Training method, prediction method and device of advertisement click rate prediction model

文档序号:9092 发布日期:2021-09-17 浏览:179次 中文

1. A training method of an advertisement click-through rate prediction model, wherein the advertisement click-through rate prediction model comprises a feature extraction network and a prediction network which are sequentially cascaded, and the method comprises the following steps:

acquiring a training sample data set, wherein the training sample data set comprises a plurality of sample data groups, and each sample data group comprises user characteristic data and advertisement characteristic data;

inputting the sample data set in the training sample data set into the feature extraction network, and outputting effective feature data, wherein the feature extraction network comprises a feature combination network and an effective feature extraction network which are sequentially cascaded; and

and training the prediction network by using the effective characteristic data to obtain a trained advertisement click rate prediction model.

2. The method of claim 1, wherein the feature combination network comprises N feature combination sub-networks;

the inputting the training sample data set into the feature extraction network and outputting valid feature data comprises:

inputting the training sample data set into the N feature combination sub-networks, wherein the N feature combination sub-networks respectively output first feature values;

generating a second characteristic value by utilizing a one-hot coding algorithm according to the N first characteristic values;

and inputting the second characteristic value into the effective characteristic extraction network, and outputting the effective characteristic data.

3. The method of claim 2, wherein the active feature extraction network comprises a convolutional layer and a pooling layer cascaded in sequence;

the inputting the second feature value into the valid feature extraction network and outputting the valid feature data includes:

inputting the second characteristic value into the convolutional layer and outputting first characteristic data;

and inputting the first characteristic data into the pooling layer and outputting the effective characteristic data.

4. The method of claim 2, wherein said inputting the set of training sample data into the N feature combination sub-networks comprises:

and splicing the user characteristic data and the advertisement characteristic data to obtain characteristic data, and inputting the characteristic data into the N characteristic combination sub-networks.

5. The method of claim 1, wherein the set of sample data further comprises tag information;

the training of the prediction network by using the effective characteristic data to obtain a trained advertisement click rate prediction model comprises the following steps:

inputting the effective characteristic data into the prediction layer, and outputting a prediction result, wherein the prediction result represents the probability of clicking a sample advertisement by a sample user, the sample user comprises a user corresponding to the user characteristic data, and the sample advertisement comprises an advertisement corresponding to the advertisement characteristic data;

and iteratively adjusting the network parameters of the prediction network and the effective characteristic extraction network according to the prediction result and by taking the prediction result approaching label information as a target until the prediction network and the effective characteristic extraction network are converged to obtain the trained advertisement click rate prediction model.

6. The method of claim 1, wherein the feature combination network comprises a feature combination network constructed based on a gradient boosting decision tree algorithm;

the effective feature extraction network comprises an effective feature extraction network constructed based on a convolutional neural network.

7. The method of claim 1, wherein the user characteristic data comprises one or more of: user location information, user basic information, user equipment information.

8. The method of claim 1, wherein the advertisement characteristic data comprises one or more of: advertisement type, advertiser name, advertisement height, advertisement width, and consumer group corresponding to the advertisement.

9. An advertisement click-through rate prediction method, comprising:

acquiring a data set to be detected, wherein the data set to be detected comprises a data group to be detected, and the data group comprises target advertisement characteristic data and target user characteristic data; and

inputting the data set to be tested into an advertisement click rate prediction model, and outputting a prediction result, wherein the prediction result represents the probability of a target user clicking a target advertisement, and the advertisement click rate prediction model is obtained by training the advertisement click rate prediction model according to any one of claims 1 to 8.

10. A training device of an advertisement click-through rate prediction model, wherein the advertisement click-through rate prediction model comprises a feature extraction network and a prediction network which are sequentially cascaded, and the device comprises:

the system comprises a first acquisition module, a first analysis module and a second acquisition module, wherein the first acquisition module is used for acquiring a training sample data set, the training sample data set comprises a plurality of sample data groups, and each sample data group comprises user characteristic data and advertisement characteristic data;

the input module is used for inputting the sample data set in the training sample data set into the characteristic extraction network and outputting effective characteristic data, wherein the characteristic extraction network comprises a characteristic combination network and an effective characteristic extraction network which are sequentially cascaded; and

and the training module is used for training the prediction network by utilizing the effective characteristic data to obtain a trained advertisement click rate prediction model.

11. An advertisement click-through rate prediction apparatus comprising:

the second acquisition module is used for acquiring a data set to be detected, wherein the data set to be detected comprises a data group to be detected, and the data group comprises target advertisement characteristic data and target user characteristic data; and

a prediction module, configured to input the data set to be tested into an advertisement click rate prediction model, and output a prediction result, where the prediction result represents a probability that a target user clicks a target advertisement, and the advertisement click rate prediction model is obtained by training the advertisement click rate prediction model according to any one of claims 1 to 8.

12. An electronic device, comprising:

one or more processors;

a storage device for storing one or more programs,

wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the method of any of claims 1-9.

13. A computer readable storage medium having stored thereon executable instructions which, when executed by a processor, cause the processor to perform the method of any one of claims 1 to 9.

14. A computer program product comprising a computer program which, when executed by a processor, implements a method according to any one of claims 1 to 9.

Background

The advertisement click rate prediction has important reference value in actual business, and is used for predicting the click probability of a user clicking an advertisement through advertisement data and user data.

The inventor finds that the advertisement click-through rate prediction method in the related art cannot well utilize advertisement data and user data in the process of realizing the concept of the invention, so that the advertisement click-through rate prediction method in the related art usually has the technical problem of inaccurate prediction results.

Disclosure of Invention

In view of the foregoing, the present disclosure provides a training method of an advertisement click-through rate prediction model, an advertisement click-through rate prediction method, apparatus, device, medium, and program product.

According to a first aspect of the present disclosure, there is provided a training method for an advertisement click-through rate prediction model, where the advertisement click-through rate prediction model includes a feature extraction network and a prediction network which are sequentially cascaded, and the method includes:

acquiring a training sample data set, wherein the training sample data set comprises a plurality of sample data groups, and each sample data group comprises user characteristic data and advertisement characteristic data;

inputting the sample data set in the training sample data set into the feature extraction network, and outputting effective feature data, wherein the feature extraction network comprises a feature combination network and an effective feature extraction network which are sequentially cascaded; and

and training the prediction network by using the effective characteristic data to obtain a trained advertisement click rate prediction model.

According to an embodiment of the present disclosure, the feature combination network includes N feature combination subnetworks;

the inputting the training sample data set into the feature extraction network and outputting valid feature data includes:

inputting the training sample data set into the N feature combination sub-networks, wherein the N feature combination sub-networks respectively output first feature values;

generating a second characteristic value by utilizing a one-hot coding algorithm according to the N first characteristic values;

and inputting the second characteristic value into the effective characteristic extraction network and outputting the effective characteristic data.

According to the embodiment of the present disclosure, the effective feature extraction network includes a convolutional layer and a pooling layer which are sequentially cascaded;

the inputting the second feature value into the effective feature extraction network and outputting the effective feature data may include:

inputting the second characteristic value into the convolutional layer and outputting first characteristic data;

and inputting the first characteristic data into the pooling layer and outputting the effective characteristic data.

According to an embodiment of the present disclosure, the inputting the training sample data set into the N feature combination subnetworks includes:

and splicing the user characteristic data and the advertisement characteristic data to obtain characteristic data, and inputting the characteristic data into the N characteristic combination sub-networks.

According to an embodiment of the present disclosure, the sample data set further includes tag information;

the training of the prediction network by using the effective characteristic data to obtain the trained advertisement click rate prediction model comprises the following steps:

inputting the effective characteristic data into the prediction layer, and outputting a prediction result, wherein the prediction result represents the probability of clicking sample advertisements by sample users, the sample users comprise users corresponding to the characteristic data of the users, and the sample advertisements comprise advertisements corresponding to the characteristic data of the advertisements;

and according to the prediction result, iteratively adjusting the network parameters of the prediction network and the effective characteristic extraction network by taking the prediction result approaching the label information as a target until the prediction network and the effective characteristic extraction network are converged to obtain the trained advertisement click rate prediction model.

According to an embodiment of the present disclosure, the feature combination network includes a feature combination network constructed based on a gradient boosting decision tree algorithm;

the effective feature extraction network comprises an effective feature extraction network constructed based on a convolutional neural network.

According to an embodiment of the present disclosure, the user characteristic data includes one or more of the following: user location information, user basic information, user equipment information.

According to an embodiment of the present disclosure, the advertisement characteristic data includes one or more of: advertisement type, advertiser name, advertisement height, advertisement width, and consumer group corresponding to the advertisement.

A second aspect of the present disclosure provides an advertisement click rate prediction method, including:

acquiring a data set to be tested, wherein the data set to be tested comprises a data group to be tested, and the data group comprises target advertisement characteristic data and target user characteristic data; and

and inputting the data set to be tested into an advertisement click rate prediction model, and outputting a prediction result, wherein the prediction result represents the probability of a target user clicking a target advertisement, and the advertisement click rate prediction model is obtained by training the training method of the advertisement click rate prediction model provided by the embodiment of the disclosure.

A third aspect of the present disclosure provides a training apparatus for an advertisement click-through rate prediction model, where the advertisement click-through rate prediction model includes a feature extraction network and a prediction network that are sequentially cascaded, and the apparatus includes:

the system comprises a first acquisition module, a first analysis module and a second acquisition module, wherein the first acquisition module is used for acquiring a training sample data set, the training sample data set comprises a plurality of sample data groups, and each sample data group comprises user characteristic data and advertisement characteristic data;

an input module, configured to input a sample data set in the training sample data set into the feature extraction network, and output effective feature data, where the feature extraction network includes a feature combination network and an effective feature extraction network that are sequentially cascaded; and

and the training module is used for training the prediction network by utilizing the effective characteristic data to obtain a trained advertisement click rate prediction model.

A fourth aspect of the present disclosure provides an advertisement click rate prediction apparatus, including:

the second acquisition module is used for acquiring a data set to be detected, wherein the data set to be detected comprises a data group to be detected, and the data group comprises target advertisement characteristic data and target user characteristic data; and

and the prediction module is used for inputting the data set to be tested into an advertisement click rate prediction model and outputting a prediction result, wherein the prediction result represents the probability of the target user clicking the target advertisement, and the advertisement click rate prediction model is obtained by training the training method of the advertisement click rate prediction model provided by the embodiment of the disclosure.

A fifth aspect of the present disclosure provides an electronic device, comprising: one or more processors; a memory for storing one or more programs, wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the method for training an advertisement click-through rate prediction model, the method for advertisement click-through rate prediction, described above.

A sixth aspect of the present disclosure also provides a computer-readable storage medium having stored thereon executable instructions that, when executed by a processor, cause the processor to perform the training method of the advertisement click-through rate prediction model, the advertisement click-through rate prediction method described above.

The seventh aspect of the present disclosure also provides a computer program product, which includes a computer program, and when the computer program is executed by a processor, the training method of the advertisement click rate prediction model and the advertisement click rate prediction method are implemented.

Drawings

The foregoing and other objects, features and advantages of the disclosure will be apparent from the following description of embodiments of the disclosure, which proceeds with reference to the accompanying drawings, in which:

FIG. 1 schematically shows an application scenario diagram of a training method of an advertisement click-through rate prediction model, an advertisement click-through rate prediction method, a training device of an advertisement click-through rate prediction model, and an advertisement click-through rate prediction device according to an embodiment of the disclosure;

FIG. 2 schematically illustrates a flow chart of a method of training an advertisement click-through rate prediction model according to an embodiment of the present disclosure;

FIG. 3 schematically illustrates a flow chart of inputting a training sample data set into a feature extraction network, outputting valid feature data, according to an embodiment of the present disclosure;

FIG. 4 schematically illustrates a flow chart of inputting a second feature value into a valid feature extraction network and outputting valid feature data according to an embodiment of the disclosure;

FIG. 5 schematically illustrates a flow chart for training a predictive network using effective feature data to obtain a trained advertisement click-through rate prediction model according to an embodiment of the present disclosure;

FIG. 6 schematically illustrates a flow chart of a method of advertisement click-through rate prediction according to an embodiment of the present disclosure;

FIG. 7 is a block diagram schematically illustrating an architecture of a training apparatus for an advertisement click-through rate prediction model according to an embodiment of the present disclosure;

FIG. 8 is a block diagram schematically illustrating an advertisement click-through rate prediction apparatus according to an embodiment of the present disclosure; and

FIG. 9 schematically illustrates a block diagram of an electronic device adapted to implement a training method of an advertisement click-through rate prediction model, an advertisement click-through rate prediction method, according to an embodiment of the present disclosure.

Detailed Description

Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings. It should be understood that the description is illustrative only and is not intended to limit the scope of the present disclosure. In the following detailed description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the disclosure. It may be evident, however, that one or more embodiments may be practiced without these specific details. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present disclosure.

The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The terms "comprises," "comprising," and the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.

All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. It is noted that the terms used herein should be interpreted as having a meaning that is consistent with the context of this specification and should not be interpreted in an idealized or overly formal sense.

Where a convention analogous to "at least one of A, B and C, etc." is used, in general such a construction is intended in the sense one having skill in the art would understand the convention (e.g., "a system having at least one of A, B and C" would include but not be limited to systems that have a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.).

The Click-Through-Rate (CTR) has important reference value in actual service, and is used for predicting the probability of a user clicking an advertisement Through advertisement data and user data. However, the feature dimensions of the advertisement data are usually large, and most of the advertisement data are sparse, so that feature information and mutual information among features are difficult to mine, and the CTR effect is further influenced.

In order to at least partially solve the technical problems in the related art, the present disclosure provides a training method for an advertisement click-through rate prediction model, which can be applied to the financial field and the artificial intelligence technical field. The training method of the advertisement click rate prediction model comprises the following steps: acquiring a training sample data set, wherein the training sample data set comprises a plurality of sample data groups, and each sample data group comprises user characteristic data and advertisement characteristic data; inputting a sample data set in a training sample data set into a feature extraction network, and outputting effective feature data, wherein the feature extraction network comprises a feature combination network and an effective feature extraction network which are sequentially cascaded; and training a prediction network by using the effective characteristic data to obtain a trained advertisement click rate prediction model. The disclosure also provides an advertisement click rate prediction method, an advertisement click rate prediction model training device, an advertisement click rate prediction device, equipment, a storage medium and a program product.

It should be noted that the method and apparatus determined by the embodiment of the present disclosure may be used in the financial field and the artificial intelligence technology field, and may also be used in any field other than the financial field and the artificial intelligence technology field.

Fig. 1 schematically shows an application scenario diagram of a training method of an advertisement click-through rate prediction model, an advertisement click-through rate prediction method, a training device of an advertisement click-through rate prediction model, and an advertisement click-through rate prediction device according to an embodiment of the disclosure.

As shown in fig. 1, the application scenario 100 according to this embodiment may include terminal devices 101, 102, 103, a network 104 and a server 105. The network 104 serves as a medium for providing communication links between the terminal devices 101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.

The user may use the terminal devices 101, 102, 103 to interact with the server 105 via the network 104 to receive or send messages or the like. The terminal devices 101, 102, 103 may have installed thereon various communication client applications, such as shopping-like applications, web browser applications, search-like applications, instant messaging tools, mailbox clients, social platform software, etc. (by way of example only).

The terminal devices 101, 102, 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.

The server 105 may be a server providing various services, such as a background management server (for example only) providing support for websites browsed by users using the terminal devices 101, 102, 103. The background management server may analyze and perform other processing on the received data such as the user request, and feed back a processing result (e.g., a webpage, information, or data obtained or generated according to the user request) to the terminal device.

It should be noted that the training method of the advertisement click-through rate prediction model and the advertisement click-through rate prediction method provided by the embodiment of the present disclosure may be generally executed by the server 105. Accordingly, the training device of the advertisement click-through rate prediction model and the advertisement click-through rate prediction device provided by the embodiments of the present disclosure may be generally disposed in the server 105. The training method of the advertisement click-through rate prediction model and the advertisement click-through rate prediction method provided by the embodiments of the present disclosure may also be executed by a server or a server cluster that is different from the server 105 and is capable of communicating with the terminal devices 101, 102, 103 and/or the server 105. Accordingly, the training device of the advertisement click-through rate prediction model and the advertisement click-through rate prediction device provided by the embodiment of the present disclosure may also be disposed in a server or a server cluster that is different from the server 105 and is capable of communicating with the terminal devices 101, 102, 103 and/or the server 105.

It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

The method for training the advertisement click-through rate prediction model according to the disclosed embodiment will be described in detail with reference to fig. 2 to 5 based on the scenario described in fig. 1.

According to the embodiment of the disclosure, the advertisement click-through rate prediction model comprises a feature extraction network and a prediction network which are sequentially cascaded.

FIG. 2 schematically illustrates a flow chart of a method of training an advertisement click-through rate prediction model according to an embodiment of the present disclosure.

As shown in fig. 2, the training method of the advertisement click-through rate prediction model of this embodiment includes operations S201 to S203.

In operation S201, a training sample data set is obtained, where the training sample data set includes a plurality of sample data groups, and each sample data group includes user characteristic data and advertisement characteristic data.

According to embodiments of the present disclosure, the user characteristic data and/or the advertisement characteristic data may be vector data.

According to an embodiment of the present disclosure, the user characteristic data and the advertisement characteristic data in each sample data set may form one characteristic data pair.

According to an embodiment of the present disclosure, for example, the sample data set Y ═ { a ═ a1(a1,b1)|A2(a2,b2)}. Wherein A is1And A2May represent a sample data set, a1And a2Can represent user characteristic data, b1And b2Ad feature data may be represented.

In operation S202, a sample data set in the training sample data set is input to a feature extraction network, and effective feature data is output, where the feature extraction network includes a feature combination network and an effective feature extraction network that are sequentially cascaded.

According to the embodiment of the disclosure, the feature combination network can have good nonlinear fitting capability, and low-order information in feature data can be well mined.

According to the embodiment of the disclosure, the effective feature extraction network can have good feature correlation mining capability, so that feature data output by the feature combination network can be further subjected to feature extraction, and effective data in user feature data and advertisement feature data can be mined.

In operation S203, the prediction network is trained using the effective feature data, and a trained advertisement click rate prediction model is obtained.

According to the embodiment of the disclosure, the prediction network can be constructed by sequentially cascading the fully-connected layer and the output layer.

According to an embodiment of the present disclosure, the output layer may include an output neuron, and the output neuron may output a prediction result of the prediction network, and the prediction result may be output in a vector form, but is not limited thereto, and the prediction result may also be output in a numerical form, for example, the prediction result may be 0.8, which indicates that a probability that a user corresponding to the user feature data in the sample data set clicks an advertisement corresponding to the advertisement feature data is 80%.

According to the neurons of the present disclosure, the activation function of the output neuron may comprise a softmax activation function.

According to embodiments of the present disclosure, a predictive network may include two fully connected layers.

The method and the device have the advantages that the user characteristic data and the advertisement characteristic data are subjected to two times of characteristic extraction through the characteristic combination network and the effective characteristic extraction network, so that the user characteristic data and the advertisement characteristic data can be better utilized, the correlation between the user characteristic data and the advertisement characteristic data is excavated, the effective characteristic data generated by the user characteristic data and the advertisement characteristic data through two times of characteristic extraction is utilized to train the prediction network, and the trained advertisement click rate prediction model is obtained. The technical problem that the prediction result of the advertisement click rate prediction method in the related technology is inaccurate can be at least partially solved when the trained advertisement click rate prediction model is used for predicting the advertisement click rate.

According to an embodiment of the present disclosure, the feature combination network includes N feature combination subnetworks.

Fig. 3 schematically shows a flowchart of inputting a training sample data set into a feature extraction network and outputting valid feature data according to an embodiment of the present disclosure.

As shown in fig. 3, inputting a training sample data set into a feature extraction network and outputting valid feature data according to this embodiment includes operations S301 to S303.

In operation S301, a training sample data set is input into N sub-networks of feature combination, which respectively output first feature values.

In operation S302, a second feature value is generated using a one-hot encoding algorithm according to the N first feature values.

In operation S303, the second feature value is input to the valid feature extraction network, and valid feature data is output.

According to an embodiment of the present disclosure, a feature combination sub-network may include a plurality of nodes, each of which may output a node value.

According to an embodiment of the present disclosure, the node values output by all nodes of the feature combination sub-network constitute a first feature value.

According to an embodiment of the present disclosure, for example, the feature combination network comprises a feature combination subnetwork 1 and a feature combination subnetwork 2, wherein the feature combination subnetwork 1 comprises three nodes and the feature combination subnetwork 2 comprises two nodes.

According to an embodiment of the present disclosure, for example, a training sample data set may be respectively input into the feature combination sub-network 1 and the feature combination sub-network 2, where node values output by three nodes of the feature combination sub-network 1 are 0, 1, and 0, respectively, then a first feature value output by the feature combination sub-network 1 may be 010; the node values output by the two nodes of the feature combination sub-network 2 are respectively 0 and 1, and then the first feature value output by the feature combination sub-network 2 may be 01; the first feature value 010 output by the feature combination sub-network 1 and the first feature value 01 output by the feature combination sub-network 2 can then be concatenated to generate 01001, i.e. valid feature data.

According to the embodiments of the present disclosure, by regarding the output value of each of the N feature combination sub-networks as one category of feature and then generating valid feature data from the outputs of the N feature combination sub-networks, it is possible to acquire feature data highly correlated with the click rate accuracy.

According to an embodiment of the present disclosure, an active feature extraction network includes a convolutional layer and a pooling layer that are cascaded in sequence.

Fig. 4 schematically shows a flowchart of inputting a second feature value into a valid feature extraction network and outputting valid feature data according to an embodiment of the present disclosure.

As shown in fig. 4, inputting the second feature value into the valid feature extraction network and outputting valid feature data according to this embodiment includes operations S401 to S402.

In operation S401, the second feature value is input to the convolutional layer, and the first feature data is output.

In operation S402, the first feature data is input into the pooling layer, and valid feature data is output.

According to the embodiment of the disclosure, the convolutional layer can learn to obtain different feature domains in the second feature value by adopting a local perception and parameter sharing mechanism, and output first feature data; and then the pooling layer extracts high-order characteristic data in the first characteristic data in a mode of scaling the characteristic domain of the first characteristic data and outputs effective characteristic data.

According to embodiments of the present disclosure, the high-order feature data may include, for example, feature data highly correlated with click rate accuracy.

According to an embodiment of the present disclosure, inputting the training sample data set into the N feature combination subnetworks in operation S401 further includes the following operations:

and splicing the user characteristic data and the advertisement characteristic data to obtain characteristic data, and inputting the characteristic data into N characteristic combination sub-networks.

According to the embodiment of the present disclosure, for example, the user characteristic data and the advertisement characteristic data may be sequentially connected to generate the characteristic data, but the present disclosure is not limited thereto, and the advertisement characteristic data and the user characteristic data may be sequentially connected to generate the characteristic data.

According to an embodiment of the present disclosure, the sample data set further includes tag information.

According to an embodiment of the present disclosure, the tag information of the sample data set may be, for example, tag information of a feature data pair composed of user feature data and advertisement feature data.

According to an embodiment of the present disclosure, the tag information may represent a probability value expected for a user corresponding to the user characteristic data to click on an advertisement corresponding to the advertisement characteristic data.

According to an embodiment of the present disclosure, for example, the sample data set Y ═ { a ═ a1(a1,b1):75%|A2(a2,b2): 90% }. Wherein A is1And A2May represent a sample data set, a1And a2Can represent user characteristic data, b1And b2May represent ad feature data; a. the1(a1,b1): 75% can represent1Corresponding user click and b1The expected probability value for the corresponding advertisement is 75%.

FIG. 5 schematically illustrates a flowchart for training a predictive network using effective feature data to obtain a trained advertisement click-through rate prediction model according to an embodiment of the present disclosure.

As shown in fig. 5, the training of the prediction network by using the effective feature data to obtain the trained advertisement click-through rate prediction model according to the embodiment includes operations S501 to S502.

In operation S501, effective feature data is input into a prediction layer, and a prediction result is output, where the prediction result represents a probability that a sample user clicks a sample advertisement, the sample user includes a user corresponding to the user feature data, and the sample advertisement includes an advertisement corresponding to the advertisement feature data;

in operation S502, iteratively adjusting the network parameters of the prediction network and the effective feature extraction network according to the prediction result and with the prediction result approaching the label information as a target until the prediction network and the effective feature extraction network converge, so as to obtain a trained advertisement click rate prediction model.

According to an embodiment of the present disclosure, for example, the sample data set Y ═ { a ═ a1(a1,b1): 75% }. After generating effective characteristic data according to the sample data set by using the characteristic extraction network, inputting the effective characteristic data into a prediction layer, wherein the output prediction result of the prediction layer can be 60 percent for example, because A1The tag information of (1) is 75%, that is, the prediction result output by the prediction layer is different from the tag information, and the prediction result output by the prediction layer is obviously smaller than the tag information, the network parameters of the prediction network and the effective feature extraction network can be adjusted towards the direction of increasing the prediction result output by the prediction layer.

According to the embodiment of the disclosure, the feature combination network comprises a feature combination network constructed based on a gradient boosting decision tree algorithm.

According to an embodiment of the present disclosure, the feature combination network may include a pre-trained gradient boosting decision tree algorithm.

According to an embodiment of the present disclosure, the effective feature extraction network includes an effective feature extraction network constructed based on a convolutional neural network.

According to an embodiment of the present disclosure, the user characteristic data comprises one or more of: user location information, user basic information, user equipment information.

According to embodiments of the present disclosure, the user device information may include, for example, the make and model of the electronic device used by the user.

According to an embodiment of the present disclosure, the user basic information may include, for example, the age, the academic calendar, and the sex of the user.

It should be noted that, in the technical solution of the present disclosure, the acquisition, storage, application, and the like of the personal information of the related user all conform to the regulations of the related laws and regulations, and necessary security measures are taken without violating the good customs of the public order.

According to an embodiment of the present disclosure, the advertisement characteristic data includes one or more of: advertisement type, advertiser name, advertisement height, advertisement width, and consumer group corresponding to the advertisement.

According to embodiments of the present disclosure, the advertisement height and advertisement width may be, for example, the height and width of the advertisement when displayed on an electronic device used by a user.

FIG. 6 schematically shows a flow chart of an advertisement click-through rate prediction method according to an embodiment of the present disclosure.

As shown in fig. 6, the advertisement click-through rate prediction method of this embodiment includes operations S601 to S602.

In operation S601, a to-be-detected data set is obtained, where the to-be-detected data set includes a to-be-detected data group, and the data group includes target advertisement feature data and target user feature data.

In operation S602, the data set to be tested is input into the advertisement click-through rate prediction model, and a prediction result is output, where the prediction result represents a probability that the target user clicks the target advertisement, and the advertisement click-through rate prediction model is obtained by training the training method of the advertisement click-through rate prediction model provided in the embodiments of the present disclosure.

Based on the training method of the advertisement click rate prediction model, the disclosure also provides a training device of the advertisement click rate prediction model. The apparatus will be described in detail below with reference to fig. 7.

FIG. 7 is a block diagram schematically illustrating a training apparatus for an advertisement click-through rate prediction model according to an embodiment of the present disclosure.

As shown in fig. 7, the training apparatus 700 of the advertisement click-through rate prediction model of this embodiment includes a first obtaining module 701, an input module 702, and a training module 703.

The first obtaining module 701 is configured to obtain a training sample data set, where the training sample data set includes a plurality of sample data groups, and each sample data group includes user characteristic data and advertisement characteristic data. In an embodiment, the first obtaining module 701 may be configured to perform the operation S201 described above, which is not described herein again.

The input module 702 is configured to input a sample data set in a training sample data set into a feature extraction network, and output valid feature data, where the feature extraction network includes a feature combination network and a valid feature extraction network that are sequentially cascaded. In an embodiment, the input module 702 may be configured to perform the operation S202 described above, which is not described herein again.

The training module 703 is configured to train the prediction network by using the effective feature data to obtain a trained advertisement click rate prediction model. In an embodiment, the training module 703 may be configured to perform the operation S203 described above, which is not described herein again.

According to an embodiment of the present disclosure, the feature combination network includes N feature combination subnetworks.

According to an embodiment of the present disclosure, the input module 702 includes a first input unit, a generation unit, and an output unit.

And the first input unit is used for inputting the training sample data set into the N feature combination sub-networks, and the N feature combination sub-networks respectively output first feature values.

And the generating unit is used for generating a second characteristic value by utilizing a one-hot coding algorithm according to the N first characteristic values.

And the output unit is used for inputting the second characteristic value into the effective characteristic extraction network and outputting effective characteristic data.

According to an embodiment of the present disclosure, an active feature extraction network includes a convolutional layer and a pooling layer that are cascaded in sequence.

According to an embodiment of the present disclosure, an output unit includes a first input subunit and an output subunit.

And a first input subunit, for inputting the second characteristic value into the convolution layer and outputting the first characteristic data.

And the output subunit is used for inputting the first characteristic data into the pooling layer and outputting the effective characteristic data.

According to an embodiment of the present disclosure, the first input unit includes a stitching subunit.

And the splicing subunit is used for splicing the user characteristic data and the advertisement characteristic data to obtain characteristic data and inputting the characteristic data into the N characteristic combination sub-networks.

According to an embodiment of the present disclosure, the sample data set further includes tag information.

According to an embodiment of the present disclosure, the training module 703 comprises a second input unit and an adjustment unit.

And the second input unit is used for inputting the effective characteristic data into the prediction layer and outputting a prediction result, wherein the prediction result represents the probability of clicking the sample advertisement by the sample user, the sample user comprises a user corresponding to the characteristic data of the user, and the sample advertisement comprises an advertisement corresponding to the characteristic data of the advertisement.

And the adjusting unit is used for iteratively adjusting the network parameters of the prediction network and the effective characteristic extraction network according to the prediction result and by taking the prediction result approaching the label information as a target until the prediction network and the effective characteristic extraction network are converged to obtain a trained advertisement click rate prediction model.

According to the embodiment of the disclosure, the feature combination network comprises a feature combination network constructed based on a gradient boosting decision tree algorithm;

the effective feature extraction network comprises an effective feature extraction network constructed based on a convolutional neural network.

According to an embodiment of the present disclosure, the user characteristic data comprises one or more of: user location information, user basic information, user equipment information.

According to an embodiment of the present disclosure, the advertisement characteristic data includes one or more of: advertisement type, advertiser name, advertisement height, advertisement width, and consumer group corresponding to the advertisement.

Based on the advertisement click rate method, the disclosure also provides an advertisement click rate prediction device. The apparatus will be described in detail below with reference to fig. 8.

FIG. 8 is a block diagram schematically illustrating an advertisement click-through rate prediction apparatus according to an embodiment of the present disclosure.

As shown in fig. 8, the training device 800 of the advertisement click-through rate prediction model of this embodiment includes a second obtaining module 801 and a prediction module 802.

The second obtaining module 801 is configured to obtain a data set to be tested, where the data set to be tested includes a data group to be tested, and the data group includes target advertisement feature data and target user feature data. In an embodiment, the second obtaining module 801 may be configured to perform the operation S601 described above, which is not described herein again.

The prediction module 802 is configured to input the data set to be tested into an advertisement click rate prediction model, and output a prediction result, where the prediction result represents a probability that a target user clicks a target advertisement, and the advertisement click rate prediction model is obtained by training the advertisement click rate prediction model provided in the embodiments of the present disclosure. In an embodiment, the prediction module 802 may be configured to perform the operation S602 described above, which is not described herein again.

According to the embodiment of the present disclosure, any plurality of the first obtaining module 701, the input module 702, the training module 703, the second obtaining module 801, and the prediction module 802 may be combined and implemented in one module, or any one of them may be split into a plurality of modules. Alternatively, at least part of the functionality of one or more of these modules may be combined with at least part of the functionality of the other modules and implemented in one module. According to an embodiment of the present disclosure, at least one of the first obtaining module 701, the input module 702, the training module 703, the second obtaining module 801, and the predicting module 802 may be implemented at least in part as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or may be implemented by hardware or firmware in any other reasonable manner of integrating or packaging a circuit, or implemented by any one of three implementations of software, hardware, and firmware, or an appropriate combination of any of them. Alternatively, at least one of the first obtaining module 701, the input module 702, the training module 703, the second obtaining module 801 and the prediction module 802 may be at least partially implemented as a computer program module, which when executed may perform the corresponding functions.

FIG. 9 schematically illustrates a block diagram of an electronic device adapted to implement a training method of an advertisement click-through rate prediction model, an advertisement click-through rate prediction method, according to an embodiment of the present disclosure.

As shown in fig. 9, an electronic apparatus 900 according to an embodiment of the present disclosure includes a processor 901 which can perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)902 or a program loaded from a storage portion 908 into a Random Access Memory (RAM) 903. Processor 901 may comprise, for example, a general purpose microprocessor (e.g., a CPU), an instruction set processor and/or associated chipset, and/or a special purpose microprocessor (e.g., an Application Specific Integrated Circuit (ASIC)), among others. The processor 901 may also include on-board memory for caching purposes. The processor 901 may comprise a single processing unit or a plurality of processing units for performing the different actions of the method flows according to embodiments of the present disclosure.

In the RAM 903, various programs and data necessary for the operation of the electronic apparatus 900 are stored. The processor 901, the ROM 902, and the RAM 903 are connected to each other through a bus 904. The processor 901 performs various operations of the method flows according to the embodiments of the present disclosure by executing programs in the ROM 902 and/or the RAM 903. Note that the programs may also be stored in one or more memories other than the ROM 902 and the RAM 903. The processor 901 may also perform various operations of the method flows according to embodiments of the present disclosure by executing programs stored in the one or more memories.

Electronic device 900 may also include input/output (I/O) interface 905, input/output (I/O) interface 905 also connected to bus 904, according to an embodiment of the present disclosure. The electronic device 900 may also include one or more of the following components connected to the I/O interface 905: an input portion 906 including a keyboard, a mouse, and the like; an output section 907 including components such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage portion 908 including a hard disk and the like; and a communication section 909 including a network interface card such as a LAN card, a modem, or the like. The communication section 909 performs communication processing via a network such as the internet. The drive 910 is also connected to the I/O interface 905 as necessary. A removable medium 911 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 910 as necessary, so that a computer program read out therefrom is mounted into the storage section 908 as necessary.

The present disclosure also provides a computer-readable storage medium, which may be contained in the apparatus/device/system described in the above embodiments; or may exist separately and not be assembled into the device/apparatus/system. The computer-readable storage medium carries one or more programs which, when executed, implement the method according to an embodiment of the disclosure.

According to embodiments of the present disclosure, the computer-readable storage medium may be a non-volatile computer-readable storage medium, which may include, for example but is not limited to: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. For example, according to embodiments of the present disclosure, a computer-readable storage medium may include the ROM 902 and/or the RAM 903 described above and/or one or more memories other than the ROM 902 and the RAM 903.

Embodiments of the present disclosure also include a computer program product comprising a computer program containing program code for performing the method illustrated in the flow chart. When the computer program product runs in a computer system, the program code is used for causing the computer system to realize the training method and the advertisement click rate prediction method of the advertisement click rate prediction model provided by the embodiment of the disclosure.

The computer program performs the above-described functions defined in the system/apparatus of the embodiments of the present disclosure when executed by the processor 901. The systems, apparatuses, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the present disclosure.

In one embodiment, the computer program may be hosted on a tangible storage medium such as an optical storage device, a magnetic storage device, or the like. In another embodiment, the computer program may also be transmitted, distributed in the form of a signal on a network medium, and downloaded and installed through the communication section 909 and/or installed from the removable medium 911. The computer program containing program code may be transmitted using any suitable network medium, including but not limited to: wireless, wired, etc., or any suitable combination of the foregoing.

In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 909, and/or installed from the removable medium 911. The computer program, when executed by the processor 901, performs the above-described functions defined in the system of the embodiment of the present disclosure. The systems, devices, apparatuses, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the present disclosure.

In accordance with embodiments of the present disclosure, program code for executing computer programs provided by embodiments of the present disclosure may be written in any combination of one or more programming languages, and in particular, these computer programs may be implemented using high level procedural and/or object oriented programming languages, and/or assembly/machine languages. The programming language includes, but is not limited to, programming languages such as Java, C + +, python, the "C" language, or the like. The program code may execute entirely on the user computing device, partly on the user device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

Those skilled in the art will appreciate that various combinations and/or combinations of features recited in the various embodiments and/or claims of the present disclosure can be made, even if such combinations or combinations are not expressly recited in the present disclosure. In particular, various combinations and/or combinations of the features recited in the various embodiments and/or claims of the present disclosure may be made without departing from the spirit or teaching of the present disclosure. All such combinations and/or associations are within the scope of the present disclosure.

The embodiments of the present disclosure have been described above. However, these examples are for illustrative purposes only and are not intended to limit the scope of the present disclosure. Although the embodiments are described separately above, this does not mean that the measures in the embodiments cannot be used in advantageous combination. The scope of the disclosure is defined by the appended claims and equivalents thereof. Various alternatives and modifications can be devised by those skilled in the art without departing from the scope of the present disclosure, and such alternatives and modifications are intended to be within the scope of the present disclosure.

完整详细技术资料下载
上一篇:石墨接头机器人自动装卡簧、装栓机
下一篇:施工现场垃圾回收方法及系统

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!