Cross-border e-commerce recommendation method based on heterogeneous graph expression learning

文档序号:9123 发布日期:2021-09-17 浏览:103次 中文

1. A cross-border e-commerce recommendation method based on heterogeneous graph expression learning is characterized by comprising the following steps:

step 1): extracting original data of cross-border e-commerce users to respectively obtain order data and product description data; wherein, the order data is expressed as a 3-tuple: described as < User _ ID, Item _ ID, Quantity >, where User _ ID is a User identification, Item _ ID is a unique identification of a product purchased by the User, Quantity is a Quantity of products accumulated by the User; the product description data is described as < Item _ ID, Title, Price >, where Title is the Title of the commodity; price is the product Price; performing operations of removing noise data and missing data on order data to obtain a final user-product purchasing matrix M, and performing operations of removing noise data and missing data on product description data to obtain product text description data D; turning to step 2);

step 2): mining the latent semantic theme of the product based on the product text description data D, and identifying the interest preference of the user by using theme information; generalizing a latent meaning theme model for any product to obtain the themes of all E-commerce products, finally obtaining a theme-product matrix T, and turning to the step 3);

step 3): constructing a cross-border e-commerce user-product-theme three-part graph based on a user-product purchase matrix M and a theme-product matrix T:

if the elements in the purchase matrix M of the user-product and the theme-product matrix T are not empty, the corresponding user u and the corresponding product i, and the node between the theme T and the product i generate an edge; traversing elements in a cross-border e-commerce user-product purchase matrix M and a theme-product purchase matrix T, thereby constructing a user-product-theme three-part graph, and marking as G (V, R), wherein V is a node set in the user-product-theme three-part graph, R is an edge set in the user-product-theme three-part graph, and turning to step 4);

step 4): dividing the constructed three-part graph G of the cross-border e-commerce user-product-theme into a Training Set and a Test Set, and establishing HNGR; in the Training stage, a Training Set is input into the HNGR, and a collaborative filtering signal is obtained along a three-part graph structure of cross-border E-business 'user-product-theme' by adopting an information propagation architecture in the traditional graph neural network, so that a characterization vector r of a user is respectively obtaineduAnd a characterization vector r of the productiGenerating a recommendation result through an excitation function; in the optimization stage, obtaining the optimal parameter configuration of the HNGR through an Adam optimizer, storing the trained HNGR, and turning to the step 5);

step 5): and respectively calculating users to be recommended in the Test Set to generate a personalized E-commerce product recommendation list by inputting the Test Set into the trained HNGR, thereby realizing cross-border E-commerce product recommendation.

2. The cross-border e-commerce recommendation method based on heterogeneous graph expression learning of claim 1, wherein: in the step 1), quantitative analysis is firstly carried out on a cross-border e-commerce 'user-product' purchase matrix M from 3 angles of user purchase product type quantity distribution, user ordering frequency distribution and product sales distribution.

3. The cross-border e-commerce recommendation method based on heterogeneous graph expression learning of claim 2, wherein: in the step 2), the product text describes data D, and the theme probability distribution of any product i in the data D is obtained after the latent semantic theme model is generalized and is marked as thetai={θi,k},k=1,2,…,K,K is the number of the subjects after the product generalization, and K is the serial number of the subjects; selection of thetaiMaximum probability distribution in the setThe theme corresponding to the value is used as the theme of the final product and abstracted into a function

Wherein, tkExpressed as the generalized theme of the product i, and finally, a 'theme-product' matrix T is obtained.

4. The cross-border e-commerce recommendation method based on heterogeneous graph expression learning of claim 3, wherein: in the step 4), the constructed three-part map G of the cross-border e-commerce user-product-subject is divided into a Training Set and a Test Set according to the proportion of 4:1, so as to be used for Training and testing the HNGR.

5. The cross-border e-commerce recommendation method based on heterogeneous graph expression learning of claim 4, wherein: in step 4), the HNGR acquires the collaborative filtering signals along a cross-border E-commerce 'user-product-theme' three-part graph structure by adopting an information propagation architecture in the graph neural network, so as to respectively obtain the characterization vectors r of the usersuAnd a characterization vector r of the productiCharacterization vector r of user uuThe method comprises the following specific steps:

1) information dissemination: in a generic single-layer GNN network, for a conventional bipartite graph constructed from a "user-product" purchase matrix M, there is an arbitrary set of edge-connected "user-product" records (u, i), meaning that user u has generated a purchase record for product i, and the information from product i to user u is recorded as Mu←i

mu←i=f(xi,xu,cu,i)

Here, f (-) is the coding function of the information, xiAnd xuRepresenting the characterization vectors of product i and user u, respectively, where xiFrom One-Hot coding, xuAre all trainedObtaining the BERT model; c. Cu,iIs an attenuation factor for controlling the propagation of any one edge (u, i), using a regularizing variableRepresents; f (-) is achieved by:

wherein N isuRepresenting the number of products connected with the edge of user u, weightW1、W2And W3Is a trainable weight matrix in the GNN network and is used for extracting useful information in information propagation;representing vector stitching, the above formula is simplified as:

similarly, for any group (u, i) of edge connections in the "user-product-subject" three-part graph, the information of product i to user u is recorded as mu←i

Wherein z represents all products belonging to the same subject as product i,represents the product number, W ', contained in the subject to which product i belongs'1、W’2And W'3Is disciplinable in GNN networksA refined weight matrix;

2) information aggregation: on the basis of information transmission, further aggregating information transmitted from all neighbor nodes of the user u, so as to obtain an expression vector of the user u; all the neighbor nodes of the user u comprise neighbor nodes in the traditional bipartite graph and neighbor nodes obtained through a user-product-subject three-part graph G, and a function h of information aggregationuIs defined as:

wherein, σ () is an excitation function, and ReLU () max (0,) is selected as the excitation function;

to obtain the final expression vector for user u, vector h is addeduThe conversion is carried out as follows:

ru=σ(Wuhu+bu),

wherein, WuAnd buRespectively representing trainable weight matrices and bias vectors, ruRepresenting a user u expression vector obtained by embedding propagation layer learning in GNN; here, ReLU is also used as the excitation function.

6. The cross-border e-commerce recommendation method based on heterogeneous graph expression learning of claim 5, wherein: in step 4), the vector r is expressed with the user uuThe calculation method is similar, and the expression vector r of the product i is obtainedi

7. The cross-border e-commerce recommendation method based on heterogeneous graph expression learning of claim 6, wherein: in step 4), the expression vector r of the user u is adopteduAnd the expression vector r of product iiPredicting the interaction score of the user u to the product i, and defining the interaction score as follows:

wherein, WjAnd bjRespectively representing trainable weight matrixes and bias vectors in the MLP, wherein l represents the total number of layers of the MLP network; σ (-) is the excitation function, ReLU is chosen as the excitation function; the final output of MLP is the resulting interaction score of u to product i, i.e.Given user u's interaction score for product iBy usingFunction to obtain the output of the model, i.e., the probability of user u purchasing product i

In the training phase, in terms of recommending products to the user, the positive labels are the product sets actually purchased by the user, namely, the interaction exists and is marked as Y+(ii) a And the negative label is formed by removing the positive label from the product set I and performing log-uniform sampling, namely no interaction exists and is marked as Y-(ii) a The binary cross entropy based loss function of HNGR is adopted, namely: loss function of purchase probability and truthThe definition is as follows:

wherein, yu,iIs the probability distribution of product i being actually purchased by user u; specifically, if (u, i) ∈ Y+Then y isu,i1, otherwiseu,i=0。

Background

For the traditional shopping scene, mature recommendation algorithms are widely applied, the most classical recommendation methods are three types, namely recommendation based on collaborative filtering, recommendation based on matrix decomposition and recommendation based on content, but the three models are difficult to work in the recommendation process due to the fact that cross-border e-commerce products are various in information type, complex in variety, extremely sparse in matrix of 'user-item' and prominent in cold start problem. In addition, based on an improved recommendation model such as collaborative filtering or matrix decomposition, only the feedback information of 'explicit' and 'implicit' of the product by the user is considered, the implicit theme association between the product and the graph structure information composed of the user and the item is ignored, and the recommendation performance hardly meets the requirements of the platform and the user.

Disclosure of Invention

The invention aims to provide a cross-border e-commerce recommendation method based on heterogeneous graph expression learning. Specifically, quantitative analysis is performed on a real cross-border e-commerce data set, the topic probability distribution of the cross-border e-commerce product is obtained through a Latent semantic topic model (LDA), and the topic corresponding to the maximum probability distribution value is selected as the topic of the final product. Then, a user-product-theme three-part Graph is constructed, and for users and projects with high-order edge relations in the user-product-theme three-part Graph, a Heterogeneous graphical Recommendation (HNGR) Recommendation model is designed, embedded propagation learning is carried out respectively, and specifically, information propagation and information propagation are includedAggregating to obtain high-quality user and product expression vectors, modeling user-product interaction through a Multi-Layer Perceptron (MLP), predicting interaction scores of users to products based on the user-product interaction vectors, and finally adoptingThe function obtains the output of the model (i.e., the probability of user u purchasing each product in the recommended candidate set).

The technical solution for realizing the purpose of the invention is as follows: a cross-border e-commerce recommendation method based on heterogeneous graph expression learning comprises the following steps:

step 1): extracting original data of cross-border e-commerce users to respectively obtain order data and product description data; wherein, the order data is expressed as a 3-tuple: described as < User _ ID, Item _ ID, Quantity >, where User _ ID is a User identification, Item _ ID is a unique identification of a product purchased by the User, Quantity is a Quantity of products accumulated by the User; the product description data is described as < Item _ ID, Title, Price >, where Title is the Title of the commodity; price is the product Price; performing operations of removing noise data and missing data on order data to obtain a final user-product purchasing matrix M, and performing operations of removing noise data and missing data on product description data to obtain product text description data D; and (6) turning to the step 2).

Step 2): mining the latent semantic theme of the product based on the product text description data D, and identifying the interest preference of the user by using theme information; and (3) generalizing a latent meaning theme model for any product to obtain the themes of all E-commerce products, finally obtaining a theme-product matrix T, and turning to the step 3).

Step 3): constructing a cross-border e-commerce user-product-theme three-part graph based on a user-product purchase matrix M and a theme-product matrix T:

if the elements in the purchase matrix M of the user-product and the theme-product matrix T are not empty, the corresponding user u and the corresponding product i, and the node between the theme T and the product i generate an edge; and traversing elements in a cross-border e-commerce user-product purchase matrix M and a theme-product purchase matrix T, thereby constructing a user-product-theme three-part graph, and marking G as (V, R), wherein V is a node set in the user-product-theme three-part graph, and R is an edge set in the user-product-theme three-part graph, and turning to step 4).

Step 4): dividing the constructed three-part graph G of the cross-border e-commerce user-product-theme into a Training Set and a Test Set, and establishing HNGR; in the Training stage, a Training Set is input into the HNGR, and a collaborative filtering signal is obtained along a three-part graph structure of cross-border E-business 'user-product-theme' by adopting an information propagation architecture in the traditional graph neural network, so that a characterization vector r of a user is respectively obtaineduAnd a characterization vector r of the productiGenerating a recommendation result through an excitation function; and in the optimization stage, obtaining the optimal parameter configuration of the HNGR through an Adam optimizer, storing the trained HNGR, and turning to the step 5).

Step 5): and respectively calculating users to be recommended in the Test Set to generate a personalized E-commerce product recommendation list by inputting the Test Set into the trained HNGR, thereby realizing cross-border E-commerce product recommendation.

Compared with the prior art, the invention has the remarkable advantages that:

(1) the invention provides a cross-border e-commerce recommendation method based on heterogeneous graph expression learning, which is used for personalized product recommendation of cross-border e-commerce platform users.

(2) The method can be used for performing representation learning on interaction information between complex commodities and users, meanwhile, a hidden semantic topic model is used as a bridge, more users and product neighbor nodes are aggregated by utilizing an aggregation Layer to obtain richer information, so that high-quality user and product expression vectors are obtained, the interaction of 'user-product' is modeled through a Multi-Layer Perceptron (MLP), and the interaction score of the user to the product is predicted based on the interaction.

(3) The invention minimizes the loss function by means of an Adam optimizerTherefore, the parameters in the model are adjusted to be optimal configuration, compared with the conventional recommendation method, the method can effectively excavate useful information in the negative sample, and further reduce the calculation cost of model training, so that the method can train on a large amount of electronic commerce interactive data more easily.

Drawings

In order to more clearly illustrate the embodiments or prior art solutions of the present invention, the drawings used in the description of the embodiments or prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without inventive effort, wherein the drawings are not limited thereto

FIG. 1 is a histogram of the distribution of the number of categories of products purchased by a user.

Fig. 2 is a graph showing a distribution of the number of purchases made by the user.

Fig. 3 is a graph of product sales distribution.

FIG. 4 is a system framework diagram of a graph neural network recommendation model based on heterogeneous graph expression learning.

FIG. 5 is a "user-product-subject" three-part diagram.

Fig. 6 is a topical subject visualization display diagram.

FIG. 7 is a flowchart of a cross-border e-commerce recommendation method based on heterogeneous graph expression learning according to the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.

Further, in the description of the present invention, "a plurality" means two or more unless otherwise specified.

Because the E-commerce product information is various in types, complex in types, extremely sparse in matrix of 'user-item' and outstanding in cold start problem, traditional recommendation based on collaborative filtering, recommendation based on matrix decomposition and recommendation based on content are difficult to take effect, and an improved recommendation model based on collaborative filtering or matrix decomposition only considers feedback information of 'explicit' and 'implicit' of a user on a product, ignores the relation between graph structure information consisting of the user and the item and a vague theme between the product, and hardly meets the requirements of a platform and the user in recommendation performance. The invention provides a cross-border e-commerce recommendation method based on heterogeneous graph expression learning, which comprises the following steps in combination with the step shown in FIG. 7:

step 1): and extracting the original data of the cross-border e-commerce user to respectively obtain order data and product description data. Therein, the order data can be represented as a 3-tuple: described as < User _ ID, Item _ ID, Quantity >, where User _ ID is a User identification, Item _ ID is a unique identification of a product purchased by the User, Quantity is a Quantity of products accumulated by the User; the product description data is described as < Item _ ID, Title, Price >, where Title is the Title of the commodity; price is the product Price; performing noise data removal and missing data operation on the two data to obtain a final user-product purchasing matrix M and product text description data D; and the characteristics of the cross-border e-commerce user-product purchase matrix M are analyzed from the statistical perspective, wherein the user-product purchase matrix is extremely sparse, the problem of cold start of a user is serious, and the phenomenon of long product sales volume is remarkable. These features suggest the challenges and challenges faced by the present invention in designing cross-border e-commerce recommendation methods, step 2).

Step 2): mining the latent semantic theme of the product based on the cross-border e-commerce product text description data D, and identifying the interest preference of the user by using theme information; and (3) generalizing a Latent semantic topic model (LDA) of any product, acquiring topics of all E-commerce products, finally acquiring a topic-product matrix T, and turning to the step 3).

Step 3): constructing a cross-border e-commerce user-product-theme three-part graph based on a cross-border e-commerce user-product purchase matrix M and a theme-product matrix T:

if the elements in the purchase matrix of the user-product and the theme-product matrix are not empty, the corresponding user u and the corresponding product i respectively and the node between the theme t and the product i generate an edge; traversing elements in a cross-border e-commerce user-product purchase matrix M and a theme-product purchase matrix T, and constructing a user-product-theme three-part graph, wherein G is (V, R), V is a node set in the user-product-theme three-part graph, and R is an edge set in the user-product-theme three-part graph; and the point set V is divided into 3 types, namely a user set U, a product set I and a theme set T, and the step 4) is carried out.

Step 4): establishing a cross-border e-commerce recommendation method HNGR (heterogeneous Neural Graph recommendation) based on heterogeneous Graph expression learning, dividing a constructed cross-border e-commerce 'user-product-theme' three-part Graph G into a Training Set and a Test Set in a Training stage, inputting the Training Set into the HNGR, adopting an information propagation architecture in a traditional Graph Neural Network (GNN), and acquiring a collaborative filtering signal along a cross-border e-commerce 'user-product-theme' three-part Graph structure so as to respectively obtain a characterization vector r of a useruAnd a characterization vector r of the productiGenerating a recommendation result through an excitation function; and in the optimization stage, obtaining the optimal parameter configuration of the HNGR through an Adam optimizer, storing the trained HNGR, and turning to the step 5).

Step 5): and respectively calculating users to be recommended in the Test Set to generate a personalized E-commerce product recommendation list by inputting the Test Set into the trained HNGR, thereby realizing cross-border E-commerce product recommendation. The method and the system can accurately analyze the interest preference of the user and recommend cross-border e-commerce products, thereby improving the order conversion rate of the platform and improving the user experience. The method can also solve the problems of sparsity of a 'user-product' matrix and cold start faced by the traditional recommendation method (such as collaborative filtering and matrix decomposition).

The above steps will be described one by one with reference to the accompanying drawings.

The cross-border e-commerce commodity data set used in the step 1) is from a certain known cross-border e-commerce platform in China. The data is largely classified into 2 categories: order data and product description data. Therein, the order data can be represented as a 3-tuple: described as < User _ ID, Item _ ID, Quantity >, where User _ ID is a User identification, Item _ ID is a unique identification of a product purchased by the User, Quantity is a Quantity of products accumulated by the User; the product description data is described as < Item _ ID, Title, Price >, where Title is the Title of the commodity; price is the product Price; the invention carries out the operations of removing noise data and missing data on the two data to obtain a final 'user-product' purchasing matrix M and product text description data D

TABLE 1 characteristics of the Pre-processed purchase matrix

Table 1 describes the basic features of the "user-product" purchase matrix M in the data set, and first, it can be seen that the number of commodities is much smaller than the number of users, if the "user-item" purchase matrix is constructed using the data and recommended using the collaborative filtering algorithm on this basis, the non-zero value ratio of the "user-item" matrix is only 1.27%, while the sparsity of the "user-item" score matrix in the commonly used MovieLens100K is 6.3%. The present invention observes the distribution of times of purchasing different cross-border e-commerce products by users, as shown in fig. 1, it can be found that the graph has a significant long tail phenomenon, in which 24211 (77.2%) users only purchase 1 cross-border e-commerce product, and only 492 (1.56%) purchase not less than 5 cross-border e-commerce products, so that the traditional collaborative filtering algorithm is difficult to directly run on the "user-item" purchase matrix.

Fig. 2 shows the distribution of the cumulative number of purchases of users, and it can be seen that the percentage of users who have only one purchase record is as high as 64.8%, i.e. more than 60% of users are cold-start users, while users who have more than three cumulative purchases account for only 16.8%. Therefore, the problem of user cold start in the cross-border e-commerce data set is serious. If the purchase frequency matrix of the user-item is directly constructed, the problem of matrix sparsity can not be avoided, and the traditional collaborative filtering algorithm is difficult to achieve.

Fig. 3 illustrates the distribution of product sales across border e-merchants, and it can be seen that the graph has a significant long tail phenomenon, i.e. only a small fraction of products are frequently purchased, with only 8 (4.9%) products sold in excess of 1 thousand, and up to 116 (71.6%) products sold below 1 thousand. It is well known that it is easy and trivial for a recommendation system to recommend popular goods, and recommending long-tailed items increases novelty of recommended goods and is a challenge. Therefore, how to design a novel recommendation model to recommend more long-tail products meeting the user interest preference to the user is the focus of cross-border e-commerce recommendation attention.

The analysis quantitatively analyzes the cross-border e-commerce user-product purchase matrix M from 3 angles of the distribution of the types and the quantity of the products purchased by the user, the distribution of the ordering frequency of the user and the distribution of the product sales volume, and the analysis result explains the difficult problems and the challenges faced by the invention in designing the cross-border e-commerce recommendation method: the purchase matrix of the user-product is extremely sparse, the cold start problem of the user is serious, and the long tail phenomenon of the product sales is obvious.

Constructing a cross-border e-commerce user-product-theme three-part graph, namely G (V, R), based on a cross-border e-commerce user-product purchase matrix M and a theme-product purchase matrix T, wherein V is a node in the user-product-theme three-part graph, and R is an edge set in the user-product-theme three-part graph.

Generalizing any product i in the product text description data D by a Latent semantic topic model (LDA) to obtain topic probability distribution, and marking the topic probability distribution as thetai={θi,k},k=1,2,…,K,K is the number of the subjects after the product generalization, and K is the serial number of the subjects; selection of thetaiMaximum probability in setThe theme corresponding to the distribution value is used as the theme of the final product and abstracted into a function

Wherein, tkThe generalized theme of the product i is expressed, and finally a theme-product matrix T is obtained.

The constructed three-part cross-border e-commerce user-product-subject map is divided into a Training Set and a Test Set according to a ratio of 4:1 for Training and testing of HNGR. Actually, products with interaction history records can often show the interest preference of users, and a user group with interaction records on the same product can be regarded as the characteristics of the product and can reflect the similarity between the products. The HNGR acquires a collaborative filtering signal along a cross-border electronic commerce 'user-product-subject' three-part graph structure by adopting an information propagation architecture in a Graph Neural Network (GNN), so as to respectively obtain the characterization vectors of a user and a product, wherein the characterization vector of a user u is as follows:

1) information dissemination: in a general single-layer gnn (graph Neural network) network, for a traditional bipartite graph constructed by a "user-product" purchase matrix M, any group of "user-product" with edge connections can be denoted as (u, i), meaning that user u has generated a purchase record for product i, and information from product i to user u is denoted as Mu←i

mu←i=f(xi,xu,cu,i),

Here, f (-) is the coding function of the information, xiAnd xuRepresenting the characterization vectors of product i and user u, respectively, where xiFrom One-Hot coding, xuAll the parameters are obtained by a trained BERT model; c. Cu,iIs an attenuation factor for controlling the propagation of any one edge (u, i), using a regularizing variableRepresents; f (-) is achieved by:

wherein N isuRepresenting the number of products connected with the edge of user u, weightW1、W2And W3Is a trainable weight matrix in the GNN network and is used for extracting useful information in information propagation;representing vector stitching, the above formula is simplified as:

similarly, for any group (u, i) of edge connections in the "user-product-subject" three-part graph, the information of product i to user u is recorded as mu←i

Wherein z represents all products belonging to the same subject as product i,representing the number of products contained in the subject to which product i belongs. W'1、W'2And W'3Is a trainable weight matrix in GNN networks.

2) Information aggregation: on the basis of information transmission, further aggregating information transmitted from all neighbor nodes of the user u, so as to obtain an expression vector of the user u; all neighbor nodes for user u include neighbor nodes in the traditional bipartite graph and through "user-product-masterFunction h for neighbor node and information aggregation acquired by topic of' three-part graph GuIs defined as:

where σ () is an excitation function, and ReLU () max (0,) is selected as the excitation function.

To obtain the final expression vector for user u, vector h is addeduThe conversion is carried out as follows:

ru=σ(Wuhu+bu),

wherein, WuAnd buRespectively representing trainable weight matrices and bias vectors, ruRepresenting a user u expression vector obtained by embedding propagation layer learning in GNN; here, ReLU is also used as the excitation function.

Express vector r with user uuThe calculation method is similar, and the expression vector r of the product i is obtainedi. In summary, the graph neural network-based three-part graph expression learning can use the embedded propagation layer to explicitly use the connection information to associate the user and item expressions, and meanwhile, uses the aggregation layer to aggregate more user and product neighbor nodes to obtain richer information by taking the implicit theme as a bridge, thereby obtaining high-quality user and product expression vectors.

Using the expression vector r of user uuAnd the expression vector r of product iiPredicting the interaction score of the user u to the product i, and defining the interaction score as follows:

wherein Wj and bj respectively represent trainable weight matrixes and bias vectors in the MLP, and l represents the total number of layers of the MLP network; σ (-) isSelecting ReLU as excitation function; the final output of MLP is the resulting interaction score of u to product i, i.e.Given user u's interaction score for product iBy usingFunction to obtain the output of the model, i.e., the probability of user u purchasing product i

In the training phase, in terms of recommending products to the user, the positive labels are the product sets actually purchased by the user, namely, the interaction exists and is marked as Y+(ii) a The negative label is formed by removing the positive label from the product set I and performing log-uniform sampling, namely no interaction exists and is marked as Y-; the binary cross entropy based loss function of HNGR is adopted, namely: loss function of purchase probability and truthThe following were used:

wherein, yu,iIs the probability distribution of product i being actually purchased by user u; specifically, if (u, i) ∈ Y+Then y isu,i1, otherwiseu,i=0。

Here, the invention minimizes the loss function by means of an Adam optimizerThereby tuning the parameters in the HNGR model to the optimal configuration. Compared with the existing training scheme, the method can effectively excavate useful information in the passive sample, and further reduces the calculation cost of model training. Therefore, the HNGR model can be more easily trained on massive cross-border e-commerce interaction data.

By inputting the Test Set into the trained HNGR, a personalized E-commerce product recommendation list can be generated by calculating users to be recommended in the Test Set respectively, so that cross-border E-commerce product recommendation is realized. The method can accurately analyze the interest preference of the user and recommend the cross-border e-commerce products, thereby improving the order conversion rate of the platform and improving the user experience. The method can also solve the problems of sparsity of a 'user-product' matrix and cold start faced by the traditional recommendation method (such as collaborative filtering and matrix decomposition).

Fig. 1 illustrates the distribution of users purchasing cross-border e-commerce product categories in the example data set, and observing the chart the present invention finds that 24211 users who purchased only one item in 64730 purchase records account for 77.2% of all users, while only 492 (1.56%) purchase items of no less than five categories. Traditional collaborative filtering algorithms are difficult to run directly on the data matrix.

Fig. 2 shows users who have purchased the same number of times in the data set, and the figure shows the population distribution of 31357 users' purchases of goods in the data set. With up to 64.8% of users having only one purchase record, i.e., over 60% of users being cold-start users. And the users with the purchase frequency more than three times only account for 16.8 percent, so if the purchase frequency matrix of the User-Item is directly constructed, the problem of matrix sparsity can not be avoided, and the traditional collaborative filtering algorithm is difficult to achieve.

Figure 3 illustrates the distribution of product sales across the data set and it can be seen that the graph has a significant long tail phenomenon, i.e. only a small fraction of products are frequently purchased, with only 8 (4.9%) products sold in excess of 1 million and up to 116 (71.6%) products sold below 1 thousand. It is well known that it is easy and trivial for a recommendation system to recommend popular goods, and recommending long-tailed items increases novelty of recommended goods and is a challenge. Therefore, how to design a novel recommendation model to recommend more long-tail products meeting the user interest preference to the user is the focus of cross-border e-commerce recommendation attention.

Fig. 4 shows a framework diagram of the HNGR recommendation model. And analyzing data such as user purchase records to obtain interactive information of 'user-commodity', and constructing a three-part graph. And performing embedded propagation learning on the basis of the three-part graph, and finally modeling the interaction of 'user-product' through a Multi-Layer Perceptron (MLP) to learn nonlinear cooperative signals in the interaction process. HNGR comprises 4 sub-modules: the system comprises a heterogeneous graph building layer, an information transmission and aggregation layer, an interaction modeling layer and a score prediction layer.

FIG. 5 illustrates a constructed "user-product-theme" bipartite graph. Let G ═ V, R be the "user-product-topic" bipartite graph constructed as shown on the left side of fig. 4, where V and R are the set of nodes and edges in the graph, respectively. The nodes in the graph can be divided into 3 types, namely a user set U, a product set I and a theme set T.

Fig. 6 shows the 100 words with the highest probability value for the 6 topics of the LDA topic. We can observe the following phenomena from this: first, the products covered under 6 themes are richer and include milk powders, adult and infant health products, cosmetics, beverages. Secondly, each topic embodies the main features. For example, topics 1, 2, 3 focus on the topic of embodying a user's purchase of milk powder and health care products, and topic 5 focuses on the topic of health care products and sports drinks; secondly, there is a linguistically related association of products within each theme, for example, the milk powder brand within theme 1 includes both hui, nestle, fond and origin includes both gang and germany. From the phenomena, on the traditional bipartite graph, the LDA topic model is used for constructing the neural network of the heterogeneous graph, so that more potential association information can be spread, and the potential interest preference of a user can be identified.

The above description is only for the specific embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the appended claims.

完整详细技术资料下载
上一篇:石墨接头机器人自动装卡簧、装栓机
下一篇:一种商品推荐系统的冷启动方法

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!