Group recommendation method based on mixed attention network
1. A group recommendation method based on a hybrid attention network is characterized in that: the recommendation method comprises a feature input layer, a feature representation layer, a feature crossing layer and a scoring prediction layer, and specifically comprises the following steps:
step 1, a characteristic input layer firstly acquires a historical interaction record of a user, establishes a user-item scoring matrix, then carries out matrix decomposition, carries out optimization by using a random gradient descent method, and obtains an embedded characteristic p of a user u and an item v through initializationuAnd q isvThen inputting the two embedded features into a feature representation layer;
step 2, the feature representation layer firstly extracts the embedded features of the user and the project neighbor nodes obtained in the step 1 from the interactive graphs of the user and the project in the graph attention network respectively, and finally forms potential feature representations of the user and the project;
step 3, the feature representation layer processes the user-user interaction relationship and the user-project interaction relationship on the potential features of the users and the projects in the group obtained in the step 2 by using local and global attention units in the sequence attention network, outputs the local representation and the global representation of the group potential feature vector, and then generates the final group potential feature representation by combining the local representation and the global representation of the group potential feature vector;
step 4, the characteristic representation layer obtains potential characteristic input characteristic cross layers of the groups, the users and the projects from the step 2 and the step 3;
step 5, the characteristic cross layer respectively inputs the splicing vectors of the groups and the projects obtained in the step 4 or the splicing vectors of the users and the projects into a multi-layer perceptron of a shared parameter to carry out high-order cross combination of characteristics and outputs the high-order cross combination to a final scoring prediction layer to obtain the prediction scores of the groups to the projects and the prediction scores of the users to the projects;
and 6, jointly optimizing the scoring prediction of the project by the group and the user to update the parameters of the model.
2. The group recommendation method based on the hybrid attention network as claimed in claim 1, wherein: in step 2, the method for extracting potential feature representations of users and items by attention network specifically includes the following steps:
s21 item latent features using a user dimension-based graph attention networkThe method comprises the steps of firstly calculating attention weights of each user in the items v and N (v) and the item, and then outputting final item potential representation through weighted fusionThe fusion function is defined as follows:
wherein if j is v, then hvj=qv(ii) a If j ∈ N (v), then hvj=pj,αvjIs the attention weight, this weight is calculated using the following formula:
where σ is a sigmoid nonlinear activation function, [,]it is shown that the splicing operation is performed,is a model parameter of a graph attention network based on user dimensions;
s22 potential user characteristics based on item dimension graph attention networkThe extraction specifically comprises the following steps: first, calculate the attention weight β of each item in the users u and C (u) and the user's own attention weight βujAnd then the final user potential representation is output through weighted fusionThe calculation formula is as follows:
wherein the content of the first and second substances,if j is u, then huj=pu(ii) a If j ∈ C (u), thenIs a parameter of the graph attention network based on project dimensions.
3. The group recommendation method based on the hybrid attention network as claimed in claim 1, wherein: in step S3, the extracting of the group potential feature representation through the sequence attention network specifically includes the following steps:
s31, firstly, calculating the attention weight of each user according to the interaction relation between the user and the item to be predicted through a local attention unit, weighting and fusing the potential feature vectors of the users in the group to output local representation of the group potential features, and outputting the local representation E of the group potential featureslocalThe expression of the extraction process is as follows:
wherein the content of the first and second substances,network parameter, alpha, for local attention unituiIs the attention weight of the ith user;
s32, extracting attention weight of each user according to interaction relation among users in the group through a global attention unit, and weighting and fusing global representation E of potential feature vector output group of users in the groupglobal。
4. The group recommendation method based on the hybrid attention network as claimed in claim 2, wherein: what is needed isIn step S32, a global representation E of the group potential featuresglobalThe extraction process comprises the following steps:
s321, mapping the potential feature vectors of the users in the group to another two auxiliary spaces, and learning the similarity of the potential features of the users in the group by using the three feature spaces together, wherein the three feature spaces are represented as follows:
Q=XWQ
K=XWK
V=XWV
wherein X ∈ Rt×dIs a feature matrix composed of t user potential feature vectors in a group, WQ∈Rd×d,WK∈Rd×d,WV∈Rd ×dRespectively carrying out parameter matrixes for spatial mapping on the feature vectors;
s322, obtaining Q, K and V, and calculating attention weight beta between other users in the group and the designated user uiuj,ui:
QujRepresenting potential feature vectors, K, for user uj in QuiPotential feature vectors representing user ui in K;
and S323, summing and normalizing the attention weights between the other users and the designated user ui to obtain the attention weight of the user in the group:
s324, combining the attention weight to perform weighted fusion on the group of users, wherein the global attention potential feature expression is as follows:
wherein, VuiAnd (3) representing the feature vector of the user ui in V, proportionally fusing the local representation and the global representation obtained above, and outputting the potential feature representation of the final group as follows:
wherein epsilon is a proportionality coefficient.
5. The group recommendation method based on the hybrid attention network as claimed in claim 1, wherein: the main implementation process of feature crossing and score prediction in step S5 is: the user individual score prediction and the group score prediction share parameters in a multi-layer perceptron of a feature cross layer, and a final score prediction result can be output through the feature cross layer, wherein the score prediction processes of feature splicing, crossing and group of the group and the project are represented as follows:
wherein the content of the first and second substances,is the splicing characteristic of groups and items, k represents the hidden layer number,is the predicted score of the group g to the item v, and defines the following target function of the group score prediction:
where v is the interaction record R for groups and itemsGObservable positive samples, DGIs composed of three groups, positive item samples and negative item samplesThe tuple, Θ, is a regularization parameter, and the objective function optimizes the parameter by maximizing the prediction difference between the positive and negative samples;
and performing characteristic splicing, crossing and user grading prediction of the user and the project in the same way, wherein the process is as follows:
wherein the content of the first and second substances,the user scoring prediction process and the group scoring prediction process share parameters at a feature cross layer,the predicted score of the group u to the item v defines an objective function of user score prediction:
wherein D isUIs a triple consisting of a user, a positive sample of the item, and a negative sample of the item.
6. The group recommendation method based on the hybrid attention network as claimed in claim 1, wherein: in step S6, the joint optimization calculation method includes:
s61, firstly, learning the parameters of the network by combining the loss functions of the interaction data of the user and the project to the optimized personal score prediction, and outputting the optimized potential feature representation of the user;
s62, optimizing a group score prediction loss function according to interaction data of the group and the project, learning related parameters of the group prediction process, and finely adjusting the shared parameters of the two prediction processes;
s63, and finally, iterating the two processes until the two processes reach a convergence state as a whole.
Background
The traditional recommendation method mainly provides recommendation service for individual users, but with the rapid development of social networks in recent years, group activities are more and more frequent, and therefore research related to group recommendation is gradually popularized.
Group recommendation refers to recommending commodity recommendations meeting the interest preferences of a group of users in an online system. Most of the traditional group recommendation methods directly learn the interest (features) of groups and users or the features of items, namely, independently model the features of the groups and the users, often neglect that the groups, the users and the items have various interactive relationships, such as the interactive relationship of the users in the group, the interactive relationship of the users and the items, and the like, and the learning method not only causes insufficient extraction of feature representations of the users and the items, but also cannot consider the difference of influence weights of the users in the group and learn the group feature representation with robustness and adaptability according to target prediction items, thereby finally damaging the recommendation effect.
CN 110502704A-a group recommendation method and system based on attention mechanism, through carrying out preprocessing on user data information, discovering a user potential group by adopting an improved density peak clustering method, grouping users with higher similarity into a group; using an attention mechanism network for the members in the group, designing an attention mechanism model (AMGR) to calculate the weight of the members in the group, and performing preference fusion; interactive learning data is carried out by using a Neural Collaborative Filtering (NCF) framework, and prediction scores of different projects of users or groups are predicted, so that group recommendation is realized.
CN 112732932A-a user entity group recommendation method based on knowledge graph embedding, portrays the user entity in the knowledge graph, returns the user entity group of the relevance top-K to the target user entity according to the user entity portrayal characteristics, although the method enhances the precision of the user entity group recommendation method, the method ignores various interaction relations among groups, users and projects, the extraction is not sufficient, and the final recommendation effect is influenced.
Disclosure of Invention
In order to solve the problems, the invention provides a group recommendation method based on a mixed attention network, which comprises the steps of extracting interaction information of users and items by using an attention network, obtaining the structural feature representation of the users and the items, and then modeling by using the interaction relation between the users in a sequence attention network building module to obtain the feature representation of a group. And finally, obtaining the prediction scores of the group to the items and the prediction scores of the users to the items by utilizing the neural collaborative filtering, and updating the parameters of the model by performing combined optimization on the two prediction targets. By the method, various relationships among the groups, the users and the items are fully modeled, and the characteristic expressions of the groups, the users and the items are effectively extracted, so that the recommendation effect is improved.
In order to achieve the purpose, the invention is realized by the following technical scheme:
the invention relates to a group recommendation method based on a hybrid attention network, which comprises a feature input layer, a feature representation layer, a feature cross layer and a score prediction layer, and specifically comprises the following steps:
step 1, a characteristic input layer firstly acquires a historical interaction record of a user, establishes a user-item scoring matrix, then carries out matrix decomposition, carries out optimization by using a random gradient descent method, and obtains an embedded characteristic p of a user u and an item v through initializationuAnd q isvThen inputting the two embedded features into a feature representation layer;
step 2, the feature representation layer firstly extracts the embedded features of the user and the project neighbor nodes obtained in the step 1 from the interactive graphs of the user and the project in the graph attention network respectively, and finally forms potential feature representations of the user and the project;
step 3, the feature representation layer processes the user-user interaction relationship and the user-project interaction relationship on the potential features of the users and the projects in the group obtained in the step 2 by using local and global attention units in the sequence attention network, outputs the local representation and the global representation of the group potential feature vector, and then fuses the local representation and the global representation of the group potential feature vector to generate the final group potential feature representation;
step 4, the characteristic representation layer obtains potential characteristic input characteristic cross layers of the groups, the users and the projects from the step 2 and the step 3;
step 5, the characteristic cross layer respectively inputs the splicing vectors of the groups and the projects obtained in the step 4 or the splicing vectors of the users and the projects into a multi-layer perceptron of a shared parameter to carry out high-order cross combination of characteristics and outputs the high-order cross combination to a final scoring prediction layer to obtain the prediction scores of the groups to the projects and the prediction scores of the users to the projects;
and 6, jointly optimizing the scoring prediction of the project by the group and the user to update the parameters of the model.
The invention is further improved in that: in step 2, the method for extracting the potential features of the users and the items by the attention network is as follows: in an interaction graph of users and projects, a project j of potential features to be extracted is regarded as a central node, users who have interaction records with the project j are regarded as neighbor nodes, attention weights of the project and the neighbor nodes are respectively calculated by using an attention mechanism, and the potential features of the project are obtained by combining the attention weights and fusing the embedding features of the project and the embedding features of the nodes.
The project dimension-based user latent feature extraction process is consistent with the project.
The invention is further improved in that: in step 3, the method for extracting group potential feature representation through the sequence attention network is as follows:
s31, firstly, calculating the attention weight of each user according to the interaction relation between the user and the item to be predicted through a local attention unit, and outputting the local representation of the group potential features through the potential feature vector of the users in the weighted fusion group.
S32, extracting attention weight of each user according to interaction relation among users in the group through a global attention unit, and weighting and fusing global representation E of potential feature vector output group of users in the groupglobal。
And fusing the local representation of the latent features of the group and the global representation to output the final latent feature representation of the group.
The invention is further improved in that: in the step 4, feature crossing and scoring prediction mainly establishes a loss function for the users and the groups according to the idea of pair-wise ordering in the implementation process.
The invention is further improved in that: in step S6, the main calculation method of joint optimization is:
s61, firstly, learning the parameters of the network by combining the loss functions of the interaction data of the user and the project to the optimized personal score prediction, and outputting the optimized potential feature representation of the user;
s62, optimizing a group score prediction loss function according to interaction data of the group and the project, learning related parameters of the group prediction process, and finely adjusting the shared parameters of the two prediction processes;
s63, and finally, iterating the two processes until the two processes reach a convergence state as a whole.
The invention has the beneficial effects that:
(1) the graph attention network adopted in the invention effectively utilizes the interactive graph information of the users and the projects, can extract richer characteristic expressions of the users and the projects, and can help to alleviate the cold start problem encountered in the group recommendation process to a certain extent.
(2) The sequence attention network in the invention integrates the local attention unit and the global attention unit, gives consideration to the influence weight of users in a group under different interaction relations, and learns the robust and self-adaptive group characteristic representation according to the target item to be tested, thereby improving the satisfaction degree of group recommendation.
(3) According to the method, a large number of network parameters shared in the group scoring prediction and individual scoring prediction processes can be trained by utilizing a large number of user-project interaction results in a combined optimization mode, so that the defect of insufficient parameter training caused by scarcity of group-project interaction results is overcome.
Drawings
FIG. 1 is a diagram showing the overall structure of the method according to the embodiment of the present invention.
FIG. 2 is a schematic diagram of local and global attention units of an embodiment of the present invention.
Detailed Description
In the following description, for purposes of explanation, numerous implementation details are set forth in order to provide a thorough understanding of the embodiments of the invention. It should be understood, however, that these implementation details are not to be interpreted as limiting the invention. That is, in some embodiments of the invention, such implementation details are not necessary.
In the data acquisition, the data set used by the method is a Meetup data set, the Meetup data set is from a website Meetup. The data set mainly comprises 16330 groups, 5893887 users, 2510 items, 685 people in each group, 31214 group-to-item interaction information, and 3195246 user-to-item interaction information.
Fig. 1 is a general structural diagram of a method according to an embodiment of the present invention, the method mainly includes four layers: the method comprises a characteristic input layer, a characteristic representation layer, a characteristic cross layer and a score prediction layer, and specifically comprises the following steps:
A. feature input layer
The characteristic input layer firstly acquires the historical interaction records of the user, establishes a user-item scoring matrix, then carries out matrix decomposition, carries out optimization by using a random gradient descent method, and obtains the embedded characteristics p of the user u and the item v through initializationuAnd q isvThen, the two embedded features are input into the feature representation layer.
B. Feature representation layer
The feature representation layer uses a mixed attention network to carry out potential feature representation of users, projects and groups. The hybrid attention network includes a graph attention network and a sequence attention network. It is noted that the potential features of users and items are extracted by the networkAndsequence attention network extracted group latent features
The method for extracting the potential feature representation of the user and the item by the graph attention network specifically comprises the following steps:
s21 item latent features using a user dimension-based graph attention networkThe extraction of (1). Latent featuresThe extraction process of (A) is as follows: searching to obtain a user set N (v) which generates interactive behaviors with the item v, firstly calculating attention weights of each user in the items v and N (v) and the item based on a graph attention network of user dimensions, and then outputting a final item potential representation through weighted fusion
The fusion function is defined as follows:
wherein if j is v, then hvj=qv(ii) a If j ∈ N (v), then hvj=pj,αvjIs the attention weight, this weight is calculated using the following formula:
where σ is a sigmoid nonlinear activation function, [,]it is shown that the splicing operation is performed,is a model parameter of a graph attention network based on user dimensions;
s22 potential user characteristics based on item dimension graph attention networkThe extraction of (1). Firstly, searching to obtain an item set C (u) which generates interaction records with a user u, firstly, calculating each item in the user u and C (u) and the attention weight beta of the user per se by a graph attention network based on item dimensionsujAnd then the final user potential representation is output through weighted fusionThe calculation formula is as follows:
wherein if j is u, then huj=pu(ii) a If j ∈ C (u), then huj=qj,Is a parameter of the graph attention network based on project dimensions.
Next is extracting the potential features of the cluster using the sequence attention networkThe input to the sequential attention network is a set of user latent vector matricest represents the number of users, d represents the dimension of the feature, Es∈Rd×t。
And 3, with reference to the attached figure 2, processing the user-user interaction relationship and the user-project interaction relationship on the potential features of the users and the projects in the group obtained in the step 2 by using local and global attention units in the sequence attention network, outputting local representation and global representation of the group potential feature vector, and then fusing the local representation and the global representation of the group potential feature vector to generate the final group potential feature representation.
Calculating the attention weight of each user according to the interaction relation between the user and the item to be predicted through a local attention unit, outputting the local representation of the group potential features by weighting the potential feature vectors of the users in the fusion group, and outputting the local representation E of the group potential featureslocalThe expression of the extraction process is as follows:
wherein the content of the first and second substances,network parameter, alpha, for local attention unituiIs the attention weight of the ith user.
S32, extracting attention weight of each user according to interaction relation among users in the group through a global attention unit, and weighting and fusing global representation E of potential feature vector output group of users in the groupglobal。
Global representation E of group latent featuresglobalThe extraction process comprises the following steps:
s321, global representation, in order to capture fine-grained interaction between one user and users in a group, a potential feature vector of the users in the group needs to be mapped to two other auxiliary spaces, and the similarity of potential features of the users in the group is learned through the common three feature spaces, wherein the three feature spaces are represented as follows:
Q=XWQ
K=XWK
V=XWV
wherein X ∈ Rt×dIs a feature matrix composed of t user potential feature vectors in a group, WQ∈Rd×d,WK∈Rd×d,WV∈Rd×dRespectively carrying out parameter matrixes for spatial mapping on the feature vectors, wherein Q, K and V are respectively Queries, Keys and Values defined in the traditional attention mechanism;
s322, obtaining Q, K and V, and calculating attention weight beta between other users in the group and the designated user uiuj,uiAttention weight βuj,uiRepresenting the degree of attention of other users to the user ui:
Qujrepresenting potential feature vectors, K, for user uj in QuiPotential feature vectors representing user ui in K;
s323, then, summing and normalizing the attention weights between the other users and the designated user ui to obtain the attention weight of the user in the group:
s324, combining the attention weight to perform weighted fusion on the group of users, wherein the global attention potential feature expression is as follows:
wherein, VuiAnd (3) representing the feature vector of the user ui in V, proportionally fusing the local representation and the global representation obtained above, and outputting the potential feature representation of the final group as follows:
wherein e is a proportionality coefficient.
After the potential feature representations of the user, item, group are obtained, they are entered into the feature intersection layer.
C. Feature intersection layer and score prediction layer
Step 4, the characteristic representation layer obtains potential characteristic input characteristic cross layers of the groups, the users and the projects from the step 2 and the step 3;
and 5, performing high-order cross combination of features on the multilayer perceptron by the feature cross layer, wherein the splicing vectors of the groups and the projects obtained in the step 4 or the splicing vectors of the users and the projects are respectively input into the shared parameters, outputting the multi-layer perceptron to the final scoring prediction layer to obtain the predicted scores of the groups to the projects and the predicted scores of the users to the projects, namely, the personal scoring prediction and the group scoring prediction of the users share the parameters in the multilayer perceptron of the feature cross layer, and outputting the final scoring prediction result through the feature cross layer.
The grouping-to-project feature splicing, crossing and grouping scoring prediction process is represented as follows:
wherein the content of the first and second substances,is the splicing characteristic of groups and items, k represents the hidden layer number,is the group g's predicted score for item v; considering the group recommendation task as a pair-wise ordering problem, the following objective function of group score prediction is defined:
where v is the interaction record R for groups and itemsGObserved positive samples, v' not in RGIn, can be used as a negative sample, DGThe target function is a triple consisting of a group, a positive project sample and a negative project sample, wherein theta is a regularization parameter, and the target function optimizes the parameter by maximizing a prediction difference between the positive sample and the negative sample;
due to the combined optimization mode, a large amount of user-project scoring data can be utilized to train some network parameters shared in the group scoring prediction and individual scoring prediction processes, so that the defect of insufficient parameter training caused by scarcity of group-project scoring data is overcome.
Therefore, the feature splicing, crossing and user rating prediction of the user and the project are performed in the same way, and the process is as follows:
wherein the content of the first and second substances,the user scoring prediction process and the group scoring prediction process share parameters at a feature cross layer,the group u predicts the score of the item v, and similarly, the user personalized recommendation task is regarded as a pair-wise ordering problem, and the following objective function of user score prediction is defined:
wherein D isUIs a triple consisting of a user, a positive sample of the item, and a negative sample of the item.
And 6, jointly optimizing the scoring prediction of the project by the group and the user to update the parameters of the model.
In order to optimize the objective function LGAnd LUA two-stage joint optimization method of the following steps S61-S63 is adopted, the training process uses a stochastic gradient descent algorithm to update parameters, and the calculation process is as follows:
s61, firstly, learning the parameters of the network by combining the loss functions of the interaction data of the user and the project to the optimized personal score prediction, and outputting the optimized potential feature representation of the user;
s62, optimizing a group score prediction loss function according to interaction data of the group and the project, learning related parameters of the group prediction process, and finely adjusting the shared parameters of the two prediction processes;
and S63, finally, iterating the two processes until the two processes reach a convergence state, and finally, sequencing according to the score prediction result to generate a service recommendation list for the group.
The invention improves the accuracy and interpretability of group recommendation.
The above description is only an embodiment of the present invention, and is not intended to limit the present invention. Various modifications and alterations to this invention will become apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the scope of the claims of the present invention.