Privacy protection-based multi-social platform user recommendation method and system
1. A multi-social platform user recommendation method based on privacy protection is characterized by comprising the following steps:
step 1: each participant client locally integrates initial data and acquires a training model initialized by the server and an initialized key;
step 2: each participant client uses locally integrated initial data to train the training model based on the target precision set by the client, encrypts training parameters obtained after training and sends the encrypted training parameters to the server;
and step 3: each participant client updates the training model based on the aggregation result of the training parameters of all the participant clients, calculates whether the updated training model reaches the target precision set by the participant client, if not, repeatedly executes the previous step, and if so, executes the next step; the aggregation results of the training parameters of the client sides of all the participants are obtained by decryption and aggregation of the server sides;
and 4, step 4: and each participant client updates the label vectors of all social users in the platform by using the trained model, and carries out mutual recommendation on the social users with similar label vectors.
2. The privacy protection based multi-social platform user recommendation method of claim 1, wherein: the initial data locally integrated by each participant client in step 1 includes: social user account ID, social user relationship information, social user characteristics, social user tags.
3. The privacy protection based multi-social platform user recommendation method of claim 2, wherein: the method for locally integrating the initial data by the client of each participant in the step 1 comprises the following steps:
step 1.1: constructing an adjacency matrix: constructing a social network of each social platform by taking the social users as nodes and the social user relations as connecting edges; obtaining an adjacency matrix of the network based on the constructed social network;
step 1.2: constructing a characteristic matrix: numbering all social user characteristics, taking the social user characteristic number as a column number, and taking an account ID of a social user as a row number to establish a characteristic matrix;
step 1.3: constructing a label matrix: and numbering all the social user tags, and constructing a tag matrix of the social user based on the numbers of the social user tags.
4. The privacy protection based multi-social platform user recommendation method of claim 1, wherein: the method for initializing the key in the step 1 comprises the following steps:
s1, generating two prime numbers p and q, and calculating a product b ═ p × q of the two and an Euler function value m ═ p-1 (q-1);
s2, generating a public key e, wherein e and m are relatively prime and 1< e < m;
s3, generating a private key d, wherein the remainder of e x d/m is 1;
s4, the method for encrypting the training parameters comprises the following steps:
s5, acquiring a number a to be encrypted, wherein a ciphertext c is aeThe remainder of/b;
the method for decrypting the training parameters comprises the following steps:
obtaining a ciphertext c, wherein the decrypted number a is cdThe remainder of/b.
5. The privacy protection based multi-social platform user recommendation method of claim 3, wherein: the method for training the training model by each participant client in the step 2 comprises the following steps:
and each participant client inputs a feature matrix H and an adjacent matrix A obtained by locally integrating initial data into the training model and outputs a predicted label matrix L'.
L′=AH′ (1)
H′={h′1,h′2,...,h′i,..,h′n}T (2)
Wherein W ∈ Rf×yAnd w ∈ R1×2yFor a trainable weight matrix, LeakyRelu () is the activation function, | | | is the matrix splicing operation, NiSet of first-order neighbor nodes, α, for node iijRepresents the importance index, h, of node j to node ii、hj、hkRespectively are the eigenvectors of node i, node j and node k, H' is the reconstructed characteristic matrix, Hi' is the reconstructed feature vector of the node i, and n is the total number of users of a single platform;
and calculating the cross entropy of the predicted label matrix L' and the label matrix L obtained by locally integrating the initial data, and iteratively updating the local weight matrixes W and W in a mode of minimizing the cross entropy.
6. The privacy protection based multi-social platform user recommendation method of claim 5, wherein: aggregating the training parameters of all the participant clients in step 3
As shown in the following formula:
wherein W 'and W' are updated weight matrixes, z is the number of clients participating in training, and WtAnd wtThe weight matrix for the tth participant client.
7. The privacy protection based multi-social platform user recommendation method of claim 1, wherein: step 4, in the process of mutual recommendation of social users with similar label vectors, if the difference index of the label vectors of the two social users is smaller than a preset threshold, mutual recommendation is performed through the participant client, and the difference index of the label vectors of the two social users is shown as the following formula:
where y is the total number of all social platform user tags, l1s、l2sThe values of the s-th element in the tag vectors of the first social user and the second social user are respectively.
8. The privacy protection based multi-social platform user recommendation system according to any one of claims 1-7, comprising: a plurality of participant clients and a server; each of the participant clients includes: the device comprises a preprocessing module, a training module, a first encryption module and an output module; the server side comprises: the device comprises an initialization module, a second encryption module and an aggregation module;
the preprocessing module locally integrates initial data, inputs the initial data as local original data, and outputs an adjacency matrix, a feature matrix and a label matrix to the training module;
the initialization module initializes the training model and the secret key and outputs the training model and the secret key to the training module of each participant client and the second encryption module of the server;
the training module inputs initial data integrated by the preprocessing module and the aggregation result of all the participant client training parameters based on the target precision set by the training module, trains the training model, updates the training model based on the aggregation result of all the participant client training parameters, calculates whether the updated training model reaches the target precision set by the training module, and outputs a trained label matrix to the output module if the accuracy requirement is met;
the first encryption module encrypts training parameters obtained after the participant client is trained, inputs trained parameter plaintext from the training module, and outputs encrypted parameter ciphertext to the second encryption module;
the second encryption module decrypts the received encrypted training parameters, inputs the encrypted parameter ciphertext from the first encryption module of the participant client, inputs the key from the initialization module, and outputs the decrypted parameter plaintext to the aggregation module;
the aggregation module aggregates the training parameters sent by the client sides of all the participants, inputs the decrypted parameter plaintext from the second encryption module, and outputs the aggregated parameters to the training modules of the client sides of the participants;
and the output module updates the label vectors of all social users in the platform based on the trained model, carries out mutual recommendation on the social users with similar label vectors, inputs the trained label matrix from the training module and outputs the user recommendation result.
Background
The occurrence of the internet social platforms such as QQ, WeChat, microblog and nailing brings great convenience to daily communication of people, and different social platforms give different social demands to users, so that the same user often has own account in different social platforms. Because social platforms are more and more, daily friend-making circles of people exist in different social platforms and are more dispersed, the problem that how to make users in one social platform obtain friend recommendations of other platforms or how to make users in the social platform know more interesting friends becomes a current concern in the social platforms is solved.
Currently, a user recommendation method for each social platform is to perform user recommendation (compare the number of the same friends of two people) based on own data, such as recognizable friends in the QQ; recommendations with information based on the user's geographic location, e.g., QQ, nearby friends in WeChat; friend recommendation based on address list information exists; there are random online recommendation methods such as shake-shake in WeChat, etc. However, the existing recommendation methods are difficult to make more accurate user recommendations because a single platform may have a small amount of data or information. The social platforms are difficult to perform data integration training due to the need of protecting user privacy and business confidentiality, so that better effects cannot be obtained. Therefore, it is necessary to provide a multi-social-platform user recommendation method and system based on privacy protection.
Disclosure of Invention
The invention aims to solve the technical problems in the prior art and provides a multi-social platform user recommendation method and system based on privacy protection.
The method and the system can effectively improve the user recommendation accuracy in the platform per se and protect the user data privacy of each platform.
In order to achieve the purpose, the invention provides the following scheme: the invention provides a privacy protection-based multi-social platform user recommendation method, which comprises the following steps:
step 1, each participant client locally integrates initial data, and acquires a training model initialized by a server and an initialized key;
step 2, each participant client trains the training model by adopting locally integrated initial data based on target precision set by the participant client, encrypts training parameters obtained after training and sends the training parameters to a server;
step 3, each participant client updates the training model based on the aggregation result of the training parameters of all the participant clients, and calculates whether the updated training model reaches the target precision set by the participant client, if not, the previous step is repeatedly executed, and if so, the next step is executed; the aggregation results of the training parameters of the client sides of all the participants are obtained by decryption and aggregation of the server sides;
and 4, updating the label vectors of all social users in the platform of each participant client by using the trained model, and recommending the social users with similar label vectors to each other.
Preferably, the initial data locally integrated by each participant client in step 1 includes: social user account ID, social user relationship information, social user characteristics, social user tags.
Preferably, the method for locally integrating the initial data by each participant client in step 1 includes:
step 1.1, constructing an adjacency matrix: constructing a social network of each social platform by taking the social users as nodes and the social user relations as connecting edges; obtaining an adjacency matrix of the network based on the constructed social network;
step 1.2, constructing a feature matrix: numbering all social user characteristics, taking the social user characteristic number as a column number, and taking an account ID of a social user as a row number to establish a characteristic matrix;
step 1.3, constructing a label matrix: and numbering all the social user tags, and constructing a tag matrix of the social user based on the numbers of the social user tags.
Preferably, the method for training the training model by each participant client in step 2 includes:
and each participant client inputs a feature matrix H and an adjacent matrix A obtained by locally integrating initial data into the training model and outputs a predicted label matrix L'.
L′=AH′ (1)
H′={h′1,h′2,...,h′i,..,h′n}T (2)
Wherein W ∈ Rf×yAnd w ∈ R1×2yFor a trainable weight matrix, LeakyRelu () is the activation function, | | | is the matrix splicing operation, NiSet of first-order neighbor nodes, α, for node iijRepresents the importance index, h, of node j to node ii、hj、hkRespectively are the eigenvectors of the node i, the node j and the node k, H 'is a reconstructed eigenvector matrix H'iThe reconstructed feature vector of the node i is obtained, and n is the total number of users of a single platform;
and calculating the cross entropy of the predicted label matrix L' and the label matrix L obtained by locally integrating the initial data, and iteratively updating the local weight matrixes W and W in a mode of minimizing the cross entropy.
Preferably, the method for initializing the key in step 1 comprises:
s1, generating two prime numbers p and q, and calculating a product b ═ p × q of the two and an Euler function value m ═ p-1 (q-1);
s2, generating a public key e, wherein e and m are relatively prime and 1< e < m;
s3, generating a private key d, wherein the remainder of e x d/m is 1;
s4, the method for encrypting the training parameters comprises the following steps:
s5, acquiring a number a to be encrypted, wherein a ciphertext c is aeThe remainder of/b;
the method for decrypting the training parameters comprises the following steps:
obtaining a ciphertext c, wherein the decrypted number a is cdThe remainder of/b.
Preferably, the training parameters of all the participant clients are aggregated in step 3 as shown in the following formula:
wherein W 'and W' are updated weight matrixes, z is the number of clients participating in training, and WtAnd wtThe weight matrix for the tth participant client.
Preferably, in the process of mutually recommending the social users with similar tag vectors in step 4, if the difference index of the tag vectors of the two social users is smaller than a preset threshold, the mutual recommendation is performed through the participant client, and the difference index of the tag vectors of the two social users is shown as the following formula:
where y is the total number of all social platform user tags, l1s、l2sThe values of the s-th element in the tag vectors of the first social user and the second social user are respectively.
The invention also provides a multi-social platform user recommendation system based on privacy protection, which comprises the following steps: a plurality of participant clients and a server; each of the participant clients includes: the device comprises a preprocessing module, a training module, a first encryption module and an output module; the server side comprises: the device comprises an initialization module, a second encryption module and an aggregation module;
the preprocessing module locally integrates initial data, inputs the initial data as local original data, and outputs an adjacency matrix, a feature matrix and a label matrix to the training module;
the initialization module initializes the training model and distributes the training model to the training module of each participant client and the second encryption module of the server;
the training module inputs the aggregation results of training parameters of all the participant client sides of initial data locally integrated by the preprocessing module based on target precision set by the training module, trains a training model, updates the training model based on the aggregation results of the training parameters of all the participant client sides, calculates whether the updated training model reaches the target precision set by the training module, and outputs a trained label matrix to the output module if the target precision requirement is met;
the first encryption module encrypts training parameters obtained after the participant client is trained, inputs trained parameter plaintext from the training module, and outputs encrypted parameter ciphertext to the second encryption module;
the second encryption module decrypts the received encrypted training parameters, inputs the encrypted parameter ciphertext from the first encryption module of the participant client, inputs the key from the initialization module, and outputs the decrypted parameter plaintext to the aggregation module;
the aggregation module is used for aggregating the training parameters sent by the client sides of all the participants, inputting decrypted parameter plaintext from the second encryption module and outputting the aggregated parameters to the training modules of the client sides of the participants;
and the output module updates the label vectors of all social users in the platform based on the trained model, carries out mutual recommendation on the social users with similar label vectors, inputs the trained label matrix from the training module and outputs the user recommendation result.
The invention has the advantages that:
according to the method, the social network is constructed, the model is trained by the social network data set and the user characteristic data in a matrix form, and the training effect of the local model is improved; the participating platforms are trained together in a mode of sharing the model but not sharing data, and shared training parameters are aggregated in an encryption mode, so that the user recommendation accuracy in the platforms per se is improved, and the user data privacy of each platform is also protected.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.
FIG. 1 is a flow chart of the method of the present invention;
fig. 2 is a schematic diagram of the system of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
Referring to fig. 1, the embodiment provides a multi-social-platform user recommendation method based on privacy protection, including:
and T1, locally integrating the initial data through the client of the participant, initializing the training model and the key through the server and distributing the training model and the key to each client of the participant.
The participator client side refers to computing equipment provided by a social platform which participates in joint recommendation of users, and the local initial data refers to data of each social platform, and comprises social user account IDs (each platform can be different, but the single platform is required to be unique), social user relationship information (the relationship information can be friends, concern, like points according to different social platforms), social user characteristics (social user characteristics can be hobbies, content browsing categories, times, active time periods and the like according to different platforms), social user tags (tags which are marked on social users according to the characteristics of different social platforms, such as movie hobbyists, outdoor exercises, and cargos).
The method for locally integrating the initial data by the client of each participant specifically comprises the following steps:
constructing an adjacency matrix: constructing a social network of each social platform by taking the social users as nodes and the social user relations as connecting edges; based on the constructed social network, an adjacency matrix of the network is obtained and marked as A (elements in the matrix A are 0 and 1, wherein 0 represents that no direct relationship exists between two social users corresponding to the row, and 1 represents that a social user relationship exists between the two social users corresponding to the row).
Constructing a characteristic matrix: all the characteristics are numbered 1-f according to the characteristics of the social users, and f is the total number of the characteristics of the social users of all the social platforms. Establishing a feature matrix H by taking the social user feature number as a column number and the user account ID as a row number, wherein H is { H ═ H }1,h2,...,hi,..,hn}T,hiIs the feature vector of the ith social user, and n is the total number of users of a single platform (element in matrix H)The elements are 0 and 1, 0 indicates that the social users corresponding to the line do not have the characteristic, and 1 indicates that the social users corresponding to the line have the characteristic).
Constructing a label matrix: numbering all tags by 1-y, wherein y is the total number of all social platform user tags, and constructing a tag matrix L of the social users according to the numbers of the social user tags, wherein L is { L ═ L1,l2,...,li,..,ln}T,liThe label vector is the ith user (the elements in the matrix L are 0 and 1, 0 indicates that the social user does not have the kind of label, and 1 indicates that the social user has the kind of label).
After the user tag vector is constructed, each social platform extracts a part of social users according to a certain proportion and divides the social users into a training set, a verification set and a test set, wherein in the embodiment, the proportion of the training set, the verification set and the test set is 1: 2: 7, it may also be different, and the selected part of users take priority that the values in the tag vector are not all 0, and the number of selections is not limited.
The server is one or more computing devices provided by the participating party or a third party. The initialization model randomly assigns initial values to the weight matrix used for model training and generates parameters required by encryption for the next encryption process. The content sent to each participant client by the server is the training model and the public keys e and b after the initialization parameters.
And T2, training the training model by using the initial data of local integration based on the target precision set by each participant client, encrypting the training parameters obtained after training and sending the encrypted training parameters to the server.
The target precision set by the social interaction platform is the test set classification accuracy ac expected to be achieved by each social interaction platform according to the joint training.
The training process of the training model is as follows:
inputting the characteristic matrix H and the adjacent matrix A of the platform, and outputting a predicted label matrix L'.
L′=AH′ (1)
H′={h′1,h′2,...,h′i,..,h′n}T (2)
Wherein W ∈ Rf×yAnd w ∈ R1×2yFor a trainable weight matrix, LeakyRelu () is the activation function, | | | is the matrix splicing operation, NiSet of first-order neighbor nodes, α, for node iijRepresents the importance index, h, of node j to node ijIs the eigenvector of node j, H 'is the reconstructed eigenvector matrix, H'iThe feature vector is reconstructed for node i.
After the predicted label matrix L' is obtained through calculation, the cross entropy of the label matrix L and the predicted label matrix L is calculated, and local weight matrixes W and W are updated in an iterative mode in a mode of minimizing the cross entropy.
The key initialization, encryption and decryption method comprises the following steps:
and (3) key initialization process: the server generates two prime numbers p and q (generally 1024 bits), calculates the product b ═ p ═ q of the two and its euler function value m ═ p-1 (q-1), generates a public key e, makes e and m prime and 1< e < m, generates a private key d, and makes the remainder of e × d/m 1.
The encryption process is as follows: obtaining a digital a to be encrypted, wherein a ciphertext c is aeThe remainder of/b.
The decryption method comprises the following steps: obtaining a ciphertext c, wherein the decrypted number a is cdThe remainder of/b.
And T3, decrypting and aggregating the collected encrypted training parameters through the server, and returning the aggregation result of the training parameters to all the participant clients.
The polymerization process is shown by the following formula:
wherein W 'and W' are updated weight matrixes, z is the number of clients participating in training, and WtAnd wtThe weight matrix for the tth participant client.
And each participant client updates the training model based on the aggregation result of the training parameters, and calculates whether the updated training model reaches the target precision set by the participant client, if not, the step T2 is repeatedly executed, and if so, the step T4 is executed.
And T4, finishing training, and updating the label vectors of all social users in the platform of each participant client by using the trained model, and recommending the social users with similar label vectors.
The label vector similarity refers to two vectors with the difference index Er smaller than a preset threshold, and in the embodiment, the preset threshold of the Er is 0.2
Where y is the total number of all social platform user tags, l1s、l2sThe values of the s-th element in the tag vectors of the first social user and the second social user are respectively.
Referring to fig. 2, the embodiment further provides a privacy protection-based multi-social platform user recommendation system, which includes a plurality of participant clients and a server; each participant client comprises a preprocessing module, a training module, a first encryption module and an output module; the server comprises an initialization module, a second encryption module and an aggregation module;
the preprocessing module, the training module and the first encryption module are sequentially connected, and the training module is connected with the output module; the initialization module, the second encryption module and the aggregation module are sequentially connected; the initialization module and the aggregation module are respectively connected with the training module, and the first encryption module is connected with the second encryption module.
The preprocessing module is used for locally integrating initial data, and specifically comprises: the social network platform builds a social network of the social network platform, and obtains an input adjacency matrix, a feature matrix and a tag matrix.
The initialization module initializes the training model and the secret key and outputs the training model and the secret key to the training module of each participant client and the second encryption module of the server;
the training module trains a training model by adopting locally integrated initial data based on the target precision set by the training module, and is also used for updating the training model based on the aggregation result of the training parameters of all the participant client sides and calculating whether the updated training model reaches the target precision set by the training module. The training process of the training model is as follows:
inputting the characteristic matrix H and the adjacent matrix A of the platform, and outputting a predicted label matrix L'.
L′=AH′ (1)
H′={h′1,h′2,...,h′i,..,h′n}T (2)
Wherein W ∈ Rf×yAnd w ∈ R1×2yFor a trainable weight matrix, LeakyRelu () is the activation function, | | | is the matrix splicing operation, NiSet of first-order neighbor nodes, α, for node iijRepresents the importance index, h, of node j to node ijIs a feature vector of the jth social user, H'Is a reconstructed feature matrix, h'iThe feature vector is reconstructed for node i.
After the predicted label matrix L' is obtained through calculation, the cross entropy of the label matrix L and the predicted label matrix L is calculated, and local weight matrixes W and W are updated in an iterative mode in a mode of minimizing the cross entropy.
The first encryption module is used for encrypting the training parameters obtained after the participant client is trained.
The second encryption module decrypts the received encrypted training parameters, inputs the encrypted parameter ciphertext from the first encryption module of the participant client, inputs the key from the initialization module, and outputs the decrypted parameter plaintext to the aggregation module;
the aggregation module is used for aggregating the training parameters of the client sides of the participants and returning the aggregation result to the client sides of the participants. The polymerization process is shown by the following formula:
wherein W 'and W' are updated weight matrixes, z is the number of clients participating in training, and WtAnd wtThe weight matrix for the tth participant client.
And the output module updates the label vectors of all social users in the platform based on the trained model and carries out mutual recommendation on the social users with similar label vectors. The label vector similarity refers to two vectors with the difference index Er smaller than a set value, and in the embodiment, the set value of the Er is 0.2
Where y is all social platform user tagsTotal number,/1s、l2sThe values of the s-th element in the first and second label vectors, respectively.
In the first encryption module and the second encryption module, the key initialization, encryption and decryption methods are as follows:
and (3) key initialization process: the server generates two prime numbers p and q (generally 1024 bits), calculates the product b ═ p ═ q of the two and its euler function value m ═ p-1 (q-1), generates a public key e, makes e and m prime and 1< e < m, generates a private key d, and makes the remainder of e × d/m 1.
The encryption process is as follows: obtaining a digital a to be encrypted, wherein a ciphertext c is aeThe remainder of/b.
The decryption method comprises the following steps: obtaining a ciphertext c, wherein the decrypted number a is cdThe remainder of/b.
The above-described embodiments are merely illustrative of the preferred embodiments of the present invention, and do not limit the scope of the present invention, and various modifications and improvements of the technical solutions of the present invention can be made by those skilled in the art without departing from the spirit of the present invention, and the technical solutions of the present invention are within the scope of the present invention defined by the claims.
- 上一篇:石墨接头机器人自动装卡簧、装栓机
- 下一篇:基于去中心化图神经网络的社交推荐方法