Node relation obtaining method based on symbolic network and storage medium

文档序号:8092 发布日期:2021-09-17 浏览:118次 中文

1. A node relation obtaining method based on a symbol network is characterized in that the node relation obtaining method comprises the following steps:

step 1: establishing a social network model, and acquiring the degree and the aggregation coefficient of the nodes;

step 2: determining PA indexes and affinity between nodes;

and step 3: determining the probability of potential links existing between nodes;

and 4, step 4: determining the relevant characteristic attribute of the node;

and 5: and fusing the attribute characteristics of the nodes, and judging the relation polarity between the nodes by adopting a logistic regression model.

2. The method for obtaining node relationship based on symbolic network according to claim 1, wherein the step 1 specifically comprises:

abstracting the social network data set into an undirected graph G ═ (V, E), wherein V represents a set of nodes in the network, and E represents a set of connected edges of the network; and the continuous edges which do not exist in the network are expressed as (x, y) belonging to U-E, wherein x, y belonging to V, and U represents all possible edges in the network, and the degree and the aggregation coefficient attribute of the node are obtained.

3. The method for obtaining node relationship based on symbolic network according to claim 1, wherein PA index between nodes in step 2 is specifically:

where k (x) and k (y) represent degrees of nodes x and y, respectively.

4. The method according to claim 1, wherein the affinity between nodes in step 2 is specifically:

wherein Γ (x) and Γ (y) are sets of neighbor nodes of the node x and the node y, respectively; k is a radical ofxAnd kyDegrees for nodes x and y, respectively; a1 on the molecule indicates that there is a connecting edge between node x and node y.

5. The method for obtaining node relationship based on symbolic network according to claim 1, wherein the calculation method of potential link probability existing between nodes in step 3 is:

6. the method for obtaining node relationship based on symbolic network according to claim 1, wherein the step 4 specifically comprises:

and determining node characteristics, node similarity characteristics and structure balance characteristics of the nodes.

7. The method as claimed in claim 6, wherein the node characteristics include a positive power ratioRatio of negative penetrationPositive out ratioNegative output ratioAnd PA similarity; the PA similarity is a PA index;

ratio of penetrationThe calculation method comprises the following steps:

ratio of negative penetrationThe calculation method comprises the following steps:

positive out ratioThe calculation method comprises the following steps:

negative output ratioThe calculation method comprises the following steps:

wherein d isin(u) represents the total in-degree of node u; dout(u) represents the total out-degree of node u;andrespectively representing the positive degree and the negative degree of the node u;andrespectively representing the positive and negative out-degrees of node u.

8. The method according to claim 6, wherein the node similarity characteristics include positive similarity S+(u, v) and negative similarity S-(u, v), the calculation formula is respectively as follows:

wherein, W+Represents a set of nodes that provide positive links to v; w-Is a set of nodes representing negative links to v; sim (u, W) is the similarity between node u and node W;

the calculation formula of the similarity sim (u, w) between the nodes is as follows:

in the formula, e(u,i)And e(w,i)Is a relational label of links pointing from node u and node w, respectively, to node I, I is a set of common neighbor nodes of u and w.

9. The method according to claim 6, wherein the structural balance feature is determined by negative triplet and quadruplet features extracted from triplet and quadruplet attributes; the negative triple ratio calculation formula of the nodes u and v is as follows:

wherein W represents the neighbors of node u and node v, | W | is the number of common neighbors of node u and node v;

the negative quadruple ratio calculation formula of the node u and the node v is as follows:

wherein the content of the first and second substances,representing the total number of all paths traversing a path length of 3 from node u to node v.

10. A storage medium, wherein the storage medium stores the method for obtaining node relationship based on symbolic network according to any one of claims 1 to 9.

Background

The relationship type prediction of the symbolic network is simply to infer the potential attitude of a certain user node on other nodes, the research direction can be used for providing user personalized services for enterprises or individuals, and meanwhile, the method has very important theoretical significance and application value for further researching the topological structure, the function, the dynamic behavior and the like of the social network

The relation between users in the online social network includes not only a displayed relation formed by adding friends or attention to each other, but also an implicit relation which judges whether the users exist or not through whether the similarity exceeds a given threshold or not from the viewpoints of user behavior and preference, and in most of the existing researches, a negative relation (i.e. an untrusted relation) is directly ignored, all existing relation links are defaulted to be a positive relation, but actually the importance of the negative relation in the social network is not inferior to the positive relation.

Disclosure of Invention

The invention aims to overcome the defects in the prior art and provide a node relation acquisition method and a storage medium based on a symbol network, which can effectively realize the acquisition of node relation in the symbol network and have high accuracy.

The purpose of the invention can be realized by the following technical scheme:

a node relation obtaining method based on a symbol network comprises the following steps:

step 1: establishing a social network model, and acquiring the degree and the aggregation coefficient of the nodes;

step 2: determining PA indexes and affinity between nodes;

and step 3: determining the probability of potential links existing between nodes;

and 4, step 4: determining the relevant characteristic attribute of the node;

and 5: and fusing the attribute characteristics of the nodes, and judging the relation polarity between the nodes by adopting a logistic regression model.

Preferably, the step 1 specifically comprises:

abstracting the social network data set into an undirected graph G ═ (V, E), wherein V represents a set of nodes in the network, and E represents a set of connected edges of the network; and the continuous edges which do not exist in the network are expressed as (x, y) belonging to U-E, wherein x, y belonging to V, and U represents all possible edges in the network, and the degree and the aggregation coefficient attribute of the node are obtained.

Preferably, the PA index between the nodes in step 2 is specifically:

where k (x) and k (y) represent degrees of nodes x and y, respectively.

Preferably, the intimacy between the nodes in the step 2 is specifically as follows:

wherein Γ (x) and Γ (y) are sets of neighbor nodes of the node x and the node y, respectively; k is a radical ofxAnd kyDegrees for nodes x and y, respectively; a1 on the molecule indicates that there is a connecting edge between node x and node y.

Preferably, the calculation method of the probability of the potential link existing between the nodes in the step 3 is as follows:

preferably, the step 4 specifically includes:

and determining node characteristics, node similarity characteristics and structure balance characteristics of the nodes.

More preferably, the node characteristics include a positive penetration ratioRatio of negative penetrationPositive out ratioNegative output ratioAnd PA similarity; the PA similarity isPA index;

ratio of penetrationThe calculation method comprises the following steps:

ratio of negative penetrationThe calculation method comprises the following steps:

positive out ratioThe calculation method comprises the following steps:

negative output ratioThe calculation method comprises the following steps:

wherein d isin(u) represents the total in-degree of node u; dout(u) represents the total out-degree of node u;andrespectively representing the positive degree and the negative degree of the node u;andrespectively representing the positive and negative out-degrees of node u.

More preferably, the node similarity feature comprises positive similarity S+(u, v) and negative similarity S-(u, v), the calculation formula is respectively as follows:

wherein, W+Represents a set of nodes that provide positive links to v; w-Is a set of nodes representing negative links to v; sim (u, W) is the similarity between node u and node W;

the calculation formula of the similarity sim (u, w) between the nodes is as follows:

in the formula, e(u,i)And e(w,i)Is a relational label of links pointing from node u and node w, respectively, to node I, I is a set of common neighbor nodes of u and w.

More preferably, the structure balance feature is determined by negative triple and negative quadruple features extracted from triple and quadruple attributes; the negative triple ratio calculation formula of the nodes u and v is as follows:

wherein W represents the neighbors of node u and node v, | W | is the number of common neighbors of node u and node v;

the negative quadruple ratio calculation formula of the node u and the node v is as follows:

wherein the content of the first and second substances,representing the total number of all paths traversing a path length of 3 from node u to node v.

A storage medium, wherein the storage medium stores any one of the above mentioned node relation obtaining methods based on the symbolic network.

Compared with the prior art, the invention has the following beneficial effects:

the acquisition of node relation in the symbol network is realized: the node relation acquisition method of the invention fully utilizes topological characteristics and node similarity attributes in the symbolic social network, and provides a negative relation mining technology based on the symbolic network; because the existing link prediction technology is less concerned about the negative relationship in the network, aiming at the problem, the node relationship acquisition method disclosed by the invention integrates the attributes of the nodes and the similar characteristics among the nodes, explores the characteristics suitable for relationship type prediction aiming at the potential relation between the positive and negative relationships, realizes effective judgment on the relationship type and has high judgment accuracy.

Drawings

FIG. 1 is a flow chart of a node relationship obtaining method in the present invention;

FIG. 2 is a schematic graph of link prediction experiment AUC value comparison between the symbolic network-based negative relationship mining method and the reference algorithm in the embodiment of the present invention on 3 data sets;

FIG. 3 is a graph illustrating the relationship type prediction experiment F1 value and AUC value comparison between the symbolic network-based negative relationship mining method and the reference algorithm according to the present invention on 3 data sets according to an embodiment of the present invention;

wherein FIG. 3(a) is a diagram illustrating comparison of predicted result F1 values; FIG. 3(b) is a graph showing the comparison of AUC values of the predicted results.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, shall fall within the scope of protection of the present invention.

A node relationship obtaining method based on a symbolic network, a flow of which is shown in fig. 1, includes:

step 1: establishing a social network model, and acquiring the degree and the aggregation coefficient of the nodes;

abstracting the social network data set into an undirected graph G ═ (V, E), wherein V represents a set of nodes in the network, and E represents a set of connected edges of the network; the continuous edges which do not exist in the network are expressed as (x, y) belonging to U-E, wherein x, y belonging to V, and U represents all possible edges in the network, and the degree and aggregation coefficient attributes of the nodes are obtained;

step 2: determining PA indexes and affinity between nodes;

the PA indexes between nodes are specifically:

wherein k (x) and k (y) represent degrees of nodes x and y, respectively;

the intimacy between nodes is as follows:

wherein Γ (x) and Γ (y) are sets of neighbor nodes of the node x and the node y, respectively; k is a radical ofxAnd kyDegrees for nodes x and y, respectively; 1 on the molecule indicates that a connecting edge exists between the node x and the node y;

and step 3: determining the probability of potential links existing between nodes;

the method for calculating the potential link probability between the nodes comprises the following steps:

and 4, step 4: determining the relevant characteristic attribute of the node;

determining node characteristics, node similarity characteristics and structure balance characteristics of the nodes;

the node characteristics include a forward ratioRatio of negative penetrationPositive out ratioNegative output ratioAnd PA similarity; the PA similarity is a PA index;

ratio of penetrationThe calculation method comprises the following steps:

ratio of negative penetrationThe calculation method comprises the following steps:

positive out ratioThe calculation method comprises the following steps:

negative output ratioThe calculation method comprises the following steps:

wherein d isin(u) represents the total in-degree of node u; dout(u) represents the total out-degree of node u;andrespectively representing the positive degree and the negative degree of the node u;andrespectively representing the positive out degree and the negative out degree of the node u;

the node similarity characteristics comprise positive similarity S+(u, v) and negative similarity S-(u, v), the calculation formula is respectively as follows:

wherein, W+Represents a set of nodes that provide positive links to v; w-Is a set of nodes representing negative links to v; sim (u, W) is the similarity between node u and node W;

the calculation formula of the similarity sim (u, w) between the nodes is as follows:

in the formula, e(u,i)And e(w,i)Is a relationship label of links pointing to node I from node u and node w, respectively, I is a set of common neighbor nodes of u and w;

the structure balance features are determined by the negative triple and negative quadruple features extracted from the triple attributes and the quadruple attributes; the negative triple ratio calculation formula of the nodes u and v is as follows:

wherein W represents the neighbors of node u and node v, | W | is the number of common neighbors of node u and node v;

the negative quadruple ratio calculation formula of the node u and the node v is as follows:

wherein the content of the first and second substances,represents the total number of all paths traversing a path length of 3 from node u to node v;

and 5: fusing the attribute characteristics of the nodes, judging the relation polarity between the nodes by adopting a logistic regression model, and deducing a symbol e of a given edge e (u, v)uvWhether it is negative or not.

The effect of the node relationship obtaining method in this embodiment can be further explained by the following experiment.

The experimental conditions are as follows: the experiment is completed on a Jet brain Pycharm Community software platform under a hardware Intel (R) core (TM) i7-8550U CPU @1.80GHz2.0GHz and a Windows 10 system.

The experimental contents are as follows: the experiment of the invention is that the method of the invention and Common Neighbors (CN) algorithm, adaptive-Adar (AA) algorithm, Preferred Attribute (PA) algorithm, Jaccard algorithm, Resource Allocation (RA) algorithm 5 existing technologies are adopted to do link prediction and relation type prediction experiments on Bitcoi-Alpha, Bitcoi-Otc and Slashdot three symbol network data sets respectively.

Experiment one: and (4) link prediction experiments.

The proportion of the test set divided in the experiment is 10%, one edge is randomly selected from the test set in the experiment each time, one edge is randomly selected from the nonexistent data set, then the similarity score of the two edges is calculated, 1 is added if the edge score in the test set is larger than the score of the nonexistent edge, 0.5 is added if the edge score is equal to the score of the nonexistent edge, n times of independent repeated experiments are completed, and the AUC index is used as an evaluation index. The results of the experiment are shown in FIG. 2. As can be seen from FIG. 2, the PACD of the method of the present invention has a significant improvement in prediction accuracy compared with other 5 reference algorithms.

Experiment two: and (4) a relation type prediction experiment.

In the experiment, 10% of data is randomly extracted from an original data set to serve as a test set, the rest data is used for training the model, and then the effect of the model is evaluated by using the test data through completing the training of the model. The whole process is repeated for 10 times, and the effectiveness of the evaluation result is ensured. Finally, the method proposed by the invention is compared and analyzed with 3 existing symbol prediction methods. The results of the experiment are shown in FIG. 3. As can be seen from fig. 3, the proposed relationship type prediction models Ne to LP almost achieve the best performance on the three data sets, which indicates that the present invention effectively selects the appropriate network topology attributes and successfully applies the relevant social theory to our models, so that the predicted performance is improved to a certain extent.

The embodiment also relates to a storage medium, in which any one of the above node relationship obtaining methods is stored.

While the invention has been described with reference to specific embodiments, the invention is not limited thereto, and various equivalent modifications and substitutions can be easily made by those skilled in the art within the technical scope of the invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

完整详细技术资料下载
上一篇:石墨接头机器人自动装卡簧、装栓机
下一篇:基于有限元法的干式套管参数设计平台及设计方法

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!

技术分类