Method, system, computer and storage medium for quantifying employee cooperation strength
1. A method for quantifying the intensity of the cooperative relationship of an employee is characterized by comprising the following steps:
an interaction sequence obtaining step, namely obtaining session interaction data of a target employee, encrypting the session interaction data, dividing the session interaction data into a plurality of session units according to a preset segmentation unit, and outputting the session units as an employee interaction sequence;
a sequence model obtaining step, namely constructing and training a sequence model through a sequence modeling method based on the employee interaction sequence;
a sequence vector obtaining step, namely obtaining a sequence vector of the staff interaction sequence based on the sequence model;
and a relation intimacy calculation step, namely storing the sequence vector into a multi-dimensional vector database, calculating the staff relation intimacy based on the sequence vector to obtain an intimacy score, and quantifying the staff cooperative relation strength based on the intimacy score.
2. The employee partnership strength measurement method according to claim 1, wherein the sequence model is a Bert model, and the sequence model obtaining step further includes:
a step of building a Bert lexicon, which is to represent all employees of a target enterprise as an employee dictionary library;
a sequence preprocessing step, namely constructing the employee interaction sequence into a Bert standard training data format to obtain session sequence data for training a model;
and a model training step of training the Bert model based on the session sequence data.
3. The employee partnership strength quantifying method according to claim 1 or 2, further comprising:
and in the staff relationship intimacy degree query step, the relationship intimacy degree of the target staff is retrieved based on a query request of a user, and a relationship intimacy degree ranking list is output according to the intimacy degree score ranking.
4. The employee partnership strength quantifying method according to claim 1 or 2, further comprising:
model iteration step, obtaining increment data of session interaction data of a preset increment period, and performing iterative training on the sequence model based on the increment data;
and a dynamic relation intimacy calculation step, namely acquiring a sequence vector according to the sequence model and dynamically calculating the relation intimacy of the staff.
5. An employee partnership strength quantifying system, comprising:
the interaction sequence acquisition module is used for acquiring conversation interaction data of a target employee, encrypting the conversation interaction data, dividing the conversation interaction data into a plurality of conversation units according to a preset segmentation unit, and outputting the conversation units as an employee interaction sequence;
the sequence model acquisition module is used for constructing and training a sequence model through a sequence modeling method based on the employee interaction sequence;
the sequence vector acquisition module is used for acquiring a sequence vector of the staff interaction sequence based on the sequence model;
and the relationship intimacy calculation module is used for storing the sequence vector into a multi-dimensional vector database, calculating the relationship intimacy of the staff based on the sequence vector to obtain an intimacy score, and quantifying the cooperative relationship strength of the staff based on the intimacy score.
6. The system for quantifying strength of staff partnership according to claim 5, wherein the sequence model is a Bert model, and the sequence model obtaining module further comprises:
the Bert lexicon building module is used for representing all employees of a target enterprise as an employee dictionary library;
the sequence preprocessing module is used for constructing the staff interaction sequence into a Bert standard training data format to obtain session sequence data used for training a model;
a model training module to train the Bert model based on the session sequence data.
7. The employee partnership strength quantification system according to claim 5 or 6, further comprising:
and the staff relationship intimacy degree query module is used for retrieving the relationship intimacy degree of the target staff based on the query request of the user and outputting a relationship intimacy degree ranking list according to the intimacy degree score ranking.
8. The employee partnership strength quantification system according to claim 5 or 6, further comprising:
the model iteration module is used for acquiring incremental data of session interaction data of a preset incremental period and carrying out iterative training on the sequence model based on the incremental data;
and the relationship intimacy dynamic calculation module is used for acquiring a sequence vector according to the sequence model and dynamically calculating the relationship intimacy of the staff.
9. A computer device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor when executing the computer program implements the employee partnership strength quantification method of any one of claims 1 to 4.
10. A computer-readable storage medium on which a computer program is stored, the program, when executed by a processor, implementing the employee partnership strength quantification method as claimed in any one of claims 1 to 4.
Background
The social network is originated from social networking, the starting point of social networking is email, the current social networking mode is not limited to enterprise WeChat, QQ, microblog and other social platforms, and the core of the social network is users participating in the social networking and the relationship among the users. From the angle of enterprise management, through the intimacy that describes staff's relation can be better understand the cooperation intensity and the cooperation mode of the inside staff of enterprise, and then the cooperation of the promotion staff that can be better, understand the inside cooperation mode of enterprise, promote the operating efficiency of enterprise.
Currently, calculating the employee affinity index for enterprise session data in an enterprise is an effective method for quantifying the intensity of the employee's partnership. The technical scheme mainly uses simple rules to complete the communication and interaction frequency of the whole staff by finding out the related interaction frequency through the conversation frequency of the staff and the conversation in the chat group in the enterprise so as to depict the intimacy of the staff.
This method is simple, fast and convenient, but uses a data statistics aggregation operation for the case of large amounts of data and fast data increments, which aggravates the computation cost because of the large amount of data. Moreover, the storage cost is needed for storing the original data, and the storage cost is rapidly increased along with the continuous increase of the data.
Disclosure of Invention
The embodiment of the application provides a method, a system, computer equipment and a computer readable storage medium for quantifying the intensity of the cooperative relationship of employees, so as to at least solve the problems of high computing cost and high storage cost in the related art.
In a first aspect, an embodiment of the present application provides a method for quantifying employee partnership strength, including:
an interactive sequence obtaining step, namely obtaining session interactive data of a target employee, encrypting the session interactive data, dividing the session interactive data into a plurality of session units according to a preset segmentation unit, and outputting the session units as an employee interactive sequence according to a time sequence;
a sequence model obtaining step, namely constructing and training a sequence model through a sequence modeling method based on the employee interaction sequence; specifically, the sequence model includes one or any combination of a bert (bidirectional Encoder responses from transformations) model, a Word2Vec model, and a Glove model.
A sequence vector obtaining step, namely obtaining a sequence vector of the staff interaction sequence based on the sequence model;
and a relation intimacy calculation step, namely storing the sequence vector into a multi-dimensional vector database, calculating the staff relation intimacy based on the sequence vector to obtain an intimacy score, and quantifying the staff cooperative relation strength based on the intimacy score.
Based on the steps, the embodiment of the application evaluates the cooperative relationship strength of the staff based on relationship intimacy, processes the session interaction data into a staff interaction sequence and realizes the vectorization expression of the staff relationship based on a sequence model, thereby greatly reducing the storage cost of the original data. In addition, compared with the existing calculation mode, the method does not limit the size of the data volume, and even the larger the data volume is, the better the model training effect is, so that the calculation cost of the data is reduced, and the problem that the calculation cost is increased for mass data is solved.
In some embodiments, the sequence model is a Bert model, and the obtaining of the sequence model further includes:
a step of building a Bert lexicon, which is to represent all employees of a target enterprise as an employee dictionary library;
a sequence preprocessing step, namely constructing the employee interaction sequence into a Bert standard training data format to obtain session sequence data for training a model; specifically, each sequence is constructed as four parts to represent: bert _ input, bert _ label, segment _ label, is _ next. The part data in the bert _ input divides the employee interaction sequence into two parts, and randomly hides one part of employees based on the split employee interaction sequence to form two sub-interaction sequences with hidden marks Mask; the bert _ label is used for identifying the staff specifically represented by the Mask sub-interaction sequence; segment _ label is used for representing the difference between the split employee interaction sequences; is _ next is used to indicate whether the sub-interaction sequence in bert _ input is contiguous.
And a model training step of training the Bert model based on the session sequence data.
Based on the steps, the establishment and training of the sequence model of the embodiment of the application are completed by combining the staff interaction sequence, so that the staff interaction sequence is conveniently subjected to vector representation through the sequence model, and the data storage cost is reduced.
In some embodiments, the method further comprises:
and in the staff relationship intimacy degree query step, the relationship intimacy degree of the target staff is retrieved based on a query request of a user, and a relationship intimacy degree ranking list is output according to the intimacy degree score ranking.
In some embodiments, the method further comprises:
model iteration step, obtaining increment data of session interaction data of a preset increment period, and performing iterative training on the sequence model based on the increment data;
and a dynamic relation intimacy calculation step, namely acquiring a sequence vector according to the sequence model and dynamically calculating the relation intimacy of the staff.
Based on the steps, the embodiment of the application realizes model iteration dynamic calculation of the staff relationship intimacy, and the staff relationship intimacy is rapidly and dynamically updated directly in a model iteration mode under the condition of fast data increment, so that staff relationship analysis is further facilitated, such as statistical analysis on the change trend of the staff relationship and whether the change trend is related to a cooperation project.
In a second aspect, an embodiment of the present application provides an employee partnership strength measurement system, including:
the interaction sequence acquisition module is used for acquiring conversation interaction data of a target employee, encrypting the conversation interaction data, dividing the conversation interaction data into a plurality of conversation units according to a preset segmentation unit, and outputting the conversation units into an employee interaction sequence according to a time sequence;
the sequence model acquisition module is used for constructing and training a sequence model through a sequence modeling method based on the employee interaction sequence; specifically, the sequence model comprises one or any combination of a Bert model, a Word2Vec model and a Glove model.
The sequence vector acquisition module is used for acquiring a sequence vector of the staff interaction sequence based on the sequence model;
and the relationship intimacy calculation module is used for storing the sequence vector into a multi-dimensional vector database, calculating the relationship intimacy of the staff based on the sequence vector to obtain an intimacy score, and quantifying the cooperative relationship strength of the staff based on the intimacy score.
Based on the structure, the embodiment of the application evaluates the cooperative relationship strength of the staff based on relationship intimacy, processes the session interaction data into a staff interaction sequence and realizes the vectorization expression of the staff relationship based on a sequence model, thereby greatly reducing the storage cost of the original data. In addition, compared with the existing calculation mode, the method does not limit the size of the data volume, and even the larger the data volume is, the better the model training effect is, so that the calculation cost of the data is reduced, and the problem that the calculation cost is increased for mass data is solved.
In some embodiments, the sequence model is a Bert model, and the sequence model obtaining module further includes:
the Bert lexicon building module is used for representing all employees of a target enterprise as an employee dictionary library;
the sequence preprocessing module is used for constructing the staff interaction sequence into a Bert standard training data format to obtain session sequence data used for training a model; specifically, each sequence is constructed as four parts to represent: bert _ input, bert _ label, segment _ label, is _ next. The part data in the bert _ input divides the employee interaction sequence into two parts, and randomly hides one part of employees based on the split employee interaction sequence to form two sub-interaction sequences with hidden marks Mask; the bert _ label is used for identifying the staff specifically represented by the Mask sub-interaction sequence; segment _ label is used for representing the difference between the split employee interaction sequences; is _ next is used to indicate whether the sub-interaction sequence in bert _ input is contiguous.
A model training module to train the Bert model based on the session sequence data.
Based on the structure, the establishment and training of the sequence model of the embodiment of the application are completed by combining the staff interaction sequence, so that the staff interaction sequence is conveniently subjected to vector representation through the sequence model, and the data storage cost is reduced.
In some of these embodiments, the system further comprises:
and the staff relationship intimacy degree query module is used for retrieving the relationship intimacy degree of the target staff based on the query request of the user and outputting a relationship intimacy degree ranking list according to the intimacy degree score ranking.
In some of these embodiments, the system further comprises:
the model iteration module is used for acquiring incremental data of session interaction data of a preset incremental period and carrying out iterative training on the sequence model based on the incremental data;
and the relationship intimacy dynamic calculation module is used for acquiring a sequence vector according to the sequence model and dynamically calculating the relationship intimacy of the staff.
Based on the structure, the embodiment of the application realizes model iteration dynamic calculation of the staff relationship intimacy, and the staff relationship intimacy is rapidly and dynamically updated directly in a model iteration mode under the condition of fast data increment, so that staff relationship analysis is further facilitated, such as statistical analysis on the change trend of the staff relationship and whether the change trend is related to a cooperation project.
In a third aspect, an embodiment of the present application provides a computer device, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, and when the processor executes the computer program, the method for quantifying the employee partnership strength as described in the first aspect above is implemented.
In a fourth aspect, the present application provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the method for quantifying the employee partnership strength as described in the first aspect above.
Compared with the related technology, the staff cooperative relationship strength quantification method, the system, the computer equipment and the computer readable storage medium provided by the embodiment of the application particularly relate to a marketing intelligent technology, and the staff relationship intimacy calculation is realized through an encryption mode, so that the data security and privacy are effectively protected; by means of vector representation of session interaction data, the problems of high data storage cost and high calculation cost under the premise of large data and data increment in the current big data environment are solved, and the data storage cost and the calculation cost are effectively reduced.
The details of one or more embodiments of the application are set forth in the accompanying drawings and the description below to provide a more thorough understanding of the application.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
FIG. 1 is a flow chart of a method for employee partnership strength quantification according to an embodiment of the present application;
FIG. 2 is a preferred flow chart of a method for employee partnership strength quantification according to an embodiment of the present application;
FIG. 3 is a preferred flow chart of a method for employee partnership strength quantification according to an embodiment of the present application;
FIG. 4 is a flow chart of a method for employee partnership strength quantification in accordance with a preferred embodiment of the present application;
FIG. 5 is a diagram of conversational interaction data, according to a preferred embodiment of the present application;
FIG. 6 is a schematic diagram of an employee interaction sequence in accordance with a preferred embodiment of the present application;
FIG. 7 is a schematic diagram of a library of Bert words in accordance with a preferred embodiment of the present application;
FIG. 8 is a schematic diagram of a sequence vector according to a preferred embodiment of the present application;
FIG. 9 is a schematic diagram illustrating the principle steps of a method for quantifying the intensity of employee collaboration according to a preferred embodiment of the present application;
FIG. 10 is a schematic diagram of another substep principle of an employee partnership intensity quantification method according to a preferred embodiment of the present application;
FIG. 11 is a block diagram of an employee relationship strength quantifying system according to an embodiment of the present application;
fig. 12 is a block diagram of a preferred structure of an employee relationship strength quantifying system according to an embodiment of the present application.
Wherein:
1. an interactive sequence acquisition module; 2. a sequence model acquisition module; 3. a sequence vector acquisition module;
4. a relationship intimacy calculation module; 5. an employee relationship intimacy query module; 6. a model iteration module;
7. a relation intimacy dynamic calculation module;
201. a Bert lexicon building module; 202. a sequence preprocessing module; 203. and a model training module.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be described and illustrated below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments provided in the present application without any inventive step are within the scope of protection of the present application.
It is obvious that the drawings in the following description are only examples or embodiments of the present application, and that it is also possible for a person skilled in the art to apply the present application to other similar contexts on the basis of these drawings without inventive effort. Moreover, it should be appreciated that in the development of any such actual implementation, as in any engineering or design project, numerous implementation-specific decisions must be made to achieve the developers' specific goals, such as compliance with system-related and business-related constraints, which may vary from one implementation to another.
Reference in the specification to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the specification. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of ordinary skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments without conflict.
Unless defined otherwise, technical or scientific terms referred to herein shall have the ordinary meaning as understood by those of ordinary skill in the art to which this application belongs. Reference to "a," "an," "the," and similar words throughout this application are not to be construed as limiting in number, and may refer to the singular or the plural. The present application is directed to the use of the terms "including," "comprising," "having," and any variations thereof, which are intended to cover non-exclusive inclusions; for example, a process, method, system, article, or apparatus that comprises a list of steps or modules (elements) is not limited to the listed steps or elements, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus. Reference to "connected," "coupled," and the like in this application is not intended to be limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect. The term "plurality" as referred to herein means two or more. "and/or" describes an association relationship of associated objects, meaning that three relationships may exist, for example, "A and/or B" may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship. Reference herein to the terms "first," "second," "third," and the like, are merely to distinguish similar objects and do not denote a particular ordering for the objects.
Implicit interaction: the extraction of implicit interactions comes from the "mention" ("@") and "forward" behavior of the user. When there is an interaction ("mention" or "forward") behavior between users, the probability of establishing an association between users will increase.
The staff cooperative relationship strength quantification method is provided for achieving staff intimacy calculation based on massive staff conversation interactive data, overcoming the problem that data volume and data increment data are very large, achieving reduction of data storage cost and calculation cost, and considering the situation that staff conversation interactive data relate to data safety and privacy problems.
The embodiment provides a method for quantifying the intensity of the cooperative relationship of the employees. Fig. 1 is a flowchart of an employee partnership strength measurement method according to an embodiment of the present application, and as shown in fig. 1, the flowchart includes the following steps:
an interaction sequence obtaining step S1, obtaining conversation interaction data of a target employee, encrypting the conversation interaction data, dividing the conversation interaction data into a plurality of conversation units according to a preset segmentation unit, and outputting the conversation units as an employee interaction sequence according to a time sequence; specifically, the encryption processing of the session interaction data is specifically to perform anonymous ID processing on the employee name, for example and without limitation, for example, based on an MD5(Message-Digest Algorithm) encryption Algorithm, and the MD5 encryption Algorithm is a one-way encryption Algorithm, so as to effectively implement privacy protection and data security. Optionally, the preset splitting unit may be day, week, month, year, and the like, and the embodiment of the application supports splitting of the session unit for the single chat data and the group chat data in the session interaction data in the same preset splitting unit or different preset splitting units.
A sequence model obtaining step S2, constructing and training a sequence model through a sequence modeling method based on the employee interaction sequence; specifically, the sequence model includes one or any combination of a Bert model, a Word2Vec model, and a Glove model.
A sequence vector obtaining step S3, obtaining a sequence vector of the staff interaction sequence based on the sequence model; the sequence vector obtained based on this step may be a dense vector expressed as 32 bits according to the staff.
And a relation intimacy calculation step S4, wherein the sequence vector is stored in a multi-dimensional vector database, the staff relation intimacy is calculated based on the sequence vector to obtain an intimacy score, and the staff cooperation strength is quantized based on the intimacy score. Optionally, the multidimensional vector database may be a FAISS, an Annoy. Alternatively, the relationship affinity of the target employee may be calculated using, but not limited to, vector similarity.
Based on the steps, the embodiment of the application evaluates the cooperative relationship strength of the staff based on relationship intimacy, processes the session interaction data into a staff interaction sequence and realizes the vectorization expression of the staff relationship based on a sequence model, thereby greatly reducing the storage cost of the original data. In addition, compared with the existing calculation mode, the method does not limit the size of the data volume, and even the larger the data volume is, the better the model training effect is, so that the calculation cost of the data is reduced, and the problem that the calculation cost is increased for mass data is solved.
In some embodiments, the sequence model is a Bert model, and the sequence model obtaining step S2 further includes:
a Bert lexicon construction step S201, wherein all employees of a target enterprise are represented as an employee dictionary library; optionally, the Bert lexicon supports ordering according to the occurrence frequency of the employee and labeling dictionary sequence numbers in an ascending order or a descending order, and also supports directly labeling non-repeated sequence numbers for the employee.
A sequence preprocessing step S202, constructing the employee interaction sequence into a Bert standard training data format to obtain session sequence data for training a model; specifically, each sequence is constructed as four parts to represent: bert _ input, bert _ label, segment _ label, is _ next. The part data in the bert _ input divides the employee interaction sequence into two parts, and randomly hides a part of employees based on the split employee interaction sequence to form two sub-interaction sequences with hidden marks Mask; the bert _ label is used for identifying the staff specifically represented by the Mask sub-interaction sequence; segment _ label is used for representing the difference between the split employee interaction sequences, and optionally, the former block is represented by 0 as 0000, and the latter block is represented by 1 as 1111; the is _ next is used to indicate whether the sub-interactive sequence in the bert _ input is continuous, optionally, a value of the is _ next is configured to be 0 or 1, if the sub-interactive sequence is not continuous, the value is indicated to be 0, otherwise, the value is indicated to be 1.
A model training step S203, training a Bert model based on the session sequence data.
Based on the steps, the establishment and training of the sequence model of the embodiment of the application are completed by combining the staff interaction sequence, so that the staff interaction sequence is conveniently subjected to vector representation through the sequence model, and the data storage cost is reduced.
The embodiment also provides a method for quantifying the intensity of the employee cooperative relationship. Fig. 2 is a preferred flowchart of an employee partnership strength measurement method according to an embodiment of the present application, and as shown in fig. 2, the flowchart includes all the steps of the above embodiment, and further includes the following steps:
and an employee relationship affinity query step S5, wherein the relationship affinity of the target employee is retrieved based on the query request of the user, and a relationship affinity ranking list is output according to the affinity score ranking. Notably, to protect data security and privacy, employees in the affinity ranked list are anonymous IDs, but use of downstream traffic supports anonymous ID-to-name conversion based on employee dictionary libraries.
The embodiment also provides a method for quantifying the intensity of the employee cooperative relationship. Fig. 3 is a preferred flowchart of an employee partnership strength measurement method according to an embodiment of the present application, and as shown in fig. 3, the flowchart includes the following steps in addition to the steps of the above embodiment:
model iteration step S6, obtaining incremental data of session interaction data of a preset incremental period, and performing iterative training on the sequence model based on the incremental data;
and a step S7 of dynamic calculation of relationship intimacy, namely acquiring a sequence vector according to the sequence model and dynamically calculating the relationship intimacy of the employee.
Based on the steps, the embodiment of the application realizes model iteration dynamic calculation of the staff relationship intimacy, and the staff relationship intimacy is rapidly and dynamically updated directly in a model iteration mode under the condition of fast data increment, so that staff relationship analysis is further facilitated, such as statistical analysis on the change trend of the staff relationship and whether the change trend is related to a cooperation project.
The embodiments of the present application are described and illustrated below by means of preferred embodiments.
Fig. 4 is a flowchart of an employee partnership strength measurement method according to a preferred embodiment of the present application, and as shown in fig. 4, the employee partnership strength measurement method includes the following steps:
staff interaction sequence generation S401: employee session interaction data is obtained, as shown in FIG. 5, which includes implicit interactions between employees. The employee name is processed through anonymous ID processing by using an MD5 encryption algorithm, single chat data and group chat data of the session interaction data are segmented into session units by taking the day as a preset segmentation unit, then the employee interaction sequence is generated by the call ticket units according to the time sequence, and a specific example of the employee interaction sequence is shown in FIG. 6.
Employee interaction sequence modeling S402: the sequence modeling is performed based on the employee interaction sequence data, and the sequence modeling can be completed by a Bert model based on a Bert or Word2Vec, Glove and other sequence modeling methods, preferably by using the Bert model in the embodiment, and the specific steps are as follows:
firstly, constructing a Bert lexicon, as shown in fig. 7 below, where the Bert lexicon of this embodiment is used to refer to an employee library, that is, performing dictionary library representation on all employees of an enterprise, where a specific representation manner may be that all employees are represented in a descending order according to the number of occurrences of the employee, and then identifying dictionary sequence numbers in the ascending order; it is also possible to directly identify non-repeating sequence numbers (without sorting by number of occurrences).
And then preprocessing the staff interaction sequence data to construct a training data format of the Bert standard. Mainly constructs each sequence data into four parts:
A. bert _ input for splitting a sequence into two sequences, based on which a part of employees with random masks (hidden) is performed, to compose two sequences with masks, for example, as follows: df3fbbc24d477636c6aaf0a5b313d1 mask 53b09914459873dff81470c6dac3c583, f82cc5c1ffa5b91bac2321a2e712d1 mask c165594b9b2251e3afd0d61ca897577 c.
B. bert _ label, which is used to identify employees specifically represented by Mask, for example as follows: 0f88c9d61e724500baa262a80e6afb6c, 66161204b7e1baf380b0b4c33fe091c 6.
C. segment _ label for representing the difference between two blocks in the above-mentioned bert _ input, such as the previous block we represent 0, and the next block we represent 1, such as: 0000, 1111.
D. is _ next, which is used to indicate whether the two blocks of sequences in the bert _ input are consecutive, 0 if not, and 1 if.
And finally, training the employee sequence model based on the session sequence data, specifically, directly feeding the processed session sequence data serving as training data into a Bert model to complete the training of the sequence model.
Employee digital representation S403: and completing vector representation of the staff based on the sequence model to obtain a sequence vector of staff interaction sequences, and specifically representing each staff according to a dense vector of which the staff is represented into 32 bits, as shown in fig. 8.
Computing employee relationship affinity using vector similarity S404: and (3) completing staff relationship affinity calculation by using vector similarity calculation, storing all dense vectors into a multi-dimensional vector storage database based on the 32-dimensional vectors obtained in the step (S403), then obtaining staff relationship affinity based on the vector similarity calculation, and finally outputting the results according to relationship affinity sequence, so that staff relationship strength is quantized based on the affinity score, wherein the specific principle is shown in FIG. 9.
Outputting a relational affinity ranked list of given employees S405: and sorting the retrieved employees with the affinity scores according to the affinity according to the query request.
Data increment and model iteration S406: and directly starting a model training process and subsequent digital representation of the staff by a model iteration mode through incremental data of every day or fixed days, and then calculating the relation intimacy of the staff through vector similarity and sequencing. Referring to fig. 10, S406 is a loop process of continuously processing incremental data, iterating the model, reconstructing the digital representation of the employee, and completing the vector similarity calculation.
It should be noted that the steps illustrated in the above-described flow diagrams or in the flow diagrams of the figures may be performed in a computer system, such as a set of computer-executable instructions, and that, although a logical order is illustrated in the flow diagrams, in some cases, the steps illustrated or described may be performed in an order different than here.
The embodiment also provides a system for quantifying the intensity of the employee's cooperative relationship, where the system is used to implement the foregoing embodiments and preferred embodiments, and the description of the system is omitted. As used hereinafter, the terms "module," "unit," "subunit," and the like may implement a combination of software and/or hardware for a predetermined function. While the system described in the embodiments below is preferably implemented in software, implementations in hardware, or a combination of software and hardware are also possible and contemplated.
Fig. 11 is a block diagram of a structure of an employee partnership strength quantifying system according to an embodiment of the present application, and as shown in fig. 11, the system includes:
the interaction sequence acquisition module 1 is used for acquiring conversation interaction data of a target employee, encrypting the conversation interaction data, dividing the conversation interaction data into a plurality of conversation units according to a preset segmentation unit, and outputting the conversation units into an employee interaction sequence according to a time sequence; specifically, the encryption processing of the session interaction data is specifically to perform anonymous ID processing on the employee name, for example and without limitation, based on an MD5 encryption algorithm, so as to effectively implement privacy protection and data security. Optionally, the preset splitting unit may be day, week, month, year, and the like, and the embodiment of the application supports splitting of the session unit for the single chat data and the group chat data in the session interaction data in the same preset splitting unit or different preset splitting units.
The sequence model acquisition module 2 is used for constructing and training a sequence model by a sequence modeling method based on the employee interaction sequence; specifically, the sequence model includes one or any combination of a Bert model, a Word2Vec model, and a Glove model. The sequence model of this embodiment is preferably a Bert model, and the sequence model obtaining module 2 further includes: the Bert lexicon constructing module 201 represents all employees of a target enterprise as an employee dictionary repository; optionally, the Bert lexicon supports ordering according to the occurrence frequency of the employee and labeling dictionary sequence numbers in an ascending order or a descending order, and also supports directly labeling non-repeated sequence numbers for the employee. The sequence preprocessing module 202 is used for constructing the employee interaction sequence into a Bert standard training data format to obtain session sequence data for training a model; specifically, each sequence is constructed as four parts to represent: bert _ input, bert _ label, segment _ label, is _ next. The part data in the bert _ input divides the employee interaction sequence into two parts, and randomly hides a part of employees based on the split employee interaction sequence to form two sub-interaction sequences with hidden marks Mask; the bert _ label is used for identifying the staff specifically represented by the Mask sub-interaction sequence; segment _ label is used for representing the difference between the split employee interaction sequences, and optionally, the former block is represented by 0 as 0000, and the latter block is represented by 1 as 1111; the is _ next is used to indicate whether the sub-interactive sequence in the bert _ input is continuous, optionally, a value of the is _ next is configured to be 0 or 1, if the sub-interactive sequence is not continuous, the value is indicated to be 0, otherwise, the value is indicated to be 1. And the model training module 203 trains the Bert model based on the session sequence data. Based on this, the establishment and training of the sequence model of the embodiment of the application are completed by combining the staff interaction sequence, so that the staff interaction sequence is conveniently subjected to vector representation through the sequence model, and the data storage cost is reduced.
The sequence vector acquisition module 3 is used for acquiring a sequence vector of the staff interaction sequence based on a sequence model; the sequence vector obtained based on this module may be a dense vector expressed as 32 bits according to the staff.
And the relationship intimacy calculation module 4 is used for storing the sequence vector into a multi-dimensional vector database, calculating the relationship intimacy of the staff based on the sequence vector to obtain an intimacy score, and quantifying the cooperative relationship strength of the staff based on the intimacy score. Optionally, the multidimensional vector database may be a FAISS, an Annoy. Alternatively, the relationship affinity of the target employee may be calculated using, but not limited to, vector similarity.
And the staff relationship intimacy degree query module 5 is used for retrieving the relationship intimacy degree of the target staff based on the query request of the user and outputting a relationship intimacy degree ranking list according to the intimacy degree score ranking. Notably, to protect data security and privacy, employees in the affinity ranked list are anonymous IDs, but use of downstream traffic supports anonymous ID-to-name conversion based on employee dictionary libraries.
Based on the structure, the embodiment of the application evaluates the cooperative relationship strength of the staff based on relationship intimacy, processes the session interaction data into a staff interaction sequence and realizes the vectorization expression of the staff relationship based on a sequence model, thereby greatly reducing the storage cost of the original data. In addition, compared with the existing calculation mode, the method does not limit the size of the data volume, and even the larger the data volume is, the better the model training effect is, so that the calculation cost of the data is reduced, and the problem that the calculation cost is increased for mass data is solved.
Fig. 12 is a block diagram of a preferred structure of an employee partnership strength quantifying system according to an embodiment of the present application, and as shown in fig. 12, the system includes all the modules shown in fig. 11, and further includes:
the model iteration module 6 is used for acquiring incremental data of session interaction data of a preset incremental period and carrying out iterative training on the sequence model based on the incremental data;
and the relationship intimacy degree dynamic calculation module 7 is used for obtaining the sequence vector according to the sequence model and dynamically calculating the relationship intimacy degree of the staff.
Based on the structure, the embodiment of the application realizes model iteration dynamic calculation of the staff relationship intimacy, and the staff relationship intimacy is rapidly and dynamically updated directly in a model iteration mode under the condition of fast data increment, so that staff relationship analysis is further facilitated, such as statistical analysis on the change trend of the staff relationship and whether the change trend is related to a cooperation project.
The above modules may be functional modules or program modules, and may be implemented by software or hardware. For a module implemented by hardware, the modules may be located in the same processor; or the modules can be respectively positioned in different processors in any combination.
In addition, the method for quantifying the employee cooperation strength in the embodiment of the present application described in conjunction with fig. 1 may be implemented by a computer device. The computer device may include a processor and a memory storing computer program instructions. In particular, the processor may include a Central Processing Unit (CPU), or A Specific Integrated Circuit (ASIC), or may be configured to implement one or more Integrated circuits of the embodiments of the present Application.
The memory may include, among other things, mass storage for data or instructions. By way of example, and not limitation, memory may include a Hard Disk Drive (Hard Disk Drive, abbreviated to HDD), a floppy Disk Drive, a Solid State Drive (SSD), flash memory, an optical Disk, a magneto-optical Disk, tape, or a Universal Serial Bus (USB) Drive or a combination of two or more of these. The memory may include removable or non-removable (or fixed) media, where appropriate. The memory may be internal or external to the data processing apparatus, where appropriate. In a particular embodiment, the memory is a Non-Volatile (Non-Volatile) memory. In particular embodiments, the Memory includes Read-Only Memory (ROM) and Random Access Memory (RAM). The ROM may be mask-programmed ROM, Programmable ROM (PROM), Erasable PROM (EPROM), Electrically Erasable PROM (EEPROM), Electrically rewritable ROM (earrom), or FLASH Memory (FLASH), or a combination of two or more of these, where appropriate. The RAM may be a Static Random-Access Memory (SRAM) or a Dynamic Random-Access Memory (DRAM), where the DRAM may be a Fast Page Mode Dynamic Random-Access Memory (FPMDRAM), an Extended data output Dynamic Random-Access Memory (EDODRAM), a Synchronous Dynamic Random-Access Memory (SDRAM), and the like.
The memory may be used to store or cache various data files for processing and/or communication use, as well as possibly computer program instructions for execution by the processor.
The processor reads and executes the computer program instructions stored in the memory to implement the employee partnership strength quantifying method in any one of the above embodiments.
In addition, in combination with the method for quantifying employee partnership strength in the foregoing embodiments, the embodiments of the present application may provide a computer-readable storage medium to implement. The computer readable storage medium having stored thereon computer program instructions; the computer program instructions, when executed by a processor, implement any of the employee partnership strength quantifying methods of the embodiments described above.
The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.