Big data service processing method based on artificial intelligence and artificial intelligence server
1. A big data service processing method based on artificial intelligence is characterized by comprising the following steps:
acquiring the interactive data correlation of each online service interactive data in the online service interactive data set;
screening out hot service interaction data subsets from the online service interaction data sets according to the set judgment value and the interaction data correlation of each online service interaction data;
determining dynamic service interaction data in the online service interaction data set and target service parameters of the dynamic service interaction data based on the hot service interaction data subset, wherein the target service parameters of the dynamic service interaction data are larger than the set judgment value;
acquiring a cold business interaction data subset in the online business interaction data set according to static business interaction data in the online business interaction data set except for the dynamic business interaction data and the incidence relation between the static business interaction data;
determining target service parameters of each online service interaction data in the cold service interaction data subset based on the cold service interaction data subset and the dynamic service interaction data; and the determined target service parameters are used for generating service interaction portrait information corresponding to the corresponding online service interaction data.
2. The method of claim 1, wherein obtaining the interaction data correlation of each online service interaction data in the online service interaction data set comprises:
acquiring the online service interaction data set;
determining the distribution condition of the associated service interaction data of each online service interaction data in the online service interaction data set;
and taking the distribution condition of the associated service interaction data as the interaction data correlation of the corresponding online service interaction data.
3. The method of claim 1, further comprising:
acquiring office assistance content corresponding to the interactive object service signature;
acquiring office assistance execution behaviors among the interactive object service signatures according to the office assistance content;
generating a business assistance interaction data set according to the office assistance execution behavior; the online business interaction data of the business assistance interaction data set represents an interaction object business signature, and the incidence relation between two online business interaction data in the business assistance interaction data set represents that an office assistance interaction behavior exists between two corresponding interaction object business signatures.
4. The method according to claim 1, wherein the screening out a hot service interaction data subset from the online service interaction data set according to the set decision value and the interaction data correlation of each online service interaction data comprises:
acquiring the set judgment value;
and filtering the online business interaction data with the interaction data correlation smaller than or equal to the set judgment value and the incidence relation corresponding to the online business interaction data from the online business interaction data set, and obtaining the hot business interaction data subset according to the incidence relation between the static business interaction data and the static business interaction data in the online business interaction data set.
5. The method of claim 1, wherein determining the dynamic business interaction data in the online business interaction data set and the target business parameters of the dynamic business interaction data based on the trending business interaction data subset comprises:
acquiring the interactive data correlation of each online service interactive data in the hot service interactive data subset according to the distribution condition of the associated service interactive data of each online service interactive data in the hot service interactive data subset, and taking the interactive data correlation in the hot service interactive data subset as the initial current target service parameter of the corresponding online service interactive data;
circularly executing each online service interaction data in the hot service interaction data subset, and determining the global correlation corresponding to the online service interaction data according to the current target service parameters of the associated service interaction data of the online service interaction data in the hot service interaction data subset;
when the global correlation degree is smaller than or equal to a set judgment value, filtering the online service interaction data from the hot service interaction data subset;
when the global correlation is larger than the judgment value and smaller than the current target service parameter of the online service interaction data, updating the current target service parameter of the online service interaction data according to the global correlation of the online service interaction data until the cycle is terminated when the current target service parameter of each online service interaction data in the hot service interaction data subset is not updated in the cycle process;
taking online service interaction data in the hot service interaction data subset obtained when the cycle is terminated as the dynamic service interaction data, and taking the current target service parameter of the dynamic service interaction data when the cycle is terminated as the target service parameter corresponding to the dynamic service interaction data;
correspondingly, the method further comprises the following steps:
after the round robin is finished, recording the updated online service interaction data of the current target service parameters in the round robin process;
the recorded online service interaction data is used for indicating that the recorded associated service interaction data of the online service interaction data in the hot service interaction data subset is used as target online service interaction data of which the global correlation degree needs to be determined again in the next cycle process when the next cycle is started;
for each online service interaction data in the hot service interaction data subset, determining a global relevancy corresponding to the online service interaction data according to a current target service parameter of associated service interaction data of the online service interaction data in the hot service interaction data subset, including:
and for the target online service interaction data in the hot service interaction data subset, determining the global correlation corresponding to the target online service interaction data according to the current target service parameters of the associated service interaction data of the target online service interaction data in the hot service interaction data subset.
6. The method of claim 1, wherein obtaining the cold business interaction data subset in the online business interaction data set according to static business interaction data in the online business interaction data set except for the dynamic business interaction data and an association relationship between the static business interaction data comprises:
filtering the dynamic service interaction data from the online service interaction data set;
and obtaining the cold business interaction data subset according to the static business interaction data after the dynamic business interaction data are filtered out and the incidence relation between the static business interaction data.
7. The method of claim 1, wherein the determining the target service parameters of each online service interaction data in the subset of cold service interaction data based on the subset of cold service interaction data and the dynamic service interaction data comprises:
initializing current target service parameters of each online service interaction data in the cold service interaction data subset according to the distribution condition of each online service interaction data in the cold service interaction data subset in the original online service interaction data set associated with the service interaction data;
circularly executing each online service interaction data in the cold service interaction data subset, and determining the global relevancy corresponding to the online service interaction data according to the current target service parameters of the associated service interaction data of the online service interaction data in the online service interaction data set;
when the global correlation degree is smaller than the current target service parameter of the online service interaction data, updating the current target service parameter of the online service interaction data according to the global correlation degree of the online service interaction data until the current target service parameter of each online service interaction data in the cold service interaction data subset is not updated in the round-robin process, and terminating the round-robin;
taking the current target service parameter of the online service interaction data when the circulation is terminated as the target service parameter corresponding to the online service interaction data;
correspondingly, the method further comprises the following steps:
after the round robin is finished, recording the updated online service interaction data of the current target service parameters in the round robin process; the recorded online service interaction data is used for indicating that the recorded online service interaction data is associated with the service interaction data in the cold service interaction data subset when the next cycle is started, and the associated service interaction data is used as target online service interaction data of which the global correlation degree needs to be determined again in the next cycle process;
for each online service interaction data in the cold service interaction data subset, determining a global relevancy corresponding to the online service interaction data according to a current target service parameter of associated service interaction data of the online service interaction data in the online service interaction data set, including:
and for the target online service interaction data in the cold service interaction data subset, determining the global relevancy corresponding to the target online service interaction data according to the current target service parameters of the associated service interaction data of the target online service interaction data in the online service interaction data set.
8. The method according to claim 5 or 7, wherein the determining the global correlation corresponding to the online service interaction data comprises:
if the online service interaction data meet the condition that the current target service parameter of i associated service interaction data in the associated service interaction data is greater than or equal to i and the current target service parameter of i +1 associated service interaction data is not greater than or equal to i +1, determining the global correlation corresponding to the online service interaction data as i, wherein i is a positive integer;
correspondingly, the method further comprises the following steps:
when the round-robin process is started, initializing the cumulative update times of the online service interaction data to be 0, wherein the cumulative update times of the online service interaction data are used for recording the distribution condition of the online service interaction data of which the current target service parameters are updated in the round-robin process;
counting the distribution condition of online service interaction data of which the current target service parameters are updated in the round-robin process;
updating the cumulative updating times of the online service interaction data according to the distribution condition;
if the accumulated updating times of the online service interaction data is not 0 when the round-robin process is ended, continuing the next round-robin process;
and if the cumulative updating times of the online service interaction data are 0 when the round-robin process is finished, terminating the round-robin.
9. The method according to any one of claims 1 to 7, wherein the online business interaction data set is a business assistance interaction data set, the online business interaction data in the business assistance interaction data set represents an interaction object business signature, and an association relationship between two online business interaction data in the business assistance interaction data set represents that there is office assistance interaction between two corresponding interaction object business signatures, the method further comprising:
generating service interaction portrait information corresponding to an interaction object service signature represented by the online service interaction data according to the target service parameters of the online service interaction data in the service assistance interaction data set;
and mining office assistance portrait characteristics corresponding to the interactive object service signature based on the service interactive portrait information through a pre-trained portrait information recognition model.
10. An artificial intelligence server comprising a processing engine, a network module, and a memory; the processing engine and the memory communicate through the network module, the processing engine reading a computer program from the memory and operating to perform the method of claims 1-9.
Background
In recent years, the explosion of big data technologies represented by Hadoop has solved the insufficient limitations of data storage and processing capabilities of the database era. In addition, large-scale application of cloud computing technologies, such as cloud computing manufacturers represented by Amazon and airy cloud, greatly reduces the cost of processing capacity and computing capacity, so that a large-scale cluster computing system becomes very cheap, and data analysis is expanded to full-scale data analysis rather than data sampling.
The continuous development of Machine Learning (ML) enables Artificial Intelligence (Artificial Intelligence) and big data to be deeply combined, thereby realizing user portrait mining of business big data. The user portrayal (Persona) refers to a tagged user model abstracted according to the user attribute, user preference, living habits, user behavior and other information. In other words, the user is tagged and the tag is a highly refined feature identification through analysis of the user information. By tagging, a user may be described with some highly generalized, easily understandable features that may make it easier for a person to understand the user and may facilitate computer processing.
With various service interaction behaviors in the digital era, accurate mining of user portrait is relatively important for service optimization and upgrading, service analysis of large service data is a common technical means for obtaining user portrait information, and related data analysis technologies have certain defects.
Disclosure of Invention
In view of the foregoing, the present application provides the following.
The scheme of one embodiment of the application provides a big data service processing method based on artificial intelligence, and the method comprises the following steps: acquiring the interactive data correlation of each online service interactive data in the online service interactive data set; screening out hot service interaction data subsets from the online service interaction data sets according to the set judgment value and the interaction data correlation of each online service interaction data; determining dynamic service interaction data in the online service interaction data set and target service parameters of the dynamic service interaction data based on the hot service interaction data subset, wherein the target service parameters of the dynamic service interaction data are larger than the set judgment value; acquiring a cold business interaction data subset in the online business interaction data set according to static business interaction data in the online business interaction data set except for the dynamic business interaction data and the incidence relation between the static business interaction data; determining target service parameters of each online service interaction data in the cold service interaction data subset based on the cold service interaction data subset and the dynamic service interaction data; and the determined target service parameters are used for generating service interaction portrait information corresponding to the corresponding online service interaction data.
The scheme of one embodiment of the application provides an artificial intelligence server, which comprises a processing engine, a network module and a memory; the processing engine and the memory communicate through the network module, and the processing engine reads the computer program from the memory and operates to perform the above-described method.
In the description that follows, additional features will be set forth, in part, in the description. These features will be in part apparent to those skilled in the art upon examination of the following and the accompanying drawings, or may be learned by production or use. The features of the present application may be realized and attained by practice or use of various aspects of the methodologies, instrumentalities and combinations particularly pointed out in the detailed examples that follow.
Drawings
The present application will be further explained by way of exemplary embodiments, which will be described in detail by way of the accompanying drawings. These embodiments are not intended to be limiting, and in these embodiments like numerals are used to indicate like structures, wherein:
FIG. 1 is a flow diagram of an exemplary artificial intelligence based big data business processing method and/or process, according to some embodiments of the present application;
FIG. 2 is a block diagram of an exemplary artificial intelligence based big data traffic processing apparatus, shown in accordance with some embodiments of the present application;
FIG. 3 is a block diagram of an exemplary artificial intelligence based big data business processing system, shown in accordance with some embodiments of the present application, an
FIG. 4 is a diagram illustrating the hardware and software components of an exemplary artificial intelligence server, according to some embodiments of the present application.
Detailed Description
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings used in the description of the embodiments will be briefly introduced below. It is obvious that the drawings in the following description are only examples or embodiments of the application, from which the application can also be applied to other similar scenarios without inventive effort for a person skilled in the art. Unless otherwise apparent from the context, or otherwise indicated, like reference numbers in the figures refer to the same structure or operation.
It should be understood that "system", "device", "unit" and/or "module" as used herein is a method for distinguishing different components, elements, parts, portions or assemblies at different levels. However, other words may be substituted by other expressions if they accomplish the same purpose.
As used in this application and the appended claims, the terms "a," "an," "the," and/or "the" are not intended to be inclusive in the singular, but rather are intended to be inclusive in the plural unless the context clearly dictates otherwise. In general, the terms "comprises" and "comprising" merely indicate that steps and elements are included which are explicitly identified, that the steps and elements do not form an exclusive list, and that a method or apparatus may include other steps or elements.
Flow charts are used herein to illustrate operations performed by systems according to embodiments of the present application. It should be understood that the preceding or following operations are not necessarily performed in the exact order in which they are performed. Rather, the various steps may be processed in reverse order or simultaneously. Meanwhile, other operations may be added to the processes, or a certain step or several steps of operations may be removed from the processes.
In order to better understand the technical solutions of the present invention, the following detailed descriptions of the technical solutions of the present invention are provided with the accompanying drawings and the specific embodiments, and it should be understood that the specific features in the embodiments and the examples of the present invention are the detailed descriptions of the technical solutions of the present invention, and are not limitations of the technical solutions of the present invention, and the technical features in the embodiments and the examples of the present invention may be combined with each other without conflict.
The whole scheme of the big data service processing method based on artificial intelligence and the artificial intelligence server can be summarized as follows: the method comprises the steps of firstly screening hot service interaction data subsets from online service interaction data sets through interaction data correlation of all online service interaction data, secondly determining dynamic service interaction data and target service parameters of the dynamic service interaction data in the online service interaction data sets according to the hot service interaction data subsets, then determining cold service interaction data subsets in the online service interaction data sets based on association relations between static service interaction data and static service interaction data in the online service interaction data sets, and finally determining the target service parameters of all online service interaction data in the cold service interaction data subsets based on the cold service interaction data subsets and the dynamic service interaction data.
It can be understood that the scheme analyzes and processes the interactive data correlation and the target service parameters of various service interactive data, and can further determine the static service interactive data by screening and determining the hot service interactive data subset and the dynamic service interactive data so as to determine the cold service interactive data subset through the incidence relation between the static service interactive data. Therefore, the related target service parameters can be determined by combining the cold service interaction data subsets and the dynamic service interaction data, the determined target service parameters can be used for generating corresponding service interaction portrait information, the cold service interaction data subsets are respectively associated with the static service interaction data, the dynamic service interaction data and the hot service interaction data subsets, high correlation between the service interaction portrait information corresponding to the cold service interaction data subsets and the online service is ensured as far as possible, and meanwhile, more accurate potential portrait information can be mined as far as possible by combining different types of service interaction data. Therefore, the problem of low efficiency of the related art in business analysis processing of cold data can be solved.
First, an exemplary artificial intelligence based big data business processing method is described, please refer to fig. 1, which is a flowchart illustrating an exemplary artificial intelligence based big data business processing method and/or process according to some embodiments of the present application, and the artificial intelligence based big data business processing method may include the following technical solutions described in steps 100 to 500.
Step 100, acquiring the interactive data correlation of each online service interactive data in the online service interactive data set.
For example, the online service interaction data may be data generated when the service user equipment and the artificial intelligence server are in communication interaction, or data generated when different service user equipment are in communication interaction. And the online service may include online shopping, online video viewing, online service inquiry, etc.
In addition, the interactive data correlation may be used to express the correlation between different online service interactive data from a service event correlation level or a service object correlation level, and in general, the interactive data correlation may be expressed by a correlation coefficient, including but not limited to a pierce correlation coefficient (person correlation coefficient), a spearman correlation coefficient (spread correlation coefficient), or a kendall correlation coefficient (kendall correlation coefficient).
In some possible embodiments, the obtaining of the interaction data correlation of each online service interaction data in the online service interaction data set in step 100 may be implemented by: acquiring the online service interaction data set; determining the distribution condition of the associated service interaction data of each online service interaction data in the online service interaction data set; and taking the distribution condition of the associated service interaction data as the interaction data correlation of the corresponding online service interaction data.
For example, the associated service interaction data of each online service interaction data may be understood as service interaction data having an associated tag corresponding to the online service interaction data, and the distribution condition of the associated service interaction data may be used to represent the distribution position or distribution area of the associated service interaction data in the online service interaction data set, for example, which subset is located. Therefore, the interactive data correlation of the corresponding online service interactive data can be determined based on the distribution probability or the distribution ratio corresponding to the distribution condition, and when the interactive data correlation of the corresponding online service interactive data is determined based on the distribution probability or the distribution ratio corresponding to the distribution condition, the calculation method of the correlation coefficient can be adopted for calculation.
200, screening out hot service interaction data subsets from the online service interaction data set according to the set judgment value and the interaction data correlation of each online service interaction data.
In this embodiment, the set determination value may be used to provide a quantitative standard for analysis and processing of different types of service interaction data, thereby improving the efficiency of data analysis and processing. The hot service interaction data subset may refer to service interaction data with a high service interaction frequency or a high search index.
In some possible embodiments, the filtering out of the hot service interaction data subset from the online service interaction data set according to the set decision value and the interaction data correlation of each online service interaction data described in the above step 200 may further be implemented by the following embodiments: acquiring the set judgment value; and filtering the online business interaction data with the interaction data correlation smaller than or equal to the set judgment value and the incidence relation corresponding to the online business interaction data from the online business interaction data set, and obtaining the hot business interaction data subset according to the incidence relation between the static business interaction data and the static business interaction data in the online business interaction data set.
For example, the set determination value may be set or adjusted according to actual conditions. For example, for interactive data correlation judgment, the value range of the set judgment value may be 0 to 1. Further, the association relationship corresponding to the online service interaction data can be expressed in the form of a graph or a curve. Furthermore, by filtering the online service interaction data with the interaction data correlation less than or equal to the set judgment value and the incidence relation corresponding to the online service interaction data, the noise service interaction data can be effectively reduced, and the corresponding data volume can be reduced, so that the efficiency and the accuracy of subsequent screening of popular service interaction data subsets can be improved. For a description of static service interaction data, reference is made to the following.
And 300, determining dynamic service interaction data in the online service interaction data set and target service parameters of the dynamic service interaction data based on the hot service interaction data subset.
In this embodiment, the dynamic service interaction data may be understood as service interaction data with timing variability and service variability, for example, the dynamic service interaction data may change over time or change along with a jump of a service class. Further, the target service parameter of the dynamic service interaction data is greater than the set judgment value. The target service parameters can be understood as interaction dimension characteristics of the dynamic service interaction data, such as interaction event characteristics, interaction behavior characteristics, interaction object characteristics, interaction time period characteristics, and the like. Accordingly, the set decision value corresponding to the target traffic parameter may be a set dimension.
It is understood that, in some examples, the determining of the dynamic service interaction data in the online service interaction data set and the target service parameters of the dynamic service interaction data based on the trending service interaction data subset described in step 300 above may include the following steps 310-350.
Step 310, obtaining the interactive data correlation of each online service interactive data in the hot service interactive data subset according to the distribution condition of the associated service interactive data of each online service interactive data in the hot service interactive data subset, and taking the interactive data correlation in the hot service interactive data subset as the initial current target service parameter of the corresponding online service interactive data.
Step 320, cyclically executing each online service interaction data in the hot service interaction data subset, and determining the global correlation corresponding to the online service interaction data according to the current target service parameter of the associated service interaction data of the online service interaction data in the hot service interaction data subset.
For example, the global relevance may be understood as the relevance of the online business interaction data at the overall business level, and may be calculated by combining the relevant global weights.
And 330, when the global correlation is smaller than or equal to a set judgment value, filtering the online business interaction data from the hot business interaction data subset.
Step 340, when the global correlation is greater than the decision value and less than the current target service parameter of the online service interaction data, updating the current target service parameter of the online service interaction data according to the global correlation of the online service interaction data, and ending the cycle until the current target service parameter of each online service interaction data in the hot service interaction data subset is not updated in the round robin process.
For example, updating the current target business parameters of the online business interaction data may be understood as increasing or decreasing the relevant interaction feature dimension.
Step 350, taking the online service interaction data in the hot service interaction data subset obtained when the cycle is terminated as the dynamic service interaction data, and taking the current target service parameter of the dynamic service interaction data when the cycle is terminated as the target service parameter corresponding to the dynamic service interaction data.
In some examples, the dynamic service interaction data may also be understood as service interaction data with a changed interaction feature dimension, and thus, the dynamic service interaction data and the target service parameters corresponding to the dynamic service interaction data may be determined based on a continuous loop iterative update process, so as to ensure the accuracy of the dynamic service interaction data and the target service parameters corresponding to the dynamic service interaction data.
In some optional embodiments, on the basis of the above steps 310 to 350, the method may further include the following: after the round robin is finished, recording the updated online service interaction data of the current target service parameters in the round robin process; and the recorded online service interaction data is used for indicating that the recorded associated service interaction data of the online service interaction data in the hot service interaction data subset is used as target online service interaction data of which the global correlation degree needs to be determined again in the next cycle process when the next cycle is started.
In some possible embodiments, for each online service interaction data in the hit service interaction data subset described in the above step 320, determining the global correlation corresponding to the online service interaction data according to the current target service parameter of the associated service interaction data of the online service interaction data in the hit service interaction data subset may be implemented by the following implementation manners: and for the target online service interaction data in the hot service interaction data subset, determining the global correlation corresponding to the target online service interaction data according to the current target service parameters of the associated service interaction data of the target online service interaction data in the hot service interaction data subset.
Therefore, by analyzing the current target service parameters, different interaction characteristic dimensions can be considered, and the accuracy and the reliability of the global correlation degree corresponding to the target online service interaction data are further ensured.
Step 400, obtaining a cold business interaction data subset in the online business interaction data set according to static business interaction data in the online business interaction data set except the dynamic business interaction data and the incidence relation between the static business interaction data.
For example, static service interaction data is relative to dynamic service interaction data, such as service interaction data having a fixed data content. The cold business interactive data subset is relatively to the hot business interactive data subset, the cold business interactive data subset is usually difficult to attach, and therefore some potential portrait information is possibly ignored.
Based on this, the obtaining of the cold business interaction data subset in the online business interaction data set according to the static business interaction data in the online business interaction data set except for the dynamic business interaction data and the association relationship between the static business interaction data described in the above step 400 can be implemented by the following implementation manners: filtering the dynamic service interaction data from the online service interaction data set; and obtaining the cold business interaction data subset according to the static business interaction data after the dynamic business interaction data are filtered out and the incidence relation between the static business interaction data.
Step 500, determining target service parameters of each online service interaction data in the cold service interaction data subset based on the cold service interaction data subset and the dynamic service interaction data.
In a related embodiment, the determined target service parameters are used to generate service interaction representation information corresponding to corresponding online service interaction data. In other words, the determined target business parameters are used to generate business interaction representation information (latent representation information) for each online business interaction data in the cold business interaction data subset. Therefore, the cold business interaction data subsets can be processed in a targeted manner, so that the potential portrait information corresponding to the cold business interaction data subsets can be accurately and reliably mined, and the subsequent pushing of related business service products is facilitated.
In a related embodiment, the determining the target service parameter of each online service interaction data in the cold service interaction data subset based on the cold service interaction data subset and the dynamic service interaction data described in the above step 500 may be implemented by the following steps 510 to 540.
Step 510, initializing current target service parameters of each online service interaction data in the cold service interaction data subset according to the distribution condition of each online service interaction data in the original online service interaction data set associated with the service interaction data.
Step 520, cyclically executing each online service interaction data in the cold service interaction data subset, and determining the global correlation corresponding to the online service interaction data according to the current target service parameter of the associated service interaction data of the online service interaction data in the online service interaction data set.
Step 530, when the global correlation is smaller than the current target service parameter of the online service interaction data, updating the current target service parameter of the online service interaction data according to the global correlation of the online service interaction data, and ending the circulation until the current target service parameter of each online service interaction data in the cold service interaction data subset is not updated in the circulation process.
And 540, taking the current target service parameter of the online service interaction data when the circulation is stopped as the target service parameter corresponding to the online service interaction data.
It can be understood that, by performing the above steps 510 to 540, different cyclic update processes can be taken into account, so as to determine the target service parameter corresponding to the cold service interaction data subset in combination with the global association degree, and thus, on the premise of ensuring the accuracy of the target service parameter, the association between the target service parameter corresponding to the cold service interaction data subset and the target service parameter corresponding to the hot service interaction data subset at the service level can be ensured.
On the basis of the above steps 510 to 540, the method may further include the following: after the round robin is finished, recording the updated online service interaction data of the current target service parameters in the round robin process; and the recorded online service interaction data is used for indicating that the recorded online service interaction data is associated with the service interaction data in the cold service interaction data subset when the next cycle is started, and the associated service interaction data is used as target online service interaction data of which the global correlation degree needs to be determined again in the next cycle process.
In some optional embodiments, the determining, for each online service interaction data in the cold service interaction data subset, the global relevancy corresponding to the online service interaction data according to the current target service parameter of the associated service interaction data in the online service interaction data set of the online service interaction data described in the above step 520 may include the following: and for the target online service interaction data in the cold service interaction data subset, determining the global relevancy corresponding to the target online service interaction data according to the current target service parameters of the associated service interaction data of the target online service interaction data in the online service interaction data set.
It can be understood that, for the above step 320 or step 520, the global correlation corresponding to the online service interaction data can also be implemented by: if the online service interaction data meets the condition that the current target service parameter of i associated service interaction data in the associated service interaction data is greater than or equal to i and the current target service parameter of i +1 associated service interaction data is not greater than or equal to i +1, determining that the global correlation corresponding to the online service interaction data is i, wherein i is a positive integer. Of course, in actual implementation, the global correlation corresponding to the online service interaction data may also be calculated in other manners.
In some optional embodiments, on the basis of step 300 or step 500, the method may further include the following: when the round-robin process is started, initializing the cumulative update times of the online service interaction data to be 0, wherein the cumulative update times of the online service interaction data are used for recording the distribution condition of the online service interaction data of which the current target service parameters are updated in the round-robin process; counting the distribution condition of online service interaction data of which the current target service parameters are updated in the round-robin process; updating the cumulative updating times of the online service interaction data according to the distribution condition; if the accumulated updating times of the online service interaction data is not 0 when the round-robin process is ended, continuing the next round-robin process; and if the cumulative updating times of the online service interaction data are 0 when the round-robin process is finished, terminating the round-robin. Therefore, the starting and the ending of the circular execution process can be carried out according to the accumulated updating times of the online service interactive data, so that the orderliness of the circular updating process is ensured, and the confusion of the circular updating process is avoided.
In some optional embodiments, on the basis of the above steps 100 to 500, the method may further include the following: acquiring office assistance content corresponding to the interactive object service signature; acquiring office assistance execution behaviors among the interactive object service signatures according to the office assistance content; and generating a business assistance interaction data set according to the office assistance execution behavior.
For example, the online business interaction data of the business assistance interaction data set represents an interaction object business signature, and the incidence relation between two online business interaction data in the business assistance interaction data set represents that an office assistance interaction behavior exists between two corresponding interaction object business signatures.
It can be understood that the assistance behavior can be taken into account by determining the business assistance interaction data set, so that the portrait mining and positioning are carried out based on the assistance requirement in the subsequent portrait analysis, and the comprehensiveness of the portrait information mining is improved.
In some examples, the online business interaction data set may be a business assistance interaction data set, the online business interaction data in the business assistance interaction data set may represent an interaction object business signature, and an association relationship between two online business interaction data in the business assistance interaction data set may represent that an office assistance interaction behavior exists between two corresponding interaction object business signatures.
Based on this, the method may further comprise the following: generating service interaction portrait information corresponding to an interaction object service signature represented by the online service interaction data according to the target service parameters of the online service interaction data in the service assistance interaction data set; and mining office assistance portrait characteristics corresponding to the interactive object service signature based on the service interactive portrait information through a pre-trained portrait information recognition model.
In this embodiment, the pre-trained portrait information recognition model may be a deep neural network model or a classifier, and is used for performing portrait feature mining on the business interaction portrait information, so that the office assistance portrait features can be mined as accurately as possible, thereby determining the related office assistance requirements through the office assistance portrait features, and performing optimization and upgrade on related office service products based on the office assistance requirements.
In summary, based on the above scheme, the interactive data correlation and the target service parameters of various types of service interactive data are analyzed, and the static service interactive data can be further determined by screening and determining the hot service interactive data subset and the dynamic service interactive data, so that the cold service interactive data subset is determined by the association relationship between the static service interactive data. Therefore, the related target service parameters can be determined by combining the cold service interaction data subsets and the dynamic service interaction data, the determined target service parameters can be used for generating corresponding service interaction portrait information, the cold service interaction data subsets are respectively associated with the static service interaction data, the dynamic service interaction data and the hot service interaction data subsets, high correlation between the service interaction portrait information corresponding to the cold service interaction data subsets and the online service is ensured as far as possible, and meanwhile, more accurate potential portrait information can be mined as far as possible by combining different types of service interaction data. Therefore, the problem of low efficiency of the related art in business analysis processing of cold data can be solved.
In some alternative embodiments, when performing the portrait processing on the online service interaction data set, it may be necessary to protect the private portrait information of the user, and based on this, before the above step 100, the following may be further included: acquiring service big data to be subjected to anonymization processing and corresponding anonymization indicating data, wherein the anonymization indicating data comprises fragments to be anonymized and anonymization identifiers; carrying out graph node processing on the service big data to be subjected to anonymization processing to obtain a service big data graph node, and carrying out graph node processing on the fragment to be anonymized to obtain a fragment graph node to be anonymized; performing dynamic feature recognition and static feature recognition based on the service big data graph node to obtain a dynamic feature graph node and a static feature graph node, and determining a target dynamic feature graph node and a target static feature graph node from the dynamic feature graph node and the static feature graph node based on the anonymization identification; performing associated feature recognition based on the target dynamic feature map node and the target static feature map node to obtain an associated feature map node of the anonymization identifier; and performing anonymization analysis on the basis of the associated feature graph nodes of the anonymization identifier and the graph nodes of the fragments to be anonymized to obtain anonymizable weights, wherein the anonymizable weights are used for representing the analysis result of anonymization of the large data fragments of the services corresponding to the anonymization identifier in the large data of the services to be anonymized by the fragments to be anonymized.
Further, the service big data to be anonymized and the anonymization indicating data may be input into a service big data processing network, the service big data processing network performs graph node transformation on the service big data to be anonymized to obtain a service big data graph node, performs graph node transformation on the fragment to be anonymized to obtain a fragment graph node to be anonymized, performs dynamic feature identification and static feature identification on the basis of the service big data graph node to obtain a dynamic feature graph node and a static feature graph node, and determines a target dynamic feature graph node and a target static feature graph node from the dynamic feature graph node and the static feature graph node on the basis of the anonymization identifier.
Then, performing associated feature recognition based on the target dynamic feature map node and the target static feature map node to obtain an associated feature map node of the anonymization identifier, and performing anonymization analysis based on the associated feature map node of the anonymization identifier and the graph node of the fragment to be anonymized to obtain an anonymization weight; the service big data processing network is obtained by training a service big data sample set and a corresponding anonymization indication data sample set based on a machine learning model.
And then anonymizing the service big data fragment corresponding to the anonymization identification in the service big data to be anonymized by using the fragment to be anonymized based on the anonymization weight to obtain the anonymized service big data.
In this embodiment, anonymized business big data can be understood as an online business interaction data set in step 100.
For the purpose of describing the above alternative embodiments, reference is made to the following.
And step S110, acquiring service big data to be subjected to anonymization processing and corresponding anonymization indicating data.
In a related embodiment, the service big data to be anonymized may be service big data uploaded to the artificial intelligence server by the service user equipment and stored in the artificial intelligence server, and the service big data to be anonymized includes portrait information and privacy information corresponding to the service user equipment. Generally, in order to ensure the security of part of image information and privacy information in the business big data and avoid such information from being illegally accessed or snooped, anonymization processing needs to be performed on the business big data.
Further, the anonymization indication data may comprise the fragment to be anonymized and the anonymization identity. The fragments to be anonymized may be understood as data fragments for performing overlay processing or scrambling processing on the traffic big data, for example, the fragments to be anonymized xxx may be used for performing overlay processing or scrambling processing on the traffic big data 1. The anonymization identification can be understood as the position information of the data fragment or the data block of the large service data needing anonymization processing.
Generally, the data volume of the large service data to be anonymized is huge, and if the large service data to be anonymized is directly subjected to anonymization analysis and processing, the data processing load of the artificial intelligence server may be increased.
And step S120, carrying out graph node treatment on the service big data to be subjected to anonymization treatment to obtain a service big data graph node, and carrying out graph node treatment on the fragment to be anonymized to obtain a fragment graph node to be anonymized.
In a related embodiment, in order to improve the anonymization analysis and processing efficiency of the data, the related data can be optimized by utilizing a graph node technology. The graph nodularization technology can be understood as a Data conversion technology based on graph Data (graphical Data), a large amount of Data can be nodularized, the Data volume can be reduced, the characteristic expression of graph nodes on corresponding Data sets or Data blocks can be ensured, and the incidence relation and the transfer relation among different Data sets or Data blocks can be reflected, so that the anonymization analysis and processing efficiency of the Data can be improved.
Therefore, in this embodiment, graph node processing may be performed on the service big data to be anonymized and the fragment to be anonymized, respectively, to obtain a service big data graph node and a fragment graph node to be anonymized.
Step S130, performing dynamic feature recognition and static feature recognition based on the service big data graph nodes to obtain dynamic feature graph nodes and static feature graph nodes, and determining target dynamic feature graph nodes and target static feature graph nodes from the dynamic feature graph nodes and the static feature graph nodes based on the anonymization identifiers.
In a related embodiment, the dynamic feature identification is used to analyze the dynamic features of the corresponding graph nodes, and the static feature identification is used to analyze the static features of the corresponding graph nodes. In general, dynamic features may be understood as features that change over time or a change in a business scenario, such as features of business interaction objects, business interaction matters, and the like. The static features may be understood as features that do not change with time or change of service scenarios, and may be understood as features inherent to the graph nodes, such as data type features, data format features, and the like.
Correspondingly, dynamic feature recognition and static feature recognition are carried out on the basis of the service big data graph node, and feature classification of the graph nodes can be realized through different feature recognition, so that the dynamic feature graph node and the static feature graph node are obtained.
Further, determining a target dynamic feature map node and a target static feature map node from the dynamic feature map node and the static feature map node based on the anonymized identifiers may be understood as determining a target dynamic feature map node and a target static feature map node matching the anonymized identifiers from the dynamic feature map node and the static feature map node by the anonymized identifiers.
In some possible embodiments, the anonymization identification may include a fragment distribution identification to be anonymized and a fragment length to be anonymized. The distribution identification of the fragments to be anonymized can be understood as position distribution information of the fragments to be anonymized, and the length of the fragments to be anonymized can be understood as the data length or the data size of the fragments to be anonymized. Based on this, the "performing dynamic feature recognition and static feature recognition based on the service big data graph node to obtain a dynamic feature graph node and a static feature graph node, and determining a target dynamic feature graph node and a target static feature graph node from the dynamic feature graph node and the static feature graph node based on the anonymization identifier" described in the above step S130 may include the following contents described in steps S131 to S133.
Step S131, performing dynamic feature recognition based on the service big data graph nodes to obtain the dynamic feature graph nodes, and determining feature graph nodes before the distribution identifier of the fragment to be anonymized from the dynamic feature graph nodes to obtain a first dynamic feature graph node.
In practical implementation, the feature graph nodes determined from the dynamic feature graph nodes before the fragment distribution identifier to be anonymized may be understood as that the fragment distribution identifier of the selected feature graph node is before the fragment distribution identifier to be anonymized, in other words, the fragment distribution identifier of the first dynamic feature graph node is before the fragment distribution identifier to be anonymized.
In some possible embodiments, the "performing dynamic feature recognition based on the service big data graph node to obtain the dynamic feature graph node, and determining the feature graph node before the distribution identifier of the fragment to be anonymized from the dynamic feature graph node to obtain the first dynamic feature graph node" described in the above step S131 may be implemented by the following steps S1311 to S1314.
Step 1311, obtaining a preset initial graph node, and determining a current attribute graph node from the service big data graph nodes according to a sequence from a graph node starting point to a graph node end point.
In this embodiment, different graph nodes are connected to form a graph node network, and an initial graph node may be selected according to actual situations. The current attribute graph nodes can be understood as graph nodes with private user attributes or specific user attributes and higher feature discrimination and portrait discrimination.
Step S1312, performing dynamic risk identification based on the preset initial graph nodes and the current attribute graph nodes to obtain current attribute dynamic feature graph nodes corresponding to the current attribute graph nodes.
In this embodiment, the dynamic risk identification may be understood as performing information stealing simulation based on time sequence change and service scene change on the current attribute graph node, so as to obtain the current attribute dynamic feature graph node. Correspondingly, the current attribute dynamic feature graph node can be understood as a graph node with variable privacy user attributes, specific user attributes, feature differentiation degrees and portrait differentiation degrees in time sequence or service scene level.
Step 1313, taking the current attribute dynamic feature map node as a preset initial map node, and returning to the step of determining the current attribute map node from the service big data map node in sequence according to the sequence from the map node starting point to the map node end point, and repeating the identification until obtaining the dynamic feature map node corresponding to each attribute map node in the service big data map node.
It can be understood that, through step S1313, iterative identification of the dynamic feature map nodes corresponding to each attribute map node can be implemented, so as to ensure local independence and global relevance between the dynamic feature map nodes corresponding to each attribute map node.
Step S1314, determining a target attribute dynamic graph node before the anonymization identification from the attribute graph nodes, and using a dynamic feature graph node corresponding to the target attribute dynamic graph node as a first dynamic feature graph node.
In an actual implementation process, after the dynamic feature map nodes corresponding to the attribute map nodes are determined, the corresponding target attribute dynamic map nodes can be determined according to the anonymization identifiers, so that the position information corresponding to the anonymization identifiers is taken into account, and the first dynamic feature map nodes are further accurately determined.
Step S132, determining a target identifier based on the distribution identifier of the fragment to be anonymized and the length of the fragment to be anonymized, and determining a feature graph node corresponding to the target identifier from the dynamic feature graph nodes to obtain a second dynamic feature graph node.
In this embodiment, the target identifier may be used to characterize an identifier corresponding to a graph node having an anonymization requirement, in other words, the second dynamic feature graph node may be understood as a graph node having an anonymization requirement.
Through the steps S131 and S132, the first dynamic feature map node of the fragment distribution identifier before the fragment distribution identifier to be anonymized and the second dynamic feature map node having the anonymization requirement can be respectively determined, so that a complete identification basis is provided for subsequent association feature identification, and the accuracy of association feature identification is ensured.
Step S133, performing static feature recognition based on the service big data graph node to obtain a static feature graph node, and determining a feature graph node corresponding to the fragment distribution identifier to be anonymized from the static feature graph node to obtain a first static feature graph node; and determining the feature map nodes behind the target identifier from the static feature map nodes to obtain second static feature map nodes.
It can be understood that, through the above steps S131 to S133, a first dynamic feature map node of the fragment distribution identifier before the fragment distribution identifier to be anonymized and a second dynamic feature map node having an anonymization requirement may be determined, and a first static feature map node and a second static feature map node of the node identifier after the target identifier corresponding to the fragment distribution identifier to be anonymized are determined, so that integrity and comprehensiveness of map nodes with different state features may be ensured, a complete identification basis is provided for subsequent associated feature identification, and accuracy of associated feature identification is ensured.
And S140, performing associated feature recognition based on the target dynamic feature map node and the target static feature map node to obtain an associated feature map node of the anonymization identifier.
In related embodiments, by performing the association feature recognition on the target dynamic feature map node and the target static feature map node, an association feature map node reflecting the overall situation of the service big data can be obtained, so that an analysis basis and a judgment basis of a global level are provided for subsequent data anonymization processing.
Correspondingly, on the basis of steps S131 to S133, the above-described "performing associated feature recognition based on the target dynamic feature map node and the target static feature map node to obtain the associated feature map node of the anonymized identifier" in step S140 may be implemented by the following embodiments: and analyzing node association information based on the first dynamic feature map node, the second dynamic feature map node, the first static feature map node and the second static feature map node to obtain an association feature map node of the anonymization identifier.
It can be understood that by analyzing the node association information of the first dynamic feature map node, the second dynamic feature map node, the first static feature map node and the second static feature map node, the association between the dynamic feature map nodes, the association between the static feature map nodes and the association between the dynamic feature map nodes and the static feature map nodes can be considered, so that the obtained associated feature map node of the anonymization identifier can reflect the overall situation of the service big data, and a global analysis basis and a judgment basis are provided for subsequent data anonymization processing.
And S150, carrying out anonymization analysis on the associated feature graph nodes based on the anonymization identification and the graph nodes of the fragments to be anonymized to obtain anonymization weights.
In this embodiment, the anonymity weight may be used to represent an analysis result of anonymization performed on the service big data fragment to be anonymized by the fragment to be anonymized, where the service big data fragment corresponds to the anonymization identifier in the service big data to be anonymized. Further, the analysis result may record the information content expression accuracy of the service big data fragment which is not anonymized, after the service big data fragment corresponding to the anonymization identifier is anonymized by the fragment to be anonymized.
For example, the traffic big data includes a traffic big data segment p1, a traffic big data segment p2, a traffic big data segment p3, a traffic big data segment p4, and a traffic big data segment p 5. The service big data fragment corresponding to the anonymization identifier is the service big data fragment p4, and then after the service big data fragment p4 is anonymized by the to-be-anonymized fragment xxx, the information content expression accuracy of the remaining service big data fragment p1, the service big data fragment p2, the service big data fragment p3 and the service big data fragment p5 can be expressed through analysis results.
In some examples, the larger the anonymity weight is, the higher the information content expression accuracy of the related service big data fragment which is not anonymized is, the smaller the anonymity weight is, which indicates that the information content expression accuracy of the related service big data fragment which is not anonymized is lower after the service big data fragment corresponding to the anonymization identifier is anonymized by the fragment to be anonymized.
By the design, the anonymization processing can be performed on the service big data fragment corresponding to the anonymization identification based on the anonymization weight and the fragment to be anonymized, so that on the premise of realizing data anonymization, the accuracy of other content expression of the anonymized service big data is ensured as much as possible, the damage of the anonymization processing on the data structure and the data content expression of the service big data is avoided, and the reliability of the data anonymization processing is improved.
In some optional embodiments, the "performing anonymization analysis based on the associated feature graph nodes of the anonymization token and the graph nodes of the fragment to be anonymized to obtain anonymization weights" in the above step S150 may include the following steps: and performing anonymization risk identification on the basis of the associated feature graph nodes of the anonymization identifier and the graph nodes of the fragments to be anonymized to obtain potential graph nodes to be anonymized, and performing linear identification on the basis of the potential graph nodes to be anonymized to obtain anonymization weights.
In this embodiment, the anonymization risk identification may be understood as data deviation risk or data damage risk identification after anonymization, and the potential graph nodes to be anonymized can be obtained by the anonymization risk identification, and may include graph nodes with different anonymization requirements and different anonymization degrees. In this way, by performing linear identification on the potential graph nodes to be anonymized, the mutual influence of different potential graph nodes in the anonymization process can be considered, and the reliability of the anonymization weight is ensured.
In some other embodiments, the above-mentioned steps of "performing anonymization risk identification based on the associated feature graph node of the anonymization identifier and the graph node of the fragment to be anonymized to obtain the potential graph node to be anonymized, and performing linear identification based on the potential graph node to be anonymized to obtain the anonymity weight" may include the following steps S151 to S155.
And S151, acquiring preset target potential graph nodes, and determining the current attribute graph nodes to be anonymized from the graph nodes of the fragments to be anonymized according to the sequence from the graph node starting point to the graph node end point.
In an actual implementation process, the preset target potential graph nodes can be selected according to actual requirements. The attribute graph nodes to be anonymized at present can be understood as graph nodes with anonymization possibility.
Step S152, identifying the current attribute potential graph node to be anonymized corresponding to the current attribute graph node to be anonymized based on the preset target potential graph node, the associated feature graph node of the anonymization identifier and the current attribute graph node to be anonymized.
In this embodiment, the current attribute potential graph node to be anonymized may be understood as a graph node which may have an associated influence on other graph nodes in the anonymization process.
Step S153, linear identification is carried out on the basis of the current attribute potential graph nodes to be anonymized, and current anonymization weight values corresponding to the current attribute graph nodes to be anonymized are obtained.
It can be understood that the current anonymization weight value corresponding to the current attribute graph node to be anonymized can be accurately determined by performing linear identification on the current attribute potential graph node to be anonymized.
Step S154, the current attribute potential graph node to be anonymized is used as a preset target potential graph node, and the step of determining the current attribute graph node to be anonymized from the fragment graph node to be anonymized according to the sequence from the graph node starting point to the graph node end point is repeated until the anonymization weight value corresponding to each attribute graph node to be anonymized is obtained.
In this step, by repeatedly determining the anonymization weight value corresponding to each attribute map node to be anonymized, the degree of distinction before the anonymization weight value corresponding to each attribute map node to be anonymized can be ensured, thereby facilitating the subsequent determination of the anonymization weight.
And S155, performing iterative identification based on the anonymization weight values corresponding to the attribute graph nodes to be anonymized to obtain the anonymization weight.
In this embodiment, the performing iterative identification based on the anonymization weight values corresponding to the attribute map nodes to be anonymized may be understood as: and performing anonymization simulation for multiple times through anonymization weight values corresponding to the attribute graph nodes to be anonymized, and then performing weighting calculation according to different simulation results to obtain the anonymization weight. For example, the anonymization weight value corresponding to the attribute map node to be anonymized is i, where i is a positive integer, and the number of times of iterative identification may be i times.
It can be understood that by implementing the above steps S151 to S155, by repeatedly determining the anonymization weight value corresponding to each attribute map node to be anonymized, the degree of distinction before the anonymization weight value corresponding to each attribute map node to be anonymized can be ensured, and further, when determining the anonymization weight, the anonymization weight can be ensured to be highly matched with the actual service interaction scenario, and the reliability of the anonymization weight can be ensured.
In some possible embodiments, the method may further include the following steps S161 and S163.
Step S161, inputting the service big data to be anonymized and the anonymization indicating data into a service big data processing network, where the service big data processing network performs graph node transformation on the service big data to be anonymized to obtain a service big data graph node, performs graph node transformation on the fragment to be anonymized to obtain a fragment graph node to be anonymized, performs dynamic feature identification and static feature identification on the basis of the service big data graph node to obtain a dynamic feature graph node and a static feature graph node, and determines a target dynamic feature graph node and a target static feature graph node from the dynamic feature graph node and the static feature graph node on the basis of the anonymization identifier.
And S162, performing associated feature recognition based on the target dynamic feature map node and the target static feature map node to obtain an associated feature map node of the anonymization identifier, and performing anonymization analysis based on the associated feature map node of the anonymization identifier and the graph node of the fragment to be anonymized to obtain an anonymization weight.
In the above steps S161 and S162, the business big data processing network is obtained by training using the business big data sample set and the corresponding anonymization indication data sample set based on the machine learning model.
Step S163, anonymizing the service big data fragment corresponding to the anonymization identification in the service big data to be anonymized by using the fragment to be anonymized based on the anonymizable weight, so as to obtain anonymized service big data.
It can be understood that, in combination with the service big data processing network, on the premise of implementing data anonymization, the accuracy of other content expression of the service big data after anonymization can be ensured as much as possible, so that the damage of the anonymization processing on the data structure and the data content expression of the service big data can be avoided, and the reliability of data anonymization processing can be further improved.
In some other embodiments, the business big data processing network may include a graph node layer and an anonymization analysis layer, and the graph node layer and the anonymization analysis layer may be understood as related functional network layers of the business big data processing network, and when the business big data processing network is trained, the related functional network layers may be trained together. Based on this, the "inputting the service big data to be anonymized and the anonymization indicating data into the service big data processing network" described in the above step S161 may include the following contents: inputting the service big data to be anonymized and the anonymization indicating data into the graph node layer, the graph node layer performs graph node processing on the service big data to be subjected to anonymization processing to obtain a service big data graph node, graph node identification is carried out on the fragments to be anonymized to obtain graph nodes of the fragments to be anonymized, dynamic feature identification and static feature identification are carried out on the basis of the business big data graph nodes to obtain dynamic feature graph nodes and static feature graph nodes, target dynamic feature graph nodes and target static feature graph nodes are determined from the dynamic feature graph nodes and the static feature graph nodes on the basis of the anonymization identification, and associated feature identification is carried out on the basis of the target dynamic feature graph nodes and the target static feature graph nodes to obtain associated feature graph nodes of the anonymization identification; and inputting the associated feature graph nodes of the anonymization identifier and the graph nodes of the fragments to be anonymized into the anonymization analysis layer, and carrying out anonymization analysis on the anonymization analysis layer based on the associated feature graph nodes of the anonymization identifier and the graph nodes of the fragments to be anonymized to obtain anonymity weights.
It can be understood that by training the service big data processing network and the related function network layer thereof, the service big data processing network can be applied to different service scenes, so that data anonymization processing is performed in different service scenes, the accuracy of other content expression of the anonymized service big data is ensured as much as possible on the premise of realizing data anonymization, the damage of the anonymization processing on the data structure and the data content expression of the service big data is avoided, and the reliability of data anonymization processing is further improved.
In some optional embodiments, on the basis of the "anonymizing the service big data fragment corresponding to the anonymizing identifier in the service big data to be anonymized by using the fragment to be anonymized based on the anonymizable weight to obtain anonymized service big data" described in the step S163, the method may further include the following content described in the step S170.
Step S170, user portrait mining is carried out on the anonymization service big data to obtain portrait mining results, privacy information identification is carried out on the portrait mining results to obtain privacy information identification results, and anonymization protection scores of the anonymization service big data are determined according to the privacy information identification results.
In practical implementation, the value of the anonymization protection score can be 0-1, and the higher the value is, the stronger the privacy protection capability of the large data of the anonymization service is. Further, the privacy information identification of the image mining result can be carried out by utilizing a predetermined privacy information label, correspondingly, the privacy information identification result can be the identification matching degree of the privacy information, and the anonymization protection score of the anonymization service big data can be calculated through the identification matching degree and a related evaluation factor. The related evaluation factors can be increased or decreased according to the actual situation, and are not described herein.
In some alternative embodiments, the "user portrait mining on the anonymized business big data to obtain portrait mining results" described in the above step S170 may include the following steps S171 to S175.
And S171, acquiring a group user portrait information set through the anonymization service big data.
In this embodiment, the set of group user representation information includes j consecutive group user representation information, j being an integer greater than 1.
And step S172, acquiring an individual user portrait information set according to the group user portrait information set.
In this embodiment, the set of individual user profile information includes a succession of j individual user profile information.
Step S173, based on the group user portrait information set, obtaining a group user keyword information set through a first keyword extraction network included in a group user identification model, and based on the individual user portrait information set, obtaining an individual user keyword information set through a second keyword extraction network included in the group user identification model.
In this embodiment, the group user keyword information set includes j group user keyword information, and the individual user keyword information set includes j individual user keyword information;
and S174, acquiring portrait label positioning results corresponding to the portrait information set of the group user through a portrait label positioning layer included in the group user identification model based on the keyword information set of the group user and the keyword information set of the individual user.
Step S175, determining the portrait mining result of the group user portrait information set according to the portrait label positioning result.
It can be understood that by analyzing the individual user portrait and the group user portrait, the key word information of the individual user and the key word information of the group user can be obtained, so that privacy anonymization detection can be carried out, and portrait label positioning results for representing the distribution of the individual portrait and the group portrait can be obtained. In this way, the portrait mining result of the group user portrait information set can be determined based on the portrait tag positioning result, and further targeted mining of group portraits and individual portraits of anonymization service big data is realized, and the integrity of the portrait mining result is ensured.
In some alternative embodiments, the "obtaining the portrait label positioning result corresponding to the portrait information set of the group user through the portrait label positioning layer included in the group user identification model based on the group user keyword information set and the individual user keyword information set" in the step S174 may be implemented by the following two implementations.
In a first implementation manner, j first keyword features are obtained through a first feature classification layer included in the group user identification model based on the group user keyword information set, wherein each first keyword feature corresponds to one group user keyword information; based on the individual user keyword information set, j second keyword characteristics are obtained through a second characteristic classification layer included in the group user identification model, wherein each second keyword characteristic corresponds to one individual user keyword information; performing feature integration processing on the j first keyword features and the j second keyword features to obtain j target keyword features, wherein each target keyword feature comprises a first keyword feature and a second keyword feature; acquiring global keyword features through an anonymization positioning network included in the group user identification model based on the j target keyword features, wherein the global keyword features are determined according to the j target keyword features and j anonymization positioning heat degrees, and each target keyword feature corresponds to one anonymization positioning heat degree; and acquiring portrait label positioning results corresponding to the portrait information set of the group user through a portrait label positioning layer included in the group user identification model based on the global keyword characteristics.
In a second implementation mode, j first keyword features are obtained through a first non-anonymization positioning network included in the group user identification model based on the group user keyword information set, wherein each first keyword feature corresponds to group user keyword information; based on the individual user keyword information set, j second keyword characteristics are obtained through a second non-anonymization positioning network included in the group user identification model, wherein each second keyword characteristic corresponds to one individual user keyword information; performing feature integration processing on the j first keyword features and the j second keyword features to obtain j target keyword features, wherein each target keyword feature comprises a first keyword feature and a second keyword feature; and acquiring portrait label positioning results corresponding to the portrait information set of the group user through the portrait label positioning layer included in the portrait user identification model based on the j target keyword characteristics.
In this way, by alternatively implementing the above-described embodiment of "acquiring a portrait label positioning result corresponding to the group user portrait information set through the portrait label positioning layer included in the group user identification model based on the group user keyword information set and the individual user keyword information set", the portrait label positioning result can be accurately and reliably determined, and the accuracy of expression of the portrait label positioning result on the individual portrait and the group portrait distribution is ensured.
In some alternative embodiments, the "obtaining j first keyword features through the first non-anonymized positioning network included in the group user identification model based on the group user keyword information set" described in the above steps may include the following: for each group user keyword information in the group user keyword information set, obtaining first dynamic individual keyword information through a dynamic privacy classification layer included in the first non-anonymization positioning network, wherein the first non-anonymization positioning network belongs to the group user identification model; for each group user keyword information in the group user keyword information set, acquiring first potential individual keyword information through a feature classification layer included in the first non-anonymization positioning network; for each group user keyword information in the group user keyword information set, acquiring first global keyword information through a user classification layer included in the first non-anonymization positioning network based on the first dynamic individual keyword information and the first potential individual keyword information; and for each group user keyword information in the group user keyword information set, acquiring a first keyword feature through a first feature classification layer included in the first non-anonymization positioning network based on the first global keyword information and the group user keyword information.
In some alternative embodiments, the "obtaining j second keyword features through the second non-anonymized positioning network included in the group user identification model based on the individual user keyword information sets" described in the above steps may include the following: for each individual user keyword information in the individual user keyword information set, obtaining second dynamic individual keyword information through a dynamic privacy classification layer included in the second non-anonymization positioning network, wherein the second non-anonymization positioning network belongs to the group user identification model; for each individual user keyword information in the individual user keyword information set, obtaining second potential individual keyword information through a feature classification layer included in the second non-anonymization positioning network; for each individual user keyword information in the individual user keyword information set, obtaining second global keyword information through a user classification layer included in the second non-anonymization positioning network based on the second dynamic individual keyword information and the second potential individual keyword information; and for each individual user keyword information in the individual user keyword information set, acquiring a second keyword feature through a second feature classification layer included in the second non-anonymization positioning network based on the second global keyword information and the individual user keyword information.
It should be understood that the above related definitions for group images, individual images, group keywords, and individual keywords may be found in the related art, which is not listed here. By implementing the scheme, the first keyword feature and the second keyword feature can be completely determined, so that the accurate division of group portrait distribution and individual portrait distribution is facilitated.
Secondly, for the above big data service processing method based on artificial intelligence, an embodiment of the present invention further provides an exemplary big data service processing apparatus based on artificial intelligence, and as shown in fig. 2, the big data service processing apparatus 200 based on artificial intelligence may include the following functional modules.
The data obtaining module 210 is configured to obtain an interaction data correlation of each online service interaction data in the online service interaction data set.
And the data screening module 220 is configured to screen out a hot service interaction data subset from the online service interaction data set according to a set decision value and the interaction data correlation of each online service interaction data.
A data determining module 230, configured to determine, based on the hot service interaction data subset, dynamic service interaction data in the online service interaction data set and a target service parameter of the dynamic service interaction data, where the target service parameter of the dynamic service interaction data is greater than the set determination value.
A data obtaining module 240, configured to obtain a cold business interaction data subset in the online business interaction data set according to static business interaction data in the online business interaction data set except for the dynamic business interaction data and an association relationship between the static business interaction data.
A parameter determining module 250, configured to determine, based on the cold business interaction data subset and the dynamic business interaction data, a target business parameter of each online business interaction data in the cold business interaction data subset; and the determined target service parameters are used for generating service interaction portrait information corresponding to the corresponding online service interaction data.
Then, based on the above method embodiment and apparatus embodiment, the embodiment of the present invention further provides a system embodiment, that is, an artificial intelligence based big data service processing system, please refer to fig. 3, where the artificial intelligence based big data service processing system 30 may include an artificial intelligence server 10 and a service user device 20. Wherein the artificial intelligence server 10 and the business user device 20 communicate to implement the above method, and further, the functionality of the big data business processing system 30 based on artificial intelligence is described as follows.
The artificial intelligence server 10 obtains the interactive data correlation of each online service interactive data in the online service interactive data set of the service user equipment 20; screening out hot service interaction data subsets from the online service interaction data sets according to the set judgment value and the interaction data correlation of each online service interaction data; determining dynamic service interaction data in the online service interaction data set and target service parameters of the dynamic service interaction data based on the hot service interaction data subset, wherein the target service parameters of the dynamic service interaction data are larger than the set judgment value; acquiring a cold business interaction data subset in the online business interaction data set according to static business interaction data in the online business interaction data set except for the dynamic business interaction data and the incidence relation between the static business interaction data; determining target service parameters of each online service interaction data in the cold service interaction data subset based on the cold service interaction data subset and the dynamic service interaction data; wherein the determined target service parameter is used for generating service interaction portrait information of the service user equipment 20 corresponding to the corresponding online service interaction data.
Further, referring to fig. 4 in conjunction, the artificial intelligence server 10 may include a processing engine 110, a network module 120, and a memory 130, the processing engine 110 and the memory 130 communicating through the network module 120.
Processing engine 110 may process the relevant information and/or data to perform one or more of the functions described herein. For example, in some embodiments, processing engine 110 may include at least one processing engine (e.g., a single core processing engine or a multi-core processor). By way of example only, the Processing engine 110 may include a Central Processing Unit (CPU), an Application-Specific Integrated Circuit (ASIC), an Application-Specific Instruction Set Processor (ASIP), a Graphics Processing Unit (GPU), a Physical Processing Unit (PPU), a Digital Signal Processor (DSP), a Field Programmable Gate Array (FPGA), a Programmable Logic Device (PLD), a controller, a microcontroller Unit, a Reduced Instruction Set Computer (RISC), a microprocessor, or the like, or any combination thereof.
Network module 120 may facilitate the exchange of information and/or data. In some embodiments, the network module 120 may be any type of wired or wireless network or combination thereof. Merely by way of example, the Network module 120 may include a cable Network, a wired Network, a fiber optic Network, a telecommunications Network, an intranet, the internet, a Local Area Network (LAN), a Wide Area Network (WAN), a Wireless Local Area Network (WLAN), a Metropolitan Area Network (MAN), a Public Switched Telephone Network (PSTN), a bluetooth Network, a Wireless personal Area Network, a Near Field Communication (NFC) Network, and the like, or any combination thereof. In some embodiments, the network module 120 may include at least one network access point. For example, the network module 120 may include wired or wireless network access points, such as base stations and/or network access points.
The Memory 130 may be, but is not limited to, a Random Access Memory (RAM), a Read Only Memory (ROM), a Programmable Read-Only Memory (PROM), an Erasable Read-Only Memory (EPROM), an electrically Erasable Read-Only Memory (EEPROM), and the like. The memory 130 is used for storing a program, and the processing engine 110 executes the program after receiving the execution instruction.
It will be appreciated that the configuration shown in FIG. 4 is merely illustrative and that the artificial intelligence server 10 may include more or fewer components than shown in FIG. 4 or may have a different configuration than shown in FIG. 4. The components shown in fig. 4 may be implemented in hardware, software, or a combination thereof.
It should be understood that, for the above, a person skilled in the art can deduce from the above disclosure to determine the meaning of the related technical term without doubt, for example, for some values, coefficients, weights, indexes, factors, and other terms, a person skilled in the art can deduce and determine from the logical relationship between the above and the following, and the value range of these values can be selected according to the actual situation, for example, 0 to 1, for example, 1 to 10, and for example, 50 to 100, which are not limited herein.
The skilled person can unambiguously determine some preset, reference, predetermined, set and target technical features/terms, such as threshold values, threshold intervals, threshold ranges, etc., from the above disclosure. For some technical characteristic terms which are not explained, the technical solution can be clearly and completely implemented by those skilled in the art by reasonably and unambiguously deriving the technical solution based on the logical relations in the previous and following paragraphs. Prefixes of unexplained technical feature terms, such as "first", "second", "previous", "next", "current", "history", "latest", "best", "target", "specified", and "real-time", etc., can be unambiguously derived and determined from the context. Suffixes of technical feature terms not to be explained, such as "list", "feature", "sequence", "set", "matrix", "unit", "element", "track", and "list", etc., can also be derived and determined unambiguously from the foregoing and the following.
The foregoing disclosure of embodiments of the present invention will be apparent to those skilled in the art. It should be understood that the process of deriving and analyzing technical terms, which are not explained, by those skilled in the art based on the above disclosure is based on the contents described in the present application, and thus the above contents are not an inventive judgment of the overall scheme.
It should be appreciated that the system and its modules shown above may be implemented in a variety of ways. For example, in some embodiments, the system and its modules may be implemented in hardware, software, or a combination of software and hardware. Wherein the hardware portion may be implemented using dedicated logic; the software portions may be stored in a memory for execution by a suitable instruction execution system, such as a microprocessor or specially designed hardware. Those skilled in the art will appreciate that the methods and systems described above may be implemented using computer executable instructions and/or embodied in processor control code, such code being provided, for example, on a carrier medium such as a diskette, CD-or DVD-ROM, a programmable memory such as read-only memory (firmware), or a data carrier such as an optical or electronic signal carrier. The system and its modules of the present application may be implemented not only by hardware circuits such as very large scale integrated circuits or gate arrays, semiconductors such as logic chips, transistors, or programmable hardware devices such as field programmable gate arrays, programmable logic devices, etc., but also by software executed by various types of processors, for example, or by a combination of the above hardware circuits and software (e.g., firmware).
It is to be noted that different embodiments may produce different advantages, and in different embodiments, any one or combination of the above advantages may be produced, or any other advantages may be obtained.
Having thus described the basic concept, it will be apparent to those skilled in the art that the foregoing detailed disclosure is to be considered merely illustrative and not restrictive of the broad application. Various modifications, improvements and adaptations to the present application may occur to those skilled in the art, although not explicitly described herein. Such modifications, improvements and adaptations are proposed in the present application and thus fall within the spirit and scope of the exemplary embodiments of the present application.
Also, this application uses specific language to describe embodiments of the application. Reference throughout this specification to "one embodiment," "an embodiment," and/or "some embodiments" means that a particular feature, structure, or characteristic described in connection with at least one embodiment of the present application is included in at least one embodiment of the present application. Therefore, it is emphasized and should be appreciated that two or more references to "an embodiment" or "one embodiment" or "an alternative embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, some features, structures, or characteristics of one or more embodiments of the present application may be combined as appropriate.
Moreover, those skilled in the art will appreciate that aspects of the present application may be illustrated and described in terms of several patentable species or situations, including any new and useful combination of processes, machines, manufacture, or materials, or any new and useful improvement thereon. Accordingly, various aspects of the present application may be embodied entirely in hardware, entirely in software (including firmware, resident software, micro-code, etc.) or in a combination of hardware and software. The above hardware or software may be referred to as "data block," module, "" engine, "" unit, "" component, "or" system. Furthermore, aspects of the present application may be represented as a computer product, including computer readable program code, embodied in one or more computer readable media.
The computer storage medium may comprise a propagated data signal with the computer program code embodied therewith, for example, on baseband or as part of a carrier wave. The propagated signal may take any of a variety of forms, including electromagnetic, optical, etc., or any suitable combination. A computer storage medium may be any computer-readable medium that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code located on a computer storage medium may be propagated over any suitable medium, including radio, cable, fiber optic cable, RF, or the like, or any combination of the preceding.
Computer program code required for the operation of various portions of the present application may be written in any one or more programming languages, including an object oriented programming language such as Java, Scala, Smalltalk, Eiffel, JADE, Emerald, C + +, C #, VB.NET, Python, and the like, a conventional programming language such as C, Visual Basic, Fortran 2003, Perl, COBOL 2002, PHP, ABAP, a dynamic programming language such as Python, Ruby, and Groovy, or other programming languages, and the like. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any network format, such as a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet), or in a cloud computing environment, or as a service, such as a software as a service (SaaS).
Additionally, the order in which elements and sequences of the processes described herein are processed, the use of alphanumeric characters, or the use of other designations, is not intended to limit the order of the processes and methods described herein, unless explicitly claimed. While various presently contemplated embodiments of the invention have been discussed in the foregoing disclosure by way of example, it is to be understood that such detail is solely for that purpose and that the appended claims are not limited to the disclosed embodiments, but, on the contrary, are intended to cover all modifications and equivalent arrangements that are within the spirit and scope of the embodiments herein. For example, although the system components described above may be implemented by hardware devices, they may also be implemented by software-only solutions, such as installing the described system on an existing server or mobile device.
Similarly, it should be noted that in the preceding description of embodiments of the application, various features are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure aiding in the understanding of one or more of the embodiments. This method of disclosure, however, is not intended to require more features than are expressly recited in the claims. Indeed, the embodiments may be characterized as having less than all of the features of a single embodiment disclosed above.
Numerals describing the number of components, attributes, etc. are used in some embodiments, it being understood that such numerals used in the description of the embodiments are modified in some instances by the use of the modifier "about", "approximately" or "substantially". Unless otherwise indicated, "about", "approximately" or "substantially" indicates that the numbers allow for adaptive variation. Accordingly, in some embodiments, the numerical parameters used in the specification and claims are approximations that may vary depending upon the desired properties of the individual embodiments. In some embodiments, the numerical parameter should take into account the specified significant digits and employ a general digit preserving approach. Notwithstanding that the numerical ranges and parameters setting forth the broad scope of the range are approximations, in the specific examples, such numerical values are set forth as precisely as possible within the scope of the application.
The entire contents of each patent, patent application publication, and other material cited in this application, such as articles, books, specifications, publications, documents, and the like, are hereby incorporated by reference into this application. Except where the application is filed in a manner inconsistent or contrary to the present disclosure, and except where the claim is filed in its broadest scope (whether present or later appended to the application) as well. It is noted that the descriptions, definitions and/or use of terms in this application shall control if they are inconsistent or contrary to the statements and/or uses of the present application in the material attached to this application.
Finally, it should be understood that the embodiments described herein are merely illustrative of the principles of the embodiments of the present application. Other variations are also possible within the scope of the present application. Thus, by way of example, and not limitation, alternative configurations of the embodiments of the present application can be viewed as being consistent with the teachings of the present application. Accordingly, the embodiments of the present application are not limited to only those embodiments explicitly described and depicted herein.