Target object determination method and device and server
1. A method for determining a target object, comprising:
acquiring individual data and relationship data of a plurality of first objects;
constructing a corresponding relation knowledge graph according to the relation data of the plurality of first objects;
screening a plurality of second objects which meet a preset first requirement from the plurality of first objects according to the relation knowledge graph;
calling a preset potential object prediction model to process the individual data of the plurality of second objects so as to obtain a prediction result of the second objects;
and screening out target objects meeting a preset second requirement from the plurality of second objects according to the prediction results of the second objects.
2. The method of claim 1, wherein the first object comprises: a natural human object, and/or, a legal human object.
3. The method of claim 2, wherein the relationship data comprises at least one of: natural person relationship, equity relationship, job relationship, business relationship, and capital relationship.
4. The method of claim 3, wherein constructing a respective relational knowledge-graph from the relational data of the plurality of first objects comprises:
establishing a node according to the object identification of the first object;
connecting corresponding nodes by using edges according to the relation data between the first objects; and marking the attribute value of the edge to obtain the relation knowledge graph.
5. The method of claim 4, wherein the step of screening a plurality of second objects from the plurality of first objects according to the relational knowledge-graph comprises:
searching the relation knowledge graph, and finding out a node with the object identification matched with the client list as an initial node;
starting from the starting node, searching nodes connected with the starting node through edges as candidate nodes;
and screening out a first object corresponding to the candidate node with the attribute value meeting a preset second requirement as a second object according to the attribute value of the edge between the candidate node and the initial node.
6. The method of claim 2, wherein the individual data comprises at least one of:
the system comprises loan data of a first object, income data of the first object, asset data of the first object, payment records of the first object, insurance data of the first object and registration information of the first object.
7. The method of claim 6, wherein invoking a preset potential object prediction model to process the individual data of the plurality of second objects to obtain a prediction result of the second object comprises:
preprocessing the individual data of the second object to extract individual features of multiple dimensions of the second object;
and calling a preset potential object prediction model to process the individual characteristics of the multiple dimensions of the second object so as to obtain a prediction result of the second object.
8. The method of claim 7, wherein the individual features of the plurality of dimensions comprise: occupation characteristics, income characteristics, housing characteristics, vehicle characteristics, consumption characteristics, natural human relationship characteristics.
9. The method according to claim 2, wherein after selecting a target object meeting a preset second requirement from the plurality of second objects according to the predicted result of the second object, the method further comprises:
acquiring a service label of a target object;
determining a target pushing strategy aiming at a target object according to the service label;
and pushing link data related to the target business product to the target object according to the target pushing strategy.
10. The method of claim 1, further comprising:
acquiring individual characteristics of a plurality of sample objects as sample data;
carrying out category marking on the sample data to obtain marked sample data;
constructing an initial model and a preset loss function; wherein the preset loss function is a FocalLoss loss function;
and training and learning the initial model by using a preset loss function and the labeled sample data to obtain a preset potential object prediction model.
11. An apparatus for determining a target object, comprising:
the acquisition module is used for acquiring individual data and relationship data of a plurality of first objects;
the construction module is used for constructing a corresponding relation knowledge graph according to the relation data of the plurality of first objects;
the first screening module is used for screening a plurality of second objects which meet a preset first requirement from the plurality of first objects according to the relation knowledge graph;
the calling module is used for calling a preset potential object prediction model to process the individual data of the plurality of second objects so as to obtain a prediction result of the second objects;
and the second screening module is used for screening the target object which meets a preset second requirement from the plurality of second objects according to the prediction result of the second object.
12. A server comprising a processor and a memory for storing processor-executable instructions which, when executed by the processor, implement the steps of the method of any one of claims 1 to 10.
13. A computer storage medium having stored thereon computer instructions which, when executed, implement the steps of the method of any one of claims 1 to 10.
Background
In a popularization scenario of a plurality of business products (for example, a bank popularizes financial products and the like), based on the existing method, a worker responsible for popularization of the business products often cannot fully and effectively utilize massive information data of a client, mostly only some data can be used, and then whether the client is a potential client (or called potential client) of the business products is subjectively judged by combining personal experience. The method has the advantages that errors are large and accuracy is low when potential customers are determined, and further the popularization effect of subsequent business products is influenced.
In view of the above problems, no effective solution has been proposed.
Disclosure of Invention
The specification provides a method, a device and a server for determining a target object, so that a potential client object with a high acceptance probability to a target business product can be screened out efficiently and accurately.
An embodiment of the present specification provides a method for determining a target object, including:
acquiring individual data and relationship data of a plurality of first objects;
constructing a corresponding relation knowledge graph according to the relation data of the plurality of first objects;
screening a plurality of second objects which meet a preset first requirement from the plurality of first objects according to the relation knowledge graph;
calling a preset potential object prediction model to process the individual data of the plurality of second objects so as to obtain a prediction result of the second objects;
and screening out target objects meeting a preset second requirement from the plurality of second objects according to the prediction results of the second objects.
In some embodiments, the first object comprises: a natural human object, and/or, a legal human object.
In some embodiments, the relationship data includes at least one of: natural person relationship, equity relationship, job relationship, business relationship, and capital relationship.
In some embodiments, constructing a respective relational knowledge-graph from the relational data of the plurality of first objects comprises:
establishing a node according to the object identification of the first object;
connecting corresponding nodes by using edges according to the relation data between the first objects; and marking the attribute value of the edge to obtain the relation knowledge graph.
In some embodiments, screening a plurality of second objects meeting a preset first requirement from the plurality of first objects according to the relationship knowledge graph comprises:
searching the relation knowledge graph, and finding out a node with the object identification matched with the client list as an initial node;
starting from the starting node, searching nodes connected with the starting node through edges as candidate nodes;
and screening out a first object corresponding to the candidate node with the attribute value meeting a preset second requirement as a second object according to the attribute value of the edge between the candidate node and the initial node.
In some embodiments, the individual data comprises at least one of:
the system comprises loan data of a first object, income data of the first object, asset data of the first object, payment records of the first object, insurance data of the first object and registration information of the first object.
In some embodiments, invoking a preset potential object prediction model to process the individual data of the plurality of second objects to obtain a prediction result of the second objects includes:
preprocessing the individual data of the second object to extract individual features of multiple dimensions of the second object;
and calling a preset potential object prediction model to process the individual characteristics of the multiple dimensions of the second object so as to obtain a prediction result of the second object.
In some embodiments, the individual features of the plurality of dimensions comprise: occupation characteristics, income characteristics, housing characteristics, vehicle characteristics, consumption characteristics, natural human relationship characteristics.
In some embodiments, after the target object meeting a preset second requirement is selected from the plurality of second objects according to the prediction result of the second object, the method further includes:
acquiring a service label of a target object;
determining a target pushing strategy aiming at a target object according to the service label;
and pushing link data related to the target business product to the target object according to the target pushing strategy.
In some embodiments, the method further comprises:
acquiring individual characteristics of a plurality of sample objects as sample data;
according to the service label of the sample degree object, carrying out class marking on the sample data to obtain marked sample data;
constructing an initial model and a preset loss function; wherein the preset loss function is a FocalLoss loss function;
and training and learning the initial model by using a preset loss function and the labeled sample data to obtain a preset potential object prediction model.
An embodiment of the present specification further provides a target object determination apparatus, including:
the acquisition module is used for acquiring individual data and relationship data of a plurality of first objects;
the construction module is used for constructing a corresponding relation knowledge graph according to the relation data of the plurality of first objects;
the first screening module is used for screening a plurality of second objects which meet a preset first requirement from the plurality of first objects according to the relation knowledge graph;
the calling module is used for calling a preset potential object prediction model to process the individual data of the plurality of second objects so as to obtain a prediction result of the second objects;
and the second screening module is used for screening the target object which meets a preset second requirement from the plurality of second objects according to the prediction result of the second object.
Embodiments of the present specification also provide a server, including a processor and a memory for storing processor-executable instructions, where the processor executes the instructions to implement the following: acquiring individual data and relationship data of a plurality of first objects; constructing a corresponding relation knowledge graph according to the relation data of the plurality of first objects; screening a plurality of second objects which meet a preset first requirement from the plurality of first objects according to the relation knowledge graph; calling a preset potential object prediction model to process the individual data of the plurality of second objects so as to obtain a prediction result of the second objects; and screening out target objects meeting a preset second requirement from the plurality of second objects according to the prediction results of the second objects.
Embodiments of the present specification also provide a computer storage medium having stored thereon computer instructions that, when executed, implement: acquiring individual data and relationship data of a plurality of first objects; constructing a corresponding relation knowledge graph according to the relation data of the plurality of first objects; screening a plurality of second objects which meet a preset first requirement from the plurality of first objects according to the relation knowledge graph; calling a preset potential object prediction model to process the individual data of the plurality of second objects so as to obtain a prediction result of the second objects; and screening out target objects meeting a preset second requirement from the plurality of second objects according to the prediction results of the second objects.
The specification provides a method, a device and a server for determining a target object. Based on the method, when a potential customer object needs to be searched for pushing a target service product, a corresponding relation knowledge graph can be obtained firstly and constructed according to relation data of a first object; screening the first object based on the relation knowledge graph to find out a second object meeting a preset first requirement; then, a preset potential object prediction model can be called to process the individual data of the second object so as to obtain a prediction result of the second object; and screening the second object according to the prediction result of the second object to find out the target object meeting the preset second requirement. Therefore, the potential customer object with higher acceptance probability of the target business product can be efficiently and accurately screened out from the first object by comprehensively utilizing the relationship data and the individual data of the first object, and the determination error in the process of determining the target object is reduced. And then subsequently, information push related to the target business product can be carried out according to the matched strategy aiming at the target object, and the popularization effect of the target business product is improved.
Drawings
In order to more clearly illustrate the embodiments of the present specification, the drawings needed to be used in the embodiments will be briefly described below, and the drawings in the following description are only some of the embodiments described in the specification, and it is obvious to those skilled in the art that other drawings can be obtained based on the drawings without any inventive work.
Fig. 1 is a schematic diagram of an embodiment of a structural component of a system to which a target object determination method provided by an embodiment of the present specification is applied;
FIG. 2 is a flow chart illustrating a method for determining a target object provided by an embodiment of the present description;
FIG. 3 is a schematic diagram of a server according to an embodiment of the present disclosure;
fig. 4 is a schematic structural component diagram of a target object determination apparatus provided in an embodiment of the present specification;
FIG. 5 is a diagram illustrating an embodiment of a method for determining a target object according to an embodiment of the present disclosure;
FIG. 6 is a diagram illustrating an embodiment of a method for determining a target object according to an embodiment of the present disclosure;
fig. 7 is a schematic diagram of an embodiment of a method for determining a target object provided by an embodiment of the present specification, in an example scenario.
Detailed Description
In order to make those skilled in the art better understand the technical solutions in the present specification, the technical solutions in the embodiments of the present specification will be clearly and completely described below with reference to the drawings in the embodiments of the present specification, and it is obvious that the described embodiments are only a part of the embodiments of the present specification, and not all of the embodiments. All other embodiments obtained by a person skilled in the art based on the embodiments in the present specification without any inventive step should fall within the scope of protection of the present specification.
The embodiment of the specification provides a method for determining a target object, which can be applied to a system comprising a server and a terminal device. In particular, reference may be made to fig. 1. The server and the terminal equipment can be connected in a wired or wireless mode to carry out specific data interaction.
In this embodiment, the server may specifically include a background server that is applied to a network platform side of a data center of a certain bank and is capable of implementing functions such as data transmission and data processing. Specifically, the server may be, for example, an electronic device having data operation, storage function and network interaction function. Alternatively, the server may be a software program running in the electronic device and providing support for data processing, storage and network interaction. In this embodiment, the number of servers included in the server is not particularly limited. The server may specifically be one server, or may also be several servers, or a server cluster formed by several servers.
In this embodiment, the terminal device may specifically include a front-end electronic device that is applied to one side of the network platform of the bank data center and is capable of implementing functions such as data acquisition, data monitoring, and data transmission on mass service data accessed by the network platform. Alternatively, the terminal device may be a software application capable of running in the electronic device. For example, it may be some APP running on a server, etc.
In this embodiment, a certain bank data center network platform will access massive business data every day. For example, the data of the house credit transacted by the user a at the bank, the data of the transfer transaction sent by the user B to the user C through the electronic bank of the bank, the registration information filled by the user D when transacting the associated account of the bank, and the like.
Currently, the bank plans to promote a new introduced financial product (which may be denoted as a target business product) to the outside. In order to improve the popularization effect, the server may apply the method for determining a target object provided in the embodiment of the present specification to effectively and fully utilize the massive business data accessed by the network platform, so as to screen out a client object having a high acceptance probability and a high purchase intention for the financial product from a plurality of client objects to be selected (which may be denoted as a first object, including a personal object and a legal object) as a potential client object (which may be denoted as a target object); and then the target object is purposefully popularized.
When the method is implemented specifically, firstly, the server can acquire and acquire mass business data accessed by the bank data center network platform through terminal equipment.
The server may then extract an object identifier of the first object, individual data of the first object (e.g., room credit data of the first object, car credit data of the first object, fund flow record of the first object, insurance data of the first object, etc.), and relationship data between the first objects (e.g., natural person relationship, equity relationship, job relationship, business transaction relationship, fund relationship, etc.) by performing semantic recognition and relationship inference on the business data.
Further, the server may construct a corresponding relationship knowledge graph using the relationship data of the plurality of first objects. Specifically, the server may determine an object identifier of the first object related to the relationship data (for example, a name of the first object, a registered mobile phone number of the first object, an account name of the first object, and the like). And then according to the object identification of the first object, establishing a node corresponding to the first object by using the circle. The server may then connect the nodes corresponding to different first objects involved in the same relationship data using edges characterizing the relationship according to the relationship data between the first objects. Meanwhile, attribute values of the edges (for example, parameters such as the relationship type of the relationship represented by the edges, the closeness degree of the relationship, the duration of the relationship and the like) can be marked based on specific relationship data according to a preset relationship marking rule. Thus, the server can establish a relation knowledge graph which meets the requirement.
The server can perform first screening by using the relation data between the first objects according to the relation knowledge graph, so as to screen out the first objects which are relatively more likely to become potential client objects as second objects which meet a preset first requirement based on the relation between different first objects.
Specifically, the server may first use an existing client list recorded with potential client objects (or good-quality client objects) to perform object identification retrieval on the relationship knowledge picture, so as to find a node with an object identification matching the client name as a start node. Then, the server may search for other nodes directly or indirectly connected to the node through edges as candidate nodes from the starting node. Then, the server may acquire an edge attribute value between the connection candidate node and the start node. And screening out a first object corresponding to a candidate node (i.e. a candidate node meeting a preset first requirement) having a closer association relation with the initial node according to the attribute value, as a second object meeting the preset first requirement.
The server may call a pre-trained preset potential object prediction model, perform a second screening using the individual data of the second object, and further screen out a first object having a relatively high probability of accepting and purchasing a target business product to be promoted from the plurality of first objects as a target object (i.e., a potential customer object) meeting a preset second requirement based on the individual characteristics of the second object.
Specifically, the server may first pre-process the individual data of the second object to extract individual features of multiple dimensions for the second object from the individual data of the second object (e.g., occupation features, income features, housing features, vehicle features, consumption features, natural human relation features, etc. of the second object). Then, the individual feature combinations of the multiple dimensions of the second object are used as model input and input into a pre-trained preset potential object prediction model; and running the model to output a predicted result of the second object. The server can screen out a target object which meets a preset second requirement and has a higher probability of accepting and purchasing the target business product from the second object according to the prediction result of the second object.
Finally, the server can carry out targeted business product popularization to the target object in a targeted mode.
Specifically, the server may first search the client database according to the object identifier of the target object to obtain a business label (e.g., a label indicating investment and financing tendency, behavior habit, etc. of the client) of the bank for the target object, and other relevant client data of the target object. And further, a matched target pushing strategy can be generated for the target object according to the service label of the target object and in combination with other related client data. And then pushing the link data about the target business product to the target object according to the target pushing strategy.
For example, the server may find that the target object is relatively more used to use a mobile phone at ordinary times according to the service tag of the target object, and the target object pays more attention to robustness and value guarantee for investment and financial management. According to the information, the server can generate a target push strategy matched with the target object. Furthermore, the server can search the target service product from a preset publicity text base according to the target pushing strategy, and highlight the publicity text with the characteristics of stability and value preservation as the target promotion text. And combining the target promotion text with the download link data of the target business product to obtain target promotion data about the target business product aiming at the target object. And then the target promotion data is sent to the mobile phone of the target object in a mobile phone short message mode. Therefore, the target object can receive the target promotion data in time through the mobile phone, and can more easily generate interest in the target business product according to the target promotion text in the target promotion data, and further more likely download and handle the target business product by triggering the download link data in the target promotion data.
Through the embodiment, the potential client object with higher acceptance probability of the target service product can be screened out from the massive first objects more efficiently and accurately, and errors in target object determination are reduced. Furthermore, the target popularization data related to the target business product can be generated and pushed to the target object according to the matched target popularization strategy, and the popularization effect of the target business product is improved.
Referring to fig. 2, an embodiment of the present disclosure provides a method for determining a target object. The method is particularly applied to the server side. In specific implementation, the method may include the following:
s201: acquiring individual data and relationship data of a plurality of first objects;
s202: constructing a corresponding relation knowledge graph according to the relation data of the plurality of first objects;
s203: screening a plurality of second objects which meet a preset first requirement from the plurality of first objects according to the relation knowledge graph;
s204: calling a preset potential object prediction model to process the individual data of the plurality of second objects so as to obtain a prediction result of the second objects;
s205: and screening out target objects meeting a preset second requirement from the plurality of second objects according to the prediction results of the second objects.
Through the embodiment, the two different types of data, namely the relationship data and the individual data of the first object, can be effectively and fully utilized, the potential customer object with higher acceptance probability of the target business product is efficiently and accurately screened out from the first object to serve as the target object, and the processing error in the process of determining the target object is reduced.
In some embodiments, the first object may be specifically understood as a customer object to be determined as a potential customer object of the target business product. The target business product may be a physical product (e.g., a mobile phone, a computer, a book, etc.), or may be a virtual service (e.g., a financial service, an insurance service, a cleaning service, etc.).
In some embodiments, for a financial service promotion scenario (for example, a bank promotes a newly released financial product, etc.), the first object may specifically include: natural human objects, and/or, legal human objects, and the like.
Of course, the first object listed above is only a schematic illustration. The first object may also comprise other types of objects for different application scenarios. For example, for a mobile phone popularization scenario, the first object may specifically include: student subjects, adult subjects, etc.
Through the embodiment, the method for determining the target object provided by the embodiment of the specification can be applied to the financial service promotion scene, and the specific judgment and determination of various first objects in the application scene can be effectively covered.
In some embodiments, the individual data may specifically include data reflecting individual characteristics of the first object. The relationship data may specifically include data reflecting an association relationship between different first objects.
In some embodiments, for a financial service promotion scenario, the relationship data may specifically include at least one of: natural person relationship, equity relationship, job relationship, business relationship, fund relationship, etc.
Through the embodiment, most of incidence relations in the financial service promotion scene can be better covered by comprehensively distinguishing and utilizing the relation data of various different types, so that the relation data can be effectively utilized to determine potential customer objects.
In some embodiments, for a financial service promotion scenario, the individual data may specifically include at least one of: the system comprises a first object, a second object, a third object and a fourth object, wherein the first object comprises loan data of the first object, income data of the first object, asset data of the first object, payment records of the first object, insurance data of the first object, registration information of the first object and the like.
By the embodiment, most of the relevant individual characteristics in the financial service promotion scene can be better covered by comprehensively acquiring and utilizing various types of individual data, so that potential customer objects can be determined by effectively utilizing the individual data.
In some embodiments, the obtaining of the individual data and the relationship data of the plurality of first objects may include the following steps:
s1: acquiring service data related to a first object; the service data may specifically include: registration information of the first object, user data of the first object, product holding records of the first object, transaction data between the first objects, asset registration data relating to the first object, and the like;
s2: and carrying out semantic recognition and association reasoning on the business data to extract individual data of the first object and relationship data between the first objects.
In some embodiments, the above building a corresponding relationship knowledge graph according to the relationship data of the plurality of first objects may include the following steps:
s1: establishing a node according to the object identification of the first object;
s2: connecting corresponding nodes by using edges according to the relation data between the first objects; and marking the attribute value of the edge to obtain the relation knowledge graph.
Through the embodiment, the relation data between different first objects can be effectively utilized to construct and obtain the relation knowledge graph which contains more comprehensive incidence relation and has higher accuracy.
In some embodiments, the selecting, according to the relationship knowledge graph, a plurality of second objects that meet a preset first requirement from the plurality of first objects may include the following steps:
s1: searching the relation knowledge graph, and finding out a node with the object identification matched with the client list as an initial node;
s2: starting from the starting node, searching nodes connected with the starting node through edges as candidate nodes;
s3: and screening out a first object corresponding to the candidate node with the attribute value meeting a preset second requirement as a second object according to the attribute value of the edge between the candidate node and the initial node.
Through the embodiment, the first screening can be completed by effectively utilizing the relation data between different first objects according to the relation knowledge graph, so that the second object which is possibly a potential customer object can be screened from the massive first objects.
In some embodiments, the client list may specifically be list data recorded with object identifications that have been determined to be potential client objects or high potential client objects.
In some embodiments, for a financial service promotion scenario, the screening, according to the relationship knowledge graph, a plurality of second objects meeting a preset first requirement from the plurality of first objects may further include: screening out a first object with personal asset data larger than preset data as a candidate object according to the individual data of the first object; searching a relation knowledge graph, and finding out a node which is the same as the object identifier of the candidate object as an initial node; further, starting from the starting node, searching nodes connected with the starting node through edges as candidate nodes; and then, according to the attribute values of the edges between the candidate nodes and the initial node, screening out the first object corresponding to the candidate nodes with the attribute values meeting the preset second requirement as a second object.
In some embodiments, the invoking of the preset potential object prediction model to process the individual data of the plurality of second objects to obtain the prediction result of the second object may include the following steps: preprocessing the individual data of the second object to extract individual features of multiple dimensions of the second object; and calling a preset potential object prediction model to process the individual characteristics of the multiple dimensions of the second object so as to obtain a prediction result of the second object.
By the embodiment, the individual data of the second object can be preprocessed to extract and obtain corresponding individual features; then, a pre-trained preset potential object prediction model is called to process the individual characteristics so as to accurately determine a prediction result for indicating whether the second object is a potential customer object; and then, according to the prediction result of the second object, the individual data of the second object is fully utilized to complete the second screening, so that the potential customer object is further determined and screened out from the plurality of second objects to serve as a final target object.
In some embodiments, the individual features of the plurality of dimensions may specifically include: occupation characteristics, income characteristics, housing characteristics, vehicle characteristics, consumption characteristics, natural human relationship characteristics, and the like.
Through the embodiment, most individual features in the financing service promotion scene can be better covered by comprehensively distinguishing and utilizing the individual features of various different dimensions, and further, the potential customer object can be finely identified and determined by effectively utilizing the individual data based on the individual features of the second object.
In some embodiments, the preprocessing of the individual data of the second object may be implemented by: performing semantic recognition on the individual data of the second object to extract a plurality of feature data of the second object; and calling a pre-trained feature classification model, and processing the plurality of feature data to obtain the individual features of the second object in a plurality of dimensions.
In some embodiments, the preset potential object prediction model may specifically be an XGBoost algorithm model.
In some embodiments, the XGBoost may be specifically understood as a GBDT-based algorithm implementation, and GBDT is another algorithm model based on a boosting integration idea. Based on the XGboost algorithm model, when model training is specifically carried out, greedy learning can be carried out by adopting a forward distribution algorithm, and a CART number is learned in each iteration to fit a residual error between a prediction result of the previous t-1 tree and a true value of a training sample. The XGboost supports parallel computing.
In some embodiments, when a preset potential object prediction model is specifically called to process the individual features of multiple dimensions of the second object, the individual features of each dimension may be converted into corresponding feature codes according to a preset feature coding rule corresponding to the feature dimension; and then, inputting a plurality of feature code combinations as model input into a preset potential object prediction model for specific processing.
In some embodiments, for a financial service promotion scenario, the above-mentioned occupation characteristics can reflect the asset acquisition capability of the client object to some extent, for example, the a occupation has a higher and stable asset acquisition capability than other occupation generally. Before specific implementation, the following preset feature coding rules can be designed for corresponding professional features: the 1-A profession, the 2-B profession, the 3-C profession, the 4-D profession … ….
For a financial service promotion scene, the income characteristics can better assist in calculating the asset condition of a client object, and the method comprises the following steps: the information of the surrogated wages, the party fee payment information, the government affairs information (social security, public accumulation fund) and the like.
For the financial service promotion scene, the housing features can reflect the level and wealth of the client to a certain extent, for example, the method includes: house loan information, reliable contact addresses, decoration loan information, and the like.
For a financial service promotion scenario, the vehicle features are also the embodiments of important assets and consumption capabilities of the customer object, and include: etc. registration information, driving license scanning information, automobile consumption loan information, etc.
Aiming at the popularization scene of financial services, the consumption characteristics are often related to the purchasing power of the customers, and the assets of the customers are the main support of the purchasing power, so that the asset conditions of the customer objects can be reflected. For example, it includes: monthly payment of credit cards, consumption of credit cards (luxury consumption, overseas consumption), payment of children schools (school charge situation), investment of real precious metals and the like.
For the financial service promotion scenario, the above-mentioned natural human relationship features, such as the natural human relationship between the client object and the determined potential client object (or high potential client object, ultra-high net value client object, etc.), can also reflect the asset status of the client object from the side. Before specific implementation, the following preset feature coding rules can be designed corresponding to the natural human relationship features: 11-class 1 relationships, 12-class 2 relationships, 13-class 3 relationships, and the like.
In some embodiments, after the target object meeting the preset second requirement is screened out from the plurality of second objects according to the prediction result of the second object, when the method is implemented, the following may be further included:
s1: acquiring a service label of a target object;
s2: determining a target pushing strategy aiming at a target object according to the service label;
s3: and pushing link data related to the target business product to the target object according to the target pushing strategy.
By the embodiment, after the target object with higher acceptance probability of the target service product is determined from the massive first objects, the link data of the target service product can be pushed to the target object in a targeted manner according to the target pushing strategy matched with the target object, so that a better popularization effect can be obtained.
In some embodiments, the method, when implemented, may further include:
s1: acquiring individual characteristics of a plurality of sample objects as sample data;
s2: carrying out category marking on the sample data to obtain marked sample data;
s3: constructing an initial model and a preset loss function; wherein the preset loss function is a FocalLoss loss function;
s4: and training and learning the initial model by using a preset loss function and the labeled sample data to obtain a preset potential object prediction model.
Through the embodiment, the preset potential object prediction model with high accuracy and good effect can be obtained through efficient training by using the sample data and combining the preset loss function based on the FocalLoss loss function.
In some embodiments, the performing the category marking on the sample data to obtain the marked sample data may include, in specific implementation: according to the type of the sample data (for example, whether the sample data is a high potential client object or a non-high potential client object, or whether the sample data is a client object which is successfully popularized or a client object which is failed in popularization), a corresponding type label is set on the sample data and marked, so that the marked sample data can be obtained. For example, a category label representing a positive sample is set on sample data of a high-potential client object or a client object which succeeds in popularization, and a category label representing a negative sample is set on sample data of a non-high-potential client object or a client object which fails in popularization, so that labeled sample data is obtained.
After the labeled sample data is obtained, the labeled sample data can be further split into a training set and a test set according to a preset training proportion (for example, 9:1), so that model training can be performed by using the training set first, and then model testing can be performed by using the test set, and a preset potential object prediction model meeting requirements can be obtained.
In some embodiments, before model training is performed by using the labeled sample data, numerical type detection may be performed on the individual features of the sample object to find out individual features belonging to numerical continuity types (e.g., generation wage information, house loan information, monthly charge of credit cards, etc.); then, the individual features of the numerical continuum can be firstly discretized into k discrete features by using a histogram algorithm, and a k-wide histogram (containing k bins) is constructed for information statistics. Therefore, when model training is subsequently carried out specifically, the statistical information of the histogram can be directly utilized for training, all individual features do not need to be traversed, the optimal splitting point can be quickly found by only traversing k bins in the histogram, so that the calculation efficiency of node splitting can be greatly improved, the variance of the model is reduced to a certain extent, the robustness of the model is enhanced, and the model training efficiency is also improved.
In some embodiments, the predetermined loss function may be a focallloss loss function. Based on the preset loss function, the FocalLoss loss function can be effectively utilized to set different weight coefficients for different categories respectively.
In some embodiments, when training of the preset potential object prediction model is specifically performed, based on the concept of the XGBoost algorithm, an objective function (including two parts, namely a preset loss function and a regularization term) of model training may be constructed in the following manner:
wherein Obj represents the function value of the objective function, l represents the preset loss function, Ω represents the regularization term, and f represents the model function.Is the ith sample object xiPredicted value of (a), yiIs the ith sample object xiThe true value of (d).
The preset loss function may be specifically expressed as the following form:
wherein alpha isiRepresenting the weight coefficient set based on the category, and the value range is between 0 and 1.
The target function is constructed by introducing and utilizing the preset loss function containing the weight coefficient based on the category to train the model, different weights can be given to different categories, so that the problem that the loss of the sample category with a high proportion is dominant in the training process, so that the model obtained by training can specially take care of the sample category with the high proportion is solved, and the accuracy of model training is improved.
Further, since XGBoost is an additive model, the predicted value (or called the predicted score) may be specifically the cumulative sum of scores for each tree. As follows:
and summing the complexity of all k trees, and adding the sum to an objective function as a regularization term:
wherein J represents the number of leaf nodes, and w represents the optimal solution of the jth leaf node.
By introducing and utilizing the regularization term to construct the objective function so as to train the model, overfitting can be effectively prevented, and the training effect of the model is further improved.
As can be seen from the above, based on the method for determining a target object provided in the embodiments of the present specification, when a potential customer object needs to be searched for pushing a target service product, a corresponding relationship knowledge graph may be first obtained and constructed according to relationship data of a first object; screening the first object based on the relation knowledge graph to find out a second object which meets a preset first requirement; then, a preset potential object prediction model can be called to process the individual data of the second object so as to obtain a prediction result of the second object; and screening the second object according to the prediction result of the second object to find out the target object meeting the preset second requirement. Therefore, the potential customer object with high acceptance probability to the target business product can be screened out from the first object efficiently and accurately by comprehensively utilizing the relationship data and the individual data of the first object. And then, information related to the target business product can be pushed according to the matched strategy aiming at the target object, so that the popularization effect of the target business product is improved.
Embodiments of the present specification further provide a server, including a processor and a memory for storing processor-executable instructions, where the processor, when implemented, may perform the following steps according to the instructions: acquiring individual data and relationship data of a plurality of first objects; constructing a corresponding relation knowledge graph according to the relation data of the plurality of first objects; screening a plurality of second objects which meet a preset first requirement from the plurality of first objects according to the relation knowledge graph; calling a preset potential object prediction model to process the individual data of the plurality of second objects so as to obtain a prediction result of the second objects; and screening out target objects meeting a preset second requirement from the plurality of second objects according to the prediction results of the second objects.
In order to more accurately complete the above instructions, referring to fig. 3, another specific server is provided in the embodiments of the present specification, wherein the server includes a network communication port 301, a processor 302, and a memory 303, and the above structures are connected by an internal cable, so that the structures may perform specific data interaction.
The network communication port 301 may be specifically configured to obtain individual data and relationship data of a plurality of first objects.
The processor 302 may be specifically configured to construct a corresponding relationship knowledge graph according to the relationship data of the plurality of first objects; screening a plurality of second objects which meet a preset first requirement from the plurality of first objects according to the relation knowledge graph; calling a preset potential object prediction model to process the individual data of the plurality of second objects so as to obtain a prediction result of the second objects; and screening out target objects meeting a preset second requirement from the plurality of second objects according to the prediction results of the second objects.
The memory 303 may be specifically configured to store a corresponding instruction program.
In this embodiment, the network communication port 301 may be a virtual port that is bound to different communication protocols, so that different data can be sent or received. For example, the network communication port may be a port responsible for web data communication, a port responsible for FTP data communication, or a port responsible for mail data communication. In addition, the network communication port can also be a communication interface or a communication chip of an entity. For example, it may be a wireless mobile network communication chip, such as GSM, CDMA, etc.; it can also be a Wifi chip; it may also be a bluetooth chip.
In this embodiment, the processor 302 may be implemented in any suitable manner. For example, the processor may take the form of, for example, a microprocessor or processor and a computer-readable medium that stores computer-readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, an Application Specific Integrated Circuit (ASIC), a programmable logic controller, an embedded microcontroller, and so forth. The description is not intended to be limiting.
In this embodiment, the memory 303 may include multiple layers, and in a digital system, the memory may be any memory as long as binary data can be stored; in an integrated circuit, a circuit without a physical form and with a storage function is also called a memory, such as a RAM, a FIFO and the like; in the system, the storage device in physical form is also called a memory, such as a memory bank, a TF card and the like.
The present specification further provides a computer storage medium based on the above determination method for a target object, where the computer storage medium stores computer program instructions, and when the computer program instructions are executed, the computer storage medium implements: acquiring individual data and relationship data of a plurality of first objects; constructing a corresponding relation knowledge graph according to the relation data of the plurality of first objects; screening a plurality of second objects which meet a preset first requirement from the plurality of first objects according to the relation knowledge graph; calling a preset potential object prediction model to process the individual data of the plurality of second objects so as to obtain a prediction result of the second objects; and screening out target objects meeting a preset second requirement from the plurality of second objects according to the prediction results of the second objects.
In this embodiment, the storage medium includes, but is not limited to, a Random Access Memory (RAM), a Read-Only Memory (ROM), a Cache (Cache), a Hard Disk Drive (HDD), or a Memory Card (Memory Card). The memory may be used to store computer program instructions. The network communication unit may be an interface for performing network connection communication, which is set in accordance with a standard prescribed by a communication protocol.
In this embodiment, the functions and effects specifically realized by the program instructions stored in the computer storage medium can be explained by comparing with other embodiments, and are not described herein again.
Referring to fig. 4, in a software level, an embodiment of the present specification further provides a device for determining a target object, where the device may specifically include the following structural modules:
the obtaining module 401 may be specifically configured to obtain individual data and relationship data of a plurality of first objects;
a building module 402, which may be specifically configured to build a corresponding relationship knowledge graph according to the relationship data of the plurality of first objects;
the first screening module 403 is specifically configured to screen, according to the relationship knowledge graph, a plurality of second objects that meet a preset first requirement from the plurality of first objects;
the invoking module 404 may be specifically configured to invoke a preset potential object prediction model to process the individual data of the plurality of second objects, so as to obtain a prediction result of the second object;
the second screening module 405 may be specifically configured to screen out a target object meeting a preset second requirement from the plurality of second objects according to the prediction result of the second object.
It should be noted that, the units, devices, modules, etc. illustrated in the above embodiments may be implemented by a computer chip or an entity, or implemented by a product with certain functions. For convenience of description, the above devices are described as being divided into various modules by functions, and are described separately. It is to be understood that, in implementing the present specification, functions of each module may be implemented in one or more pieces of software and/or hardware, or a module that implements the same function may be implemented by a combination of a plurality of sub-modules or sub-units, or the like. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
As can be seen from the above, based on the determination apparatus for a target object provided in the embodiments of the present specification, by comprehensively utilizing the relationship data and the individual data of the first object, a potential customer object with a high acceptance probability for a target business product is efficiently and accurately screened out from the first object.
In a specific scenario example, the target client may be probed based on the association relationship and then marketed accordingly by applying the method for determining a target object provided by the embodiment of the present specification. The specific implementation process can be seen in fig. 5, and includes the following steps.
In step S1, a plurality of invisible relationships (e.g., first object relationship data) are obtained and determined according to the initial business data. And a knowledge graph (e.g., a relational knowledge graph) is constructed by using the association relations such as natural person relations, stock right relations, job relations, business transaction relations, fund relations and the like. By utilizing the knowledge graph, various implicit relations can be explored, and customers (for example, the total amount of personal assets is less than 20W) which temporarily represent low-value and low-asset can be mined, but a customer list with high actual assets is estimated from the associated data (for example, a second object which meets a preset first requirement is screened out from a large number of first objects).
And step S2, according to the low-asset potential customer list, performing potential target customer prediction according to characteristics (such as individual characteristics) of six dimensions (including occupation, income, housing, automobile, consumption and natural human relations) from the six dimensions (such as occupation, income, housing, automobile, consumption and natural human relations) which are most close to funds by using an XGboost algorithm model (such as a preset potential object prediction model).
And step S3, according to the model prediction result, the comprehensive condition of the client can be judged by combining the traditional client star label, and powerful decision support is provided for the accurate marketing of the client manager (for example, the target object is determined, and the target business product is popularized on the target object).
In this scenario example, in a specific implementation, a knowledge graph may be constructed to mine low-asset potential customers by using corporate-to-personal relationship and personal-to-personal relationship data, taking a customer entity as a "point," taking a "relationship" between entities as an "edge," and combining company information (company settlers, listed company information) with business registration information (equity relationship, high-management relationship) or fund flow association relationship, so as to accurately identify potential customers; or the private bank ultrahigh-net-value customer is used as a starting point, and the target customer is mined based on map incidence relations such as the relationship of relatives, the relationship of credit cards and major and minor cards, the incidence relation of enterprises and the like. Wherein the relationship may be determined based on personal registration information, a house loan common repayment, loan guarantee information, credit card contacts, same credit card address, same contact address, insurance beneficiary, fixed transfer account, close up of the past, government information (marital), and other information.
In the scenario example, potential target customer predictions can be performed on the target customer groups by using the XGBoost algorithm model based on basic customer information such as occupation, income, housing, automobile, consumption, natural human relations and the like.
Wherein, occupation: the occupation of the client determines the asset acquisition capability of the client, and particularly an important manager of an enterprise encodes common occupation, such as: 1-a, 2-B, 3-C, 4-D, which is used as a feature for XGBoost model learning. Income: the income of the customer can directly calculate the asset condition of the customer, including the generation wage information, party fee payment information and government affair information (social security and accumulation fund). Housing: different housing conditions of domestic customers reflect the customer levels and wealth, and comprise: house loan information, reliable contact address, and decoration loan information. An automobile: the automobile is a manifestation of important assets and consumption capacity of customers, and comprises etc registration information, driving license scanning information and automobile consumption loan information. Consumption: the consumption situation reflects the purchasing power of the client, and the financial power of the client is the back support of the purchasing power, including monthly average consumption amount of the credit card, consumption situation of the credit card (luxury consumption, overseas consumption), payment of the child school (school charge situation) and investment situation of the real precious metals. Natural human relationships: and according to the association relationship obtained in the step S1, encoding the relationship between the customer to be predicted and the private bank ultra-high-net-worth customer, such as 11-type 1 relationship, 12-type 2 relationship, 13-type 3 relationship, and the like. And other contents are as follows: once high-asset customers, due to loss of various conditions, can be used as a tracking basis.
In the present scenario example, referring to fig. 6, a corresponding system may be utilized to perform XGBoost algorithm-based training of a prediction model of a potential target customer. The system may specifically include the following structure:
training sample unit 601 and test sample unit 602: in the scenario example, known private bank customers and high-net-value customers can be used as positive samples, customer cases with marketing failure of the front-stage base-level customer manager can be used as negative samples, and basic information of the customers such as occupation, income, housing, automobiles, consumption, natural human relations and the like can be used as sample characteristics. Dividing the sample data into a training sample and a test sample according to a ratio of 9: 1.
The data preprocessing unit 603: for continuous characteristic values (such as generation wage information, house credit information, monthly average consumption amount of a credit card and the like), a histogram algorithm is adopted to discretize the continuous characteristic values into k discrete characteristics, and a histogram with the width of k is constructed for statistical information (containing k bins). The preprocessing method enables the optimal split point to be found only by traversing k bins without traversing data in the model training process, greatly improves the calculation efficiency when the nodes are split, reduces the variance of the model to a certain extent, and enhances the robustness of the model.
Model training unit 604: in the scene example, an XGboost algorithm can be adopted for potential target customer prediction model training, the XGboost is an engineering implementation based on GBDT, the GBDT is an algorithm model based on boosting integration thought, greedy learning is carried out by adopting a forward distribution algorithm during training, and a CART number is learned every iteration to fit a residual error between a prediction result of a previous t-1 tree and a real value of a training sample. The basic idea of XGboost and GBDT are that some optimizations such as default missing value processing, second-order derivative information, regular terms, column sampling are added, and parallel computation is possible.
The target function of the XGboost is composed of a loss function and a regularization term, and is defined as follows:
where l represents the loss function. The above logistic regression-based loss function can be expressed as:
wherein the content of the first and second substances,is the ith sample xiWhen a sample data set is prepared, known private bank customers and high-net-value customers are used as positive samples, and customer cases with marketing failure of a primary base level customer manager are used as negative samples, so that the problem of large proportion difference of the positive samples and the negative samples exists, loss occupying a high sample class is dominant in the training process, and the trained model specially takes care of the sample class occupying the high proportion, so that the precision of the model is reduced.
Thus, in the present scenario example, focallloss may be employed as a penalty function to assign different weights to each class:
wherein alpha isiThe value range is 0-1 for the category weight.
Since XGBoost is an additive model, the prediction score is the cumulative sum of the scores per tree:
and summing the complexity of all k trees, adding the sum into an objective function as a regularization term, and preventing overfitting:
where J is the number of leaf nodes and w is the optimal solution for the jth leaf node.
Model evaluation unit 605: and evaluating the trained model by using the test sample, and finishing the training after the model iteratively outputs a model evaluation result to meet the prediction requirement.
In the scenario example, with the use of the business, the potential target customer prediction model can be retrained by using the accumulated marketing result data and the original information data in the database, so that the model accuracy is improved.
Further, referring to fig. 7, a marketing system for exploring target customers based on the association relationship may be constructed. The system may specifically include the following structure:
701 legal customer listing unit: acquiring a list of company settlers and listed companies;
702 a legal person associates personal exploration units (which may include a equity relationship unit, an occupational relationship unit, and a fund relationship unit, etc.): the method comprises the steps of simply abstracting entity objects such as company clients, individual clients and the like into individual 'points', abstracting the relationship between entities such as business registration information (equity relationship, high management relationship) or capital flow direction and the like into one 'edge', generating a relationship network, and exploring individual potential clients based on the dimensionality of 'relationship';
703 high value personal customer list unit: the method can acquire the lists of private bank customers and high-net-value customers;
704 personal association personal exploration units (including natural person relationship units, funding relationship units, applicant relationship units, etc.): the method is characterized in that personal clients are simply abstracted into 'points', relationships among entities such as relatives (including personal registration information, house and loan common repayments, loan guarantee information, credit card contacts, the same credit card addresses, the same contact addresses, insurance beneficiaries, fixed transfer accounts, close funds come-to-go persons and government affair information (marriage)), credit card main and sub-card relationships, enterprise association relationships and the like are abstracted into 'edges', a relationship network is generated, and the personal potential clients are detected based on the dimensionality of 'relationship';
705 potential customer inventory unit: personal asset query is carried out on a legal person associated personal exploration unit and a personal potential customer list obtained by the personal associated personal exploration unit, and if the net asset of a customer is less than 20 ten thousand yuan, implicit star-level evaluation is carried out;
706 personal customer base information element: including professional units, income units, housing units, automobile units, consumption units, natural people relationship units, etc. related to individual customers;
707 target potential customer prediction model: the XGboost algorithm model is used for predicting potential target customers for a target customer group based on the basic information characteristics of the individual customers;
708 marketing personal customer list element: and judging the comprehensive condition of the client according to the model prediction result and the traditional client star level combination to obtain a client list to be marketed.
Through the scene example, it can be verified that the method for determining the target object provided by the embodiment of the specification can break through the limitation of the existing relational database, and can more efficiently, accurately and quickly mine low-asset potential clients from massive data; meanwhile, the incidence relations such as natural person relations, share right relations, job relations, business relation, fund relations and the like are effectively utilized to construct a knowledge map to explore various implicit relations, and the customers that banks temporarily represent low-value and low-asset are excavated, but the customers which are possibly valuable are predicted from the incidence data, so that the range of the customers to be analyzed is reduced. Further, one can again base 6 large potential analysis dimensions: and constructing a potential target customer prediction model by using the improved XGboost algorithm according to basic customer information such as occupation, income, housing, automobiles, consumption, natural human relations and the like, predicting the target customer groups, and taking the prediction result as a main marketing entry point. Based on the method and the system, the disorder caused by the traditional checking of information accumulation of various dimensions can be changed, and the summary and conclusion of deep conditions behind the client are strengthened. And the comprehensive condition of the client is judged by combining the client assets and the model prediction results, so that the client manager can accurately identify and market the service.
Although the present specification provides method steps as described in the examples or flowcharts, additional or fewer steps may be included based on conventional or non-inventive means. The order of steps recited in the embodiments is merely one manner of performing the steps in a multitude of orders and does not represent the only order of execution. When an apparatus or client product in practice executes, it may execute sequentially or in parallel (e.g., in a parallel processor or multithreaded processing environment, or even in a distributed data processing environment) according to the embodiments or methods shown in the figures. The terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, the presence of additional identical or equivalent elements in a process, method, article, or apparatus that comprises the recited elements is not excluded. The terms first, second, etc. are used to denote names, but not any particular order.
Those skilled in the art will also appreciate that, in addition to implementing the controller as pure computer readable program code, the same functionality can be implemented by logically programming method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Such a controller may therefore be considered as a hardware component, and the means included therein for performing the various functions may also be considered as a structure within the hardware component. Or even means for performing the functions may be regarded as being both a software module for performing the method and a structure within a hardware component.
This description may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, classes, etc. that perform particular tasks or implement particular abstract data types. The specification may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
From the above description of the embodiments, it is clear to those skilled in the art that the present specification can be implemented by software plus necessary general hardware platform. With this understanding, the technical solutions in the present specification may be essentially embodied in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., and includes several instructions for enabling a computer device (which may be a personal computer, a mobile terminal, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments in the present specification.
The embodiments in the present specification are described in a progressive manner, and the same or similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. The description is operational with numerous general purpose or special purpose computing system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet-type devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable electronic devices, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like.
While the specification has been described with examples, those skilled in the art will appreciate that there are numerous variations and permutations of the specification that do not depart from the spirit of the specification, and it is intended that the appended claims include such variations and modifications that do not depart from the spirit of the specification.