Cloud and edge combined electricity stealing user identification method and device

文档序号:8678 发布日期:2021-09-17 浏览:25次 中文

1. A method for identifying electricity stealing users by combining a cloud end and an edge end is characterized by comprising the following steps:

respectively extracting electricity stealing identification evaluation indexes and electricity stealing labels in response to the acquired historical electricity consumption data of the user at the side end and the electricity stealing record of the terminal equipment, so that a training data set is formed;

training a combined classification model based on the training data set, wherein the combined classification model is a combined model based on a LightGBM submodel and a neural network submodel, and the construction process of the LightGBM submodel is as follows:

pre-ordering the continuous features in the data set, and converting continuous floating point data into discrete data;

generating a decision tree based on the characteristic data, comprehensively considering the accuracy degree of the decision tree and the complexity degree of the decision tree, and defining the objective function of the decision tree to calculate as follows:

in the formula (I), the compound is shown in the specification,in order to calculate the accuracy of the decision tree determination,andrespectively representing the predicted value of the label of the data set and the actual value of the label of the data set of the decision tree;to calculate the complexity of the decision tree, wherein,as to the number of leaf nodes,for the weight vectors of the different leaf nodes,andare all regular term coefficients;

and fitting a new decision tree by using the negative gradient of the loss function as a residual error approximate value of the current decision tree, wherein the relation between fitting targets of each decision tree is as follows:

in the formula (I), the compound is shown in the specification,in data set for the t treeThe result of the prediction of (a) above,is frontThe result of the prediction of the whole tree,is at presentPrediction results of the tree;

according to the firstAnd (3) generating a decision tree, and defining an objective function as follows:

to pairPerforming Taylor expansion to define a pairA first order partial derivative function ofA second order partial derivative function ofThe objective function is rewritten as:

definition ofSolving the loss function to obtain leaf nodesIs best weightedAnd a simplified sub-tree branch score function, as follows:

calculating a segmentation gain for each current leaf node, selecting a current maximum gain node for segmentation until the overall objective function value of the decision tree meets the set requirement, and finishing generating the t-th decision tree, wherein the expression for calculating the segmentation gain is as follows:

in the formula (I), the compound is shown in the specification,indicating that after the current node is divided, the left leaf node scores,indicating that after the current node is divided, the right leaf node scores,representing the score of the decision tree when the node is not partitioned,representing the complexity cost introduced by adding a new leaf node;

based on the existing decision tree set, the characteristic value is predicted to obtain the predicted value of the current t decision treesCalculatingAnd true valueThe difference is put into a fitting target of the next decision tree until the generated number of the decision trees meets a set value or the prediction precision of the whole decision tree set meets the requirement;

and inputting the real-time electricity utilization data of a certain user into the combined classification model, and outputting the electricity stealing suspicion coefficient of the certain user so as to determine the electricity stealing suspicion user.

2. The method according to claim 1, wherein the input quantity of the combined classification model is the electricity stealing identification evaluation index, and the output quantity is the electricity stealing tag.

3. The method according to claim 1, wherein the electricity stealing identification evaluation index comprises a load curve slope index, a line loss index and an alarm index.

4. The method for identifying electricity stealing users by combining the cloud terminal with the edge terminal as claimed in claim 1, wherein before the step of extracting the electricity stealing identification evaluation index and the electricity stealing label respectively to form a training data set in response to the user historical electricity consumption data of the edge terminal and the electricity stealing record of the terminal device, the method further comprises:

and responding to the acquired historical electricity utilization data of the user at the side end and the electricity stealing record of the terminal equipment, and preprocessing the historical electricity utilization data of the user and the electricity stealing record of the terminal equipment, wherein the preprocessing comprises data cleaning and missing value processing.

5. The method according to claim 4, wherein the missing value processing specifically includes:

determining a dependent variable and an independent variable from an original data set, and taking out at least two data before and after a missing value;

processing at least four data based on a Lagrange polynomial interpolation method, sequentially interpolating all missing data until no missing value exists, wherein the expression for processing the at least four data based on the Lagrange polynomial interpolation method is as follows:

in the formula (I), the compound is shown in the specification,is the subscript number corresponding to the missing value,as a result of the interpolation of the missing values,is a non-missing valueIs the total number of data samples, N.

6. The utility model provides a combine high in the clouds and steal electric user recognition device of limit end, its characterized in that includes:

the acquisition module is configured to respond to the acquired historical electricity consumption data of the user at the side end and the electricity stealing record of the terminal equipment, and respectively extract an electricity stealing identification evaluation index and an electricity stealing label to form a training data set;

a training module configured to train a combined classification model based on the training data set, wherein the combined classification model is a combined model based on a LightGBM submodel and a neural network submodel, and a construction process of the LightGBM submodel is specifically as follows:

pre-ordering the continuous features in the data set, and converting continuous floating point data into discrete data;

generating a decision tree based on the characteristic data, comprehensively considering the accuracy degree of the decision tree and the complexity degree of the decision tree, and defining the objective function of the decision tree to calculate as follows:

in the formula (I), the compound is shown in the specification,in order to calculate the accuracy of the decision tree determination,andrespectively representing the predicted value of the label of the data set and the actual value of the label of the data set of the decision tree;to calculate the complexity of the decision tree, wherein,as to the number of leaf nodes,for the weight vectors of the different leaf nodes,andare all regular term coefficients;

and fitting a new decision tree by using the negative gradient of the loss function as a residual error approximate value of the current decision tree, wherein the relation between fitting targets of each decision tree is as follows:

in the formula (I), the compound is shown in the specification,in data set for the t treeThe result of the prediction of (a) above,is frontThe result of the prediction of the whole tree,is at presentPrediction results of the tree;

according to the firstAnd (3) generating a decision tree, and defining an objective function as follows:

to pairPerforming Taylor expansion to define a pairA first order partial derivative function ofA second order partial derivative function ofThe objective function is rewritten as:

definition ofSolving the loss function to obtain leaf nodesIs best weightedAnd a simplified sub-tree branch score function, as follows:

calculating a segmentation gain for each current leaf node, selecting a current maximum gain node for segmentation until the overall objective function value of the decision tree meets the set requirement, and finishing generating the t-th decision tree, wherein the expression for calculating the segmentation gain is as follows:

in the formula (I), the compound is shown in the specification,indicating that after the current node is divided, the left leaf node scores,indicating that after the current node is divided, the right leaf node scores,representing the score of the decision tree when the node is not partitioned,representing the complexity cost introduced by adding a new leaf node;

based on the existing decision tree set, the characteristic value is predicted to obtain the predicted value of the current t decision treesCalculatingAnd true valueThe difference is put into a fitting target of the next decision tree until the generated number of the decision trees meets a set value or the prediction precision of the whole decision tree set meets the requirement;

and the output module is configured to input the real-time electricity utilization data of a certain user into the combined classification model, and output the suspected electricity stealing coefficient of the certain user so as to determine the suspected electricity stealing user.

7. An electronic device, comprising: at least one processor, and a memory communicatively coupled to the at least one processor, wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any of claims 1 to 5.

8. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the method of any one of claims 1 to 5.

Background

With the pace of modern construction of our country becoming faster, the consumption of energy by the country is also increasing, and especially the demand for electric power is on the rise year by year. Under the background, some lawbreakers steal power resources by means of destroying metering devices, private lines and the like, so that the power utilization cost is reduced. The electricity stealing behavior not only seriously affects the normal power supply and utilization order and brings great economic loss to power grid enterprises, but also causes electric shock accidents and electric fire accidents, and endangers the personal safety and the power grid safety.

At present, most of identification methods of users with suspicion of electricity stealing are worker inspection, and periodic user-by-user inspection is carried out, but the method is low in efficiency, consumes a large amount of manpower and material resources, and is difficult to accurately identify some concealed electricity stealing modes.

Disclosure of Invention

The invention provides a method for identifying electricity stealing users by combining a cloud end and an edge end, which is used for solving at least one of the technical problems.

In a first aspect, the present invention provides a method for identifying a power stealing user by combining a cloud and an edge, including: respectively extracting electricity stealing identification evaluation indexes and electricity stealing labels in response to the acquired historical electricity consumption data of the user at the side end and the electricity stealing record of the terminal equipment, so that a training data set is formed; training a combined classification model based on the training data set, wherein the combined classification model is a combined model based on a LightGBM submodel and a neural network submodel, and the construction process of the LightGBM submodel is as follows: pre-ordering the continuous features in the data set, and converting continuous floating point data into discrete data; generating a decision tree based on the characteristic data, comprehensively considering the accuracy degree of the decision tree and the complexity degree of the decision tree, and defining the objective function of the decision tree to calculate as follows:in the formula (I), wherein,in order to calculate the accuracy of the decision tree determination,andrespectively representing the predicted value of the label of the data set and the actual value of the label of the data set of the decision tree;to calculate the complexity of the decision tree, wherein,as to the number of leaf nodes,for the weight vectors of the different leaf nodes,andare all regular term coefficients; and fitting a new decision tree by using the negative gradient of the loss function as a residual error approximate value of the current decision tree, wherein the relation between fitting targets of each decision tree is as follows:in the formula (I), wherein,in data set for the t treeThe result of the prediction of (a) above,is frontThe result of the prediction of the whole tree,is at presentPrediction results of the tree; according to the firstAnd (3) generating a decision tree, and defining an objective function as follows:to, forPerforming Taylor expansion to define a pairA first order partial derivative function ofA second order partial derivative function ofThe objective function is rewritten as:definition ofSolving the loss function to obtain leaf nodesIs best weightedAnd a simplified sub-tree branch score function, as follows:calculating the segmentation gain of each current leaf node, selecting the node with the current maximum gain to segment until the overall objective function value of the decision tree meets the set requirement, and finishing the generation of the t-th decision tree, wherein the expression for calculating the segmentation gain is as follows:in the formula (I), wherein,indicating that after the current node is divided, the left leaf node scores,indicating that after the current node is divided, the right leaf node scores,representing the score of the decision tree when the node is not partitioned,representing the complexity cost introduced by adding a new leaf node; based on the existing decision tree set, the characteristic value is predicted to obtain the predicted value of the current t decision treesCalculatingAnd true valueAnd put it into the fitting target of the next decision tree until the generated number of decision trees meets the set value or the decision tree set as a wholeThe prediction precision of the method meets the requirement; and inputting the real-time electricity utilization data of a certain user into the combined classification model, and outputting the electricity stealing suspicion coefficient of the certain user so as to determine the electricity stealing suspicion user.

In a second aspect, the present invention provides an electricity stealing user identification apparatus combining a cloud terminal and an edge terminal, including: the acquisition module is configured to respond to the acquired historical electricity consumption data of the user at the side end and the electricity stealing record of the terminal equipment, and respectively extract an electricity stealing identification evaluation index and an electricity stealing label to form a training data set; a training module configured to train a combined classification model based on the training data set, wherein the combined classification model is a combined model based on a LightGBM submodel and a neural network submodel, and a construction process of the LightGBM submodel is specifically as follows: pre-ordering the continuous features in the data set, and converting continuous floating point data into discrete data; generating a decision tree based on the characteristic data, comprehensively considering the accuracy degree of the decision tree and the complexity degree of the decision tree, and defining the objective function of the decision tree to calculate as follows:in the formula (I), wherein,in order to calculate the accuracy of the decision tree determination,andrespectively representing the predicted value of the label of the data set and the actual value of the label of the data set of the decision tree;to calculate the complexity of the decision tree, wherein,as to the number of leaf nodes,for the weight vectors of the different leaf nodes,andare all regular term coefficients; and fitting a new decision tree by using the negative gradient of the loss function as a residual error approximate value of the current decision tree, wherein the relation between fitting targets of each decision tree is as follows:in the formula (I), wherein,in data set for the t treeThe result of the prediction of (a) above,is frontThe result of the prediction of the whole tree,is at presentPrediction results of the tree; according to the firstAnd (3) generating a decision tree, and defining an objective function as follows:to, forPerforming Taylor expansion to define a pairA first order partial derivative function ofA second order partial derivative function ofThe objective function is rewritten as:definition ofSolving the loss function to obtain leaf nodesIs best weightedAnd a simplified sub-tree branch score function, as follows:calculating the segmentation gain of each current leaf node, selecting the node with the current maximum gain to segment until the overall objective function value of the decision tree meets the set requirement, and finishing the generation of the t-th decision tree, wherein the expression for calculating the segmentation gain is as follows:in the formula (I), wherein,indicating that after the current node is divided, the left leaf node scores,indicating that after the current node is divided, the right leaf node scores,representing the score of the decision tree when the node is not partitioned,representing the complexity cost introduced by adding a new leaf node; based on the existing decision tree set, the characteristic value is predicted to obtain the predicted value of the current t decision treesCalculatingAnd true valueThe difference is put into a fitting target of the next decision tree until the generated number of the decision trees meets a set value or the prediction precision of the whole decision tree set meets the requirement; and the output module is configured to input the real-time electricity utilization data of a certain user into the combined classification model, and output the suspected electricity stealing coefficient of the certain user so as to determine the suspected electricity stealing user.

In a third aspect, an electronic device is provided, comprising: the system comprises at least one processor and a memory which is in communication connection with the at least one processor, wherein the memory stores instructions which can be executed by the at least one processor, and the instructions are executed by the at least one processor, so that the at least one processor can execute the steps of the method for identifying the electricity stealing users by combining the cloud end and the edge end of any embodiment of the invention.

In a fourth aspect, the present invention further provides a computer-readable storage medium, on which a computer program is stored, the computer program including program instructions, which, when executed by a computer, cause the computer to perform the steps of the electricity stealing user identification method combining a cloud terminal and an edge terminal according to any embodiment of the present invention.

According to the electricity stealing user identification method and device combining the cloud end and the edge end, data are preprocessed through the edge end server, the electricity stealing identification label is generated, the calculation burden of the cloud end server is reduced, the calculation efficiency and the detection efficiency are improved, in addition, the lightGBM model and the BP neural network combined model are adopted, the calculation speed is accelerated, and the classification accuracy is improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on the drawings without creative efforts.

Fig. 1 is a flowchart of a method for identifying a power stealing subscriber by combining a cloud terminal and an edge terminal according to an embodiment of the present invention;

fig. 2 is a flowchart of another electricity stealing subscriber identification method combining a cloud terminal and an edge terminal according to an embodiment of the present invention;

fig. 3 is a block diagram illustrating a configuration of a device for identifying a fraudulent use of electricity, which combines a cloud terminal and an edge terminal according to an embodiment of the present invention;

fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Referring to fig. 1, a flowchart of a method for identifying a power stealing subscriber by combining a cloud terminal and an edge terminal according to the present application is shown.

As shown in fig. 1, the method for identifying a power stealing user by combining a cloud terminal and an edge terminal specifically includes:

and S101, respectively extracting electricity stealing identification evaluation indexes and electricity stealing labels in response to the acquired historical electricity consumption data of the user at the side end and the electricity stealing record of the terminal equipment, so that a training data set is formed.

In this embodiment, the edge server collects all relevant data affecting the identification of the electricity stealing users, including electricity consumption data of users, line loss data of lines, alarm data of terminals, and electricity stealing records of users in corresponding areas, and the electricity stealing user identification device extracts electricity stealing identification evaluation indexes and electricity stealing tags from the relevant data, so as to form a training data set.

And S102, training a combined classification model based on the training data set.

In this embodiment, the combined classification model is a combined model based on a LightGBM submodel and a neural network submodel, and a construction process of the LightGBM submodel is specifically as follows:

pre-ordering the continuous features in the data set, and converting continuous floating point data into discrete data;

generating a decision tree based on the characteristic data, comprehensively considering the accuracy degree of the decision tree and the complexity degree of the decision tree, and defining the objective function of the decision tree to calculate as follows:

in the formula (I), the compound is shown in the specification,in order to calculate the accuracy of the decision tree determination,andrespectively representing the predicted value of the label of the data set and the actual value of the label of the data set of the decision tree;to calculate the complexity of the decision tree, wherein,as to the number of leaf nodes,for the weight vectors of the different leaf nodes,andare all regular term coefficients;

and fitting a new decision tree by using the negative gradient of the loss function as a residual error approximate value of the current decision tree, wherein the relation between fitting targets of each decision tree is as follows:

in the formula (I), the compound is shown in the specification,in data set for the t treeThe result of the prediction of (a) above,is frontThe result of the prediction of the whole tree,is at presentPrediction results of the tree;

according to the firstAnd (3) generating a decision tree, and defining an objective function as follows:

to pairPerforming Taylor expansion to define a pairA first order partial derivative function ofA second order partial derivative function ofThe objective function is rewritten as:

definition ofSolving the loss function to obtain leaf nodesIs best weightedAnd a simplified sub-tree branch score function, as follows:

calculating a segmentation gain for each current leaf node, selecting a current maximum gain node for segmentation until the overall objective function value of the decision tree meets the set requirement, and finishing generating the t-th decision tree, wherein the expression for calculating the segmentation gain is as follows:

in the formula (I), the compound is shown in the specification,indicating that after the current node is divided, the left leaf node scores,indicating that after the current node is divided, the right leaf node scores,representing the score of the decision tree when the node is not partitioned,indicates the addition of a newComplexity cost introduced by leaf nodes;

based on the existing decision tree set, the characteristic value is predicted to obtain the predicted value of the current t decision treesCalculatingAnd true valueAnd putting the difference into a fitting target of the next decision tree until the generated number of the decision trees meets a set value or the prediction precision of the whole decision tree set meets the requirement.

The specific process of training the neural network submodel is as follows:

1) determining an input vector

And taking the comprehensive evaluation index obtained by calculation in the electricity stealing identification evaluation index system as an input vector of the BP neural network, and taking the electricity stealing label as an output vector.

2) Design implicit node count

Designing a hidden layer, wherein the node number of the hidden layer is determined by the following formula.

Wherein the content of the first and second substances,is the number of nodes of the input layer,is the number of nodes, constants, of the output layerBetween 1 and 10.

3) Determining activation functions

Selecting a Sigmoid type functionAs an activation function of the hidden layer node; selecting a Linear function ReIU functionAs an activation function of the output layer nodes.

4) And training a BP neural network model based on the input vector and the output vector to realize the judgment and identification of whether the user is a power stealing user.

And solving the combined weight of the LightGBM model and the neural network based on an equal-weight recursion method. The basic principle is as follows:

suppose there are n classification methods, which are recorded as:

the first round of averaging may be expressed as:

wherein the content of the first and second substances,representing the classification value of the ith single classification method at the time t;representing the classification value at time t after the first algebraic averaging.

Assuming that the sum of squared errors of the ith single classification model in the n classification methods is the maximum, the method is usedReplacing the classification value of the ith method, and obtaining n method classification values required by the second round of averaging as follows:

repeating the steps, and obtaining a combined classification model through k rounds of averaging, wherein the combined classification model comprises the following steps:

in the formula (I), the compound is shown in the specification,is the weight of each single classification method. If it is notThe model relative error percentage of (2) has reached an acceptable level, the iteration is stopped, otherwise the iteration is continued until the model relative error percentage meets the requirements.

Step S103, inputting the real-time electricity utilization data of a certain user into the combined classification model, and outputting the electricity stealing suspicion coefficient of the certain user so as to determine the electricity stealing suspicion user.

In this embodiment, real-time power consumption data of a certain user is input into the combined classification model, and a suspected power stealing coefficient of the certain user is output, so that the suspected power stealing user is determined, and if the result is that the user steals power, actions such as alarming and stopping power supply are executed.

According to the method, the data are preprocessed through the side server and the electricity stealing identification tag is generated, the calculation burden of the cloud server is reduced, the calculation efficiency and the detection efficiency are improved, and the combined model of the LightGBM submodel and the neural network submodel is adopted, so that the operation speed is increased, and the classification accuracy is improved.

In some optional embodiments, the electricity stealing identification evaluation index comprises a load curve slope index, a line loss index and an alarm type index.

The load curve slope index has the expression as follows:

in the formula (I), the compound is shown in the specification,is an indicator of the slope of the load curve,is an indication of a change in the slope of the load curve,is the firsttThe slope of the load curve for a day,is the firstt-1The slope of the load curve for a day,sis the number of changes in the slope of the load curve,mis the statistical period of days;

the load curve slope is calculated by the formula:

in the formula (I), the compound is shown in the specification,is the firstiThe slope of the load curve for a day,is the firsttThe daily load ofmThe day is the counting period of the time,is thatThe average load on the day of the day,are respectively the firstTianheThe number of days is,is the average number of days;

the expression of the line loss index is as follows:

in the formula (I), the compound is shown in the specification,is an index of the line loss,is a reference value of the line loss index,andthe average line loss rates of m days before and m days after the day are respectively,respectively representAndday;

the calculation formula of the line loss rate is as follows:

in the formula (I), the compound is shown in the specification,is the line loss rate at the t-th day,is the amount of power transmitted by the line,is the total loss of all the users and,Uis a set of users that are in a group,uis a user;

the expression of the alarm class index is as follows:

in the formula (I), the compound is shown in the specification,is the total number of alarms that have been reported,is the state of warning signal, if there is alarm information, thenOtherwiseIs an index of the alarm class, and is,is a warning number reference value;

and carrying out weighted summation on the load curve slope index, the line loss index and the alarm index to obtain a comprehensive evaluation index, wherein the expression of the comprehensive evaluation index is as follows:

in the formula (I), the compound is shown in the specification,is a comprehensive evaluation index of the quality of the product,is an indicator of the slope of the load curve,is an index of the line loss,is an index of the alarm class, and is,respectively the load curve slope index, the line loss index and the weight of the alarm index,

in some optional embodiments, a method for identifying a power stealing subscriber by combining a cloud terminal and an edge terminal further includes: and responding to the acquired historical electricity utilization data of the user at the side end and the electricity stealing record of the terminal equipment, and preprocessing the historical electricity utilization data of the user and the electricity stealing record of the terminal equipment, wherein the preprocessing comprises data cleaning and missing value processing.

(1) Data cleansing

The purpose of data cleansing is to filter out data that is not relevant to electricity stealing behavior. Public utility users such as banks, schools, industrial and commercial businesses and the like generally do not steal electricity, so that electricity utilization data corresponding to the data sets of the electricity stealing identification and evaluation index system and the electricity stealing tags need to be removed from the total data set. For the resident users, the difference between the power consumption of the holidays and the power consumption of the working days is large, and in order to obtain better recognition effect, the data of the holidays are removed, namely the command is sent

(2) Missing value handling

In the data acquisition process of the edge server, data loss phenomena, such as packet loss and equipment failure, occur due to various reasons. If missing value processing is carried out, the calculated line loss data has larger errors, so that in order to obtain better identification effect, a Lagrange interpolation method is adopted to process the missing values. The specific method comprises the following steps: firstly, dependent variables and independent variables are determined from an original data set, 5 data before and after a missing value are taken out (data does not exist or is empty in the data before and after, the data is directly discarded, and only data is formed into one group), and 10 taken out data are formed into one group. And then processing by adopting a Lagrange polynomial interpolation formula, sequentially interpolating all missing data until no missing value exists, wherein the expression for processing at least four data based on the Lagrange polynomial interpolation method is as follows:

in the formula (I), the compound is shown in the specification,is the subscript number corresponding to the missing value,as a result of the interpolation of the missing values,is a non-missing valueIs the total number of data samples, N.

Referring to fig. 2, a flowchart of another method for identifying a power stealing subscriber by combining a cloud terminal and an edge terminal according to the present application is shown.

As shown in fig. 2, firstly, a cloud-side combined electricity stealing user identification framework is established, an edge server preprocesses acquired data, generates an electricity stealing identification tag and uploads the electricity stealing identification tag to a cloud server, the cloud server trains an electricity stealing identification model based on the electricity stealing identification tag, then, influence factors reflecting electricity stealing user behaviors such as electricity load, line loss and alarm information are comprehensively considered, three electricity stealing identification evaluation indexes, namely an electricity load curve slope index, a line loss index and an alarm information index, are established to describe electricity stealing behavior characteristics in a multidimensional manner, and then, a combined classification model is obtained by means of a LightGBM model and a BP neural network to identify electricity stealing users, so that the accuracy and the real-time performance of electricity stealing user identification are improved; the identification of the electricity stealing users is carried out based on the existing electrical data, excessive monitoring elements are not required to be added, and the identification cost of the electricity stealing users is reduced.

Please refer to fig. 3, which shows a block diagram of a device for identifying a fraudulent use of electricity by combining a cloud terminal and an edge terminal according to the present application.

As shown in fig. 3, the electricity stealing subscriber identifying apparatus 200 includes an obtaining module 210, a training module 220, and an output module 230.

The obtaining module 210 is configured to respectively extract an electricity stealing identification evaluation index and an electricity stealing tag in response to obtaining the historical electricity consumption data of the user at the edge and the electricity stealing record of the terminal device, so that a training data set is formed;

a training module 220 configured to train a combined classification model based on the training dataset, where the combined classification model is a combined model based on a LightGBM submodel and a neural network submodel, and a construction process of the LightGBM submodel is specifically as follows:

pre-ordering the continuous features in the data set, and converting continuous floating point data into discrete data;

generating a decision tree based on the characteristic data, comprehensively considering the accuracy degree of the decision tree and the complexity degree of the decision tree, and defining the objective function of the decision tree to calculate as follows:

in the formula (I), the compound is shown in the specification,in order to calculate the accuracy of the decision tree determination,andrespectively representing the predicted value of the label of the data set and the actual value of the label of the data set of the decision tree;to calculate the complexity of the decision tree, wherein,as to the number of leaf nodes,for the weight vectors of the different leaf nodes,andare all regular term coefficients;

and fitting a new decision tree by using the negative gradient of the loss function as a residual error approximate value of the current decision tree, wherein the relation between fitting targets of each decision tree is as follows:

in the formula (I), the compound is shown in the specification,in data set for the t treeThe result of the prediction of (a) above,is frontThe result of the prediction of the whole tree,is at presentPrediction results of the tree;

according to the firstAnd (3) generating a decision tree, and defining an objective function as follows:

to pairPerforming Taylor expansion to define a pairA first order partial derivative function ofA second order partial derivative function ofThe objective function is rewritten as:

definition ofSolving the loss function to obtain leaf nodesIs best weightedAnd a simplified sub-tree branch score function, as follows:

calculating a segmentation gain for each current leaf node, selecting a current maximum gain node for segmentation until the overall objective function value of the decision tree meets the set requirement, and finishing generating the t-th decision tree, wherein the expression for calculating the segmentation gain is as follows:

in the formula (I), the compound is shown in the specification,indicating that after the current node is divided, the left leaf node scores,indicating that after the current node is divided, the right leaf node scores,representing the score of the decision tree when the node is not partitioned,representing the complexity cost introduced by adding a new leaf node;

based on the existing decision tree set, the characteristic value is predicted to obtain the predicted value of the current t decision treesCalculatingAnd true valueThe difference is put into a fitting target of the next decision tree until the generated number of the decision trees meets a set value or the prediction precision of the whole decision tree set meets the requirement;

the output module 230 is configured to input real-time power utilization data of a certain user into the combined classification model, and output a suspected electricity stealing coefficient of the certain user, so that the suspected electricity stealing user is determined.

It should be understood that the modules depicted in fig. 3 correspond to various steps in the method described with reference to fig. 1. Thus, the operations and features described above for the method and the corresponding technical effects are also applicable to the modules in fig. 3, and are not described again here.

In other embodiments, the present invention further provides a computer-readable storage medium, where computer-executable instructions are stored, where the computer-executable instructions may perform the electricity stealing user identification method in any of the above method embodiments;

as one embodiment, the computer-readable storage medium of the present invention stores computer-executable instructions configured to:

respectively extracting electricity stealing identification evaluation indexes and electricity stealing labels in response to the acquired historical electricity consumption data of the user at the side end and the electricity stealing record of the terminal equipment, so that a training data set is formed;

training a combined classification model based on the training data set, wherein the combined classification model is a combined model based on a LightGBM submodel and a neural network submodel;

and inputting the real-time electricity utilization data of a certain user into the combined classification model, and outputting the electricity stealing suspicion coefficient of the certain user so as to determine the electricity stealing suspicion user.

The computer-readable storage medium may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to use of the electricity stealing user identification apparatus, and the like. Further, the computer-readable storage medium may include high speed random access memory, and may also include memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some embodiments, the computer readable storage medium optionally includes memory located remotely from the processor, which may be connected to the electricity stealing user identification device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

Fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present invention, and as shown in fig. 4, the electronic device includes: a processor 310 and a memory 320. The electronic device may further include: an input device 330 and an output device 340. The processor 310, the memory 320, the input device 330, and the output device 340 may be connected by a bus or other means, such as the bus connection in fig. 4. The memory 320 is the computer-readable storage medium described above. The processor 310 executes various functional applications of the server and data processing by executing nonvolatile software programs, instructions and modules stored in the memory 320, that is, implements the above-described method embodiment electricity stealing user identification method. The input device 330 may receive input numeric or character information and generate key signal inputs related to user settings and function control of the electricity stealing user recognition apparatus. The output device 340 may include a display device such as a display screen.

The electronic device can execute the method provided by the embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method. For technical details that are not described in detail in this embodiment, reference may be made to the method provided by the embodiment of the present invention.

As an embodiment, the electronic device is applied to a device for identifying a power-stealing user, and is used for a client, and the device comprises: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to cause the at least one processor to:

respectively extracting electricity stealing identification evaluation indexes and electricity stealing labels in response to the acquired historical electricity consumption data of the user at the side end and the electricity stealing record of the terminal equipment, so that a training data set is formed;

training a combined classification model based on the training data set, wherein the combined classification model is a combined model based on a LightGBM submodel and a neural network submodel;

and inputting the real-time electricity utilization data of a certain user into the combined classification model, and outputting the electricity stealing suspicion coefficient of the certain user so as to determine the electricity stealing suspicion user.

The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

完整详细技术资料下载
上一篇:石墨接头机器人自动装卡簧、装栓机
下一篇:一种医废运输数据整合交互接口系统及方法

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!