Abnormal feature information extraction method, system, electronic device and medium
1. An abnormal feature information extraction method is characterized by comprising the following steps:
return information characteristic obtaining step: after return information of a user visiting contact is obtained through an advertisement flow detection system, analyzing the return information to obtain return information characteristics;
and a step of obtaining important characteristics of return information: screening the returned information characteristics by an average influence value method to obtain the important characteristics of the returned information;
and a learning model construction step, namely after a mapping relation learning model is constructed, training the important characteristics of the returned information in the mapping relation learning model to obtain the characteristic information of abnormal advertisement traffic.
2. The method according to claim 1, wherein the step of obtaining the returned information features comprises obtaining the returned information features by analyzing the returned information in multiple dimensions after obtaining the returned information of the user tour contact by the advertisement traffic detection system, wherein the returned information comprises normal advertisement traffic and abnormal advertisement traffic.
3. The method according to claim 1, wherein the step of obtaining important features of the returned information comprises analyzing the importance of the returned information features by the mean influence value method, sorting the returned information features according to the importance, and obtaining the important features of the returned information after removing redundant features in the returned information features.
4. The abnormal feature information extraction method according to claim 1, wherein the learning model construction step includes:
training important characteristics of returned information: after a mapping relation learning model is built, training the important features of the returned information in the encoder, and acquiring the important features of the returned information after the encoder is trained;
an advertisement flow abnormal characteristic obtaining step: and after the decoder trains the important features of the returned information trained by the encoder, acquiring the advertisement flow abnormal feature information.
5. An abnormal feature information extraction system applied to the abnormal feature information extraction method according to any one of claims 1 to 4, the abnormal feature information extraction system comprising:
return information characteristic acquisition unit: after return information of a user visiting contact is obtained through an advertisement flow detection system, analyzing the return information to obtain return information characteristics;
return information important characteristic obtaining unit: screening the returned information characteristics by an average influence value method to obtain the important characteristics of the returned information;
and the learning model construction unit is used for training the important characteristics of the returned information in the mapping relation learning model after the mapping relation learning model is constructed, and acquiring the abnormal characteristic information of the advertisement flow.
6. The system according to claim 5, wherein after the return information of the user tour contact is obtained by the advertisement traffic detection system, the return information is analyzed in a multi-dimensional manner, and the return information is obtained by the return information feature obtaining unit, wherein the return information includes normal advertisement traffic and abnormal advertisement traffic.
7. The abnormal feature information extraction system according to claim 6, wherein the importance of the returned information features is analyzed by the average influence value method, the returned information features are sorted according to the importance, and the returned information important features are obtained by the returned information important feature obtaining unit after redundant features in the returned information features are removed.
8. The abnormality feature information extraction system according to claim 7, characterized in that the learning model construction step unit:
a pass-back information important characteristic training module: after a mapping relation learning model is built, training the important features of the returned information in the encoder, and acquiring the important features of the returned information after the encoder is trained;
an advertisement traffic anomaly feature acquisition module: and after the decoder trains the important features of the returned information trained by the encoder, acquiring the advertisement flow abnormal feature information.
9. An electronic device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor implements the abnormality feature information extraction method according to any one of claims 1 to 4 when executing the computer program.
10. An electronic device-readable storage medium having stored thereon computer program instructions which, when executed by the processor, implement the abnormal feature information extracting method according to any one of claims 1 to 4.
Background
Advertising is a major revenue source for many developers, thereby enabling the developers to provide services to users free of charge, which is an important part of the mobile application ecosystem. But some developers earn benefits through illegal advertising, which poses a serious threat to the information security of users. Illegal advertisements are different from legitimate advertisements in traffic expression, and can be detected by detecting abnormal traffic. At present, the main mode for detecting advertisement flow at home and abroad is a filtering list, but the mode is easy to cause the condition of identification failure or error identification, does not have self-updating capability and needs a large amount of manpower for maintenance.
Illegal traffic not only directly damages the vital interests of advertisers, but also influences the formulation of marketing strategies and further restricts the benign development of the industry. In general, the abnormal ad traffic is classified into GIVT (general Invalid traffic) and SIVT (Sophisted Invalid traffic) from the Chinese Advertising Association definitions and classifications. In general, setting a rule for making an invalid traffic filtering list, the main data contents include: IP address blacklist, IP address grey list, Device ID blacklist, Device ID grey list. At present, the judgment aiming at the SIVT is relatively complex, the technical requirement is high, and continuous improvement and optimization are needed, and how to adopt a machine learning method to identify abnormal advertisement traffic of the SIVT type becomes the hot content of the current research.
Disclosure of Invention
The embodiment of the application provides an abnormal characteristic information extraction method, an abnormal characteristic information extraction system, electronic equipment and a medium, and at least solves the problems that abnormal advertisement flow cannot be identified through a machine learning method and the like through the method and the system.
The invention provides an abnormal feature information extraction method, which comprises the following steps:
return information characteristic obtaining step: after return information of a user visiting contact is obtained through an advertisement flow detection system, analyzing the return information to obtain return information characteristics;
and a step of obtaining important characteristics of return information: screening the returned information characteristics by an average influence value method to obtain the important characteristics of the returned information;
and a learning model construction step, namely after a mapping relation learning model is constructed, training the important characteristics of the returned information in the mapping relation learning model to obtain the characteristic information of abnormal advertisement traffic.
In the above method for extracting abnormal feature information, the step of obtaining returned information features includes obtaining the returned information of the user tour contact by the advertisement traffic detection system, and then performing multidimensional analysis on the returned information to obtain the returned information features, where the returned information includes normal advertisement traffic and abnormal advertisement traffic.
In the above abnormal feature information extraction method, the step of obtaining the important features of the returned information includes analyzing the importance of the characteristics of the returned information by the average influence value method, sorting the characteristics of the returned information according to the importance, and obtaining the important features of the returned information after removing redundant features in the characteristics of the returned information.
In the above method for extracting abnormal feature information, the step of constructing the learning model includes:
training important characteristics of returned information: after a mapping relation learning model is built, training the important features of the returned information in the encoder, and acquiring the important features of the returned information after the encoder is trained;
an advertisement flow abnormal characteristic obtaining step: and after the decoder trains the important features of the returned information trained by the encoder, acquiring the advertisement flow abnormal feature information.
The present invention also provides an abnormal feature information extraction system, which is applicable to the above abnormal feature information extraction method, and the abnormal feature information extraction system includes:
return information characteristic acquisition unit: after return information of a user visiting contact is obtained through an advertisement flow detection system, analyzing the return information to obtain return information characteristics;
return information important characteristic obtaining unit: screening the returned information characteristics by an average influence value method to obtain the important characteristics of the returned information;
and the learning model construction unit is used for training the important characteristics of the returned information in the mapping relation learning model after the mapping relation learning model is constructed, and acquiring the abnormal characteristic information of the advertisement flow.
In the above abnormal characteristic information extraction system, after the return information of the user tour contact is acquired by the advertisement traffic detection system, the return information is analyzed in a multi-dimensional manner, and the return information characteristic is acquired by the return information characteristic acquisition unit, wherein the return information includes normal advertisement traffic and abnormal advertisement traffic.
In the above abnormal feature information extraction system, the importance of the returned information features is analyzed by the average influence value method, the returned information features are sorted according to the importance, and after redundant features in the returned information features are removed, the returned information important features are obtained by the returned information important feature obtaining unit.
In the above abnormal feature information extraction system, the learning model construction step unit:
a pass-back information important characteristic training module: after a mapping relation learning model is built, training the important features of the returned information in the encoder, and acquiring the important features of the returned information after the encoder is trained;
an advertisement traffic anomaly feature acquisition module: and after the decoder trains the important features of the returned information trained by the encoder, acquiring the advertisement flow abnormal feature information.
The present invention also provides an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements any one of the above-mentioned abnormal feature information extraction methods when executing the computer program.
The present invention also provides an electronic device readable storage medium, on which computer program instructions are stored, which, when executed by the processor, implement any one of the above-described abnormal feature information extraction methods.
Compared with the related technology, the abnormal feature information extraction method, the abnormal feature information extraction system, the electronic equipment and the medium provided by the invention combine deep learning and statistical learning methods to extract the historical tour behavior features of the user, and the obtained features can reflect the common features of a plurality of advertisement return feature time sequences and simultaneously improve the prediction and optimization capabilities.
The details of one or more embodiments of the application are set forth in the accompanying drawings and the description below to provide a more thorough understanding of the application.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
fig. 1 is a flowchart of an abnormal feature information extraction method according to an embodiment of the present application;
FIG. 2 is a block diagram of an abnormal feature information extraction implementation procedure according to an embodiment of the present application;
FIG. 3 is a schematic structural diagram of an abnormal feature information extraction system according to the present invention;
fig. 4 is a frame diagram of an electronic device according to an embodiment of the present application.
Wherein the reference numerals are:
return information characteristic acquisition unit: 51;
return information important characteristic obtaining unit: 52;
a learning model construction unit: 53;
a pass-back information important characteristic training module: 531;
an advertisement traffic anomaly feature acquisition module: 532;
80 parts of a bus;
a processor: 81;
a memory: 82;
a communication interface: 83.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be described and illustrated below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments provided in the present application without any inventive step are within the scope of protection of the present application.
It is obvious that the drawings in the following description are only examples or embodiments of the present application, and that it is also possible for a person skilled in the art to apply the present application to other similar contexts on the basis of these drawings without inventive effort. Moreover, it should be appreciated that such a development effort might be complex and tedious, but would nevertheless be a routine undertaking of design, fabrication, and manufacture for those of ordinary skill having the benefit of this disclosure, and thus should not be construed as a limitation of this disclosure.
Reference in the specification to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the specification. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of ordinary skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments without conflict.
Unless defined otherwise, technical or scientific terms referred to herein shall have the ordinary meaning as understood by those of ordinary skill in the art to which this application belongs. Reference to "a," "an," "the," and similar words throughout this application are not to be construed as limiting in number, and may refer to the singular or the plural. The present application is directed to the use of the terms "including," "comprising," "having," and any variations thereof, which are intended to cover non-exclusive inclusions; for example, a process, method, system, article, or apparatus that comprises a list of steps or modules (elements) is not limited to the listed steps or elements, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus. Reference to "connected," "coupled," and the like in this application is not intended to be limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect. The term "plurality" as referred to herein means two or more. "and/or" describes an association relationship of associated objects, meaning that three relationships may exist, for example, "A and/or B" may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship. Reference herein to the terms "first," "second," "third," and the like, are merely to distinguish similar objects and do not denote a particular ordering for the objects.
The invention obtains the return information (such as ip address, uuid, os, imei and other field information) of each user visiting contact point through the advertisement flow monitoring system. Based on the physical significance of the advertisement return characteristics, return information characteristics are respectively obtained and analyzed from multiple dimension analyses such as IP, uuid, os, imei and the like. In order to avoid introducing non-key input into the mapping relation, nonlinear independent variable screening is carried out on the returned information characteristics by adopting an average influence value method based on a large number of samples. Secondly, a mapping relation learning model based on an attention mechanism and a long-time and short-time memory network is designed. The model consists of an encoder and a decoder, wherein the encoder and the decoder both utilize a long-time memory network to learn the incidence relation between the advertisement parameter time sequence characteristics, and utilize an attention mechanism to improve the data use efficiency and reduce the training difficulty.
The present invention will be described with reference to specific examples.
Example one
The embodiment provides an abnormal feature information extraction method. Referring to fig. 1 to 2, fig. 1 is a flowchart illustrating an abnormal feature information extraction method according to an embodiment of the present application; fig. 2 is a frame diagram of an abnormal feature information extraction implementation step according to an embodiment of the present application, and as shown in fig. 1 to 2, the abnormal feature information extraction method includes the following steps:
return information characteristic obtaining step: after return information of a user visiting contact is obtained through an advertisement flow detection system, analyzing the return information to obtain return information characteristics;
and a step of obtaining important characteristics of return information: screening the returned information characteristics by an average influence value method to obtain the important characteristics of the returned information;
and a learning model construction step, namely after a mapping relation learning model is constructed, training the important characteristics of the returned information in the mapping relation learning model to obtain the characteristic information of abnormal advertisement traffic.
In an embodiment, the step S1 of obtaining characteristics of returned information includes obtaining the returned information of the user tour contact by the advertisement traffic detection system, and then performing multidimensional analysis on the returned information to obtain the characteristics of returned information, where the returned information includes normal advertisement traffic and abnormal advertisement traffic.
In a specific embodiment, the advertisement traffic detection system obtains the return information of the user's visit contact, that is, the information (field information such as ip address, uuid, os, imei, etc.) of each user's visit contact media, which includes normal advertisement traffic and abnormal advertisement traffic, and the general abnormality is represented as a brushing volume tool, a simulator brushing volume, etc.
In an embodiment, the step S2 of obtaining important characteristics of the returned information includes analyzing the importance of the returned information characteristics by the average influence value method, sorting the returned information characteristics according to the importance, and obtaining the important characteristics of the returned information after removing redundant characteristics from the returned information characteristics.
In an embodiment, the learning model building step S3 includes:
pass-back information important feature training step S31: after a mapping relation learning model is built, training the important features of the returned information in the encoder, and acquiring the important features of the returned information after the encoder is trained;
advertisement traffic anomaly feature acquisition step S32: and after the decoder trains the important features of the returned information trained by the encoder, acquiring the advertisement flow abnormal feature information.
In a specific embodiment, a mapping learning model based on an attention mechanism and a long-time and short-time memory network is designed. The model consists of an encoder and a decoder, wherein the encoder and the decoder both utilize a long-time memory network to learn the incidence relation between returned information characteristics, and utilize an attention mechanism to improve the data use efficiency and reduce the training difficulty. The input of the encoder is the important characteristic of returned information, and the incidence relation existing among different input parameters is learned through an attention layer, a softmax layer, an inner lamination layer and a long-term and short-term memory network layer in the encoder. The input of the decoder is the output of the encoder, and the determined association relation of each set of important characteristics of the returned information at different moments is learned through the attention layer, the softmax layer, the context vector layer and the long-time memory network layer of the decoder.
Example two
Referring to fig. 3, fig. 3 is a schematic structural diagram of an abnormal feature information extraction system according to the present invention. As shown in fig. 3, the work summary generation of the invention is applied to the above-described abnormal feature information extraction method, and the abnormal feature information extraction system includes:
return information feature acquisition unit 51: after return information of a user visiting contact is obtained through an advertisement flow detection system, analyzing the return information to obtain return information characteristics;
return information important feature acquisition unit 52: screening the returned information characteristics by an average influence value method to obtain the important characteristics of the returned information;
and the learning model constructing unit 53 is used for training the important features of the returned information in the mapping relation learning model after constructing the mapping relation learning model, and acquiring the abnormal feature information of the advertisement flow.
In an embodiment, after the return information of the user tour contact is obtained by the advertisement traffic detection system, the return information is analyzed in a multi-dimensional manner, and the return information characteristic is obtained by the return information characteristic obtaining unit 51, where the return information includes normal advertisement traffic and abnormal advertisement traffic.
In an embodiment, the importance of the returned information features is analyzed by the average influence value method, the returned information features are sorted according to the importance, and after redundant features in the returned information features are removed, the returned information important features are obtained by the returned information important feature obtaining unit 52.
In an embodiment, the learning model building step unit 53:
pass back information important feature training module 531: after a mapping relation learning model is built, training the important features of the returned information in the encoder, and acquiring the important features of the returned information after the encoder is trained;
the advertisement traffic anomaly feature obtaining module 532: and after the decoder trains the important features of the returned information trained by the encoder, acquiring the advertisement flow abnormal feature information.
EXAMPLE III
Referring to fig. 4, this embodiment discloses a specific implementation of an electronic device. The electronic device may include a processor 81 and a memory 82 storing computer program instructions.
Specifically, the processor 81 may include a Central Processing Unit (CPU), or A Specific Integrated Circuit (ASIC), or may be configured to implement one or more Integrated circuits of the embodiments of the present Application.
Memory 82 may include, among other things, mass storage for data or instructions. By way of example, and not limitation, memory 82 may include a Hard Disk Drive (Hard Disk Drive, abbreviated HDD), a floppy Disk Drive, a Solid State Drive (SSD), flash memory, an optical Disk, a magneto-optical Disk, tape, or a Universal Serial Bus (USB) Drive or a combination of two or more of these. Memory 82 may include removable or non-removable (or fixed) media, where appropriate. The memory 82 may be internal or external to the anomaly data monitoring device, where appropriate. In a particular embodiment, the memory 82 is a Non-Volatile (Non-Volatile) memory. In particular embodiments, Memory 82 includes Read-Only Memory (ROM) and Random Access Memory (RAM). The ROM may be mask-programmed ROM, Programmable ROM (PROM), Erasable PROM (FPROM), Electrically Erasable PROM (EFPROM), Electrically rewritable ROM (EAROM), or FLASH Memory (FLASH), or a combination of two or more of these, where appropriate. The RAM may be a Static Random-Access Memory (SRAM) or a Dynamic Random-Access Memory (DRAM), where the DRAM may be a Fast Page Mode Dynamic Random-Access Memory (FPMDRAM), an Extended data output Dynamic Random-Access Memory (EDODRAM), a Synchronous Dynamic Random-Access Memory (SDRAM), and the like.
The memory 82 may be used to store or cache various data files for processing and/or communication use, as well as possible computer program instructions executed by the processor 81.
The processor 81 realizes any of the abnormal feature information extracting methods in the above-described embodiments by reading and executing computer program instructions stored in the memory 82.
In some of these embodiments, the electronic device may also include a communication interface 83 and a bus 80. As shown in fig. 4, the processor 81, the memory 82, and the communication interface 83 are connected via the bus 80 to complete communication therebetween.
The communication interface 83 is used for implementing communication between modules, devices, units and/or equipment in the embodiment of the present application. The communication port 83 may also be implemented with other components such as: and data communication is carried out among external equipment, image/abnormal data monitoring equipment, a database, external storage, an image/abnormal data monitoring workstation and the like.
The bus 80 includes hardware, software, or both to couple the components of the electronic device to one another. Bus 80 includes, but is not limited to, at least one of the following: data Bus (Data Bus), Address Bus (Address Bus), Control Bus (Control Bus), Expansion Bus (Expansion Bus), and Local Bus (Local Bus). By way of example, and not limitation, Bus 80 may include an Accelerated Graphics Port (AGP) or other Graphics Bus, an Enhanced Industry Standard Architecture (EISA) Bus, a Front-Side Bus (FSB), a Hyper Transport (HT) Interconnect, an ISA (ISA) Bus, an InfiniBand (InfiniBand) Interconnect, a Low Pin Count (LPC) Bus, a memory Bus, a microchannel Architecture (MCA) Bus, a PCI (Peripheral Component Interconnect) Bus, a PCI-Express (PCI-X) Bus, a Serial Advanced Technology Attachment (SATA) Bus, a Video Electronics Bus (audio Electronics Association), abbreviated VLB) bus or other suitable bus or a combination of two or more of these. Bus 80 may include one or more buses, where appropriate. Although specific buses are described and shown in the embodiments of the application, any suitable buses or interconnects are contemplated by the application.
The electronic device may be connected to an anomaly data monitoring system to implement the methods described in connection with fig. 1-3.
The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
In summary, the invention combines the attention mechanism and the long-and-short-term memory network to extract the abnormal feature information of the advertisement feedback information feature, and the encoder and the decoder use the long-and-short-term memory network to learn the association relationship between the important features of the advertisement feedback information through the mapping relationship learning model based on the attention mechanism and the long-and-short-term memory network, which is composed of the encoder and the decoder, so that the attention mechanism is used to improve the data utilization efficiency and reduce the training difficulty.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent application shall be subject to the protection scope of the appended claims.
- 上一篇:石墨接头机器人自动装卡簧、装栓机
- 下一篇:一种指标关联性分析方法