Dynamic IP positioning clustering method based on scene characteristics
1. A dynamic IP positioning clustering method based on scene features is characterized in that: comprises that
Step 1, obtaining IP position distribution conditions of a plurality of data sources within a period of time to form IP historical datum data;
step 2, screening out historical reference point data of the dynamic IP based on the obtained historical reference point data of the IP according to the static IP and the dynamic IP distribution characteristics;
and 3, clustering the historical datum point data by using a clustering algorithm to realize the positioning of the dynamic IP.
2. The dynamic IP location clustering method based on scene features as claimed in claim 1, wherein: in the step 1, IP geographical position distribution conditions within a period of time are obtained through a distributed web crawler technology and a manual collection mode, and IP historical datum data are formed.
3. The dynamic IP location clustering method based on scene features as claimed in claim 1, wherein: in step 2, firstly, analyzing the distribution characteristics of dynamic IP and static IP, wherein the geographic distribution characteristics of the dynamic IP datum point data are that the historical position distribution of a single IP is dispersed, the historical position distribution of different IPs under a C block is similar, and the historical position distribution of adjacent C blocks is similar; the geographic distribution characteristic of the static IP datum data is that the historical position distribution of a single IP is concentrated, and the historical position distribution distance of different IPs under the C block is far; and secondly, screening out geographical distribution characteristics which accord with the dynamic IP datum point data according to the static IP and the dynamic IP distribution characteristics based on the obtained IP historical datum point data.
4. The dynamic IP location clustering method based on scene features as claimed in claim 1, wherein: in step 3, clustering historical datum point data of one or a plurality of IP blocks with similar geographical position distribution of datum points by using a density-based DBSCAN clustering method, and expressing a clustering result by longitude and latitude of a central position and a corresponding radius, thereby obtaining a central point and a clustering boundary of the dynamic IP and realizing the positioning of the dynamic IP.
Background
The IP positioning technology is a technical means for determining the geographical location of a device through the IP address of the device. The application field of the ultra-high precision IP positioning technology is very wide, and the public sentiment analysis of community granularity can be carried out on the network behaviors of people by government departments through the service, so that the people can be fully understood, and the policy of benefiting the nation and the people can be made; the security department can acquire the source target position of the network attack through the service, and the network security defense capability is improved; the online payment of the business end can realize the early warning of the remote login of the user through the service, and the safety of the transaction is improved. According to the regional distribution characteristics of the IP, the IP can be divided into a dynamic state and a static distribution state. Static IP, the IP is fixed to be used in a place in a fixed time period; such as schools using IP, which will be used in the school context for a long time. Dynamic IP dynamically distributed in one or more adjacent areas and shared for use in a period of time; for example, the residential IP is shared among several adjacent cells.
Most of the existing IP positioning products position the IP to a specific geographic position and bind the IP with longitude and latitude. The general positioning precision is fixed to the country, province, city and even street. However, the single-point IP positioning method cannot objectively reflect the real geographic location of some IPs. From analysis on the aspect of dynamic IP allocation and use, for IP converted from typical NET ports of mobile networks, residential users, WLANs and the like, the traditional single-point positioning mode easily causes large positioning deviation and cannot objectively reflect the real position of the dynamic IP.
Disclosure of Invention
In order to solve the problems in the background art, the invention provides a dynamic IP positioning clustering method based on scene characteristics.
A dynamic IP positioning clustering method based on scene characteristics comprises
Step 1, obtaining IP position distribution conditions of a plurality of data sources within a period of time to form IP historical datum data;
step 2, screening out historical reference point data of the dynamic IP based on the obtained historical reference point data of the IP according to the static IP and the dynamic IP distribution characteristics;
and 3, clustering the historical datum point data by using a clustering algorithm to realize the positioning of the dynamic IP.
Based on the above, in step 1, the distribution condition of the IP geographic location within a period of time is obtained through a distributed web crawler technology and a manual collection mode, so as to form IP historical reference point data.
Based on the above, in step 2, firstly, the distribution characteristics of the dynamic IP and the static IP are analyzed, the geographical distribution characteristics of the dynamic IP datum point data are that the historical position distribution of a single IP is dispersed, the historical position distribution of different IPs under the C block is similar, and the historical position distribution of adjacent C blocks is similar; the geographic distribution characteristic of the static IP datum data is that the historical position distribution of a single IP is concentrated, and the historical position distribution distance of different IPs under the C block is far; and secondly, screening out geographical distribution characteristics which accord with the dynamic IP datum point data according to the static IP and the dynamic IP distribution characteristics based on the obtained IP historical datum point data.
Based on the above, in step 3, for IP blocks with similar geographical location distribution of one or several reference points, a density-based DBSCAN clustering method is used to cluster their historical reference point data, and the clustering result is represented by the longitude and latitude of a center position and the corresponding radius, so as to obtain the center point and the clustering boundary of the dynamic IP, thereby implementing the positioning of the dynamic IP.
Compared with the prior art, the dynamic IP positioning method has outstanding substantive characteristics and remarkable progress, and particularly, the dynamic IP positioning method uses a clustering algorithm to cluster based on the historical distribution condition of the dynamic IP within a period of time to obtain the distribution range of the real position of the dynamic IP, forms the longitude and the latitude of a central position and a corresponding radius, realizes the positioning of the dynamic IP, and solves the problem that an IP positioning product cannot accurately position multiple dynamic IPs.
Drawings
FIG. 1 is a diagram of historical fiducial point distribution and clustering results for a dynamic IP block.
In fig. 1, 1) the inverted drop-shaped dots represent historical punctuation data; 2) the circles represent the clustering results.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without inventive effort based on the embodiments of the present invention, are within the scope of the present invention.
A dynamic IP positioning clustering method based on scene characteristics is explained by taking the processing of a dynamic IP block as an example.
Step 1, obtaining IP position distribution conditions of a plurality of data sources in a period of time to form IP historical punctuation data.
By means of a distributed web crawler technology and a manual acquisition mode, the distribution situation of the IP geographic positions within a period of time is obtained, and the distribution situation of the geographic positions shown by the inverted-drop-shaped points in the figure 1 is obtained.
And 2, screening out historical reference point data of the dynamic IP based on the obtained historical reference point data of the IP according to the static IP and the dynamic IP distribution characteristics.
And screening the IP as a dynamic IP based on the distribution characteristics of the historical datum point data under the IP block in the figure 1.
And 3, clustering the historical datum point data by using a clustering algorithm to realize the positioning of the dynamic IP.
For the dynamic IP blocks shown in FIG. 1, clustering is performed on their historical datum point data by using a density-based DBSCAN clustering method, and the clustering result is represented by longitude and latitude of a center position and a corresponding radius, such as the clustering result shown by a circle in FIG. 1, so as to realize the positioning of the dynamic IP. It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein.