Performance measurement method and device for measuring space division boundary and related equipment
1. A performance measurement method for measuring space division boundaries is characterized by comprising the following steps:
acquiring weight vectors corresponding to different partition boundaries of a metric space according to a linear partition boundary rule;
counting the weight vectors corresponding to different partition boundaries to obtain a weight sequence corresponding to each partition boundary;
calculating the performance of each partition boundary based on the weight sequence to obtain a performance coefficient;
and comparing the performance coefficients, and determining that the partition boundary with the minimum performance coefficient has the optimal performance.
2. The method of claim 1, wherein before obtaining weight vectors corresponding to different partition boundaries of the metric space according to the linear partition boundary rule, the method comprises:
dividing a measurement space of data in a database to obtain a plurality of weight vectors;
selecting a plurality of supporting points by using a point selection algorithm;
and mapping the data in the measurement space into multi-dimensional vector data by taking the distance from the data to the supporting point as a coordinate to obtain a supporting point space.
3. The method according to claim 1, wherein the calculating the performance of each partition boundary based on the weight sequence to obtain the performance coefficient comprises:
calculating the standard deviation and the average number of the weight sequence to obtain the standard deviation and the average number of the weight sequence;
and taking the ratio of the standard deviation to the average number as a performance coefficient of the division boundary corresponding to the weight sequence.
4. The method of claim 1, wherein the metric space is a binary set (M, d), where M is a finite non-empty data set and d is a distance function defined over M.
5. The method of claim 4, wherein the distance function satisfies the following condition:
for any x ∈ M, y ∈ M, d (x, y) ≧ 0, and d (x, y) ≧ 0, x ═ y;
for any x e M, y e M, d (x, y) d (y, x);
for any x ∈ M, y ∈ M, z ∈ M, d (x, y) + d (y, z) ≧ d (x, z).
6. The method for performance measurement of a metric space partition boundary according to claim 5, characterized in that the metric space (M, d) satisfies:
data S ═ Si|siE.g., M, i 1,2,.., M }, where there are n support points P { P } in S1,p2,...,pnForAt the distance d (s, p) of data to the support pointi) Defining a mapping from M to n-dimensional space as coordinates, using spRepresenting the image of s in n-dimensional space, there being a mapping function FP,dThe following were used:
FP,d(s)=(f1(s),f2(s),...,fn(s))=(d(s,p1),d(s,p2),...,d(s,pn))∈FP,d(M),
support for supportingThe point space is S at RnThe image of (1):
FP,d(s)={sP|sP=d(s,p1),d(s,p2),...,d(s,pn),s∈S}。
7. the method of claim 6, wherein the linear partition boundary rule satisfies:
for the metric space (M, d),selecting n support points p in S1,p2,...,pn,The following linear relationship exists:
a1·d(s,p1)+a2·d(s,p2)+...+an·d(s,pn)=c,c∈R,ai∈R,
a1·x1+a2·x2+...+an·xn=c,c∈R,ai∈R,
8. a performance measurement apparatus for measuring a spatial partitioning boundary, comprising:
the acquisition module is used for acquiring weight vectors corresponding to different partition boundaries of a metric space according to a linear partition boundary rule;
the statistical module is used for counting the weight vectors corresponding to different partition boundaries to obtain a weight sequence corresponding to each partition boundary;
the calculation module is used for calculating the performance of each division boundary based on the weight sequence to obtain a performance coefficient;
and the comparison module is used for comparing the performance coefficients and determining that the partition boundary with the minimum performance coefficient has the optimal performance.
9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the performance measurement method for measuring a space partitioning boundary according to any one of claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program which, when executed by a processor, causes the processor to perform the method of performance measurement of a metric space partitioning boundary as claimed in any one of claims 1 to 7.
Background
Existing partition-based metric spatial indexing is mainly performed based on spherical partition or hyperplane partition. Based on the divided metric space index, the method can be divided into two types, namely a spherical division basis and a hyperplane division basis, according to the logic form of the subspace of the division result. The performance comparison between different indexes is performed through respective methods, the indexing conditions of different indexing methods are often different, the performance influence is determined by a plurality of factors, and the performance of the indexes is greatly influenced by different supporting points, different dividing modes, the balance of the indexes and the like.
The performance comparison between the existing different measurement spatial indexes is carried out through experiments, and no systematic and theoretical method can objectively evaluate the advantages and disadvantages of different methods, so that the inherent differences of different division methods cannot be objectively reflected, the objectivity of a conclusion is greatly reduced due to the difference between different data sets adopted in the experiments, and meanwhile, the efficiency of performance measurement is low.
Disclosure of Invention
The embodiment of the invention provides a performance measuring method and device for measuring a space division boundary and related equipment, and aims to solve the problem of high performance efficiency of measuring the space division boundary in the prior art.
In a first aspect, an embodiment of the present invention provides a performance measurement method for measuring a space partition boundary, where the method includes:
acquiring weight vectors corresponding to different partition boundaries of a metric space according to a linear partition boundary rule;
counting the weight vectors corresponding to different partition boundaries to obtain a weight sequence corresponding to each partition boundary;
calculating the performance of each partition boundary based on the weight sequence to obtain a performance coefficient;
and comparing the performance coefficients, and determining that the partition boundary with the minimum performance coefficient has the optimal performance.
In a second aspect, an embodiment of the present invention provides a performance measurement apparatus for measuring a space partition boundary, where the apparatus includes:
the acquisition module is used for acquiring weight vectors corresponding to different partition boundaries of a metric space according to a linear partition boundary rule;
the statistical module is used for counting the weight vectors corresponding to different partition boundaries to obtain a weight sequence corresponding to each partition boundary;
the calculation module is used for calculating the performance of each division boundary based on the weight sequence to obtain a performance coefficient;
and the comparison module is used for comparing the performance coefficients and determining that the partition boundary with the minimum performance coefficient has the optimal performance.
In a third aspect, an embodiment of the present invention further provides a computer device, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor implements the performance measurement method for measuring a space partition boundary according to the first aspect when executing the computer program.
In a fourth aspect, an embodiment of the present invention further provides a computer-readable storage medium, where the computer-readable storage medium stores a computer program, and the computer program, when executed by a processor, causes the processor to execute the performance measurement method for measuring a space partition boundary according to the first aspect.
The embodiment of the invention provides a performance measurement method and device for measuring space division boundaries and related equipment. The method comprises the steps of obtaining weight vectors corresponding to different partition boundaries of a metric space according to a linear partition boundary rule; counting the weight vectors corresponding to different partition boundaries to obtain a weight sequence corresponding to each partition boundary; calculating the performance of each partition boundary based on the weight sequence to obtain a performance coefficient; and comparing the performance coefficients, and determining that the partition boundary with the minimum performance coefficient has the optimal performance. According to the method, the performance is measured by the weight vector of the measurement space division boundary according to the mathematical theorem, the measurement result is more objective, and the measurement efficiency of measuring the performance of the different measurement space division boundaries is improved without complicated experiments.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a schematic flowchart of a performance measurement method for measuring a space partition boundary according to an embodiment of the present invention;
fig. 2 is a schematic flowchart of the performance measurement method for measuring a space partition boundary according to the embodiment of the present invention before step S110;
fig. 3 is a sub-flowchart of step S130 of the method for measuring performance of space partition boundaries according to the embodiment of the present invention;
fig. 4 is a schematic block diagram of a performance measuring apparatus for measuring a space partition boundary according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It is also to be understood that the terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in the specification of the present invention and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should be further understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.
Referring to fig. 1, fig. 1 is a flowchart illustrating a performance measurement method for measuring a space partition boundary according to an embodiment of the present invention, where the method includes steps S110 to S140.
Step S110, obtaining weight vectors corresponding to different partition boundaries of a measurement space according to a linear partition boundary rule;
in this embodiment, according to the linear boundary division rule of the metric space, the weight vectors corresponding to different division boundaries of the metric space are obtained. The boundary dividing mode can be hyperplane dividing or spherical dividing. The weight vector represents a vector representation after the partition boundary is mapped to the support point space.
In one embodiment, the metric space is a binary set (M, d), where M is a finite non-empty data set and d is a distance function defined over M.
In an embodiment, the distance function satisfies:
for any, x is equal to or greater than 0, and when d (x, y) is equal to 0, x is equal to y;
for any, d (x, y) ═ d (y, x);
optionally, d (x, y) + d (y, z) ≧ d (x, z).
In one embodiment, the linear partition boundary rule satisfies:
for the metric space (M, d),selecting n support points p in S1,p2,...,pn,Using a linear relationship:
a1·d(s,p1)+a2·d(s,p2)+...+an·d(s,pn)=c,c∈R,ai∈R,a method of dividing data with i 1, 2.. and n as boundaries is called linear division, where a is aiA weight vector representing the distance of the data to the ith support point. In the supporting point space, the linearly divided division boundary appears as (a)1,a2,...,an) Hyperplane a as a weight vector1x1+a2x2+...+anxn=c,c∈R,ai∈R,1, 2. Each boundary of the linear division can be described by a corresponding linear equation, and each linear division mode can be represented by a corresponding linear equation set. The division methods mentioned in the embodiments of the present invention all belong to linear division, if not specifically stated.
In one embodiment, as shown in fig. 2, before step S110, the method includes:
step S01, dividing the data in the database into measurement spaces to obtain a plurality of weight vectors;
s02, selecting a plurality of supporting points by using a point selection algorithm;
and step S03, mapping the data in the measurement space into multi-dimensional vector data by taking the distance from the data to the supporting point as a coordinate, so as to obtain a supporting point space.
In this embodiment, the data in the database is divided into measurement spaces to obtain a plurality of weight vectors. Selecting a plurality of supporting points by using a point selection algorithm; and mapping the data in the measurement space into multi-dimensional vector data by taking the distance from the data to the supporting point as a coordinate to obtain a supporting point space. For example, divide two divisionsSub-boundary d (x, p)1)–d(x,p2) D and d (x, p)1)+d(x,p2) If the data is completely divided, the weight vectors of the input hyperplane division are (1, -1) and (1, 1), and the division radius r of the spherical division is (for example, r is 1).
Further, if the number of vectors is 2 according to the vector groups (1, -1) and (1, 1), two support points p can be selected by using a support point selection algorithm (such as an FFT (fast Fourier transform (FFT) algorithm), an incremental model algorithm and the like)1,p2. Then with data to the support point p1,p2Is used as a coordinate, and the data in the measurement space is mapped into two-dimensional vector data. It should be noted that the dimension of the vector data obtained by mapping is equal to the number of the selected support points.
In one embodiment, for metric space (M, d), data S ═ Si|siE.g., M, i 1,2,.., M }, and selecting n support points P { P } in S1,p2,...,pnForAt the distance d (s, p) of data to the support pointi) As coordinates, a mapping from M to n-dimensional space can be defined, with spRepresenting the image of s in n-dimensional space, there is a mapping function FP,dThe following were used:
FP,d(s)=(f1(s),f2(s),...,fn(s))=(d(s,p1),d(s,p2),...,d(s,pn))∈FP,d(M);
support point space FP,d(S) is S at RnThe image of (1):
FP,d(s)={sP|sP=d(s,p1),d(s,p2),...,d(s,pn),s∈S}。
for example, three data s in metric space1,s2,s3Wherein d(s)2,s1)=12,d(s2,s3)=23,d(s1,s3) When s is selected, 131,s3When two supporting points are arranged, the space dimension of the obtained supporting point is 2 s1,s2,s3The images in the supporting point space are respectively s1 P=(d(s1,s1),d(s1,s3))=(0,13),s2 P=(d(s2,s1),d(s2,s3))=(12,23),s3 P=(d(s3,s1),d(s3,s3))=(13,0)。
Step S120, counting the weight vectors corresponding to different partition boundaries to obtain a weight sequence corresponding to each partition boundary;
in this embodiment, the weight vectors corresponding to different partition boundaries are counted to obtain a weight sequence corresponding to each partition boundary. Wherein, measuring space (M, d), data S ═ Si|siE.g., M, i 1,2,.., M }, and selecting n support points P { P } in S1,p2,...,pnForUsing a linear relation a1·d(s,p1)+a2·d(s,p2)+...+an·d(s,pn) C (c and a)iAre all constants; 1,2, n; d (s, p)i) Representing the distance of the data to the ith support point) as a boundary, the weight vector a is used1,a2,...,anThe value of (a) determines the partitioning performance. Counting weight vectors a corresponding to different partition boundaries1,a2,...,anObtaining the weight sequence { | a corresponding to different partition boundaries1|,|a2|,...,|an|}。
Step S130, calculating the performance of each partition boundary based on the weight sequence to obtain a performance coefficient;
in this embodiment, the performance of each partition boundary is calculated based on the weight sequence of each partition boundary, and the performance coefficients of different partition boundaries are obtained.
In one embodiment, as shown in fig. 3, step S130 includes:
step S131, calculating the standard deviation and the average number of the weight sequence to obtain the standard deviation and the average number of the weight sequence;
and step S132, taking the ratio of the standard deviation to the average number as a performance coefficient of the division boundary corresponding to the weight sequence.
In this embodiment, in order to characterize the proximity of the weight vector, the ratio of the standard deviation and the average of the absolute value of the weight vector is used to represent the proximity of the weight vector. The standard deviation represents the overall offset condition of data, the average value represents the overall value of the weight vector, the ratio of the standard deviation to the average value can accurately represent the offset degree of certain group of data to the average data, and the smaller the value of the weight coefficient is, the closer the value of the weight vector in the weight sequence is, the better the performance of dividing the boundary is. In this embodiment, the standard deviation and the average of the weight sequence are calculated to obtain the standard deviation and the average of the weight sequence; and taking the ratio of the standard deviation to the average number as a performance coefficient of the corresponding division boundary of the weight sequence.
For example, if the hyperplane is divided, two supporting points are selected, and one divided boundary is mapped to x in the supporting point space1-x2When it is equal to 0, the weight a1=1,a2-1, then the weight sequence is {1,1 }; if spherical division is adopted, two supporting points are selected, and one division boundary of the two supporting points is mapped to x in a two-dimensional supporting point space1R is a constant, then a1=1,a2When the weight sequence is 0, the weight sequence is {1,0 }; and then calculating the standard deviation and the average value of the weight sequences corresponding to the two partition boundaries, and taking the ratio of the standard deviation and the average value as the performance coefficients of the two partition boundaries.
And step S140, comparing the performance coefficients and determining that the partition boundary with the minimum performance coefficient has the optimal performance.
In this embodiment, the performance coefficients of different partition boundaries are compared, so that the performance of the partition boundary corresponding to the minimum performance coefficient is optimal.
According to the method, the performance is measured by the weight vector of the measurement space division boundary according to the mathematical theorem, the measurement result is more objective, and the measurement efficiency of measuring the performance of the different measurement space division boundaries is improved without complicated experiments.
The embodiment of the invention also provides a performance measuring device for measuring the space division boundary, which is used for executing any embodiment of the performance measuring method for measuring the space division boundary. Specifically, referring to fig. 4, fig. 4 is a schematic block diagram of a performance measuring apparatus for measuring a space partition boundary according to an embodiment of the present invention. The performance measuring apparatus 100 for measuring the space division boundary may be configured in a server.
As shown in fig. 4, the performance measuring apparatus 100 for measuring space partition boundaries includes an obtaining module 110, a counting module 120, a calculating module 130, and a comparing module 140.
An obtaining module 110, configured to obtain weight vectors corresponding to different partition boundaries of a metric space according to a linear partition boundary rule;
a counting module 120, configured to count the weight vectors corresponding to different partition boundaries to obtain a weight sequence corresponding to each partition boundary;
a calculating module 130, configured to calculate performance of each partition boundary based on the weight sequence to obtain a performance coefficient;
and the comparison module 140 is configured to compare the performance coefficients and determine that the partition boundary with the minimum performance coefficient has the optimal performance.
In one embodiment, the performance measuring apparatus 100 for measuring the space partition boundary includes:
the dividing module is used for dividing the measurement space of the data in the database to obtain a plurality of weight vectors;
the selecting module is used for selecting a plurality of supporting points by using a point selecting algorithm;
and the mapping module is used for mapping the data in the measurement space into two-dimensional vector data by taking the distance from the data to the supporting point as a coordinate to obtain a supporting point space.
In one embodiment, the calculation module 130 includes:
the calculating unit is used for calculating the standard deviation and the average number of the weight sequence to obtain the standard deviation and the average number of the weight sequence; and taking the ratio of the standard deviation to the average number as a performance coefficient of the division boundary corresponding to the weight sequence.
The embodiment of the present invention further provides a computer device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements the performance measurement method for measuring a space partition boundary as described above when executing the computer program.
In another embodiment of the invention, a computer-readable storage medium is provided. The computer readable storage medium may be a non-volatile computer readable storage medium. The computer readable storage medium stores a computer program which, when executed by a processor, causes the processor to perform a performance measurement method based on a metric space partition boundary as described above.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described apparatuses, devices and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again. Those of ordinary skill in the art will appreciate that the elements and algorithm steps of the examples described in connection with the embodiments disclosed herein may be embodied in electronic hardware, computer software, or combinations of both, and that the components and steps of the examples have been described in a functional general in the foregoing description for the purpose of illustrating clearly the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
In the embodiments provided by the present invention, it should be understood that the disclosed apparatus, device and method can be implemented in other ways. For example, the above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only a logical division, and there may be other divisions when the actual implementation is performed, or units having the same function may be grouped into one unit, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may also be an electric, mechanical or other form of connection.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment of the present invention.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a storage medium. Based on such understanding, the technical solution of the present invention essentially or partially contributes to the prior art, or all or part of the technical solution can be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a magnetic disk, or an optical disk.
While the invention has been described with reference to specific embodiments, the invention is not limited thereto, and various equivalent modifications and substitutions can be easily made by those skilled in the art within the technical scope of the invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.