Method for measuring space division multi-boundary search performance measurement and related assembly

文档序号:7800 发布日期:2021-09-17 浏览:65次 中文

1. A method for measuring space division multi-boundary search performance measurement is characterized by comprising the following steps:

obtaining a plurality of partition boundary groups in a metric space, wherein each partition boundary group comprises a first partition boundary and a second partition boundary;

calculating normal vectors of the first partition boundary and the second partition boundary aiming at each partition boundary group to respectively obtain a first normal vector and a second normal vector;

calculating a cosine value of an included angle between the first normal vector and the second normal vector for each divided boundary group, and taking the cosine value as a cosine value corresponding to the divided boundary group;

and comparing the cosine value of each divided boundary group, and confirming the searching performance of each divided boundary group according to the comparison result.

2. The method of claim 1, wherein the obtaining a plurality of partition boundary groups in a metric space, each partition boundary group comprising a first partition boundary and a second partition boundary comprises:

selecting different support points, and dividing the same data range in the measurement space twice to obtain a divided boundary group comprising a first divided boundary and a second divided boundary;

and continuously carrying out different division on the data according to different weighting distances from the data to the supporting points, namely obtaining the next division boundary group, and so on to obtain a plurality of division boundary groups.

3. The method of claim 2, wherein the calculating normal vectors of the first partition boundary and the second partition boundary for each partition boundary group to obtain a first normal vector and a second normal vector respectively comprises:

mapping the first division boundary and the second division boundary to a supporting point space to obtain a corresponding first division hyperplane and a corresponding second division hyperplane;

and respectively calculating the weight values of the first division hyperplane and the second division hyperplane, taking the weight value of the first division hyperplane as a first normal vector of a first division boundary, and taking the weight value of the second division hyperplane as a second normal vector of a second division boundary.

4. The method of claim 1, wherein the calculating the cosine of the angle between the first normal vector and the second normal vector comprises:

calculating the cosine value of the included angle between the first normal vector and the second normal vector according to the following formula:

wherein (a)1,a2,…,an) Is the coordinate of the first normal vector, (b)1,b2,…,bn) Is the coordinate of the second normal vector.

5. The method of metric space partitioning multi-boundary search performance measurement as claimed in claim 1, wherein the metric space is a binary set (M, d), where M is a finite non-empty data set and d is a distance function defined over M.

6. The method for metric space partitioning multi-boundary search performance measurement according to claim 5, wherein the metric space (M, d) satisfies:

data S ═ Si|siE.g., M, i 1,2,.., M }, where there are n support points P { P } in S1,p2,...,pnForAt the distance d (s, p) of data to the support pointi) Defining a mapping from M to n-dimensional space as coordinates, using spRepresenting the image of s in n-dimensional space, there being a mapping function FP,dThe following were used:

FP,d(s)=(f1(s),f2(s),...,fn(s))=(d(s,p1),d(s,p2),...,d(s,pn))∈FP,d(M),

the support point space is S at RnThe image of (1):

FP,d(s)={sP|sP=d(s,p1),d(s,p2),...,d(s,pn),s∈S}。

7. the method of metric space partitioning multi-boundary search performance measurement of claim 6, wherein the linear partition boundary rule satisfies:

for the metric space (M, d),selecting n support points p in S1,p2,...,pnThe following linear relationship exists:

a1·d(s,p1)+a2·d(s,p2)+...+an·d(s,pn)=c,c∈R,ai∈R,

a1·x1+a2·x2+...+an·xn=c,c∈R,ai∈R,

8. an apparatus for measuring space-partitioning multi-boundary search performance metrics, comprising:

an obtaining unit, configured to obtain a plurality of partition boundary groups in a metric space, where each partition boundary group includes a first partition boundary and a second partition boundary;

the first calculation unit is used for calculating normal vectors of the first partition boundary and the second partition boundary aiming at each partition boundary group to respectively obtain a first normal vector and a second normal vector;

the second calculation unit is used for calculating a cosine value of an included angle between the first normal vector and the second normal vector for each divided boundary group, and the cosine value is used as a cosine value corresponding to the divided boundary group;

and the comparison unit is used for comparing the cosine values of each divided boundary group and confirming the searching performance of each divided boundary group according to the comparison result.

9. A computer device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor when executing the computer program implements the method for measuring space partitioning multi-boundary search performance measure of any of claims 1 to 7.

10. A computer-readable storage medium, having stored thereon a computer program which, when executed by a processor, causes the processor to perform a method of metric space partitioning multi-boundary search performance metrics as claimed in any one of claims 1 to 7.

Background

The existing metric spatial index based on division, such as the division based on the hyperplane, has good geometric characteristics as a result of the division based on the hyperplane, and areas obtained by the division do not overlap with each other.

The existing optimization of the classical index starts from further processing of the divided data and distribution of the data, and few people start optimizing the index in the form of dividing the boundary, because: on one hand, the performance comparison among different indexes is carried out through experiments, and the advantages and disadvantages of different methods can be objectively evaluated without a systematic and theoretical method, so that the inherent differences of different division methods hidden behind the experiments cannot be objectively reflected; on the other hand, special penetrating piece indexes and code for range search need to be written for different partitions, so that the problem of high experiment cost exists; on the other hand, the range search needs to be performed on all data in the database by using different indexes, and the range search time or the distance calculation times required by the range search when the return index uses different indexes have the problems of high time cost and low efficiency.

Disclosure of Invention

The invention aims to provide a method for measuring space division multi-boundary search performance measurement and a related component, and aims to solve the problems of high experiment cost, high time cost and low efficiency when the performance of a plurality of groups of different multi-division boundaries is analyzed in the existing measurement space.

In order to solve the technical problems, the invention aims to realize the following technical scheme: a method for measuring space division multi-boundary search performance measurement is provided, which comprises the following steps:

obtaining a plurality of partition boundary groups in a metric space, wherein each partition boundary group comprises a first partition boundary and a second partition boundary;

calculating normal vectors of the first partition boundary and the second partition boundary aiming at each partition boundary group to respectively obtain a first normal vector and a second normal vector;

calculating a cosine value of an included angle between the first normal vector and the second normal vector for each divided boundary group, and taking the cosine value as a cosine value corresponding to the divided boundary group;

and comparing the cosine value of each divided boundary group, and confirming the searching performance of each divided boundary group according to the comparison result.

In addition, another object of the present invention is to provide a device for measuring space division multi-boundary search performance measurement, which includes:

an obtaining unit, configured to obtain a plurality of partition boundary groups in a metric space, where each partition boundary group includes a first partition boundary and a second partition boundary;

the first calculation unit is used for calculating normal vectors of the first partition boundary and the second partition boundary aiming at each partition boundary group to respectively obtain a first normal vector and a second normal vector;

the second calculation unit is used for calculating a cosine value of an included angle between the first normal vector and the second normal vector for each divided boundary group, and the cosine value is used as a cosine value corresponding to the divided boundary group;

and the comparison unit is used for comparing the cosine values of each divided boundary group and confirming the searching performance of each divided boundary group according to the comparison result.

In addition, an embodiment of the present invention further provides a computer device, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor implements the method for measuring performance measurement of space partition and multi-boundary search according to the first aspect when executing the computer program.

In addition, an embodiment of the present invention further provides a computer-readable storage medium, where the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the processor is caused to execute the method for measuring performance of metric space partitioning multi-boundary search according to the first aspect.

The embodiment of the invention discloses a method for measuring space division multi-boundary search performance measurement and a related component. The method comprises the steps of obtaining a plurality of partition boundary groups in a metric space, wherein each partition boundary group comprises a first partition boundary and a second partition boundary; calculating normal vectors of the first partition boundary and the second partition boundary aiming at each partition boundary group to respectively obtain a first normal vector and a second normal vector; calculating a cosine value of an included angle between the first normal vector and the second normal vector for each divided boundary group, and taking the cosine value as a cosine value of the corresponding divided boundary group; and comparing the cosine values of each divided boundary group, and confirming the searching performance of each divided boundary group according to the comparison result. According to the embodiment of the invention, the cosine values of the divided boundary groups of each group are calculated, the dividing forms of the first divided boundary and the second divided boundary in the divided boundary groups can be confirmed, and the dividing forms are analyzed, so that the divided boundary group with the optimal search performance is selected, and the method has the advantages of low experiment cost, low time cost and high comparison efficiency.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a schematic flowchart of a method for measuring performance measurement of space division multi-boundary search according to an embodiment of the present invention;

fig. 2 is a schematic sub-flowchart of step S101 according to an embodiment of the present invention;

fig. 3 is a schematic sub-flowchart of step S102 according to an embodiment of the present invention;

FIG. 4 is a diagram illustrating the partition performance of a partition boundary group according to an embodiment of the present invention;

fig. 5 is a schematic diagram of another partition performance of a partition boundary group according to an embodiment of the present invention;

fig. 6 is a schematic block diagram of an apparatus for measuring performance measurement of space partition and multi-boundary search according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

It is also to be understood that the terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in the specification of the present invention and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.

It should be further understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.

Referring to fig. 1, fig. 1 is a schematic flowchart illustrating a method for measuring space partition multi-boundary search performance measurement according to an embodiment of the present invention;

as shown in fig. 1, the method includes steps S101 to S104.

S101, obtaining a plurality of partition boundary groups in a metric space, wherein each partition boundary group comprises a first partition boundary and a second partition boundary.

Specifically, as shown in fig. 2, the step S101 includes:

s201, selecting different support points, and dividing the same data range in the measurement space twice to obtain a divided boundary group comprising a first divided boundary and a second divided boundary;

s202, continuously dividing the data according to different weighting distances from the data to the supporting points to obtain the next divided boundary group, and repeating the steps to obtain a plurality of divided boundary groups.

The metric space is a binary set (M, d), where M is a finite non-empty data set and d is a distance function defined over M.

The metric space (M, d) satisfies:

data S ═ Si|siE.g., M, i 1,2,.., M }, where there are n support points P { P } in S1,p2.,..,pnForAt the distance d (s, p) of data to the support pointi) Defining a mapping from M to n-dimensional space as coordinates, using spRepresenting the image of s in n-dimensional space, there being a mapping function FP,dThe following were used:

FP,d(s)=(f1(s),f2(s),...,fn(s))=(d(s,p1),d(s,p2),...,d(s,pn))∈FP,d(M),

the support point space is S at RnThe image of (1):

FP,d(s)={sP|sP=d(s,p1),d(s,p2),...,d(s,pn),s∈S}。

for example, assume three data s in metric space1,s2,s3Wherein d(s)2,s1)=12,d(s2,s3)=23,d(s1,s3) When s is selected, 131,s3Two supportsWhen the point is pointed, the space dimension of the obtained supporting point is 2 s1,s2,s3The images in the supporting point space are respectively s1 P=(d(s1,s1),d(s1,s3))=(0,13),s2 P=(d(s2,s1),d(s2,s3))=(12,23),s3 P=(d(s3,s1),d(s3,s3))=(13,0)。

The linear partition boundary rule satisfies:

for the metric space (M, d),selecting n support points p in S1,p2,...,pnThe following linear relationship exists:

in the embodiment, the expression is based on a measurement space, a support point space and a linear division boundary rule; the data can be divided by adopting a GH division mode, specifically, k support points are selected according to the same data range in the measurement space, k is larger than or equal to 2, the distance from the data to the k support points is used as a coordinate, the data is divided to the nearest support point by adopting a linear division mode, and a division boundary can be obtained; for example, let k be 2, according to the above-mentioned linear partition boundary rule, that is, d (s, p1) ═ d (s, p2) can be the first partition boundary, s is the data point, p is the support point, d (s, p) represents the distance function, and the second partition boundary can be 2 × d (s, p1) ═ d (s, p2), based on which a partition boundary group including the first partition boundary and the second partition boundary in the form of two intersecting hyperplanes can be obtained.

Continuously carrying out different division on the data according to different weighting distances from the data to each supporting point by adopting the same division mode to obtain a next division boundary group; that is, after selecting two support points, a linear partition boundary set can be obtained: a1 · d (s, p1) + a2 · d (s, p1) ═ c, (a1, a2, c can be any real number in general), numerous partition boundary groups can be obtained from the set, and then different partition boundary groups are selected respectively for comparison of partition performance, so that the characteristics of the partition boundary group with better partition performance can be analyzed.

S102, calculating normal vectors of the first partition boundary and the second partition boundary aiming at each partition boundary group to respectively obtain a first normal vector and a second normal vector.

Specifically, as shown in fig. 3, the step S102 includes:

s301, mapping the first division boundary and the second division boundary to a supporting point space to obtain a corresponding first division hyperplane and a corresponding second division hyperplane;

s302, respectively calculating the weight values of the first division hyperplane and the second division hyperplane, taking the weight value of the first division hyperplane as a first normal vector of a first division boundary, and taking the weight value of the second division hyperplane as a second normal vector of a second division boundary.

In this embodiment, when mapping data to a supporting point space, the first partition boundary and the second partition boundary may also be mapped to the supporting point space, and expressions of the first partition boundary and the second partition boundary in the supporting point space are a first partition hyperplane and a second partition hyperplane, for example: GH division, assuming that two support points p1 and p2 are selected, the essence of GH division is to divide data to the support points closer to the GH. There is a division boundary d (s, p1) of d (s, p2), s being any data point. If the data is mapped to a supporting point space with p1 and p2 as supporting points, the x axis represents d (s, p1) and the y axis represents d (s, p2), and the expression form of the dividing boundary in the supporting point space is x-y.

Calculating the weight of the first and second partition hyperplanes, respectively, taking d (s, p1) ═ d (s, p2) as the first partition boundary, i.e. d (s, p1) -d (s, p2) ═ 0, where the weight of the first partition hyperplane is: a1 is 1, a2 is-1.

And taking the weight of the first division hyperplane as a first normal vector of a first division boundary, and taking the weight of the second division hyperplane as a second normal vector of a second division boundary.

S103, calculating a cosine value of an included angle between the first normal vector and the second normal vector for each divided boundary group, and taking the cosine value as a cosine value corresponding to the divided boundary group.

Specifically, the step S103 includes:

calculating the cosine value of the included angle between the first normal vector and the second normal vector according to the following formula:

wherein (a)1,a2,…,an) Is the coordinate of the first normal vector, (b)1,b2,…,bn) And n is the number of the support points.

In this embodiment, the coordinates of the first normal vector and the coordinates of the second normal vector are substituted and calculated according to the above formula, so as to obtain the cosine values of the corresponding divided boundary groups.

And S104, comparing the cosine values of each divided boundary group, and confirming the searching performance of each divided boundary group according to the comparison result.

In this embodiment, the smaller the cosine value of the divided boundary group is, the smaller the superset of the search hypercube is, and the smaller the size of the set requiring linear scanning when performing range search is; according to the law of the cosine function, when the included angle between the first normal vector and the second normal vector is closer to pi/2, the cosine value is smaller, and therefore the cosine value of each divided boundary group is compared, and the searching performance of the divided boundary group with the smallest cosine value is determined to be optimal.

Specifically, as shown in fig. 4 and 5, the search performance of two divided boundary groups can be visually compared; suppose that a first partition boundary and a second partition boundary of a partition boundary group are respectively H1And H2The first partition boundary and the second partition boundary of another partition boundary group are respectively K1And K2

The reason why the formed r neighborhoods (non-blank portions) are different for the two partition boundary groups is that H is H1And H2Angle theta and K1And K2It should be noted that R-neighborhood is a region near the dividing boundary, and when the center q of the range search R (q, R) falls into the region, the regions on both sides of the dividing boundary cannot be excluded when the range search is performed.

It can be seen that the smaller the r neighborhood, the less likely the query point falls into it, and as can be seen from fig. 4 and 5, the data is divided into 4 blocks, if the query point falls into a blank, the other three blocks of data can be excluded without further query, if the query point falls into other shadow (light shadow in the figure), the other two disjoint areas can be excluded, if the query point falls into a black shadow area, none of the 4 blocks can be excluded, and each block is searched, therefore, by mathematically calculating the total exclusion ratio, it can be concluded that when θ ═ pi/2, the area of the intersected area is the smallest, and the more data that can be excluded is probabilistically possible, that is, the exclusion ratio of the query point falling into the area of the intersected area is the largest, so that the search performance of the divided boundary group with the smallest cosine value is the most is obtained.

The embodiment of the invention also provides a device for measuring the space division multi-boundary search performance measurement, which is used for executing any embodiment of the method for measuring the space division multi-boundary search performance measurement. Specifically, referring to fig. 6, fig. 6 is a schematic block diagram of an apparatus for measuring space partition multi-boundary search performance measurement according to an embodiment of the present invention.

As shown in fig. 6, an apparatus 600 for measuring space partition multi-boundary search performance measure includes: an acquisition unit 601, a first calculation unit 602, a second calculation unit 603, and a comparison unit 604.

An obtaining unit 601, configured to obtain a plurality of partition boundary groups in a metric space, where each partition boundary group includes a first partition boundary and a second partition boundary;

a first calculating unit 602, configured to calculate, for each partition boundary group, normal vectors of the first partition boundary and the second partition boundary to obtain a first normal vector and a second normal vector, respectively;

a second calculating unit 603, configured to calculate, for each of the divided boundary groups, a cosine value of an included angle between the first normal vector and the second normal vector, and use the cosine value as a cosine value corresponding to the divided boundary group;

the comparing unit 604 is configured to compare cosine values of each of the divided boundary groups, and determine search performance of each of the divided boundary groups according to a comparison result.

The device analyzes and judges the range searching performance of two groups of different multi-division boundaries by using the same judgment basis, directly compares the advantages and disadvantages of two or more division boundary groups, and the comparison result is proved to be the feasibility of comparison and analysis in the mathematical theory, so that the result is more objective.

The device can directly compare the data relation of the division forms of the division boundary groups formed by different division methods to perform performance analysis, and does not need to specifically establish indexes and perform search operation for the method, so that the space and time cost of comparative analysis are saved.

In an embodiment, the obtaining unit 601 includes:

the first dividing unit is used for selecting different support points and dividing the same data range in the measurement space twice to obtain a dividing boundary group comprising a first dividing boundary and a second dividing boundary; wherein, the single division process comprises the following steps: selecting two supporting points in a measurement space, and dividing data into the nearest supporting points by taking the distance from the data to the two supporting points as coordinates;

and the second dividing unit is used for continuously dividing the data according to different weighting distances from the data to the supporting points to obtain the next divided boundary group, and so on to obtain a plurality of divided boundary groups.

In one embodiment, the first computing unit 602 includes:

the mapping unit is used for mapping the first division boundary and the second division boundary into a supporting point space to obtain a corresponding first division hyperplane and a corresponding second division hyperplane;

and the normal vector calculation unit is used for respectively calculating the weights of the first division hyperplane and the second division hyperplane, taking the weight of the first division hyperplane as a first normal vector of a first division boundary, and taking the weight of the second division hyperplane as a second normal vector of a second division boundary.

The embodiment of the present invention further provides a computer device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements the performance measurement method for measuring a space partition boundary as described above when executing the computer program.

In another embodiment of the invention, a computer-readable storage medium is provided. The computer readable storage medium may be a non-volatile computer readable storage medium. The computer readable storage medium stores a computer program which, when executed by a processor, causes the processor to perform a performance measurement method based on a metric space partition boundary as described above.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described apparatuses, devices and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again. Those of ordinary skill in the art will appreciate that the elements and algorithm steps of the examples described in connection with the embodiments disclosed herein may be embodied in electronic hardware, computer software, or combinations of both, and that the components and steps of the examples have been described in a functional general in the foregoing description for the purpose of illustrating clearly the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

In the embodiments provided by the present invention, it should be understood that the disclosed apparatus, device and method can be implemented in other ways. For example, the above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only a logical division, and there may be other divisions when the actual implementation is performed, or units having the same function may be grouped into one unit, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may also be an electric, mechanical or other form of connection.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment of the present invention.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a storage medium. Based on such understanding, the technical solution of the present invention essentially or partially contributes to the prior art, or all or part of the technical solution can be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a magnetic disk, or an optical disk.

While the invention has been described with reference to specific embodiments, the invention is not limited thereto, and various equivalent modifications and substitutions can be easily made by those skilled in the art within the technical scope of the invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

完整详细技术资料下载
上一篇:石墨接头机器人自动装卡簧、装栓机
下一篇:度量空间划分边界的性能衡量方法、装置及相关设备

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!