Visual positioning method and related device, equipment and storage medium
1. A visual positioning method, comprising:
adjusting the position of a target space point to a reference plane to adjust a preset plane of a world coordinate system to the reference plane, wherein the target space point corresponds to a first feature point in a first image frame, and the target space point is used for defining the preset plane of the world coordinate system;
and obtaining a second pose of the second image frame in the adjusted world coordinate system based on the adjusted target space point and the image information in the first image frame and the second image frame, wherein the first image frame and the second image frame are obtained by sequentially shooting a target plane by a shooting device of the equipment.
2. The method of claim 1, wherein prior to said adjusting the position of the target spatial point onto the reference plane, the method further comprises:
acquiring a first pose of the first image frame in a world coordinate system;
the adjusting the position of the target spatial point onto the reference plane includes:
updating the pose in the first pose to the reference pose to obtain an updated first pose, wherein the reference pose is the pose of the photographing device relative to a reference plane;
determining a position of the target space point on the reference plane based on the updated first pose, the first feature point of the first image frame, as the adjusted position of the target space point.
3. The method of claim 1, wherein prior to said adjusting the position of the target spatial point onto the reference plane, the method further comprises:
detecting whether a preset plane of a world coordinate system is the reference plane or not based on a first pose of a first image frame in the world coordinate system;
in response to the preset plane not being the reference plane, performing the adjusting of the position of the target spatial point onto the reference plane.
4. The method of claim 3, wherein the detecting whether a preset plane of a world coordinate system is a reference plane based on a first pose of the first image frame in the world coordinate system comprises:
detecting whether a difference between a pose in the first pose and a reference pose is within a preset range, wherein the reference pose is a pose relative to a reference plane, and the pose in the first pose is a pose relative to the preset plane;
and in response to the difference not being within a preset range, determining that the preset plane is not the reference plane.
5. Method according to claim 2 or 4, characterized in that it further comprises the following steps to obtain said reference poses:
and acquiring the reference attitude detected by a sensing device of the equipment at a reference moment, wherein the difference between the reference moment and the shooting moment of the first image frame does not exceed a preset time difference.
6. The method according to any one of claims 2 to 5, wherein the target plane is a plane in which a positioning assistance image is located, and the first pose is determined based on the positioning assistance image.
7. The method according to claim 6, characterized in that it further comprises the following steps to obtain said first attitude:
determining a first transformation parameter between the first image frame and the positioning auxiliary image based on a first matching point pair between the first image frame and the positioning auxiliary image, and obtaining the first pose by using the first transformation parameter; alternatively, the first and second electrodes may be,
determining a second transformation parameter between the first image frame and a third image frame based on a second matching point pair between the first image frame and the third image frame, and obtaining the first pose by using the second transformation parameter and a third transformation parameter between the third image frame and the positioning auxiliary image, wherein the third image frame is obtained by shooting by the shooting device before the first image frame.
8. The method according to any one of claims 1 to 7, wherein the obtaining a second pose of the second image frame in the adjusted world coordinate system based on the adjusted target space point and image information in the first image frame and the second image frame comprises:
determining the second pose based on a pixel value difference between a second feature point projected by the adjusted target space point on the second image frame and the first feature point.
9. The method of claim 8, wherein the determining the second pose based on a pixel value difference between a second feature point projected by the adjusted target space point on the second image frame and the first feature point comprises:
and acquiring at least one candidate pose, determining the second feature point corresponding to the candidate pose based on each candidate pose and the adjusted target space point, and selecting one candidate pose as the second pose based on the pixel value difference between the second feature point and the first feature point.
10. The method of claim 9, wherein the at least one candidate pose is determined based on an updated first pose that is updated from a first pose in the world coordinate system of the first image frame prior to adjustment with a reference pose, the reference pose being a pose of the camera relative to the reference plane;
and/or selecting the candidate pose as the second pose based on the pixel value difference between the second feature point and the first feature point comprises:
and selecting the candidate pose corresponding to the second feature point with the pixel value difference meeting the preset requirement as the second pose.
11. The method according to any one of claims 1 to 10, wherein the predetermined plane is a horizontal plane in the world coordinate system and the reference plane is a reference horizontal plane.
12. A visual positioning device, comprising:
an adjusting module, configured to adjust a position of the target spatial point onto the reference plane to adjust a preset plane of the world coordinate system to the reference plane, where the target spatial point corresponds to a first feature point in a first image frame, and the target spatial point is used to define the preset plane of the world coordinate system;
and the pose determining module is used for obtaining a second pose of the second image frame in the adjusted world coordinate system based on the adjusted target space point and the image information in the first image frame and the second image frame, wherein the first image frame and the second image frame are obtained by shooting a target plane by a shooting device of equipment in sequence.
13. An electronic device comprising a memory and a processor coupled to each other, the processor being configured to execute program instructions stored in the memory to implement the visual positioning method of any one of claims 1 to 11.
14. A computer readable storage medium having stored thereon program instructions which, when executed by a processor, implement the visual positioning method of any of claims 1 to 11.
Background
Computer vision technologies such as Augmented Reality (AR) and Virtual Reality (VR) are current hotspot technologies, and a camera is used as an input device and processed by an image algorithm, so that the surrounding environment can be digitized, and the use experience of interaction with a real environment can be acquired. Visual localization is an important application of AR technology, VR technology. The pose of the device can be obtained by acquiring the image shot by the device.
However, in the existing visual localization technology, the pose in the world coordinate system constructed by the object plane can be obtained using an image taken of the object plane such as a horizontal plane. However, based on the actual placement of the target plane, or the influence of factors such as calculation errors of the pose of the device, the world coordinate system may be inconsistent with the real situation, which often results in inaccurate obtained poses.
Therefore, how to improve the accuracy of the pose of the equipment is very important for further application of the vision positioning technology.
Disclosure of Invention
The application provides a visual positioning method, a related device, equipment and a storage medium.
A first aspect of the present application provides a visual positioning method, including: adjusting the position of a target space point to a reference plane so as to adjust a preset plane of a world coordinate system to the reference plane, wherein the target space point corresponds to a first characteristic point in a first image frame, and the target space point is used for defining the preset plane of the world coordinate system; and obtaining a second pose of the second image frame in the adjusted world coordinate system based on the adjusted target space point and the image information in the first image frame and the second image frame, wherein the first image frame and the second image frame are obtained by shooting the target plane by a shooting device of the equipment in sequence.
Therefore, the position of the target space point is adjusted to the reference plane, so that the preset plane of the world coordinate system defined based on the adjusted target space point is coincided with the reference plane, and the adjusted world coordinate system accords with the actual space condition, so that the world coordinate system is corrected, an accurate second pose is obtained, and the accurate positioning of the equipment is realized.
Wherein, before performing the adjusting of the position of the target spatial point onto the reference plane, the visual positioning method further comprises: acquiring a first pose of a first image frame in a world coordinate system; the adjusting the position of the target space point to the reference plane includes: updating the gesture in the first pose to a reference gesture to obtain an updated first pose, wherein the reference gesture is the gesture of the shooting device relative to a reference plane; and determining the position of the target space point on the reference plane based on the updated first pose and the first feature point of the first image frame to serve as the position of the adjusted target space point.
Therefore, the position of the target space point can be adjusted to the reference plane by updating the posture in the first posture to the reference posture, so that the terminal can be accurately positioned by using the adjusted target space point and the subsequent image frame.
Wherein, before performing the adjusting of the position of the target spatial point onto the reference plane, the visual positioning method further comprises: detecting whether a preset plane of a world coordinate system is a reference plane or not based on a first pose of a first image frame in the world coordinate system; and responding to the preset plane not being the reference plane, and adjusting the position of the target space point to the reference plane.
Therefore, by detecting whether a preset plane of the world coordinate system is a reference plane by using the first pose of the first image frame in the world coordinate system, it can be determined whether the established world coordinate system needs to be corrected.
The detecting whether a preset plane of the world coordinate system is a reference plane based on the first pose of the first image frame in the world coordinate system includes: detecting whether a difference between a pose in a first pose and a reference pose is within a preset range, wherein the reference pose is a pose with respect to a reference plane, and the pose in the first pose is a pose with respect to a preset plane; and determining that the preset plane is not the reference plane in response to the difference being within the preset range.
Therefore, by judging the difference between the reference posture and the posture of the first posture, whether the preset plane is the reference plane can be determined, and whether the current world coordinate system needs to be corrected can be determined.
Wherein, the above method further comprises the following steps to obtain the reference attitude: and acquiring a reference attitude detected by a sensing device of the equipment at a reference moment, wherein the difference between the reference moment and the shooting moment of the first image frame does not exceed a preset time difference.
Therefore, the reference attitude can be quickly acquired by acquiring the reference attitude by using the sensing device, and the running speed of the visual positioning method is accelerated.
The target plane is a plane where the positioning auxiliary image is located, and the first pose is determined based on the positioning auxiliary image.
Therefore, by using the positioning assistance image to perform registration with the first image frame, the first pose of the first image frame can be obtained.
Wherein, the method further comprises the following steps to obtain the first posture: determining a first transformation parameter between the first image frame and the positioning auxiliary image based on a first matching point pair between the first image frame and the positioning auxiliary image, and obtaining a first position by using the first transformation parameter; or determining a second transformation parameter between the first image frame and a third image frame based on a second matching point pair between the first image frame and the third image frame, and obtaining a first pose by using the second transformation parameter and a third transformation parameter between the third image frame and the positioning auxiliary image, wherein the third image frame is obtained by shooting in front of the first image frame by a shooting device.
Therefore, obtaining the first pose can be achieved by obtaining a first transformation parameter between the first image frame and the positioning assistance image, or by obtaining a second transformation parameter and a third transformation parameter.
The adjusted target space point is obtained based on a first feature point of a first image frame; obtaining a second pose of the second image frame in the world coordinate system based on the adjusted target space point and the image information in the first image frame and the second image frame, wherein the second pose comprises: and determining a second pose based on the pixel value difference between a second feature point projected by the adjusted target space point on the second image frame and the first feature point.
Therefore, the second pose of the second image frame is obtained by calculating the difference of the pixel values between the second characteristic point and the first characteristic point.
Wherein the determining a second pose based on the pixel value difference between the first feature point and the second feature point projected by the adjusted target space point on the second image frame includes: and acquiring at least one candidate pose, determining a second feature point corresponding to the candidate pose based on each candidate pose and the adjusted target space point, and selecting one candidate pose as the second pose based on the pixel value difference between the second feature point and the first feature point.
Therefore, the second pose of the second image frame is selected from the candidate poses by acquiring at least one candidate pose, determining the pixel value difference corresponding to the candidate pose and comparing the pixel difference corresponding to each candidate pose, so that the more accurate second pose of the second image frame can be obtained.
The at least one candidate pose is determined based on updating a first pose, the updating the first pose is obtained by updating the first pose in the world coordinate system of the first image frame before adjustment by using a reference pose, and the reference pose is a pose of the shooting device relative to a reference plane; and/or selecting the candidate pose as the second pose based on the pixel value difference between the third feature point and the first feature point, comprising: and selecting a candidate pose corresponding to the third second feature point with the pixel value difference meeting the preset requirement as a second pose.
Therefore, a relatively accurate second pose can be obtained by screening candidate poses meeting preset requirements.
The preset plane is a horizontal plane in world coordinates, and the reference plane is a reference horizontal plane.
Therefore, by determining the horizontal plane of the world coordinate system using the reference horizontal plane, the obtained second posture can be made more accurate.
A second aspect of the present application provides a visual positioning apparatus comprising: an adjusting module and a pose determining module; the adjusting module is used for adjusting the position of a target space point to a reference plane so as to adjust a preset plane of a world coordinate system to the reference plane, wherein the target space point corresponds to a first feature point in a first image frame, and the target space point is used for defining the preset plane of the world coordinate system; the pose determining module is used for obtaining a second pose of the second image frame in the adjusted world coordinate system based on the adjusted target space point and the image information in the first image frame and the second image frame, wherein the first image frame and the second image frame are obtained by shooting the target plane by a shooting device of the equipment in sequence.
A third aspect of the present application provides an electronic device, which includes a memory and a processor coupled to each other, wherein the processor is configured to execute program instructions stored in the memory to implement the visual positioning method in the first aspect.
A fourth aspect of the present application provides a computer-readable storage medium having stored thereon program instructions, which when executed by a processor, implement the visual positioning method of the first aspect.
According to the scheme, the position of the target space point is adjusted to the reference plane, the preset plane of the world coordinate system defined by the adjusted target space point is coincident with the reference plane, and the adjusted world coordinate system conforms to the actual space condition, so that the world coordinate system is corrected, the accurate second pose is obtained, and the accurate positioning of the equipment is realized.
Drawings
FIG. 1 is a schematic flow chart of a first embodiment of a visual positioning method of the present application;
FIG. 2 is a schematic diagram of the present application in which a misalignment between a reference plane and a predetermined plane of a world coordinate system is detected;
FIG. 3 is a schematic flow chart of a second embodiment of the visual positioning method of the present application;
FIG. 4 is a flowchart illustrating a third embodiment of the visual positioning method of the present application
FIG. 5 is a schematic diagram of a frame of an embodiment of the visual positioning apparatus of the present application;
FIG. 6 is a block diagram of an embodiment of an electronic device of the present application;
FIG. 7 is a block diagram of an embodiment of a computer-readable storage medium of the present application.
Detailed Description
The following describes in detail the embodiments of the present application with reference to the drawings attached hereto.
In the following description, for purposes of explanation and not limitation, specific details are set forth such as particular system structures, interfaces, techniques, etc. in order to provide a thorough understanding of the present application.
The terms "system" and "network" are often used interchangeably herein. The term "and/or" herein is merely an association describing an associated object, meaning that three relationships may exist, e.g., a and/or B, may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship. Further, the term "plurality" herein means two or more than two.
The image frames referred to in this application may be images captured by a camera to be positioned. For example, in an application scenario such as AR, VR, etc., the image frame may be an image captured by a capturing device in an electronic device such as a mobile phone, a tablet computer, smart glasses, etc., and is not limited herein. Other scenarios may be analogized, and are not exemplified here.
Referring to fig. 1, fig. 1 is a schematic flow chart of a visual positioning method according to a first embodiment of the present application. Specifically, the method may include the steps of:
step S11: and adjusting the position of the target space point to a reference plane so as to adjust a preset plane of the world coordinate system to be the reference plane.
In this application, the photographing device may photograph the target plane according to a time sequence to obtain at least one image frame, for example, the photographing device photographs the target plane sequentially to obtain a first image frame and a second image frame. It is understood that the object plane may be regarded as being photographed as long as the object plane exists in the first image frame and the second image frame. In some embodiments, a region corresponding to the target plane may be determined in an image frame captured by the capturing device, so that a spatial point corresponding to a feature point of the region is a point on an object existing on the target plane; or directly determining all the contents of the image frame as the contents on the target plane, namely all the areas of the image frame are the areas corresponding to the target plane, so that the spatial points corresponding to the feature points of the image frame are all the points on the object existing on the target plane.
The target space point corresponding to the first feature point in the first image frame is defined as a point on a preset plane of the world coordinate system. The position of the target space point can be determined by using the pose of the first image frame and the first feature point on the first image frame. The target space point is defined as a point on a preset plane of a world coordinate system, and it can be understood that the preset plane of the world coordinate system is defined by the target space point, that is, only the position of the target space point is determined, the preset plane of the world coordinate system can be determined, and the world coordinate system can be constructed based on the determined preset plane. The preset plane may be a plane formed by two coordinate axes in a world coordinate system, such as an XOY plane (which may also be a horizontal plane referred to as a world coordinate system), or a YOZ plane (which may also be a vertical plane referred to as a world coordinate system).
In the application, the target plane is a plane shot by the shooting device to be positioned, and is used for positioning the equipment. In some embodiments, the first feature point may be regarded as a feature point in a region corresponding to the target plane in the first image frame, so the target space point corresponding to the first feature point may be regarded as a point on a surface of an object existing on the target plane, and a preset plane of an original world coordinate system constructed by the target space point may be regarded as being coincident with the target plane, so the original world coordinate system is constructed based on the target plane. Thus, a particular plane of real space may be selected as the target plane, e.g., a real plane such as a real horizontal plane or a real vertical plane. However, in practical applications, the target plane may have a placement error, so that the target plane is not the real plane (e.g., a plane close to the real plane), and thus the constructed preset plane of the original world coordinates is not the real plane; or the target plane is the real plane, but the deviation of the calculation error of the related algorithm for positioning by using the image frame, the noise influence of the first shot image frame and the like in the positioning process causes that the constructed preset plane of the original world coordinate is not the real plane. This may result in inaccurate second poses for subsequent image frames. Based on this, in order to ensure the pose accuracy of the subsequent image frame, the world coordinate system can be corrected by using the reference plane.
The reference plane may be considered as a specific plane in the real space, and is used to define a preset plane of the world coordinate system that meets the situation of the real space, so that the world coordinate system constructed in the positioning process is adjusted when there is a deviation from the real world coordinate system, for example, the reference plane may be a real horizontal plane (i.e., a reference horizontal plane) used to correct a horizontal plane (also referred to as an XOY plane) of the world coordinate system to the real horizontal plane; or a real vertical plane (may also be referred to as YOZ plane) for correcting a vertical plane of the world coordinate system (may also be referred to as YOZ plane) to the real vertical plane, and the like.
In the present application, the target space point corresponding to the first feature point of the first image frame is used to define a preset plane of the world coordinate system, so that when it is detected that the preset plane is not the reference plane, the position of the target space point can be adjusted to the reference plane, so that the preset plane defined by the adjusted target space point coincides with the reference plane, that is, the preset plane of the world coordinate system is adjusted to the reference plane. In this case, the world coordinate system has been adjusted to meet the real space conditions, for example, the preset plane of the world coordinate system is adjusted to coincide with the real horizontal plane, so that the subsequent image frames are positioned in the adjusted world coordinate system, and the positioning accuracy of the subsequent image frames can be improved.
In this embodiment, the target space point before adjustment and the target space point after adjustment are projected on the same first feature point in the first image frame. The target space point before adjustment can be regarded as the target space point determined by the first pose and the first feature point before adjustment.
The following exemplarily lists two specific implementations of adjusting the position of the target spatial point to the reference plane, but it should be understood that the invention is not limited to the following two.
In one implementation scenario, the first pose of the first image frame may be adjusted using a reference pose (pose may also be referred to herein as orientation) of the device relative to the reference plane at a capture time corresponding to the first image frame, resulting in an updated first pose of the first image frame, which may be understood as the pose of the first image frame relative to the reference plane. The reference attitude is an attitude of the photographing device with respect to a reference plane. On the basis, the position of the target space point can be determined by utilizing the updated first pose and the first feature point, so that the adjusted target space point is obtained.
In one implementation scenario, the first pose and the first feature point of the first image frame may be used to determine a target space point, and then the position of the target space point may be adjusted to the reference plane, for example, a straight line passing through the target space point and the position of the shooting device is determined, and the position of the intersection point of the straight line and the reference plane is used as the adjusted position of the target space point.
Since the position of the target spatial point has been adjusted to the reference plane, it means that the preset plane of the world coordinate system is also adjusted to the reference plane, i.e. the preset plane of the world coordinate system at this time is coincident with the reference plane.
If not specifically stated, after the step of adjusting the position of the target space point to the reference plane, the preset plane of the world coordinate system mentioned in the subsequent step is coincident with the reference plane, that is, the world coordinate system mentioned in the subsequent step is the adjusted world coordinate system.
Step S12: and obtaining a second pose of the second image frame in the adjusted world coordinate system based on the adjusted target space point and the image information in the first image frame and the second image frame.
After the adjustment in step S11, the preset plane of the world coordinate system is the reference plane, and the spatial point of the target photographed by the photographing device is also on the reference plane, which corresponds to the photographing device photographing the reference plane. Therefore, the second pose of the second image frame in the adjusted world coordinate system can be determined and obtained by acquiring the image information of the feature points corresponding to the adjusted target space points in the first image frame and the second image frame. And the second pose of the second image frame in the world coordinate system is the second pose of the terminal when the second image frame is shot.
In a specific implementation scenario, a photometric error equation may be used to obtain a second pose of the second image frame in the world coordinate system based on the adjusted target space point and the image information in the first image frame and the second image frame.
Therefore, under the condition that the target space point is used for defining the preset plane of the world coordinate system, the position of the target space point is adjusted to the reference plane, so that the preset plane of the world coordinate system defined on the basis of the adjusted target space point is coincided with the reference plane, the adjusted world coordinate system conforms to the actual space condition, the world coordinate system is corrected, the accurate second pose is obtained, and the accurate positioning of the equipment is realized.
In one embodiment, before performing the above step of "adjusting the position of the target spatial point onto the reference plane", the following step S13 may also be performed.
Step S13: and detecting whether a preset plane of the world coordinate system is a reference plane or not based on a first pose of the first image frame in the world coordinate system.
In this embodiment, whether the world coordinate system needs to be adjusted may be determined by detecting whether a preset plane of world coordinates is a reference plane based on the first pose of the first image frame in the world coordinate system.
Referring to fig. 2, fig. 2 is a schematic diagram illustrating that a preset plane of a world coordinate system is detected to be misaligned with a reference plane in the visual positioning method of the present application. In fig. 2, C denotes a first position of the photographing apparatus, 201 denotes a preset plane of a world coordinate system, and 202 denotes a reference plane. It can be seen that, at this time, the preset plane of the world coordinate system does not coincide with the reference plane, i.e. the preset plane of the world coordinate system is not the reference plane.
When it is detected that the predetermined plane of the world coordinate system is not the reference plane, the apparatus executing the method described herein may execute the step S11, namely, adjusting the position of the target spatial point to the reference plane, in response to the predetermined plane not being the reference plane. If it is detected that the predetermined plane of the world coordinate system is the reference plane, the apparatus performing the method described herein may stop performing the subsequent steps in response to the predetermined plane being the reference plane.
Therefore, by detecting whether the preset plane of the world coordinate system is the reference plane, it can be determined whether the correction of the world coordinate system is needed.
In one embodiment, before performing the step of adjusting the position of the target spatial point onto the reference plane, a step of acquiring a first pose of the first image frame in a world coordinate system may also be performed to obtain the first pose.
In some implementation scenarios, the image information of the first image frame may be used to determine a first pose of the first image frame in the constructed world coordinate system, and specifically, the first pose may be determined using an image registration method or an image tracking method.
For example, the target plane is a plane where the positioning assistance image is located, and the photographing device photographs the positioning assistance image to obtain a first image frame and a second image frame. The positioning assistance image may be image registered with the first image frame to obtain a transformation relationship between the positioning assistance image and the first image frame, and the transformation relationship is further used to obtain a first pose, i.e. the first pose is determined based on the positioning assistance image. It can be understood that, since the positioning assistance image is located on the target plane, the transformation relationship can be regarded as a transformation relationship between the target plane and the first image frame (i.e. the camera or the device to be positioned when the first image frame is shot), so that the first pose can be obtained according to the transformation relationship. Wherein the positioning assistance image may be an arbitrary two-dimensional pattern. Thus, by registering with the first image frame using the positioning assistance image, the first pose of the first image frame can be obtained.
For another example, after the pose of the image frame before the first image frame in the world coordinate system is obtained (which can be obtained by the image registration method described above), the first pose is determined by using the image information between the previous image frame and the first image frame.
It should be noted that the image registration or image tracking described above can be implemented by using an existing general registration or tracking method, which is not described in detail herein.
In some embodiments, the following step S21 or step S22 may also be performed to obtain the first posture.
Step S21: and determining a first transformation parameter between the first image frame and the positioning auxiliary image based on a first matching point pair between the first image frame and the positioning auxiliary image, and obtaining a first posture by using the first transformation parameter.
In the embodiment of the present disclosure, by performing feature extraction on the first image frame and the positioning auxiliary image, a first feature point corresponding to the first image frame and a third feature point corresponding to the positioning auxiliary image may be obtained. In the present application, the feature points extracted from the image frames may include feature points obtained by feature extraction of a series of image frames in an image pyramid established based on the image frames. The number of feature points is not particularly limited. The feature extraction algorithm is, for example, FAST (features from obtained segment) algorithm, SIFT (Scale-innovative feature transform) algorithm, orb (organized FAST and related bridge) algorithm, and the like. In one implementation scenario, the feature extraction algorithm is the orb (organized FAST and rotaed brief) algorithm. After the feature points are obtained, a feature representation corresponding to each feature point is also obtained, and the feature representation is, for example, a feature vector. Therefore, each feature point has a feature representation corresponding to it. In the present embodiment, the feature points for feature extraction based on the image frame can be considered to be in the same plane as the positioning assistance image.
In one implementation scenario, a series of first matching point pairs may be obtained by calculating a matching degree of each first feature point and each third feature point, and then a matching point pair with a high matching degree may be selected as the first matching point pair. In the first matching point pair, the first feature point is used as a first matching point, and the third feature point is used as a second matching point. The degree of matching between the first feature point and the third feature point may be calculated as a distance between feature representations of the two feature points, and a closer distance may be considered as a better match. Then, a first transformation parameter between the first image frame and the positioning assistance image may be determined using an image registration algorithm based on the obtained series of first matching point pairs, and a first pose may be obtained using the first transformation parameter. The image registration algorithm is for example a grayscale and template based algorithm or a feature based matching method. For example, with respect to the feature-based matching method, a certain number of first matching point pairs with respect to the first image frame and the positioning auxiliary image may be obtained, and then a random consensus sampling algorithm (RANSAC) is used to calculate a first transformation parameter (for example, a homography matrix) between the image to be registered and the target image, so as to achieve registration of the images, and further obtain a first pose according to the first transformation parameter. The first bit-pose may be derived based on the first transformation parameter, for example by using a pnp (passive-n-Point) algorithm.
In another implementation scenario, after obtaining at least one set of first matching point pairs, direction information for each set of first matching point pairs may be calculated. The direction information of the first matching point pair may be obtained from the directions of the first matching point and the second matching point in the first matching point pair.
In one implementation scenario, the direction information of the first matching point pair may be a difference value of directions of the first matching point and the second matching point. For example, when the feature points are extracted by the ORB algorithm, the direction of the first matching point is a corner point direction angle, and the direction of the second matching point is also a corner point direction angle, and the direction information of the first matching point pair may be a difference between the corner point direction angle of the first matching point and the corner point direction angle of the second matching point. Thus, by calculating the direction information of a set of first matching point pairs, the rotation angle of the first image frame relative to the positioning assistance image can be found. After obtaining the direction information of a set of first matching point pairs, the image registration may be performed subsequently by using the rotation angle of the first image frame relative to the positioning assistance image, which is represented by the direction information of the set of first matching point pairs, to finally obtain a first transformation parameter between the first image frame and the positioning assistance image.
In one implementation scenario, a first image region centered on a first matching point may be extracted in a first image frame, and a second image region centered on a second matching point may be extracted in a positioning assistance image. Then, a first deflection angle of the first image area and a second deflection angle of the second image area are determined. Finally, a first transformation parameter is obtained based on the first deflection angle and the second deflection angle, and specifically, the first transformation parameter may be obtained based on the direction information of the first matching point pair and the pixel coordinate information of the first matching point and the second matching point in the first matching point pair.
In one implementation scenario, the first deflection angle is a directional angle between a line connecting the centroid of the first image region and the center of the first image region and a predetermined direction (e.g., an X-axis of a world coordinate system). The second deflection angle is a directed included angle between a connecting line of the centroid of the second image area and the center of the second image area and the preset direction.
In another implementation scenario, the first deflection angle θ can be directly obtained by the following equation:
θ=arctan(∑yI(x,y),∑xI(x,y)) (1)
in the above formula (1), (x, y) represents the offset of a certain pixel point in the first image region with respect to the center of the first image region, I (x, y) represents the pixel value of the pixel point, and Σ represents the summation, whose summation range is the pixel point in the first image region. Similarly, the second deflection angle can also be calculated in the same way.
In one implementation scenario, the orientation information of the first matching point pair and the coordinate information, e.g. pixel coordinate information, of the first matching point and the second matching point of the first matching point pair may be utilized to arrive at a first transformation parameter between the first image frame and the positioning assistance image. This enables the calculation of the first transformation parameters using a set of pairs of first matching points.
In a specific embodiment, the transformation parameters between the first image frame and the positioning assistance image may be obtained by the following steps a and b.
Step a: an angular difference between the first deflection angle and the second deflection angle is obtained.
The angular difference is, for example, the difference between the first deflection angle and the second deflection angle.
In one implementation scenario, equation (2) for calculating the angular difference is as follows:
wherein theta is an angle difference,being a first deflection angle, T denotes a positioning assistance image,for the second deflection angle, F denotes the first image frame.
Step b: and obtaining a first candidate transformation parameter based on the angle difference and the scale corresponding to the first matching point pair.
The first candidate transformation parameter is for example a homography matrix of the correspondence between the first image frame and the positioning assistance image. The homography matrix is calculated as follows:
H=HlHsHRHr (3)
wherein, H is a homography matrix corresponding between the positioning auxiliary image and the first image frame, namely a first candidate transformation parameter; hrRepresenting an amount of translation of the first image frame relative to the positioning assistance image; hsThe scale corresponding to the represented first matching point pair is the scale information when the positioning auxiliary image is zoomed; hRRepresenting the amount of rotation, H, of the first image frame relative to the positioning assistance imagelRepresenting the amount of translation reset after translation.
In order to obtain the angular difference, the above equation (3) may be converted to obtain equation (4).
Wherein the content of the first and second substances,pixel coordinates on the positioning aid image for the first matching point;pixel coordinates of the second matching point on the first image frame; s is the scale corresponding to the first matching point pair, i.e. the pointA corresponding scale; θ is the angular difference.
Step S22: and determining a second transformation parameter between the first image frame and the third image frame based on a second matching point pair between the first image frame and the third image frame, and obtaining the first pose by using the second transformation parameter and a third transformation parameter between the third image frame and the positioning auxiliary image.
In the embodiment of the present disclosure, the third image frame is captured by the capturing device before the first image frame. The third image frame may be taken of the object plane.
The method for obtaining the second matching point pair between the first image frame and the third image frame and obtaining the second transformation parameter may refer to the above method for obtaining the first matching point pair and the first transformation parameter, and is not described herein again. Similarly, the third transformation parameter between the third image frame and the positioning auxiliary image may also be obtained by referring to the above method for obtaining the first transformation parameter, which is not described herein again.
After obtaining the third transformation parameter and the second transformation parameter, a first transformation parameter may be obtained based on the two transformation parameters, and the first pose may be obtained based on the first transformation parameter.
In one implementation scenario, the first transformation parameter may be obtained by the following equation (5).
H1=H2·H3 (5)
Wherein H1Is a first transformation parameter, H2As a second transformation parameter, H3Is the third transformation parameter.
Thus, the first pose can be obtained by obtaining a first transformation parameter between the first image frame and the positioning assistance image, or by obtaining a second transformation parameter and a third transformation parameter.
Referring to fig. 3, fig. 3 is a flowchart illustrating a visual positioning method according to a second embodiment of the present application. The embodiment of the present disclosure is a further extension of the above-mentioned "detecting whether the preset plane of the world coordinate system is the reference plane based on the first pose of the first image frame in the world coordinate system", and specifically may include the following steps S31 to S33.
Step S31: detecting whether a difference between the pose in the first pose and the reference pose is within a preset range.
In the embodiment of the present disclosure, the reference posture is a posture of the photographing device relative to a reference plane, and in one implementation scenario, the reference posture may be understood as a rotation angle of the terminal relative to the reference plane. The posture in the first posture is a posture with respect to a preset plane.
The reference attitude may be based on a measurement device. For example, the obtained reference attitude may be detected by sensing means of the device at a reference time, wherein a difference between the reference time and the capturing time of the first image frame does not exceed a preset time difference. The preset time may be set according to the implementation and is not limited herein. The reference attitude can be quickly acquired by acquiring the reference attitude by using the sensing device, so that the running speed of the visual positioning method is increased.
If the difference between the pose of the first pose and the reference pose is within a preset range, the preset plane of the world coordinate system can be considered to be coincident with the reference plane. If the difference between the pose of the first pose and the reference pose is not within the preset range, the preset plane of the world coordinate system is considered not to coincide with the reference plane.
In one implementation scenario, the predetermined range is 0, i.e. only if the pose of the first pose is the same as the reference pose, the predetermined plane of the world coordinate system and the reference plane are considered to be coincident. In one implementation scenario, the preset range may be between 0 ° and 5 °. The specific value can be adjusted according to the needs, and is not limited herein.
Step S32: and in response to the difference not being within the preset range, determining that the preset plane is not the reference plane.
If the difference between the pose in the first pose and the reference pose is not within the preset range, then the preset plane of the world coordinate system may be considered to not be coincident with the reference plane. At this time, the apparatus performing the method described herein may determine that the preset plane is not the reference plane in response to the difference being within the preset range.
Step S33: and determining the preset plane as a reference plane in response to the difference being within the preset range.
If the difference between the pose in the first pose and the reference pose is within a predetermined range, the predetermined plane of the world coordinate system and the reference plane may be considered to be coincident. At this time, the apparatus performing the method described herein may determine the preset plane as the reference plane in response to the difference not being within the preset range.
Therefore, by judging the difference between the reference posture and the posture of the first posture, whether the preset plane is the reference plane can be determined, and whether the current world coordinate system needs to be corrected can be determined.
Referring to fig. 4, fig. 4 is a flowchart illustrating a visual positioning method according to a third embodiment of the present application. The embodiment of the present disclosure is a further extension of the above-mentioned "adjusting the position of the detection target spatial point onto the reference plane", and specifically may include the following steps S41 and S42.
Step S41: and updating the posture in the first posture to a reference posture to obtain an updated first posture.
In the disclosed embodiment, the reference pose is a pose of the device relative to a reference plane, and the pose in the first pose is a pose of the device relative to a preset plane.
After the preset plane is determined not to be the reference plane, the preset plane needs to be corrected. At this time, the pose in the first pose may be replaced with the reference pose such that the pose of the first pose is based on the pose of the reference plane, resulting in an updated first pose. Replacing the pose in the first pose with the reference pose can be understood as rotating the preset plane such that the preset plane coincides with the reference plane. Since the preset plane of the world coordinate system is rotated, that is, the world coordinate system is rotated, the world coordinate system can be considered to be corrected.
Step S42: and determining the position of the target space point on the reference plane based on the updated first pose and the first feature point of the first image frame to serve as the position of the adjusted target space point.
After the updated first pose is obtained, the position of the target space point on the reference plane may be determined as the adjusted position of the target space point based on the updated first pose, the first feature point of the first image frame. Since the first image frame is obtained by shooting the target plane, the first feature point may be regarded as a projection point of a target space point on the target plane on the first image frame. Meanwhile, in step S121, the first bit position has been updated, which in this case means that the target spatial point is on the reference plane. Therefore, the position of the target space point on the reference plane can be determined based on the updated first pose and the first feature point of the first image frame, and the adjusted target space point is obtained.
In an implementation scenario, the adjusted target space point may be obtained based on the updated image information of the first pose and the first feature point based on existing general computer vision knowledge, which is not described in detail herein.
Therefore, the position of the target space point can be adjusted to the reference plane through the reference attitude, so that the adjusted target space point and the subsequent image frame can be utilized to accurately position the terminal.
In a disclosed embodiment, in one embodiment, the steps mentioned above include: obtaining a second pose of the second image frame in the adjusted world coordinate system based on the adjusted target space point and the image information in the first image frame and the second image frame, and the method specifically comprises the following steps: and determining a second pose based on the pixel value difference between a second feature point projected by the adjusted target space point on the second image frame and the first feature point.
In this embodiment, the adjusted target space point is obtained based on the first feature point, and specific reference may be specifically made to the specific description about the adjusted target space point, which is not described herein again.
Since the second image frame is obtained by shooting the target plane, the adjusted target space point also has a projection point on the second image frame, and thus, a second feature point of the adjusted target space point projected on the second image frame can be obtained.
The method of determining the second feature point may be a general feature point tracking method, for example, an optical flow method. Or determining the relative pose change between the first image frame and the second image frame, and determining the second characteristic point based on the adjusted target space point.
The difference in pixel values between the second feature point and the first feature point may specifically be a photometric error, that is, the photometric error may be solved to optimize and reduce the photometric error as much as possible, so as to determine the second pose of the second image frame.
Therefore, by calculating the pixel value difference between the second feature point and the first feature point, the second pose of the second image frame can be calculated.
In a specific implementation scenario, the relative pose change between the first image frame and the second image frame may be obtained first, and the photometric error between the first image frame and the second image frame may be calculated, and the second pose of the second image frame may be obtained by optimizing and reducing the photometric error as much as possible.
In a specific implementation scenario, an initial second pose of the second image frame in the world coordinate system may be obtained first, then the initial second pose and the updated first pose of the first image frame are used to obtain a luminosity error between the first image frame and the second image frame, and then the luminosity error is optimized and reduced as much as possible to obtain a second pose of the second image frame in the world coordinate system.
In one implementation scenario, the determining a second pose based on a pixel value difference between a second feature point projected by the adjusted target space point on the second image frame and the first feature point includes: and acquiring at least one candidate pose, determining a second feature point corresponding to the candidate pose based on each candidate pose and the adjusted target space point, and selecting one candidate pose as the second pose based on the pixel value difference between the second feature point and the first feature point.
In one implementation scenario, the candidate pose is determined based on an updated first pose, the updated first pose being obtained by updating the first pose of the first image frame in the world coordinate system before adjustment with a reference pose, the reference pose being a pose of the camera with respect to a reference plane. In a specific embodiment, a series of candidate poses can be obtained by an iterative optimization method based on updating the first pose.
In one implementation scenario, where the pixel value difference is a photometric error, the second pose of the second image frame may be obtained by equation (6) below.
Wherein C is the pixel value difference; i (X)p) Pixel values corresponding to the first feature points;to be in a candidate poseA second feature point of the lower target space point projected on the second image frame,pixel values corresponding to the second feature points; k is a memory matrix of the terminal shooting device;are the candidate poses of the pose positions, and are the candidate poses,as poses (which may also be referred to as rotation amounts or orientations) in the candidate poses;is the translation amount; sigmapRepresenting each target space point XpA sum of pixel value differences between the corresponding first feature point and second feature point;and selecting the candidate pose corresponding to the minimum pixel value difference C from the candidate poses by using an iterative optimization method to serve as the second pose of the second image frame.
In a specific implementation scenario, after the pixel value difference is obtained, a candidate pose corresponding to a second feature point whose pixel value difference meets a preset requirement may be further selected as a second pose of the second image frame. The preset requirement can be set according to the requirement, and is not limited herein. And if the pixel values calculated by the formula (6) are different, selecting the candidate second pose corresponding to the C meeting the preset requirement as the second pose of the second image frame. Therefore, the candidate poses meeting the preset requirements are screened, and the relatively accurate second pose can be obtained.
Therefore, the second pose of the second image frame is selected from the candidate poses by acquiring at least one candidate pose, determining the pixel value difference corresponding to the candidate pose and comparing the pixel difference corresponding to each candidate pose, so that the more accurate second pose of the second image frame can be obtained.
Referring to fig. 5, fig. 5 is a schematic diagram of a frame of an embodiment of a visual positioning apparatus according to the present application. The visual positioning apparatus 50 includes an adjustment module 51 and a pose determination module 52. The adjusting module 51 is configured to adjust the position of a target space point to a reference plane, so as to adjust a preset plane of a world coordinate system to the reference plane, where the target space point corresponds to a first feature point in the first image frame, and the target space point is used to define the preset plane of the world coordinate system; the pose determining module 52 is configured to obtain a second pose of the second image frame in the adjusted world coordinate system based on the adjusted target space point and the image information in the first image frame and the second image frame, where the first image frame and the second image frame are obtained by sequentially shooting the target plane by a shooting device of the device.
The visual positioning apparatus 50 further includes a pose acquiring module, before the adjusting module 51 adjusts the position of the target space point to the reference plane, the pose acquiring module is configured to acquire a first pose of the first image frame in the world coordinate system; the adjusting module 51 is configured to adjust the position of the target spatial point onto the reference plane, and includes: updating the posture in the first posture to a reference posture to obtain an updated first posture, wherein the reference posture is the posture of the shooting device relative to a reference plane, and the posture in the first posture is the posture of the equipment relative to a preset plane; and determining the position of the target space point on the reference plane based on the updated first pose and the first feature point of the first image frame to serve as the position of the adjusted target space point.
The visual positioning apparatus 50 further includes a detection module, before the adjustment module 51 adjusts the position of the target space point to the reference plane, configured to detect whether a preset plane of the world coordinate system is the reference plane based on a first pose of the first image frame in the world coordinate system; the adjusting module 51 may perform the adjustment of the position of the target spatial point onto the reference plane in response to the preset plane not being the reference plane.
The detecting module 51 detects whether a preset plane of the world coordinate system is a reference plane based on the first pose of the first image frame in the world coordinate system, and includes: determining that the preset plane is not the reference plane if the difference between the posture in the first posture and the reference posture is detected to be within a preset range, wherein the reference posture is a posture relative to the reference plane, and the posture in the first posture is a posture relative to the preset plane; the detection module 51 may determine that the preset plane is not the reference plane in response to the difference not being within the preset range.
The above-mentioned visual positioning apparatus 50 further includes a reference posture acquiring module, configured to acquire a reference posture, specifically, the reference posture acquiring module is configured to acquire a reference posture detected by a sensing apparatus of the device at a reference time, where a difference between the reference time and a shooting time of the first image frame does not exceed a preset time difference.
The target plane is a plane where the positioning auxiliary image is located, and the first pose is determined based on the positioning auxiliary image.
The visual positioning apparatus 50 further includes a first pose acquisition module, configured to acquire a first pose, where the first pose acquisition module is configured to determine a first transformation parameter between the first image frame and the positioning auxiliary image based on a first matching point pair between the first image frame and the positioning auxiliary image, and obtain the first pose by using the first transformation parameter; or determining a second transformation parameter between the first image frame and a third image frame based on a second matching point pair between the first image frame and the third image frame, and obtaining a first pose by using the second transformation parameter and a third transformation parameter between the third image frame and the positioning auxiliary image, wherein the third image frame is obtained by shooting in front of the first image frame by a shooting device.
The adjusted target space point is obtained based on the first feature point of the first image frame. The pose determination module 52 is configured to obtain a second pose of the second image frame in the adjusted world coordinate system based on the adjusted target space point and the image information in the first image frame and the second image frame, and includes: and determining a second pose based on the pixel value difference between a second feature point projected by the adjusted target space point on the second image frame and the first feature point.
The pose determination module 52 is configured to determine a second pose based on a pixel value difference between a second feature point projected by the adjusted target space point on the second image frame and the first feature point, and includes: and acquiring at least one candidate pose, determining a second feature point corresponding to the candidate pose based on each candidate pose and the adjusted target space point, and selecting one candidate pose as the second pose based on the pixel value difference between the second feature point and the first feature point.
The at least one candidate pose is determined based on updating a first pose, the updating the first pose is obtained by updating the first pose in the world coordinate system of the first image frame before adjustment by using a reference pose, and the reference pose is a pose of the shooting device relative to a reference plane; the pose determination module 52 is configured to select the candidate pose as the second pose based on the pixel value difference between the second feature point and the first feature point, and includes: and selecting the candidate pose corresponding to the second characteristic point with the pixel value difference meeting the preset requirement as the second pose.
The preset plane of the world coordinate system is a preset horizontal plane, and the reference plane is a reference horizontal plane.
Referring to fig. 6, fig. 6 is a schematic frame diagram of an embodiment of an electronic device according to the present application. The electronic device 60 comprises a memory 61 and a processor 62 coupled to each other, the processor 62 being configured to execute program instructions stored in the memory 61 to implement the steps of any of the above-described embodiments of the image registration method. In one particular implementation scenario, electronic device 60 may include, but is not limited to: a microcomputer, a server, and in addition, the electronic device 60 may also include a mobile device such as a notebook computer, a tablet computer, and the like, which is not limited herein.
In particular, the processor 62 is configured to control itself and the memory 61 to implement the steps of any of the above-described embodiments of the image registration method. The processor 62 may also be referred to as a CPU (Central Processing Unit). The processor 62 may be an integrated circuit chip having signal processing capabilities. The Processor 62 may also be a general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. In addition, the processor 62 may be collectively implemented by an integrated circuit chip.
Referring to fig. 7, fig. 7 is a block diagram illustrating an embodiment of a computer-readable storage medium according to the present application. The computer readable storage medium 70 stores program instructions 701 executable by a processor, the program instructions 701 being for implementing the steps of any of the image registration method embodiments described above.
According to the scheme, the target space point is used for defining the preset plane of the world coordinate system, and therefore if the preset plane of the world coordinate system is detected not to be the reference plane, the position of the target space point is adjusted to the reference plane, so that the preset plane of the world coordinate system defined based on the adjusted target space point is overlapped with the reference plane, the adjusted world coordinate system accords with the actual space condition, the world coordinate system is corrected, the accurate second pose is obtained, and accurate positioning of the equipment is achieved.
In some embodiments, functions of or modules included in the apparatus provided in the embodiments of the present disclosure may be used to execute the method described in the above method embodiments, and specific implementation thereof may refer to the description of the above method embodiments, and for brevity, will not be described again here.
The foregoing description of the various embodiments is intended to highlight various differences between the embodiments, and the same or similar parts may be referred to each other, and for brevity, will not be described again herein.
In the several embodiments provided in the present application, it should be understood that the disclosed method and apparatus may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, a division of a module or a unit is merely one type of logical division, and an actual implementation may have another division, for example, a unit or a component may be combined or integrated with another system, or some features may be omitted, or not implemented. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some interfaces, and may be in an electrical, mechanical or other form.
Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on network elements. Some or all of the units can be selected according to actual needs to achieve the purpose of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, a network device, or the like) or a processor (processor) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.