Mobile robot position re-identification method based on laser radar information
1. A mobile robot position re-identification method based on laser radar information is characterized by comprising the following steps:
(1) the method comprises the steps that a laser radar collects laser radar data of the mobile robot in real time, and the laser radar data are processed every travel set distance according to odometer information of the mobile robot to form a multi-channel laser radar polar coordinate aerial view;
(2) inputting the multi-channel laser radar polar coordinate aerial view obtained in the step (1) into a feature extraction module to generate a position descriptor;
(3) retrieving a plurality of candidate places similar to the current position descriptor in a map database, and estimating the relative orientation of the current position and the candidate places;
(4) adjusting the laser point cloud of the candidate location according to the relative orientation obtained in the step (3) to enable the laser point cloud to be consistent with the orientation of the current position, and then carrying out subsequent pose estimation;
(5) the location descriptors and their corresponding frequency spectra are stored in a map database for next candidate location retrieval and relative orientation estimation.
2. The lidar-information-based mobile robot position re-identification method according to claim 1, wherein in the step (1), the specific steps of the lidar data processing are as follows:
(1-1) dividing laser radar data into a plurality of layers according to height;
(1-2) respectively carrying out polar coordinate transformation and dividing grids in each height layer, and counting point cloud occupation information by each grid;
and (1-3) regarding the grids as pixel values, further converting each height layer into a bird's-eye view, and overlapping the plurality of layers of bird's-eye views to form a multi-channel laser radar polar coordinate bird's-eye view.
3. The lidar-information-based mobile robot position re-identification method according to claim 2, wherein the step (1) performs calculation using a multi-core parallel structure of a GPU: under the GPU frame, each point in the point cloud can independently calculate the corresponding coordinate on the polar coordinate aerial view after the coordinate transformation, and can be reorganized into the laser radar polar coordinate aerial view according to the corresponding coordinate information on the polar coordinate aerial view.
4. The lidar information based mobile robot position re-identification method of claim 1, wherein the feature extraction module is formed using a codec network.
5. The lidar-information-based mobile robot position re-identification method according to claim 4, wherein in the step (2), the specific step of generating the position descriptor is as follows:
(2-1) extracting robust position features in a multi-channel laser radar polar coordinate aerial view through a coder-decoder network to obtain a feature map with the same size as the multi-channel laser radar polar coordinate aerial view, wherein the feature map is used for forming a position descriptor and calculating a relative orientation;
and (2-2) converting the characteristic diagram into frequency domain representation through fast Fourier transform, and taking the magnitude spectrum of the frequency spectrum as a position descriptor.
6. The lidar information based mobile robot position re-identification method of claim 4, wherein the encoder-decoder network is a deep learning network, and the model training steps are as follows:
(2-1-1) making a data set used for training a coder-decoder network, wherein the data set comprises laser radar data of the same track in different time periods and GPS coordinates corresponding to the laser radar data;
(2-1-2) sampling the laser radar data in one track at certain distance intervals, and retrieving neighbor positions in other tracks through distances, wherein the neighbor positions serve as neighbors for position re-identification, and the other positions serve as counter examples; after sampling is finished, outputting laser data of a triple of each sampling place and the absolute orientation of the triple, wherein the triple is a current position, a neighbor and an opposite example;
(2-1-3) inputting the triple of the sampling place output in the step (2-1-2) into a coder-decoder network for single training to learn robust features;
(2-1-4) converting the robust features learned by the coder-decoder network into a frequency domain through fast Fourier transform, and taking the magnitude spectrum of the obtained current frequency spectrum as a current position descriptor; converting the laser data of the adjacent and counterexample in the triad into a frequency domain through fast Fourier transform to respectively obtain an adjacent frequency spectrum and a counterexample frequency spectrum, wherein the magnitude spectrum of the adjacent frequency spectrum is used as an adjacent position descriptor, and the magnitude spectrum of the counterexample frequency spectrum is used as a counterexample position descriptor;
(2-1-5) calculating a neighboring position descriptor, a counterexample position descriptor and a current position descriptor to obtain a position re-identification loss function; performing phase cross correlation on the neighboring frequency spectrum and the current frequency spectrum to calculate the relative orientation to obtain a relative orientation loss function;
(2-1-6) combining the position re-identification loss function and the relative orientation loss function as an overall network loss through which the encoder-decoder network is trained.
7. The lidar information-based mobile robot position re-identification method according to claim 6, wherein the phase cross correlation calculation performed between the neighboring spectrum and the current spectrum adopts a differentiable phase correlation algorithm, and the specific method is as follows:
the traditional phase cross-correlation algorithm is rewritten, and a gradient calculation part is added to the traditional phase cross-correlation algorithm, so that the gradient can be conducted reversely by the network during training, and the network parameters are updated; and rewriting the maximum value calculation part of the cross-correlation spectrum into the originally irreducible argmax operation and the softargmax operation expected by calculating the coordinates, so that the gradient transfer is normal.
8. The lidar-information-based mobile robot position re-identification method according to claim 1, wherein in the step (3), the specific estimation method of the relative orientation between the current position and the candidate location is as follows: and performing cross-correlation calculation on the frequency spectrum subjected to the fast Fourier transform and the frequency spectrum of the candidate location to obtain a correlation spectrum, wherein the abscissa of the maximum value in the correlation spectrum corresponds to the relative orientation of the current position and the candidate position.
9. The lidar-information-based position re-identification method for the mobile robot of claim 1, wherein in the step (4), when the pose estimation result is converged during the subsequent pose estimation, the current position is associated with the candidate position in the map by data to provide assistance for the subsequent pose optimization correction; and if the pose estimation result is not converged, the current position of the mobile robot is not in the map.
Background
The position re-identification technology is a very important part in the global positioning of the mobile robot. Global positioning estimates the positional information of the current robot without any a priori information. The common practice is to divide global positioning into two parts, namely, position re-identification, i.e., matching the current robot observation with the observation of the traveled place to obtain possible candidate places of the robot, and then pose estimation, i.e., performing more accurate pose estimation through the current observation and the observation of the candidate places, thereby calculating the pose of the current robot.
There are many vision-based position re-identification methods that rely on images captured by a camera, are not robust to environmental changes, and are susceptible to illumination and seasonal changes in time. With the recent popularization of laser radars, a position re-recognition method based on laser radars is also gradually emerging in consideration of the reliability of laser light against environmental changes. Such methods can achieve certain effects, but most of them do not take the final purpose of global positioning into consideration, namely giving the pose of the robot. Most methods only consider giving candidate locations, but after the candidate locations are given, errors may occur when the pose is further estimated according to the two frames of point clouds. This is because the conventional method for estimating the pose of the point cloud has a high requirement for the initial value. For example, at an intersection, a vehicle enters the intersection from an opposite direction, and at this time, although the location re-identification module can give an indication that the vehicle once passes through the intersection, due to a 180 ° change of the two-frame point cloud, subsequent pose estimation may erroneously consider that the vehicle is currently oriented in the same direction as the vehicle that passes through the intersection for the first time, resulting in an error in positioning results.
The invention with the publication number of CN111914815A discloses a machine vision intelligent recognition system of a rubbish target, which comprises an odometer module, a target recognition module and a drawing building module, and also discloses a machine vision intelligent recognition method of the rubbish target, which comprises the following steps: the method mainly aims to utilize real-time pose information of a robot in an autonomous walking process to map the position of the garbage target identified in an image into a real-time two-dimensional map through a series of coordinate transformation, so that the change condition of the domestic garbage in the whole community is monitored.
The invention with publication number CN111708042A discloses a robot method and system for pedestrian trajectory prediction and following, comprising: the system comprises a ZED camera, a GPU embedded platform, an industrial personal computer, an MCU controller, a laser radar sensor and a wheel type odometer; the pedestrian trajectory prediction method based on the pedestrian interaction network integrates the laser radar and the camera, combines the pedestrian trajectory prediction network with the pedestrian re-recognition framework, performs prediction by using the pedestrian interaction trajectory network, and realizes active selection of an optimal view angle for a target pedestrian to follow.
Disclosure of Invention
The invention aims to provide a position re-identification method based on a laser radar, which can give consideration to subsequent pose estimation, and realize the global positioning of environment change robustness.
A mobile robot position re-identification method based on laser radar information comprises the following steps:
(1) the method comprises the steps that a laser radar collects laser radar data of the mobile robot in real time, and the laser radar data are processed every travel set distance according to odometer information of the mobile robot to form a multi-channel laser radar polar coordinate aerial view;
(2) inputting the multi-channel laser radar polar coordinate aerial view obtained in the step (1) into a feature extraction module to generate a position descriptor;
(3) retrieving a plurality of candidate places similar to the current position descriptor in a map database, and estimating the relative orientation of the current position and the candidate places;
(4) adjusting the laser point cloud of the candidate location according to the relative orientation obtained in the step (3) to enable the laser point cloud to be consistent with the orientation of the current position, and then carrying out subsequent pose estimation;
(5) the location descriptors and their corresponding frequency spectra are stored in a map database for next candidate location retrieval and relative orientation estimation.
The invention converts rotation into translation by utilizing polar coordinate transformation, and performs spectrum cross-correlation calculation on two images based on the invariance of the spectrum to the translation and solves the translation property of the images. The translation invariance is used for generating a position descriptor so as to carry out candidate matching of position re-identification, and the calculation of cross-correlation and polar coordinate transformation can solve relative rotation together; and (4) taking the environment of time conversion into consideration, and extracting the features by adopting a deep learning network.
In the step (1), the laser radar data processing comprises the following specific steps:
(1-1) dividing laser radar data into a plurality of layers according to height;
(1-2) respectively carrying out polar coordinate transformation and dividing grids in each height layer, and counting point cloud occupation information by each grid;
and (1-3) regarding the grids as pixel values, further converting each height layer into a bird's-eye view, and overlapping the plurality of layers of bird's-eye views to form a multi-channel laser radar polar coordinate bird's-eye view.
The step (1) is to calculate by using a multi-core parallel structure of a GPU (graphic processing unit): under a GPU frame, each point in the point cloud can independently calculate the corresponding coordinate on the polar coordinate aerial view after the coordinate transformation, and can be reorganized into a laser radar polar coordinate aerial view according to the corresponding coordinate information on the polar coordinate aerial view; compared with the traditional serial computing mode of a CPU (central processing unit), when the multi-core parallel structure of the GPU is used for computing, the larger the point cloud scale is, the more obvious the acceleration effect is.
And (3) through the calculation in the step (1), the rotation of the mobile robot at the same place is represented as transverse translation in the multi-channel laser radar polar coordinate aerial view.
The feature extraction module is formed by adopting a coder-decoder network.
In the step (2), the specific steps of generating the position descriptor are as follows:
(2-1) extracting robust position features in a multi-channel laser radar polar coordinate aerial view through a coder-decoder network to obtain a feature map with the same size as the multi-channel laser radar polar coordinate aerial view, wherein the feature map is used for forming a position descriptor and calculating a relative orientation;
and (2-2) converting the characteristic diagram into frequency domain representation through fast Fourier transform, and taking the magnitude spectrum of the frequency spectrum as a position descriptor.
Due to the characteristics of fast fourier transform, the translation in the feature map will not be reflected in the magnitude spectrum of the frequency spectrum, so the position descriptor has rotation invariance.
The coding-decoder network is a deep learning network, and the model training steps are as follows:
(2-1-1) making a data set used for training a coder-decoder network, wherein the data set comprises laser radar data of the same track in different time periods and GPS coordinates corresponding to the laser radar data;
(2-1-2) sampling the laser radar data in one track at certain distance intervals, and retrieving neighbor positions in other tracks through distances, wherein the neighbor positions serve as neighbors for position re-identification, and the other positions serve as counter examples; after sampling is finished, outputting laser data of a triple of each sampling place and the absolute orientation of the triple, wherein the triple is a current position, a neighbor and an opposite example;
(2-1-3) inputting the triplet of the sampling place output in the step (2-1-2) into the coder-decoder network for single training to learn the robust features;
(2-1-4) converting the robust features learned by the coder-decoder network into a frequency domain through fast Fourier transform, and taking the magnitude spectrum of the obtained current frequency spectrum as a current position descriptor; converting the laser data of the adjacent and counterexample in the triad into a frequency domain through fast Fourier transform to respectively obtain an adjacent frequency spectrum and a counterexample frequency spectrum, wherein the magnitude spectrum of the adjacent frequency spectrum is used as an adjacent position descriptor, and the magnitude spectrum of the counterexample frequency spectrum is used as a counterexample position descriptor;
(2-1-5) calculating a neighboring position descriptor, a counterexample position descriptor and a current position descriptor to obtain a position re-identification loss function; performing phase cross correlation on the neighboring frequency spectrum and the current frequency spectrum to calculate the relative orientation to obtain a relative orientation loss function;
(2-1-6) combining the position re-identification loss function and the relative orientation loss function as an overall network loss through which the encoder-decoder network is trained.
Preferably, the phase cross correlation calculation performed by the neighboring spectrum and the current spectrum adopts a differentiable phase correlation algorithm, and the specific method is as follows:
the traditional phase cross-correlation algorithm is rewritten, and a gradient calculation part is added to the traditional phase cross-correlation algorithm, so that the gradient can be conducted reversely by the network during training, and the network parameters are updated; and rewriting the maximum value calculation part of the cross-correlation spectrum into the originally irreducible argmax operation and the softargmax operation expected by calculating the coordinates, so that the gradient transfer is normal.
The differentiable phase correlation algorithm can conduct gradient conduction in the network and supervise the model to learn the correlation characteristics.
In the step (3), a specific estimation method of the relative orientation between the current position and the candidate location is as follows: and performing cross-correlation calculation on the frequency spectrum subjected to the fast Fourier transform and the frequency spectrum of the candidate location to obtain a correlation spectrum, wherein the abscissa of the maximum value in the correlation spectrum corresponds to the relative orientation of the current position and the candidate position.
In the step (4), during subsequent pose estimation, if a pose estimation result is converged, performing data association on the current position and the candidate position in the map to provide help for subsequent pose optimization correction; and if the pose estimation result is not converged, the current position of the mobile robot is not in the map.
Compared with the prior art, the invention has the advantages that:
1. the invention does not depend on images acquired by a camera, realizes robustness to environmental changes, and is not easily influenced by illumination and time and season changes.
2. The invention considers the subsequent pose estimation and prevents errors possibly generated when the pose is further estimated according to two frames of point clouds.
Drawings
Fig. 1 is a flowchart of a method for re-identifying a position of a mobile robot based on laser radar information according to an embodiment of the present invention.
Fig. 2 is a schematic diagram of converting an original laser point cloud into a multi-channel image according to an embodiment of the present invention.
FIG. 3 is a photograph of a rotation invariant position descriptor in accordance with an embodiment of the present invention.
Fig. 4 is a flow chart of a solution of the differentiable phase correlation method shown in fig. 1.
FIG. 5 is a photograph of a correlation spectrum and its relationship to relative orientation according to an embodiment of the present invention.
Detailed Description
As shown in fig. 1, the method for re-identifying the position of the mobile robot based on the laser radar information includes the following steps:
(1) the mobile robot moves in an outdoor scene, the laser radar collects laser radar data in real time, the currently obtained laser radar data are processed every 3m according to odometer information of the mobile robot, a multi-channel laser radar polar coordinate aerial view is formed, and as shown in fig. 2, the specific steps of processing the laser radar data are as follows:
(1-1) dividing laser radar data into a plurality of layers according to height;
(1-2) respectively carrying out polar coordinate transformation in each height layer, dividing sector-shaped division grids according to angles and radii, and counting point cloud occupation information by each grid;
and (1-3) regarding the grids as pixel values, and further converting each height layer into a bird's-eye view image respectively, wherein the width of the bird's-eye view image represents an angle corresponding to the grids, the angle value is 0-2 pi, the height of the bird's-eye view image represents a radius corresponding to the grids, the corresponding radius from top to bottom is 0-maximum radius value, each channel of the bird's-eye view image represents one height layer, and the plurality of layers of bird's-eye views are overlapped to form the multi-channel laser radar polar coordinate bird's-eye view image.
And (3) calculating by using a multi-core parallel structure of the GPU: under a GPU frame, each point in the point cloud can independently calculate the corresponding coordinate on the polar coordinate aerial view after the coordinate transformation, and can be reorganized into a laser radar polar coordinate aerial view according to the corresponding coordinate information on the polar coordinate aerial view; compared with the traditional serial computing mode of the CPU, when the multi-core parallel structure of the GPU is used for computing, the larger the point cloud scale is, the more obvious the acceleration effect is.
And (3) through the calculation in the step (1), the rotation of the mobile robot at the same place is represented as transverse translation in the multi-channel laser radar polar coordinate aerial view.
(2) Inputting the multi-channel laser radar polar coordinate aerial view obtained in the step (1) into a coder-decoder network to generate a position descriptor, wherein the method specifically comprises the following steps:
(2-1) extracting robust position features in the multi-channel laser radar polar coordinate aerial view through an encoder-decoder network to obtain a feature map with the same size as the multi-channel laser radar polar coordinate aerial view, wherein the feature map is used for forming a position descriptor and calculating a relative orientation;
(2-2) converting the characteristic diagram into a frequency domain representation through fast Fourier transform, and taking the magnitude spectrum of the frequency spectrum as a position descriptor, as shown in FIG. 3.
Due to the nature of the fast fourier transform, the translation in the signature will not be reflected in the magnitude spectrum of the spectrum, so the position descriptor has rotational invariance.
The coding-decoder network is a deep learning network, and the model training steps are as follows:
(2-1-1) making a data set used for network training of the coder-decoder, wherein the data set comprises laser radar data of the same track in different time periods and GPS coordinates corresponding to the laser radar data;
(2-1-2) sampling the laser radar data in one track at intervals of a certain distance, retrieving neighbor positions in other tracks through distances, wherein the neighbor positions serve as neighbor for position re-identification, and the other positions serve as counter examples; after sampling is finished, outputting laser data of a triple of each sampling place and the absolute orientation of the triple, wherein the triple is a current position, a neighbor and an opposite example;
(2-1-3) inputting the triplet of the sampling place output in the step (2-1-2) into the coder-decoder network for single training to learn the robust features;
(2-1-4) converting the robust features learned by the coder-decoder network into a frequency domain through fast Fourier transform, and taking the magnitude spectrum of the obtained current frequency spectrum as a current position descriptor; converting the laser data of the adjacent and counterexample in the triad into a frequency domain through fast Fourier transform to respectively obtain an adjacent frequency spectrum and a counterexample frequency spectrum, wherein the magnitude spectrum of the adjacent frequency spectrum is used as an adjacent position descriptor, and the magnitude spectrum of the counterexample frequency spectrum is used as a counterexample position descriptor;
(2-1-5) calculating a neighboring position descriptor, a counterexample position descriptor and a current position descriptor to obtain a position re-identification loss function; the relative orientation is calculated by performing phase cross correlation between the neighboring spectrum and the current spectrum, and a relative orientation loss function is obtained by a differentiable phase correlation algorithm, as shown in fig. 4.
The differentiable phase correlation algorithm rewrites the traditional phase cross correlation algorithm and adds a gradient calculation part to the differentiable phase correlation algorithm, so that the network can reversely conduct the gradient during training, and the updating of network parameters is realized; and rewriting the maximum value calculation part of the cross-correlation spectrum into the originally irreducible argmax operation and the softargmax operation expected by calculating the coordinates, so that the gradient transfer is normal.
Because the common phase cross-correlation algorithm can not be directly applied to network learning, the differentiable phase correlation algorithm can be used for conducting gradient in the network and supervising the model learning related characteristics.
(2-1-6) combining the position re-identification loss function and the relative orientation loss function as an overall network loss through which the encoder-decoder network is trained.
(3) Retrieving a number of candidate locations in the map database that are similar to the current location descriptor and estimating a relative orientation of the current location to the candidate locations;
(4) adjusting the laser point cloud of the candidate location according to the relative orientation obtained in the step (3) to enable the laser point cloud to be consistent with the orientation of the current position, and then carrying out subsequent pose estimation;
(5) the location descriptors and their corresponding frequency spectra are stored in a map database for next candidate location retrieval and relative orientation estimation.
(3) Retrieving a plurality of candidate places similar to the current position descriptor in the map database, and estimating the relative orientation of the current position and the candidate places; a specific estimation method of the relative orientation of the current position and the candidate location is as follows: the frequency spectrum after the fast fourier transform and the frequency spectrum of the candidate location are subjected to cross-correlation calculation to obtain a correlation spectrum, and the abscissa of the maximum value in the correlation spectrum corresponds to the relative orientation between the current position and the candidate position, as shown in fig. 5.
(4) Adjusting the laser point cloud of the candidate location according to the relative orientation obtained in the step (3) to enable the laser point cloud to be consistent with the orientation of the current position, and then carrying out subsequent pose estimation; during subsequent pose estimation, if a pose estimation result is converged, performing data association on the current position and the candidate position in the map to provide help for subsequent pose optimization correction; and if the pose estimation result is not converged, the current position of the mobile robot is not in the map.
(5) And storing the position descriptor and the corresponding frequency spectrum into a map database for next candidate location retrieval and relative orientation estimation no matter whether the current position of the mobile robot is in a map or not.