Parking information acquisition method and device and parking method and device
1. A parking information acquisition method characterized by comprising:
acquiring a bird view of the position of a vehicle to be parked;
inputting the aerial view into a pre-constructed information acquisition model to obtain parking information; the information acquisition model is pre-established based on a multitask neural network, the multitask neural network comprises a shared coding layer and a plurality of task neural networks, and the shared coding layer is shared by the plurality of task neural networks.
2. The method of claim 1, wherein the information acquisition model is constructed by:
acquiring a training image set; the training image set comprises a plurality of training images, each training image is a bird's-eye view image carrying annotation information, and the annotation information comprises: the method comprises the steps that road element categories corresponding to each pixel point in the aerial view, library corner point marking information of each library position and library position attributes of each library position are obtained;
performing data enhancement processing on each training image in the training image set to obtain a first training image of each training image;
forming a target image set by each training image and each first training image;
and training the multitask neural network according to each image in the target image set to obtain an information acquisition model.
3. The method of claim 2, wherein training a multitask neural network to obtain an information acquisition model based on each image in the target image set comprises:
determining a plurality of training sets according to each image in the target image set; wherein each training set comprises at least one image in the target image set;
selecting a training set from a plurality of training sets as a target training set;
inputting each image in the target training set into a shared coding layer of a multitask neural network to obtain first result data of the target training set;
inputting the first result data into a decoding layer of each task neural network respectively, and processing the first result data through each network layer in the decoding layer of each task neural network to obtain second result data of the target training set output by the decoding layer of each task neural network; for the decoding layer of each task neural network, the input data of the Nth network layer of the decoding layer is the result of fusion processing of the output result of the N-1 th network layer of the decoding layer and the calculation result of the N-1 th network layer in the decoding layer of each other task neural network; the calculation result of the (N-1) th network layer in the decoding layers of each other task neural network model is a result of performing first calculation on the output result of the (N-1) th network layer in the decoding layers of other task neural networks; n is a positive integer greater than 1;
calculating a loss function value of the multitask neural network in the current iteration according to each second result data and the labeling information of each image in the target training set;
and updating network parameters of the multitask neural network according to the loss function value of the multitask neural network in the current iteration to obtain a new multitask neural network, and when a preset iteration condition is not met, returning to the step of executing the step of selecting one training set from the multiple training sets as a target training set until the iteration condition is met, and taking the current multitask neural network as an information acquisition model.
4. The method of claim 3, wherein calculating the loss function value of the multitask neural network at the current iteration according to the second result data and the label information of each image in the target training set comprises:
calculating a loss function value of each task neural network in the current iteration according to each second result data and the labeling information of each image in the target training set;
obtaining the loss reduction rate of each task neural network in the last iteration;
aiming at each task neural network, calculating the weight of the task neural network in the current iteration according to the loss reduction rate of all the task neural networks in the last iteration;
and calculating the loss function value of the multi-task neural network in the current iteration according to the loss function value and the weight of each task neural network in the current iteration.
5. The method of claim 1, wherein the obtaining a bird's eye view of a location of a vehicle to be parked comprises:
acquiring images acquired by a plurality of image acquisition devices respectively installed at different parts of a vehicle to be parked;
and integrating the multiple images acquired by the multiple image acquisition devices to obtain the aerial view of the vehicle to be parked.
6. The method of claim 5, wherein the image capture device is captured as a fisheye camera.
7. The method according to claim 5, wherein before the integrating the plurality of images captured by the plurality of image capturing devices, further comprising:
and carrying out image preprocessing on each image acquired by the plurality of image acquisition devices.
8. The method of claim 7, wherein the image pre-processing each image acquired by the plurality of image acquisition devices comprises:
carrying out distortion removal processing on each image acquired by a plurality of image acquisition devices;
acquiring equipment parameters of each image acquisition device;
and for each image subjected to distortion removal processing, carrying out coordinate transformation on the image by using the equipment parameters corresponding to the image.
9. The method of claim 1, wherein the parking information comprises: the method comprises the following steps of inputting the aerial view into a pre-constructed recognition model to obtain parking information, wherein the aerial view comprises road element categories corresponding to each pixel point in the aerial view, library corner point coordinates of each library position and library position attributes of each library position, and further comprises the following steps:
carrying out coordinate transformation on the library corner point coordinates of each library position in the parking information to obtain first library corner point coordinates of each library position;
filtering the first library corner point coordinate of each library position to obtain a second library corner point coordinate of each library position;
and planning a parking path according to the second library corner point coordinates of each library position, the library position attributes of each library position in the parking information and the road element categories corresponding to each pixel point in the aerial view to obtain the parking path.
10. A method of parking a vehicle, comprising:
the method for obtaining the parking information of the vehicle to be parked is as claimed in any one of claims 1 to 8;
determining a parking path according to the parking information;
and controlling the vehicle to be parked to park according to the parking path.
11. The method of claim 10, wherein the parking information comprises: the determining of the parking path according to the parking information includes:
carrying out coordinate transformation on the library corner point coordinates of each library position in the parking information to obtain first library corner point coordinates of each library position;
filtering the first library corner point coordinate of each library position to obtain a second library corner point coordinate of each library position;
and planning a parking path according to the second library corner point coordinates of each library position, the library position attributes of each library position in the parking information and the road element categories corresponding to each pixel point in the aerial view to obtain the parking path.
12. A parking information acquisition apparatus characterized by comprising:
the parking system comprises a first obtaining unit, a second obtaining unit and a control unit, wherein the first obtaining unit is used for obtaining a bird view of the position of a vehicle to be parked;
the input unit is used for inputting the aerial view into a pre-constructed information acquisition model to obtain parking information; the information acquisition model is pre-established based on a multitask neural network, the multitask neural network comprises a shared coding layer and a plurality of task neural networks, and the shared coding layer is shared by the plurality of task neural networks.
13. A parking apparatus, comprising:
a second obtaining unit, configured to obtain parking information of a vehicle to be parked, where the method for obtaining parking information is as described in any one of claims 1 to 8;
the determining unit is used for determining a parking path according to the parking information;
and the control unit is used for controlling the vehicle to be parked to park according to the parking path.
Background
In the field of automatic driving, autonomous parking increasingly becomes a standard configuration of intelligent vehicles, and the automatic driving vehicle plans a parking path and controls the vehicle to move based on the acquired parking information, and finally parks the vehicle in a free parking space.
The acquisition of parking information often requires a visual perception function of a camera of an autonomous vehicle to complete a plurality of tasks including parking space detection, vacant space judgment and obstacle identification, that is, a plurality of neural network models need to be simultaneously operated on a vehicle-mounted computing platform, which brings higher requirements for computing resources.
Disclosure of Invention
The application provides a parking information acquisition method and device, and a parking method and device, and aims to solve the problem that accuracy of acquired parking information is reduced due to the fact that a large number of neural network models are cut.
In order to achieve the above object, the present application provides the following technical solutions:
a parking information acquisition method comprising:
acquiring a bird view of the position of a vehicle to be parked;
inputting the aerial view into a pre-constructed information acquisition model to obtain parking information; the information acquisition model is pre-established based on a multitask neural network, the multitask neural network comprises a shared coding layer and a plurality of task neural networks, and the shared coding layer is shared by the plurality of task neural networks.
Optionally, the above method includes a process of constructing the information acquisition model, including:
acquiring a training image set; the training image set comprises a plurality of training images, each training image is a bird's-eye view image carrying annotation information, and the annotation information comprises: the method comprises the steps that road element categories corresponding to each pixel point in the aerial view, library corner point marking information of each library position and library position attributes of each library position are obtained;
performing data enhancement processing on each training image in the training image set to obtain a first training image of each training image;
forming a target image set by each training image and each first training image;
and training the multitask neural network according to each image in the target image set to obtain an information acquisition model.
Optionally, in the method, the training the multitask neural network according to each image in the target image set to obtain an information acquisition model includes:
determining a plurality of training sets according to each image in the target image set; wherein each training set comprises at least one image in the target image set;
selecting a training set from a plurality of training sets as a target training set;
inputting each image in the target training set into a shared coding layer of a multitask neural network to obtain first result data of the target training set;
inputting the first result data into a decoding layer of each task neural network respectively, and processing the first result data through each network layer in the decoding layer of each task neural network to obtain second result data of the target training set output by the decoding layer of each task neural network; for the decoding layer of each task neural network, the input data of the Nth network layer of the decoding layer is the result of fusion processing of the output result of the N-1 th network layer of the decoding layer and the calculation result of the N-1 th network layer in the decoding layer of each other task neural network; the calculation result of the (N-1) th network layer in the decoding layers of each other task neural network model is a result of performing first calculation on the output result of the (N-1) th network layer in the decoding layers of other task neural networks; n is a positive integer greater than 1;
calculating a loss function value of the multitask neural network in the current iteration according to each second result data and the labeling information of each image in the target training set;
and updating network parameters of the multitask neural network according to the loss function value of the multitask neural network in the current iteration to obtain a new multitask neural network, and when a preset iteration condition is not met, returning to the step of executing the step of selecting one training set from the multiple training sets as a target training set until the iteration condition is met, and taking the current multitask neural network as an information acquisition model.
Optionally, the calculating a loss function value of the multitask neural network at the current iteration according to each piece of second result data and the label information of each image in the target training set includes:
calculating a loss function value of each task neural network in the current iteration according to each second result data and the labeling information of each image in the target training set;
obtaining the loss reduction rate of each task neural network in the last iteration;
aiming at each task neural network, calculating the weight of the task neural network in the current iteration according to the loss reduction rate of all the task neural networks in the last iteration;
and calculating the loss function value of the multi-task neural network in the current iteration according to the loss function value and the weight of each task neural network in the current iteration.
The method optionally includes, that obtaining a bird's eye view of a location of a vehicle to be parked includes:
acquiring images acquired by a plurality of image acquisition devices respectively installed at different parts of a vehicle to be parked;
and integrating the multiple images acquired by the multiple image acquisition devices to obtain the aerial view of the vehicle to be parked.
In the above method, optionally, the image acquisition device acquires images by using a fisheye camera.
In the foregoing method, optionally, before the integrating the multiple images acquired by the multiple image acquiring devices, the method further includes:
and carrying out image preprocessing on each image acquired by the plurality of image acquisition devices.
The method described above, optionally, the image preprocessing is performed on each image acquired by a plurality of image acquisition devices, and includes:
carrying out distortion removal processing on each image acquired by a plurality of image acquisition devices;
acquiring equipment parameters of each image acquisition device;
and for each image subjected to distortion removal processing, carrying out coordinate transformation on the image by using the equipment parameters corresponding to the image.
The method may further include, optionally, the parking information includes: the method comprises the following steps of inputting the aerial view into a pre-constructed recognition model to obtain parking information, wherein the aerial view comprises road element categories corresponding to each pixel point in the aerial view, library corner point coordinates of each library position and library position attributes of each library position, and further comprises the following steps:
carrying out coordinate transformation on the library corner point coordinates of each library position in the parking information to obtain first library corner point coordinates of each library position;
filtering the first library corner point coordinate of each library position to obtain a second library corner point coordinate of each library position;
and planning a parking path according to the second library corner point coordinates of each library position, the library position attributes of each library position in the parking information and the road element categories corresponding to each pixel point in the aerial view to obtain the parking path.
A method of parking a vehicle comprising:
the method for obtaining the parking information of the vehicle to be parked is as claimed in any one of claims 1 to 8;
determining a parking path according to the parking information;
and controlling the vehicle to be parked to park according to the parking path.
The method may further include, optionally, the parking information includes: the determining of the parking path according to the parking information includes:
carrying out coordinate transformation on the library corner point coordinates of each library position in the parking information to obtain first library corner point coordinates of each library position;
filtering the first library corner point coordinate of each library position to obtain a second library corner point coordinate of each library position;
and planning a parking path according to the second library corner point coordinates of each library position, the library position attributes of each library position in the parking information and the road element categories corresponding to each pixel point in the aerial view to obtain the parking path.
A parking information acquisition apparatus comprising:
the parking system comprises a first obtaining unit, a second obtaining unit and a control unit, wherein the first obtaining unit is used for obtaining a bird view of the position of a vehicle to be parked;
the input unit is used for inputting the aerial view into a pre-constructed information acquisition model to obtain parking information; the information acquisition model is pre-established based on a multitask neural network, the multitask neural network comprises a shared coding layer and a plurality of task neural networks, and the shared coding layer is shared by the plurality of task neural networks.
A parking apparatus comprising:
the parking information acquisition unit is used for acquiring parking information of a vehicle to be parked, and the method for acquiring the parking information is as described above;
the determining unit is used for determining a parking path according to the parking information;
and the control unit is used for controlling the vehicle to be parked to park according to the parking path.
Compared with the prior art, the method has the following advantages:
the application provides a parking information acquisition method and device and a parking method and device, wherein the parking information acquisition method comprises the following steps: acquiring a bird view of the position of a vehicle to be parked; inputting the aerial view into a pre-constructed information acquisition model to obtain parking information; the information acquisition model is pre-established based on a multitask neural network, the multitask neural network comprises a shared coding layer and a plurality of task neural networks, and the shared coding layer is shared by the plurality of task neural networks. Therefore, in the scheme of the application, the information acquisition model is constructed in advance based on the multitask neural network, the aerial view is input into the information acquisition model, the parking information can be directly obtained, and the information acquisition model only needs to be operated, so that the occupied computing resources are less, and the information acquisition model does not need to be cut in a large amount, so that the accuracy of the parking information and the acquisition efficiency of the parking information are improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, it is obvious that the drawings in the following description are only embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
Fig. 1 is a flowchart of a parking information obtaining method according to the present application;
fig. 2 is a flowchart of another method of the parking information obtaining method according to the present application;
fig. 3 is an exemplary diagram of a parking information obtaining method provided by the present application;
fig. 4 is a flowchart of another method of the parking information obtaining method according to the present application;
fig. 5 is a diagram illustrating another example of a parking information obtaining method according to the present application;
fig. 6 is a flowchart of another method of a parking information obtaining method according to the present application;
fig. 7 is a diagram illustrating another example of a parking information obtaining method according to the present application;
fig. 8 is another exemplary diagram of a parking information obtaining method provided by the present application;
fig. 9 is a flowchart of another method of a parking information obtaining method according to the present application;
FIG. 10 is a method flow diagram of a parking method provided herein;
FIG. 11 is an exemplary illustration of a method of parking a vehicle according to the present application;
fig. 12 is a schematic structural diagram of a parking information obtaining method according to the present application;
fig. 13 is a schematic structural diagram of a parking method according to the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The term "include" and variations thereof as used herein are open-ended, i.e., "including but not limited to". The term "based on" is "based, at least in part, on". The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments". Relevant definitions for other terms will be given in the following description.
It should be noted that the terms "first", "second", and the like in the disclosure of the present application are only used for distinguishing different devices, modules or units, and are not used for limiting the order or interdependence relationship of the functions performed by the devices, modules or units.
It is noted that references to "a", "an", and "the" modifications in the disclosure herein are exemplary rather than limiting, and those skilled in the art will understand that "one or more" will be understood unless the context clearly dictates otherwise.
The embodiment of the application provides a parking information obtaining method, which can be applied to various system platforms, wherein an execution main body of the method can be a processor of a vehicle-mounted computing platform, and a method flow chart of the method is shown in fig. 1, and specifically comprises the following steps:
s101, obtaining a bird' S eye view of the position of the vehicle to be parked.
In this embodiment, after the vehicle to be parked reaches the parking lot, the aerial view of the position where the vehicle to be parked is located is obtained.
Referring to fig. 2, the process of obtaining the bird's eye view of the position of the vehicle to be parked includes:
s201, images acquired by a plurality of image acquisition devices respectively installed at different parts of a vehicle to be parked are acquired.
In this embodiment, image capture devices are installed at different positions of the vehicle to be parked, each image capture device is used to capture an image of an area covered by the image capture device, and optionally, the image capture device may be a fisheye camera.
In this embodiment, an image acquired by each image acquisition device on a vehicle to be parked is acquired.
S202, integrating the multiple images acquired by the multiple image acquisition devices to obtain the aerial view of the vehicle to be parked.
In this embodiment, the multiple images acquired by the image acquisition devices are integrated to obtain the bird's-eye view of the vehicle to be parked, and specifically, the multiple images acquired by the image acquisition devices may be spliced to obtain the bird's-eye view of the vehicle to be parked.
In this embodiment, the process of stitching the plurality of images acquired by each image acquisition device specifically includes the following steps:
and determining a matching point of each image, wherein the matching point is an effective characteristic point in a visual field range overlapped in each image, and splicing each image according to the matching point of each image so as to obtain a bird's-eye view covering 360 degrees around the vehicle to be parked. In this embodiment, referring to fig. 3, fig. 3 shows a bird's eye view of a vehicle to be parked, which is obtained by integrating a plurality of images acquired by a plurality of image acquisition devices.
Optionally, before the integration processing is performed on the multiple images acquired by the multiple image acquisition devices, image preprocessing may be performed on each image acquired by the multiple image acquisition devices, and optionally, the image preprocessing includes distortion removal processing and coordinate transformation, so that each image after the image preprocessing is integrated, and the bird's-eye view of the vehicle to be parked is obtained.
In this embodiment, the process of performing image preprocessing on each image acquired by the plurality of image acquisition devices specifically includes the following steps:
carrying out distortion removal processing on each image acquired by a plurality of image acquisition devices;
acquiring equipment parameters of each image acquisition device;
and for each image subjected to distortion removal processing, carrying out coordinate transformation on the image by using the equipment parameters corresponding to the image.
In this embodiment, since the image captured by the image capturing device may have distortion, including but not limited to radial distortion or tangential distortion, it is necessary to perform a distortion removal process on each captured image to improve the accuracy of subsequent stitching.
In this embodiment, the image capturing devices are installed at different positions of the vehicle to be parked, so that different image capturing devices image at different angles, that is, images captured by different image capturing devices correspond to different image coordinate systems.
In this embodiment, the device parameters of each image acquisition device installed at different positions of the vehicle to be parked are obtained, and for each image after the distortion removal processing, the device parameters of the image acquisition device corresponding to the image are used to perform coordinate transformation on the image, so that each image is transformed from different image coordinate systems into the same image coordinate system.
Optionally, after image preprocessing is performed on each image acquired by the plurality of image acquisition devices, brightness equalization processing and color equalization processing may be performed on each image after image preprocessing.
S102, inputting the aerial view into a pre-constructed information acquisition model to obtain parking information; the information acquisition model is pre-established based on a multitask neural network, the multitask neural network comprises a shared coding layer and a plurality of task neural networks, and the shared coding layer is shared by the plurality of task neural networks.
In this embodiment, an information acquisition model is pre-established, and the information acquisition model is pre-established based on a multitask neural network, where the multitask neural network includes a shared coding layer and a plurality of task neural networks, and the shared coding layer is shared by the plurality of task neural networks.
In this embodiment, the bird's-eye view is input into a pre-constructed information acquisition model, and the parking information output by the information acquisition model is obtained by processing the bird's-eye view through the information acquisition model, wherein the parking information includes: the road element category corresponding to each pixel point in the aerial view, the library corner point coordinates of each library position, and the library position attribute of each library position.
Referring to fig. 4, the process of constructing the information acquisition model specifically includes the following steps:
s401, obtaining a training image set.
In this embodiment, a training image set is obtained, where the training image set includes a plurality of training images, each training image is a bird's-eye view image carrying annotation information, and the annotation information includes: the method comprises the steps of obtaining a road element type corresponding to each pixel point in the aerial view, library corner point standard information of each library position in the aerial view, and library position attributes of each library position in the aerial view.
In this embodiment, the road element categories include, but are not limited to: roads, curbs, vehicle lines, wheel blocks and obstacles; the library corner point marking information of each library position comprises library corner point marking information of four library corner points of the library position; the library bit attributes comprise but are not limited to library bit occupation attributes, library bit line shape attributes, library bit material attributes, library bit mark abrasion attributes and library corner point attributes, wherein the library bit occupation attributes are used for indicating the occupation situation of the library bit and comprise empty library bits and occupied library bits, the library bit line shape attributes are used for indicating the shape of the library bit line and comprise but are not limited to T shapes, L shapes, I shapes and U shapes, the library bit material attributes are used for indicating the material of the library bit and comprise but not limited to cement, stone bricks, grass bricks, grasslands, asphalt, paint and metal, and the library corner point attributes are used for indicating whether the library corner points are shielded or not and comprise no shielding and shielding.
Optionally, the library corner point labeling information of each library position is obtained by labeling in a counterclockwise direction, and may be represented by numbers, referring to fig. 5, where fig. 5 shows the library corner point labeling information of the library position and the library position occupation attribute, where the library corner point labeling information of the right front corner point of the library position is 1, the library corner point labeling information of the left front corner point is 2, the library corner point labeling information of the left rear corner point is 3, the library corner point labeling information of the right corner point is 4, and the library position occupation attribute is an empty library position.
S402, performing data enhancement processing on each training image in the training image set to obtain a first training image of each training image.
In this embodiment, a data enhancement processing method is adopted to perform data enhancement processing on each training image in the training image set, so as to obtain a first training image of each training image. Specifically, the data enhancement processing method may be random rotation processing, random translation processing, random noise processing, random brightness processing, random contrast processing, or random color processing, so as to enhance the image used for model training.
The data enhancement processing may be performed on each training image in the training image set, and may be performed on each training image in the training image set a plurality of times to obtain a plurality of first training images for each training image.
And S403, forming a target image set by the training images and the first training images.
In this embodiment, each training image and each first training image are combined into a target image set.
S404, training the multitask neural network according to each image in the target image set to obtain an information acquisition model.
In this embodiment, a multitask neural network is pre-constructed, where the multitask neural network includes a shared coding layer and a plurality of task neural networks, each task neural network includes a coding layer and a decoding layer, the plurality of task neural networks share the shared coding layer, and the decoding layer of each task neural network includes a plurality of network layers.
In this embodiment, the multitask neural network is trained according to each image in the target image set, so as to obtain an information acquisition model.
Referring to fig. 6, the process of training the multitask neural network to obtain the information acquisition model according to each image in the target image set specifically includes the following steps:
s601, determining a plurality of training sets according to each image in the target image set.
In this embodiment, each image in the target image set is divided according to a preset rule to form a plurality of training sets, and each training set includes at least one image in the target image set.
S602, selecting one training set from the plurality of training sets as a target training set.
In this embodiment, one training set is selected from the multiple training sets as a target training set, where for the first selection of the target training set, one training set may be randomly selected from the multiple training sets, and for the non-first selection of the target training set, the training sets that are not selected need to be selected from the remaining unselected training sets.
S603, inputting each image in the target training set into a shared coding layer of the multitask neural network to obtain first result data of the target training set.
In this embodiment, each image in the target training set is input to a shared coding layer of the multitask neural network, and is processed by the shared coding layer of the multitask neural network to obtain first result data of the target training set output by the shared coding layer, where the shared coding layer is a coding layer shared by each task neural network in the multitask neural network.
Among them, the bookThe network structure of the multitask neural network mentioned in the embodiment is shown in fig. 7, where the shared base network model is a shared encoding layer of the multitask neural network, the task a sub-module is a decoding layer of the task neural network a, the task B sub-module is a decoding layer of the task neural network B, the task C sub-module is a decoding layer of the task neural network C, and there are learnable parameters (denoted by P) between different task sub-modules, which are used to determine whether and to what extent knowledge between different tasks needs to be passed, where P isABWhen the output result is transmitted from the task A submodule to the task B submodule, the preset value corresponding to the first calculation of the output result is required to be calculated, and similarly, PBAWhen the task B submodule transmits the output result to the task A submodule, the corresponding preset numerical value of the output result needs to be calculated firstly, and the rest can be done.
And S604, respectively inputting the first result data to the decoding layer of each task neural network, and processing the first result data through each network layer in the decoding layer of each task neural network to obtain second result data of the target training set output by the decoding layer of each task neural network.
In this embodiment, the decoding layer of each task neural network includes a plurality of network layers, and it should be noted that the number of network layers included in the decoding layer of each task neural network in the multitask neural network is the same.
In this embodiment, first result data output by a shared coding layer is respectively input to a decoding layer of each task neural network, and is processed by each network layer in the decoding layer of each task neural network to obtain second result data of a target training set output by the decoding layer of each task neural network, wherein for the decoding layer of each task neural network, input data of an nth network layer of the decoding layer is a result obtained by fusing an output result of an nth-1 network layer of the decoding layer and a calculation result of an nth-1 network layer in the decoding layers of each other task neural network; the calculation result of the (N-1) th network layer in the decoding layers of each other task neural network model is a result of performing first calculation on the output result of the (N-1) th network layer in the decoding layers of other task neural networks; n is a positive integer greater than 1. That is, the input of each of the decoding layers is related to the output result of the previous one of the decoding layers and the output results of the network layers of the other decoding layers.
Optionally, the first calculation of the output result of the N-1 network layer in the decoding layers of the other task neural networks may be to multiply the output result by a preset value. It should be noted that the preset value is set according to other task neural networks outputting the output result and the task neural network to be input, that is, the preset values corresponding to different task neural networks are different, and the preset values corresponding to the output result output by the same task neural network to different task neural networks are different.
The above-mentioned process of inputting the first result data to the decoding layer of each task neural network, and processing the first result data by each network layer in the decoding layer of each task neural network to obtain the second result data of the target training set output by the decoding layer of each task neural network is illustrated as follows:
the multitask neural network comprises a task neural network A, a task neural network B and a task neural network C, wherein 2 network layers are arranged in a decoding layer of the task neural network A, 2 network layers are arranged in a decoding layer of the task neural network B, 2 network layers are arranged in a decoding layer of the task neural network C, first result data output by a shared coding layer of the multitask neural network are respectively input into the decoding layers, a first network layer in the decoding layers of the task neural network A, a first network layer of the task neural network B and a first network layer of the task neural network C are respectively input into the decoding layers, and the first results are processed and output by the first network layers of the decoding layers of the task neural network A, B and the decoding layers of the task neural network C.
Aiming at the second network layer of the task neural network A, multiplying the result output by the first network layer of the task neural network B by a preset first value PBAObtaining a first sub-result, and multiplying the result output by the first network layer of the task neural network C by a preset second value PCAObtaining a second sub-result, pairAnd the result output by the first network layer of the task neural network A, the first sub-result and the second sub-result are subjected to fusion processing to obtain a first fusion result, and the first fusion result is input to the second network layer of the task neural network A.
Aiming at the second network layer of the task neural network B, multiplying the result output by the first network layer of the task neural network A by a preset third numerical value PABObtaining a third sub-result, and multiplying the result output by the first network layer of the task neural network C by a preset fourth value PCBAnd obtaining a fourth sub-result, performing fusion processing on the result output by the first network layer of the task neural network B, the third sub-result and the fourth sub-result to obtain a second fusion result, and inputting the second fusion result to the second network layer of the task neural network B.
Aiming at the second network layer of the task neural network C, multiplying the result output by the first network layer of the task neural network A by a preset fifth numerical value PACObtaining a fifth sub-result, and multiplying the result output by the first network layer of the task neural network B by a preset sixth numerical value PBCAnd obtaining a sixth sub-result, performing fusion processing on the result output by the first network layer of the task neural network C, the fifth sub-result and the sixth sub-result to obtain a third fusion result, and inputting the third fusion result to the second network layer of the task neural network C.
And the second network layer of the task neural network A processes the input first fusion result and outputs second result data of the target training set.
And the second network layer of the task neural network B processes the input second fusion result and outputs second result data of the target training set.
And the second network layer of the task neural network C processes the input third fusion result and outputs second result data of the target training set.
And S605, obtaining a loss function value of the multitask neural network in the current iteration according to the second result data and the labeling information of each image in the target training set.
In this embodiment, the loss function value is related to the second result data output by the decoding layer of each task neural network and the label information of each image in the target training set, and the loss function value of the multi-task neural network in the current iteration is calculated according to the second result data and the label information of each image in the target training set through a preset loss function calculation formula.
In this embodiment, the process of calculating the loss function value of the multitask neural network at the current iteration according to each piece of second result data and the label information of each image in the target training set specifically includes:
calculating a loss function value of each task neural network in the current iteration according to each second result data and the labeling information of each image in the target training set;
obtaining the loss reduction rate of each task neural network in the last iteration;
aiming at each task neural network, calculating the weight of the task neural network in the current iteration according to the loss reduction rate of all the task neural networks in the last iteration;
and calculating the loss function value of the multi-task neural network in the current iteration according to the loss function value and the weight of each task neural network in the current iteration.
In this embodiment, according to the currently output second result data and the label information of each image in the target training set, the loss function value of each task neural network in the current iteration is calculated through a preset task neural network loss function calculation formula.
In this embodiment, a loss function value of a last iteration of each task neural network and a loss function value of a last iteration are obtained, and a loss reduction rate of each task neural network of the last iteration is calculated according to the loss function value of the last iteration of each task neural network and the loss function value of the last iteration of each task neural network through a preset loss reduction rate formula, where the loss reduction rate formula is:
wherein λ isi(t-1) representing the loss reduction rate of the task neural network i at the t-1 iteration; l isi(t-1) represents the loss function value of the task neural network i at the t-1 iteration, Li(t-2) represents the loss function value of the task neural network i at the t-2 iteration.
In this embodiment, for each task neural network, the weight of the task neural network is calculated according to the loss reduction rate of the last iteration of the task neural network and the loss reduction rate of the last iteration of other task neural networks in the multi-task neural network by using a preset weight calculation formula, where the weight calculation formula is:
wherein, ω isi(t) weight of the task neural network i at the t-th iteration, λi(T-1) represents the loss reduction rate of the task neural network i in the T-1 th iteration, n represents the number of the task neural networks, and T represents a preset threshold value.
In this embodiment, the loss function value of the multitask neural network in the current iteration is calculated according to the loss function value and the weight of each task neural network in the current iteration, and specifically, the product of the loss function value and the weight of each task neural network in the current iteration is calculated, and the products are accumulated to obtain the loss function value of the multitask neural network in the current iteration. For example, the multitask neural network comprises a task neural network A, a task neural network B and a task neural network C, and the loss function value of the task neural network A at the current iteration is LAWeight of ωA(t), the loss function value of the task neural network B at the current iteration is LBWeight of ωB(t), the loss function value of the task neural network C at the current iteration is LCWeight of ωC(t), the loss function value of the multitask neural network at the current iteration. L ═ ωA(t)LA+ωB(t)LB+ωC(t)LC。
And S606, updating network parameters of the multitask neural network according to the loss function value of the multitask neural network in the current iteration to obtain a new multitask neural network.
In this embodiment, the network parameters of the multitask neural network are updated according to the loss function value of the multitask neural network in the current iteration, so as to obtain a new multitask neural network.
S607, determining whether a preset iteration condition is satisfied, if not, returning to perform step S602 according to the new multitask neural network, and if so, performing step S608.
And judging whether a preset iteration condition is met, wherein the iteration condition can be that the iteration number reaches a set threshold value or that the loss function value of the model is smaller than a preset numerical value.
And (3) judging whether a preset iteration condition is met, namely judging whether the iteration number of the multitask neural network reaches a set threshold value or not, or judging whether the loss function value of the multitask neural network is smaller than a preset value or not, if the iteration number of the multitask neural network reaches the set threshold value or the loss function value of the multitask neural network is smaller than the preset value, feeding back and executing the step S602 according to the new multitask neural network, and otherwise, executing the step S608.
And S608, taking the current multitask neural network as an information acquisition model.
In this embodiment, if the preset iteration condition is satisfied, the current multitask neural network is used as the information acquisition model.
Optionally, in this embodiment, model evaluation may be performed on the information acquisition model, and the hyper-parameter of the information acquisition model is adjusted according to an evaluation result, so as to optimize the information acquisition model.
Optionally, the multitasking neural network mentioned in this embodiment may include two task neural networks, which are a detection task neural network and a segmentation task neural network, respectively, where the detection task neural network is used to output four library corner point coordinates of each library position on the bird's-eye view and a library position attribute of each library position, and the segmentation task neural network is used to output a road element category corresponding to each pixel point in the bird's-eye view.
Optionally, the network structure of the detection task neural network may be a centret structure, and the network result of the split task neural network may be a deplab v3p structure, where the coding layer of the detection task neural network and the coding layer of the split task neural network are used as a shared coding layer of the multitask neural network, the unique structure (including branches of a generated thermodynamic diagram heatmap, a central point regression, and a corner point regression) in the detection task neural network is used as a decoding layer, and the unique structure (including a portion subjected to upsampling on different scales) in the split task neural network is used as a decoding layer.
Referring to fig. 8, the process of constructing the information acquisition model is illustrated as follows:
firstly, data acquisition is carried out, camera calibration is carried out on fisheye cameras arranged at different positions of a vehicle to be parked in advance to obtain internal and external parameters of the cameras, images of a garage are obtained through the fisheye cameras arranged at the different positions of the vehicle to be parked, image distortion removal, coordinate transformation and splicing processing are carried out on the images according to the internal and external parameters of the cameras to obtain a bird's-eye view image, a target image is obtained after data annotation is carried out on the bird's-eye view image, then a single-task neural network is constructed, namely a detection task neural network and a segmentation task neural network are constructed, each task neural network is of a coding-decoding (Encoder-Decoder) structure, angular point detection offset, library position classification accuracy, parking position detection rate and detection rate index of the detection task neural network are counted, and average cross-over ratio (mIoU) and pixel accuracy index of the segmentation task neural network are counted, and counting floating point operand and inference speed indexes of the two networks. The statistical indexes are used as baselines for comparison with a multitask network model, the multitask neural network is constructed based on a detection task neural network and a segmentation task neural network, a shared coding layer of the multitask neural network is a coding layer shared by the detection task neural network and the segmentation task neural network, and a decoding layer of each task neural network is used as a task submodule of the multitask neural network. And then, performing data enhancement, namely performing data enhancement on the target image to obtain a training set, performing model training according to the training set, namely training the multitask neural network by using the training set, performing model evaluation on the trained model to obtain a loss function value of the multitask neural network, adjusting parameters of the multitask neural network according to the loss function value, stopping training the model when the iteration number reaches a threshold value, and finally deploying the trained multitask neural network on a vehicle-mounted computing platform so as to utilize the trained multitask neural network to process the bird's-eye view image and obtain parking information.
According to the parking information obtaining method provided by the embodiment, the information obtaining model is built in advance based on the multitask neural network, the obtained aerial view is input into the information obtaining model, the road element type corresponding to each pixel point in the aerial view, the library corner point coordinates of each library position and the library position attributes of each library position can be directly obtained, the information obtaining model only needs to be operated on the vehicle-mounted computing platform due to the fact that the multitask neural network models are fused into the multitask neural network, the occupied computing resources are few, and therefore the information obtaining model does not need to be cut in a large number, accuracy of parking information is improved, and obtaining efficiency of the parking information is improved.
Referring to fig. 9, after step S102, the parking information obtaining method according to the embodiment of the present application may further include the following steps:
s901, carrying out coordinate transformation on the library corner point coordinates of each library position in the parking information to obtain first library corner point coordinates of each library position.
S902, filtering the first library corner point coordinate of each library position to obtain a second library corner point coordinate of each library position.
And S903, planning a parking path according to the second library corner point coordinates of each library position, the library position attribute of each library position in the parking information and the road element category corresponding to each pixel point in the aerial view to obtain the parking path.
In this embodiment, the coordinates of the library corner point of each library bit in the parking information are based on an image coordinate system.
In this embodiment, the library corner point coordinates of each library position in the parking information need to be subjected to coordinate transformation to obtain the first library corner point coordinates of each library position, so as to realize that the library corner point coordinates of each library position are transformed from an image coordinate system to coordinates in a vehicle coordinate system, and the first library corner point coordinates of each library position are subjected to filtering processing to obtain the second library corner point coordinates of each library position, so as to obtain a stable output result in the vehicle coordinate system.
In this embodiment, a parking path is planned according to the second library corner point coordinates of each library position, the library position attribute of each library position in the parking information, and the road element category corresponding to each pixel point in the aerial view, to obtain a parking path, and specifically, a parking path is planned through a preset path planning strategy according to the second library corner point coordinates of each library position, the library position attribute of each library position in the parking information, and the road element category corresponding to each pixel point in the aerial view.
In this embodiment, the parking path may also be sent to a display of the vehicle-mounted computing platform to display the parking path in the display.
Referring to fig. 10, an embodiment of the present application further provides a parking method, which specifically includes the following steps:
and S1001, obtaining parking information of the vehicle to be parked.
In this embodiment, the parking information of the vehicle to be parked is obtained, and the method for obtaining the parking information is described above, and for a specific process, reference is made to the steps shown in fig. 1 to fig. 8, which is not described herein again.
And S1002, determining a parking path according to the parking information.
In this embodiment, the parking information includes: and determining a parking path according to parking information by using the road element category corresponding to each pixel point in the aerial view, the library corner point coordinates of each library position and the library position attributes of each library position.
In this embodiment, the process of determining the parking route according to the parking information is as follows:
carrying out coordinate transformation on the library corner point coordinates of each library position in the parking information to obtain first library corner point coordinates of each library position;
filtering the first library corner point coordinate of each library position to obtain a second library corner point coordinate of each library position;
and planning a parking path according to the second library corner point coordinates of each library position, the library position attributes of each library position in the parking information and the road element categories corresponding to each pixel point in the aerial view to obtain the parking path.
In this embodiment, the library corner coordinates are subjected to coordinate transformation to realize transformation of the library corner coordinates from an image coordinate system to a vehicle coordinate system, filtering processing is performed on the first library corner coordinates obtained after the coordinate transformation to obtain second library corner coordinates, and parking path planning is performed according to the second library corner coordinates of each library position, the library position attribute of each library position in the parking information, and the road element category corresponding to each pixel point in the bird's-eye view, so that a parking path is obtained.
And S1003, controlling the vehicle to be parked to park according to the parking path.
In this embodiment, the vehicle to be parked is controlled to park according to the determined parking path, so that the vehicle to be parked is parked in the parking space.
According to the parking method provided by the embodiment, the accuracy and the efficiency of the acquired parking information are high, so that the path planning is performed based on the parking information, the accuracy and the efficiency of the path planning can be improved, and the efficiency and the accuracy of parking are improved.
Referring to fig. 11, a parking process is illustrated as follows:
1. and acquiring an image acquired by a panoramic fisheye camera, wherein the panoramic fisheye camera is a fisheye camera arranged around the vehicle to be parked.
2. And carrying out camera shielding detection and classification on the images, and filtering the images which do not meet the requirements.
3. And preprocessing the image, including distortion removal processing and coordinate transformation processing.
4. And splicing the preprocessed images to generate a bird's-eye view BEV.
5. Inputting the aerial view into a trained multitask neural network, extracting image characteristics through a base network (BEV road information detection and segmentation base network), and processing the image characteristics through two subtask network modules (library site detection and library position classification tasks and road segmentation tasks) to obtain library corner point coordinates of each library position, library position attributes of each library position and road element categories corresponding to each pixel point in the aerial view in an image coordinate system.
6. And carrying out coordinate transformation and time sequence tracking on the library corner point coordinates of each library position in the image coordinate system to obtain library corner point coordinates in the vehicle coordinate system, and carrying out tracking filtering by combining continuous frames to obtain stable vehicle coordinate system output.
7. And outputting the result under the vehicle coordinate system to a fusion, decision and planning control module, and respectively fusing the ultrasonic radar sensing result and planning and controlling the path. And meanwhile, the data is output to a cabin area controller in the automobile to complete the display and interaction functions such as man-machine interaction and the like.
8. In the execution process, the scene screening and data returning module monitors the whole process, and when receiving a trigger signal, the scene screening and data returning module records data before and after triggering and uploads the data to the server through the network for analysis and learning.
It should be noted that while operations are depicted in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order. Under certain circumstances, multitasking and parallel processing may be advantageous.
It should be understood that the various steps recited in the method embodiments disclosed herein may be performed in a different order and/or performed in parallel. Moreover, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the disclosure is not limited in this respect.
Corresponding to the parking information obtaining method shown in fig. 1, an embodiment of the present application further provides a parking information obtaining apparatus, which is used for implementing the method shown in fig. 1 specifically, and a schematic structural diagram of the parking information obtaining apparatus is shown in fig. 12, and specifically includes:
a first obtaining unit 1201, configured to obtain a bird's eye view of a position where a vehicle to be parked is located;
the input unit 1202 is configured to input the aerial view into a pre-constructed information acquisition model to obtain parking information; the information acquisition model is pre-established based on a multitask neural network, the multitask neural network comprises a shared coding layer and a plurality of task neural networks, and the shared coding layer is shared by the plurality of task neural networks.
The parking information obtaining device provided by this embodiment builds an information obtaining model based on a multitask neural network in advance, inputs the obtained aerial view into the information obtaining model, and can directly obtain parking information.
In an embodiment of the present application, based on the foregoing scheme, the method may further include:
a third obtaining unit, configured to obtain a training image set; the training image set comprises a plurality of training images, each training image is a bird's-eye view image carrying annotation information, and the annotation information comprises: the method comprises the steps that road element categories corresponding to each pixel point in the aerial view, the coordinates of corner points of each library position and the attribute of each library position are obtained;
the data enhancement unit is used for carrying out data enhancement processing on each training image in the training image set to obtain a first training image of each training image;
the composition unit is used for composing each training image and each first training image into a target image set;
and the training unit is used for training the multitask neural network according to each image in the target image set to obtain an information acquisition model.
In an embodiment of the application, based on the foregoing scheme, the training unit is configured to train the multitask neural network according to each image in the target image set to obtain the information acquisition model, and includes:
determining a plurality of training sets according to each image in the target image set; wherein each training set comprises at least one image in the target image set;
selecting a training set from a plurality of training sets as a target training set;
inputting each image in the target training set into a shared coding layer of a multitask neural network to obtain first result data of the target training set; the multitask neural network comprises a plurality of task neural networks, and the shared coding layer is a coding layer shared by each task neural network in the multitask neural network;
inputting the first result data into a decoding layer of each task neural network respectively, and processing the first result data through each network layer in the decoding layer of each task neural network to obtain second result data of the target training set output by the decoding layer of each task neural network; for the decoding layer of each task neural network, the input data of the Nth network layer of the decoding layer is the result of fusion processing of the output result of the N-1 th network layer of the decoding layer and the calculation result of the N-1 th network layer in the decoding layer of each other task neural network; the calculation result of the (N-1) th network layer in the decoding layers of each other task neural network model is a result of performing first calculation on the output result of the (N-1) th network layer in the decoding layers of other task neural networks; n is a positive integer greater than 1;
calculating a loss function value of the multitask neural network in the current iteration according to each second result data and the labeling information of each image in the target training set;
and updating network parameters of the multitask neural network according to the loss function value of the multitask neural network in the current iteration to obtain a new multitask neural network, and when a preset iteration condition is not met, returning to the step of executing the step of selecting one training set from the multiple training sets as a target training set until the iteration condition is met, and taking the current multitask neural network as an information acquisition model.
In an embodiment of the application, based on the foregoing solution, the training unit is configured to calculate a loss function value of the multitask neural network at the current iteration according to each piece of second result data and label information of each piece of image in the target training set, and includes the training unit specifically configured to:
calculating a loss function value of each task neural network in the current iteration according to each second result data and the labeling information of each image in the target training set;
obtaining the loss reduction rate of each task neural network in the last iteration;
aiming at each task neural network, calculating the weight of the task neural network in the current iteration according to the loss reduction rate of all the task neural networks in the last iteration;
and calculating the loss function value of the multi-task neural network in the current iteration according to the loss function value and the weight of each task neural network in the current iteration.
In an embodiment of the present application, based on the foregoing solution, the first obtaining unit 1201 is configured to obtain a bird's-eye view of a location where a vehicle to be parked is located, and includes the first obtaining unit 1201 specifically configured to:
acquiring images acquired by a plurality of image acquisition devices respectively installed at different parts of a vehicle to be parked;
and integrating the multiple images acquired by the multiple image acquisition devices to obtain the aerial view of the vehicle to be parked.
In an embodiment of the application, based on the foregoing scheme, the image acquisition device acquires a fisheye camera.
In an embodiment of the present application, based on the foregoing scheme, the first obtaining unit 1201 is further configured to:
and carrying out image preprocessing on each image acquired by the plurality of image acquisition devices.
In an embodiment of the present application, based on the foregoing scheme, the first obtaining unit 1201 is configured to perform image preprocessing on each image collected by a plurality of image collecting devices, and includes that the first obtaining unit 1201 is specifically configured to:
carrying out distortion removal processing on each image acquired by a plurality of image acquisition devices;
acquiring equipment parameters of each image acquisition device;
and for each image subjected to distortion removal processing, carrying out coordinate transformation on the image by using the equipment parameters corresponding to the image.
In one embodiment of the present application, the parking information includes: the road element category corresponding to each pixel point in the aerial view, the library corner point coordinates of each library position, and the library position attribute of each library position can be further configured as follows based on the scheme:
the coordinate transformation unit is used for carrying out coordinate transformation on the library corner point coordinates of each library position in the parking information to obtain first library corner point coordinates of each library position;
the filtering unit is used for filtering the first library corner point coordinates of each library position to obtain second library corner point coordinates of each library position;
and the planning unit is used for planning a parking path according to the second library corner point coordinates of each library position, the library position attributes of each library position in the parking information and the road element types corresponding to each pixel point in the aerial view to obtain the parking path.
Corresponding to the parking method described in fig. 10, an embodiment of the present application further provides an information obtaining apparatus, which is used for implementing the method in fig. 10 specifically, and a schematic structural diagram of the information obtaining apparatus is shown in fig. 13, and specifically includes:
a second obtaining unit 1301, configured to obtain parking information of a vehicle to be parked, where an embodiment of the method for obtaining parking information is described in fig. 1 to 8;
a determining unit 1302, configured to determine a parking path according to the parking information;
and the control unit 1303 is used for controlling the vehicle to be parked to park according to the parking path.
According to the parking device provided by the embodiment, the accuracy and the efficiency of the acquired parking information are high, so that the path planning is performed based on the parking information, the accuracy and the efficiency of the path planning can be improved, and the efficiency and the accuracy of parking are improved.
In one embodiment of the present application, the parking information includes: based on the above scheme, the determining unit 1302 is configured to determine a parking path according to the parking information, and includes the determining unit 1302 specifically configured to:
carrying out coordinate transformation on the library corner point coordinates of each library position in the parking information to obtain first library corner point coordinates of each library position;
filtering the first library corner point coordinate of each library position to obtain a second library corner point coordinate of each library position;
and planning a parking path according to the second library corner point coordinates of each library position, the library position attributes of each library position in the parking information and the road element categories corresponding to each pixel point in the aerial view to obtain the parking path.
Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims.
While several specific implementation details are included in the above discussion, these should not be construed as limitations on the scope of the disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination.
The foregoing description is only exemplary of the preferred embodiments disclosed herein and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the disclosure herein is not limited to the particular combination of features described above, but also encompasses other arrangements formed by any combination of the above features or their equivalents without departing from the spirit of the disclosure. For example, the above features and (but not limited to) technical features having similar functions disclosed in the present disclosure are mutually replaced to form the technical solution.