Three-dimensional reconstruction method and device, electronic equipment and storage medium

文档序号:9446 发布日期:2021-09-17 浏览:82次 中文

1. A method of three-dimensional reconstruction, comprising:

acquiring an image sequence of an object to be reconstructed, wherein the image sequence is a continuous image frame obtained by acquiring an image of the object to be reconstructed by a monocular image acquisition device;

extracting depth information of the image to be processed aiming at the image to be processed in the image sequence;

according to the world coordinate information of each feature point in a reference image, the image coordinate information of each feature point in the image to be processed and the rotation pose information of the image to be processed, carrying out translation pose information estimation on the image to be processed to obtain the translation pose information of the image to be processed; the reference image is an adjacent image of which the corresponding acquisition time point in the image sequence is positioned before the image to be processed;

generating a point cloud picture according to the depth information, the rotation pose information and the translation pose information of each frame of image in the image sequence;

and according to the point cloud picture, performing three-dimensional reconstruction on the object to be reconstructed.

2. The three-dimensional reconstruction method according to claim 1, wherein before the performing the estimation of the translation pose information of the image to be processed according to the world coordinate information of each feature point in the reference image, the image coordinate information of each feature point in the image to be processed, and the rotation pose information of the image to be processed to obtain the translation pose information of the image to be processed, the method further comprises:

and acquiring inertial measurement information when the image collector collects the image to be processed, wherein the inertial measurement information comprises the rotation pose information.

3. The three-dimensional reconstruction method according to claim 1, wherein the performing of the translational pose information estimation on the image to be processed according to the world coordinate information of each feature point in the reference image, the image coordinate information of each feature point in the image to be processed, and the rotation pose information of the image to be processed to obtain the translational pose information of the image to be processed comprises:

acquiring world coordinate information of each feature point in the reference image;

performing optical flow tracking on each feature point in the reference image, and determining image coordinate information of each feature point in the image to be processed;

and taking the translation pose information of the image to be processed as a variable, taking the world coordinate information of each characteristic point in the reference image, the image coordinate information of each characteristic point in the image to be processed and the rotation pose information of the image to be processed as parameters, taking pose constraint of six degrees of freedom as a condition, constructing an equation set, and solving to obtain the translation pose information of the image to be processed.

4. The three-dimensional reconstruction method according to claim 1, wherein the generating a point cloud map according to the depth information, the rotation pose information, and the translation pose information of each frame of image in the image sequence comprises:

for each frame of image to be processed in the image sequence, determining image collector position information corresponding to the image to be processed according to the rotation pose information of the image to be processed, the translation pose information of the image to be processed and the image collector position information corresponding to the first frame of image in the image sequence;

determining world coordinate information of each pixel point in the image to be processed according to the position information of the image collector corresponding to the image to be processed and the depth information of the image to be processed;

and generating the point cloud picture according to the world coordinate information of each pixel point in each frame of image.

5. The three-dimensional reconstruction method according to claim 1, wherein the three-dimensional reconstruction of the object to be reconstructed according to the point cloud image comprises:

carrying out space grid division on the point cloud picture to obtain each voxel block;

performing ray projection processing on the point cloud picture by taking the pixel point as a starting point aiming at each pixel point of each frame of image in the image sequence, and determining a voxel block through which the ray passes;

determining each isosurface and corresponding position information according to a voxel block which is penetrated by a ray with each pixel point as a starting point, wherein the TSDF value of each voxel block in the isosurface is the same, and the TSDF value of the voxel block is determined according to the length of the ray from the voxel block to the pixel point;

and drawing the three-dimensional model of the object to be reconstructed according to each isosurface and the corresponding position information.

6. The three-dimensional reconstruction method according to claim 5, wherein after performing ray casting processing on the point cloud image with respect to each pixel point of each frame image in the image sequence by using the pixel point as a starting point, and determining a voxel block through which the ray passes, the method further comprises:

aiming at a voxel block which passes through by taking each pixel point as a starting point, determining a hash value corresponding to the voxel block according to the spatial position information of the voxel block;

searching a hash table according to the hash value corresponding to the voxel block, and determining a target storage area of the voxel block; the hash table stores a mapping relation between a hash value and a storage area;

and searching the voxel block in the target storage area.

7. A three-dimensional reconstruction apparatus, comprising:

the device comprises an acquisition module, a reconstruction module and a reconstruction module, wherein the acquisition module is configured to execute image sequence acquisition of an object to be reconstructed, and the image sequence is a continuous image frame obtained by image acquisition of a monocular image collector on the object to be reconstructed;

an extraction module configured to extract depth information of an image to be processed in the image sequence;

the determining module is configured to perform translation pose information estimation on the image to be processed according to the world coordinate information of each feature point in a reference image, the image coordinate information of each feature point in the image to be processed and the rotation pose information of the image to be processed, so as to obtain the translation pose information of the image to be processed; the reference image is an adjacent image of which the corresponding acquisition time point in the image sequence is positioned before the image to be processed;

a generation module configured to execute generating a point cloud picture according to the depth information, the rotation pose information and the translation pose information of each frame of image in the image sequence;

and the reconstruction module is configured to perform three-dimensional reconstruction on the object to be reconstructed according to the point cloud picture.

8. An electronic device, comprising:

a processor;

a memory for storing the processor-executable instructions;

wherein the processor is configured to execute the instructions to implement the three-dimensional reconstruction method of any one of claims 1 to 6.

9. A computer-readable storage medium in which instructions, when executed by a processor of an electronic device, enable the electronic device to perform the three-dimensional reconstruction method of any one of claims 1 to 6.

10. A computer program product comprising a computer program which, when executed by a processor of an electronic device, enables the electronic device to perform the three-dimensional reconstruction method of any one of claims 1 to 6.

Background

Currently, when three-dimensional reconstruction is performed on a mobile terminal, a depth sensor, such as a depth camera, needs to be installed on the mobile terminal to obtain depth information of an image; determining the pose information of the mobile terminal by combining the image and the depth information of the image; and then, three-dimensional reconstruction is carried out on the object in the image by combining the pose information of the mobile terminal, the image and the depth information of the image.

In the method, the depth sensor needs to be installed on the mobile terminal, so that the cost is high and the adaptability is poor; and the measuring range of the depth sensor is limited, and the expansibility is poor.

Disclosure of Invention

The present disclosure provides a three-dimensional reconstruction method, an apparatus, an electronic device, and a storage medium, to at least solve the problems of high cost, poor adaptability, and poor expansibility due to the need to install a depth sensor on a mobile terminal in the related art.

The technical scheme of the disclosure is as follows:

acquiring an image sequence of an object to be reconstructed, wherein the image sequence is a continuous image frame obtained by acquiring an image of the object to be reconstructed by a monocular image acquisition device; extracting depth information of the image to be processed aiming at the image to be processed in the image sequence; according to the world coordinate information of each feature point in a reference image, the image coordinate information of each feature point in the image to be processed and the rotation pose information of the image to be processed, carrying out translation pose information estimation on the image to be processed to obtain the translation pose information of the image to be processed; the reference image is an adjacent image of which the corresponding acquisition time point in the image sequence is positioned before the image to be processed; generating a point cloud picture according to the depth information, the rotation pose information and the translation pose information of each frame of image in the image sequence; and according to the point cloud picture, performing three-dimensional reconstruction on the object to be reconstructed.

As a first possible case of the embodiment of the present disclosure, before performing, according to world coordinate information of each feature point in a reference image, image coordinate information of each feature point in the image to be processed, and rotation pose information of the image to be processed, translation pose information estimation on the image to be processed to obtain translation pose information of the image to be processed, the method further includes: and acquiring inertial measurement information when the image collector collects the image to be processed, wherein the inertial measurement information comprises the rotation pose information.

As a second possible case of the embodiment of the present disclosure, the performing, according to the world coordinate information of each feature point in the reference image, the image coordinate information of each feature point in the image to be processed, and the rotation pose information of the image to be processed, translation pose information estimation on the image to be processed to obtain translation pose information of the image to be processed includes: acquiring world coordinate information of each feature point in the reference image; performing optical flow tracking on each feature point in the reference image, and determining image coordinate information of each feature point in the image to be processed; and taking the translation pose information of the image to be processed as a variable, taking the world coordinate information of each characteristic point in the reference image, the image coordinate information of each characteristic point in the image to be processed and the rotation pose information of the image to be processed as parameters, taking pose constraint of six degrees of freedom as a condition, constructing an equation set, and solving to obtain the translation pose information of the image to be processed.

As a third possible case of the embodiment of the present disclosure, the generating a point cloud graph according to the depth information, the rotation pose information, and the translation pose information of each frame of image in the image sequence includes: for each frame of image to be processed in the image sequence, determining image collector position information corresponding to the image to be processed according to the rotation pose information of the image to be processed, the translation pose information of the image to be processed and the image collector position information corresponding to the first frame of image in the image sequence; determining world coordinate information of each pixel point in the image to be processed according to the position information of the image collector corresponding to the image to be processed and the depth information of the image to be processed; and generating the point cloud picture according to the world coordinate information of each pixel point in each frame of image.

As a fourth possible case of the embodiment of the present disclosure, the three-dimensional reconstruction of the object to be reconstructed according to the point cloud chart includes: carrying out space grid division on the point cloud picture to obtain each voxel block; performing ray projection processing on the point cloud picture by taking the pixel point as a starting point aiming at each pixel point of each frame of image in the image sequence, and determining a voxel block through which the ray passes; determining each isosurface and corresponding position information according to a voxel block which is penetrated by a ray with each pixel point as a starting point, wherein the TSDF value of each voxel block in the isosurface is the same, and the TSDF value of the voxel block is determined according to the length of the ray from the voxel block to the pixel point; and drawing the three-dimensional model of the object to be reconstructed according to each isosurface and the corresponding position information.

As a fifth possible case of the embodiment of the present disclosure, after performing ray projection processing on the point cloud image by taking the pixel point as a starting point and determining a voxel block through which the ray passes for each pixel point of each frame of image in the image sequence, the method further includes: aiming at a voxel block which passes through by taking each pixel point as a starting point, determining a hash value corresponding to the voxel block according to the spatial position information of the voxel block; searching a hash table according to the hash value corresponding to the voxel block, and determining a target storage area of the voxel block; the hash table stores a mapping relation between a hash value and a storage area; and searching the voxel block in the target storage area.

According to a second aspect of the embodiments of the present disclosure, there is provided a three-dimensional reconstruction apparatus including: the device comprises an acquisition module, a reconstruction module and a reconstruction module, wherein the acquisition module is configured to execute image sequence acquisition of an object to be reconstructed, and the image sequence is a continuous image frame obtained by image acquisition of a monocular image collector on the object to be reconstructed; an extraction module configured to extract depth information of an image to be processed in the image sequence; the determining module is configured to perform translation pose information estimation on the image to be processed according to the world coordinate information of each feature point in a reference image, the image coordinate information of each feature point in the image to be processed and the rotation pose information of the image to be processed, so as to obtain the translation pose information of the image to be processed; the reference image is an adjacent image of which the corresponding acquisition time point in the image sequence is positioned before the image to be processed; a generation module configured to execute generating a point cloud picture according to the depth information, the rotation pose information and the translation pose information of each frame of image in the image sequence; and the reconstruction module is configured to perform three-dimensional reconstruction on the object to be reconstructed according to the point cloud picture.

As a first possible situation of the embodiment of the present disclosure, the obtaining module is further configured to perform obtaining of inertial measurement information when the image collector collects the image to be processed, where the inertial measurement information includes the rotation pose information.

As a second possible case of the embodiment of the present disclosure, the determining module is specifically configured to execute, to obtain world coordinate information of each feature point in the reference image; performing optical flow tracking on each feature point in the reference image, and determining image coordinate information of each feature point in the image to be processed; and taking the translation pose information of the image to be processed as a variable, taking the world coordinate information of each characteristic point in the reference image, the image coordinate information of each characteristic point in the image to be processed and the rotation pose information of the image to be processed as parameters, taking pose constraint of six degrees of freedom as a condition, constructing an equation set, and solving to obtain the translation pose information of the image to be processed.

As a third possible case of the embodiment of the present disclosure, the generating module is specifically configured to execute, for each frame of to-be-processed image in the image sequence, determining image collector position information corresponding to the to-be-processed image according to rotation pose information of the to-be-processed image, translation pose information of the to-be-processed image, and image collector position information corresponding to a first frame of image in the image sequence; determining world coordinate information of each pixel point in the image to be processed according to the position information of the image collector corresponding to the image to be processed and the depth information of the image to be processed; and generating the point cloud picture according to the world coordinate information of each pixel point in each frame of image.

As a fourth possible case of the embodiment of the present disclosure, the reconstruction module is specifically configured to perform spatial grid division on the point cloud image to obtain each voxel block; performing ray projection processing on the point cloud picture by taking the pixel point as a starting point aiming at each pixel point of each frame of image in the image sequence, and determining a voxel block through which the ray passes; determining each isosurface and corresponding position information according to a voxel block which is penetrated by a ray with each pixel point as a starting point, wherein the TSDF value of each voxel block in the isosurface is the same, and the TSDF value of the voxel block is determined according to the length of the ray from the voxel block to the pixel point; and drawing the three-dimensional model of the object to be reconstructed according to each isosurface and the corresponding position information.

As a fifth possible situation of the embodiment of the present disclosure, the reconstruction module is further specifically configured to execute, with respect to a voxel block that passes through with each pixel point as a starting point, determining a hash value corresponding to the voxel block according to spatial position information of the voxel block; searching a hash table according to the hash value corresponding to the voxel block, and determining a target storage area of the voxel block; the hash table stores a mapping relation between a hash value and a storage area; and searching the voxel block in the target storage area.

According to a third aspect of the embodiments of the present disclosure, there is provided an electronic apparatus including: a processor; a memory for storing the processor-executable instructions; wherein the processor is configured to execute the instructions to implement the three-dimensional reconstruction method set forth in the embodiments of the first aspect of the present disclosure.

According to a fourth aspect of the embodiments of the present disclosure, there is provided a computer-readable storage medium, wherein instructions, when executed by a processor of an electronic device, enable the electronic device to perform the three-dimensional reconstruction method set forth in the first aspect of the present disclosure.

According to a fifth aspect of the embodiments of the present disclosure, there is provided a computer program product, which, when executed by a processor of an electronic device, enables the electronic device to perform the three-dimensional reconstruction method set forth in the embodiments of the first aspect of the present disclosure.

The technical scheme provided by the embodiment of the disclosure at least brings the following beneficial effects:

acquiring an image sequence of an object to be reconstructed, wherein the image sequence is a continuous image frame obtained by acquiring images of the object to be reconstructed by a monocular image acquisition device; extracting depth information of an image to be processed aiming at the image to be processed in the image sequence; according to the world coordinate information of each characteristic point in the reference image, the image coordinate information of each characteristic point in the image to be processed and the rotation pose information of the image to be processed, carrying out translation pose information estimation on the image to be processed to obtain the translation pose information of the image to be processed; the reference image is an adjacent image of which the corresponding acquisition time point in the image sequence is positioned before the image to be processed; generating a point cloud picture according to the depth information, the rotation pose information and the translation pose information of each frame of image in the image sequence; according to the point cloud picture, the three-dimensional reconstruction is carried out on the object to be reconstructed, so that the three-dimensional reconstruction of the object to be reconstructed is realized only by combining a single-purpose image collector under the condition of avoiding using a depth sensor, the defect of limited range of the depth sensor is overcome, the cost is reduced, the adaptability is good, and the expansibility is good.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure and are not to be construed as limiting the disclosure.

FIG. 1 is a flow chart of a three-dimensional reconstruction method shown in accordance with an exemplary embodiment;

FIG. 2 is a schematic diagram of the structure of a deep network model;

FIG. 3 is a flow chart illustrating another method of three-dimensional reconstruction in accordance with an exemplary embodiment;

FIG. 4 is a schematic diagram of 15 basic modes;

FIG. 5 is a block diagram of a three-dimensional reconstruction apparatus shown in accordance with an exemplary embodiment;

fig. 6 is a block diagram illustrating an electronic device for three-dimensional reconstruction in accordance with an exemplary embodiment.

Detailed Description

In order to make the technical solutions of the present disclosure better understood by those of ordinary skill in the art, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.

It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein are capable of operation in sequences other than those illustrated or otherwise described herein. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.

Currently, when three-dimensional reconstruction is performed on a mobile terminal, a depth sensor, such as a depth camera, needs to be installed on the mobile terminal to obtain depth information of an image; determining the pose information of the mobile terminal by combining the image and the depth information of the image; and then, three-dimensional reconstruction is carried out on the object in the image by combining the pose information of the mobile terminal, the image and the depth information of the image. In the method, the depth sensor needs to be installed on the mobile terminal, so that the cost is high and the adaptability is poor; and the measuring range of the depth sensor is limited, and the expansibility is poor.

The three-dimensional reconstruction method is mainly used for solving the problems that in the related art, a depth sensor needs to be installed on a mobile terminal, the cost is high, the adaptability is poor, and the expansibility is poor. The three-dimensional reconstruction method of the embodiment of the disclosure acquires an image sequence of an object to be reconstructed, wherein the image sequence is a continuous image frame obtained by acquiring an image of the object to be reconstructed by a monocular image acquisition device; extracting depth information of an image to be processed aiming at the image to be processed in the image sequence; according to the world coordinate information of each characteristic point in the reference image, the image coordinate information of each characteristic point in the image to be processed and the rotation pose information of the image to be processed, carrying out translation pose information estimation on the image to be processed to obtain the translation pose information of the image to be processed; the reference image is an adjacent image of which the corresponding acquisition time point in the image sequence is positioned before the image to be processed; generating a point cloud picture according to the depth information, the rotation pose information and the translation pose information of each frame of image in the image sequence; according to the point cloud picture, the three-dimensional reconstruction is carried out on the object to be reconstructed, so that the three-dimensional reconstruction of the object to be reconstructed is realized only by combining a single-purpose image collector under the condition of avoiding using a depth sensor, the defect of limited range of the depth sensor is overcome, the cost is reduced, the adaptability is good, and the expansibility is good.

The three-dimensional reconstruction method provided by the embodiment of the present disclosure is described in detail below with reference to the accompanying drawings.

Fig. 1 is a flow chart of a three-dimensional reconstruction method shown in accordance with an exemplary embodiment.

It should be noted that an execution subject of the three-dimensional reconstruction method according to the embodiment of the present disclosure may be a three-dimensional reconstruction apparatus, and the three-dimensional reconstruction apparatus may be configured in an electronic device that is not installed with a depth sensor, so as to perform three-dimensional reconstruction on an object to be reconstructed on the electronic device.

The electronic device may be any stationary or mobile computing device capable of performing data processing, for example, a mobile computing device such as a notebook computer and a wearable device, or a stationary computing device such as a desktop computer, or other types of computing devices, which is not limited in this disclosure.

As shown in fig. 1, the three-dimensional reconstruction method may include the following steps 101-105.

In step 101, an image sequence of an object to be reconstructed is obtained, where the image sequence is a continuous image frame obtained by a monocular image acquisition device performing image acquisition on the object to be reconstructed.

The object to be reconstructed may be, for example, any item or any spatial region. And the image sequence is a continuous image frame obtained by acquiring images of the object to be reconstructed from various angles by a monocular image acquisition device. The monocular image collector may be, for example, a single camera on the mobile computing device.

In step 102, for an image to be processed in an image sequence, depth information of the image to be processed is extracted.

The method for extracting the depth information of the image to be processed by the three-dimensional reconstruction device may be that the image to be processed is input into a preset depth network model to obtain the depth information of the image to be processed. The structural diagram of the deep network model may be as shown in fig. 2, and in fig. 2, the structure of the deep network model may include: the device comprises a feature extraction module, a feature fusion module and a prediction module. The feature extraction module is mainly used for extracting features of the image from a bottom layer to a high layer; the feature fusion module is used for gradually recovering the image resolution and reducing the number of channels, and fusing the high-level features and the bottom-level features extracted by the feature extraction module to obtain fusion features; the prediction module is configured to predict a depth value of each pixel in the image based on the feature of the pixel in the fused feature.

The deep network model may be trained by obtaining training data, where the training data includes: each sample image and corresponding depth information; and training an initial depth network model by adopting each sample image and corresponding depth information to obtain a trained depth network model.

In step 103, according to the world coordinate information of each feature point in the reference image, the image coordinate information of each feature point in the image to be processed, and the rotation pose information of the image to be processed, performing translation pose information estimation on the image to be processed to obtain translation pose information of the image to be processed; the reference image is an adjacent image of which the corresponding acquisition time point in the image sequence is positioned before the image to be processed.

In an exemplary embodiment, an Inertial Measurement Unit (IMU) may be configured on the monocular image collector, and is configured to measure Inertial measurement information of the image collector in real time, so that the Inertial measurement information when the image collector collects the image to be processed can be obtained. The inertial measurement information may include rotational pose information, among others. It should be noted that the rotation pose information is information of angular offset of the first posture relative to the second posture of the image collector. The first posture is the posture when the image collector collects the image to be processed; the second posture is the posture when the image collector collects the first frame image in the image sequence.

In the embodiment, the inertial measurement unit is arranged on the monocular image collector, so that the rotation pose information generated when the image collector collects the image to be processed can be directly acquired, the translation pose information can be conveniently determined by combining the rotation pose information, and the speed and the accuracy for determining the translation pose information are improved.

In an exemplary embodiment, since the position of the object to be reconstructed is fixed and does not move with the movement of the image collector, the world coordinate information of the object to be reconstructed in a plurality of images continuously captured by the image collector is the same. In addition, in the continuous shooting process of the image collector, the posture change of the image collector is limited when two adjacent images are shot, and the large-amplitude change cannot occur, so that the six-degree-of-freedom posture constraint can be constructed on the basis of the posture change, the translation posture information is solved under the condition of the six-degree-of-freedom posture constraint, and the solving accuracy of the translation posture information can be improved. The six degrees of freedom refer to the degree of freedom of movement in the directions of three rectangular coordinate axes of x, y and z and the degree of freedom of rotation around the three rectangular coordinate axes in the world coordinate system. Therefore, the three-dimensional reconstruction device may execute the process of step 103, for example, to obtain world coordinate information of each feature point in the reference image; performing optical flow tracking on each feature point in the reference image, and determining image coordinate information of each feature point in the image to be processed; and taking the translation pose information of the image to be processed as a variable, taking the world coordinate information of each characteristic point in the reference image, the image coordinate information of each characteristic point in the image to be processed and the rotation pose information of the image to be processed as parameters, taking the pose constraint of six degrees of freedom as a condition, constructing an equation set, and solving to obtain the translation pose information of the image to be processed.

In the above embodiment, the algorithm for solving the translation pose information of the image to be processed by using the solution principle may be, for example, a multipoint-n-point (PNP) perspective imaging algorithm. The extraction method of each feature point in the image may be to extract gftt (good feature to track) feature points from the image to obtain each feature point in the image. It should be noted that the translation pose information is position offset information of the first posture relative to the second posture of the image collector. The first posture is the posture when the image collector collects the image to be processed; the second posture is the posture when the image collector collects the first frame image in the image sequence.

In step 104, a point cloud image is generated according to the depth information, the rotation pose information and the translation pose information of each frame of image in the image sequence.

In an exemplary embodiment, according to the depth information, the rotation pose information and the translation pose information of each frame of image in the image sequence, the world coordinate information of each pixel point in each frame of image can be directly determined, and then the world coordinate information of each pixel point in each frame of image is combined to generate a point cloud picture, so that the rate and the accuracy of generating the point cloud picture can be improved. Therefore, the process of the three-dimensional reconstruction apparatus to execute step 104 may be, for example, to determine, for each frame of to-be-processed image in the image sequence, image collector position information corresponding to the to-be-processed image according to the rotation pose information of the to-be-processed image, the translation pose information of the to-be-processed image, and image collector position information corresponding to the first frame of image in the image sequence; determining world coordinate information of each pixel point in the image to be processed according to the position information of the image collector corresponding to the image to be processed and the depth information of the image to be processed; and generating a point cloud picture according to the world coordinate information of each pixel point in each frame of image.

In the above embodiment, the rotation pose information and the translational pose information of the image to be processed are offset information between a first pose and a second pose, where the first pose is a pose when the image collector collects the image to be processed; the second posture is the posture when the image collector collects the first frame image in the image sequence. Therefore, the position information of the image collector corresponding to the image to be processed can be determined by combining the rotation position information and the translation position information of the image to be processed and the position information of the image collector corresponding to the first frame of image in the image sequence.

In the above embodiment, the depth information of the image to be processed refers to distance offset information of each pixel point in the image to be processed in a fixed coordinate axis or a fixed orientation determined by multiple coordinate axes with respect to the image collector, so that, in combination with the position information of the image collector corresponding to the image to be processed and the depth information of the image to be processed, the world coordinate information of each pixel point in the image to be processed can be determined.

In step 105, a three-dimensional reconstruction of the object to be reconstructed is performed according to the point cloud map.

The three-dimensional reconstruction method of the embodiment of the disclosure acquires an image sequence of an object to be reconstructed, wherein the image sequence is a continuous image frame obtained by acquiring an image of the object to be reconstructed by a monocular image acquisition device; extracting depth information of an image to be processed aiming at the image to be processed in the image sequence; according to the world coordinate information of each characteristic point in the reference image, the image coordinate information of each characteristic point in the image to be processed and the rotation pose information of the image to be processed, carrying out translation pose information estimation on the image to be processed to obtain the translation pose information of the image to be processed; the reference image is an adjacent image of which the corresponding acquisition time point in the image sequence is positioned before the image to be processed; generating a point cloud picture according to the depth information, the rotation pose information and the translation pose information of each frame of image in the image sequence; according to the point cloud picture, the three-dimensional reconstruction is carried out on the object to be reconstructed, so that the three-dimensional reconstruction of the object to be reconstructed is realized only by combining a single-purpose image collector under the condition of avoiding using a depth sensor, the defect of limited range of the depth sensor is overcome, the cost is reduced, the adaptability is good, and the expansibility is good.

Next, with reference to fig. 3, a process of performing three-dimensional reconstruction on an object to be reconstructed according to a point cloud diagram in the three-dimensional reconstruction method provided in the embodiment of the present disclosure is described.

FIG. 3 is a flow chart illustrating another method of three-dimensional reconstruction in accordance with an exemplary embodiment.

As shown in fig. 3, the step 105 shown in fig. 1 may specifically include the following steps 301-304.

In step 301, the point cloud map is spatially gridded to obtain individual voxel blocks.

In an exemplary embodiment, the number of voxel blocks may be set, and the point cloud graph is spatially gridded according to the number to obtain the number of voxel blocks; or, the size of the voxel block may be set, and the dot cloud image is spatially gridded according to the size to obtain a plurality of voxel blocks with the size. Wherein, a plurality of voxels may be included in one voxel block, and a voxel is a minimum block structure in the voxel block.

In step 302, for each pixel point of each frame of image in the image sequence, ray projection processing is performed on the point cloud image with the pixel point as a starting point, and a voxel block through which a ray passes is determined.

In an exemplary embodiment, to facilitate storage and search of the voxel block and improve the search efficiency, after step 302, the three-dimensional reconstruction apparatus may further perform the following steps: aiming at a voxel block which passes through by taking each pixel point as a starting point, determining a hash value corresponding to the voxel block according to the spatial position information of the voxel block; searching a hash table according to the hash value corresponding to the voxel block, and determining a target storage area of the voxel block; the hash table stores a mapping relation between a hash value and a storage area; a voxel block is looked up in the target storage area. And the storage mode of the voxel block is that the voxel block is stored in a storage area corresponding to the hash value according to the corresponding hash value.

In the above embodiment, according to the spatial position information of the voxel block, the hash value corresponding to the voxel block is determined in a manner that world coordinate information of a pixel point at the lower left corner of the voxel block is determined, where the world coordinate information includes an X-axis coordinate, a Y-axis coordinate, and a Z-axis coordinate; determining a preset coding value corresponding to each axis, and determining the number of storage areas; and multiplying and summing the coordinates of each axis and the preset code value, and performing remainder calculation processing on the processing result and the number of the storage areas to obtain the hash value corresponding to the voxel block.

In step 303, according to a voxel block through which a ray with each pixel point as a starting point passes, each iso-surface and corresponding position information are determined, wherein TSDF values of each voxel block in the iso-surface are the same, and the TSDF value of the voxel block is determined according to the length of the ray from the voxel block to the pixel point.

In an exemplary embodiment, each voxel block corresponds to a signed phase distance function (TSDF) value and a weight. And the TSDF value of the voxel block is obtained by fusing the TSDF value and the weight of each voxel in the voxel block.

In the above embodiment, after determining the iso-surface, the voxels in the voxel block are also needed to accurately represent the iso-surface. Specifically, the intersection points of the iso-surface and the voxel edges are connected according to the relative position of each vertex of the voxel and the iso-surface, and an approximate representation of the iso-surface in the voxel is determined. For each voxel, there are two cases for each vertex, i.e. (greater than or less than) the value of the current iso-surface, and 256 cases for 8 vertices, and the reclassification results in the 15-base pattern shown in fig. 4 in consideration of rotational symmetry. The situations are coded into a voxel state table, and the position coordinates of the isosurface and the edge in the voxel can be quickly calculated according to the current voxel vertex state index. And finally, solving the normal phase of each vertex of the isosurface by using vector cross multiplication, and further determining the position information of the isosurface.

In step 304, a three-dimensional model of the object to be reconstructed is rendered based on the iso-surfaces and the corresponding position information.

The three-dimensional reconstruction method of the embodiment of the disclosure acquires an image sequence of an object to be reconstructed, wherein the image sequence is a continuous image frame obtained by acquiring an image of the object to be reconstructed by a monocular image acquisition device; extracting depth information of an image to be processed aiming at the image to be processed in the image sequence; according to the world coordinate information of each characteristic point in the reference image, the image coordinate information of each characteristic point in the image to be processed and the rotation pose information of the image to be processed, carrying out translation pose information estimation on the image to be processed to obtain the translation pose information of the image to be processed; the reference image is an adjacent image of which the corresponding acquisition time point in the image sequence is positioned before the image to be processed; generating a point cloud picture according to the depth information, the rotation pose information and the translation pose information of each frame of image in the image sequence; carrying out space grid division on the point cloud picture to obtain each voxel block; aiming at each pixel point of each frame of image in the image sequence, carrying out ray projection processing on the point cloud picture by taking the pixel point as a starting point, and determining a voxel block through which a ray passes; determining each isosurface and corresponding position information according to a voxel block which is penetrated by a ray with each pixel point as a starting point, wherein the TSDF value of each voxel block in the isosurface is the same, and the TSDF value of the voxel block is determined according to the length of the ray from the voxel block to the pixel point; the three-dimensional model of the object to be reconstructed is drawn according to each isosurface and corresponding position information, so that the three-dimensional reconstruction of the object to be reconstructed is realized only by combining a single-purpose image collector under the condition of avoiding using a depth sensor, the defect of limited range of the depth sensor is overcome, the cost is reduced, the adaptability is good, the expansibility is good, the isosurface can be accurately found through ray projection processing, and the three-dimensional reconstruction efficiency is improved.

In order to implement the foregoing embodiments, the present disclosure provides a three-dimensional reconstruction apparatus.

Fig. 5 is a block diagram of a three-dimensional reconstruction apparatus shown in accordance with an exemplary embodiment.

Referring to fig. 5, the three-dimensional reconstruction apparatus 500 may include: an acquisition module 510, an extraction module 520, a determination module 530, a generation module 540, and a reconstruction module 550.

The acquiring module 510 is configured to perform acquiring an image sequence of an object to be reconstructed, where the image sequence is a continuous image frame obtained by acquiring an image of the object to be reconstructed by a monocular image acquisition device;

an extracting module 520 configured to extract depth information of an image to be processed in the image sequence;

a determining module 530 configured to perform translation pose information estimation on the image to be processed according to the world coordinate information of each feature point in the reference image, the image coordinate information of each feature point in the image to be processed, and the rotation pose information of the image to be processed, so as to obtain translation pose information of the image to be processed; the reference image is an adjacent image of which the corresponding acquisition time point in the image sequence is positioned before the image to be processed;

a generating module 540 configured to execute generating a point cloud image according to the depth information, the rotation pose information and the translation pose information of each frame of image in the image sequence;

a reconstruction module 550 configured to perform a three-dimensional reconstruction of the object to be reconstructed according to the point cloud map.

In an exemplary embodiment, the obtaining module 510 is further configured to perform obtaining inertial measurement information when the image collector collects the image to be processed, where the inertial measurement information includes the rotation pose information.

In an exemplary embodiment, the determining module 530 is specifically configured to execute, obtaining world coordinate information of the respective feature points in the reference image; performing optical flow tracking on each feature point in the reference image, and determining image coordinate information of each feature point in the image to be processed; and taking the translation pose information of the image to be processed as a variable, taking the world coordinate information of each characteristic point in the reference image, the image coordinate information of each characteristic point in the image to be processed and the rotation pose information of the image to be processed as parameters, taking pose constraint of six degrees of freedom as a condition, constructing an equation set, and solving to obtain the translation pose information of the image to be processed.

In an exemplary embodiment, the generating module 540 is specifically configured to perform, for each frame of to-be-processed image in the image sequence, determining image collector position information corresponding to the to-be-processed image according to the rotation pose information of the to-be-processed image, the translation pose information of the to-be-processed image, and image collector position information corresponding to a first frame of image in the image sequence; determining world coordinate information of each pixel point in the image to be processed according to the position information of the image collector corresponding to the image to be processed and the depth information of the image to be processed; and generating the point cloud picture according to the world coordinate information of each pixel point in each frame of image.

In an exemplary embodiment, the reconstruction module 550 is specifically configured to perform spatial meshing of the point cloud image to obtain each voxel block; performing ray projection processing on the point cloud picture by taking the pixel point as a starting point aiming at each pixel point of each frame of image in the image sequence, and determining a voxel block through which the ray passes; determining each isosurface and corresponding position information according to a voxel block which is penetrated by a ray with each pixel point as a starting point, wherein the TSDF value of each voxel block in the isosurface is the same, and the TSDF value of the voxel block is determined according to the length of the ray from the voxel block to the pixel point; and drawing the three-dimensional model of the object to be reconstructed according to each isosurface and the corresponding position information.

In an exemplary embodiment, the reconstructing module 550 is further specifically configured to perform, for a voxel block that passes through with each pixel point as a starting point, determining a hash value corresponding to the voxel block according to spatial position information of the voxel block; searching a hash table according to the hash value corresponding to the voxel block, and determining a target storage area of the voxel block; the hash table stores a mapping relation between a hash value and a storage area; and searching the voxel block in the target storage area.

It should be noted that the three-dimensional reconstruction apparatus according to the embodiment of the present disclosure may execute the three-dimensional reconstruction method in the foregoing embodiment, and the three-dimensional reconstruction apparatus may be an electronic device, and may also be configured in the electronic device to perform three-dimensional reconstruction in the electronic device.

The electronic device may be any stationary or mobile computing device capable of performing data processing, for example, a mobile computing device such as a notebook computer and a wearable device, or a stationary computing device such as a desktop computer, or other types of computing devices, which is not limited in this disclosure.

It should be noted that, regarding the apparatus in the above embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated herein.

The three-dimensional reconstruction device of the embodiment of the disclosure acquires an image sequence of an object to be reconstructed, wherein the image sequence is a continuous image frame obtained by acquiring an image of the object to be reconstructed by a monocular image acquisition device; extracting depth information of an image to be processed aiming at the image to be processed in the image sequence; according to the world coordinate information of each characteristic point in the reference image, the image coordinate information of each characteristic point in the image to be processed and the rotation pose information of the image to be processed, carrying out translation pose information estimation on the image to be processed to obtain the translation pose information of the image to be processed; the reference image is an adjacent image of which the corresponding acquisition time point in the image sequence is positioned before the image to be processed; generating a point cloud picture according to the depth information, the rotation pose information and the translation pose information of each frame of image in the image sequence; according to the point cloud picture, the three-dimensional reconstruction is carried out on the object to be reconstructed, so that the three-dimensional reconstruction of the object to be reconstructed is realized only by combining a single-purpose image collector under the condition of avoiding using a depth sensor, the defect of limited range of the depth sensor is overcome, the cost is reduced, the adaptability is good, and the expansibility is good.

In order to implement the above embodiments, the embodiment of the present disclosure further provides an electronic device.

Wherein, the electronic device 200 includes:

a processor 220;

a memory 210 for storing instructions executable by processor 220;

wherein the processor 220 is configured to execute the instructions to implement the three-dimensional reconstruction method as previously described.

As an example, fig. 6 is a block diagram illustrating an electronic device 200 for three-dimensional reconstruction according to an exemplary embodiment, and as shown in fig. 6, the electronic device 200 may further include:

a memory 210 and a processor 220, a bus 230 connecting different components (including the memory 210 and the processor 220), wherein the memory 210 stores a computer program, and when the processor 220 executes the program, the three-dimensional reconstruction method according to the embodiment of the present disclosure is implemented.

Bus 230 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. By way of example, such architectures include, but are not limited to, Industry Standard Architecture (ISA) bus, micro-channel architecture (MAC) bus, enhanced ISA bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.

Electronic device 200 typically includes a variety of computer-readable media. Such media may be any available media that is accessible by electronic device 200 and includes both volatile and nonvolatile media, removable and non-removable media.

Memory 210 may also include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM)240 and/or cache memory 250. The electronic device 200 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 260 may be used to read from and write to non-removable, nonvolatile magnetic media (not shown in FIG. 6, commonly referred to as a "hard drive"). Although not shown in FIG. 6, a magnetic disk drive for reading from and writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In these cases, each drive may be connected to bus 230 by one or more data media interfaces. Memory 210 may include at least one program product having a set (e.g., at least one) of program modules that are configured to carry out the functions of embodiments of the disclosure.

A program/utility 280 having a set (at least one) of program modules 270, including but not limited to an operating system, one or more application programs, other program modules, and program data, each of which or some combination thereof may comprise an implementation of a network environment, may be stored in, for example, the memory 210. The program modules 270 generally perform the functions and/or methodologies of the embodiments described in this disclosure.

Electronic device 200 may also communicate with one or more external devices 290 (e.g., keyboard, pointing device, display 291, etc.), with one or more devices that enable a user to interact with electronic device 200, and/or with any devices (e.g., network card, modem, etc.) that enable electronic device 200 to communicate with one or more other computing devices. Such communication may occur via input/output (I/O) interfaces 292. Also, the electronic device 200 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network such as the Internet) via the network adapter 293. As shown in FIG. 6, the network adapter 293 communicates with the other modules of the electronic device 200 via the bus 230. It should be appreciated that although not shown in FIG. 6, other hardware and/or software modules may be used in conjunction with electronic device 200, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.

The processor 220 executes various functional applications and data processing by executing programs stored in the memory 210.

It should be noted that, for the implementation process and the technical principle of the electronic device of the embodiment, reference is made to the foregoing explanation of the three-dimensional reconstruction method of the embodiment of the present disclosure, and details are not described here again.

The electronic equipment provided by the embodiment of the disclosure acquires an image sequence of an object to be reconstructed, wherein the image sequence is a continuous image frame obtained by acquiring an image of the object to be reconstructed by a monocular image acquisition device; extracting depth information of an image to be processed aiming at the image to be processed in the image sequence; according to the world coordinate information of each characteristic point in the reference image, the image coordinate information of each characteristic point in the image to be processed and the rotation pose information of the image to be processed, carrying out translation pose information estimation on the image to be processed to obtain the translation pose information of the image to be processed; the reference image is an adjacent image of which the corresponding acquisition time point in the image sequence is positioned before the image to be processed; generating a point cloud picture according to the depth information, the rotation pose information and the translation pose information of each frame of image in the image sequence; according to the point cloud picture, the three-dimensional reconstruction is carried out on the object to be reconstructed, so that the three-dimensional reconstruction of the object to be reconstructed is realized only by combining a single-purpose image collector under the condition of avoiding using a depth sensor, the defect of limited range of the depth sensor is overcome, the cost is reduced, the adaptability is good, and the expansibility is good.

In order to implement the above embodiments, the embodiments of the present disclosure also provide a computer-readable storage medium.

Wherein the instructions in the computer readable storage medium, when executed by a processor of an electronic device, enable the electronic device to perform the three-dimensional reconstruction method as previously described.

To achieve the above embodiments, the present disclosure also provides a computer program product, which, when executed by a processor of an electronic device, enables the electronic device to perform the three-dimensional reconstruction method as described above.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

完整详细技术资料下载
上一篇:石墨接头机器人自动装卡簧、装栓机
下一篇:骨折手术钢钉稳定性的评估方法、确定方法、介质及设备

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!