Real-time display method and system for automobile A column blind area based on three-dimensional coordinates of human eyes
1. A real-time display method for a dead zone of an automobile column A based on three-dimensional coordinates of human eyes is characterized by comprising the following steps:
acquiring a head image of a driver, and detecting a face image from the head image;
carrying out facial mark detection on the face image to obtain facial feature points;
carrying out three-dimensional attitude estimation on the face image to obtain a rotation matrix and a translation vector for converting a standard face coordinate system into a camera coordinate system;
according to the facial feature points, the rotation matrix and the translation vector, solving a three-dimensional coordinate of the face in the face image, and according to the positions of the eyes in the facial feature points, acquiring a three-dimensional coordinate of the eyes from the three-dimensional coordinate of the face;
and acquiring an environment image outside the A column, cutting out a blind area image from the environment image through a three-dimensional reconstruction blind area algorithm according to the binocular three-dimensional coordinates, and displaying the blind area image in real time.
2. The method for displaying the blind area of the automobile A column based on the three-dimensional coordinates of the human eyes as claimed in claim 1, wherein in the process of detecting the human face image from the head image, if only one human face is detected in the head image, the human face image is directly determined; and if a plurality of faces are detected from the head image, taking the image corresponding to the face with the largest area as a face image.
3. The method for displaying the automobile pillar-a blind area in real time based on the three-dimensional coordinates of the human eye according to claim 1, wherein the process of cutting out the blind area image from the environment image through a three-dimensional reconstruction blind area algorithm according to the three-dimensional coordinates of the two eyes specifically comprises the following steps:
judging the relation between the eyes and the A column display screen according to the two-eye three-dimensional coordinates and the position of the A column display screen;
drawing the eyes, the camera, the overlook area of the A-column display screen and the camera shooting area in the same coordinate system, drawing the sight of the eyes and determining a blind area;
respectively calculating the initial angle and the blind area range angle of the blind area in the environment image, acquiring the wide angle degree and the resolution of the camera, expanding the blind area outwards by the center of the image in the horizontal direction, acquiring the position of the blind area in the environment image pixel, and cutting to obtain the blind area image.
4. The method for displaying the blind area of the column A of the automobile in real time based on the three-dimensional coordinates of the human eyes as claimed in claim 3, wherein the relationship between the eyes and the display screen of the column A comprises: the eyes are within the screen of the a-pillar display, the eyes are on the left outside the screen of the a-pillar display, and the eyes are on the right outside the screen of the a-pillar display.
5. A real-time display system for a blind area of an automobile column A based on three-dimensional coordinates of human eyes is characterized by comprising a first image acquisition device, a second image acquisition device, a processor and a display screen, wherein the first image acquisition device, the second image acquisition device and the display screen are electrically connected with the processor;
the first image acquisition equipment is arranged at the right center of the A column on the inner side of the automobile and used for acquiring a head image of a driver in real time, the second image acquisition equipment is arranged on the A column on the outer side of the automobile and used for acquiring an environment image on the outer side of the A column in real time;
the processor is used for detecting a human face image from the head image; carrying out facial mark detection on the face image to obtain facial feature points; carrying out three-dimensional attitude estimation on the face image to obtain a rotation matrix and a translation vector for converting a standard face coordinate system into a camera coordinate system; according to the facial feature points, the rotation matrix and the translation vector, solving a three-dimensional coordinate of the face in the face image, and according to the positions of the eyes in the facial feature points, acquiring a three-dimensional coordinate of the eyes from the three-dimensional coordinate of the face; cutting out a blind area image from the environment image through a three-dimensional reconstruction blind area algorithm according to the binocular three-dimensional coordinates;
the display screen is arranged on the column A on the inner side of the automobile and used for displaying the blind area image in real time.
6. The system for displaying the blind area of the A column of the automobile in real time based on the three-dimensional coordinates of the human eyes as claimed in claim 5, wherein the second image acquisition device is a wide-angle camera.
Background
At present, with the continuous popularization and application of automobiles, the safety problem of automobile driving is concerned more and more, and the A, B, C column of the automobile plays a role in protecting the internal structure of the automobile body for the automobile body as a whole, and can protect the safety of personnel in the automobile cabin when the automobile body is extruded or overturned. However, the a column shields the sight of the driver to a certain degree, which forms a blind area of the a column of the automobile, and the problem of the blind area of the a column of the automobile is always an unavoidable problem in the production of the automobile, and many traffic accidents are caused by the fact that the sight of the driver is shielded by the a column to form the blind area, so in recent years, researchers have conducted a great deal of research on the display of the blind area of the a column.
With the research on the application of computer vision in the technical field of automobile active safety, the vision application for blind zone elimination is continuously developed, and for the elimination of the A column blind zone and the realization of visual field compensation, the existing method has many problems and limitations. For example, although a lot of researches can realize the display of the A column blind area, the blind area is still pictures or projections, the requirements that the A column blind area can be obtained in real time when the automobile runs in real life cannot be met, and the adopted equipment is complex and various and cannot meet the requirements of convenience in practical application.
Therefore, how to provide a method for displaying the blind area of the pillar a of the automobile, which can display the blind area image in real time and is more convenient and fast, is a problem that needs to be solved urgently by technical personnel in the field.
Disclosure of Invention
In view of the above, the invention provides a method and a system for displaying a column A blind area of an automobile in real time based on three-dimensional coordinates of human eyes, and the method and the system effectively solve the technical problems that the existing method for displaying the column A blind area cannot display the column A blind area in real time, is not convenient and fast enough, and the like.
In order to achieve the purpose, the invention adopts the following technical scheme:
on one hand, the invention provides a real-time display method for a blind area of an automobile column A based on three-dimensional coordinates of human eyes, which comprises the following steps:
acquiring a head image of a driver, and detecting a face image from the head image;
carrying out facial mark detection on the face image to obtain facial feature points;
carrying out three-dimensional attitude estimation on the face image to obtain a rotation matrix and a translation vector for converting a standard face coordinate system into a camera coordinate system;
according to the facial feature points, the rotation matrix and the translation vector, solving a three-dimensional coordinate of the face in the face image, and according to the positions of the eyes in the facial feature points, acquiring a three-dimensional coordinate of the eyes from the three-dimensional coordinate of the face;
and acquiring an environment image outside the A column, cutting out a blind area image from the environment image through a three-dimensional reconstruction blind area algorithm according to the binocular three-dimensional coordinates, and displaying the blind area image in real time.
Further, in the process of detecting the face image from the head image, if only one face is detected in the head image, the face image is directly determined; and if a plurality of faces are detected from the head image, taking the image corresponding to the face with the largest area as a face image.
The invention assumes only one face in the image in advance, if the detector returns a plurality of face suggestions, the largest bounding box is taken, and the image of which the face cannot be found by the detector is directly discarded.
The invention performs three-dimensional pose estimation based on a generic average face shape model F, which is the average shape of the participants in the MPIIGaze data set.
Further, the process of cutting out the blind area image from the environment image through a three-dimensional reconstruction blind area algorithm according to the binocular three-dimensional coordinates specifically includes:
judging the relation between the eyes and the A column display screen according to the two-eye three-dimensional coordinates and the position of the A column display screen;
drawing the eyes, the camera, the overlook area of the A-column display screen and the camera shooting area in the same coordinate system, drawing the sight of the eyes and determining a blind area;
respectively calculating the initial angle and the blind area range angle of the blind area in the environment image, acquiring the wide angle degree and the resolution of the camera, expanding the blind area outwards by the center of the image in the horizontal direction, acquiring the position of the blind area in the environment image pixel, and cutting to obtain the blind area image.
Still further, the relationship of the eyes to the A-pillar display screen includes: the eyes are within the screen of the a-pillar display, the eyes are on the left outside the screen of the a-pillar display, and the eyes are on the right outside the screen of the a-pillar display.
On the other hand, the invention also provides an automobile A-column blind area real-time display system based on the three-dimensional coordinates of human eyes, which comprises a first image acquisition device, a second image acquisition device, a processor and a display screen, wherein the first image acquisition device, the second image acquisition device and the display screen are electrically connected with the processor;
the first image acquisition equipment is arranged at the right center of the A column on the inner side of the automobile and used for acquiring a head image of a driver in real time, the second image acquisition equipment is arranged on the A column on the outer side of the automobile and used for acquiring an environment image on the outer side of the A column in real time;
the processor is used for detecting a human face image from the head image; carrying out facial mark detection on the face image to obtain facial feature points; carrying out three-dimensional attitude estimation on the face image to obtain a rotation matrix and a translation vector for converting a standard face coordinate system into a camera coordinate system; according to the facial feature points, the rotation matrix and the translation vector, solving a three-dimensional coordinate of the face in the face image, and according to the positions of the eyes in the facial feature points, acquiring a three-dimensional coordinate of the eyes from the three-dimensional coordinate of the face; cutting out a blind area image from the environment image through a three-dimensional reconstruction blind area algorithm according to the binocular three-dimensional coordinates;
the display screen is arranged on the column A on the inner side of the automobile and used for displaying the blind area image in real time.
Further, the second image capture device is a wide-angle camera.
According to the system, a first image acquisition device, which can be a monocular camera, is arranged in the middle of the automobile A column and is used for acquiring three-dimensional coordinates of human eyes in real time, and a second image acquisition device, which can be a monocular camera or a wide-angle camera, is arranged on the left side of the automobile A column and is used for monitoring the outside of an automobile in real time and shooting the position shielded by the A column. The two cameras run simultaneously, when a face is detected, three-dimensional human eye data are obtained through the processor immediately, after the human eye data are obtained, the sheltered blind area is extracted in an image cutting mode according to a three-dimensional reconstruction blind area calculation method, and the area is displayed on a display screen on the column A in real time.
According to the technical scheme, compared with the prior art, the invention discloses and provides an automobile A-pillar blind area real-time display method and system based on human eye three-dimensional coordinates. The system realizes the real-time display of the blind area images, has a simple structure, is convenient and quick to build, and can meet the actual use requirements.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
FIG. 1 is a schematic flow chart of an implementation of a method for displaying a blind area of an automobile pillar A in real time based on three-dimensional coordinates of human eyes according to the present invention;
FIG. 2 is a schematic diagram of facial feature points of a human face;
FIG. 3 is a schematic perspective view of a three-dimensional column A dead zone;
FIG. 4 is a top view of the A-pillar display screen with eye coordinates not exceeding those of the A-pillar display screen;
FIG. 5 is a top view of the eye coordinates beyond the A-pillar display screen and to the left of the display screen;
FIG. 6 is a top view of the eye coordinates beyond the A-pillar display screen and to the right of the display screen;
FIG. 7 is a schematic structural diagram of an automobile A-pillar blind area real-time display system based on three-dimensional coordinates of human eyes according to the present invention;
FIG. 8 is a schematic diagram of a face image detected in an embodiment of the present invention;
FIG. 9 is a schematic diagram of obtaining a face landmark and pixel coordinates;
FIG. 10 is a schematic diagram of three-dimensional coordinate data obtained by 3D face detection;
FIG. 11 is a schematic view of a blind area display image with a human face in the middle;
FIG. 12 is a schematic diagram of a blind area display image when a face is on the left side;
fig. 13 is a schematic diagram of a blind area display image when the face is on the right side.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
On one hand, referring to the attached figure 1, the embodiment of the invention discloses an automobile A-pillar blind area real-time display method based on three-dimensional coordinates of human eyes, which comprises the following steps:
s1: the method comprises the steps of obtaining a head image of a driver, and detecting the head image to obtain a face image.
In the embodiment, the face of a user in an image is detected by a method based on the hog, only one face in the image is assumed in advance, and if the detector returns a plurality of face suggestions, the largest bounding box is taken, namely, the image corresponding to the face with the largest area is taken as the face image. The image of the face which cannot be found by the detector is directly discarded.
S2: and carrying out facial mark detection on the face image to obtain facial feature points.
A CLNF (Constrained Local Neural Field) model proposed by Tadas Baltrutis et al is adopted to acquire the human face landmarks in real time. Facial markers are detected using the continuous conditional neural domain (i.e., CCNF) model framework. And obtaining accurate facial feature points, namely the facial landmarks. The standard facial feature points are shown in fig. 2.
S3: and carrying out three-dimensional attitude estimation on the face image to obtain a rotation matrix and a translation vector for converting a standard face coordinate system into a camera coordinate system.
The present embodiment performs three-dimensional pose estimation according to a general average face shape model F, which is the average shape of the participants in the MPIIGaze data set, and then estimates an initial solution using EPnP algorithm using the three-dimensional model of the given object and the corresponding two-dimensional projection to obtain the rotation matrix and translation vector required for conversion from the standard face coordinate system to the camera coordinate system.
S4: and solving a human face three-dimensional coordinate in the human face image according to the facial feature points, the rotation matrix and the translation vector, and acquiring a binocular three-dimensional coordinate from the human face three-dimensional coordinate according to the positions of the eyes in the facial feature points.
And solving the three-dimensional coordinates of the human face according to the EPnP algorithm by using the obtained rotation matrix and translation vector and the two-dimensional standard feature points of the human face. And then according to the corresponding positions of the eyes in the standard feature points of the human face, taking the corresponding data of 36-47 in the three-dimensional coordinates of the human face as the three-dimensional coordinate data of the eyes, and obtaining the final three-dimensional coordinates of the eyes.
S5: and acquiring an environment image outside the A column, cutting out a blind area image from the environment image through a three-dimensional reconstruction blind area algorithm according to the binocular three-dimensional coordinates, and displaying the blind area image in real time.
Specifically, the process of cutting out a blind area image from an environment image through a three-dimensional reconstruction blind area algorithm according to the three-dimensional coordinates of the two eyes specifically comprises the following steps:
judging the relation between the eyes and the A column display screen according to the three-dimensional coordinates of the eyes and the position of the A column display screen;
drawing the eyes, the camera, the overlook area of the A-column display screen and the camera shooting area in the same coordinate system, drawing the sight of the eyes and determining a blind area;
respectively calculating the initial angle and the blind area range angle of the blind area in the environment image, acquiring the wide angle degree and the resolution of the camera, expanding the blind area outwards by the center of the image in the horizontal direction, acquiring the position of the blind area in the environment image pixel, and cutting to obtain the blind area image.
In this embodiment, a blind area is cut out and displayed according to the three-dimensional reconstruction blind area algorithm, and the following is the specific content of the three-dimensional reconstruction blind area algorithm:
a three-dimensional a-pillar blind spot perspective is shown in fig. 3. Depending on the stereo structure, the eye coordinates can be divided into three parts of the eye, including: eyes within the screen, eyes to the left outside the screen, and eyes to the right outside the screen. And (4) obtaining the size and the relative position of the A column blind area according to the plan view.
FIG. 4 is a top view of the eye coordinate not exceeding the A-pillar screen, FIG. 5 is a top view of the eye coordinate exceeding the A-pillar screen in the left position of the screen, and FIG. 6 is a top view of the eye coordinate exceeding the A-pillar screen in the right position of the screen.
In fig. 4, 5 and 6, a represents the position where the eyes are located, C represents the position where the camera is placed, BC represents the top view area of the a-pillar display screen, AD represents the leftmost viewing line of the final position that can be seen after the line of sight of the human eye is blocked by the a-pillar display screen, and AE represents the rightmost viewing line of the final position that can be seen after the line of sight of the human eye is blocked by the a-pillar display screen. DE represents an area blocked by the a-pillar, FG represents an area that the wide-angle camera C can photograph, and I represents whether the position of the eye a with the screen is within the screen. K is the intersection of the eye and the screen.
According to the three top views, the occlusion region can be obtained according to the binocular three-dimensional coordinates and the coordinates of the a-pillar screen, and taking the first case shown in fig. 4 as an example, the calculation formula is as follows:
angle KAC ═ acos (KA/KC);
angle KBA ═ acos (KA/KB);
AE=AI/cos(KAC);AD=AI/cos(KAB)
IE=AI*tan(KAC);ID=AI*tan(KAB)
and (3) solving the CD length by using the cospinning theorem, namely:
CD2=AD2+AC2-AD*AC*cos(BAC);
angle DCE ═ acos ((CD)2+CE2-DE2)/2*DC*CE));
Angle ECG ═ angle BCA- (180-camera wide angle number FCG)/2;
angle FCD — FCG-DCE-ECG.
In the above angles, the initial angle of the blind area in the image is FCD, the angle of the blind area range is DCE, the wide angle of the a-pillar camera is T degree and resolution H × W, the blind area is expanded outwards from the center of the image in the horizontal direction, the position of the blind area range in the image pixel can be obtained, and the rectangular area Rect (x, y, width, higth) is cut out.
Wherein:
Rect.x=FCD/T*W
Rect.width=DCE/T*W
right-a-pillar resolution HA
Rect.y=W-hight
And finally, according to the binocular three-dimensional coordinate and the stereo reconstruction blind area estimation method, after the three-dimensional binocular coordinate is obtained, the coordinate data is transmitted to the blind area cutting method, the environment image of the position behind the A column, which is shot in real time, is directly output to the A column display screen according to the blind area cutting method, and the real-time display function is realized.
On the other hand, referring to fig. 7, the invention further provides an automobile a-pillar blind area real-time display system based on a human eye three-dimensional coordinate, the system comprises a first image acquisition device 1, a second image acquisition device 2, a processor 3 and a display screen 4, and the first image acquisition device 1, the second image acquisition device 2 and the display screen 4 are all electrically connected with the processor 3;
the first image acquisition device 1 is arranged at the right center of the A column on the inner side of the automobile, the first image acquisition device 1 is used for acquiring a head image of a driver in real time, the second image acquisition device 2 is arranged on the A column on the outer side of the automobile, and the second image acquisition device 2 is used for acquiring an environment image on the outer side of the A column in real time;
the processor 3 is used for detecting a face image from the head image; carrying out facial mark detection on the face image to obtain facial feature points; carrying out three-dimensional attitude estimation on the face image to obtain a rotation matrix and a translation vector for converting a standard face coordinate system into a camera coordinate system; according to the facial feature points, the rotation matrix and the translation vector, solving a three-dimensional coordinate of the face in the face image, and acquiring a three-dimensional coordinate of two eyes from the three-dimensional coordinate of the face according to the positions of the two eyes in the facial feature points; cutting a blind area image from the environment image through a three-dimensional reconstruction blind area algorithm according to the three-dimensional coordinates of the two eyes;
the display screen 4 is arranged on the column A on the inner side of the automobile and used for displaying the blind area images in real time.
More preferably, the second image capturing device 2 in the present embodiment may employ a wide-angle camera.
This embodiment is when setting up the system, and a monocular camera, first image acquisition equipment promptly are installed to the positive centre in car A post, and this camera carries out real-time acquisition to people's eye three-dimensional coordinate, at a monocular camera of car A post's left side installation, second image acquisition equipment promptly, and this camera carries out real-time supervision to the car outside, can shoot out by the position that A post sheltered from. The two cameras operate simultaneously, when a face is detected, three-dimensional human eye data are obtained immediately, after the human eye data are obtained, the sheltered blind area is extracted in an image cutting mode according to a three-dimensional reconstruction blind area calculation method, and the area is displayed on an A-column display screen in real time.
In the embodiment, the camera C1 (i.e., the first image acquisition device) is fixed in the middle of the a-pillar display screen, the camera C2 (i.e., the second image acquisition device) is fixed on the left side of the a-pillar display screen, and the middle monocular camera C1 is calibrated in advance by using matlab software to obtain the internal reference matrix of the camera. The camera C1 is turned on to acquire the three-dimensional coordinates of the human eye in real time, and the image is shown in fig. 8. The processor then acquires the landmarks and the pixel coordinates of the human face, and the image is shown in fig. 9. And then, carrying out human eye normalization and obtaining the pixel coordinates of the human eyes. Then, 3D detection is performed on the human face to obtain three-dimensional coordinate data, and an image obtained by the 3D detection is shown in fig. 10. The simulated face then displays the image with the dead zone in the middle, as shown in fig. 11. The simulated face then shows the image with the blind area to the left, as shown in fig. 12. The simulated face then displays the image with the blind area to the right, as shown in FIG. 13.
The A column blind area display scheme disclosed by the invention has the advantages that firstly, the coordinates of human eyes are obtained through an appearance-based human eye detection method and an MPIIGaze data set, and the three-dimensional coordinates of the human eyes are obtained through calibration of a matlab camera and three-dimensional face reconstruction of a monocular camera. According to the three-dimensional coordinates of the human eyes obtained by the front monocular camera, the content obtained by the rear camera is cut in real time through a blind area three-dimensional stereo reconstruction algorithm and is output to a display screen, and the display of the blind area is realized. The frequency of the blind area display and the human eye detection can reach the highest frequency of 30fps of the network camera at the same time, and the real-time requirement when the environment changes is met.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.