Fall detection system and fall detection method based on depth image
1. A fall detection system, comprising:
the image reading device reads and preprocesses the depth image; and
and the coordinate processing unit receives the depth image information of the image reading device, calculates and processes coordinate information, calculates the speed and the acceleration of each joint point according to the coordinate information, and judges whether the joint points fall down or not by judging the characteristic information.
2. Fall detection system according to claim 1, wherein the image reading means comprises
The filtering module is used for carrying out median filtering and noise removal on the depth image information;
the cutting module cuts the depth image according to the self-adaptive depth threshold value to roughly remove the background area, further remove noise interference and set the background to be black so as to obtain a foreground area;
the filtering module removes fine holes in the foreground area by using graphical filtering; and
and the area setting module is used for searching a moving human body area in the foreground area according to the motion detection algorithm and taking the human body area as a new foreground area.
3. Fall detection system according to claim 2, wherein the coordinate processing unit comprises
The random forest classifier classifies the pixel points of the human body region obtained in the region setting module and judges that the pixel points belong to a certain joint part of a human body;
the screening module screens out pixel points of each joint part;
the calculation module is used for acquiring pixel points of all joint points and calculating to obtain the pixel coordinates of the joint points;
the conversion module is used for converting the three-dimensional space coordinates of the joint points according to the internal parameters of the depth map camera of the image reading device and the pixel coordinates of the joint points and recording the time stamps corresponding to the image frames;
the recording module records the three-dimensional space coordinates of each joint point;
a speed calculating module, which calculates the movement speed of the joint point and the overall average speed according to the coordinate difference and the time difference of the joint point;
the acceleration calculation module is used for calculating a speed difference according to the inter-frame speed obtained by calculation so as to calculate the acceleration; and
and the SVM classifier is used for judging whether the falling down occurs or not by combining the speed and acceleration information of the coordinates of each joint point.
4. A fall detection method based on a depth image is characterized by comprising the following steps:
(a) reading a depth image, and preprocessing the depth image to obtain a depth image only containing a human body region;
(b) calculating coordinates of the human body joint points according to the depth information, and returning and recording coordinate information and the timestamp;
(c) calculating the movement speed and acceleration of the joint point according to the joint point coordinates and the time stamp obtained in the step (b);
(d) and (c) judging whether a falling event occurs according to the joint point coordinates, the speed and the acceleration obtained in the step (b) and the step (c).
5. The fall detection method according to claim 4, wherein in step (a), the depth image preprocessing comprises the following specific steps:
(a1) carrying out median filtering on the depth image to remove noise;
(a2) cutting the depth image according to the self-adaptive depth threshold value to roughly remove a background area, further removing noise interference, and setting the background to be black;
(a3) removing fine holes in the foreground area by using graphical filtering;
(a4) and searching a moving human body area in the foreground area according to a motion detection algorithm, and taking the human body area as a new foreground area.
6. Fall detection method according to claim 5, wherein in step (b) the specific steps of the calculation of the coordinates of the human joint points are as follows:
(b1) classifying the pixel points of the human body region obtained in the step (a4) by using a random forest classifier, and judging that the pixel points belong to a certain joint part of a human body;
(b2) screening points with the prediction probability larger than a set threshold value in the step (b1) to obtain pixel points of each joint part with different densities;
(b3) clustering pixel points belonging to the same position by using a Mean Shift algorithm to obtain a joint point pixel coordinate;
(b4) and (b) converting the image reading device and the pixel coordinates obtained in the step (b3) to obtain three-dimensional space coordinates of the joint point, and recording a time stamp corresponding to the image frame.
7. Fall detection method according to claim 6, wherein in step (c) the specific steps of the joint movement information calculation are as follows:
(c1) selecting the coordinates of 3 bone points of the neck, the bottom of the spine and the left ankle of the human body obtained in the step (b4), maintaining a queue, and recording the three-dimensional space coordinates and the time stamp of the joint points of the latest 10 frames;
(c2) calculating a coordinate difference and a time difference according to the information recorded in the step (c1), thereby calculating a speed of the movement of the joint point between the latest 10 frames and an average speed of the whole 10 frames;
(c3) and (c2) calculating a velocity difference according to the inter-frame velocity calculated in the step (c2), thereby calculating the acceleration.
8. A fall detection method as claimed in claim 7, wherein in step (d), the specific determining step is as follows:
(d1) calculating to obtain joint point speed characteristics and acceleration characteristics according to the joint point coordinate position obtained in the step (b) as height characteristics in the step (c);
(d2) and fusing three characteristics of the latest continuous 10 frames of joint points as a classification basis, classifying by using an SVM classifier, and judging whether a falling event occurs.
Background
In recent years, the aging problem of the population of China is increasingly severe, and the health problem and the safety problem of daily life of the old people are widely concerned and regarded by society. With the age, the risk of accidental falls increases year by year for the elderly, and the social and economic costs incurred after a fall are also higher, so fall detection techniques are essential.
Currently, there are mainly three types of fall detection systems: the human body falling detection system is based on wearable equipment, environment layout and vision technology. The fall detection system based on the wearable device mainly detects parameter information such as speed and acceleration of human motion through the wearable sensor device, and judges whether the fall occurs according to a preset threshold value. Because the portable monitoring device is carried about, the monitoring range is not limited. Power supply is a problem and comfort considerations are required for wearable devices.
Fall detection systems deployed based on the environment generally detect impacts generated during falls by pressure-sensitive sensors, such as paving pressure-sensitive tiles. The method has the advantages that extra equipment is not required to be carried, and the influence on the activities of the user is small; the disadvantage is that the monitoring area has limitations, and the monitoring can be carried out only in a room where the sensors are laid, and a plurality of sensors are generally required to be arranged, so that the cost is high.
A falling detection system based on a vision technology monitors human body activities through a camera and detects whether falling occurs or not. However, the RGB camera may risk privacy disclosure in privacy places such as a bathroom and a bedroom. Many elderly people do not want to install a camera in a private place, but these scenes are the most prone to accidents such as falls, and especially the floor of a toilet may be slippery. Therefore, the monitoring by using the depth camera can reasonably solve the contradiction, because only the contour features of the monitored person can be obtained, and color image information can not be obtained from the depth image (except for a binocular depth camera, because the two RGB cameras are still in nature), thereby protecting the privacy of the person.
Disclosure of Invention
The invention aims to provide a fall detection system and a fall detection method based on a depth image, which utilize the depth image to carry out motion detection, behavior identification and fall detection, have the unique advantage of privacy protection, can carry out all-weather monitoring indoors with more importance on privacy and are not interfered by illumination.
In order to solve the technical problems, the technical scheme of the invention is as follows:
a fall detection system comprising:
the image reading device reads and preprocesses the depth image; and
and the coordinate processing unit receives the depth image information of the image reading device, calculates and processes coordinate information, calculates the speed and the acceleration of each joint point according to the coordinate information, and judges whether the joint points fall down or not by judging the characteristic information.
Preferably, the image reading apparatus includes
The filtering module is used for carrying out median filtering and noise removal on the depth image information;
the cutting module cuts the depth image according to the self-adaptive depth threshold value to roughly remove the background area, further remove noise interference and set the background to be black so as to obtain a foreground area;
the filtering module removes fine holes in the foreground area by using graphical filtering; and
and the area setting module is used for searching a moving human body area in the foreground area according to the motion detection algorithm and taking the human body area as a new foreground area.
Preferably, the coordinate processing unit includes
The random forest classifier classifies the pixel points of the human body region obtained in the region setting module and judges that the pixel points belong to a certain joint part of a human body;
the screening module screens out pixel points of each joint part;
the calculation module is used for acquiring pixel points of all joint points and calculating to obtain the pixel coordinates of the joint points;
the conversion module is used for converting the three-dimensional space coordinates of the joint points according to the internal parameters of the depth map camera of the image reading device and the pixel coordinates of the joint points and recording the time stamps corresponding to the image frames;
the recording module records the three-dimensional space coordinates of each joint point;
a speed calculating module, which calculates the movement speed of the joint point and the overall average speed according to the coordinate difference and the time difference of the joint point;
the acceleration calculation module is used for calculating a speed difference according to the inter-frame speed obtained by calculation so as to calculate the acceleration; and
and the SVM classifier is used for judging whether the falling down occurs or not by combining the speed and acceleration information of the coordinates of each joint point.
Further, the invention also provides a fall detection method based on depth image motion detection and behavior identification, which comprises the following steps:
(a) reading a depth image, and preprocessing the depth image to obtain a depth image only containing a human body region;
(b) calculating coordinates of the human body joint points according to the depth information, and returning and recording coordinate information and the timestamp;
(c) calculating the movement speed and acceleration of the joint point according to the joint point coordinates and the time stamp obtained in the step (b);
(d) and (c) judging whether a falling event occurs according to the joint point coordinates, the speed and the acceleration obtained in the step (b) and the step (c).
Preferably, in step (a), the depth image preprocessing comprises the following specific steps:
(a1) carrying out median filtering on the depth image to remove noise;
(a2) cutting the depth image according to the self-adaptive depth threshold value to roughly remove a background area, further removing noise interference, and setting the background to be black;
(a3) removing fine holes in the foreground area by using graphical filtering;
(a4) and searching a moving human body area in the foreground area according to a motion detection algorithm, and taking the human body area as a new foreground area.
Preferably, in the step (b), the specific steps of the coordinate calculation of the human body joint point are as follows:
(b1) classifying the pixel points of the human body region obtained in the step (a4) by using a random forest classifier, and judging that the pixel points belong to a certain joint part of a human body;
(b2) screening points with the prediction probability larger than a set threshold value in the step (b1) to obtain pixel points of each joint part with different densities;
(b3) clustering pixel points belonging to the same position by using a Mean Shift algorithm to obtain a joint point pixel coordinate;
(b4) and (b) converting the image reading device and the pixel coordinates obtained in the step (b3) to obtain the three-dimensional space coordinates of the joint point, and recording a time stamp corresponding to the image frame.
Preferably, in the step (c), the joint point movement information is calculated as follows:
(c1) selecting the coordinates of 3 bone points of the neck, the bottom of the spine and the left ankle of the human body obtained in the step (b4), maintaining a queue, and recording the three-dimensional space coordinates and the time stamp of the joint points of the latest 10 frames;
(c2) calculating a coordinate difference and a time difference according to the information recorded in the step (c1), thereby calculating a speed of the movement of the joint point between the latest 10 frames and an average speed of the whole 10 frames;
(c3) calculating a velocity difference from the inter-frame velocity calculated in the step (c2), thereby calculating an acceleration;
preferably, in step (d), the specific determination steps are as follows:
(d1) calculating to obtain joint point speed characteristics and acceleration characteristics according to the joint point coordinate position obtained in the step (b) as height characteristics in the step (c);
(d2) and fusing three characteristics of the latest continuous 10 frames of joint points as a classification basis, classifying by using an SVM classifier, and judging whether a falling event occurs.
Drawings
Fig. 1 is a block diagram of a fall detection system according to the invention;
fig. 2 is a basic flow chart of a fall detection method according to the invention;
fig. 3 is a basic flowchart of the specific steps of depth image preprocessing of the fall detection method according to the present invention;
fig. 4 is a basic flowchart of the specific steps of the coordinate calculation of the human joint point of the fall detection method according to the present invention;
fig. 5 is a basic flowchart of the specific steps of determining whether a fall occurs according to the fall detection method of the present invention.
Detailed Description
The following further describes embodiments of the present invention with reference to the drawings. It should be noted that the description of the embodiments is provided to help understanding of the present invention, but the present invention is not limited thereto. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
As shown in fig. 1, the present invention provides a fall detection system, which includes an image reading device 10, where the image reading device 10 reads a current depth image and performs preprocessing on the depth image, where the preprocessing includes capturing a depth image currently including a human body region.
The Depth image is equal to a common RGB three-channel color image + Depth Map, which is an image or image channel containing information about the distance of the surface of the scene object of the viewpoint in 3D computer graphics. Where the Depth Map is similar to a grayscale image except that each pixel value thereof is the actual distance of the sensor from the object. Usually, the RGB image and the Depth image are registered, so that there is a one-to-one correspondence between the pixel points.
In domestic detection device, generally adopt the camera, but the risk that privacy was revealed exists in the RGB camera to the during operation relatively relies on ambient light, is difficult to normal work when light is dim or night. The image reading device 10 in the present invention reads the current depth image and processes the image, and the limiting conditions for light and time are small, so that the present invention is more suitable for different environments and conditions.
The fall detection system further comprises a coordinate processing unit 20, wherein the coordinate processing unit 10 receives the depth image of the image reading device 10, performs coordinate processing calculation on the current depth image, calculates the coordinates of the human body joint points, and returns the coordinates to the coordinate information and the time stamp points. And further, calculating the speed and the acceleration of the current joint point according to the calculated joint point coordinates and the timestamp point.
It can be understood that when the human body normally walks and falls, the speeds and accelerations generated by the respective joint points are different, and therefore, whether a fall incident occurs can be judged through the coordinates, the speeds and the accelerations of the joint points.
In the above process, the specific operation of the depth image preprocessing is as follows: and carrying out median filtering, depth cutting, graphical filtering, motion detection and human body area searching on the image.
Specifically, the image reading apparatus 10 includes a filtering module 11, which performs median filtering and noise removal on the depth image information;
the image reading apparatus 10 further includes a cutting module 12, the cutting module 12: cutting the depth image according to the self-adaptive depth threshold value to roughly remove a background area, further removing noise interference, and setting the background to be black so as to obtain a foreground area;
the image reading device 10 further includes a filtering module 13, where the filtering module 13 removes fine holes in the foreground region by using a graphical filtering;
the image reading apparatus 10 further includes a region setting module 14, wherein the region setting module 14 finds a moving human body region in the foreground region according to a motion detection algorithm, and takes the human body region as a new foreground region.
Through the operation, the depth image of the human body is obtained.
In the above process, the specific operations of the coordinate processing of the depth image are as follows: and calculating coordinates of the human body joint points and calculating movement information of the joint points.
Specifically, the coordinate processing unit 20 includes a random forest classifier 21, and the random forest classifier 21 classifies the pixel points of the human body region obtained in the region setting module 14, and determines that the pixel points belong to a certain joint part of the human body;
it should be noted that the random forest classifier: the method is a classifier which trains and predicts a sample by utilizing a plurality of decision trees, and the random forest combines the decision trees together, so that the accuracy is greatly improved compared with that of a single decision tree.
The coordinate processing unit 20 further includes a screening module 22, the screening module 22 screens out pixel points of each joint portion, and the specific screening mode is to screen out pixel points of each joint portion with different densities by predicting points with a probability greater than a set threshold;
the coordinate processing unit 20 further comprises a calculating module 23, wherein the calculating module 23 acquires pixel points of each joint point and calculates to obtain pixel coordinates of the joint points, and the specific calculating mode is to cluster the pixel points belonging to the same position by using a Mean Shift algorithm to obtain the pixel coordinates of the joint points;
it is worth mentioning that the Mean Shift algorithm: the method is also called a mean shift algorithm and is widely applied to clustering, image smoothing, segmentation and tracking. The Mean Shift algorithm is generally an iterative process, which first calculates the Mean Shift value in the region of interest, moves the center of the region to the calculated centroid, and then continues moving with the calculated centroid as a new starting point until the final condition is met. In the iterative process, the image is continuously shifted to a position with higher density until the image is moved to the central position with the highest density of the pixel points and then stopped.
The coordinate processing unit 20 further includes a conversion module 24, which converts the three-dimensional space coordinates of the joint point according to the internal parameters of the depth map camera of the image reading device 10 and the pixel coordinates of the joint point, and records the timestamp corresponding to the image frame.
More specifically, the coordinate processing unit 20 includes a recording module 25, which records the three-dimensional space coordinates of the above joint points, such as the coordinates of the bone points of the neck, the bottom of the spine, the left ankle, etc., maintains a queue, and records the three-dimensional space coordinates and the time stamp of the joint points; the number of frames recorded is exemplified by the latest 10 frames.
The coordinate processing unit 20 comprises a speed calculating module 26, which calculates the speed of the movement of the joint points between the latest 10 frames and the average speed of the latest 10 frames according to the coordinate difference and the time difference of the joint points;
the coordinate processing unit 20 further includes an acceleration calculating module 27, which calculates a speed difference according to the inter-frame speed obtained by calculation, so as to calculate an acceleration;
the coordinate processing unit 20 further includes an SVM classifier 28, which is used to determine whether a fall occurs by combining the information of the velocity and the acceleration of the coordinates of each joint point, and the specific determination process is as follows: calculating to obtain joint point speed characteristics and acceleration characteristics according to the obtained joint point coordinate position as height characteristics; three features of continuous 10 frames of joint points are fused to be used as a classification basis, and an SVM classifier 28 is used for classification to judge whether a falling event occurs.
It should be noted that the SVM classifier 28, also called a support vector machine, is a two-class model whose basic model is a linear classifier with maximum spacing defined in the feature space, and whose basic idea is to solve the separating hyperplane with maximum geometric spacing that can correctly divide the training data set.
As shown in fig. 2 to 5, according to the above specific operation modes, the present invention further provides a fall detection method based on depth image motion detection and behavior recognition, comprising the following steps:
(a) reading a depth image, and preprocessing the depth image to obtain a depth image only containing a human body region;
(b) calculating coordinates of the human body joint points according to the depth information, and returning and recording coordinate information and the timestamp;
(c) calculating the movement speed and acceleration of the joint point according to the joint point coordinates and the time stamp obtained in the step (b);
(d) and (c) judging whether a falling event occurs according to the joint point coordinates, the speed and the acceleration obtained in the step (b) and the step (c).
Specifically, in step (a), the depth image preprocessing comprises the following specific steps:
(a1) carrying out median filtering on the depth image to remove noise;
(a2) cutting the depth image according to the self-adaptive depth threshold value to roughly remove a background area, further removing noise interference, and setting the background to be black;
(a3) removing fine holes in the foreground area by using graphical filtering;
(a4) and searching a moving human body area in the foreground area according to a motion detection algorithm, and taking the human body area as a new foreground area.
Specifically, in the step (b), the specific steps of the calculation of the coordinates of the human body joint point are as follows:
(b1) classifying the pixel points of the human body region obtained in the step (a4) by using a random forest classifier, and judging that the pixel points belong to a certain joint part of a human body;
(b2) screening points with the prediction probability larger than a set threshold value in the step (b1) to obtain pixel points of each joint part with different densities;
(b3) clustering pixel points belonging to the same position by using a Mean Shift algorithm to obtain a joint point pixel coordinate;
(b4) and converting the three-dimensional space coordinates of the joint points according to the pixel coordinates obtained in the participation S33 in the depth map camera, and recording the time stamp corresponding to the image frame.
Specifically, in step (c), the joint point movement information is calculated as follows:
(c1) selecting the coordinates of 3 bone points of the neck, the bottom of the spine and the left ankle of the human body obtained in the step (b4), maintaining a queue, and recording the three-dimensional space coordinates and the time stamp of the joint points of the latest 10 frames;
(c2) calculating a coordinate difference and a time difference according to the information recorded in the step (c1), thereby calculating a speed of the movement of the joint point between the latest 10 frames and an average speed of the whole 10 frames;
(c3) calculating a velocity difference from the inter-frame velocity calculated in the step (c2), thereby calculating an acceleration;
specifically, in step (d), the specific determination steps are as follows:
(d1) calculating to obtain joint point speed characteristics and acceleration characteristics according to the joint point coordinate position obtained in the step (b) as height characteristics in the step (c);
(d2) and fusing three characteristics of the latest continuous 10 frames of joint points as a classification basis, classifying by using an SVM classifier, and judging whether a falling event occurs.
The embodiments of the present invention have been described in detail with reference to the accompanying drawings, but the present invention is not limited to the described embodiments. It will be apparent to those skilled in the art that various changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, and the scope of protection is still within the scope of the invention.