Passenger flow detection method, system, device and medium based on convolutional neural network
1. A passenger flow detection method based on a convolutional neural network is characterized by comprising the following steps:
acquiring video data acquired in real time;
according to the video data, adopting a YOLOv4-tiny network model to carry out passenger flow detection to obtain a passenger flow detection result;
the YOLOv4-tiny network model comprises an input layer, a backbone framework, a network layer and an output layer, wherein the input layer, the backbone framework, the network layer and the output layer are sequentially connected; the backbone framework comprises a convolution calculation module and a ResBlock-D module; the convolution calculation module is used for carrying out preliminary feature processing on the video data; and the ResBlock-D module is used for carrying out feature extraction and calculation on the video data after the preliminary feature processing.
2. The convolutional neural network-based passenger flow detection method of claim 1, wherein the backbone framework further comprises a residual module, and the residual module is used for assisting the ResBlock-D module in feature extraction and calculation of the video data after preliminary feature processing.
3. The convolutional neural network-based passenger flow detection method of claim 1, wherein the input layer is configured to perform angle rotation, color adjustment, size adjustment and mosaic data enhancement on the image corresponding to the video data.
4. The convolutional neural network-based passenger flow detection method of claim 1, wherein the network layer comprises an upsampling module and a downsampling module; the up-sampling module is used for sampling the image processed by the backbone frame and transmitting strong semantic features from top to bottom; and the down-sampling module is used for sampling the image processed by the up-sampling module and transmitting the positioning characteristics from bottom to top.
5. The convolutional neural network-based passenger flow detection method as claimed in any one of claims 1 to 4, further comprising a training step of the YOLOv4-tiny network model before the step of passenger flow detection using the YOLOv4-tiny network model, wherein the training step comprises:
acquiring image data to be processed;
marking the head part of the image data to obtain label data;
training the YOLOv4-tiny network model using the image data and the label data;
acquiring test data;
testing the trained YOLOv4-tiny network model by using the test data;
when the accuracy of the test result is less than or equal to a preset value, continuing to execute the training process of the YOLOv4-tiny network model and adjusting the parameters of the YOLOv4-tiny network model;
and when the accuracy of the test result is greater than a preset value, stopping the training process of the YOLOv4-tiny network model.
6. A passenger flow detection system based on a convolutional neural network, comprising:
the data acquisition unit is used for acquiring video data in a preset acquisition area in real time;
a data processing unit for performing the convolutional neural network-based passenger flow detection method of any one of claims 1-5;
and the data display unit is used for counting passenger flow detection results output by the data processing unit and displaying the counting results.
7. The convolutional neural network-based passenger flow detection system of claim 6, wherein the data acquisition unit comprises a plurality of cameras, and the shooting areas of the cameras cover the preset acquisition area.
8. The convolutional neural network-based passenger flow detection system of claim 6, wherein the data processing unit comprises an ARM embedded platform, and the ARM embedded platform is in wireless communication with the plurality of cameras and the data display unit, respectively.
9. A passenger flow detection device based on a convolutional neural network, comprising:
at least one memory for storing a program;
at least one processor configured to load the program to perform the convolutional neural network-based traffic detection method of any of claims 1-5.
10. A computer-readable storage medium in which a processor-executable program is stored, wherein the processor-executable program, when executed by a processor, is configured to perform the convolutional neural network-based passenger flow detection method of any one of claims 1-5.
Background
Passenger flow is the purposeful flow formed by people with the help of various vehicles in order to realize various travel activities. With the continuous development of traffic networks, the passenger flow is increased, for example, the passenger flow is greatly increased when services such as ticket buying, ticket taking, ticket change, consultation and the like are handled on site in a ticket selling hall of a railway station. With the increase of passenger flow, when an emergency happens, casualties and property loss are easily caused. Therefore, for these large-traffic places, the traffic control is very important. One important means of traffic control is to specify the number of people in a location for monitoring. At present, for the number of people in a specified place, on one hand, the number of people is artificially determined by a monitoring person after the monitoring video is amplified, and the workload of the monitoring person is very large by the mode; another method is to use a specific model for monitoring, such as fast-RCNN, SSD, YOLOv 1-v 3, and this method of detecting through the model can reduce the workload of the monitoring personnel to some extent, but because the processing time of the model in the detection process is long in the data processing process, it is unable to provide the monitoring personnel with effective passenger flow in the monitored area quickly.
Disclosure of Invention
The present invention is directed to solving at least one of the problems of the prior art. Therefore, the invention provides a passenger flow detection method, a system, a device and a medium based on a convolutional neural network, which can rapidly provide effective passenger flow of a monitoring area for monitoring personnel.
In a first aspect, an embodiment of the present invention provides a passenger flow detection method based on a convolutional neural network, including the following steps:
acquiring video data acquired in real time;
according to the video data, adopting a YOLOv4-tiny network model to carry out passenger flow detection to obtain a passenger flow detection result;
the YOLOv4-tiny network model comprises an input layer, a backbone framework, a network layer and an output layer, wherein the input layer, the backbone framework, the network layer and the output layer are sequentially connected; the backbone framework comprises a convolution calculation module and a ResBlock-D module; the convolution calculation module is used for carrying out preliminary feature processing on the video data; and the ResBlock-D module is used for carrying out feature extraction and calculation on the video data after the preliminary feature processing.
The passenger flow detection method based on the convolutional neural network provided by the embodiment of the invention has the following beneficial effects:
according to the embodiment, the YOLOv4-tiny network model composed of the input layer, the backbone framework comprising the convolution calculation module and the ResBlock-D module, the network layer and the output layer is combined with the video data collected in real time to carry out passenger flow detection, so that the number of people monitoring video contents is not required to be artificially determined by monitoring personnel, the workload of the monitoring personnel is reduced, meanwhile, the ResBlock-D module is arranged in the backbone framework, the complexity of the video data processing process is reduced, the data processing speed is increased, and therefore the passenger flow of a real-time and effective monitoring area is rapidly provided for the monitoring personnel.
Optionally, the backbone framework further includes a residual module, where the residual module is used to assist the ResBlock-D module in performing feature extraction and calculation on the video data after the preliminary feature processing.
Optionally, the input layer is configured to perform angle rotation, color adjustment, resizing, and mosaic data enhancement on an image corresponding to the video data.
Optionally, the network layer comprises an upsampling module and a downsampling module; the up-sampling module is used for sampling the image processed by the backbone frame and transmitting strong semantic features from top to bottom; and the down-sampling module is used for sampling the image processed by the up-sampling module and transmitting the positioning characteristics from bottom to top.
Optionally, before the step of performing passenger flow detection by using the YOLOv4-tiny network model, the method further includes a training step of the YOLOv4-tiny network model, where the training step includes:
acquiring image data to be processed;
marking the head part of the image data to obtain label data;
training the YOLOv4-tiny network model using the image data and the label data;
acquiring test data;
testing the trained YOLOv4-tiny network model by using the test data;
when the accuracy of the test result is less than or equal to a preset value, continuing to execute the training process of the YOLOv4-tiny network model and adjusting the parameters of the YOLOv4-tiny network model;
and when the accuracy of the test result is greater than a preset value, stopping the training process of the YOLOv4-tiny network model.
In a second aspect, an embodiment of the present invention provides a passenger flow detection system based on a convolutional neural network, including:
the data acquisition unit is used for acquiring video data in a preset acquisition area in real time;
the data processing unit is used for executing the convolutional neural network-based passenger flow detection method provided by the embodiment of the first aspect;
and the data display unit is used for counting passenger flow detection results output by the data processing unit and displaying the counting results.
Optionally, the data acquisition unit includes a plurality of cameras, and the shooting area of the plurality of cameras covers the preset acquisition area.
Optionally, the data processing unit includes an ARM embedded platform, and the ARM embedded platform is in wireless communication with the plurality of cameras and the data display unit, respectively.
In a third aspect, an embodiment of the present invention provides a passenger flow detection apparatus based on a convolutional neural network, including:
at least one memory for storing a program;
and the at least one processor is used for loading the program to execute the convolutional neural network-based passenger flow detection method provided by the embodiment of the first aspect.
In a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium, in which a processor-executable program is stored, where the processor-executable program is used to execute the convolutional neural network-based passenger flow detection method provided in the first aspect.
Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
The invention is further described with reference to the following figures and examples, in which:
FIG. 1 is a flowchart of a method for detecting passenger flow based on a convolutional neural network according to an embodiment of the present invention
FIG. 2 is a schematic diagram of a YOLOv4-tiny network model according to an embodiment of the present invention;
FIG. 3 is a schematic structural diagram of a Backbone frame Backbone according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of a network layer Neck according to an embodiment of the present invention;
fig. 5 is a block diagram of units of a convolutional neural network-based passenger flow detection system according to an embodiment of the present invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the accompanying drawings are illustrative only for the purpose of explaining the present invention, and are not to be construed as limiting the present invention.
In the description of the present invention, it should be understood that the orientation or positional relationship referred to in the description of the orientation, such as the upper, lower, front, rear, left, right, etc., is based on the orientation or positional relationship shown in the drawings, and is only for convenience of description and simplification of description, and does not indicate or imply that the device or element referred to must have a specific orientation, be constructed and operated in a specific orientation, and thus, should not be construed as limiting the present invention.
In the description of the present invention, the meaning of a plurality is one or more, the meaning of a plurality is two or more, and the above, below, exceeding, etc. are understood as excluding the present numbers, and the above, below, within, etc. are understood as including the present numbers. If the first and second are described for the purpose of distinguishing technical features, they are not to be understood as indicating or implying relative importance or implicitly indicating the number of technical features indicated or implicitly indicating the precedence of the technical features indicated.
In the description of the present invention, unless otherwise explicitly limited, terms such as arrangement, installation, connection and the like should be understood in a broad sense, and those skilled in the art can reasonably determine the specific meanings of the above terms in the present invention in combination with the specific contents of the technical solutions.
In the description of the present invention, reference to the description of the terms "one embodiment," "some embodiments," "an illustrative embodiment," "an example," "a specific example," or "some examples," etc., means that a particular feature or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
Referring to fig. 1, an embodiment of the present invention provides a passenger flow detection method based on a convolutional neural network, and this embodiment may be applied to a server or a controller corresponding to a monitoring processing platform, where the server or the controller interacts with each terminal device, and the terminal device includes a video device and a display device.
In the application process, the present embodiment includes steps S11 and S12:
and S11, acquiring the video data collected in real time.
In the embodiment of the application, the video data is real-time video data in a preset acquisition area, and is acquired in real time through a plurality of preset cameras. The preset acquisition area can be areas such as a ticket selling hall of a railway station, a ticket selling hall of a bus station, a shopping mall and an amusement park.
And S12, carrying out passenger flow detection by adopting a YOLOv4-tiny network model according to the video data to obtain a passenger flow detection result.
In the embodiment of the application, the video data acquired in real time is input into a YOLOv4-tiny network model, and after people number of the video data is identified through the YOLOv4-tiny network model, a passenger flow detection result is output. The passenger flow detection result can be displayed on the terminal equipment, and the display content comprises information such as time, number of people, position and the like.
Specifically, as shown in fig. 2, the YOLOv4-tiny network model of this embodiment includes an input layer, a backbone frame, a network layer, and an output layer, where an input end of the input layer is used to input video data obtained in real time, an output end of the input layer is connected to an input end of the backbone frame, an output end of the backbone frame is connected to an input end of the network layer, and an output end of the network layer is connected to an input end of the output layer. The backbone framework comprises a convolution calculation module and a ResBlock-D module, wherein the convolution calculation module is used for carrying out preliminary characteristic processing on the video data; and the ResBlock-D module is used for carrying out feature extraction and calculation on the video data after the preliminary feature processing.
In this embodiment, the input layer is used to perform angle rotation, color adjustment, resizing and mosaic data enhancement on an image corresponding to the video data. Particularly occlusion of a target object, for example a human head, by mosaic data. Aiming at the scene characteristics, before the model is applied, a large number of images with dense people flow are selected, each head in the images is possibly shielded or incomplete, the images are used as training data to train the model, so that the trained model has better robustness, and the shielded object has higher accuracy.
In this embodiment, the Backbone framework backhaul is improved on the existing backhaul of the YOLOv4 model, wherein the backhaul on the YOLOv4 model adopts CSPDarknet53 instead of the previous Darknet53 as a feature extraction network, CSPnet reduces the amount of calculation and can ensure the accuracy, and the miss activation function leak Relu function is used to better enable information to go deep into the neural network, thereby obtaining better accuracy and generalization. In the Backbone in the Yolov4-tiny model of this embodiment, a ResBlock-D module is used to replace a part of CSPBlock module on the Yolov4 model, so as to reduce the complexity of calculation, and the Backbone of this embodiment is shown in FIG. 3. As can be seen from fig. 3, the Backbone of this embodiment is further provided with an Auxiliary residual error module, where the residual error module is used to assist the ResBlock-D module in performing feature extraction and calculation on the video data after the preliminary feature processing, so as to extract more human head feature information and reduce the detection error. For example, taking the size of the input feature image as 104 × 104 as an example, the number of channels is 64:
the number of floating point operations per second, FLOPs, performed by CSPBlock is:
FLOPs=1042×32×642×1042×32×64×32+1042×32×322+1042×12×642
=7.421×108
the number of floating point operations per second FLOPs performed by ResBlock-D is:
FLOPs=1042×12×64×32+522×32×322+522×12×32×64+64×522×22+522×12×642
=6.438×107
the computational complexity ratio of CSPBlock to ResBlock-D is about 10:1 as can be obtained from the above calculations. The computational complexity of ResBlock-D is far beyond CSPBlock. Therefore, the calculation speed is increased through the ResBlock-D module, and the real-time and effective passenger flow of the monitored area can be rapidly provided for the monitoring personnel.
In this embodiment, the network layer tack adopts multi-scale fusion features, so that the feature map information is richer. Specifically, as shown in fig. 4, the network layer tack includes an up-sampling module PFN and a down-sampling module PAN, where the PFN is configured to sample an image processed by the backbone framework and transmit strong semantic features from top to bottom; and the PAN is used for sampling the image processed by the up-sampling module and transmitting the positioning features from bottom to top.
In this embodiment, the output layer operates with cloujoss _ LOSS, DIOU _ nms, wherein cloujoss takes into account the characteristics of the shape difference of the prediction frame and the target frame to improve the convergence speed and regression accuracy of the model. By using DIOU to judge the nms effect, the judging effect is more consistent with the actual situation, so as to obtain a better output result.
In some embodiments, the training step of the YOLOv4-tiny network model comprises:
and acquiring image data to be processed, and labeling the head part of the image data to obtain label data. The image data to be processed may be image data extracted from historical monitoring video data, and the image data may be composed of 1 ten thousand pictures. And after the image data are obtained, labeling the image data through labeling software so as to label the image data. Then, image data and label data are adopted to train a Yolov4-tiny network model, namely, the image data and the label data are input into a Yolov4-tiny network model, test data are obtained at the same time, and the trained Yolov4-tiny network model is tested by adopting the test data; when the accuracy of the test result is less than or equal to the preset value, continuing to execute the training process of the YOLOv4-tiny network model and adjusting the parameters of the YOLOv4-tiny network model; and when the accuracy of the test result is greater than the preset value, stopping the training process of the YOLOv4-tiny network model to ensure that the output detection result is more accurate in the application process of the trained YOLOv4-tiny network model.
Referring to fig. 5, an embodiment of the present invention provides a passenger flow detection system based on a convolutional neural network, including a data acquisition unit, a data processing unit, and a data display unit, where both the data acquisition unit and the data display unit interact with the data processing unit. Specifically, the data acquisition unit includes a plurality of camera, and a plurality of camera all is used for the video data of gathering in the predetermined collection area in real time, and the shooting area cover of a plurality of camera predetermines the collection area to video admission through the multi-angle reduces the influence of the condition of sheltering from to the number testing result. The data processing unit comprises an ARM embedded platform, and is used for executing the passenger flow detection method based on the convolutional neural network shown in the figure 1 according to the data acquired by the data acquisition unit. And the data display unit is used for counting passenger flow detection results output by the data processing unit and displaying the counting results.
When the embodiment is applied to the personnel detection process of the ticket office of the railway station, the monitoring personnel manually calculates the passenger flow volume in the specified time period to be 85 persons, and the number of the persons detected by the detection method provided by the embodiment is 78 persons.
The embodiment of the invention provides a passenger flow detection device based on a convolutional neural network, which comprises:
at least one memory for storing a program;
at least one processor for loading a program to perform the convolutional neural network-based passenger flow detection method shown in fig. 1.
The content of the method embodiment of the present invention is applicable to the apparatus embodiment, the functions specifically implemented by the apparatus embodiment are the same as those of the method embodiment, and the beneficial effects achieved by the apparatus embodiment are also the same as those achieved by the method.
An embodiment of the present invention provides a computer-readable storage medium, in which a program executable by a processor is stored, and the program executable by the processor is used for executing the convolutional neural network-based passenger flow detection method shown in fig. 1 when executed by the processor.
The embodiment of the invention also discloses a computer program product or a computer program, which comprises computer instructions, and the computer instructions are stored in a computer readable storage medium. The computer instructions may be read by a processor of a computer device from a computer-readable storage medium, and executed by the processor to cause the computer device to perform the method illustrated in fig. 1.
The embodiments of the present invention have been described in detail with reference to the accompanying drawings, but the present invention is not limited to the above embodiments, and various changes can be made within the knowledge of those skilled in the art without departing from the gist of the present invention. Furthermore, the embodiments of the present invention and the features of the embodiments may be combined with each other without conflict.