Task-oriented image quality evaluation method based on transfer learning
1. A task-oriented image quality evaluation method based on transfer learning is characterized by comprising the following steps:
s1, acquiring an underwater fish detection data set, and labeling and sorting to obtain a fish target detection data set;
step S2, training a yolov4 network model based on a fish target detection data set to obtain a fish detection model fish-yolov 4;
s3, constructing an image quality database for guiding underwater fish detection tasks;
and S4, performing migration learning and fine adjustment on the fish detection model fish-yolov4 based on the image quality database guided by the underwater fish detection task to obtain a final detection model.
2. The task-oriented image quality assessment method based on transfer learning according to claim 1, wherein the step S1 specifically comprises:
step S11, framing the underwater Fish detection data set disclosed by Fish4Knowledge to obtain an initial data set;
and step S12, preprocessing the initial data set, correcting the label, and converting the label into a corresponding xml file to obtain a fish target detection data set.
3. The task-oriented image quality assessment method based on transfer learning according to claim 2, wherein said preprocessing comprises deleting fish images less than a preset number and performing data augmentation processing on the remaining images.
4. The migration learning-based task-oriented image quality assessment method according to claim 1, wherein said yolov4 structure comprises CSPDarkNet53, SPP, PAN, FPN.
5. The task-oriented image quality assessment method based on transfer learning according to claim 1, wherein the step S3 specifically comprises:
s31, selecting N images with clear fish characteristics, wherein each image comprises m fish;
step S32, distortion and degradation processing is carried out on the selected image;
in step S33, a task-oriented image quality database is composed of a plurality of images including the original image with the object and its degraded image and the original image without the object.
6. The method for task-oriented image quality assessment based on migration learning of claim 1, wherein the step S4 is specifically to freeze the parameters of the backbone and the rock parts of the fish-yolov4, and then to perform the output layer for quality prediction, and to fine-tune the model with the task-oriented image quality database established in the previous step, and only fine-tune the final output layer to obtain the final detection model.
Background
Human beings are crossing the information age, and images play an important role in daily life and work as important carriers of information. Image degradation is often caused in the processes of image acquisition, transmission, storage and display. Therefore, an image quality evaluation method is required to determine the distortion degree of the image and whether the image can be used for subsequent processing. Since the image receptor is mainly human, the most reliable image quality evaluation method is subjective quality evaluation, that is, the organization viewer subjectively evaluates the image quality according to his own experience and perception. However, as the number of images increases, subjective quality assessment consumes a lot of time, manpower and material resources, and cannot be applied to a real-time image processing system. Therefore, researchers have focused more on objective quality assessment algorithms.
Objective quality assessment algorithms can be divided into three categories based on the use of reference information: full reference image quality evaluation (FR-IQA), half reference image quality evaluation (RR-IQA), and no reference image quality evaluation (NR-IQA). FR-IQA and RR-IQA often do not perform well in practical tasks where reference information is difficult to obtain. The NR-IQA method is important, and research on the method is mainly focused on images without reference information.
Whether subjective or objective, the quality of the image is evaluated and scored as a label. The score of the image quality can be divided into two cases. In the first case, with the continuous perfection of shooting equipment and transmission channels, the content of images is richer and richer, the details are more and more real, and the quality of the images is higher and higher. In this case, the quality score does not converge. In another case, the image will eventually converge to some maximum point. For example, for compressed images, the quality of the image will decrease as the degree of compression of the image increases. The highest quality score will converge to the score of the uncompressed image.
Both of the above two cases are to satisfy the user experience, and the higher the user satisfaction, the higher the quality of the image can be said to be. However, as science and technology continue to advance, mankind is exploring more fields. Many images today are not only for user enjoyment, but are also commonly used for professional analysis, understanding and processing. For example, underwater images are more used for the exploration of marine resources, and medical images can acquire internal tissue information in a non-invasive manner. The image quality score will appear in a third scenario when the image quality is relevant to a particular task. When the image is sufficient to complete the task, the image quality score is saturated and it is not useful to continue to improve quality. And the quality of the image depends on its utility in accomplishing a particular task. Researchers can set different quality thresholds depending on the needs of the task. However, current research efforts regarding image quality focus more on the perceived quality of images, and do not address image quality evaluation efforts under specific tasks.
Disclosure of Invention
In view of the above, the present invention provides a task-oriented image quality evaluation method based on transfer learning, so as to solve the problems in the prior art.
In order to achieve the purpose, the invention adopts the following technical scheme:
a task-oriented image quality evaluation method based on transfer learning comprises the following steps:
s1, acquiring an underwater fish detection data set, and labeling and sorting to obtain a fish target detection data set;
step S2, training a yolov4 network model based on a fish target detection data set to obtain a fish detection model fish-yolov 4;
s3, constructing an image quality database for guiding underwater fish detection tasks;
and S4, performing migration learning and fine adjustment on the fish detection model fish-yolov4 based on the image quality database guided by the underwater fish detection task to obtain a final detection model.
Further, the step S1 is specifically:
step S11, framing the underwater Fish detection data set disclosed by Fish4Knowledge to obtain an initial data set;
and step S12, preprocessing the initial data set, correcting the label, and converting the label into a corresponding xml file to obtain a fish target detection data set.
Further, the preprocessing comprises deleting fish images of which the number is smaller than the preset number and performing data augmentation processing on the rest images.
Further, the yolov4 structure comprises CSPDarkNet53, SPP, PAN, FPN.
Further, the step S3 is specifically:
s31, selecting N images with clear fish characteristics, wherein each image comprises m fish;
step S32, distortion and degradation processing is carried out on the selected image;
in step S33, a task-oriented image quality database is composed of a plurality of images including the original image with the object and its degraded image and the original image without the object.
Further, the step S4 is specifically to freeze the parameters of the backbone and the rock parts of the fish-yolov4, and then to perform an output layer for quality prediction, and to perform fine adjustment on the model by using the task-oriented image quality database established in the previous step, and to only fine-adjust the final output layer, so as to obtain the final detection model.
Compared with the prior art, the invention has the following beneficial effects:
the invention can effectively improve the image quality evaluation efficiency and improve the reliability and authenticity of evaluation.
Drawings
FIG. 1 is a general technical roadmap for the present invention;
FIG. 2 is a diagram illustrating picture distortion types for task-oriented image quality datasets established in accordance with an embodiment of the present invention;
FIG. 3 is an image quality evaluation platform interface constructed in an embodiment of the present invention;
fig. 4 is a framework of an image quality evaluation model for underwater fish detection based on transfer learning according to an embodiment of the present invention.
Detailed Description
The invention is further explained below with reference to the drawings and the embodiments.
Referring to fig. 1, the present invention provides a task-oriented image quality evaluation method based on transfer learning, which includes the following steps:
s1, acquiring an underwater fish detection data set, and labeling and sorting to obtain a fish target detection data set;
step S2, training a yolov4 network model based on a fish target detection data set to obtain a fish detection model fish-yolov 4;
s3, constructing an image quality database for guiding underwater fish detection tasks;
and S4, performing migration learning and fine adjustment on the fish detection model fish-yolov4 based on the image quality database guided by the underwater fish detection task to obtain a final detection model.
In this embodiment, the step S1 specifically includes:
and (4) framing the underwater Fish detection data set disclosed by the Fish4Knowledge, namely the underwater Fish video. Firstly, the data set has a serious problem of unbalanced categories, and the number of partial fishes is too small, so that the category processing needs to be carried out on the whole data, partial fishes with less images are directly deleted, and the data augmentation processing is carried out on partial images. Secondly, the data set label has problems, the coordinate position and the species name of part of fishes have obvious errors, and the data set label needs to be modified or re-labeled, and the label comprises a target category and horizontal and vertical coordinates of the upper left corner and the lower right corner of the square frame. And finally, reading the label with the correct format by using a target detection algorithm, and converting the label into a corresponding xml file. Finally, a Fish Object Detection Database (FODD) is obtained through a large amount of sorting and correction.
In this embodiment, yolov4 structure includes CSPDarkNet53, SPP, PAN, FPN.
In this example, 145 images with clear fish features were selected, each image containing 1-6 fish of the 15 species. In addition, 90 images without objects are supplemented. As shown in fig. 2, distortion and degradation processing is performed on the 145 images, and for the most comprehensive restoration of the real underwater environment in the context of the task, the degradation types include the following: due to the influence of plankton or tiny particles in the ocean, the water body has serious attenuation and scattering effects on light, so that underwater illumination distortion, contrast distortion and ocean snow distortion are simulated; simulating underwater dynamic fuzzy distortion due to interference of factors such as seawater disturbance, shooting equipment and target movement; and on the basis of the target detection task, whether the underwater target clearly introduces foreground and background fuzzy distortion or not is judged. After the original image is degraded, 3340 images including the original image with the target and its degraded image and the original image without the target are combined into a task-oriented image quality database.
In this embodiment, a subjective experiment is also designed, and the subjective quality evaluation criterion should be guided by the task background, that is, the effectiveness of the image to complete the task. In the task of fish detection, if fish objects in the image are clearly distinguishable, the image is a high-quality image, and conversely, the image is a low-quality image. The evaluation environment is a laboratory environment with mixed illumination of a fluorescent lamp and natural light, the ambient light illumination is about 400Lux, and the recommended range is 250-650 Lux. In the experimental process, the environment can be finely adjusted according to the watching habits of the volunteers, and conditions such as selecting a proper position distance for evaluating the daily watching of the volunteers on the display screen, wearing glasses and the like are included; meanwhile, the resolution, brightness and contrast of the displays used by all the evaluation volunteers are consistent and uniform before the experiment is carried out; the method comprises the following steps that an evaluation volunteer can randomly select an image group to test each time, in order to prevent the evaluation volunteer from generating fatigue due to long-time watching to affect subjective evaluation, the volunteer selects at least 1 group at each time and at most 3 groups, and each group is tested at an interval of 15 minutes;
outlier Coefficients (OC) were introduced to quantify the subjective consistency of the database:
OC=
ntotal represents the total number of images, and Noutlier represents the number of images for which the difference between the 25 th percentile and the 75 th percentile scores of the subjective scores obtained for each image is greater than 1. For the subjective database we constructed, OC =5.32%, so it can be assumed that the overall database judgment by different subjects is not very different. We process the subjective score values given by each viewer into vectors and then compute the normalized cross-correlation (NCC) and euclidean distance (EUD) between each vector. The final values for NCC and EUD were 0.91 and 0.08, respectively. The higher the NCC value, the lower the EUD value, indicating a higher correlation between the two subjective score vectors.
Preferably, in this embodiment, the step S4 specifically includes: and redesigning the output layer, inputting the three feature maps of the second part into a full connection layer and outputting the final score. The specific process is that the parameters of the backbone and the rock parts of the fish-yolov4 are frozen, and an output layer for predicting the quality is arranged on the frozen parameters. Then, the model is fine-tuned by using a Task-oriented Image Quality Database (TIQD) established in the previous step, and only the final output layer is fine-tuned.
The above description is only a preferred embodiment of the present invention, and all equivalent changes and modifications made in accordance with the claims of the present invention should be covered by the present invention.