Diabetic retinopathy image classification method based on improved ResNeSt convolutional neural network model
1. A method for classifying images of diabetic retinopathy based on an improved ResNeSt convolutional neural network model is characterized by comprising the following steps:
step 1, acquiring a medical image of diabetic retinopathy from a hospital;
step 2, preprocessing the collected medical image, and then asking a professional ophthalmologist to manually label a focus part in the medical image, so as to form a data set with label information required by training a ResNeSt convolutional neural network classification model, and dividing the data set with label information into a training set, a verification set and a test set;
step 3, building a deep learning server platform required by the experiment, then compiling python codes and preparing for building a model;
step 4, introducing two light-weight and high-efficiency convolution operations of OctConv and SPConv into the ResNeSt convolution neural network, and introducing a learning rate mediation mechanism of Warm Restart and cosine annealing, thereby effectively improving the classification precision and the training speed;
step 5, an ILSVRC2012 data set is adopted to pre-train the improved ResNeSt network, and the obtained initial model is transferred to a preprocessed diabetic retina data set to perform fine tuning learning, so that a final diabetic retina image classification model is obtained;
and 6, loading a test set for testing the trained ResNeSt convolutional neural network classification model to obtain a classification result, and judging whether each classification index meets the requirement.
2. The method for classifying images of diabetic retinopathy based on the improved ResNeSt convolutional neural network model as claimed in claim 1, wherein the collected medical images are first preprocessed in step 2, and then a professional ophthalmologist is required to manually label the lesion parts in the medical images, so as to form a data set with labeling information required for training the ResNeSt convolutional neural network classification model, and the data set with labeling information is divided into a training set, a validation set and a test set, which are as follows:
step 2.1, removing retina images without any significance in the collected medical image data set of the retinopathy;
step 2.2, cutting off black frames which can bring interference to classification on each retina image;
step 2.3, uniformly stretching the histogram of the rest retina images;
step 2.4, adopting a color normalization processing method for all retina images in the data set;
step 2.5, carrying out image enhancement processing on the data set image;
step 2.6, please professional pathologist label the data set;
step 2.7, performing data amplification operation on the basis of the 6 steps, wherein the main amplification step comprises scaling the retina image according to the proportion of 0.8, rotating for 90 degrees/180 degrees/270 degrees, performing affine processing, performing mirror image in the vertical direction and the horizontal direction, and the like;
step 2.8, dividing the processed data set into a training set, a verification set and a test set; the training set comprises 17392 images, the verification set comprises 2319 images, and the test set comprises 3478 images. The training set is used for building the model, the verification set is used for verifying the model, and the test set is used for testing the model.
3. The method for classifying images of diabetic retinopathy based on the improved ResNeSt convolutional neural network model as claimed in claim 1, wherein the deep learning server platform required by the experiment is built in step 3, then python codes are written, and the building of the model is prepared as follows:
step 3.1, installing a Ubuntu 20.10 operating system on a desktop and downloading cuda 10.2;
step 3.2, downloading an Anaconda 3 software library and creating a python programming environment;
step 3.3, downloading and installing various environments required by PyTorch 1.8.0 on Anaconda 3;
step 3.4. finally write python code based on PyTorch 1.8.0 framework.
4. The method for classifying images of diabetic retinopathy based on the improved ResNeSt convolutional neural network model as claimed in claim 1, characterized in that step 4 introduces two light-weight and high-efficiency convolution operations of OctConv and SPConv and introduces a learning rate mediation mechanism of Warm Restart and cosine annealing in the ResNeSt convolutional neural network, thereby effectively improving the classification accuracy and training speed, and specifically as follows:
and 4.1, building a ResNeSt network model, replacing the convolution layer with the convolution kernel of 3X3 with OctConv, and adopting SPConv as the convolution kernel of the basic convolution layer in the OctConv initial layer to reduce training parameters, thereby reducing the time required by training and improving the classification accuracy.
And 4.2, after the learning rate is every certain period, reinitializing to a certain preset value and then gradually attenuating. The model parameters are optimized from the beginning after each restart, and the optimization is continued on the basis of the parameters before the restart.
And 4.3, combining a cosine annealing mode, and being beneficial to improving the classification precision. The combination of the two modes improves the accuracy and speed of the model.
5. The method for classifying images of diabetic retinopathy based on the improved ResNeSt convolutional neural network model as claimed in claim 1, wherein the step 5 is to pre-train the improved ResNeSt network by using ILSVRC2012 data set, and then transfer the obtained initial model to the preprocessed diabetic retina data set for fine tuning learning, so as to obtain the final classification model of diabetic retina images, which is as follows:
step 5.1, firstly, pre-training a built ResNeSt model by adopting an ILSVRC2012 data set, and storing the model after 10000 times of training iteration;
and 5.2, transferring the saved model to a training set and a verification set which are collected and preprocessed to retrain and verify, thereby realizing the extraction of the depth features of the diabetic retina image.
6. The method for classifying images of diabetic retinopathy based on the improved ResNeSt convolutional neural network model as claimed in claim 1, wherein the step 6 loads the test set for testing the trained ResNeSt convolutional neural network classification model to obtain the classification result, and see whether each classification index meets the requirement, which is as follows:
step 6.1, taking the preprocessed test set with 3478 images of diabetic retinopathy as input to obtain a classification output result;
and 6.1, calculating each classification evaluation index (including AUC, Accuracy, Sensitivity, Specificity, F1-score) to see whether the model effect is expected or not.
Background
Diabetes mellitus is a metabolic disease of a body with a high incidence rate, is frequently generated in a population with a large age, not only has serious influence on normal blood sugar metabolism of a human body, but also causes damage to other parts of the body, wherein Diabetic Retinopathy (DR) is a typical complication of diabetes mellitus. DR not only has a great influence on the vision of a person, but also causes damage to the nerves of the brain. DR generally has a latent period of 10-15 years from early stage to blindness, and the early stage is not prominent and thus tends not to attract attention.
Diabetic retinopathy not only causes blindness in patients, but also imposes a serious mental and economic burden on patients and family members. If the diabetic retinopathy can be diagnosed in time in the early stage and effective treatment means are adopted, the health of the patient can be improved to the maximum extent, and even the vision of the patient can be recovered.
Therefore, accurate diagnosis and identification of the uropathic retinopathy are of great significance to the patients. Traditional detection of diabetic retinopathy mainly depends on manual identification of professional ophthalmologists, but the process is time-consuming, labor-consuming, low in efficiency and even easy to cause misdiagnosis. In addition, the current Chinese ophthalmologists are seriously short, and the diagnosis level is uneven. Artificial intelligence aided diagnostic systems can solve these problems very well.
Disclosure of Invention
The invention aims to solve the technical problem of providing a diabetic retinopathy image classification method based on an improved ResNeSt convolutional neural network model, and the classification of the diabetic retinopathy image is more accurate and the classification time is shorter by utilizing an improved ResNeSt convolutional network classification algorithm. In order to solve the technical problems, the invention adopts the technical scheme that: the diabetic retinopathy image classification method based on the improved ResNeSt convolutional neural network model specifically comprises the following steps, and a relevant flow chart is shown in FIG. 1:
step 1, acquiring a medical image of diabetic retinopathy from a hospital;
step 2, preprocessing the collected medical image, and then asking a professional ophthalmologist to manually label a focus part in the medical image, so as to form a data set with label information required by training a ResNeSt convolutional neural network classification model, and dividing the data set with label information into a training set, a verification set and a test set;
step 3, building a deep learning server platform required by the experiment, then compiling python codes and preparing for building a model;
step 4, introducing two light-weight and high-efficiency convolution operations of OctConv and SPConv into the ResNeSt convolution neural network, and introducing a learning rate mediation mechanism of Warm Restart and cosine annealing, thereby effectively improving the classification precision and the training speed;
step 5, an ILSVRC2012 data set is adopted to pre-train the improved ResNeSt network, and the obtained initial model is transferred to a preprocessed diabetic retina data set to perform fine tuning learning, so that a final diabetic retina image classification model is obtained;
and 6, loading a test set for testing the trained ResNeSt convolutional neural network classification model to obtain a classification result, and judging whether each classification index meets the requirement.
Further, the step 2 of preprocessing the collected medical image, and then manually labeling the focus part in the medical image by a professional ophthalmologist, so as to form a data set with labeling information required for training the resnext convolutional neural network classification model, and dividing the data set with labeling information into a training set, a verification set and a test set, specifically:
step 2.1, removing retina images without any significance in the collected medical image data set of the retinopathy;
step 2.2, cutting off black frames which can bring interference to classification on each retina image;
step 2.3, uniformly stretching the histogram of the rest retina images;
step 2.4, adopting a color normalization processing method for all retina images in the data set;
step 2.5, carrying out image enhancement processing on the data set image;
step 2.6, please professional pathologist label the data set;
step 2.7, performing data amplification operation on the basis of the 6 steps, wherein the main amplification step comprises scaling the retina image according to the proportion of 0.8, rotating for 90 degrees/180 degrees/270 degrees, performing affine processing, performing mirror image in the vertical direction and the horizontal direction, and the like;
step 2.8, dividing the processed data set into a training set, a verification set and a test set; the training set comprises 17392 images, the verification set comprises 2319 images, and the test set comprises 3478 images. The training set is used for building the model, the verification set is used for verifying the model, and the test set is used for testing the model.
Further, the step 3 of building a deep learning server platform required by the experiment, then writing python codes, and preparing for building a model specifically comprises:
step 3.1, installing a Ubuntu 20.10 operating system on a desktop and downloading cuda 10.2;
step 3.2, downloading an Anaconda 3 software library and creating a python programming environment;
step 3.3, downloading and installing various environments required by PyTorch 1.8.0 on Anaconda 3;
step 3.4. finally write python code based on PyTorch 1.8.0 framework.
Further, in the step 4, two light-weight and high-efficiency convolution operations, namely OctConv and SPConv, are introduced into the reseest convolution neural network, and a learning rate mediation mechanism of norm Restart and cosine annealing is introduced, so that the classification accuracy and the training speed are effectively improved, specifically:
and 4.1, building a ResNeSt network model, replacing the convolution layer with the convolution kernel of 3X3 with OctConv, and adopting SPConv as the convolution kernel of the basic convolution layer in the OctConv initial layer to reduce training parameters, thereby reducing the time required by training and improving the classification accuracy.
And 4.2, after the learning rate is every certain period, reinitializing to a certain preset value and then gradually attenuating. The model parameters are optimized from the beginning after each restart, and the optimization is continued on the basis of the parameters before the restart.
And 4.3, combining a cosine annealing mode, and being beneficial to improving the classification precision. The combination of the two modes improves the accuracy and speed of the model.
Further, the improved resenest network is pre-trained by using the ILSVRC2012 dataset in step 5, and the obtained initial model is transferred to the preprocessed diabetic retina dataset for fine-tuning learning, so as to obtain a final diabetic retina image classification model, specifically:
step 5.1, firstly, pre-training a built ResNeSt model by adopting an ILSVRC2012 data set, and storing the model after 10000 times of training iteration;
and 5.2, transferring the saved model to a training set and a verification set which are collected and preprocessed to retrain and verify, thereby realizing the extraction of the depth features of the diabetic retina image.
Further, the loading test set in step 6 is used to test the trained resenestt convolutional neural network classification model to obtain a classification result, and whether each classification index meets the requirement is specifically:
step 6.1, taking the preprocessed test set with 3478 images of diabetic retinopathy as input to obtain a classification output result;
and 6.2, calculating each classification evaluation index (including AUC, Accuracy, Sensitivity, Specificity, F1-score) to see whether the model effect is expected or not.
Compared with the prior art, the invention has the beneficial effects that:
(1) the invention improves the structure of ResNeSt, and greatly reduces the parameters of the model, improves the classification accuracy and reduces the training time by adding two light and effective convolution operations of OctConv and SPConv.
(2) The classification method provided by the invention is subjected to pre-training of the ILSVRC2012 data set, so that the accuracy can be effectively improved.
(3) According to the invention, the target data set is preprocessed, some factors influencing the model classification effect are removed, the problems of unbalanced target class quantity, large difference in color and the like are avoided, and the classification result obtained through testing is good.
(4) The invention adopts a learning rate mediation mechanism of the norm Restart and cosine annealing, can effectively shorten the training time and simultaneously improve the classification accuracy.
Drawings
FIG. 1 is a schematic flow chart of the method for classifying images of diabetic retinopathy based on the improved ResNeSt convolutional neural network model in the present invention;
FIG. 2 is a diagram of a ResNeSt network model structure according to the present invention;
FIG. 3 is a flow chart of the improved ResNeSt network algorithm of the present invention;
FIG. 4 is a schematic structural diagram of an OctConv transition layer according to the present invention;
FIG. 5 is a schematic diagram of the cosine annealing learning rate adjustment mechanism with the norm Restrat in the present invention.
Detailed Description
The invention is described in further detail below with reference to the following figures and examples:
a diabetic retinopathy image classification method based on an improved ResNeSt convolutional neural network model specifically comprises the following steps:
step 1, acquiring a medical image of diabetic retinopathy from a hospital;
step 2, preprocessing the collected medical image, and then asking a professional ophthalmologist to manually label a focus part in the medical image, so as to form a data set with label information required by training a ResNeSt convolutional neural network classification model, and dividing the data set with label information into a training set, a verification set and a test set;
step 3, building a deep learning server platform required by the experiment, then compiling python codes and preparing for building a model;
step 4, introducing two light-weight and high-efficiency convolution operations of OctConv and SPConv into the ResNeSt convolution neural network, and introducing a learning rate mediation mechanism of Warm Restart and cosine annealing, thereby effectively improving the classification precision and the training speed;
step 5, an ILSVRC2012 data set is adopted to pre-train the improved ResNeSt network, and the obtained initial model is transferred to a preprocessed diabetic retina data set to perform fine tuning learning, so that a final diabetic retina image classification model is obtained;
and 6, loading a test set for testing the trained ResNeSt convolutional neural network classification model to obtain a classification result, and judging whether each classification index meets the requirement.
Further, the step 2 of preprocessing the collected medical image, and then manually labeling the focus part in the medical image by a professional ophthalmologist, so as to form a data set with labeling information required for training the resnext convolutional neural network classification model, and dividing the data set with labeling information into a training set, a verification set and a test set, specifically:
step 2.1, removing retina images without any significance in the collected medical image data set of the retinopathy;
step 2.2, cutting off black frames which can bring interference to classification on each retina image;
step 2.3, uniformly stretching the histogram of the rest retina images;
step 2.4, adopting a color normalization processing method for all retina images in the data set;
step 2.5, carrying out image enhancement processing on the data set image;
step 2.6, please professional pathologist label the data set;
step 2.7, performing data amplification operation on the basis of the 6 steps, wherein the main amplification step comprises scaling the retina image according to the proportion of 0.8, rotating for 90 degrees/180 degrees/270 degrees, performing affine processing, performing mirror image in the vertical direction and the horizontal direction, and the like;
step 2.8, dividing the processed data set into a training set, a verification set and a test set; the training set comprises 17392 images, the verification set comprises 2319 images, and the test set comprises 3478 images. The training set is used for building the model, the verification set is used for verifying the model, and the test set is used for testing the model.
Further, the step 3 of building a deep learning server platform required by the experiment, then writing python codes, and preparing for building a model specifically comprises:
step 3.1, installing a Ubuntu 20.10 operating system on a desktop and downloading cuda 10.2;
step 3.2, downloading an Anaconda 3 software library and creating a python programming environment;
step 3.3, downloading and installing various environments required by PyTorch 1.8.0 on Anaconda 3;
step 3.4. finally write python code based on PyTorch 1.8.0 framework.
Further, in the step 4, two light-weight and high-efficiency convolution operations, namely OctConv and SPConv, are introduced into the reseest convolution neural network, and a learning rate mediation mechanism of norm Restart and cosine annealing is introduced, so that the classification accuracy and the training speed are effectively improved, specifically:
step 4.1, building a ResNeSt network model, replacing a convolution layer with a convolution kernel of 3X3 by OctConv, and adopting SPConv as the convolution kernel of a basic convolution layer in an OctConv initial layer to reduce training parameters so as to reduce training stationsThe time is required, and the classification accuracy is improved. The reason for this is: (1) the OctConv structure includes an initial layer, a transition layer, and an output layer. The initial layer is single-input and double-output and is responsible for receiving the input characteristic diagram. The original image passes through a convolution layer with convolution kernel size of 3X3 to output a high-frequency characteristic diagram (A)H) After the original image is averaged and pooled, a low-frequency characteristic diagram (A) is output through the same convolution layerL) Then, a high-frequency to high-frequency convolution operation is performed on the high-frequency portion to obtain a feature map (B)H→H). Then the high frequency part is averaged and pooled, and finally the characteristic diagram (B) is obtained by convolutionH→L). Performing low-frequency to low-frequency convolution operation on the low-frequency part to obtain a feature map (B)L→L) Then, the low frequency part is subjected to up-sampling operation and convolution to obtain the final characteristic diagram (B)L→H) A 1 to BH→HAnd BL→H,BH→LAnd BL→LRespectively adding to obtain high and low frequency characteristic diagrams BH,BLCan be expressed as:
BH=BH→H+BL→H
BL=BL→L+BH→L
(2) the SPConv is designed based on feature redundancy, the input features are divided into two groups to be processed respectively, and compared with the conventional convolution, the SPConv not only has higher precision, but also has higher reasoning speed and lower parameter quantity.
And 4.2, replacing the learning rate attenuation method with the norm Restart. The method specifically comprises the following steps: suppose X restarts during the gradient descent, the Mth restart starting at the last restartxAfter one round, MxReferred to as a restart period. Before the x-th restart, the learning rate is reduced by means of cosine annealing. The learning rate of the ith iteration is:
whereinThe upper limit and the lower limit of the learning rate in the x-th period, respectively, may gradually decrease as x increases; mrepIs the number of epochs since the last restart. Restart period MxMay be gradually increased with the number of restarts.
Further, the improved resenest network is pre-trained by using the ILSVRC2012 dataset in step 5, and the obtained initial model is transferred to the preprocessed diabetic retina dataset for fine-tuning learning, so as to obtain a final diabetic retina image classification model, specifically:
step 5.1, firstly, pre-training a built ResNeSt model by adopting an ILSVRC2012 data set, and storing the model after 10000 times of training iteration;
and 5.2, transferring the saved model to a training set and a verification set which are collected and preprocessed to retrain and verify, thereby realizing the extraction of the depth features of the diabetic retina image.
Further, the loading test set in step 6 is used to test the trained resenestt convolutional neural network classification model to obtain a classification result, and whether each classification index meets the requirement is specifically:
step 6.1, taking the preprocessed test set with 3478 images of diabetic retinopathy as input to obtain a classification output result;
and 6.2, calculating each classification evaluation index (including AUC, Accuracy, Sensitivity, Specificity, F1-score) to see whether the model effect is expected or not.
In the step 6, the performance evaluation index of the image classification model is specifically an evaluation equationAUC, F1-score, where TP is: the number of positive samples correctly identified; TN is: the number of negative examples correctly identified; FP is: the number of negative samples that are misidentified as positive samples; FN is: the number of positive samples that are misidentified as negative samples.
The above description is only one embodiment of the present invention, and is not intended to limit the present invention, and it is apparent to those skilled in the art that various modifications and variations can be made in the present invention. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present invention shall be included in the protection scope of the present invention.