Deep neural network model robustness optimization method
1. A robustness optimization method of a deep neural network model is characterized by comprising the following steps:
the method comprises the following steps: selecting a deep neural network image classification model as a basic model, and adding an additional branch P in the shallow layer of the basic model1,P2,...,PnForming a multi-prediction branch structure, wherein n is the number of residual block groups of the basic model minus 1;
step two: will PnThe characteristic diagram of the layer is expanded to the sum P by adopting an up-sampling methodn-1The characteristic graphs of the layers are consistent;
step three: repeating the steps two to n is 2, and finally, adding P1,P2,...,PnPerforming average operation on the prediction result as output, and transforming the basic model into a characteristic pyramid structure;
step four: selecting a characteristic pyramid structure model as a target model M, and training the target model M until the target model M can accurately map the image sampled from the original data set to a corresponding label t, wherein specifically, t is the image category information manually marked in the original image data set;
step five: taking the last layer of convolution layer of the target model M as a feature extractor f, and constructing a generator G and a discriminator D, wherein the generator G is specifically constructed by a convolution neural network, and the discriminator D is used for distinguishing a real image and a generated countermeasure sample;
step six: inputting an attack category a of a required countermeasure sample;
step seven: for noise priors pg(z) m noise samples { z)1,…zmSampling in small batches, generating a distribution p from the datadata(x) M samples x1,…xmSampling in small batches;
step eight: will { x1,…xmExtracting potential feature output feature vector f (x) through a feature extractor f, and sampling { z & lt/EN & gt from normal distribution1,…zm-outputting a noise vector z;
step nine: the feature vector f (x) and the noise vector z are used as cascade vectors and input into a generator G to generate samples { x }adv1,xadv2,...xadvm};
Step ten: minimizing xadviAnd xiL between2Loss is limited to limit the size of disturbance, wherein i is more than or equal to 1 and less than or equal to m, a random gradient updating discriminator D is added, the discriminator D maximizes the difference between a real image and a generated sample, a random gradient updating generator G is reduced, and the probability that the sample generated by the generator G in a minimized mode belongs to a category t is reduced;
step eleven: repeat step ten until { xadv1,xadv2,...xadvmThe original distribution is approached and the output challenge sample label is a.
Step twelve: and repeating the sixth step to the eleventh step until the number of the generated attack classes of the confrontation samples is consistent with the image classes existing in the original image data set, manually marking all the confrontation samples as correct class labels, adding the confrontation samples into the training set again, and retraining the target model M.
Background
The deep neural network is developed rapidly in recent years and has been applied to a large number of fields, and although the deep neural network model achieves high accuracy at the present stage, researches show that the neural network classifier can generate misjudgment by adding small disturbance to a normal sample, and the disturbance does not influence human eye judgment. Such a class of samples that can cause the neural network classifier to produce false positives is called countermeasure samples, which alter the underlying characteristics of the neural network affecting the output of the neural network. Therefore, the defense capability of the deep neural network model against sample attack is improved, so that the model robustness of the deep neural network is improved, and the method plays a key role in the development of the deep neural network technology.
In order to make the use of the deep neural network model safer in the confrontation environment, many scholars at home and abroad aim to improve the robustness of the deep neural network model. When the white box attack is carried out on the deep neural network model, on the premise that the disturbance is the same, the classification accuracy of the model is higher, and the robustness is better. The existing methods for improving the robustness of the model have three types: modifying input data of the model, modifying the network structure and adding external modules. The previous research for improving the robustness of the model basically has the problems of limited defense range, high cost and the like.
Disclosure of Invention
A robustness optimization method of a deep neural network model is characterized in that firstly, a basic model is transformed into a characteristic pyramid structure which can be balanced between speed and accuracy, secondly, the characteristic pyramid structure model is used as a target model, potential characteristics of an original image are used as a priori to generate a confrontation sample, and the confrontation sample is added into a training set to carry out confrontation training. The method aims to enable the improved model to defend against various white-box attacks, the classification accuracy is high, and the robustness of the model is optimized.
The invention relates to a robustness optimization method of a deep neural network model, which is characterized by comprising the following steps of:
the method comprises the following steps: selecting a deep neural network image classification model as a basic model, and adding an additional branch P in the shallow layer of the basic model1,P2,...,PnForming a multi-prediction branch structure, wherein n is the number of residual block groups of the basic model minus 1;
step two: will PnThe characteristic diagram of the layer is expanded to the sum P by adopting an up-sampling methodn-1The characteristic graphs of the layers are consistent;
step three: repeating the steps two to n is 2, and finally, adding P1,P2,...,PnThe average operation is carried out on the prediction result as output, so that the basic model is transformed into a characteristic pyramid structure,the feature pyramid structure is shown in fig. 2;
step four: selecting a characteristic pyramid structure model as a target model M, and training the target model M until the target model M can accurately map the image sampled from the original data set to a corresponding label t, wherein specifically, t is the image category information manually marked in the original image data set;
step five: taking the last convolutional layer of the target model M as a feature extractor f, and constructing a generator G and a discriminator D, wherein the generator G is specifically constructed by a convolutional neural network, the discriminator D is used for distinguishing a real image and a generated countermeasure sample, and the architecture is shown in FIG. 3;
step six: inputting an attack category a of a required countermeasure sample;
step seven: for noise priors pg(z) m noise samples { z)1,…zmSampling in small batches, generating a distribution p from the datadata(x) M samples x1,…xmSampling in small batches;
step eight: will { x1,…xmExtracting potential feature output feature vector f (x) through a feature extractor f, and sampling { z & lt/EN & gt from normal distribution1,…zm-outputting a noise vector z;
step nine: the feature vector f (x) and the noise vector z are used as cascade vectors and input into a generator G to generate samples { x }adv1,xadv2,...xadvm};
Step ten: minimizing xadviAnd xiL between2Loss is limited to limit the size of disturbance, wherein i is more than or equal to 1 and less than or equal to m, a random gradient updating discriminator D is added, the discriminator D maximizes the difference between a real image and a generated sample, a random gradient updating generator G is reduced, and the probability that the sample generated by the generator G in a minimized mode belongs to a category t is reduced;
step eleven: repeat step ten until { xadv1,xadv2,...xadvmThe original distribution is approached and the output challenge sample label is a.
Step twelve: and repeating the sixth step to the eleventh step until the number of the generated attack classes of the confrontation samples is consistent with the image classes existing in the original image data set, manually marking all the confrontation samples as correct class labels, adding the confrontation samples into the training set again, and retraining the target model M.
Drawings
FIG. 1 is a flow chart of a robustness optimization method of a deep neural network model.
Fig. 2 is a schematic diagram of a basic model transformed into a characteristic pyramid structure.
FIG. 3 is an architectural diagram of the generation of challenge samples using latent features.
Detailed Description
Taking a classical deep neural network model Resne32 as an example, a specific implementation of the deep neural network model robustness optimization method provided by the invention is explained.
The method comprises the following steps: selecting Resne32 model as basic model, Resnet32 model having 3 groups of residual blocks, each group of residual blocks having different number of identical residual blocks, adding branch P after the first two groups of residual blocks of Resnet32 model1,P2Forming a multi-prediction branch structure;
step two: will P2The characteristic diagram of the layer is expanded to the sum P by adopting an up-sampling method1The characteristic graphs of the layers are consistent;
step three: will P1,P2Performing average operation on the prediction result as output, and transforming the basic model Resne32 into a characteristic pyramid structure;
step four: selecting a characteristic pyramid structure model transformed by a Resne32 model as a target model M, and training the target model M until the target model M can accurately map images sampled from an original data set to corresponding labels t, wherein specifically, t is manually labeled image category information in the original image data set;
step five: taking the last layer of convolution layer of the target model M as a feature extractor f, and constructing a generator G and a discriminator D, wherein the generator G is specifically constructed by a convolution neural network, and the discriminator D is used for distinguishing a real image from a generated sample;
step six: performing an experiment aiming at the MNIST data set, and inputting an attack category 0 of a required countermeasure sample;
step seven: for 10000 random high-dimensional noise samples { z1,…z10000Sample for small batches, 10000 samples { x ] from the raw data1,…x10000Sampling in small batches;
step eight: will { x1,…x10000Extracting potential feature output feature vector f (x) through a feature extractor f, and sampling { z & lt/EN & gt from normal distribution1,…z10000-outputting a noise vector z;
step nine: the cascade mapping of the image feature vector f (x) and the noise vector z is input into a generator G to generate samples xadv1,xadv2,...xadv10000};
Step ten: minimizing l between each generated image and the original image2Loss is limited to limit the disturbance size, a random gradient updating discriminator D is added, the discriminator D maximizes the difference degree of a real image and a generated sample, and a random gradient updating generator G is reduced, wherein the generator G minimizes the probability that the generated sample belongs to the class 0;
step eleven: repeating the step ten until the generated sample labels output by 10000 samples are 0;
step twelve: and repeating the sixth step to the eleventh step, setting the attack categories as numbers from 0 to 9 respectively until 10000 countersamples are generated by each attack category, marking the generated countersamples as correct labels, adding the labels into an MNIST training set, and retraining the target model M.
Through the process, the robustness optimization method of the deep neural network model can be realized, and the flow chart is shown in figure 1 and comprises two stages of modifying the network structure and modifying the model input data. The network structure modification stage is mainly used for adding a prediction branch to a target model and transforming the target model into a characteristic pyramid structure; in the stage of modifying the model input data, a potential feature is extracted by using a feature extractor to generate a confrontation sample, and the generated confrontation sample is added into a training set to train the model.
Aiming at the Resne34 model, five white-box attacks are respectively used for carrying out experiments on the MNIST data set and the CIFAR-10 data set, and Table 1 shows that the optimized Resne34 model is used for resisting sample classification accuracy comparison under the high-disturbance five white-box attacks, the first part is the experiment result on the MNIST data set, and the second part is the experiment result on the CIFAR-10 data set.
As can be seen from Table 1, the classification accuracy of the method under the attack of five high-disturbance white boxes is at least improved by 4 times compared with that of a basic model, and the optimized model has stronger defense capability than that of an original model and an enlarged defense range, and can be used as a deep neural network model robustness optimization method.
TABLE 1 comparison of Classification correctness for challenge samples on MNIST and CIFAR-10 datasets