Semi-supervised small sample image classification method based on graph collaborative training
1. A semi-supervised small sample image classification method based on graph collaborative training is characterized by comprising the following steps:
and extracting image features by adopting a convolutional neural network.
Encoding relationships between edges and points in the graph study using an adjacency matrix, wherein elements in the adjacency matrix are:
wherein the content of the first and second substances,it is shown that a graph learning is performed,and epsilon respectively represents a point set and an edge set, wherein the point set consists of image samples;representing an adjacency matrix; x is the number ofi(i-1, 2, …, N) is viEmbedding feature, viIs thatThe ith vertex of (1); e represents at one edge; dis (x)i,xj) Representing a calculation viAnd vjAn operator of the distance between the embedded features.
Carrying out isolated graph learning by adopting an objective function, converting samples in a characteristic space into samples in a graph space, and classifying different test samples by learning regularized projection, wherein the objective function is as follows:
wherein the content of the first and second substances,represents a normalized projection, C represents the total number of classes; x represents an embedded feature of the sample;represents the normalized graph laplacian;represents a dot matrix, being a diagonal matrix with (i, i) th elements equal to the sum of the ith row elements of a;representing the initial tag embedding matrix.
Expanding the isolated graph learning to a graph collaborative training framework with characteristics of two modes, namely a rotation mode and a mirror mode, carrying out graph collaborative training by using label-free data to obtain an optimal basic learner, and predicting the category of a query label by using the optimal basic learner, wherein the category of the query label is as follows:
wherein the content of the first and second substances,representation and rotation modality feature extractor omegar(. o) corresponding new data embedding feature, whereinAndrespectively representing the embedded characteristics of the support set, the label-free set and the query set data on the rotation modality;representation and mirror mode feature extractor omegam(. o) corresponding new data embedding feature, wherein Respectively representing the embedded characteristics of the support set, the unlabeled set and the query set data on the mirror image mode;andtwo optimal basis learners are represented.
2. The image classification method according to claim 1, wherein the isolated graph learning is performed by using an objective function, samples in a feature space are converted into samples in a graph space, and different test samples are classified by learning a regularized projection, specifically:
obtaining a relaxed function by relaxing the objective functionAlternately updating P and B until the function after relaxation is converged to obtain
Wherein the content of the first and second substances,represents a normalized projection, C represents the total number of classes; x represents an embedded feature of the sample;represents the normalized graph laplacian;the dot matrix is represented as a diagonal matrix with (i, i) th elements equal to the sum of the ith row elements of A; b denotes a diagonal matrix (B)(ii)The element representing the ith row and the jth column of B, P(i·)Row i element representing P);representing an initial tag embedding matrix;
the method of predicting the class of test samples is as follows:
wherein the content of the first and second substances,representing a test sample; max represents the operator of the index of the maximum value in the vector.
3. The image classification method according to claim 1 or 2, wherein optionally, the isolated graph learning is extended to a graph collaborative training framework with features of two modalities, namely a rotation modality and a mirror modality, and an optimal basis learner is obtained by graph collaborative training using unlabeled data, specifically:
the first step is as follows: two different basic learners are respectively constructed by utilizing the characteristics of the two modal support sets.
The second step is that: soft pseudo-labels of unlabeled data are predicted from two modal features.
The third step: the values in the soft pseudo-label matrix are sorted, and then the unlabeled sample with the highest confidence is selected on the embedded features of each modality, and pseudo-labeled instances and corresponding labels are extended to the support set on different modes in an intersecting manner.
The fourth step: and repeating the three steps until the performance of the basic learner to be learned does not rise to obtain two optimal basic learners.
Background
In the last few years, visual recognition methods based on deep learning have in some cases reached or even exceeded the human level, an indispensable factor for success being the large amount of labeled data. In practice, however, the burden of data collection and maintenance can be significant. Therefore, learning for small samples lacking labeled samples in each category is attracting increasing attention.
At present, the main small sample image classification methods include the following methods:
(1) the optimization-based small sample image classification method comprises the following steps: if the supervision information is rich, learning can be carried out through a gradient descent method and cross validation can be carried out, but the number of samples for learning small samples is small and is not enough to support the method, so that the training cost and the error rate can be greatly reduced by providing a good initialization parameter theta. Singh et al in 2019 combine domain adaptation with MAML, so that the model is quickly fitted on a limited sample; rajeswaran and Finn et al in 2019 solve the problem that a Hessian matrix needs to be calculated in the gradient solving process of the MAML, and an improved version iMAML is provided; lee et al, 2019, learns based on a convex-concave optimization by using optimized implicit differentiability and low-rank characteristics of a classifier; in 2019, Luca et al use the idea of dimension reduction to solve the problem of difficult data calculation in small sample learning, and the operation speed is improved.
(2) The small sample image classification method based on the measurement comprises the following steps: the small sample learning based on the measurement is to combine the measurement learning with the small sample learning, and the basic principle is to autonomously learn a measurement distance function for a specific task according to different tasks. The prototype network proposed by Snell et al in 2017 is a simple and efficient small sample learning method, and the learning is carried out by calculating the centers of all prototypes in an embedded space; in 2018, by means of measurement scaling, Oreshkin and the like are improvements of a prototype network, the difference between cosine similarity and Euler distance in small sample learning is reduced, and auxiliary task collaborative training is provided, so that feature extraction with task dependency is easier to train, and the generalization capability is good; ren et al successfully applied the prototype network method in 2018 to the field; wang et al, 2018, extended the diversity of samples by generating model-generated virtual data and trained with prototype networks. The small sample learning method for different metric learning achieves better performance on a small sample image classification task, but because the number of the samples in the support set is small during testing, the class prototype of each class calculated in the prototype network cannot well represent the overall distribution condition of the tested samples, the problem of feature mismatching exists, and the small sample image classification performance can be limited to a certain extent.
Disclosure of Invention
In order to solve the problems existing in the image classification process of the small sample image classification method in the prior art, the embodiment of the invention provides a semi-supervised small sample image classification method based on graph collaborative training. The technical scheme is as follows:
the invention provides a semi-supervised small sample image classification method based on graph collaborative training, which comprises the following steps:
extracting image features by adopting a convolutional neural network;
encoding relationships between edges and points in the graph study using an adjacency matrix, wherein elements in the adjacency matrix are:
wherein the content of the first and second substances,it is shown that a graph learning is performed,andrespectively representing a point set and an edge set, wherein the point set is composed of image samples;representing an adjacency matrix; x is the number ofi(i-1, 2, …, N) is viEmbedding feature, viIs thatThe ith vertex of (1); e represents at one edge; dis (x)i,xj) Representing a calculation viAnd vjAn operator of the distance between the embedded features;
carrying out isolated graph learning by adopting an objective function, converting samples in a characteristic space into samples in a graph space, and classifying different test samples by learning regularized projection, wherein the objective function is as follows:
wherein the content of the first and second substances,represents a normalized projection, C represents the total number of classes; x represents an embedded feature of the sample;represents the normalized graph laplacian;the dot matrix is represented as a diagonal matrix with (i, i) th elements equal to the sum of the ith row elements of A;representing an initial tag embedding matrix;
expanding the isolated graph learning to a graph collaborative training framework with characteristics of two modes, namely a rotation mode and a mirror mode, carrying out graph collaborative training by using label-free data to obtain an optimal basic learner, and predicting the category of a query label by using the optimal basic learner, wherein the category of the query label is as follows:
wherein the content of the first and second substances,representation and rotation modality feature extractor omegar(. o) corresponding new data embedding feature, whereinAndrespectively representing the embedded characteristics of the support set, the label-free set and the query set data on the rotation modality;representation and mirror mode feature extractor omegam(. o) corresponding new data embedding feature, wherein Respectively representing the embedded characteristics of the support set, the unlabeled set and the query set data on the mirror image mode;andtwo optimal basis learners are represented.
Optionally, the isolated graph learning is performed by using a target function, samples in the feature space are converted into samples in the graph space, and different test samples are classified by learning regularized projection, which specifically includes:
obtaining a relaxed function by relaxing the objective functionAlternately updating P and B until the function after relaxation is converged to obtain
Wherein the content of the first and second substances,represents a normalized projection, C represents the total number of classes; x represents an embedded feature of the sample;represents the normalized graph laplacian;the dot matrix is represented as a diagonal matrix with (i, i) th elements equal to the sum of the ith row elements of A; b denotes a diagonal matrix (B)(ii)The element representing the ith row and the jth column of B, P(i·)Row i element representing P);representing an initial tag embedding matrix;
the method of predicting the class of test samples is as follows:
wherein the content of the first and second substances,representing a test sample; max denotes the operation of indexing the maximum value in the vectorAnd (4) making symbols.
Optionally, the isolated graph learning is extended to a graph collaborative training framework with characteristics of two modes, namely a rotation mode and a mirror mode, and graph collaborative training is performed by using label-free data to obtain an optimal basic learner, specifically:
the first step is as follows: two different basic learners are respectively constructed by utilizing the characteristics of the two modal support sets;
the second step is that: predicting soft pseudo labels of the label-free data from the two modal characteristics;
the third step: sorting values in the soft pseudo tag matrix, selecting unmarked samples with the highest confidence coefficient on the embedded features of each modality, and alternately expanding pseudo tag instances and corresponding tags to support sets on different modes;
the fourth step: and repeating the three steps until the performance of the basic learner to be learned does not rise to obtain two optimal basic learners.
The technical scheme provided by the embodiment of the invention has the following beneficial effects:
the embodiment of the invention provides a semi-supervised small sample image classification method based on graph collaborative training, which provides a new label prediction method, namely isolated graph learning. Secondly, a semi-supervised graph collaborative training method is provided, isolated graph learning is extended to a graph collaborative training framework with characteristics of two modes, namely a rotation mode and a mirror mode, the problem of characteristic mismatching in small sample learning is solved from the perspective of multi-mode fusion, and the classification performance of small sample images is greatly improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a schematic flowchart of a semi-supervised small sample image classification method based on graph collaborative training according to an embodiment of the present invention;
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in detail with reference to the accompanying drawings.
The semi-supervised small sample image classification method based on graph collaborative training according to the embodiment of the present invention will be described in detail below with reference to fig. 1.
Referring to fig. 1, a semi-supervised small sample image classification method based on graph collaborative training according to an embodiment of the present invention includes:
step 110: and extracting image features by adopting a convolutional neural network.
And extracting image features by adopting a convolutional neural network model Resnet-12 model. Specifically, firstly, the image dimension is changed into 84 × 84, and then the Resnet-12 model is called to obtain the characteristics of the image to be processed. The process of extracting image features by using the convolutional neural network is not the protection content of the present invention, the process of extracting image features by using the convolutional neural network belongs to the prior art, and the process of extracting image features by using the convolutional neural network is a common image feature extraction method.
Step 120: the adjacency matrix is used to encode the relationship between edges and points in the graph study.
Definition ofLearning for a graph whereinAndrespectively representing a point set and an edge set, the point set consisting of image samples.
By a contiguous matrixEncoding relationships between edges and points in a graph study, where xi(i-1, 2, …, N) is viEmbedding feature, viIs thatThe elements in the adjacency matrix a are represented by:
where e denotes at one edge, dis (x)i,xj) Representing a calculation viAnd vjAn operator of the distance between the embedded features;
step 130: and (3) carrying out isolated graph learning by adopting an objective function, converting samples in the characteristic space into samples in the graph space, and classifying different test samples by learning regularized projection.
Unlike graph learning, which requires both labeled and unlabeled data to construct a graph, isolated graph learning is a new label prediction method that converts samples in a feature space into samples in a graph space, and learns a regularized projection(C represents the total number of classes) to classify different test samples, thereby more flexibly and independently completing the training and testing process.
The loss function is defined as:
wherein f is1(P) represents the laplacian regularizer of the graph; λ and μ represent parameters of the balance function; f. of2(P) represents an empirical loss term; f. of3(P) represents a constraint term.
f1The representative function of (P) is as follows:
wherein X represents an embedded feature of the sample; (.)(i·)The ith row of data representing (-) is,the dot matrix is represented as a diagonal matrix with (i, i) th elements equal to the sum of the ith row elements of A;represents the normalized graph laplacian;
f2the representative function of (P) is as follows:
wherein the content of the first and second substances,representing an initial tag embedding matrix; y is(ij)1 when the ith sample belongs to the jth sample, and 0 if not;
introduction ofTo select a basic feature of P and avoid overfitting, f3The representative function of (P) is as follows:
wherein the content of the first and second substances,represents (. smallcircle.) of To representRegularization, which is a sparse learning method;
the target function for isolated graph learning is:
obtaining relaxed function by relaxing target function learned by isolated graphAlternately updating P and B until the function after relaxation is converged to obtain
Wherein B represents a diagonal matrix (B)(ii)The element representing the ith row and the jth column of B, P(i·)Row i element representing P);
given a test specimenThe method of predicting the class of test samples is as follows: :
where max represents the operator of the index of the maximum value in the vector.
Step 140: and expanding the isolated graph learning to a graph collaborative training framework with characteristics of two modes, namely a rotation mode and a mirror mode, carrying out graph collaborative training by using label-free data to obtain an optimal basic learner, and predicting the category of the query label by using the optimal basic learner.
Definition ofRepresentation and rotation modality feature extractor omegar(. o) corresponding new data embedding feature, whereinAndrespectively representing the embedded characteristics of the support set, the label-free set and the query set data on the rotation modality;representation and mirror mode feature extractor omegam(. o) corresponding new data embedding feature, wherein The embedded features of the support set, the tagless set and the query set data on the mirror mode are respectively represented.
The first step is as follows: according to the formula in step 130Support set features with two modalitiesAndrespectively constructing two different basic learners PrAnd PmTwo different basic learners PrAnd PmAs follows:
the second step is that: predicting soft pseudo labels of unlabeled data from two modal features, soft pseudo label matrix of unlabeled data predicted on rotation modalitySoft pseudo-label matrix predicted on mirror mode with unlabeled dataAs follows:
the third step: sorting the values in the soft pseudo-label matrix and then selecting the unlabeled exemplars with the highest confidence on the embedded features of each modalityAndextending pseudo-tag instances and corresponding labels across support sets on different schemas;
wherein the content of the first and second substances,a tag matrix representing samples of support sets of two modalities;representing two modesA label matrix supporting set samples;andrepresenting a pseudo label matrix.
The fourth step: repeating the previous three steps until the performance of the basic learner to be learned does not rise to obtain two optimal basic learnersAnd
the categories of the predicted query tags are:
the embodiment of the invention provides a semi-supervised small sample image classification method based on graph collaborative training, which provides a new label prediction method, namely isolated graph learning. Secondly, a semi-supervised graph collaborative training method is provided, isolated graph learning is extended to a graph collaborative training framework with characteristics of two modes, namely a rotation mode and a mirror mode, the problem of characteristic mismatching in small sample learning is solved from the perspective of multi-mode fusion, and the classification performance of small sample images is greatly improved.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.