Hyperspectral image semi-supervised classification method based on small sample learning
1. The hyperspectral image semi-supervised classification method based on small sample learning is characterized by comprising the following steps of:
(1) selecting five data sets of Indian Pines, KSC, Salinas, Pavia University and Botswana from a hyperspectral database, and respectively reading to obtain a three-dimensional matrix data domain of each data set, wherein the three-dimensional matrix data domain is mxnxh, a label domain is a two-dimensional matrix mxn, h represents the spectral dimension of a hyperspectral image, and (m, n) represents the position of a pixel on a certain spectrum;
(2) respectively carrying out data normalization processing, background class removal and dimension reduction operation preprocessing on the five data sets obtained in the step (1) by utilizing a three-dimensional matrix data domain in each data set so as to eliminate the influence caused by noise and redundant information;
(3) dividing a training set and a testing set: randomly selecting one data set as a test set from the five preprocessed data sets; randomly extracting zeta classes from the remaining four data sets as training sets, wherein zeta values are respectively set according to different training sets;
(4) constructing a hyperspectral image prototype classification network sequentially consisting of input data → a first convolution layer → a first linear rectification function RELU operation → a second convolution layer → a second linear rectification function RELU operation → a full connection layer;
(5) training a hyperspectral image prototype classification network, namely taking a negative logarithm form of Euclidean distances from unlabeled samples in a training set to each prototype as a loss function, and iteratively updating the loss function by using a random gradient descent method to optimize the network until a loss function value is minimum to obtain a trained classification network;
(6) predicting the test set to finish classification:
(6a) respectively selecting 3 samples from all K categories in the test set as a test support set S2With the remaining samples as the test query set Q2;
(6b) Calculating the gravity center c of the test support set in each class after network mappingkAnd using it as initial value c of test support set prototype of each typek;
(6c) For each data in the test query set to all test support set prototype ckPerforming softmax logistic regression operation on the distance to obtain the class probability of all the test query set data
(6d) Probability in categoryScreening out test query set data with high confidence level through set threshold, and calculating and screening outThe center of gravity of the test query set and the test support set after network mapping is used as a prototype of the modified test support set
(6e) For each data in the set of unscreened test queries, its contribution to all prototypes is calculated separatelyAnd sequentially performing softmax logistic regression and argmax operation to obtain the predicted category labelClassification is completed.
2. The method of claim 1, wherein: the data normalization processing in (2) is realized as follows:
(2a) transforming the three-dimensional matrix data field m × n × h obtained in (1) into a two-dimensional matrix [ (m × n), h ];
(2b) traversing h spectra by adopting Max-Min normalization operation, and mapping m multiplied by n data in each spectrum to [0,1]Within the range, obtaining a normalized pixel value xijs′:
Where s denotes a spectral band in the hyperspectral image, (i, j) denotes the coordinates of a pixel in spectral band s, xijsRepresenting a value of a pixel, x, in the spectral range s--smax,x--sminRespectively representing the maximum and minimum of all picture elements in the spectral band s.
3. The method of claim 1, wherein: the background removing operation in (2) is to remove the sample and the label with the category of 0 in the data set.
4. The method of claim 1, wherein: the dimension reduction operation in the step (2) is to perform principal component analysis on the five data sets from which the background class is removed to obtain a three-dimensional matrix m × n × pnWherein p isnSet to 50.
5. The method of claim 1, wherein: the parameters of each layer of the classification network of the mid-high spectrum image prototype are set as follows:
the total number of feature maps of the first convolution layer is 50, the size of a convolution kernel is 3 multiplied by 3, the convolution step size is 1, and the padding value is 1;
the total number of feature maps of the second convolution layer is 100, the size of the convolution kernel is 3 multiplied by 3, the convolution step size is 1, and the padding value is 0;
the input width of the fully connected layer is 200 and the output width is 9.
6. The method according to claim 1, wherein the network is optimized in (5) by iteratively updating the loss function using a stochastic gradient descent method as follows:
(5a) in the training set, 3 samples are respectively selected from Zeta classes as a training support set S1And the rest samples are used as a training query set Q1;
(5b) Using training support set S1Calculating prototypes of training support set cξ:
Wherein x isiDenotes the ith training support set, fφ(xi) Represents the training support set, N, after network mappingξNumber of xi class training support set, fφA mapping function for the network;
(5c) supporting set of prototypes c by trainingξPredicting training query set Q in turn1Class probability of
Wherein x isjRepresents the jth training query set, fφ(xj) Represents the training query set, d (f), after network mappingΦ(xj),cξ) Representing training query set samples x after network mappingjTo prototype cξThe distance of (d);
(5d) setting a threshold p1Class probability of 0.9Value and threshold p1Performing a comparison to determine weights w for a set of training queries in a prototype updatej,ξ: when in useWhen p is greater than p, w isj,ξIs arranged asOtherwise, wj,ξSet to 0;
(5e) zeta training support set models are updated in sequence
(5f) Predicting labels for each data in an unscreened training query set
Wherein the content of the first and second substances,represents a sample xjA probability value assigned to the tag ξ;
(5g) calculating an objective function J of a current networkq(Φ):
Where Q is 1,2, …, Q denotes the Q-th training of the classification network, Q denotes the total number of times of training, Jq-1(phi) represents an objective function obtained in the previous round of training, and N represents the total number of samples in the training query set;
(5h) and (5a) to (5g) are executed in a loop until the total training times Q is 1000 times, and then the training is finished, and the network model with the minimum value of the target function J (phi) is obtained and serves as the trained classification network.
7. The method of claim 1, wherein: (6b) the center of gravity c of the test support set in each class after network mapping is calculatedkThe formula is as follows:
wherein z isiRepresents the ith test support set data, fφ(zi) Representing a test support set, N, mapped through the networkkNumber of support sets for the kth class of tests.
8. The method of claim 1, wherein: (6c) class probability obtained inThe formula is as follows:
wherein z isjRepresenting the jth test query set data, fφ(zj) Representing a set of test queries, d (f), mapped across a networkΦ(zj),ck) Representing test query set samples z after network mappingjTo prototype ckThe distance of (c).
9. The method of claim 1, wherein: (6d) the gravity center of the screened test query set and the test support set after network mapping is calculatedThe method is realized as follows:
(6d1) setting a threshold p2Class probability of 0.9Value and threshold p2Performing a comparison to determine weights w for a set of training queries in a prototype updatej,k: when in useWhen p is greater than p, w isj,kIs arranged asOtherwise, wj,kSet to 0;
(6d2) weights w from the training query setj,kSequentially updating K test support set prototypes by the following formula
Wherein f isφ(zi) Representing the set of test support, f, mapped through the networkφ(zj) Representing a set of test queries over a network map, NkNumber of support sets for the kth class of tests.
10. The method of claim 1, wherein: (6e) respectively computing test query set samples to all prototypesAnd sequentially performing softmax logistic regression and argmax operations, wherein the following steps are realized:
(6e1) computing test query set samples zjProbability value p (y ═ k | z) assigned to label kj):
Wherein the content of the first and second substances,representing test query set samples z after network mappingjTo updated prototypesThe distance of (a) to (b),representing test query set samples z after network mappingjTest query set prototype to kth' after updateThe distance of (d);
(6e2)according to the label probability value p (y ═ k | z)j) Obtaining a label for each data in the set of test queries predicted to be unscreened
Where argmax represents the maximum argument point set function.
Background
The hyperspectral image classification is the key point of research in the field of image processing. The hyperspectral image has the characteristics of large data volume, multiple wave bands and strong correlation among the wave bands, and although the characteristics bring great convenience to the classification process, the model is easy to over-fit due to the fact that the labeled samples are few, so that the hyperspectral image faces many challenges in actual classification and identification application.
The existing hyperspectral image classification methods are classified into unsupervised type, semi-supervised type and supervised type classification methods according to whether an unmarked sample participates in training. The semi-supervised classification method can be divided into five types from the difference of the non-labeled sample participating in the training mode: graph-based methods, generative methods, collaborative learning methods, semi-supervised clustering methods, and self-training methods.
The graph-based method is used for modeling the connection relation between individuals by using the data form of the graph, and mainly comprises two methods, namely a graph convolution neural network and a label propagation method. The hyperspectral image classification method based on the graph utilizes the graph model to represent the similarity relation between samples, can obtain higher classification precision, but has the problem of large computation amount.
Generative models, as the name implies, are models that generate observable data. Typical generative models include generative confrontation networks GAN, variational self-encoders, and the like. The generative method can link the unmarked sample with the learning target through the parameters of the potential model, but the method assumes that the sample data obeys a potential distribution, needs sufficiently reliable priori knowledge modeling, and has higher use threshold.
The collaborative training method is to expand a training set by selecting more reliable samples for each other through two mutually independent classifiers so as to achieve the purpose of improving the classification precision. The method is simple, but the assumed conditions of the classifiers which are independent from each other are difficult to satisfy in practical application, and the error label is easily introduced into another classifier in the early stage of training.
The most typical algorithm in the semi-supervised clustering method is a direct-push support vector machine, which essentially belongs to the popularization form of the support vector machine, and the aim of the algorithm is to find a hyperplane for dividing unmarked samples. The method can process large and high-dimensional data sets, is easy to operate, and is easy to trap into a local optimal solution rather than a global optimal solution.
The self-training method is another efficient learning method. Firstly, training a classifier by using marking sample data; secondly, generating a 'pseudo label' for the unmarked sample by using a trained classifier; thirdly, combining the pseudo mark data with the mark sample data, and retraining the mark sample; and finally, predicting the class label of the test sample by using the trained classifier, and finishing the classification process. This learning method, being simple and effective, does not require specific assumptions, is widely used, but when enough "false labels" are incorrect, the poor classification decision is strengthened and the performance of the classifier becomes actually worse. In order to solve this problem, researchers have proposed different solutions.
Lu et al propose a self-training method in a novel synthetic classification for hyper-and panchromatic images based on selected-training, which combines standard active learning and an image segmentation model based on active learning, automatically selects unlabeled samples according to spatio-spectral features and prediction information from a spectral classifier during learning, participates in training and completes the classification process. Although the method is simple to operate, the threshold value of the spectral similarity and the number of the unlabeled samples are determined manually, so that the optimal solution in the learning process is difficult to obtain.
Li et al propose a soft label-based sparse polynomial logistic regression model in semi-persistent hyper-spectral image classification using soft sparse polynomial regression, which sequentially assigns a hard label and a plurality of soft labels to an unlabeled sample, and finally determines the label type of the unlabeled sample after multiple iterations. The method uses a plurality of soft labels, and although the problem of incorrect 'false labels' can be alleviated, the experimental result can cause an unstable problem.
Fang et al propose a Self-Learning method Based on multi-scale convolutional neural network integration in Multiscale CNNs Ensemble Based Self-Learning for Hyperspectral Image Classification. The method comprises the steps of firstly extracting spatial information of different scales from limited labeled training samples, then training a plurality of CNN models, and finally classifying unlabeled samples by using a trained multi-scale neural network. The method uses a plurality of classifiers, and although the problem that the label part is incorrect can be solved, the time and the memory loss are large.
Disclosure of Invention
The invention aims to provide a hyperspectral image semi-supervised classification method based on small sample learning, aiming at overcoming the defects of the existing self-training method, so as to reduce the influence of 'pseudo-labeled' samples with low confidence level in the training process on the model, enable the model to better represent the class distribution of data, relieve the problem of easy overfitting in a small sample scene, and improve the classification performance of the network.
The technical idea of the invention is as follows: and (3) using the prototype network as a basic model, repeatedly adding the most reliable unmarked sample selected by setting a threshold value and the predicted label thereof into a training set, updating the class prototype and finishing classification. The implementation scheme comprises the following steps:
acquiring five public hyperspectral datasets; respectively preprocessing the data sets; obtaining a training set and a test set by adopting a non-repeated sampling method; constructing a hyperspectral image prototype classification network and setting parameters of each layer; training a hyperspectral image prototype classification network; inputting the test set into a trained hyperspectral image prototype classification network, correcting a category prototype by using a query set, and predicting the category of the query set by using the prototype, wherein the implementation comprises the following steps:
(1) selecting five data sets of Indian Pines, KSC, Salinas, Pavia University and Botswana from a hyperspectral database, and respectively reading to obtain a three-dimensional matrix data domain of each data set, wherein the three-dimensional matrix data domain is mxnxh, a label domain is a two-dimensional matrix mxn, h represents the spectral dimension of a hyperspectral image, and (m, n) represents the position of a pixel on a certain spectrum;
(2) respectively carrying out data normalization processing, background class removal and dimension reduction operation preprocessing on the five data sets obtained in the step (1) by utilizing a three-dimensional matrix data domain in each data set so as to eliminate the influence caused by noise and redundant information;
(3) dividing a training set and a testing set: randomly selecting one data set as a test set from the five preprocessed data sets; randomly extracting zeta classes from the remaining four data sets as training sets, wherein zeta values are respectively set according to different training sets;
(4) constructing a hyperspectral image prototype classification network sequentially consisting of input data → a first convolution layer → a first linear rectification function RELU operation → a second convolution layer → a second linear rectification function RELU operation → a full connection layer;
(5) training a hyperspectral image prototype classification network, namely taking a negative logarithm form of Euclidean distances from unlabeled samples in a training set to each prototype as a loss function, and iteratively updating the loss function by using a random gradient descent method to optimize the network until a loss function value is minimum to obtain a trained classification network;
(6) predicting the test set to finish classification:
(6a) respectively selecting 3 samples from all K categories in the test set as a test support set S2With the remaining samples as the test query set Q2;
(6b) Calculating the gravity center c of the test support set in each class after network mappingkAnd using it as initial value c of test support set prototype of each typek;
(6c) For each data in the test query set to all test support set prototype ckPerforming softmax logistic regression operation on the distance to obtain the class probability of all the test query set data
(6d) Probability in categoryScreening out test query set data with high confidence through a set threshold, and calculating the gravity center of the screened test query set and the test support set after network mapping to be used as a modified test support set prototype
(6e) For each data in the set of unscreened test queries, its contribution to all prototypes is calculated separatelyAnd sequentially performing softmax logistic regression and argmax operation to obtain the predicted category labelClassification is completed.
Compared with the prior art, the invention has the following advantages:
1. on the basis of the existing prototype network hyperspectral image classification model, the invention adopts the closed-loop classification network based on self-training, and can fully utilize the posterior information of the unlabeled sample generated by the classification network, so that the classification network can better represent data distribution, the overfitting problem of the classification network model is relieved, and the classification precision is effectively improved.
2. According to the invention, the pseudo-label samples with high confidence level are screened out by setting the threshold value and participate in the updating process of each class prototype, so that the prototype calculation process is more reasonable, the adverse effect of unreliable pseudo-label samples on the classification network training process when the classification network training is insufficient in the initial training stage is reduced, and the classification precision is further improved.
Drawings
FIG. 1 is a flow chart of an implementation of the present invention.
FIG. 2 is a sub-flowchart of the present invention for training a hyperspectral image prototype classification network.
Detailed Description
Embodiments and effects of the present invention will be described in further detail below with reference to the accompanying drawings.
Referring to fig. 1, the implementation steps of the present invention include the following:
step 1, five public hyperspectral data sets are obtained.
Five data sets of Indian Pines, KSC, Salinas, Pavia University and Botswana are selected from a hyperspectral database and are respectively read to obtain a three-dimensional matrix data domain of mxnxh and a label domain of a two-dimensional matrix mxn in each data set, wherein h represents the spectral dimension of a hyperspectral image, and (m, n) represents the position of a pixel on a certain spectrum.
And 2, respectively carrying out data preprocessing on three-dimensional matrix data fields in the acquired five data sets so as to eliminate the influence brought by noise and redundant information.
(2.1) transforming the three-dimensional matrix data field m × n × h into a two-dimensional matrix [ (m × n), h ];
(2.2) traversing h spectra by adopting Max-Min normalization operation, and mapping m multiplied by n data in each spectrum to [0,1]Within the range, obtaining a normalized pixel value xijs′:
Where s denotes a spectral band in the hyperspectral image, (i, j) denotes the coordinates of a pixel in spectral band s, xijsRepresenting a value of a pixel, x, in the spectral range s..s max,x..s minRespectively representing the maximum value and the minimum value of all the image elements in the spectral section s;
(2.3) removing background class operation, namely removing samples and labels with the class of 0 in the data set;
(2.4) performing dimensionality reduction operation, namely performing principal component classification on the five data sets after the background class is removedSeparating out p beforenUsing the individual main component data as spectral information to reduce the original h-dimension data to pnDimension to obtain a preprocessed three-dimensional matrix m × n × pnThis example is not limited to pnSet to 50.
And 3, dividing a training set and a testing set.
(3.1) randomly selecting one data set as a to-be-tested set from the five preprocessed data sets, and using the remaining four data sets as a to-be-trained set, wherein the to-be-tested set comprises K categories, the to-be-trained set comprises Z categories, and Z is larger than K;
(3.2) randomly extracting zeta classes with the sample number more than 200 from Z classes of the to-be-trained set, and taking all samples in the zeta classes as the training set;
and (3.3) taking all samples in the K categories of the test set as the test set.
And 4, constructing a hyperspectral image prototype classification network.
(4.1) network architecture:
the structure of the hyperspectral image prototype classification network sequentially comprises input data → a first convolution layer → a first linear rectification function RELU operation → a second convolution layer → a second linear rectification function RELU operation → a full connection layer;
and (4.2) setting parameters of each layer of the network:
in the first convolutional layer, the total number of feature maps is 50, the size of a convolutional kernel is 3 multiplied by 3, the convolution step length is 1, in order to ensure that the size of output data after passing through the first convolutional layer is not changed, data input into the first convolutional layer is filled, and the filling value is set to be 1;
in the second convolution layer, the total number of the feature maps is set to be 100, the size of a convolution kernel is 3 multiplied by 3, and the convolution step length is 1;
in the fully-connected layer, the input width is set to 200 and the output width is set to 9.
And 5, carrying out self-training learning on the hyperspectral image prototype classification network to obtain a trained classification network.
Referring to fig. 2, the specific implementation of this step is as follows:
(5.1) in the training setIn (1), 3 samples are respectively selected from Zeta classes as a training support set S1And the rest samples are used as a training query set Q1;
(5.2) Using training support set S1Calculating prototypes of training support set cξ:
Wherein x isiDenotes the ith training support set, fφ(xi) Represents the training support set, N, after network mappingξNumber of xi class training support set, fφA mapping function for the network;
(5.3) supporting prototype c by trainingξPredicting training query set Q in turn1Class probability of
Wherein x isjRepresents the jth training query set, fφ(xj) Represents the training query set, d (f), after network mappingΦ(xj),cξ) Representing training query set samples x after network mappingjTo prototype cξThe distance of (d);
(5.4) setting the threshold p1Class probability of 0.9Value and threshold p1Performing a comparison to determine weights w for a set of training queries in a prototype updatej,ξ:
If it is notGreater than p1Then w will bej,ξIs arranged as
Otherwise, will wj,ξSet to 0;
(5.5) sequentially updating the ζ training support set models according to the following formula
(5.6) predicting the label of each data in the unscreened training query set
Wherein the content of the first and second substances,represents a sample xjA probability value assigned to the tag ξ;
(5.7) calculating the objective function J of the current networkq(Φ):
Where Q is 1,2, …, Q denotes the Q-th training of the classification network, Q denotes the total number of times of training, Jq-1(phi) represents an objective function obtained in the previous round of training, and N represents the total number of samples in the training query set;
and (5.8) circularly executing the steps (5a) to (5g) until the total training time Q is 1000 times, and ending the training to obtain a network model with the minimum value of the target function J (phi) as a well-trained classification network.
And 6, classifying the test set and outputting a classification result.
(6.1) respectively selecting 3 samples from all K categories in the test set as a test support set S2With the remaining samples as the test query set Q2;
(6.2) calculating the gravity center c of the test support set in each class after network mappingkAnd using it as initial value c of test support set prototype of each typek:
Wherein z isiRepresents the ith test support set data, fφ(zi) Representing a test support set, N, mapped through the networkkNumber of support sets for the kth class of tests.
(6.3) for each data in the test query set to all test support set prototypes ckPerforming softmax logistic regression operation on the distance to obtain the class probability of all the test query set data
Wherein z isjRepresenting the jth test query set data, fφ(zj) Representing a set of test queries, d (f), mapped across a networkΦ(zj),ck) Representing test query set samples z after network mappingjTo prototype ckThe distance of (d);
(6.4) probability in classScreening out test query set data with high confidence level through the set threshold value, and setting the threshold value p2Class probability of 0.9Value and threshold p2Performing a comparison to determine weights w for a set of training queries in a prototype updatej,k:
If it is notGreater than p2Then w will bej,kIs arranged as
Otherwise, will wj,kSet to 0;
(6.5) calculating the gravity center of the screened test query set and test support set after network mapping to be used as a corrected test support set prototype
(6.6) for each data in the set of unscreened test queries, calculate it separately to all prototypesAnd performing softmax logistic regression operation on the obtained distance, and calculating the sample z of the test query set which is not screenedjProbability value p (y ═ k | z) assigned to label kj):
Wherein the content of the first and second substances,representing test query set samples z after network mappingjTo updated prototypesThe distance of (a) to (b),representing test query set samples z after network mappingjTest query set prototype to kth' after updateThe distance of (d);
(6.7) probability value p (y ═ k | z) for tagj) Performing argmax operation to obtain the label of each data in the test query set which is predicted not to be screened
The effects of the present invention can be illustrated by the following test results:
the method comprises the following steps of 1, selecting Salinas as a to-be-tested set, taking a training class zeta as 25, taking a testing class K as 16, and respectively testing by using the method disclosed by the invention and the 6 methods of the existing SVM, EMP, CNN, SVM-CK, EPF and PN under the scene that only three samples with labels can be used, so as to obtain the classification accuracy shown in the table 1.
TABLE 1 Classification accuracy based on test set Salinas
SVM
EMP
CNN
SVM-CK
EPF
PN
The invention
Salinas
71.02
72.78
75.43
72.03
72.75
75.60
77.93
In table 1, SVM represents an existing hyperspectral image classification model based on a support vector machine, EMP represents an existing hyperspectral image classification model based on an extended morphological profile, CNN represents a hyperspectral image classification model of an existing two-dimensional convolutional neural network, SVM-CK represents a hyperspectral image classification model of an existing support vector machine based on a composite kernel, EPF represents an existing hyperspectral image classification model based on edge preserving filtering, and PN represents an existing hyperspectral image supervision classification model based on a prototype network.
As can be seen from table 1: based on a test set Salinas, the overall classification precision of the hyperspectral images classified by using an SVM classification model is 71.02%, the overall classification precision of the hyperspectral images classified by using an EMP classification model is 72.78%, the overall classification precision of the hyperspectral images classified by using a CNN classification model is 75.43%, the overall classification precision of the hyperspectral images classified by using an SVM-CK classification model is 72.03%, the overall classification precision of the hyperspectral images classified by using an EPF classification model is 72.75%, the overall classification precision of the hyperspectral images classified by using a PN classification model is 75.60%, and the overall classification precision of the hyperspectral images by using the hyperspectral image classification method is 77.93%. It is shown that the present invention is more advantageous than the prior art in dealing with small sample problems.
It can also be seen from table 1: compared with the existing PN based on the prototype network supervision classification model, although both the prototype network and the PN are used as basic models, the invention uses the closed-loop self-training method, and uses the posterior information from the unlabeled sample in the training process of the classification network, so that the classification network can more accurately represent the actual data distribution, and the classification result is higher.
And 2, testing 2, namely selecting the Pavia University as a to-be-tested set, taking the zeta value of the training category as 30 and the K value of the testing category as 9, and respectively testing by using the method of the invention and 6 methods of the prior SVM, EMP, CNN, SVM-CK, EPF and PN under the scene that only three samples with labels are available for use, wherein the classification precision is shown in the table 2.
TABLE 2 Classification accuracy based on test set Pavia University
SVM
EMP
CNN
SVM-CK
EPF
PN
The invention
Pavia University
46.99
60.64
67.22
49.21
48.93
67.12
67.36
As can be seen from table 2: based on the test set Pavia University, the overall classification precision of the hyperspectral images by using the SVM classification model is 46.99%, the overall classification precision of the hyperspectral images by using the EMP classification model is 60.64%, the overall classification precision of the hyperspectral images by using the CNN classification model is 67.22%, the overall classification precision of the hyperspectral images by using the SVM-CK classification model is 49.21%, the overall classification precision of the hyperspectral images by using the EPF classification model is 48.93%, and the overall classification precision of the hyperspectral images by using the PN classification model is 67.12%. It is shown that the present invention is more advantageous than the prior art in dealing with small sample problems.
It can also be seen from table 2: compared with the existing PN based on the prototype network supervision classification model, although both the prototype network and the PN are used as basic models, the invention adopts a closed-loop self-training method, and posterior information from unlabeled samples is used in the training process of the classification network, so that the classification network can more accurately represent actual data distribution, and the classification result is higher.
And 3, selecting Botswana as a to-be-tested set, taking the training class zeta as 50 and the testing class K as 11, and respectively using the SPN of the invention and the existing SVM, EMP, CNN, SVM-CK, EPF and PN to test under the scene that only three labeled samples can be used in each class, wherein the classification precision is shown in the table 3.
TABLE 3 Classification accuracy based on test set Botswana
SVM
EMP
CNN
SVM-CK
EPF
PN
The invention
Botswana
68.93
69.23
72.07
70.56
77.85
80.78
81.37
As can be seen from table 3: based on a test set Botswana, the overall classification precision of the hyperspectral images classified by using an SVM classification model is 68.93%, the overall classification precision of the hyperspectral images classified by using an EMP classification model is 69.23%, the overall classification precision of the hyperspectral images classified by using a CNN classification model is 72.07%, the overall classification precision of the hyperspectral images classified by using an SVM-CK classification model is 70.56%, the overall classification precision of the hyperspectral images classified by using an EPF classification model is 77.85%, the overall classification precision of the hyperspectral images classified by using a PN classification model is 80.78%, and the overall classification precision of the hyperspectral images by using an SPN classification model is 81.37%. It is shown that the present invention is more advantageous than the prior art in dealing with small sample problems.
It can also be seen from table 3: compared with the existing PN based on the prototype network supervision classification model, although both the prototype network and the PN are used as basic models, the invention uses the closed-loop self-training method, and uses the posterior information from the unlabeled sample in the training process of the classification network, so that the classification network can more accurately represent the actual data distribution, and the classification result is higher.
In conclusion, the invention takes the overall classification precision as an evaluation index, verifies that the semi-supervised learning method for training by selecting the pseudo-labeled sample with high confidence level through setting the threshold value classifies the hyperspectral image in the small sample scene, so that the classification network model can better represent the class distribution of data, the problem of easy overfitting in the small sample scene can be relieved, and the classification performance of the network is improved; meanwhile, the classification model used by the invention also has good generalization capability and has better classification effect on the data sets of Pavia University, Salinas and Botswana.
- 上一篇:石墨接头机器人自动装卡簧、装栓机
- 下一篇:基于图协同训练的半监督小样本图像分类方法