Time sequence image reconstruction method based on gated convolution-long and short memory network

文档序号:9415 发布日期:2021-09-17 浏览:63次 中文

1. A time series image reconstruction method based on a gate-controlled convolution-long and short memory network is characterized in that a time series image I with a part of missing areas and a mask M for identifying the missing areas are input, and the method comprises the following key steps:

step 1: training a generator;

step 2: training a classifier;

and step 3: and (4) reconstructing a deletion area.

2. According to the above method for reconstructing a time series image based on a gated convolution-long and short memory network, the step 1 comprises:

substep 1: GatedConvLSTM-based time sequence image feature extraction

Inputting an image mask M and a time series image I, and extracting time series characteristics of the image through a convolutional neural network GatedConvLSTMAnd hidden features

Where subscript 1 of gateconvlstm 1 indicates the hierarchy of the classifier.

Substep 2: GatedConv3 d-based image downsampling

The extracted featuresDownsampling features by spatial resolution downsampling through three-dimensional gated convolution (GatedConv3d)

The same reasoning can be obtained according to substeps 1 and 2:

substep 3: multi-scale hole feature fusion based on GatedConv3d

Extracting the featureThe depth fusion feature F was obtained by GatedConv3d fusion of four void ratios (scaled) of 2, 4, 6, 8, respectivelyB

Sub-steps 2, 3 correspond to a coding process, i.e. the coding of time-series images is characterized,

substep 4: image feature connection

Depth fusion feature FBFeatures obtained by comparison with previous GatedConvLSTMCombining and inputting gated convolutions to obtain pre-upsampled features

Substep 5: image feature upsampling

Pre-upsampling featuresObtaining hidden features after processing with GatedConvLSTM of the same scaleInput to the current GatedConvLSTM, to maintain the time dimension of the association over the network features,

will be characterized byTwo times amplified by the upsampling gated convolution.

The same can be obtained according to substeps 4 and 5:

substep 6: time series image generation

After the structure and the steps of up-sampling and down-sampling, the repaired time series image I is output by using gated convolutionout

3. The method for reconstructing time-series images based on a gated convolution-long and short memory network as claimed in claim 1 or 2, wherein said step 2 comprises: the real time series image and the generated image are input to a classifier C, and whether the image is real or an image synthesized by the classifier is judged.

4. The time series image reconstruction method based on the gated convolution-long and short memory network as claimed in claim 3, wherein the judgment method is specifically a multi-temporal spectral local discriminator, which is composed of 3d convolutions with 6 convolution kernels of 3 × 5 × 5 and step length of 1 × 2 × 2, each convolution layer of the discriminator uses spectral normalization to stabilize training, in addition, we use least squares to generate a training form of the countermeasure network, and the optimization function of the discriminator is as follows:

wherein G represents a generator; d represents a discriminator; z represents the input deletion sequence; both a and c represent true, denoted by 1, and b represents false.

5. The method for reconstructing time-series images based on a gated convolution-long and short memory network as claimed in claim 1 or 2, wherein said step 3 comprises: step of partial area missing image IjTo the generator G which finishes the training in the step 1, obtaining a time sequence reconstruction image Iout

6. The method according to claim 2, wherein the gated convolution and long-short memory network are used in substep 1 to extract spatial and temporal features of the image respectively.

7. The method for reconstructing time-series images based on the gated convolution-long and short memory network as claimed in claim 2, wherein sub-steps 2 and 5 in step 1 respectively adopt down-sampling and up-sampling to blend image information in multiple scales.

Background

The time series images are formed by arranging multiple images of a research area according to time, and describe the change of the surface features of the research area along with time. However, the data of the corresponding area is lost due to the influence of cloud shadow and other factors on the partial date of the time-series image. Therefore, the dimensions of the feature part extracted from the time series image are missing, so that the features cannot be aligned accurately, and difficulty is brought to subsequent image classification and information extraction. The missing region filling is to fill the missing part of the image according to a certain rule to obtain the image without missing. According to different dimensions of the utilized information, the remote sensing data missing area filling method is divided into a method combining space dimension, time dimension, spectrum dimension and space-time.

Since the missing region of the remote sensing image is mostly missing part of the region, the missing region can be predicted by using the effective region on the remote sensing image, and the remote sensing image without missing is obtained. According to whether a reference image is used, the method can be divided into single-phase image filling and multi-phase image filling, wherein the single-phase image filling is to extract structural information such as gradients from the image and popularize the structural information from an effective area to an unknown area on the image so as to fill a missing area; the multi-period image maps the information of the reference image to the reference image according to a certain method. In consideration of the difference of radiation characteristics between images acquired at different times, directly pasting the original image results in color difference between the reconstructed image block and the original image block, which results in color inconsistency. Another method is to use only one image as a reference and copy similar regions on the same image to fill in the missing regions.

The missing feature reconstruction based on the time dimension is a feature description which considers each pixel as changing along with time, the feature can be reflectivity, DN value or NDVI and the like, and the missing feature is reconstructed by describing the change of the feature in the time dimension according to a certain rule. Common methods such as mean filling, previous pixel value filling and next pixel value filling have good effects on stationary time sequences, but have general effects on ground objects with periodic changes or sudden changes. The other method is to select a time sequence characteristic as a condition for selecting similar pixels, namely, a change curve with the same or similar characteristic is selected according to the time sequence characteristic, and a reference time sequence curve is utilized to reconstruct a missing region.

The remote sensing image shows not only spatial correlation but also temporal correlation. The reconstruction result can not only reflect the texture details of the local area, but also be integrated with the surrounding non-missing area; but also reflects the time-varying characteristics of the image. The method of fusion of time domain and space domain tries to realize the reconstruction of the missing region by simultaneously utilizing the two kinds of information. One method divides the image into homogeneous image blocks by multi-scale segmentation, and describes the spatial similarity of image pixels because the image blocks are formed by similar pixels. Typical methods are as follows: forming superpixels by pixel clustering to obtain superpixels with similar spectral blocks (Zhouya' nan, Yang Xiaonzeng, Feng Li, Wu Wei, Wu Tianjun, Luo Jiancheng, Zhou Xiaocheng, Zhang Xin. Superpixel-based time-series retrieval for optical images using SAR data using autoencoder networks [ J ]. GIScience & remove sensing.2020,57(8): 1005-) 1025); partitioning by multi-Temporal phases so that the partitions have similar variation laws (WuWei, Ge Luoqi, Luo Jiancheng, Huang Ruohong, Yang Yingpin.A Spectral-Temporal Patch-based MissingArea Reconstruction for Time-Series Images [ J ] Remote Sensing,2018,10(10): 1560.); on the basis, the whole reconstruction of the missing region is realized by establishing the time change relation of the image blocks. Time series similarity judgment and statistics-based methods are widely used for time series reconstruction, where euclidean distance and correlation coefficient are two commonly used measures for describing the correlation between time series.

From the above process, it can be seen that: the spatial domain and the time domain of the method are processed separately, namely, pixels are firstly gathered into super pixels through clustering, or segmentation blocks are obtained through image segmentation; and then, carrying out time series modeling by using the LSTM to describe the time variation characteristics of the image. The clustering and timing modeling of this method are performed independently, and this method assumes that the superpixel or inside of the segmented block is uniform, which is quite different from the fact. A more optimized mode is that the time change rule and the space change rule can be optimized in a coordinated mode, so that the model can describe not only the space change of the earth surface, but also the time change of the earth surface. Meanwhile, the space-time distribution rule of the earth surface is described through nonlinear models such as convolution and the like.

Disclosure of Invention

In order to fully utilize the time-space characteristics shown by the time series image, the invention provides a time series image reconstruction method based on a gated convolution-long and short memory network.

Assuming that a group of n remote sensing images I acquired at different time are arranged according to a time sequence to form a time sequence image:

I=<I1,I2,...,In> (1)

where < > denotes an ordered set, i.e. the individual elements in the sequence are ordered.

The data state of the partial area on the remote sensing image is represented by using a mask M. The mask identifies whether each pixel of the image is missing, 0 represents a missing value and 1 represents a valid value. The mask data is obtained using a cloud mask algorithm or a manual visual approach. Inputting an image I with a scene covering the same areajThe image is from a time-series image or other time images, and partial areas on the image are missing, and the missing areas need to be filled.

The invention adopts a generation countermeasure network structure, utilizes a time sequence image I to train a model G, and the model describes the change relation of the time sequence image in time and space. Meanwhile, the artificially generated result and the real image are input into a classifier D, and the judgment input is the artificial generationThe result is also a real image, and the two images are subjected to antagonistic training to realize the cooperative optimization of the two images. Then, the image I to be reconstructed isjInput into a generator G according to IjPredicting the missing part on the image to generate a non-missing image.

Comprises the following key steps:

step 1: generator training

Substep 1: GatedConvLSTM-based time sequence image feature extraction

Inputting an image mask M and a time series image I, and extracting time series characteristics F of the image through a convolutional neural network GatedConvLSTM1 TAnd hidden features

Where subscript 1 of gateconvlstm 1 indicates the hierarchy of the classifier.

Substep 2: GatedConv3 d-based image downsampling

The extracted features F1 TDownsampling the spatial resolution to obtain the downsampled feature F by three-dimensional gated convolution (GatedConv3d)2 C

F2 C=GatedConv3d1(F1 T)

The same reasoning can be obtained according to substeps 1 and 2:

substep 3: multi-scale hole feature fusion based on GatedConv3d

Extracting the feature F3 TThe depth fusion feature F was obtained by GatedConv3d fusion of four void ratios (scaled) of 2, 4, 6, 8, respectivelyB

FB=DilatedGatedConv3ds(F3 T)

The substeps 2, 3 correspond to a coding process, i.e. the coding of time-series images is characterized.

Substep 4: image feature connection

Depth fusion feature FBFeature F obtained by comparison with the previous GatedConvLSTM3 TCombining and inputting gated convolution to obtain pre-upsampled features F4 C

F4 C=GatedConv3d3(Concat(F3 T,FB))

Substep 5: image feature upsampling

Pre-upsampling feature F4 CObtaining hidden features after processing with GatedConvLSTM of the same scaleInput to the current GatedConvLSTM to maintain the temporal dimension of the association over the network features.

Will be characterized byTwo times amplified by the upsampling gated convolution.

The same can be obtained according to substeps 4 and 5:

F6=UpsampledGatedConv3d2(F5 T)

F6 C=GatedConv3d6(Concat(F1 T,F6))

substep 6: time series image generation

After the structure and the steps of up-sampling and down-sampling, the repaired time series image I is output by using gated convolutionout

Step 2: classifier training

The real time series image and the generated image are input to a classifier C, and whether the image is real or an image synthesized by the classifier is judged.

The invention uses a multi-temporal spectral local discriminator which makes full use of the information of the temporal-spatial features and the time dimension. It consists of 3d convolutions with 6 convolution kernels of 3 x 5 and step sizes of 1 x 2.

Each convolutional layer of the discriminator uses spectral normalization to stabilize the training.

In addition, we use least squares to generate a training form of the countermeasure network, and the optimization function of the discriminator is as follows:

wherein G represents a generator; d represents a discriminator; z represents the input deletion sequence; both a and c represent true, denoted by 1, and b represents false.

And step 3: reconstruction of a region of absence

Step of partial area missing image IjTo the generator G which finishes the training in the step 1, obtaining a time sequence reconstruction image Iout

The invention has the advantages that: 1. modeling the time series image by using an LSTM network so that a reconstruction result can describe the time variation characteristics of the earth surface;

2. equivalent to the conventional convolution mode, all pixels are regarded as effective pixels, and the gated convolution can better perform feature extraction on a missing region by explicitly modeling the missing pixels.

3. The invention fuses LSTM and gated convolution, can model by considering time and space characteristics at the same time, and has better accuracy in reconstruction.

4. The invention adopts the training of the generation confrontation network, and can improve the reconstruction effect through the cooperative optimization of the generation network and the confrontation network.

Drawings

FIG. 1 is a flow chart of a method for reconstructing images based on time series.

Fig. 2 is a network structure diagram based on time-series image reconstruction.

Fig. 3 time series images before reconstruction.

Fig. 4 time series images after reconstruction.

Detailed Description

This embodiment will describe an embodiment of the present invention with reference to the flowchart of fig. 1 and the network structure diagram of fig. 2. And constructing a time sequence image I by using an 18 scene sentinel second image acquired between 3 months and 8 months in 2019, wherein the time sequence image I is located in the Anhui shou county of China.

First, the 18-view image is preprocessed, and atmospheric correction and cloud shadow removal are performed. Wherein, the atmosphere correction uses Sen2Cor software; and (3) cloud shadow removal, namely, generating a cloud shadow mask M by using Fmask4 software, and extracting clear and clear pixels according to the mask. Fig. 3 shows a time-series image I with a missing partial region.

The data set is a clean part in time series and is divided into 32 × 32 blocks with a time length of 10 and a total of 8811. Training and test sets were written at 9: 1 are randomly divided. The true value target is the originally divided small block, and the random time and random position of the input sequence is lost.

The training data is subjected to data enhancement by using a random rotation mode. The generator and classifier were co-trained for 200 generations in a challenge-generating fashion, iterating 8 data at a time. Both the generator and the classifier were learned using Adam optimizer, with the generator initial learning rate set to 0.001 and the classifier initial learning rate 0.0005, both attenuated by 0.5 for each 20 generations.

Step 1: generator training

Substep 1: GatedConvLSTM-based time sequence image feature extraction

Inputting the time-series image I and the cloud shadow mask M according to the formula (2), and extracting the time-series characteristics F of the image through a convolutional neural network GatedConvLSTM1 TAnd hidden features

Substep 2: GatedConv3 d-based image downsampling

The extracted features F1 TDownsampling the spatial resolution to obtain the downsampled feature F by three-dimensional gated convolution (GatedConv3d)2 CThe input of the step is the convolution feature F extracted in the last step1 TAs in equation (3).

The same can be obtained from substeps 1 and 2, as shown in equations (4), (5) and (6).

Substep 3: multi-scale hole feature fusion based on GatedConv3d

Equation (7) willExtracting the feature F3 TThe deep fusion characteristic F is obtained by four GatedConv3d fusions with the voidage rates of 2, 4, 6 and 8 respectivelyB

Substep 4: image feature connection

Depth fusion feature FBFeature F obtained by comparison with the previous GatedConvLSTM3 TCombining and inputting gated convolution to obtain pre-upsampled features F4 CAs in equation (8).

Substep 5: image feature upsampling

Pre-upsampling feature F as in equation (9)4 CObtaining hidden features after processing with GatedConvLSTM of the same scaleInput to the current GatedConvLSTM to maintain the temporal dimension of the association over the network features.

Characteristic F as shown in formula (10)4 TTwo times amplified by the upsampling gated convolution.

The same applies to substeps 4 and 5, as given by equations (11), (12), (13), (14) and (15).

Substep 6: time series image generation

After the structure and steps similar to encoding-decoding, the repaired time series image I is finally output by using gated convolutionoutAs in equation (16).

Step 2: classifier training

Inputting the real image and the image generated by the algorithm into a classifier D, and judging whether the image is real or the image synthesized by the classifier. In this regard, we use a multi-temporal spectral local discriminator that fully exploits the information in the spatiotemporal features and the time dimension. It consists of 3d convolutions with 6 convolution kernels of 3 x 5 and step sizes of 1 x 2. Each convolutional layer of the discriminator uses spectral normalization to stabilize the training. Whether the final output image is real or generator-synthesized.

In addition, we use the training of least squares GAN, and the optimization function of the discriminator is as shown in equations (17) and (18).

And step 3: reconstruction of a region of absence

Inputting the image Ij with partial area missing to the generator trained in the step 1 to obtain the image without missing.

The missing regions in the time-series image I are filled in sequence by the above method, and the obtained non-missing time-series image is shown in fig. 4. It can be seen that the result of filling the missing region can not only clearly reflect the texture space distribution of the ground object, but also reflect the time sequence change of the image, which indicates that the method can accurately reflect the time change and the space change of the ground object.

The foregoing is merely a description of embodiments of the invention and is not intended to limit the scope of the invention to the particular forms set forth, but on the contrary, the intention is to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the invention as defined by the appended claims.

完整详细技术资料下载
上一篇:石墨接头机器人自动装卡簧、装栓机
下一篇:X线图像的散射修正方法、装置、电子设备、存储介质

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!