Video super-resolution method based on multi-frame grouping and feedback network
1. A video super-resolution method based on multi-frame grouping and a feedback network is characterized by comprising the following steps:
s1: acquiring video data, preprocessing the video data to obtain training video data, and forming a training video data set;
s2: determining a target frame needing to be subjected to super-resolution, and performing up-sampling on the target frame to obtain a primary super-resolution video frame;
s3: grouping a video frame sequence contained in each piece of training video data on a time axis;
s4: inputting the grouped video frame sequences into the constructed initial super-resolution network model, extracting the feature maps of each group of video frame sequences, and carrying out alignment and fusion operations on the feature maps of each group of video frame sequences to obtain LR feature maps of each group of video frame sequences;
s5: gradually feeding back the LR characteristic graph of each group of video frame sequences for super-division to obtain a super-division characteristic graph sequence of the target frame;
s6: reconstructing the hyper-resolution feature map sequence of the target frame to obtain a reconstructed hyper-resolution residual error information sequence of the target frame, and adding the reconstructed hyper-resolution residual error information sequence with the preliminary hyper-resolution video frame of the target frame in S2 to obtain a final hyper-resolution video frame sequence of the target frame;
s7: setting a loss function, and training the initial super-resolution network model to obtain a trained super-resolution network model;
s8: and performing super-resolution reconstruction on the to-be-super-resolution video by using the trained super-resolution network model.
2. The video super-resolution method based on multi-frame grouping and feedback network of claim 1, wherein the preprocessing the video data comprises:
s1.1: intercepting a high-resolution video frame at the same position for all video data;
s1.2: down-sampling the high-resolution video frame to obtain a low-resolution video frame;
s1.3: normalizing all low-resolution video frames;
s1.4: and carrying out random data enhancement operation on the low-resolution video frame after the normalization processing, wherein the data enhancement operation comprises turning operation and mirroring operation.
3. The video super resolution method based on multi-frame grouping and feedback network according to claim 2, wherein in step S1.2, the high resolution video frame is down-sampled by using gaussian fuzzy down-sampling method.
4. The video super-resolution method based on multi-frame grouping and feedback network of claim 1, wherein in step S2, a bicubic interpolation up-sampling method is used to perform up-sampling operation on a target frame of training video data, so as to obtain a preliminary super-resolution video frame.
5. The method for super-resolution of videos based on multi-frame grouping and feedback network of claim 1, wherein in step S3, each training video data comprises a sequence of video frames that is divided into n groups on a time axis, wherein n subsets of the sequence of video frames are obtained, and each subset comprises a target frame.
6. The video super-resolution method based on multi-frame grouping and feedback network of claim 5, wherein the initial super-resolution network model comprises a deformable convolution alignment module and a fusion module;
the deformable convolution alignment module is a PCD feature alignment module at the front end of the existing EDVR model and comprises a multi-scale feature extraction unit and a feature alignment unit;
inputting subsets of n video frame sequences obtained by grouping into a multi-scale feature extraction unit, obtaining n size feature maps of each frame of video, inputting the feature maps of each size into a feature alignment unit to perform deformable convolution alignment operation, and obtaining alignment feature maps of each group of video frame sequences;
the fusion module is a TSA fusion module of the existing EDVR network model;
gradually fusing the alignment feature maps of each group of video frame sequences from small to large upwards to obtain an LR feature map (F) of each group of video frame sequencesg1、Fg2、…、Fgn) Where n denotes the number of groups of packets of the sequence of video frames, FgnAn LR feature map representing a subset of the nth video frame sequence.
7. The video super-resolution method based on multi-frame grouping and feedback network of claim 6, wherein the initial super-resolution network module further comprises a feedback module;
LR feature map (F) of each set of video frame sequencesg1、Fg2、…、Fgn) An input feedback module for n times of iteration according to the grouping sequence, wherein the input for gradually feeding back the super-resolution in each iteration is the LR characteristic graph of the video frame sequence subset corresponding to the iteration and the super-resolution characteristic graph of the target frame output in the last iteration, and each iteration is outputThe hyper-resolution feature map of the generation target frame is as follows:
F(out,n)=fFB(F(out,n-1),Fgn)
wherein, F(out,n)A hyper-resolution feature map representing the target frame output by the nth iteration, fFB(. X) denotes feedback over-divide operation, F(out,n-1)A hyper-resolution feature map representing a target frame output by the (n-1) th iteration; on the first iteration, F(out,n-1)=FgnI.e. F(out,0)=Fg1;
Forming the hyper-resolution feature map of the target frame of each iteration into a hyper-resolution feature map sequence (F) of the target frame(out,1)、F(out,2)、…、F(out,n))。
8. The video super-resolution method based on multi-frame grouping and feedback network of claim 7, wherein the initial super-resolution network module further comprises a reconstruction super-resolution module;
the hyper-resolution feature map sequence (F) of the target frame(out,1)、F(out,2)、…、F(out,n)) Inputting the reconstruction super-resolution module for reconstruction to obtain a reconstruction super-resolution residual error information sequence (I) of the target frame(Res,1)、I(Res,2)、…、I(Res,n)) Namely:
I(Res,n)=fRB(F(out,n))
wherein, I(Res,n)Reconstructed hyper-resolution residual information, f, of a target frame of a subset of the sequence of video frames representing the nth iterationRB() represents a rebuild operation;
adding the reconstructed super-resolution residual error information of the target frame and the preliminary super-resolution video frame of the target frame to obtain a final super-resolution video frame of the target frame, namely:
I(SR,n)=I(Res,n)+fUP(It)
wherein, I(SR,n)Final hyper-resolution video frame, F, representing the target frame of the subset of video frame sequences of the nth iterationUP(It) Representing preliminary overcuts of target framesVideo frame, ItRepresenting a target frame;
the final super-divided video frame of the target frame is formed into a final super-divided video frame sequence (I) of the target frame(SR,1)、I(SR,2)、…、I(SR,n))。
9. The video super-resolution method based on multi-frame grouping and feedback network of claim 9, wherein the loss function is an L1 norm loss function:
wherein, WnIs represented by(SR,n)Ratio occupied in the loss function, I(HR,t)A ground route representing a target frame;
and repeating the steps S3-S6, and carrying out iterative training on the initial super-resolution network model by using the training video data in the training video data set.
10. The method for super resolution of video based on multi-frame packet and feedback network as claimed in claim i, wherein the video data is obtained from an existing high resolution data set Vimeo-90 k.
Background
The video super-resolution method is a method of generating a high-resolution video from a low-resolution video, and has been widely studied for decades as a typical computer vision problem. In recent years, the emergence of a large number of high-definition display devices and the emergence of ultra-high-definition resolution further promote the development of video super-resolution. Meanwhile, the method has wide application prospects in satellite images, video monitoring, medical imaging and military science and technology, and becomes one of hot research problems in the field of computer vision.
The video super-resolution task adds timing information compared to single-frame super-resolution. The video super-resolution technology based on the deep learning can be roughly classified into a method based on multi-frame concatenation, a method based on 3D convolution, and a method based on a loop structure according to different ways of using timing information. The method based on multi-frame concatenation can be regarded as converting single-frame super-resolution into multi-frame input. If the method wants to use good time sequence information, the adjacent frames are not aligned, and the adjacent frame alignment mode can be divided into optical flow alignment and deformable convolution alignment. The EDVR network proposed by Wang et al belongs to deformable convolution alignment in the method, the EDVR aligns the features of adjacent frames with the current frame through multi-scale deformable convolution, and feature fusion is performed subsequently. The RBPN network is an optical flow alignment method belonging to such methods, and the RBPN utilizes information of adjacent frames by combining concepts of SISR and MISR, and the optical flow method often affects the accuracy of the final reconstruction result because excessive noise is introduced in the alignment step. Although the method based on multi-frame concatenation utilizes multi-frame features, the features are merely concatenated together, and motion information between frames cannot be represented. The method based on the 3D convolution is to process time sequence information in a video by utilizing the characteristic that the 3D convolution can learn time information, and Cabilllero et al firstly proposes that the 3D convolution can be regarded as a slow inter-frame information fusion process. Huang et al propose the BRCN model by using the idea of combining 3D convolution with RNN, but their work still uses a shallow network and has very limited information to learn. Therefore, the FSTRN proposed by Li et al, employs a deep 3D convolutional network with hopping connections, in which separable 3D convolutions are used to reduce the computational load of the 3D convolution. The recurrent neural network is good at processing a sequence structure, and thus a method based on the recurrent structure performs multi-frame super resolution by RNN, LSTM, or the like. The first proposed in this approach is a bi-directional RNN, which has a small network capacity and no subsequent inter-frame alignment steps. Guo et al improves the bi-directional RNN by employing a motion compensation module and a convolutional LSTM layer. Recent advances in video super-resolution (VSR) have shown a strength of deep learning, which can achieve better reconstruction performance. However, the existing video SR methods based on deep learning basically fuse the input multi-frame timing information, and obtain the final result after one reconstruction. The existing method does not fully utilize the feedback mechanism common in the human visual system to carry out grouping feedback super-separation on multi-frame videos.
Chinese patent CN110969577A published in 4/7/2020 provides a video super-resolution reconstruction method based on a deep double attention network, which realizes accurate video super-resolution reconstruction by fully utilizing space-time information characteristics by loading a cascaded motion compensation network model and a reconstruction network model; the motion compensation network model can gradually learn optical flow from rough to fine to synthesize multi-scale motion information of adjacent frames; and utilizing a double attention mechanism in the reconstruction network model, and forming a residual attention unit to concentrate on the intermediate information characteristics. The method is based on multi-frame cascade, adjacent frames are aligned by using an optical flow method, excessive noise is introduced to influence the accuracy of the final reconstruction result, and the method based on multi-frame cascade only cascades the features together, cannot represent motion information between frames, and has poor video super-resolution effect.
Disclosure of Invention
The invention provides a video super-resolution method based on multi-frame grouping and feedback network, which aims to overcome the defect of poor video super-resolution effect in the prior art, applies the feedback mechanism of the human visual system to the video super-resolution technology, has strong high-level representation capability and improves the video super-resolution effect.
In order to solve the technical problems, the technical scheme of the invention is as follows:
the invention provides a video super-resolution method based on multi-frame grouping and a feedback network, which comprises the following steps:
s1: acquiring video data, preprocessing the video data to obtain training video data, and forming a training video data set;
s2: determining a target frame needing to be subjected to super-resolution, and performing up-sampling on the target frame to obtain a primary super-resolution video frame;
s3: grouping a video frame sequence contained in each piece of training video data on a time axis;
s4: inputting the grouped video frame sequences into the constructed initial super-resolution network model, extracting the feature maps of each group of video frame sequences, and carrying out alignment and fusion operations on the feature maps of each group of video frame sequences to obtain LR feature maps of each group of video frame sequences;
s5: gradually feeding back the LR characteristic graph of each group of video frame sequences for super-division to obtain a super-division characteristic graph sequence of the target frame;
s6: reconstructing the hyper-resolution feature map sequence of the target frame to obtain a reconstructed hyper-resolution residual error information sequence of the target frame, and adding the reconstructed hyper-resolution residual error information sequence with the preliminary hyper-resolution video frame of the target frame in S2 to obtain a final hyper-resolution video frame sequence of the target frame;
s7: setting a loss function, and training the initial super-resolution network model to obtain a trained super-resolution network model;
s8: and performing super-resolution reconstruction on the to-be-super-resolution video by using the trained super-resolution network model.
Preferably, the preprocessing the video data includes:
s1.1: intercepting a high-resolution video frame at the same position for all video data;
s1.2: down-sampling the high-resolution video frame to obtain a low-resolution video frame;
s1.3: normalizing all low-resolution video frames;
s1.4: and carrying out random data enhancement operation on the low-resolution video frame after the normalization processing, wherein the data enhancement operation comprises turning operation and mirroring operation.
Preferably, in step S1.2, the high resolution video frame is downsampled by using a gaussian blur downsampling method.
Preferably, in step S2, an upsampling operation is performed on a target frame of the training video data by using a bicubic interpolation upsampling method, so as to obtain a preliminary hyper-resolution video frame.
Preferably, in step S3, each training video data includes a sequence of video frames that is divided into n groups on the time axis, and n subsets of the sequence of video frames are obtained, and each subset includes a target frame.
Preferably, the initial super-resolution network model comprises a deformable convolution alignment module and a fusion module;
the deformable convolution alignment module is a PCD feature alignment module at the front end of the existing EDVR model and comprises a multi-scale feature extraction unit and a feature alignment unit;
inputting subsets of n video frame sequences obtained by grouping into a multi-scale feature extraction unit, obtaining n size feature maps of each frame of video, inputting the feature maps of each size into a feature alignment unit to perform deformable convolution alignment operation, and obtaining alignment feature maps of each group of video frame sequences;
the fusion module is a TSA fusion module of the existing EDVR network model;
gradually fusing the alignment characteristic graphs of each group of video frame sequences from small to large upwards to obtain each group of video framesLR profile of sequences (F)g1、Fg2、…、Fgn) Where n denotes the number of groups of packets of the sequence of video frames, FgnAn LR feature map representing a subset of the nth video frame sequence.
Preferably, the initial super-resolution network module further comprises a feedback module;
LR feature map (F) of each set of video frame sequencesg1、Fg2、…、Fgn) The input feedback module carries out n times of iteration according to the grouping sequence, the input for gradually feeding back the super-score in each iteration is the LR characteristic graph of the video frame sequence subset corresponding to the iteration and the super-score characteristic graph of the target frame output in the last iteration, and the super-score characteristic graph of the target frame in each iteration is output, namely:
F(out,n)=fFB(F(out,n-1),Fgn)
wherein, F(out,n)A hyper-resolution feature map representing the target frame output by the nth iteration, fFB(. X) denotes feedback over-divide operation, F(out,n-1)A hyper-resolution feature map representing a target frame output by the (n-1) th iteration; on the first iteration, F(out,n-1)=FgnI.e. F(out,0)=Fg1;
Forming the hyper-resolution feature map of the target frame of each iteration into a hyper-resolution feature map sequence (F) of the target frame(out,1)、F(out,2)、…、F(out,n))。
Preferably, the initial super-resolution network model further comprises a reconstruction super-resolution module;
the hyper-resolution feature map sequence (F) of the target frame(out,1)、F(out,2)、…、F(out,n)) Inputting the reconstruction super-resolution module for reconstruction to obtain a reconstruction super-resolution residual error information sequence (I) of the target frame(Res,1)、I(Res,2)、…、I(Res,n)) Namely:
I(Res,n)=fRB(F(out,n))
wherein, I(Res,n)Reconstructed super-divided residual of target frame of subset of video frame sequence representing nth iterationInformation, fRB() represents a rebuild operation;
adding the reconstructed super-resolution residual error information of the target frame and the preliminary super-resolution video frame of the target frame to obtain a final super-resolution video frame of the target frame, namely:
I(SR,n)=I(Res,n)+fUP(It)
wherein, I(SR,n)Final hyper-divided video frame, f, representing the target frame of the subset of video frame sequences of the nth iterationUP(It) Preliminary hyper-divided video frame, I, representing a target frametRepresenting a target frame;
the final super-divided video frame of the target frame is formed into a final super-divided video frame sequence (I) of the target frame(SR,1)、I(SR,2)、…、I(SR,n))。
Preferably, the loss function is an L1 norm loss function:
wherein, WnIs represented by(SR,n)Ratio occupied in the loss function, I(HR,t)A ground route representing a target frame;
and repeating the steps S3-S6, and carrying out iterative training on the initial super-resolution network model by using the training video data in the training video data set.
Preferably, the video data is obtained from an existing high resolution data set Vimeo-90 k.
Compared with the prior art, the technical scheme of the invention has the beneficial effects that:
firstly, performing up-sampling on a target frame of pre-processed training video data needing to be subjected to super-resolution to obtain a primary super-resolution video frame, and grouping a video frame sequence contained in the training video data on a time axis; inputting the grouped video frame sequences into an initial super-resolution network model to perform operations of feature map extraction, feature map alignment and feature map fusion alignment, and obtaining LR feature maps of each group of video frame sequences; then, carrying out gradual feedback hyper-division operation on the LR characteristic graph of each group of video frame sequences to obtain a hyper-division characteristic graph sequence of a target frame with strong high-level representation capability; and finally, reconstructing the hyper-resolution feature map sequence of the target frame to obtain a reconstructed hyper-resolution residual error information sequence of the target frame, and adding the reconstructed hyper-resolution residual error information sequence and the preliminary hyper-resolution video frame of the target frame to obtain a final hyper-resolution video frame sequence of the target frame. Training the initial super-resolution network model by setting a loss function to obtain a trained super-resolution network model, and performing super-resolution reconstruction on the super-resolution video to be processed by using the trained super-resolution network model; the method improves the video super-resolution effect, and obviously improves the detail keeping effect of the reconstructed video frame.
Drawings
Fig. 1 is a flowchart of a video super-resolution method based on multi-frame grouping and feedback network according to an embodiment;
fig. 2 is a data flow diagram of a video super-resolution method based on multi-frame grouping and feedback network according to an embodiment;
FIG. 3 is a diagram illustrating data flow in a feedback module according to an embodiment.
Detailed Description
The drawings are for illustrative purposes only and are not to be construed as limiting the patent;
for the purpose of better illustrating the embodiments, certain features of the drawings may be omitted, enlarged or reduced, and do not represent the size of an actual product;
it will be understood by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.
The technical solution of the present invention is further described below with reference to the accompanying drawings and examples.
Examples
The embodiment provides a video super-resolution method based on multi-frame grouping and a feedback network, as shown in fig. 1, including the following steps:
s1: acquiring video data, preprocessing the video data to obtain training video data, and forming a training video data set;
in this embodiment, a video in the existing public high-resolution data set Vimeo-90k is selected as video data, and the video data is preprocessed:
s1.1: intercepting a high-resolution video frame with the length of 256 and the width of 256 at the same position of video data;
s1.2: adopting a Gaussian fuzzy downsampling method to carry out downsampling on the high-resolution video frame to reduce the high-resolution video frame by 4 times to obtain a low-resolution video frame with the length of 64 and the width of 64;
s1.3: normalizing all low-resolution video frames;
s1.4: and carrying out random data enhancement operation on the low-resolution video frame after the normalization processing, wherein the data enhancement operation comprises turning operation and mirroring operation.
S2: determining a target frame needing to be subjected to super-resolution, and performing up-sampling on the target frame to obtain a primary super-resolution video frame;
in this embodiment, the training video data is 7 frames, an intermediate frame is selected as a target frame to be subjected to super-segmentation, and a bicubic interpolation up-sampling operation is performed on the target frame to obtain a preliminary super-segmentation video frame.
S3: grouping a video frame sequence contained in each piece of training video data on a time axis;
as shown in FIG. 2, the target frame to be over-divided is denoted as ItAnd 7 frames of training video data are respectively marked as It-3、It-2、It-1、It、It+1、It+2、It+ 3; in this embodiment, the sequence of video frames is divided into 3 groups, the first group (I)t-3、It、It+3), second packet (I)t-2、It、It+2), third packet (I)t-1、It、It+1), each set being a subset of the sequence of video frames, and each set containing a target frame.
S4: inputting the grouped video frame sequences into the constructed initial super-resolution network model, extracting the feature maps of each group of video frame sequences, and carrying out alignment and fusion operations on the feature maps of each group of video frame sequences to obtain LR feature maps of each group of video frame sequences;
in this embodiment, the initial super-resolution network model includes a deformable convolution alignment module, a fusion module, a feedback module, and a reconstruction super-resolution module; inputting a grouped video frame sequence into a deformable convolution alignment module of an initial super-resolution network model, wherein the deformable convolution alignment module is a PCD (personal computer) feature alignment module at the front end of the existing EDVR (enhanced video visual response) model and comprises a multi-scale feature extraction unit and a feature alignment unit, and the multi-scale feature extraction unit consists of 5 basic residual modules;
inputting a subset of 3 video frame sequences obtained by grouping into a multi-scale feature extraction unit, and obtaining feature maps of 3 sizes from large to small in each frame of video;
inputting the feature map of each size into a feature alignment unit to perform deformable convolution alignment operation to obtain an alignment feature map of each group of video frame sequences;
inputting the alignment feature maps of each group of video frame sequences into a fusion module for fusion operation, wherein the fusion module is a TSA fusion module of the existing EDVR network model;
gradually fusing the alignment feature maps of each group of video frame sequences from small to large upwards to obtain an LR feature map (F) of each group of video frame sequencesg1、Fg2、Fg3) Each group of channels is 64; wherein, Fg1、Fg2、Fg3Respectively showing LR characteristic maps of 1 st, second and third subsets of video frame sequences.
S5: gradually feeding back the LR characteristic graph of each group of video frame sequences for super-division to obtain a super-division characteristic graph sequence of the target frame;
as shown in FIG. 3, in this embodiment, 3 sets of LR feature maps (F)g1、Fg2、Fg3) And (3) iterating for 3 times in the iteration input feedback module, wherein the input for gradually feeding back the super-score in each iteration is the LR characteristic graph of the video frame sequence subset corresponding to the iteration and the super-score characteristic graph of the target frame output in the last iteration:
iteration 1, n is 1:
F(out,1)=fFB(F(out,0),Fg1)
wherein, F(out,1)A hyper-resolution feature map representing the target frame output from iteration 1, fFB() represents a feedback hyper-divide operation; at iteration 1, F(out,0)=Fg1;
Iteration 2, n is 2:
F(out,2)=fFB(F(out,1),Fg2)
wherein, F(out,2)A hyper-resolution feature map representing the target frame output by the 2 nd iteration;
iteration 3, n is 3:
F(out,3)=fFB(F(out,2),Fg3)
wherein, F(out,3)A hyper-resolution feature map representing the target frame output by the 3 rd iteration;
forming the hyper-resolution feature map of the target frame of 3 iterations into a hyper-resolution feature map sequence (F) of the target frame(out,1)、F(out,2)、F(out,3))。
S6: reconstructing the hyper-resolution feature map sequence of the target frame to obtain a reconstructed hyper-resolution residual error information sequence of the target frame, and adding the reconstructed hyper-resolution residual error information sequence with the preliminary hyper-resolution video frame of the target frame in S2 to obtain a final hyper-resolution video frame sequence of the target frame;
in this example, the sequence of the hyper-resolution feature map (F)(out,1)、F(out,2)、F(out,3)) Inputting a reconstruction super-resolution module for reconstruction:
I(Res,1)=fRB(F(out,1))
I(Res,2)=fRB(F(out,2))
I(Res,3)=fRB(F(out,3))
wherein, I(Res,1)、I(Res,2)、I(Res,3)The reconstructed hyper-resolution residual information of the target frames of the video frame sequence subsets of the 1 st iteration, the 2 nd iteration and the 3 rd iteration respectively form a reconstructed hyper-resolution residual information sequence (I) of the target frames(Res,1)、I(Res,2)、I(Res,3));
Adding the reconstructed super-resolution residual error information sequence of the target frame with the preliminary super-resolution video frame of the target frame:
I(SR,1)=I(Res,1)+fUP(It)
I(SR,2)=I(Res,2)+fUP(It)
I(SR,3)=I(Res,3)+fUP(It)
wherein, I(SR,1)、I(SR,2)、I(SR,3)Final hyper-resolution video frame of target frames of subsets of the sequence of video frames of 1 st, 2 nd and 3 rd iterations, respectively, fUP(It) A preliminary hyper-divided video frame representing a target frame;
the final super-divided video frame of the target frame is formed into a final super-divided video frame sequence (I) of the target frame(SR,1)、I(SR,2)、I(SR,3))。
S7: setting a loss function, and training the initial super-resolution network model to obtain a trained super-resolution network model;
in this embodiment, the loss function is an L1 norm loss function:
wherein, WnIs represented by(SR,n)The ratio of n to n in the loss function is 1, 2, 3; i is(HR,t)Represents the ground route of the target frame. In this example, W1、W2、W3The values are all 1;
repeating the steps S3-S6, and carrying out iterative training on the initial super-resolution network model by using training video data in the training video data set;
in this embodiment, the final hyper-resolution video frame I of the target frame of the video frame sequence subset of the previous two iterations is(SR,1)、I(SR,2)Video for computation of loss function, last iterationFinal hyper-resolution video frame I of target frame of frame sequence subset(SR,3)As a target frame ItThe result of the over-classification of (1).
S8: and performing super-resolution reconstruction on the to-be-super-resolution video by using the trained super-resolution network model.
By adopting the method provided by the embodiment to carry out super-resolution reconstruction on the hyper-resolution video, the super-resolution effect of the video can be greatly improved, the detail keeping effect of the reconstructed video frame is excellent, and powerful support is provided for the technical fields of satellite images, video monitoring, medical imaging and military affairs.
It should be understood that the above-described embodiments of the present invention are merely examples for clearly illustrating the present invention, and are not intended to limit the embodiments of the present invention. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the claims of the present invention.