Sichuan wildfire risk early warning method based on time dimension characteristic enhancement
1. A Sichuan wildfire risk early warning method based on time dimension feature enhancement comprises the following steps:
step 1: acquiring a remote sensing image of a target area, and acquiring combustible water content, biomass, humanistic factors, elevation, gradient, slope direction, wind speed, temperature, rainfall, relative humidity and land coverage characteristics of each pixel of the remote sensing image, and fire data corresponding to the characteristics;
step 2: performing time dimension characteristic enhancement on the water content, the wind speed, the temperature, the rainfall and the relative humidity of the combustible in the data obtained in the step 1; respectively calculating pixels belonging to the forest and pixels belonging to the grassland in the remote sensing image;
i: performing the following time dimension characteristic enhancement on pixels belonging to a forest;
firstly, enhancing the time dimension characteristics of the water content of the combustible;
(1) extracting nonlinear time-lag characteristics;
wherein constant is a constant, XiThe characteristic value of a time sequence i, N is a time sequence length, and C is a nonlinear time-lag characteristic obtained through calculation;
the water content of combustible materials in a certain target pixel within a period of time is taken as X, the X is input into a formula 1 to obtain a nonlinear time-lag characteristic, and the characteristic is marked as CFMC;
(2) Extracting absolute entropy characteristics;
wherein AE is absolute entropy feature obtained by calculation, and XiThe characteristic value with the time sequence of i;
taking the water content value of combustible materials in a certain target pixel within a period of time as X, inputting the X into a formula 2 to obtain absolute entropy characteristics, and marking as AEFMC;
(3) Extracting CWT characteristics;
wherein, CWT is the result of Ricker continuous wavelet transform for time series, D is the width parameter of wavelet function, and X is the time series characteristic of input characteristic;
taking the water content value of combustible materials in a certain target pixel within a period of time as X, inputting the X into a formula 3 to obtain CWT characteristics, and marking the CWT characteristics as CWTFMC;
(4) Extracting the number of low-value statistics;
CB ═ count (X < constant) formula 4
Wherein, CB is the number of the calculation statistic time sequence which is lower than the threshold, constant is a constant, X is the time sequence value of the input characteristic;
taking the water content value of the combustible material in a certain target pixel within a period of time as X, inputting the X into a formula 4 to obtain a low value statistical sign, and marking the low value statistical sign as CBFMC;
(5) Extracting quantile features;
q ═ Quantile (X, constant) formula 5
Wherein Q is a constant% quantile of the time series X, the constant is a constant, and X is a time series value of the input characteristic;
taking the water content value of combustible materials in a certain target pixel within a period of time as X, inputting the X into a formula 5 to obtain quantile characteristics and recording the quantile characteristics as QFMC;
Secondly, time dimension characteristic enhancement is carried out according to the wind speed;
(1) CWT characteristics of the wind speed are extracted by adopting a method of a formula 3 and are recorded as CWTWS;
(2) Extracting a high-value statistical number;
CA ═ count (X > constant) equation 6
Wherein, CA is the number of the calculated statistic time sequence which is higher than the threshold, constant is a constant, X is the time sequence value of the input characteristic;
taking the wind speed value of a certain target pixel in a period of time as X, and taking the value as XThe input formula 6 is used to obtain a high-value count, which is marked as CAWS;
(3) Extracting the nonlinear time lag characteristic of the wind speed by adopting the method of formula 1 to obtain CWS;
(4) Extracting a value statistic;
VC (X) formula 7
Wherein, VC is the number of constant values of the calculated statistical time sequence, constant is a constant, X is the time sequence value of the input characteristic;
the wind speed value of a certain target pixel within a period of time is taken as X, and the X is input into a formula 7 to obtain a value statistic which is recorded as VCWS;
(5) Extracting range statistics;
RC ═ count (constant _ low < X < constant _ up) formula 8
Wherein RC refers to the number in the specified range of the calculation and statistics time sequence, constant _ low and constant _ up are constants, and X is the time sequence value of the input characteristic;
the wind speed value of a certain target pixel in a period of time is taken as X, and the X is input into a formula 8 to obtain a range statistic which is recorded as RCWS;
Thirdly, time dimension characteristic enhancement is carried out aiming at the temperature;
(1) extracting CWT characteristics of temperature by adopting a method of formula 3 to obtain CWTT;
(2) The temperature is subjected to quantile processing by adopting a method of a formula 5 to obtain QT;
(3) Extracting grouping entropy characteristics;
wherein BE is the calculated grouping entropy characteristics, P is the sample percentage, nums is the grouping number, and X is the time sequence value of the input characteristics;
the temperature value of a certain target pixel in a period of time is taken as X, and the temperature value is input into a formula 9 to obtain a grouping entropy characteristic which is recorded as BET;
(4) Carrying out absolute entropy processing on the temperature by adopting a method of formula 2 to obtain AET;
(5) Performing low value statistical processing on the temperature by adopting a method of a formula 4 to obtain CBT;
Fourthly, feature extraction is carried out aiming at rainfall;
(1) quantile processing is carried out on the rainfall by adopting a method of a formula 5 to obtain QRainfall;
(2) Performing low value statistical processing on the rainfall by adopting a method of a formula 4 to obtain CBRainfall;
(3) The method of formula 8 is adopted to carry out range statistic processing on rainfall to obtain RCRainfall;
(4) Adopting the method of formula 1 to carry out nonlinear time-lag processing on rainfall to obtain CRainfall;
(5) The method of formula 9 is adopted to carry out grouping entropy processing on the rainfall to obtain BERainfall;
Fifthly, feature extraction is carried out on the relative humidity:
(1) performing wavelet transformation processing on the relative humidity by adopting a method of a formula 3 to obtain CWTRH;
(2) Carrying out high-value counting treatment on the relative humidity by adopting a method of a formula 6 to obtain CARH;
(3) Quantile processing is carried out on the relative humidity by adopting a method of a formula 5 to obtain QRH;
(4) Adopting the method of formula 2 to carry out absolute entropy processing on the relative humidity to obtain AERH;
(5) Performing grouping entropy processing on the relative humidity by adopting a method of a formula 9 to obtain BERH;
II: carrying out the following time dimension characteristic enhancement on the pixels belonging to the grassland;
firstly, extracting characteristics aiming at the moisture content of combustible materials:
(1) performing quantile processing on the water content of the combustible by adopting a method of a formula 5 to obtain QFMC;
(2) The method of formula 1 is adopted to carry out nonlinear time-lag treatment on the water content of the combustible,to obtain CFMC;
(3) Performing range statistics on the water content of the combustible by adopting a method of a formula 8 to obtain RCFMC;
(4) Performing grouped entropy processing on the water content of the combustible by adopting a method of a formula 9 to obtain BEFMC;
(5) Carrying out absolute entropy treatment on the water content of the combustible by adopting a method of a formula 2 to obtain AEFMC;
Secondly, feature extraction is carried out on WS:
(1) adopting a method of a formula 4 to carry out low value statistical processing on WS to obtain CBWS;
(2) Quantile processing is carried out on WS by adopting a method of formula 5 to obtain QWS;
(3) The WS is processed by the grouping entropy by adopting the method of formula 9 to obtain BEWS;
(4) Adopting the method of formula 1 to carry out nonlinear time-lag processing on WS to obtain CWS;
(5) Adopting the method of formula 6 to carry out high-value statistical processing on WS to obtain CAWS;
Thirdly, extracting features of the T:
(1) adopting the method of formula 1 to carry out nonlinear time-lag processing on T to obtain CT;
(2) Adopting a method of formula 3 to perform wavelet transformation processing on T to obtain CWTT;
(3) Performing value statistics processing on T by adopting a method of a formula 7 to obtain VCT;
(4) Carrying out high-value statistical processing on T by adopting a method of a formula 6 to obtain CAT;
(5) Performing low value statistical processing on T by adopting a method of a formula 4 to obtain CBT;
Fourthly, feature extraction is carried out on Rainfall:
(1) performing grouping entropy processing on Rainfall by adopting a method of a formula 9 to obtain BERainfall;
(2) Performing value statistics processing on Rainfall by adopting a method of a formula 7 to obtain VCRainfall;
(3) Performing wavelet transformation processing on Rainfall by adopting a method of a formula 3 to obtain CWT (continuous wavelet transform)Rainfall;
(4) Quantile processing is carried out on Rainfall by adopting a method of a formula 5 to obtain QRainfall;
(5) Adopting a method of a formula 2 to carry out absolute entropy processing on Rainfall to obtain AERainfall;
Fifthly, extracting features of RH:
(1) adopting a method of a formula 2 to carry out absolute entropy processing on RH to obtain AERH;
(2) Adopting a method of a formula 6 to carry out high-value statistical processing on RH to obtain CARH;
(3) Performing value statistics processing on RH by adopting a method of a formula 7 to obtain VCRH;
(4) Adopting a method of a formula 4 to carry out low value statistical processing on RH to obtain CBRH;
(5) Adopting the method of formula 1 to carry out nonlinear time-lag processing on RH to obtain CRH;
And step 3: respectively training an XGboost model aiming at a forest pixel and a grassland pixel; the inputs to the XGBoost model are: step 2, outputting the data obtained after the dimensional characteristics of the moisture content, the wind speed, the temperature, the rainfall and the relative humidity of the combustible material are enhanced, and the remaining characteristics of biomass, humanistic factors, elevation, gradient, slope direction and land coverage in the step 1 as whether pixels corresponding to the characteristics catch fire or not;
and 4, step 4: in the real-time monitoring process, whether the pixels are forest pixels or grassland pixels is firstly distinguished, corresponding characteristics are obtained, and then a corresponding XGboost model is adopted for fire early warning.
Background
Wildfires play an important role in the process of ecosystem development, which can benefit ecosystem development such as: promoting the succession of vegetation and improving the pest resistance of the ecological system, and the like. However, it also causes negative effects such as soil erosion, deterioration and greenhouse gas emission. The fire fighting is dangerous, the fire fighting is suddenly carried out in 30 days 3 months in 2019 in Liangshan county in Sichuan province, the total fire passing area of the fire scene is about 20 hectares, and the number of people in distress reaches 31. Australian bushy wildfires began to abuse 7/8 of 2019, more than 1200 hectares of land were overfire in australia, and about 10 million wild animals were killed in the wildfire, in which case 33 people died unfortunate and 2500 houses were burned. In 30 days 3 months in 2020, Xichang city, Sichuan province also breaks out forest fire, the fire area is more than 1000 hectares, the destruction area is more than 80 hectares, and the number of people in distress reaches 19. In the face of such disastrous losses, wildfire risk early warning is imminent.
In recent years, with the rapid development and popularization of the fields such as remote sensing technology, computers, machine learning and the like, the method combining remote sensing and machine learning algorithm is developed rapidly in the field of fire hazard early warning, and a set of method system capable of early warning in a large range, high precision and high resolution is urgently needed in the face of economic loss and casualties caused by a large-range fire disaster which is abused in recent years. A series of applicable wildfire early warning models with large range, high precision and high space-time resolution are constructed, so that on one hand, scientific decision support can be provided for relevant decision departments, thereby realizing precaution in the bud, controlling the occurrence of fire from the source and being beneficial to the environmental protection of an ecological system; on the other hand, the method can provide basis for the research of the wild fire early warning and the related fields thereof, and further continuously improve the effect of the wild fire early warning.
The research aiming at the prior fire risk early warning about the fire risk mainly aims at the selection of a model and the selection of factors, and the research fails to consider the importance of the time dimension characteristics of the factors in the construction method of the fire data set. The method for constructing the fire data set is simple and efficient, but has a plurality of problems. Because the explosion of forest and grassland fires needs a long-time drought condition to enable combustible materials to be in a combustible condition, a complete fire scene cannot be represented only from factor information on the day of the fire, long-time sequence information of induction factors before the fire occurs does not exist, a model is difficult to learn a specific time-space scene of the fire, and accurate early warning cannot be carried out on wildfires.
Disclosure of Invention
The invention aims to provide a model for performing the risk early warning of the wild fire in the western and Sichuan by performing feature enhancement on the time dimension characteristics of the wild fire inducing factors and considering the heterogeneity of the time dimension characteristics of fire pixels and non-fire pixels.
The technical scheme of the invention is a Chunxi wildfire risk early warning method based on time dimension characteristic enhancement, which comprises the following steps:
step 1: acquiring a remote sensing image of a target area, and acquiring combustible water content, biomass, humanistic factors, elevation, gradient, slope direction, wind speed, temperature, rainfall, relative humidity and land coverage characteristics of each pixel of the remote sensing image, and fire data corresponding to the characteristics;
step 2: performing time dimension characteristic enhancement on the water content, the wind speed, the temperature, the rainfall and the relative humidity of the combustible in the data obtained in the step 1; respectively calculating pixels belonging to the forest and pixels belonging to the grassland in the remote sensing image;
i: performing the following time dimension characteristic enhancement on pixels belonging to a forest;
firstly, enhancing the time dimension characteristics of the water content of the combustible;
(1) extracting nonlinear time-lag characteristics;
wherein constant is a constant, XiThe characteristic value of a time sequence i, N is a time sequence length, and C is a nonlinear time-lag characteristic obtained through calculation;
the water content of combustible materials in a certain target pixel within a period of time is taken as X, the X is input into a formula 1 to obtain a nonlinear time-lag characteristic, and the characteristic is marked as CFMC;
(2) Extracting absolute entropy characteristics;
wherein AE is absolute entropy feature obtained by calculation, and XiThe characteristic value with the time sequence of i;
taking the water content value of combustible materials in a certain target pixel within a period of time as X, inputting the X into a formula 2 to obtain absolute entropy characteristics, and marking as AEFMC;
(3) Extracting CWT characteristics;
wherein, CWT is the result of Ricker continuous wavelet transform for time series, D is the width parameter of wavelet function, and X is the time series characteristic of input characteristic;
taking the water content value of combustible materials in a certain target pixel within a period of time as X, inputting the X into a formula 3 to obtain CWT characteristics, and marking the CWT characteristics as CWTFMC;
(4) Extracting the number of low-value statistics;
CB ═ count (X < constant) formula 4
Wherein, CB is the number of the calculation statistic time sequence which is lower than the threshold, constant is a constant, X is the time sequence value of the input characteristic;
taking the water content value of the combustible material in a certain target pixel within a period of time as X, inputting the X into a formula 4 to obtain a low value statistical sign, and marking the low value statistical sign as CBFMC;
(5) Extracting quantile features;
q ═ Quantile (X, constant) formula 5
Wherein Q is a constant% quantile of the time series X, the constant is a constant, and X is a time series value of the input characteristic;
taking the water content value of combustible materials in a certain target pixel within a period of time as X, inputting the X into a formula 5 to obtain quantile characteristics and recording the quantile characteristics as QFMC;
Secondly, time dimension characteristic enhancement is carried out according to the wind speed;
(1) CWT characteristics of the wind speed are extracted by adopting a method of a formula 3 and are recorded as CWTWS;
(2) Extracting a high-value statistical number;
CA ═ count (X > constant) formula 6
Wherein, CA is the number of the calculated statistic time sequence which is higher than the threshold, constant is a constant, X is the time sequence value of the input characteristic;
the wind speed value of a certain target pixel within a period of time is taken as X, and the X is input into a formula 6 to obtain a high value statistic, and the high value statistic is marked as CAWS;
(3) Extracting the nonlinear time lag characteristic of the wind speed by adopting the method of formula 1 to obtain CWS;
(4) Extracting a value statistic;
VC (X) formula 7
Wherein, VC is the number of constant values of the calculated statistical time sequence, constant is a constant, X is the time sequence value of the input characteristic;
the wind speed value of a certain target pixel within a period of time is taken as X, and the X is input into a formula 7 to obtain a value statistic which is recorded as VCWS;
(5) Extracting range statistics;
RC ═ count (constant _ low < X < constant _ up) formula 8
Wherein RC refers to the number in the specified range of the calculation and statistics time sequence, constant _ low and constant _ up are constants, and X is the time sequence value of the input characteristic;
the wind speed value of a certain target pixel in a period of time is taken as X, and the X is input into a formula 8 to obtain a range statistic which is recorded as RCWS;
Thirdly, time dimension characteristic enhancement is carried out aiming at the temperature;
(1) extracting CWT characteristics of temperature by adopting a method of formula 3 to obtain CWTT;
(2) Temperature is divided by the method of equation 5Processing the digit to obtain QT;
(3) Extracting grouping entropy characteristics;
wherein BE is the calculated grouping entropy characteristics, P is the sample percentage, nums is the grouping number, and X is the time sequence value of the input characteristics;
the temperature value of a certain target pixel in a period of time is taken as X, and the temperature value is input into a formula 9 to obtain a grouping entropy characteristic which is recorded as BET;
(4) Carrying out absolute entropy processing on the temperature by adopting a method of formula 2 to obtain AET;
(5) Performing low value statistical processing on the temperature by adopting a method of a formula 4 to obtain CBT;
Fourthly, feature extraction is carried out aiming at rainfall;
(1) quantile processing is carried out on the rainfall by adopting a method of a formula 5 to obtain QRainfall;
(2) Performing low value statistical processing on the rainfall by adopting a method of a formula 4 to obtain CBRainfall;
(3) The method of formula 8 is adopted to carry out range statistic processing on rainfall to obtain RCRainfall;
(4) Adopting the method of formula 1 to carry out nonlinear time-lag processing on rainfall to obtain CRainfall;
(5) The method of formula 9 is adopted to carry out grouping entropy processing on the rainfall to obtain BERainfall;
Fifthly, feature extraction is carried out on the relative humidity:
(1) performing wavelet transformation processing on the relative humidity by adopting a method of a formula 3 to obtain CWTRH;
(2) Carrying out high-value counting treatment on the relative humidity by adopting a method of a formula 6 to obtain CARH;
(3) Quantile processing is carried out on the relative humidity by adopting a method of a formula 5 to obtain QRH;
(4) Adopting the method of formula 2 to carry out absolute entropy processing on the relative humidity to obtain AERH;
(5) Performing grouping entropy processing on the relative humidity by adopting a method of a formula 9 to obtain BERH;
II: carrying out the following time dimension characteristic enhancement on the pixels belonging to the grassland;
firstly, extracting characteristics aiming at the moisture content of combustible materials:
(1) performing quantile processing on the water content of the combustible by adopting a method of a formula 5 to obtain QFMC;
(2) Carrying out nonlinear time-lag treatment on the water content of the combustible by adopting the method of formula 1 to obtain CFMC;
(3) Performing range statistics on the water content of the combustible by adopting a method of a formula 8 to obtain RCFMC;
(4) Performing grouped entropy processing on the water content of the combustible by adopting a method of a formula 9 to obtain BEFMC;
(5) Carrying out absolute entropy treatment on the water content of the combustible by adopting a method of a formula 2 to obtain AEFMC;
Secondly, feature extraction is carried out on WS:
(1) adopting a method of a formula 4 to carry out low value statistical processing on WS to obtain CBWS;
(2) Quantile processing is carried out on WS by adopting a method of formula 5 to obtain QWS;
(3) The WS is processed by the grouping entropy by adopting the method of formula 9 to obtain BEWS;
(4) Adopting the method of formula 1 to carry out nonlinear time-lag processing on WS to obtain CWS;
(5) Adopting the method of formula 6 to carry out high-value statistical processing on WS to obtain CAWS;
Thirdly, extracting features of the T:
(1) adopting the method of formula 1 to carry out nonlinear time-lag processing on T to obtain CT;
(2) Adopting a method of formula 3 to perform wavelet transformation processing on T to obtain CWTT;
(3) Performing value statistics processing on T by adopting a method of a formula 7 to obtain VCT;
(4) Carrying out high-value statistical processing on T by adopting a method of a formula 6 to obtain CAT;
(5) Performing low value statistical processing on T by adopting a method of a formula 4 to obtain CBT;
Fourthly, feature extraction is carried out on Rainfall:
(1) performing grouping entropy processing on Rainfall by adopting a method of a formula 9 to obtain BERainfall;
(2) Performing value statistics processing on Rainfall by adopting a method of a formula 7 to obtain VCRainfall;
(3) Performing wavelet transformation processing on Rainfall by adopting a method of a formula 3 to obtain CWT (continuous wavelet transform)Rainfall;
(4) Quantile processing is carried out on Rainfall by adopting a method of a formula 5 to obtain QRainfall;
(5) Adopting a method of a formula 2 to carry out absolute entropy processing on Rainfall to obtain AERainfall;
Fifthly, extracting features of RH:
(1) adopting a method of a formula 2 to carry out absolute entropy processing on RH to obtain AERH;
(2) Adopting a method of a formula 6 to carry out high-value statistical processing on RH to obtain CARH;
(3) Performing value statistics processing on RH by adopting a method of a formula 7 to obtain VCRH;
(4) Adopting a method of a formula 4 to carry out low value statistical processing on RH to obtain CBRH;
(5) Adopting the method of formula 1 to carry out nonlinear time-lag processing on RH to obtain CRH;
And step 3: respectively training an XGboost model aiming at a forest pixel and a grassland pixel; the inputs to the XGBoost model are: step 2, outputting the data obtained after the dimensional characteristics of the moisture content, the wind speed, the temperature, the rainfall and the relative humidity of the combustible material are enhanced, and the remaining characteristics of biomass, humanistic factors, elevation, gradient, slope direction and land coverage in the step 1 as whether pixels corresponding to the characteristics catch fire or not;
and 4, step 4: in the real-time monitoring process, whether the pixels are forest pixels or grassland pixels is firstly distinguished, corresponding characteristics are obtained, and then a corresponding XGboost model is adopted for fire early warning.
The invention has the beneficial effects that: the invention provides a Chunxi wildfire risk early warning model based on time dimension characteristic enhancement. The time dimension characteristics of the wildfire inducing factors are enhanced, five most relevant characteristics are selected for each factor by combining the Pearson correlation coefficient on the basis, and finally the five most relevant characteristics are added into the wildfire historical database, so that the heterogeneity of the factors in the database on the time scale is improved, and the wildfire early warning precision of the model is improved.
Drawings
FIG. 1 is a diagram showing the respective factors in the research region of Chuanxi in Sichuan province. In the figure, (a) shows factors (from left to right, and from top to bottom, slope direction, elevation, slope, FFL, land cover and combustible moisture content) of the western region, and (b) shows specific conditions of rainfall, relative humidity, temperature, wind speed, distance to a road and distance to a residential site.
FIG. 2 is a schematic diagram of the research area in Chuanxi, Sichuan province. The historical fire information, the terrain condition of the western region and the distribution condition of the forest grassland are marked in the figure. The figure shows the specific components of the west region: the aca zang qiang nationality, the gan zi zang nationality, the cool mountain Yi nationality and the Panzhihua city.
FIG. 3 is a schematic diagram of an ROC curve of the XGboost model. The left graph shows the ROC condition of the test set for whether time dimension features and different ground feature types are added or not; the right graph reveals the ROC case of the training set for whether time-dimensional features and different surface feature types are added.
FIG. 4 is a schematic diagram of the risk of wildfires in the Chunxi area. (a) The method is a display diagram of fire risks in the season of fire with the factor time dimension characteristic enhancement; (c) the method is a display diagram of fire risks in the non-fire season Sichuan west with factor time dimension characteristic enhancement added; (b) the method is a display diagram of fire risk in Szechwan province in fire seasons without adding factor time dimension characteristic enhancement; (d) the method is a display diagram of fire risks in non-fire season Sichuan west without factor time dimension characteristic enhancement.
Fig. 5 is a timing diagram of wildfire risk for lushan and georgette. (a) The long-time sequence diagram of the fire danger of the Lushan city of Wenchang with the time dimension characteristic enhancement is shown in the figure; (b) the long-time sequence diagram of the fire danger of the Xichang city Lushan is shown, wherein the time dimension characteristic of the long-time sequence diagram is not added for enhancement; (c) the long-time sequence chart is added with time dimension feature enhancement and used for the fire danger of the Murray county, the arbor and the township; (d) the long-time sequence chart of the fire danger of the Murray county, arbor and township without adding time dimension feature enhancement is shown in the figure.
Detailed Description
The invention is further described below with reference to the accompanying drawings.
(1) Data preparation and database construction
There are many inducing factors for fire, such as moisture content of combustible material, FFL (biomass) and other parameters. The method comprises the steps of firstly acquiring a remote sensing image of a target area, and acquiring the moisture content of combustible materials, FFL, humanistic factors, elevation, gradient, slope direction, wind speed, temperature, rainfall, relative humidity and land coverage characteristics of each pixel, wherein the humanistic factors are obtained by carrying out Euclidean distance buffer zone change on images of a road network and residential points. The slope direction, the elevation, the gradient, the FFL, the land cover and the water content of combustible substances in the western region are shown by 6 factors (see figure 1(a)), and the rainfall, the relative humidity, the temperature, the wind speed, the Euclidean distance from a road and the Euclidean distance from a residential quarter (see figure 1(b)) are shown. And extracting fire pixels and non-fire pixels through the historical fire product MCD64A1 to construct a wildfire historical database. And further acquiring values of all induction factors of the fire pixels and the non-fire pixels in all the wildfire historical databases at the same time, the historical time and the same place, and adding the values into the wildfire historical databases. The method is mainly characterized in that the method selects Sichuan as a research area, and has the main reason that 90% of fire points in Sichuan are concentrated in the Sichuan area (see figure 2) from the historical fire points, and the wild fire risk early warning difficulty is high due to the fact that the climate in the Sichuan area is variable and the terrain is complex, so that the real feasibility and the effect of a model can be tested;
(2) time dimension feature enhancement
Performing time dimension characteristic enhancement on five factors including water content, wind speed, temperature, rainfall and relative humidity of the combustible by using a constructed wildfire historical database and taking 16 days as a time window;
respectively calculating pixels belonging to the forest and pixels belonging to the grassland in the remote sensing image, and carrying out characteristic extraction on the moisture content, the wind speed, the temperature, the rainfall and the relative humidity of combustible materials in the pixels belonging to the forest, wherein the extraction method comprises the following steps:
firstly, extracting characteristics aiming at the moisture content of combustible materials:
(1) c characteristic processing is carried out on the water content of the combustible, the water content value of the combustible within a certain target pixel within 16 days is taken as X, and the X is input into a formula 1 (wherein N is 16, constant is 1) to obtain CFMC;
Wherein, C (nonlinear time lag) is a measure of nonlinearity in the time sequence to describe the nonlinear characteristics of the time sequence, constant is a constant, and X is the value of the factor time sequence;
(2) AE characteristic processing is carried out on the water content of the combustible, the water content value of the combustible within 16 days of a certain target pixel is taken as X, and the X is input into a formula 2 (wherein N is 16) to obtain AEFMC;
Wherein AE (absolute entropy) is the absolute energy of the time series to describe the sum of the squares of the factors within its time window; x is the value of the factor time series;
(3) CWT characteristic processing is carried out on the moisture content of the combustible, the moisture content value of the combustible within a certain target pixel 16 days is taken as X, and the X is input into a formula 3 (wherein D is 5) to obtain CWTFMC。
Wherein, CWT (wavelet transform) refers to a result of computing Ricker continuous wavelet transform for a time series, D is a width parameter of a wavelet function, and X is a value of a factor time series.
(4) Performing CB characteristic treatment on the water content of the combustible, taking the water content value of the combustible within a 16-day range of a certain target pixel as X, and inputting the X into a formula 4 (wherein constant is 120) to obtain CBFMC。
CB=count(X<constant) (4)
Wherein, CB (low value statistic) refers to the number below a certain value of the calculated statistic time series, constant is a constant, and X is the value of the factor time series.
(5) Q characteristic processing is carried out on the water content of the combustible, the water content value of the combustible within 16 days of a certain target pixel is taken as X, and the X is input into a formula 5 (wherein constant is 72) to obtain QFMC。
Q=Quantile(X,constant) (5)
Wherein Q (quantile) is the constant% quantile of time series X, constant is a constant, and X is the value of the factor time series.
Secondly, feature extraction is carried out on the Wind Speed (WS):
(1) performing wavelet transformation processing on WS by adopting a method of formula 3 (wherein D is 9) to obtain CWTWS。
(2) Performing CA (conditional access) feature processing on WS, taking the WS value within 16 days of a certain target pixel as X, and inputting the WS value into a formula 6 (wherein constant is 3.2) to obtain CAWS。
CA=count(X>constant) (6)
Where, CA (high value statistic) refers to the number of the calculated statistic time series higher than a certain value, constant is a constant, and X is the value of the factor time series.
(3) Applying the method of formula 1 (wherein N is 16, constant is 3) to perform nonlinear time-lag processing on WSTo obtain CWS。
(4) Performing VC characteristic processing on WS, taking the WS value within a certain target pixel within 16 days as X, and inputting the WS value into a formula 7 (wherein constant is 0) to obtain VCWS。
VC=count(X==constant) (7)
Wherein VC (value statistic) is a number equal to a certain value of the calculated statistical time series, constant is a constant, and X is a value of the factor time series.
(5) Performing RC characteristic processing on WS, taking the WS value within a certain target pixel within 16 days as X, and inputting the WS value into a formula 8 (wherein constant _ low is-1 and constant _ up is 1) to obtain RCWS。
RC=count(constant_low<X<constant_up) (8)
Where RC (range statistic) refers to the number of values equal to a certain value in calculating the statistical time series, constant is a constant, and X is the value of the factor time series.
Thirdly, performing feature extraction on the temperature (T):
(1) performing wavelet transformation processing on T by adopting a method of formula 3 (wherein D is 4) to obtain CWTT。
(2) The quantile processing is performed on T by the method of formula 5 (where constant is 27) to obtain QT。
(3) Performing BE characteristic processing on T, taking the T value of a certain target pixel within 16 days as X, and inputting the X value into a formula 9 (wherein nums is 16) to obtain BET。
Wherein BE (packet entropy) is: dividing the whole time sequence into nums groups according to values, then putting each value into a corresponding group, and then solving entropy, wherein P is sample percentage, nums is group number, and X is the value of the factor time sequence.
(4) The method of formula 2 (where N is 16) is used to perform absolute entropy processing on T, resulting in AET。
(5) Performing low value statistical processing on T by adopting a method of formula 4 (wherein constant is 300) to obtain CBT。
Fourthly, feature extraction is carried out on Rainfall (Rainfall):
(1) rainfall is subjected to quantile processing by adopting a method of formula 5 (wherein constant is 62), and Q is obtainedRainfall。
(2) Performing low-value statistical processing on Rainfall by adopting a method of formula 4 (wherein constant is 1) to obtain CBRainfall。
(3) Performing range statistics processing on Rainfall by using a method of formula 8 (where constant _ low is 0 and constant _ up is 1), to obtain RCRainfall。
(4) Performing nonlinear time-lag processing on Rainfall by adopting a method of formula 1 (wherein N is 16, constant is 3) to obtain CRainfall。
(5) Performing packet entropy processing on Rainfall by adopting a method of formula 9 (wherein nums is 16), and obtaining BERainfall。
Fifthly, extracting the characteristics of the Relative Humidity (RH):
(1) performing wavelet transformation on RH by adopting a method of formula 3 (wherein D is 2) to obtain CWTRH。
(2) Performing high-value statistical processing on the RH by using the method of formula 6 (where constant is 68) to obtain CARH。
(3) Dividing the RH by the method of formula 5 (wherein constant is 48) to obtain QRH。
(4) The RH is processed by absolute entropy by the method of formula 2 (where N is 16), and AE is obtainedRH。
(5) Performing packet entropy processing on the RH by using the method of formula 9 (where nums ═ 16) to obtain BERH。
The method is characterized in that the water content, the wind speed, the temperature, the rainfall and the relative humidity of combustible materials in pixels belonging to the grassland are extracted, and the extraction method comprises the following steps:
firstly, extracting characteristics aiming at the moisture content of combustible materials:
(1) by usingThe method of formula 5 (wherein constant is 37) is used for carrying out quantile processing on the water content of the combustible to obtain QFMC。
(2) Carrying out nonlinear time-lag processing on the water content of the combustible material by adopting a method of formula 1 (wherein N is 16, and constant is 2) to obtain CFMC。
(3) Performing range statistics processing on the combustible water content by adopting a method of a formula 8 (wherein, constant _ low is 200, and constant _ up is 240) to obtain RCFMC。
(4) Performing grouping entropy processing on the water content of the combustible by adopting a method of formula 9 (wherein nums is 16) to obtain BEFMC。
(5) Performing absolute entropy treatment on the water content of the combustible by adopting a method of formula 2 (wherein N is 16) to obtain AEFMC。
Secondly, feature extraction is carried out on WS:
(1) performing low value statistical processing on WS by adopting a method of formula 4 (wherein constant is 1.8) to obtain CBWS。
(2) The WS is quantile processed using the method of equation 5 (where constant is 81) to obtain QWS。
(3) Performing packet entropy processing on WS by adopting a method of formula 9 (where nums is 16), and obtaining BEWS。
(4) Performing nonlinear time-lag processing on WS by adopting a method of formula 1 (wherein N is 16, and constant is 2) to obtain CWS。
(5) Using the method of equation 6 (where constant is 3.8) to perform high-value statistics on WS, CA is obtainedWS。
Thirdly, extracting features of the T:
(1) performing nonlinear time-lag processing on T by adopting a method of formula 1 (wherein N is 16, and constant is 0) to obtain CT。
(2) Performing wavelet transformation processing on T by adopting a method of formula 3 (wherein D is 4) to obtain CWTT。
(3) Performing value statistics on T by using the method of formula 7 (where constant is 287), and obtaining VCT。
(4) Using a formulaThe method of 6 (wherein constant 302) performs high-value statistics on T to obtain CAT。
(5) Performing low value statistical processing on T by adopting a method of formula 4 (wherein constant is 290) to obtain CBT。
Fourthly, feature extraction is carried out on Rainfall:
(1) performing packet entropy processing on Rainfall by adopting a method of formula 9 (wherein nums is 16), and obtaining BERainfall。
(2) Performing value statistics on Rainfall by adopting a method of formula 7 (wherein constant is 0) to obtain VCRainfall。
(3) Performing wavelet transformation processing on Rainfall by adopting a method of formula 3 (wherein D is 1) to obtain CWTRainfall。
(4) Rainfall is subjected to quantile processing by adopting a method of formula 5 (wherein constant is 44), and Q is obtainedRainfall。
(5) Adopting a method of formula 2 (wherein N is 16) to carry out absolute entropy processing on Rainfall to obtain AERainfall。
Fifthly, extracting features of RH:
(1) the RH is processed by absolute entropy by the method of formula 2 (where N is 16), and AE is obtainedRH。
(2) Performing high-value statistical processing on the RH by adopting the method of formula 6 (wherein constant is 53), and obtaining CARH。
(3) Performing value statistics on the RH by using the method of formula 7 (where constant is 47), to obtain VCRH。
(4) Performing low-value statistical processing on RH by adopting a method of formula 4 (wherein constant is 50) to obtain CBRH。
(5) Performing nonlinear time-lag processing on the RH by using a method of formula 1 (where N is 16 and constant is 4) to obtain CRH。
(3) Model training and parameter tuning
And finally selecting the XGboost model through comparison of a series of models, and dividing the wildfire historical database into a database subjected to time dimension characteristic enhancement and a database not subjected to time dimension characteristic enhancement. Inputting the parameters into a model, and selecting optimal parameters by using a grid search and cross validation method, wherein the optimal model parameters are as follows: the scale is 0, learning _ rate is 0.3, min _ child _ weight is 1, max _ depth is 6, gamma is 0, subsample is 1, max _ delta _ step is 0, colsample _ byte is 1, reg _ lambda is 1, n _ estimators is 200, and seed is 1000.
(4) Comparative analysis of model results
Finally, the models were compared and analyzed, and the ROC curve and AUC values of the XGBoost model were first compared (see fig. 3). It is clear from the graph (a) that the test set AUC values of the grassland model and the forest model to which the time dimension feature enhancement is added have reached 0.99 and 0.98, while the test set AUC values of the grassland model and the forest model to which the time dimension feature enhancement is not added have reached 0.99 and 0.95. Whether time dimension characteristic enhancement is added or not, the precision of the training set is basically kept in a relatively close state. However, as analyzed from the graph (b), the AUC values of the training sets of the grassland model and the forest model to which the time-dimension feature enhancement is added have reached 0.96 and 0.94, while the AUC values of the training sets of the grassland model and the forest model to which the time-dimension feature enhancement is not added have reached 0.92 and 0.88. The grassland and forest models added with the time dimension characteristic enhancement still keep higher AUC values, but the grassland and forest models without the time dimension characteristic enhancement are obviously changed. The model using temporal dimensional feature enhancement is more robust.
Next, the result analysis is performed on the models of the western regions (see fig. 4), wherein a map (a) shows the western risk map of 3/30/2020 with time dimension feature enhancement added, a map (b) shows the western risk map of 3/30/2020 with no time dimension feature enhancement added, and a comparison between the map (a) and the map (b) can analyze the western risk map with time dimension feature enhancement added, so that the risk display is more accurate, the number of fragmented high risk regions is less, the number of false alarm regions is less, and a fire disaster in western chang and wood occurs in the day. The graph (c) shows the western-chuan risk graph of 6/1/2020 with the added time dimension feature enhancement, the graph (d) shows the western-chuan risk graph of 6/1/2020 with no added time dimension feature enhancement, when entering june, namely, rainy season in Sichuan, the fire risk should be greatly reduced or close to 0, and the comparison between the graph (c) and the graph (d) can analyze that the risk of the western-chuan risk graph with the added time dimension feature enhancement is closer to the fact, most areas are at low risk, and for the western-chuan risk graph with no added time dimension feature enhancement, the risk is higher, and the climbing flower city is even at medium and high risk, which makes the judgment of the fire risk wrong.
Finally, the time-dimension-added model and the time-dimension-unadditized model were compared for this year's fire events in Murray and West Chang (see FIG. 5). The graph (a) shows the change of the fire long-time sequence of the Xichang city Lushan area added with the time dimension characteristic enhancement (namely the fire area of the Xichang city in 2020), and the graph (b) shows the change of the fire long-time sequence of the Xichang city Lushan area without the time dimension characteristic enhancement. Fig. c shows the change of the long-time fire risk sequence in the district of the bush county and the tile town to which the time-dimensional feature enhancement is added (i.e., the district where the bush county is a fire in 2020), and fig. d shows the change of the long-time fire risk sequence in the district of the bush county and the tile town to which the time-dimensional feature enhancement is not added. From the analysis of the fire occurrence mechanism in the fire season, the fire risk should be in a fluctuation state with fluctuation rising, but from the comparison of the two graphs, it can be found that the graphs (a) and (c) are always in a fluctuation and stable rising state, because the fire season in western province is mainly concentrated in one to three months, the number of fires in three months is the largest, the natural fire risk is higher, but the graphs (b) and (d) are always in a high value and almost have no fluctuation, the fire early warning is wrong and distorted, and the wrong fire forecast is easily caused.
The result analysis can show that the model added with the time dimension characteristic enhancement is more consistent with the fact and more suitable for the western region, and is more robust than the model without the time dimension characteristic enhancement.