Encoder-Decoder-based long-term traffic flow prediction method
1. A long-term traffic flow prediction method based on Encoder-Decoder is characterized in that: the method comprises the following steps:
step 1) acquiring traffic flow data, calculating a Pearson correlation coefficient between the traffic flow data of different road sections and the traffic flow data of a target road section, and selecting the traffic flow data of a plurality of road sections with the correlation coefficient higher than a threshold value as input data;
step 2) carrying out standardization processing on the input data, and constructing a training set, a verification set and a test set by using the processed input data set;
step 3) constructing a long-term traffic flow prediction model based on the Encode-Decoder, and determining basic structure parameters of the long-term traffic flow prediction model;
and 4) training a long-term traffic flow prediction model based on the Encode-Decoder by using the training set constructed in the step 2), and predicting traffic flow data of the next day of the target road section.
2. The Encoder-Decoder-based long-term traffic flow prediction method according to claim 1, characterized in that: in the step 1), a Pearson correlation coefficient R between traffic flow data of the link i and traffic flow data of the target link O is calculated by the following formulai:
Where n denotes the total length of the traffic stream data sequence, TijAnd TOjTraffic flow data respectively representing the j-th time zone link i and the target link O,respectively a road section iAverage of traffic flow data with the target section O.
3. The Encoder-Decoder-based long-term traffic flow prediction method according to claim 1, wherein: in the step 2), the traffic flow data is standardized by using a Z-score standardization method, and T is calculated by the following formula*:
Wherein T isjThe j time slot of the original traffic flow sequence is the traffic flow data,is the mean of the original traffic flow sequence, TσThe standard deviation of the original traffic flow sequence is calculated by the following formula:
the normalized data set is:
T*={(T1 *,T2 *,…,Ti *)1,(T1 *,T2 *,…,Ti *)2,…,(T1 *,T2 *,…,Ti *)n}
wherein (T)1 *,T2 *,…,Ti *)jAnd (3) traffic flow data standardized for the j time Z-score of the 1 st to i roads.
4. The Encoder-Decoder-based long-term traffic flow prediction method according to claim 1, wherein: in the step 3), the Encode part of the constructed Encode-Decoder-based long-term traffic flow prediction model consists of GRUs, and the calculation in each GRU structure is as follows:
resetting a gate: r ist=σ(Yr·[ht-1,Xt]+ar);
And (4) updating the door: u. oft=σ(Yu·[ht-1,Xt]+au);
Cell state
The current state is as follows:
and (3) outputting: y ist=σ(Yy*ht+ay);
Wherein Y isr、Yu、Yh、YyA parameter matrix of GRU with initial values in the range of [ -0.1,0.1 [)]A random matrix of (a); a isr、au、ah、ayThe bias of GRU is the initial value of zero vector; represents the hadamard product; r ist、utThe control method comprises the steps that an update gate and a reset gate of the GRU are respectively used, the number of 0-1 is output through a sigmoid function to control the opening and closing degree of the gate, and therefore the control of the input quantity, the maintenance of the original state and the control of the output quantity of a GRU state sequence h are achieved;
the input value of Encoder part is the normalized training set TtrainAnd the output value is a system state sequence h.
5. The Encoder-Decoder-based long-term traffic flow prediction method according to claim 1, wherein: in the step 3, the coding vector sequence C of the constructed Encode-Decoder-based long-term traffic flow prediction model is calculated by a softattention mechanism, and the calculation mode is as follows:
αj=softmax(relu(Y2*(relu(Y1*h+a1))+a2))
Cj=αj*h,i∈[1,n]
wherein alpha isjCalculating an attention coefficient for a jth moment coding vector C, and h is a system state sequence output by an Encoder part, and adopting a multilayer full-connection network to calculate the attention coefficient, Y2、Y1Is the weight of the fully connected network, a1、a2For the bias of this fully connected network, the attention coefficient is scaled with the softmax function to 1 sum (α).
6. The Encoder-Decoder-based long-term traffic flow prediction method according to claim 5, wherein: in the step 3), the Decoder part of the constructed Encode-Decoder-based long-term traffic flow prediction model consists of LSTMs, and the calculation in each LSTM structure is as follows:
cell state:
an input gate: i.e. it=σ(Yi·[ht-1,Xt]+ai);
Forget the door: f. oft=σ(Yf·[ht-1,Xt]+af);
An output gate: u shapet=σ(Yo·[ht-1,Xt]+ao);
The current state is as follows:
and (3) outputting: p (t) tanh (s (t)) o (t);
wherein Y isC、Yi、Yf、YoA parameter matrix of LSTM with initial values in the range of-0.1, 0.1]Random matrix of aC、ai、af、aoIs the bias of LSTM, and the initial value is zero vector; represents the hadamard product; i.e. it、ft、UtRespectively an input gate, a forgetting gate, an output gate and 3 gates of the LSTM, and outputting a number of 0-1 through a sigmoid function to control the opening and closing of the gatesThe degree realizes the control of the input quantity of the system state H, the maintenance of the original state and the control of the output quantity;
the input value of the Decoder part is a coding vector sequence C, the sequence is the output of a soft entry mechanism, and the output value of the sequence is the prediction result of the LSTM after being mapped by a single-layer full-connection network. The expression for this fully connected network is as follows:
result=relu(Yresult*p+aresult)
where p is the output of LSTM, YresultIs a parameter matrix of the fully-connected layer, aresultIs the bias of the fully connected layer.
7. The Encoder-Decoder-based long-term traffic flow prediction method according to claim 6, wherein: in the step 4), traffic flow data of the next day of the target road section is predicted by using a long-term traffic flow prediction model trained by a training set and based on an Encoder-Decoder, and the specific steps are as follows:
step 4-1: will train set TtrainThe traffic flow data of the previous day of the middle related road section is used as the input of the Encode-Decoder-based long-term traffic flow prediction model, and the corresponding actual output is obtained
Step 4-2: using mean square errorCalculating the error between the actual output value and the expected output value of the traffic flow, wherein m is the number of data in the training set; transmitting the error of each operation to each neuron of the model by using a back propagation algorithm, and then updating each connection weight by using an adam algorithm;
step 4-3: setting iteration times, continuously updating LSTM neural network connection weight in iteration, and using the network pair to verify set T in iteration processverifyPredicting the data;
step 4-4: in-process experiment in selection iteration processCertificate collection TverifyStoring the parameters with the best expression as the final parameters of the model;
and 4-5: will be used as test set TtestThe traffic flow is output by using a long-time traffic flow prediction model based on an Encode-Decoder, and predicted traffic flow data are obtained to test the model.
Background
In recent years, with the rapid development of national economy, the quantity of vehicles kept by everyone is continuously increased, the urban road network pressure is higher and higher, the congestion problem of each large city in China is developed successively, and the normal life and travel of urban residents are seriously influenced. Therefore, the reasonable road planning, the reasonable traffic facility arrangement and the timely traffic guidance become the primary tasks of the road traffic department and the transportation control management center. The high-precision real-time traffic flow prediction is the key, not only provides theoretical basis and data support for traffic flow induction and diversion, but also helps travelers to make better trip decisions, relieves traffic jam, reduces carbon emission and improves traffic operation efficiency.
According to the traffic flow prediction period, the traffic flow prediction is divided into short-time traffic flow prediction and long-time traffic flow prediction, the short-time traffic flow prediction refers to the traffic flow prediction with the prediction period less than 1 hour, and the method is mainly applied to real-time traffic control, such as automatic timing of traffic lights, vehicle navigation and the like. The long-time traffic flow prediction refers to the traffic flow prediction with a prediction period of hours, a day or even longer, can help managers make decisions, take measures and overall arrangement as soon as possible, and plays a positive role in improving traffic management and service quality. Compared with short-term prediction, long-term prediction can provide beneficial reference for improving the efficiency of limited traffic management resources, and can help travelers to make plans in advance to avoid congestion. Therefore, the method provides high-precision long-term prediction, provides more data support for decision making, and has important significance for scientific management. However, most of the current traffic flow prediction schemes aim at short-time traffic flow prediction, and an effective long-time traffic flow prediction scheme is lacked.
The Encode-Decoder is a typical sequence-to-sequence model framework, is mainly used in the fields of machine translation, natural language processing and the like in the early period, and because the prediction result of the model is a sequence, the phenomenon that the prediction effect is rapidly reduced along with the increase of the prediction step length does not occur in the problem that the prediction result is a sequence like the LSTM multi-step prediction model in the long-term traffic flow prediction and the like.
Disclosure of Invention
The purpose of the invention is as follows: aiming at the problems, the invention introduces a long-term traffic flow prediction method based on an Encode-Decoder. According to the method, on the basis of the Encoder-Decoder model, a Soft attention mechanism is introduced to dynamically adjust the numerical value of the coded vector C, so that the long-term memory capacity of the Encoder-Decoder model is improved, and the prediction precision is improved.
The technical scheme is as follows: a long-term traffic flow prediction method based on an Encode-Decoder comprises the following steps:
step 1) acquiring traffic flow data, calculating a Pearson correlation coefficient between the traffic flow data of different road sections and the traffic flow data of a target road section, and selecting the traffic flow data of a plurality of road sections with the correlation coefficient higher than a threshold value as input data;
step 2) carrying out standardization processing on the input data, and constructing a training set, a verification set and a test set by using the processed input data set;
step 3) constructing a long-term traffic flow prediction model based on the Encode-Decoder, and determining basic structure parameters of the long-term traffic flow prediction model;
and 4) training a long-term traffic flow prediction model based on the Encode-Decoder by using the training set constructed in the step 2), and predicting traffic flow data of the next day of the target road section.
Further, in the step 1, the pearson correlation coefficient R between the traffic flow data of the link i and the traffic flow data of the target link O is calculated by the following formulai:
Where n denotes the total length of the traffic stream data sequence, TijAnd TOjTraffic flow data respectively representing the j-th time zone link i and the target link O,the average of the traffic flow data of the road section i and the target road section O, respectively.
Further, in the step 2, the traffic flow data is normalized by the Z-score normalization method, and T is calculated by the following equation*:
Wherein T isjThe traffic flow data of the jth time slot of the original traffic flow sequence,is the mean of the original traffic flow sequence, TσThe standard deviation of the original traffic flow sequence is calculated by the following formula:
the normalized data set is:
T*={(T1 *,T2 *,...,Ti *)1,(T1 *,T2 *,...,Ti *)2,...,(T1 *,T2 *,...,Ti *)n}
wherein (T)1 *,T2 *,...,Ti *)jAnd (3) traffic flow data standardized for the j time Z-score of the 1 st to i roads.
Further, in the step 3, the Encode-Decoder-based long-term traffic flow prediction model is constructed, wherein an Encode part of the Encode-based long-term traffic flow prediction model is composed of GRUs, and the calculation in each GRU structure is as follows:
resetting a gate: r ist=σ(Yr·[ht-1,Xt]+ar);
And (4) updating the door: u. oft=σ(Yu·[ht-1,Xt]+au);
Cell state
The current state is as follows:
and (3) outputting: y ist=σ(Yy*ht+ay);
Wherein Y isr、Yu、Yh、YyA parameter matrix of GRU with initial values in the range of [ -0.1,0.1 [)]Random matrix of ar、au、ah、ayIs the bias of the GRU, whose initial value is the zero vector. Represents the hadamard product. r ist、utThe control method comprises the steps that an update gate and a reset gate of the GRU are respectively used, the number of 0-1 is output through a sigmoid function to control the opening and closing degree of the gate, and therefore control over input quantity, original state keeping and output quantity of a system state h is achieved.
The input value of Encoder part is the normalized training set TtrainAnd the output value is a system state sequence h.
Further, in the step 3, the coding vector sequence C of the constructed Encode-Decoder-based long-term traffic flow prediction model is calculated by a soft entry mechanism in the following calculation mode:
αj=softmax(relu(Y2*(relu(Y1*h+a1))+a2))
Cj=αj*h,i∈[1,n]
wherein alpha isjCalculating an attention coefficient for a jth moment coding vector C, and h is a system state sequence output by an Encoder part, and adopting a multilayer full-connection network to calculate the attention coefficient, Y2、Y1Is the weight of the fully connected network, a1、a2For the bias of this fully connected network, the attention coefficient is scaled with the softmax function to 1 sum (α).
Further, in the step 3, the Decoder part of the constructed Encode-Decoder-based long-term traffic flow prediction model is composed of LSTMs, and the calculation in each LSTM structure is as follows:
cell state:
an input gate: i.e. it=σ(Yi·[ht-1,Xt]+ai);
Forget the door: f. oft=σ(Yf·[ht-1,Xt]+af);
An output gate: u shapet=σ(Yo·[ht-1,Xt]+ao);
The current state is as follows:
and (3) outputting: p (t) tanh (s (t)) o (t);
wherein Y isC、Yi、Yf、YoA parameter matrix of LSTM with initial values in the range of-0.1, 0.1]Random matrix of aC、ai、af、aoIs the bias of the LSTM, whose initial value is the zero vector. Represents the hadamard product. i.e. it、ft、UtThe input gate, the forgetting gate, the output gate and 3 gates of the system respectively output 0-1 numbers through a sigmoid function to control the opening and closing degree of the gates so as to realize the control of the input quantity of the LSTM state H, the maintenance of the original state and the control of the output quantity.
The input value of the Decoder part is a coding vector sequence C, the sequence is the output of a soft entry mechanism, and the output value of the sequence is the prediction result of the LSTM after being mapped by a single-layer full-connection network. The expression for this fully connected network is as follows:
result=relu(Yresult*p+aresult)
where p is the output of LSTM, YresultIs a parameter matrix of the fully-connected layer, aresultIs the bias of the fully connected layer.
Further, in the step 4, the training set is used to train the Encode-Decoder-based long-term traffic flow prediction model to predict the traffic flow data of the next day of the target road section, and the specific steps are as follows:
step 4-1: will train set TtrainThe traffic flow data of the previous day of the middle-related road section is used as input and input into an Encoder-Decoder-based long-term traffic flow prediction model, and corresponding actual output is obtained through the model
Step 4-2: using mean square errorCalculating the error between the actual output value and the expected output value of the traffic flow, wherein m is the number of data in the training set; transmitting the error of each operation to each neuron of the model by using a back propagation algorithm, and then updating each connection weight by using an adam algorithm;
step 4-3: setting iteration times, continuously updating LSTM neural network connection weight in iteration, and using the network pair to verify set T in iteration processverifyPredicting the data;
step 4-4: selecting a verification set T in an iterative processverifyThe parameters that perform best are saved as the final parameters of the model.
And 4-5: will be used as test set TtestThe traffic flow is output by using an Encoder-Decoder-based long-term traffic flow prediction model to obtain predicted traffic flow dataThe model was tested.
Has the advantages that: the invention provides a long-term traffic flow prediction method based on an Encoder-Decoder, wherein a soft attention mechanism is introduced into the Encoder-Decoder to dynamically adjust the numerical value of a coding vector C, so that the long-term memory capability of an Encoder-Decoder model is improved, the prediction precision is improved, and the long-term traffic flow prediction has higher accuracy and reliability.
Drawings
FIG. 1 is a schematic diagram illustrating steps of a long-term traffic flow prediction method based on an Encode-Decoder according to the present invention;
FIG. 2 is a training flow chart of the Encode-Decoder-based long-term traffic flow prediction method of the present invention;
FIG. 3 is a schematic structural diagram of a long-term traffic flow prediction method based on Encode-Decoder according to the present invention;
FIG. 4 is a test set data fitting result of a case of the Encode-Decoder-based long-term traffic flow prediction model of the present invention.
Detailed Description
The technical scheme of the invention is further explained in detail by combining the drawings in the specification.
As shown in fig. 1 to 4, a long-term traffic flow prediction method based on an Encoder-Decoder includes the following steps:
step 1) acquiring traffic flow data, calculating Pearson correlation coefficients between the traffic flow data of different road sections and traffic flow data of a target road section, and selecting the traffic flow data of a plurality of road sections with higher correlation coefficients higher than a threshold value as input data;
in the step 1, a Pearson correlation coefficient R between traffic flow data of a link i and traffic flow data of a target link O is calculated by the following formulai:
Where n denotes the total length of the traffic stream data sequence, TijAnd TOjRespectively represent the jth time intervalTraffic flow data for link i and target link O,the average of the traffic flow data of the road section i and the target road section O, respectively.
The open source data set adopted in the embodiment is traffic flow data of a digital road interactive visualization and evaluation network (DRIVENET, http:// uwdrive. net/STARLab)5 # interstate highway (I5)2015 with a time interval of 5 minutes from 12 months 1 to 2016 years 12 months 12 days 31, a road segment monitored by a sensor (sensor 42) which is 163.02 miles away from Winghua is taken as a target road segment, a correlation threshold value is 0.95, and a correlation coefficient between the traffic flow data of the day before the road segment monitored by the 6 sensors 39-44 and the traffic flow data of the target road segment is greater than the threshold value.
Step 2) carrying out standardized processing on the processed data, and constructing a training set, a verification set and a test set by using the processed input data;
in the step 2, the traffic flow data is standardized by using a Z-score standardization method, and T is calculated by the following formula*:
Wherein T isjThe traffic flow data of the jth time slot of the original traffic flow sequence,is the mean of the original traffic flow sequence, TσThe standard deviation of the original traffic flow sequence is calculated by the following formula:
the normalized data set is:
T*={(T1 *,T2 *,...,Ti *)1,(T1 *,T2 *,...,Ti *)2,...,(T1 *,T2 *,...,Ti *)n)
where n represents the number of records of the raw data set T after Z-score normalization, (T)1 *,T2 *,...,Ti *)jAnd (3) traffic flow data standardized for the j time Z-score of the 1 st to i roads. After standardization processing, the data set is divided into a training set, a verification set and a test set according to proportion.
The standardization scheme adopted in the embodiment is Z-score standardization, and the standardized data set is divided into a training set, a verification set and a test set according to the proportion of 60%, 20% and 20%.
Step 3) constructing a long-term traffic flow prediction model based on the Encode-Decoder, and determining basic structure parameters of the long-term traffic flow prediction model;
in the step 3, the constructed long-term traffic flow prediction model based on the Encoder-Decoder is divided into an Encoder part, a Decoder part and a coding vector calculation part, wherein the Encoder part comprises GRUs, the number of the GRUs is consistent with the length of an input sequence, the daily traffic flow prediction of a 5-minute interval data set is adopted in the embodiment, the length of the input sequence is 288, and the length of the input sequence can be adjusted as required in the actual use process. The calculations in each GRU structure are as follows:
resetting a gate: r ist=σ(Yr·[ht-1,Xt]+ar);
And (4) updating the door: u. oft=σ(Yu·[ht-1,Xt]+au);
Cell state
The current state is as follows:
and (3) outputting: y ist=σ(Yy*ht+ay);
Wherein Y isr、Yu、Yh、YyIs a parameter matrix of the system, with initial values in the range of [ -0.1,0.1 [)]Random matrix of ar、au、ah、ayThe offset of the system is the initial value of the zero vector. Represents the hadamard product. r ist、utThe control method comprises the steps that an update gate and a reset gate of a system are respectively used, the number of 0-1 is output through a sigmoid function to control the opening and closing degree of the gates, and therefore the control of the input quantity, the maintenance of the original state and the control of the output quantity of the system state h are achieved.
The input value of Encoder part is the normalized training set TtrainAnd the output value is a system state sequence h.
The coding vector sequence C of the model is calculated by the soft entry mechanism in the following way:
αj=softmax(relu(Y2*(relu(Y1*h+a1))+a2))
Cj=αj*h,i∈[1,n]
wherein alpha isjCalculating an attention coefficient for a jth moment coding vector C, and h is a system state sequence output by an Encoder part, and adopting a multilayer full-connection network to calculate the attention coefficient, Y2、Y1Is the weight of the fully connected network, a1、a2For the bias of this fully connected network, the attention coefficient is scaled with the softmax function to 1 sum (α). The basic unit of the Decoder part is composed of LSTMs, and the calculation in each LSTM structure is as follows:
cell state
An input gate: i.e. it=σ(Yi·[ht-1,Xt]+ai);
Forget the door: f. oft=σ(Yf·[ht-1,Xt]+af);
An output gate: u shapet=σ(Yo·[ht-1,Xt]+ao);
The current state is as follows:
and (3) outputting: p (t) tanh (s (t)) o (t);
wherein Y isC、Yi、Yf、YoA parameter matrix of LSTM with initial values in the range of-0.1, 0.1]Random matrix of aC、ai、af、aoIs the bias of the LSTM, whose initial value is the zero vector. Represents the hadamard product. i.e. it、ft、UtThe input gate, the forgetting gate, the output gate and 3 gates of the LSTM respectively output 0-1 numbers through a sigmoid function to control the opening and closing degree of the gates, so that the control of the input quantity of the LSTM state H, the maintenance of the original state and the control of the output quantity are realized.
The input value of the Decoder part is a coding vector sequence C, the sequence is the output of a soft entry mechanism, and the output value of the sequence is the prediction result of the system after being mapped by a single-layer full-connection network. The expression for this fully connected network is as follows:
result=relu(Y_result*p+a_result)
where p is the output result of the LSTM, Y _ result is the parameter matrix of the fully-connected layer, and a _ result is the offset of the fully-connected layer.
And 4) predicting traffic flow data of the next day of the target road section by using the Encode-Decoder-based long-time traffic flow prediction model trained and constructed by the constructed training set.
In the step 4, traffic flow data of a day after a target road section is predicted by using a training set training Encoder-Decoder-based long-term traffic flow prediction model, and the specific steps are as follows:
step 4-1: will train set TtrainThe traffic flow data of the previous day of the middle-related road section is used as input and input into an Encoder-Decoder-based long-term traffic flow prediction model, and pairs are obtained through the modelActual output of response
Step 4-2: using mean square errorCalculating the error between the actual output value and the expected output value of the traffic flow; transmitting the error of each operation to each neuron of the model by using a back propagation algorithm, and then updating each connection weight by using an adam algorithm;
step 4-3: setting times, continuously updating LSTM neural network connection weight in iteration, and using the network pair to verify set T in iteration processverifyThe data of (2) are predicted.
The number of iterations is set to 1000 in the example.
Step 4-4: selecting a verification set T in an iterative processverifyThe parameters that perform best are saved as the final parameters of the model.
The final parameters in the real-time example are the updated parameters of the 978 th training.
And 4-5: will be used as test set TtestThe traffic flow is output by using a long-time traffic flow prediction model based on an Encode-Decoder, and predicted traffic flow data are obtained to test the model.
In the test results of the examples, the prediction results of 2016 (working day), 25 months, 11 (month), and 27 days (double holidays) are shown in fig. 4, where the mean square error (MAE) of the test results is 25, the Mean Absolute Percentage Error (MAPE) is 9.13%, and the prediction results are more accurate
The invention provides a long-term traffic flow prediction method based on an Encoder-Decoder, wherein a soft attention mechanism is introduced into the Encoder-Decoder to dynamically adjust the numerical value of a coding vector C, so that the long-term memory capability of an Encoder-Decoder model is improved, the prediction precision is improved, and the long-term traffic flow prediction has higher accuracy and reliability.
The above description is only for the purpose of illustrating the present invention and the appended claims are not to be construed as limiting the scope of the present invention, which is defined by the appended claims.