Data optimization method and system for ADCP
1. A method of data optimization for ADCP, the method comprising:
measuring the flow velocity and flow data of the water body by using an acoustic Doppler flow velocity profiler, calibrating the measured data, and storing the calibrated data into a database;
carrying out abnormal value identification on the stored measurement data by using an abnormal value identification algorithm, and removing the abnormal value;
performing abnormal value interpolation by using an interpolation algorithm combined with a random forest to obtain a measurement data sequence after abnormal value processing;
and smoothing the measurement data sequence by using a moving average method and a median filtering method, and taking the smoothed measurement data sequence as ADCP (adaptive Doppler current profiler) test data of the water body.
2. The data optimization method for ADCP of claim 1, wherein the measuring the flow velocity and flow data of the body of water using an acoustic doppler flow profiler comprises:
1) the frequency f of ADCP sending to the water body0When the sound wave signal propagates in water, part of the sound wave energy is scattered along with scatterers in the water to form an echo signal f1;
2) ADCP reception echo signal f1Calculating the water flow velocity v:
wherein:
c represents the propagation speed of sound waves in water;
3) according to the time t of the ADCP receiving the echo signal, calculating the depth H of the water body:
wherein:
h represents the depth of the ADCP put under the water;
t represents the time interval from sending out the acoustic signal to receiving the echo signal by the ADCP;
c represents the propagation speed of sound waves in water;
θ represents the beam angle of ADCP;
4) measuring the width W of the water body, and calculating to obtain the cross-sectional area S of the water body which is H multiplied by W; calculating the flow Q of the water body:
Q=v*S
wherein:
v represents the flow velocity of the body of water.
3. The data optimization method for ADCP of claim 2, wherein the performing calibration process on the measured data comprises:
receiving acoustic signals f transmitted by ADCP0Converting the acoustic signal into an electric signal, collecting the signal, then performing resampling and interpolation processing on the collected signal, intercepting, time domain continuation and attenuation of the signal, performing D/A conversion on the signal after a certain time delay, and transmitting the signal back at the same speed as the A/D collection to obtain an acoustic signal with a certain frequency offset and time delay; the sound wave signal f 'with frequency deviation and time delay'0And the original acoustic signal f0The relationship between them is:
wherein:
fpthe signal acquisition frequency;
fqresampling frequency for the signal;
and sending the sound wave signal with frequency deviation and time delay to obtain an ADCP actual echo signal, receiving the echo signal by the ADCP, and measuring the flow velocity and flow data of the water body by using the flow of the ADCP to obtain calibrated water body measurement data.
4. A data optimization method for ADCP as claimed in claim 3, wherein the performing outlier identification of the stored measurement data using an outlier identification algorithm comprises:
for the measured value xi(i ═ 1,2, …, N), the mean value of which was calculatedAnd residual vi:
If the measurements contain only random errors, the probability that the residual falls outside (-3 σ,3 σ) is only 0.27%, for any measurement xiResidual error v ofiIf | vi|>3 σ, the measured value x is considerediIs an abnormal value and rejects the measured value xi(ii) a The estimate of the standard deviation S of the finite number of measurements is substituted for σ:
wherein:
virepresenting the measured value xiThe residual error of (a).
5. A data optimization method for ADCP as claimed in claim 4, wherein the interpolation of outliers using interpolation algorithm combined with random forest comprises:
1) b sample sequence sets are extracted from the measured data sequence by adopting a Bootstrap sampling algorithm, wherein missing value data after abnormal value elimination exists in each sample sequence set;
2) for each sample sequence set, let the sample sequence set with the window n before the missing value be R ═ x1,x2,…,xn}; dividing R into m subspaces, wherein each subspace RmThere is a fixed regression output value ymAnd constructing a decision tree model:
wherein:
s is a pulse function;
for the constructed decision tree model, the subspace corresponding to the current father node is set as RmR is set according to a threshold value TmIs divided into RlAnd RrTwo parts are as follows: rl={xi<T},Rr={xi≥T};
Respectively reacting R withlAnd RrAs a father node, recursively segmenting until the variance of the samples in the current father node is smaller than a given variance threshold; when the conditions are met, stopping returning and setting the current father node as a leaf node;
3) generating a random forest prediction interpolation model according to the M constructed decision trees:
yc=ave(y(Xc,Ti)),i=1,2,…,M
wherein:
Xcis the sequence of samples before the outlier c;
Tirepresenting the ith decision tree;
y(Xc,Ti) Representing the predicted value of the ith decision tree to the abnormal value c;
ave(y(Xc,Ti) Mean value representing the predicted values of all decision trees for the outlier c;
and carrying out abnormal value interpolation processing on all abnormal values by using a random forest prediction interpolation model to obtain a measurement data sequence after the abnormal values are processed.
6. The data optimization method for ADCP of claim 5, wherein the flow of the moving average method is as follows:
continuously sliding windows one by one for N values in the measured data sequence, selecting m adjacent data to carry out weighted average, and obtaining the smoothed measured data sequence:
wherein:
ycrepresents the c-th value in the measurement data sequence;
y′crepresents the c-th value in the smoothed measurement data sequence;
is expressed as ycAs a center, the sum of adjacent 2i values,
w (k) represents a weighting coefficient.
7. The data optimization method for ADCP of claim 6, wherein the smoothing of the measurement data sequence by the median filtering method comprises:
from the smoothed measurement data sequence x1,x2,…,xnExtracting m numbers x successivelyi-v,…,xi-1,xi,…,xi+v}; wherein v ═ (m-1)/2; the m points are arranged according to the size sequence, the middle numerical value is selected as the data after filtering, and the formula is as follows:
yi=median{xi-v,…,xi-1,xi,xi+1,…,xi+v}
wherein:
mean {. } represents the median of the set {. };
taking the length of a median filtering window as m-5-10;
and taking the measurement data sequence after the smoothing treatment as ADCP test data of the water body.
8. A data optimization system for ADCP, the system comprising:
the data acquisition device is used for measuring the flow velocity and flow data of the water body by using the acoustic Doppler flow velocity profiler;
the data processor is used for calibrating the measured data and storing the calibrated data into a database;
the ADCP data optimization device is used for carrying out abnormal value identification on the stored measurement data by using an abnormal value identification algorithm, removing the abnormal values, and carrying out abnormal value interpolation by using an interpolation algorithm combined with a random forest to obtain a measurement data sequence after the abnormal value processing; and smoothing the measurement data sequence by using a moving average method and a median filtering method, and taking the smoothed measurement data sequence as ADCP (adaptive Doppler current profiler) test data of the water body.
9. A computer readable storage medium having ADCP data optimization program instructions stored thereon, the ADCP data optimization program instructions being executable by one or more processors to implement steps of an implementation method for data optimization of ADCP as described above.
Background
An Acoustic Doppler Current Profiler (ADCP) is a sonar device for measuring the water flow velocity and flow rate, and also gives consideration to the bottom velocity of a measuring carrier. Therefore, ADCP can be used in the field of river and ocean current surveying. Meanwhile, the ADCP flow measurement has high efficiency and a large measurement range, and works based on the acoustic principle, so that the interference to a water body is avoided, the flow field condition can be truly reflected, and the water flow environment is protected.
The relative water bottom velocity measured by the ADCP can be used for underwater acoustic positioning navigation, and typical application fields are as follows: a surface vessel, an autonomous underwater platform, a towing system, a diver's hand-held system, an underwater submarine, and a remote operation platform; ocean current velocity information measured by ADCP is indispensable in many ocean application fields such as oil and gas development, transportation, biological environment observation and the like.
However, various errors may be generated in the testing process due to external environments, improper testing work, sensor accuracy, and the like.
In view of this, how to optimize the ADCP measurement data to obtain high-precision flow rate and flow data becomes a problem to be solved by those skilled in the art.
Disclosure of Invention
The invention provides a data optimization method for ADCP, which comprises the steps of measuring the flow velocity and flow data of a water body by using an acoustic Doppler flow velocity profiler, calibrating the measured flow velocity, and storing the calibrated data in a database; and carrying out abnormal value processing on the stored data by using an abnormal value identification and rejection algorithm, simultaneously carrying out moving average and filtering noise reduction processing on the processed data, and taking the finally processed data as test data of the water body.
In order to achieve the above object, the present invention provides a data optimization method for ADCP, including:
measuring the flow velocity and flow data of the water body by using an acoustic Doppler flow velocity profiler, calibrating the measured data, and storing the calibrated data into a database;
carrying out abnormal value identification on the stored measurement data by using an abnormal value identification algorithm, and removing the abnormal value;
performing abnormal value interpolation by using an interpolation algorithm combined with a random forest to obtain a measurement data sequence after abnormal value processing;
and smoothing the measurement data sequence by using a moving average method and a median filtering method, and taking the smoothed measurement data sequence as test data of the water body.
Optionally, the measuring the flow velocity and the flow data of the water body by using the acoustic doppler current profiler includes:
the water body comprises an ocean and a river, and in one specific embodiment of the invention, the acoustic Doppler current profiler is placed in equipment such as a surface naval vessel, an autonomous underwater platform, a towing system, a diver handheld system, an underwater submarine and the like;
the flow for measuring the flow velocity and the flow data of the water body by using the ADCP comprises the following steps:
1) the frequency f of ADCP sending to the water body0When the sound wave signal propagates in water, part of the sound wave energy is scattered along with scatterers in the water to form an echo signal f1;
2) ADCP reception echo signal f1Calculating the water flow velocity v:
wherein:
c represents the propagation speed of sound waves in water;
3) according to the time t of the ADCP receiving the echo signal, calculating the depth H of the water body:
wherein:
h represents the depth of the ADCP put under the water;
t represents the time interval from sending out the acoustic signal to receiving the echo signal by the ADCP;
c represents the propagation speed of sound waves in water;
θ represents the beam angle of ADCP;
4) measuring the width W of the water body, and calculating to obtain the cross-sectional area S of the water body which is H multiplied by W; calculating the flow Q of the water body:
Q=v*S
wherein:
v represents the flow velocity of the body of water.
Optionally, the performing calibration processing on the measured data includes:
receiving acoustic signals f transmitted by ADCP0Converting the acoustic signal into an electric signal, collecting the signal, then performing resampling and interpolation processing on the collected signal, intercepting, time domain continuation and attenuation of the signal, performing D/A conversion on the signal after a certain time delay, and transmitting the signal back at the same speed as the A/D collection to obtain an acoustic signal with a certain frequency offset and time delay; the sound wave signal f 'with frequency deviation and time delay'0And the original acoustic signal f0The relationship between them is:
wherein:
fpthe signal acquisition frequency;
fqresampling frequency for the signal;
and sending the sound wave signal with frequency deviation and time delay to obtain an ADCP actual echo signal, receiving the echo signal by the ADCP, and measuring the flow velocity and flow data of the water body by using the flow of the ADCP to obtain calibrated water body measurement data.
Optionally, the performing outlier identification on the stored measurement data by using an outlier identification algorithm includes:
the abnormal value is an abnormal measurement value caused by some abnormal factors (such as misoperation of a measurer or external interference) which suddenly occur in the measurement process, and the abnormal value seriously distorts the real situation, so that the reliability and the use value are lost;
the abnormal value identification algorithm flow is as follows:
for the measured value xi(i ═ 1, 2.., N), the mean value of which was calculatedAnd residual vi:
If the measurements contain only random errors, the probability that the residual falls outside (-3 σ,3 σ) is only 0.27%, for any measurement xiResidual error v ofiIf | viIf | is greater than 3 σ, the measurement value x is considerediIs an abnormal value and rejects the measured value xi(ii) a In one embodiment of the invention, since the value of σ is generally an unknown value, the invention replaces σ with an estimate of the standard deviation S of a finite number of measurements:
wherein:
virepresenting the measured value xiThe residual error of (a).
Optionally, the performing outlier interpolation by using an interpolation algorithm combined with a random forest includes:
for measuring data sequencesThere is a corresponding time series t0,t1,...,tnTherein ofIs determined as an abnormal value and is eliminated, so t needs to be obtained againkData corresponding to time, in one embodiment of the invention, the invention utilizes an interpolation algorithm that incorporates a random forest to interpolate outliers, the combination being dependent on the time of the outliersThe interpolation algorithm flow of the machine forest is as follows:
1) b sample sequence sets are extracted from the measured data sequence by adopting a Bootstrap sampling algorithm, wherein missing value data after abnormal value elimination exists in each sample sequence set;
2) for each sample sequence set, let the sample sequence set with the window n before the missing value be R ═ x1,x2,...,xn}; dividing R into m subspaces, wherein each subspace RmThere is a fixed regression output value ymAnd constructing a decision tree model:
wherein:
s is a pulse function;
for the constructed decision tree model, the subspace corresponding to the current father node is set as RmR is set according to a threshold value TmIs divided into RlAnd RrTwo parts are as follows: rl={xi<T},Rr={xi≥T};
Respectively reacting R withlAnd RrAs a father node, recursively segmenting until the variance of the samples in the current father node is smaller than a given variance threshold; when the conditions are met, stopping returning and setting the current father node as a leaf node;
3) generating a random forest prediction interpolation model according to the M constructed decision trees:
yc=ave(y(Xc,Ti)),i=1,2,...,M
wherein:
Xcis the sequence of samples before the outlier c;
Tirepresenting the ith decision tree;
y(Xc,Ti) Representing the predicted value of the ith decision tree to the abnormal value c;
ave(y(Xc,Ti) Mean value representing the predicted values of all decision trees for the outlier c;
and carrying out abnormal value interpolation processing on all abnormal values by using a random forest prediction interpolation model to obtain a measurement data sequence after the abnormal values are processed.
Optionally, the flow of the moving average method is as follows:
continuously sliding windows one by one for N values in the measured data sequence, selecting m adjacent data to carry out weighted average, and obtaining the smoothed measured data sequence:
wherein:
ycrepresents the c-th value in the measurement data sequence;
y′crepresents the c-th value in the smoothed measurement data sequence;
is expressed as ycAs a center, the sum of adjacent 2i values,
w (k) represents a weighting coefficient.
Optionally, the smoothing the measurement data sequence by using a median filtering method includes:
from the smoothed measurement data sequence x1,x2,...,xnExtracting m numbers x successivelyi-v,...,xi-1,xi,...,xi+v}; wherein v ═ (m-1)/2; the m points are arranged according to the size sequence, the middle numerical value is selected as the data after filtering, and the formula is as follows:
yi=median{xi-v,...,xi-1,xi,xi+1,...,xi+v}
wherein:
mean {. } represents the median of the set {. };
in a specific embodiment of the invention, the length of a median filtering window is taken as m being 5-10;
and taking the measurement data sequence after the smoothing treatment as ADCP test data of the water body.
In addition, to achieve the above object, the present invention also provides a data optimization system for ADCP, the system comprising:
the data acquisition device is used for measuring the flow velocity and flow data of the water body by using the acoustic Doppler flow velocity profiler;
the data processor is used for calibrating the measured data and storing the calibrated data into a database;
the ADCP data optimization device is used for carrying out abnormal value identification on the stored measurement data by using an abnormal value identification algorithm, removing the abnormal values, and carrying out abnormal value interpolation by using an interpolation algorithm combined with a random forest to obtain a measurement data sequence after the abnormal value processing; and smoothing the measurement data sequence by using a moving average method and a median filtering method, and taking the smoothed measurement data sequence as ADCP (adaptive Doppler current profiler) test data of the water body.
Furthermore, to achieve the above object, the present invention also provides a computer readable storage medium having ADCP data optimization program instructions stored thereon, which are executable by one or more processors to implement the steps of the implementation method for data optimization of ADCP as described above.
Compared with the prior art, the invention provides a data optimization method for ADCP, which has the following advantages:
firstly, aiming at the problem of difficult calibration of ADCP on site, the invention provides an ADCP data calibration method based on signal retransmission0When the sound wave signal ofWhen the sound wave signal is propagated in water, part of sound wave energy is scattered along with scatterers in water to form an echo signal f1(ii) a ADCP reception echo signal f1Calculating the water flow velocity v:
wherein: c represents the propagation speed of sound waves in water; according to the time t of the ADCP receiving the echo signal, calculating the depth H of the water body:
wherein: h represents the depth of the ADCP put under the water; t represents the time interval from sending out the acoustic signal to receiving the echo signal by the ADCP; c represents the propagation speed of sound waves in water; θ represents the beam angle of ADCP; measuring the width W of the water body, and calculating to obtain the cross-sectional area S of the water body which is H multiplied by W; calculating the flow Q of the water body:
Q=v*S
wherein: v represents the flow velocity of the body of water. Furthermore, the invention uses the signal receiving device to receive the acoustic signal f emitted by the ADCP0Converting the acoustic signal into an electric signal, collecting the signal, then performing resampling and interpolation processing on the collected signal, intercepting, time domain continuation and attenuation of the signal, performing D/A conversion on the signal after a certain time delay, and sending back the signal at the same speed as the A/D collection to obtain an acoustic signal with a certain frequency offset and time delay; the sound wave signal f 'with frequency deviation and time delay'0And the original acoustic signal f0The relationship between them is:
wherein: f. ofpThe signal acquisition frequency; f. ofgResampling frequency for the signal; sending the sound wave signal with frequency deviation and time delay to obtain ADCP actual echoAnd the ADCP receives the echo signal, and the flow of the flow velocity and flow data of the water body is measured by utilizing the ADCP to obtain the calibrated water body measurement data. Compared with the initial ADCP measurement data, the more accurate echo signal based on time delay and frequency offset is obtained through setting of echo time delay and resampling, and the more accurate water body measurement data is obtained based on more accurate echo signal calibration calculation.
Meanwhile, the invention utilizes an interpolation algorithm combined with random forests to carry out abnormal value interpolation to obtain a measured data sequence after abnormal value processing, and the measured data sequence is subjected to the abnormal value interpolationThere is a corresponding time series t0,t1,...,tnTherein ofIs determined as an abnormal value and is eliminated, so t needs to be obtained againkIn a specific embodiment of the present invention, the present invention performs outlier interpolation by using an interpolation algorithm combined with a random forest, where the flow of the interpolation algorithm combined with the random forest is as follows: b sample sequence sets are extracted from the measured data sequence by adopting a Bootstrap sampling algorithm, wherein missing value data after abnormal value elimination exists in each sample sequence set; for each sample sequence set, let the sample sequence set with the window n before the missing value be R ═ x1,x2,...,xn}; dividing R into m subspaces, wherein each subspace RmThere is a fixed regression output value ymAnd constructing a decision tree model:
wherein: s is a pulse function; for the constructed decision tree model, the subspace corresponding to the current father node is set as RmR is set according to a threshold value TmIs divided into RlAnd RrTwo parts are as follows: rl={xi<T},Rr={xiMore than or equal to T }; respectively reacting R withlAnd RrAs a father node, recursively segmenting until the variance of the samples in the current father node is smaller than a given variance threshold; when the conditions are met, stopping returning and setting the current father node as a leaf node; generating a random forest prediction interpolation model according to the M constructed decision trees:
yc=ave(y(Xc,Ti)),i=1,2,...,M
wherein: xcIs the sequence of samples before the outlier c; t isiRepresenting the ith decision tree; y (X)c,Ti) Representing the predicted value of the ith decision tree to the abnormal value c; ave (y (X)c,Ti) Mean value representing the predicted values of all decision trees for the outlier c; and performing abnormal value interpolation processing on all abnormal values by using a random forest prediction interpolation model, and performing compensation processing on the abnormal values of the continuously existing abnormal value data in a recursion mode to obtain an ADCP (advanced digital content control protocol) measurement data sequence after the abnormal value processing.
In summary, the invention provides a data optimization method for ADCP, which performs identification, elimination and interpolation on abnormal values of ADCP data, and performs data moving average and median filtering. Matlab simulation analysis proves that the algorithm is feasible, abnormal values are effectively eliminated, ADCP data are smoothed, the precision is improved for subsequent flow rate processing, and the method is simple to implement.
Drawings
Fig. 1 is a schematic flowchart of a data optimization method for ADCP according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of a data optimization system for ADCP according to an embodiment of the present invention;
the implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Measuring the flow velocity and flow data of the water body by using an acoustic Doppler flow velocity profiler, calibrating the measured flow velocity, and storing the calibrated data into a database; and carrying out abnormal value processing on the stored data by using an abnormal value identification and rejection algorithm, simultaneously carrying out moving average and filtering noise reduction processing on the processed data, and taking the finally processed data as test data of the water body. Fig. 1 is a schematic diagram illustrating a data optimization method for ADCP according to an embodiment of the present invention.
In this embodiment, the data optimization method for ADCP includes:
and S1, measuring the flow velocity and flow data of the water body by using the acoustic Doppler flow velocity profiler, calibrating the measured data, and storing the calibrated data in a database.
Firstly, the invention uses an Acoustic Doppler Current Profiler (ADCP) to measure the current velocity and flow data of a water body, wherein the water body comprises an ocean and a river, and in a specific embodiment of the invention, the acoustic Doppler current profiler is placed in equipment such as a surface naval vessel, an autonomous underwater platform, a towing system, a diver handheld system, an underwater submarine and the like;
the flow for measuring the flow velocity and the flow data of the water body by using the ADCP comprises the following steps:
1) the frequency f of ADCP sending to the water body0When the sound wave signal propagates in water, part of the sound wave energy is scattered along with scatterers in the water to form an echo signal f1;
2) ADCP reception echo signal f1Calculating the water flow velocity v:
wherein:
c represents the propagation speed of sound waves in water;
3) according to the time t of the ADCP receiving the echo signal, calculating the depth H of the water body:
wherein:
h represents the depth of the ADCP put under the water;
t represents the time interval from sending out the acoustic signal to receiving the echo signal by the ADCP;
c represents the propagation speed of sound waves in water;
θ represents the beam angle of ADCP;
4) measuring the width W of the water body, and calculating to obtain the cross-sectional area S of the water body which is H multiplied by W; calculating the flow Q of the water body:
Q=v*S
wherein:
v represents the flow velocity of the body of water.
Further, the invention performs calibration processing on the measured flow rate, and the calibration processing flow comprises the following steps:
receiving acoustic signals f transmitted by ADCP0Converting the acoustic signal into an electric signal, collecting the signal, then performing resampling and interpolation processing on the collected signal, intercepting, time domain continuation and attenuation of the signal, performing D/A conversion on the signal after a certain time delay, and transmitting the signal back at the same speed as the A/D collection to obtain an acoustic signal with a certain frequency offset and time delay; the sound wave signal f 'with frequency deviation and time delay'0And the original acoustic signal f0The relationship between them is:
wherein:
fpthe signal acquisition frequency;
fqresampling frequency for the signal;
and sending the sound wave signal with frequency deviation and time delay to obtain an ADCP actual echo signal, receiving the echo signal by the ADCP, and measuring the flow velocity and flow data of the water body by using the flow of the ADCP to obtain calibrated water body measurement data.
And S2, performing abnormal value identification on the stored measurement data by using an abnormal value identification algorithm, and removing the abnormal values.
Further, the invention uses abnormal value recognition algorithm to perform abnormal value recognition to the stored measurement data, the abnormal value refers to abnormal measurement value caused by some abnormal factors (such as misoperation of measurer or external interference) which suddenly occur in the measurement process, the abnormal value seriously distorts the real situation, thereby losing reliability and use value;
the abnormal value identification algorithm flow is as follows:
for the measured value xi(i ═ 1, 2.., N), the mean value of which was calculatedAnd residual vi:
If the measurements contain only random errors, the probability that the residual falls outside (-3 σ,3 σ) is only 0.27%, for any measurement xiResidual error v ofiIf | viIf | is greater than 3 σ, the measurement value x is considerediIs an abnormal value and rejects the measured value xi(ii) a In one embodiment of the invention, since the value of σ is generally an unknown value, the invention replaces σ with an estimate of the standard deviation S of a finite number of measurements:
wherein:
virepresenting the measured value xiThe residual error of (a).
And S3, performing abnormal value interpolation by combining an interpolation algorithm of a random forest to obtain a measurement data sequence after abnormal value processing.
Further, for the measured data sequenceThere is a corresponding time series t0,t1,...,tnTherein ofThe wave is determined as an abnormal value and is eliminated, so t needs to be obtained againkIn a specific embodiment of the present invention, the present invention performs outlier interpolation by using an interpolation algorithm combined with a random forest, where the flow of the interpolation algorithm combined with the random forest is as follows:
1) b sample sequence sets are extracted from the measured data sequence by adopting a Bootstrap sampling algorithm, wherein missing value data after abnormal value elimination exists in each sample sequence set;
2) for each sample sequence set, let the sample sequence set with the window n before the missing value be R ═ x1,x2 ,...,xn}; dividing R into m subspaces, wherein each subspace RmThere is a fixed regression output value ymAnd constructing a decision tree model:
wherein:
s is a pulse function;
for the constructed decision tree model, the subspace corresponding to the current father node is set as RmR is set according to a threshold value TmIs divided into RlAnd RrTwo parts are as follows: rl={xi<T},Rr={xi≥T};
Respectively reacting R withlAnd RrAs a father node, recursively segmenting until the variance of the samples in the current father node is smaller than a given variance threshold; stopping recursion when the condition is satisfiedSetting the current father node as a leaf node;
3) generating a random forest prediction interpolation model according to the M constructed decision trees:
yc=ave(y(Xc,Ti)),i=1,2,...,M
wherein:
Xcis the sequence of samples before the outlier c;
Tirepresenting the ith decision tree;
y(Xc,Ti) Representing the predicted value of the ith decision tree to the abnormal value c;
ave(y(Xc,Ti) Mean value representing the predicted values of all decision trees for the outlier c;
and carrying out abnormal value interpolation processing on all abnormal values by using a random forest prediction interpolation model to obtain a measurement data sequence after the abnormal values are processed.
And S4, smoothing the measurement data sequence by using a moving average method and a median filtering method, and taking the smoothed measurement data sequence as test data of the water body.
Further, in the ADCP flow measurement process, due to the influence of factors such as a water flow environment and the like, random errors can be contained in the measurement data even if a high-precision sensor is used, so that the measurement data sequence is subjected to smoothing processing by using a moving average method and a median filtering method;
the flow of the moving average method is as follows:
continuously sliding windows one by one for N values in the measured data sequence, selecting m adjacent data to carry out weighted average, and obtaining the smoothed measured data sequence:
wherein:
ycrepresents the c-th value in the measurement data sequence;
y′crepresents the c-th value in the smoothed measurement data sequence;
is expressed as ycAs a center, the sum of adjacent 2i values,
w (k) represents a weighting coefficient;
median filtering is a nonlinear signal processing technique that can effectively suppress noise. The process is as follows: a window is set and then traversed over all points on the sequence and the value of the window's center point is replaced by the median of the original values in the window. When the median filtering is adopted for smoothing, the radial flow velocity in the beam direction and data acquired by various sensors are all one-dimensional data, so the one-dimensional median filtering is adopted, and the flow of the median filtering algorithm adopted by the method is as follows:
from the smoothed measurement data sequence x1,x2,...,xnExtracting m numbers x successivelyi-v,...,xi-1,xi,xi+1,...,xi+v}; wherein v ═ (m-1)/2; the m points are arranged according to the size sequence, the middle numerical value is selected as the data after filtering, and the formula is as follows:
yi=median{xi-v,...,xi-1,xi,xi+1,...,xi+v}
wherein:
mean {. } represents the median of the set {. };
in a specific embodiment of the invention, the length of a median filtering window is taken as m being 5-10;
further, the invention takes the measurement data sequence after smoothing as ADCP test data of the water body.
The following describes embodiments of the present invention through an algorithmic experiment and tests of the inventive treatment method. The hardware test environment of the algorithm of the invention is as follows: inter (R) core (TM) i7-6700K CPU with software Matlab2018 b; the comparison method is a data optimization method for ADCP based on Bayesian and a data optimization method for ADCP based on neural network.
In the algorithm experiment, the data set is ADCP data of 10G. According to the experiment, the acquired ADCP data is input into the algorithm model, the effectiveness of the optimized ADCP data is used as an evaluation index of the feasibility of the algorithm, and the effectiveness and the feasibility of the algorithm are higher when the effectiveness of the optimized ADCP data is higher.
According to the experimental result, the effectiveness of the optimized ADCP data based on the Bayesian data optimization method is 76.32, the effectiveness of the optimized ADCP data based on the neural network data optimization method is 81.69, the effectiveness of the optimized ADCP data of the method is 88.92, and compared with a comparison algorithm, the data optimization method for the ADCP provided by the invention can realize higher effectiveness of the optimized ADCP data.
The invention also provides a data optimization system for ADCP. Fig. 2 is a schematic diagram illustrating an internal structure of a data optimization system for ADCP according to an embodiment of the present invention.
In the present embodiment, the data optimization system 1 for ADCP includes at least a data acquisition device 11, a data processor 12, an ADCP data optimization device 13, a communication bus 14, and a network interface 15.
The data acquisition device 11 may be a PC (Personal Computer), a terminal device such as a smart phone, a tablet Computer, or a mobile Computer, or may be a server.
The data processor 12 includes at least one type of readable storage medium including flash memory, hard disks, multi-media cards, card-type memory (e.g., SD or DX memory, etc.), magnetic memory, magnetic disks, optical disks, and the like. The data processor 12 may in some embodiments be an internal storage unit of the data optimization system for ADCP 1, for example a hard disk of the data optimization system for ADCP 1. The data processor 12 may also be an external storage device of the data optimization system 1 for ADCP in other embodiments, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like provided on the data optimization system 1 for ADCP. Further, the data processor 12 may also comprise both an internal storage unit and an external storage device of the data optimization system 1 for ADCP. The data processor 12 can be used not only to store application software installed in the data optimization system 1 for ADCP and various kinds of data, but also to temporarily store data that has been output or is to be output.
The ADCP data optimization device 13 may be, in some embodiments, a Central Processing Unit (CPU), controller, microcontroller, microprocessor or other data Processing chip including a monitoring Unit for running program code stored in the data processor 12 or Processing data, such as ADCP data optimization program instructions 16.
The communication bus 14 is used to enable connection communication between these components.
The network interface 15 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface), and is typically used to establish a communication link between the system 1 and other electronic devices.
Optionally, the data optimization system 1 for ADCP may further include a user interface, the user interface may include a Display (Display), an input unit such as a Keyboard (Keyboard), and the optional user interface may further include a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch device, or the like. The display, which may also be referred to as a display screen or display unit, is suitable for displaying information processed in the data optimization system 1 for ADCP and for displaying a visualized user interface.
Fig. 2 only shows the data optimization system 1 with components 11-15 and for ADCP, and it will be understood by those skilled in the art that the structure shown in fig. 1 does not constitute a limitation of the data optimization system 1 for ADCP, and may include fewer or more components than shown, or combine certain components, or a different arrangement of components.
In the embodiment of the data optimization system 1 for ADCP shown in fig. 2, ADCP data optimization program instructions 16 are stored in the data processor 12; the steps of the ADCP data optimization means 13 executing the ADCP data optimization program instructions 16 stored in the data processor 12 are the same as the implementation method of the data optimization method for the ADCP, and are not described here.
Furthermore, an embodiment of the present invention also provides a computer-readable storage medium having ADCP data optimization program instructions stored thereon, where the ADCP data optimization program instructions are executable by one or more processors to implement the following operations:
measuring the flow velocity and flow data of the water body by using an acoustic Doppler flow velocity profiler, calibrating the measured data, and storing the calibrated data into a database;
carrying out abnormal value identification on the stored measurement data by using an abnormal value identification algorithm, and removing the abnormal value;
performing abnormal value interpolation by using an interpolation algorithm combined with a random forest to obtain a measurement data sequence after abnormal value processing;
and smoothing the measurement data sequence by using a moving average method and a median filtering method, and taking the smoothed measurement data sequence as test data of the water body.
It should be noted that the above-mentioned numbers of the embodiments of the present invention are merely for description, and do not represent the merits of the embodiments. And the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, apparatus, article, or method that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, apparatus, article, or method. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, apparatus, article, or method that includes the element.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) as described above and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present invention.
The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.
- 上一篇:石墨接头机器人自动装卡簧、装栓机
- 下一篇:高抗静摩擦现象的MEMS惯性传感器