Data anomaly monitoring method and device, computer equipment and storage medium
1. A method for monitoring data anomalies, the method comprising:
acquiring data to be analyzed;
according to the historical data corresponding to the data to be classified, carrying out local weighted regression processing on the data to be analyzed to obtain a fitting slope of the data to be analyzed;
updating slope data contained in a preset slope set based on the fitting slope to obtain an updated slope set, wherein the slope set stores a fixed amount of slope data according to a time sequence;
obtaining the number of abnormal slope data exceeding the allowable fluctuation interval in the updated slope set according to the allowable fluctuation interval predetermined based on the historical data;
and judging whether the data to be analyzed is abnormal or not according to the quantity of the abnormal slope data and a preset abnormal threshold.
2. The method according to claim 1, wherein the performing local weighted regression processing on the data to be analyzed according to the historical data corresponding to the data to be classified to obtain the fitting slope of the data to be analyzed comprises:
determining the position of the real-time fitting window in the historical data corresponding to the data to be analyzed by taking the time of the data to be analyzed as the last time of the real-time fitting window;
according to the position of the real-time fitting window, acquiring fitting data corresponding to the data to be analyzed from the historical data;
and performing local weighted regression processing on the data to be analyzed based on the position of the fitting data in a real-time fitting window and the weighting parameters corresponding to the positions of the real-time fitting window to obtain the fitting slope of the data to be analyzed.
3. The method according to claim 1, wherein before obtaining the number of abnormal slope data exceeding the allowable fluctuation interval in the updated slope set according to the allowable fluctuation interval predetermined based on the historical data, the method further comprises:
carrying out local weighted regression processing on the historical data point by point according to time to obtain a historical slope corresponding to each time of the historical data;
and determining the allowable fluctuation interval according to the historical slope corresponding to each moment in the historical data.
4. The method according to claim 3, wherein the performing local weighted regression processing on the historical data point by point according to time to obtain the historical slope corresponding to each time of the historical data comprises:
executing the following steps on target data corresponding to any moment in the historical data:
determining the position of the history fitting window in the history data by taking the time of the data to be analyzed as the last time of the history fitting window;
acquiring historical fitting data corresponding to the target data from the historical data according to the position of the historical fitting window;
and performing local weighted regression processing on the target data based on the positions of the historical fitting data in the historical fitting window and the weighting parameters corresponding to the positions of the historical fitting window to obtain a historical slope corresponding to the target data.
5. The method according to claim 3, wherein the determining the allowable fluctuation interval according to the historical slope corresponding to each time in the historical data comprises:
obtaining a median and a median absolute deviation in the historical slope according to the historical slope corresponding to each moment in the historical data;
and determining the allowable fluctuation interval according to the median and the absolute deviation of the median.
6. The method of claim 1, wherein the data to be analyzed is periodically changed;
before obtaining the number of abnormal slope data exceeding the allowable fluctuation interval in the updated slope set according to the allowable fluctuation interval predetermined based on the historical data, the method further includes:
acquiring historical data in at least three updating periods, and determining a historical slope corresponding to the historical data;
determining baseline data of the historical data in a period and a baseline slope corresponding to the baseline data to obtain a slope difference value of the historical slope and the baseline slope at each moment;
determining the allowable fluctuation interval according to the median and the median absolute deviation of the slope difference;
the updating slope data contained in a preset slope set based on the fitting slope to obtain an updated slope set comprises:
and updating the slope increment of the fitting slope relative to the baseline slope to a preset slope set to obtain an updated slope set.
7. The method of claim 6, wherein the determining baseline data of the historical data in a cycle and a baseline slope corresponding to the baseline data, and obtaining a slope difference between the historical slope and the baseline slope at each time comprises:
determining baseline data of the relative time according to historical data corresponding to the same relative time in different periods based on the relative time relationship between the time and the periods;
carrying out local weighted regression processing on the baseline data point by point according to the relative time to obtain the baseline slope of the relative time;
and performing difference processing on the historical slope corresponding to the historical data and the baseline slope corresponding to the relative moment to obtain a slope difference value.
8. A data anomaly monitoring device, the device comprising:
the data to be analyzed acquisition module is used for acquiring data to be analyzed;
the fitting slope processing module is used for carrying out local weighted regression processing on the data to be analyzed according to the historical data corresponding to the data to be classified to obtain the fitting slope of the data to be analyzed;
a slope set updating module, configured to update slope data included in a preset slope set based on the fitting slope to obtain an updated slope set, where the slope set stores a fixed amount of slope data in a time sequence;
the fluctuation analysis module is used for obtaining the number of abnormal slope data exceeding the allowable fluctuation interval in the updated slope set according to the allowable fluctuation interval predetermined based on the historical data;
and the abnormity determining module is used for determining whether the data to be analyzed is abnormal or not according to the quantity of the abnormal slope data and a preset abnormity threshold value.
9. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor, when executing the computer program, implements the steps of the method of any of claims 1 to 7.
10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 7.
Background
With the development of internet technology, large internet companies provide a large number of applications and web services to the outside through servers. However, service failures due to many reasons are always difficult to avoid, such as network delays, server failures, malicious network attacks, and so on. These service failures usually occur, and some service-related indexes, such as some typical time series data, always have abnormal fluctuation changes, such as sudden increase of service failure amount, sudden decrease of success rate, sudden increase of response delay, and the like. Therefore, by monitoring abnormal conditions such as sudden change of business indexes, operation and maintenance personnel can deal with and solve service faults more timely and more accurately.
In the traditional abnormal monitoring process, the data distribution characteristics of a research object are detected by adopting a probability distribution function matched with the data distribution characteristics, and the monitoring accuracy of abnormal data is low by adopting the monitoring mode.
Disclosure of Invention
In view of the foregoing, it is desirable to provide a data anomaly monitoring method, apparatus, computer device and storage medium capable of improving anomaly monitoring accuracy.
A data anomaly monitoring method comprises the following steps:
acquiring data to be analyzed;
according to historical data corresponding to the data to be classified, carrying out local weighted regression processing on the data to be analyzed to obtain a fitting slope of the data to be analyzed;
updating slope data contained in a preset slope set based on the fitting slope to obtain an updated slope set, wherein the slope set stores a fixed amount of slope data in a time sequence;
obtaining the number of abnormal slope data exceeding the allowable fluctuation interval in the updated slope set according to the allowable fluctuation interval predetermined based on the historical data;
and judging whether the data to be analyzed is abnormal or not according to the quantity of the abnormal slope data and a preset abnormal threshold.
A data anomaly monitoring device, the device comprising:
the data to be analyzed acquisition module is used for acquiring data to be analyzed;
the fitting slope processing module is used for carrying out local weighted regression processing on the data to be analyzed according to the historical data corresponding to the data to be classified to obtain the fitting slope of the data to be analyzed;
the slope set updating module is used for updating slope data contained in a preset slope set based on the fitting slope to obtain an updated slope set, and the slope set stores a fixed amount of slope data according to a time sequence;
the fluctuation analysis module is used for obtaining the number of abnormal slope data exceeding an allowable fluctuation interval in the updated slope set according to the allowable fluctuation interval predetermined based on the historical data;
and the abnormity determining module is used for determining whether the data to be analyzed is abnormal or not according to the quantity of the abnormal slope data and a preset abnormity threshold value.
A computer device comprising a memory and a processor, the memory storing a computer program, the processor implementing the following steps when executing the computer program:
acquiring data to be analyzed;
according to historical data corresponding to the data to be classified, carrying out local weighted regression processing on the data to be analyzed to obtain a fitting slope of the data to be analyzed;
updating slope data contained in a preset slope set based on the fitting slope to obtain an updated slope set, wherein the slope set stores a fixed amount of slope data in a time sequence;
obtaining the number of abnormal slope data exceeding the allowable fluctuation interval in the updated slope set according to the allowable fluctuation interval predetermined based on the historical data;
and judging whether the data to be analyzed is abnormal or not according to the quantity of the abnormal slope data and a preset abnormal threshold.
A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, carries out the steps of:
acquiring data to be analyzed;
according to historical data corresponding to the data to be classified, carrying out local weighted regression processing on the data to be analyzed to obtain a fitting slope of the data to be analyzed;
updating slope data contained in a preset slope set based on the fitting slope to obtain an updated slope set, wherein the slope set stores a fixed amount of slope data in a time sequence;
obtaining the number of abnormal slope data exceeding the allowable fluctuation interval in the updated slope set according to the allowable fluctuation interval predetermined based on the historical data;
and judging whether the data to be analyzed is abnormal or not according to the quantity of the abnormal slope data and a preset abnormal threshold.
According to the data anomaly monitoring method, the data anomaly monitoring device, the computer equipment and the storage medium, the data to be analyzed is obtained, local weighted regression processing is carried out on the data to be analyzed according to the historical data corresponding to the data to be classified, the data to be analyzed can be smoothly processed, the fitting slope of the data to be analyzed is obtained, the interference caused by accidental shaking of the data is avoided, the allowable fluctuation range is determined in advance based on the historical data and has good fluctuation range reference, and whether the abnormal data exist or not can be accurately judged according to the number of the abnormal slope data exceeding the allowable fluctuation range in the updated slope set and the number of the abnormal slope data exceeding the allowable fluctuation range, so that the monitoring accuracy of the abnormal data is improved.
Drawings
FIG. 1 is a diagram of an exemplary data anomaly monitoring method;
FIG. 2 is a schematic flow chart diagram illustrating a data anomaly monitoring method in one embodiment;
FIG. 3 is a schematic flow chart diagram illustrating a data anomaly monitoring method in accordance with another embodiment;
FIG. 4 is a schematic diagram of the LOWESS algorithm in the data anomaly monitoring method in one embodiment;
FIG. 5 is a schematic flow chart diagram illustrating a data anomaly monitoring method according to yet another embodiment;
FIG. 6 is a flowchart illustrating the step of determining a historical slope in a data anomaly monitoring method according to one embodiment;
FIG. 7(a) is a schematic distribution diagram of raw data and smoothed data of a data anomaly monitoring method in one embodiment;
FIG. 7(b) is a diagram illustrating a distribution of a fitting slope and an allowable fluctuation interval of the data anomaly monitoring method in one embodiment;
FIG. 8 is a schematic diagram illustrating a slope and a corresponding decision boundary when fluctuation of a data curve is significant in an embodiment of a data anomaly monitoring method;
FIG. 9 is a schematic flow chart diagram illustrating a data anomaly monitoring method in accordance with another embodiment;
FIG. 10 is a schematic flow chart diagram illustrating a method for data anomaly monitoring in accordance with yet another embodiment;
FIG. 11 is a schematic flow chart diagram illustrating a method for monitoring data anomalies in yet another embodiment;
FIG. 12 is a schematic flow chart diagram illustrating a method for monitoring data anomalies in yet another embodiment;
FIG. 13 is a data flow diagram illustrating a data anomaly monitoring method in an embodiment in which data values of a monitored object remain stable;
FIG. 14 is a data flow diagram illustrating a data anomaly monitoring method in an embodiment under a scenario where a data value of a monitored object varies periodically;
FIG. 15 is a block diagram of a data anomaly monitoring device in one embodiment;
FIG. 16 is a diagram illustrating an internal structure of a computer device according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
The data anomaly monitoring method provided by the application can be applied to the application environment shown in fig. 1. Wherein the terminal 102 communicates with the server 104 via a network. The server 104 provides services for the terminal 102 and monitors the services provided for the terminal 102, the server 104 obtains data to be analyzed, performs local weighted regression processing on the data to be analyzed according to historical data corresponding to the data to be classified to obtain a fitting slope of the data to be analyzed, updates slope data included in a preset slope set based on the fitting slope to obtain an updated slope set, the slope set stores a fixed amount of slope data in a time sequence, obtains the amount of abnormal slope data exceeding an allowable fluctuation interval in the updated slope set according to the allowable fluctuation interval predetermined based on the historical data, and judges whether the data to be analyzed is abnormal or not according to the amount of the abnormal slope data and a preset abnormal threshold. The terminal 102 may be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices, and the server 104 may be implemented by an independent server or a server cluster formed by a plurality of servers.
In one embodiment, as shown in fig. 2, a data anomaly monitoring method is provided, which is described by taking the method as an example applied to the server in fig. 1, and includes the following steps 202 to 210.
Step 202, data to be analyzed is obtained.
The data to be analyzed refers to data in relevant monitoring indexes for providing services for the server, wherein the monitoring data corresponding to the monitoring indexes are typical time series data generally. The time-series data refers to data in which an index is recorded in chronological order. Many data in daily life are time series data, and numerical values at different times of a day, such as stock share price, notebook cpu usage, indoor temperature, and the like, form time series data.
In practical applications, the reasons for the service failure are many and are always difficult to avoid, such as network delay, server failure, malicious network attacks, etc., which may cause abnormal fluctuation changes of the time series data, for example, a sudden increase of traffic failure amount, a sudden decrease of success rate, a sudden increase of response delay, etc. Therefore, the data to be analyzed is obtained by monitoring the service index, and whether abnormal conditions such as mutation and the like occur in the monitored data is determined by analyzing the data to be analyzed. The success rate is a ratio of a success amount to a request amount obtained by counting the request amount, the success amount and the failure amount of a certain service in a certain time.
And 204, performing local weighted regression processing on the data to be analyzed according to the historical data corresponding to the data to be classified to obtain the fitting slope of the data to be analyzed.
The historical data refers to data which is obtained and completes monitoring analysis, and the historical data corresponding to the data to be classified refers to historical data of the same monitoring service index. For example, if the data to be analyzed is the success rate of a certain service, the historical data is the existing historical data corresponding to the success rate of the service.
The local weighted regression refers to a process of performing point-by-point local linear (or nonlinear) fitting on time series data, and then calculating a smoothed value of a target point in the time series data according to a fitted curve. Local weighted regression (Lowess) is a widely used data fitting means in regression analysis. The basic idea is that different weights are given to data points near the points to be predicted, and then linear fitting or nonlinear fitting is carried out on the points, so that the problem of under-fitting of linear regression can be solved. In an embodiment, the fitted slope of the data to be analyzed may be calculated by a first order local weighted regression (LOWESS) algorithm.
And step 206, updating the slope data contained in the preset slope set based on the fitting slope to obtain an updated slope set.
Wherein, the slope set stores a fixed amount of slope data in time sequence. For example, 10 slope data may be stored in the slope set, and when a fitting slope corresponding to data to be analyzed is obtained, first, slope data that needs to be updated to the slope set is determined, where the slope data may be the fitting slope, or other slope data corresponding to the fitting slope, such as a slope difference with other fitting slopes, and the like. And then updating the determined slope data to a slope set, wherein the slope set takes the updated slope data as the latest data, and discards the slope data with the longest storage time, so that the slope set always stores only a fixed amount of slope data, such as 10 slope data.
Specifically, the slope data updated to the slope set requires data of the same nature as the slope data currently stored in the slope set. If the slope set is a fitting slope, the slope data updated to the slope set is also a fitting slope, and for example, if the slope set is a slope difference, the slope data updated to the slope set is also a slope difference.
And step 208, obtaining the quantity of abnormal slope data exceeding the allowable fluctuation interval in the updated slope set according to the allowable fluctuation interval predetermined based on the historical data.
The allowable fluctuation interval refers to an allowable fluctuation range for describing that the monitoring data is in a normal condition. It should be noted that the fluctuation interval is allowed to be the same as the data type of the slope data in the slope set, such as a fitting slope or a slope difference.
And comparing the slope data in the updated slope set with the allowable fluctuation interval, determining abnormal slope data with the numerical value exceeding the allowable fluctuation interval, and counting the number of the abnormal slope data.
And step 210, judging whether the data to be analyzed is abnormal or not according to the quantity of the abnormal slope data and a preset abnormal threshold.
The preset abnormal threshold is used for representing the maximum value of the number of abnormal slope data in the allowable slope set. The preset abnormal threshold value can be set according to actual needs, specifically, the number of abnormal slope data occurring in the allowable slope set can be directly set, and the maximum value of the number of abnormal slope data occurring in the allowable slope set can be indirectly determined by setting the proportion of the number of abnormal slope data occurring in the slope set to the total number of the slope set.
The abnormal data is used for representing the data abnormal phenomenon in the monitored object. The number of abnormal slope data in the slope set is taken as an example, wherein the preset abnormal threshold is taken as the number of the abnormal slope data in the slope set. And if the quantity of the abnormal slope data is not greater than the preset abnormal threshold, judging that the abnormal data exists.
In the case of abnormal monitoring in real-time calculation, occasionally one or two abnormal values may correspond to accidental jitter of data rather than a true system failure due to the fluctuation of the data. Therefore, the number of abnormal points in a certain time window is usually determined during actual calculation, and only when the number of abnormal points is larger than a set threshold value, the system corresponding to the index is considered to be abnormal, and an alarm is given.
According to the data anomaly monitoring method, the data to be analyzed is obtained, local weighted regression processing is carried out on the data to be analyzed according to the historical data corresponding to the data to be classified, smooth processing of the data to be analyzed can be achieved, the fitting slope of the data to be analyzed is obtained, interference caused by accidental shaking of the data is avoided, an allowable fluctuation interval which is predetermined based on the historical data has good fluctuation range reference, whether the abnormal data exist or not can be accurately judged according to the number of abnormal slope data exceeding the allowable fluctuation interval in an updated slope set and the number of abnormal slope data exceeding the allowable fluctuation interval, and the monitoring accuracy of the abnormal data is improved. In addition, in the analysis process, the abnormal value can be judged only by performing local weighted regression calculation once on the line and pulling the generated allowable fluctuation interval, so that the calculation amount is very small, and the method has the characteristic of quick response.
In one embodiment, as shown in fig. 3, according to the historical data corresponding to the data to be classified, the step 204 of performing local weighted regression on the data to be analyzed to obtain the fitting slope of the data to be analyzed includes steps 302 to 306.
And step 302, determining the position of the real-time fitting window in the historical data corresponding to the data to be analyzed by taking the time of the data to be analyzed as the last time of the real-time fitting window.
And 304, acquiring fitting data corresponding to the data to be analyzed from the historical data according to the position of the real-time fitting window.
And step 306, performing local weighted regression processing on the data to be analyzed based on the position of the fitting data in the real-time fitting window and the weighting parameters corresponding to the positions of the real-time fitting window to obtain a fitting slope of the data to be analyzed.
The real-time fitting window is a fitting window used when a fitting slope corresponding to newly generated data is calculated in real time, and the fitting window is a sliding calculation window used for realizing local weighted regression. Corresponding to the real-time fitting window, there is also a historical fitting window used in calculating the fitting slope corresponding to the historical data.
Usually, the local weighted regression is implemented by the LOWESS algorithm, which selects data t-w on the left and right sides of t time0,t+w0]As fitting data, where w0Half the length of the fitting window. The implementation principle is shown in fig. 4, where the discrete points in fig. 4 correspond to the raw data in the history data, and the curve is the smoothed data after fitting. Taking the time t as an example of the target time, in order to obtain the fitting slope at the time t, a fitting window needs to be selected, and linear fitting with weight is performed on data in the fitting window, that is, the farther away from the target point, the lower the weight is. The straight line in the graph is a fitted straight line, the t moment is substituted into a straight line equation of the straight line, and a corresponding ordinate is calculated, so that a fitted point can be obtained. At the same time, the slope of the straight line corresponds to the rate of change of the time series data at time t. Sliding the calculation window can calculate the smoothed values at different moments, namely the curves in the graph.
However, in real-time calculation, since future data at time t cannot be obtained, t-2 x w0, t is used]For fitting the window, fitting and calculating the data at the moment t to obtain the slope at the moment t, wherein in order to ensure the comparability of the slope, the real-time fitting window and the historical fitting window both adopt t-2 xw0,t]The window of (2).
In the embodiment, the first-order LOWESS algorithm is adopted to perform local weighted regression processing on the data to be analyzed, and the slope of the curve is calculated. LOWESS is a data analysis technique commonly used to smooth time series containing random noise. The method is derived on the basis of linear (or nonlinear) fitting, can effectively reduce the influence of random noise, and improves the under-fitting problem generated by linear fitting. The goal of linear fitting is to find a globally optimal linear function that describes the linear relationship between two variables. The basic idea of LOWESS, which does not pursue such a global optimum function, is to perform a point-by-point local linear (or non-linear) fit to the time series data and then calculate a smoothed value of the point according to the fitted curve. By the means, the nonlinear fitting effect can be realized even by adopting simple linear fitting, and different weights are given to different points according to the distances between the different points and the target point when fitting is carried out so as to further reflect the local characteristics of the data. For example, to obtain the fitting value at time t, a real-time fitting window is selected, and weighted linear fitting is performed on the data in the real-time fitting window, that is, the farther from the target point, the lower the weight is. And according to the fitted straight line, substituting t, calculating a corresponding vertical coordinate to obtain a fitted point, wherein the slope of the fitted straight line corresponds to the slope of the t moment. Sliding the calculation window can calculate the smoothed values at different times.
The fitting data is selected through the real-time fitting window, the slope of the data to be analyzed is obtained based on local weighted regression processing, and the time window with a certain width is adopted, so that the method is insensitive to occasional noise, can naturally avoid abnormal alarm of accidental jitter of the data which is not the condition that the system really breaks down, and improves the accuracy of abnormal monitoring.
In one embodiment, as shown in fig. 5, before obtaining the number of abnormal slope data exceeding the allowable fluctuation interval in the updated slope set according to the allowable fluctuation interval predetermined based on the historical data, steps 502 to 504 are further included.
Step 502, local weighted regression processing is carried out on the historical data point by point according to the time, and historical slopes corresponding to all the time of the historical data are obtained.
Step 504, determining an allowable fluctuation interval according to the historical slope corresponding to each time in the historical data.
And (4) the index change is relatively smooth, the numerical value is maintained in a stable interval range, and if the success rate index is monitored, the fitting slope is taken as a comparison object to perform abnormal evaluation. The data stored in the slope set is embodied as the fitting slope by taking the fitting slope as a comparison object, and the allowable fluctuation interval is determined based on the fitting slope of the historical data.
The local weighted regression processing is carried out on the historical data point by point according to the time, the historical data can be smoothed, the fitting slope corresponding to each time of the historical data is obtained, and in order to distinguish the fitting slope obtained by calculating the data to be analyzed and the fitting slope obtained by calculating the historical data, the fitting slope corresponding to the historical data is called as the historical slope.
In the process of determining the decision of the abnormal data of the allowable fluctuation interval, the selection of the normal fluctuation interval directly determines the effect of the abnormal detection. Specifically, in one embodiment, the historical slope of the indicator may be used to calculate the mean μ and standard deviation σ, and then a reasonable fluctuation interval [ μ -3 σ, u +3 σ ] may be calculated in conjunction with the statistical 3-sigma method. For data satisfying normal distribution, it has a probability of falling within this interval of 99.73%, and according to the inverse-negative law, data falling outside the interval can be considered as abnormal data. The 3-sigma law means that statistically, random variable observed values conforming to standard normal distribution fall into different intervals according to certain probability. The probability of falling within the plus and minus three standard deviations reaches 99.73%, and the cases falling within other intervals can be regarded as small probability events. Therefore, in the abnormality detection, data falling outside of 3 standard deviations is generally regarded as abnormal data. Even if the random variable does not conform to the standard normal distribution, the probability of falling between 3 standard deviations reaches more than 88.89% according to the chebyshev inequality.
In another embodiment, the fluctuation interval may also be calculated using the median (mean) and Median Absolute Deviation (MAD) of the historical slopes. MAD and mean are more resilient to processing outliers in the dataset than mean and standard deviation. Specifically, the upper and lower limits of the fluctuation range may be [ mean-6-MAD, mean + 6-MAD ]. The allowable fluctuation interval is set by any one of the two methods, so that a more accurate abnormality monitoring result can be obtained.
In one embodiment, summarizing, performing local weighted regression on historical data point by point at a time to obtain a historical slope corresponding to each time of the historical data comprises the following processing steps:
as shown in fig. 6, the following steps 602 to 606 are performed on the target data corresponding to any time in the history data to determine the history slope.
Step 602, determining the position of the history fitting window in the history data by taking the time of the data to be analyzed as the last time of the history fitting window.
And step 604, acquiring historical fitting data corresponding to the target data from the historical data according to the position of the historical fitting window.
And 606, performing local weighted regression processing on the target data based on the positions of the historical fitting data in the historical fitting window and the weighting parameters corresponding to the positions of the historical fitting window to obtain a historical slope corresponding to the target data.
The history fitting window is a window for fitting the history data to obtain a fitting slope. The historical fitting window and the real-time fitting window are both sliding windows in nature, and the data in the historical data are sequentially fitted according to the time sequence by the same window size to obtain a fitting slope.
In an embodiment, the fitting slope of the historical data is better comparable to the fitting slope of the real-time data. And the target data in the history fitting window and the data to be analyzed in the real-time fitting window correspond to the last moment in the fitting window, namely the last moment.
For some system abnormal conditions, the index curve may be suddenly changed, for example, the curve with larger change in fig. 7(a) shows the success rate of a certain service, and the value is normally the sameFluctuating within a range of 80%. And at T0At that moment, the system fails, resulting in a sudden decrease in success rate to around 65%. The success rate curve data can be smoothed by a first order LOWESS algorithm, the effect being shown by the curve that changes more gradually in fig. 7 (a). From this smoothed curve, it can be seen that the curve is at T0The most obvious feature of the instant is that it is much steeper, i.e. the slope of the curve suddenly changes to a relatively large negative value. By the LOWESS algorithm, the slope of the success rate at each time can be calculated, as shown by the curve in fig. 7 (b). T shown by upper and lower horizontal lines in FIG. 7(b)0Normal fluctuation interval of slope calculated from historical data before time T0The success rate slope is greatly reduced near the moment, and the abnormal business at the moment can be judged by comparing the normal fluctuation interval of the historical slope.
In the actual real-time calculation for anomaly detection, occasionally one or two abnormal values may correspond to accidental jitter of data rather than a true system failure due to the fluctuation of the data. Therefore, the number of abnormal points in a certain time window is usually determined during actual calculation, and only when the number of abnormal points is larger than a set threshold value, the system corresponding to the index is considered to be abnormal, and an alarm is given. It should be noted that the LOWESS algorithm uses a time window with a certain width for calculating the slope, so that it is not sensitive to occasional noise, and can naturally avoid the situation, as shown in fig. 7(a), where occasional jitter of data does not have a great influence on the slope curve. In addition, the algorithm can still perform well when the success rate changes greatly, and as shown in fig. 8, even if the historical success rate fluctuates greatly, an alarm can still be generated when the success rate drops steeply.
In one embodiment, as shown in fig. 9, step 502, which is to determine an allowable fluctuation interval, includes steps 902 to 904, according to a historical slope corresponding to each time in the historical data.
And step 902, obtaining a median and a median absolute deviation in the historical slope according to the historical slope corresponding to each moment in the historical data.
And 904, determining an allowable fluctuation interval according to the median and the absolute deviation of the median.
In statistics, the median absolute MAD is a robust measure of the sample bias of univariate numerical data. While also representing the overall parameters derived from the MAD estimation of the samples. For single variable data set X1,X2,...,XnThe MAD is defined as the median of the absolute deviation of each data point from the median in the univariate dataset: MAD is the median (i Xi-median (x)) i.e. the residual between the data and their median (the deviation) is calculated first, the MAD being the median of the absolute values of these deviations. For example, consider a data set (1,1,2,2,4,6,9) with a median of 2. The absolute deviation of the data points to 2 is (1,1,0,0,2,4,7), and the median of the list of deviations is 1 (since the sorted absolute deviation is (0,0,1,1,2,4, 7)). The absolute deviation of the median of the data is 1.
The allowable fluctuation interval is calculated using the median (mean) and the Median Absolute Deviation (MAD). MAD and mean are more resilient to processing of outliers in the dataset than mean and standard deviation. The upper and lower horizontal lines in fig. 7(b) are the upper and lower bounds of the fluctuation range calculated from the slope value of the index before the time T0: [ mean-6-MAD, mean + 6-MAD ]. In fact, since median and MAD are relatively stable, in the actual calculation process, in order to meet the requirement of real-time calculation on the calculation amount, the two data can be calculated based on off-line data and then updated regularly.
In an actual application scenario, the monitoring data has different data characteristics, for example, some data indexes change more smoothly, and some data indexes change periodically, and different modes can be adopted for the case of periodic change of the data indexes. Specifically, for a scene that the data index changes more smoothly and the value is maintained within a stable interval range, the anomaly analysis is performed in a fitting slope-based mode. The index change is obvious, but under the scene with certain periodicity, the abnormity can be analyzed by adopting a slope difference-based mode. In one embodiment, as shown in FIG. 10, one embodiment of anomaly analysis in a slope difference based manner is provided. Before obtaining the number of abnormal slope data exceeding the allowable fluctuation interval in the updated slope set according to the allowable fluctuation interval predetermined based on the historical data, the method further includes steps 1002 to 1006.
Step 1002, obtaining historical data in at least three updating periods, and determining a historical slope corresponding to the historical data.
Step 1004, determining baseline data of the historical data in the period and a baseline slope corresponding to the baseline data to obtain a slope difference between the historical slope and the baseline slope at each moment.
Because abnormal data may occur in a certain period in the historical data, at least three periods of the historical data are set to ensure that at least three data are available at the same relative moment in the period, so that the interference of the abnormal data on the determination of the baseline data can be effectively reduced or avoided, and the reliability of the baseline data is improved.
In particular, the slope difference may be a numerical difference having a positive or negative representation, for example, the slope difference may be embodied by a slope increment or a slope decrement.
In one embodiment, determining baseline data of the historical data in the period and a baseline slope corresponding to the baseline data, and obtaining a slope difference between the historical slope and the baseline slope at each time includes: and determining baseline data of the relative time according to historical data corresponding to the same relative time in different periods based on the relative time relationship between the time and the periods. And carrying out local weighted regression processing on the baseline data point by point according to the relative time to obtain the baseline slope of the relative time. And performing difference processing on the historical slope corresponding to the historical data and the baseline slope corresponding to the relative moment to obtain a slope difference value.
And step 1006, determining an allowable fluctuation interval according to the median of the slope difference and the absolute deviation of the median.
Based on the fitted slope, the slope data included in the preset slope set is updated, and an updated slope set is obtained, including step 1008.
Step 1008, updating the slope increment of the fitting slope relative to the baseline slope to a preset slope set to obtain an updated slope set.
The data to be analyzed changes periodically, which means that the service indexes corresponding to the target to be analyzed change regularly in a period, that is, the service indexes at the same relative moment in different periods have the same or similar data characteristics. The relative time refers to the relative time relationship between the time and the period. For example, the first time in the first period and the first time in the second period are the same relative time in different periods.
The baseline data refers to data which is obtained by analyzing historical data according to cycles and is used for describing cycle characteristics. The baseline data may be obtained by averaging or median-taking data corresponding to the same relative time in different periods, or more accurately, periodic partial data calculated based on a time-series decomposition algorithm. The data amount thereof is the same as that of one cycle.
For data which changes periodically, the slope difference value is used as a comparison object to perform anomaly analysis, the data which is stored in the slope set and is embodied by using the slope difference value as the comparison object is used as the slope difference value of the fitting slope compared with the baseline slope, and the allowable fluctuation interval is determined based on the slope difference value of the historical slope of the historical data compared with the baseline slope. Through the comparison and analysis mode of the slope difference, the method can be suitable for the abnormal monitoring of data which changes periodically, and an accurate and effective abnormal monitoring result is obtained.
In one embodiment, as shown in fig. 11, a data anomaly monitoring method is provided, which can be applied in a scenario where a data index changes relatively gently and a value is maintained within a stable interval range, and includes the following steps 1102 to 1122.
Step 1102, determining the position of the history fitting window in the history data by taking the time of the data to be analyzed as the last time of the history fitting window for the target data corresponding to any time in the history data.
And 1104, acquiring history fitting data corresponding to the target data from the history data according to the position of the history fitting window.
And step 1106, performing local weighted regression on the target data based on the positions of the historical fitting data in the historical fitting window and the weighting parameters corresponding to the positions of the historical fitting window to obtain a historical slope corresponding to the target data.
And 1108, obtaining the median and the absolute deviation of the median in the historical slope according to the historical slope corresponding to each moment in the historical data.
And step 1110, determining an allowable fluctuation interval according to the median in the historical slope and the absolute deviation of the median.
Step 1112, acquiring data to be analyzed, and determining a position of the real-time fitting window in the historical data corresponding to the data to be analyzed by taking a time of the data to be analyzed as a last time of the real-time fitting window.
Step 1114, obtaining fitting data corresponding to the data to be analyzed from the historical data according to the position of the real-time fitting window.
And 1116, performing local weighted regression on the data to be analyzed based on the position of the fitting data in the real-time fitting window and the weighting parameters corresponding to the positions of the real-time fitting window to obtain a fitting slope of the data to be analyzed.
Step 1118, based on the fitting slope, the slope data included in the preset slope set is updated, so as to obtain an updated slope set.
Step 1120, obtaining the number of abnormal slope data exceeding the allowable fluctuation interval in the updated slope set according to the allowable fluctuation interval.
Step 1122, determining whether the data to be analyzed is abnormal or not according to the number of the abnormal slope data and a preset abnormal threshold.
In one embodiment, as shown in fig. 12, a data anomaly monitoring method is provided, which can be applied to a scenario where the index changes significantly but has a certain periodicity, and includes the following steps 1202 to 1222.
Step 1202, obtaining historical data in at least three updating periods, and determining a historical slope corresponding to the historical data.
And 1204, determining baseline data of the relative time according to historical data corresponding to the same relative time in different periods based on the relative time relationship between the time and the periods.
In step 1206, local weighted regression is performed on the baseline data point by point according to the relative time to obtain the baseline slope at the relative time.
And 1208, performing difference processing on the historical slope corresponding to the historical data and the baseline slope corresponding to the relative moment to obtain a slope difference value.
And step 1210, determining an allowable fluctuation interval according to the median of the slope difference and the absolute deviation of the median.
And 1212, acquiring the data to be analyzed, and determining the position of the real-time fitting window in the historical data corresponding to the data to be analyzed by taking the time of the data to be analyzed as the last time of the real-time fitting window.
And step 1214, acquiring fitting data corresponding to the data to be analyzed from the historical data according to the position of the real-time fitting window.
And step 1216, performing local weighted regression on the data to be analyzed based on the position of the fitting data in the real-time fitting window and the weighting parameters corresponding to the positions of the real-time fitting window to obtain a fitting slope of the data to be analyzed.
In step 1218, the slope increment of the fitting slope relative to the baseline slope is updated to a preset slope set to obtain an updated slope set.
Step 1220, obtaining the number of abnormal slope data exceeding the allowable fluctuation interval in the updated slope set according to the allowable fluctuation interval.
Step 1222, determining whether the data to be analyzed is abnormal or not according to the number of the abnormal slope data and a preset abnormal threshold.
The application also provides an application scenario, and the data anomaly monitoring method is applied to the application scenario. Specifically, the application of the data anomaly monitoring method in an application scene in which the index change is relatively gentle is shown in the following scene one:
scene one: the data value of the monitored object is maintained in a stable interval range, for example, the monitoring of the rate index, the flow under the scene is shown in fig. 13, the calculation is divided into two parts, namely strategy calculation and real-time calculation, and the updating frequency is day and minute respectively.
Referring to fig. 13, inside the dotted line frame in the upper part is a policy calculation part for the purpose of calculating an allowable fluctuation interval. Since the allowable fluctuation interval is relatively stable, the update frequency may be low, with a period of day or half day, where Tupdate is the update time.
The real-time calculation part is arranged in a dotted line frame at the lower part, in order to reduce the influence of noise, slope data within 10min before the current moment is cached in a memory, whether the slope data is abnormal or not is judged according to the number of points outside a fluctuation interval in a 10min window, the slope data is updated in real time, and the length of the slope data is always maintained at 10 min.
Specifically, the server acquires monitoring time series data using each APP as a monitoring object according to a data acquisition granularity of 1min, and completes on-line processing on the allowable fluctuation interval in a period of one day. The treatment process comprises the following steps: acquiring historical data in a period before the updating moment, performing slope calculation on the historical data point by adopting a LOWESS algorithm to obtain 1440 slopes, and calculating allowable fluctuation intervals [ mean-6 MAD, mean +6 MAD ] according to median (mean) and Median Absolute Deviation (MAD) of the 1440 slopes.
Taking the moment corresponding to the latest data to be analyzed currently acquired as t0For example, the server is according to t02w before the moment of time0Calculating t by LOWESS algorithm based on historical data in range0The slope of the time of day. The server is at t-1At any moment, the memory will cache [ k ]-10、k-9……k-1]Corresponding slope at t0At any moment, the memory will cache [ k ]-9、k-8……k0]The corresponding slope. Then buffer [ k-9、k-8……k0]Corresponding slopes and allowable fluctuation intervals [ mean-6-MAD, mean + 6-MAD]Making a comparison when the slope is at a point outside the allowable fluctuation rangeIf the number is larger than 7, judging that the abnormity occurs, otherwise, judging that the abnormity does not occur.
The application further provides an application scenario, and the data anomaly monitoring method is applied to the application scenario. Specifically, the data anomaly monitoring method is obvious in index change, but the application of the data anomaly monitoring method in certain periodic application scenes is shown in the following scene two:
and in a second scenario, a periodic part is extracted based on a time sequence decomposition algorithm to be used as a base line, and then the Slope of the base line is calculatedperodic(t), the Slope may be in accordance with Slopeperodic(t)=Slopeperodic(T + T) is generalized to any time instant, where T is the length of one cycle. At this time, a Slope (t) -Slope can be usedperodic(t) as a statistic, instead of the slope index slope (t) in the first scene, the calculation flow in this scene is shown in fig. 14, in which it is assumed that data is periodic in days.
Referring to fig. 14, a policy calculation portion is located in a dashed line frame of the upper portion, and compared to scenario one, since the index data has periodicity, a baseline may be extracted based on the historical data, and then a deviation degree of a slope of the historical data from the baseline is calculated, so as to obtain a fluctuation interval.
The lower dashed box is the real-time calculation part, which also calculates the slope of the index data within 10min before the current time, but the index of the anomaly measure at this time is the increment of the slope relative to the slope of the baseline data. In this figure, it is assumed that the index data is in cycles of days, and 7 cycles of history data are used in calculating the baseline data.
Specifically, the server acquires monitoring time series data using each APP as a monitoring object according to a data acquisition granularity of 1min, and for the calculation of an allowable fluctuation interval, data of 7 periods needs to be acquired in one day as one period, and the online processing is completed. The treatment process comprises the following steps: and acquiring historical data in 7 periods before the updating time, and performing slope calculation on the historical data in 7 periods point by adopting a LOWESS algorithm to obtain 7 x 1440 slopes. According to 7 data of the same time in different periods, 1400 baseline data are obtained by taking a median, the slope calculation is carried out on the baseline data point by adopting a LOWESS algorithm to obtain 1440 baseline slopes, 7 x 1440 slope values in 7 periods and 1440 baseline slope values are subjected to difference based on the periodicity to obtain 7 x 1440 slope difference values, and an allowable fluctuation interval [ mean-6 MAD, mean +6 MAD ] is calculated according to the median (mean) and the absolute deviation (MAD) of the 1440 slope difference values.
Taking the moment corresponding to the latest data to be analyzed currently acquired as t0For example, the server is according to t02w before the moment of time0Calculating t by LOWESS algorithm based on historical data in range0The slope of the time of day. The server is at t-1At any moment, the memory will cache [ k ]-10、k-9……k-1]Corresponding slope at t0At any moment, the memory will cache [ k ]-9、k-8……k0]The corresponding slope. Then buffer [ k-9、k-8……k0]The corresponding slope is subtracted from the slope at the corresponding moment in the baseline slope to obtain t0Will buffer [ Delta k ] in the memory at any moment-9、Δk-8……Δk0]Corresponding slope difference, and comparing the buffered slope difference with the allowable fluctuation interval [ mean-6-MAD, mean + 6-MAD]And comparing, and judging that the abnormity occurs when the number of points of the slope difference outside the allowable fluctuation interval is more than 7, otherwise, judging that the abnormity does not occur.
It can be seen that the calculation of both scene one and scene two is performed on line, and the updating frequency is low. And the abnormal value can be judged only by carrying out LOWESS calculation once on the line and then pulling the generated strategy data, and the calculated amount is very small.
It should be understood that, although the steps in the flowcharts referred to in the embodiments described above are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a part of the steps in the flowcharts related to the above embodiments may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of performing the steps or stages is not necessarily sequential, but may be performed alternately or alternately with other steps or at least a part of the steps or stages in other steps.
In one embodiment, as shown in fig. 15, a data anomaly monitoring apparatus 1500 is provided, which may be a part of a computer device using a software module or a hardware module, or a combination of the two, and specifically includes: a data to be analyzed obtaining module 1502, a fitting slope processing module 1504, a slope set updating module 1506, a fluctuation analysis module 1508, and an anomaly determination module 1510, wherein:
a data to be analyzed obtaining module 1502 is configured to obtain data to be analyzed.
And the fitting slope processing module 1504 is used for performing local weighted regression processing on the data to be analyzed according to the historical data corresponding to the data to be classified to obtain the fitting slope of the data to be analyzed.
The slope set updating module 1506 is configured to update slope data included in a preset slope set based on the fitting slope to obtain an updated slope set, where the slope set stores a fixed amount of slope data in a time sequence.
And a fluctuation analysis module 1508, configured to obtain the number of abnormal slope data in the updated slope set, which exceeds the allowable fluctuation interval, according to the allowable fluctuation interval predetermined based on the historical data.
The anomaly determination module 1510 is configured to determine whether the data to be analyzed is abnormal according to the number of the abnormal slope data and a preset anomaly threshold.
In one embodiment, the fitting slope processing module is further configured to determine a position of the real-time fitting window in the historical data corresponding to the data to be analyzed, with a time at which the data to be analyzed is located as a last time of the real-time fitting window; according to the position of the real-time fitting window, obtaining fitting data corresponding to the data to be analyzed from the historical data; and performing local weighted regression processing on the data to be analyzed based on the position of the fitting data in the real-time fitting window and the weighting parameters corresponding to the positions of the real-time fitting window to obtain the fitting slope of the data to be analyzed.
In one embodiment, the data anomaly monitoring device further comprises an allowable fluctuation interval determining module, which is used for performing local weighted regression processing on the historical data point by point according to time to obtain a historical slope corresponding to each time of the historical data; and determining an allowable fluctuation interval according to the historical slope corresponding to each moment in the historical data.
In one embodiment, the allowable fluctuation interval determining module is further configured to determine, for target data corresponding to any time in the historical data, a position of the historical fitting window in the historical data by using a time at which the data to be analyzed is located as a last time of the historical fitting window; acquiring historical fitting data corresponding to the target data from the historical data according to the position of the historical fitting window; and performing local weighted regression processing on the target data based on the positions of the historical fitting data in the historical fitting window and the weighting parameters corresponding to the positions of the historical fitting window to obtain a historical slope corresponding to the target data.
In one embodiment, the allowable fluctuation interval determination module is further configured to obtain a median and a median absolute deviation in the historical slope according to the historical slope corresponding to each time in the historical data; and determining an allowable fluctuation interval according to the median and the absolute deviation of the median.
In one embodiment, the data to be analyzed is periodically changed; the allowable fluctuation interval determining module is further used for acquiring historical data in at least three updating periods and determining a historical slope corresponding to the historical data; determining baseline data of the historical data in a period and a baseline slope corresponding to the baseline data to obtain slope difference values of the historical slope and the baseline slope at various moments; determining an allowable fluctuation interval according to the median of the slope difference and the absolute deviation of the median; the slope set updating module is further used for updating the slope increment of the fitting slope relative to the baseline slope to a preset slope set to obtain an updated slope set.
In one embodiment, the allowable fluctuation interval determining module is further configured to determine, based on a relative time relationship between a time and a period, baseline data of a relative time according to historical data corresponding to the same relative time in different periods; carrying out local weighted regression processing on the baseline data point by point according to the relative time to obtain the baseline slope of the relative time; and performing difference processing on the historical slope corresponding to the historical data and the baseline slope corresponding to the relative moment to obtain a slope difference value.
For specific limitations of the data anomaly monitoring device, reference may be made to the above limitations of the data anomaly monitoring method, which are not described herein again. All or part of the modules in the data anomaly monitoring device can be realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be as shown in fig. 16. The computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used for storing anomaly monitoring data. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a data anomaly monitoring method.
Those skilled in the art will appreciate that the architecture shown in fig. 16 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, a computer device is further provided, which includes a memory and a processor, the memory stores a computer program, and the processor implements the steps of the above method embodiments when executing the computer program.
In an embodiment, a computer-readable storage medium is provided, in which a computer program is stored which, when being executed by a processor, carries out the steps of the above-mentioned method embodiments.
In one embodiment, a computer program product or computer program is provided that includes computer instructions stored in a computer-readable storage medium. The computer instructions are read by a processor of a computer device from a computer-readable storage medium, and the computer instructions are executed by the processor to cause the computer device to perform the steps in the above-mentioned method embodiments.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware related to instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database or other medium used in the embodiments provided herein can include at least one of non-volatile and volatile memory. Non-volatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical storage, or the like. Volatile Memory can include Random Access Memory (RAM) or external cache Memory. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others.
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above examples only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.