CN103747477B - Network traffic analysis and Forecasting Methodology and device - Google Patents
Network traffic analysis and Forecasting Methodology and device Download PDFInfo
- Publication number
- CN103747477B CN103747477B CN201410019136.7A CN201410019136A CN103747477B CN 103747477 B CN103747477 B CN 103747477B CN 201410019136 A CN201410019136 A CN 201410019136A CN 103747477 B CN103747477 B CN 103747477B
- Authority
- CN
- China
- Prior art keywords
- flow
- time sequence
- calculation formula
- feature
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Landscapes
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a kind of network traffic analysis and Forecasting Methodology and device, the global characteristics of the flow-time sequence of each base station to be measured are first extracted;Then clustered according to the global characteristics extracted;Further according to the result clustered, the attributive character of data on flows is gathered;The flow of attributive character and last moment finally according to the data on flows, carries out volume forecasting.The global characteristics of extraction time sequence of the invention, with global characteristics similitude come the similitude of reflecting time sequence, catch the behavioral characteristics that time series is changed over time, obtain more rational result, large-scale time series is described by using a small amount of feature simultaneously, improve the complexity during the robustness for judging analog result, reduction cluster calculation;The various attributive character related to data on flows are gathered according to cluster result, according to flow and the common predicted flow rate data of attributive character, containing much information for prediction correspondingly improves precision of prediction, rational resource distribution is carried out to network.
    Description
Technical field
      The present invention relates to communication technical field, more particularly to a kind of network traffic analysis and Forecasting Methodology and device.
    Background technology
      In communication network optimization, network traffic analysis and prediction are very important link, the optimization to Internet resources
Configuration is significant.Whether accurate volume forecasting is, the interpretation that predicts the outcome and predicts the outcome and actual flow number
According to whether being consistent, investment and the construction scale of network are all directly affected, and be the key of volume forecasting to the preliminary analysis of flow,
Directly affect the accuracy of volume forecasting.
    Flow is analyzed using original time series in the prior art, using between Euclidean distance measuring period sequence
Similitude, then clustered according to this similitude;Meanwhile, usage history data on flows predicts unknown stream during predicted flow rate
Data are measured, using traditional Regression Forecast, time series analysis etc..
      Existing method only payes attention to the difference of time series value on correspondence time point;Using euclidean distance metric time series
Between similitude, so as to cause result to be vulnerable to the influence of value on indivedual time points, lose the robustness of result;Only utilize
Data on flows, so as to cause the result poor-performing of prediction.
    The content of the invention
      Based on above-mentioned situation, the present invention proposes a kind of network traffic analysis and Forecasting Methodology, it is possible to increase precision of prediction,
Rational resource distribution is carried out to network.
      To achieve these goals, the technical scheme is that:
      A kind of network traffic analysis and Forecasting Methodology, comprise the following steps:
      Extract the global characteristics of the flow-time sequence of each base station to be measured;
      Global characteristics according to being extracted are clustered;
      According to the result clustered, the attributive character of data on flows is gathered;
      According to the attributive character of the data on flows and the flow of last moment, volume forecasting is carried out.
      For prior art problem, the invention also provides a kind of network traffic analysis and prediction meanss, improve existing stream
Amount analysis robustness is poor, the problem of volume forecasting precision is low, is adapted to practical application.
    Specific implementation is:A kind of network traffic analysis and prediction meanss, including:
      Extraction module, the global characteristics of the flow-time sequence for extracting each base station to be measured;
      Cluster module, for being clustered according to the global characteristics extracted;
      Acquisition module, for according to the result clustered, gathering the attributive character of data on flows;
      Prediction module, for the attributive character and the flow of last moment according to the data on flows, carries out volume forecasting.
      Compared with prior art, beneficial effects of the present invention are:Inventive network flow analysis and Forecasting Methodology and device,
First extract the global characteristics of the flow-time sequence of each base station to be measured;Then clustered according to the global characteristics extracted;
Further according to the result clustered, the attributive character of data on flows is gathered;Attributive character finally according to the data on flows and upper
The flow at one moment, carries out volume forecasting.After technology using the present invention, the global characteristics of extraction time sequence, with global spy
The similitude that similitude carrys out reflecting time sequence is levied, the behavioral characteristics for catching time series to change over time obtain more reasonable
Result, while describe large-scale time series by using a small amount of feature, improve the robustness for judging analog result, reduction cluster
Complexity in calculating process;The various attributive character related to data on flows are gathered according to cluster result, according to flow and category
The property common predicted flow rate data of feature, containing much information for prediction correspondingly improves precision of prediction, network is reasonably provided
Source is configured.
    Brief description of the drawings
      Fig. 1 is network traffic analysis and the schematic flow sheet of Forecasting Methodology in one embodiment;
      Fig. 2 is network traffic analysis and the structural representation of prediction meanss in one embodiment.
    Embodiment
      For the objects, technical solutions and advantages of the present invention are more clearly understood, below in conjunction with drawings and Examples, to this
Invention is described in further detail.It should be appreciated that embodiment described herein is only to explain the present invention,
Do not limit protection scope of the present invention.
      Network traffic analysis and Forecasting Methodology in one embodiment, as shown in figure 1, methods described includes:
      Step S101:Extract the global characteristics of the flow-time sequence of each base station to be measured;
      Step S102:Global characteristics according to being extracted are clustered;
      Step S103:According to the result clustered, the attributive character of data on flows is gathered;
      Step S104:According to the attributive character of the data on flows and the flow of last moment, volume forecasting is carried out.
      It is evidenced from the above discussion that, this method improves network traffics according to flow and the common predicted flow rate data of attributive character
Precision of prediction, rational resource distribution is carried out to network.
      As one embodiment, the global characteristics include tendency feature or seasonal characteristics or kurtosis feature or the degree of bias
Feature or auto-correlation coefficient feature or any one of nonlinear characteristic or spectrum signature or multinomial.
      As one embodiment, the flow-time sequence is by daily gathering the data on flows of each base station to be measured, even
Continuous collection half a year obtains.
      As one embodiment, the tendency feature is weighed by Z statistics, and Z statistics are more than zero, become to rise
Gesture;Z statistics are less than zero, are downward trend;The calculation formula of Z statistics is:Wherein S is
The statistic of Normal Distribution, Var (S) is S variance, and S calculation formula is:Var(S)
Calculation formula be:Var(S)=T(T-1)(2T+5)/18;Flow-time sequence xt, t=1,2 ... T, T are flow-time sequence
Length, xjIt is flow-time sequence in the value at j moment, xkValue for flow-time sequence at the k moment, sign function sgn (xj-
xk) calculation formula be: 
      The seasonal characteristics reflect that the calculation procedure of average period is by average period:To flow time series xt
FFT, i.e. FFT are carried out, t=1,2 ... T, T are the length of flow-time sequence, are obtained:The frequency wherein used is:Further calculating average frequency is:Calculate average period be: 
      The calculation formula of kurtosis is in the kurtosis feature:Wherein xtFor flow-time sequence, t=
1,2 ... T, T are the length of flow-time sequence,For the average of flow-time sequence, σ is the sample mark of flow-time sequence
It is accurate poor;
      The calculation formula of the degree of bias is in the degree of bias feature:Wherein xtFor flow-time sequence
Row, t=1,2 ... T, T are the length of flow-time sequence,For the average of flow-time sequence, σ is the sample of flow-time sequence
This standard is poor;
      The auto-correlation coefficient feature is weighed with Ljung-Box-Q statistics, Ljung-Box-Q statistic detection flows
Whether time series is white-noise process, and the calculation formula of Ljung-Box-Q statistics is:Its
Middle T is the length of flow-time sequence, and p is considered maximum lag order, and τ is delayed issue, rτFor flow-time sequence
Auto-correlation coefficient;rτCalculation formula be:Wherein xtFor flow-time sequence, t=1,2 ...
T,For the average of flow-time sequence;
      The nonlinear characteristic reflects that BDS test statistics detection flows time serieses are by BDS test statistics
No is independent same distribution, for flow-time sequence xt, t=1,2 ... T, at moment s, w observed value is xsAnd xw, then all sights
Examine value (xs,xw) by being configured to:
      {(xs,xw),(xs+1,xw+1),(xs+2,xw+2),…(xs+m-1,xw+m-1), wherein m is embedded interval;BDS statistics
Calculation formula be:Wherein r is interval size, and C (N, m, r) is phase
Close integration, σ ' (N, m, r) be C (N, m, r)-C (N, 1, r)mProgressive standard deviation estimation;C (N, m, r) calculation formula is:WhereinIt is m dimensional vectors, 
      The spectrum signature is the preceding second order coefficient of the DFT extracted, and the extraction of spectrum signature is using discrete
Fourier transform coefficient, n levels number is as spectrum signature before can extracting, because the HFS of a signal is unimportant,
Therefore before most of energy of domain space is concentrated on several coefficients.
      As one embodiment, the tendency characteristic use linear trend method is obtained, and is isolated using linear trend method
The trend components of time series, and with the trend feature of the slope term of linear function as the time series, i.e. setup time sequence
Arrange xt, regression models of t=1,2 ... T on time t, xt=α+βt+εt, wherein α is intercept, and β is slope, and ε is error, β
Least-squares estimation be:WhereinT represents the length of flow-time sequence;
      The seasonal characteristics are obtained using H-P filter methods, pass through computational minimization time series xtWith Trend value ytBetween
Difference estimate trend components:Wherein, T is the length of flow-time sequence
Degree, λ is the penalty factor fluctuated to trend components, it can thus be concluded that periodic component:Its
In, L is lag operator, works as CtThere is obvious peak value, it can be determined that time series xtWith cyclic swing composition, peak value institute is right
The cycle answered is the Cycle Length of the time series;
      The nonlinear characteristic using McLeod-Li- examine or Bispectral examine RESET examine or F examine or
Neural Network Based Nonlinear test statistics reflects.
      Above-mentioned global characteristics can be obtained by being not excluded for also other methods.
      As one embodiment, the cluster is clustered including Kmeans, regard the global characteristics extracted as new feature
Vector, the flow-time sequence pair of each base station to be measured answers a new characteristic vector, and K-means is carried out to new characteristic vector
Cluster.
      As one embodiment, the cluster include FCM cluster, using the global characteristics extracted as new feature to
Amount, the flow-time sequence pair of each base station to be measured answers a new characteristic vector, and FCM clusters are carried out to new characteristic vector.
      It is not excluded for also other clustering methods.
      In order to more fully understand this method, the application example of this method detailed below:
      A, the flow-time sequence { x for daily gathering each prediction base stationt, t=1,2 ... T }, continuous acquisition half a year;
      B, extract each base station flow-time sequence global characteristics, including tendency feature, seasonal characteristics, kurtosis
Feature, degree of bias feature, auto-correlation coefficient feature, nonlinear characteristic and spectrum signature;
      C, using the global characteristics of each base station of extraction as new characteristic vector, now each base station flow-time sequence
Row one new characteristic vector of correspondence, is clustered to new characteristic vector application K-means clustering methods;
      D, to each class base station data after cluster according to the appropriate attribute of its feature selecting, if data on flows presents
Gesture feature, gathers the ARPU value related to data on flows, 3G permeabilities;If data on flows is presented periodically, collection and stream
Measure the related ARPU values of data, 3G permeabilities, total number of users;
      E, set up one have three-decker, transmission function for tansig BP neural network structure and be trained;
      F, the model trained to previous step, input attributive character and the last moment of the data on flows to be predicted of collection
Flow, calculate the flow to be predicted, for example, input the attributive character of the data on flows of today of collection and the flow of yesterday,
The flow of today can be predicted.
      Wherein, the global characteristics of flow-time sequence are extracted in step B, are extracted by the following method:
      B1, the tendency feature are weighed by Z statistics, and the calculation formula of Z statistics is:Wherein S is the statistic of Normal Distribution, and Var (S) is S variance, S calculation formula
For:Var (S) calculation formula is:Var(S)=T(T-1)(2T+5)/18;Flow-time sequence
xt, t=1,2 ... T, T are the length of flow-time sequence, xjIt is flow-time sequence in the value at j moment, xkFor flow-time sequence
It is listed in the value at k moment, sign function sgn (xj-xk) calculation formula be: 
      B2, the seasonal characteristics reflect that the calculation procedure of average period is by average period:To flow time series
xtFFT, i.e. FFT are carried out, t=1,2 ... T, T are the length of flow-time sequence, are obtained:The frequency wherein used is:Further calculating average frequency is:Calculate average period be: 
      The calculation formula of kurtosis is in B3, the kurtosis feature:Wherein xtFor flow-time sequence,
T=1,2 ... T, T are the length of flow-time sequence,For the average of flow-time sequence, σ is the sample of flow-time sequence
Standard deviation;
      The calculation formula of the degree of bias is in B4, the degree of bias feature:Wherein xtFor flow-time
Sequence, t=1,2 ... T, T are the length of flow-time sequence,For the average of flow-time sequence, σ is flow-time sequence
Sample standard deviation;
      B5, the auto-correlation coefficient feature are weighed with Ljung-Box-Q statistics, the meter of Ljung-Box-Q statistics
Calculating formula is:Wherein T is the length of flow-time sequence, and p is considered maximum delayed rank
Number, τ is delayed issue, rτFor the auto-correlation coefficient of flow-time sequence;rτCalculation formula be:Wherein xtFor flow-time sequence, t=1,2 ... T,For the average of flow-time sequence;
      B6, the nonlinear characteristic are reflected by BDS test statistics, for flow-time sequence xt, t=1,2 ...
T, at moment s, w observed value is xsAnd xw, then all observed value (xs,xw) by being configured to:{(xs,xw),(xs+1,xw+1),
(xs+2,xw+2),…(xs+m-1,xw+m-1), wherein m is embedded interval;The calculation formula of BDS statistics is:Wherein r is interval size, and C (N, m, r) is correlation intergal, σ ' (N, m,
R) for C (N, m, r)-C (N, 1, r)mProgressive standard deviation estimation;C (N, m, r) calculation formula is:WhereinIt is m dimensional vectors, 
      B7, the spectrum signature are the preceding second order coefficient of the DFT extracted.
      Network traffic analysis and prediction meanss in one embodiment, as shown in Fig. 2 described device includes:
      Extraction module, the global characteristics of the flow-time sequence for extracting each base station to be measured;
      Cluster module, for being clustered according to the global characteristics extracted;
      Acquisition module, for according to the result clustered, gathering the attributive character of data on flows;
      Prediction module, for the attributive character and the flow of last moment according to the data on flows, carries out volume forecasting.
      As shown in Fig. 2 a preferred embodiment of each module annexation of the present apparatus is:Extraction module, cluster module,
Acquisition module and prediction module are linked in sequence successively.
      Extraction module extracts the global characteristics of the flow-time sequence of each base station to be measured first;Then cluster module according to
The global characteristics extracted are clustered;The attributive character of data on flows is gathered according to the result clustered by acquisition module again;
The attributive character and the flow of last moment of the data on flows are inputted neural network structure by last prediction module, carry out flow
Prediction, present apparatus network traffic analysis is more reasonable, and prediction contains much information, and precision is high, be adapted to application.
      As one embodiment, the global characteristics include tendency feature or seasonal characteristics or kurtosis feature or the degree of bias
Feature or auto-correlation coefficient feature or any one of nonlinear characteristic or spectrum signature or multinomial.
      As one embodiment, the flow-time sequence is by daily gathering the data on flows of each base station to be measured, even
Continuous collection half a year obtains.
      As one embodiment, the tendency feature is weighed by Z statistics, and Z statistics are more than zero, become to rise
Gesture;Z statistics are less than zero, are downward trend;The calculation formula of Z statistics is:Wherein S is
The statistic of Normal Distribution, Var (S) is S variance, and S calculation formula is:Var(S)
Calculation formula be:Var(S)=T(T-1)(2T+5)/18;Flow-time sequence xt, t=1,2 ... T, T are flow-time sequence
Length, xjIt is flow-time sequence in the value at j moment, xkValue for flow-time sequence at the k moment, sign function sgn (xj-
xk) calculation formula be: 
      The seasonal characteristics reflect that the calculation procedure of average period is by average period:To flow time series xt
FFT, i.e. FFT are carried out, t=1,2 ... T, T are the length of flow-time sequence, are obtained:The frequency wherein used is:Further calculating average frequency is:Calculate average period be: 
      The calculation formula of kurtosis is in the kurtosis feature:Wherein xtFor flow-time sequence, t=
1,2 ... T, T are the length of flow-time sequence,For the average of flow-time sequence, σ is the sample canonical of flow-time sequence
Difference;
      The calculation formula of the degree of bias is in the degree of bias feature:Wherein xtFor flow-time sequence
Row, t=1,2 ... T, T are the length of flow-time sequence,For the average of flow-time sequence, σ is the sample of flow-time sequence
This standard is poor;
      The auto-correlation coefficient feature is weighed with Ljung-Box-Q statistics, Ljung-Box-Q statistic detection flows
Whether time series is white-noise process, and the calculation formula of Ljung-Box-Q statistics is:Its
Middle T is the length of flow-time sequence, and p is considered maximum lag order, and τ is delayed issue, rτFor flow-time sequence
Auto-correlation coefficient;rτCalculation formula be:Wherein xt is flow-time sequence, t=1,2 ...
T,For the average of flow-time sequence;
      The nonlinear characteristic reflects that BDS test statistics detection flows time serieses are by BDS test statistics
No is independent same distribution, for flow-time sequence xt, t=1,2 ... T, at moment s, w observed value is xsAnd xw, then all sights
Examine value (xs,xw) by being configured to:
      {(xs,xw),(xs+1,xw+1),(xs+2,xw+2),…(xs+m-1,xw+m-1), wherein m is embedded interval;BDS statistics
Calculation formula be:Wherein r is interval size, and C (N, m, r) is phase
Close integration, σ ' (N, m, r) be C (N, m, r)-C (N, 1, r)mProgressive standard deviation estimation;C (N, m, r) calculation formula is:WhereinIt is m dimensional vectors, 
      The spectrum signature is the preceding second order coefficient of the DFT extracted, and the extraction of spectrum signature is using discrete
Fourier transform coefficient, n levels number is as spectrum signature before can extracting, because the HFS of a signal is unimportant,
Therefore before most of energy of domain space is concentrated on several coefficients.
      As one embodiment, the tendency characteristic use linear trend method is obtained, and is isolated using linear trend method
The trend components of time series, and with the trend feature of the slope term of linear function as the time series, i.e. setup time sequence
Arrange xt, regression models of t=1,2 ... T on time t, xt=α+βt+εt, wherein α is intercept, and β is slope, and ε is error, β
Least-squares estimation be:WhereinT represents the length of flow-time sequence;
      The seasonal characteristics are obtained using H-P filter methods, pass through computational minimization time series xtWith Trend value ytBetween
Difference estimate trend components:Wherein, T is the length of flow-time sequence,
λ is the penalty factor fluctuated to trend components, it can thus be concluded that periodic component:
Wherein, L is lag operator, works as CtThere is obvious peak value, it can be determined that time series xtWith cyclic swing composition, peak value institute
The corresponding cycle is the Cycle Length of the time series;
      The nonlinear characteristic using McLeod-Li- examine or Bispectral examine RESET examine or F examine or
Neural Network Based Nonlinear test statistics reflects.
      Above-mentioned global characteristics can be obtained by being not excluded for also other methods.
      As one embodiment, the cluster is clustered including Kmeans, regard the global characteristics extracted as new feature
Vector, the flow-time sequence pair of each base station to be measured answers a new characteristic vector, and K-means is carried out to new characteristic vector
Cluster.
      As one embodiment, the cluster include FCM cluster, using the global characteristics extracted as new feature to
Amount, the flow-time sequence pair of each base station to be measured answers a new characteristic vector, and FCM clusters are carried out to new characteristic vector.
      It is not excluded for also other clustering methods.
      Embodiment described above only expresses the several embodiments of the present invention, and it describes more specific and detailed, but simultaneously
Therefore the limitation to the scope of the claims of the present invention can not be interpreted as.It should be pointed out that for one of ordinary skill in the art
For, without departing from the inventive concept of the premise, various modifications and improvements can be made, these belong to the guarantor of the present invention
Protect scope.Therefore, the protection domain of patent of the present invention should be determined by the appended claims.
    Claims (8)
1. a kind of network traffic analysis and Forecasting Methodology, it is characterised in that comprise the following steps:
      The global characteristics of the flow-time sequence of each base station to be measured are extracted, the global characteristics include tendency feature or season
Property feature or kurtosis feature or degree of bias feature or auto-correlation coefficient feature or any one of nonlinear characteristic or spectrum signature or many
;
      Global characteristics according to being extracted are clustered;
      According to the result clustered, the attributive character of data on flows is gathered;
      According to the attributive character of the data on flows and the flow of last moment, volume forecasting is carried out;
      It is described that the attributive character of data on flows is gathered according to the result clustered, including:
      To each class base station data after cluster according to its feature selecting attributive character related to data on flows, the attribute is special
Levy including ARPU values, 3G permeabilities and/or total number of users.
    2. network traffic analysis according to claim 1 and Forecasting Methodology, it is characterised in that the flow-time sequence is led to
The data on flows for daily gathering each base station to be measured is crossed, continuous acquisition half a year obtains.
    3. network traffic analysis according to claim 1 and Forecasting Methodology, it is characterised in that the tendency feature passes through
Z statistics are weighed, and the calculation formula of Z statistics is:Wherein S is the system of Normal Distribution
Metering, Var (S) is S variance, and S calculation formula is:Var (S) calculation formula is:Var
(S)=T (T-1) (2T+5)/18;Flow-time sequence xt, t=1,2 ... T, T is the length of flow-time sequence, xjFor flow
Time series is in the value at j moment, xkValue for flow-time sequence at the k moment, sign function sgn (xj-xk) calculation formula
For: 
      The seasonal characteristics reflect that the calculation procedure of average period is by average period:To flow time series xtCarry out fast
Fast Fourier transform, i.e. FFT, t=1,2 ... T, T is the length of flow-time sequence, is obtained:
The frequency wherein used is:Further calculating average frequency is:Calculate
Average period is: 
      The calculation formula of kurtosis is in the kurtosis feature:Wherein xtFor flow-time sequence, t=1,
2 ... T, T are the length of flow-time sequence,For the average of flow-time sequence, σ is the sample canonical of flow-time sequence
Difference;
      The calculation formula of the degree of bias is in the degree of bias feature:Wherein xtFor flow-time sequence, t=
1,2 ... T, T are the length of flow-time sequence,For the average of flow-time sequence, σ is the sample canonical of flow-time sequence
Difference;
      The auto-correlation coefficient feature is weighed with Ljung-Box-Q statistics, the calculation formula of Ljung-Box-Q statistics
For:Wherein T is the length of flow-time sequence, and p is considered maximum lag order, and τ is
Delayed issue, rτFor the auto-correlation coefficient of flow-time sequence;rτCalculation formula be:Wherein
xtFor flow-time sequence, t=1,2 ... T,For the average of flow-time sequence;
      The nonlinear characteristic is reflected by BDS test statistics, for flow-time sequence xt, t=1,2 ... T, moment s,
W observed value is xsAnd xw, then all observed value (xs,xw) by being configured to:{(xs,xw),(xs+1,xw+1),(xs+2,
xw+2),…(xs+m-1,xw+m-1), wherein m is embedded interval;The calculation formula of BDS statistics is:Wherein r is interval size, and C (N, m, r) is correlation intergal, σ ' (N, m,
R) for C (N, m, r)-C (N, 1, r)mProgressive standard deviation estimation;C (N, m, r) calculation formula is:WhereinIt is m dimensional vectors, 
      The spectrum signature is the preceding second order coefficient of the DFT extracted.
    4. network traffic analysis according to claim 1 and Forecasting Methodology, it is characterised in that the cluster includes Kmeans
Cluster, using the global characteristics extracted as new characteristic vector, the flow-time sequence pair of each base station to be measured answers one newly
Characteristic vector, K-means clusters are carried out to new characteristic vector.
    5. a kind of network traffic analysis and prediction meanss, it is characterised in that including:
      Extraction module, the global characteristics of the flow-time sequence for extracting each base station to be measured, the global characteristics include becoming
Gesture feature or seasonal characteristics or kurtosis feature or degree of bias feature or auto-correlation coefficient feature or nonlinear characteristic or frequency spectrum are special
Any one of levy or multinomial;
      Cluster module, for being clustered according to the global characteristics extracted;
      Acquisition module, for according to the result clustered, gathering the attributive character of data on flows;
      Prediction module, for the attributive character and the flow of last moment according to the data on flows, carries out volume forecasting;
      Wherein, it is described according to the result clustered, the attributive character of data on flows is gathered, including:
      To each class base station data after cluster according to its feature selecting attributive character related to data on flows, the attribute is special
Levy including ARPU values, 3G permeabilities and/or total number of users.
    6. network traffic analysis according to claim 5 and prediction meanss, it is characterised in that the flow-time sequence is led to
The data on flows for daily gathering each base station to be measured is crossed, continuous acquisition half a year obtains.
    7. network traffic analysis according to claim 5 and prediction meanss, it is characterised in that the tendency feature passes through
Z statistics are weighed, and the calculation formula of Z statistics is:Wherein S is the system of Normal Distribution
Metering, Var (S) is S variance, and S calculation formula is:Var (S) calculation formula is:Var
(S)=T (T-1) (2T+5)/18;Flow-time sequence xt, t=1,2 ... T, T is the length of flow-time sequence, xjFor flow
Time series is in the value at j moment, xkValue for flow-time sequence at the k moment, sign function sgn (xj-xk) calculation formula
For: 
      The seasonal characteristics reflect that the calculation procedure of average period is by average period:To flow time series xtCarry out fast
Fast Fourier transform, i.e. FFT, t=1,2 ... T, T is the length of flow-time sequence, is obtained:
The frequency wherein used is:Further calculating average frequency is:Calculate
Average period is: 
      The calculation formula of kurtosis is in the kurtosis feature:Wherein xtFor flow-time sequence, t=1,
2 ... T, T are the length of flow-time sequence,For the average of flow-time sequence, σ is the sample canonical of flow-time sequence
Difference;
      The calculation formula of the degree of bias is in the degree of bias feature:Wherein xtFor flow-time sequence, t=
1,2 ... T, T are the length of flow-time sequence,For the average of flow-time sequence, σ is the sample canonical of flow-time sequence
Difference;
      The auto-correlation coefficient feature is weighed with Ljung-Box-Q statistics, the calculation formula of Ljung-Box-Q statistics
For:Wherein T is the length of flow-time sequence, and p is considered maximum lag order, and τ is
Delayed issue, rτFor the auto-correlation coefficient of flow-time sequence;rτCalculation formula be:Wherein
xtFor flow-time sequence, t=1,2 ... T,For the average of flow-time sequence;
      The nonlinear characteristic is reflected by BDS test statistics, for flow-time sequence xt, t=1,2 ... T, moment s,
W observed value is xsAnd xw, then all observed value (xs,xw) by being configured to:{(xs,xw),(xs+1,xw+1),(xs+2,
xw+2),…(xs+m-1,xw+m-1), wherein m is embedded interval;The calculation formula of BDS statistics is:Wherein r is interval size, and C (N, m, r) is correlation intergal, σ ' (N, m,
R) for C (N, m, r)-C (N, 1, r)mProgressive standard deviation estimation;C (N, m, r) calculation formula is:WhereinIt is m dimensional vectors, 
      The spectrum signature is the preceding second order coefficient of the DFT extracted.
    8. network traffic analysis according to claim 5 and prediction meanss, it is characterised in that the cluster includes Kmeans
Cluster, using the global characteristics extracted as new characteristic vector, the flow-time sequence pair of each base station to be measured answers one newly
Characteristic vector, K-means clusters are carried out to new characteristic vector.
    Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title | 
|---|---|---|---|
| CN201410019136.7A CN103747477B (en) | 2014-01-15 | 2014-01-15 | Network traffic analysis and Forecasting Methodology and device | 
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title | 
|---|---|---|---|
| CN201410019136.7A CN103747477B (en) | 2014-01-15 | 2014-01-15 | Network traffic analysis and Forecasting Methodology and device | 
Publications (2)
| Publication Number | Publication Date | 
|---|---|
| CN103747477A CN103747477A (en) | 2014-04-23 | 
| CN103747477B true CN103747477B (en) | 2017-08-25 | 
Family
ID=50504455
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date | 
|---|---|---|---|
| CN201410019136.7A Active CN103747477B (en) | 2014-01-15 | 2014-01-15 | Network traffic analysis and Forecasting Methodology and device | 
Country Status (1)
| Country | Link | 
|---|---|
| CN (1) | CN103747477B (en) | 
Families Citing this family (14)
| Publication number | Priority date | Publication date | Assignee | Title | 
|---|---|---|---|---|
| US10417226B2 (en) * | 2015-05-29 | 2019-09-17 | International Business Machines Corporation | Estimating the cost of data-mining services | 
| CN107517166A (en) * | 2016-06-16 | 2017-12-26 | 中兴通讯股份有限公司 | Flow control methods, device and access device | 
| TWI641251B (en) | 2016-11-18 | 2018-11-11 | 財團法人工業技術研究院 | Method and system for monitoring network flow | 
| CN107135126B (en) * | 2017-05-22 | 2020-03-24 | 安徽师范大学 | Flow online identification method based on sub-flow fractal index | 
| CN110098944B (en) * | 2018-01-29 | 2020-09-08 | 中国科学院声学研究所 | A Method for Predicting Protocol Data Traffic Based on FP-Growth and RNN | 
| CN108770002B (en) * | 2018-04-27 | 2021-08-10 | 广州杰赛科技股份有限公司 | Base station flow analysis method, device, equipment and storage medium | 
| CN108960537B (en) * | 2018-08-17 | 2020-10-13 | 安吉汽车物流股份有限公司 | Logistics order prediction method and device and readable medium | 
| CN113037577B (en) * | 2019-12-09 | 2023-03-24 | 中国电信股份有限公司 | Network traffic prediction method, device and computer readable storage medium | 
| CN112235152B (en) * | 2020-09-04 | 2022-05-10 | 北京邮电大学 | Flow size estimation method and device | 
| CN111935766B (en) * | 2020-09-15 | 2021-01-12 | 之江实验室 | A Wireless Network Traffic Prediction Method Based on Global Spatial Dependency | 
| CN113225824A (en) * | 2021-04-28 | 2021-08-06 | 辽宁邮电规划设计院有限公司 | Device and method for automatically allocating bandwidths with different service requirements based on 5G technology | 
| CN114330145B (en) * | 2022-03-01 | 2022-07-12 | 北京蚂蚁云金融信息服务有限公司 | Method and device for analyzing sequence based on probabilistic graphical model | 
| CN114793197B (en) * | 2022-03-29 | 2023-09-19 | 广州杰赛科技股份有限公司 | NFV-based network resource configuration method, device, equipment and storage medium | 
| CN115949891B (en) * | 2023-03-09 | 2023-05-23 | 天津佰焰科技股份有限公司 | Intelligent control system and control method for LNG (liquefied Natural gas) station | 
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title | 
|---|---|---|---|---|
| US20050213514A1 (en) * | 2004-03-23 | 2005-09-29 | Ching-Fong Su | Estimating and managing network traffic | 
| CN101252541A (en) * | 2008-04-09 | 2008-08-27 | 中国科学院计算技术研究所 | A method for establishing a network traffic classification model and a corresponding system | 
| CN103227999A (en) * | 2013-05-02 | 2013-07-31 | 中国联合网络通信集团有限公司 | Network traffic prediction method and device | 
| CN103368811A (en) * | 2012-04-06 | 2013-10-23 | 华为终端有限公司 | Bandwidth distribution method and equipment | 
- 
        2014
        - 2014-01-15 CN CN201410019136.7A patent/CN103747477B/en active Active
 
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title | 
|---|---|---|---|---|
| US20050213514A1 (en) * | 2004-03-23 | 2005-09-29 | Ching-Fong Su | Estimating and managing network traffic | 
| CN101252541A (en) * | 2008-04-09 | 2008-08-27 | 中国科学院计算技术研究所 | A method for establishing a network traffic classification model and a corresponding system | 
| CN103368811A (en) * | 2012-04-06 | 2013-10-23 | 华为终端有限公司 | Bandwidth distribution method and equipment | 
| CN103227999A (en) * | 2013-05-02 | 2013-07-31 | 中国联合网络通信集团有限公司 | Network traffic prediction method and device | 
Non-Patent Citations (1)
| Title | 
|---|
| 基于时间序列的网络流量分析与预测;何建;《中国科技信息》;20051231(第22期);全文 * | 
Also Published As
| Publication number | Publication date | 
|---|---|
| CN103747477A (en) | 2014-04-23 | 
Similar Documents
| Publication | Publication Date | Title | 
|---|---|---|
| CN103747477B (en) | Network traffic analysis and Forecasting Methodology and device | |
| CN111612651B (en) | A method for detecting abnormal power data based on long short-term memory network | |
| CN108593990B (en) | Electricity stealing detection method based on electricity consumption behavior mode of electric energy user and application | |
| CN111144435B (en) | Electric energy abnormal data monitoring method based on LOF and verification filtering framework | |
| Wu et al. | Random forest predictive model development with uncertainty analysis capability for the estimation of evapotranspiration in an arid oasis region | |
| CN109101938A (en) | A kind of multi-tag age estimation method based on convolutional neural networks | |
| CN106533750A (en) | System and method for predicting non-steady application user concurrency in cloud environment | |
| CN112330065A (en) | A Runoff Forecasting Method Based on Baseflow Segmentation and Artificial Neural Network Model | |
| CN108876021A (en) | A kind of Medium-and Long-Term Runoff Forecasting method and system | |
| CN117272813B (en) | Soil humidity prediction method based on water balance constraint deep learning | |
| CN106296465A (en) | A kind of intelligent grid exception electricity consumption behavioral value method | |
| CN110837933A (en) | Leakage identification method, device, equipment and storage medium based on neural network | |
| CN109063885A (en) | A kind of substation's exception metric data prediction technique | |
| CN111428421A (en) | Rainfall runoff simulation method for deep learning guided by physical mechanism | |
| CN119691703B (en) | Concentrator error calibration method based on machine learning | |
| CN108089097B (en) | Intelligent online distribution network ground fault location method | |
| CN118868221B (en) | Community-level post-meter photovoltaic power generation power distributed estimation method and system | |
| CN119316305A (en) | Network traffic prediction method, device, electronic device and non-volatile storage medium | |
| CN109754333A (en) | A multi-regional aggregation mining method for regional lightning-sensitive loads | |
| Smith et al. | Testing probabilistic adaptive real‐time flood forecasting models | |
| CN115422840B (en) | Daily-scale runoff estimation method based on physical model hybrid deep learning model | |
| CN116821695A (en) | Semi-supervised neural network soft measurement modeling method | |
| CN112651537A (en) | Photovoltaic power generation ultra-short term power prediction method and system | |
| CN116011654A (en) | District popularity index prediction method based on power load data | |
| CN114865641B (en) | Residential Load Location Method | 
Legal Events
| Date | Code | Title | Description | 
|---|---|---|---|
| C06 | Publication | ||
| PB01 | Publication | ||
| C10 | Entry into substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |