Moving statistics for fast data streams

Learn how to get started with analyzing fast data streams with moving statistics

To make this tutorial easy to understand, we use very few data points and a simple statistic. The Time Door API can process significantly more data points per second with more computationally intensive algorithms.

Introduction

Let's assume that we have a JavaScript based server monitoring application that receives a fast data stream of 20 new data points per second. We want to be able to detect potential early warning signals of critical transitions and want to update our statistic every second.

For input data streams with a high update frequency (fast data), moving statistics are particularly useful. Moving statistics compute local statistics of an input data stream within a moving window of a specified window size. The term moving window is synonymous with the terms rolling, sliding and running window.

Endpoints that are particularly well suited for fast data are indicated by a fast data tag on the API Docs page.

We will look at two points in time, t0 and t1:

5.342
-1.255
0.4479
-0.2208
ACF1
2020-01-01 00:00:00.000
2020-01-01 00:00:00.950
t

Moving autocorrelation at lag 1 (ACF1) at t0.

5.342
-7.657
0.7935
-0.3505
ACF1
2020-01-01 00:00:00.000
2020-01-01 00:00:01.950
t

Moving autocorrelation at lag 1 (ACF1) at t0 and t1.

The initial gap in the ACF1 statistic (orange chart) is caused by the moving window (and first-differencing). We will cover this in more depth later in the tutorial.

t0

At t0 we have 20 data points:

const time = [1577836800000, 1577836800050, 1577836800100, 1577836800150, 1577836800200, 1577836800250, 1577836800300, 1577836800350, 1577836800400, 1577836800450, 1577836800500, 1577836800550, 1577836800600, 1577836800650, 1577836800700, 1577836800750, 1577836800800, 1577836800850, 1577836800900, 1577836800950]

const data = [1.0137, 0.8591, 0.6281, 0.0703, -1.2546, 0.0635, 0.9731, 1.1618, 2.5978, 3.9978, 4.0237, 4.6535, 5.0346, 4.8492, 5.3419, 5.0351, 3.7687, 3.374, 1.1942, -1.0547]

We need to prepare our data for the Time Door API and make our request.

Let's get the data in the correct format for the Time Door API:

// ...
const requestData = {
  time_series: [
    {
      data: {},
      // ...
    }
  ],
  // ...
}

// transform the time series data and add it to the request data
for (let t = 0; t < data.length; t++) {
  Object.assign(requestData.time_series[0].data, {
    [new Date(time[t]).toISOString()]: data[t]
  })
}

This produces the following data structure:

{ '2020-01-01T00:00:00.000Z': 1.0137, '2020-01-01T00:00:00.050Z': 0.8591, '2020-01-01T00:00:00.100Z': 0.6281, '2020-01-01T00:00:00.150Z': 0.0703, '2020-01-01T00:00:00.200Z': -1.2546, '2020-01-01T00:00:00.250Z': 0.0635, '2020-01-01T00:00:00.300Z': 0.9731, '2020-01-01T00:00:00.350Z': 1.1618, '2020-01-01T00:00:00.400Z': 2.5978, '2020-01-01T00:00:00.450Z': 3.9978, '2020-01-01T00:00:00.500Z': 4.0237, '2020-01-01T00:00:00.550Z': 4.6535, '2020-01-01T00:00:00.600Z': 5.0346, '2020-01-01T00:00:00.650Z': 4.8492, '2020-01-01T00:00:00.700Z': 5.3419, '2020-01-01T00:00:00.750Z': 5.0351, '2020-01-01T00:00:00.800Z': 3.7687, '2020-01-01T00:00:00.850Z': 3.374, '2020-01-01T00:00:00.900Z': 1.1942, '2020-01-01T00:00:00.950Z': -1.0547 }

We are interested in detecting potential early warning signals of critical transitions, so we can use the acf1 method of the Early Warning Signals endpoint.

We can use axios, a promise-based HTTP client, for making our first request to the Time Door API with the prepared data:

import axios from 'axios'

const time = [1577836800000, 1577836800050, 1577836800100, 1577836800150, 1577836800200, 1577836800250, 1577836800300, 1577836800350, 1577836800400, 1577836800450, 1577836800500, 1577836800550, 1577836800600, 1577836800650, 1577836800700, 1577836800750, 1577836800800, 1577836800850, 1577836800900, 1577836800950]

const data = [1.0137, 0.8591, 0.6281, 0.0703, -1.2546, 0.0635, 0.9731, 1.1618, 2.5978, 3.9978, 4.0237, 4.6535, 5.0346, 4.8492, 5.3419, 5.0351, 3.7687, 3.374, 1.1942, -1.0547]

const requestData = {
  time_series: [
    {
      data: {},
      transformations: {
        first_diff: {
          apply: true,
          n_diffs: 1
        }
      }
    }
  ],
  method: "acf1",
  window_size: 10
}

// transform the time series data and add it to the request data
for (let t = 0; t < data.length; t++) {
  Object.assign(requestData.time_series[0].data, {
    [new Date(time[t]).toISOString()]: data[t]
  })
}

const key = "" // add your Time Door API key here
const url = "https://api.timedoor.io/invocation/early-warning-signals"

axios
  .post(url, requestData, {
    headers: {
      "X-Time-Door-Key": key
    }
  })
  .then((response) => {
    console.log(response.data)
  })
  .catch((error) => {
    console.log(error)
  })

The Time Door API response:

{
  "report": {
    "computation_time": {
      "actual": 345.82,
      "billed": 400
    }
  },
  "result": {
    "reproduction": {},
    "data": [
      {
        "t": "2020-01-01T00:00:00.000Z",
        "acf1": null
      },
      {
        "t": "2020-01-01T00:00:00.050Z",
        "acf1": null
      },
      {
        "t": "2020-01-01T00:00:00.100Z",
        "acf1": null
      },
      {
        "t": "2020-01-01T00:00:00.150Z",
        "acf1": null
      },
      {
        "t": "2020-01-01T00:00:00.200Z",
        "acf1": null
      },
      {
        "t": "2020-01-01T00:00:00.250Z",
        "acf1": null
      },
      {
        "t": "2020-01-01T00:00:00.300Z",
        "acf1": null
      },
      {
        "t": "2020-01-01T00:00:00.350Z",
        "acf1": null
      },
      {
        "t": "2020-01-01T00:00:00.400Z",
        "acf1": null
      },
      {
        "t": "2020-01-01T00:00:00.450Z",
        "acf1": 0.273
      },
      {
        "t": "2020-01-01T00:00:00.500Z",
        "acf1": 0.1967
      },
      {
        "t": "2020-01-01T00:00:00.550Z",
        "acf1": 0.1
      },
      {
        "t": "2020-01-01T00:00:00.600Z",
        "acf1": -0.2208
      },
      {
        "t": "2020-01-01T00:00:00.650Z",
        "acf1": 0.0139
      },
      {
        "t": "2020-01-01T00:00:00.700Z",
        "acf1": -0.0129
      },
      {
        "t": "2020-01-01T00:00:00.750Z",
        "acf1": 0.05432
      },
      {
        "t": "2020-01-01T00:00:00.800Z",
        "acf1": 0.2746
      },
      {
        "t": "2020-01-01T00:00:00.850Z",
        "acf1": 0.1927
      },
      {
        "t": "2020-01-01T00:00:00.900Z",
        "acf1": 0.2125
      },
      {
        "t": "2020-01-01T00:00:00.950Z",
        "acf1": 0.4479
      }
    ]
  }
}

We can visualize the response data:

5.342
-1.255
0.4479
-0.2208
ACF1
2020-01-01 00:00:00.000
2020-01-01 00:00:00.950
t

Moving autocorrelation at lag 1 (ACF1) at t0.

We have an initial sequence of null values from t = 2020-01-01T00:00:00.000Z to t = 2020-01-01T00:00:00.450Z. The length of this sequence is given by null_seq_len = first_differences + seasonal_differences * period + window_size - 1 = 1 + 0 * 0 + 10 - 1 = 10.

As recommended for fast streaming data, we don't use automatic data transformations. This provides consistent data transformations and makes null_seq_len constant across requests. In our case, we just use simple first-differencing, which is often sufficient to make a non-stationary time series stationary:

{
  "transformations": {
    "first_diff": {
      "apply": true,
      "n_diffs": 1
    }
  }
}

The transformation arguments of our request.

t1

At t1 (1s later) we have 20 new data points (total = 40):

const time = [1577836800000, 1577836800050, 1577836800100, 1577836800150, 1577836800200, 1577836800250, 1577836800300, 1577836800350, 1577836800400, 1577836800450, 1577836800500, 1577836800550, 1577836800600, 1577836800650, 1577836800700, 1577836800750, 1577836800800, 1577836800850, 1577836800900, 1577836800950, 1577836801000, 1577836801050, 1577836801100, 1577836801150, 1577836801200, 1577836801250, 1577836801300, 1577836801350, 1577836801400, 1577836801450, 1577836801500, 1577836801550, 1577836801600, 1577836801650, 1577836801700, 1577836801750, 1577836801800, 1577836801850, 1577836801900, 1577836801950]

const data = [1.0137, 0.8591, 0.6281, 0.0703, -1.2546, 0.0635, 0.9731, 1.1618, 2.5978, 3.9978, 4.0237, 4.6535, 5.0346, 4.8492, 5.3419, 5.0351, 3.7687, 3.374, 1.1942, -1.0547, -3.3157, -5.057, -6.691, -7.6573, -7.6522, -7.2028, -5.7701, -4.5187, -3.7872, -4.2919, -3.8214, -4.195, -3.5719, -2.4975, -2.787, -3.6928, -3.2771, -3.8647, -2.9803, -3.6646]

It's important to prepare every dataset following the first one based on null_seq_len to append it precisely to the previous one.

We can simply achieve this by sending the new 20 data points at t1 with the previous null_seq_len = 10 data points in our request:

import axios from 'axios'

const time = [1577836800000, 1577836800050, 1577836800100, 1577836800150, 1577836800200, 1577836800250, 1577836800300, 1577836800350, 1577836800400, 1577836800450, 1577836800500, 1577836800550, 1577836800600, 1577836800650, 1577836800700, 1577836800750, 1577836800800, 1577836800850, 1577836800900, 1577836800950, 1577836801000, 1577836801050, 1577836801100, 1577836801150, 1577836801200, 1577836801250, 1577836801300, 1577836801350, 1577836801400, 1577836801450, 1577836801500, 1577836801550, 1577836801600, 1577836801650, 1577836801700, 1577836801750, 1577836801800, 1577836801850, 1577836801900, 1577836801950]

const data = [1.0137, 0.8591, 0.6281, 0.0703, -1.2546, 0.0635, 0.9731, 1.1618, 2.5978, 3.9978, 4.0237, 4.6535, 5.0346, 4.8492, 5.3419, 5.0351, 3.7687, 3.374, 1.1942, -1.0547, -3.3157, -5.057, -6.691, -7.6573, -7.6522, -7.2028, -5.7701, -4.5187, -3.7872, -4.2919, -3.8214, -4.195, -3.5719, -2.4975, -2.787, -3.6928, -3.2771, -3.8647, -2.9803, -3.6646]

const requestData = {
  time_series: [
    {
      data: {},
      transformations: {
        first_diff: {
          apply: true,
          n_diffs: 1
        }
      }
    }
  ],
  method: "acf1",
  window_size: 10
}

const t = requestData.time_series[0].transformations
const nullSeqLen = t.first_diff.n_diffs + requestData.window_size - 1 // 1 + 10 - 1 = 10

const totalDataPoints = data.length // 40
const newDataPoints = 20

const startIndex = totalDataPoints - newDataPoints - nullSeqLen // 10

// transform the time series data and add it to the request data
// we use startIndex here:
for (let t = startIndex; t < data.length; t++) {
  Object.assign(requestData.time_series[0].data, {
    [new Date(time[t]).toISOString()]: data[t]
  })
}

const key = "" // add your Time Door API key here
const url = "https://api.timedoor.io/invocation/early-warning-signals"

axios
  .post(url, requestData, {
    headers: {
      "X-Time-Door-Key": key
    },
  })
  .then((response) => {
    console.log(response.data)
  })
  .catch((error) => {
    console.log(error)
  })

The Time Door API response:

{
  "report": {
    "computation_time": {
      "actual": 344.69,
      "billed": 400
    }
  },
  "result": {
    "reproduction": {},
    "data": [
      {
        "t": "2020-01-01T00:00:00.500Z",
        "acf1": null
      },
      {
        "t": "2020-01-01T00:00:00.550Z",
        "acf1": null
      },
      {
        "t": "2020-01-01T00:00:00.600Z",
        "acf1": null
      },
      {
        "t": "2020-01-01T00:00:00.650Z",
        "acf1": null
      },
      {
        "t": "2020-01-01T00:00:00.700Z",
        "acf1": null
      },
      {
        "t": "2020-01-01T00:00:00.750Z",
        "acf1": null
      },
      {
        "t": "2020-01-01T00:00:00.800Z",
        "acf1": null
      },
      {
        "t": "2020-01-01T00:00:00.850Z",
        "acf1": null
      },
      {
        "t": "2020-01-01T00:00:00.900Z",
        "acf1": null
      },
      {
        "t": "2020-01-01T00:00:00.950Z",
        "acf1": null
      },
      {
        "t": "2020-01-01T00:00:01.000Z",
        "acf1": 0.5681
      },
      {
        "t": "2020-01-01T00:00:01.050Z",
        "acf1": 0.5841
      },
      {
        "t": "2020-01-01T00:00:01.100Z",
        "acf1": 0.5869
      },
      {
        "t": "2020-01-01T00:00:01.150Z",
        "acf1": 0.416
      },
      {
        "t": "2020-01-01T00:00:01.200Z",
        "acf1": 0.2928
      },
      {
        "t": "2020-01-01T00:00:01.250Z",
        "acf1": 0.4954
      },
      {
        "t": "2020-01-01T00:00:01.300Z",
        "acf1": 0.5953
      },
      {
        "t": "2020-01-01T00:00:01.350Z",
        "acf1": 0.7657
      },
      {
        "t": "2020-01-01T00:00:01.400Z",
        "acf1": 0.7935
      },
      {
        "t": "2020-01-01T00:00:01.450Z",
        "acf1": 0.724
      },
      {
        "t": "2020-01-01T00:00:01.500Z",
        "acf1": 0.6295
      },
      {
        "t": "2020-01-01T00:00:01.550Z",
        "acf1": 0.4577
      },
      {
        "t": "2020-01-01T00:00:01.600Z",
        "acf1": 0.2175
      },
      {
        "t": "2020-01-01T00:00:01.650Z",
        "acf1": 0.1654
      },
      {
        "t": "2020-01-01T00:00:01.700Z",
        "acf1": 0.04004
      },
      {
        "t": "2020-01-01T00:00:01.750Z",
        "acf1": 0.1911
      },
      {
        "t": "2020-01-01T00:00:01.800Z",
        "acf1": -0.02625
      },
      {
        "t": "2020-01-01T00:00:01.850Z",
        "acf1": -0.2585
      },
      {
        "t": "2020-01-01T00:00:01.900Z",
        "acf1": -0.2808
      },
      {
        "t": "2020-01-01T00:00:01.950Z",
        "acf1": -0.3505
      }
    ]
  }
}

The first non-null value at t = 2020-01-01T00:00:01.000Z follows the last value we received in the previous response at t0, so we can append the non-null values of the new t1 response to the values of the previous t0 response and visualize the result:

5.342
-7.657
0.7935
-0.3505
ACF1
2020-01-01 00:00:00.000
2020-01-01 00:00:01.950
t

Moving autocorrelation at lag 1 (ACF1) at t0 and t1.

Conclusion

For all new data points after t1, we can repeat the steps above for t1. We now have a moving statistic that updates every second. Because it's a moving statistic, we don't need to send the entire dataset with every request, just the newest 30 data points. For a real-world application, we might want to look at multiple statistics to get more comprehensive insights into the statistical properties of the time series data.

If you want to see the basic concepts of this tutorial in action, have a look at our live demo: Launch live demo