This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

RedisTimeSeries

Ingest and query time series data with Redis

Discord Github

RedisTimeSeries is a Redis module that adds a time series data structure to Redis.

Features

  • High volume inserts, low latency reads
  • Query by start time and end-time
  • Aggregated queries (min, max, avg, sum, range, count, first, last, STD.P, STD.S, Var.P, Var.S, twa) for any time bucket
  • Configurable maximum retention period
  • Downsampling / compaction for automatically updated aggregated timeseries
  • Secondary indexing for time series entries. Each time series has labels (field value pairs) which will allows to query by labels

Client libraries

Official and community client libraries in Python, Java, JavaScript, Ruby, Go, C#, Rust, and PHP.

See the clients page for the full list.

Using with other metrics tools

In the RedisTimeSeries organization you can find projects that help you integrate RedisTimeSeries with other tools, including:

  1. Prometheus - read/write adapter to use RedisTimeSeries as backend db.
  2. Grafana 7.1+ - using the Redis Data Source.
  3. Telegraph
  4. StatsD, Graphite exports using graphite protocol.

Memory model

A time series is a linked list of memory chunks. Each chunk has a predefined size of samples. Each sample is a 128-bit tuple: 64 bits for the timestamp and 64 bits for the value.

Forum

Got questions? Feel free to ask at the RedisTimeSeries mailing list.

License

Redis Source Available License Agreement. See LICENSE

1 - Commands

Commands Overview

RedisTimeSeries API

Details on module's commands can be filtered for a specific module or command, e.g., [TS.CREATE](/commands/?group=timeseries&name=ts.create). The details also include the syntax for the commands, where:

  • Command and subcommand names are in uppercase, for example TS.ADD
  • Optional arguments are enclosed in square brackets, for example [index]
  • Additional optional arguments are indicated by three period characters, for example ...

Commands usually require a key's name as their first argument. The path is generally assumed to be the root if not specified.

2 - Quickstart

Quick Start Guide to RedisTimeSeries

Setup

You can either get RedisTimeSeries setup in the cloud, in a Docker container or on your own machine.

Redis Cloud

RedisTimeSeries is available on all Redis Cloud managed services, including a completely free managed database up to 30MB.

Get started here

Docker

To quickly try out RedisTimeSeries, launch an instance using docker:

docker run -p 6379:6379 -it --rm redislabs/redistimeseries

Download and running binaries

First download the pre-compiled version from the Redis download center.

Next, run Redis with RedisTimeSeries:

$ redis-server --loadmodule /path/to/module/redistimeseries.so

Build and Run it yourself

You can also build and run RedisTimeSeries on your own machine.

Major Linux distributions as well as macOS are supported.

Requirements

First, clone the RedisTimeSeries repository from git:

git clone --recursive https://github.com/RedisTimeSeries/RedisTimeSeries.git

Then, to install required build artifacts, invoke the following:

cd RedisTimeSeries
make setup

Or you can install required dependencies manually listed in system-setup.py.

If make is not yet available, the following commands are equivalent:

./deps/readies/bin/getpy3
./system-setup.py

Note that system-setup.py will install various packages on your system using the native package manager and pip. This requires root permissions (i.e. sudo) on Linux.

If you prefer to avoid that, you can:

  • Review system-setup.py and install packages manually,
  • Utilize a Python virtual environment,
  • Use Docker with the --volume option to create an isolated build environment.

Build

make build

Binary artifacts are placed under the bin directory.

Run

In your redis-server run: loadmodule bin/redistimeseries.so

For more information about modules, go to the redis official documentation.

Give it a try with redis-cli

After you setup RedisTimeSeries, you can interact with it using redis-cli.

$ redis-cli
127.0.0.1:6379> TS.CREATE sensor1
OK

Creating a timeseries

A new timeseries can be created with the TS.CREATE command; for example, to create a timeseries named sensor1 run the following:

TS.CREATE sensor1

You can prevent your timeseries growing indefinitely by setting a maximum age for samples compared to the last event time (in milliseconds) with the RETENTION option. The default value for retention is 0, which means the series will not be trimmed.

TS.CREATE sensor1 RETENTION 2678400000

This will create a timeseries called sensor1 and trim it to values of up to one month.

Adding data points

For adding new data points to a timeseries we use the TS.ADD command:

TS.ADD key timestamp value

The timestamp argument is the UNIX timestamp of the sample in milliseconds and value is the numeric data value of the sample.

Example:

TS.ADD sensor1 1626434637914 26

To add a datapoint with the current timestamp you can use a * instead of a specific timestamp:

TS.ADD sensor1 * 26

You can append data points to multiple timeseries at the same time with the TS.MADD command:

TS.MADD key timestamp value [key timestamp value ...]

Deleting data points

Data points between two timestamps (inclusive) can be deleted with the TS.DEL command:

TS.DEL key fromTimestamp toTimestamp

Example:

TS.DEL sensor1 1000 2000

To delete a single timestamp, use it as both the "from" and "to" timestamp:

TS.DEL sensor1 1000 1000

Note: When a sample is deleted, the data in all downsampled timeseries will be recalculated for the specific bucket. If part of the bucket has already been removed though, because it's outside of the retention period, we won't be able to recalculate the full bucket, so in those cases we will refuse the delete operation.

Labels

Labels are key-value metadata we attach to data points, allowing us to group and filter. They can be either string or numeric values and are added to a timeseries on creation:

TS.CREATE sensor1 LABELS region east

Downsampling

Another useful feature of RedisTimeSeries is compacting data by creating a rule for downsampling (TS.CREATERULE). For example, if you have collected more than one billion data points in a day, you could aggregate the data by every minute in order to downsample it, thereby reducing the dataset size to 24 * 60 = 1,440 data points. You can choose one of the many available aggregation types in order to aggregate multiple data points from a certain minute into a single one. The currently supported aggregation types are: avg, sum, min, max, range, count, first, last, std.p, std.s, var.p, var.s and twa.

It's important to point out that there is no data rewriting on the original timeseries; the compaction happens in a new series, while the original one stays the same. In order to prevent the original timeseries from growing indefinitely, you can use the retention option, which will trim it down to a certain period of time.

NOTE: You need to create the destination (the compacted) timeseries before creating the rule.

TS.CREATERULE sourceKey destKey AGGREGATION aggregationType bucketDuration

Example:

TS.CREATE sensor1_compacted  # Create the destination timeseries first
TS.CREATERULE sensor1 sensor1_compacted AGGREGATION avg 60000   # Create the rule

With this creation rule, datapoints added to the sensor1 timeseries will be grouped into buckets of 60 seconds (60000ms), averaged, and saved in the sensor1_compacted timeseries.

Filtering

RedisTimeSeries allows to filter by value, timestamp and by labels:

Filtering by label

You can retrieve datapoints from multiple timeseries in the same query, and the way to do this is by using label filters. For example:

TS.MRANGE - + FILTER area_id=32

This query will show data from all sensors (timeseries) that have a label of area_id with a value of 32. The results will be grouped by timeseries.

Or we can also use the TS.MGET command to get the last sample that matches the specific filter:

TS.MGET FILTER area_id=32

Filtering by value

We can filter by value across a single or multiple timeseries:

TS.RANGE sensor1 - + FILTER_BY_VALUE 25 30

This command will return all data points whose value sits between 25 and 30, inclusive.

To achieve the same filtering on multiple series we have to combine the filtering by value with filtering by label:

TS.MRANGE - +  FILTER_BY_VALUE 20 30 FILTER region=east

Filtering by timestamp

To retrieve the datapoints for specific timestamps on one or multiple timeseries we can use the FILTER_BY_TS argument:

Filter on one timeseries:

TS.RANGE sensor1 - + FILTER_BY_TS 1626435230501 1626443276598

Filter on multiple timeseries:

TS.MRANGE - +  FILTER_BY_TS 1626435230501 1626443276598 FILTER region=east

Aggregation

It's possible to combine values of one or more timeseries by leveraging aggregation functions:

TS.RANGE ... AGGREGATION aggType bucketDuration...

For example, to find the average temperature per hour in our sensor1 series we could run:

TS.RANGE sensor1 - + + AGGREGATION avg 3600000

To achieve the same across multiple sensors from the area with id of 32 we would run:

TS.MRANGE - + AGGREGATION avg 3600000 FILTER area_id=32

Aggregation bucket alignment

When doing aggregations, the aggregation buckets will be aligned to 0 as so:

TS.RANGE sensor3 10 70 + AGGREGATION min 25
Value:        |      (1000)     (2000)     (3000)     (4000)     (5000)     (6000)     (7000)
Timestamp:    |-------|10|-------|20|-------|30|-------|40|-------|50|-------|60|-------|70|--->  

Bucket(25ms): |_________________________||_________________________||___________________________|
                           V                          V                           V
                  min(1000, 2000)=1000      min(3000, 4000)=3000     min(5000, 6000, 7000)=5000                

And we will get the following datapoints: 1000, 3000, 5000.

You can choose to align the buckets to the start or end of the queried interval as so:

TS.RANGE sensor3 10 70 + AGGREGATION min 25 ALIGN start
Value:        |      (1000)     (2000)     (3000)     (4000)     (5000)     (6000)     (7000)
Timestamp:    |-------|10|-------|20|-------|30|-------|40|-------|50|-------|60|-------|70|--->  

Bucket(25ms):          |__________________________||_________________________||___________________________|
                                    V                          V                           V
                        min(1000, 2000, 3000)=1000      min(4000, 5000)=4000     min(6000, 7000)=6000                

The result array will contain the following datapoints: 1000, 4000 and 6000

Aggregation across timeseries

By default, results of multiple timeseries will be grouped by timeseries, but (since v1.6) you can use the GROUPBY and REDUCE options to group them by label and apply an additional aggregation.

To find minimum temperature per region, for example, we can run:

TS.MRANGE - + FILTER region=(east,west) GROUPBY region REDUCE min

3 - Configuration

Run-time configuration

RedisTimeSeries supports a few run-time configuration options that should be determined when loading the module. In time more options will be added.

Passing Configuration Options During Loading

In general, passing configuration options is done by appending arguments after the --loadmodule argument in the command line, loadmodule configuration directive in a Redis config file, or the MODULE LOAD command. For example:

In redis.conf:

loadmodule redistimeseries.so OPT1 OPT2

From redis-cli:

127.0.0.6379> MODULE load redistimeseries.so OPT1 OPT2

From command line:

$ redis-server --loadmodule ./redistimeseries.so OPT1 OPT2

RedisTimeSeries configuration options

COMPACTION_POLICY {policy}

Default compaction/downsampling rules for newly created key with TS.ADD.

Each rule is separated by a semicolon (;), the rule consists of several fields that are separated by a colon (:):

  • aggregation function - avg, sum, min, max, count, first, last

  • time bucket duration - number and the time representation (Example for 1 minute: 1M)

    • m - millisecond
    • M - minute
    • s - seconds
    • d - day
  • retention time - in milliseconds

Example:

max:1M:1h - Aggregate using max over 1 minute and retain the last 1 hour

Default

Example

$ redis-server --loadmodule ./redistimeseries.so COMPACTION_POLICY max:1m:1h;min:10s:5d:10d;last:5M:10ms;avg:2h:10d;avg:3d:100d

RETENTION_POLICY

Maximum age for samples compared to last event time (in milliseconds) per key, this configuration will set the default retention for newly created keys that do not have a an override.

Default

0

Example

$ redis-server --loadmodule ./redistimeseries.so RETENTION_POLICY 20

CHUNK_TYPE

Default chunk type for automatically created keys when COMPACTION_POLICY is configured. Possible values: COMPRESSED, UNCOMPRESSED.

Default

COMPRESSED

Example

$ redis-server --loadmodule ./redistimeseries.so COMPACTION_POLICY max:1m:1h; CHUNK_TYPE COMPRESSED

NUM_THREADS

The maximal number of per-shard threads for cross-key queries when using cluster mode (TS.MRANGE, TS.MGET, and TS.QUERYINDEX). The value must be equal to or greater than 1. Note that increasing this value may either increase or decrease the performance!

Default

3

Example

$ redis-server --loadmodule ./redistimeseries.so NUM_THREADS 3

DUPLICATE_POLICY

Policy that will define handling of duplicate samples. The following are the possible policies:

  • BLOCK - an error will occur for any out of order sample
  • FIRST - ignore the new value
  • LAST - override with latest value
  • MIN - only override if the value is lower than the existing value
  • MAX - only override if the value is higher than the existing value
  • SUM - If a previous sample exists, add the new sample to it so that the updated value is equal to (previous + new). If no previous sample exists, set the updated value equal to the new value.

Precedence order

Since the duplication policy can be provided at different levels, the actual precedence of the used policy will be:

  1. TS.ADD input
  2. Key level policy
  3. Module configuration (AKA database-wide)

Default configuration

The default policy for database-wide is BLOCK, new and pre-existing keys will conform to database-wide default policy.

Example

$ redis-server --loadmodule ./redistimeseries.so DUPLICATE_POLICY LAST

4 - Development

Developing RedisTimeSeries

Developing RedisTimeSeries involves setting up the development environment (which can be either Linux-based or macOS-based), building RedisTimeSeries, running tests and benchmarks, and debugging both the RedisTimeSeries module and its tests.

Cloning the git repository

By invoking the following command, RedisTimeSeries module and its submodules are cloned:

git clone --recursive https://github.com/RedisTimeSeries/RedisTimeSeries.git

Working in an isolated environment

There are several reasons to develop in an isolated environment, like keeping your workstation clean, and developing for a different Linux distribution. The most general option for an isolated environment is a virtual machine (it's very easy to set one up using Vagrant). Docker is even a more agile solution, as it offers an almost instant solution:

ts=$(docker run -d -it -v $PWD:/build debian:bullseye bash)
docker exec -it $ts bash

Then, from within the container, cd /build and go on as usual. In this mode, all installations remain in the scope of the Docker container. Upon exiting the container, you can either re-invoke the container with the above docker exec or commit the state of the container to an image and re-invoke it on a later stage:

docker commit $ts ts1
docker stop $ts
ts=$(docker run -d -it -v $PWD:/build ts1 bash)
docker exec -it $ts bash

Installing prerequisites

To build and test RedisTimeSeries one needs to install several packages, depending on the underlying OS. Currently, we support the Ubuntu/Debian, CentOS, Fedora, and macOS.

If you have gnu make installed, you can execute

cd RedisTimeSeries
make setup

Alternatively, just invoke the following:

cd RedisTimeSeries
git submodule update --init --recursive    
./deps/readies/bin/getpy3
./system-setup.py

Note that system-setup.py will install various packages on your system using the native package manager and pip. This requires root permissions (i.e. sudo) on Linux.

If you prefer to avoid that, you can:

  • Review system-setup.py and install packages manually,
  • Use an isolated environment like explained above,
  • Utilize a Python virtual environment, as Python installations known to be sensitive when not used in isolation.

Installing Redis

As a rule of thumb, you're better off running the latest Redis version.

If your OS has a Redis package, you can install it using the OS package manager.

Otherwise, you can invoke ./deps/readies/bin/getredis.

Getting help

make help provides a quick summary of the development features.

Building from source

make will build RedisTimeSeries.

Build artifacts are placed into bin/linux-x64-release (or similar, according to your platform and build options).

Use make clean to remove built artifacts. make clean ALL=1 will remove the entire binary artifacts directory.

Running Redis with RedisTimeSeries

The following will run redis and load RedisTimeSeries module.

make run

You can open redis-cli in another terminal to interact with it.

Running tests

The module includes a basic set of unit tests and integration tests:

  • C unit tests, located in src/tests, run by make unit_tests.
  • Python integration tests (enabled by RLTest), located in tests/flow, run by make flow_tests.

One can run all tests by invoking make test. A single test can be run using the TEST parameter, e.g. make flow_test TEST=file:name.

Debugging

To build for debugging (enabling symbolic information and disabling optimization), run make DEBUG=1. One can the use make run DEBUG=1 to invoke gdb. In addition to the usual way to set breakpoints in gdb, it is possible to use the BB macro to set a breakpoint inside RedisTimeSeries code. It will only have an effect when running under gdb.

Similarly, Python tests in a single-test mode, one can set a breakpoint by using the BB() function inside a test. This will invoke pudb.

The two methods can be combined: one can set a breakpoint within a flow test, and when reached, connect gdb to a redis-server process to debug the module.

5 - Clients

RedisTimeSeries Client Libraries

RedisTimeSeries has several client libraries, written by the module authors and community members - abstracting the API in different programming languages.

While it is possible and simple to use the raw Redis commands API, in most cases it's more convenient to use a client library abstracting it.

Currently available Libraries

Some languages have client libraries that provide support for RedisTimeSeries commands:

ProjectLanguageLicenseAuthorStars
JedisJavaMITRedisJedis-stars
JRedisTimeSeriesJavaBSD-3RedisLabsJRedisTimeSeries-stars
redis-modules-javaJavaApache-2denglimingredis-modules-java-stars
redistimeseries-goGoApache-2RedisLabsredistimeseries-go-stars
rueidisGoApache-2Rueianrueidis-stars
redis-py (examples)PythonMITRedisLabsredis-py-stars
NRedisTimeSeries.NETBSD-3RedisLabsNRedisTimeSeries-stars
phpRedisTimeSeriesPHPMITAlessandro BalascophpRedisTimeSeries-stars
node-redisJavaScriptMITRedisnode-redis-stars
redis-time-seriesJavaScriptMITRafa Campoyredis-time-series-stars
redistimeseries-jsJavaScriptMITMilos Nikolovskiredistimeseries-js-stars
redis-modules-sdkTypescriptBSD-3-ClauseDani Tseitlinredis-modules-sdk-stars
redis_tsRustBSD-3Thomas Profeltredis_ts-stars
redistimeseriesRubyMITEaden McKeeredistimeseries-stars
redis-time-seriesRubyMITMatt Duszynskiredis-time-series-rb-stars

6 - Reference

Reference

6.1 - Out-of-order / backfilled ingestion performance considerations

Out-of-order / backfilled ingestion performance considerations

When an older timestamp is inserted into a time series, the chunk of memory corresponding to the new sample’s time frame will potentially have to be retrieved from the main memory (you can read more about these chunks here). When this chunk is a compressed chunk, it will also have to be decoded before we can insert/update to it. These are memory-intensive—and in the case of decoding, compute-intensive—operations that will influence the overall achievable ingestion rate.

Ingest performance is critical for us, which pushed us to assess and be transparent about the impact of the out-of-order backfilled ratio on our overall high-performance TSDB.

To do so, we created a Go benchmark client that enabled us to control key factors that dictate overall system performance, like the out-of-order ratio, the compression of the series, the number of concurrent clients used, and command pipelining. For the full benchmark-driver configuration details and parameters, please refer to this GitHub link.

Furthermore, all benchmark variations were run on Amazon Web Services instances, provisioned through our benchmark-testing infrastructure. Both the benchmarking client and database servers were running on separate c5.9xlarge instances. The tests were executed on a single-shard setup, with RedisTimeSeries version 1.4.

Below you can see the correlation between achievable ops/sec and out-of-order ratio for both compressed and uncompressed chunks.

Compressed chunks out-of-order/backfilled impact analysis

With compressed chunks, given that a single out-of-order datapoint implies the full decompression from double delta of the entire chunk, you should expect higher overheads in out-of-order writes.

As a rule of thumb, to increase out-of-order compressed performance, reduce the chunk size as much as possible. Smaller chunks imply less computation on double-delta decompression and thus less overall impact, with the drawback of smaller compression ratio.

The graphs and tables below make these key points:

  • If the database receives 1% of out-of-order samples with our current default chunk size in bytes (4096) the overall impact on the ingestion rate should be 10%.

  • At larger out-of-order percentages, like 5%, 10%, or even 25%, the overall impact should be between 35% to 75% fewer ops/sec. At this level of out-of-order percentages, you should really consider reducing the chunk size.

  • We've observed a maximum 95% drop in the achievable ops/sec even at 99% out-of-order ingestion. (Again, reducing the chunk size can cut the impact in half.)

compressed-overall-ops-sec-vs-out-of-order-percentage compressed-overall-p50-lat-vs-out-of-order-percentage compressed-out-of-order-overhead-table

Uncompressed chunks out-of-order/backfilled impact analysis

As visible on the charts and tables below, the chunk size does not affect the overall out-of-order impact on ingestion (meaning that if I have a chunk size of 256 bytes and a chunk size of 4096 bytes, the expected impact that out-of-order ingestion is the same—as it should be). Apart from that, we can observe the following key take-aways:

  • If the database receives 1% of out-of-order samples, the overall impact in ingestion rate should be low or even unmeasurable.

  • At higher out-of-order percentages, like 5%, 10%, or even 25%, the overall impact should be 5% to 19% fewer ops/sec.

  • We've observed a maximum 45% drop in the achievable ops/sec, even at 99% out-of-order ingestion.

uncompressed-overall-ops-sec-vs-out-of-order-percentage uncompressed-overall-p50-lat-vs-out-of-order-percentage uncompressed-out-of-order-overhead-table