Wikipedia
"A time series database is a specialized database that efficiently stores and retrieves time-stamped data. Each time series is stored individually as an optimized list of values, enabling fast data retrieval and cost-effective storage." --Honeycomb
"Time series databases are specialized databases designed to manage data that is organized and indexed by time. Unlike traditional databases, which are optimized for general-purpose data storage, TSDBs focus on efficiently storing, querying, and analyzing sequences of time-stamped data points." -- Datacamp
Characteristics of Time Series Databases: There are a few things that TSDBs do differently than traditional databases.
Optimized for time-stamped data. At their core, TSDBs are built to handle data with timestamps as a fundamental attribute. Every data point in a TSDB includes a timestamp, which serves as its primary index. This allows these databases to efficiently store and retrieve time-ordered sequences and provide quick access to historical trends or recent events. Most TSDBs use time-based partitioning, meaning the data is stored in partitions based on time intervals (e.g., hourly, daily). This enables efficient pruning, where queries ignore irrelevant partitions altogether. They can also implement time buckets, grouping data into predefined time windows (e.g., 1 minute, 1 hour) for faster aggregations.
High ingestion rates. Time series data is often generated at a rapid pace—think of IoT devices sending thousands of data points per second or a server monitoring tool capturing system metrics in real time. TSDBs are optimized for these high write rates and can ingest vast amounts of data without slowing down or losing information. This is usually achieved using append-only data storage models and in-memory buffers to prevent locks or transactional bottlenecks.
Efficient queries for time ranges. Analyzing time series data often involves querying specific time intervals or windows, such as “last 24 hours” or “this year compared to last year.” TSDBs are built with this in mind, offering specialized query capabilities that allow users to quickly retrieve data over defined time ranges. They also support aggregations like averages, sums, or trends to offer valuable analytics without complex query logic.
The query optimization techniques include:
- Pre-aggregated data: TSDBs often pre-calculate summaries for common time intervals (e.g., hourly or daily averages).
- Sliding window algorithms: These help efficiently compute metrics over moving time windows, such as rolling averages.
Data compression and retention policies. To manage the vast amount of time series data generated over time, TSDBs use advanced data compression techniques. These methods reduce storage requirements while preserving query performance. TSDBs usually include retention policies so the users can define how long data should be kept. For example, a system might retain detailed data for the past month while downsampling for older data. Downsampling is the process of reducing the granularity of data over time. For example: Raw temperature readings might be recorded every 10 seconds for the most recent 7 days; or for older data, the system might downsample to hourly averages to save space while still retaining historical trends.
Examples of advanced compression techniques include:
- Delta encoding: Storing the difference between consecutive values instead of the full value.
- Gorilla compression: A method used to efficiently compress floating-point time series data by storing changes in binary format.
Writes dominate. Our primary requirement for a TSDB is that it should always be available to take writes. As we have hundreds of systems exposing multiple data items, the write rate might easily exceed tens of millions of data points each second. In constrast, the read rate is usually a couple orders of magnitude lower as it is primarily from automated systems watching ’important’ time series, data visualization systems presenting dashboards for human consumption, or from human operators wishing to diagnose an observed problem.
State transitions. We wish to identify issues that emerge from a new software release, an unexpected side effect of a configuration change, a network cut and other issues that result in a significant state transition. Thus, we wish for our TSDB to support fine-grained aggregations over short-time windows. The ability to display state transitions within tens of seconds is particularly prized as it allows automation to quickly remediate problems before they become wide spread.
High availability. Even if a network partition or other failure leads to disconnection between different datacenters, systems operating within any given datacenter ought to be able to write data to local TSDB machines and be able to retrieve this data on demand.
Fault tolerance. We wish to replicate all writes to multiple regions so we can survive the loss of any given datacenter or geographic region due to a disaster.
Lack of ACID. These systems do not store any user data so traditional ACID guarantees are not a core requirement for TSDBs. However, a high percentage of writes must succeed at all times, even in the face of disasters that might render entire datacenters unreachable. Additionally, recent data points are of higher value than older points given the intuition that knowing if a particular system or service is broken right now is more valuable to an operations engineer than knowing if it was broken an hour ago.
Databases
- atlas In-memory dimensional time series database from Netflix.
- Cassandra Apache Cassandra is an open source NoSQL distributed database trusted by thousands of companies for scalability and high availability without compromising performance.
- ClickHouse An open-source, high performance columnar OLAP database management system for real-time analytics using SQL.
- cratedb The SQL database for complex, large scale time series workloads in industrial IoT.
- druid A high performance real-time analytics database.
- fauna Fauna is a flexible, developer-friendly, transactional database delivered as a secure and scalable cloud API with native GraphQL.
- InfluxDB Is the essential time series toolkit - dashboards, queries, tasks and agents all in one place.
- KairosDB Fast Time Series Database on Cassandra.
- OpenTSDB The Scalable Time Series Database.
- prometheus An open-source systems monitoring and alerting toolkit originally built at SoundCloud.
- QuestDB An open source SQL database designed to process time series data, faster.
- SiriDB An highly-scalable, robust and super fast time series database.
- TimescaleDB TimescaleDB is the leading open-source relational database with support for time-series data.
- TDengine An open-source time-series database with high-performance, scalability and SQL support.
- Whisper (Graphite): a file-based time-series database format for Graphite. (Docs) | Graphite website
Managed database services
Packages
Python
- adtk A Python toolkit for rule-based/unsupervised anomaly detection in time series.
- aeon A unified framework for machine learning with time series.
- alibi-detect Algorithms for outlier, adversarial and drift detection.
- AutoTS A time series package for Python designed for rapidly deploying high-accuracy forecasts at scale.
- Auto_TS Automatically build ARIMA, SARIMAX, VAR, FB Prophet and XGBoost Models on Time Series data sets with a Single Line of Code. Now updated with Dask to handle millions of rows.
- cesium Open-Source Platform for Time Series Inference.
- darts Time Series Made Easy in Python. A python library for easy manipulation and forecasting of time series.
- deeptime Python library for analysis of time series data including dimensionality reduction, clustering, and Markov model estimation.
- dtw-python Python port of R's Comprehensive Dynamic Time Warp algorithm package.
- etna ETNA is an easy-to-use time series forecasting framework.
- fost Forecasting open source tool aims to provide an easy-use tool for spatial-temporal forecasting.
- functime Time-series machine learning and embeddings at scale.
- gluon-ts Probabilistic time series modeling in Python from AWS.
- gordo Building thousands of models with time series data to monitor systems.
- greykite A flexible, intuitive and fast forecasting library from LinkedIn.
- hmmlearn Hidden Markov Models in Python, with
scikit-learn
like API.
- HyperTS A Full-Pipeline Automated Time Series (AutoTS) Analysis Toolkit.
- kats A kit to analyze time series data, a lightweight, easy-to-use, generalizable, and extendable framework to perform time series analysis, from understanding the key statistics and characteristics, detecting change points and anomalies, to forecasting future trends.
- libmaxdiv Implementation of the Maximally Divergent Intervals algorithm for Anomaly Detection in multivariate spatio-temporal time-series.
- lifelines Survival analysis in Python.
- luminaire A python package that provides ML driven solutions for monitoring time series data. Luminaire provides several anomaly detection and forecasting capabilities that incorporate correlational and seasonal patterns in the data over time as well as uncontrollable variations.
- mass-ts Mueen's Algorithm for Similarity Search, a library used for searching time series sub- sequences under z-normalized Euclidean distance for similarity.
- matrixprofile A Python library making time series data mining tasks, utilizing matrix profile algorithms, accessible to everyone.
- Merlion A Python library for time series intelligence. It provides an end-to-end machine learning framework that includes loading and transforming data, building and training models, post-processing model outputs, and evaluating model performance.
- neuralforecast Scalable and user friendly neural brain forecasting algorithms.
- nixtla Automated time series processing and forecasting.
- orbit A package for Bayesian forecasting with object-oriented design and probabilistic models under the hood from Uber.
- pastas An open-source Python framework for the analysis of hydrological time series.
- pmdarima A statistical library designed to fill the void in Python's time series analysis capabilities, including the equivalent of R's
auto.arima
function.
- prophet Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.
- pyaf PyAF is an Open Source Python library for Automatic Time Series Forecasting built on top of popular pydata modules.
- PyDLM Bayesian time series modeling package. Based on the Bayesian dynamic linear model (Harrison and West, 1999) and optimized for fast model fitting and inference.
- PyFlux Open source time series library for Python.
- pyFTS An open source library for Fuzzy Time Series in Python.
- Pyod A Python toolbox for scalable outlier detection (Anomaly Detection).
- PyPOTS A python toolbox/library for data mining on partially-observed time series (A.K.A. irregularly-sampled time series), supporting tasks of forecasting/imputation/classification/clustering on incomplete multivariate time series with missing values.
- pyspi Comparative analysis of pairwise interactions in multivariate time series.
- pytimetk The time series toolkit for python.
- rrcf Implementation of the Robust Random Cut Forest algorithm for anomaly detection on streams.
- scalecast A scalable forecasting approach for Timeseries in Python
- scikit-hts Hierarchical Time Series Forecasting with a familiar API.
- seglearn A python package for machine learning time series or sequences.
- shyft Time-series for python and c++, including distributed storage and calculations Hydrologic Forecasting Toolbox, high-performance flexible stacks, including calibration Energy-market models and micro services.
- similarity_measures Quantify the difference between two arbitrary curves.
- skforecast Time series forecasting with scikit-learn models.
- sktime A
scikit-learn
compatible Python toolbox for learning with time series.
- statsforecast Lightning :zap: fast forecasting with statistical and econometric models.
- statsmodels.tsa Time Series Analysis (tsa)
statsmodels.tsa
contains model classes and functions that are useful for time series analysis.
- stumpy A powerful and scalable Python library that can be used for a variety of time series data mining tasks.
- TICC A python solver for efficiently segmenting and clustering a multivariate time series.
- tick Module for statistical learning, with a particular emphasis on time-dependent modelling.
- TimeCopilot An open-source forecasting agent that combines the power of large language models with state-of-the-art time series foundation models.
- timemachines Continuously evaluated, functional, incremental, time-series forecasting.
- TimeSeers A hierarchical Bayesian Time Series model based on Prophet, written in PyMC3.
- TimesFM TimesFM (Time Series Foundation Model) is a pretrained time-series foundation model developed by Google Research for time-series forecasting.
- Time Series Generator Provides a solution for the direct multi-step outputs limitation in Keras.
- tods An Automated Time-series Outlier Detection System.
- torchtime Time series data sets for PyTorch.
- TSDB Time-Series DataBase: A Python toolbox helping load time-series datasets easily.
- tsai State-of-the-art Deep Learning library for Time Series and Sequences.
- tscv Time Series Cross-Validation - an extension for scikit-learn.
- tsflex Flexible time series feature extraction & processing.
- tslearn The machine learning toolkit for time series analysis in Python.
- tslumen A library for Time Series Exploratory Data Analysis (EDA).
- tsmoothie A python library for time-series smoothing and outlier detection in a vectorized way.
Date and Time
Libraries for working with dates and times.
- astral Python calculations for the position of the sun and moon.
- Arrow - A Python library that offers a sensible and human-friendly approach to creating, manipulating, formatting and converting dates, times and timestamps.
- Chronyk - A Python 3 library for parsing human-written times and dates.
- dateutil - Extensions to the standard Python datetime module.
- delorean - A library for clearing up the inconvenient truths that arise dealing with datetimes.
- maya - Datetimes for Humans.
- moment - A Python library for dealing with dates/times. Inspired by Moment.js.
- Pendulum - Python datetimes made easy.
- PyTime - An easy-to-use Python module which aims to operate date/time/datetime by string.
- pytz - World timezone definitions, modern and historical. Brings the tz database into Python.
- when.py - Providing user-friendly functions to help perform common date and time actions.
Feature Engineering
- AntroPy Time-efficient algorithms for computing the entropy and complexity of time-series.
- catch22 CAnonical Time-series CHaracteristics, 22 high-performing time-series features in C, Python and Julia.
- featuretools An open source python library for automated feature engineering.
- temporian Temporian is an open-source Python library for preprocessing ⚡ and feature engineering 🛠 temporal data 📈 for machine learning applications 🤖
- tsfeatures Calculates various features from time series data. Python implementation of the R package tsfeatures.
- tsfel An intuitive library to extract features from time series.
- tsflex Flexible & efficient time series feature extraction & processing package.
- tsfresh The package contains many feature extraction methods and a robust feature selection algorithm.
Time Series Segmentation & Change Point Detection
- bayesian_changepoint_detection Methods to get the probability of a change point in a time series. Both online and offline methods are available.
- changepy Change point detection in time series in pure python.
- RBEAST Bayesian Change-Point Detection and Time Series Decomposition.
- ruptures A Python library for off-line change point detection. This package provides methods for the analysis and segmentation of non-stationary signals.
- TCPDBench Turing Change Point Detection Benchmark, a benchmark evaluation of change point detection algorithms.
Time Series Generation and Augmentation
- DeepEcho Synthetic Data Generation for mixed-type, multivariate time series.
- deltapy Tabular Data Augmentation & Feature Engineering.
- time_series_augmentation An example of time series augmentation methods with Keras.
- TimeSynth A multipurpose library for synthetic time series in Python.
- tsaug A Python package for time series augmentation.
- tsgm Synthetic time series generation and time series augmentations.
Visualization
- atlair Declarative statistical visualization library for Python.
- matplotlib A comprehensive library for creating static, animated, and interactive visualizations in Python.
- plotly A graphing library makes interactive, publication-quality graphs.
- plotly-resampler Wrapper for Plotly figures, making large sequential plots scalable.
- seaborn A data visualization library based on matplotlib that provides a high-level interface for drawing attractive and informative statistical graphics.
- tsdownsample Extremely fast time series downsampling for visualisation.
Benchmarking & Contests
R
- bcp Bayesian Analysis of Change Point Problems.
- CausalImpact An R package for causal inference using Bayesian structural time-series models.
- changepoint Implements various mainstream and specialised changepoint methods for finding single and multiple changepoints within data.
- cpm Sequential and Batch Change Detection Using Parametric and Nonparametric Methods.
- EnvCpt Detection of Structural Changes in Climate and Environment Time Series.
- fable A tidyverts package for tidy time series forecasting.
- fasster A tidyverts package for forecasting with additive switching of seasonality, trend and exogenous regressors.
- feasts A tidyverts package for feature extraction and statistics for time series.
- fpop Segmentation using Optimal Partitioning and Function Pruning.
- greybox Regression model building and forecasting in R.
- modeltime Modeltime unlocks time series forecast models and machine learning in one framework.
- penaltyLearning Algorithms for supervised learning of penalty functions for change detection.
- Rcatch22 R package for calculation of 22 CAnonical Time-series CHaracteristics.
- smooth The set of smoothing functions used for time series analysis and in forecasting.
- theft R package for Tools for Handling Extraction of Features from Time series.
- timetk A
tidyverse
toolkit to visualize, wrangle, and transform time series data.
- tsibble A tidyverts package with tidy temporal data frames and tools.
- tsrepr TSrepr: R package for time series representations.
Java
- SFA Scalable Time Series Data Analytics.
- tsml Java time series machine learning tools in a Weka compatible toolkit.
JavaScript
Visualization
- cubism A D3 plugin for visualizing time series. Use Cubism to construct better realtime dashboards, pulling data from Graphite, Cube and other sources.
- echarts A free, powerful charting and visualization library offering an easy way of adding intuitive, interactive, and highly customizable charts to your commercial products.
- fusiontime Helps you visualize time-series and stock data in JavaScript, with just a few lines of code.
- highcharts A JavaScript charting library based on SVG, with fallbacks to VML and canvas for old browsers.
- synchro-charts A front-end component library that provides a collection of components to visualize time-series data.
Spark
- flint A Time Series Library for Apache Spark.
MATLAB
- hctsa Highly comparative time-series analysis.
Annotation and Labeling
- AnnotateChange - A simple flask application to collect annotations for the Turing Change Point Dataset, a benchmark dataset for change point detection algorithms.
- Curve - An open-source tool to help label anomalies on time-series data
- TagAnomaly - Anomaly detection analysis and labeling tool, specifically for multiple time series (one time series per category)
- time-series-annotator - Time Series Annotation Library implements classification tasks for time series.
- WDK - The Wearables Development Toolkit (WDK) is a set of tools to facilitate the development of activity recognition applications with wearable devices.
Reading
Blogs
Papers
- Dive into Time-Series Anomaly Detection: A Decade Review, Paul Boniol, Qinghua Liu, Mingyi Huang, Themis Palpanas, John Paparrizos, 2024
- TS2Vec: Towards Universal Representation of Time Series, Zhihan Yue, Yujing Wang, Juanyong Duan, Tianmeng Yang, Congrui Huang, Yunhai Tong, Bixiong Xu, 2022 - code
- Conformal prediction interval for dynamic time-series, Chen Xu, Yao Xie, International Conference on Machine Learning 2021 (long presentation) - code
- Deep learning for time series classification: a review, H. I. Fawaz, G. Forestier, J. Weber, L. Idoumghar, P-A. Muller, Data Mining and Knowledge Discovery 2019 - code
- Greedy Gaussian Segmentation of Multivariate Time Series, D. Hallac, P. Nystrup, and S. Boyd, Advances in Data Analysis and Classification, 13(3), 727–751, 2019. - code
- U-Time: A Fully Convolutional Network for Time Series Segmentation Applied to Sleep Staging, Mathias Perslev, Michael Jensen, Sune Darkner, Poul Jørgen Jennum, Christian Igel, NeurIPS, 2019. - code
- A Better Alternative to Piecewise Linear Time Series Segmentation, Daniel Lemire, SIAM Data Mining, 2007. - code
- Time-series Generative Adversarial Networks, Jinsung Yoon, Daniel Jarrett, Mihaela van der Schaar, NeurIPS, 2019. - code
- Learning to Diagnose with LSTM Recurrent Neural Networks, Zachary C. Lipton, David C. Kale, Charles Elkan, Randall Wetzel, arXiv:1511.03677, 2015. - code
- Coherence-based Label Propagation over Time Series for Accelerated Active Learning, Yooju Shin, Susik Yoon, Sundong Kim, Hwanjun Song, Jae-Gil Lee, Byung Suk Lee, ICLR, 2022. - code
- Facebook Gorilla: "Large-scale internet services aim to remain highly-available and responsive for their users even in the presence of unexpected failures. As these services have grown to support a global audience, they have scaled beyond a few systems running on hundreds of machines to thousands of individual systems running on many thousands of machines, often across multiple geo-replicated datacenters. An important requirement to operating these large scale services is to accurately monitor the health and performance of the underlying system and quickly identify and diagnose problems as they arise. Facebook uses a time series database (TSDB) to store system measuring data points and provides quick query functionalities on top. We next specify some of the constraints that we need to satisy for monitoring and operating Facebook and then describe Gorilla, our new inmemory TSDB that can store tens of millions of datapoints (e.g., CPU load, error rate, latency etc.) every second and respond queries over this data within milliseconds."
- Goku: A Schemaless Time Series Database for Large Scale Monitoring at Pinterest
Books
- Bayesian Time Series Models 💲 David Barber, A. Taylan Cemgil, Silvia Chiappa, Cambridge Academic Press 2011
- Codeless Time Series Analysis with KNIME 💲 Corey Weisinger, Maarit Widmann, and Daniele Tonini, Packt Publishing 2022
- Forecasting principles and practice (3rd ed) 🆓 Rob J Hyndman and George Athanasopoulos 2021
- Practical Time Series Analysis 💲 Avishek Pal, PKS Prakash, Packt 2017
- repo with code
- Practical Time Series Analysis: Prediction with Statistics and Machine Learning 💲 Aileen Nielsen, O’Reilly 2019
- Machine Learning for Time-Series with Python 💲 Ben Auffarth, Packt Publishing 2021
- repo with code
- Time Series Analysis Handbook 🆓 Students of PhD in Data Science Batch 2023 at the Asian Institute of Management.
- Visualization of Time-Oriented Data 💲 Wolfgang Aigner, Silvia Miksch, Heidrun Schumann, Christian Tominski, Springer-Verlag 2011
Courses
Tutorials
Repos with Models
Applications
- binjr A Time Series Data Browser.
- CompEngine A self-organizing database of time-series data, that allows you to upload time-series data and interactively visualize similar data that have been measured by others.
Awesome lists
Tags:
storage
model
temporal
Last modified 15 September 2025