STUMPY - A Re-newed Approach to Time Series Analysis

Thanks to the support of TD Ameritrade, I recently open sourced (BSD-3-Clause) a new, powerful, and scalable Python library called STUMPY that can be used for a variety of time series data mining tasks. At the heart of it, this library takes any time series or sequential data and efficiently computes something called the matrix profile, which, with only a few extra lines of code, enables you to perform:

pattern/motif (approximately repeated subsequences within a longer time series) discovery
anomaly/novelty (discord) discovery
shapelet discovery
semantic segmentation
density estimation
time series chains (temporally ordered set of subsequence patterns)
and more…

First, let’s install stumpy with Conda (preferred):

conda install -c conda-forge stumpy

or, alternatively, you can install stumpy with Pip:

pip install stumpy

Once stumpy is installed, typical usage would be to take your time series and compute the matrix profile:

import stumpy
import numpy as np

your_time_series = np.random.rand(10000)
window_size = 50  # Approximately, how many data points might be found in a pattern

matrix_profile = stumpy.stump(your_time_series, m=window_size)

For a more detailed example, check out our tutorials and documentation or feel free to file a Github issue. We welcome contributions in any form!

I’d love to hear from you so let me know what you think!

Published

May 13, 2019