Advanced Time Series
Technology & Use Cases
Mathias Herberts - CTO
Mathias.Herberts@senx.io
@herberts
Introduction
Time Series are universal and ubiquitous
■ Time Series are all about capturing change, not simply state
■ Time Series help understand the past and predict the future
■ Time Series are the bridges between the physical world and its digital twin
■ Time Series are the memory of the universe we live in
■ Time Series are eating the world
What are Time Series?
■ Time Series are sequences of values indexed by time
■ Time is an illusion, any sequence can be seen as a Time Series
Where can Time Series be found?
■ Time Series are present in many if not all verticals
Why do Time Series require specific tools?
■ Time Series data are different by nature
■ Their production rate is massive and continuous
■ The historical datasets that need to be retained are gigantic
■ The access pattern to Time Series data is unique
■ The type of analysis performed on Time Series data is uncommon
■ Traditional tools MUST be adapted if they are to be used
for Machine Data
Storage Analytics Visualization
Data Model
A universal data model
Geo Time Series™ data containers
Architecture
Warp 10™ standalone version
Single jar, no external dependencies
in-memorydisk based persistence
HDD / SSD
Standalone Warp 10™
Standalone Warp 10™Standalone Warp 10™
Standalone with datalog replication
Standalone Warp 10™ Standalone Warp 10™
Standalone Warp 10™
Standalone with datalog sharding
Metadata index
WarpScript™ analytics engine
Ingestion endpoint
Persistence daemon
Warp 10™ distributed version
Storage
A high performance Geo TSDB
■ Simple interaction via HTTP and text format for easy integration
■ Ability to ingest and fetch very long streams of data points
■ Support for WebSocket input and output
■ Fine grained access control via cryptographic tokens
■ Proven scalability with no cardinality problems
■ Support for Univariate and Multivariate data points
■ Distributed throttling mechanisms for number of series and data points rate
Anatomy of storage engine input
TIMESTAMP/LATITUDE:LONGITUDE/ELEVATION CLASS{LABELS} VALUE
■ Support for time precisions from ns to ms
■ Class and labels support UTF-8 in both names and values
■ Support for 5 types LONG, DOUBLE, BOOLEAN, STRING, BINARY
64 -Infinity NaN 4E-05 F ’foo’ b64:UmVmbHV4Cg==
■ Support for nested Multivariate values - each MV is a GTS (Geo Time Series™)
[ 2/42 64/48.0:-4.5/’hello’ 128/[ 1 2 3 ] 256/hex:12345 ]
Real world scalability and performance figures
■ Known deployments of over 500M series
■ Ingestion performance of 120M data points per second on a single in-memory
■ Historical datasets of several hundreds of trillions of data points
■ Sustained ingestion of several million data points per second per ingress
■ Ingestion of over 300k data points per second on a single thread on a RPi 4
■ Random deletions at several million data points per second
Analytics
Built around a data processing language
Full featured language dedicated to Time Series
■ Fully functional concatenative language
■ Turing complete with loops, conditionals, asynchronous transfer of control
■ Supports Geo Time Series as first class citizens
■ Over 980 functions available - from summary statistics to signal processing
■ 6 frameworks - BUCKETIZE, MAP, REDUCE, FILL, APPLY, FILTER
■ Fully extensible and embeddable
■ Ability to call external programs
Web IDE and Visual Studio Code Plugin
Powerful expressiveness
[ ‘TOKEN’ ‘class’ {} NOW 24 h ] FETCH ‘gts’ STORE // Fetch last 24 hours
[ $gts bucketizer.mean NOW 0 1 m ] BUCKETIZE ‘mean’ STORE // mean every 1’
[ $gts mapper.rate 1 0 0 ] MAP ‘rate’ STORE // Compute rate of change
NEWGTS 'randomwalk' RENAME 0.0 'v' STORE 42 PRNG 1 1000
<% 10 m * NOW SWAP - NaN DUP DUP $v SRAND 0.5 - + 'v' STORE $v ADDVALUE %> FOR
Complex algorithms available as simple functions
NEWGTS 'randomwalk' RENAME 0.0 'v' STORE 42 PRNG 1 1000
<% 10 m * NOW SWAP - NaN DUP DUP $v SRAND 0.5 - + 'v' STORE $v ADDVALUE %> FOR
DUP 100 LTTB
Hiding WarpScript complexity in macros
NEWGTS 'randomwalk' RENAME 0.0 'v' STORE 42 PRNG 1 1000
<% 10 m * NOW SWAP - NaN DUP DUP $v SRAND 0.5 - + 'v' STORE $v ADDVALUE %> FOR
'UTC' @senx/cal/byday
Complete documentation online at warp10.io
Visualization
Flexible visualization options
Full support for Processing in WarpScript
800 'width' STORE 800 'height' STORE
400.0 'maxspeed' STORE 40000.0 'maxalt' STORE
3.0 2.0 2.0 @orbit/heatmap/kernel/triangular 'kernel' STORE
@orbit/heatmap/palette/classic 'palette' STORE
'TOKEN''token' STORE
$width $height '2D' PGraphics
'MULTIPLY' PblendMode 'CENTER' PimageMode
[ $token '~(ALT|CAS)' {} NOW -2000000 ] FETCH
DUP 0 GET LASTTICK 'now' STORE
[ SWAP bucketizer.last $now STU 0 ] BUCKETIZE
// Create heatmap
<%
7 GET LIST-> DROP 'CAS' STORE 'ALT' STORE
<% $CAS ISNULL NOT $ALT ISNULL NOT && %>
<% $kernel $CAS $maxspeed / $width * $ALT $maxalt / 1.0 SWAP - $height * Pimage
%>
IFT
0 NaN NaN NaN NULL
%> MACROREDUCER 'GRAPHER' STORE
[ SWAP [] $GRAPHER ] REDUCE DROP
// Colorize
Ppixels <% DROP Palpha $palette SWAP GET %> LMAP
PupdatePixels Pencode Pdecode
$width $height '2D' PGraphics
// Do the grid
PnoFill 0 0 $width 1 - $height 1 - Prect
2.0 PstrokeWeight 200.0 Pcolor Pstroke
250.0 $maxspeed / $width * DUP 0 SWAP $height Pline
0 10000 $maxalt / 1.0 SWAP - $height * DUP $width SWAP Pline
SWAP 0 0 Pimage Pencode
Extensibility
Macros
Factorizing
WarpScript code to
separate
responsabilities and
encourage
reusability
<%
// This is a macro body
%>
■ Macros can be deployed on the server side
■ Macros can be packaged in a jar
■ Macros can access some config elements (MACROCONFIG)
■ Macros can be deployed on a remote server
WarpFleet™
Resolver
Enable hosting of
macros on remote
servers
■ Macros can be hosted on any HTTP server including GitHub
■ Resolution is performed at runtime
■ Support for multiple macro repositories
■ Script execution can modify repositories
■ WarpFleet™ resolver can be disabled altogether
■ Support for versioning via the IMPORT function
■ SenX provides a growing set of macros via its own repo
■ Warp 10 does intelligent caching of fetched macros
■ Support for runtime injection of elements (MACROCONFIG)
Extensions
Add, remove or
modify WarpScript
functions
■ Write new functions in Java (JVM), Go, Rust, C++, C (JNA)
■ Simple API to interact with the WarpScript execution runtime
■ Freedom of licensing for extensions
■ Growing list of existing extensions, contributions welcome!
Barcode, GeoTransforms, Grok, InfluxDB, JDBC, PCap, PMML,
Polyglot, Redis, S3, Swift, TensorFlow, EGADS, Elastic,
GCode, H2O, Keras, memcached, Parquet, ORC, Neo4J,
OpenTSDB, LAS, Pig, Spark
Some commercial ones by SenX
LevelDB, MapMatching, Forecasting, WarpScript Compiler
Plugins
Extend Warp 10 by
adding new features
■ Plugins are run in the Warp 10 process
■ Plugins can be in a Java (JVM) or Go, Rust, C, C++ (JNA)
■ Very diverse things can be done using plugins
■ Authentication plugins add new types of credentials
■ No license constraints
Kafka, MQTT, WarpStudio, Zeppelin, HTTP, UDP, TCP, Py4J,
InfluxDB Line Protocol
OVH is considering open sourcing plugins to support
PromQL, Graphite, OpenTSDB, InfluxQL query languages
Poke them to make it happen!
WarpFleet™
Community site for
finding extensions,
macro packages and
plugins
■ CLI tool on NPM - npm install -g @senx/warpfleet
■ Modules are hosted on maven repositories
■ Benefit from dependency resolution mechanisms
■ Modules can be fetched by Spark for example
■ Again, contributions more than welcome!
Integrations
Augment existing tools and frameworks
Use Cases
Flight data analysis for fleet reliability
Pressure Altitude vs TAS
Weather data
1,000,000 cells
400 parameters
208 time steps
86 B data points every 6 hours
in 400 M series
Using rank 2 tensors multi values
Warp 10 can store all of GFS in just
1,000,000 Geo Time Series
Helping racing sailboats fly
Automatic phase extraction
by TWA analysis
chemtrails-locator.com
200,000 aircrafts
15 B positions
Spatio-temporal indexing
150 km / 5 minutes cells
Served entirely by Warp 10
sandbox.senx.io
@SenXHQ - @Warp10io - @WarpScript
senx.io - warp10.io

2019-09-25 Paris Time Series Meetup - Warp 10 - Advanced Time Series Technology & Use Cases

  • 1.
    Advanced Time Series Technology& Use Cases Mathias Herberts - CTO Mathias.Herberts@senx.io @herberts
  • 2.
  • 3.
    Time Series areuniversal and ubiquitous ■ Time Series are all about capturing change, not simply state ■ Time Series help understand the past and predict the future ■ Time Series are the bridges between the physical world and its digital twin ■ Time Series are the memory of the universe we live in ■ Time Series are eating the world
  • 4.
    What are TimeSeries? ■ Time Series are sequences of values indexed by time ■ Time is an illusion, any sequence can be seen as a Time Series
  • 5.
    Where can TimeSeries be found? ■ Time Series are present in many if not all verticals
  • 6.
    Why do TimeSeries require specific tools? ■ Time Series data are different by nature ■ Their production rate is massive and continuous ■ The historical datasets that need to be retained are gigantic ■ The access pattern to Time Series data is unique ■ The type of analysis performed on Time Series data is uncommon ■ Traditional tools MUST be adapted if they are to be used
  • 7.
    for Machine Data StorageAnalytics Visualization
  • 8.
  • 9.
  • 10.
    Geo Time Series™data containers
  • 11.
  • 12.
    Warp 10™ standaloneversion Single jar, no external dependencies in-memorydisk based persistence HDD / SSD
  • 13.
    Standalone Warp 10™ StandaloneWarp 10™Standalone Warp 10™ Standalone with datalog replication
  • 14.
    Standalone Warp 10™Standalone Warp 10™ Standalone Warp 10™ Standalone with datalog sharding
  • 15.
    Metadata index WarpScript™ analyticsengine Ingestion endpoint Persistence daemon Warp 10™ distributed version
  • 16.
  • 17.
    A high performanceGeo TSDB ■ Simple interaction via HTTP and text format for easy integration ■ Ability to ingest and fetch very long streams of data points ■ Support for WebSocket input and output ■ Fine grained access control via cryptographic tokens ■ Proven scalability with no cardinality problems ■ Support for Univariate and Multivariate data points ■ Distributed throttling mechanisms for number of series and data points rate
  • 18.
    Anatomy of storageengine input TIMESTAMP/LATITUDE:LONGITUDE/ELEVATION CLASS{LABELS} VALUE ■ Support for time precisions from ns to ms ■ Class and labels support UTF-8 in both names and values ■ Support for 5 types LONG, DOUBLE, BOOLEAN, STRING, BINARY 64 -Infinity NaN 4E-05 F ’foo’ b64:UmVmbHV4Cg== ■ Support for nested Multivariate values - each MV is a GTS (Geo Time Series™) [ 2/42 64/48.0:-4.5/’hello’ 128/[ 1 2 3 ] 256/hex:12345 ]
  • 19.
    Real world scalabilityand performance figures ■ Known deployments of over 500M series ■ Ingestion performance of 120M data points per second on a single in-memory ■ Historical datasets of several hundreds of trillions of data points ■ Sustained ingestion of several million data points per second per ingress ■ Ingestion of over 300k data points per second on a single thread on a RPi 4 ■ Random deletions at several million data points per second
  • 20.
  • 21.
    Built around adata processing language
  • 22.
    Full featured languagededicated to Time Series ■ Fully functional concatenative language ■ Turing complete with loops, conditionals, asynchronous transfer of control ■ Supports Geo Time Series as first class citizens ■ Over 980 functions available - from summary statistics to signal processing ■ 6 frameworks - BUCKETIZE, MAP, REDUCE, FILL, APPLY, FILTER ■ Fully extensible and embeddable ■ Ability to call external programs
  • 23.
    Web IDE andVisual Studio Code Plugin
  • 24.
    Powerful expressiveness [ ‘TOKEN’‘class’ {} NOW 24 h ] FETCH ‘gts’ STORE // Fetch last 24 hours [ $gts bucketizer.mean NOW 0 1 m ] BUCKETIZE ‘mean’ STORE // mean every 1’ [ $gts mapper.rate 1 0 0 ] MAP ‘rate’ STORE // Compute rate of change NEWGTS 'randomwalk' RENAME 0.0 'v' STORE 42 PRNG 1 1000 <% 10 m * NOW SWAP - NaN DUP DUP $v SRAND 0.5 - + 'v' STORE $v ADDVALUE %> FOR
  • 25.
    Complex algorithms availableas simple functions NEWGTS 'randomwalk' RENAME 0.0 'v' STORE 42 PRNG 1 1000 <% 10 m * NOW SWAP - NaN DUP DUP $v SRAND 0.5 - + 'v' STORE $v ADDVALUE %> FOR DUP 100 LTTB
  • 26.
    Hiding WarpScript complexityin macros NEWGTS 'randomwalk' RENAME 0.0 'v' STORE 42 PRNG 1 1000 <% 10 m * NOW SWAP - NaN DUP DUP $v SRAND 0.5 - + 'v' STORE $v ADDVALUE %> FOR 'UTC' @senx/cal/byday
  • 28.
  • 29.
  • 30.
  • 31.
    Full support forProcessing in WarpScript 800 'width' STORE 800 'height' STORE 400.0 'maxspeed' STORE 40000.0 'maxalt' STORE 3.0 2.0 2.0 @orbit/heatmap/kernel/triangular 'kernel' STORE @orbit/heatmap/palette/classic 'palette' STORE 'TOKEN''token' STORE $width $height '2D' PGraphics 'MULTIPLY' PblendMode 'CENTER' PimageMode [ $token '~(ALT|CAS)' {} NOW -2000000 ] FETCH DUP 0 GET LASTTICK 'now' STORE [ SWAP bucketizer.last $now STU 0 ] BUCKETIZE // Create heatmap <% 7 GET LIST-> DROP 'CAS' STORE 'ALT' STORE <% $CAS ISNULL NOT $ALT ISNULL NOT && %> <% $kernel $CAS $maxspeed / $width * $ALT $maxalt / 1.0 SWAP - $height * Pimage %> IFT 0 NaN NaN NaN NULL %> MACROREDUCER 'GRAPHER' STORE [ SWAP [] $GRAPHER ] REDUCE DROP // Colorize Ppixels <% DROP Palpha $palette SWAP GET %> LMAP PupdatePixels Pencode Pdecode $width $height '2D' PGraphics // Do the grid PnoFill 0 0 $width 1 - $height 1 - Prect 2.0 PstrokeWeight 200.0 Pcolor Pstroke 250.0 $maxspeed / $width * DUP 0 SWAP $height Pline 0 10000 $maxalt / 1.0 SWAP - $height * DUP $width SWAP Pline SWAP 0 0 Pimage Pencode
  • 32.
  • 33.
    Macros Factorizing WarpScript code to separate responsabilitiesand encourage reusability <% // This is a macro body %> ■ Macros can be deployed on the server side ■ Macros can be packaged in a jar ■ Macros can access some config elements (MACROCONFIG) ■ Macros can be deployed on a remote server
  • 34.
    WarpFleet™ Resolver Enable hosting of macroson remote servers ■ Macros can be hosted on any HTTP server including GitHub ■ Resolution is performed at runtime ■ Support for multiple macro repositories ■ Script execution can modify repositories ■ WarpFleet™ resolver can be disabled altogether ■ Support for versioning via the IMPORT function ■ SenX provides a growing set of macros via its own repo ■ Warp 10 does intelligent caching of fetched macros ■ Support for runtime injection of elements (MACROCONFIG)
  • 35.
    Extensions Add, remove or modifyWarpScript functions ■ Write new functions in Java (JVM), Go, Rust, C++, C (JNA) ■ Simple API to interact with the WarpScript execution runtime ■ Freedom of licensing for extensions ■ Growing list of existing extensions, contributions welcome! Barcode, GeoTransforms, Grok, InfluxDB, JDBC, PCap, PMML, Polyglot, Redis, S3, Swift, TensorFlow, EGADS, Elastic, GCode, H2O, Keras, memcached, Parquet, ORC, Neo4J, OpenTSDB, LAS, Pig, Spark Some commercial ones by SenX LevelDB, MapMatching, Forecasting, WarpScript Compiler
  • 36.
    Plugins Extend Warp 10by adding new features ■ Plugins are run in the Warp 10 process ■ Plugins can be in a Java (JVM) or Go, Rust, C, C++ (JNA) ■ Very diverse things can be done using plugins ■ Authentication plugins add new types of credentials ■ No license constraints Kafka, MQTT, WarpStudio, Zeppelin, HTTP, UDP, TCP, Py4J, InfluxDB Line Protocol OVH is considering open sourcing plugins to support PromQL, Graphite, OpenTSDB, InfluxQL query languages Poke them to make it happen!
  • 37.
    WarpFleet™ Community site for findingextensions, macro packages and plugins ■ CLI tool on NPM - npm install -g @senx/warpfleet ■ Modules are hosted on maven repositories ■ Benefit from dependency resolution mechanisms ■ Modules can be fetched by Spark for example ■ Again, contributions more than welcome!
  • 38.
  • 39.
    Augment existing toolsand frameworks
  • 40.
  • 41.
    Flight data analysisfor fleet reliability Pressure Altitude vs TAS
  • 42.
    Weather data 1,000,000 cells 400parameters 208 time steps 86 B data points every 6 hours in 400 M series Using rank 2 tensors multi values Warp 10 can store all of GFS in just 1,000,000 Geo Time Series
  • 43.
    Helping racing sailboatsfly Automatic phase extraction by TWA analysis
  • 44.
    chemtrails-locator.com 200,000 aircrafts 15 Bpositions Spatio-temporal indexing 150 km / 5 minutes cells Served entirely by Warp 10
  • 45.
  • 46.
    @SenXHQ - @Warp10io- @WarpScript senx.io - warp10.io