Accumulo Summit 2016: Timely - Scalable Secure Time Series Database

Scalable Secure Time Series Database
https://NationalSecurityAgency.github.io/timely

Overview
lBuilt on Apache Accumulo
– Proven Security, Scale & Reliability
lUses Netty for communication protocols
– Widely adopted, easy to integrate
lProvides secure access to labeled data
– Easily customized to meet unique architectures

History
lIntegrated OpenTSDB with Apache Accumulo
– Using Eric Newtons shim code
– Seemed to have issues with scale
– FAIL - Could not get past StackOverflowError
•(OpenTSDB issue #334)
lDecided to write it from scratch
– Keep Grafana
– Use Grafana OpenTSDB datasource plugin
lHad something working in 2 weeks

Simple Architecture
lInsert data points
lSubscribe to data points
lQuery for aggregated data points
Timely
Ingest Subscribe
Time Series

Application Interfaces
lSupports multiple protocols
– udp, tcp, https, websocket
lOperations for storing data
– All protocols, security tag optional
lOperations for working with time series data
– https and websocket
lOperations for subscribing to data
– websocket only

Timely Input Format (Text)
lSimple text based on OpenTSDB put format:
put <metric> <timestamp> <value> <tag>[,<tag>...]
lExample
put sys.cpu.idle 1469735914000 25.0 host=s01n04 rack=s01 instance=0
lSupported in all protocols
lviz tag used to label data
– viz=private

Timely Input Format (Binary)
lBinary format uses Google FlatBuffers encoding
lIDL file located in the source code
lGenerate client code in multiple languages
lCurrently supported in UDP and TCP protocols

Sending Data to Timely
lSend data directly from your application
lCan use existing collection agents:
– OpenTSDB Tcollector
– CollectD
lCan leverage StatsD servers also
– HADOOP-12360 (StatsD Metrics2 sink)

Storage Format
lMeta Table
– Stores unique metric and tag information
lMetrics Table
– Stores individual metric data
– Each data point stored N ways, N = # tags
lSeveral bytes to store each key
– Run Length Encoding
– Compression

Visualizing Time Series Data
lTimely built to work with Grafana
lTimely App for Grafana
– Drop it into the Grafana plugins directory
– Provides Timely data source
– Integrates security features into Grafana
– Example dashboards provided

Timely App – Data Sources
lDefine Timely Data Sources
lTest Connectivity

Timely App – Menu Items
•Login to defined data source
lView Metric Names / Tags

Timely App – Login
lTop – Login using client certificates
lBottom – Login using username / password

Sample Dashboards
lTimely App included dashboards:
– Timely Status
– System Overview
– Hadoop Overview
– Accumulo Overview

Subscribing to Data
lSubscription API over WebSocket protocol
– WebSocket is a bi-directional protocol
– Timely uses secure WebSockets (wss)
lCreate connection and subscribe to:
– Data for specific metric names
– Data for a specific time window
– Optionally, data that matches tag names and values
lCan register multiple subscriptions
lRemove subscriptions when appropriate

Security - Implementation
lTimely stores the labels provided in the viz tag
– Timely only calls flatten() on the CV for consistent
ordering
lSpring Security enables users to plug in their
authentication mechanism and role provider
lWorkflow:
– User logs into Timely via /login HTTPS endpoint
– User authenticated via Spring Security
– HTTP secure session cookie returned for future API
calls

Security Configuration
lAnonymous access configurable
lSSL provider: JDK or OpenSSL
lSSL file locations and passwords
lSSL ciphers
lSession cookie expiration
lCORS properties

Transport Security
lHTTP Strict Transport Security (HSTS)
– Accessing via http will redirect to HTTPS
– Rule stored in browser for configured time
lHTTPS
lWSS

Modes of Operation
lAnonymous access enabled
– Unauthenticated users only see unlabled data
– Authenticated users see what they are allowed
lAnonymous access disabled
– Unauthenticated users receive an error message
– Authenticated users see what they are allowed

Roadmap
lSummarization of historical data
lNew Time Series API
– Move away from OpenTSDB API
– Add additional features
lTimely Client
– Make subscribing to data easier
– Enable analytics to be easily written
lEnrichment
– Allow for user supplied information about time series
lSupport Grafana annotations

Deploying Timely
lJava 8 required for Accumulo and Timely
lTested with Accumulo 1.7.x and Hadoop 2.6
lStandaloneMode
– Uses Mini Accumulo Cluster
– Useful for development and testing
– Data lost across restarts
lNon-Standalone Mode
– 1+ Timely Servers

Deployment #1
lSetup:
– 1 Timely Server
– Accumulo 1.7.1, 26 Tservers on single disk hosts
lTimely server receiving 2.75M metrics/min
l Inserting 20.3M keys/min (338K / sec)
– @10:1 ratio inserted to received
l2.2T keys in the metrics table
– 8.75TB unreplicated
– @ 4.3 bytes per key, ~ 40 bytes per metric

Deployments #2
lSetup:
– 2 Timely servers
– Accumulo 1.7.1, 31 TabletServers on single disk
hosts
lTimely servers receiving 10M metrics/minute
lInserting 71M keys/minute (1.18M / sec)
– @ 7:1 ratio inserted to received
l1.91T keys in the metrics table
– 7.47TB unreplicated
– @4.3 bytes per key, ~ 30 bytes per metric

Accumulo Summit 2016: Timely - Scalable Secure Time Series Database

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Accumulo Summit 2016: Timely - Scalable Secure Time Series Database

Similar to Accumulo Summit 2016: Timely - Scalable Secure Time Series Database (20)

Recently uploaded

Recently uploaded (20)

Accumulo Summit 2016: Timely - Scalable Secure Time Series Database