More Related Content
Similar to IoT:what about data storage? (20)
More from DataWorks Summit/Hadoop Summit (20)
IoT:what about data storage?
- 1. 1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
IoT: what about data storage?
Vladimir Rodionov
Staff Software Engineer
- 2. 2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
IoT data stream
Sequence of data points
- 3. 3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
IoT data stream
Sequence of data points
Triplet: [ID][TIME][VALUE] – basic time-series
- 4. 4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
IoT data stream
Sequence of data points
Triplet: [ID][TIME][VALUE] – basic time-series
Multiplet: [ID][TIME][TAG1][…][TAGN][VALUE] – time-series with tags
- 5. 5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
IoT data stream
Sequence of data points
Triplet: [ID][TIME][VALUE] – basic time-series
Multiplet: [ID][TIME][TAG1][…][TAGN][VALUE] – time-series with tags
Sometimes with location – spatial data
- 6. 6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
IoT data stream
Sequence of data points
Triplet: [ID][TIME][VALUE] – basic time-series
Multiplet: [ID][TIME][TAG1][…][TAGN][VALUE] – time-series with tags
Sometimes with location – spatial data
But, strictly time-series
- 7. 7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
IoT data stream
Sequence of data points
Triplet: [ID][TIME][VALUE] – basic time-series
Multiplet: [ID][TIME][TAG1][…][TAGN][VALUE] – time-series with tags
Sometimes with location – spatial data
But, strictly time-series
Do we have good time series data store?
- 8. 8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
IoT data stream
Sequence of data points
Triplet: [ID][TIME][VALUE] – basic time-series
Multiplet: [ID][TIME][TAG1][…][TAGN][VALUE] – time-series with tags
Sometimes with location – spatial data
But, strictly time-series
Do we have good time series data store?
Open source?
- 9. 9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
IoT data stream
Sequence of data points
Triplet: [ID][TIME][VALUE] – basic time-series
Multiplet: [ID][TIME][TAG1][…][TAGN][VALUE] – time-series with tags
Sometimes with location – spatial data
But, strictly time-series
Do we have good time series data store?
Open source?
But commercially supported?
- 12. 12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Apache HBase
Open Source
Scalable
Distributed
NoSQL Data Store
Commercially supported
Temporal?
- 13. 13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Apache HBase
Open Source
Scalable
Distributed
NoSQL Data Store
Commercially supported
Temporal? Sure, you can do temporal
- 14. 14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Apache HBase
Open Source
Scalable
Distributed
NoSQL Data Store
Commercially supported
Temporal? Sure, you can do temporal stuff!
Out of box?
- 17. 17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Time Series DB requirements
Data Store MUST preserve temporal locality of data for better in-memory caching
Data Store MUST provide efficient compression
– Time – series are highly compressible (less than 2 bytes per data point in some cases)
– Facebook custom compression codec produces less than 1.4 bytes per data point
Data Store MUST provide automatic time-based rollup aggregations: sum, count, avg,
min, max, etc., by min, hour, day and so on – configurable. Most of the time its
aggregated data we are interested in.
Efficient caching policy (RAM/SSD)
SQL API (nice to have, but it is optional)
Support IoT use cases ( write/read ratio up to 99/1, millions ops)
- 18. 18 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Ideal HBase Time Series DB
Keeps raw data for hours
Does not compact raw data at all
Preserves raw data in memory cache for periodic compactions and time-based rollup
aggregations
Stores full resolution data only in compressed form
Has different TTL for different aggregation resolutions:
– Days for by_min, by_10min etc.
– Months, years for by_hour
Compaction should preserve temporal locality of both: full resolution data and
aggregated data.
Integration with Phoenix (SQL)
- 20. 20 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Time Series DB HBase
Raw Events
Region Server
HDFS
CF:Compressed
CF:Raw
CF:Aggregates
C
A
C
A
Compressor Coprocessor
Aggregator Coprocessor
CF:Aggregates
CF:Compressed – TTL days/months
CF:Aggregates – TTL months/years (CF per resolution)
CF:Raw – TTL hours
- 21. 21 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
HBASE-14468 FIFO compaction
First-In-First-Out
No compaction at all
TTL expired data just get archived
Ideal for raw data storage
No compaction – no block cache trashing
Raw data can be cached on write or on read
Sustains 100s MB/s write throughput per RS
Available 0.98.17, 1.1+, 1.2+, HDP-2.4+
Can be easily back ported to 1.0 (do we need this?)
- 22. 22 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Exploring (Size-Tiered) Compaction
Does not preserve temporal locality of data.
Compaction trashes block cache
No efficient caching of data is possible
It hurts most-recent-most-valuable data access pattern.
Compression/Aggregation is very heavy.
To read back recent raw data and run it through compressor, many IO operations are
required, because …
We can’t guarantee recent data in a block cache.
- 23. 23 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
HBASE-15181 Date Tiered Compaction
DateTieredCompactionPolicy
CASSANDRA-6602
Works better for time series than ExploringCompactionPolicy
Better temporal locality helps with reads
Good choice for compressed full resolution and aggregated data.
Available in 0.98.17, 1.2+, HDP-2.4 has it as well
- 24. 24 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Exploring Compaction + Max Size
Set hbase.hstore.compaction.max.size
This emulates Date-Tiered Compaction
Preserves temporal locality of data – data point which are close will be stored in a same
file, distant ones – in separate files.
Compaction works better with block cache
More efficient caching of recent data is possible
Good for most-recent-most-valuable data access pattern.
Use it for compressed and aggregated data
Helps to keep recent data in a block cache.
ECPM
- 25. 25 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
HBASE-14496 Delayed compaction
Files are eligible for minor compaction if their age > delay
Good for application where most recent data is most valuable.
Prevents block cache from trashing for recent data due to frequent minor compactions
of a fresh store files
Will enable this feature for Exploring Compaction Policy
Improves read latency for most recent data.
ECP + Max +Delay (1-2 days) is good option for compressed full resolution and
aggregated data. ECPMD
Patch available.
HBase 1.0+ (can be back-ported to 0.98)
- 26. 26 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Time Series DB HBase
Raw Events
Region Server
HDFS
CF:Compressed
CF:Raw
CF:Aggregates
C
A
C
A
Compressor Coprocessor
Aggregator Coprocessor
CF:Aggregates
CF:Compressed – TTL days/months
CF:Aggregates – TTL months/years (CF per resolution)
CF:Raw – TTL hours
ECPM or DTCP
FIFO
ECPM or DTCP
- 27. 27 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
HBase Block Cache and Time Series
Current policy (LRU) is not optimal for time-series applications
We need something similar to FIFO (both in RAM and on SSD)
We need support for TB size RAM/SSD-based caches
Current off-heap bucket cache does not scale well (it keeps keys in Java heap)
For SSD cache we could mirror most recent store files, thus providing FIFO semantics
w/o any complexity of disk-based cache management.
This all above are work items for future, but today …
– Disable cache for raw data (prevent extreme cache churn)
– Enable cache on write/read for compressed data and aggregations
- 28. 28 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Flexible Retention Policies
Raw
Compressed
Aggregates
Hours Months Years
- 29. 29 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Read/Write IO Reduction
100
~50
~10
Base
FIFO+ECPM
+Compaction
- 30. 30 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Read/Write IO Reduction
100
~50
~10
Base
FIFO+ECPM
+Compaction
50-100MB/s
25-50MB/s
5-10MB/s
- 31. 31 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Read/Write IO Reduction (estimate for 250K/sec data points)
100
~50
~10
Base
FIFO+ECPM
+Compaction
50-100MB/s
25-50MB/s
5-10MB/s
- 32. 32 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Summary
Disable major compaction
Do not run HDFS balancer
Disable HBase auto region balancing: balance_switch false
Disable region splits (DisabledRegionSplitPolicy)
Presplit table in advance.
Have separate column families for raw, compressed and aggregated data (each
aggregate resolution – its own family)
Increase hbase.hstore.blockingStoreFiles for all column families
FIFO for Raw, ECPM(D) or DTCP (next session) for compressed and aggregated data
- 33. 33 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Summary (continued)
Run periodically internal job (coprocessor) to compress data and produce time-based
rollup aggregations.
Do not cache raw data, write/read cache for others (if ECPM(D))
Enable WAL Compression - decrease write IO.
Use maximum compression for Raw data (GZ) – decrease write IO.
- 35. 35 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
SQL (Phoenix) integration
Each time series has set of named attributes, which we call meta (tags in OpenTSDB)
Keep time-series meta in Phoenix type table(s).
Adding new time series, deleting time-series or updating time-series is DML/DDL
operation on a Phoenix table.
Meta is static (mostly)
Define set of attributes in meta which create PK
Have PK translation to unique ID.
Store ID, RTS (reversed time stamp), VALUE in HBase
Now you can index time-series by any attribute(s) in Phoenix
Query is two-step process: Phoenix first to select list of IDs, then HBase to run query on
ID list
- 36. 36 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Query Flow
ID Active Version … MFG
11 true 1.1 SA
12 true 1.3 SA
15 true 1.4 GE
17 true 1.1 GE
… … … … …
345 false 1.0 SA
Phoenix SQL
Time-Series Definition - META
ID Timestamp Value
11 143897653 10.0
12 143897753 11.3
15 143897953 11.6
17 143897853 11.9
… … …
345 143897753 11.0
HBase Time Series DB
Time-Series Data
2)GetAvgByIdSet(ID
set, now(), now() -
24h)
1)SELECT ID FROM META
WHERE MFG=‘SA’AND
Version = ‘1.1’
1. 2.
ID set
- 37. 37 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Time-Series DB API
Group operations on ID sets by time range
– Min, Max, Avg, Count, Sum, other aggregations
Pluggable aggregation functions
Support for different time resolutions
With different approximations (linear, cubic, bi-cubic)
Batch load support (for writes)
Can be implemented in a HBase coprocessor layer
Can work much-much faster than regular SQL DBMS
- 38. 38 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Time-Series DB API
Group operations on ID sets by time range
– Min, Max, Avg, Count, Sum, other aggregations
Pluggable aggregation functions
Support for different time resolutions
With different approximations (linear, cubic, bi-cubic)
Batch load support (for writes)
Can be implemented in a HBase coprocessor layer
Can work much-much faster than regular SQL DBMS
Because we have already aggregated data