Lessons Learned from Leveraging Real-Time Power Consumption Data with Apache Kudu

© Hitachi, Ltd. 2019. All rights reserved.
Lessons Learned from Leveraging Real-Time
Power Consumption Data with Apache Kudu
ApacheCon North America 2019
Masahiro Ito
OSS Solution Center
Hitachi, Ltd. September 11, 2019

© Hitachi, Ltd. 2019. All rights reserved. 1
Who am I?
• Masahiro Ito
➢ Software Engineer at Hitachi, Ltd.
• Developing Bigdata and AI solutions
– E-mail: masahiro.ito.ph@hitachi.com
➢ Web article writer (in Japanese)
• https://thinkit.co.jp/author/10002

Outline
1. Introduction
2. Apache Kudu Overview
3. Performance Evaluations
I. Bulk Data Loading Performance
II. Near Real-time Processing Performance
4. Summary

1. Introduction

Hitachi Corporate Profile
9,480.6 billion yen
754.9 billion yen
295,941
February 1, 1920
458.7 billion yen
Revenues
Operating Income
Number of Employees
Established
Capital
(as of end of Mar. 2019)
(as of end of Mar. 2019)
(FY2018 Consolidated)
(FY2018 Consolidated)
Hitachi, Ltd.
President & CEO
Toshiaki Higashihara

Share of Revenues (FY2018*)
16%
20%
10%
7%
9%
7%
10%
Revenues
9,480.6
billion yen
■IT
■Hitachi Construction
Machinery
■Hitachi Metals
■Hitachi Chemical
■Others
5%
4%
■Industry
■Mobility
12%■Hitachi High-Technologies
■Energy
■Smart Life
* The figures are based on the new segment classifications effective from FY2019

Motivation of Real-time IoT Data Analysis
• Utilization of IoT and AI in various industries
➢ Generates large amounts of data in real-time by various IoT devices
➢ Leverages sensor data for monitoring, BI, and machine learning
Kudu
• Real-time IoT data analysis
➢ Requires strong performance for streaming / analytic workload

2. Apache Kudu Overview

Apache Kudu Overview
• Apache Kudu is a storage engine for Apache Hadoop
➢ A top-level project in the Apache Software Foundation
• Apache Hadoop ecosystem integration
➢ Reduces query latency for Apache Impala and Apache Spark
➢ Enables transparently joining of Kudu tables with HDFS or HBase
• Kudu enables real-time analytics on rapidly changing data
➢ Has both of fast inserts/updates and efficient scans

Performance Comparison for Kudu/HBase/HDFS
High throughput read
Real-time read
High throughput writeReal-time write
Kudu
HBase
HDFS
Suitable for data analysis
Suitable for
streaming data store
Kudu covers different workloads by itself
➢ Enables real-time analytics on rapidly changing data

Traditional Hadoop and Kudu: Analytics on rapidly changing data
HBase HDFS
Streaming
data
Traditional Hadoop
Inserts/Updates
Kudu
Kudu
Analysis system
- Dashboard
- BI
- Machine Learning
Streaming
data
Analysis system
- Dashboard
- BI
- Machine Learning
Inserts/Updates Batch copy
Scans
Scans

Data Model: Table
• Strongly-typed columns
• Primary Key consists of one or more columns
• Operations: Insert / Update / Delete / Upsert / Scan
date id usage cost complete
2018-01-01 01 20.86 22,360 True
2018-01-01 02 124.23 182,345 True
2018-01-02 01 22.53 736 False
2018-01-02 02 30.01 5,842 True
Primary key
Sorted by
primary key columns

Kudu TServer
Kudu TServer
Data Management: Table and Tablet
• A table is partitioned into tablets that distributed across tablet servers
➢ Partitioning strategy: Range partitioning, Hash partitioning
➢ All rows within a tablet are sorted by its primary key
date id …
2018-01-01 01 …
2018-01-01 02 …
2018-01-01 03 …
2018-01-01 04 …
2018-01-02 01 …
2018-01-02 02 …
2018-01-02 03 …
2018-01-02 04 …
Range partitioning
by date
Hash partitioning
by id
TabletsTable
2018-01-01 01 …
2018-01-01 03 …
2018-01-01 02 …
2018-01-01 04 …
2018-01-02 02 …
2018-01-02 04 …
2018-01-02 01 …
2018-01-02 03 …
Tablet 1
Tablet 3
Tablet 2
Tablet 4
2018-01-01 01 …
2018-01-01 02 …
2018-01-01 03 …
2018-01-01 04 …
2018-01-02 01 …
2018-01-02 02 …
2018-01-02 03 …
2018-01-02 04 …
Replicate

Insert Operation Flow in each TServer
Worker Node
Kudu TServer
Data disk
Tablet 1
DiskRowSet
key …
01 …
02 …
03 …
04 …
05 …
06 …
DiskRowSet
key …
02 …
03 …
06 …
3. Flush
4. Compaction:
Merge and sort by primary key
1. Insert records
Kudu Client
05 …
01 …
04 …
DiskRowSet
key …
01 …
04 …
05 …
MemRowSet
key …
01 …
04 …
05 …
2. Sort by in-memory buffer:
Sort by primary key
Write Ahead log

3. Performance Evaluations

Evaluation Scenario: Real-time Power Consumption Data Analysis
• What is power disaggregation?
➢ Estimates the power consumption of individual appliances from a single meter only
• Appliances: TV, air conditioner, refrigerator, microwave, etc.
➢ Enables energy monitoring of individual appliances
• For energy efficiency improvement, user behavior analysis, etc.
Appliance load
monitoring
Total electrical signal
(with single meter)
Electrical signals of
each appliance
Disaggregation

Evaluation Outline
i. Bulk Data Loading Performance
➢ Migrate existing data to the new system with Kudu
ii. Near Real-time Processing Performance
➢ Simultaneous data insertion and scanning
• Insert power consumption data every second
• Scan inserted data every minute for aggregation
• Scan aggregated data every 5 seconds for interactive data analysis
0000
Meters
0000
0000
Kudu
Electric Power
Disaggregation
System
Analysis
system
Insert every second Analytic query
Minutely aggregation
Analyst

Evaluation Environment: 6 Physical machines and 10 Gbps network
Physical machine Spec
- CPU: 20 cores (40 threads)
- Memory: 384 GB
- Disk: SAS HDD 1,200 GB * 10 disks
1 master node
- Impala Catalog Server
- Impala StateStore
- HDFS NameNode
- Kudu Master
- Hive Metastore Server
1 client node
- Kudu Java client
4 worker nodes
- Impala Daemon
- HDFS DataNode
- Kudu TServer
10 Gbps switch / 10Gpbs LAN
Software version
- OS: CentOS 7.6
- CDH 6.2, Kudu 1.9.0
Software Configurations
- TServer memory: 32GB
- Impala memory: 256GB

I. Bulk Data Loading Performance

Evaluation Overview
• Load CSV files in HDFS into a Kudu table using Impala
• Compared two optimizer hints in Impala
1. +SHUFFLE,CLUSTERED (default):
• SHUFFLE: Exchanges data between nodes for Partitioning data before insert
• CLUSTERED: Sorts data by the partition columns before insert
2. +NOSHUFFLE,NOCLUSTERED
• Does not partitioning and sort before insert
Table schema
# Columns Primary key Type
1 time_stamp ✔ unixtime_micros
2 building_id ✔ int32
3 floor_id ✔ int32
4 device_id ✔ int32
5 device_load int64
6 device_type int32
Table design:
- Record size: 32 byte
- Range partition: 24 hour (time_stamp)
- Hash partitions: 16 (building_id, floor_id)
- Replication factor: 3
Data size:
- 1,440 million records, 43 GB

Bulk Data Loading Performance: Throughput and Compaction load
Insertion finish
+SHUFFLE,CLUSTERED (default) +NOSHUFFLE,NOCLUSTERED
Insertion
throughput
Compaction
duration
Insertion finish
Avg. 1.57M
records/sec
Avg. 0.57M
records/sec
Optimizer hints in Impala
Almost no time
Continues after
finish insertion

Evaluation Summary
0.57 M
1.57 M
0.00 M
0.50 M
1.00 M
1.50 M
2.00 M
CLUSTERED,
SHUFFLE
(default)
NOCLUSTERED,
NOSHUFFLE
records/sec
Impala query hints
Impala bulk insert throughput
+NOSHUFFLE,NOCLUSTERED hints:
• Using Impala memory only for data
insertion
• Impala completes data loading quickly
• Kudu continues heavy compaction in
the background
+SHUFFLE,CLUSTERED hints (default):
• Leveraging Impala memory for
partitioning and sorting
• Impala takes more time to complete
data loading
• Kudu has less compaction load

II. Near Real-time Processing Performance

Evaluation Overview
• Concurrent data insertion and scanning for 4 hours
➢ Insert every second with Kudu Java clients
• Num. of insert records (appliances) : 100,000 ~
• Fail if insertion time continues to exceed 1 second
➢ Scan by two types of queries with Impala
Kudu
load_per_sec_table
minutely_load_table
A) Minutely aggregation query
From seconds to minutes for all appliances
(Every minute)
B) Per appliance aggregation query
Get 1 appliance daily total load
(Every 5 second)
Insert records
(Every second)
Pre-store for 1 day (1,440 minutes) records
to save measurement time
Kudu Java client × 20 Impala

Table Designs and Scan Workloads
• Evaluates two types of tables with different primary key order
➢ Affects Scan performance
time_stamp building floor device watt type
2019-09-01 00:00 00001 01 01 209 3
2019-09-01 00:00 00001 01 02 102 5
2019-09-01 00:00 00001 01 03 42 11
2019-09-01 00:00 00001 02 01 462 4
2019-09-01 00:00 00001 02 02 3 22
2019-09-01 00:00 00001 03 01 0 4
building floor device time_stamp watt type
00001 01 01 2019-09-01 00:00 209 3
00001 01 01 2019-09-01 00:01 102 5
00001 01 01 2019-09-01 00:02 42 11
00001 01 02 2019-09-01 00:00 462 4
00001 01 02 2019-09-01 00:01 3 22
00001 01 02 2019-09-01 00:02 0 4
2) First primary key columns
= building, floor, device IDs
Efficient access to a specific appliance load
➢ e.g. B) Per appliance query
Efficient access to a range of time loads
➢ e.g. A) Minutely aggregation query
1) First primary key column
= time_stamp

Insertion Performance: First primary key column = time_stamp
• Avg. insertion time: 480 msec
• Sometimes insertion time exceeded 1
second, but recovered quickly
• Insertion time exceeded 1 second continuously
• Occurred “Memory pressure rejection”
- Soft memory limit exceeded (at 93.59% of capacity).
Insert 1.9 M record/sec (Succeeded) Insert 2.0 M record/sec (Failed)
480 msec
1,031 msec
0 msec
1,000 msec
2,000 msec
0.1 M 0.2 M 0.3 M 0.4 M 0.5 M 0.6 M 0.7 M 0.8 M 0.9 M 1.0 M 1.2 M 1.4 M 1.6 M 1.8 M 1.9 M 2.0 M
Num. of insertion records per second
Insertion time (Avg.)

Insertion Performance: First primary key columns = IDs
185 msec
440 msec
0 msec
500 msec
0.1 M 0.2 M 0.3 M 0.4 M 0.5 M 0.6 M
- From 01:20
• Occurred “The service queue is full (50 items)”

Why is the insertion performance different in order of primary key?
• The RowSet compaction load changes according to the primary key order
➢ Since the records are inserted in timestamp order
time_stamp …
2019-09-01 00:00 …
2019-09-01 00:00 …
2019-09-01 00:01 …
2019-09-01 00:01 …
time_stamp …
2019-09-01 00:00 …
2019-09-01 00:00 …
2019-09-01 00:01 …
2019-09-01 00:01 …
2019-09-01 00:02 …
2019-09-01 00:02 …
time_stamp …
2019-09-01 00:02 …
2019-09-01 00:02 …
Existing RowSet
New RowSet
building ... time_stamp …
00001 … 2019-09-01 00:00 …
00001 … 2019-09-01 00:01 …
00001 … 2019-09-01 00:02 …
00002 … 2019-09-01 00:00 …
00002 … 2019-09-01 00:01 …
00002 … 2019-09-01 00:02 …
00001 … 2019-09-01 00:00 …
00001 … 2019-09-01 00:01 …
00002 … 2019-09-01 00:01 …
00002 … 2019-09-01 00:00 …
00001 … 2019-09-01 00:02 …
00002 … 2019-09-01 00:02 …
Existing RowSet
New RowSet
First primary key column = time_stamp
(Inserted 1.9M records/sec)
First primary key columns = IDs
Merge without sorting
Merge with sorting

Can we reduce the compaction load in another way?
time_stamp …
2019-09-01 00:00 …
2019-09-01 00:00 …
2019-09-01 00:01 …
2019-09-01 00:01 …
time_stamp …
2019-09-01 00:00 …
2019-09-01 00:00 …
2019-09-01 00:01 …
2019-09-01 00:01 …
2019-09-01 00:02 …
2019-09-01 00:02 …
time_stamp …
2019-09-01 00:02 …
2019-09-01 00:02 …
Existing RowSet
New RowSet
00001 … 2019-09-01 00:00 …
00001 … 2019-09-01 00:01 …
00001 … 2019-09-01 00:02 …
00002 … 2019-09-01 00:00 …
00002 … 2019-09-01 00:01 …
00002 … 2019-09-01 00:02 …
00001 … 2019-09-01 00:00 …
00001 … 2019-09-01 00:01 …
00002 … 2019-09-01 00:01 …
00002 … 2019-09-01 00:00 …
00001 … 2019-09-01 00:02 …
00002 … 2019-09-01 00:02 …
Existing RowSet
New RowSet
Merge without sorting
Merge with sorting
Can we reduce the compaction load by
reducing the maximum size of each tablet?
First primary key column = time_stamp

Insertion Performance: First primary key column = IDs, Partition range = 24h->1h
• Increase of insertion time was reset by
hourly tablet change
- Around the end of every hour
• Occurred “Memory pressure rejection” and
“The service queue is full (50 items)”
231 msec
358 msec
0 msec
200 msec
400 msec
0.1 M 0.2 M 0.3 M 0.4 M 0.5 M 0.6 M 0.7 M 0.8 M 0.9 M
Reduced the maximum size of each tablet
by changing range partition from 24h to 1h.

Insertion Performance Summary
1.9 M record/sec
0.5 M record/sec
0.8 M record/sec
0.0 M record/sec
0.5 M record/sec
1.0 M record/sec
1.5 M record/sec
2.0 M record/sec
2.5 M record/sec
3.0 M record/sec
Timestamp (24h) IDs (24h) IDs (1h)
First primary key columns (Partition range)
Insertion throughput
Tuning point: Reduce the compaction load
• Use Timestamp for the first primary key column
• If you want IDs as the first key, reduce the maximum size of each tablet
➢ Increase the number of partitions

Scan Performance Summary
First primary key column = timestamp
was the lowest latency
were the lowest latency
1.2 sec
0.2 sec0.2 sec
0.0 sec
0.2 sec
0.4 sec
0.6 sec
0.8 sec
1.0 sec
1.2 sec
1.4 sec
1.6 sec
1.8 sec
0.1
M
0.2
M
0.3
M
0.4
M
0.5
M
0.6
M
0.7
M
0.8
M
0.9
M
1.0
M
1.2
M
1.4
M
1.6
M
1.8
M
1.9
M
2.0
M
B) Per appliance aggregation query time
(95 percentile)
First key: Timestamp
First key: IDs, Partition range: 1h
8.0 sec
7.4 sec
10.6 sec
0.0 sec
2.0 sec
4.0 sec
6.0 sec
8.0 sec
10.0 sec
12.0 sec
14.0 sec
16.0 sec
18.0 sec
20.0 sec
0.1
M
0.2
M
0.3
M
0.4
M
0.5
M
0.6
M
0.7
M
0.8
M
0.9
M
1.0
M
1.2
M
1.4
M
1.6
M
1.8
M
1.9
M
2.0
M
A) Minutely aggregation query time
(95 percentile)
First key: Timestamp
• Primary key order should be defined according to the patterns of data scan
➢ Scan request latencies were 3-6 times different
• Trade-off with insertion performance

4. Summary

Summary
• 4-TServer Kudu cluster enables real-time analysis on
1-second power consumption data for 1.9 million appliances
➢ Inserts every second, aggregates every minute, aggregates by any appliance
• Lessons from performance evaluation:
➢ Insertion performance tuning:
• Reduce the compaction load by
✓ Using timestamp for the first primary key column
to reduce the cost of sort during the merge
✓ Reducing a tablet size to reduce compaction records
➢ Scan performance tuning:
• Define primary key order according to the patterns of data scan

Trademarks
• Apache Kudu, Apache Impala, Apache Spark, Apache HBase and Apache Hadoop are either
registered trademarks or trademarks of Apache Software Foundation in the United States and/or
other countries.
• Other company and product names mentioned in this document may be the trademarks of their
respective owners.

Lessons Learned from Leveraging Real-Time Power Consumption Data with Apache Kudu

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Lessons Learned from Leveraging Real-Time Power Consumption Data with Apache Kudu

Similar to Lessons Learned from Leveraging Real-Time Power Consumption Data with Apache Kudu (20)

More from Hitachi, Ltd. OSS Solution Center.

More from Hitachi, Ltd. OSS Solution Center. (20)

Recently uploaded

Recently uploaded (20)

Lessons Learned from Leveraging Real-Time Power Consumption Data with Apache Kudu