SlideShare a Scribd company logo
1Copyright©2019 NTT Corp. All Rights Reserved.
Real-time spatiotemporal data utilization
for future mobility services
Atsushi ISOMURA
NTT(Nippon Telegraph and Telephone) Corp., Researcher
2Copyright©2019 NTT Corp. All Rights Reserved.
About me
- Atsushi Isomura
- Tokyo, Japan
- Work
- NTT Software Innovation Center
- OSS, in-memory, data storage, distribution, etc.
- “Spatio-Temporal data processing”
- Free time
- Develop smart-phone apps
- Nintendo Switch (Splatoon2 : rank X, SSB Ultimate : Elite)
- Baseball
- Drive cars (Desire : Drive without traffic jam)
3Copyright©2019 NTT Corp. All Rights Reserved.
1. Motivation
2. Spatio-temporal data utilization in redis
3. Proposal
4. Performance
5. ST-code generation tips / Sample codes
#Links of codes and slides at the end.
INDEX
4Copyright©2019 NTT Corp. All Rights Reserved.
1. Motivation
5Copyright©2019 NTT Corp. All Rights Reserved.
1-1. Background
These IoT devices keep INCREASING!
ref1 : 2016 estimation of Fuji Keizai Marketing Research & Consulting Group
ref2 : 2016 estimation of Yano Research Institute
0
2
4
6
2016年 2020年
[hundred million]
‘16 ‘20
2.6
5.4
0
1
2
3
2016年 2020年
[hundred million]
‘16 ‘20
1.1
3.2
6Copyright©2019 NTT Corp. All Rights Reserved.
1-1. Background
IoT sensorsIoT devices
What’s the difference?
7Copyright©2019 NTT Corp. All Rights Reserved.
1-1. Background
1. They MOVE every moment!
Latitude Longitude Time Value
37.800 -122.402 2019/4/2 12:30:15 ID:1234, 30km/h
37.798 -122.400 2019/4/2 12:30:16 ID:1234, 31km/h
… … … …
ST(Spatio-Temporal) Data
8Copyright©2019 NTT Corp. All Rights Reserved.
1-1. Background
2. The density CHANGES by location & time
Metropolis
Suburb
Metropolis
Suburb
High
Low
Low
High
9Copyright©2019 NTT Corp. All Rights Reserved.
1-2. Future mobility services
Example1 : Nearby car crash alert
car
crash
alert
Broken car
on abc Street
real-time
view
Alert !
Crash!
10Copyright©2019 NTT Corp. All Rights Reserved.
1-2. Future mobility services
redisconf19 ended
train arrived
Example2 : Optimal routing for taxis
taxi
waiting
events
party closed Input
- location of waiting people
- event information
- traffic jam
- etc.
Calculate optimal
route automatically
11Copyright©2019 NTT Corp. All Rights Reserved.
1-2. Future mobility services
I need to send this
package NOW!
Nearest drone available
Example3 : Drone package delivery
drone
package
12Copyright©2019 NTT Corp. All Rights Reserved.
1-3. IoT devices’ features
IoT devices Important features
1. MOVE
2. Density CHANGES
Related services require
- real-time response
- ST-data insert
- ST-data search
13Copyright©2019 NTT Corp. All Rights Reserved.
1-4. Requirements and current technology
1. Insert bunch of ST-data in real-time (<10ms)
2. Search by ST-range query in real-time (<100ms)
3. Distribute data equally regardless of density changes
- All requirements must be satisfied
Data store AppsCars
1. over 20M rec/s[1]
[1] : Fuji Keizai Marketing Research “Connected car related markets and telematics strategy 2017”
(Estimation only in Japan)
2. lng:x1~x2 lat:y1~y2 time:t1~t2
Value
No matured technology that could satisfy all requirements.
ST-range query
3.
14Copyright©2019 NTT Corp. All Rights Reserved.
1-5. Which data store to use?
Of course we selected “redis”
We searched for…
- blazingly fast performance
- geo features
- secondary indexing
- data distribution
We studied from RedisConf…
redisconf17
Using “Geohash-encoding” & “Sorted-set”
enable ST-data management in redis
15Copyright©2019 NTT Corp. All Rights Reserved.
2. ST-data utilization in redis
16Copyright©2019 NTT Corp. All Rights Reserved.
2-1. Related commands
https://redis.io/commands
Geo related commands Sorted-set related commands
Utilize “Geohash[1]”
encoding algorithm
[1] : http://geohash.org/
17Copyright©2019 NTT Corp. All Rights Reserved.
2-2. What’s “Geohash”?
2-dimensional
longitude(x), latitude(y)
1-dimensional (Geohash)
x1y1x2y2x3y3 … xnyn
length : short=wide long=narrow
0
00
01
0
10
101100
010
011
10
1
1
11
San Francisco(x, y)
lv.1 0 1
lv.1 lv.2 0 1 0 1
0
0
1
1
0
1
Morton-curve[1]
level-1 level-2
n : length of each dimension
01 10 …
18Copyright©2019 NTT Corp. All Rights Reserved.
2-2. What’s “Geohash”?
★Useful feature
Prefix match = Range query of longitude & latitude
0
00
01
0
10
101100
010
011
1
1
11
10…
1001…
100110…
19Copyright©2019 NTT Corp. All Rights Reserved.
2-3. Insert/Search requirements
- Insert : longitude(x), latitude(y), time(t), and value
- Search : range query of location and time
x y t value
37.798° -122.402° April 2nd 2019 14:10:15 30 km/h
… … … …
Query : Search all values of…
- GEOHASH with prefix of ‘x1y1…xqyq ’
- TIMESTAMP between t1 and t2
q : length of each dimension for prefix search
20Copyright©2019 NTT Corp. All Rights Reserved.
>ZADD time_a geohash_a “ID, …”
(integer) 1
>GEOADD time_a geohash_a “ID, …”
(integer) 1
2-4. Possible Key-Value design
-Key
Timestamp
(string)
-Score
Geohash
(int)
-Value
time_a
geohash_a ID, …
… …
time_b
… …
… …
… … …
-Key
Geohash
(string)
-Score
Timestamp
(int)
-Value
geohash_a
time_a ID, …
… …
geohash_b
… …
… …
…
Pattern 1. Time key sorted by Geohash Pattern 2. Geohash key sorted by Time
Either of them works fine
>ZADD geohash_a time_a “ID, …”
(integer) 1
21Copyright©2019 NTT Corp. All Rights Reserved.
2-5. How to search by range
>ZRANGEBYSCORE t1 x1y1…xqyq…00 x1y1…xqyq…11
>ZRANGEBYSCORE t1+1 x1y1…xqyq…00 x1y1…xqyq…11
>ZRANGEBYSCORE t1+2 x1y1…xqyq…00 x1y1…xqyq…11
>ZRANGEBYSCORE t1+3 x1y1…xqyq…00 x1y1…xqyq…11
…
>ZRANGEBYSCORE t2 x1y1…xqyq…00 x1y1…xqyq…11
>KEYS x1y1…xqyq*
(return list[i] of all keys that start with x1y1…xqyq )
>ZRANGEBYSCORE list[0] t1 t2
>ZRANGEBYSCORE list[1] t1 t2
…
>ZRANGEBYSCORE list[i] t1 t2
query by circle : GEORADIUS instead of ZRANGEBYSCORE
-Key
Timestamp
(string)
-Score
Geohash
(int)
-Value
time_a
geohash_a ID, …
… …
-Key
Geohash
(string)
-Score
Timestamp
(int)
-Value
geohash_a
time_a ID, …
… …
Pattern 1 Pattern 2
Query : Search all values of…
- GEOHASH with prefix of ‘x1y1…xqyq ’
- TIMESTAMP between t1 and t2
(q : length of each dimension for query)
22Copyright©2019 NTT Corp. All Rights Reserved.
>KEYS x1y1…xqyq*
(return list[i] of all keys that start with x1y1…xqyq )
>ZRANGEBYSCORE list[0] t1 t2
>ZRANGEBYSCORE list[1] t1 t2
…
>ZRANGEBYSCORE list[i] t1 t2
>ZRANGEBYSCORE t1 x1y1…xqyq…00 x1y1…xqyq…11
>ZRANGEBYSCORE t1+1 x1y1…xqyq…00 x1y1…xqyq…11
>ZRANGEBYSCORE t1+2 x1y1…xqyq…00 x1y1…xqyq…11
>ZRANGEBYSCORE t1+3 x1y1…xqyq…00 x1y1…xqyq…11
…
>ZRANGEBYSCORE t2 x1y1…xqyq…00 x1y1…xqyq…11
2-6. Range query takes time
Pattern 1 Pattern 2
Turn around time/Query 1.3 s 535 s
Simple test by using 5 redis-servers
(concurrent connections : 256, number of values : 10 million, search only)
Pattern 1 Pattern 2
[1] : https://redis.io/commands/KEYS
Search too many Keys.
Slow!
Danger![1] Too slow!
23Copyright©2019 NTT Corp. All Rights Reserved.
2-7. Range query takes time
>ZRANGEBYSCORE t1 x1y1…xqyq…00 x1y1…xqyq…11
>ZRANGEBYSCORE t1+1 x1y1…xqyq…00 x1y1…xqyq…11
>ZRANGEBYSCORE t1+2 x1y1…xqyq…00 x1y1…xqyq…11
>ZRANGEBYSCORE t1+3 x1y1…xqyq…00 x1y1…xqyq…11
…
>ZRANGEBYSCORE t2 x1y1…xqyq…00 x1y1…xqyq…11
Pattern 1
Turn around time/Query 1.3 s
Simple test by using 5 redis-servers
(concurrent connections : 256, number of values : 10 million, search only)
Pattern 1
Search too many Keys.
Slow!
It takes more than 1s.
Let’s reduce the Keys
24Copyright©2019 NTT Corp. All Rights Reserved.
2-7. Range query takes time
>ZRANGEBYSCORE t1 x1y1…xqyq…00 x1y1…xqyq…11
>ZRANGEBYSCORE t1+1 x1y1…xqyq…00 x1y1…xqyq…11
>ZRANGEBYSCORE t1+2 x1y1…xqyq…00 x1y1…xqyq…11
>ZRANGEBYSCORE t1+3 x1y1…xqyq…00 x1y1…xqyq…11
…
>ZRANGEBYSCORE t2 x1y1…xqyq…00 x1y1…xqyq…11
Pattern 1
Turn around time/Query 1.3 s
Simple test by using 5 redis-servers
(concurrent connections : 256, number of values : 10 million, search only)
Pattern 1
Search too many Keys.
Slow!
It takes more than 1s.
Let’s reduce the Keys
Wait!
Problem is left!
25Copyright©2019 NTT Corp. All Rights Reserved.
2-8. Another problem?
Suppose that…
- Tons of cars send data continuously
- Applications require current data
- Multiple Redis-servers are available
AppsCars
redis1
redis2
redis3…
redisN
What will happen?
-Key
Timestamp
(string)
-Score
Geohash
(int)
-Value
time_a
geohash_a ID, …
… …
Pattern 1
26Copyright©2019 NTT Corp. All Rights Reserved.
redis1
redis2
redis3…
redisN
2-8. Load concentration (intensive access)
current timestamp key
Idle
busy
We send
current data!
We need
current data!
AppsCars
27Copyright©2019 NTT Corp. All Rights Reserved.
redis1
redis2
redis3…
redisN
2-8. Load concentration (intensive access)
current timestamp key
Idle
busy
We send
current data!
We need
current data!
AppsCars
28Copyright©2019 NTT Corp. All Rights Reserved.
2-8. Load concentration
1 2 3 4
24
24 redis-servers
Simple test by using 24 redis-servers
( concurrent connections : 256 (data insertion only) )
Cannot use CPU resource efficiently
CPU usage (%)
User/System
usage(%)Idle(%)
0
100
spike
0
50
29Copyright©2019 NTT Corp. All Rights Reserved.
2-9. Problems we need to solve
Problem 1.
- ST-range query is slow due to
- searching too many Keys
- using the “KEYS” command
Problem 2.
- ST-data insert is inefficient due to
- load concentration
30Copyright©2019 NTT Corp. All Rights Reserved.
3. Proposal
31Copyright©2019 NTT Corp. All Rights Reserved.
3-1. Applying “ST-code”
0
00
01
0
10
101100
010
011
1
1
11
Morton-curve transform for longitude, latitude, and time
timestamp
[1] Jan Jezek, “STCode : The Text Encoding Algorithm for Laitute/Longitude/Time”,
Springer International Publishing Switzerland 2014
ST-code[1] : x1y1t1 x2y2t2 x3y3t3 … xnyntn
prefix match = range query
timestamp
Min.
timestamp
Max.
current time
0 1
1110
100 101
32Copyright©2019 NTT Corp. All Rights Reserved.
3-1. Applying “ST-code”
-Key -Score -Value
PRE-code_a
SUF-code_a ID, …
… …
… … …
ST-code : x1y1t1 x2y2t2 x3y3t3 … xnyntn
split
PRE-code : x1y1t1 … xsysts
(express WIDE st-range)
SUF-code : xs+1ys+1ts+1 … xnyntn
(express NARROW st-range)
>ZADD PRE-code_a SUF-code_a “ID5, …”
(integer) 1
s : where you split
Don’t make me use the
KEYS command!
33Copyright©2019 NTT Corp. All Rights Reserved.
3-1. Applying “ST-code”
-Key -Score -Value
PRE-code_a
SUF-code_a ID5, …
… …
… …
>ZRANGEBYSCORE PRE-code_a
xs+1ys+1ts+1…xqyqtq…000 xs+1ys+1ts+1…xqyqtq…111
Very Fast! Problem solved!?
(restriction : s < q)
s : where you split
q : length of each dimension for prefix search
ST-range query only in one command!
Query : Search all values of…
- GEOHASH with prefix of ‘x1y1…xqyq ’
- TIMESTAMP between t1 and t2
34Copyright©2019 NTT Corp. All Rights Reserved.
Problems we need to solve
Problem 1. (Solved by ST-code!)
- ST-range query is slow due to
- searching too many Keys
- using the “KEYS” command
Problem 2. (not yet)
- ST-data insert is inefficient due to
- load concentration
search only 1 key
“ZRANGEBYSCORE”
35Copyright©2019 NTT Corp. All Rights Reserved.
Don’t forget about this…
36Copyright©2019 NTT Corp. All Rights Reserved.
3-2. Limited node distribution
insert
• Select multiple nodes based on the hashed value of ST-code(PRE-code).
• Insert to “one” of the selected nodes.
• Search from “all” of the selected nodes.
San Francisco, 7:00
7:03
…
7:00
7:01
7:02
search
7:00~7:01
San Francisco,
ST-range query
avoid load concentration efficient search
#works as above when applying ST-code(PRE-code) as Key
time
selected nodes
37Copyright©2019 NTT Corp. All Rights Reserved.
Problems we need to solve
Problem 1. (Solved by ST-code!)
- ST-range query is slow due to
- searching too many Keys
- using the “KEYS” command
Problem 2. (Solved by Limited node distribution)
- ST-data insert is inefficient due to
- load concentration
search only 1 key
“ZRANGEBYSCORE”
load distribution
38Copyright©2019 NTT Corp. All Rights Reserved.
3-3. Architecture Overview
(A)ST-code & (B)Limited node distribution are applied.
calculate ST-code
split ST-code into PRE-code & SUF-code
calculate hashed value of PRE-code
calculate insert/search node number
1 2 3 4 5
PRE-code ⇒ “Key”
SUF-code ⇒ “Score”
PRE-code ⇒ “Key”
SUF-code ⇒ range query of “Score”
Cars (insert) Application (search)
time lat lng value
Redis
(B)
ST-code value ST-code
PRE-code
valuenode num
valuetime lat lng
(A)
SUF-code
PRE-code SUF-code node num PRE-code SUF-code
39Copyright©2019 NTT Corp. All Rights Reserved.
4. Performance
40Copyright©2019 NTT Corp. All Rights Reserved.
4-1. Compared methods
-Key -Score -Value
PRE-code_a
SUF-code_a …
… …
… … …
1. ST-key method (ST-code & Limited node distribution)
2. Time-key method
-Key -Score -Value
time_a
geohash_a …
… …
… … …
41Copyright©2019 NTT Corp. All Rights Reserved.
4-2. Experimental conditions
Concurrency
(max)
Data size
(KB)
Redis server nodes
“selected nodes”
for proposed method
insert 640
10 24 8
search 320
Data inserted (10 million data) Data searched (100,000 query)
time range : 15min
area : 3km2
(1) : http://www.nyc.gov/html/tlc/html/about/trip_record_data.shtml
dense/sparse
depending on area(2)
time Current timestamp
longitude
NY Taxi open data(1)
latitude
value ID, speed, etc.
(2) : referred from https://toddwschneider.com/posts/analyzing-1-1-billion-nyc-taxi-and-uber-trips-with-a-vengeance/
42Copyright©2019 NTT Corp. All Rights Reserved.
4-3. System configuration
Client:8 physical machines Server:4 physical machines (24 redis processes)
# Software version
1
Client
Jedis 2.9.0
2 Test Program(TP) -
2 MW Java 1.8.0
3 Server redis 5.0.3
4 OS Ubuntu 16.04.1 LTS
# Specification
1 server all in common
Intel Xeon E5-2618Lv4 10 core 2.2GHz 25M cash x1
256GB DDR4 1.2v ECC REG DIMM (32GBx8)
SSD : 2.5inch 480GB SATA3 ×2
HDD : 2.5inch 1TB 7200rpm SATA3 ×2
3 NW Infiniband SW
【Mellanox IB Switch】
MSB7800-ES2F Switch-IB™-2 based EDR InfiniBand
1U Switch, 36 QSFP28ports
Client#1
Jedis
MW
OS
TP
Server#1
OS
Redis1
Infiniband SW
Redis2 Redis3 Redis4 Redis5 Redis6
Server#4
OS
Redis19 Redis20 Redis21 Redis22 Redis23 Redis24
Client#2
Jedis
MW
OS
TP
Client#7
Jedis
MW
OS
TP
Client#8
Jedis
MW
OS
TP
…
…
43Copyright©2019 NTT Corp. All Rights Reserved.
4-4. Insert performance
• ST-key method is 13 times better in throughput, 12 times better in turn around time(TAT).
Throughput (rec/s)
0
20000
40000
60000
80000
ST-key Time-key
Average TAT (ms/rec)
0
10
20
30
40
ST-key Time-key
×12
(concurrency : 256)
×1376000
5000 3
40
44Copyright©2019 NTT Corp. All Rights Reserved.
4-4. Insert performance (CPU resource)
ST-key Time-key
Average CPU usage of all servers 81% 5%
• ST-key method can fully use CPU resource of servers.
• ST-key method distributed processing load to servers equally.
ST-key Time-key
fully used spike
User/System
usage(%)
Idle(%)
User/System
usage(%)Idle(%)
0
100
0
70
0
100
0
50
45Copyright©2019 NTT Corp. All Rights Reserved.
4-5. Search performance
• ST-key method is 5 times better in throughput and TAT.
(concurrency : 256)
0
1000
2000
3000
4000
ST-key Time-key
0
100
200
300
400
ST-key Time-key
Throughput (query/s) Average TAT (ms/query)
×5×53500
680 70
360
46Copyright©2019 NTT Corp. All Rights Reserved.
4-5. Search performance (CPU resource)
• ST-key method enables better performance with less CPU usage.
ST-key Time-key
ST-key Time-key
Average CPU usage of all servers 51% 65%
40
User/System
usage(%)
Idle(%)
User/System
usage(%)
Idle(%)
100
0
0
50
less CPU
0
100
0
47Copyright©2019 NTT Corp. All Rights Reserved.
4-6. Results in summary
All requirements are satisfied
1. Insert bunch of ST-data in real-time (<10ms)
2. Search by ST-range query in real-time (<100ms)
3. Distribute data equally regardless of density changes
ST-key method
AppsCars
Value
ST-range query
redis
1. 3.3ms/insert 2. 70ms/query3. Distribution
48Copyright©2019 NTT Corp. All Rights Reserved.
5. ST-code generation tips /
Demo(console)
49Copyright©2019 NTT Corp. All Rights Reserved.
50Copyright©2019 NTT Corp. All Rights Reserved.
5-1. ST-code generation ( stencode/stencode_naive.py)
def st_encode(lon_input, lat_input, time_input, precision=96):
lon_interval, lat_interval, time_interval = (-90.0, 90.0), (-180.0, 180.0), (0.0, 2018304000.0)
st_code = ‘’
loop = 0
while len(st_code) < precision:
if loop%3 ==0:
mid = (lon_interval[0] + lon_interval[1]) / 2
if lon_input > mid:
lon_interval = (mid, lon_interval[1])
st_code += '1'
else:
lon_interval = (lon_interval[0], mid)
st_code += '0'
elif loop%3 == 1:
mid = (lat_interval[0] + lat_interval[1]) / 2
if lat_input > mid:
lat_interval = (mid, lat_interval[1])
st_code += '1'
else:
lat_interval = (lat_interval[0], mid)
st_code += '0'
else :
mid = (time_interval[0] + time_interval[1]) / 2
if time_input > mid:
time_interval = (mid, time_interval[1])
st_code += '1'
else:
time_interval = (time_interval[0], mid)
st_code += '0'
loop += 1
return st_code
Too naïve!!!
Too slow!!!
51Copyright©2019 NTT Corp. All Rights Reserved.
input = [lon_input, lat_input, time_input]
maxmin = [(-90.0, 90.0), (-180.0, 180.0), (0.0, 2018304000.0)]
def st_encode_FAST(input, maxmin, precision=96):
bins=[]
precision = int(precision/3)
for (i, m) in zip (input, maxmin):
tmp = (i-m[0])/(m[1]-m[0])*(2**precision)
tmp = format(int(tmp),'b')
n_lost = precision-len(tmp)
bins.append('0' * n_lost + tmp)
st_code = ''.join(b1+b2+b3 for b1,b2,b3 in zip(bins[0],bins[1],bins[2]))
return st_code
5-1. ST-code generation ( stencode/stencode_fast.py)
Much faster
52Copyright©2019 NTT Corp. All Rights Reserved.
5-2. Demo (console)
- Data insert (st_insert.py)
- Data search (st_search.py)
redis client
PyPIredis
MW
OS
st_insert.py st_search.py
redis server
redis
OS
- key : PRE CODE
- score : SUF CODE
- value : ID, lat, lng, time
- key : PRE CODE
- score : SUF CODE
- value : ID, lat, lng, time
Thank you!

More Related Content

Similar to Redisconf19: Real-time spatiotemporal data utilization for future mobility services

CLOUD-NATIVE NETWORKS FOR THE ADVANCEMENT OF AI/IoT
CLOUD-NATIVE NETWORKS FOR THE ADVANCEMENT OF AI/IoTCLOUD-NATIVE NETWORKS FOR THE ADVANCEMENT OF AI/IoT
CLOUD-NATIVE NETWORKS FOR THE ADVANCEMENT OF AI/IoT
NTT Software Innovation Center
 
iot_cloud
iot_cloudiot_cloud
iot_cloud
Ubuntu
 
IoT and connected devices
IoT and connected devicesIoT and connected devices
IoT and connected devices
Pascal Bodin
 
OPTIMIZING THE TICK STACK
OPTIMIZING THE TICK STACKOPTIMIZING THE TICK STACK
OPTIMIZING THE TICK STACK
InfluxData
 
Real-time and long-time together
Real-time and long-time togetherReal-time and long-time together
Real-time and long-time together
Ted Dunning
 
TLD Anycast DNS servers to ISPs
TLD Anycast DNS servers to ISPsTLD Anycast DNS servers to ISPs
TLD Anycast DNS servers to ISPs
APNIC
 
Web rtc for iot, edge computing use cases
Web rtc for iot, edge computing use casesWeb rtc for iot, edge computing use cases
Web rtc for iot, edge computing use cases
NTT Communications Technology Development
 
Going Real-­Time/Convergent: Oracle Communications Billing and Revenue Manage...
Going Real-­Time/Convergent: Oracle Communications Billing and Revenue Manage...Going Real-­Time/Convergent: Oracle Communications Billing and Revenue Manage...
Going Real-­Time/Convergent: Oracle Communications Billing and Revenue Manage...
Ognjen Antonic
 
5G Standalone Will Deliver! - But What?
5G Standalone Will Deliver! - But What?5G Standalone Will Deliver! - But What?
5G Standalone Will Deliver! - But What?
Dr. Kim (Kyllesbech Larsen)
 
20170922_1_Azureを利用した倉庫温湿度管理ソリューション
20170922_1_Azureを利用した倉庫温湿度管理ソリューション20170922_1_Azureを利用した倉庫温湿度管理ソリューション
20170922_1_Azureを利用した倉庫温湿度管理ソリューション
IoTビジネス共創ラボ
 
Transport SDN & OpenDaylight Use Cases in Korea
Transport SDN & OpenDaylight Use Cases in KoreaTransport SDN & OpenDaylight Use Cases in Korea
Transport SDN & OpenDaylight Use Cases in Korea
Justin Park
 
스마트 디바이스 현황_및__전자정부에_대한_제언
스마트 디바이스 현황_및__전자정부에_대한_제언스마트 디바이스 현황_및__전자정부에_대한_제언
스마트 디바이스 현황_및__전자정부에_대한_제언
Gori Communication
 
Fog Computing
Fog ComputingFog Computing
Fog Computing
IRJET Journal
 
Emergency Service Provide by Mobile
Emergency Service Provide by MobileEmergency Service Provide by Mobile
Emergency Service Provide by Mobile
Samiul Hoque
 
Case Study: Large Scale Deployment for Machine Learning with Highspeed Storage
Case Study: Large Scale Deployment for Machine Learning with Highspeed StorageCase Study: Large Scale Deployment for Machine Learning with Highspeed Storage
Case Study: Large Scale Deployment for Machine Learning with Highspeed Storage
Kota Tsuyuzaki
 
IRJET - Accident Monitoring and Rescue System
IRJET - Accident Monitoring and Rescue SystemIRJET - Accident Monitoring and Rescue System
IRJET - Accident Monitoring and Rescue System
IRJET Journal
 
Peering in Japan from JPNAP perspective
Peering in Japan from JPNAP perspectivePeering in Japan from JPNAP perspective
Peering in Japan from JPNAP perspective
APNIC
 
Ext Training TARIQ
Ext Training TARIQExt Training TARIQ
Ext Training TARIQ
Tariq Quttaineh
 
StreamSight - Query-Driven Descriptive Analytics for IoT and Edge Computing
StreamSight - Query-Driven Descriptive Analytics for IoT and Edge ComputingStreamSight - Query-Driven Descriptive Analytics for IoT and Edge Computing
StreamSight - Query-Driven Descriptive Analytics for IoT and Edge Computing
Demetris Trihinas
 
“COVID-19 Safe Distancing Measures in Public Spaces with Edge AI,” a Presenta...
“COVID-19 Safe Distancing Measures in Public Spaces with Edge AI,” a Presenta...“COVID-19 Safe Distancing Measures in Public Spaces with Edge AI,” a Presenta...
“COVID-19 Safe Distancing Measures in Public Spaces with Edge AI,” a Presenta...
Edge AI and Vision Alliance
 

Similar to Redisconf19: Real-time spatiotemporal data utilization for future mobility services (20)

CLOUD-NATIVE NETWORKS FOR THE ADVANCEMENT OF AI/IoT
CLOUD-NATIVE NETWORKS FOR THE ADVANCEMENT OF AI/IoTCLOUD-NATIVE NETWORKS FOR THE ADVANCEMENT OF AI/IoT
CLOUD-NATIVE NETWORKS FOR THE ADVANCEMENT OF AI/IoT
 
iot_cloud
iot_cloudiot_cloud
iot_cloud
 
IoT and connected devices
IoT and connected devicesIoT and connected devices
IoT and connected devices
 
OPTIMIZING THE TICK STACK
OPTIMIZING THE TICK STACKOPTIMIZING THE TICK STACK
OPTIMIZING THE TICK STACK
 
Real-time and long-time together
Real-time and long-time togetherReal-time and long-time together
Real-time and long-time together
 
TLD Anycast DNS servers to ISPs
TLD Anycast DNS servers to ISPsTLD Anycast DNS servers to ISPs
TLD Anycast DNS servers to ISPs
 
Web rtc for iot, edge computing use cases
Web rtc for iot, edge computing use casesWeb rtc for iot, edge computing use cases
Web rtc for iot, edge computing use cases
 
Going Real-­Time/Convergent: Oracle Communications Billing and Revenue Manage...
Going Real-­Time/Convergent: Oracle Communications Billing and Revenue Manage...Going Real-­Time/Convergent: Oracle Communications Billing and Revenue Manage...
Going Real-­Time/Convergent: Oracle Communications Billing and Revenue Manage...
 
5G Standalone Will Deliver! - But What?
5G Standalone Will Deliver! - But What?5G Standalone Will Deliver! - But What?
5G Standalone Will Deliver! - But What?
 
20170922_1_Azureを利用した倉庫温湿度管理ソリューション
20170922_1_Azureを利用した倉庫温湿度管理ソリューション20170922_1_Azureを利用した倉庫温湿度管理ソリューション
20170922_1_Azureを利用した倉庫温湿度管理ソリューション
 
Transport SDN & OpenDaylight Use Cases in Korea
Transport SDN & OpenDaylight Use Cases in KoreaTransport SDN & OpenDaylight Use Cases in Korea
Transport SDN & OpenDaylight Use Cases in Korea
 
스마트 디바이스 현황_및__전자정부에_대한_제언
스마트 디바이스 현황_및__전자정부에_대한_제언스마트 디바이스 현황_및__전자정부에_대한_제언
스마트 디바이스 현황_및__전자정부에_대한_제언
 
Fog Computing
Fog ComputingFog Computing
Fog Computing
 
Emergency Service Provide by Mobile
Emergency Service Provide by MobileEmergency Service Provide by Mobile
Emergency Service Provide by Mobile
 
Case Study: Large Scale Deployment for Machine Learning with Highspeed Storage
Case Study: Large Scale Deployment for Machine Learning with Highspeed StorageCase Study: Large Scale Deployment for Machine Learning with Highspeed Storage
Case Study: Large Scale Deployment for Machine Learning with Highspeed Storage
 
IRJET - Accident Monitoring and Rescue System
IRJET - Accident Monitoring and Rescue SystemIRJET - Accident Monitoring and Rescue System
IRJET - Accident Monitoring and Rescue System
 
Peering in Japan from JPNAP perspective
Peering in Japan from JPNAP perspectivePeering in Japan from JPNAP perspective
Peering in Japan from JPNAP perspective
 
Ext Training TARIQ
Ext Training TARIQExt Training TARIQ
Ext Training TARIQ
 
StreamSight - Query-Driven Descriptive Analytics for IoT and Edge Computing
StreamSight - Query-Driven Descriptive Analytics for IoT and Edge ComputingStreamSight - Query-Driven Descriptive Analytics for IoT and Edge Computing
StreamSight - Query-Driven Descriptive Analytics for IoT and Edge Computing
 
“COVID-19 Safe Distancing Measures in Public Spaces with Edge AI,” a Presenta...
“COVID-19 Safe Distancing Measures in Public Spaces with Edge AI,” a Presenta...“COVID-19 Safe Distancing Measures in Public Spaces with Edge AI,” a Presenta...
“COVID-19 Safe Distancing Measures in Public Spaces with Edge AI,” a Presenta...
 

Recently uploaded

All you need to know about Spring Boot and GraalVM
All you need to know about Spring Boot and GraalVMAll you need to know about Spring Boot and GraalVM
All you need to know about Spring Boot and GraalVM
Alina Yurenko
 
Quarter 3 SLRP grade 9.. gshajsbhhaheabh
Quarter 3 SLRP grade 9.. gshajsbhhaheabhQuarter 3 SLRP grade 9.. gshajsbhhaheabh
Quarter 3 SLRP grade 9.. gshajsbhhaheabh
aisafed42
 
Enhanced Screen Flows UI/UX using SLDS with Tom Kitt
Enhanced Screen Flows UI/UX using SLDS with Tom KittEnhanced Screen Flows UI/UX using SLDS with Tom Kitt
Enhanced Screen Flows UI/UX using SLDS with Tom Kitt
Peter Caitens
 
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling ExtensionsUI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
Peter Muessig
 
Safelyio Toolbox Talk Softwate & App (How To Digitize Safety Meetings)
Safelyio Toolbox Talk Softwate & App (How To Digitize Safety Meetings)Safelyio Toolbox Talk Softwate & App (How To Digitize Safety Meetings)
Safelyio Toolbox Talk Softwate & App (How To Digitize Safety Meetings)
safelyiotech
 
INTRODUCTION TO AI CLASSICAL THEORY TARGETED EXAMPLES
INTRODUCTION TO AI CLASSICAL THEORY TARGETED EXAMPLESINTRODUCTION TO AI CLASSICAL THEORY TARGETED EXAMPLES
INTRODUCTION TO AI CLASSICAL THEORY TARGETED EXAMPLES
anfaltahir1010
 
Oracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptxOracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptx
Remote DBA Services
 
Measures in SQL (SIGMOD 2024, Santiago, Chile)
Measures in SQL (SIGMOD 2024, Santiago, Chile)Measures in SQL (SIGMOD 2024, Santiago, Chile)
Measures in SQL (SIGMOD 2024, Santiago, Chile)
Julian Hyde
 
UI5con 2024 - Bring Your Own Design System
UI5con 2024 - Bring Your Own Design SystemUI5con 2024 - Bring Your Own Design System
UI5con 2024 - Bring Your Own Design System
Peter Muessig
 
What’s New in Odoo 17 – A Complete Roadmap
What’s New in Odoo 17 – A Complete RoadmapWhat’s New in Odoo 17 – A Complete Roadmap
What’s New in Odoo 17 – A Complete Roadmap
Envertis Software Solutions
 
Using Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query PerformanceUsing Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query Performance
Grant Fritchey
 
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdfBaha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid
 
14 th Edition of International conference on computer vision
14 th Edition of International conference on computer vision14 th Edition of International conference on computer vision
14 th Edition of International conference on computer vision
ShulagnaSarkar2
 
The Rising Future of CPaaS in the Middle East 2024
The Rising Future of CPaaS in the Middle East 2024The Rising Future of CPaaS in the Middle East 2024
The Rising Future of CPaaS in the Middle East 2024
Yara Milbes
 
Malibou Pitch Deck For Its €3M Seed Round
Malibou Pitch Deck For Its €3M Seed RoundMalibou Pitch Deck For Its €3M Seed Round
Malibou Pitch Deck For Its €3M Seed Round
sjcobrien
 
Mobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona InfotechMobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona Infotech
Drona Infotech
 
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
kgyxske
 
How Can Hiring A Mobile App Development Company Help Your Business Grow?
How Can Hiring A Mobile App Development Company Help Your Business Grow?How Can Hiring A Mobile App Development Company Help Your Business Grow?
How Can Hiring A Mobile App Development Company Help Your Business Grow?
ToXSL Technologies
 
Enums On Steroids - let's look at sealed classes !
Enums On Steroids - let's look at sealed classes !Enums On Steroids - let's look at sealed classes !
Enums On Steroids - let's look at sealed classes !
Marcin Chrost
 
Oracle 23c New Features For DBAs and Developers.pptx
Oracle 23c New Features For DBAs and Developers.pptxOracle 23c New Features For DBAs and Developers.pptx
Oracle 23c New Features For DBAs and Developers.pptx
Remote DBA Services
 

Recently uploaded (20)

All you need to know about Spring Boot and GraalVM
All you need to know about Spring Boot and GraalVMAll you need to know about Spring Boot and GraalVM
All you need to know about Spring Boot and GraalVM
 
Quarter 3 SLRP grade 9.. gshajsbhhaheabh
Quarter 3 SLRP grade 9.. gshajsbhhaheabhQuarter 3 SLRP grade 9.. gshajsbhhaheabh
Quarter 3 SLRP grade 9.. gshajsbhhaheabh
 
Enhanced Screen Flows UI/UX using SLDS with Tom Kitt
Enhanced Screen Flows UI/UX using SLDS with Tom KittEnhanced Screen Flows UI/UX using SLDS with Tom Kitt
Enhanced Screen Flows UI/UX using SLDS with Tom Kitt
 
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling ExtensionsUI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
 
Safelyio Toolbox Talk Softwate & App (How To Digitize Safety Meetings)
Safelyio Toolbox Talk Softwate & App (How To Digitize Safety Meetings)Safelyio Toolbox Talk Softwate & App (How To Digitize Safety Meetings)
Safelyio Toolbox Talk Softwate & App (How To Digitize Safety Meetings)
 
INTRODUCTION TO AI CLASSICAL THEORY TARGETED EXAMPLES
INTRODUCTION TO AI CLASSICAL THEORY TARGETED EXAMPLESINTRODUCTION TO AI CLASSICAL THEORY TARGETED EXAMPLES
INTRODUCTION TO AI CLASSICAL THEORY TARGETED EXAMPLES
 
Oracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptxOracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptx
 
Measures in SQL (SIGMOD 2024, Santiago, Chile)
Measures in SQL (SIGMOD 2024, Santiago, Chile)Measures in SQL (SIGMOD 2024, Santiago, Chile)
Measures in SQL (SIGMOD 2024, Santiago, Chile)
 
UI5con 2024 - Bring Your Own Design System
UI5con 2024 - Bring Your Own Design SystemUI5con 2024 - Bring Your Own Design System
UI5con 2024 - Bring Your Own Design System
 
What’s New in Odoo 17 – A Complete Roadmap
What’s New in Odoo 17 – A Complete RoadmapWhat’s New in Odoo 17 – A Complete Roadmap
What’s New in Odoo 17 – A Complete Roadmap
 
Using Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query PerformanceUsing Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query Performance
 
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdfBaha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
 
14 th Edition of International conference on computer vision
14 th Edition of International conference on computer vision14 th Edition of International conference on computer vision
14 th Edition of International conference on computer vision
 
The Rising Future of CPaaS in the Middle East 2024
The Rising Future of CPaaS in the Middle East 2024The Rising Future of CPaaS in the Middle East 2024
The Rising Future of CPaaS in the Middle East 2024
 
Malibou Pitch Deck For Its €3M Seed Round
Malibou Pitch Deck For Its €3M Seed RoundMalibou Pitch Deck For Its €3M Seed Round
Malibou Pitch Deck For Its €3M Seed Round
 
Mobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona InfotechMobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona Infotech
 
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
 
How Can Hiring A Mobile App Development Company Help Your Business Grow?
How Can Hiring A Mobile App Development Company Help Your Business Grow?How Can Hiring A Mobile App Development Company Help Your Business Grow?
How Can Hiring A Mobile App Development Company Help Your Business Grow?
 
Enums On Steroids - let's look at sealed classes !
Enums On Steroids - let's look at sealed classes !Enums On Steroids - let's look at sealed classes !
Enums On Steroids - let's look at sealed classes !
 
Oracle 23c New Features For DBAs and Developers.pptx
Oracle 23c New Features For DBAs and Developers.pptxOracle 23c New Features For DBAs and Developers.pptx
Oracle 23c New Features For DBAs and Developers.pptx
 

Redisconf19: Real-time spatiotemporal data utilization for future mobility services

  • 1. 1Copyright©2019 NTT Corp. All Rights Reserved. Real-time spatiotemporal data utilization for future mobility services Atsushi ISOMURA NTT(Nippon Telegraph and Telephone) Corp., Researcher
  • 2. 2Copyright©2019 NTT Corp. All Rights Reserved. About me - Atsushi Isomura - Tokyo, Japan - Work - NTT Software Innovation Center - OSS, in-memory, data storage, distribution, etc. - “Spatio-Temporal data processing” - Free time - Develop smart-phone apps - Nintendo Switch (Splatoon2 : rank X, SSB Ultimate : Elite) - Baseball - Drive cars (Desire : Drive without traffic jam)
  • 3. 3Copyright©2019 NTT Corp. All Rights Reserved. 1. Motivation 2. Spatio-temporal data utilization in redis 3. Proposal 4. Performance 5. ST-code generation tips / Sample codes #Links of codes and slides at the end. INDEX
  • 4. 4Copyright©2019 NTT Corp. All Rights Reserved. 1. Motivation
  • 5. 5Copyright©2019 NTT Corp. All Rights Reserved. 1-1. Background These IoT devices keep INCREASING! ref1 : 2016 estimation of Fuji Keizai Marketing Research & Consulting Group ref2 : 2016 estimation of Yano Research Institute 0 2 4 6 2016年 2020年 [hundred million] ‘16 ‘20 2.6 5.4 0 1 2 3 2016年 2020年 [hundred million] ‘16 ‘20 1.1 3.2
  • 6. 6Copyright©2019 NTT Corp. All Rights Reserved. 1-1. Background IoT sensorsIoT devices What’s the difference?
  • 7. 7Copyright©2019 NTT Corp. All Rights Reserved. 1-1. Background 1. They MOVE every moment! Latitude Longitude Time Value 37.800 -122.402 2019/4/2 12:30:15 ID:1234, 30km/h 37.798 -122.400 2019/4/2 12:30:16 ID:1234, 31km/h … … … … ST(Spatio-Temporal) Data
  • 8. 8Copyright©2019 NTT Corp. All Rights Reserved. 1-1. Background 2. The density CHANGES by location & time Metropolis Suburb Metropolis Suburb High Low Low High
  • 9. 9Copyright©2019 NTT Corp. All Rights Reserved. 1-2. Future mobility services Example1 : Nearby car crash alert car crash alert Broken car on abc Street real-time view Alert ! Crash!
  • 10. 10Copyright©2019 NTT Corp. All Rights Reserved. 1-2. Future mobility services redisconf19 ended train arrived Example2 : Optimal routing for taxis taxi waiting events party closed Input - location of waiting people - event information - traffic jam - etc. Calculate optimal route automatically
  • 11. 11Copyright©2019 NTT Corp. All Rights Reserved. 1-2. Future mobility services I need to send this package NOW! Nearest drone available Example3 : Drone package delivery drone package
  • 12. 12Copyright©2019 NTT Corp. All Rights Reserved. 1-3. IoT devices’ features IoT devices Important features 1. MOVE 2. Density CHANGES Related services require - real-time response - ST-data insert - ST-data search
  • 13. 13Copyright©2019 NTT Corp. All Rights Reserved. 1-4. Requirements and current technology 1. Insert bunch of ST-data in real-time (<10ms) 2. Search by ST-range query in real-time (<100ms) 3. Distribute data equally regardless of density changes - All requirements must be satisfied Data store AppsCars 1. over 20M rec/s[1] [1] : Fuji Keizai Marketing Research “Connected car related markets and telematics strategy 2017” (Estimation only in Japan) 2. lng:x1~x2 lat:y1~y2 time:t1~t2 Value No matured technology that could satisfy all requirements. ST-range query 3.
  • 14. 14Copyright©2019 NTT Corp. All Rights Reserved. 1-5. Which data store to use? Of course we selected “redis” We searched for… - blazingly fast performance - geo features - secondary indexing - data distribution We studied from RedisConf… redisconf17 Using “Geohash-encoding” & “Sorted-set” enable ST-data management in redis
  • 15. 15Copyright©2019 NTT Corp. All Rights Reserved. 2. ST-data utilization in redis
  • 16. 16Copyright©2019 NTT Corp. All Rights Reserved. 2-1. Related commands https://redis.io/commands Geo related commands Sorted-set related commands Utilize “Geohash[1]” encoding algorithm [1] : http://geohash.org/
  • 17. 17Copyright©2019 NTT Corp. All Rights Reserved. 2-2. What’s “Geohash”? 2-dimensional longitude(x), latitude(y) 1-dimensional (Geohash) x1y1x2y2x3y3 … xnyn length : short=wide long=narrow 0 00 01 0 10 101100 010 011 10 1 1 11 San Francisco(x, y) lv.1 0 1 lv.1 lv.2 0 1 0 1 0 0 1 1 0 1 Morton-curve[1] level-1 level-2 n : length of each dimension 01 10 …
  • 18. 18Copyright©2019 NTT Corp. All Rights Reserved. 2-2. What’s “Geohash”? ★Useful feature Prefix match = Range query of longitude & latitude 0 00 01 0 10 101100 010 011 1 1 11 10… 1001… 100110…
  • 19. 19Copyright©2019 NTT Corp. All Rights Reserved. 2-3. Insert/Search requirements - Insert : longitude(x), latitude(y), time(t), and value - Search : range query of location and time x y t value 37.798° -122.402° April 2nd 2019 14:10:15 30 km/h … … … … Query : Search all values of… - GEOHASH with prefix of ‘x1y1…xqyq ’ - TIMESTAMP between t1 and t2 q : length of each dimension for prefix search
  • 20. 20Copyright©2019 NTT Corp. All Rights Reserved. >ZADD time_a geohash_a “ID, …” (integer) 1 >GEOADD time_a geohash_a “ID, …” (integer) 1 2-4. Possible Key-Value design -Key Timestamp (string) -Score Geohash (int) -Value time_a geohash_a ID, … … … time_b … … … … … … … -Key Geohash (string) -Score Timestamp (int) -Value geohash_a time_a ID, … … … geohash_b … … … … … Pattern 1. Time key sorted by Geohash Pattern 2. Geohash key sorted by Time Either of them works fine >ZADD geohash_a time_a “ID, …” (integer) 1
  • 21. 21Copyright©2019 NTT Corp. All Rights Reserved. 2-5. How to search by range >ZRANGEBYSCORE t1 x1y1…xqyq…00 x1y1…xqyq…11 >ZRANGEBYSCORE t1+1 x1y1…xqyq…00 x1y1…xqyq…11 >ZRANGEBYSCORE t1+2 x1y1…xqyq…00 x1y1…xqyq…11 >ZRANGEBYSCORE t1+3 x1y1…xqyq…00 x1y1…xqyq…11 … >ZRANGEBYSCORE t2 x1y1…xqyq…00 x1y1…xqyq…11 >KEYS x1y1…xqyq* (return list[i] of all keys that start with x1y1…xqyq ) >ZRANGEBYSCORE list[0] t1 t2 >ZRANGEBYSCORE list[1] t1 t2 … >ZRANGEBYSCORE list[i] t1 t2 query by circle : GEORADIUS instead of ZRANGEBYSCORE -Key Timestamp (string) -Score Geohash (int) -Value time_a geohash_a ID, … … … -Key Geohash (string) -Score Timestamp (int) -Value geohash_a time_a ID, … … … Pattern 1 Pattern 2 Query : Search all values of… - GEOHASH with prefix of ‘x1y1…xqyq ’ - TIMESTAMP between t1 and t2 (q : length of each dimension for query)
  • 22. 22Copyright©2019 NTT Corp. All Rights Reserved. >KEYS x1y1…xqyq* (return list[i] of all keys that start with x1y1…xqyq ) >ZRANGEBYSCORE list[0] t1 t2 >ZRANGEBYSCORE list[1] t1 t2 … >ZRANGEBYSCORE list[i] t1 t2 >ZRANGEBYSCORE t1 x1y1…xqyq…00 x1y1…xqyq…11 >ZRANGEBYSCORE t1+1 x1y1…xqyq…00 x1y1…xqyq…11 >ZRANGEBYSCORE t1+2 x1y1…xqyq…00 x1y1…xqyq…11 >ZRANGEBYSCORE t1+3 x1y1…xqyq…00 x1y1…xqyq…11 … >ZRANGEBYSCORE t2 x1y1…xqyq…00 x1y1…xqyq…11 2-6. Range query takes time Pattern 1 Pattern 2 Turn around time/Query 1.3 s 535 s Simple test by using 5 redis-servers (concurrent connections : 256, number of values : 10 million, search only) Pattern 1 Pattern 2 [1] : https://redis.io/commands/KEYS Search too many Keys. Slow! Danger![1] Too slow!
  • 23. 23Copyright©2019 NTT Corp. All Rights Reserved. 2-7. Range query takes time >ZRANGEBYSCORE t1 x1y1…xqyq…00 x1y1…xqyq…11 >ZRANGEBYSCORE t1+1 x1y1…xqyq…00 x1y1…xqyq…11 >ZRANGEBYSCORE t1+2 x1y1…xqyq…00 x1y1…xqyq…11 >ZRANGEBYSCORE t1+3 x1y1…xqyq…00 x1y1…xqyq…11 … >ZRANGEBYSCORE t2 x1y1…xqyq…00 x1y1…xqyq…11 Pattern 1 Turn around time/Query 1.3 s Simple test by using 5 redis-servers (concurrent connections : 256, number of values : 10 million, search only) Pattern 1 Search too many Keys. Slow! It takes more than 1s. Let’s reduce the Keys
  • 24. 24Copyright©2019 NTT Corp. All Rights Reserved. 2-7. Range query takes time >ZRANGEBYSCORE t1 x1y1…xqyq…00 x1y1…xqyq…11 >ZRANGEBYSCORE t1+1 x1y1…xqyq…00 x1y1…xqyq…11 >ZRANGEBYSCORE t1+2 x1y1…xqyq…00 x1y1…xqyq…11 >ZRANGEBYSCORE t1+3 x1y1…xqyq…00 x1y1…xqyq…11 … >ZRANGEBYSCORE t2 x1y1…xqyq…00 x1y1…xqyq…11 Pattern 1 Turn around time/Query 1.3 s Simple test by using 5 redis-servers (concurrent connections : 256, number of values : 10 million, search only) Pattern 1 Search too many Keys. Slow! It takes more than 1s. Let’s reduce the Keys Wait! Problem is left!
  • 25. 25Copyright©2019 NTT Corp. All Rights Reserved. 2-8. Another problem? Suppose that… - Tons of cars send data continuously - Applications require current data - Multiple Redis-servers are available AppsCars redis1 redis2 redis3… redisN What will happen? -Key Timestamp (string) -Score Geohash (int) -Value time_a geohash_a ID, … … … Pattern 1
  • 26. 26Copyright©2019 NTT Corp. All Rights Reserved. redis1 redis2 redis3… redisN 2-8. Load concentration (intensive access) current timestamp key Idle busy We send current data! We need current data! AppsCars
  • 27. 27Copyright©2019 NTT Corp. All Rights Reserved. redis1 redis2 redis3… redisN 2-8. Load concentration (intensive access) current timestamp key Idle busy We send current data! We need current data! AppsCars
  • 28. 28Copyright©2019 NTT Corp. All Rights Reserved. 2-8. Load concentration 1 2 3 4 24 24 redis-servers Simple test by using 24 redis-servers ( concurrent connections : 256 (data insertion only) ) Cannot use CPU resource efficiently CPU usage (%) User/System usage(%)Idle(%) 0 100 spike 0 50
  • 29. 29Copyright©2019 NTT Corp. All Rights Reserved. 2-9. Problems we need to solve Problem 1. - ST-range query is slow due to - searching too many Keys - using the “KEYS” command Problem 2. - ST-data insert is inefficient due to - load concentration
  • 30. 30Copyright©2019 NTT Corp. All Rights Reserved. 3. Proposal
  • 31. 31Copyright©2019 NTT Corp. All Rights Reserved. 3-1. Applying “ST-code” 0 00 01 0 10 101100 010 011 1 1 11 Morton-curve transform for longitude, latitude, and time timestamp [1] Jan Jezek, “STCode : The Text Encoding Algorithm for Laitute/Longitude/Time”, Springer International Publishing Switzerland 2014 ST-code[1] : x1y1t1 x2y2t2 x3y3t3 … xnyntn prefix match = range query timestamp Min. timestamp Max. current time 0 1 1110 100 101
  • 32. 32Copyright©2019 NTT Corp. All Rights Reserved. 3-1. Applying “ST-code” -Key -Score -Value PRE-code_a SUF-code_a ID, … … … … … … ST-code : x1y1t1 x2y2t2 x3y3t3 … xnyntn split PRE-code : x1y1t1 … xsysts (express WIDE st-range) SUF-code : xs+1ys+1ts+1 … xnyntn (express NARROW st-range) >ZADD PRE-code_a SUF-code_a “ID5, …” (integer) 1 s : where you split Don’t make me use the KEYS command!
  • 33. 33Copyright©2019 NTT Corp. All Rights Reserved. 3-1. Applying “ST-code” -Key -Score -Value PRE-code_a SUF-code_a ID5, … … … … … >ZRANGEBYSCORE PRE-code_a xs+1ys+1ts+1…xqyqtq…000 xs+1ys+1ts+1…xqyqtq…111 Very Fast! Problem solved!? (restriction : s < q) s : where you split q : length of each dimension for prefix search ST-range query only in one command! Query : Search all values of… - GEOHASH with prefix of ‘x1y1…xqyq ’ - TIMESTAMP between t1 and t2
  • 34. 34Copyright©2019 NTT Corp. All Rights Reserved. Problems we need to solve Problem 1. (Solved by ST-code!) - ST-range query is slow due to - searching too many Keys - using the “KEYS” command Problem 2. (not yet) - ST-data insert is inefficient due to - load concentration search only 1 key “ZRANGEBYSCORE”
  • 35. 35Copyright©2019 NTT Corp. All Rights Reserved. Don’t forget about this…
  • 36. 36Copyright©2019 NTT Corp. All Rights Reserved. 3-2. Limited node distribution insert • Select multiple nodes based on the hashed value of ST-code(PRE-code). • Insert to “one” of the selected nodes. • Search from “all” of the selected nodes. San Francisco, 7:00 7:03 … 7:00 7:01 7:02 search 7:00~7:01 San Francisco, ST-range query avoid load concentration efficient search #works as above when applying ST-code(PRE-code) as Key time selected nodes
  • 37. 37Copyright©2019 NTT Corp. All Rights Reserved. Problems we need to solve Problem 1. (Solved by ST-code!) - ST-range query is slow due to - searching too many Keys - using the “KEYS” command Problem 2. (Solved by Limited node distribution) - ST-data insert is inefficient due to - load concentration search only 1 key “ZRANGEBYSCORE” load distribution
  • 38. 38Copyright©2019 NTT Corp. All Rights Reserved. 3-3. Architecture Overview (A)ST-code & (B)Limited node distribution are applied. calculate ST-code split ST-code into PRE-code & SUF-code calculate hashed value of PRE-code calculate insert/search node number 1 2 3 4 5 PRE-code ⇒ “Key” SUF-code ⇒ “Score” PRE-code ⇒ “Key” SUF-code ⇒ range query of “Score” Cars (insert) Application (search) time lat lng value Redis (B) ST-code value ST-code PRE-code valuenode num valuetime lat lng (A) SUF-code PRE-code SUF-code node num PRE-code SUF-code
  • 39. 39Copyright©2019 NTT Corp. All Rights Reserved. 4. Performance
  • 40. 40Copyright©2019 NTT Corp. All Rights Reserved. 4-1. Compared methods -Key -Score -Value PRE-code_a SUF-code_a … … … … … … 1. ST-key method (ST-code & Limited node distribution) 2. Time-key method -Key -Score -Value time_a geohash_a … … … … … …
  • 41. 41Copyright©2019 NTT Corp. All Rights Reserved. 4-2. Experimental conditions Concurrency (max) Data size (KB) Redis server nodes “selected nodes” for proposed method insert 640 10 24 8 search 320 Data inserted (10 million data) Data searched (100,000 query) time range : 15min area : 3km2 (1) : http://www.nyc.gov/html/tlc/html/about/trip_record_data.shtml dense/sparse depending on area(2) time Current timestamp longitude NY Taxi open data(1) latitude value ID, speed, etc. (2) : referred from https://toddwschneider.com/posts/analyzing-1-1-billion-nyc-taxi-and-uber-trips-with-a-vengeance/
  • 42. 42Copyright©2019 NTT Corp. All Rights Reserved. 4-3. System configuration Client:8 physical machines Server:4 physical machines (24 redis processes) # Software version 1 Client Jedis 2.9.0 2 Test Program(TP) - 2 MW Java 1.8.0 3 Server redis 5.0.3 4 OS Ubuntu 16.04.1 LTS # Specification 1 server all in common Intel Xeon E5-2618Lv4 10 core 2.2GHz 25M cash x1 256GB DDR4 1.2v ECC REG DIMM (32GBx8) SSD : 2.5inch 480GB SATA3 ×2 HDD : 2.5inch 1TB 7200rpm SATA3 ×2 3 NW Infiniband SW 【Mellanox IB Switch】 MSB7800-ES2F Switch-IB™-2 based EDR InfiniBand 1U Switch, 36 QSFP28ports Client#1 Jedis MW OS TP Server#1 OS Redis1 Infiniband SW Redis2 Redis3 Redis4 Redis5 Redis6 Server#4 OS Redis19 Redis20 Redis21 Redis22 Redis23 Redis24 Client#2 Jedis MW OS TP Client#7 Jedis MW OS TP Client#8 Jedis MW OS TP … …
  • 43. 43Copyright©2019 NTT Corp. All Rights Reserved. 4-4. Insert performance • ST-key method is 13 times better in throughput, 12 times better in turn around time(TAT). Throughput (rec/s) 0 20000 40000 60000 80000 ST-key Time-key Average TAT (ms/rec) 0 10 20 30 40 ST-key Time-key ×12 (concurrency : 256) ×1376000 5000 3 40
  • 44. 44Copyright©2019 NTT Corp. All Rights Reserved. 4-4. Insert performance (CPU resource) ST-key Time-key Average CPU usage of all servers 81% 5% • ST-key method can fully use CPU resource of servers. • ST-key method distributed processing load to servers equally. ST-key Time-key fully used spike User/System usage(%) Idle(%) User/System usage(%)Idle(%) 0 100 0 70 0 100 0 50
  • 45. 45Copyright©2019 NTT Corp. All Rights Reserved. 4-5. Search performance • ST-key method is 5 times better in throughput and TAT. (concurrency : 256) 0 1000 2000 3000 4000 ST-key Time-key 0 100 200 300 400 ST-key Time-key Throughput (query/s) Average TAT (ms/query) ×5×53500 680 70 360
  • 46. 46Copyright©2019 NTT Corp. All Rights Reserved. 4-5. Search performance (CPU resource) • ST-key method enables better performance with less CPU usage. ST-key Time-key ST-key Time-key Average CPU usage of all servers 51% 65% 40 User/System usage(%) Idle(%) User/System usage(%) Idle(%) 100 0 0 50 less CPU 0 100 0
  • 47. 47Copyright©2019 NTT Corp. All Rights Reserved. 4-6. Results in summary All requirements are satisfied 1. Insert bunch of ST-data in real-time (<10ms) 2. Search by ST-range query in real-time (<100ms) 3. Distribute data equally regardless of density changes ST-key method AppsCars Value ST-range query redis 1. 3.3ms/insert 2. 70ms/query3. Distribution
  • 48. 48Copyright©2019 NTT Corp. All Rights Reserved. 5. ST-code generation tips / Demo(console)
  • 49. 49Copyright©2019 NTT Corp. All Rights Reserved.
  • 50. 50Copyright©2019 NTT Corp. All Rights Reserved. 5-1. ST-code generation ( stencode/stencode_naive.py) def st_encode(lon_input, lat_input, time_input, precision=96): lon_interval, lat_interval, time_interval = (-90.0, 90.0), (-180.0, 180.0), (0.0, 2018304000.0) st_code = ‘’ loop = 0 while len(st_code) < precision: if loop%3 ==0: mid = (lon_interval[0] + lon_interval[1]) / 2 if lon_input > mid: lon_interval = (mid, lon_interval[1]) st_code += '1' else: lon_interval = (lon_interval[0], mid) st_code += '0' elif loop%3 == 1: mid = (lat_interval[0] + lat_interval[1]) / 2 if lat_input > mid: lat_interval = (mid, lat_interval[1]) st_code += '1' else: lat_interval = (lat_interval[0], mid) st_code += '0' else : mid = (time_interval[0] + time_interval[1]) / 2 if time_input > mid: time_interval = (mid, time_interval[1]) st_code += '1' else: time_interval = (time_interval[0], mid) st_code += '0' loop += 1 return st_code Too naïve!!! Too slow!!!
  • 51. 51Copyright©2019 NTT Corp. All Rights Reserved. input = [lon_input, lat_input, time_input] maxmin = [(-90.0, 90.0), (-180.0, 180.0), (0.0, 2018304000.0)] def st_encode_FAST(input, maxmin, precision=96): bins=[] precision = int(precision/3) for (i, m) in zip (input, maxmin): tmp = (i-m[0])/(m[1]-m[0])*(2**precision) tmp = format(int(tmp),'b') n_lost = precision-len(tmp) bins.append('0' * n_lost + tmp) st_code = ''.join(b1+b2+b3 for b1,b2,b3 in zip(bins[0],bins[1],bins[2])) return st_code 5-1. ST-code generation ( stencode/stencode_fast.py) Much faster
  • 52. 52Copyright©2019 NTT Corp. All Rights Reserved. 5-2. Demo (console) - Data insert (st_insert.py) - Data search (st_search.py) redis client PyPIredis MW OS st_insert.py st_search.py redis server redis OS - key : PRE CODE - score : SUF CODE - value : ID, lat, lng, time - key : PRE CODE - score : SUF CODE - value : ID, lat, lng, time

Editor's Notes

  1. Thank you Mr.chairman for your kind introduction, and all the organizers that worked out for the conference preparation. It’s such an honor to be here. I hope this session will help developers who are struggling with data utilization for real-time services. So, the title of this session is real-time spatiotemporal data utilization for future mobility services.
  2. Allow me to briefly introduce myself. I’m Atsushi Isomura from Tokyo, Japan. I know it’s hard to pronounce, so just call me SUSHI. I work for NTT, a Japanese telecommunications company that manage most of the telephone lines in Japan with 280 thousand workers all over the world. We also work on cloud technology and big data analysis. Right now, I’m working on in-memory storage technology that deal with Spatio-temporal data. In my free time, I make apps, play games mainly Nintendo Switch, and play baseball. Also, I love to drive cars, and I desire driving without traffic jam. In Japan, the land area is very small, but there are too many running cars. This is what makes driving very boring, and this is one of the big problem that I want to solve by Spatio-Temporal data processing.
  3. So, today’s talk will proceed in this order. The links of the sample codes and slides will be shown at the end of my presentation.
  4. Let’s start off by talking about the motivation.
  5. As you know, recently, IoT devices such as connected vehicles, wearable devices, and drones are spreading everywhere. The number of these devices will keep increasing more and more. It is expected that connected vehicles will reach 5.4 hundred million in the world, and wearable devices will reach 3.2 hundred million in the world. Well, we all know that.
  6. However, compared to the ordinary IoT “sensors” inside factories and buildings, what do you think is the difference about these IoT devices? From the point of data management we think that there are two main differences.
  7. The first difference is that they MOVE every moment. Compared to the non-moving “sensors” inside factories and buildings, data collected from these devices differ in latitude, longitude, and time. This type of data is what we call Spatio-Temporal Data. You can see here that, even if the device ID is the same, latitude, longitude, and time changes every moment. In this talk, Spatio-temporal will be abbreviated as ST since it’s too long for me to speak.
  8. The second difference is that the density of the devices changes by location and time. In this image I’m showing how vehicles move from the suburbs to metropolis depending on time. For example, during the daytime, vehicles tend to gather in the metropolis for work and shopping and the density of metropolis becomes high. On the other hand, the density of suburb becomes low. However, during the night, these vehicles go home in the suburbs and the density becomes opposite compared to daytime.
  9. Let’s take a look at some of the future mobility services that will take place in the next few years. One example is “Nearby car crash alert”. Let’s assume that someone crashed near our conference center. The cars running nearby want to know what’s happening in real-time, but vehicles running far from this area don’t need to know it. So, we need to send an alert and information just to the nearby running cars. To realize this application, we need to find all cars that are running near the crash at current time.
  10. Let’s take a look at another example. People waiting for taxis will use their smart phones to send their location and time simultaneously. Also, information of events, traffic jams, etc will be acquired. In the future, taxis will pick up passengers by calculating the optimal route automatically based on the collected ST-data.
  11. How about drone package delivery? Many companies will start this service for a more flexible delivery. If the user wants to send a package to somewhere now, the system needs to find the nearest drone available with no packages. This service also deals with ST-data since we need to search drones that keep moving.
  12. So, the important features of these IoT devices that we need to consider are They move, and The density Changes by location and time Also the future services related to these devices require real-time response of ST-data and ST-data search.
  13. Well, what’s so difficult about realizing these services? As I said in the beginning, bunch of ST-data are sent from a massive number of devices every second in real-time. It is estimated that over 20 million records will be sent from cars in one second in Japan in year 2024. At the same time, future mobility applications require ST-range query which means to search by the range of longitude, latitude, and time, in real-time. We defined that the requirement for insertion is less than 10ms for each data and search is less than 100ms for each query. And, in addition, the data store needs to consider the density changes of IoT devices. Here, we found a problem that there is no matured technology that could satisfy all requirements. So, we started off by choosing the appropriate data store.
  14. To satisfy our requirements, we searched for a data store that has “blazingly fast performance”, “geo features”, and “secondary indexing”. And, by watching the speaker from lyft at RedisConf, we found out that “Geohash-encoding” & “Sorted-set” could realize ST-data management in redis. So, of course we selected redis as our data store.
  15. So now, let’s think how we can utilize ST-data inside redis.
  16. First, here are some of the Geo-related commands and Sorted-set-related commands that could be used for ST-data management. For instance, GEOADD could be used to apply the information of latitude and longitude as the score of sorted-set. Also, ZADD could be used to apply any value for the score of sorted-set. When searching data of sorted-set, we can use ZRANGEBYSCORE command. OK, so as I mentioned before, redis utilizes the Geohash coding algorithm for the geo-related commands.
  17. What’s geohash? We usually talk about location as a 2-dimensional data of longitude and latitude. Well, geohash is a coding algorithm that could transform this 2-dimensional data into a 1-dimensional code by using Morton-curve or some people call Z-curve. If we want to encode the x and y of San Francisco, we first split the earth into 4 blocks. The first two bits will be “10” in this case. The next two bits will be “01”, then the next two bits will be “10”. This bit interleaving process is repeated until it reaches the precision that you want to express. So, the shorter the code is, the wider the area becomes. In opposition, the longer the code is, the narrower the area becomes. This 1-dimensional code enables faster data insertion and it matches redis’s simple data structure of Key-value store.
  18. There’s another useful feature about Geohash. This Geohash code could be used for range query of longitude and latitude by matching the prefix. As shown in this image, the range query of the red box can be expressed as matching the prefix of “10” In the same manner, the purple box and the orange box can be expressed by adding digits to “10”. As a result, one-dimensional prefix match of geohash could substitute two-dimensional range query of latitude and longitude.
  19. So let’s review the requirements of insert data and search query. We want to insert ST-data that consists of longitude, latitude, time, and value. For example, x is 37 degrees, y is -122 degrees which is San Francisco, and time is … now. With the value of something like 30km/h. Then we want to search a ST-range query that requires the data corresponding to a particular prefix match of GEOHASH and a particular range of TIMESTAMP. This query can find data that is inside a certain location and time period.
  20. In this situation, we came up of two possible key-value design for redis. First design pattern is applying Timestamp as the key, and Geohash as the score of sorted set. The redis commands that can be used for inserting are ZADD and GEOADD. If we use ZADD, the Geohash score needs to be calculated beforehand. Second design is applying Geohash as the key, and Timestamp as the score of sorted set. The command that can be used for inserting is ZADD. We thought that these are the two main key-value design that we can generally prepare in Redis.
  21. Next, we needed to find how we can search by range query for these two patterns. As I said in the requirements, a prefix GEOHASH and a range of TIMESTAMP must be searched. If we use the first pattern, we need to submit ZRANGEBYSCORE command for each Timestamp-key that needs to be searched. For example, searching Timestamp range of 10 minutes means to search 600 Timestamp-keys. For the second pattern, we first need to submit KEYS command to acquire all GEOHASH keys that has a particular prefix. Afterwards, we need to submit ZRANGEBYSCORE command for each Geohash-key that needs to be searched.
  22. As you can easily imagine, the second pattern will not help us out. It is too slow since it requires the dangerous “KEYS” command for searching the list of keys that match a particular prefix. In addition, these two patterns require too many keys to search. This leads to slow performance as shown in the table here. Pattern 1 takes 1.3 second and Pattern2 takes more than 500 seconds for 1 range query.
  23. So we thought, “ok let’s reduce the keys to search” for design pattern 1.
  24. But wait! We have to think about a different problem that is still left.
  25. If we suppose that tons of vehicles send data continuously and applications require data of current time and multiple redis-servers are available, what will happen?
  26. Here is the answer. No matter where the cars are running, all of them will send data consisted of “current time”. On the other hand, mobility applications search for data that relates with “current time”. If redis1 has the current “timestamp key”, all of the work load concentrates to this machine. Other redis servers will be in a idle state which is very inefficient for the entire system.
  27. Even if the time changes, the current timestamp key moves to a different redis server. This causes the same problem of “load concentration” again and again.
  28. Here is an example of the CPU usage when conducting ST-data insert using 24 redis-servers. The horizontal axis of this graph is time, and the vertical axis of the upper half represent CPU usage of User and System. The lower half represent the Idle percentage of CPU. You can see that the work load concentrates temporarily to a particular redis-server as a spike. This graph demonstrates that the key-value design with “current timestamp as key” cannot use CPU resource efficiently.
  29. Let’s review the problems that we need to solve. St-range query is slow because searching too many keys, or using the “KEYS” command. ST-data insert is inefficient due to load concentration
  30. So now, let me introduce our proposal to solve the problems.
  31. First of all, I have introduced “geohash” encoding in the beginning. In several technical papers, there is an another encoding algorithm proposed called ST-code. This is a 1-dimensional code that is calculated by applying morton-curve to 3-dimensional value of latitude, longitude, “and time”. First, we define the minimum timestamp and maximum timestamp that you will deal with. Then, depending on current time, we divide timestamp in half repeatedly just like geohash. The bit array of ST-code will be a repeat of x, y, t and so on. So, the shorter the code is, the wider the area and longer the time period becomes. In opposition, the longer the code is, the narrower the area and shorter the time period becomes. By using this code, a range query of longitude, latitude, and time can be replaced with prefix match.
  32. We asked ourselves, “how can we insert and search this ST-code without the need of using the dangerous KEYS command?” Here is our approach. First, we split the ST-code in two parts, “PRE-code” and “SUF-code”. This simply means the prefix part and the suffix part. The PRE-code that express the WIDE st-range is stored as the Key of a sorted set. Then the SUF-code that express the NARROW st-range is stored as the Score of a sorted set. We used ZADD to realize this key-value design.
  33. This design patter enables range query of GEOHASH and TIMESTAMP only in one command. When using the ZRANGEBYSCORE command, the Key can be defined by calculating the PRE-code that express the range of GEOHASH and TIMESTAMP. The minimum and the maximum range of the Score can also be calculated from GEOHASH and TIMESTAMP. The “s” and “q” should be considered beforehand depending of what kind of query you are planning to submit. So, the range query became very fast by one command and the problem is solved!
  34. Well, we solved the first problem by ST-code by searching only 1 key, andby using the ZRANGEBYSCORE command. However, the second problem is not solved yet.
  35. We cannot forget about this load concentration even if we use ST-code as the key of sorted set.
  36. So, this is our second proposal “Limited Node Distribution” The basic idea of limited node distribution is to select “multiple nodes” based on the hashed value of ST-code which express a particular location and time. The data insert is conducted to “one” of the selected nodes, and data search is conducted to “all” of the selected nodes. Let’s say that the number of selected nodes is 2. In this example, the destination of ST-data obtained near San Francisco at 7:00 AM goes to either this node or this node. From here, the insert node is decided randomly, in this case, let’s choose this node. By this method, insert goes to “one” of the selected nodes. The combination of the selected nodes differ every time, like this. So, the insertion can be conducted by avoiding load concentration. When searching data obtained near San Francisco from 7:00AM to 7:01AM, the search goes to this node and this node which is “all” of the selected nodes. Also the search is efficiently conducted not to all nodes but only to the selected nodes.
  37. So now, the two problems are solved by applying ST-code and Limited node distribution.
  38. The architecture overview including ST-code and Limited node distribution is shown here. In brief, the insert and search starts by calculating ST-code and splitting ST-code into PRE-code and SUF-code. From the hashed value of PRE-code, insert or search node number is calculated. Insert is only conducted to “one” of the selected nodes, and search is conducted to “all” of the selected nodes. I understand that it’s a little complicated so please ask me questions later if you feel any misunderstandings.
  39. Next I’ll introduce the performance.
  40. We conducted an experiment to compare these two methods that I’ve already introduced. One is the ST-key method which is the proposal, and the other is the Time-key method with the Timestamp as Key and Geohash as the Score.
  41. Here are the experimental conditions. Each data size is 10KB and we prepared 24 redis server nodes. The number of “selected nodes” for our proposed method is 8. We created test data-set based on the longitude and latitude acquired from NY taxi open data. To make it real, we applied “Current Timestamp” during the experiment as time. The search of range query requires 15minutes in time range and 3 square kilometer for area. The data-set that we used are sparse or dense depending on the area. So, there are a lot of data near Manhattan, but less data in the suburbs. This data-set is very real, so it could simulate the concentration of cars running in New York.
  42. The system configuration in detail is listed here. Total of 8 physical machines were prepared for redis-client, and 4 physical machines were prepared for the redis-server. The specification of server machines had 256GB of memory, but other specifications are not so special or powerful. We used Jedis for the interface to redis, and the version of redis was 5.0.3.
  43. This is the result of the insert performance. The graphs here indicate that the insert performance was 13 times better in throughput and 12 times better in turn around time for each record. Throughput reached over 76000 records in one second, and the TAT was 3.3 ms per one record.
  44. This improvement was mainly due to the efficient usage in CPU resource. Again, the horizontal axis of this graph is time, and the vertical axis of the upper half represent CPU usage of User and System. The lower half represent the Idle percentage of CPU. The ST-key method proved that we can fully use CPU resource and the processing load is distributed to all redis-servers equally. As mentioned before, Time-key method could not use CPU resource efficiently due to load concentration.
  45. Next is the result of the search performance. The graphs here indicate that the search performance was 5 times better in throughput and 5 times better in turn around time for each query. Throughput reached over 3500 queries in one second, and the TAT was 70 ms per one query.
  46. This improvement was mainly due to the efficient search conducted by 1 command per 1 query. The ST-key method proved that the search performance is better compared to Time-Key method with less CPU usage.
  47. I will summarize the result. The combination of blazingly fast redis and ST-key method satisfied all of our requirements at once. Insert was 3.3ms per record and it satisfied the turn around time of 10ms requirement. Also search was 70ms per query and it satisfied the turn around time of 100ms requirement.
  48. So now, let’s move on to the sample code and Demo.
  49. The codes that I will explain from now is uploaded on Github under the account of sushi-boy.
  50. This is how we first coded the ST-code generation. For each dimension of longitude, latitude, and time, you have to calculate the “0, 1” bit by dividing it by 2 again and again. This coding algorithm had no problem, however, you can obviously see that there are too many if statements and of course it was very slow. So, we made a different one.
  51. The new Faster ST-code generator looks like this. The idea is basically based on bit operation. The bits of each dimension is calculated before hand, and the bit interleaving process is conducted afterwards. By reducing the number of if statements, it became much faster than the naïve one. This code is also uploaded in the link I will show later.
  52. So here, let me explain how to do a simple insert and search using ST-code. The sample program on github assumes that the system is a simple client-server architecture. Since I wrote it in Python, it uses PyPIredis for connection. First, “data insert” can be conducted by using the st_insert.py. Just simply execute the python file and it will insert value by applying PRE CODE as key, and SUF CODE as score by ZADD command. The console will look like this. It will show the generated ST CODE, PRE CODE, and SUF CODE based on the latitude, longitude, and time input. The program will submit ZADD command by python API to redis. Next, “data search” can be conducted by using the st_search.py. Again, simply execute the python file and it will search value by applying PRE CODE as key and SUF CODE based on the range of latitude, longitude, and time. The program will submit ZRANGEBYSCORE command by python API to redis. You will be able to acquire the value that matches the query.
  53. Thank you for listening!