1
Basavaraj Soppannavar
Sr. Strategist, IoT
Toshiba America Research Inc.
Purpose-built In-Memory NoSQL Database
For
Internet of Things
5th Aug 2017
Los Angeles
Agenda
 Internet of Things
 IoT Data & its properties
 GridDB
 Real Use Cases
GridDB by Toshiba 2
Internet of Things
GridDB by Toshiba 3
Internet of Things Predictions
Number of Connected Devices
4GridDB by Toshiba
By 2020 the number of connected devices will be
• 50 Billion – Cisco
• 28.1 Billion* – IDC
• 20.8 Billion* – Gartner
*not including smartphones & computers
Most IoT smart devices aren’t in your home or phone—they are in factories,
businesses, and healthcare – Intel Infographics
• 40.2 % in Business and Manufacturing
• 30.3 % in Healthcare
IoT Revenue projections
• $300 Billion – Gartner
• $470 Billion – Bain
IoT Economics
Technology Stack of IoT
Data Aggregation / Processing
Session / Communication
Transport
Link
Connectivity
Data Storage and Retrieval
CoAP, MQTT, DDS, XMPP, AMQP, HTTP
IPV4, IPV6
Ethernet, WiFi, Bluetooth, BLE, Zigbee, Zwave, RFiD, 2G, 3G, LTE
Wireless, USB, RJ45(Ethernet), DSL
Storm, Kafka, Fluentd, RabbitMQ
GridDB, HBase, Cassandra, MongoDB, MS-SQL, Hadoop
Analytics & AIDeviceandDataManagement
SecurityandPrivacy
BI, Visualization, Data Mining, DPP* Analytics, Machine Learning
Applications Mobile, Web, Business Apps
Device Sensors, Embedded chips, Cameras, Wearables
*Descriptive, Predictive, Prescriptive
5GridDB by Toshiba
Toshiba’s Full Stack
Solution for
IoT & Big Data
GridDB by Toshiba
6
GridDB NoSQL
Database
IoT Data & Databases
GridDB by Toshiba 7
Properties of IoT Data
Periodic
Large volume
but
Small record size
Structured
Time
Stamped
8GridDB by Toshiba
Timestamp Voltage Current Temperature
2017/05/03 10:45:00 100 0.64 20.5
2017/05/03 10:45:30 101 0.63 20.4
2017/05/03 10:46:00 99 0.65 20.5
.
.
.
.
.
.
.
.
.
.
.
.
Single record (size less than 100 bytes)
Millions of records
Database Requirements of IoT
Highly Available &
Fault Tolerant
Great read and write
performance for millions
of records
Time series data &
operations support
Fast Search and Range
Queries
Spatial and geo-location
support
Real-time streaming
support
9GridDB by Toshiba
Support for ever-increasing data (Scale Out)
Evolution of Database Management Systems
RDBMS
NoSQL DBs
Key Value Store
Wide Column Store
Document Store
Graph Store
Hadoop
OLAP / DW
Riak, Aerospike
Cassandra, HBase
MongoDB, Couchbase
Neo4j
MySQL, Postgres
Cloudera, Hortonworks
Teradata, Vertica, GreenPlum
RDBMS RDBMS
OLAP / DW
Operational / Transactional
Database
Data Warehouse for BI
and Analytics
OLAP – Online Analytical Processing
DW – Data Warehouse
10GridDB by Toshiba
Inspired by Source: https://practicalanalytics.co/2015/06/02/the-maturing-nosql-ecoystem-a-c-level-guide/
90s 2000s Today
GridDB
A Purpose-built In-Memory NoSQL Database for IoT
GridDB by Toshiba 11
What is GridDB?
Highly Scalable
In Memory
Distributed
Key-Value
IoT Database
12GridDB by Toshiba
GridDB – Highly Scalable Database for IoT
13GridDB by Toshiba
Highly Scalable Distributed Key-Container Database
14GridDB by Toshiba
NoSQL Data Models
15GridDB by Toshiba
• GridDB has a unique Key-Container data model
• Container can be visualized as a table of a Relational Database
• Fixed schema
Key Container Data Model
16GridDB by Toshiba
 Container is a group of data set with a schema
 GridDB supports 2 types of containers
 Collection container – For generic records management
 Time-series container – For time series records management
 Key Container model provides
 Data Consistency within the container (ACID is guaranteed within the container)
 Faster data retrieval and search because of schema
 TQL, an SQL-like query language for reading data from the containers
Key Container Data Model - Example
17GridDB by Toshiba
static class SMData {
@RowKey Date timestamp;
int voltage;
double current;
int temp;
}
TimeSeries<SMData> ts = store.putTimeSeries(SM101, SMData.class);
Schema definition
Creating a TS Container
Container name
“Key”
Schema
High Performance
18GridDB by Toshiba
GridDB’s hybrid composition of In-Memory and Disk architecture is optimized for maximum performance
Memory from multiple nodes
Node/Server Node Node Node
SSD/DiskSSD/HDD SSD/Disk SSD/Disk
Add new nodes
GridDB 4-node Cluster
In-Memory + Disk Hybrid
Excess data from memory is saved on to SDD/Disk
YCSB Performance Results
19GridDB by Toshiba
• Tests performed under same hardware systems (MS Azure Standard_D2 dual core CPUs, 7GB RAM per node)
• 1 client per core; 128 threads per client
*Tests performed by Fixstars
0
100
200
300
400
A B C D F
Avg.Throughput
('000ops/sec)
YCSB Workloads
Throughput - 16 nodes
GridDB
Cassandra
0
100
200
300
400
500
600
700
800
A B C D F
Avg.Throughput
('000ops/sec)
YCSB Workloads
Throughput - 32 nodes
GridDB
Cassandra
0
50
100
150
A B C D F
Latencyin
Microseconds
YCSB Workloads
Read Latency – 16 nodes
GridDB
Cassandra
Yahoo Cloud Servicing Benchmark (YCSB) comparing
GridDB and Cassandra shows that*
 Average throughput of GridDB is 4x-5x higher than
that of Cassandra
 Average latency of GridDB is 3x-4x lower than that of
Cassandra
Superior Stability
20GridDB by Toshiba
0
2,000
4,000
6,000
8,000
10,000
12,000
14,000
0 10,000 20,000 30,000 40,000 50,000 60,000 70,000 80,000 90,000
Throughput(ops)
Elapsed Time (seconds)
YCSB Workload A 24Hrs Stability test
GridDB
Cassandra
3hrs 15hrs9hrs 21hrs 25hrs
Tests performed by Fixstars
High Availability
21GridDB by Toshiba
Advanced Master-Slave Model - Hybrid Cluster Management
• No Single Point of Failure (SPOF) – Master node is selected automatically
• No Split Brain – Quorum Policy is applied
Autonomous Data Distribution
• Data distribution and failover are taken care of automatically
Master
Original Replica
Original Replica
Original Replica
Original Replica
OriginalReplica
Data Distribution Table (Cached)
Hybrid Cluster Management Failover
Node 1 Node 2 Node 3 Node 4 Node 5
Data Replication
Client Client Client
Add new nodes
Time Series Features
22GridDB by Toshiba
• TDPA
• GridDB implements Time Series Data Placement Algorithm for high frequency data to maximize
memory utilization
• Expiry Release Function
• Data retention period can be set to a particular period to release the old data and free storage
• Aggregate Functions
• MIN, MAX, AVG, VARIANCE, STDDEV
• Sampling and Interpolation Functions
• TIME_INTERPOLATED, TIME_SAMPLING, TIME_NEXT, TIME_PREVIOUS
• Trigger functions
• JMS and REST notifications
GridDB is optimized for Time-Series operations
Real Use Cases
1. Building Energy Management Systems
2. Smart Meters – Electric Power Company
3. Smart City – Ishinomaki City
1. Building Energy Management Systems
24GridDB by Toshiba
• 100+ buildings are managed by the BEMS company in Kawasaki, Japan
• BEMS company manages over 1 Peta Byte (million Gigabytes) of sensor data each year
• Average 5MB data per sensor per day or approximately 2GB data from each sensor per year
• 100-1000 sensors per building depending on the sqft area making the collected sensor data of 1TB per building
per year
GridDB was used for its easy scalability, simple data model and Time Series querying & functions
2. Smart Meters – Electric Power Company
25GridDB by Toshiba
• One of Japan’s top Electric power companies
switched from a Relational Database to
GridDB
• The company saw an increase in throughput
by 2,250 times the old system
• Overall processing time was went down
considerably
• Data center costs reduced significantly
GridDB was used for its high performance, large data handling and reduced cost
2. Smart Meters – Electric Power Company
26GridDB by Toshiba
• Has been running as a real system since April, 2016
• 3 million smart meters` data is collected every 30 minutes and is stored for 3 months
• Data size is approximately 2.6 TB
• 13 billion records
• Record size of 200 bytes
MDMS
MapReduce
Charge Cal. Imbalance Cal.
30 Min. Balancing
MapReduce
Read Value App
AppServer
Data Input GridDB
GridDB
RDB
Preliminary
Results Usage
Power
Retailers
Usage
Power
Retailers3 million
smart meters
SM
SM
SM
3 node cluster 3 node cluster
5 node cluster
Active-Standby Cluster
3 node cluster
4 node cluster
SM – Smart Meter
MDMS – Meter Data Management System
RDB – Relational Database
3. Smart City – Disaster-tolerant Ishinomaki City
27GridDB by Toshiba
GridDB was used for its high speed processing of large data, long-term data retention, maintain consistency
Post 2011 disaster recovery plan of Ishinomaki city
PoC of Consignment Charge Calculation System
28GridDB by Toshiba
• 30 million smart meters’ data is collected every 30 minutes
and is stored for 1 month
• Data size is approximately 8.6TB
• 43 billion records
• Record size of 200 bytes
• 1 month charge calculation for 30 million meter data was
executed in 96 minutes
MDMS
Imbalance
(43G records)
5 node cluster
MapReduce
Data Input
(30M data)
GridDB
6 node cluster
30 million
smart meters
SM
SM
SM 8.6TB
Charge
Calculation
(43G records)
Associating
Contract Info.
(30M data)
Execution Time
= 1 min 47 secs
Execution Time
= 9 mins
Execution Time
= 30 mins
Execution Time
= 55 mins
GridDB
Editions, Languages, Connectors
GridDB Editions
30GridDB by Toshiba
GridDB on Amazon AWS Marketplace
31GridDB by Toshiba
Languages and Connectors
• GridDB Community Edition is open sourced and is available on GitHub
• https://github.com/griddb
• Currently supports Java, C/C++, REST, Python & Ruby interfaces
• Go, PHP, Perl and JavaScript drivers will be added in the coming months
• MapReduce connector is available on GitHub
• https://github.com/griddb/griddb_hadoop_mapreduce
• KairosDB connector is available on GitHub
• https://github.com/griddb/griddb_kairosdb
• Spark connector is recently released on GitHub
• https://github.com/griddb/griddb_spark
• Kafka-GridDB integration blog post is up on www.griddb.net website
32GridDB by Toshiba
GridDB feature set
33GridDB by Toshiba
Horizontal scaling is near-linear and works great on commodity hardware
• Tested on 100 nodes per cluster, can scale up to 1000 nodes
GridDB's advanced master-slave model eliminates SPOF and split brain
Autonomous data distribution prevents data loss
ACID transactions are guaranteed at the container level
TQL, an SQL-like language for fast querying and analytics
GridDB’s hybrid composition of In-Memory and Disk architecture is optimized for maximum performance
GridDB is custom designed for IoT and other use cases that involve Time Series operations
• TS data types, temporal based querying, geometry type and BLOB types are supported
• Vector sets data type support is in development
Useful Links
• Developers’ website - www.griddb.net
• Toshiba GridDB website - http://solutions.toshiba.com/overview.html
• GitHub repository - https://github.com/griddb
• Quick Start Guide - http://www.griddb.net/en/docs/GridDB_QuickStartGuide.html
• Technical Reference - http://www.griddb.net/en/docs/GridDB_TechnicalReference.pdf
• API Reference - http://www.griddb.net/en/docs/GridDB_API_Reference.html
34GridDB by Toshiba
Contact
Basavaraj Soppannavar
Sr. Strategist, IoT
Basavaraj.Soppannavar@toshiba.com
@griddbcommunity
Follow GridDB
GridDB by Toshiba 35
T H A N K YO U
ADDITIONAL INFO
GridDB by Toshiba 36
Yahoo Cloud Services
Benchmark (YCSB)
GridDB by Toshiba 37
YCSB
Yahoo Cloud Services Benchmark is an open source benchmarking suite designed by Yahoo
Labs for comparative performance evaluation of NoSQL Database Management Systems
• YCSB is used by DBMS vendors for ‘Benchmark Comparison’
• Traditional benchmarking tools such as TPC (Transaction Processing Performance Council) are used
to compare RDBMS
• YCSB measures/compares various attributes of the DBMS such as Latency, Throughput, Durability,
Scalability, Availability, Read/Write optimization, Sync/Async replication etc.
YCSB has 2 main parts
• YCSB Client – an extensible workload generator
• Client generated standard workloads can also be extended to generate user defined workloads that would be operated
on the system (on DBMS)
• YCSB Core Workloads – a set of scenarios generated by the client to run on the existing system
under test
• Core workloads give a well rounded picture of the system’s performance under test
GridDB by Toshiba 38
YCSB Workloads
YCSB has 6 core workloads
GridDB by Toshiba 39
Workload A-
Update heavy
Workload B -
Read mostly
Workload C -
Read only
Workload D -
Read latest
Workload E -
Short Ranges
Workload F -
Read-modify-
write
This workload has a mix of 50/50 reads and writes. An application example is a session store
recording recent actions
This workload has a 95/5 reads/write mix. Application example: photo tagging; add a tag is
an update, but most operations are to read tags
This workload is 100% read. Application example: user profile cache, where profiles are
constructed elsewhere (e.g., Hadoop)
In this workload, new records are inserted, and the most recently inserted records are the
most popular. Application example: user status updates; people want to read the latest
In this workload, short ranges of records are queried, instead of individual records.
Application example: threaded conversations, where each scan is for the posts in a given
thread (assumed to be clustered by thread id)
In this workload, the client will read a record, modify it, and write back the changes.
Application example: user database, where user records are read and modified by the user
or to record user activity

Purpose-built NoSQL Database for IoT by Basavaraj Soppannavar

  • 1.
    1 Basavaraj Soppannavar Sr. Strategist,IoT Toshiba America Research Inc. Purpose-built In-Memory NoSQL Database For Internet of Things 5th Aug 2017 Los Angeles
  • 2.
    Agenda  Internet ofThings  IoT Data & its properties  GridDB  Real Use Cases GridDB by Toshiba 2
  • 3.
  • 4.
    Internet of ThingsPredictions Number of Connected Devices 4GridDB by Toshiba By 2020 the number of connected devices will be • 50 Billion – Cisco • 28.1 Billion* – IDC • 20.8 Billion* – Gartner *not including smartphones & computers Most IoT smart devices aren’t in your home or phone—they are in factories, businesses, and healthcare – Intel Infographics • 40.2 % in Business and Manufacturing • 30.3 % in Healthcare IoT Revenue projections • $300 Billion – Gartner • $470 Billion – Bain IoT Economics
  • 5.
    Technology Stack ofIoT Data Aggregation / Processing Session / Communication Transport Link Connectivity Data Storage and Retrieval CoAP, MQTT, DDS, XMPP, AMQP, HTTP IPV4, IPV6 Ethernet, WiFi, Bluetooth, BLE, Zigbee, Zwave, RFiD, 2G, 3G, LTE Wireless, USB, RJ45(Ethernet), DSL Storm, Kafka, Fluentd, RabbitMQ GridDB, HBase, Cassandra, MongoDB, MS-SQL, Hadoop Analytics & AIDeviceandDataManagement SecurityandPrivacy BI, Visualization, Data Mining, DPP* Analytics, Machine Learning Applications Mobile, Web, Business Apps Device Sensors, Embedded chips, Cameras, Wearables *Descriptive, Predictive, Prescriptive 5GridDB by Toshiba
  • 6.
    Toshiba’s Full Stack Solutionfor IoT & Big Data GridDB by Toshiba 6 GridDB NoSQL Database
  • 7.
    IoT Data &Databases GridDB by Toshiba 7
  • 8.
    Properties of IoTData Periodic Large volume but Small record size Structured Time Stamped 8GridDB by Toshiba Timestamp Voltage Current Temperature 2017/05/03 10:45:00 100 0.64 20.5 2017/05/03 10:45:30 101 0.63 20.4 2017/05/03 10:46:00 99 0.65 20.5 . . . . . . . . . . . . Single record (size less than 100 bytes) Millions of records
  • 9.
    Database Requirements ofIoT Highly Available & Fault Tolerant Great read and write performance for millions of records Time series data & operations support Fast Search and Range Queries Spatial and geo-location support Real-time streaming support 9GridDB by Toshiba Support for ever-increasing data (Scale Out)
  • 10.
    Evolution of DatabaseManagement Systems RDBMS NoSQL DBs Key Value Store Wide Column Store Document Store Graph Store Hadoop OLAP / DW Riak, Aerospike Cassandra, HBase MongoDB, Couchbase Neo4j MySQL, Postgres Cloudera, Hortonworks Teradata, Vertica, GreenPlum RDBMS RDBMS OLAP / DW Operational / Transactional Database Data Warehouse for BI and Analytics OLAP – Online Analytical Processing DW – Data Warehouse 10GridDB by Toshiba Inspired by Source: https://practicalanalytics.co/2015/06/02/the-maturing-nosql-ecoystem-a-c-level-guide/ 90s 2000s Today
  • 11.
    GridDB A Purpose-built In-MemoryNoSQL Database for IoT GridDB by Toshiba 11
  • 12.
    What is GridDB? HighlyScalable In Memory Distributed Key-Value IoT Database 12GridDB by Toshiba
  • 13.
    GridDB – HighlyScalable Database for IoT 13GridDB by Toshiba
  • 14.
    Highly Scalable DistributedKey-Container Database 14GridDB by Toshiba
  • 15.
    NoSQL Data Models 15GridDBby Toshiba • GridDB has a unique Key-Container data model • Container can be visualized as a table of a Relational Database • Fixed schema
  • 16.
    Key Container DataModel 16GridDB by Toshiba  Container is a group of data set with a schema  GridDB supports 2 types of containers  Collection container – For generic records management  Time-series container – For time series records management  Key Container model provides  Data Consistency within the container (ACID is guaranteed within the container)  Faster data retrieval and search because of schema  TQL, an SQL-like query language for reading data from the containers
  • 17.
    Key Container DataModel - Example 17GridDB by Toshiba static class SMData { @RowKey Date timestamp; int voltage; double current; int temp; } TimeSeries<SMData> ts = store.putTimeSeries(SM101, SMData.class); Schema definition Creating a TS Container Container name “Key” Schema
  • 18.
    High Performance 18GridDB byToshiba GridDB’s hybrid composition of In-Memory and Disk architecture is optimized for maximum performance Memory from multiple nodes Node/Server Node Node Node SSD/DiskSSD/HDD SSD/Disk SSD/Disk Add new nodes GridDB 4-node Cluster In-Memory + Disk Hybrid Excess data from memory is saved on to SDD/Disk
  • 19.
    YCSB Performance Results 19GridDBby Toshiba • Tests performed under same hardware systems (MS Azure Standard_D2 dual core CPUs, 7GB RAM per node) • 1 client per core; 128 threads per client *Tests performed by Fixstars 0 100 200 300 400 A B C D F Avg.Throughput ('000ops/sec) YCSB Workloads Throughput - 16 nodes GridDB Cassandra 0 100 200 300 400 500 600 700 800 A B C D F Avg.Throughput ('000ops/sec) YCSB Workloads Throughput - 32 nodes GridDB Cassandra 0 50 100 150 A B C D F Latencyin Microseconds YCSB Workloads Read Latency – 16 nodes GridDB Cassandra Yahoo Cloud Servicing Benchmark (YCSB) comparing GridDB and Cassandra shows that*  Average throughput of GridDB is 4x-5x higher than that of Cassandra  Average latency of GridDB is 3x-4x lower than that of Cassandra
  • 20.
    Superior Stability 20GridDB byToshiba 0 2,000 4,000 6,000 8,000 10,000 12,000 14,000 0 10,000 20,000 30,000 40,000 50,000 60,000 70,000 80,000 90,000 Throughput(ops) Elapsed Time (seconds) YCSB Workload A 24Hrs Stability test GridDB Cassandra 3hrs 15hrs9hrs 21hrs 25hrs Tests performed by Fixstars
  • 21.
    High Availability 21GridDB byToshiba Advanced Master-Slave Model - Hybrid Cluster Management • No Single Point of Failure (SPOF) – Master node is selected automatically • No Split Brain – Quorum Policy is applied Autonomous Data Distribution • Data distribution and failover are taken care of automatically Master Original Replica Original Replica Original Replica Original Replica OriginalReplica Data Distribution Table (Cached) Hybrid Cluster Management Failover Node 1 Node 2 Node 3 Node 4 Node 5 Data Replication Client Client Client Add new nodes
  • 22.
    Time Series Features 22GridDBby Toshiba • TDPA • GridDB implements Time Series Data Placement Algorithm for high frequency data to maximize memory utilization • Expiry Release Function • Data retention period can be set to a particular period to release the old data and free storage • Aggregate Functions • MIN, MAX, AVG, VARIANCE, STDDEV • Sampling and Interpolation Functions • TIME_INTERPOLATED, TIME_SAMPLING, TIME_NEXT, TIME_PREVIOUS • Trigger functions • JMS and REST notifications GridDB is optimized for Time-Series operations
  • 23.
    Real Use Cases 1.Building Energy Management Systems 2. Smart Meters – Electric Power Company 3. Smart City – Ishinomaki City
  • 24.
    1. Building EnergyManagement Systems 24GridDB by Toshiba • 100+ buildings are managed by the BEMS company in Kawasaki, Japan • BEMS company manages over 1 Peta Byte (million Gigabytes) of sensor data each year • Average 5MB data per sensor per day or approximately 2GB data from each sensor per year • 100-1000 sensors per building depending on the sqft area making the collected sensor data of 1TB per building per year GridDB was used for its easy scalability, simple data model and Time Series querying & functions
  • 25.
    2. Smart Meters– Electric Power Company 25GridDB by Toshiba • One of Japan’s top Electric power companies switched from a Relational Database to GridDB • The company saw an increase in throughput by 2,250 times the old system • Overall processing time was went down considerably • Data center costs reduced significantly GridDB was used for its high performance, large data handling and reduced cost
  • 26.
    2. Smart Meters– Electric Power Company 26GridDB by Toshiba • Has been running as a real system since April, 2016 • 3 million smart meters` data is collected every 30 minutes and is stored for 3 months • Data size is approximately 2.6 TB • 13 billion records • Record size of 200 bytes MDMS MapReduce Charge Cal. Imbalance Cal. 30 Min. Balancing MapReduce Read Value App AppServer Data Input GridDB GridDB RDB Preliminary Results Usage Power Retailers Usage Power Retailers3 million smart meters SM SM SM 3 node cluster 3 node cluster 5 node cluster Active-Standby Cluster 3 node cluster 4 node cluster SM – Smart Meter MDMS – Meter Data Management System RDB – Relational Database
  • 27.
    3. Smart City– Disaster-tolerant Ishinomaki City 27GridDB by Toshiba GridDB was used for its high speed processing of large data, long-term data retention, maintain consistency Post 2011 disaster recovery plan of Ishinomaki city
  • 28.
    PoC of ConsignmentCharge Calculation System 28GridDB by Toshiba • 30 million smart meters’ data is collected every 30 minutes and is stored for 1 month • Data size is approximately 8.6TB • 43 billion records • Record size of 200 bytes • 1 month charge calculation for 30 million meter data was executed in 96 minutes MDMS Imbalance (43G records) 5 node cluster MapReduce Data Input (30M data) GridDB 6 node cluster 30 million smart meters SM SM SM 8.6TB Charge Calculation (43G records) Associating Contract Info. (30M data) Execution Time = 1 min 47 secs Execution Time = 9 mins Execution Time = 30 mins Execution Time = 55 mins
  • 29.
  • 30.
  • 31.
    GridDB on AmazonAWS Marketplace 31GridDB by Toshiba
  • 32.
    Languages and Connectors •GridDB Community Edition is open sourced and is available on GitHub • https://github.com/griddb • Currently supports Java, C/C++, REST, Python & Ruby interfaces • Go, PHP, Perl and JavaScript drivers will be added in the coming months • MapReduce connector is available on GitHub • https://github.com/griddb/griddb_hadoop_mapreduce • KairosDB connector is available on GitHub • https://github.com/griddb/griddb_kairosdb • Spark connector is recently released on GitHub • https://github.com/griddb/griddb_spark • Kafka-GridDB integration blog post is up on www.griddb.net website 32GridDB by Toshiba
  • 33.
    GridDB feature set 33GridDBby Toshiba Horizontal scaling is near-linear and works great on commodity hardware • Tested on 100 nodes per cluster, can scale up to 1000 nodes GridDB's advanced master-slave model eliminates SPOF and split brain Autonomous data distribution prevents data loss ACID transactions are guaranteed at the container level TQL, an SQL-like language for fast querying and analytics GridDB’s hybrid composition of In-Memory and Disk architecture is optimized for maximum performance GridDB is custom designed for IoT and other use cases that involve Time Series operations • TS data types, temporal based querying, geometry type and BLOB types are supported • Vector sets data type support is in development
  • 34.
    Useful Links • Developers’website - www.griddb.net • Toshiba GridDB website - http://solutions.toshiba.com/overview.html • GitHub repository - https://github.com/griddb • Quick Start Guide - http://www.griddb.net/en/docs/GridDB_QuickStartGuide.html • Technical Reference - http://www.griddb.net/en/docs/GridDB_TechnicalReference.pdf • API Reference - http://www.griddb.net/en/docs/GridDB_API_Reference.html 34GridDB by Toshiba Contact Basavaraj Soppannavar Sr. Strategist, IoT Basavaraj.Soppannavar@toshiba.com @griddbcommunity Follow GridDB
  • 35.
    GridDB by Toshiba35 T H A N K YO U
  • 36.
  • 37.
    Yahoo Cloud Services Benchmark(YCSB) GridDB by Toshiba 37
  • 38.
    YCSB Yahoo Cloud ServicesBenchmark is an open source benchmarking suite designed by Yahoo Labs for comparative performance evaluation of NoSQL Database Management Systems • YCSB is used by DBMS vendors for ‘Benchmark Comparison’ • Traditional benchmarking tools such as TPC (Transaction Processing Performance Council) are used to compare RDBMS • YCSB measures/compares various attributes of the DBMS such as Latency, Throughput, Durability, Scalability, Availability, Read/Write optimization, Sync/Async replication etc. YCSB has 2 main parts • YCSB Client – an extensible workload generator • Client generated standard workloads can also be extended to generate user defined workloads that would be operated on the system (on DBMS) • YCSB Core Workloads – a set of scenarios generated by the client to run on the existing system under test • Core workloads give a well rounded picture of the system’s performance under test GridDB by Toshiba 38
  • 39.
    YCSB Workloads YCSB has6 core workloads GridDB by Toshiba 39 Workload A- Update heavy Workload B - Read mostly Workload C - Read only Workload D - Read latest Workload E - Short Ranges Workload F - Read-modify- write This workload has a mix of 50/50 reads and writes. An application example is a session store recording recent actions This workload has a 95/5 reads/write mix. Application example: photo tagging; add a tag is an update, but most operations are to read tags This workload is 100% read. Application example: user profile cache, where profiles are constructed elsewhere (e.g., Hadoop) In this workload, new records are inserted, and the most recently inserted records are the most popular. Application example: user status updates; people want to read the latest In this workload, short ranges of records are queried, instead of individual records. Application example: threaded conversations, where each scan is for the posts in a given thread (assumed to be clustered by thread id) In this workload, the client will read a record, modify it, and write back the changes. Application example: user database, where user records are read and modified by the user or to record user activity