BASEL BERN BRUGG DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. GENEVA
HAMBURG COPENHAGEN LAUSANNE MUNICH STUTTGART VIENNA ZURICH
Big Data Solution Architectures
29.9.2016 – DOAG 2016 Big Data Days
Guido Schmutz
Trivadis
Guido Schmutz
Working for Trivadis for more than 19 years
Oracle ACE Director for Fusion Middleware and SOA
Co-Author of different books
Consultant, Trainer, Software Architect for Java, SOA & Big Data / Fast Data
Member of Trivadis Architecture Board
Technology Manager @ Trivadis
More than 25 years of software development experience
Contact: guido.schmutz@trivadis.com
Blog: http://guidoschmutz.wordpress.com
Slideshare: http://www.slideshare.net/gschmutz
Twitter: gschmutz
29.9.2016 Big Data Solution Architectures2
Agenda
Big Data Solution Architectures3 29.9.2016
1. Introduction
2. Big Data Reference Architectures
• Traditional Big Data
• Event / Stream-Processing
• Lambda Architecture
• Kappa Architecture
• Unified Architecture
3. Big Data Ecosystem – many choices sorted!
Introduction
Big Data Solution Architectures29.9.20164
Why talking about Big Data Architectures
Choosing the right architecture is key for any (big data) project
Big Data is still quite a rather young field and therefore a “moving target”
no standard architectures available which have been used for years
In the past years, some architectures and best practices have evolved
Know your use cases before choosing your architecture / technologies
To have a reference architecture in place helps in choosing the
right/matching technologies
Big Data Solution Architectures29.9.20165
How to do Big Data? Why is a structure / architecture
important
Big Data Solution Architectures29.9.20166
Big Data Ecosystem – many choices sorted!
Big Data Solution Architectures29.9.20167
Important Properties for choosing (Big) Data Architecture
Latency
Keep raw and un-interpreted data “forever” ?
Volume, Velocity, Variety, Veracity
Ad-Hoc Query Capabilities needed ?
Robustness & Fault Tolerance
Scalability
…
Big Data Solution Architectures29.9.20169
Big Data Reference Architectures -
Traditional Big Data
Big Data Solution Architectures29.9.201610
“Traditional Architecture” for Big Data
Data
Ingestion
(Analytical)	Data	Processing
Result	StoreData
Sources
Data
Consumer
Reports
Service
Analytic
Tools
Alerting
Tools
Content
RDBMS
Social
ERP
Logfiles
Sensor
Machine
Batch
compute
Pushing	
Ingestion Result	Store
Query
Engine
Computed	
Information
Raw	Data	
(Reservoir)
=	Data	in	Motion =	Data	at	Rest
Big Data Solution Architectures
Pulling	
Ingestion
Channel
29.9.201611
“Traditional Architecture” for Big Data – Hadoop
Technology Mapping
Data
Ingestion
(Analytical)	Data	Processing
Result	StoreData
Sources
Data
Consumer
Reports
Service
Analytic
Tools
Alerting
Tools
Content
RDBMS
Social
ERP
Logfiles
Sensor
Machine
Batch
compute
Pushing	
Ingestion Result	Store
Query
Engine
Computed	
Information
Raw	Data	
(Reservoir)
=	Data	in	Motion =	Data	at	Rest
Big Data Solution Architectures
Pulling	
Ingestion
Channel
29.9.201612
“Traditional Architecture” for Big Data – Spark
Technology Mapping
Data
Ingestion
(Analytical)	Data	Processing
Result	StoreData
Sources
Data
Consumer
Reports
Service
Analytic
Tools
Alerting
Tools
Content
RDBMS
Social
ERP
Logfiles
Sensor
Machine
Batch
compute
Pushing	
Ingestion Result	Store
Query
Engine
Computed	
Information
Raw	Data	
(Reservoir)
=	Data	in	Motion =	Data	at	Rest
Big Data Solution Architectures
Pulling	
Ingestion
Channel
29.9.201613
“Traditional Architecture” for Big Data – Feeding in High-
Volume Event Streams
Data
Ingestion
(Analytical)	Data	Processing
Result	StoreData
Sources
Data
Consumer
Reports
Service
Analytic
Tools
Alerting
Tools
Content
RDBMS
Social
ERP
Logfiles
Sensor
Machine
Batch
compute
Pushing	
Ingestion Result	Store
Query
Engine
Computed	
Information
Raw	Data	
(Reservoir)
=	Data	in	Motion =	Data	at	Rest
Big Data Solution Architectures
Pulling	
Ingestion
Channel
?
?
29.9.201614
Traditional Architecture for Big Data
• Batch Processing - “Data at Rest”
• Not for low latency use cases
• Responses are delivered “after the fact”
• Maximum value of the identified situation is lost
• Decision are made on old and stale data
• Spar Core is a faster alternative to Hadoop Map
Reduce, but still Batch Processing
• Spark Ecosystems offers a lot of additional
advanced analytic capabilities (machine learning,
graph processing, …)
Big Data Solution Architectures29.9.201615
Big Data Reference Architectures –
Event/Stream Processing
Big Data Solution Architectures29.9.201616
Event / Stream Processing – “Data in Motion”
“Data in motion”
Events are analyzed and processed in real-
time as the arrive
Decisions are timely, contextual and based
on fresh data
Decision latency is eliminated
Big Data Solution Architectures29.9.201617
Event / Stream Processing Architecture
Data
Ingestion
Batch
compute
Data
Sources
Channel
Data
Consumer
Reports
Service
Analytic
Tools
Alerting
Tools
Content
Logfiles
Social
RDBMS
ERP
Sensor
Machine
(Analytical)	Real-Time	Data	Processing
Stream/Event	Processing
Result	Store
Messaging
Result	Store
Big Data Solution Architectures
=	Data	in	Motion =	Data	at	Rest
29.9.201618
Continuous Ingestion
DB	Source
Big	Data
Log
Stream	
Processing
IoT Sensor
Event	Hub
Topic
Topic
REST
Topic
IoT GW
CDC	GW
Connect
CDC
DB	Source
Log CDC
Native
IoT Sensor
IoT Sensor
19
Dataflow	GW
Topic
Topic
Queue
MQTT	GW
Topic
Dataflow	GW
Dataflow
TopicREST
19
File	Source
Log
Log
Log
Social
Native
29.9.2016 Big Data Solution Architectures19
Topic
Topic
Challenges for Ingesting Sensor Data
Big Data Solution Architectures
Multitude of sensors
Real-Time Streaming
Multiple Firmware versions
Bad Data from damaged sensors
Regulatory Constraints
Data Quality
20 29.9.2016
SQL Polling
Change Data Capture
(CDC)
File Stream (File Tailing)
File Stream (Streaming
Appender)
Enabling Continuous Data Ingestion
Sensor Stream
Big Data Solution Architectures21 29.9.2016
Event / Stream Processing Architecture – Open Source
Technology Mapping
Data
Ingestion
Batch
compute
Data
Sources
Channel
Data
Consumer
Reports
Service
Analytic
Tools
Alerting
Tools
Content
Logfiles
Social
RDBMS
ERP
Sensor
Machine
(Analytical)	Real-Time	Data	Processing
Stream/Event	Processing
Result	Store
Messaging
Result	Store
Big Data Solution Architectures
=	Data	in	Motion =	Data	at	Rest
29.9.201622
Event / Stream Processing Architecture – Oracle
Technology Mapping
Data
Ingestion
Batch
compute
Data
Sources
Channel
Data
Consumer
Reports
Service
Analytic
Tools
Alerting
Tools
Content
Logfiles
Social
RDBMS
ERP
Sensor
Machine
(Analytical)	Real-Time	Data	Processing
Stream/Event	Processing
Result	Store
Messaging
Result	Store
Big Data Solution Architectures
=	Data	in	Motion =	Data	at	Rest
29.9.201623
Event / Stream Processing Architecture
The solution for low latency use cases
Process each event separately => low latency
Process events in micro-batches => increases latency but offers better
reliability
Previously known as “Complex Event Processing”
Keep the data moving / Data in Motion instead of Data at Rest => raw events
were not stored
Big Data Solution Architectures29.9.201624
Event / Stream Processing Architecture - Keep raw
event data
Data
Ingestion
Batch
compute
Data
Sources
Channel
Data
Consumer
Reports
Service
Analytic
Tools
Alerting
Tools
Content
Logfiles
Social
RDBMS
ERP
Sensor
Machine
(Analytical)	Real-Time	Data	Processing
Stream/Event	Processing
Result	Store
Messaging
Result	Store
(Analytical)	Batch	Data	Processing
Raw	Data	
(Reservoir)
Big Data Solution Architectures
=	Data	in	Motion =	Data	at	Rest
29.9.201625
Big Data Reference Architectures -
Lambda Architecture for Big Data
Big Data Solution Architectures29.9.201626
“Lambda Architecture” for Big Data
Data
Ingestion
(Analytical)	Batch	Data	Processing
Batch
compute
Result	StoreData
Sources
Channel
Data
Consumer
Reports
Service
Analytic
Tools
Alerting
Tools
Content
RDBMS
Social
ERP
Logfiles
Sensor
Machine
(Analytical)	Real-Time	Data	Processing
Stream/Event	Processing
Batch
compute
Messaging
Result	Store
Query
Engine
Result	Store
Computed	
Information
Raw	Data	
(Reservoir)
Big Data Solution Architectures
=	Data	in	Motion =	Data	at	Rest
Pulling	
Ingestion
29.9.201627
“Lambda Architecture” for Big Data
Data
Ingestion
(Analytical)	Batch	Data	Processing
Batch
compute
Result	StoreData
Sources
Channel
Data
Consumer
Reports
Service
Analytic
Tools
Alerting
Tools
Content
RDBMS
Social
ERP
Logfiles
Sensor
Machine
(Analytical)	Real-Time	Data	Processing
Stream/Event	Processing
Batch
compute
Messaging
Result	Store
Query
Engine
Result	Store
Computed	
Information
Raw	Data	
(Reservoir)
Big Data Solution Architectures
=	Data	in	Motion =	Data	at	Rest
Pulling	
Ingestion
29.9.201628
Lambda Architecture for Big Data
Combines (Big) Data at Rest with (Fast) Data in Motion
Closes the gap from high-latency batch processing
Keeps the raw information forever
Makes it possible to rerun analytics operations on whole data set if necessary
=> because the old run had an error or
=> because we have found a better algorithm we want to apply
Have to implement functionality twice
• Once for batch
• Once for real-time streaming
Big Data Solution Architectures29.9.201629
Big Data Reference Architectures -
„Kappa“ Architecture
Big Data Solution Architectures29.9.201630
“Kappa Architecture” for Big Data
Data
Ingestion
“Raw	Data	Reservoir”
Batch
compute
Data
Sources
Messaging
Data
Consumer
Reports
Service
Analytic
Tools
Alerting
Tools
Content
RDBMS
Social
ERP
Logfiles
Sensor
Machine
(Analytical)	Real-Time	Data	Processing
Stream/Event	Processing
Result	Store
Messaging
Result	Store
Raw	Data	
(Reservoir)
Computed	
Information
Big Data Solution Architectures
=	Data	in	Motion =	Data	at	Rest
29.9.201631
Big Data Reference Architectures -
„Unified“ Architecture
Big Data Solution Architectures29.9.201632
“Unified Architecture” for Big Data
Data
Ingestion
(Analytical)	Batch	Data	Processing	(Calculate	
Models	of	incoming	data)
Batch
compute
Result	StoreData
Sources
Channel
Data
Consumer
Reports
Service
Analytic
Tools
Alerting
Tools
Content
RDBMS
Social
ERP
Logfiles
Sensor
Machine
(Analytical)	Real-Time	Data	Processing
Stream/Event	Processing
Batch
compute
Messaging
Result	Store
Query
Engine
Result	Store
Computed	
Information
Raw	Data	
(Reservoir)
=	Data	in	Motion =	Data	at	Rest
Prediction	
Models
Big Data Solution Architectures29.9.201633
Big Data Ecosystem – many
choices sorted!
Big Data Solution Architectures29.9.201634
Building Blocks for (Big) Data Processing
Data
Acquisition
Format
File System
Stream Processing
Batch SQL
Graph DBMS
Document
DBMS
Relational
DBMS
Visualization
IoT
Messaging
Analytics
OLAP DBMS
Query
Federation
Table-Style
DBMS
Key Value
DBMS
Batch Processing
In-Memory
Big Data Solution Architectures29.9.201635
Big Data Ecosystem – many choices sorted!
Big Data Solution Architectures29.9.201636
NoSQL Datastores
Big Data Solution Architectures29.9.201637
Organizing NoSQL Datastores – Different Types
Key	Value	Store
Big Data Solution Architectures38
Wide-column	store
Document	store
Graph	store
29.9.2016
Key Value
K1 V1
K2 V2
K3 V3
Document
{
k1:	v1,
k2:	v2,	
k3:	[v1,	v2,	v3]
}
Rowkey
CK1
RK1
V1
CK2
V2
CK3
V3
CK4
V4
…
…
CK1
RK2
V1
CK4
V4
CK6
V6
…
…
…
…
…
…
CK3
V3
Organizing NoSQL Datastores – and the Products
Key	Value	Store
Big Data Solution Architectures39
Wide-column	store
Document	store
Graph	store
29.9.2016
Big Data Solution Architectures29.9.201640
Guido Schmutz
Technology Manager
guido.schmutz@trivadis.com
Big Data Solution Architectures29.9.201641

Big Data Architectures

  • 1.
    BASEL BERN BRUGGDÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. GENEVA HAMBURG COPENHAGEN LAUSANNE MUNICH STUTTGART VIENNA ZURICH Big Data Solution Architectures 29.9.2016 – DOAG 2016 Big Data Days Guido Schmutz Trivadis
  • 2.
    Guido Schmutz Working forTrivadis for more than 19 years Oracle ACE Director for Fusion Middleware and SOA Co-Author of different books Consultant, Trainer, Software Architect for Java, SOA & Big Data / Fast Data Member of Trivadis Architecture Board Technology Manager @ Trivadis More than 25 years of software development experience Contact: guido.schmutz@trivadis.com Blog: http://guidoschmutz.wordpress.com Slideshare: http://www.slideshare.net/gschmutz Twitter: gschmutz 29.9.2016 Big Data Solution Architectures2
  • 3.
    Agenda Big Data SolutionArchitectures3 29.9.2016 1. Introduction 2. Big Data Reference Architectures • Traditional Big Data • Event / Stream-Processing • Lambda Architecture • Kappa Architecture • Unified Architecture 3. Big Data Ecosystem – many choices sorted!
  • 4.
    Introduction Big Data SolutionArchitectures29.9.20164
  • 5.
    Why talking aboutBig Data Architectures Choosing the right architecture is key for any (big data) project Big Data is still quite a rather young field and therefore a “moving target” no standard architectures available which have been used for years In the past years, some architectures and best practices have evolved Know your use cases before choosing your architecture / technologies To have a reference architecture in place helps in choosing the right/matching technologies Big Data Solution Architectures29.9.20165
  • 6.
    How to doBig Data? Why is a structure / architecture important Big Data Solution Architectures29.9.20166
  • 7.
    Big Data Ecosystem– many choices sorted! Big Data Solution Architectures29.9.20167
  • 8.
    Important Properties forchoosing (Big) Data Architecture Latency Keep raw and un-interpreted data “forever” ? Volume, Velocity, Variety, Veracity Ad-Hoc Query Capabilities needed ? Robustness & Fault Tolerance Scalability … Big Data Solution Architectures29.9.20169
  • 9.
    Big Data ReferenceArchitectures - Traditional Big Data Big Data Solution Architectures29.9.201610
  • 10.
    “Traditional Architecture” forBig Data Data Ingestion (Analytical) Data Processing Result StoreData Sources Data Consumer Reports Service Analytic Tools Alerting Tools Content RDBMS Social ERP Logfiles Sensor Machine Batch compute Pushing Ingestion Result Store Query Engine Computed Information Raw Data (Reservoir) = Data in Motion = Data at Rest Big Data Solution Architectures Pulling Ingestion Channel 29.9.201611
  • 11.
    “Traditional Architecture” forBig Data – Hadoop Technology Mapping Data Ingestion (Analytical) Data Processing Result StoreData Sources Data Consumer Reports Service Analytic Tools Alerting Tools Content RDBMS Social ERP Logfiles Sensor Machine Batch compute Pushing Ingestion Result Store Query Engine Computed Information Raw Data (Reservoir) = Data in Motion = Data at Rest Big Data Solution Architectures Pulling Ingestion Channel 29.9.201612
  • 12.
    “Traditional Architecture” forBig Data – Spark Technology Mapping Data Ingestion (Analytical) Data Processing Result StoreData Sources Data Consumer Reports Service Analytic Tools Alerting Tools Content RDBMS Social ERP Logfiles Sensor Machine Batch compute Pushing Ingestion Result Store Query Engine Computed Information Raw Data (Reservoir) = Data in Motion = Data at Rest Big Data Solution Architectures Pulling Ingestion Channel 29.9.201613
  • 13.
    “Traditional Architecture” forBig Data – Feeding in High- Volume Event Streams Data Ingestion (Analytical) Data Processing Result StoreData Sources Data Consumer Reports Service Analytic Tools Alerting Tools Content RDBMS Social ERP Logfiles Sensor Machine Batch compute Pushing Ingestion Result Store Query Engine Computed Information Raw Data (Reservoir) = Data in Motion = Data at Rest Big Data Solution Architectures Pulling Ingestion Channel ? ? 29.9.201614
  • 14.
    Traditional Architecture forBig Data • Batch Processing - “Data at Rest” • Not for low latency use cases • Responses are delivered “after the fact” • Maximum value of the identified situation is lost • Decision are made on old and stale data • Spar Core is a faster alternative to Hadoop Map Reduce, but still Batch Processing • Spark Ecosystems offers a lot of additional advanced analytic capabilities (machine learning, graph processing, …) Big Data Solution Architectures29.9.201615
  • 15.
    Big Data ReferenceArchitectures – Event/Stream Processing Big Data Solution Architectures29.9.201616
  • 16.
    Event / StreamProcessing – “Data in Motion” “Data in motion” Events are analyzed and processed in real- time as the arrive Decisions are timely, contextual and based on fresh data Decision latency is eliminated Big Data Solution Architectures29.9.201617
  • 17.
    Event / StreamProcessing Architecture Data Ingestion Batch compute Data Sources Channel Data Consumer Reports Service Analytic Tools Alerting Tools Content Logfiles Social RDBMS ERP Sensor Machine (Analytical) Real-Time Data Processing Stream/Event Processing Result Store Messaging Result Store Big Data Solution Architectures = Data in Motion = Data at Rest 29.9.201618
  • 18.
    Continuous Ingestion DB Source Big Data Log Stream Processing IoT Sensor Event Hub Topic Topic REST Topic IoTGW CDC GW Connect CDC DB Source Log CDC Native IoT Sensor IoT Sensor 19 Dataflow GW Topic Topic Queue MQTT GW Topic Dataflow GW Dataflow TopicREST 19 File Source Log Log Log Social Native 29.9.2016 Big Data Solution Architectures19 Topic Topic
  • 19.
    Challenges for IngestingSensor Data Big Data Solution Architectures Multitude of sensors Real-Time Streaming Multiple Firmware versions Bad Data from damaged sensors Regulatory Constraints Data Quality 20 29.9.2016
  • 20.
    SQL Polling Change DataCapture (CDC) File Stream (File Tailing) File Stream (Streaming Appender) Enabling Continuous Data Ingestion Sensor Stream Big Data Solution Architectures21 29.9.2016
  • 21.
    Event / StreamProcessing Architecture – Open Source Technology Mapping Data Ingestion Batch compute Data Sources Channel Data Consumer Reports Service Analytic Tools Alerting Tools Content Logfiles Social RDBMS ERP Sensor Machine (Analytical) Real-Time Data Processing Stream/Event Processing Result Store Messaging Result Store Big Data Solution Architectures = Data in Motion = Data at Rest 29.9.201622
  • 22.
    Event / StreamProcessing Architecture – Oracle Technology Mapping Data Ingestion Batch compute Data Sources Channel Data Consumer Reports Service Analytic Tools Alerting Tools Content Logfiles Social RDBMS ERP Sensor Machine (Analytical) Real-Time Data Processing Stream/Event Processing Result Store Messaging Result Store Big Data Solution Architectures = Data in Motion = Data at Rest 29.9.201623
  • 23.
    Event / StreamProcessing Architecture The solution for low latency use cases Process each event separately => low latency Process events in micro-batches => increases latency but offers better reliability Previously known as “Complex Event Processing” Keep the data moving / Data in Motion instead of Data at Rest => raw events were not stored Big Data Solution Architectures29.9.201624
  • 24.
    Event / StreamProcessing Architecture - Keep raw event data Data Ingestion Batch compute Data Sources Channel Data Consumer Reports Service Analytic Tools Alerting Tools Content Logfiles Social RDBMS ERP Sensor Machine (Analytical) Real-Time Data Processing Stream/Event Processing Result Store Messaging Result Store (Analytical) Batch Data Processing Raw Data (Reservoir) Big Data Solution Architectures = Data in Motion = Data at Rest 29.9.201625
  • 25.
    Big Data ReferenceArchitectures - Lambda Architecture for Big Data Big Data Solution Architectures29.9.201626
  • 26.
    “Lambda Architecture” forBig Data Data Ingestion (Analytical) Batch Data Processing Batch compute Result StoreData Sources Channel Data Consumer Reports Service Analytic Tools Alerting Tools Content RDBMS Social ERP Logfiles Sensor Machine (Analytical) Real-Time Data Processing Stream/Event Processing Batch compute Messaging Result Store Query Engine Result Store Computed Information Raw Data (Reservoir) Big Data Solution Architectures = Data in Motion = Data at Rest Pulling Ingestion 29.9.201627
  • 27.
    “Lambda Architecture” forBig Data Data Ingestion (Analytical) Batch Data Processing Batch compute Result StoreData Sources Channel Data Consumer Reports Service Analytic Tools Alerting Tools Content RDBMS Social ERP Logfiles Sensor Machine (Analytical) Real-Time Data Processing Stream/Event Processing Batch compute Messaging Result Store Query Engine Result Store Computed Information Raw Data (Reservoir) Big Data Solution Architectures = Data in Motion = Data at Rest Pulling Ingestion 29.9.201628
  • 28.
    Lambda Architecture forBig Data Combines (Big) Data at Rest with (Fast) Data in Motion Closes the gap from high-latency batch processing Keeps the raw information forever Makes it possible to rerun analytics operations on whole data set if necessary => because the old run had an error or => because we have found a better algorithm we want to apply Have to implement functionality twice • Once for batch • Once for real-time streaming Big Data Solution Architectures29.9.201629
  • 29.
    Big Data ReferenceArchitectures - „Kappa“ Architecture Big Data Solution Architectures29.9.201630
  • 30.
    “Kappa Architecture” forBig Data Data Ingestion “Raw Data Reservoir” Batch compute Data Sources Messaging Data Consumer Reports Service Analytic Tools Alerting Tools Content RDBMS Social ERP Logfiles Sensor Machine (Analytical) Real-Time Data Processing Stream/Event Processing Result Store Messaging Result Store Raw Data (Reservoir) Computed Information Big Data Solution Architectures = Data in Motion = Data at Rest 29.9.201631
  • 31.
    Big Data ReferenceArchitectures - „Unified“ Architecture Big Data Solution Architectures29.9.201632
  • 32.
    “Unified Architecture” forBig Data Data Ingestion (Analytical) Batch Data Processing (Calculate Models of incoming data) Batch compute Result StoreData Sources Channel Data Consumer Reports Service Analytic Tools Alerting Tools Content RDBMS Social ERP Logfiles Sensor Machine (Analytical) Real-Time Data Processing Stream/Event Processing Batch compute Messaging Result Store Query Engine Result Store Computed Information Raw Data (Reservoir) = Data in Motion = Data at Rest Prediction Models Big Data Solution Architectures29.9.201633
  • 33.
    Big Data Ecosystem– many choices sorted! Big Data Solution Architectures29.9.201634
  • 34.
    Building Blocks for(Big) Data Processing Data Acquisition Format File System Stream Processing Batch SQL Graph DBMS Document DBMS Relational DBMS Visualization IoT Messaging Analytics OLAP DBMS Query Federation Table-Style DBMS Key Value DBMS Batch Processing In-Memory Big Data Solution Architectures29.9.201635
  • 35.
    Big Data Ecosystem– many choices sorted! Big Data Solution Architectures29.9.201636
  • 36.
    NoSQL Datastores Big DataSolution Architectures29.9.201637
  • 37.
    Organizing NoSQL Datastores– Different Types Key Value Store Big Data Solution Architectures38 Wide-column store Document store Graph store 29.9.2016 Key Value K1 V1 K2 V2 K3 V3 Document { k1: v1, k2: v2, k3: [v1, v2, v3] } Rowkey CK1 RK1 V1 CK2 V2 CK3 V3 CK4 V4 … … CK1 RK2 V1 CK4 V4 CK6 V6 … … … … … … CK3 V3
  • 38.
    Organizing NoSQL Datastores– and the Products Key Value Store Big Data Solution Architectures39 Wide-column store Document store Graph store 29.9.2016
  • 39.
    Big Data SolutionArchitectures29.9.201640
  • 40.