SlideShare a Scribd company logo
50 Shades of
Data - how,
when and why
Big, Fast,
Relational,
NoSQL, Elastic,
Event, CQRS
On the many types of
data, data stores and data
usages
Dutch Oracle Architects Platform | 6th February 2018 1
µ
µ
What is data?
What is data?
• A solidified representation of
• An observation [of a fact]
• A concept
• Serialized in order to be
• Understood & processed by machines
• Reproduced for human consumption
When things were simple
RDBMS
SQL
ACID
Data
files
Log
Files
Backup
Backup
Backup
SAN
And then stuff happened
Middle Tier:
Java EE (Stateful) application
Client Tier:
Browser
Client Tier:
Browser
Client Tier:
Browser
Mobile App
(offline)
Mobile App
(offline)
Mobile App
(offline)
Data
Warehouse
OO,
XML,
JSON
Content
Management
Big Data
Fast Data
API
API
API
µ λ
Explosion of Data Store technologies
RDBMS
SQL
ACID
V4
Data
Tagcloud
Business Areas
Marketing &
Campaigns
External Actors
Supplier
Gov Agency
ShippingSecurity
Finance
Accounts, Invoices
Supplier & Product
Management
Inventory &
Warehousing
Output
(print & mail, email,
SMS, …)
Sales &
Customer
Service Inside the Enterprise
Data Department
Consolidation, MI, Reporting,
Analysis and R&D
Customer
Management
Order
Management
Data
providers
Customers
Marketing &
Campaigns
Public Internet/External Actors
Gov Agency
ShippingSecurity
Finance
Accounts, Invoices
Supplier & Product
Management
Inventory &
Warehousing
Output
(print & mail, email,
SMS, …)
Inside the Enterprise
Data Department
Consolidation, MI, Reporting,
Analysis and R&D
Customer
Management
Order
Management
Data
providers
SupplierCustomers
B2B Partner Portal
Customer
Service
SaaS
Mobile
App
Custom
Application for
Product Catalog
IoT Gateways
& Hub
SaaS ERP
Enterprise Content
Management
System
Human
Workflow
Engine Mail
Server
Data Warehouse
SaaS CRM
Custom Order
Management
Application
B2B
APIs
B2B
APIs
Open Data
APIs
DaaS
Services
APIs
SaaS CX
Campaigns, Social
Media Monitor, 360
Customer View
LDAP for Users,
Roles &
Permissions
WebShop
Portal
Recommendation
Engine
Enterprise
Dashboard & BI
& Reporting
Security &
Compliance
Monitor
Desktop
Tools
Communication &
Collaboration tools
Asset
Tracker
Business
Applications
Marketing &
Campaigns
Public Internet/External Actors
Gov Agency
ShippingSecurity
Finance
Accounts, Invoices
Supplier & Product
Management
Inventory &
Warehousing
Output
(print & mail, email,
SMS, …)
Inside the Enterprise
Data Department
Consolidation, MI, Reporting,
Analysis and R&D
Customer
Management
Order
Management
Data
providers
SupplierCustomers
B2B Partner Portal
Customer
Service
SaaS
Mobile
App
Big Data
Lake
Custom
Application for
Product Catalog
IoT Gateways
& Hub
SaaS ERP
Enterprise Content
Management
System
Human
Workflow
Engine Mail
Server
Data Warehouse
SaaS CRM
DaaS
Services
SaaS CX
Campaigns, Social
Media Monitor, 360
Customer View
LDAP for Users,
Roles &
Permissions
WebShop
Portal
Recommendation
Engine
Enterprise
Dashboard & BI
& Reporting
Security &
Compliance
Monitor
Desktop
Tools
Communication &
Collaboration tools
Logging Collector
& Monitor &
Analyzer
Monitor for
Application & Infra
metrics
Source Code
Control
System
API
Gateway
Service
Bus
Event
Bus
Event
Bus
Rule
Engine
Desktop
Browser
Mobile
Devices
Email /
Facebook /
WhatsApp
Custom Order
Management
Application
Asset
Tracker
Corporate
DatabaseFile
Storage
Job
Scheduling
B2B
APIs
Open Data
APIs
APIs
Application
Server
Private
Blockchain
B2B
APIs
Docker
Container
Registry
Business Applications
& IT Systems
Microservices
Platform
Kubernetes Container
Management
Marketing &
Campaigns
Public Internet/External Actors
Gov Agency
ShippingSecurity
Finance
Accounts, Invoices
Supplier & Product
Management
Inventory &
Warehousing
Output
(print & mail, email,
SMS, …)
Inside the Enterprise
Data Department
Consolidation, MI, Reporting,
Analysis and R&D
Customer
Management
Order
Management
Data
providers
SupplierCustomers
B2B Partner Portal
Customer
Service
SaaS
Mobile
App
Big Data
Lake
Custom
Application for
Product Catalog
IoT Gateways
& Hub
SaaS ERP
Enterprise Content
Management
System
Human
Workflow
Engine Mail
Server
Data Warehouse
SaaS CRM
DaaS
Services
SaaS CX
Campaigns, Social
Media Monitor, 360
Customer View
LDAP for Users,
Roles &
Permissions
WebShop
Portal
Recommendation
Engine
Enterprise
Dashboard & BI
& Reporting
Security &
Compliance
Monitor
Desktop
Tools
Communication &
Collaboration tools
Logging Collector
& Monitor &
Analyzer
Monitor for
Application & Infra
metrics
Source
Code
Control
System
API
Gateway
Service
Bus
Event
Bus
Event
Bus
Rule
Engine
Desktop
Browser
Mobile
Devices
Email /
Facebook /
WhatsApp
Custom Order
Management
Application
Asset
Tracker
Corporate
DatabaseFile
Storage
Job
Scheduling
B2B
APIs
Open Data
APIs
APIs
Application
Server
Private
Blockchain
B2B
APIs
Docker
Container
Registry
Business & IT - Data List of
Products
shown in UI
Personal Profile,
Order and
Payments Details
Smart Contracts
with supply
chain details
Recent
Consumer
purchases
information
Footage
from security
cameras
Readings
from motion
detectors
Emails regarding
customer
complaints
Spreadsheets with
Sales records
Log-files from IT
systems (infra &
platform)
WebShop activity,
Social Media
discussions, …
ML Models
In Flight
Messages
Events
Job
Schedules
Application &
Infrastructure
source history
Offers, invoices,
rewards
messages
Shopping Cart
with selected
items
Order
Details
API usage,
billing, policies
Running & Past
workflow instances
Sales Aggregates
by Day, Region,
Product Category
Invoices &
Payments
Product
Manuals
Digital
Twin
KPIs & Alerts
Customer
Interaction records
Case files
(Complaints
, Requests)
Rules & Rule
Execution
metrics
Weather,
Demographics,
Sports, Social, …
Config
data
Customer
Details
Audit Trails,
Security
Incidents
ML Models
Programming
in progress
User Stories,
Designs, Discussions
Copy of
Production Data
in Acceptance
Volume
Marketing &
Campaigns
Public Internet/External Actors
Gov Agency
ShippingSecurity
Finance
Accounts, Invoices
Supplier & Product
Management
Inventory &
Warehousing
Output
(print & mail, email,
SMS, …)
Inside the Enterprise
Data Department
Consolidation, MI, Reporting,
Analysis and R&D
Customer
Management
Order
Management
Data
providers
SupplierCustomers
B2B Partner Portal
Customer
Service
SaaS
Mobile
App
Big Data
Lake
Custom
Application for
Product Catalog
IoT Gateways
& Hub
SaaS ERP
Enterprise Content
Management
System
Human
Workflow
Engine Mail
Server
Data Warehouse
SaaS CRM
DaaS
Services
SaaS CX
Campaigns, Social
Media Monitor, 360
Customer View
LDAP for Users,
Roles &
Permissions
WebShop
Portal
Recommendation
Engine
Enterprise
Dashboard & BI
& Reporting
Security &
Compliance
Monitor
Desktop
Tools
Communication &
Collaboration tools
Logging Collector
& Monitor &
Analyzer
Monitor for
Application & Infra
metrics
Source
Code
Control
System
API
Gateway
Service
Bus
Event
Bus
Event
Bus
Rule
Engine
Desktop
Browser
Mobile
Devices
Email /
Facebook /
WhatsApp
Custom Order
Management
Application
Asset
Tracker
Corporate
DatabaseFile
Storage
Job
Scheduling
B2B
APIs
Open Data
APIs
APIs
Application
Server
Private
Blockchain
B2B
APIs
Docker
Container
Registry
Data Volume List of
Products
shown in UI
Personal Profile,
Order and
Payments Details
Smart Contracts
with supply
chain details
Recent
Consumer
purchases
information
Footage
from security
cameras
Readings
from motion
detectors
Emails regarding
customer
complaints
Spreadsheets
with Sales
records
Log-files from IT
systems (infra &
platform)
WebShop activity,
Social Media
discussions, …
ML Models
In Flight
Messages
Events
Job
Schedules
Application &
Infrastructure
source history
Offers, invoices,
rewards
messages
Shopping Cart
with selected
items
Order
Details
API usage,
billing, policies
Running & Past
workflow instances
Sales Aggregates
by Day, Region,
Product Category
Invoices &
Payments
Product
Manuals
Digital
Twin
KPIs & Alerts
Customer
Interaction records
Case files
(Complaints
, Requests)
Rules & Rule
Execution
metrics
Weather,
Demographics,
Sports, Social, …
Config
data
Customer
Details
Audit Trails,
Security
Incidents
ML Models
Programming
in progress
User Stories,
Designs, Discussions
Copy of
Production Data
in Acceptance
Big Data Lake
Machine Learning
models
Long term history Data
Warehouse
Big Lots of data
Small chunks of
off line data
Piles of log-files
Fine grained
events
Gathering – never
purging?
Small payloads
Medium size –
structured data
Rule meta-data
(very small)
Compression
• . Technical Compression
• Same data, fewer bits to store
• Same time – or even longer - to process
• Logical Compression
• Filter (older than, one in X)
• Reduce fine grainedness - helicopterview
• Average over geographical area
• Min/Max/Average per minute/hour/day
• Is typically done in data warehouse & digital twin
• Could be done for query stores and even for big data set
80M Pictures of Road
Big Data => Small ML Models
Velocity
Fast Data – Fast Insight
Raw Data
Event Hub
Streaming with
Hot (Alerting)
and ColdIoT
Device Data Digital Twin
Machine Learning
Models to apply to digital
twin to predict maintenance
need
Marketing &
Campaigns
Public Internet/External Actors
Gov Agency
ShippingSecurity
Finance
Accounts, Invoices
Supplier & Product
Management
Inventory &
Warehousing
Output
(print & mail, email,
SMS, …)
Inside the Enterprise
Data Department
Consolidation, MI, Reporting,
Analysis and R&D
Customer
Management
Order
Management
Data
providers
SupplierCustomers
B2B Partner Portal
Customer
Service
SaaS
Mobile
App
Big Data
Lake
Custom
Application for
Product Catalog
IoT Gateways
& Hub
SaaS ERP
Enterprise Content
Management
System
Human
Workflow
Engine Mail
Server
Data Warehouse
SaaS CRM
DaaS
Services
SaaS CX
Campaigns, Social
Media Monitor, 360
Customer View
LDAP for Users,
Roles &
Permissions
WebShop
Portal
Recommendation
Engine
Enterprise
Dashboard & BI
& Reporting
Security &
Compliance
Monitor
Desktop
Tools
Communication &
Collaboration tools
Logging Collector
& Monitor &
Analyzer
Monitor for
Application & Infra
metrics
Source
Code
Control
System
API
Gateway
Service
Bus
Event
Bus
Event
Bus
Rule
Engine
Desktop
Browser
Mobile
Devices
Email /
Facebook /
WhatsApp
Custom Order
Management
Application
Asset
Tracker
Corporate
DatabaseFile
Storage
Job
Scheduling
B2B
APIs
Open Data
APIs
APIs
Application
Server
Private
Blockchain
B2B
APIs
Data Volatility
Personal Profile,
Order and
Payments Details
Smart Contracts
with supply
chain details
Recent
Consumer
purchases
information
Footage
from security
cameras
Emails regarding
customer
complaints
Log-files from IT
systems (infra &
platform)
ML Models
In Flight
Messages
Events
Job
Schedules
Offers, invoices,
rewards
messages
Order
Details
API usage,
billing, policies
Running & Past
workflow instances
Sales
Aggregatesby
Day, Region,
Product Category
Invoices &
Payments
Product
Manuals
Digital
Twin
KPIs & Alerts
Customer
Interaction records
Case files
(Complaints
, Requests)
Rules & Rule
Execution
metrics
Weather,
Demographics,
Sports, Social, …
Config
data
Customer
Details
Audit Trails,
Security
Incidents
ML Models
Programming
in progress
User Stories,
Designs, Discussions
Copy of
Production Data
in Acceptance
List of Products
shown in UI
Spreadsheets with
Sales records
WebShop activity,
Social Media
discussions, …
In Flight
Messages
Events
Application &
Infrastructure
source history
Shopping Cart with
selected items
Audit Trails,
Security
Incidents
Readings from
motion detectors
Sales Aggregates by
Day, Region, Product
Category
high low
Location
Location of Data
Location of Data
Marketing &
Campaigns
Public Internet/External Actors
Gov Agency
ShippingSecurity
Finance
Accounts, Invoices
Supplier & Product
Management
Inventory &
Warehousing
Output
(print & mail, email,
SMS, …)
Inside the Enterprise
Data Department
Consolidation, MI, Reporting,
Analysis and R&D
Customer
Management
Order
Management
Data
providers
SupplierCustomers
B2B Partner Portal
Customer
Service
SaaS
Mobile
App
Big Data
Lake
Custom
Application for
Product Catalog
IoT Gateways
& Hub
SaaS ERP
Enterprise Content
Management
System
Human
Workflow
Engine Mail
Server
Data Warehouse
SaaS CRM
DaaS
Services
SaaS CX
Campaigns, Social
Media Monitor, 360
Customer View
LDAP for Users,
Roles &
Permissions
WebShop
Portal
Recommendation
Engine
Enterprise
Dashboard & BI
& Reporting
Security &
Compliance
Monitor
Desktop
Tools
Communication &
Collaboration tools
Logging Collector
& Monitor &
Analyzer
Monitor for
Application & Infra
metrics
Source
Code
Control
System
API
Gateway
Service
Bus
Event
Bus
Event
Bus
Rule
Engine
Desktop
Browser
Mobile
Devices
Email /
Facebook /
WhatsApp
Custom Order
Management
Application
Asset
Tracker
Corporate
DatabaseFile
Storage
Job
Scheduling
B2B
APIs
Open Data
APIs
APIs
Application
Server
Private
Blockchain
B2B
APIs
Docker
Container
Registry
Location List of
Products
shown in UI
Personal Profile,
Order and
Payments Details
Smart Contracts
with supply
chain details
Recent
Consumer
purchases
information
Footage
from security
cameras
Readings
from motion
detectors
Emails regarding
customer
complaints
Spreadsheets
with Sales
records
Log-files from IT
systems (infra &
platform)
WebShop activity,
Social Media
discussions, …
ML Models
In Flight
Messages
Events
Job
Schedules
Application &
Infrastructure
source history
Offers, invoices,
rewards
messages
Shopping Cart
with selected
items
Order
Details
API usage,
billing, policies
Running & Past
workflow instances
Sales Aggregates
by Day, Region,
Product Category
Invoices &
Payments
Product
Manuals
Digital
Twin
KPIs & Alerts
Customer
Interaction records
Case files
(Complaints
, Requests)
Rules & Rule
Execution
metrics
Weather,
Demographics,
Sports, Social, …
Config
data
Customer
Details
Audit Trails,
Security
Incidents
ML Models
Programming
in progress
User Stories,
Designs, Discussions
Copy of
Production Data
in Acceptance
Global Content Delivery
Network
Offline Storage in
Apps
Third party (SaaS)
Git repo
Offsite Standby for
Disaster Recovery
SaaS data store
in Cloud
DaaS data store
in Cloud
Application Server
Memory (on site)
Excel Sheets on
employee laptops
Local storage on “Things” &
Edge devices
Cloud storage for
Database backups
Local Database Instance for
each region
Considerations around
Location
• Latency
• Latency experienced by end-user is sum of latencies in the chain
• Co-located – systems with chatty interaction
• Storage cost
• Network Transport costs
• Ease of distribution
• Background distribution may be acceptable – provided it happens
frequently enough
• Off line usage
• Security
• Data “en route”
On the move
Marketing &
Campaigns
Public Internet/External Actors
Gov Agency
ShippingSecurity
Finance
Accounts, Invoices
Supplier & Product
Management
Inventory &
Warehousing
Output
(print & mail, email,
SMS, …)
Inside the Enterprise
Data Department
Consolidation, MI, Reporting,
Analysis and R&D
Customer
Management
Order
Management
Data
providers
SupplierCustomers
B2B Partner Portal
Customer
Service
SaaS
Mobile
App
Big Data
Lake
Custom
Application for
Product Catalog
IoT Gateways
& Hub
SaaS ERP
Enterprise Content
Management
System
Human
Workflow
Engine Mail
Server
Data Warehouse
SaaS CRM
DaaS
Services
SaaS CX
Campaigns, Social
Media Monitor, 360
Customer View
LDAP for Users,
Roles &
Permissions
WebShop
Portal
Recommendation
Engine
Enterprise
Dashboard & BI
& Reporting
Security &
Compliance
Monitor
Desktop
Tools
Communication &
Collaboration tools
Logging Collector
& Monitor &
Analyzer
Monitor for
Application & Infra
metrics
Source
Code
Control
System
API
Gateway
Service
Bus
Event
Bus
Event
Bus
Rule
Engine
Desktop
Browser
Mobile
Devices
Email /
Facebook /
WhatsApp
Custom Order
Management
Application
Asset
Tracker
Corporate
DatabaseFile
Storage
Job
Scheduling
B2B
APIs
Open Data
APIs
APIs
Application
Server
Private
Blockchain
B2B
APIs
Docker
Container
Registry
Streaming List of
Products
shown in UI
Personal Profile,
Order and
Payments Details
Smart Contracts
with supply
chain details
Recent
Consumer
purchases
information
Footage
from security
cameras
Readings
from motion
detectors
Emails regarding
customer
complaints
Spreadsheets
with Sales
records
Log-files from IT
systems (infra &
platform)
WebShop activity,
Social Media
discussions, …
ML Models
In Flight
Messages
Events
Job
Schedules
Application &
Infrastructure
source history
Offers, invoices,
rewards
messages
Shopping Cart
with selected
items
Order
Details
API usage,
billing, policies
Running & Past
workflow instances
Sales Aggregates
by Day, Region,
Product Category
Invoices &
Payments
Product
Manuals
Digital
Twin
KPIs & Alerts
Customer
Interaction records
Case files
(Complaints
, Requests)
Rules & Rule
Execution
metrics
Weather,
Demographics,
Sports, Social, …
Config
data
Customer
Details
Audit Trails,
Security
Incidents
ML Models
Programming
in progress
User Stories,
Designs, Discussions
Copy of
Production Data
in Acceptance
Synchronization of Devices
coming online again
Upload of ML
Models
Replaying transaction on
standby database
Applications
being deployed
Update of
Datawarehouse
Laptops & USB sticks
on the move
Raw IoT => Streaming Analysis
=> {alerts | digital twin | big
data}
Customer sending
complaint by email
Synchronization of SaaS from
On Premises
Metrics from Apps | Platform |
Infra to Log Stash & Monitor
Events moving to consumers
UI updates pushed to
browser
Task notification sent to
employee
Fresh Data pushed to
Application Cache
Database Backup
moved offsite
Cost
TC(D)O –
Total Cost of Data Ownership
• Business cost (missed opportunity, user dissatisfaction, …) of not having the
data available
• at all or fast enough or fresh enough
Speed
Freshness
Available
Compute
Storage
Network
TC(D)O –
Total Cost of Data Ownership
• Direct cost of
• Acquiring data
• Storing Data
• Storage (cheap and slow, expensive and quick)
• Compression (less storage at expense of compute)
• Retrieving Data
• Compute resources
• Cleansing, Calculating & Deriving data (DWH, ML Model, CQRS)
• Compute resources
• Transporting Data
• Network traffic has a price tag (especially when out of local ‘area’)
TC(D)O –
Total Cost of Data Ownership
• Operational costs
• Backup & Recovery
• Security
• Intellectual property
• Life cycle management – slower tier, archive, purge
• “Right to be forgotten”
• Regulatory periods to hang on to data
Open (APIs) & DaaS
• Governments and NGOs, scientific
and even commercial organizations
are publishing data
• Inviting anyone who wants to join in
to help make sense of the data
– understand driving factors,
identify categories, help predict
• Many areas
• Economy, health, public safety, sports,
traffic & transportation, games,
environment, maps, …
Live
Real Life
Background Batch
Process
(preparing letters for
customers)
Customers
BPM Engine
Processing cases
for Customers
Stale
Stale
• Data is a representation of the real world
• All data is inherently stale
• Except when it describes something that can not change – and whose
description can not change
• Staleness is probably not a problem
• Except in self driving cars…
• Run the end-of-year-report
• Consistency is much more important
Glimpses of the past
Session 1 Session 2
Glimpses of the past
Session 1 Session 2
Flashback to the Past
Powered by Undo
UNDO
Consistent: Move entire session to point in time
44
Looking into the future…
OUR_PRODUCTS
NAME PRICE
select name, price
from our_products
45
Looking further into the future…
OUR_PRODUCTS
NAME PRICE
select name, price
from our_products
begin
DBMS_FLASHBACK_ARCHIVE.ENABLE_AT_VALID_TIME (
level => 'ASOF'
, query_time => TO_TIMESTAMP('01-10-2018', 'DD-MM-YYYY')
);
end;
46
Current situation …
OUR_PRODUCTS
NAME PRICE
select name, price
from our_products
begin
DBMS_FLASHBACK_ARCHIVE.ENABLE_AT_VALID_TIME (
level => 'CURRENT'
);
end;
All data in the table
(the default setting)
47
OUR_PRODUCTS
NAME PRICE
select name, price
from our_products
begin
DBMS_FLASHBACK_ARCHIVE.ENABLE_AT_VALID_TIME (
level => 'ALL'
);
end;
All data in the table
(the default setting)
48
OUR_PRODUCTS
NAME PRICE
select name, price, start_date, end_date
from our_products
order
by start_date
START_DATE END_DATE
begin
DBMS_FLASHBACK_ARCHIVE.ENABLE_AT_VALID_TIME (
level => 'ALL'
);
end;
49
Part of SQL 2011 standard:
Temporal Database
Make the database aware of the time based business validity
of records
• Add timestamp columns indicating start and end of valid time for a record
• Specify a PERIOD for the table
• Note:
• A table can have multiple sets of columns, describing multiple types of
temporal business validity
create table our_products
( name varchar2(100)
, price number(7,2)
, start_date timestamp
, end_date timestamp
, PERIOD FOR offer_time (start_date, end_date)
);
Integrity
Data Constraints
to protect integrity
• Allowable values
• Mandatory attributes
• (Foreign Key) References
• NULL
• Constraints on
• type
• length
• format
• Spelling
• Character encoding
Data is representation of
the known real world
• How useful is it to enforce data integrity?
Virtual Reality
Data Integrity
• Why?
• Is it about truth?
• About regulations and by-the-book?
• Allow IT systems to run smoothly and not get confused?
• About auditability and non-repudiation?
• What about the real world?
• Data in IT is just a representation;
if the world is not by the book – what should IT do?
Blockchain
• Distributed
• Across trusted business partners
• Across public, anonymous parties
• Immutable
• Secured
• Trusted
• Smart Contracts
• Operations on data (without human intervention)
Format &
Technology
Graph Database
• Natural fit during development
• Superior (10-1000 times better)
performance
Person liked
by anyone
liked by Bob
Find People
liked by
anyone liked
by Bob
Find People
liked by
anyone liked
by Bob
From relational SQL
to Graph query
SQL vs NoSQL
SQL vs NoSQL
ACID vs BASE
Relational vs …
SQL is not good at anything
• But it sucks at nothing
Relational Databases
• Based on relational model of data (E.F. Codd), a mathematical foundation
• Uses SQL for query, DML and DDL
• Transactions are ACID (Atomicity, Consistency, Isolation, Durability)
• All or nothing
• Constraint Compliant
• Individual experience
[in a multi-session environment]
(aka concurrency)
• Down does not hurt
ACID comes at a cost
• Transaction results have to be persisted [before the transaction completes]
in order to guarantee D
• Concurrency requires some degree of locking (and multi-versioning) in order
to have I
• Constraint compliance (unique key, foreign key) means all data hangs
together (as do all transactions)
in order to have C
• Two-phase commit (across multiple participants)
introduces complexity, dependencies and delays,
yet required for A
The holy grail of Normalization
• Normalize to prevent
• data redundancy
• discrepancies (split brain)
• storage waste
• However: we should
recognize the fact that
some data is read far more
frequently than that
it is created and modified
The Relational Model
in practice
• Traditional Relational Data Model has severe impact on physical disk
performance
• Transaction Log => Sequential Write (append to file)
• Data Blocks require much more expensive Random Access disk writes
• Indexes (B-Tree, Bitmap, …) are used to speed up query (read)
performance
• and slow down transactions
• Relational data does not [always] map naturally to the data format required
in the application (OO, JSON, XML)
• Capability to join and construct ad-hoc queries across the entire data model
is powerful
• Declarative integrity constraints allow for strict enforcement of data quality
rules
• “the data may be non sensical, but at least it adheres to the rules”
Databases re-evaluated
• Not all use cases require ACID (or can afford it)
• Read only (product catalog for web shops)
• Inserts only and no (inter-record) constraints
• Big Data collected and “dumped” in Data Lake (Hadoop) for subsequent
processing
• High performance demands
• Not all data needs structured formats or structured querying and JOINs
• Entire documents are stored and retrieved based on a single key
• Sometimes – scalable availability and productivity is more important than
Consistency – and ACID is sacrificed
• CAP-theorem states: Consistency [across nodes], Availability and
Partition tolerance can not all three be satisfied
NoSQL and BASE
• NoSQL arose because of performance and scalability
challenges with traditional/relational approach in Web Scale operations
• NoSQL is a label for a wide variety of databases that lack some aspect of a
true relational database
• ACID-ness, SQL, relational model, constraints
• The label has been used since 2009
• Perhaps NoREL would be more appropriate
• Some well known NoSQL products are
• Cassandra, MongoDB, Redis, CouchDB, …
• BASE as alternative to ACID:
• basically available, soft state, eventually consistent
(after a short duration)
Typical for NoSQL
• Focus on speed, availability and scalability
• Horizontal scale out – distributed with load balancing and fail-over
• No (predefined) Data Structure
• Integrity primarily protected by application logic
• Open Source (most offerings are, not all: MarkLogic)
• Close(r) attention for how the data is used
• Application oriented data format and search paths and specialized
database per application (microservice, capability)
• Similar to the switch from SOA to API/Microservice
• Reads (far) more relevant than writes
• Data redundancy & denormalization
• No data access through SQL – well, …
Types of NoSQL
(leading) NoSQL Database
products
• MongoDB is (one of) the most popular (by any measure)
• Cloud (only):
• Google BigTable,
• AWS Dynamo
• Cache (in memory)
• ZooKeeper, Redis,
Coherence, Memcached,
Apache Ignite
(pka GridGain), …
• Hadoop/HDFS
• Oracle NoSQL
(fka Berkeley DB)
NoSQL means:
No Data Access through SQL
• However
• Data Professionals and
Developers speak SQL
• Reporting, Dashboarding,
ETL, BI tools speak SQL
• There is no common query
language across NoSQL
products
No Data Access through SQL
• However
• Data Professionals and
Developers speak SQL
• Reporting, Dashboarding,
ETL, BI tools speak SQL
• There is no common query
language across NoSQL
products
• Attempts from many vendors to create drivers that translate SQL statements
into NoSQL commands for the specific target database
• To protect existing investments in SQL – skills, tools, applications, reports,
..
SQL vs NoSQL
• SQL != RDBMS
• SQL on top of
• Hadoop – Spark SQL, Hive, Drill, Impala
• “External Table” Text files, CSV, Excel
• XML, JSON
• KSQL on Kafka events
• Google Spanner, BigQuery
• NoSQL – Berkeley DB, Hbase, Elastic Search,
MongoDB, Cassandra
NoSQL (MongoDB) vs
SQL (Oracle)
db.emp.find
( {"JOB":"SALESMAN"}
, { ENAME:1
, SAL:1}
)
.sort
( {'SAL':-1})
.limit(2)
select ename
, sal
from emp
where job = 'SALESMAN'
order
by sal desc
FETCH FIRST 2 ROWS ONLY
NoSQL (MongoDB) vs
SQL (Oracle)
db.emp.find
( {"JOB":"SALESMAN"
, $where :
" this.SAL +
(this.COMM != null?
this.COMM: 0)
> 2000"
}
)
select *
from emp
where sal + nvl(comm, 0)
> 2000
Distributed
Why distributed?
• Because it is
• Business is physically spread out over multiple locations
• To achieve
• Scalability
• Performance (parallelism, latency)
• Resilience of the whole – availability (in the face of individual failure)
• (site) Disaster recovery
• Trust (e.g. blockchain)
• Applies to data & processes
Marketing &
Campaigns
Public Internet/External Actors
Gov Agency
ShippingSecurity
Finance
Accounts, Invoices
Supplier & Product
Management
Inventory &
Warehousing
Output
(print & mail, email,
SMS, …)
Inside the Enterprise
Data Department
Consolidation, MI, Reporting,
Analysis and R&D
Customer
Management
Order
Management
Data
providers
SupplierCustomers
B2B Partner Portal
Customer
Service
SaaS
Mobile
App
Big Data
Lake
Custom
Application for
Product Catalog
IoT Gateways
& Hub
SaaS ERP
Enterprise Content
Management
System
Human
Workflow
Engine Mail
Server
Data Warehouse
SaaS CRM
DaaS
Services
SaaS CX
Campaigns, Social
Media Monitor, 360
Customer View
LDAP for Users,
Roles &
Permissions
WebShop
Portal
Recommendation
Engine
Enterprise
Dashboard & BI
& Reporting
Security &
Compliance
Monitor
Desktop
Tools
Communication &
Collaboration tools
Logging Collector
& Monitor &
Analyzer
Monitor for
Application & Infra
metrics
Source
Code
Control
System
API
Gateway
Service
Bus
Event
Bus
Event
Bus
Rule
Engine
Desktop
Browser
Mobile
Devices
Email /
Facebook /
WhatsApp
Custom Order
Management
Application
Asset
Tracker
Corporate
DatabaseFile
Storage
Job
Scheduling
B2B
APIs
Open Data
APIs
APIs
Application
Server
Private
Blockchain
B2B
APIs
Docker
Container
Registry
Distributed List of
Products
shown in UI
Personal Profile,
Order and
Payments Details
Smart Contracts
with supply
chain details
Recent
Consumer
purchases
information
Footage
from security
cameras
Readings
from motion
detectors
Emails regarding
customer
complaints
Spreadsheets
with Sales
records
Log-files from IT
systems (infra &
platform)
WebShop activity,
Social Media
discussions, …
ML Models
In Flight
Messages
Events
Job
Schedules
Application &
Infrastructure
source history
Offers, invoices,
rewards
messages
Shopping Cart
with selected
items
Order
Details
API usage,
billing, policies
Running & Past
workflow instances
Sales Aggregates
by Day, Region,
Product Category
Invoices &
Payments
Product
Manuals
Digital
Twin
KPIs & Alerts
Customer
Interaction records
Case files
(Complaints
, Requests)
Rules & Rule
Execution
metrics
Weather,
Demographics,
Sports, Social, …
Config
data
Customer
Details
Audit Trails,
Security
Incidents
ML Models
Programming
in progress
User Stories,
Designs, Discussions
Copy of
Production Data
in Acceptance
Global Content Delivery
Network
Offline Storage in
Apps
Real Application
Clusters
Distributed In Memory Cache Hazelcast,
MemCached, Redis, Coherence
Java EE Application
Server Cluster
SETI
Local storage on “Things” &
Edge devices
Active Standby
Database
SAN
Cross Cloud/On
Premises archive
Distributed Datastore MongoDB,
Cassandra, BigTable, HBase
Apache Spark Distributed Data
Processing
Logical Data Shards in Oracle
Database, MySQL, Elastic
HDFS Hadoop Distributed File
System
Kubernetes Distributed
Container Platform
Distributed Event Bus:
Kafka
Vertically Distributed Data
Client Tier: Browser
DOM/UI
MVVM
Middle Tier:
Java EE (Stateful) application
API
API
API
Stateless
Vertically Distributed Data
Client Tier: Browser
DOM/UI
MVVM
Middle Tier:
Java EE (Stateful) application
API
API
API
Stateless
Availability
Marketing &
Campaigns
Public Internet/External Actors
Gov Agency
ShippingSecurity
Finance
Accounts, Invoices
Supplier & Product
Management
Inventory &
Warehousing
Output
(print & mail, email,
SMS, …)
Inside the Enterprise
Data Department
Consolidation, MI, Reporting,
Analysis and R&D
Customer
Management
Order
Management
Data
providers
SupplierCustomers
B2B Partner Portal
Customer
Service
SaaS
Mobile
App
Big Data
Lake
Custom
Application for
Product Catalog
IoT Gateways
& Hub
SaaS ERP
Enterprise Content
Management
System
Human
Workflow
Engine Mail
Server
Data Warehouse
SaaS CRM
DaaS
Services
SaaS CX
Campaigns, Social
Media Monitor, 360
Customer View
LDAP for Users,
Roles &
Permissions
WebShop
Portal
Recommendation
Engine
Enterprise
Dashboard & BI
& Reporting
Security &
Compliance
Monitor
Desktop
Tools
Communication &
Collaboration tools
Logging Collector
& Monitor &
Analyzer
Monitor for
Application & Infra
metrics
Source
Code
Control
System
API
Gateway
Service
Bus
Event
Bus
Event
Bus
Rule
Engine
Desktop
Browser
Mobile
Devices
Email /
Facebook /
WhatsApp
Custom Order
Management
Application
Asset
Tracker
Corporate
DatabaseFile
Storage
Job
Scheduling
B2B
APIs
Open Data
APIs
APIs
Application
Server
Private
Blockchain
B2B
APIs
Docker
Container
Registry
Availability List of
Products
shown in UI
Personal Profile,
Order and
Payments Details
Smart Contracts
with supply
chain details
Recent
Consumer
purchases
information
Footage
from security
cameras
Readings
from motion
detectors
Emails regarding
customer
complaints
Spreadsheets
with Sales
records
Log-files from IT
systems (infra &
platform)
WebShop activity,
Social Media
discussions, …
ML Models
In Flight
Messages
Events
Job
Schedules
Application &
Infrastructure
source history
Offers, invoices,
rewards
messages
Shopping Cart
with selected
items
Order
Details
API usage,
billing, policies
Running & Past
workflow instances
Sales Aggregates
by Day, Region,
Product Category
Invoices &
Payments
Product
Manuals
Digital
Twin
KPIs & Alerts
Customer
Interaction records
Case files
(Complaints
, Requests)
Rules & Rule
Execution
metrics
Weather,
Demographics,
Sports, Social, …
Config
data
Customer
Details
Audit Trails,
Security
Incidents
ML Models
Programming
in progress
User Stories,
Designs, Discussions
Copy of
Production Data
in Acceptance
Global Content Delivery
Network
Webshop 24/7
on line
Relaxed availability (office
hours) for DWH
SaaS CRM less available
than desired
Fairly high availability for
[clusters of] things – not for
individual things
Active Standby
Database
SAN
Cross Cloud/On
Premises archive
Low availability demands
on Big Data
H/A for Oracle
Database
EventBus 24/7
on line
H/A for IoT
Hub
H/A for
LDAP
Fairly high availability for
[clusters of] things – not for
individual things
H/A during extended office
hours for human workflow
engine
Service Bus
24/7 on line
Some loss or service is acceptable for
recommendation engine
Availability of Data
• Availability:
• unplanned downtime (incident => disaster)
• planned (not desired) downtime (upgrade, patch to application, platform,
infra)
• Chain is as strong as the weakest link
• Availability is determined by least available component
• Datastore can drive (and help improve) availability of many
systems/applications/services
• Custom UI on top of SAP requires 99.95% up time – SAP only offers 98%
• Increase availability
• H/A architecture – multi-node cluster, hot standby and fail-over, disaster
recovery
• Rolling upgrades
• Single node for command, multiple (independent) helpers for query
Case of Web Shop
• Webshop – 1M visitors per day
• Product catalog consists of 15+ millions of records
• The web shop presents: product description, images, reviews, pricing details,
related offerings, stock status
• Some Products are added and updated and removed every day
• Although most products do not change very frequently
• Some vendors do bulk manipulation of product details
Products
Product updates
Webshop visits
- searches
- product details
- orders
Case of Web Shop –
Usage Patterns & Architecture
Products
Product updates
Webshop visits
- searches
- product details
- orders
firewall
Data manipulation
Data Quality (enforcement)
<10K transactions
Batch jobs next to online
Speed is nice
Read only
On line
Speed is crucial
XHTML & JSON
> 5M visits
Products
Products
Products
Webshop visits
- searches
- product details
- orders
firewall
Data manipulation
Data Quality (enforcement)
<10K transactions
Batch jobs next to online
Speed is nice
Read only
On line
Speed is crucial
XHTML & JSON
> 1M visits
DMZ
Read only
JSON documents
Images
Text Search
Scale Horizontally
Stale but consistent
Products
Nightly generation
Product updates
Case of Web Shop –
Usage Patterns & Architecture
CQRS
CQRS – Multi Data Store
Hoe integreer je applicaties en data? 89
Products
Data Manipulation
Data
Retrieval
CQRS – Multi Data Store
Hoe integreer je applicaties en data? 90
Special
Products
Product
Clusters
ProductsData Manipulation
Data Retrieval
Food
Stuff
Toys
Quick Product
Search Index
Product Store in
SaaS app
CQRS in Oracle Database
Active Data Guard Standby
SAN
Middleware Middleware Middleware
T T
MV
MV
idx idx
IMDB
RAC
RAC
Shard
(12c R2)
Shard
(12c R2)
SAN
SAN
dbf
SGA
Redo
Logs
CQRS - Command and Query
Responsibility Segragation
• Data manipulation and retrieval in separate places
• (physical data proliferation)
• Query store is optimized
for consumers
• Level of detail, format,
filters applied
• For performance and
scalability, independence,
productivity
lower license fees and
lower TCO, security
Synchronizing the Query Stores
Special
Products
Product Clusters
Products
Data Manipulation
Data Retrieval
Food Stuff
Toys
Quick Product Search
Index
Product Store in
SaaS app
Synchronizing the Query Stores
• Depends on
• Freshness requirements
• Authorization demands
• Cost of synchronizing the query store (full synchronize vs event based)
• Usage pattern for query store
• Facilities available in Command store (and in query stores)
• Relative locations (e.g. cloud & on premises)
• Mechanisms
• Importing Database dump-file
(periodic, full or partial)
• Direct queries & DML
• Change Data Capture from transaction logs
• Event based
Special
Product
s
Product
Clusters
ProductsData Manipulation
Data Retrieval
Food
Stuff
Toys
Quick Product
Search Index
Product Store
in SaaS app
Event Sourcing
State is sum of changes
Source: https://ookami86.github.io/event-sourcing-in-practice/#how-eventsourcing-works
Take the UD out of CRUD
• Introducing the Immu Table
• A ledger of entity changes
• With a timestamp or event sequence
• And the entity identifier
• And the new values of the added, changed,
erased attributes
• Each event is an immutable record that is appended to the ledger – just
simply added to the end
• Atomic, very cheap compared to Update and Delete
– does not require a lock
- it does require random file access and rearranging blocks on disk
Bank Account Change Event
Event Type
Timestamp
Account Id
Amount
(New value for) Owner
Erased: some attribute
Event Log in Event Sourcing
• Primary Data Source is ledger of change events
• Not a store of the current state
• However: optionally use snapshots of baseline (state up until time)
• Entity Event Store replaces Table
• Offers a simple API for creating and retrieving events
• ‘Entity Change Event’ Producer (to which consumers can subscribe)
• To correct a mistake:
• Do not remove the event! (it happened, it may already have been
distributed)
• Instead, create a compensating event (and then it unhappened)
Event Log
• Audit Log
• Time travel
• Reconstruct system (application state)
• Distributed application state
• Support multiple (read) models
• Easy construct debugging environment – of exact situation and time
• What-if scenarios –take copy, inject event & play forward from there
• State = sum of change events
• State = snapshot plus sum of recent events
• To synch application state = current state + sum of events after the event
version number on which current state is based
To implement
Event Sourcing
• Take a data store
• That is distributed, scalable, available
• For example Apache Cassandra
• Create an Event Log table [for each business entity]
• Create columns for timestamp, event id,
change [event] type, entity identifier
• Create columns for all attributes
or a single column to hold a document (e.g. JSON)
• A special change type can be ‘snapshot’ to specify a baseline
• No older entries are needed in the event log
Event Sourcing driving CQRS
Microservices
& Data
What is IT all about?
Application
Production Runtime
What is IT all about?
Application
Production Runtime
Platform
What is IT all about?
Application
Platform
Production Runtime
Operations
Monitoring &
Management
One team has Agile responsibility
through full lifecyle
Application
Platform
Production Runtime
Operations
Monitoring &
ManagementApplication
Preparation Runtime
Platform
Development
CD
Agile Design,
Build, Test
One team has Agile responsibility
through full lifecyle
Application
Platform
Production Runtime
Operations
Monitoring &
ManagementApplication
Preparation Runtime
Platform
Development
CD
Agile Design,
Build, Test
One team has Agile responsibility
through full lifecyle
Application
Platform
Application
Platform
DevOps team owns and runs
one (or more) products
Application
Platform
Generic Infrastructure Platform for running DevOps Products
Floorspace, Power,
Cooling, Storage,
Compute
Monitoring, Management,
Cache, Authentication,
RDBMS, Event Hub
Multiple products from multiple teams
run on a shared generic infrastructure
Generic Infrastructure Platform for running DevOps Products
Floorspace, Power,
Cooling, Storage,
Compute
Monitoring, Management,
Cache, Authentication,
RDBMS, Event Hub
Application
Platform
Application
Platform
Application
Platform
Application
Platform
Application
Platform
App plus platform under DevOps
== Microservice
Generic Infrastructure Platform for running DevOps Products
µ µ µ µ µ
App plus platform under DevOps
== Microservice• Stateless
• Horizontally scalable
• Mutually Independent
• upgrade, patch, relocate
• Can expose Public API (HTTP/REST)
and/or UI
• Communicate with each other through events
• Have their own bounded data context
• Do not rely on other microservices [for the data they need]
• Serverless – do not require allocated server, can be fired up
Generic Infrastructure Platform for running DevOps Products
µ µ µ µ µ
Microservices - objectives
• Minimize cost of change
• Maximize agility
• Isolate responsibility
• Reduce cohesion by minimizing dependencies
• logical, technical and runtime
• only standardized communication/interaction
• Independent, scalable processes
• Choreograhy (broadcast) preferred over Orchestration (direct call)
• Efficient operations
• Comprehendable, controllable IT
How do we get
from a Monolith
to Microservices?
Data in microservices
• Microservices are stateless & horizontally scalable
• Microservices are isolated & independent
• Where is their data?
• What about lookup data?
• Data not owned by the microservice –
but still required by it to perform its role => bounded context
Microservices State
Cache
RDBMS
Document
Store
NoSQL
Generic Platform for running microservices
Event Hub
Big Data
Block
Storage
LDAP
Bounded context in microservices
• Micoservice needs to be able to run independently
• It needs to contain & own all data required to run
• It cannot depend on other microservices
API
Customer
APIUI
OrderCustomerModified event
Wrap Up
Wrap Up
• Data used to be like T-Ford
• One model, one color
• And then:
Wrap Up
• Data comes in many shades (at least 50) – variations along many
dimensions
usage
Total Cost of Data Ownership
authorization
distribution
formatvolatility volume
ACID demands availability
freshness requirements
(staleness allowance)
location
speed
ownership
required consistency
Wrap Up
• Some form of CQRS is plain common sense
• Use fitting technology for the query challenge at hand
• Graph, Document, Relational, Key/Value, Column, Elastic Index, …
• Every organization will (should) have multiple data stores in various
technologies – and not just relational SQL
• Design & implement mechanism to synchronize
the query stores
• Events are attractive: decoupled, fine grained and fast
• Devise a purging strategy
• Stop carrying around your data legacy
Wrap Up
• All data is stale
• Consistency should be your main concern
• Microservices are stateless
• They can own state – in their private data store
• And maintain derived state – bounded context
• Events are published to allow microservices to synch their context
• Event Sourcing reduces complexity
• CRUD => CR
• Keep a ledger of data changes (book keeping of DML transactions)
• Reconstruct state – current or historical – from events
(into query store)
Wrap Up
• Data Integrity may be overrated
• Instead of enforcing constraints (reality may not be so clean) – identify
anomalies in data and act on them
• SQL sits on top of the world
• SQL [like query languages] run against a wide array of data stores,
including Streams, Big Data, NoSQL and CSV / Excel
• People and tools know SQL – make use of that
• Machine Learning and Artificial Intelligence are fueled by data
• They make the smallest, rawest, silliest piece of data potentially valuable
Wrap Up
DATADATADATADATADATADATA
Wrap Up
DATA
DATADATA
Thank you!
What is Apache Kafka and why is it important? 127
• Blog: technology.amis.nl
• Email: lucas.jellema@amis.nl
• : @lucasjellema
• : lucas-jellema
• : www.amis.nl, info@amis.nl

More Related Content

What's hot

WJAX 2013 Slides online: Big Data beyond Apache Hadoop - How to integrate ALL...
WJAX 2013 Slides online: Big Data beyond Apache Hadoop - How to integrate ALL...WJAX 2013 Slides online: Big Data beyond Apache Hadoop - How to integrate ALL...
WJAX 2013 Slides online: Big Data beyond Apache Hadoop - How to integrate ALL...
Kai Wähner
 
Ibm machine learning for z os
Ibm machine learning for z osIbm machine learning for z os
Ibm machine learning for z os
Cuneyt Goksu
 
Driving Business Transformation with Real-Time Analytics Using Apache Kafka a...
Driving Business Transformation with Real-Time Analytics Using Apache Kafka a...Driving Business Transformation with Real-Time Analytics Using Apache Kafka a...
Driving Business Transformation with Real-Time Analytics Using Apache Kafka a...
confluent
 
Kappa vs Lambda Architectures and Technology Comparison
Kappa vs Lambda Architectures and Technology ComparisonKappa vs Lambda Architectures and Technology Comparison
Kappa vs Lambda Architectures and Technology Comparison
Kai Wähner
 
Informatica Cloud Overview
Informatica Cloud OverviewInformatica Cloud Overview
Informatica Cloud Overview
Darren Cunningham
 
R, Spark, Tensorflow, H20.ai Applied to Streaming Analytics
R, Spark, Tensorflow, H20.ai Applied to Streaming AnalyticsR, Spark, Tensorflow, H20.ai Applied to Streaming Analytics
R, Spark, Tensorflow, H20.ai Applied to Streaming Analytics
Kai Wähner
 
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
SoftServe
 
StreamCentral Technical Overview
StreamCentral Technical OverviewStreamCentral Technical Overview
StreamCentral Technical Overview
Raheel Retiwalla
 
AWS Webcast - Informatica - Big Data Solutions Showcase
AWS Webcast - Informatica - Big Data Solutions ShowcaseAWS Webcast - Informatica - Big Data Solutions Showcase
AWS Webcast - Informatica - Big Data Solutions Showcase
Amazon Web Services
 
Using Hadoop for Cognitive Analytics
Using Hadoop for Cognitive AnalyticsUsing Hadoop for Cognitive Analytics
Using Hadoop for Cognitive Analytics
DataWorks Summit/Hadoop Summit
 
Extreme Analytics @ eBay
Extreme Analytics @ eBayExtreme Analytics @ eBay
Extreme Analytics @ eBay
DataWorks Summit/Hadoop Summit
 
Analytics in a Day Virtual Workshop
Analytics in a Day Virtual WorkshopAnalytics in a Day Virtual Workshop
Analytics in a Day Virtual Workshop
CCG
 
How to Apply Big Data Analytics and Machine Learning to Real Time Processing ...
How to Apply Big Data Analytics and Machine Learning to Real Time Processing ...How to Apply Big Data Analytics and Machine Learning to Real Time Processing ...
How to Apply Big Data Analytics and Machine Learning to Real Time Processing ...
Codemotion
 
Analytics in a Day Ft. Synapse Virtual Workshop
Analytics in a Day Ft. Synapse Virtual WorkshopAnalytics in a Day Ft. Synapse Virtual Workshop
Analytics in a Day Ft. Synapse Virtual Workshop
CCG
 
IoT meets AI in the Clouds
IoT meets AI in the CloudsIoT meets AI in the Clouds
IoT meets AI in the Clouds
Dr. Mirko Kämpf
 
Customer Event Hub – a modern Customer 360° view with DataStax Enterprise (DSE)
Customer Event Hub – a modern Customer 360° view with DataStax Enterprise (DSE) Customer Event Hub – a modern Customer 360° view with DataStax Enterprise (DSE)
Customer Event Hub – a modern Customer 360° view with DataStax Enterprise (DSE)
Guido Schmutz
 
Informatica + Hadoop = Best of Both Worlds
Informatica + Hadoop = Best of Both WorldsInformatica + Hadoop = Best of Both Worlds
Informatica + Hadoop = Best of Both Worlds
Ahmed Tayeh
 
IoT Architectures for Apache Kafka and Event Streaming - Industry 4.0, Digita...
IoT Architectures for Apache Kafka and Event Streaming - Industry 4.0, Digita...IoT Architectures for Apache Kafka and Event Streaming - Industry 4.0, Digita...
IoT Architectures for Apache Kafka and Event Streaming - Industry 4.0, Digita...
Kai Wähner
 
Sören Eickhoff, Informatica GmbH, "Informatica Intelligent Data Lake – Self S...
Sören Eickhoff, Informatica GmbH, "Informatica Intelligent Data Lake – Self S...Sören Eickhoff, Informatica GmbH, "Informatica Intelligent Data Lake – Self S...
Sören Eickhoff, Informatica GmbH, "Informatica Intelligent Data Lake – Self S...
Dataconomy Media
 
The Manulife Journey
The Manulife JourneyThe Manulife Journey
The Manulife Journey
DataWorks Summit
 

What's hot (20)

WJAX 2013 Slides online: Big Data beyond Apache Hadoop - How to integrate ALL...
WJAX 2013 Slides online: Big Data beyond Apache Hadoop - How to integrate ALL...WJAX 2013 Slides online: Big Data beyond Apache Hadoop - How to integrate ALL...
WJAX 2013 Slides online: Big Data beyond Apache Hadoop - How to integrate ALL...
 
Ibm machine learning for z os
Ibm machine learning for z osIbm machine learning for z os
Ibm machine learning for z os
 
Driving Business Transformation with Real-Time Analytics Using Apache Kafka a...
Driving Business Transformation with Real-Time Analytics Using Apache Kafka a...Driving Business Transformation with Real-Time Analytics Using Apache Kafka a...
Driving Business Transformation with Real-Time Analytics Using Apache Kafka a...
 
Kappa vs Lambda Architectures and Technology Comparison
Kappa vs Lambda Architectures and Technology ComparisonKappa vs Lambda Architectures and Technology Comparison
Kappa vs Lambda Architectures and Technology Comparison
 
Informatica Cloud Overview
Informatica Cloud OverviewInformatica Cloud Overview
Informatica Cloud Overview
 
R, Spark, Tensorflow, H20.ai Applied to Streaming Analytics
R, Spark, Tensorflow, H20.ai Applied to Streaming AnalyticsR, Spark, Tensorflow, H20.ai Applied to Streaming Analytics
R, Spark, Tensorflow, H20.ai Applied to Streaming Analytics
 
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
 
StreamCentral Technical Overview
StreamCentral Technical OverviewStreamCentral Technical Overview
StreamCentral Technical Overview
 
AWS Webcast - Informatica - Big Data Solutions Showcase
AWS Webcast - Informatica - Big Data Solutions ShowcaseAWS Webcast - Informatica - Big Data Solutions Showcase
AWS Webcast - Informatica - Big Data Solutions Showcase
 
Using Hadoop for Cognitive Analytics
Using Hadoop for Cognitive AnalyticsUsing Hadoop for Cognitive Analytics
Using Hadoop for Cognitive Analytics
 
Extreme Analytics @ eBay
Extreme Analytics @ eBayExtreme Analytics @ eBay
Extreme Analytics @ eBay
 
Analytics in a Day Virtual Workshop
Analytics in a Day Virtual WorkshopAnalytics in a Day Virtual Workshop
Analytics in a Day Virtual Workshop
 
How to Apply Big Data Analytics and Machine Learning to Real Time Processing ...
How to Apply Big Data Analytics and Machine Learning to Real Time Processing ...How to Apply Big Data Analytics and Machine Learning to Real Time Processing ...
How to Apply Big Data Analytics and Machine Learning to Real Time Processing ...
 
Analytics in a Day Ft. Synapse Virtual Workshop
Analytics in a Day Ft. Synapse Virtual WorkshopAnalytics in a Day Ft. Synapse Virtual Workshop
Analytics in a Day Ft. Synapse Virtual Workshop
 
IoT meets AI in the Clouds
IoT meets AI in the CloudsIoT meets AI in the Clouds
IoT meets AI in the Clouds
 
Customer Event Hub – a modern Customer 360° view with DataStax Enterprise (DSE)
Customer Event Hub – a modern Customer 360° view with DataStax Enterprise (DSE) Customer Event Hub – a modern Customer 360° view with DataStax Enterprise (DSE)
Customer Event Hub – a modern Customer 360° view with DataStax Enterprise (DSE)
 
Informatica + Hadoop = Best of Both Worlds
Informatica + Hadoop = Best of Both WorldsInformatica + Hadoop = Best of Both Worlds
Informatica + Hadoop = Best of Both Worlds
 
IoT Architectures for Apache Kafka and Event Streaming - Industry 4.0, Digita...
IoT Architectures for Apache Kafka and Event Streaming - Industry 4.0, Digita...IoT Architectures for Apache Kafka and Event Streaming - Industry 4.0, Digita...
IoT Architectures for Apache Kafka and Event Streaming - Industry 4.0, Digita...
 
Sören Eickhoff, Informatica GmbH, "Informatica Intelligent Data Lake – Self S...
Sören Eickhoff, Informatica GmbH, "Informatica Intelligent Data Lake – Self S...Sören Eickhoff, Informatica GmbH, "Informatica Intelligent Data Lake – Self S...
Sören Eickhoff, Informatica GmbH, "Informatica Intelligent Data Lake – Self S...
 
The Manulife Journey
The Manulife JourneyThe Manulife Journey
The Manulife Journey
 

Similar to 50 Shades of Data - Dutch Oracle Architects Platform (February 2018)

50 Shades of Data - how, when and why Big,Relational,NoSQL,Elastic,Event,CQRS...
50 Shades of Data - how, when and why Big,Relational,NoSQL,Elastic,Event,CQRS...50 Shades of Data - how, when and why Big,Relational,NoSQL,Elastic,Event,CQRS...
50 Shades of Data - how, when and why Big,Relational,NoSQL,Elastic,Event,CQRS...
Lucas Jellema
 
Demystify big data data science
Demystify big data  data scienceDemystify big data  data science
Demystify big data data science
Mahesh Kumar CV
 
A Winning Strategy for the Digital Economy
A Winning Strategy for the Digital EconomyA Winning Strategy for the Digital Economy
A Winning Strategy for the Digital Economy
Eric Kavanagh
 
Data Analytics를 통한 비지니스 혁신::Craig Stries::AWS Summit Seoul 2018
Data Analytics를 통한 비지니스 혁신::Craig Stries::AWS Summit Seoul 2018Data Analytics를 통한 비지니스 혁신::Craig Stries::AWS Summit Seoul 2018
Data Analytics를 통한 비지니스 혁신::Craig Stries::AWS Summit Seoul 2018
Amazon Web Services Korea
 
Modern Data Architectures for Business Insights at Scale
Modern Data Architectures for Business Insights at Scale Modern Data Architectures for Business Insights at Scale
Modern Data Architectures for Business Insights at Scale
Amazon Web Services
 
Big Data and Analytics on Amazon Web Services: Building A Business-Friendly P...
Big Data and Analytics on Amazon Web Services: Building A Business-Friendly P...Big Data and Analytics on Amazon Web Services: Building A Business-Friendly P...
Big Data and Analytics on Amazon Web Services: Building A Business-Friendly P...
Amazon Web Services
 
Big Data Application Architectures - Fraud Detection
Big Data Application Architectures - Fraud DetectionBig Data Application Architectures - Fraud Detection
Big Data Application Architectures - Fraud Detection
DataWorks Summit/Hadoop Summit
 
Building your Datalake on AWS
Building your Datalake on AWSBuilding your Datalake on AWS
Building your Datalake on AWS
Amazon Web Services
 
Demystify Big Data, Data Science & Signal Extraction Deep Dive
Demystify Big Data, Data Science & Signal Extraction Deep DiveDemystify Big Data, Data Science & Signal Extraction Deep Dive
Demystify Big Data, Data Science & Signal Extraction Deep Dive
Hyderabad Scalability Meetup
 
The Warranty Data Lake – After, Inc.
The Warranty Data Lake – After, Inc.The Warranty Data Lake – After, Inc.
The Warranty Data Lake – After, Inc.
Richard Vermillion
 
Hadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data ProcessingHadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data Processing
Hortonworks
 
Real-Time Streaming: Intro to Amazon Kinesis
Real-Time Streaming: Intro to Amazon KinesisReal-Time Streaming: Intro to Amazon Kinesis
Real-Time Streaming: Intro to Amazon Kinesis
Amazon Web Services
 
Driving Business Insights with a Modern Data Architecture AWS Summit SG 2017
Driving Business Insights with a Modern Data Architecture  AWS Summit SG 2017Driving Business Insights with a Modern Data Architecture  AWS Summit SG 2017
Driving Business Insights with a Modern Data Architecture AWS Summit SG 2017
Amazon Web Services
 
Analysing Data in Real-time
Analysing Data in Real-timeAnalysing Data in Real-time
Analysing Data in Real-time
Amazon Web Services
 
The 4th Generation Kingland platform
The 4th Generation Kingland platformThe 4th Generation Kingland platform
The 4th Generation Kingland platform
Kingland
 
Big Data on AWS - Toronto FSI Symposium - October 2016
Big Data on AWS - Toronto FSI Symposium - October 2016Big Data on AWS - Toronto FSI Symposium - October 2016
Big Data on AWS - Toronto FSI Symposium - October 2016
Amazon Web Services
 
Kaizentric Presentation
Kaizentric PresentationKaizentric Presentation
Kaizentric Presentation
Azhagarasan Annadorai
 
Take Action: The New Reality of Data-Driven Business
Take Action: The New Reality of Data-Driven BusinessTake Action: The New Reality of Data-Driven Business
Take Action: The New Reality of Data-Driven Business
Inside Analysis
 
Big Data Meetup: Analytical Systems Evolution
Big Data Meetup: Analytical Systems EvolutionBig Data Meetup: Analytical Systems Evolution
Big Data Meetup: Analytical Systems Evolution
Provectus
 
Analytical Systems Evolution: From Excel to Big Data Platforms and Data Lakes
Analytical Systems Evolution: From Excel to Big Data Platforms and Data LakesAnalytical Systems Evolution: From Excel to Big Data Platforms and Data Lakes
Analytical Systems Evolution: From Excel to Big Data Platforms and Data Lakes
Provectus
 

Similar to 50 Shades of Data - Dutch Oracle Architects Platform (February 2018) (20)

50 Shades of Data - how, when and why Big,Relational,NoSQL,Elastic,Event,CQRS...
50 Shades of Data - how, when and why Big,Relational,NoSQL,Elastic,Event,CQRS...50 Shades of Data - how, when and why Big,Relational,NoSQL,Elastic,Event,CQRS...
50 Shades of Data - how, when and why Big,Relational,NoSQL,Elastic,Event,CQRS...
 
Demystify big data data science
Demystify big data  data scienceDemystify big data  data science
Demystify big data data science
 
A Winning Strategy for the Digital Economy
A Winning Strategy for the Digital EconomyA Winning Strategy for the Digital Economy
A Winning Strategy for the Digital Economy
 
Data Analytics를 통한 비지니스 혁신::Craig Stries::AWS Summit Seoul 2018
Data Analytics를 통한 비지니스 혁신::Craig Stries::AWS Summit Seoul 2018Data Analytics를 통한 비지니스 혁신::Craig Stries::AWS Summit Seoul 2018
Data Analytics를 통한 비지니스 혁신::Craig Stries::AWS Summit Seoul 2018
 
Modern Data Architectures for Business Insights at Scale
Modern Data Architectures for Business Insights at Scale Modern Data Architectures for Business Insights at Scale
Modern Data Architectures for Business Insights at Scale
 
Big Data and Analytics on Amazon Web Services: Building A Business-Friendly P...
Big Data and Analytics on Amazon Web Services: Building A Business-Friendly P...Big Data and Analytics on Amazon Web Services: Building A Business-Friendly P...
Big Data and Analytics on Amazon Web Services: Building A Business-Friendly P...
 
Big Data Application Architectures - Fraud Detection
Big Data Application Architectures - Fraud DetectionBig Data Application Architectures - Fraud Detection
Big Data Application Architectures - Fraud Detection
 
Building your Datalake on AWS
Building your Datalake on AWSBuilding your Datalake on AWS
Building your Datalake on AWS
 
Demystify Big Data, Data Science & Signal Extraction Deep Dive
Demystify Big Data, Data Science & Signal Extraction Deep DiveDemystify Big Data, Data Science & Signal Extraction Deep Dive
Demystify Big Data, Data Science & Signal Extraction Deep Dive
 
The Warranty Data Lake – After, Inc.
The Warranty Data Lake – After, Inc.The Warranty Data Lake – After, Inc.
The Warranty Data Lake – After, Inc.
 
Hadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data ProcessingHadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data Processing
 
Real-Time Streaming: Intro to Amazon Kinesis
Real-Time Streaming: Intro to Amazon KinesisReal-Time Streaming: Intro to Amazon Kinesis
Real-Time Streaming: Intro to Amazon Kinesis
 
Driving Business Insights with a Modern Data Architecture AWS Summit SG 2017
Driving Business Insights with a Modern Data Architecture  AWS Summit SG 2017Driving Business Insights with a Modern Data Architecture  AWS Summit SG 2017
Driving Business Insights with a Modern Data Architecture AWS Summit SG 2017
 
Analysing Data in Real-time
Analysing Data in Real-timeAnalysing Data in Real-time
Analysing Data in Real-time
 
The 4th Generation Kingland platform
The 4th Generation Kingland platformThe 4th Generation Kingland platform
The 4th Generation Kingland platform
 
Big Data on AWS - Toronto FSI Symposium - October 2016
Big Data on AWS - Toronto FSI Symposium - October 2016Big Data on AWS - Toronto FSI Symposium - October 2016
Big Data on AWS - Toronto FSI Symposium - October 2016
 
Kaizentric Presentation
Kaizentric PresentationKaizentric Presentation
Kaizentric Presentation
 
Take Action: The New Reality of Data-Driven Business
Take Action: The New Reality of Data-Driven BusinessTake Action: The New Reality of Data-Driven Business
Take Action: The New Reality of Data-Driven Business
 
Big Data Meetup: Analytical Systems Evolution
Big Data Meetup: Analytical Systems EvolutionBig Data Meetup: Analytical Systems Evolution
Big Data Meetup: Analytical Systems Evolution
 
Analytical Systems Evolution: From Excel to Big Data Platforms and Data Lakes
Analytical Systems Evolution: From Excel to Big Data Platforms and Data LakesAnalytical Systems Evolution: From Excel to Big Data Platforms and Data Lakes
Analytical Systems Evolution: From Excel to Big Data Platforms and Data Lakes
 

More from Lucas Jellema

Introduction to web application development with Vue (for absolute beginners)...
Introduction to web application development with Vue (for absolute beginners)...Introduction to web application development with Vue (for absolute beginners)...
Introduction to web application development with Vue (for absolute beginners)...
Lucas Jellema
 
Making the Shift Left - Bringing Ops to Dev before bringing applications to p...
Making the Shift Left - Bringing Ops to Dev before bringing applications to p...Making the Shift Left - Bringing Ops to Dev before bringing applications to p...
Making the Shift Left - Bringing Ops to Dev before bringing applications to p...
Lucas Jellema
 
Lightweight coding in powerful Cloud Development Environments (DigitalXchange...
Lightweight coding in powerful Cloud Development Environments (DigitalXchange...Lightweight coding in powerful Cloud Development Environments (DigitalXchange...
Lightweight coding in powerful Cloud Development Environments (DigitalXchange...
Lucas Jellema
 
Apache Superset - open source data exploration and visualization (Conclusion ...
Apache Superset - open source data exploration and visualization (Conclusion ...Apache Superset - open source data exploration and visualization (Conclusion ...
Apache Superset - open source data exploration and visualization (Conclusion ...
Lucas Jellema
 
CONNECTING THE REAL WORLD TO ENTERPRISE IT – HOW IoT DRIVES OUR ENERGY TRANSI...
CONNECTING THE REAL WORLD TO ENTERPRISE IT – HOW IoT DRIVES OUR ENERGY TRANSI...CONNECTING THE REAL WORLD TO ENTERPRISE IT – HOW IoT DRIVES OUR ENERGY TRANSI...
CONNECTING THE REAL WORLD TO ENTERPRISE IT – HOW IoT DRIVES OUR ENERGY TRANSI...
Lucas Jellema
 
Help me move away from Oracle - or not?! (Oracle Community Tour EMEA - LVOUG...
Help me move away from Oracle - or not?!  (Oracle Community Tour EMEA - LVOUG...Help me move away from Oracle - or not?!  (Oracle Community Tour EMEA - LVOUG...
Help me move away from Oracle - or not?! (Oracle Community Tour EMEA - LVOUG...
Lucas Jellema
 
Op je vingers tellen... tot 1000!
Op je vingers tellen... tot 1000!Op je vingers tellen... tot 1000!
Op je vingers tellen... tot 1000!
Lucas Jellema
 
IoT - from prototype to enterprise platform (DigitalXchange 2022)
IoT - from prototype to enterprise platform (DigitalXchange 2022)IoT - from prototype to enterprise platform (DigitalXchange 2022)
IoT - from prototype to enterprise platform (DigitalXchange 2022)
Lucas Jellema
 
Who Wants to Become an IT Architect-A Look at the Bigger Picture - DigitalXch...
Who Wants to Become an IT Architect-A Look at the Bigger Picture - DigitalXch...Who Wants to Become an IT Architect-A Look at the Bigger Picture - DigitalXch...
Who Wants to Become an IT Architect-A Look at the Bigger Picture - DigitalXch...
Lucas Jellema
 
Steampipe - use SQL to retrieve data from cloud, platforms and files (Code Ca...
Steampipe - use SQL to retrieve data from cloud, platforms and files (Code Ca...Steampipe - use SQL to retrieve data from cloud, platforms and files (Code Ca...
Steampipe - use SQL to retrieve data from cloud, platforms and files (Code Ca...
Lucas Jellema
 
Automation of Software Engineering with OCI DevOps Build and Deployment Pipel...
Automation of Software Engineering with OCI DevOps Build and Deployment Pipel...Automation of Software Engineering with OCI DevOps Build and Deployment Pipel...
Automation of Software Engineering with OCI DevOps Build and Deployment Pipel...
Lucas Jellema
 
Introducing Dapr.io - the open source personal assistant to microservices and...
Introducing Dapr.io - the open source personal assistant to microservices and...Introducing Dapr.io - the open source personal assistant to microservices and...
Introducing Dapr.io - the open source personal assistant to microservices and...
Lucas Jellema
 
How and Why you can and should Participate in Open Source Projects (AMIS, Sof...
How and Why you can and should Participate in Open Source Projects (AMIS, Sof...How and Why you can and should Participate in Open Source Projects (AMIS, Sof...
How and Why you can and should Participate in Open Source Projects (AMIS, Sof...
Lucas Jellema
 
Microservices, Apache Kafka, Node, Dapr and more - Part Two (Fontys Hogeschoo...
Microservices, Apache Kafka, Node, Dapr and more - Part Two (Fontys Hogeschoo...Microservices, Apache Kafka, Node, Dapr and more - Part Two (Fontys Hogeschoo...
Microservices, Apache Kafka, Node, Dapr and more - Part Two (Fontys Hogeschoo...
Lucas Jellema
 
Microservices, Node, Dapr and more - Part One (Fontys Hogeschool, Spring 2022)
Microservices, Node, Dapr and more - Part One (Fontys Hogeschool, Spring 2022)Microservices, Node, Dapr and more - Part One (Fontys Hogeschool, Spring 2022)
Microservices, Node, Dapr and more - Part One (Fontys Hogeschool, Spring 2022)
Lucas Jellema
 
6Reinventing Oracle Systems in a Cloudy World (RMOUG Trainingdays, February 2...
6Reinventing Oracle Systems in a Cloudy World (RMOUG Trainingdays, February 2...6Reinventing Oracle Systems in a Cloudy World (RMOUG Trainingdays, February 2...
6Reinventing Oracle Systems in a Cloudy World (RMOUG Trainingdays, February 2...
Lucas Jellema
 
Help me move away from Oracle! (RMOUG Training Days 2022, February 2022)
Help me move away from Oracle! (RMOUG Training Days 2022, February 2022)Help me move away from Oracle! (RMOUG Training Days 2022, February 2022)
Help me move away from Oracle! (RMOUG Training Days 2022, February 2022)
Lucas Jellema
 
Tech Talks 101 - DevOps (jan 2022)
Tech Talks 101 - DevOps (jan 2022)Tech Talks 101 - DevOps (jan 2022)
Tech Talks 101 - DevOps (jan 2022)
Lucas Jellema
 
Conclusion Code Cafe - Microcks for Mocking and Testing Async APIs (January 2...
Conclusion Code Cafe - Microcks for Mocking and Testing Async APIs (January 2...Conclusion Code Cafe - Microcks for Mocking and Testing Async APIs (January 2...
Conclusion Code Cafe - Microcks for Mocking and Testing Async APIs (January 2...
Lucas Jellema
 
Cloud Native Application Development - build fast, low TCO, scalable & agile ...
Cloud Native Application Development - build fast, low TCO, scalable & agile ...Cloud Native Application Development - build fast, low TCO, scalable & agile ...
Cloud Native Application Development - build fast, low TCO, scalable & agile ...
Lucas Jellema
 

More from Lucas Jellema (20)

Introduction to web application development with Vue (for absolute beginners)...
Introduction to web application development with Vue (for absolute beginners)...Introduction to web application development with Vue (for absolute beginners)...
Introduction to web application development with Vue (for absolute beginners)...
 
Making the Shift Left - Bringing Ops to Dev before bringing applications to p...
Making the Shift Left - Bringing Ops to Dev before bringing applications to p...Making the Shift Left - Bringing Ops to Dev before bringing applications to p...
Making the Shift Left - Bringing Ops to Dev before bringing applications to p...
 
Lightweight coding in powerful Cloud Development Environments (DigitalXchange...
Lightweight coding in powerful Cloud Development Environments (DigitalXchange...Lightweight coding in powerful Cloud Development Environments (DigitalXchange...
Lightweight coding in powerful Cloud Development Environments (DigitalXchange...
 
Apache Superset - open source data exploration and visualization (Conclusion ...
Apache Superset - open source data exploration and visualization (Conclusion ...Apache Superset - open source data exploration and visualization (Conclusion ...
Apache Superset - open source data exploration and visualization (Conclusion ...
 
CONNECTING THE REAL WORLD TO ENTERPRISE IT – HOW IoT DRIVES OUR ENERGY TRANSI...
CONNECTING THE REAL WORLD TO ENTERPRISE IT – HOW IoT DRIVES OUR ENERGY TRANSI...CONNECTING THE REAL WORLD TO ENTERPRISE IT – HOW IoT DRIVES OUR ENERGY TRANSI...
CONNECTING THE REAL WORLD TO ENTERPRISE IT – HOW IoT DRIVES OUR ENERGY TRANSI...
 
Help me move away from Oracle - or not?! (Oracle Community Tour EMEA - LVOUG...
Help me move away from Oracle - or not?!  (Oracle Community Tour EMEA - LVOUG...Help me move away from Oracle - or not?!  (Oracle Community Tour EMEA - LVOUG...
Help me move away from Oracle - or not?! (Oracle Community Tour EMEA - LVOUG...
 
Op je vingers tellen... tot 1000!
Op je vingers tellen... tot 1000!Op je vingers tellen... tot 1000!
Op je vingers tellen... tot 1000!
 
IoT - from prototype to enterprise platform (DigitalXchange 2022)
IoT - from prototype to enterprise platform (DigitalXchange 2022)IoT - from prototype to enterprise platform (DigitalXchange 2022)
IoT - from prototype to enterprise platform (DigitalXchange 2022)
 
Who Wants to Become an IT Architect-A Look at the Bigger Picture - DigitalXch...
Who Wants to Become an IT Architect-A Look at the Bigger Picture - DigitalXch...Who Wants to Become an IT Architect-A Look at the Bigger Picture - DigitalXch...
Who Wants to Become an IT Architect-A Look at the Bigger Picture - DigitalXch...
 
Steampipe - use SQL to retrieve data from cloud, platforms and files (Code Ca...
Steampipe - use SQL to retrieve data from cloud, platforms and files (Code Ca...Steampipe - use SQL to retrieve data from cloud, platforms and files (Code Ca...
Steampipe - use SQL to retrieve data from cloud, platforms and files (Code Ca...
 
Automation of Software Engineering with OCI DevOps Build and Deployment Pipel...
Automation of Software Engineering with OCI DevOps Build and Deployment Pipel...Automation of Software Engineering with OCI DevOps Build and Deployment Pipel...
Automation of Software Engineering with OCI DevOps Build and Deployment Pipel...
 
Introducing Dapr.io - the open source personal assistant to microservices and...
Introducing Dapr.io - the open source personal assistant to microservices and...Introducing Dapr.io - the open source personal assistant to microservices and...
Introducing Dapr.io - the open source personal assistant to microservices and...
 
How and Why you can and should Participate in Open Source Projects (AMIS, Sof...
How and Why you can and should Participate in Open Source Projects (AMIS, Sof...How and Why you can and should Participate in Open Source Projects (AMIS, Sof...
How and Why you can and should Participate in Open Source Projects (AMIS, Sof...
 
Microservices, Apache Kafka, Node, Dapr and more - Part Two (Fontys Hogeschoo...
Microservices, Apache Kafka, Node, Dapr and more - Part Two (Fontys Hogeschoo...Microservices, Apache Kafka, Node, Dapr and more - Part Two (Fontys Hogeschoo...
Microservices, Apache Kafka, Node, Dapr and more - Part Two (Fontys Hogeschoo...
 
Microservices, Node, Dapr and more - Part One (Fontys Hogeschool, Spring 2022)
Microservices, Node, Dapr and more - Part One (Fontys Hogeschool, Spring 2022)Microservices, Node, Dapr and more - Part One (Fontys Hogeschool, Spring 2022)
Microservices, Node, Dapr and more - Part One (Fontys Hogeschool, Spring 2022)
 
6Reinventing Oracle Systems in a Cloudy World (RMOUG Trainingdays, February 2...
6Reinventing Oracle Systems in a Cloudy World (RMOUG Trainingdays, February 2...6Reinventing Oracle Systems in a Cloudy World (RMOUG Trainingdays, February 2...
6Reinventing Oracle Systems in a Cloudy World (RMOUG Trainingdays, February 2...
 
Help me move away from Oracle! (RMOUG Training Days 2022, February 2022)
Help me move away from Oracle! (RMOUG Training Days 2022, February 2022)Help me move away from Oracle! (RMOUG Training Days 2022, February 2022)
Help me move away from Oracle! (RMOUG Training Days 2022, February 2022)
 
Tech Talks 101 - DevOps (jan 2022)
Tech Talks 101 - DevOps (jan 2022)Tech Talks 101 - DevOps (jan 2022)
Tech Talks 101 - DevOps (jan 2022)
 
Conclusion Code Cafe - Microcks for Mocking and Testing Async APIs (January 2...
Conclusion Code Cafe - Microcks for Mocking and Testing Async APIs (January 2...Conclusion Code Cafe - Microcks for Mocking and Testing Async APIs (January 2...
Conclusion Code Cafe - Microcks for Mocking and Testing Async APIs (January 2...
 
Cloud Native Application Development - build fast, low TCO, scalable & agile ...
Cloud Native Application Development - build fast, low TCO, scalable & agile ...Cloud Native Application Development - build fast, low TCO, scalable & agile ...
Cloud Native Application Development - build fast, low TCO, scalable & agile ...
 

Recently uploaded

Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j
 
Using Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query PerformanceUsing Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query Performance
Grant Fritchey
 
Graspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code AnalysisGraspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code Analysis
Aftab Hussain
 
SQL Accounting Software Brochure Malaysia
SQL Accounting Software Brochure MalaysiaSQL Accounting Software Brochure Malaysia
SQL Accounting Software Brochure Malaysia
GohKiangHock
 
openEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain SecurityopenEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain Security
Shane Coughlan
 
Oracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptxOracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptx
Remote DBA Services
 
SMS API Integration in Saudi Arabia| Best SMS API Service
SMS API Integration in Saudi Arabia| Best SMS API ServiceSMS API Integration in Saudi Arabia| Best SMS API Service
SMS API Integration in Saudi Arabia| Best SMS API Service
Yara Milbes
 
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeA Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
Aftab Hussain
 
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
mz5nrf0n
 
Using Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional SafetyUsing Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional Safety
Ayan Halder
 
How to write a program in any programming language
How to write a program in any programming languageHow to write a program in any programming language
How to write a program in any programming language
Rakesh Kumar R
 
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
XfilesPro
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOMLORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
lorraineandreiamcidl
 
GreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-JurisicGreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-Jurisic
Green Software Development
 
Empowering Growth with Best Software Development Company in Noida - Deuglo
Empowering Growth with Best Software  Development Company in Noida - DeugloEmpowering Growth with Best Software  Development Company in Noida - Deuglo
Empowering Growth with Best Software Development Company in Noida - Deuglo
Deuglo Infosystem Pvt Ltd
 
WWDC 2024 Keynote Review: For CocoaCoders Austin
WWDC 2024 Keynote Review: For CocoaCoders AustinWWDC 2024 Keynote Review: For CocoaCoders Austin
WWDC 2024 Keynote Review: For CocoaCoders Austin
Patrick Weigel
 
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdfAutomated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
timtebeek1
 
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
Łukasz Chruściel
 
Oracle 23c New Features For DBAs and Developers.pptx
Oracle 23c New Features For DBAs and Developers.pptxOracle 23c New Features For DBAs and Developers.pptx
Oracle 23c New Features For DBAs and Developers.pptx
Remote DBA Services
 
Microservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we workMicroservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we work
Sven Peters
 

Recently uploaded (20)

Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
 
Using Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query PerformanceUsing Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query Performance
 
Graspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code AnalysisGraspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code Analysis
 
SQL Accounting Software Brochure Malaysia
SQL Accounting Software Brochure MalaysiaSQL Accounting Software Brochure Malaysia
SQL Accounting Software Brochure Malaysia
 
openEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain SecurityopenEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain Security
 
Oracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptxOracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptx
 
SMS API Integration in Saudi Arabia| Best SMS API Service
SMS API Integration in Saudi Arabia| Best SMS API ServiceSMS API Integration in Saudi Arabia| Best SMS API Service
SMS API Integration in Saudi Arabia| Best SMS API Service
 
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeA Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
 
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
 
Using Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional SafetyUsing Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional Safety
 
How to write a program in any programming language
How to write a program in any programming languageHow to write a program in any programming language
How to write a program in any programming language
 
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOMLORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
 
GreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-JurisicGreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-Jurisic
 
Empowering Growth with Best Software Development Company in Noida - Deuglo
Empowering Growth with Best Software  Development Company in Noida - DeugloEmpowering Growth with Best Software  Development Company in Noida - Deuglo
Empowering Growth with Best Software Development Company in Noida - Deuglo
 
WWDC 2024 Keynote Review: For CocoaCoders Austin
WWDC 2024 Keynote Review: For CocoaCoders AustinWWDC 2024 Keynote Review: For CocoaCoders Austin
WWDC 2024 Keynote Review: For CocoaCoders Austin
 
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdfAutomated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
 
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
 
Oracle 23c New Features For DBAs and Developers.pptx
Oracle 23c New Features For DBAs and Developers.pptxOracle 23c New Features For DBAs and Developers.pptx
Oracle 23c New Features For DBAs and Developers.pptx
 
Microservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we workMicroservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we work
 

50 Shades of Data - Dutch Oracle Architects Platform (February 2018)

  • 1. 50 Shades of Data - how, when and why Big, Fast, Relational, NoSQL, Elastic, Event, CQRS On the many types of data, data stores and data usages Dutch Oracle Architects Platform | 6th February 2018 1 µ µ
  • 3. What is data? • A solidified representation of • An observation [of a fact] • A concept • Serialized in order to be • Understood & processed by machines • Reproduced for human consumption
  • 4. When things were simple RDBMS SQL ACID Data files Log Files Backup Backup Backup SAN
  • 5. And then stuff happened Middle Tier: Java EE (Stateful) application Client Tier: Browser Client Tier: Browser Client Tier: Browser Mobile App (offline) Mobile App (offline) Mobile App (offline) Data Warehouse OO, XML, JSON Content Management Big Data Fast Data API API API µ λ
  • 6. Explosion of Data Store technologies RDBMS SQL ACID
  • 7. V4
  • 9. Business Areas Marketing & Campaigns External Actors Supplier Gov Agency ShippingSecurity Finance Accounts, Invoices Supplier & Product Management Inventory & Warehousing Output (print & mail, email, SMS, …) Sales & Customer Service Inside the Enterprise Data Department Consolidation, MI, Reporting, Analysis and R&D Customer Management Order Management Data providers Customers
  • 10. Marketing & Campaigns Public Internet/External Actors Gov Agency ShippingSecurity Finance Accounts, Invoices Supplier & Product Management Inventory & Warehousing Output (print & mail, email, SMS, …) Inside the Enterprise Data Department Consolidation, MI, Reporting, Analysis and R&D Customer Management Order Management Data providers SupplierCustomers B2B Partner Portal Customer Service SaaS Mobile App Custom Application for Product Catalog IoT Gateways & Hub SaaS ERP Enterprise Content Management System Human Workflow Engine Mail Server Data Warehouse SaaS CRM Custom Order Management Application B2B APIs B2B APIs Open Data APIs DaaS Services APIs SaaS CX Campaigns, Social Media Monitor, 360 Customer View LDAP for Users, Roles & Permissions WebShop Portal Recommendation Engine Enterprise Dashboard & BI & Reporting Security & Compliance Monitor Desktop Tools Communication & Collaboration tools Asset Tracker Business Applications
  • 11. Marketing & Campaigns Public Internet/External Actors Gov Agency ShippingSecurity Finance Accounts, Invoices Supplier & Product Management Inventory & Warehousing Output (print & mail, email, SMS, …) Inside the Enterprise Data Department Consolidation, MI, Reporting, Analysis and R&D Customer Management Order Management Data providers SupplierCustomers B2B Partner Portal Customer Service SaaS Mobile App Big Data Lake Custom Application for Product Catalog IoT Gateways & Hub SaaS ERP Enterprise Content Management System Human Workflow Engine Mail Server Data Warehouse SaaS CRM DaaS Services SaaS CX Campaigns, Social Media Monitor, 360 Customer View LDAP for Users, Roles & Permissions WebShop Portal Recommendation Engine Enterprise Dashboard & BI & Reporting Security & Compliance Monitor Desktop Tools Communication & Collaboration tools Logging Collector & Monitor & Analyzer Monitor for Application & Infra metrics Source Code Control System API Gateway Service Bus Event Bus Event Bus Rule Engine Desktop Browser Mobile Devices Email / Facebook / WhatsApp Custom Order Management Application Asset Tracker Corporate DatabaseFile Storage Job Scheduling B2B APIs Open Data APIs APIs Application Server Private Blockchain B2B APIs Docker Container Registry Business Applications & IT Systems Microservices Platform Kubernetes Container Management
  • 12. Marketing & Campaigns Public Internet/External Actors Gov Agency ShippingSecurity Finance Accounts, Invoices Supplier & Product Management Inventory & Warehousing Output (print & mail, email, SMS, …) Inside the Enterprise Data Department Consolidation, MI, Reporting, Analysis and R&D Customer Management Order Management Data providers SupplierCustomers B2B Partner Portal Customer Service SaaS Mobile App Big Data Lake Custom Application for Product Catalog IoT Gateways & Hub SaaS ERP Enterprise Content Management System Human Workflow Engine Mail Server Data Warehouse SaaS CRM DaaS Services SaaS CX Campaigns, Social Media Monitor, 360 Customer View LDAP for Users, Roles & Permissions WebShop Portal Recommendation Engine Enterprise Dashboard & BI & Reporting Security & Compliance Monitor Desktop Tools Communication & Collaboration tools Logging Collector & Monitor & Analyzer Monitor for Application & Infra metrics Source Code Control System API Gateway Service Bus Event Bus Event Bus Rule Engine Desktop Browser Mobile Devices Email / Facebook / WhatsApp Custom Order Management Application Asset Tracker Corporate DatabaseFile Storage Job Scheduling B2B APIs Open Data APIs APIs Application Server Private Blockchain B2B APIs Docker Container Registry Business & IT - Data List of Products shown in UI Personal Profile, Order and Payments Details Smart Contracts with supply chain details Recent Consumer purchases information Footage from security cameras Readings from motion detectors Emails regarding customer complaints Spreadsheets with Sales records Log-files from IT systems (infra & platform) WebShop activity, Social Media discussions, … ML Models In Flight Messages Events Job Schedules Application & Infrastructure source history Offers, invoices, rewards messages Shopping Cart with selected items Order Details API usage, billing, policies Running & Past workflow instances Sales Aggregates by Day, Region, Product Category Invoices & Payments Product Manuals Digital Twin KPIs & Alerts Customer Interaction records Case files (Complaints , Requests) Rules & Rule Execution metrics Weather, Demographics, Sports, Social, … Config data Customer Details Audit Trails, Security Incidents ML Models Programming in progress User Stories, Designs, Discussions Copy of Production Data in Acceptance
  • 14. Marketing & Campaigns Public Internet/External Actors Gov Agency ShippingSecurity Finance Accounts, Invoices Supplier & Product Management Inventory & Warehousing Output (print & mail, email, SMS, …) Inside the Enterprise Data Department Consolidation, MI, Reporting, Analysis and R&D Customer Management Order Management Data providers SupplierCustomers B2B Partner Portal Customer Service SaaS Mobile App Big Data Lake Custom Application for Product Catalog IoT Gateways & Hub SaaS ERP Enterprise Content Management System Human Workflow Engine Mail Server Data Warehouse SaaS CRM DaaS Services SaaS CX Campaigns, Social Media Monitor, 360 Customer View LDAP for Users, Roles & Permissions WebShop Portal Recommendation Engine Enterprise Dashboard & BI & Reporting Security & Compliance Monitor Desktop Tools Communication & Collaboration tools Logging Collector & Monitor & Analyzer Monitor for Application & Infra metrics Source Code Control System API Gateway Service Bus Event Bus Event Bus Rule Engine Desktop Browser Mobile Devices Email / Facebook / WhatsApp Custom Order Management Application Asset Tracker Corporate DatabaseFile Storage Job Scheduling B2B APIs Open Data APIs APIs Application Server Private Blockchain B2B APIs Docker Container Registry Data Volume List of Products shown in UI Personal Profile, Order and Payments Details Smart Contracts with supply chain details Recent Consumer purchases information Footage from security cameras Readings from motion detectors Emails regarding customer complaints Spreadsheets with Sales records Log-files from IT systems (infra & platform) WebShop activity, Social Media discussions, … ML Models In Flight Messages Events Job Schedules Application & Infrastructure source history Offers, invoices, rewards messages Shopping Cart with selected items Order Details API usage, billing, policies Running & Past workflow instances Sales Aggregates by Day, Region, Product Category Invoices & Payments Product Manuals Digital Twin KPIs & Alerts Customer Interaction records Case files (Complaints , Requests) Rules & Rule Execution metrics Weather, Demographics, Sports, Social, … Config data Customer Details Audit Trails, Security Incidents ML Models Programming in progress User Stories, Designs, Discussions Copy of Production Data in Acceptance Big Data Lake Machine Learning models Long term history Data Warehouse Big Lots of data Small chunks of off line data Piles of log-files Fine grained events Gathering – never purging? Small payloads Medium size – structured data Rule meta-data (very small)
  • 15. Compression • . Technical Compression • Same data, fewer bits to store • Same time – or even longer - to process • Logical Compression • Filter (older than, one in X) • Reduce fine grainedness - helicopterview • Average over geographical area • Min/Max/Average per minute/hour/day • Is typically done in data warehouse & digital twin • Could be done for query stores and even for big data set
  • 17. Big Data => Small ML Models
  • 19.
  • 20. Fast Data – Fast Insight Raw Data Event Hub Streaming with Hot (Alerting) and ColdIoT Device Data Digital Twin Machine Learning Models to apply to digital twin to predict maintenance need
  • 21. Marketing & Campaigns Public Internet/External Actors Gov Agency ShippingSecurity Finance Accounts, Invoices Supplier & Product Management Inventory & Warehousing Output (print & mail, email, SMS, …) Inside the Enterprise Data Department Consolidation, MI, Reporting, Analysis and R&D Customer Management Order Management Data providers SupplierCustomers B2B Partner Portal Customer Service SaaS Mobile App Big Data Lake Custom Application for Product Catalog IoT Gateways & Hub SaaS ERP Enterprise Content Management System Human Workflow Engine Mail Server Data Warehouse SaaS CRM DaaS Services SaaS CX Campaigns, Social Media Monitor, 360 Customer View LDAP for Users, Roles & Permissions WebShop Portal Recommendation Engine Enterprise Dashboard & BI & Reporting Security & Compliance Monitor Desktop Tools Communication & Collaboration tools Logging Collector & Monitor & Analyzer Monitor for Application & Infra metrics Source Code Control System API Gateway Service Bus Event Bus Event Bus Rule Engine Desktop Browser Mobile Devices Email / Facebook / WhatsApp Custom Order Management Application Asset Tracker Corporate DatabaseFile Storage Job Scheduling B2B APIs Open Data APIs APIs Application Server Private Blockchain B2B APIs Data Volatility Personal Profile, Order and Payments Details Smart Contracts with supply chain details Recent Consumer purchases information Footage from security cameras Emails regarding customer complaints Log-files from IT systems (infra & platform) ML Models In Flight Messages Events Job Schedules Offers, invoices, rewards messages Order Details API usage, billing, policies Running & Past workflow instances Sales Aggregatesby Day, Region, Product Category Invoices & Payments Product Manuals Digital Twin KPIs & Alerts Customer Interaction records Case files (Complaints , Requests) Rules & Rule Execution metrics Weather, Demographics, Sports, Social, … Config data Customer Details Audit Trails, Security Incidents ML Models Programming in progress User Stories, Designs, Discussions Copy of Production Data in Acceptance List of Products shown in UI Spreadsheets with Sales records WebShop activity, Social Media discussions, … In Flight Messages Events Application & Infrastructure source history Shopping Cart with selected items Audit Trails, Security Incidents Readings from motion detectors Sales Aggregates by Day, Region, Product Category high low
  • 25. Marketing & Campaigns Public Internet/External Actors Gov Agency ShippingSecurity Finance Accounts, Invoices Supplier & Product Management Inventory & Warehousing Output (print & mail, email, SMS, …) Inside the Enterprise Data Department Consolidation, MI, Reporting, Analysis and R&D Customer Management Order Management Data providers SupplierCustomers B2B Partner Portal Customer Service SaaS Mobile App Big Data Lake Custom Application for Product Catalog IoT Gateways & Hub SaaS ERP Enterprise Content Management System Human Workflow Engine Mail Server Data Warehouse SaaS CRM DaaS Services SaaS CX Campaigns, Social Media Monitor, 360 Customer View LDAP for Users, Roles & Permissions WebShop Portal Recommendation Engine Enterprise Dashboard & BI & Reporting Security & Compliance Monitor Desktop Tools Communication & Collaboration tools Logging Collector & Monitor & Analyzer Monitor for Application & Infra metrics Source Code Control System API Gateway Service Bus Event Bus Event Bus Rule Engine Desktop Browser Mobile Devices Email / Facebook / WhatsApp Custom Order Management Application Asset Tracker Corporate DatabaseFile Storage Job Scheduling B2B APIs Open Data APIs APIs Application Server Private Blockchain B2B APIs Docker Container Registry Location List of Products shown in UI Personal Profile, Order and Payments Details Smart Contracts with supply chain details Recent Consumer purchases information Footage from security cameras Readings from motion detectors Emails regarding customer complaints Spreadsheets with Sales records Log-files from IT systems (infra & platform) WebShop activity, Social Media discussions, … ML Models In Flight Messages Events Job Schedules Application & Infrastructure source history Offers, invoices, rewards messages Shopping Cart with selected items Order Details API usage, billing, policies Running & Past workflow instances Sales Aggregates by Day, Region, Product Category Invoices & Payments Product Manuals Digital Twin KPIs & Alerts Customer Interaction records Case files (Complaints , Requests) Rules & Rule Execution metrics Weather, Demographics, Sports, Social, … Config data Customer Details Audit Trails, Security Incidents ML Models Programming in progress User Stories, Designs, Discussions Copy of Production Data in Acceptance Global Content Delivery Network Offline Storage in Apps Third party (SaaS) Git repo Offsite Standby for Disaster Recovery SaaS data store in Cloud DaaS data store in Cloud Application Server Memory (on site) Excel Sheets on employee laptops Local storage on “Things” & Edge devices Cloud storage for Database backups Local Database Instance for each region
  • 26. Considerations around Location • Latency • Latency experienced by end-user is sum of latencies in the chain • Co-located – systems with chatty interaction • Storage cost • Network Transport costs • Ease of distribution • Background distribution may be acceptable – provided it happens frequently enough • Off line usage • Security • Data “en route”
  • 28. Marketing & Campaigns Public Internet/External Actors Gov Agency ShippingSecurity Finance Accounts, Invoices Supplier & Product Management Inventory & Warehousing Output (print & mail, email, SMS, …) Inside the Enterprise Data Department Consolidation, MI, Reporting, Analysis and R&D Customer Management Order Management Data providers SupplierCustomers B2B Partner Portal Customer Service SaaS Mobile App Big Data Lake Custom Application for Product Catalog IoT Gateways & Hub SaaS ERP Enterprise Content Management System Human Workflow Engine Mail Server Data Warehouse SaaS CRM DaaS Services SaaS CX Campaigns, Social Media Monitor, 360 Customer View LDAP for Users, Roles & Permissions WebShop Portal Recommendation Engine Enterprise Dashboard & BI & Reporting Security & Compliance Monitor Desktop Tools Communication & Collaboration tools Logging Collector & Monitor & Analyzer Monitor for Application & Infra metrics Source Code Control System API Gateway Service Bus Event Bus Event Bus Rule Engine Desktop Browser Mobile Devices Email / Facebook / WhatsApp Custom Order Management Application Asset Tracker Corporate DatabaseFile Storage Job Scheduling B2B APIs Open Data APIs APIs Application Server Private Blockchain B2B APIs Docker Container Registry Streaming List of Products shown in UI Personal Profile, Order and Payments Details Smart Contracts with supply chain details Recent Consumer purchases information Footage from security cameras Readings from motion detectors Emails regarding customer complaints Spreadsheets with Sales records Log-files from IT systems (infra & platform) WebShop activity, Social Media discussions, … ML Models In Flight Messages Events Job Schedules Application & Infrastructure source history Offers, invoices, rewards messages Shopping Cart with selected items Order Details API usage, billing, policies Running & Past workflow instances Sales Aggregates by Day, Region, Product Category Invoices & Payments Product Manuals Digital Twin KPIs & Alerts Customer Interaction records Case files (Complaints , Requests) Rules & Rule Execution metrics Weather, Demographics, Sports, Social, … Config data Customer Details Audit Trails, Security Incidents ML Models Programming in progress User Stories, Designs, Discussions Copy of Production Data in Acceptance Synchronization of Devices coming online again Upload of ML Models Replaying transaction on standby database Applications being deployed Update of Datawarehouse Laptops & USB sticks on the move Raw IoT => Streaming Analysis => {alerts | digital twin | big data} Customer sending complaint by email Synchronization of SaaS from On Premises Metrics from Apps | Platform | Infra to Log Stash & Monitor Events moving to consumers UI updates pushed to browser Task notification sent to employee Fresh Data pushed to Application Cache Database Backup moved offsite
  • 29. Cost
  • 30. TC(D)O – Total Cost of Data Ownership • Business cost (missed opportunity, user dissatisfaction, …) of not having the data available • at all or fast enough or fresh enough Speed Freshness Available Compute Storage Network
  • 31. TC(D)O – Total Cost of Data Ownership • Direct cost of • Acquiring data • Storing Data • Storage (cheap and slow, expensive and quick) • Compression (less storage at expense of compute) • Retrieving Data • Compute resources • Cleansing, Calculating & Deriving data (DWH, ML Model, CQRS) • Compute resources • Transporting Data • Network traffic has a price tag (especially when out of local ‘area’)
  • 32. TC(D)O – Total Cost of Data Ownership • Operational costs • Backup & Recovery • Security • Intellectual property • Life cycle management – slower tier, archive, purge • “Right to be forgotten” • Regulatory periods to hang on to data
  • 33. Open (APIs) & DaaS • Governments and NGOs, scientific and even commercial organizations are publishing data • Inviting anyone who wants to join in to help make sense of the data – understand driving factors, identify categories, help predict • Many areas • Economy, health, public safety, sports, traffic & transportation, games, environment, maps, …
  • 34. Live
  • 35. Real Life Background Batch Process (preparing letters for customers) Customers BPM Engine Processing cases for Customers
  • 36. Stale
  • 37. Stale • Data is a representation of the real world • All data is inherently stale • Except when it describes something that can not change – and whose description can not change • Staleness is probably not a problem • Except in self driving cars… • Run the end-of-year-report • Consistency is much more important
  • 38. Glimpses of the past Session 1 Session 2
  • 39. Glimpses of the past Session 1 Session 2
  • 42. Consistent: Move entire session to point in time
  • 43. 44 Looking into the future… OUR_PRODUCTS NAME PRICE select name, price from our_products
  • 44. 45 Looking further into the future… OUR_PRODUCTS NAME PRICE select name, price from our_products begin DBMS_FLASHBACK_ARCHIVE.ENABLE_AT_VALID_TIME ( level => 'ASOF' , query_time => TO_TIMESTAMP('01-10-2018', 'DD-MM-YYYY') ); end;
  • 45. 46 Current situation … OUR_PRODUCTS NAME PRICE select name, price from our_products begin DBMS_FLASHBACK_ARCHIVE.ENABLE_AT_VALID_TIME ( level => 'CURRENT' ); end;
  • 46. All data in the table (the default setting) 47 OUR_PRODUCTS NAME PRICE select name, price from our_products begin DBMS_FLASHBACK_ARCHIVE.ENABLE_AT_VALID_TIME ( level => 'ALL' ); end;
  • 47. All data in the table (the default setting) 48 OUR_PRODUCTS NAME PRICE select name, price, start_date, end_date from our_products order by start_date START_DATE END_DATE begin DBMS_FLASHBACK_ARCHIVE.ENABLE_AT_VALID_TIME ( level => 'ALL' ); end;
  • 48. 49 Part of SQL 2011 standard: Temporal Database
  • 49. Make the database aware of the time based business validity of records • Add timestamp columns indicating start and end of valid time for a record • Specify a PERIOD for the table • Note: • A table can have multiple sets of columns, describing multiple types of temporal business validity create table our_products ( name varchar2(100) , price number(7,2) , start_date timestamp , end_date timestamp , PERIOD FOR offer_time (start_date, end_date) );
  • 51. Data Constraints to protect integrity • Allowable values • Mandatory attributes • (Foreign Key) References • NULL • Constraints on • type • length • format • Spelling • Character encoding
  • 52. Data is representation of the known real world • How useful is it to enforce data integrity?
  • 54. Data Integrity • Why? • Is it about truth? • About regulations and by-the-book? • Allow IT systems to run smoothly and not get confused? • About auditability and non-repudiation? • What about the real world? • Data in IT is just a representation; if the world is not by the book – what should IT do?
  • 55. Blockchain • Distributed • Across trusted business partners • Across public, anonymous parties • Immutable • Secured • Trusted • Smart Contracts • Operations on data (without human intervention)
  • 57. Graph Database • Natural fit during development • Superior (10-1000 times better) performance Person liked by anyone liked by Bob Find People liked by anyone liked by Bob Find People liked by anyone liked by Bob
  • 58. From relational SQL to Graph query
  • 60. SQL vs NoSQL ACID vs BASE Relational vs …
  • 61. SQL is not good at anything • But it sucks at nothing
  • 62. Relational Databases • Based on relational model of data (E.F. Codd), a mathematical foundation • Uses SQL for query, DML and DDL • Transactions are ACID (Atomicity, Consistency, Isolation, Durability) • All or nothing • Constraint Compliant • Individual experience [in a multi-session environment] (aka concurrency) • Down does not hurt
  • 63. ACID comes at a cost • Transaction results have to be persisted [before the transaction completes] in order to guarantee D • Concurrency requires some degree of locking (and multi-versioning) in order to have I • Constraint compliance (unique key, foreign key) means all data hangs together (as do all transactions) in order to have C • Two-phase commit (across multiple participants) introduces complexity, dependencies and delays, yet required for A
  • 64. The holy grail of Normalization • Normalize to prevent • data redundancy • discrepancies (split brain) • storage waste • However: we should recognize the fact that some data is read far more frequently than that it is created and modified
  • 65. The Relational Model in practice • Traditional Relational Data Model has severe impact on physical disk performance • Transaction Log => Sequential Write (append to file) • Data Blocks require much more expensive Random Access disk writes • Indexes (B-Tree, Bitmap, …) are used to speed up query (read) performance • and slow down transactions • Relational data does not [always] map naturally to the data format required in the application (OO, JSON, XML) • Capability to join and construct ad-hoc queries across the entire data model is powerful • Declarative integrity constraints allow for strict enforcement of data quality rules • “the data may be non sensical, but at least it adheres to the rules”
  • 66. Databases re-evaluated • Not all use cases require ACID (or can afford it) • Read only (product catalog for web shops) • Inserts only and no (inter-record) constraints • Big Data collected and “dumped” in Data Lake (Hadoop) for subsequent processing • High performance demands • Not all data needs structured formats or structured querying and JOINs • Entire documents are stored and retrieved based on a single key • Sometimes – scalable availability and productivity is more important than Consistency – and ACID is sacrificed • CAP-theorem states: Consistency [across nodes], Availability and Partition tolerance can not all three be satisfied
  • 67. NoSQL and BASE • NoSQL arose because of performance and scalability challenges with traditional/relational approach in Web Scale operations • NoSQL is a label for a wide variety of databases that lack some aspect of a true relational database • ACID-ness, SQL, relational model, constraints • The label has been used since 2009 • Perhaps NoREL would be more appropriate • Some well known NoSQL products are • Cassandra, MongoDB, Redis, CouchDB, … • BASE as alternative to ACID: • basically available, soft state, eventually consistent (after a short duration)
  • 68. Typical for NoSQL • Focus on speed, availability and scalability • Horizontal scale out – distributed with load balancing and fail-over • No (predefined) Data Structure • Integrity primarily protected by application logic • Open Source (most offerings are, not all: MarkLogic) • Close(r) attention for how the data is used • Application oriented data format and search paths and specialized database per application (microservice, capability) • Similar to the switch from SOA to API/Microservice • Reads (far) more relevant than writes • Data redundancy & denormalization • No data access through SQL – well, …
  • 70. (leading) NoSQL Database products • MongoDB is (one of) the most popular (by any measure) • Cloud (only): • Google BigTable, • AWS Dynamo • Cache (in memory) • ZooKeeper, Redis, Coherence, Memcached, Apache Ignite (pka GridGain), … • Hadoop/HDFS • Oracle NoSQL (fka Berkeley DB)
  • 71. NoSQL means: No Data Access through SQL • However • Data Professionals and Developers speak SQL • Reporting, Dashboarding, ETL, BI tools speak SQL • There is no common query language across NoSQL products
  • 72. No Data Access through SQL • However • Data Professionals and Developers speak SQL • Reporting, Dashboarding, ETL, BI tools speak SQL • There is no common query language across NoSQL products • Attempts from many vendors to create drivers that translate SQL statements into NoSQL commands for the specific target database • To protect existing investments in SQL – skills, tools, applications, reports, ..
  • 73. SQL vs NoSQL • SQL != RDBMS • SQL on top of • Hadoop – Spark SQL, Hive, Drill, Impala • “External Table” Text files, CSV, Excel • XML, JSON • KSQL on Kafka events • Google Spanner, BigQuery • NoSQL – Berkeley DB, Hbase, Elastic Search, MongoDB, Cassandra
  • 74. NoSQL (MongoDB) vs SQL (Oracle) db.emp.find ( {"JOB":"SALESMAN"} , { ENAME:1 , SAL:1} ) .sort ( {'SAL':-1}) .limit(2) select ename , sal from emp where job = 'SALESMAN' order by sal desc FETCH FIRST 2 ROWS ONLY
  • 75. NoSQL (MongoDB) vs SQL (Oracle) db.emp.find ( {"JOB":"SALESMAN" , $where : " this.SAL + (this.COMM != null? this.COMM: 0) > 2000" } ) select * from emp where sal + nvl(comm, 0) > 2000
  • 77. Why distributed? • Because it is • Business is physically spread out over multiple locations • To achieve • Scalability • Performance (parallelism, latency) • Resilience of the whole – availability (in the face of individual failure) • (site) Disaster recovery • Trust (e.g. blockchain) • Applies to data & processes
  • 78. Marketing & Campaigns Public Internet/External Actors Gov Agency ShippingSecurity Finance Accounts, Invoices Supplier & Product Management Inventory & Warehousing Output (print & mail, email, SMS, …) Inside the Enterprise Data Department Consolidation, MI, Reporting, Analysis and R&D Customer Management Order Management Data providers SupplierCustomers B2B Partner Portal Customer Service SaaS Mobile App Big Data Lake Custom Application for Product Catalog IoT Gateways & Hub SaaS ERP Enterprise Content Management System Human Workflow Engine Mail Server Data Warehouse SaaS CRM DaaS Services SaaS CX Campaigns, Social Media Monitor, 360 Customer View LDAP for Users, Roles & Permissions WebShop Portal Recommendation Engine Enterprise Dashboard & BI & Reporting Security & Compliance Monitor Desktop Tools Communication & Collaboration tools Logging Collector & Monitor & Analyzer Monitor for Application & Infra metrics Source Code Control System API Gateway Service Bus Event Bus Event Bus Rule Engine Desktop Browser Mobile Devices Email / Facebook / WhatsApp Custom Order Management Application Asset Tracker Corporate DatabaseFile Storage Job Scheduling B2B APIs Open Data APIs APIs Application Server Private Blockchain B2B APIs Docker Container Registry Distributed List of Products shown in UI Personal Profile, Order and Payments Details Smart Contracts with supply chain details Recent Consumer purchases information Footage from security cameras Readings from motion detectors Emails regarding customer complaints Spreadsheets with Sales records Log-files from IT systems (infra & platform) WebShop activity, Social Media discussions, … ML Models In Flight Messages Events Job Schedules Application & Infrastructure source history Offers, invoices, rewards messages Shopping Cart with selected items Order Details API usage, billing, policies Running & Past workflow instances Sales Aggregates by Day, Region, Product Category Invoices & Payments Product Manuals Digital Twin KPIs & Alerts Customer Interaction records Case files (Complaints , Requests) Rules & Rule Execution metrics Weather, Demographics, Sports, Social, … Config data Customer Details Audit Trails, Security Incidents ML Models Programming in progress User Stories, Designs, Discussions Copy of Production Data in Acceptance Global Content Delivery Network Offline Storage in Apps Real Application Clusters Distributed In Memory Cache Hazelcast, MemCached, Redis, Coherence Java EE Application Server Cluster SETI Local storage on “Things” & Edge devices Active Standby Database SAN Cross Cloud/On Premises archive Distributed Datastore MongoDB, Cassandra, BigTable, HBase Apache Spark Distributed Data Processing Logical Data Shards in Oracle Database, MySQL, Elastic HDFS Hadoop Distributed File System Kubernetes Distributed Container Platform Distributed Event Bus: Kafka
  • 79. Vertically Distributed Data Client Tier: Browser DOM/UI MVVM Middle Tier: Java EE (Stateful) application API API API Stateless
  • 80. Vertically Distributed Data Client Tier: Browser DOM/UI MVVM Middle Tier: Java EE (Stateful) application API API API Stateless
  • 82. Marketing & Campaigns Public Internet/External Actors Gov Agency ShippingSecurity Finance Accounts, Invoices Supplier & Product Management Inventory & Warehousing Output (print & mail, email, SMS, …) Inside the Enterprise Data Department Consolidation, MI, Reporting, Analysis and R&D Customer Management Order Management Data providers SupplierCustomers B2B Partner Portal Customer Service SaaS Mobile App Big Data Lake Custom Application for Product Catalog IoT Gateways & Hub SaaS ERP Enterprise Content Management System Human Workflow Engine Mail Server Data Warehouse SaaS CRM DaaS Services SaaS CX Campaigns, Social Media Monitor, 360 Customer View LDAP for Users, Roles & Permissions WebShop Portal Recommendation Engine Enterprise Dashboard & BI & Reporting Security & Compliance Monitor Desktop Tools Communication & Collaboration tools Logging Collector & Monitor & Analyzer Monitor for Application & Infra metrics Source Code Control System API Gateway Service Bus Event Bus Event Bus Rule Engine Desktop Browser Mobile Devices Email / Facebook / WhatsApp Custom Order Management Application Asset Tracker Corporate DatabaseFile Storage Job Scheduling B2B APIs Open Data APIs APIs Application Server Private Blockchain B2B APIs Docker Container Registry Availability List of Products shown in UI Personal Profile, Order and Payments Details Smart Contracts with supply chain details Recent Consumer purchases information Footage from security cameras Readings from motion detectors Emails regarding customer complaints Spreadsheets with Sales records Log-files from IT systems (infra & platform) WebShop activity, Social Media discussions, … ML Models In Flight Messages Events Job Schedules Application & Infrastructure source history Offers, invoices, rewards messages Shopping Cart with selected items Order Details API usage, billing, policies Running & Past workflow instances Sales Aggregates by Day, Region, Product Category Invoices & Payments Product Manuals Digital Twin KPIs & Alerts Customer Interaction records Case files (Complaints , Requests) Rules & Rule Execution metrics Weather, Demographics, Sports, Social, … Config data Customer Details Audit Trails, Security Incidents ML Models Programming in progress User Stories, Designs, Discussions Copy of Production Data in Acceptance Global Content Delivery Network Webshop 24/7 on line Relaxed availability (office hours) for DWH SaaS CRM less available than desired Fairly high availability for [clusters of] things – not for individual things Active Standby Database SAN Cross Cloud/On Premises archive Low availability demands on Big Data H/A for Oracle Database EventBus 24/7 on line H/A for IoT Hub H/A for LDAP Fairly high availability for [clusters of] things – not for individual things H/A during extended office hours for human workflow engine Service Bus 24/7 on line Some loss or service is acceptable for recommendation engine
  • 83. Availability of Data • Availability: • unplanned downtime (incident => disaster) • planned (not desired) downtime (upgrade, patch to application, platform, infra) • Chain is as strong as the weakest link • Availability is determined by least available component • Datastore can drive (and help improve) availability of many systems/applications/services • Custom UI on top of SAP requires 99.95% up time – SAP only offers 98% • Increase availability • H/A architecture – multi-node cluster, hot standby and fail-over, disaster recovery • Rolling upgrades • Single node for command, multiple (independent) helpers for query
  • 84. Case of Web Shop • Webshop – 1M visitors per day • Product catalog consists of 15+ millions of records • The web shop presents: product description, images, reviews, pricing details, related offerings, stock status • Some Products are added and updated and removed every day • Although most products do not change very frequently • Some vendors do bulk manipulation of product details Products Product updates Webshop visits - searches - product details - orders
  • 85. Case of Web Shop – Usage Patterns & Architecture Products Product updates Webshop visits - searches - product details - orders firewall Data manipulation Data Quality (enforcement) <10K transactions Batch jobs next to online Speed is nice Read only On line Speed is crucial XHTML & JSON > 5M visits
  • 86. Products Products Products Webshop visits - searches - product details - orders firewall Data manipulation Data Quality (enforcement) <10K transactions Batch jobs next to online Speed is nice Read only On line Speed is crucial XHTML & JSON > 1M visits DMZ Read only JSON documents Images Text Search Scale Horizontally Stale but consistent Products Nightly generation Product updates Case of Web Shop – Usage Patterns & Architecture
  • 87. CQRS
  • 88. CQRS – Multi Data Store Hoe integreer je applicaties en data? 89 Products Data Manipulation Data Retrieval
  • 89. CQRS – Multi Data Store Hoe integreer je applicaties en data? 90 Special Products Product Clusters ProductsData Manipulation Data Retrieval Food Stuff Toys Quick Product Search Index Product Store in SaaS app
  • 90. CQRS in Oracle Database Active Data Guard Standby SAN Middleware Middleware Middleware T T MV MV idx idx IMDB RAC RAC Shard (12c R2) Shard (12c R2) SAN SAN dbf SGA Redo Logs
  • 91. CQRS - Command and Query Responsibility Segragation • Data manipulation and retrieval in separate places • (physical data proliferation) • Query store is optimized for consumers • Level of detail, format, filters applied • For performance and scalability, independence, productivity lower license fees and lower TCO, security
  • 92. Synchronizing the Query Stores Special Products Product Clusters Products Data Manipulation Data Retrieval Food Stuff Toys Quick Product Search Index Product Store in SaaS app
  • 93. Synchronizing the Query Stores • Depends on • Freshness requirements • Authorization demands • Cost of synchronizing the query store (full synchronize vs event based) • Usage pattern for query store • Facilities available in Command store (and in query stores) • Relative locations (e.g. cloud & on premises) • Mechanisms • Importing Database dump-file (periodic, full or partial) • Direct queries & DML • Change Data Capture from transaction logs • Event based Special Product s Product Clusters ProductsData Manipulation Data Retrieval Food Stuff Toys Quick Product Search Index Product Store in SaaS app
  • 95. State is sum of changes Source: https://ookami86.github.io/event-sourcing-in-practice/#how-eventsourcing-works
  • 96. Take the UD out of CRUD • Introducing the Immu Table • A ledger of entity changes • With a timestamp or event sequence • And the entity identifier • And the new values of the added, changed, erased attributes • Each event is an immutable record that is appended to the ledger – just simply added to the end • Atomic, very cheap compared to Update and Delete – does not require a lock - it does require random file access and rearranging blocks on disk Bank Account Change Event Event Type Timestamp Account Id Amount (New value for) Owner Erased: some attribute
  • 97. Event Log in Event Sourcing • Primary Data Source is ledger of change events • Not a store of the current state • However: optionally use snapshots of baseline (state up until time) • Entity Event Store replaces Table • Offers a simple API for creating and retrieving events • ‘Entity Change Event’ Producer (to which consumers can subscribe) • To correct a mistake: • Do not remove the event! (it happened, it may already have been distributed) • Instead, create a compensating event (and then it unhappened)
  • 98. Event Log • Audit Log • Time travel • Reconstruct system (application state) • Distributed application state • Support multiple (read) models • Easy construct debugging environment – of exact situation and time • What-if scenarios –take copy, inject event & play forward from there • State = sum of change events • State = snapshot plus sum of recent events • To synch application state = current state + sum of events after the event version number on which current state is based
  • 99. To implement Event Sourcing • Take a data store • That is distributed, scalable, available • For example Apache Cassandra • Create an Event Log table [for each business entity] • Create columns for timestamp, event id, change [event] type, entity identifier • Create columns for all attributes or a single column to hold a document (e.g. JSON) • A special change type can be ‘snapshot’ to specify a baseline • No older entries are needed in the event log
  • 102. What is IT all about? Application Production Runtime
  • 103. What is IT all about? Application Production Runtime Platform
  • 104. What is IT all about? Application Platform Production Runtime Operations Monitoring & Management
  • 105. One team has Agile responsibility through full lifecyle Application Platform Production Runtime Operations Monitoring & ManagementApplication Preparation Runtime Platform Development CD Agile Design, Build, Test
  • 106. One team has Agile responsibility through full lifecyle Application Platform Production Runtime Operations Monitoring & ManagementApplication Preparation Runtime Platform Development CD Agile Design, Build, Test
  • 107. One team has Agile responsibility through full lifecyle Application Platform Application Platform
  • 108. DevOps team owns and runs one (or more) products Application Platform Generic Infrastructure Platform for running DevOps Products Floorspace, Power, Cooling, Storage, Compute Monitoring, Management, Cache, Authentication, RDBMS, Event Hub
  • 109. Multiple products from multiple teams run on a shared generic infrastructure Generic Infrastructure Platform for running DevOps Products Floorspace, Power, Cooling, Storage, Compute Monitoring, Management, Cache, Authentication, RDBMS, Event Hub Application Platform Application Platform Application Platform Application Platform Application Platform
  • 110. App plus platform under DevOps == Microservice Generic Infrastructure Platform for running DevOps Products µ µ µ µ µ
  • 111. App plus platform under DevOps == Microservice• Stateless • Horizontally scalable • Mutually Independent • upgrade, patch, relocate • Can expose Public API (HTTP/REST) and/or UI • Communicate with each other through events • Have their own bounded data context • Do not rely on other microservices [for the data they need] • Serverless – do not require allocated server, can be fired up Generic Infrastructure Platform for running DevOps Products µ µ µ µ µ
  • 112. Microservices - objectives • Minimize cost of change • Maximize agility • Isolate responsibility • Reduce cohesion by minimizing dependencies • logical, technical and runtime • only standardized communication/interaction • Independent, scalable processes • Choreograhy (broadcast) preferred over Orchestration (direct call) • Efficient operations • Comprehendable, controllable IT How do we get from a Monolith to Microservices?
  • 113. Data in microservices • Microservices are stateless & horizontally scalable • Microservices are isolated & independent • Where is their data? • What about lookup data? • Data not owned by the microservice – but still required by it to perform its role => bounded context
  • 114. Microservices State Cache RDBMS Document Store NoSQL Generic Platform for running microservices Event Hub Big Data Block Storage LDAP
  • 115. Bounded context in microservices • Micoservice needs to be able to run independently • It needs to contain & own all data required to run • It cannot depend on other microservices API Customer APIUI OrderCustomerModified event
  • 117. Wrap Up • Data used to be like T-Ford • One model, one color • And then:
  • 118. Wrap Up • Data comes in many shades (at least 50) – variations along many dimensions usage Total Cost of Data Ownership authorization distribution formatvolatility volume ACID demands availability freshness requirements (staleness allowance) location speed ownership required consistency
  • 119. Wrap Up • Some form of CQRS is plain common sense • Use fitting technology for the query challenge at hand • Graph, Document, Relational, Key/Value, Column, Elastic Index, … • Every organization will (should) have multiple data stores in various technologies – and not just relational SQL • Design & implement mechanism to synchronize the query stores • Events are attractive: decoupled, fine grained and fast • Devise a purging strategy • Stop carrying around your data legacy
  • 120. Wrap Up • All data is stale • Consistency should be your main concern • Microservices are stateless • They can own state – in their private data store • And maintain derived state – bounded context • Events are published to allow microservices to synch their context • Event Sourcing reduces complexity • CRUD => CR • Keep a ledger of data changes (book keeping of DML transactions) • Reconstruct state – current or historical – from events (into query store)
  • 121. Wrap Up • Data Integrity may be overrated • Instead of enforcing constraints (reality may not be so clean) – identify anomalies in data and act on them • SQL sits on top of the world • SQL [like query languages] run against a wide array of data stores, including Streams, Big Data, NoSQL and CSV / Excel • People and tools know SQL – make use of that • Machine Learning and Artificial Intelligence are fueled by data • They make the smallest, rawest, silliest piece of data potentially valuable
  • 124. Thank you! What is Apache Kafka and why is it important? 127 • Blog: technology.amis.nl • Email: lucas.jellema@amis.nl • : @lucasjellema • : lucas-jellema • : www.amis.nl, info@amis.nl

Editor's Notes

  1. Fast data arrives in real time and potentially high volume. Rapid processing, filtering and aggregation is required to ensure timely reaction and actual information in user interfaces. Doing so is a challenge, make this happen in a scalable and reliable fashion is even more interesting. This session introduces Apache Kafka as the scalable event bus that takes care of the events as they flow in and Kafka Streams and KSQL for the streaming analytics. Both Java and Node applications are demonstrated that interact with Kafka and leverage Server Sent Events and WebSocket channels to update the Web UI in real time. User activity performed by the audience in the Web UI is processed by the Kafka powered back end and results in live updates on all clients. Fast data arrives in real time and potentially high volume. Rapid processing, filtering and aggregation is required to ensure timely reaction and actual information in user interfaces. Doing so is a challenge, make this happen in a scalable and reliable fashion is even more interesting. This session introduces Apache Kafka as the scalable event bus that takes care of the events as they flow in and Kafka Streams for the streaming analytics. Both Java and Node applications are demonstrated that interact with Kafka and leverage Server Sent Events and WebSocket channels to update the Web UI in real time. User activity performed by the audience in the Web UI is processed by the Kafka powered back end and results in live updates on all clients. Introducing the challenge: fast data, scalable and decoupled event handling, streaming analytics Introduction of Kafka demo of Producing to and consuming from Kafka in Java and Nodejs clients Intro Kafka Stream API for streaming analytics Demo streaming analytics from java client Intro of web ui: HTML 5, WebSocket channel and SSE listener Demo of Push from server to Web UI - in general End to end flow: - IFTTT picks up Tweets and pushed them to an API that hands them to Kafka Topic. - The Java application Consumes these events, performs Streaming Analytics (grouped by hashtag and author and time window) and counts them; the aggregation results are produced to Kafka - The NodeJS application consumes these aggregation results and pushes them to Web UI - The WebUI displays the selected Tweets along with the aggregation results - in the Web UI, users can LIKE and RATE the tweets; each like or rating is sent to the server and produced to Kafka; these events are processed too through Stream Analytics and result in updated Like counts and Average Rating results; these are then pushed to all clients; this means that the audience can Tweet, see the tweet appear in the web ui on their own device, rate & like and see the ratings and like count update in real time
  2. https://specify.io/concepts/microservices
  3. https://specify.io/concepts/microservices
  4. https://specify.io/concepts/microservices
  5. https://specify.io/concepts/microservices
  6. https://specify.io/concepts/microservices
  7. https://specify.io/concepts/microservices
  8. http://morocco.opendataforafrica.org/
  9. https://specify.io/concepts/microservices
  10. https://specify.io/concepts/microservices
  11. https://specify.io/concepts/microservices
  12. https://specify.io/concepts/microservices
  13. https://specify.io/concepts/microservices
  14. https://specify.io/concepts/microservices
  15. https://specify.io/concepts/microservices
  16. https://specify.io/concepts/microservices
  17. https://specify.io/concepts/microservices
  18. https://specify.io/concepts/microservices
  19. https://specify.io/concepts/microservices
  20. https://specify.io/concepts/microservices
  21. https://specify.io/concepts/microservices
  22. https://specify.io/concepts/microservices
  23. https://specify.io/concepts/microservices
  24. https://specify.io/concepts/microservices
  25. https://specify.io/concepts/microservices
  26. https://specify.io/concepts/microservices
  27. https://specify.io/concepts/microservices
  28. https://specify.io/concepts/microservices
  29. https://specify.io/concepts/microservices
  30. https://specify.io/concepts/microservices
  31. https://specify.io/concepts/microservices
  32. https://specify.io/concepts/microservices
  33. https://specify.io/concepts/microservices
  34. https://specify.io/concepts/microservices
  35. https://specify.io/concepts/microservices
  36. https://specify.io/concepts/microservices
  37. https://specify.io/concepts/microservices
  38. https://specify.io/concepts/microservices
  39. https://specify.io/concepts/microservices
  40. https://www.infoq.com/articles/microservices-aggregates-events-cqrs-part-1-richardson http://microservices.io/patterns/data/event-sourcing.html http://blog.kontena.io/event-sourcing-microservices-with-kafka/ https://specify.io/concepts/microservices CQRS and Event Sourcing Applications with Cassandra - https://www.youtube.com/watch?v=3t8EUDiPfMQ Martin Fowler - https://www.youtube.com/watch?v=aweV9FLTZkU https://youtu.be/9a1PqwFrMP0?t=14m28s – Hospital – admit, transfer,transfer, discharge Greg Young - https://www.youtube.com/watch?v=kZL41SMXWdM
  41. https://ookami86.github.io/event-sourcing-in-practice/#how-eventsourcing-works
  42. https://specify.io/concepts/microservices
  43. All data stores are distributed Or at least distributedly available They can be local or on cloud (latency is important) Data in generic data store is still owned by only one microservice – no one can touch it Only in DWH and BigData do we deliberately take copies of data and disown them