Put machine data to work

August 20, 2018
Anatomy of the
Crate Machine Data Platform
Crate.io
Logistics…
• Submit questions at any time via the questions panel

• Slides & recording will be shared via email after the event
Agenda
–
• The Era of Machine Data & AI…

• New Data Management Challenges

• Anatomy of a (the) Machine Data Platform

• Questions & answers
Business Visionaries Build “Smart” with AI & Machine Learning
—
• Smart Factories that course correct every
minute, not once a month

• Smart Cities infrastructure that improves the
quality of living and the environment

• Smart Vehicles that decrease fleet costs and
increase public safety

• Smart Products that enrich the customer
experience
Smart Systems Go Beyond Visualization. They Put Data to Work
–
Analyze

(Real-time, AI, data science
Immediate
Action
(Control, alert, predict)
Machine
Data
(Sensory stimuli)
Sensors HealthMobileSecurity LogisticsManufacturing Automotive
ALPLA - Smart Factory

–
•$4B global plastic packaging manufacturer

•Centralized “mission control”

- Informed by connected machines

•1+ million sensors, across 1500 product lines

- Predictive maintenance & alerting

- Augmented reality connects mission control &
factory floor

•Rolled out to 30 plants within 12 months:

- Reduced workforce turnover & on-boarding cost

- Lower raw materials waste

- Increased operational equipment effective (OEE)
“It’s incredibly powerful.
Continuous production data
guides decision-making on the
floor in the moment.”

Philipp Lehner, CEO Alpla, USA
Smart Systems, AI = Huge Appetite for Data
–
•Data Variety

-950 different sensor types

-Operator logs (natural language)

-Material suppliers

-Operator (HR) data

•Data Volume

-100s of data points per bottle

-Millions of bottles per day
Smart Systems Machine Data Use Case
Firehose of
Complex data in
Real-time at the
Edge + Cloud
Data Layer Choice is Critical to Success
–
The right choice will…
• Scale easily
• Handle diverse data
• Perform fast
• Run in the cloud or on premises
• Produce value sooner
• Lower costs and risk
The wrong choice will…
• Delay time to market/value
• Cost more to build and operate
• Not integrate easily
• Hard to staff/hire
• Create lock-in risk
• Not handle new requirements
Analyze

(Real-time, AI, data science
Immediate
Action
(Control, alert, predict)
Machine
Data
(Sensory stimuli)
Sensors HealthMobileSecurity LogisticsManufacturing Automotive
?
Machine Data Layer Misconceptions…
–
Million data
points/min
Complex data (JSON)
Complex analyses
Sub-second
queries
Myth: It’s not a new data problem
Reality: Traditional SQL DBs too
slow to build & run
Myth: It’s a time series problem
Reality: Time series is a need, but
TSDBs too specialized, hard to
extend
Myth: It’s a NoSQL problem
Reality: It used to be, but now
SQL DBs can handle
Myth: We can’t start until our data
is cleansed
Reality: Start now; cleanse as you
go
Requirements
Put data to work (even faster)
CrateDB &
Crate.io Machine Data Platform
–
Put Machine Data to Work - CrateDB & MDP (Machine Data Platform)
–
- Millions of data points in realtime

- Complex data & queries

• JSON, Relational, Geospatial, Fulltext

• Aggregate/time series

• Enable AI / machine learning / Predictive

- Real-time performance

- Edge & cloud deployment
Analyze

Immediate
Action
Machine
Data
Crate.io Advantages
–
• Ease of use, staffing, integration (SQL!)

• Easy to meet new requirements

• Fast performance, especially multi-user

• Lower operational and development cost

• Faster time to value
CrateDB vs. Crate.io Machine Data Platform
—
Just a [fabulous] database

On-prem, or Hosted cloud service 

For teams with distributed/cloud-
native systems experience
Entire data layer (built on CrateDB)

Hosted cloud service

For teams… 

• With less distributed systems
experience

• In a hurry to get to market
CrateDB - Built for Smart Systems
–
Distributed SQL with
search, time series,
geospatial, aggregations
Cloud-native architecture
On Prem, EDGE, Cloud
NoSQL storage & clustering
and Machine Data specific
features
Columnar Caches for real-
time, in-memory SQL query
performance
shared-nothing architecture
100+ Global Production CrateDB Customers
–
Use-case specific:
• Industrial

• Energy

• Vehicular

• IT Ops
Commodity:
• Yet, difficult & costly

• Undifferentiated (unless
you build it wrong)
Anatomy of an IoT Platform
–
• Hosting infrastructure

• Stack architecture design

• Database design

• Container platform

• Security and identity
management

• Logging & monitoring

• Backup & restore

• DevOps tooling

• Scaling

• 24x7 operations

• Host & scale ingestion

• Host & scale analytics

• Host & scale enrichment

• Host & scale alerting
Logic/Code
Visualization
Analysis
Action/Alerting
Data Enrichment
Data Management
Crate.io
Machine Data
Platform
–
• Eliminates delay, cost, risk of
machine data layer
• Operated 24x7 by Crate.io
• Powered by CrateDB, plus:
- Data ingestion service
- Data aging/archival
- Backup & restore
- Identity management &
security
- Systems monitoring & logging
- Kubernetes environment
scales user code with DBMS
• Available on Microsoft Azure
(AWS, on-premises available)
IoTPlatform
Infrastructure - servers, networking
DashboardsData Science 

(AI/ML)
Enrichment Notification &
Control
Device Mgmt.
Ingest

(MQTT
SQL Bulk)
Database
Data Aging &
Restore
Enrichment, analysis, visualization, alerting micro services

(developed by user, hosted & scaled by Crate.io)
ID
Mangement
Container Deployment, Scaling & Security
MachineDataPlatform
Monitoring &
Logging
DEPLOY, HOST, SCALE, SECURE
DEVELOP
Crate.io Machine
Data Platform:
Ingest
–
• MQTT-Standard based
(Timestamp, Topic, Payload)

• Payload JSON based, nesting
possible

• Queuing included

• Create Multiple MQTT
Endpoints, secured with
username/password via http

• Ingest data into a raw data
storage based on CrateDB 

• Consume the raw data storage
with enrichment services and
move into metrics data store
MQTT Raw
Ingest
Crate.io Machine
Data Platform:
Enrichment
–
• Service transforms data from
raw input data into customized/
normalized format
- Same sensor, different types
- Events data
- Create/Update Configuration
items
- etc.

• Transformed/enriched data is
stored in the metric table or the
configuration service, this can be
set up with an individual schema
that fits best for the use case

• Use-case specific, configurable
with Python and Standard-SQL

• Integrated user management MQTT EnrichedRaw
Ingest Enrichment
Crate.io Machine
Data Platform:
Metrics Data
Store/Config
–
• Dedicated storage based on
CrateDB

• Configuration Service: holds
contextual information on
products, product variants,
production line configuration –
interpretation of sensor values

• Metrics: Normalized store of
sensor data, events, planning
data for consumption by
applications or the configuration
service API

• Integrated user management MQTT EnrichedRaw
Ingest Enrichment
Crate.io Machine
Data Platform:
Alerting
–
• Configurable Alerting/Notes
service; developing via python
and Standard SQL

• Considering Configuration
Service context and sensor data

• Notification of centralized unit /
shopfloor agents based on
decision rules
MQTT EnrichedRaw
Ingest Enrichment Analytics &

Alerting
Crate.io Machine
Data Platform: ID
Management/
Logging
–
• Integrated Logging and
Monitoring of all services

• Visible within the Management
UI Dashboard of Crate.io for
each Crate.io MDP cluster

• User Management and
authentication within this
framework for services included
Ingest

(MQTT
SQL Bulk)
Database
Data Aging &
Restore
Enrichment, analysis, visualization, alerting micro services

(developed by user, hosted & scaled by Crate.io)
ID
Mangement
Container Deployment, Scaling & Security
MachineDataPlatform
Monitoring &
Logging
DEPLOY, HOST, SCALE, SECURE
Machine Data Ingestion
–
Simple raw machine data
ingestion format:

• Timestamp

• Topic

• Payload
Crate Machine Data Platform
MQTT, SQL, Bulk
EnrichedRaw
Data Aging Rules
S3/Cold Storage
Payload = Simple or complex
structures, objects, nested objects,
arrays, arrays of objects, …
Enrichment…

• Cleansing

• Calculations /
aggregations

- Bottles per day

- Downtime
Ingest Analytics &

Alerting
Enrichment
Next Steps…
–
CrateDB
Enterprise
CrateDB
Community
CrateDB only
CrateDB +
Ingestion
CrateDB +
Ingestion +
Data science+
Action
Crate.io
Machine Data Platform
CrateDB
downloads
• Crate for Smart Systems

- Fastest time to value

- Easiest to use

- Most versatile

- Simple to scale 

• Download CrateDB, try it
for free

• Get a Machine Data
Platform account
https://crate.io
Thank you

CrateDB Machine Data Platform Webinar

  • 1.
    Put machine datato work August 20, 2018 Anatomy of the Crate Machine Data Platform Crate.io
  • 2.
    Logistics… • Submit questionsat any time via the questions panel • Slides & recording will be shared via email after the event
  • 3.
    Agenda – • The Eraof Machine Data & AI… • New Data Management Challenges • Anatomy of a (the) Machine Data Platform • Questions & answers
  • 4.
    Business Visionaries Build“Smart” with AI & Machine Learning — • Smart Factories that course correct every minute, not once a month • Smart Cities infrastructure that improves the quality of living and the environment • Smart Vehicles that decrease fleet costs and increase public safety • Smart Products that enrich the customer experience
  • 5.
    Smart Systems GoBeyond Visualization. They Put Data to Work – Analyze
 (Real-time, AI, data science Immediate Action (Control, alert, predict) Machine Data (Sensory stimuli) Sensors HealthMobileSecurity LogisticsManufacturing Automotive
  • 6.
    ALPLA - SmartFactory – •$4B global plastic packaging manufacturer •Centralized “mission control” - Informed by connected machines •1+ million sensors, across 1500 product lines - Predictive maintenance & alerting - Augmented reality connects mission control & factory floor •Rolled out to 30 plants within 12 months: - Reduced workforce turnover & on-boarding cost - Lower raw materials waste - Increased operational equipment effective (OEE) “It’s incredibly powerful. Continuous production data guides decision-making on the floor in the moment.”
 Philipp Lehner, CEO Alpla, USA
  • 7.
    Smart Systems, AI= Huge Appetite for Data – •Data Variety -950 different sensor types -Operator logs (natural language) -Material suppliers -Operator (HR) data •Data Volume -100s of data points per bottle -Millions of bottles per day
  • 8.
    Smart Systems MachineData Use Case Firehose of Complex data in Real-time at the Edge + Cloud
  • 9.
    Data Layer Choiceis Critical to Success – The right choice will… • Scale easily • Handle diverse data • Perform fast • Run in the cloud or on premises • Produce value sooner • Lower costs and risk The wrong choice will… • Delay time to market/value • Cost more to build and operate • Not integrate easily • Hard to staff/hire • Create lock-in risk • Not handle new requirements Analyze
 (Real-time, AI, data science Immediate Action (Control, alert, predict) Machine Data (Sensory stimuli) Sensors HealthMobileSecurity LogisticsManufacturing Automotive ?
  • 10.
    Machine Data LayerMisconceptions… – Million data points/min Complex data (JSON) Complex analyses Sub-second queries Myth: It’s not a new data problem Reality: Traditional SQL DBs too slow to build & run Myth: It’s a time series problem Reality: Time series is a need, but TSDBs too specialized, hard to extend Myth: It’s a NoSQL problem Reality: It used to be, but now SQL DBs can handle Myth: We can’t start until our data is cleansed Reality: Start now; cleanse as you go Requirements
  • 11.
    Put data towork (even faster) CrateDB & Crate.io Machine Data Platform –
  • 12.
    Put Machine Datato Work - CrateDB & MDP (Machine Data Platform) – - Millions of data points in realtime - Complex data & queries • JSON, Relational, Geospatial, Fulltext • Aggregate/time series • Enable AI / machine learning / Predictive - Real-time performance - Edge & cloud deployment Analyze
 Immediate Action Machine Data
  • 13.
    Crate.io Advantages – • Easeof use, staffing, integration (SQL!) • Easy to meet new requirements • Fast performance, especially multi-user • Lower operational and development cost • Faster time to value
  • 14.
    CrateDB vs. Crate.ioMachine Data Platform — Just a [fabulous] database On-prem, or Hosted cloud service For teams with distributed/cloud- native systems experience Entire data layer (built on CrateDB) Hosted cloud service For teams… • With less distributed systems experience • In a hurry to get to market
  • 15.
    CrateDB - Builtfor Smart Systems – Distributed SQL with search, time series, geospatial, aggregations Cloud-native architecture On Prem, EDGE, Cloud NoSQL storage & clustering and Machine Data specific features Columnar Caches for real- time, in-memory SQL query performance shared-nothing architecture
  • 16.
    100+ Global ProductionCrateDB Customers –
  • 17.
    Use-case specific: • Industrial •Energy • Vehicular • IT Ops Commodity: • Yet, difficult & costly • Undifferentiated (unless you build it wrong) Anatomy of an IoT Platform – • Hosting infrastructure • Stack architecture design • Database design • Container platform • Security and identity management • Logging & monitoring • Backup & restore • DevOps tooling • Scaling • 24x7 operations • Host & scale ingestion • Host & scale analytics • Host & scale enrichment • Host & scale alerting Logic/Code Visualization Analysis Action/Alerting Data Enrichment Data Management
  • 18.
    Crate.io Machine Data Platform – • Eliminatesdelay, cost, risk of machine data layer • Operated 24x7 by Crate.io • Powered by CrateDB, plus: - Data ingestion service - Data aging/archival - Backup & restore - Identity management & security - Systems monitoring & logging - Kubernetes environment scales user code with DBMS • Available on Microsoft Azure (AWS, on-premises available) IoTPlatform Infrastructure - servers, networking DashboardsData Science (AI/ML) Enrichment Notification & Control Device Mgmt. Ingest (MQTT SQL Bulk) Database Data Aging & Restore Enrichment, analysis, visualization, alerting micro services (developed by user, hosted & scaled by Crate.io) ID Mangement Container Deployment, Scaling & Security MachineDataPlatform Monitoring & Logging DEPLOY, HOST, SCALE, SECURE DEVELOP
  • 19.
    Crate.io Machine Data Platform: Ingest – •MQTT-Standard based (Timestamp, Topic, Payload)
 • Payload JSON based, nesting possible
 • Queuing included
 • Create Multiple MQTT Endpoints, secured with username/password via http
 • Ingest data into a raw data storage based on CrateDB 
 • Consume the raw data storage with enrichment services and move into metrics data store MQTT Raw Ingest
  • 20.
    Crate.io Machine Data Platform: Enrichment – •Service transforms data from raw input data into customized/ normalized format - Same sensor, different types - Events data - Create/Update Configuration items - etc.
 • Transformed/enriched data is stored in the metric table or the configuration service, this can be set up with an individual schema that fits best for the use case
 • Use-case specific, configurable with Python and Standard-SQL
 • Integrated user management MQTT EnrichedRaw Ingest Enrichment
  • 21.
    Crate.io Machine Data Platform: MetricsData Store/Config – • Dedicated storage based on CrateDB
 • Configuration Service: holds contextual information on products, product variants, production line configuration – interpretation of sensor values
 • Metrics: Normalized store of sensor data, events, planning data for consumption by applications or the configuration service API
 • Integrated user management MQTT EnrichedRaw Ingest Enrichment
  • 22.
    Crate.io Machine Data Platform: Alerting – •Configurable Alerting/Notes service; developing via python and Standard SQL
 • Considering Configuration Service context and sensor data
 • Notification of centralized unit / shopfloor agents based on decision rules MQTT EnrichedRaw Ingest Enrichment Analytics & Alerting
  • 23.
    Crate.io Machine Data Platform:ID Management/ Logging – • Integrated Logging and Monitoring of all services
 • Visible within the Management UI Dashboard of Crate.io for each Crate.io MDP cluster
 • User Management and authentication within this framework for services included Ingest (MQTT SQL Bulk) Database Data Aging & Restore Enrichment, analysis, visualization, alerting micro services (developed by user, hosted & scaled by Crate.io) ID Mangement Container Deployment, Scaling & Security MachineDataPlatform Monitoring & Logging DEPLOY, HOST, SCALE, SECURE
  • 24.
    Machine Data Ingestion – Simpleraw machine data ingestion format: • Timestamp • Topic • Payload Crate Machine Data Platform MQTT, SQL, Bulk EnrichedRaw Data Aging Rules S3/Cold Storage Payload = Simple or complex structures, objects, nested objects, arrays, arrays of objects, … Enrichment… • Cleansing • Calculations / aggregations - Bottles per day - Downtime Ingest Analytics & Alerting Enrichment
  • 25.
    Next Steps… – CrateDB Enterprise CrateDB Community CrateDB only CrateDB+ Ingestion CrateDB + Ingestion + Data science+ Action Crate.io Machine Data Platform CrateDB downloads • Crate for Smart Systems - Fastest time to value - Easiest to use - Most versatile - Simple to scale • Download CrateDB, try it for free • Get a Machine Data Platform account
  • 26.