SlideShare a Scribd company logo
1 of 37
© 2015 MapR Technologies, confidential
© 2015 MapR Technologies, confidential© 2015 MapR Technologies, confidential
IoT—a superset of the Internet
© 2015 MapR Technologies, confidential
IoT—a superset of the Internet
What is the IoT?
“The idea of an all-encompassing and ubiquitous network of devices to
facilitate co-ordination and communication between the devices
themselves as well as between the devices and human end-users. The
involved devices are typically constrained devices such as RFID
sensors, but may also more sophisticated ones like smartphones.”
© 2015 MapR Technologies, confidential
IoT—a superset of the Internet
thin fat
stationary
mobile
devices and their deployment
© 2015 MapR Technologies, confidential
The IoT landscapeinfrastructuredataapps
© 2015 MapR Technologies, confidential
New and Existing
Devices
IoT Gateways Network/Wireless
Services
Backend Systems
Orientation iot.eclipse.org
© 2015 MapR Technologies, confidential
Categorization & use cases
© 2015 MapR Technologies, confidential
Categorization & use cases: Personal IoT
Scope is on a single person, for example a smartphone equipped
with GPS sensor or a fitness device that measures the heart and
sharing this data with her GP. One of the fastest growing, rather
consumer-oriented areas of IoT.
Use cases and apps
• Quantified self
• Smart jackets
• Personal digital assistant
© 2015 MapR Technologies, confidential
Categorization & use cases: Group IoT
Focuses on a small group of people, for example a family in the
context of a smart home where the deployed sensors capture
temperature and lighting conditions for optimal comfort. One of the
most challenging areas and yet early days.
Use cases and apps
• Smart homes
• Proactive/predictive car maintenance
• Interactive tourism
© 2015 MapR Technologies, confidential
Categorization & use cases: Community IoT
Considers a large group of people, potentially tens of thousands,
usually in the context of public infrastructure, such as smart cities.
Some immature from a commercial POV but potentially promising
IoT area.
Use cases and apps
• Smart cities
• Health care (monitoring, trackers)
© 2015 MapR Technologies, confidential
Categorization & use cases: Industrial IoT
Scope can be either within an organization or between
organizations and/or individuals. This is arguably the most
established and mature part of IoT, see also M2M.
Use cases and apps
• Smart factory
• Retailer supply chain
• Agriculture
• Waste management
© 2015 MapR Technologies, confidential
IoT lends itself to ‘Big Data’ approach
Scaling out on commodity hardware,
in a schema-on-read fashion,
over community-defined interfaces
• Volume: store all incoming sensor data for historical references
• Variety: dozens of data formats in use in the IoT world, none is relational
• Velocity: many devices generate data at a high rate; usually data streams
© 2015 MapR Technologies, confidential© 2015 MapR Technologies, confidential
My IoT toolbox
© 2015 MapR Technologies, confidential
Apache Kafka
• A high-throughput, distributed,
persistent publish-subscribe
messaging system
• Originates from LinkedIn
• Typically used as buffer/ de-coupling
layer in online stream processing
kafka.apache.org
© 2015 MapR Technologies, confidential
Fluentd
Data collector for unified logging layer
www.fluentd.org
© 2015 MapR Technologies, confidential
Apache Storm
• Distributed, fault-tolerant stream-
processing platform
• Guaranteed message processing;
takes care of replaying messages
on failure
• Concepts: tuples, streams,
spouts, bolts, topologies
storm.apache.org
© 2015 MapR Technologies, confidential
Apache Spark
spark.apache.org
Continued innovation bringing new functionality, such as:
• Tachyon (Shared RDDs, off-heap solution)
• BlinkDB (approximate queries)
• SparkR (R wrapper for Spark)
Spark SQL
(SQL/HQL)
Spark Streaming
(stream processing)
MLlib
(machine learning)
Spark (core execution engine—RDDs)
GraphX
(graph processing)
Mesos
file system (local, MapR-FS, HDFS, S3) or data store (HBase, Elasticsearch, etc.)
YARNStandalone
© 2015 MapR Technologies, confidential
Apache HBase
• Distributed, column-oriented NoSQL database built on top of HDFS
• Based on Google’s BigTable technology, CP
• Scales to 1,000s of commodity servers, billions of rows/ PB of data
• Low-latency get/put operations
hbase.apache.org
© 2015 MapR Technologies, confidential
drill.apache.org
Apache Drill
• Interactive analysis at scale,
with global and local schema
• Support evolving NoSQL data structures
• Self-service BI; use with or without
Hadoop
mapr.com/blog/how-use-sql-hadoop-drill-rest-
json-nosql-and-hbase-simple-rest-client
© 2015 MapR Technologies, confidential© 2015 MapR Technologies, confidential
Time series data
© 2015 MapR Technologies, confidential
OpenTSDB
OpenTSDB is a distributed time series database on top of HBase,
enabling you …
• to store & index, as well as
• to query & plot
… metrics at scale.
opentsdb.net
© 2015 MapR Technologies, confidential
OpenTSDB: key concepts
data point: (timestamp, value) + metric + tag: key=value  time series
(00:38, 56) mysql.com_delete schema=userdb
© 2015 MapR Technologies, confidential
read pathwrite path
OpenTSDB: high-level architecture
MapR-DB
HBase RPC: PUT, SCAN
TSD RPC
tcollector
tcollector
tcollector
app/metric
shell script
(alert, etc.)
TSD TSD TSD TSD
TSD RPC or HTTP
opentsdb.net/overview.html
© 2015 MapR Technologies, confidential
OpenTSDB with MapR
https://github.com/mapr-demos/opentsdb
message
queue
data points
users
tcollector MapR-DB
web app
buffering data for 1 hour in collector allows
1000x decrease in insertion rate
© 2015 MapR Technologies, confidential
OpenTSDB: interfacing
• HTTP API
• CLI (tsd, query, mkmetric, etc.)
• Java lib: asynchbase
• Improved collectors: scollector
• Dashboard: Grafana
© 2015 MapR Technologies© 2015 MapR Technologies
The Internet of Things architecture (iot-a)
© 2015 MapR Technologies
Key Requirements for an IoT Data Platform
• Deal with raw data natively
• Support a range of
workloads; streaming as
first-class citizen
• Ensure business continuity
• Provide secure and
privacy-aware operation
mapr.com/blog/key-requirements-iot-data-platform
© 2015 MapR Technologies, confidential
The IoT architecture (iot-a)
iot-a.info
MQ/SP
DFS
DB
input outputas-it-happens
outputinteractive
outputbatch
© 2015 MapR Technologies, confidential
Example iot-a
HDFS
HBase
input outputas-it-happens
outputinteractive
outputbatch
batch jobs
batch jobs
© 2015 MapR Technologies, confidential
A proof of concept from the automotive sector
© 2015 MapR Technologies, confidential
A proof of concept from the automotive sector
© 2015 MapR Technologies, confidential
Largest Biometric Database in the World
PEOPLE
1.2B
PEOPLE
uidai.gov.in/images/AadhaarTechnologyArchitecture_March2014.pdf
© 2015 MapR Technologies, confidential
Financial Services
Fraud detection
Personalized
offers
Fraud
investigation
tool
Fraud investigator
Fraud model
Recommendations
table
Clickstream
analysis
Online
transactions
MapR Distribution for Hadoop
Analytics
Interactive marketer
© 2015 MapR Technologies, confidential
Waste & Recycling Leader—Architecture
Truck
Truck
Truck
.
.
.
MapR
lat/lng
lat/lng
lat/lng
Online alerts
Batch processing
(MapReduce)
Tax
reduction
reporting
Shortest path graph
algorithm
(Titan)
Route
optimization
Real-time stream
processing
(Apache Storm)
© 2015 MapR Technologies, confidential
$50M$50M
in Free Training
© 2015 MapR Technologies
Q&A
@mhausenblas maprtech
mhausenblas@mapr.com
Engage with us!
MapR
maprtech
mapr-technologies

More Related Content

What's hot

Cloud of things (IoT + Cloud Computing)
Cloud of things (IoT + Cloud Computing)Cloud of things (IoT + Cloud Computing)
Cloud of things (IoT + Cloud Computing)Zakaria Hossain
 
Microsoft Azure Cloud Services
Microsoft Azure Cloud ServicesMicrosoft Azure Cloud Services
Microsoft Azure Cloud ServicesDavid J Rosenthal
 
Azure Service Bus
Azure Service BusAzure Service Bus
Azure Service BusBizTalk360
 
cloud computing, Principle and Paradigms: 1 introdution
cloud computing, Principle and Paradigms: 1 introdutioncloud computing, Principle and Paradigms: 1 introdution
cloud computing, Principle and Paradigms: 1 introdutionMajid Hajibaba
 
What is private cloud Explained
What is private cloud ExplainedWhat is private cloud Explained
What is private cloud Explainedjeetendra mandal
 
Solace Singapore User Group: Dell Boomi Presentation
Solace Singapore User Group: Dell Boomi PresentationSolace Singapore User Group: Dell Boomi Presentation
Solace Singapore User Group: Dell Boomi PresentationSolace
 
web connectivity in IoT
web connectivity in IoTweb connectivity in IoT
web connectivity in IoTFabMinds
 
BDA306 Building a Modern Data Warehouse: Deep Dive on Amazon Redshift
BDA306 Building a Modern Data Warehouse: Deep Dive on Amazon RedshiftBDA306 Building a Modern Data Warehouse: Deep Dive on Amazon Redshift
BDA306 Building a Modern Data Warehouse: Deep Dive on Amazon RedshiftAmazon Web Services
 
IoT Communication Protocols
IoT Communication ProtocolsIoT Communication Protocols
IoT Communication ProtocolsPradeep Kumar TS
 
Cloud Computing - An Introduction
Cloud Computing - An IntroductionCloud Computing - An Introduction
Cloud Computing - An IntroductionRavindra Dastikop
 

What's hot (20)

Cloud of things (IoT + Cloud Computing)
Cloud of things (IoT + Cloud Computing)Cloud of things (IoT + Cloud Computing)
Cloud of things (IoT + Cloud Computing)
 
Microsoft Azure Cloud Services
Microsoft Azure Cloud ServicesMicrosoft Azure Cloud Services
Microsoft Azure Cloud Services
 
Edge Computing.pptx
Edge Computing.pptxEdge Computing.pptx
Edge Computing.pptx
 
Azure Service Bus
Azure Service BusAzure Service Bus
Azure Service Bus
 
cloud computing, Principle and Paradigms: 1 introdution
cloud computing, Principle and Paradigms: 1 introdutioncloud computing, Principle and Paradigms: 1 introdution
cloud computing, Principle and Paradigms: 1 introdution
 
Azure IoT Hub
Azure IoT HubAzure IoT Hub
Azure IoT Hub
 
What is private cloud Explained
What is private cloud ExplainedWhat is private cloud Explained
What is private cloud Explained
 
Solace Singapore User Group: Dell Boomi Presentation
Solace Singapore User Group: Dell Boomi PresentationSolace Singapore User Group: Dell Boomi Presentation
Solace Singapore User Group: Dell Boomi Presentation
 
Windows Azure Service Bus
Windows Azure Service BusWindows Azure Service Bus
Windows Azure Service Bus
 
web connectivity in IoT
web connectivity in IoTweb connectivity in IoT
web connectivity in IoT
 
Azure IoT Summary
Azure IoT SummaryAzure IoT Summary
Azure IoT Summary
 
BDA306 Building a Modern Data Warehouse: Deep Dive on Amazon Redshift
BDA306 Building a Modern Data Warehouse: Deep Dive on Amazon RedshiftBDA306 Building a Modern Data Warehouse: Deep Dive on Amazon Redshift
BDA306 Building a Modern Data Warehouse: Deep Dive on Amazon Redshift
 
IOT in SMART Cities
IOT in SMART CitiesIOT in SMART Cities
IOT in SMART Cities
 
Protocols for IoT
Protocols for IoTProtocols for IoT
Protocols for IoT
 
IoT Communication Protocols
IoT Communication ProtocolsIoT Communication Protocols
IoT Communication Protocols
 
Cloud Computing - An Introduction
Cloud Computing - An IntroductionCloud Computing - An Introduction
Cloud Computing - An Introduction
 
Cloud computing
Cloud computingCloud computing
Cloud computing
 
Iot logical design
Iot logical designIot logical design
Iot logical design
 
Hadoop Ecosystem
Hadoop EcosystemHadoop Ecosystem
Hadoop Ecosystem
 
Cloud computing
Cloud computingCloud computing
Cloud computing
 

Similar to A modern IoT data processing toolbox

Internet of Things & Big Data
Internet of Things & Big DataInternet of Things & Big Data
Internet of Things & Big DataArun Rajput
 
October Southern CA Road Shows - Build Safe and Secure Distributed Systems
October Southern CA Road Shows -  Build Safe and Secure Distributed SystemsOctober Southern CA Road Shows -  Build Safe and Secure Distributed Systems
October Southern CA Road Shows - Build Safe and Secure Distributed SystemsReal-Time Innovations (RTI)
 
Webofthing_WOT_vs_IOT.pptx
Webofthing_WOT_vs_IOT.pptxWebofthing_WOT_vs_IOT.pptx
Webofthing_WOT_vs_IOT.pptxjainam bhavsar
 
MapR Streams and MapR Converged Data Platform
MapR Streams and MapR Converged Data PlatformMapR Streams and MapR Converged Data Platform
MapR Streams and MapR Converged Data PlatformMapR Technologies
 
The Internet of Things: Solutions to Drive Business Transformation
The Internet of Things: Solutions to Drive Business TransformationThe Internet of Things: Solutions to Drive Business Transformation
The Internet of Things: Solutions to Drive Business TransformationEvan Wong
 
Key Open Standards for inter-operable IoT systems
Key Open Standards for inter-operable IoT systemsKey Open Standards for inter-operable IoT systems
Key Open Standards for inter-operable IoT systemsPratul Sharma
 
Build Safe & Secure Distributed Systems - RTI Boston Roadshow- 2014 09 30
Build Safe & Secure Distributed Systems - RTI Boston Roadshow- 2014 09 30Build Safe & Secure Distributed Systems - RTI Boston Roadshow- 2014 09 30
Build Safe & Secure Distributed Systems - RTI Boston Roadshow- 2014 09 30Real-Time Innovations (RTI)
 
Industrial Internet of Things: Protocols an Standards
Industrial Internet of Things: Protocols an StandardsIndustrial Internet of Things: Protocols an Standards
Industrial Internet of Things: Protocols an StandardsJavier Povedano
 
Blueprint for the Industrial Internet: The Architecture
Blueprint for the Industrial Internet: The ArchitectureBlueprint for the Industrial Internet: The Architecture
Blueprint for the Industrial Internet: The ArchitectureReal-Time Innovations (RTI)
 
Internet of Things IoT Guido Schmutz
Internet of Things IoT Guido SchmutzInternet of Things IoT Guido Schmutz
Internet of Things IoT Guido SchmutzDésirée Pfister
 
Internet of Things (IoT)
Internet of Things (IoT)Internet of Things (IoT)
Internet of Things (IoT)Trivadis
 
Internet of Things - Are traditional architectures good enough?
Internet of Things - Are traditional architectures good enough?Internet of Things - Are traditional architectures good enough?
Internet of Things - Are traditional architectures good enough?Guido Schmutz
 
Addressing the Complexity and Risks of M2M Projects - M2M World Congress Apri...
Addressing the Complexity and Risks of M2M Projects - M2M World Congress Apri...Addressing the Complexity and Risks of M2M Projects - M2M World Congress Apri...
Addressing the Complexity and Risks of M2M Projects - M2M World Congress Apri...Eurotech
 
IoT and the Oil & Gas industry at M2M Oil & Gas 2014 in London
IoT and the Oil & Gas industry at M2M Oil & Gas 2014 in LondonIoT and the Oil & Gas industry at M2M Oil & Gas 2014 in London
IoT and the Oil & Gas industry at M2M Oil & Gas 2014 in LondonEurotech
 
Why we need internet of things on Node.js
Why we need internet of things on Node.jsWhy we need internet of things on Node.js
Why we need internet of things on Node.jsIndeema Software Inc.
 
How to maximize profit from IoT by using data platform - Albert Lewandowski, ...
How to maximize profit from IoT by using data platform - Albert Lewandowski, ...How to maximize profit from IoT by using data platform - Albert Lewandowski, ...
How to maximize profit from IoT by using data platform - Albert Lewandowski, ...GetInData
 

Similar to A modern IoT data processing toolbox (20)

Internet of Things & Big Data
Internet of Things & Big DataInternet of Things & Big Data
Internet of Things & Big Data
 
Internet of things
Internet of thingsInternet of things
Internet of things
 
Internet of things
Internet of thingsInternet of things
Internet of things
 
October Southern CA Road Shows - Build Safe and Secure Distributed Systems
October Southern CA Road Shows -  Build Safe and Secure Distributed SystemsOctober Southern CA Road Shows -  Build Safe and Secure Distributed Systems
October Southern CA Road Shows - Build Safe and Secure Distributed Systems
 
Webofthing_WOT_vs_IOT.pptx
Webofthing_WOT_vs_IOT.pptxWebofthing_WOT_vs_IOT.pptx
Webofthing_WOT_vs_IOT.pptx
 
Web of things
Web of thingsWeb of things
Web of things
 
MapR Streams and MapR Converged Data Platform
MapR Streams and MapR Converged Data PlatformMapR Streams and MapR Converged Data Platform
MapR Streams and MapR Converged Data Platform
 
The Internet of Things: Solutions to Drive Business Transformation
The Internet of Things: Solutions to Drive Business TransformationThe Internet of Things: Solutions to Drive Business Transformation
The Internet of Things: Solutions to Drive Business Transformation
 
Key Open Standards for inter-operable IoT systems
Key Open Standards for inter-operable IoT systemsKey Open Standards for inter-operable IoT systems
Key Open Standards for inter-operable IoT systems
 
Build Safe & Secure Distributed Systems - RTI Boston Roadshow- 2014 09 30
Build Safe & Secure Distributed Systems - RTI Boston Roadshow- 2014 09 30Build Safe & Secure Distributed Systems - RTI Boston Roadshow- 2014 09 30
Build Safe & Secure Distributed Systems - RTI Boston Roadshow- 2014 09 30
 
Industrial Internet of Things: Protocols an Standards
Industrial Internet of Things: Protocols an StandardsIndustrial Internet of Things: Protocols an Standards
Industrial Internet of Things: Protocols an Standards
 
Understanding the Internet of Things Protocols
Understanding the Internet of Things ProtocolsUnderstanding the Internet of Things Protocols
Understanding the Internet of Things Protocols
 
Blueprint for the Industrial Internet: The Architecture
Blueprint for the Industrial Internet: The ArchitectureBlueprint for the Industrial Internet: The Architecture
Blueprint for the Industrial Internet: The Architecture
 
Internet of Things IoT Guido Schmutz
Internet of Things IoT Guido SchmutzInternet of Things IoT Guido Schmutz
Internet of Things IoT Guido Schmutz
 
Internet of Things (IoT)
Internet of Things (IoT)Internet of Things (IoT)
Internet of Things (IoT)
 
Internet of Things - Are traditional architectures good enough?
Internet of Things - Are traditional architectures good enough?Internet of Things - Are traditional architectures good enough?
Internet of Things - Are traditional architectures good enough?
 
Addressing the Complexity and Risks of M2M Projects - M2M World Congress Apri...
Addressing the Complexity and Risks of M2M Projects - M2M World Congress Apri...Addressing the Complexity and Risks of M2M Projects - M2M World Congress Apri...
Addressing the Complexity and Risks of M2M Projects - M2M World Congress Apri...
 
IoT and the Oil & Gas industry at M2M Oil & Gas 2014 in London
IoT and the Oil & Gas industry at M2M Oil & Gas 2014 in LondonIoT and the Oil & Gas industry at M2M Oil & Gas 2014 in London
IoT and the Oil & Gas industry at M2M Oil & Gas 2014 in London
 
Why we need internet of things on Node.js
Why we need internet of things on Node.jsWhy we need internet of things on Node.js
Why we need internet of things on Node.js
 
How to maximize profit from IoT by using data platform - Albert Lewandowski, ...
How to maximize profit from IoT by using data platform - Albert Lewandowski, ...How to maximize profit from IoT by using data platform - Albert Lewandowski, ...
How to maximize profit from IoT by using data platform - Albert Lewandowski, ...
 

More from DataWorks Summit

Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisDataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiDataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal SystemDataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExampleDataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberDataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixDataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiDataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsDataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureDataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EngineDataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudDataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiDataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerDataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouDataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkDataWorks Summit
 

More from DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Recently uploaded

costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 

Recently uploaded (20)

costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 

A modern IoT data processing toolbox

  • 1. © 2015 MapR Technologies, confidential
  • 2.
  • 3. © 2015 MapR Technologies, confidential© 2015 MapR Technologies, confidential IoT—a superset of the Internet
  • 4. © 2015 MapR Technologies, confidential IoT—a superset of the Internet What is the IoT? “The idea of an all-encompassing and ubiquitous network of devices to facilitate co-ordination and communication between the devices themselves as well as between the devices and human end-users. The involved devices are typically constrained devices such as RFID sensors, but may also more sophisticated ones like smartphones.”
  • 5. © 2015 MapR Technologies, confidential IoT—a superset of the Internet thin fat stationary mobile devices and their deployment
  • 6. © 2015 MapR Technologies, confidential The IoT landscapeinfrastructuredataapps
  • 7. © 2015 MapR Technologies, confidential New and Existing Devices IoT Gateways Network/Wireless Services Backend Systems Orientation iot.eclipse.org
  • 8. © 2015 MapR Technologies, confidential Categorization & use cases
  • 9. © 2015 MapR Technologies, confidential Categorization & use cases: Personal IoT Scope is on a single person, for example a smartphone equipped with GPS sensor or a fitness device that measures the heart and sharing this data with her GP. One of the fastest growing, rather consumer-oriented areas of IoT. Use cases and apps • Quantified self • Smart jackets • Personal digital assistant
  • 10. © 2015 MapR Technologies, confidential Categorization & use cases: Group IoT Focuses on a small group of people, for example a family in the context of a smart home where the deployed sensors capture temperature and lighting conditions for optimal comfort. One of the most challenging areas and yet early days. Use cases and apps • Smart homes • Proactive/predictive car maintenance • Interactive tourism
  • 11. © 2015 MapR Technologies, confidential Categorization & use cases: Community IoT Considers a large group of people, potentially tens of thousands, usually in the context of public infrastructure, such as smart cities. Some immature from a commercial POV but potentially promising IoT area. Use cases and apps • Smart cities • Health care (monitoring, trackers)
  • 12. © 2015 MapR Technologies, confidential Categorization & use cases: Industrial IoT Scope can be either within an organization or between organizations and/or individuals. This is arguably the most established and mature part of IoT, see also M2M. Use cases and apps • Smart factory • Retailer supply chain • Agriculture • Waste management
  • 13. © 2015 MapR Technologies, confidential IoT lends itself to ‘Big Data’ approach Scaling out on commodity hardware, in a schema-on-read fashion, over community-defined interfaces • Volume: store all incoming sensor data for historical references • Variety: dozens of data formats in use in the IoT world, none is relational • Velocity: many devices generate data at a high rate; usually data streams
  • 14. © 2015 MapR Technologies, confidential© 2015 MapR Technologies, confidential My IoT toolbox
  • 15. © 2015 MapR Technologies, confidential Apache Kafka • A high-throughput, distributed, persistent publish-subscribe messaging system • Originates from LinkedIn • Typically used as buffer/ de-coupling layer in online stream processing kafka.apache.org
  • 16. © 2015 MapR Technologies, confidential Fluentd Data collector for unified logging layer www.fluentd.org
  • 17. © 2015 MapR Technologies, confidential Apache Storm • Distributed, fault-tolerant stream- processing platform • Guaranteed message processing; takes care of replaying messages on failure • Concepts: tuples, streams, spouts, bolts, topologies storm.apache.org
  • 18. © 2015 MapR Technologies, confidential Apache Spark spark.apache.org Continued innovation bringing new functionality, such as: • Tachyon (Shared RDDs, off-heap solution) • BlinkDB (approximate queries) • SparkR (R wrapper for Spark) Spark SQL (SQL/HQL) Spark Streaming (stream processing) MLlib (machine learning) Spark (core execution engine—RDDs) GraphX (graph processing) Mesos file system (local, MapR-FS, HDFS, S3) or data store (HBase, Elasticsearch, etc.) YARNStandalone
  • 19. © 2015 MapR Technologies, confidential Apache HBase • Distributed, column-oriented NoSQL database built on top of HDFS • Based on Google’s BigTable technology, CP • Scales to 1,000s of commodity servers, billions of rows/ PB of data • Low-latency get/put operations hbase.apache.org
  • 20. © 2015 MapR Technologies, confidential drill.apache.org Apache Drill • Interactive analysis at scale, with global and local schema • Support evolving NoSQL data structures • Self-service BI; use with or without Hadoop mapr.com/blog/how-use-sql-hadoop-drill-rest- json-nosql-and-hbase-simple-rest-client
  • 21. © 2015 MapR Technologies, confidential© 2015 MapR Technologies, confidential Time series data
  • 22. © 2015 MapR Technologies, confidential OpenTSDB OpenTSDB is a distributed time series database on top of HBase, enabling you … • to store & index, as well as • to query & plot … metrics at scale. opentsdb.net
  • 23. © 2015 MapR Technologies, confidential OpenTSDB: key concepts data point: (timestamp, value) + metric + tag: key=value  time series (00:38, 56) mysql.com_delete schema=userdb
  • 24. © 2015 MapR Technologies, confidential read pathwrite path OpenTSDB: high-level architecture MapR-DB HBase RPC: PUT, SCAN TSD RPC tcollector tcollector tcollector app/metric shell script (alert, etc.) TSD TSD TSD TSD TSD RPC or HTTP opentsdb.net/overview.html
  • 25. © 2015 MapR Technologies, confidential OpenTSDB with MapR https://github.com/mapr-demos/opentsdb message queue data points users tcollector MapR-DB web app buffering data for 1 hour in collector allows 1000x decrease in insertion rate
  • 26. © 2015 MapR Technologies, confidential OpenTSDB: interfacing • HTTP API • CLI (tsd, query, mkmetric, etc.) • Java lib: asynchbase • Improved collectors: scollector • Dashboard: Grafana
  • 27. © 2015 MapR Technologies© 2015 MapR Technologies The Internet of Things architecture (iot-a)
  • 28. © 2015 MapR Technologies Key Requirements for an IoT Data Platform • Deal with raw data natively • Support a range of workloads; streaming as first-class citizen • Ensure business continuity • Provide secure and privacy-aware operation mapr.com/blog/key-requirements-iot-data-platform
  • 29. © 2015 MapR Technologies, confidential The IoT architecture (iot-a) iot-a.info MQ/SP DFS DB input outputas-it-happens outputinteractive outputbatch
  • 30. © 2015 MapR Technologies, confidential Example iot-a HDFS HBase input outputas-it-happens outputinteractive outputbatch batch jobs batch jobs
  • 31. © 2015 MapR Technologies, confidential A proof of concept from the automotive sector
  • 32. © 2015 MapR Technologies, confidential A proof of concept from the automotive sector
  • 33. © 2015 MapR Technologies, confidential Largest Biometric Database in the World PEOPLE 1.2B PEOPLE uidai.gov.in/images/AadhaarTechnologyArchitecture_March2014.pdf
  • 34. © 2015 MapR Technologies, confidential Financial Services Fraud detection Personalized offers Fraud investigation tool Fraud investigator Fraud model Recommendations table Clickstream analysis Online transactions MapR Distribution for Hadoop Analytics Interactive marketer
  • 35. © 2015 MapR Technologies, confidential Waste & Recycling Leader—Architecture Truck Truck Truck . . . MapR lat/lng lat/lng lat/lng Online alerts Batch processing (MapReduce) Tax reduction reporting Shortest path graph algorithm (Titan) Route optimization Real-time stream processing (Apache Storm)
  • 36. © 2015 MapR Technologies, confidential $50M$50M in Free Training
  • 37. © 2015 MapR Technologies Q&A @mhausenblas maprtech mhausenblas@mapr.com Engage with us! MapR maprtech mapr-technologies

Editor's Notes

  1. 33