SlideShare a Scribd company logo
IoT meets Big Data
รัฐศิลป์ รานอกภานุวัชร์, D.ENG
Keywords
• Big Data
• Internet of Things
• Streaming data processing
• IoT Big Data analytics
• Advanced machine learning
2
3
Big Data technology
Credit: https://www.xenonstack.com/blog/big-data-engineering/ingestion-processing-big-data-iot-stream/ 4
Internet of Things (IoT)
Credit: https://orzota.com/industrial-iot/
Software and
platform
VisualizationThings
5
Sensors & Actuators
IoT data characteristics
Large-Scale
Streaming Data
Heterogeneity
Time and space
correlation
High noise data
IoT
data
Fast computing and
advanced machine learning
techniques require for IoT
streaming data processing
and IoT bigdata analytics
Analytics requirement
IoT Applications support
High-speed data streams
and requiring real-time
or near real-time actions
Reference: M. Chen, S. Mao, Y. Zhang, and V. C. Leung, Big data: related technologies, challenges and future prospects. Springer, 2014
Things are Producing Streaming Data
7
Variety
Difference type of
Data
Velocity
Speed at which
Data is Generated
Veracity
Data Accuracy
“6V” for IoT Big Data
IoT Big Data
Volume
Size of Data
Variability
Dynamic Behavior In Data
Source coz dataflow rate
Value
Useful Data
8
New class of analytics “Fast and streaming data analytics”
IoT data
‘6V’
Streaming
processing
Advanced
machine
learning
Fast distributed
computing
9
IoT Big Data Architecture
Filtering
Analytics
Ingestion Data
Source: https://mapr.com/blog/ml-iot-connected-medical-devices/ 10
Use Case – Truck Sensors
11
How to design a Streaming Analytics Solution?
12
How to design a Streaming Analytics System?
It usually starts very simple … just one data pipeline
13
New Event Stream sources are added…
14
New Processors are interested in the events …
15
… and the solution becomes the problem
16
… and the solution becomes the problem
17
Decouple event streams from consumers
data pipeline
18
Apache Kafka
A distributed streaming platform
19
Messaging Systems: Publish/Subscribe
Producer Consumer
Producer
Consumer
Topic 1 Topic 2
Topic 3
subscribe
publish(topic, msg)
Publish subscribe
system
msg
msg
20
Before: How to integrate this variety of data and make it available to all products?
▪ LinkedIn grew to have dozens of data systems and data repositories.
▪ LinkedId described their point-to-point data pipelines like;
The first presentation for Kafka Meetup @ Linkedin (Bangalore) held on 2015/12/5 21
After
▪ Kafka was crated to server as centralized online data pipelining system:
▪ Elastically scalable
▪ Durable
▪ High-throughput
▪ Fast
22
Why must be concerned
▪ Over 1,300,000,000,000 messages are transported via Kafka every
day at LinkedIn
▪ 300 Terabytes of inbound and 900 Terabytes of outbound traffic
▪ 4.5 Million messages per second, on single cluster
▪ Kafka runs on around 1300 servers at LinkedIn
Newsfeed Recommendation Metrics and Monitoring23
A few important characteristics
Fast
◦ Kafka can handle hundreds of megabytes of reads and writes per second from a
large number of clients.
◦ Designed for real time activity streaming.
Distributed and highly scalable
◦ Kafka has a cluster-centric design offers strong durability and fault-tolerance
guarantees.
◦ Messages partitioning spread over a cluster of machines
Durable
◦ Message persisted to disk and replicated within cluster to prevent data loss.
◦ Each broker can handle terabytes of messages without performance impact
Kafka architecture: Broker, Topics, Producers,
and Consumers
26
Kafka Cluster is made up of multiple Kafka Brokers
Kafka Zookeeper Coordination
Producer
Consumer
Producer
Broker Broker Broker Broker
Consumer
ZK
27
Apache Kafka - Architecture
Producer
Consumer
29
Apache Kafka - Architecture
Producer
Consumer
30
Apache Kafka
Producer
Consumer
31
Use Case – Truck Sensors
32
Kafka Single Node Example
DOWNLOAD LATEST VERSION FROM HTTPS://KAFKA.APACHE.ORG/DOWNLOADS
Run ZooKeeper
Wait about 30 seconds or so for ZooKeeper to startup.
34
Run Kafka Server (Broker)
Wait about 30 seconds or so for Kafka to startup.
35
Create Kafka Topic
• We create a topic called my-topic with a replication factor of 1 since we only have one server.
• We will use 13 partitions for my-topic, which means we could have up to 13 Kafka consumers.
36
Run Kafka Producer
• Notice that we specify the Kafka node which is running at localhost:9092..
• Next run start-producer-console.sh and send at least four messages
37
Run Kafka Consumer
Notice that we specify the Kafka node which is running at localhost:9092 like
we did before, but we also specify to read all of the messages from my-topic
from the beginning —from-beginning.
38
Running Kafka Producer and Consumer
• Notice that the messages are not coming in order.
• This is because we only have one consumer so it is reading the messages from all 13
partitions.
• Order is only guaranteed within a partition.
39
IoT Big Data Streaming processing patterns
Events Events
Events
Real-time
applications
Long term
storage
Real-time
dashboards
Source: Streaming Big Data on Azure with HDInsight Kafka, Storm and Spark by Raghav Mohan Program Manager Azure HDInsight
Example
Source: https://www.scnsoft.com/blog/salesforce-iot-cloud-benefits-and-limitations 42
IoT Big Data Analytic
IoT Big Data Architecture
Filtering
Analytics
Ingestion Data
Source: https://mapr.com/blog/ml-iot-connected-medical-devices/ 44
What is Machine Learning?
45
Source: https://cybrml.com/2017/01/23/ml-in-cs-4-machine-learning-technical-review/ 46
Machine Learning in IoT Applications
Source : https://medium.com/iotforall/using-deep-learning-processors-for-intelligent-iot-devices-1a7ed9d2226d 47
Dataset
48Reference : Deep Learning for IoT Big Data and Streaming Analytics: A Survey
Disadvantages of Pure Cloud Service Model
o Unpredictable response time from cloud server to endpoints
o Unreliable cloud connections can bring down the service
o Excessive data can overburden infrastructure
o Privacy issues when sensitive customer data are stored in the cloud
o Difficulties in scaling to ever increasing number of sensors and actuators
49
Fog computing for IoT
• Bringing computing and analytics closer to the end-users/devices to remove unnecessary and
prohibitive communication delays (saves on transmissions costs).
• It can receive, process and react in real time to the incoming data.
50
Ex. Fog computing + Kafka
https://www.cisco.com/c/en/us/td/docs/unified_computing/ucs/UCS_CVDs/Cisco_UCS_Integrated_Infrastructure_for_Big_Data_with_
Cloudera_and_Apache_Spark.html 51
Case study #1
REFERENCE: HTTPS://MAPR.COM/BLOG/ML-IOT-CONNECTED-MEDICAL-DEVICES/
52
Streaming machine-learning application to detect
anomalies in data from a heart monitor
◦ Cheaper sensors that can monitor vital signs combined with machine learning, are making it
possible for doctors to rapidly apply smart medicine to their patients’ cases.
electrocardiogram (ECG)
53
Building the Model with Clustering
Heartbeats activity: normal EKG pattern
we use this repeating pattern to train a model on
previous heartbeat activity and then compare
subsequent observations to this model in order to
evaluate anomalous behavior.
To build a model of typical heartbeats activity, we process an
EKG (based on a specific patient or a group of many patients),
break it into overlapping pieces that are about 1/3 sec long, and
then apply a clustering algorithm to group similar shapes.
The k-means algorithm
54
Apache Spark processing with k-means
55
Results in a catalog of shapes
It can be used for reconstructing
what an EKG should look like.
56
Using the Model of Normal with Streaming Data
57
Detecting Anomalies
The difference between the observed and expected EKG (the green minus the red) is
the reconstruction error, or residual (shown in yellow). If the residual is high, then
there could be an anomaly.
58
Case study #2
REFERENCE:
การประชุมวิชาการระดับประเทศด้านเทคโนโลยีสารสนเทศ (NATIONAL CONFERENCE ON
INFORMATION TECHNOLOGY: NCIT) ครั้ง ที่ 10 24-25 ตุลาคา 2561
60
โรงเรือนผักไฮโดรโปรนิกส์อัตโนมัติโดยใช้เทคโนโลยี IoT และ
Machine learning
Internet
Camera
Amazon S3
Small class Medium class Large class
61
การวิเคราะห์การเติบโตผัก แบ่ง3 class
Small Medium Large
✓ ในการทาโมเดล เราจะทาการเทรนชุดข้อมูล class ละ 300 รูป
✓ เฟรมเวิร์ก Caffe โมเดล CNNs และ SDK ของ Intel deep learning training
tool ในการพัฒนาโมเดล ที่ติดตั้งบน AWS Cloud
62
ขั้นตอนการทางาน
Camera Module
ชุดข้อมูล class ละ 300 รูป
Predict Class
CNNs
CNNs = Convolutional Neural Network
ผลการทดสอบโมเดล
64
Profile ผักสาหรับควบคุมอัตโนมัติ 3 class
ตั แปร ค ค ม ม ย
Temp อง C อ มิ ยในโรงเรือน
Hum % ค มชนในอ ก ยในโรงเรือน
Lux Lux ค มเ ้มแสง ยในโรงเรือน
Fan On/Off ก รปิดปิด ัดลม
Silent On/Off ก รเปิดปิดม น ร งแสง
Water On/Off ก รเปิดปิดปัมน
Cool On/Off ก รเปิดปิดปัมน ไ ลผ นแผงรังผง
Foggy On/Off ก รเปิดปิด ั น มอก
Challenges and Future Directions
o Lack of Large IoT Dataset
o more data is needed to achieve more accuracy
o Preprocessing
o more complex since the system deals with data from different sources that may have various formats
o Secure and Privacy Preserving Machine Learning
o developing further techniques to defend and prevent the effect of this sort of attacks on models is
necessary for reliable IoT applications.
o Machine Learning for IoT Devices
o consider the requirements of handling Machine learning in resource-constrained devices
66
Thank you

More Related Content

What's hot

Brain computer interface
Brain computer interfaceBrain computer interface
Brain computer interface
Presentaionslive.blogspot.com
 
Brain gate
Brain gateBrain gate
Brain gate
Mayank Garg
 
Brain Computer Interface ppt
Brain Computer Interface pptBrain Computer Interface ppt
Brain Computer Interface ppt
Aman Kumar
 
Brain computer interface
Brain computer interfaceBrain computer interface
Brain computer interface
Koushik Veldanda
 
Internet of Things
Internet of ThingsInternet of Things
Internet of Things
Mphasis
 
Introduction to Internet of Things (IoT)
Introduction to Internet of Things (IoT)Introduction to Internet of Things (IoT)
Introduction to Internet of Things (IoT)
Amarjeetsingh Thakur
 
Internet of things (IoT)
Internet of things (IoT)Internet of things (IoT)
Internet of things (IoT)
Ameer Sameer
 
Generative Art (a gentle introduction)
Generative Art (a gentle introduction)Generative Art (a gentle introduction)
Generative Art (a gentle introduction)
Sabin Buraga
 
Brain Computer Interface Next Generation of Human Computer Interaction
Brain Computer Interface Next Generation of Human Computer InteractionBrain Computer Interface Next Generation of Human Computer Interaction
Brain Computer Interface Next Generation of Human Computer Interaction
Saurabh Giratkar
 
Blockchain Study(1) - What is Blockchain?
Blockchain Study(1) - What is Blockchain?Blockchain Study(1) - What is Blockchain?
Blockchain Study(1) - What is Blockchain?
Fermat Jade
 
What is the Internet of Things?
What is the Internet of Things?What is the Internet of Things?
What is the Internet of Things?
Felix Grovit
 
Internet of Things(IoT) - Introduction and Research Areas for Thesis
Internet of Things(IoT) - Introduction and Research Areas for ThesisInternet of Things(IoT) - Introduction and Research Areas for Thesis
Internet of Things(IoT) - Introduction and Research Areas for Thesis
WriteMyThesis
 
Internet of things (IoT)- Introduction, Utilities, Applications
Internet of things (IoT)- Introduction, Utilities, ApplicationsInternet of things (IoT)- Introduction, Utilities, Applications
Internet of things (IoT)- Introduction, Utilities, Applications
Tarika Verma
 
Internet of things (IoT)
Internet of things (IoT)Internet of things (IoT)
Internet of things (IoT)
Tarika Verma
 
Current Trends in Internet of Things (IOT)
Current Trends in Internet of Things (IOT)Current Trends in Internet of Things (IOT)
Current Trends in Internet of Things (IOT)
Dr. Mazlan Abbas
 
Ai in e commerce
Ai in e commerceAi in e commerce
Ai in e commerce
Guru Technolabs
 
Big Data Analytics for the Industrial Internet of Things
Big Data Analytics for the Industrial Internet of ThingsBig Data Analytics for the Industrial Internet of Things
Big Data Analytics for the Industrial Internet of Things
Anthony Chen
 
OpenAI-Copilot-ChatGPT.pptx
OpenAI-Copilot-ChatGPT.pptxOpenAI-Copilot-ChatGPT.pptx
OpenAI-Copilot-ChatGPT.pptx
Udaiappa Ramachandran
 

What's hot (20)

138693 28152-brain-chips
138693 28152-brain-chips138693 28152-brain-chips
138693 28152-brain-chips
 
Internet Of Things
 Internet Of Things Internet Of Things
Internet Of Things
 
Brain computer interface
Brain computer interfaceBrain computer interface
Brain computer interface
 
Brain gate
Brain gateBrain gate
Brain gate
 
Brain Computer Interface ppt
Brain Computer Interface pptBrain Computer Interface ppt
Brain Computer Interface ppt
 
Brain computer interface
Brain computer interfaceBrain computer interface
Brain computer interface
 
Internet of Things
Internet of ThingsInternet of Things
Internet of Things
 
Introduction to Internet of Things (IoT)
Introduction to Internet of Things (IoT)Introduction to Internet of Things (IoT)
Introduction to Internet of Things (IoT)
 
Internet of things (IoT)
Internet of things (IoT)Internet of things (IoT)
Internet of things (IoT)
 
Generative Art (a gentle introduction)
Generative Art (a gentle introduction)Generative Art (a gentle introduction)
Generative Art (a gentle introduction)
 
Brain Computer Interface Next Generation of Human Computer Interaction
Brain Computer Interface Next Generation of Human Computer InteractionBrain Computer Interface Next Generation of Human Computer Interaction
Brain Computer Interface Next Generation of Human Computer Interaction
 
Blockchain Study(1) - What is Blockchain?
Blockchain Study(1) - What is Blockchain?Blockchain Study(1) - What is Blockchain?
Blockchain Study(1) - What is Blockchain?
 
What is the Internet of Things?
What is the Internet of Things?What is the Internet of Things?
What is the Internet of Things?
 
Internet of Things(IoT) - Introduction and Research Areas for Thesis
Internet of Things(IoT) - Introduction and Research Areas for ThesisInternet of Things(IoT) - Introduction and Research Areas for Thesis
Internet of Things(IoT) - Introduction and Research Areas for Thesis
 
Internet of things (IoT)- Introduction, Utilities, Applications
Internet of things (IoT)- Introduction, Utilities, ApplicationsInternet of things (IoT)- Introduction, Utilities, Applications
Internet of things (IoT)- Introduction, Utilities, Applications
 
Internet of things (IoT)
Internet of things (IoT)Internet of things (IoT)
Internet of things (IoT)
 
Current Trends in Internet of Things (IOT)
Current Trends in Internet of Things (IOT)Current Trends in Internet of Things (IOT)
Current Trends in Internet of Things (IOT)
 
Ai in e commerce
Ai in e commerceAi in e commerce
Ai in e commerce
 
Big Data Analytics for the Industrial Internet of Things
Big Data Analytics for the Industrial Internet of ThingsBig Data Analytics for the Industrial Internet of Things
Big Data Analytics for the Industrial Internet of Things
 
OpenAI-Copilot-ChatGPT.pptx
OpenAI-Copilot-ChatGPT.pptxOpenAI-Copilot-ChatGPT.pptx
OpenAI-Copilot-ChatGPT.pptx
 

Similar to IoT meets Big Data

Brad stack - Digital Health and Well-Being Festival
Brad stack - Digital Health and Well-Being Festival Brad stack - Digital Health and Well-Being Festival
Brad stack - Digital Health and Well-Being Festival
Digital Health Enterprise Zone
 
Phoenix Data Conference - Big Data Analytics for IoT 11/4/17
Phoenix Data Conference - Big Data Analytics for IoT 11/4/17Phoenix Data Conference - Big Data Analytics for IoT 11/4/17
Phoenix Data Conference - Big Data Analytics for IoT 11/4/17
Mark Goldstein
 
Io t data streaming
Io t data streamingIo t data streaming
Io t data streaming
ratthaslip ranokphanuwat
 
Microsoft Dryad
Microsoft DryadMicrosoft Dryad
Microsoft Dryad
Colin Clark
 
The Pandemic Changes Everything, the Need for Speed and Resiliency
The Pandemic Changes Everything, the Need for Speed and ResiliencyThe Pandemic Changes Everything, the Need for Speed and Resiliency
The Pandemic Changes Everything, the Need for Speed and Resiliency
Alluxio, Inc.
 
People Counting: Internet of Things in Motion at JavaOne 2013
People Counting: Internet of Things in Motion at JavaOne 2013People Counting: Internet of Things in Motion at JavaOne 2013
People Counting: Internet of Things in Motion at JavaOne 2013
Eurotech
 
Java in the Air: A Case Study for Java-based Environment Monitoring Stations
Java in the Air: A Case Study for Java-based Environment Monitoring StationsJava in the Air: A Case Study for Java-based Environment Monitoring Stations
Java in the Air: A Case Study for Java-based Environment Monitoring Stations
Eurotech
 
Using Eclipse and Lua for the Internet of Things - JAX2013
Using Eclipse and Lua for the Internet of Things - JAX2013Using Eclipse and Lua for the Internet of Things - JAX2013
Using Eclipse and Lua for the Internet of Things - JAX2013Benjamin Cabé
 
DataPalooza: ML & IoT Workshop
DataPalooza: ML & IoT WorkshopDataPalooza: ML & IoT Workshop
DataPalooza: ML & IoT Workshop
Amazon Web Services
 
Privacy preserving public auditing for secured cloud storage
Privacy preserving public auditing for secured cloud storagePrivacy preserving public auditing for secured cloud storage
Privacy preserving public auditing for secured cloud storage
dbpublications
 
In search of the perfect IoT Stack - Scalable IoT Architectures with MQTT
In search of the perfect IoT Stack - Scalable IoT Architectures with MQTTIn search of the perfect IoT Stack - Scalable IoT Architectures with MQTT
In search of the perfect IoT Stack - Scalable IoT Architectures with MQTT
Dominik Obermaier
 
Datapalooza: A Music Festival Themed ML & IoT Workshop
Datapalooza: A Music Festival Themed ML & IoT WorkshopDatapalooza: A Music Festival Themed ML & IoT Workshop
Datapalooza: A Music Festival Themed ML & IoT Workshop
Amazon Web Services
 
Lightweight and scalable IoT Architectures with MQTT
Lightweight and scalable IoT Architectures with MQTTLightweight and scalable IoT Architectures with MQTT
Lightweight and scalable IoT Architectures with MQTT
Dominik Obermaier
 
Maturing IoT solutions with Microsoft Azure (Sam Vanhoutte & Glenn Colpaert a...
Maturing IoT solutions with Microsoft Azure (Sam Vanhoutte & Glenn Colpaert a...Maturing IoT solutions with Microsoft Azure (Sam Vanhoutte & Glenn Colpaert a...
Maturing IoT solutions with Microsoft Azure (Sam Vanhoutte & Glenn Colpaert a...
Codit
 
Devising a practical approach to the Internet of Things
Devising a practical approach to the Internet of ThingsDevising a practical approach to the Internet of Things
Devising a practical approach to the Internet of ThingsGordon Haff
 
Edge computing and its role in architecting IoT
Edge computing and its role in architecting IoTEdge computing and its role in architecting IoT
Edge computing and its role in architecting IoT
Kiran Kumar Pattanaik
 
Cloud, Fog, or Edge: Where and When to Compute?
Cloud, Fog, or Edge: Where and When to Compute?Cloud, Fog, or Edge: Where and When to Compute?
Cloud, Fog, or Edge: Where and When to Compute?
Förderverein Technische Fakultät
 
Shaping a Digital Vision
Shaping a Digital VisionShaping a Digital Vision
Shaping a Digital Vision
DataWorks Summit/Hadoop Summit
 
Webinar: Cutting Time, Complexity and Cost from Data Science to Production
Webinar: Cutting Time, Complexity and Cost from Data Science to ProductionWebinar: Cutting Time, Complexity and Cost from Data Science to Production
Webinar: Cutting Time, Complexity and Cost from Data Science to Production
iguazio
 
ING CoreIntel - collect and process network logs across data centers in near ...
ING CoreIntel - collect and process network logs across data centers in near ...ING CoreIntel - collect and process network logs across data centers in near ...
ING CoreIntel - collect and process network logs across data centers in near ...
Evention
 

Similar to IoT meets Big Data (20)

Brad stack - Digital Health and Well-Being Festival
Brad stack - Digital Health and Well-Being Festival Brad stack - Digital Health and Well-Being Festival
Brad stack - Digital Health and Well-Being Festival
 
Phoenix Data Conference - Big Data Analytics for IoT 11/4/17
Phoenix Data Conference - Big Data Analytics for IoT 11/4/17Phoenix Data Conference - Big Data Analytics for IoT 11/4/17
Phoenix Data Conference - Big Data Analytics for IoT 11/4/17
 
Io t data streaming
Io t data streamingIo t data streaming
Io t data streaming
 
Microsoft Dryad
Microsoft DryadMicrosoft Dryad
Microsoft Dryad
 
The Pandemic Changes Everything, the Need for Speed and Resiliency
The Pandemic Changes Everything, the Need for Speed and ResiliencyThe Pandemic Changes Everything, the Need for Speed and Resiliency
The Pandemic Changes Everything, the Need for Speed and Resiliency
 
People Counting: Internet of Things in Motion at JavaOne 2013
People Counting: Internet of Things in Motion at JavaOne 2013People Counting: Internet of Things in Motion at JavaOne 2013
People Counting: Internet of Things in Motion at JavaOne 2013
 
Java in the Air: A Case Study for Java-based Environment Monitoring Stations
Java in the Air: A Case Study for Java-based Environment Monitoring StationsJava in the Air: A Case Study for Java-based Environment Monitoring Stations
Java in the Air: A Case Study for Java-based Environment Monitoring Stations
 
Using Eclipse and Lua for the Internet of Things - JAX2013
Using Eclipse and Lua for the Internet of Things - JAX2013Using Eclipse and Lua for the Internet of Things - JAX2013
Using Eclipse and Lua for the Internet of Things - JAX2013
 
DataPalooza: ML & IoT Workshop
DataPalooza: ML & IoT WorkshopDataPalooza: ML & IoT Workshop
DataPalooza: ML & IoT Workshop
 
Privacy preserving public auditing for secured cloud storage
Privacy preserving public auditing for secured cloud storagePrivacy preserving public auditing for secured cloud storage
Privacy preserving public auditing for secured cloud storage
 
In search of the perfect IoT Stack - Scalable IoT Architectures with MQTT
In search of the perfect IoT Stack - Scalable IoT Architectures with MQTTIn search of the perfect IoT Stack - Scalable IoT Architectures with MQTT
In search of the perfect IoT Stack - Scalable IoT Architectures with MQTT
 
Datapalooza: A Music Festival Themed ML & IoT Workshop
Datapalooza: A Music Festival Themed ML & IoT WorkshopDatapalooza: A Music Festival Themed ML & IoT Workshop
Datapalooza: A Music Festival Themed ML & IoT Workshop
 
Lightweight and scalable IoT Architectures with MQTT
Lightweight and scalable IoT Architectures with MQTTLightweight and scalable IoT Architectures with MQTT
Lightweight and scalable IoT Architectures with MQTT
 
Maturing IoT solutions with Microsoft Azure (Sam Vanhoutte & Glenn Colpaert a...
Maturing IoT solutions with Microsoft Azure (Sam Vanhoutte & Glenn Colpaert a...Maturing IoT solutions with Microsoft Azure (Sam Vanhoutte & Glenn Colpaert a...
Maturing IoT solutions with Microsoft Azure (Sam Vanhoutte & Glenn Colpaert a...
 
Devising a practical approach to the Internet of Things
Devising a practical approach to the Internet of ThingsDevising a practical approach to the Internet of Things
Devising a practical approach to the Internet of Things
 
Edge computing and its role in architecting IoT
Edge computing and its role in architecting IoTEdge computing and its role in architecting IoT
Edge computing and its role in architecting IoT
 
Cloud, Fog, or Edge: Where and When to Compute?
Cloud, Fog, or Edge: Where and When to Compute?Cloud, Fog, or Edge: Where and When to Compute?
Cloud, Fog, or Edge: Where and When to Compute?
 
Shaping a Digital Vision
Shaping a Digital VisionShaping a Digital Vision
Shaping a Digital Vision
 
Webinar: Cutting Time, Complexity and Cost from Data Science to Production
Webinar: Cutting Time, Complexity and Cost from Data Science to ProductionWebinar: Cutting Time, Complexity and Cost from Data Science to Production
Webinar: Cutting Time, Complexity and Cost from Data Science to Production
 
ING CoreIntel - collect and process network logs across data centers in near ...
ING CoreIntel - collect and process network logs across data centers in near ...ING CoreIntel - collect and process network logs across data centers in near ...
ING CoreIntel - collect and process network logs across data centers in near ...
 

Recently uploaded

Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
ukgaet
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
ewymefz
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Linda486226
 
tapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive data
theahmadsaood
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
StarCompliance.io
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
nscud
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
ArpitMalhotra16
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
enxupq
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
ewymefz
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
Tiktokethiodaily
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
vcaxypu
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
benishzehra469
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
alex933524
 

Recently uploaded (20)

Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
 
tapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive data
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
 

IoT meets Big Data

  • 1. IoT meets Big Data รัฐศิลป์ รานอกภานุวัชร์, D.ENG
  • 2. Keywords • Big Data • Internet of Things • Streaming data processing • IoT Big Data analytics • Advanced machine learning 2
  • 3. 3
  • 4. Big Data technology Credit: https://www.xenonstack.com/blog/big-data-engineering/ingestion-processing-big-data-iot-stream/ 4
  • 5. Internet of Things (IoT) Credit: https://orzota.com/industrial-iot/ Software and platform VisualizationThings 5 Sensors & Actuators
  • 6. IoT data characteristics Large-Scale Streaming Data Heterogeneity Time and space correlation High noise data IoT data Fast computing and advanced machine learning techniques require for IoT streaming data processing and IoT bigdata analytics Analytics requirement IoT Applications support High-speed data streams and requiring real-time or near real-time actions Reference: M. Chen, S. Mao, Y. Zhang, and V. C. Leung, Big data: related technologies, challenges and future prospects. Springer, 2014
  • 7. Things are Producing Streaming Data 7
  • 8. Variety Difference type of Data Velocity Speed at which Data is Generated Veracity Data Accuracy “6V” for IoT Big Data IoT Big Data Volume Size of Data Variability Dynamic Behavior In Data Source coz dataflow rate Value Useful Data 8
  • 9. New class of analytics “Fast and streaming data analytics” IoT data ‘6V’ Streaming processing Advanced machine learning Fast distributed computing 9
  • 10. IoT Big Data Architecture Filtering Analytics Ingestion Data Source: https://mapr.com/blog/ml-iot-connected-medical-devices/ 10
  • 11. Use Case – Truck Sensors 11
  • 12. How to design a Streaming Analytics Solution? 12
  • 13. How to design a Streaming Analytics System? It usually starts very simple … just one data pipeline 13
  • 14. New Event Stream sources are added… 14
  • 15. New Processors are interested in the events … 15
  • 16. … and the solution becomes the problem 16
  • 17. … and the solution becomes the problem 17
  • 18. Decouple event streams from consumers data pipeline 18
  • 19. Apache Kafka A distributed streaming platform 19
  • 20. Messaging Systems: Publish/Subscribe Producer Consumer Producer Consumer Topic 1 Topic 2 Topic 3 subscribe publish(topic, msg) Publish subscribe system msg msg 20
  • 21. Before: How to integrate this variety of data and make it available to all products? ▪ LinkedIn grew to have dozens of data systems and data repositories. ▪ LinkedId described their point-to-point data pipelines like; The first presentation for Kafka Meetup @ Linkedin (Bangalore) held on 2015/12/5 21
  • 22. After ▪ Kafka was crated to server as centralized online data pipelining system: ▪ Elastically scalable ▪ Durable ▪ High-throughput ▪ Fast 22
  • 23. Why must be concerned ▪ Over 1,300,000,000,000 messages are transported via Kafka every day at LinkedIn ▪ 300 Terabytes of inbound and 900 Terabytes of outbound traffic ▪ 4.5 Million messages per second, on single cluster ▪ Kafka runs on around 1300 servers at LinkedIn Newsfeed Recommendation Metrics and Monitoring23
  • 24. A few important characteristics Fast ◦ Kafka can handle hundreds of megabytes of reads and writes per second from a large number of clients. ◦ Designed for real time activity streaming. Distributed and highly scalable ◦ Kafka has a cluster-centric design offers strong durability and fault-tolerance guarantees. ◦ Messages partitioning spread over a cluster of machines Durable ◦ Message persisted to disk and replicated within cluster to prevent data loss. ◦ Each broker can handle terabytes of messages without performance impact
  • 25. Kafka architecture: Broker, Topics, Producers, and Consumers 26 Kafka Cluster is made up of multiple Kafka Brokers
  • 27. Apache Kafka - Architecture Producer Consumer 29
  • 28. Apache Kafka - Architecture Producer Consumer 30
  • 30. Use Case – Truck Sensors 32
  • 31. Kafka Single Node Example DOWNLOAD LATEST VERSION FROM HTTPS://KAFKA.APACHE.ORG/DOWNLOADS
  • 32. Run ZooKeeper Wait about 30 seconds or so for ZooKeeper to startup. 34
  • 33. Run Kafka Server (Broker) Wait about 30 seconds or so for Kafka to startup. 35
  • 34. Create Kafka Topic • We create a topic called my-topic with a replication factor of 1 since we only have one server. • We will use 13 partitions for my-topic, which means we could have up to 13 Kafka consumers. 36
  • 35. Run Kafka Producer • Notice that we specify the Kafka node which is running at localhost:9092.. • Next run start-producer-console.sh and send at least four messages 37
  • 36. Run Kafka Consumer Notice that we specify the Kafka node which is running at localhost:9092 like we did before, but we also specify to read all of the messages from my-topic from the beginning —from-beginning. 38
  • 37. Running Kafka Producer and Consumer • Notice that the messages are not coming in order. • This is because we only have one consumer so it is reading the messages from all 13 partitions. • Order is only guaranteed within a partition. 39
  • 38. IoT Big Data Streaming processing patterns Events Events Events Real-time applications Long term storage Real-time dashboards Source: Streaming Big Data on Azure with HDInsight Kafka, Storm and Spark by Raghav Mohan Program Manager Azure HDInsight
  • 40. IoT Big Data Analytic
  • 41. IoT Big Data Architecture Filtering Analytics Ingestion Data Source: https://mapr.com/blog/ml-iot-connected-medical-devices/ 44
  • 42. What is Machine Learning? 45
  • 44. Machine Learning in IoT Applications Source : https://medium.com/iotforall/using-deep-learning-processors-for-intelligent-iot-devices-1a7ed9d2226d 47
  • 45. Dataset 48Reference : Deep Learning for IoT Big Data and Streaming Analytics: A Survey
  • 46. Disadvantages of Pure Cloud Service Model o Unpredictable response time from cloud server to endpoints o Unreliable cloud connections can bring down the service o Excessive data can overburden infrastructure o Privacy issues when sensitive customer data are stored in the cloud o Difficulties in scaling to ever increasing number of sensors and actuators 49
  • 47. Fog computing for IoT • Bringing computing and analytics closer to the end-users/devices to remove unnecessary and prohibitive communication delays (saves on transmissions costs). • It can receive, process and react in real time to the incoming data. 50
  • 48. Ex. Fog computing + Kafka https://www.cisco.com/c/en/us/td/docs/unified_computing/ucs/UCS_CVDs/Cisco_UCS_Integrated_Infrastructure_for_Big_Data_with_ Cloudera_and_Apache_Spark.html 51
  • 49. Case study #1 REFERENCE: HTTPS://MAPR.COM/BLOG/ML-IOT-CONNECTED-MEDICAL-DEVICES/ 52
  • 50. Streaming machine-learning application to detect anomalies in data from a heart monitor ◦ Cheaper sensors that can monitor vital signs combined with machine learning, are making it possible for doctors to rapidly apply smart medicine to their patients’ cases. electrocardiogram (ECG) 53
  • 51. Building the Model with Clustering Heartbeats activity: normal EKG pattern we use this repeating pattern to train a model on previous heartbeat activity and then compare subsequent observations to this model in order to evaluate anomalous behavior. To build a model of typical heartbeats activity, we process an EKG (based on a specific patient or a group of many patients), break it into overlapping pieces that are about 1/3 sec long, and then apply a clustering algorithm to group similar shapes. The k-means algorithm 54
  • 52. Apache Spark processing with k-means 55
  • 53. Results in a catalog of shapes It can be used for reconstructing what an EKG should look like. 56
  • 54. Using the Model of Normal with Streaming Data 57
  • 55. Detecting Anomalies The difference between the observed and expected EKG (the green minus the red) is the reconstruction error, or residual (shown in yellow). If the residual is high, then there could be an anomaly. 58
  • 56. Case study #2 REFERENCE: การประชุมวิชาการระดับประเทศด้านเทคโนโลยีสารสนเทศ (NATIONAL CONFERENCE ON INFORMATION TECHNOLOGY: NCIT) ครั้ง ที่ 10 24-25 ตุลาคา 2561 60
  • 58. การวิเคราะห์การเติบโตผัก แบ่ง3 class Small Medium Large ✓ ในการทาโมเดล เราจะทาการเทรนชุดข้อมูล class ละ 300 รูป ✓ เฟรมเวิร์ก Caffe โมเดล CNNs และ SDK ของ Intel deep learning training tool ในการพัฒนาโมเดล ที่ติดตั้งบน AWS Cloud 62
  • 59. ขั้นตอนการทางาน Camera Module ชุดข้อมูล class ละ 300 รูป Predict Class CNNs CNNs = Convolutional Neural Network
  • 61. Profile ผักสาหรับควบคุมอัตโนมัติ 3 class ตั แปร ค ค ม ม ย Temp อง C อ มิ ยในโรงเรือน Hum % ค มชนในอ ก ยในโรงเรือน Lux Lux ค มเ ้มแสง ยในโรงเรือน Fan On/Off ก รปิดปิด ัดลม Silent On/Off ก รเปิดปิดม น ร งแสง Water On/Off ก รเปิดปิดปัมน Cool On/Off ก รเปิดปิดปัมน ไ ลผ นแผงรังผง Foggy On/Off ก รเปิดปิด ั น มอก
  • 62. Challenges and Future Directions o Lack of Large IoT Dataset o more data is needed to achieve more accuracy o Preprocessing o more complex since the system deals with data from different sources that may have various formats o Secure and Privacy Preserving Machine Learning o developing further techniques to defend and prevent the effect of this sort of attacks on models is necessary for reliable IoT applications. o Machine Learning for IoT Devices o consider the requirements of handling Machine learning in resource-constrained devices 66