SlideShare a Scribd company logo
Big Data and Internet of things(IOT)
Project Morpheus
(Beaconstac Analytics)
Garima Batra
Core Platform Engineer | MobStac
May 2015
A quick intro about Beaconstac 1
Beaconstac is a proximity marketing and analytics platform for
beacons
Several beacon specific events are defined to aid proximity marketing
The events include Camp on event, beacon exit event, region enter,
region exit etc.
Beaconstac analytics platform makes it easy for
managers/marketers/developers to analyze event data
Components include Beaconstac iOS/Android sdk, beaconstac portal
Why Hadoop? 1
Collect event logs generated from Beaconstac SDK usage
Needed a system to answer queries like
o Heat map of beacons by the number of visits received in a specified time
interval.
o Heat map of beacons by the amount of time spent in a specified time
interval.
o Average time spent by users near different beacons
o Last seen per user
o Last seen per beacon
o Analyzing data with custom attributes filters
o Traversed path in an area by individual users
Leveraging Amazon's EMR for Beaconstac
Analytics
1
 Amazon's Streaming API for writing mapper and reducer functions in Python
 Input - Copy programs to Amazon S3
 Output – Copy the processed/output data to S3
 Initial tests were run using Amazon's EMR console. Here you can
define the following -
1) Cluster configuration – Name, Termination protection, Logging,
logs location on S3 etc.
2) Software configuration – Hadoop AMI version, applications to be
installed on startup etc.
3) Hardware configuration – Types of nodes – master, Core and
Task
4) Security keys, allowed users
5) Bootstrap actions – Configure Hadoop, Custom actions etc.
6) Steps – Streaming program, Hive program, Pig program
Integrating EMR in production 1
Batch processing for Morpheus 1
AWS Data pipeline
Deep dive into EMR startup and job
submission
1
How Does AWS Data Pipeline Work? 1
Pipeline definition - specifies the business logic of your data management
AWS Data pipeline web service - interprets the pipeline definition and assigns
tasks to workers to move and transform data.
Task runner - polls the AWS Data Pipeline web service for tasks and then
performs those tasks.
Morpheus version of Data pipeline 1
Runs every hour
Requires a Kafka
consumer script
Copy the
output to
Elastic
Search
Run EMR
jobs
Copy logs
from Kafka
to S3
Runs once every
day
Processes each
job and produces
output
Each job
comprises of
mapper and
reducer scripts
Runs once every
day
Inserts output in
Elastic search
Settings file in each job 1
Source: Lorem Ipsum
Questions??
1

More Related Content

What's hot

From MapReduce to Apache Spark
From MapReduce to Apache SparkFrom MapReduce to Apache Spark
From MapReduce to Apache Spark
Jen Aman
 
Microservices meetup April 2017
Microservices meetup April 2017Microservices meetup April 2017
Microservices meetup April 2017
SignalFx
 
Location based push notifications
Location based push notificationsLocation based push notifications
Location based push notifications
jamesrichards
 
Confluent On Azure: Why you should add Confluent to your Azure toolkit | Alic...
Confluent On Azure: Why you should add Confluent to your Azure toolkit | Alic...Confluent On Azure: Why you should add Confluent to your Azure toolkit | Alic...
Confluent On Azure: Why you should add Confluent to your Azure toolkit | Alic...
HostedbyConfluent
 
Spark Summit EU talk by John Musser
Spark Summit EU talk by John MusserSpark Summit EU talk by John Musser
Spark Summit EU talk by John Musser
Spark Summit
 
AI on Spark for Malware Analysis and Anomalous Threat Detection
AI on Spark for Malware Analysis and Anomalous Threat DetectionAI on Spark for Malware Analysis and Anomalous Threat Detection
AI on Spark for Malware Analysis and Anomalous Threat Detection
Databricks
 
How we Auto Scale applications based on CPU with Kubernetes at M6Web?
 How we Auto Scale applications based on CPU with Kubernetes at M6Web? How we Auto Scale applications based on CPU with Kubernetes at M6Web?
How we Auto Scale applications based on CPU with Kubernetes at M6Web?
Vincent Gallissot
 
Ml sprint16 thesis_intro
Ml sprint16 thesis_introMl sprint16 thesis_intro
Ml sprint16 thesis_intro
ThanhNguyen3805
 
Advanced Spark Meetup - Jan 12, 2016
Advanced Spark Meetup - Jan 12, 2016Advanced Spark Meetup - Jan 12, 2016
Advanced Spark Meetup - Jan 12, 2016
Michelle Casbon
 
How to Define and Share your Event APIs using AsyncAPI and Event API Products...
How to Define and Share your Event APIs using AsyncAPI and Event API Products...How to Define and Share your Event APIs using AsyncAPI and Event API Products...
How to Define and Share your Event APIs using AsyncAPI and Event API Products...
HostedbyConfluent
 
Monitoring
MonitoringMonitoring
Monitoring
strikr .
 
Spark at Airbnb
Spark at AirbnbSpark at Airbnb
Spark at Airbnb
Hao Wang
 
Simplifying Big Data Applications with Apache Spark 2.0
Simplifying Big Data Applications with Apache Spark 2.0Simplifying Big Data Applications with Apache Spark 2.0
Simplifying Big Data Applications with Apache Spark 2.0
Spark Summit
 
Sumo Logic QuickStart Webinar Oct 2016
Sumo Logic QuickStart Webinar Oct 2016Sumo Logic QuickStart Webinar Oct 2016
Sumo Logic QuickStart Webinar Oct 2016
Sumo Logic
 
One Click Streaming Data Pipelines & Flows | Leveraging Kafka & Spark | Ido F...
One Click Streaming Data Pipelines & Flows | Leveraging Kafka & Spark | Ido F...One Click Streaming Data Pipelines & Flows | Leveraging Kafka & Spark | Ido F...
One Click Streaming Data Pipelines & Flows | Leveraging Kafka & Spark | Ido F...
HostedbyConfluent
 
Digital Transformation & Solvency II Simulations for L&G: Optimizing, Acceler...
Digital Transformation & Solvency II Simulations for L&G: Optimizing, Acceler...Digital Transformation & Solvency II Simulations for L&G: Optimizing, Acceler...
Digital Transformation & Solvency II Simulations for L&G: Optimizing, Acceler...
OW2
 
DBOps
DBOpsDBOps
DBOps
strikr .
 
Building adaptive user experiences using Contextual Multi-Armed Bandits with...
Building adaptive user experiences using Contextual Multi-Armed Bandits  with...Building adaptive user experiences using Contextual Multi-Armed Bandits  with...
Building adaptive user experiences using Contextual Multi-Armed Bandits with...
HostedbyConfluent
 
Asynchronous Hyperparameter Optimization with Apache Spark
Asynchronous Hyperparameter Optimization with Apache SparkAsynchronous Hyperparameter Optimization with Apache Spark
Asynchronous Hyperparameter Optimization with Apache Spark
Databricks
 
Stream processing IoT time series data with Kafka & InfluxDB | Al Sargent, In...
Stream processing IoT time series data with Kafka & InfluxDB | Al Sargent, In...Stream processing IoT time series data with Kafka & InfluxDB | Al Sargent, In...
Stream processing IoT time series data with Kafka & InfluxDB | Al Sargent, In...
HostedbyConfluent
 

What's hot (20)

From MapReduce to Apache Spark
From MapReduce to Apache SparkFrom MapReduce to Apache Spark
From MapReduce to Apache Spark
 
Microservices meetup April 2017
Microservices meetup April 2017Microservices meetup April 2017
Microservices meetup April 2017
 
Location based push notifications
Location based push notificationsLocation based push notifications
Location based push notifications
 
Confluent On Azure: Why you should add Confluent to your Azure toolkit | Alic...
Confluent On Azure: Why you should add Confluent to your Azure toolkit | Alic...Confluent On Azure: Why you should add Confluent to your Azure toolkit | Alic...
Confluent On Azure: Why you should add Confluent to your Azure toolkit | Alic...
 
Spark Summit EU talk by John Musser
Spark Summit EU talk by John MusserSpark Summit EU talk by John Musser
Spark Summit EU talk by John Musser
 
AI on Spark for Malware Analysis and Anomalous Threat Detection
AI on Spark for Malware Analysis and Anomalous Threat DetectionAI on Spark for Malware Analysis and Anomalous Threat Detection
AI on Spark for Malware Analysis and Anomalous Threat Detection
 
How we Auto Scale applications based on CPU with Kubernetes at M6Web?
 How we Auto Scale applications based on CPU with Kubernetes at M6Web? How we Auto Scale applications based on CPU with Kubernetes at M6Web?
How we Auto Scale applications based on CPU with Kubernetes at M6Web?
 
Ml sprint16 thesis_intro
Ml sprint16 thesis_introMl sprint16 thesis_intro
Ml sprint16 thesis_intro
 
Advanced Spark Meetup - Jan 12, 2016
Advanced Spark Meetup - Jan 12, 2016Advanced Spark Meetup - Jan 12, 2016
Advanced Spark Meetup - Jan 12, 2016
 
How to Define and Share your Event APIs using AsyncAPI and Event API Products...
How to Define and Share your Event APIs using AsyncAPI and Event API Products...How to Define and Share your Event APIs using AsyncAPI and Event API Products...
How to Define and Share your Event APIs using AsyncAPI and Event API Products...
 
Monitoring
MonitoringMonitoring
Monitoring
 
Spark at Airbnb
Spark at AirbnbSpark at Airbnb
Spark at Airbnb
 
Simplifying Big Data Applications with Apache Spark 2.0
Simplifying Big Data Applications with Apache Spark 2.0Simplifying Big Data Applications with Apache Spark 2.0
Simplifying Big Data Applications with Apache Spark 2.0
 
Sumo Logic QuickStart Webinar Oct 2016
Sumo Logic QuickStart Webinar Oct 2016Sumo Logic QuickStart Webinar Oct 2016
Sumo Logic QuickStart Webinar Oct 2016
 
One Click Streaming Data Pipelines & Flows | Leveraging Kafka & Spark | Ido F...
One Click Streaming Data Pipelines & Flows | Leveraging Kafka & Spark | Ido F...One Click Streaming Data Pipelines & Flows | Leveraging Kafka & Spark | Ido F...
One Click Streaming Data Pipelines & Flows | Leveraging Kafka & Spark | Ido F...
 
Digital Transformation & Solvency II Simulations for L&G: Optimizing, Acceler...
Digital Transformation & Solvency II Simulations for L&G: Optimizing, Acceler...Digital Transformation & Solvency II Simulations for L&G: Optimizing, Acceler...
Digital Transformation & Solvency II Simulations for L&G: Optimizing, Acceler...
 
DBOps
DBOpsDBOps
DBOps
 
Building adaptive user experiences using Contextual Multi-Armed Bandits with...
Building adaptive user experiences using Contextual Multi-Armed Bandits  with...Building adaptive user experiences using Contextual Multi-Armed Bandits  with...
Building adaptive user experiences using Contextual Multi-Armed Bandits with...
 
Asynchronous Hyperparameter Optimization with Apache Spark
Asynchronous Hyperparameter Optimization with Apache SparkAsynchronous Hyperparameter Optimization with Apache Spark
Asynchronous Hyperparameter Optimization with Apache Spark
 
Stream processing IoT time series data with Kafka & InfluxDB | Al Sargent, In...
Stream processing IoT time series data with Kafka & InfluxDB | Al Sargent, In...Stream processing IoT time series data with Kafka & InfluxDB | Al Sargent, In...
Stream processing IoT time series data with Kafka & InfluxDB | Al Sargent, In...
 

Similar to Internet of Things

Spark + AI Summit 2019: Apache Spark Listeners: A Crash Course in Fast, Easy ...
Spark + AI Summit 2019: Apache Spark Listeners: A Crash Course in Fast, Easy ...Spark + AI Summit 2019: Apache Spark Listeners: A Crash Course in Fast, Easy ...
Spark + AI Summit 2019: Apache Spark Listeners: A Crash Course in Fast, Easy ...
Landon Robinson
 
Spark Development Lifecycle at Workday - ApacheCon 2020
Spark Development Lifecycle at Workday - ApacheCon 2020Spark Development Lifecycle at Workday - ApacheCon 2020
Spark Development Lifecycle at Workday - ApacheCon 2020
Pavel Hardak
 
Apache Spark Development Lifecycle @ Workday - ApacheCon 2020
Apache Spark Development Lifecycle @ Workday - ApacheCon 2020Apache Spark Development Lifecycle @ Workday - ApacheCon 2020
Apache Spark Development Lifecycle @ Workday - ApacheCon 2020
Eren Avşaroğulları
 
Apache Spark Listeners: A Crash Course in Fast, Easy Monitoring
Apache Spark Listeners: A Crash Course in Fast, Easy MonitoringApache Spark Listeners: A Crash Course in Fast, Easy Monitoring
Apache Spark Listeners: A Crash Course in Fast, Easy Monitoring
Databricks
 
vinay-mittal-new
vinay-mittal-newvinay-mittal-new
vinay-mittal-new
Vinay Mittal
 
Broadcast Music Inc - Release Automation Rockstars!
Broadcast Music Inc - Release Automation Rockstars!Broadcast Music Inc - Release Automation Rockstars!
Broadcast Music Inc - Release Automation Rockstars!
ghodgkinson
 
IDC 1도 모르는 개발자가 DevOps를 만났을때
IDC 1도 모르는 개발자가 DevOps를 만났을때IDC 1도 모르는 개발자가 DevOps를 만났을때
IDC 1도 모르는 개발자가 DevOps를 만났을때
주은 안
 
Spring Boot & Spring Cloud Apps on Pivotal Application Service - Daniel Lavoie
Spring Boot & Spring Cloud Apps on Pivotal Application Service - Daniel LavoieSpring Boot & Spring Cloud Apps on Pivotal Application Service - Daniel Lavoie
Spring Boot & Spring Cloud Apps on Pivotal Application Service - Daniel Lavoie
VMware Tanzu
 
Aleksandr_Savelyev_Resume_Mar_2016
Aleksandr_Savelyev_Resume_Mar_2016Aleksandr_Savelyev_Resume_Mar_2016
Aleksandr_Savelyev_Resume_Mar_2016
Aleksandr Savelyev
 
Big Data in the Cloud
Big Data in the CloudBig Data in the Cloud
Big Data in the Cloud
Amazon Web Services
 
kumarResume
kumarResumekumarResume
kumarResume
Kumar RAMASWAMY
 
"The Suitcase" Project Cloud QTR meeting presentation @ Disney/ABC
"The Suitcase"  Project Cloud QTR meeting presentation @ Disney/ABC"The Suitcase"  Project Cloud QTR meeting presentation @ Disney/ABC
"The Suitcase" Project Cloud QTR meeting presentation @ Disney/ABC
ETCenter
 
batbern43 Events - Lessons learnt building an Enterprise Data Bus
batbern43 Events - Lessons learnt building an Enterprise Data Busbatbern43 Events - Lessons learnt building an Enterprise Data Bus
batbern43 Events - Lessons learnt building an Enterprise Data Bus
BATbern
 
IoT Analytics from Edge to Cloud - using IBM Informix
IoT Analytics from Edge to Cloud - using IBM InformixIoT Analytics from Edge to Cloud - using IBM Informix
IoT Analytics from Edge to Cloud - using IBM Informix
Pradeep Muthalpuredathe
 
Data Streaming with Apache Kafka & MongoDB
Data Streaming with Apache Kafka & MongoDBData Streaming with Apache Kafka & MongoDB
Data Streaming with Apache Kafka & MongoDB
confluent
 
SpringOne Tour Denver - Spring Boot & Spring Cloud on Pivotal Application Ser...
SpringOne Tour Denver - Spring Boot & Spring Cloud on Pivotal Application Ser...SpringOne Tour Denver - Spring Boot & Spring Cloud on Pivotal Application Ser...
SpringOne Tour Denver - Spring Boot & Spring Cloud on Pivotal Application Ser...
VMware Tanzu
 
AI Scalability for the Next Decade
AI Scalability for the Next DecadeAI Scalability for the Next Decade
AI Scalability for the Next Decade
Paula Koziol
 
6. DISZ - Webalkalmazások skálázhatósága a Google Cloud Platformon
6. DISZ - Webalkalmazások skálázhatósága  a Google Cloud Platformon6. DISZ - Webalkalmazások skálázhatósága  a Google Cloud Platformon
6. DISZ - Webalkalmazások skálázhatósága a Google Cloud Platformon
Márton Kodok
 
DevOps Spain 2019. Beatriz Martínez-IBM
DevOps Spain 2019. Beatriz Martínez-IBMDevOps Spain 2019. Beatriz Martínez-IBM
DevOps Spain 2019. Beatriz Martínez-IBM
atSistemas
 
Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv
Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv
Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv
Amazon Web Services
 

Similar to Internet of Things (20)

Spark + AI Summit 2019: Apache Spark Listeners: A Crash Course in Fast, Easy ...
Spark + AI Summit 2019: Apache Spark Listeners: A Crash Course in Fast, Easy ...Spark + AI Summit 2019: Apache Spark Listeners: A Crash Course in Fast, Easy ...
Spark + AI Summit 2019: Apache Spark Listeners: A Crash Course in Fast, Easy ...
 
Spark Development Lifecycle at Workday - ApacheCon 2020
Spark Development Lifecycle at Workday - ApacheCon 2020Spark Development Lifecycle at Workday - ApacheCon 2020
Spark Development Lifecycle at Workday - ApacheCon 2020
 
Apache Spark Development Lifecycle @ Workday - ApacheCon 2020
Apache Spark Development Lifecycle @ Workday - ApacheCon 2020Apache Spark Development Lifecycle @ Workday - ApacheCon 2020
Apache Spark Development Lifecycle @ Workday - ApacheCon 2020
 
Apache Spark Listeners: A Crash Course in Fast, Easy Monitoring
Apache Spark Listeners: A Crash Course in Fast, Easy MonitoringApache Spark Listeners: A Crash Course in Fast, Easy Monitoring
Apache Spark Listeners: A Crash Course in Fast, Easy Monitoring
 
vinay-mittal-new
vinay-mittal-newvinay-mittal-new
vinay-mittal-new
 
Broadcast Music Inc - Release Automation Rockstars!
Broadcast Music Inc - Release Automation Rockstars!Broadcast Music Inc - Release Automation Rockstars!
Broadcast Music Inc - Release Automation Rockstars!
 
IDC 1도 모르는 개발자가 DevOps를 만났을때
IDC 1도 모르는 개발자가 DevOps를 만났을때IDC 1도 모르는 개발자가 DevOps를 만났을때
IDC 1도 모르는 개발자가 DevOps를 만났을때
 
Spring Boot & Spring Cloud Apps on Pivotal Application Service - Daniel Lavoie
Spring Boot & Spring Cloud Apps on Pivotal Application Service - Daniel LavoieSpring Boot & Spring Cloud Apps on Pivotal Application Service - Daniel Lavoie
Spring Boot & Spring Cloud Apps on Pivotal Application Service - Daniel Lavoie
 
Aleksandr_Savelyev_Resume_Mar_2016
Aleksandr_Savelyev_Resume_Mar_2016Aleksandr_Savelyev_Resume_Mar_2016
Aleksandr_Savelyev_Resume_Mar_2016
 
Big Data in the Cloud
Big Data in the CloudBig Data in the Cloud
Big Data in the Cloud
 
kumarResume
kumarResumekumarResume
kumarResume
 
"The Suitcase" Project Cloud QTR meeting presentation @ Disney/ABC
"The Suitcase"  Project Cloud QTR meeting presentation @ Disney/ABC"The Suitcase"  Project Cloud QTR meeting presentation @ Disney/ABC
"The Suitcase" Project Cloud QTR meeting presentation @ Disney/ABC
 
batbern43 Events - Lessons learnt building an Enterprise Data Bus
batbern43 Events - Lessons learnt building an Enterprise Data Busbatbern43 Events - Lessons learnt building an Enterprise Data Bus
batbern43 Events - Lessons learnt building an Enterprise Data Bus
 
IoT Analytics from Edge to Cloud - using IBM Informix
IoT Analytics from Edge to Cloud - using IBM InformixIoT Analytics from Edge to Cloud - using IBM Informix
IoT Analytics from Edge to Cloud - using IBM Informix
 
Data Streaming with Apache Kafka & MongoDB
Data Streaming with Apache Kafka & MongoDBData Streaming with Apache Kafka & MongoDB
Data Streaming with Apache Kafka & MongoDB
 
SpringOne Tour Denver - Spring Boot & Spring Cloud on Pivotal Application Ser...
SpringOne Tour Denver - Spring Boot & Spring Cloud on Pivotal Application Ser...SpringOne Tour Denver - Spring Boot & Spring Cloud on Pivotal Application Ser...
SpringOne Tour Denver - Spring Boot & Spring Cloud on Pivotal Application Ser...
 
AI Scalability for the Next Decade
AI Scalability for the Next DecadeAI Scalability for the Next Decade
AI Scalability for the Next Decade
 
6. DISZ - Webalkalmazások skálázhatósága a Google Cloud Platformon
6. DISZ - Webalkalmazások skálázhatósága  a Google Cloud Platformon6. DISZ - Webalkalmazások skálázhatósága  a Google Cloud Platformon
6. DISZ - Webalkalmazások skálázhatósága a Google Cloud Platformon
 
DevOps Spain 2019. Beatriz Martínez-IBM
DevOps Spain 2019. Beatriz Martínez-IBMDevOps Spain 2019. Beatriz Martínez-IBM
DevOps Spain 2019. Beatriz Martínez-IBM
 
Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv
Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv
Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv
 

More from DeZyre

Top 10 Data Visualization Tools
Top 10 Data Visualization ToolsTop 10 Data Visualization Tools
Top 10 Data Visualization Tools
DeZyre
 
Data Scientist Skills
Data Scientist SkillsData Scientist Skills
Data Scientist Skills
DeZyre
 
What companies hiring data scientists and hadoop developers are looking for?
What companies hiring data scientists and hadoop developers are looking for?What companies hiring data scientists and hadoop developers are looking for?
What companies hiring data scientists and hadoop developers are looking for?
DeZyre
 
Big Data Timeline
Big Data TimelineBig Data Timeline
Big Data Timeline
DeZyre
 
How to program your way into data science?
How to program your way into data science?How to program your way into data science?
How to program your way into data science?
DeZyre
 
Big Data Use Cases
Big Data Use CasesBig Data Use Cases
Big Data Use Cases
DeZyre
 
Big data hadoop salary trends
Big data hadoop salary trendsBig data hadoop salary trends
Big data hadoop salary trends
DeZyre
 
Stay updated through online hackathons
Stay updated through online hackathonsStay updated through online hackathons
Stay updated through online hackathons
DeZyre
 
How to become a data scientist
How to become a data scientistHow to become a data scientist
How to become a data scientist
DeZyre
 
How big data is transforming BI
How big data is transforming BIHow big data is transforming BI
How big data is transforming BI
DeZyre
 
Sports and Big data
Sports and Big dataSports and Big data
Sports and Big data
DeZyre
 
Big data in healthcare
Big data in healthcareBig data in healthcare
Big data in healthcare
DeZyre
 
What is big data
What is big data What is big data
What is big data
DeZyre
 
25 things that make Amazons Jeff Bezos, Jeff Bezos
25 things that make Amazons Jeff Bezos, Jeff Bezos25 things that make Amazons Jeff Bezos, Jeff Bezos
25 things that make Amazons Jeff Bezos, Jeff Bezos
DeZyre
 

More from DeZyre (14)

Top 10 Data Visualization Tools
Top 10 Data Visualization ToolsTop 10 Data Visualization Tools
Top 10 Data Visualization Tools
 
Data Scientist Skills
Data Scientist SkillsData Scientist Skills
Data Scientist Skills
 
What companies hiring data scientists and hadoop developers are looking for?
What companies hiring data scientists and hadoop developers are looking for?What companies hiring data scientists and hadoop developers are looking for?
What companies hiring data scientists and hadoop developers are looking for?
 
Big Data Timeline
Big Data TimelineBig Data Timeline
Big Data Timeline
 
How to program your way into data science?
How to program your way into data science?How to program your way into data science?
How to program your way into data science?
 
Big Data Use Cases
Big Data Use CasesBig Data Use Cases
Big Data Use Cases
 
Big data hadoop salary trends
Big data hadoop salary trendsBig data hadoop salary trends
Big data hadoop salary trends
 
Stay updated through online hackathons
Stay updated through online hackathonsStay updated through online hackathons
Stay updated through online hackathons
 
How to become a data scientist
How to become a data scientistHow to become a data scientist
How to become a data scientist
 
How big data is transforming BI
How big data is transforming BIHow big data is transforming BI
How big data is transforming BI
 
Sports and Big data
Sports and Big dataSports and Big data
Sports and Big data
 
Big data in healthcare
Big data in healthcareBig data in healthcare
Big data in healthcare
 
What is big data
What is big data What is big data
What is big data
 
25 things that make Amazons Jeff Bezos, Jeff Bezos
25 things that make Amazons Jeff Bezos, Jeff Bezos25 things that make Amazons Jeff Bezos, Jeff Bezos
25 things that make Amazons Jeff Bezos, Jeff Bezos
 

Recently uploaded

STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
sameer shah
 
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
zsjl4mimo
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
Social Samosa
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
nuttdpt
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
roli9797
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
javier ramirez
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
jitskeb
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
jerlynmaetalle
 
A presentation that explain the Power BI Licensing
A presentation that explain the Power BI LicensingA presentation that explain the Power BI Licensing
A presentation that explain the Power BI Licensing
AlessioFois2
 
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
74nqk8xf
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
bopyb
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Aggregage
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
aqzctr7x
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
g4dpvqap0
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
Lars Albertsson
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
v7oacc3l
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
soxrziqu
 
State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023
kuntobimo2016
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
Timothy Spann
 

Recently uploaded (20)

STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
 
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
 
A presentation that explain the Power BI Licensing
A presentation that explain the Power BI LicensingA presentation that explain the Power BI Licensing
A presentation that explain the Power BI Licensing
 
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
 
State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
 

Internet of Things

  • 1. Big Data and Internet of things(IOT)
  • 2. Project Morpheus (Beaconstac Analytics) Garima Batra Core Platform Engineer | MobStac May 2015
  • 3. A quick intro about Beaconstac 1 Beaconstac is a proximity marketing and analytics platform for beacons Several beacon specific events are defined to aid proximity marketing The events include Camp on event, beacon exit event, region enter, region exit etc. Beaconstac analytics platform makes it easy for managers/marketers/developers to analyze event data Components include Beaconstac iOS/Android sdk, beaconstac portal
  • 4. Why Hadoop? 1 Collect event logs generated from Beaconstac SDK usage Needed a system to answer queries like o Heat map of beacons by the number of visits received in a specified time interval. o Heat map of beacons by the amount of time spent in a specified time interval. o Average time spent by users near different beacons o Last seen per user o Last seen per beacon o Analyzing data with custom attributes filters o Traversed path in an area by individual users
  • 5. Leveraging Amazon's EMR for Beaconstac Analytics 1  Amazon's Streaming API for writing mapper and reducer functions in Python  Input - Copy programs to Amazon S3  Output – Copy the processed/output data to S3  Initial tests were run using Amazon's EMR console. Here you can define the following - 1) Cluster configuration – Name, Termination protection, Logging, logs location on S3 etc. 2) Software configuration – Hadoop AMI version, applications to be installed on startup etc. 3) Hardware configuration – Types of nodes – master, Core and Task 4) Security keys, allowed users 5) Bootstrap actions – Configure Hadoop, Custom actions etc. 6) Steps – Streaming program, Hive program, Pig program
  • 6. Integrating EMR in production 1
  • 7. Batch processing for Morpheus 1 AWS Data pipeline
  • 8. Deep dive into EMR startup and job submission 1
  • 9. How Does AWS Data Pipeline Work? 1 Pipeline definition - specifies the business logic of your data management AWS Data pipeline web service - interprets the pipeline definition and assigns tasks to workers to move and transform data. Task runner - polls the AWS Data Pipeline web service for tasks and then performs those tasks.
  • 10. Morpheus version of Data pipeline 1 Runs every hour Requires a Kafka consumer script Copy the output to Elastic Search Run EMR jobs Copy logs from Kafka to S3 Runs once every day Processes each job and produces output Each job comprises of mapper and reducer scripts Runs once every day Inserts output in Elastic search
  • 11. Settings file in each job 1 Source: Lorem Ipsum Questions?? 1