SlideShare a Scribd company logo
RaDEn: A Scalable and Efficient Platform
for Engineering Radiation Data
Hadi Fadlallah, Yehia Taher, Ali Jaber
Plan
• Introduction
• Objective
• Proposed system
• Implementation
• Experiments
• Conclusion
• Limitations
• Future work
2/25
Radiation Pollution
3/25
Introduction 3 … 5
Rise of Internet of Things
4/25
Introduction 3 … 5
New Challenges
5/25
Introduction 3 … 5
Huge Volume
High Speed
Wide variety
Traditional
Solutions
Objective
• Scalable solution for engineering radiation data
• Processing big data (huge volume, high speed)
• Real-time monitoring
6/25
Objective
Proposed system
• RaDEn: Radiation Data Engineering system
• Scalability and fault-tolerance
• Handles big data
• Monitor radiation data in real-time and batch
style
7/25
Proposed system 7 … 12
Proposed system
8/25
Proposed system 7 … 12
Data ingestion
9/25
Proposed system 7 … 12
Data storage
10/25
Proposed system 7 … 12
Data processing
11/25
Proposed system 7 … 12
Data visualization
• Acts with data processing engine
• Real-time graph
• Matplotlib python library
12/25
Proposed system 7 … 12
Implementation
13/25
Implementation 13 … 14
Alarm System
14/25
Implementation 13 … 14
Experiments
• Dataset provided by the Lebanese Atomic
Energy Commission
• Confidentiality issues in accessing sensors, web
server
• Data: Beirut, from 2015-08-01 to 2016-08-01
• Radiation level, temperature, rain level, sensor
battery power, data collection time and external
battery power
15/25
Experiments 15 … 20
Experiments
• Start required services
• Sensor simulation, folder listener
• Import to HDFS
• Processing and visualization
16/25
Experiments 15 … 20
Experiments
17/25
Experiments 15 … 20
Experiments
• Alert is raised in from of
message boxes
• Level -> title
• Description -> body
18/25
Experiments 15 … 20
Experiments
• Created Hive external table (table: radiation)
• Ignore move messy data rows (view:
vw_radiation)
• Spark-SQL , HiveQL queries
SELECT * FROM vw_radiation
WHERE dose_rate > 50;
19/25
Experiments 15 … 20
Experiments
20/25
Experiments 15 … 20
Conclusion
•Implemented radiation data engineering system
•Relies on Apache Hadoop, Kafka, Sqoop, Flume
and Spark
•Ensure scalability and fault-tolerance
•Radiation monitoring
•Data retrieval
21/25
Conclusion
Limitations
• Small data set
• No sensors or web server access
• Lack of documentation
• Time limit
22/25
Limitations
Future work
• Improve visualization (bokeh, Kibana)
• Friendly user interface
• Use ORC (Optimized Row Columnar) format
• Distributed search engines
23/25
Future work
Thank you

More Related Content

What's hot

Grid computing
Grid computingGrid computing
Network_Intrusion_Detection_System_Team1
Network_Intrusion_Detection_System_Team1Network_Intrusion_Detection_System_Team1
Network_Intrusion_Detection_System_Team1
Saksham Agrawal
 
ECL-Watch: A Big Data Application Performance Tuning Tool in the HPCC Systems...
ECL-Watch: A Big Data Application Performance Tuning Tool in the HPCC Systems...ECL-Watch: A Big Data Application Performance Tuning Tool in the HPCC Systems...
ECL-Watch: A Big Data Application Performance Tuning Tool in the HPCC Systems...
HPCC Systems
 
StreamSet ETL tool
StreamSet  ETL toolStreamSet  ETL tool
StreamSet ETL tool
SwapnilSHampi
 
Advanced Automated Analytics Using OSS Tools
Advanced Automated Analytics Using OSS ToolsAdvanced Automated Analytics Using OSS Tools
Advanced Automated Analytics Using OSS Tools
Grid Protection Alliance
 
DGterzo
DGterzoDGterzo
Software-defined networking
Software-defined networkingSoftware-defined networking
Software-defined networking
inovex GmbH
 
Advanced Automated Analytics Using OSS Tools, GA Tech FDA Conference 2016
Advanced Automated Analytics Using OSS Tools, GA Tech FDA Conference 2016Advanced Automated Analytics Using OSS Tools, GA Tech FDA Conference 2016
Advanced Automated Analytics Using OSS Tools, GA Tech FDA Conference 2016
Grid Protection Alliance
 
"Quantum Clustering - Physics Inspired Clustering Algorithm", Sigalit Bechler...
"Quantum Clustering - Physics Inspired Clustering Algorithm", Sigalit Bechler..."Quantum Clustering - Physics Inspired Clustering Algorithm", Sigalit Bechler...
"Quantum Clustering - Physics Inspired Clustering Algorithm", Sigalit Bechler...
Dataconomy Media
 
Axibase Time Series Database
Axibase Time Series DatabaseAxibase Time Series Database
Axibase Time Series Database
heinrichvk
 
An Open Solution for Next-generation Real-time Power System Simulation
An Open Solution for Next-generation Real-time Power System SimulationAn Open Solution for Next-generation Real-time Power System Simulation
An Open Solution for Next-generation Real-time Power System Simulation
Steffen Vogel
 
MACHINE LEARNING ON MAPREDUCE FRAMEWORK
MACHINE LEARNING ON MAPREDUCE FRAMEWORKMACHINE LEARNING ON MAPREDUCE FRAMEWORK
MACHINE LEARNING ON MAPREDUCE FRAMEWORK
Abhi Jit
 
Plan approach sdc
Plan  approach sdcPlan  approach sdc
Plan approach sdc
ku1ku
 
How Sensor Data Can Help Manufacturers Gain Insight to Reduce Waste, Energy C...
How Sensor Data Can Help Manufacturers Gain Insight to Reduce Waste, Energy C...How Sensor Data Can Help Manufacturers Gain Insight to Reduce Waste, Energy C...
How Sensor Data Can Help Manufacturers Gain Insight to Reduce Waste, Energy C...
InfluxData
 
FogFlow: Cloud-Edge Orchestrator in FIWARE
FogFlow: Cloud-Edge Orchestrator in FIWAREFogFlow: Cloud-Edge Orchestrator in FIWARE
FogFlow: Cloud-Edge Orchestrator in FIWARE
Bin Cheng
 
"Machine Learning and Internet of Things, the future of medical prevention", ...
"Machine Learning and Internet of Things, the future of medical prevention", ..."Machine Learning and Internet of Things, the future of medical prevention", ...
"Machine Learning and Internet of Things, the future of medical prevention", ...
Dataconomy Media
 
Apache Apex - Hadoop Users Group
Apache Apex - Hadoop Users GroupApache Apex - Hadoop Users Group
Apache Apex - Hadoop Users Group
Pramod Immaneni
 
Accountex 2014 The Cloud and Risks for the Modern Practice
Accountex 2014 The Cloud and Risks for the Modern PracticeAccountex 2014 The Cloud and Risks for the Modern Practice
Accountex 2014 The Cloud and Risks for the Modern Practice
David Watson
 
RECAP at the YERUN Launch Event
RECAP at the YERUN Launch EventRECAP at the YERUN Launch Event
RECAP at the YERUN Launch Event
RECAP Project
 
Science DMZ
Science DMZScience DMZ
Science DMZ
Jisc
 

What's hot (20)

Grid computing
Grid computingGrid computing
Grid computing
 
Network_Intrusion_Detection_System_Team1
Network_Intrusion_Detection_System_Team1Network_Intrusion_Detection_System_Team1
Network_Intrusion_Detection_System_Team1
 
ECL-Watch: A Big Data Application Performance Tuning Tool in the HPCC Systems...
ECL-Watch: A Big Data Application Performance Tuning Tool in the HPCC Systems...ECL-Watch: A Big Data Application Performance Tuning Tool in the HPCC Systems...
ECL-Watch: A Big Data Application Performance Tuning Tool in the HPCC Systems...
 
StreamSet ETL tool
StreamSet  ETL toolStreamSet  ETL tool
StreamSet ETL tool
 
Advanced Automated Analytics Using OSS Tools
Advanced Automated Analytics Using OSS ToolsAdvanced Automated Analytics Using OSS Tools
Advanced Automated Analytics Using OSS Tools
 
DGterzo
DGterzoDGterzo
DGterzo
 
Software-defined networking
Software-defined networkingSoftware-defined networking
Software-defined networking
 
Advanced Automated Analytics Using OSS Tools, GA Tech FDA Conference 2016
Advanced Automated Analytics Using OSS Tools, GA Tech FDA Conference 2016Advanced Automated Analytics Using OSS Tools, GA Tech FDA Conference 2016
Advanced Automated Analytics Using OSS Tools, GA Tech FDA Conference 2016
 
"Quantum Clustering - Physics Inspired Clustering Algorithm", Sigalit Bechler...
"Quantum Clustering - Physics Inspired Clustering Algorithm", Sigalit Bechler..."Quantum Clustering - Physics Inspired Clustering Algorithm", Sigalit Bechler...
"Quantum Clustering - Physics Inspired Clustering Algorithm", Sigalit Bechler...
 
Axibase Time Series Database
Axibase Time Series DatabaseAxibase Time Series Database
Axibase Time Series Database
 
An Open Solution for Next-generation Real-time Power System Simulation
An Open Solution for Next-generation Real-time Power System SimulationAn Open Solution for Next-generation Real-time Power System Simulation
An Open Solution for Next-generation Real-time Power System Simulation
 
MACHINE LEARNING ON MAPREDUCE FRAMEWORK
MACHINE LEARNING ON MAPREDUCE FRAMEWORKMACHINE LEARNING ON MAPREDUCE FRAMEWORK
MACHINE LEARNING ON MAPREDUCE FRAMEWORK
 
Plan approach sdc
Plan  approach sdcPlan  approach sdc
Plan approach sdc
 
How Sensor Data Can Help Manufacturers Gain Insight to Reduce Waste, Energy C...
How Sensor Data Can Help Manufacturers Gain Insight to Reduce Waste, Energy C...How Sensor Data Can Help Manufacturers Gain Insight to Reduce Waste, Energy C...
How Sensor Data Can Help Manufacturers Gain Insight to Reduce Waste, Energy C...
 
FogFlow: Cloud-Edge Orchestrator in FIWARE
FogFlow: Cloud-Edge Orchestrator in FIWAREFogFlow: Cloud-Edge Orchestrator in FIWARE
FogFlow: Cloud-Edge Orchestrator in FIWARE
 
"Machine Learning and Internet of Things, the future of medical prevention", ...
"Machine Learning and Internet of Things, the future of medical prevention", ..."Machine Learning and Internet of Things, the future of medical prevention", ...
"Machine Learning and Internet of Things, the future of medical prevention", ...
 
Apache Apex - Hadoop Users Group
Apache Apex - Hadoop Users GroupApache Apex - Hadoop Users Group
Apache Apex - Hadoop Users Group
 
Accountex 2014 The Cloud and Risks for the Modern Practice
Accountex 2014 The Cloud and Risks for the Modern PracticeAccountex 2014 The Cloud and Risks for the Modern Practice
Accountex 2014 The Cloud and Risks for the Modern Practice
 
RECAP at the YERUN Launch Event
RECAP at the YERUN Launch EventRECAP at the YERUN Launch Event
RECAP at the YERUN Launch Event
 
Science DMZ
Science DMZScience DMZ
Science DMZ
 

Similar to RaDEn : A Scalable and Efficient Platform for Engineering Radiation Data

FEWS Data Analysis with ARR2016
FEWS Data Analysis with ARR2016 FEWS Data Analysis with ARR2016
FEWS Data Analysis with ARR2016
Lindsay Millard
 
OLAP
OLAPOLAP
SiriusCon 2017 - Get your stakeholders into modeling using graphical editors
SiriusCon 2017 - Get your stakeholders into modeling using graphical editorsSiriusCon 2017 - Get your stakeholders into modeling using graphical editors
SiriusCon 2017 - Get your stakeholders into modeling using graphical editors
Obeo
 
Small Embedded Data Center Pilot
Small Embedded Data Center PilotSmall Embedded Data Center Pilot
Small Embedded Data Center Pilot
Center for Energy and Environment
 
Small Embedded Data Center Pilot Program Webinar
Small Embedded Data Center Pilot Program WebinarSmall Embedded Data Center Pilot Program Webinar
Small Embedded Data Center Pilot Program Webinar
Lester Shen
 
Realtime analytics with_hadoop
Realtime analytics with_hadoopRealtime analytics with_hadoop
Realtime analytics with_hadoop
Edgar Alejandro Villegas
 
Redis TimeSeries
Redis TimeSeries Redis TimeSeries
Redis TimeSeries
Redis Labs
 
Fog computing
Fog computingFog computing
Fog computing
Hadi Fadlallah
 
Satellite Imagery: Acquisition and Presentation
Satellite Imagery: Acquisition and PresentationSatellite Imagery: Acquisition and Presentation
Satellite Imagery: Acquisition and Presentation
Travis Thompson
 
Operationalizing Machine Learning Using GPU-accelerated, In-database Analytics
Operationalizing Machine Learning Using GPU-accelerated, In-database AnalyticsOperationalizing Machine Learning Using GPU-accelerated, In-database Analytics
Operationalizing Machine Learning Using GPU-accelerated, In-database Analytics
Kinetica
 
Kanthaka - High Volume CDR Analyzer
Kanthaka - High Volume CDR AnalyzerKanthaka - High Volume CDR Analyzer
Kanthaka - High Volume CDR Analyzer
Pushpalanka Jayawardhana
 
Soap UI - Lesson3
Soap UI - Lesson3Soap UI - Lesson3
Soap UI - Lesson3
Qualitest
 
Big Data LDN 2018: DEUTSCHE BANK: THE PATH TO AUTOMATION IN A HIGHLY REGULATE...
Big Data LDN 2018: DEUTSCHE BANK: THE PATH TO AUTOMATION IN A HIGHLY REGULATE...Big Data LDN 2018: DEUTSCHE BANK: THE PATH TO AUTOMATION IN A HIGHLY REGULATE...
Big Data LDN 2018: DEUTSCHE BANK: THE PATH TO AUTOMATION IN A HIGHLY REGULATE...
Matt Stubbs
 
Lambda architecture: from zero to One
Lambda architecture: from zero to OneLambda architecture: from zero to One
Lambda architecture: from zero to One
Serg Masyutin
 
Application Performance Management
Application Performance ManagementApplication Performance Management
Application Performance Management
Noriaki Tatsumi
 
Making BD Work~TIAS_20150622
Making BD Work~TIAS_20150622Making BD Work~TIAS_20150622
Making BD Work~TIAS_20150622
Anthony Potappel
 
Big Data Quickstart Series 3: Perform Data Integration
Big Data Quickstart Series 3: Perform Data IntegrationBig Data Quickstart Series 3: Perform Data Integration
Big Data Quickstart Series 3: Perform Data Integration
Alibaba Cloud
 
Benefits of Hadoop as Platform as a Service
Benefits of Hadoop as Platform as a ServiceBenefits of Hadoop as Platform as a Service
Benefits of Hadoop as Platform as a Service
DataWorks Summit/Hadoop Summit
 
Impala Performance Update
Impala Performance UpdateImpala Performance Update
Impala Performance Update
Cloudera, Inc.
 
DevOps for Big Data - Data 360 2014 Conference
DevOps for Big Data - Data 360 2014 ConferenceDevOps for Big Data - Data 360 2014 Conference
DevOps for Big Data - Data 360 2014 Conference
Grid Dynamics
 

Similar to RaDEn : A Scalable and Efficient Platform for Engineering Radiation Data (20)

FEWS Data Analysis with ARR2016
FEWS Data Analysis with ARR2016 FEWS Data Analysis with ARR2016
FEWS Data Analysis with ARR2016
 
OLAP
OLAPOLAP
OLAP
 
SiriusCon 2017 - Get your stakeholders into modeling using graphical editors
SiriusCon 2017 - Get your stakeholders into modeling using graphical editorsSiriusCon 2017 - Get your stakeholders into modeling using graphical editors
SiriusCon 2017 - Get your stakeholders into modeling using graphical editors
 
Small Embedded Data Center Pilot
Small Embedded Data Center PilotSmall Embedded Data Center Pilot
Small Embedded Data Center Pilot
 
Small Embedded Data Center Pilot Program Webinar
Small Embedded Data Center Pilot Program WebinarSmall Embedded Data Center Pilot Program Webinar
Small Embedded Data Center Pilot Program Webinar
 
Realtime analytics with_hadoop
Realtime analytics with_hadoopRealtime analytics with_hadoop
Realtime analytics with_hadoop
 
Redis TimeSeries
Redis TimeSeries Redis TimeSeries
Redis TimeSeries
 
Fog computing
Fog computingFog computing
Fog computing
 
Satellite Imagery: Acquisition and Presentation
Satellite Imagery: Acquisition and PresentationSatellite Imagery: Acquisition and Presentation
Satellite Imagery: Acquisition and Presentation
 
Operationalizing Machine Learning Using GPU-accelerated, In-database Analytics
Operationalizing Machine Learning Using GPU-accelerated, In-database AnalyticsOperationalizing Machine Learning Using GPU-accelerated, In-database Analytics
Operationalizing Machine Learning Using GPU-accelerated, In-database Analytics
 
Kanthaka - High Volume CDR Analyzer
Kanthaka - High Volume CDR AnalyzerKanthaka - High Volume CDR Analyzer
Kanthaka - High Volume CDR Analyzer
 
Soap UI - Lesson3
Soap UI - Lesson3Soap UI - Lesson3
Soap UI - Lesson3
 
Big Data LDN 2018: DEUTSCHE BANK: THE PATH TO AUTOMATION IN A HIGHLY REGULATE...
Big Data LDN 2018: DEUTSCHE BANK: THE PATH TO AUTOMATION IN A HIGHLY REGULATE...Big Data LDN 2018: DEUTSCHE BANK: THE PATH TO AUTOMATION IN A HIGHLY REGULATE...
Big Data LDN 2018: DEUTSCHE BANK: THE PATH TO AUTOMATION IN A HIGHLY REGULATE...
 
Lambda architecture: from zero to One
Lambda architecture: from zero to OneLambda architecture: from zero to One
Lambda architecture: from zero to One
 
Application Performance Management
Application Performance ManagementApplication Performance Management
Application Performance Management
 
Making BD Work~TIAS_20150622
Making BD Work~TIAS_20150622Making BD Work~TIAS_20150622
Making BD Work~TIAS_20150622
 
Big Data Quickstart Series 3: Perform Data Integration
Big Data Quickstart Series 3: Perform Data IntegrationBig Data Quickstart Series 3: Perform Data Integration
Big Data Quickstart Series 3: Perform Data Integration
 
Benefits of Hadoop as Platform as a Service
Benefits of Hadoop as Platform as a ServiceBenefits of Hadoop as Platform as a Service
Benefits of Hadoop as Platform as a Service
 
Impala Performance Update
Impala Performance UpdateImpala Performance Update
Impala Performance Update
 
DevOps for Big Data - Data 360 2014 Conference
DevOps for Big Data - Data 360 2014 ConferenceDevOps for Big Data - Data 360 2014 Conference
DevOps for Big Data - Data 360 2014 Conference
 

More from Hadi Fadlallah

What makes it worth becoming a Data Engineer?
What makes it worth becoming a Data Engineer?What makes it worth becoming a Data Engineer?
What makes it worth becoming a Data Engineer?
Hadi Fadlallah
 
Introduction to Data Engineering
Introduction to Data EngineeringIntroduction to Data Engineering
Introduction to Data Engineering
Hadi Fadlallah
 
An introduction to Business intelligence
An introduction to Business intelligenceAn introduction to Business intelligence
An introduction to Business intelligence
Hadi Fadlallah
 
Big data lab as a service
Big data lab as a serviceBig data lab as a service
Big data lab as a service
Hadi Fadlallah
 
Risk management and IT technologies
Risk management and IT technologiesRisk management and IT technologies
Risk management and IT technologies
Hadi Fadlallah
 
Inertial sensors
Inertial sensors Inertial sensors
Inertial sensors
Hadi Fadlallah
 
Big Data Integration
Big Data IntegrationBig Data Integration
Big Data Integration
Hadi Fadlallah
 
Cloud computing pricing models
Cloud computing pricing modelsCloud computing pricing models
Cloud computing pricing models
Hadi Fadlallah
 
Internet of things security challenges
Internet of things security challengesInternet of things security challenges
Internet of things security challenges
Hadi Fadlallah
 
Marketing Mobile
Marketing MobileMarketing Mobile
Marketing Mobile
Hadi Fadlallah
 
Secure Aware Routing Protocol
Secure Aware Routing ProtocolSecure Aware Routing Protocol
Secure Aware Routing Protocol
Hadi Fadlallah
 
Bhopal disaster
Bhopal disasterBhopal disaster
Bhopal disaster
Hadi Fadlallah
 
Penetration testing in wireless network
Penetration testing in wireless networkPenetration testing in wireless network
Penetration testing in wireless network
Hadi Fadlallah
 
Cyber propaganda
Cyber propagandaCyber propaganda
Cyber propaganda
Hadi Fadlallah
 
Dhcp authentication using certificates
Dhcp authentication using certificatesDhcp authentication using certificates
Dhcp authentication using certificates
Hadi Fadlallah
 
Introduction to Data mining
Introduction to Data miningIntroduction to Data mining
Introduction to Data mining
Hadi Fadlallah
 
Sql parametrized queries
Sql parametrized queriesSql parametrized queries
Sql parametrized queries
Hadi Fadlallah
 
Introduction to software testing
Introduction to software testingIntroduction to software testing
Introduction to software testing
Hadi Fadlallah
 
Enhancing the performance of kmeans algorithm
Enhancing the performance of kmeans algorithmEnhancing the performance of kmeans algorithm
Enhancing the performance of kmeans algorithm
Hadi Fadlallah
 
Analyzing "Total liban" mobile Application
Analyzing "Total liban" mobile ApplicationAnalyzing "Total liban" mobile Application
Analyzing "Total liban" mobile Application
Hadi Fadlallah
 

More from Hadi Fadlallah (20)

What makes it worth becoming a Data Engineer?
What makes it worth becoming a Data Engineer?What makes it worth becoming a Data Engineer?
What makes it worth becoming a Data Engineer?
 
Introduction to Data Engineering
Introduction to Data EngineeringIntroduction to Data Engineering
Introduction to Data Engineering
 
An introduction to Business intelligence
An introduction to Business intelligenceAn introduction to Business intelligence
An introduction to Business intelligence
 
Big data lab as a service
Big data lab as a serviceBig data lab as a service
Big data lab as a service
 
Risk management and IT technologies
Risk management and IT technologiesRisk management and IT technologies
Risk management and IT technologies
 
Inertial sensors
Inertial sensors Inertial sensors
Inertial sensors
 
Big Data Integration
Big Data IntegrationBig Data Integration
Big Data Integration
 
Cloud computing pricing models
Cloud computing pricing modelsCloud computing pricing models
Cloud computing pricing models
 
Internet of things security challenges
Internet of things security challengesInternet of things security challenges
Internet of things security challenges
 
Marketing Mobile
Marketing MobileMarketing Mobile
Marketing Mobile
 
Secure Aware Routing Protocol
Secure Aware Routing ProtocolSecure Aware Routing Protocol
Secure Aware Routing Protocol
 
Bhopal disaster
Bhopal disasterBhopal disaster
Bhopal disaster
 
Penetration testing in wireless network
Penetration testing in wireless networkPenetration testing in wireless network
Penetration testing in wireless network
 
Cyber propaganda
Cyber propagandaCyber propaganda
Cyber propaganda
 
Dhcp authentication using certificates
Dhcp authentication using certificatesDhcp authentication using certificates
Dhcp authentication using certificates
 
Introduction to Data mining
Introduction to Data miningIntroduction to Data mining
Introduction to Data mining
 
Sql parametrized queries
Sql parametrized queriesSql parametrized queries
Sql parametrized queries
 
Introduction to software testing
Introduction to software testingIntroduction to software testing
Introduction to software testing
 
Enhancing the performance of kmeans algorithm
Enhancing the performance of kmeans algorithmEnhancing the performance of kmeans algorithm
Enhancing the performance of kmeans algorithm
 
Analyzing "Total liban" mobile Application
Analyzing "Total liban" mobile ApplicationAnalyzing "Total liban" mobile Application
Analyzing "Total liban" mobile Application
 

Recently uploaded

一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
g4dpvqap0
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
javier ramirez
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
vikram sood
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
nyfuhyz
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
jerlynmaetalle
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
rwarrenll
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
bopyb
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
74nqk8xf
 
Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
Sachin Paul
 
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
zsjl4mimo
 
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
74nqk8xf
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
Social Samosa
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
aqzctr7x
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
Social Samosa
 
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdfUdemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Fernanda Palhano
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
soxrziqu
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Kiwi Creative
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
jitskeb
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
Walaa Eldin Moustafa
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
apvysm8
 

Recently uploaded (20)

一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
 
Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
 
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
 
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
 
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdfUdemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
 

RaDEn : A Scalable and Efficient Platform for Engineering Radiation Data

Editor's Notes

  1. Radiation pollution is a critical concern due to high damage that it may cause to humans and environment. To minimize damages, controlling and monitoring is very important.
  2. In the past century, it was hard to have centralized radiation monitoring system due to the limitations of traditional networks. With the rise of internet of things, radiation measurement unit was integrated in wireless sensors, and used to transmit data via communication networks.
  3. As result, new challenges appeared: 1- when sensors collect data in real-time it may result a massive amount of data, which is transferred in a high speed. 2- the utilization of different types of sensors implies that we have different data formats. The traditional data technologies cannot handles any more this type of data. Also existing solutions are conventional and mostly handles data in batch style.
  4. In this experimental research, our objective is to build a scalable radiation data engineering platform that has: the ability to process and monitors huge amount of radiation data with high speed having different formats in real-time.
  5. Our proposed system is called RaDEn an abbreviation of radiation data engineering system It guarantees high scalability and fault-tolerance, handles big data And has the ability to monitor data in real-time and batch-style
  6. The system architecture is composed of 6 layers: The data sources which consists of radiation sensors installed in different places, Flat files and Archive relational databases The data ingestion layer, which is responsible of collecting data and send it to the data processing engine and data storage layer The data storage layer which allows storing huge volume of data, and allow end-user to search among the stored data The data processing engine it allows processing radiation data in real-time and raise alerts when high radiation level is detected The visualization layer, it allows showing real-time graphs The coordination layer: it guarantee the communication between the different technologies used in different layers. This task is done by Apache zookeeper which is required by data technologies. Next, we will describe the technologies that we have used in each layer
  7. First, the data ingestion layer. To read data with different formats from sensors and flat files we have used Apache Kafka, which is a distributed, scalable and fault-tolerant technology We have create two Kafka topics: one fro real-time processing and one for batch style. Data are sent from the data sources to Kafka producers then are sent distributed into kafka pipelines in parallel then until they are consumed. Data are sent to the data storage layer via Apache flume agent (one for each kafka topic) and at the same time it is sent to the processing engine. Also the system is able to import archival data from relational databases using apache sqoop import where we only have to specify the connection string of the relational database and the location into the hdfs
  8. The data storage layer has 2 components: The data repository: which consists of Hadoop distributed file system, which allow parallel computing and guarantee high scalability and fault-tolerance: the data comes from the ingestion layer to the Hadoop master node and then it is replicated over the slave nodes in a text file format. The metadata: which relies mainly on Apache Hive. it allows creating Tables on the top of HDFS directories, and let the user able to retrieve data from the repository using SQL-Like languages (Spark-SQL, HiveQL)
  9. The Data processing layer relies mainly on Apache Spark , which is a scalable, fault-tolerant, distributed data processing technology. The Apache spark master receive the data from the data ingestion layer and send the data to the spark workers to be processed then visualized in the data visualization layer. Beside of Spark, we have used pandas python library which contains many function to manipulate data.
  10. The data visualization layer relies mainly on a python library called Matplotlib, it a very simple library that allows user the draw real-time graphs.
  11. TO implement this system, we have configured three (linux-based) virtual machines, one machine acts as hadoop master node, and it contains apache kafka, flume, hive, sqoop and spark installations. Other machine act as Hadoop data nodes. We have used only one Kafka node and one Spark node due to the small dataset that we have received, but we can add more nodes when required
  12. We have written a python script that implement the following alarm system (based on the LAEC requirements) The alarm system work as the following: ….
  13. We run the experiments with a dataset proceed by the LAEC. For confidentiality purposes we they give us the data in form of flat files instead of giving access to the sensors or the web server. The data is collected from one sensor located in Beirut 1 august two thousand fifty till 1 august two thousands sixty The dataset contains information such as ….
  14. First, we have to run the required services (Hadoop cluster, spark, kafka, flume agent and python script) To simulate reading data from sensor we have created a directory and a listener on the top of it: when any file is added to the folder, it will start sending it line by line to the kafka broker. Each row is processed and visualized using the python script.
  15. The following figure shows some sequential screenshots of the real-time graph, we can se the evolution of the radiation level in function of date and time
  16. When there is an alert, it is raised in form of a message box like shown in the figure, the alarm level is written in the title and the description in the body
  17. On the top of the HDFS directory we have created a Hive external table, and we created a view that read from this table to ignore messy data rows and convert data types. Then we can retrieve data using SQL-Like languages such as spark-SQL and HiveQL.
  18. The figure show a screenshots of the results of the previous query.
  19. As a conclusion, we can say that we have designed and implemented a radiation data engineering system that: - can handles massive amount of data in real-time and at rest. - relies on scalable, fault-tolerant and distributed technologies such as Hadoop. - Allow users to retrieve stored data using SQL-Like languages Also, we have implemented an alarm system to monitor the radiation data and raise alert when high radiation level is detected.
  20. This research has some limitations due to the following reasons: It is not evaluated using big data due to the small dataset that we have received We didn’t get access to the sensors or web server Lack of big data technologies documentation The time limit constraint
  21. In the future, there are many improvements that can be made: Improving the visualization layer, using more powerful tools such as bokeh python library and Kibana which is a part of elastic search framework Design and implement user friendly interfaces Creating a data warehousing job that run every day and convert the newely stored files into ORC format which guarantee higher performance We can use distributed search engines such as Solr and ElasticSearch