SlideShare a Scribd company logo
1 of 23
Tao Zhong
 Kshitij A. Doshi
 Xi Tang
 Ting Lou
 Zhongyan Lu
 Hong Li
Presented by: Raminder Kaur
Wayne State University
 Introduction
 Motivation and Background
 Architecture
 Framework
 Result
 Future work
 Conclusion
 Index term
 References
Wayne State University
This paper describes:
 a few key additional requirements that result from having to
support in-memory processing of data while updates proceed
concurrently.
 RAF
 Two RAF based solutions (discussed further)
Wayne State University
A few examples of information in motion that may just be seconds old, and
not yet well categorized or linked to other data:
- GPS-based navigation : to reduce wasted energy, accidents, delays and
emergencies.
- A credit card company : to detect and intercept suspicious transactions
- A metropolitan or regional power grid : to modulate power generation,
perform load-balancing, direct repair actions, and take policy enforcement
steps
 An essential feature in the above examples is the need to integrate new
transactions into analysis results within a very short time—sometimes as
short as a few tens of milliseconds.
Wayne State University
RDD makes in-memory solutions less failure prone. So RAF enhances RDD
approach so that resiliency is blended with a few additional characteristics as
listed below:
• Efficient allocation and control of memory resources
• Resilient update of information at much finer resolution
• Flexible and highly efficient concurrency control
• Replication and partitioning of data transparent to clients
Architecturally RAF elevates memory across an entire cluster to a first class
storage entity and defines high level mechanisms by which applications on RAF
can orchestrate distributed actions upon objects stored in cluster memory.
To promote responsible and transparent use of memory, RAF opts to use a
programming language such as C, C++, over mixed language environments in
which garbage allocation is opaque.
Wayne State University
Data has a lot of value when mined. As data continues to compound at brisk
rates, institutions need to grapple with two broad demands –
 accumulating, processing, synopsizing and utilizing information in a
timely manner
 storing the refined data resiliently
 keeping the data accessible at high speed.
The term Big Data itself is elastic and serves well as a description of the scale
or volume of these solutions, but does not define a constraining principle for
organizing storage .
Wayne State University
Requirements for low-latency and high throughput analytics on
datasets:
 In-memory structures and storage
 Resiliency
 Sharing data through memory
 Uniform interaction with storage
 Minimizing memory recycling
 Efficient integration of CRUD
 Synchronizing efficiently
 Searching Efficiently
Wayne State University
Translation of eight requirements into five design elements:
 C and C++ based programming for efficient sharing of data
through memory
 Resilient storing of new content
 Efficient concurrency
 Processing information in motion
 Fast, general, ad-hoc searches
Wayne State University
 This framework targets the execution of complex queries at
very low latency.
 Information upon which queries operate may be available on
some storage medium, or generated dynamically as a result of
ongoing transactional activities.
 RAF provides distributed computing environment which is
integrated with memory-centric, distributed storage system
where one application can pass the data to another in order to
share data in memory
Wayne State University
 RDD: used to store information in memory of one or more machines to
assure that in case of failure of one or more machines, the RDD can be
reconstructed.
 Transformations: operation on RDD to generate new data sets. RAF
transformations are join, map, union, etc.
 Filter: a particular type of transformation. Produces a dataset whose
contents satisfy a specified condition.
 Delegate: It is a bridged module. Purpose of delegate is to create a version
of datastore at a particular time and present it as memory resident RDD.
Wayne State
University
 Efficient storage sharing using DELEGATE
 Memory-centric storage operation
-Reliability
 Data and storage types
-Structured data
-Storage types (Replicated store and Partitioned store)
 Distributed Execution of Analytics tasks
-Analytics tasks interface
Wayne State University
 Unit Testing:
-Scalability testing results (how well update operations scale)
-Latency relative to Hive/HDFS (how long does it take to
complete a query)
NOTE: These unit test results show advantage of in-memory
distributed processing oriented design of RAF.
 Solution-level implementation and testing
-Telecommunications subscriber Management
-Safe City Solution
Wayne State University
Wayne State University
 Motivated by the high degree of familiarity that many developers have
with database interfaces, we are incrementally introducing SQL-
92/JDBC/ODBC like interfaces on top of RAF. A number of optimizations
are also being added.
These optimizations include:
 application requested indexing, to accelerate searches
 blending in column-store capabilities where appropriate (for example, for
rarely-written data)
 compression, in order to reduce data transported between nodes.
Wayne State
University
 Discussed RAF, an architectural approach that meshes memory-centric
non-relational query processing for low latency analytics with memory-
centric update processing to accommodate high volumes of updates.
 Delegate, which participates as a special type of content transformer in a
hierarchy of RDD transformations.
 In RAF, protocol buffers are used to obtain data abstraction and efficient
conveyance among applications, providing applications with a high degree
of independence in location, representation, and transmission of data.
 A light-weight but expressive interface for RAF
 Using unit tests we show high cluster scaling capability for transactions, an
order of magnitude latency improvement for query processing.
 Discussed two real-world usage scenarios in which RAF is being used.
Wayne State University
 RDD: Resilient distributed dataset
 RAF: Real-time Analytics Foundation
 CRUD : Create/Retrieve/Update/Delete
 HDFS: Hadoop Distributed File System
 Apache Hadoop: http://hadoop.apache.org/
 Apache HBase: http://hbase.apache.org/
 Memcached: http://www.memcached.org/
 Oracle Coherence: http://www.oracle.com/technetwork/ middle ware/ coherence/
 H. Plattner, A. Zeier, In-Memory Data Management.
 Protobuf: http://code.google.com/p/protobuf/
 Redis: http://www.redis.io/
 SQLStream: http://www.sqlstream.com/
 Vertica: http://www.vertica.com/
 VoltDB: http://www.voltdb.com
Wayne State University
Thanks !!!

More Related Content

What's hot

Platform for Data Scientists
Platform for Data ScientistsPlatform for Data Scientists
Platform for Data Scientistsdatamantra
 
Big Data Business Transformation - Big Picture and Blueprints
Big Data Business Transformation - Big Picture and BlueprintsBig Data Business Transformation - Big Picture and Blueprints
Big Data Business Transformation - Big Picture and BlueprintsAshnikbiz
 
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSetsEnabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSetsStreamsets Inc.
 
Building Custom Big Data Integrations
Building Custom Big Data IntegrationsBuilding Custom Big Data Integrations
Building Custom Big Data IntegrationsPat Patterson
 
Billions of Rows, Millions of Insights, Right Now
Billions of Rows, Millions of Insights, Right NowBillions of Rows, Millions of Insights, Right Now
Billions of Rows, Millions of Insights, Right NowRob Winters
 
The "Big Data" Ecosystem at LinkedIn
The "Big Data" Ecosystem at LinkedInThe "Big Data" Ecosystem at LinkedIn
The "Big Data" Ecosystem at LinkedInSam Shah
 
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...Hortonworks
 
Quantopix analytics system (qas)
Quantopix analytics system (qas)Quantopix analytics system (qas)
Quantopix analytics system (qas)Al Sabawi
 
Dealing with Drift: Building an Enterprise Data Lake
Dealing with Drift: Building an Enterprise Data LakeDealing with Drift: Building an Enterprise Data Lake
Dealing with Drift: Building an Enterprise Data LakePat Patterson
 
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...Databricks
 
Real time data ingestion and Hybrid Cloud
Real time data ingestion and Hybrid CloudReal time data ingestion and Hybrid Cloud
Real time data ingestion and Hybrid CloudNeeraj Sabharwal
 
Cloud Computing for Small & Medium Businesses
Cloud Computing for Small & Medium BusinessesCloud Computing for Small & Medium Businesses
Cloud Computing for Small & Medium BusinessesAl Sabawi
 
Spark Summit Keynote by Suren Nathan
Spark Summit Keynote by Suren NathanSpark Summit Keynote by Suren Nathan
Spark Summit Keynote by Suren NathanSpark Summit
 
Optimize Data for the Logical Data Warehouse
Optimize Data for the Logical Data WarehouseOptimize Data for the Logical Data Warehouse
Optimize Data for the Logical Data WarehouseAttunity
 
Data Management on Hadoop at Yahoo!
Data Management on Hadoop at Yahoo!Data Management on Hadoop at Yahoo!
Data Management on Hadoop at Yahoo!Seetharam Venkatesh
 
Which data should you move to Hadoop?
Which data should you move to Hadoop?Which data should you move to Hadoop?
Which data should you move to Hadoop?Attunity
 
O'Reilly ebook: Operationalizing the Data Lake
O'Reilly ebook: Operationalizing the Data LakeO'Reilly ebook: Operationalizing the Data Lake
O'Reilly ebook: Operationalizing the Data LakeVasu S
 
Hadoop mapreduce and yarn frame work- unit5
Hadoop mapreduce and yarn frame work-  unit5Hadoop mapreduce and yarn frame work-  unit5
Hadoop mapreduce and yarn frame work- unit5RojaT4
 
Tableau @ Spil Games
Tableau @ Spil GamesTableau @ Spil Games
Tableau @ Spil GamesRob Winters
 
Data Mover for Hadoop | Diyotta
Data Mover for Hadoop | DiyottaData Mover for Hadoop | Diyotta
Data Mover for Hadoop | Diyottadiyotta
 

What's hot (20)

Platform for Data Scientists
Platform for Data ScientistsPlatform for Data Scientists
Platform for Data Scientists
 
Big Data Business Transformation - Big Picture and Blueprints
Big Data Business Transformation - Big Picture and BlueprintsBig Data Business Transformation - Big Picture and Blueprints
Big Data Business Transformation - Big Picture and Blueprints
 
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSetsEnabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
 
Building Custom Big Data Integrations
Building Custom Big Data IntegrationsBuilding Custom Big Data Integrations
Building Custom Big Data Integrations
 
Billions of Rows, Millions of Insights, Right Now
Billions of Rows, Millions of Insights, Right NowBillions of Rows, Millions of Insights, Right Now
Billions of Rows, Millions of Insights, Right Now
 
The "Big Data" Ecosystem at LinkedIn
The "Big Data" Ecosystem at LinkedInThe "Big Data" Ecosystem at LinkedIn
The "Big Data" Ecosystem at LinkedIn
 
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
 
Quantopix analytics system (qas)
Quantopix analytics system (qas)Quantopix analytics system (qas)
Quantopix analytics system (qas)
 
Dealing with Drift: Building an Enterprise Data Lake
Dealing with Drift: Building an Enterprise Data LakeDealing with Drift: Building an Enterprise Data Lake
Dealing with Drift: Building an Enterprise Data Lake
 
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...
 
Real time data ingestion and Hybrid Cloud
Real time data ingestion and Hybrid CloudReal time data ingestion and Hybrid Cloud
Real time data ingestion and Hybrid Cloud
 
Cloud Computing for Small & Medium Businesses
Cloud Computing for Small & Medium BusinessesCloud Computing for Small & Medium Businesses
Cloud Computing for Small & Medium Businesses
 
Spark Summit Keynote by Suren Nathan
Spark Summit Keynote by Suren NathanSpark Summit Keynote by Suren Nathan
Spark Summit Keynote by Suren Nathan
 
Optimize Data for the Logical Data Warehouse
Optimize Data for the Logical Data WarehouseOptimize Data for the Logical Data Warehouse
Optimize Data for the Logical Data Warehouse
 
Data Management on Hadoop at Yahoo!
Data Management on Hadoop at Yahoo!Data Management on Hadoop at Yahoo!
Data Management on Hadoop at Yahoo!
 
Which data should you move to Hadoop?
Which data should you move to Hadoop?Which data should you move to Hadoop?
Which data should you move to Hadoop?
 
O'Reilly ebook: Operationalizing the Data Lake
O'Reilly ebook: Operationalizing the Data LakeO'Reilly ebook: Operationalizing the Data Lake
O'Reilly ebook: Operationalizing the Data Lake
 
Hadoop mapreduce and yarn frame work- unit5
Hadoop mapreduce and yarn frame work-  unit5Hadoop mapreduce and yarn frame work-  unit5
Hadoop mapreduce and yarn frame work- unit5
 
Tableau @ Spil Games
Tableau @ Spil GamesTableau @ Spil Games
Tableau @ Spil Games
 
Data Mover for Hadoop | Diyotta
Data Mover for Hadoop | DiyottaData Mover for Hadoop | Diyotta
Data Mover for Hadoop | Diyotta
 

Viewers also liked

Top Agile Metrics
Top Agile MetricsTop Agile Metrics
Top Agile MetricsXBOSoft
 
Ast 0060878 wayne-eckerson_research_report_big_data_analytics
Ast 0060878 wayne-eckerson_research_report_big_data_analyticsAst 0060878 wayne-eckerson_research_report_big_data_analytics
Ast 0060878 wayne-eckerson_research_report_big_data_analyticsAccenture
 
Innovation Diffusion: a (Big) Data-driven approach to the study of the geogra...
Innovation Diffusion: a (Big) Data-driven approach to the study of the geogra...Innovation Diffusion: a (Big) Data-driven approach to the study of the geogra...
Innovation Diffusion: a (Big) Data-driven approach to the study of the geogra...Enrico Palumbo
 
Big Data Analytics: Architectural Perspective
Big Data Analytics: Architectural PerspectiveBig Data Analytics: Architectural Perspective
Big Data Analytics: Architectural PerspectiveSumit Kalra
 
Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...
Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...
Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...Brian O'Neill
 
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...Hortonworks
 
Real time data analytics - part 1 - backend infrastructure
Real time data analytics - part 1 - backend infrastructureReal time data analytics - part 1 - backend infrastructure
Real time data analytics - part 1 - backend infrastructureAmazon Web Services
 
PARTNERS 2013 - Dr. Stefan Schwarz - Big Data Analytics as a Service
PARTNERS 2013 - Dr. Stefan Schwarz - Big Data Analytics as a Service PARTNERS 2013 - Dr. Stefan Schwarz - Big Data Analytics as a Service
PARTNERS 2013 - Dr. Stefan Schwarz - Big Data Analytics as a Service Stefan Schwarz
 
Speed layer : Real time views in LAMBDA architecture
Speed layer : Real time views in LAMBDA architecture Speed layer : Real time views in LAMBDA architecture
Speed layer : Real time views in LAMBDA architecture Tin Ho
 
Real time analytics @ netflix
Real time analytics @ netflixReal time analytics @ netflix
Real time analytics @ netflixCody Rioux
 
Big Data Analytics for Real Time Systems
Big Data Analytics for Real Time SystemsBig Data Analytics for Real Time Systems
Big Data Analytics for Real Time SystemsKamalika Dutta
 
Agile data science
Agile data scienceAgile data science
Agile data scienceJoel Horwitz
 
A technical Introduction to Big Data Analytics
A technical Introduction to Big Data AnalyticsA technical Introduction to Big Data Analytics
A technical Introduction to Big Data AnalyticsPethuru Raj PhD
 
Agile Big Data Analytics Development: An Architecture-Centric Approach
Agile Big Data Analytics Development: An Architecture-Centric ApproachAgile Big Data Analytics Development: An Architecture-Centric Approach
Agile Big Data Analytics Development: An Architecture-Centric ApproachSoftServe
 
The Modern Data Architecture for Predictive Analytics with Hortonworks and Re...
The Modern Data Architecture for Predictive Analytics with Hortonworks and Re...The Modern Data Architecture for Predictive Analytics with Hortonworks and Re...
The Modern Data Architecture for Predictive Analytics with Hortonworks and Re...Revolution Analytics
 
Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtwo...
Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtwo...Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtwo...
Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtwo...Thoughtworks
 
Building Big Data Analytics Center Of Excellence
Building Big Data Analytics Center Of Excellence Building Big Data Analytics Center Of Excellence
Building Big Data Analytics Center Of Excellence Dr. Mohan K. Bavirisetty
 
Business Process Maturity and Centers of Excellence
Business Process Maturity and Centers of ExcellenceBusiness Process Maturity and Centers of Excellence
Business Process Maturity and Centers of ExcellenceSandy Kemsley
 
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...SoftServe
 

Viewers also liked (20)

Top Agile Metrics
Top Agile MetricsTop Agile Metrics
Top Agile Metrics
 
Ast 0060878 wayne-eckerson_research_report_big_data_analytics
Ast 0060878 wayne-eckerson_research_report_big_data_analyticsAst 0060878 wayne-eckerson_research_report_big_data_analytics
Ast 0060878 wayne-eckerson_research_report_big_data_analytics
 
Innovation Diffusion: a (Big) Data-driven approach to the study of the geogra...
Innovation Diffusion: a (Big) Data-driven approach to the study of the geogra...Innovation Diffusion: a (Big) Data-driven approach to the study of the geogra...
Innovation Diffusion: a (Big) Data-driven approach to the study of the geogra...
 
Big Data Analytics: Architectural Perspective
Big Data Analytics: Architectural PerspectiveBig Data Analytics: Architectural Perspective
Big Data Analytics: Architectural Perspective
 
Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...
Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...
Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...
 
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...
 
Real time data analytics - part 1 - backend infrastructure
Real time data analytics - part 1 - backend infrastructureReal time data analytics - part 1 - backend infrastructure
Real time data analytics - part 1 - backend infrastructure
 
PARTNERS 2013 - Dr. Stefan Schwarz - Big Data Analytics as a Service
PARTNERS 2013 - Dr. Stefan Schwarz - Big Data Analytics as a Service PARTNERS 2013 - Dr. Stefan Schwarz - Big Data Analytics as a Service
PARTNERS 2013 - Dr. Stefan Schwarz - Big Data Analytics as a Service
 
Speed layer : Real time views in LAMBDA architecture
Speed layer : Real time views in LAMBDA architecture Speed layer : Real time views in LAMBDA architecture
Speed layer : Real time views in LAMBDA architecture
 
Real time analytics @ netflix
Real time analytics @ netflixReal time analytics @ netflix
Real time analytics @ netflix
 
Big Data Analytics for Real Time Systems
Big Data Analytics for Real Time SystemsBig Data Analytics for Real Time Systems
Big Data Analytics for Real Time Systems
 
Agile data science
Agile data scienceAgile data science
Agile data science
 
A technical Introduction to Big Data Analytics
A technical Introduction to Big Data AnalyticsA technical Introduction to Big Data Analytics
A technical Introduction to Big Data Analytics
 
Agile Big Data Analytics Development: An Architecture-Centric Approach
Agile Big Data Analytics Development: An Architecture-Centric ApproachAgile Big Data Analytics Development: An Architecture-Centric Approach
Agile Big Data Analytics Development: An Architecture-Centric Approach
 
The Modern Data Architecture for Predictive Analytics with Hortonworks and Re...
The Modern Data Architecture for Predictive Analytics with Hortonworks and Re...The Modern Data Architecture for Predictive Analytics with Hortonworks and Re...
The Modern Data Architecture for Predictive Analytics with Hortonworks and Re...
 
Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtwo...
Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtwo...Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtwo...
Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtwo...
 
Building Big Data Analytics Center Of Excellence
Building Big Data Analytics Center Of Excellence Building Big Data Analytics Center Of Excellence
Building Big Data Analytics Center Of Excellence
 
Business Process Maturity and Centers of Excellence
Business Process Maturity and Centers of ExcellenceBusiness Process Maturity and Centers of Excellence
Business Process Maturity and Centers of Excellence
 
Big Data Architectural Patterns
Big Data Architectural PatternsBig Data Architectural Patterns
Big Data Architectural Patterns
 
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
 

Similar to A big-data architecture for real-time analytics

Analysis of SOFTWARE DEFINED STORAGE (SDS)
Analysis of SOFTWARE DEFINED STORAGE (SDS)Analysis of SOFTWARE DEFINED STORAGE (SDS)
Analysis of SOFTWARE DEFINED STORAGE (SDS)Kaushik Rajan
 
Cloud Computing Ambiance using Secluded Access Control Method
Cloud Computing Ambiance using Secluded Access Control MethodCloud Computing Ambiance using Secluded Access Control Method
Cloud Computing Ambiance using Secluded Access Control MethodIRJET Journal
 
Resilient Distributed DataSets - Apache SPARK
Resilient Distributed DataSets - Apache SPARKResilient Distributed DataSets - Apache SPARK
Resilient Distributed DataSets - Apache SPARKTaposh Roy
 
Data Partitioning in Mongo DB with Cloud
Data Partitioning in Mongo DB with CloudData Partitioning in Mongo DB with Cloud
Data Partitioning in Mongo DB with CloudIJAAS Team
 
A DDS-Based Scalable and Reconfigurable Framework for Cyber-Physical Systems
A DDS-Based Scalable and Reconfigurable Framework for Cyber-Physical SystemsA DDS-Based Scalable and Reconfigurable Framework for Cyber-Physical Systems
A DDS-Based Scalable and Reconfigurable Framework for Cyber-Physical Systemsijseajournal
 
access.2021.3077680.pdf
access.2021.3077680.pdfaccess.2021.3077680.pdf
access.2021.3077680.pdfneju3
 
An Efficient and Fault Tolerant Data Replica Placement Technique for Cloud ba...
An Efficient and Fault Tolerant Data Replica Placement Technique for Cloud ba...An Efficient and Fault Tolerant Data Replica Placement Technique for Cloud ba...
An Efficient and Fault Tolerant Data Replica Placement Technique for Cloud ba...IJCSIS Research Publications
 
What is Scalability and How can affect on overall system performance of database
What is Scalability and How can affect on overall system performance of databaseWhat is Scalability and How can affect on overall system performance of database
What is Scalability and How can affect on overall system performance of databaseAlireza Kamrani
 
Storage Virtualization: Towards an Efficient and Scalable Framework
Storage Virtualization: Towards an Efficient and Scalable FrameworkStorage Virtualization: Towards an Efficient and Scalable Framework
Storage Virtualization: Towards an Efficient and Scalable FrameworkCSCJournals
 
An Energy Efficient Data Transmission and Aggregation of WSN using Data Proce...
An Energy Efficient Data Transmission and Aggregation of WSN using Data Proce...An Energy Efficient Data Transmission and Aggregation of WSN using Data Proce...
An Energy Efficient Data Transmission and Aggregation of WSN using Data Proce...IRJET Journal
 
A Study on Replication and Failover Cluster to Maximize System Uptime
A Study on Replication and Failover Cluster to Maximize System UptimeA Study on Replication and Failover Cluster to Maximize System Uptime
A Study on Replication and Failover Cluster to Maximize System UptimeYogeshIJTSRD
 
Iaetsd mapreduce streaming over cassandra datasets
Iaetsd mapreduce streaming over cassandra datasetsIaetsd mapreduce streaming over cassandra datasets
Iaetsd mapreduce streaming over cassandra datasetsIaetsd Iaetsd
 
Big Data: RDBMS vs. Hadoop vs. Spark
Big Data: RDBMS vs. Hadoop vs. SparkBig Data: RDBMS vs. Hadoop vs. Spark
Big Data: RDBMS vs. Hadoop vs. SparkGraisy Biswal
 
Dataintensive
DataintensiveDataintensive
Dataintensivesulfath
 

Similar to A big-data architecture for real-time analytics (20)

Analysis of SOFTWARE DEFINED STORAGE (SDS)
Analysis of SOFTWARE DEFINED STORAGE (SDS)Analysis of SOFTWARE DEFINED STORAGE (SDS)
Analysis of SOFTWARE DEFINED STORAGE (SDS)
 
HadoopDB in Action
HadoopDB in ActionHadoopDB in Action
HadoopDB in Action
 
Cloud Computing Ambiance using Secluded Access Control Method
Cloud Computing Ambiance using Secluded Access Control MethodCloud Computing Ambiance using Secluded Access Control Method
Cloud Computing Ambiance using Secluded Access Control Method
 
Resilient Distributed DataSets - Apache SPARK
Resilient Distributed DataSets - Apache SPARKResilient Distributed DataSets - Apache SPARK
Resilient Distributed DataSets - Apache SPARK
 
Facade
FacadeFacade
Facade
 
Data Partitioning in Mongo DB with Cloud
Data Partitioning in Mongo DB with CloudData Partitioning in Mongo DB with Cloud
Data Partitioning in Mongo DB with Cloud
 
p1365-fernandes
p1365-fernandesp1365-fernandes
p1365-fernandes
 
A DDS-Based Scalable and Reconfigurable Framework for Cyber-Physical Systems
A DDS-Based Scalable and Reconfigurable Framework for Cyber-Physical SystemsA DDS-Based Scalable and Reconfigurable Framework for Cyber-Physical Systems
A DDS-Based Scalable and Reconfigurable Framework for Cyber-Physical Systems
 
access.2021.3077680.pdf
access.2021.3077680.pdfaccess.2021.3077680.pdf
access.2021.3077680.pdf
 
An Efficient and Fault Tolerant Data Replica Placement Technique for Cloud ba...
An Efficient and Fault Tolerant Data Replica Placement Technique for Cloud ba...An Efficient and Fault Tolerant Data Replica Placement Technique for Cloud ba...
An Efficient and Fault Tolerant Data Replica Placement Technique for Cloud ba...
 
What is Scalability and How can affect on overall system performance of database
What is Scalability and How can affect on overall system performance of databaseWhat is Scalability and How can affect on overall system performance of database
What is Scalability and How can affect on overall system performance of database
 
Storage Virtualization: Towards an Efficient and Scalable Framework
Storage Virtualization: Towards an Efficient and Scalable FrameworkStorage Virtualization: Towards an Efficient and Scalable Framework
Storage Virtualization: Towards an Efficient and Scalable Framework
 
An Energy Efficient Data Transmission and Aggregation of WSN using Data Proce...
An Energy Efficient Data Transmission and Aggregation of WSN using Data Proce...An Energy Efficient Data Transmission and Aggregation of WSN using Data Proce...
An Energy Efficient Data Transmission and Aggregation of WSN using Data Proce...
 
A Study on Replication and Failover Cluster to Maximize System Uptime
A Study on Replication and Failover Cluster to Maximize System UptimeA Study on Replication and Failover Cluster to Maximize System Uptime
A Study on Replication and Failover Cluster to Maximize System Uptime
 
Iaetsd mapreduce streaming over cassandra datasets
Iaetsd mapreduce streaming over cassandra datasetsIaetsd mapreduce streaming over cassandra datasets
Iaetsd mapreduce streaming over cassandra datasets
 
Big Data: RDBMS vs. Hadoop vs. Spark
Big Data: RDBMS vs. Hadoop vs. SparkBig Data: RDBMS vs. Hadoop vs. Spark
Big Data: RDBMS vs. Hadoop vs. Spark
 
Sdn in big data
Sdn in big dataSdn in big data
Sdn in big data
 
Dataintensive
DataintensiveDataintensive
Dataintensive
 
S18 das
S18 dasS18 das
S18 das
 
YugabyteDB_TVA-Datastax.pdf
YugabyteDB_TVA-Datastax.pdfYugabyteDB_TVA-Datastax.pdf
YugabyteDB_TVA-Datastax.pdf
 

Recently uploaded

call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...Postal Advocate Inc.
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxMaryGraceBautista27
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxDr.Ibrahim Hassaan
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONHumphrey A Beña
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17Celine George
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfphamnguyenenglishnb
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSJoshuaGantuangco2
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfSpandanaRallapalli
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomnelietumpap1
 

Recently uploaded (20)

call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptx
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptx
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginners
 
OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdf
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choom
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 

A big-data architecture for real-time analytics

  • 1. Tao Zhong  Kshitij A. Doshi  Xi Tang  Ting Lou  Zhongyan Lu  Hong Li Presented by: Raminder Kaur Wayne State University
  • 2.  Introduction  Motivation and Background  Architecture  Framework  Result  Future work  Conclusion  Index term  References Wayne State University
  • 3. This paper describes:  a few key additional requirements that result from having to support in-memory processing of data while updates proceed concurrently.  RAF  Two RAF based solutions (discussed further) Wayne State University
  • 4. A few examples of information in motion that may just be seconds old, and not yet well categorized or linked to other data: - GPS-based navigation : to reduce wasted energy, accidents, delays and emergencies. - A credit card company : to detect and intercept suspicious transactions - A metropolitan or regional power grid : to modulate power generation, perform load-balancing, direct repair actions, and take policy enforcement steps  An essential feature in the above examples is the need to integrate new transactions into analysis results within a very short time—sometimes as short as a few tens of milliseconds. Wayne State University
  • 5. RDD makes in-memory solutions less failure prone. So RAF enhances RDD approach so that resiliency is blended with a few additional characteristics as listed below: • Efficient allocation and control of memory resources • Resilient update of information at much finer resolution • Flexible and highly efficient concurrency control • Replication and partitioning of data transparent to clients Architecturally RAF elevates memory across an entire cluster to a first class storage entity and defines high level mechanisms by which applications on RAF can orchestrate distributed actions upon objects stored in cluster memory. To promote responsible and transparent use of memory, RAF opts to use a programming language such as C, C++, over mixed language environments in which garbage allocation is opaque. Wayne State University
  • 6. Data has a lot of value when mined. As data continues to compound at brisk rates, institutions need to grapple with two broad demands –  accumulating, processing, synopsizing and utilizing information in a timely manner  storing the refined data resiliently  keeping the data accessible at high speed. The term Big Data itself is elastic and serves well as a description of the scale or volume of these solutions, but does not define a constraining principle for organizing storage . Wayne State University
  • 7. Requirements for low-latency and high throughput analytics on datasets:  In-memory structures and storage  Resiliency  Sharing data through memory  Uniform interaction with storage  Minimizing memory recycling  Efficient integration of CRUD  Synchronizing efficiently  Searching Efficiently Wayne State University
  • 8.
  • 9. Translation of eight requirements into five design elements:  C and C++ based programming for efficient sharing of data through memory  Resilient storing of new content  Efficient concurrency  Processing information in motion  Fast, general, ad-hoc searches Wayne State University
  • 10.  This framework targets the execution of complex queries at very low latency.  Information upon which queries operate may be available on some storage medium, or generated dynamically as a result of ongoing transactional activities.  RAF provides distributed computing environment which is integrated with memory-centric, distributed storage system where one application can pass the data to another in order to share data in memory Wayne State University
  • 11.  RDD: used to store information in memory of one or more machines to assure that in case of failure of one or more machines, the RDD can be reconstructed.  Transformations: operation on RDD to generate new data sets. RAF transformations are join, map, union, etc.  Filter: a particular type of transformation. Produces a dataset whose contents satisfy a specified condition.  Delegate: It is a bridged module. Purpose of delegate is to create a version of datastore at a particular time and present it as memory resident RDD. Wayne State University
  • 12.
  • 13.  Efficient storage sharing using DELEGATE  Memory-centric storage operation -Reliability  Data and storage types -Structured data -Storage types (Replicated store and Partitioned store)  Distributed Execution of Analytics tasks -Analytics tasks interface Wayne State University
  • 14.
  • 15.
  • 16.  Unit Testing: -Scalability testing results (how well update operations scale) -Latency relative to Hive/HDFS (how long does it take to complete a query) NOTE: These unit test results show advantage of in-memory distributed processing oriented design of RAF.  Solution-level implementation and testing -Telecommunications subscriber Management -Safe City Solution
  • 19.  Motivated by the high degree of familiarity that many developers have with database interfaces, we are incrementally introducing SQL- 92/JDBC/ODBC like interfaces on top of RAF. A number of optimizations are also being added. These optimizations include:  application requested indexing, to accelerate searches  blending in column-store capabilities where appropriate (for example, for rarely-written data)  compression, in order to reduce data transported between nodes. Wayne State University
  • 20.  Discussed RAF, an architectural approach that meshes memory-centric non-relational query processing for low latency analytics with memory- centric update processing to accommodate high volumes of updates.  Delegate, which participates as a special type of content transformer in a hierarchy of RDD transformations.  In RAF, protocol buffers are used to obtain data abstraction and efficient conveyance among applications, providing applications with a high degree of independence in location, representation, and transmission of data.  A light-weight but expressive interface for RAF  Using unit tests we show high cluster scaling capability for transactions, an order of magnitude latency improvement for query processing.  Discussed two real-world usage scenarios in which RAF is being used. Wayne State University
  • 21.  RDD: Resilient distributed dataset  RAF: Real-time Analytics Foundation  CRUD : Create/Retrieve/Update/Delete  HDFS: Hadoop Distributed File System
  • 22.  Apache Hadoop: http://hadoop.apache.org/  Apache HBase: http://hbase.apache.org/  Memcached: http://www.memcached.org/  Oracle Coherence: http://www.oracle.com/technetwork/ middle ware/ coherence/  H. Plattner, A. Zeier, In-Memory Data Management.  Protobuf: http://code.google.com/p/protobuf/  Redis: http://www.redis.io/  SQLStream: http://www.sqlstream.com/  Vertica: http://www.vertica.com/  VoltDB: http://www.voltdb.com Wayne State University