SlideShare a Scribd company logo
Tao Zhong
 Kshitij A. Doshi
 Xi Tang
 Ting Lou
 Zhongyan Lu
 Hong Li
Presented by: Raminder Kaur
Wayne State University
 Introduction
 Motivation and Background
 Architecture
 Framework
 Result
 Future work
 Conclusion
 Index term
 References
Wayne State University
This paper describes:
 a few key additional requirements that result from having to
support in-memory processing of data while updates proceed
concurrently.
 RAF
 Two RAF based solutions (discussed further)
Wayne State University
A few examples of information in motion that may just be seconds old, and
not yet well categorized or linked to other data:
- GPS-based navigation : to reduce wasted energy, accidents, delays and
emergencies.
- A credit card company : to detect and intercept suspicious transactions
- A metropolitan or regional power grid : to modulate power generation,
perform load-balancing, direct repair actions, and take policy enforcement
steps
 An essential feature in the above examples is the need to integrate new
transactions into analysis results within a very short time—sometimes as
short as a few tens of milliseconds.
Wayne State University
RDD makes in-memory solutions less failure prone. So RAF enhances RDD
approach so that resiliency is blended with a few additional characteristics as
listed below:
• Efficient allocation and control of memory resources
• Resilient update of information at much finer resolution
• Flexible and highly efficient concurrency control
• Replication and partitioning of data transparent to clients
Architecturally RAF elevates memory across an entire cluster to a first class
storage entity and defines high level mechanisms by which applications on RAF
can orchestrate distributed actions upon objects stored in cluster memory.
To promote responsible and transparent use of memory, RAF opts to use a
programming language such as C, C++, over mixed language environments in
which garbage allocation is opaque.
Wayne State University
Data has a lot of value when mined. As data continues to compound at brisk
rates, institutions need to grapple with two broad demands –
 accumulating, processing, synopsizing and utilizing information in a
timely manner
 storing the refined data resiliently
 keeping the data accessible at high speed.
The term Big Data itself is elastic and serves well as a description of the scale
or volume of these solutions, but does not define a constraining principle for
organizing storage .
Wayne State University
Requirements for low-latency and high throughput analytics on
datasets:
 In-memory structures and storage
 Resiliency
 Sharing data through memory
 Uniform interaction with storage
 Minimizing memory recycling
 Efficient integration of CRUD
 Synchronizing efficiently
 Searching Efficiently
Wayne State University
Translation of eight requirements into five design elements:
 C and C++ based programming for efficient sharing of data
through memory
 Resilient storing of new content
 Efficient concurrency
 Processing information in motion
 Fast, general, ad-hoc searches
Wayne State University
 This framework targets the execution of complex queries at
very low latency.
 Information upon which queries operate may be available on
some storage medium, or generated dynamically as a result of
ongoing transactional activities.
 RAF provides distributed computing environment which is
integrated with memory-centric, distributed storage system
where one application can pass the data to another in order to
share data in memory
Wayne State University
 RDD: used to store information in memory of one or more machines to
assure that in case of failure of one or more machines, the RDD can be
reconstructed.
 Transformations: operation on RDD to generate new data sets. RAF
transformations are join, map, union, etc.
 Filter: a particular type of transformation. Produces a dataset whose
contents satisfy a specified condition.
 Delegate: It is a bridged module. Purpose of delegate is to create a version
of datastore at a particular time and present it as memory resident RDD.
Wayne State
University
 Efficient storage sharing using DELEGATE
 Memory-centric storage operation
-Reliability
 Data and storage types
-Structured data
-Storage types (Replicated store and Partitioned store)
 Distributed Execution of Analytics tasks
-Analytics tasks interface
Wayne State University
 Unit Testing:
-Scalability testing results (how well update operations scale)
-Latency relative to Hive/HDFS (how long does it take to
complete a query)
NOTE: These unit test results show advantage of in-memory
distributed processing oriented design of RAF.
 Solution-level implementation and testing
-Telecommunications subscriber Management
-Safe City Solution
Wayne State University
Wayne State University
 Motivated by the high degree of familiarity that many developers have
with database interfaces, we are incrementally introducing SQL-
92/JDBC/ODBC like interfaces on top of RAF. A number of optimizations
are also being added.
These optimizations include:
 application requested indexing, to accelerate searches
 blending in column-store capabilities where appropriate (for example, for
rarely-written data)
 compression, in order to reduce data transported between nodes.
Wayne State
University
 Discussed RAF, an architectural approach that meshes memory-centric
non-relational query processing for low latency analytics with memory-
centric update processing to accommodate high volumes of updates.
 Delegate, which participates as a special type of content transformer in a
hierarchy of RDD transformations.
 In RAF, protocol buffers are used to obtain data abstraction and efficient
conveyance among applications, providing applications with a high degree
of independence in location, representation, and transmission of data.
 A light-weight but expressive interface for RAF
 Using unit tests we show high cluster scaling capability for transactions, an
order of magnitude latency improvement for query processing.
 Discussed two real-world usage scenarios in which RAF is being used.
Wayne State University
 RDD: Resilient distributed dataset
 RAF: Real-time Analytics Foundation
 CRUD : Create/Retrieve/Update/Delete
 HDFS: Hadoop Distributed File System
 Apache Hadoop: http://hadoop.apache.org/
 Apache HBase: http://hbase.apache.org/
 Memcached: http://www.memcached.org/
 Oracle Coherence: http://www.oracle.com/technetwork/ middle ware/ coherence/
 H. Plattner, A. Zeier, In-Memory Data Management.
 Protobuf: http://code.google.com/p/protobuf/
 Redis: http://www.redis.io/
 SQLStream: http://www.sqlstream.com/
 Vertica: http://www.vertica.com/
 VoltDB: http://www.voltdb.com
Wayne State University
Thanks !!!

More Related Content

What's hot

Platform for Data Scientists
Platform for Data ScientistsPlatform for Data Scientists
Platform for Data Scientists
datamantra
 
Big Data Business Transformation - Big Picture and Blueprints
Big Data Business Transformation - Big Picture and BlueprintsBig Data Business Transformation - Big Picture and Blueprints
Big Data Business Transformation - Big Picture and Blueprints
Ashnikbiz
 
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSetsEnabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
Streamsets Inc.
 
Building Custom Big Data Integrations
Building Custom Big Data IntegrationsBuilding Custom Big Data Integrations
Building Custom Big Data Integrations
Pat Patterson
 
Billions of Rows, Millions of Insights, Right Now
Billions of Rows, Millions of Insights, Right NowBillions of Rows, Millions of Insights, Right Now
Billions of Rows, Millions of Insights, Right Now
Rob Winters
 
The "Big Data" Ecosystem at LinkedIn
The "Big Data" Ecosystem at LinkedInThe "Big Data" Ecosystem at LinkedIn
The "Big Data" Ecosystem at LinkedIn
Sam Shah
 
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Hortonworks
 
Quantopix analytics system (qas)
Quantopix analytics system (qas)Quantopix analytics system (qas)
Quantopix analytics system (qas)
Al Sabawi
 
Dealing with Drift: Building an Enterprise Data Lake
Dealing with Drift: Building an Enterprise Data LakeDealing with Drift: Building an Enterprise Data Lake
Dealing with Drift: Building an Enterprise Data Lake
Pat Patterson
 
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...
Databricks
 
Real time data ingestion and Hybrid Cloud
Real time data ingestion and Hybrid CloudReal time data ingestion and Hybrid Cloud
Real time data ingestion and Hybrid Cloud
Neeraj Sabharwal
 
Cloud Computing for Small & Medium Businesses
Cloud Computing for Small & Medium BusinessesCloud Computing for Small & Medium Businesses
Cloud Computing for Small & Medium Businesses
Al Sabawi
 
Spark Summit Keynote by Suren Nathan
Spark Summit Keynote by Suren NathanSpark Summit Keynote by Suren Nathan
Spark Summit Keynote by Suren Nathan
Spark Summit
 
Optimize Data for the Logical Data Warehouse
Optimize Data for the Logical Data WarehouseOptimize Data for the Logical Data Warehouse
Optimize Data for the Logical Data Warehouse
Attunity
 
Data Management on Hadoop at Yahoo!
Data Management on Hadoop at Yahoo!Data Management on Hadoop at Yahoo!
Data Management on Hadoop at Yahoo!
Seetharam Venkatesh
 
Which data should you move to Hadoop?
Which data should you move to Hadoop?Which data should you move to Hadoop?
Which data should you move to Hadoop?
Attunity
 
O'Reilly ebook: Operationalizing the Data Lake
O'Reilly ebook: Operationalizing the Data LakeO'Reilly ebook: Operationalizing the Data Lake
O'Reilly ebook: Operationalizing the Data Lake
Vasu S
 
Hadoop mapreduce and yarn frame work- unit5
Hadoop mapreduce and yarn frame work-  unit5Hadoop mapreduce and yarn frame work-  unit5
Hadoop mapreduce and yarn frame work- unit5
RojaT4
 
Tableau @ Spil Games
Tableau @ Spil GamesTableau @ Spil Games
Tableau @ Spil GamesRob Winters
 
Data Mover for Hadoop | Diyotta
Data Mover for Hadoop | DiyottaData Mover for Hadoop | Diyotta
Data Mover for Hadoop | Diyotta
diyotta
 

What's hot (20)

Platform for Data Scientists
Platform for Data ScientistsPlatform for Data Scientists
Platform for Data Scientists
 
Big Data Business Transformation - Big Picture and Blueprints
Big Data Business Transformation - Big Picture and BlueprintsBig Data Business Transformation - Big Picture and Blueprints
Big Data Business Transformation - Big Picture and Blueprints
 
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSetsEnabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
 
Building Custom Big Data Integrations
Building Custom Big Data IntegrationsBuilding Custom Big Data Integrations
Building Custom Big Data Integrations
 
Billions of Rows, Millions of Insights, Right Now
Billions of Rows, Millions of Insights, Right NowBillions of Rows, Millions of Insights, Right Now
Billions of Rows, Millions of Insights, Right Now
 
The "Big Data" Ecosystem at LinkedIn
The "Big Data" Ecosystem at LinkedInThe "Big Data" Ecosystem at LinkedIn
The "Big Data" Ecosystem at LinkedIn
 
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
 
Quantopix analytics system (qas)
Quantopix analytics system (qas)Quantopix analytics system (qas)
Quantopix analytics system (qas)
 
Dealing with Drift: Building an Enterprise Data Lake
Dealing with Drift: Building an Enterprise Data LakeDealing with Drift: Building an Enterprise Data Lake
Dealing with Drift: Building an Enterprise Data Lake
 
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...
 
Real time data ingestion and Hybrid Cloud
Real time data ingestion and Hybrid CloudReal time data ingestion and Hybrid Cloud
Real time data ingestion and Hybrid Cloud
 
Cloud Computing for Small & Medium Businesses
Cloud Computing for Small & Medium BusinessesCloud Computing for Small & Medium Businesses
Cloud Computing for Small & Medium Businesses
 
Spark Summit Keynote by Suren Nathan
Spark Summit Keynote by Suren NathanSpark Summit Keynote by Suren Nathan
Spark Summit Keynote by Suren Nathan
 
Optimize Data for the Logical Data Warehouse
Optimize Data for the Logical Data WarehouseOptimize Data for the Logical Data Warehouse
Optimize Data for the Logical Data Warehouse
 
Data Management on Hadoop at Yahoo!
Data Management on Hadoop at Yahoo!Data Management on Hadoop at Yahoo!
Data Management on Hadoop at Yahoo!
 
Which data should you move to Hadoop?
Which data should you move to Hadoop?Which data should you move to Hadoop?
Which data should you move to Hadoop?
 
O'Reilly ebook: Operationalizing the Data Lake
O'Reilly ebook: Operationalizing the Data LakeO'Reilly ebook: Operationalizing the Data Lake
O'Reilly ebook: Operationalizing the Data Lake
 
Hadoop mapreduce and yarn frame work- unit5
Hadoop mapreduce and yarn frame work-  unit5Hadoop mapreduce and yarn frame work-  unit5
Hadoop mapreduce and yarn frame work- unit5
 
Tableau @ Spil Games
Tableau @ Spil GamesTableau @ Spil Games
Tableau @ Spil Games
 
Data Mover for Hadoop | Diyotta
Data Mover for Hadoop | DiyottaData Mover for Hadoop | Diyotta
Data Mover for Hadoop | Diyotta
 

Viewers also liked

Top Agile Metrics
Top Agile MetricsTop Agile Metrics
Top Agile Metrics
XBOSoft
 
Ast 0060878 wayne-eckerson_research_report_big_data_analytics
Ast 0060878 wayne-eckerson_research_report_big_data_analyticsAst 0060878 wayne-eckerson_research_report_big_data_analytics
Ast 0060878 wayne-eckerson_research_report_big_data_analyticsAccenture
 
Innovation Diffusion: a (Big) Data-driven approach to the study of the geogra...
Innovation Diffusion: a (Big) Data-driven approach to the study of the geogra...Innovation Diffusion: a (Big) Data-driven approach to the study of the geogra...
Innovation Diffusion: a (Big) Data-driven approach to the study of the geogra...
Enrico Palumbo
 
Big Data Analytics: Architectural Perspective
Big Data Analytics: Architectural PerspectiveBig Data Analytics: Architectural Perspective
Big Data Analytics: Architectural Perspective
Sumit Kalra
 
Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...
Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...
Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...
Brian O'Neill
 
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...
Hortonworks
 
Real time data analytics - part 1 - backend infrastructure
Real time data analytics - part 1 - backend infrastructureReal time data analytics - part 1 - backend infrastructure
Real time data analytics - part 1 - backend infrastructure
Amazon Web Services
 
PARTNERS 2013 - Dr. Stefan Schwarz - Big Data Analytics as a Service
PARTNERS 2013 - Dr. Stefan Schwarz - Big Data Analytics as a Service PARTNERS 2013 - Dr. Stefan Schwarz - Big Data Analytics as a Service
PARTNERS 2013 - Dr. Stefan Schwarz - Big Data Analytics as a Service Stefan Schwarz
 
Speed layer : Real time views in LAMBDA architecture
Speed layer : Real time views in LAMBDA architecture Speed layer : Real time views in LAMBDA architecture
Speed layer : Real time views in LAMBDA architecture
Tin Ho
 
Real time analytics @ netflix
Real time analytics @ netflixReal time analytics @ netflix
Real time analytics @ netflix
Cody Rioux
 
Big Data Analytics for Real Time Systems
Big Data Analytics for Real Time SystemsBig Data Analytics for Real Time Systems
Big Data Analytics for Real Time Systems
Kamalika Dutta
 
Agile data science
Agile data scienceAgile data science
Agile data science
Joel Horwitz
 
A technical Introduction to Big Data Analytics
A technical Introduction to Big Data AnalyticsA technical Introduction to Big Data Analytics
A technical Introduction to Big Data Analytics
Pethuru Raj PhD
 
Agile Big Data Analytics Development: An Architecture-Centric Approach
Agile Big Data Analytics Development: An Architecture-Centric ApproachAgile Big Data Analytics Development: An Architecture-Centric Approach
Agile Big Data Analytics Development: An Architecture-Centric Approach
SoftServe
 
The Modern Data Architecture for Predictive Analytics with Hortonworks and Re...
The Modern Data Architecture for Predictive Analytics with Hortonworks and Re...The Modern Data Architecture for Predictive Analytics with Hortonworks and Re...
The Modern Data Architecture for Predictive Analytics with Hortonworks and Re...
Revolution Analytics
 
Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtwo...
Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtwo...Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtwo...
Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtwo...
Thoughtworks
 
Building Big Data Analytics Center Of Excellence
Building Big Data Analytics Center Of Excellence Building Big Data Analytics Center Of Excellence
Building Big Data Analytics Center Of Excellence Dr. Mohan K. Bavirisetty
 
Business Process Maturity and Centers of Excellence
Business Process Maturity and Centers of ExcellenceBusiness Process Maturity and Centers of Excellence
Business Process Maturity and Centers of Excellence
Sandy Kemsley
 
Big Data Architectural Patterns
Big Data Architectural PatternsBig Data Architectural Patterns
Big Data Architectural Patterns
Amazon Web Services
 
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
SoftServe
 

Viewers also liked (20)

Top Agile Metrics
Top Agile MetricsTop Agile Metrics
Top Agile Metrics
 
Ast 0060878 wayne-eckerson_research_report_big_data_analytics
Ast 0060878 wayne-eckerson_research_report_big_data_analyticsAst 0060878 wayne-eckerson_research_report_big_data_analytics
Ast 0060878 wayne-eckerson_research_report_big_data_analytics
 
Innovation Diffusion: a (Big) Data-driven approach to the study of the geogra...
Innovation Diffusion: a (Big) Data-driven approach to the study of the geogra...Innovation Diffusion: a (Big) Data-driven approach to the study of the geogra...
Innovation Diffusion: a (Big) Data-driven approach to the study of the geogra...
 
Big Data Analytics: Architectural Perspective
Big Data Analytics: Architectural PerspectiveBig Data Analytics: Architectural Perspective
Big Data Analytics: Architectural Perspective
 
Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...
Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...
Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...
 
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...
 
Real time data analytics - part 1 - backend infrastructure
Real time data analytics - part 1 - backend infrastructureReal time data analytics - part 1 - backend infrastructure
Real time data analytics - part 1 - backend infrastructure
 
PARTNERS 2013 - Dr. Stefan Schwarz - Big Data Analytics as a Service
PARTNERS 2013 - Dr. Stefan Schwarz - Big Data Analytics as a Service PARTNERS 2013 - Dr. Stefan Schwarz - Big Data Analytics as a Service
PARTNERS 2013 - Dr. Stefan Schwarz - Big Data Analytics as a Service
 
Speed layer : Real time views in LAMBDA architecture
Speed layer : Real time views in LAMBDA architecture Speed layer : Real time views in LAMBDA architecture
Speed layer : Real time views in LAMBDA architecture
 
Real time analytics @ netflix
Real time analytics @ netflixReal time analytics @ netflix
Real time analytics @ netflix
 
Big Data Analytics for Real Time Systems
Big Data Analytics for Real Time SystemsBig Data Analytics for Real Time Systems
Big Data Analytics for Real Time Systems
 
Agile data science
Agile data scienceAgile data science
Agile data science
 
A technical Introduction to Big Data Analytics
A technical Introduction to Big Data AnalyticsA technical Introduction to Big Data Analytics
A technical Introduction to Big Data Analytics
 
Agile Big Data Analytics Development: An Architecture-Centric Approach
Agile Big Data Analytics Development: An Architecture-Centric ApproachAgile Big Data Analytics Development: An Architecture-Centric Approach
Agile Big Data Analytics Development: An Architecture-Centric Approach
 
The Modern Data Architecture for Predictive Analytics with Hortonworks and Re...
The Modern Data Architecture for Predictive Analytics with Hortonworks and Re...The Modern Data Architecture for Predictive Analytics with Hortonworks and Re...
The Modern Data Architecture for Predictive Analytics with Hortonworks and Re...
 
Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtwo...
Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtwo...Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtwo...
Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtwo...
 
Building Big Data Analytics Center Of Excellence
Building Big Data Analytics Center Of Excellence Building Big Data Analytics Center Of Excellence
Building Big Data Analytics Center Of Excellence
 
Business Process Maturity and Centers of Excellence
Business Process Maturity and Centers of ExcellenceBusiness Process Maturity and Centers of Excellence
Business Process Maturity and Centers of Excellence
 
Big Data Architectural Patterns
Big Data Architectural PatternsBig Data Architectural Patterns
Big Data Architectural Patterns
 
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
 

Similar to A big-data architecture for real-time analytics

Analysis of SOFTWARE DEFINED STORAGE (SDS)
Analysis of SOFTWARE DEFINED STORAGE (SDS)Analysis of SOFTWARE DEFINED STORAGE (SDS)
Analysis of SOFTWARE DEFINED STORAGE (SDS)
Kaushik Rajan
 
HadoopDB in Action
HadoopDB in ActionHadoopDB in Action
Cloud Computing Ambiance using Secluded Access Control Method
Cloud Computing Ambiance using Secluded Access Control MethodCloud Computing Ambiance using Secluded Access Control Method
Cloud Computing Ambiance using Secluded Access Control Method
IRJET Journal
 
Resilient Distributed DataSets - Apache SPARK
Resilient Distributed DataSets - Apache SPARKResilient Distributed DataSets - Apache SPARK
Resilient Distributed DataSets - Apache SPARK
Taposh Roy
 
Data Partitioning in Mongo DB with Cloud
Data Partitioning in Mongo DB with CloudData Partitioning in Mongo DB with Cloud
Data Partitioning in Mongo DB with Cloud
IJAAS Team
 
A DDS-Based Scalable and Reconfigurable Framework for Cyber-Physical Systems
A DDS-Based Scalable and Reconfigurable Framework for Cyber-Physical SystemsA DDS-Based Scalable and Reconfigurable Framework for Cyber-Physical Systems
A DDS-Based Scalable and Reconfigurable Framework for Cyber-Physical Systems
ijseajournal
 
access.2021.3077680.pdf
access.2021.3077680.pdfaccess.2021.3077680.pdf
access.2021.3077680.pdf
neju3
 
An Efficient and Fault Tolerant Data Replica Placement Technique for Cloud ba...
An Efficient and Fault Tolerant Data Replica Placement Technique for Cloud ba...An Efficient and Fault Tolerant Data Replica Placement Technique for Cloud ba...
An Efficient and Fault Tolerant Data Replica Placement Technique for Cloud ba...
IJCSIS Research Publications
 
What is Scalability and How can affect on overall system performance of database
What is Scalability and How can affect on overall system performance of databaseWhat is Scalability and How can affect on overall system performance of database
What is Scalability and How can affect on overall system performance of database
Alireza Kamrani
 
Storage Virtualization: Towards an Efficient and Scalable Framework
Storage Virtualization: Towards an Efficient and Scalable FrameworkStorage Virtualization: Towards an Efficient and Scalable Framework
Storage Virtualization: Towards an Efficient and Scalable Framework
CSCJournals
 
An Energy Efficient Data Transmission and Aggregation of WSN using Data Proce...
An Energy Efficient Data Transmission and Aggregation of WSN using Data Proce...An Energy Efficient Data Transmission and Aggregation of WSN using Data Proce...
An Energy Efficient Data Transmission and Aggregation of WSN using Data Proce...
IRJET Journal
 
A Study on Replication and Failover Cluster to Maximize System Uptime
A Study on Replication and Failover Cluster to Maximize System UptimeA Study on Replication and Failover Cluster to Maximize System Uptime
A Study on Replication and Failover Cluster to Maximize System Uptime
YogeshIJTSRD
 
Iaetsd mapreduce streaming over cassandra datasets
Iaetsd mapreduce streaming over cassandra datasetsIaetsd mapreduce streaming over cassandra datasets
Iaetsd mapreduce streaming over cassandra datasets
Iaetsd Iaetsd
 
Big Data: RDBMS vs. Hadoop vs. Spark
Big Data: RDBMS vs. Hadoop vs. SparkBig Data: RDBMS vs. Hadoop vs. Spark
Big Data: RDBMS vs. Hadoop vs. Spark
Graisy Biswal
 
Sdn in big data
Sdn in big dataSdn in big data
Sdn in big data
ahmed kassab
 
Dataintensive
DataintensiveDataintensive
Dataintensivesulfath
 
S18 das
S18 dasS18 das
YugabyteDB_TVA-Datastax.pdf
YugabyteDB_TVA-Datastax.pdfYugabyteDB_TVA-Datastax.pdf
YugabyteDB_TVA-Datastax.pdf
AmitAgarwal355193
 

Similar to A big-data architecture for real-time analytics (20)

Analysis of SOFTWARE DEFINED STORAGE (SDS)
Analysis of SOFTWARE DEFINED STORAGE (SDS)Analysis of SOFTWARE DEFINED STORAGE (SDS)
Analysis of SOFTWARE DEFINED STORAGE (SDS)
 
HadoopDB in Action
HadoopDB in ActionHadoopDB in Action
HadoopDB in Action
 
Cloud Computing Ambiance using Secluded Access Control Method
Cloud Computing Ambiance using Secluded Access Control MethodCloud Computing Ambiance using Secluded Access Control Method
Cloud Computing Ambiance using Secluded Access Control Method
 
Resilient Distributed DataSets - Apache SPARK
Resilient Distributed DataSets - Apache SPARKResilient Distributed DataSets - Apache SPARK
Resilient Distributed DataSets - Apache SPARK
 
Facade
FacadeFacade
Facade
 
Data Partitioning in Mongo DB with Cloud
Data Partitioning in Mongo DB with CloudData Partitioning in Mongo DB with Cloud
Data Partitioning in Mongo DB with Cloud
 
p1365-fernandes
p1365-fernandesp1365-fernandes
p1365-fernandes
 
A DDS-Based Scalable and Reconfigurable Framework for Cyber-Physical Systems
A DDS-Based Scalable and Reconfigurable Framework for Cyber-Physical SystemsA DDS-Based Scalable and Reconfigurable Framework for Cyber-Physical Systems
A DDS-Based Scalable and Reconfigurable Framework for Cyber-Physical Systems
 
access.2021.3077680.pdf
access.2021.3077680.pdfaccess.2021.3077680.pdf
access.2021.3077680.pdf
 
An Efficient and Fault Tolerant Data Replica Placement Technique for Cloud ba...
An Efficient and Fault Tolerant Data Replica Placement Technique for Cloud ba...An Efficient and Fault Tolerant Data Replica Placement Technique for Cloud ba...
An Efficient and Fault Tolerant Data Replica Placement Technique for Cloud ba...
 
What is Scalability and How can affect on overall system performance of database
What is Scalability and How can affect on overall system performance of databaseWhat is Scalability and How can affect on overall system performance of database
What is Scalability and How can affect on overall system performance of database
 
Storage Virtualization: Towards an Efficient and Scalable Framework
Storage Virtualization: Towards an Efficient and Scalable FrameworkStorage Virtualization: Towards an Efficient and Scalable Framework
Storage Virtualization: Towards an Efficient and Scalable Framework
 
An Energy Efficient Data Transmission and Aggregation of WSN using Data Proce...
An Energy Efficient Data Transmission and Aggregation of WSN using Data Proce...An Energy Efficient Data Transmission and Aggregation of WSN using Data Proce...
An Energy Efficient Data Transmission and Aggregation of WSN using Data Proce...
 
A Study on Replication and Failover Cluster to Maximize System Uptime
A Study on Replication and Failover Cluster to Maximize System UptimeA Study on Replication and Failover Cluster to Maximize System Uptime
A Study on Replication and Failover Cluster to Maximize System Uptime
 
Iaetsd mapreduce streaming over cassandra datasets
Iaetsd mapreduce streaming over cassandra datasetsIaetsd mapreduce streaming over cassandra datasets
Iaetsd mapreduce streaming over cassandra datasets
 
Big Data: RDBMS vs. Hadoop vs. Spark
Big Data: RDBMS vs. Hadoop vs. SparkBig Data: RDBMS vs. Hadoop vs. Spark
Big Data: RDBMS vs. Hadoop vs. Spark
 
Sdn in big data
Sdn in big dataSdn in big data
Sdn in big data
 
Dataintensive
DataintensiveDataintensive
Dataintensive
 
S18 das
S18 dasS18 das
S18 das
 
YugabyteDB_TVA-Datastax.pdf
YugabyteDB_TVA-Datastax.pdfYugabyteDB_TVA-Datastax.pdf
YugabyteDB_TVA-Datastax.pdf
 

Recently uploaded

June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
Levi Shapiro
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
Special education needs
 
Operation Blue Star - Saka Neela Tara
Operation Blue Star   -  Saka Neela TaraOperation Blue Star   -  Saka Neela Tara
Operation Blue Star - Saka Neela Tara
Balvir Singh
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
heathfieldcps1
 
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXXPhrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
MIRIAMSALINAS13
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
Celine George
 
Polish students' mobility in the Czech Republic
Polish students' mobility in the Czech RepublicPolish students' mobility in the Czech Republic
Polish students' mobility in the Czech Republic
Anna Sz.
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
TechSoup
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
Sandy Millin
 
Home assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdfHome assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdf
Tamralipta Mahavidyalaya
 
1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx
JosvitaDsouza2
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
siemaillard
 
Overview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with MechanismOverview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with Mechanism
DeeptiGupta154
 
678020731-Sumas-y-Restas-Para-Colorear.pdf
678020731-Sumas-y-Restas-Para-Colorear.pdf678020731-Sumas-y-Restas-Para-Colorear.pdf
678020731-Sumas-y-Restas-Para-Colorear.pdf
CarlosHernanMontoyab2
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
Jisc
 
Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
joachimlavalley1
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
Vikramjit Singh
 
Embracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic ImperativeEmbracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic Imperative
Peter Windle
 
The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
kaushalkr1407
 

Recently uploaded (20)

June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
 
Operation Blue Star - Saka Neela Tara
Operation Blue Star   -  Saka Neela TaraOperation Blue Star   -  Saka Neela Tara
Operation Blue Star - Saka Neela Tara
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
 
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXXPhrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
 
Polish students' mobility in the Czech Republic
Polish students' mobility in the Czech RepublicPolish students' mobility in the Czech Republic
Polish students' mobility in the Czech Republic
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
 
Home assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdfHome assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdf
 
1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
 
Overview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with MechanismOverview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with Mechanism
 
678020731-Sumas-y-Restas-Para-Colorear.pdf
678020731-Sumas-y-Restas-Para-Colorear.pdf678020731-Sumas-y-Restas-Para-Colorear.pdf
678020731-Sumas-y-Restas-Para-Colorear.pdf
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
 
Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
 
Embracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic ImperativeEmbracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic Imperative
 
The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
 

A big-data architecture for real-time analytics

  • 1. Tao Zhong  Kshitij A. Doshi  Xi Tang  Ting Lou  Zhongyan Lu  Hong Li Presented by: Raminder Kaur Wayne State University
  • 2.  Introduction  Motivation and Background  Architecture  Framework  Result  Future work  Conclusion  Index term  References Wayne State University
  • 3. This paper describes:  a few key additional requirements that result from having to support in-memory processing of data while updates proceed concurrently.  RAF  Two RAF based solutions (discussed further) Wayne State University
  • 4. A few examples of information in motion that may just be seconds old, and not yet well categorized or linked to other data: - GPS-based navigation : to reduce wasted energy, accidents, delays and emergencies. - A credit card company : to detect and intercept suspicious transactions - A metropolitan or regional power grid : to modulate power generation, perform load-balancing, direct repair actions, and take policy enforcement steps  An essential feature in the above examples is the need to integrate new transactions into analysis results within a very short time—sometimes as short as a few tens of milliseconds. Wayne State University
  • 5. RDD makes in-memory solutions less failure prone. So RAF enhances RDD approach so that resiliency is blended with a few additional characteristics as listed below: • Efficient allocation and control of memory resources • Resilient update of information at much finer resolution • Flexible and highly efficient concurrency control • Replication and partitioning of data transparent to clients Architecturally RAF elevates memory across an entire cluster to a first class storage entity and defines high level mechanisms by which applications on RAF can orchestrate distributed actions upon objects stored in cluster memory. To promote responsible and transparent use of memory, RAF opts to use a programming language such as C, C++, over mixed language environments in which garbage allocation is opaque. Wayne State University
  • 6. Data has a lot of value when mined. As data continues to compound at brisk rates, institutions need to grapple with two broad demands –  accumulating, processing, synopsizing and utilizing information in a timely manner  storing the refined data resiliently  keeping the data accessible at high speed. The term Big Data itself is elastic and serves well as a description of the scale or volume of these solutions, but does not define a constraining principle for organizing storage . Wayne State University
  • 7. Requirements for low-latency and high throughput analytics on datasets:  In-memory structures and storage  Resiliency  Sharing data through memory  Uniform interaction with storage  Minimizing memory recycling  Efficient integration of CRUD  Synchronizing efficiently  Searching Efficiently Wayne State University
  • 8.
  • 9. Translation of eight requirements into five design elements:  C and C++ based programming for efficient sharing of data through memory  Resilient storing of new content  Efficient concurrency  Processing information in motion  Fast, general, ad-hoc searches Wayne State University
  • 10.  This framework targets the execution of complex queries at very low latency.  Information upon which queries operate may be available on some storage medium, or generated dynamically as a result of ongoing transactional activities.  RAF provides distributed computing environment which is integrated with memory-centric, distributed storage system where one application can pass the data to another in order to share data in memory Wayne State University
  • 11.  RDD: used to store information in memory of one or more machines to assure that in case of failure of one or more machines, the RDD can be reconstructed.  Transformations: operation on RDD to generate new data sets. RAF transformations are join, map, union, etc.  Filter: a particular type of transformation. Produces a dataset whose contents satisfy a specified condition.  Delegate: It is a bridged module. Purpose of delegate is to create a version of datastore at a particular time and present it as memory resident RDD. Wayne State University
  • 12.
  • 13.  Efficient storage sharing using DELEGATE  Memory-centric storage operation -Reliability  Data and storage types -Structured data -Storage types (Replicated store and Partitioned store)  Distributed Execution of Analytics tasks -Analytics tasks interface Wayne State University
  • 14.
  • 15.
  • 16.  Unit Testing: -Scalability testing results (how well update operations scale) -Latency relative to Hive/HDFS (how long does it take to complete a query) NOTE: These unit test results show advantage of in-memory distributed processing oriented design of RAF.  Solution-level implementation and testing -Telecommunications subscriber Management -Safe City Solution
  • 19.  Motivated by the high degree of familiarity that many developers have with database interfaces, we are incrementally introducing SQL- 92/JDBC/ODBC like interfaces on top of RAF. A number of optimizations are also being added. These optimizations include:  application requested indexing, to accelerate searches  blending in column-store capabilities where appropriate (for example, for rarely-written data)  compression, in order to reduce data transported between nodes. Wayne State University
  • 20.  Discussed RAF, an architectural approach that meshes memory-centric non-relational query processing for low latency analytics with memory- centric update processing to accommodate high volumes of updates.  Delegate, which participates as a special type of content transformer in a hierarchy of RDD transformations.  In RAF, protocol buffers are used to obtain data abstraction and efficient conveyance among applications, providing applications with a high degree of independence in location, representation, and transmission of data.  A light-weight but expressive interface for RAF  Using unit tests we show high cluster scaling capability for transactions, an order of magnitude latency improvement for query processing.  Discussed two real-world usage scenarios in which RAF is being used. Wayne State University
  • 21.  RDD: Resilient distributed dataset  RAF: Real-time Analytics Foundation  CRUD : Create/Retrieve/Update/Delete  HDFS: Hadoop Distributed File System
  • 22.  Apache Hadoop: http://hadoop.apache.org/  Apache HBase: http://hbase.apache.org/  Memcached: http://www.memcached.org/  Oracle Coherence: http://www.oracle.com/technetwork/ middle ware/ coherence/  H. Plattner, A. Zeier, In-Memory Data Management.  Protobuf: http://code.google.com/p/protobuf/  Redis: http://www.redis.io/  SQLStream: http://www.sqlstream.com/  Vertica: http://www.vertica.com/  VoltDB: http://www.voltdb.com Wayne State University