SlideShare a Scribd company logo
Big Data: Its Characteristics And
Architecture Capabilities

By
Ashraf Uddin
South Asian University
(http://ashrafsau.blogspot.in/)
What is Big Data?
Big data refers to large datasets that are
challenging
to
store,
search,
share,
visualize, and analyze.
“Big Data” is data whose scale, diversity,
and complexity require new architecture,
techniques, algorithms, and analytics to
manage it and extract value and hidden
knowledge from it…
The Model of Generating/Consuming
Data has Changed
Old Model: Few companies are generating data, all others are
consuming data

New Model: all of us are generating data, and all of us are
consuming data
Do we really need Big Data?
For consumer :

Better understanding of own behavior

Integration of activities

Influence – involvement and recognition

For companies :

Real behavior-- what do people do, and what do they
value?

Faster interaction

Better targeted offers

Customer understanding
Characteristics of Big Data

1. Volume (Scale)
2. Velocity (Speed)
3. Varity (Complexity)
Volume
Velocity
• Data is being generated fast and need to be
processed fast
• Online Data Analytics
• Late Decision leads missing opportunity
Varity
• Various formats, types, and
structures
• Text, numerical, images,
audio, video, sequences, time
series, social media data,
multi-dim arrays, etc…
• Static data vs. streaming data
• A single application can be
generating/collecting many
types of data
• To extract knowledge all
these types of data need to
linked together
Generation of Big Data

Scientific instruments
(collecting all sorts of data)

Social media and networks
(all of us are generating data)

Sensor technology and
networks
(measuring all kinds of data)
Why Big Data is Different?
For example, an airline jet collects 10 terabytes of
sensor data for every 30 minutes of flying time.
Compare that with conventional high performance
computing where New York Stock Exchange collects
1 terabyte of structured trading data per day.
Conventional corporate structured data sized in
terabytes and petabytes.
Big Data is sized in peta-, exa-, and soon perhaps,
zetta-bytes!
Why Big Data is Different?
The unique characteristics of Big Data is the
manner in which value is discovered.
In conventional BI, the simple summing of a
known value reveals a result
In Big Data, the value is discovered through a
refining modeling process:
make a hypothesis
create statistical, visual, or semantic models
validate, then make a new hypothesis.
Use cases for Big Data Analytics
A Big Data Use Case:
Personalized Insurance Premium

an insurance company wants to offer to those who are
unlikely to make a claim, thereby optimizing their profits.
One way to approach this problem is to collect more
detailed data about an individual's driving habits and then
assess their risk.
to collect data on driving habits utilizing sensors in their
customers' cars to capture driving data, such as routes
driven, miles driven, time of day, and braking abruptness.
A Big Data Use Case:
Personalized Insurance Premium

This data is used to assess driver risk; they compare
individual
driving
patterns
with
other
statistical
information, such as average miles driven in same state,
and peak hours of drivers on the road.
Driver risk plus actuarial information is then correlated
with policy and profile information to offer a competitive
and more profitable rate for the company
The result
A personalized insurance plan.
These unique capabilities, delivered from big data analytics, are
revolutionizing the insurance industry.
A Big Data Use Case:
Personalized Insurance Premium

To accomplish this task:
a great amount of continuous data must be collected,
stored, and correlated.
Hadoop is an excellent choice for acquisition and
reduction of the automobile sensor data.
Master data and certain reference data including
customer profile information are likely to be stored in the
existing DBMS systems
a NoSQL database can be used to capture and store
reference data that are more dynamic, diverse in formats,
and change frequently.
Data Realm Characteristics
Big Data Architecture Capabilities
Storage and Management Capability
Database Capability
Processing Capability
Data Integration Capability
Statistical Analysis Capability
Storage and Management Capability
Hadoop
(HDFS)

Distributed

File

System

 highly scalable storage and automatic
data replication across three nodes for fault
tolerance

Cloudera Manager
 gives a cluster-wide, real-time view of
nodes and services running; provides a
single, central place to enact configuration
changes across the cluster
Big Data Architecture Capabilities
Storage and Management Capability
Database Capability
Processing Capability
Data Integration Capability
Statistical Analysis Capability
Database Capability
Oracle NoSQL
 Dynamic and flexible schema design
 High performance key value pair database.

Apache HBase
 Strictly consistent reads and writes
 Allows random, real time read/write access

Apache Cassandra
 Fault tolerance capability is designed for every node
 Data model offers column indexes with the
performance of log-structured updates, materialized
views, and built-in caching

Apache Hive
 Tools to enable easy data extract/transform/load (ETL)

 Query execution via MapReduce
Big Data Architecture Capabilities
Storage and Management Capability
Database Capability
Processing Capability
Data Integration Capability
Statistical Analysis Capability
Processing Capability
MapReduce

Break problem up into smaller
sub-problems
 Able to distribute data workloads across
thousands of nodes

Apache Hadoop
 Leading MapReduce implementation
 Highly scalable parallel batch processing
 Writes multiple copies across cluster for
fault tolerance
Big Data Architecture Capabilities
Storage and Management Capability
Database Capability
Processing Capability
Data Integration Capability
Statistical Analysis Capability
Data Integration Capability
Exports MapReduce results
Hadoop, and other targets

to

RDBMS,

Connects Hadoop to relational databases for
SQL processing
Optimized processing
import/export

with

parallel

data
Big Data Architecture Capabilities
Storage and Management Capability
Database Capability
Processing Capability
Data Integration Capability
Statistical Analysis Capability
Statistical Analysis Capability
Programming
analysis

language

for

statistical

Oracle R Enterprise allows reuse
pre-existing R scripts with no modification

of
Big Data Architecture

Traditional Information Architecture Capability

Big Data Information Architecture Capability
Conclusion
Today’s economic environment demands
that business be driven by useful, accurate,
and timely information.
the world of Big Data is a solution to the
problem.
there are always business and IT tradeoffs to
get to data and information in a most
cost-effective way.
References
1. Big Data Analytics Guide: Better technology, more
insight for the next generation of business
applications, SAP
2. Oracle Information
Guide to Big Data

Architecture:

An

Architect’s

3. http://
www.csc.com/insights/flxwd/78931-big_data_univers
e_beginning_to_explode
4. http://
www.techrepublic.com/blog/big-data-analytics/10-em
erging-technologies-for-big-data/280
5. http://www.idc.com/
6. From Database to Big Data. Sam Madden (MIT)

More Related Content

What's hot

Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with Hadoop
Philippe Julio
 
Big data ppt
Big data pptBig data ppt
Big data ppt
Deepika ParthaSarathy
 
Map Reduce
Map ReduceMap Reduce
Map Reduce
Prashant Gupta
 
Big data
Big dataBig data
Big data
Nausheen Hasan
 
Big data
Big dataBig data
Big data
Nimish Kochhar
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
Ghulam Imaduddin
 
Presentation on Big Data
Presentation on Big DataPresentation on Big Data
Presentation on Big Data
Maruf Abdullah (Rion)
 
Big Data - The 5 Vs Everyone Must Know
Big Data - The 5 Vs Everyone Must KnowBig Data - The 5 Vs Everyone Must Know
Big Data - The 5 Vs Everyone Must Know
Bernard Marr
 
Big data unit i
Big data unit iBig data unit i
Big data unit i
Navjot Kaur
 
Big Data Ppt PowerPoint Presentation Slides
Big Data Ppt PowerPoint Presentation Slides Big Data Ppt PowerPoint Presentation Slides
Big Data Ppt PowerPoint Presentation Slides
SlideTeam
 
Lecture1 introduction to big data
Lecture1 introduction to big dataLecture1 introduction to big data
Lecture1 introduction to big data
hktripathy
 
Big_data_ppt
Big_data_ppt Big_data_ppt
Big_data_ppt
Sadhana Singh
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
Apache Apex
 
Hadoop Architecture
Hadoop ArchitectureHadoop Architecture
Hadoop Architecture
Dr. C.V. Suresh Babu
 
Big Data Open Source Technologies
Big Data Open Source TechnologiesBig Data Open Source Technologies
Big Data Open Source Technologies
neeraj rathore
 
Big Data
Big DataBig Data
Big Data
NGDATA
 
Hadoop File system (HDFS)
Hadoop File system (HDFS)Hadoop File system (HDFS)
Hadoop File system (HDFS)
Prashant Gupta
 
Introduction to Big Data
Introduction to Big Data Introduction to Big Data
Introduction to Big Data
Srinath Perera
 
Big data ppt
Big data pptBig data ppt
Big data ppt
IDBI Bank Ltd.
 

What's hot (20)

Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with Hadoop
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Map Reduce
Map ReduceMap Reduce
Map Reduce
 
Big data
Big dataBig data
Big data
 
Big data
Big dataBig data
Big data
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
Presentation on Big Data
Presentation on Big DataPresentation on Big Data
Presentation on Big Data
 
Big Data - The 5 Vs Everyone Must Know
Big Data - The 5 Vs Everyone Must KnowBig Data - The 5 Vs Everyone Must Know
Big Data - The 5 Vs Everyone Must Know
 
Big data unit i
Big data unit iBig data unit i
Big data unit i
 
Big Data Ppt PowerPoint Presentation Slides
Big Data Ppt PowerPoint Presentation Slides Big Data Ppt PowerPoint Presentation Slides
Big Data Ppt PowerPoint Presentation Slides
 
Lecture1 introduction to big data
Lecture1 introduction to big dataLecture1 introduction to big data
Lecture1 introduction to big data
 
Big_data_ppt
Big_data_ppt Big_data_ppt
Big_data_ppt
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
Hadoop Architecture
Hadoop ArchitectureHadoop Architecture
Hadoop Architecture
 
Big Data Open Source Technologies
Big Data Open Source TechnologiesBig Data Open Source Technologies
Big Data Open Source Technologies
 
Hadoop Map Reduce
Hadoop Map ReduceHadoop Map Reduce
Hadoop Map Reduce
 
Big Data
Big DataBig Data
Big Data
 
Hadoop File system (HDFS)
Hadoop File system (HDFS)Hadoop File system (HDFS)
Hadoop File system (HDFS)
 
Introduction to Big Data
Introduction to Big Data Introduction to Big Data
Introduction to Big Data
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 

Similar to Big Data: Its Characteristics And Architecture Capabilities

Big data analysis concepts and references
Big data analysis concepts and referencesBig data analysis concepts and references
Big data analysis concepts and references
Information Security Awareness Group
 
Lecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.pptLecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.ppt
almaraniabwmalk
 
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenariosThe Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
kcmallu
 
Building a Big Data Solution
Building a Big Data SolutionBuilding a Big Data Solution
Building a Big Data Solution
James Serra
 
Big data peresintaion
Big data peresintaion Big data peresintaion
Big data peresintaion
ahmed alshikh
 
Big data and oracle
Big data and oracleBig data and oracle
Big data and oracle
Sourabh Saxena
 
Hadoop Demo eConvergence
Hadoop Demo eConvergenceHadoop Demo eConvergence
Hadoop Demo eConvergence
kvnnrao
 
using big-data methods analyse the Cross platform aviation
 using big-data methods analyse the Cross platform aviation using big-data methods analyse the Cross platform aviation
using big-data methods analyse the Cross platform aviation
ranjit banshpal
 
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSetsEnabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
Streamsets Inc.
 
Big Data
Big DataBig Data
Big Data
Kirubaburi R
 
Knowledge Graph Discussion: Foundational Capability for Data Fabric, Data Int...
Knowledge Graph Discussion: Foundational Capability for Data Fabric, Data Int...Knowledge Graph Discussion: Foundational Capability for Data Fabric, Data Int...
Knowledge Graph Discussion: Foundational Capability for Data Fabric, Data Int...
Cambridge Semantics
 
Big Data Session 1.pptx
Big Data Session 1.pptxBig Data Session 1.pptx
Big Data Session 1.pptx
ElsonPaul2
 
Big data - what, why, where, when and how
Big data - what, why, where, when and howBig data - what, why, where, when and how
Big data - what, why, where, when and how
bobosenthil
 
Big Data: It’s all about the Use Cases
Big Data: It’s all about the Use CasesBig Data: It’s all about the Use Cases
Big Data: It’s all about the Use Cases
James Serra
 
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
Denodo
 
Machine Data Analytics
Machine Data AnalyticsMachine Data Analytics
Machine Data Analytics
Nicolas Morales
 
Finding business value in Big Data
Finding business value in Big DataFinding business value in Big Data
Finding business value in Big Data
James Serra
 
Fast Data Strategy Houston Roadshow Presentation
Fast Data Strategy Houston Roadshow PresentationFast Data Strategy Houston Roadshow Presentation
Fast Data Strategy Houston Roadshow Presentation
Denodo
 
Overview - IBM Big Data Platform
Overview - IBM Big Data PlatformOverview - IBM Big Data Platform
Overview - IBM Big Data Platform
Vikas Manoria
 
Hd insight overview
Hd insight overviewHd insight overview
Hd insight overview
vhrocca
 

Similar to Big Data: Its Characteristics And Architecture Capabilities (20)

Big data analysis concepts and references
Big data analysis concepts and referencesBig data analysis concepts and references
Big data analysis concepts and references
 
Lecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.pptLecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.ppt
 
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenariosThe Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
 
Building a Big Data Solution
Building a Big Data SolutionBuilding a Big Data Solution
Building a Big Data Solution
 
Big data peresintaion
Big data peresintaion Big data peresintaion
Big data peresintaion
 
Big data and oracle
Big data and oracleBig data and oracle
Big data and oracle
 
Hadoop Demo eConvergence
Hadoop Demo eConvergenceHadoop Demo eConvergence
Hadoop Demo eConvergence
 
using big-data methods analyse the Cross platform aviation
 using big-data methods analyse the Cross platform aviation using big-data methods analyse the Cross platform aviation
using big-data methods analyse the Cross platform aviation
 
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSetsEnabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
 
Big Data
Big DataBig Data
Big Data
 
Knowledge Graph Discussion: Foundational Capability for Data Fabric, Data Int...
Knowledge Graph Discussion: Foundational Capability for Data Fabric, Data Int...Knowledge Graph Discussion: Foundational Capability for Data Fabric, Data Int...
Knowledge Graph Discussion: Foundational Capability for Data Fabric, Data Int...
 
Big Data Session 1.pptx
Big Data Session 1.pptxBig Data Session 1.pptx
Big Data Session 1.pptx
 
Big data - what, why, where, when and how
Big data - what, why, where, when and howBig data - what, why, where, when and how
Big data - what, why, where, when and how
 
Big Data: It’s all about the Use Cases
Big Data: It’s all about the Use CasesBig Data: It’s all about the Use Cases
Big Data: It’s all about the Use Cases
 
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
 
Machine Data Analytics
Machine Data AnalyticsMachine Data Analytics
Machine Data Analytics
 
Finding business value in Big Data
Finding business value in Big DataFinding business value in Big Data
Finding business value in Big Data
 
Fast Data Strategy Houston Roadshow Presentation
Fast Data Strategy Houston Roadshow PresentationFast Data Strategy Houston Roadshow Presentation
Fast Data Strategy Houston Roadshow Presentation
 
Overview - IBM Big Data Platform
Overview - IBM Big Data PlatformOverview - IBM Big Data Platform
Overview - IBM Big Data Platform
 
Hd insight overview
Hd insight overviewHd insight overview
Hd insight overview
 

More from Ashraf Uddin

A short tutorial on r
A short tutorial on rA short tutorial on r
A short tutorial on r
Ashraf Uddin
 
MapReduce: Simplified Data Processing on Large Clusters
MapReduce: Simplified Data Processing on Large ClustersMapReduce: Simplified Data Processing on Large Clusters
MapReduce: Simplified Data Processing on Large Clusters
Ashraf Uddin
 
Text Mining Infrastructure in R
Text Mining Infrastructure in RText Mining Infrastructure in R
Text Mining Infrastructure in R
Ashraf Uddin
 
Dynamic source routing
Dynamic source routingDynamic source routing
Dynamic source routingAshraf Uddin
 

More from Ashraf Uddin (7)

A short tutorial on r
A short tutorial on rA short tutorial on r
A short tutorial on r
 
MapReduce: Simplified Data Processing on Large Clusters
MapReduce: Simplified Data Processing on Large ClustersMapReduce: Simplified Data Processing on Large Clusters
MapReduce: Simplified Data Processing on Large Clusters
 
Text Mining Infrastructure in R
Text Mining Infrastructure in RText Mining Infrastructure in R
Text Mining Infrastructure in R
 
Software piracy
Software piracySoftware piracy
Software piracy
 
Naive bayes
Naive bayesNaive bayes
Naive bayes
 
Freenet
FreenetFreenet
Freenet
 
Dynamic source routing
Dynamic source routingDynamic source routing
Dynamic source routing
 

Recently uploaded

Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
Pavel ( NSTU)
 
Acetabularia Information For Class 9 .docx
Acetabularia Information For Class 9  .docxAcetabularia Information For Class 9  .docx
Acetabularia Information For Class 9 .docx
vaibhavrinwa19
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
heathfieldcps1
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
Jisc
 
Overview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with MechanismOverview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with Mechanism
DeeptiGupta154
 
CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
camakaiclarkmusic
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
Jisc
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia
 
The geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideasThe geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideas
GeoBlogs
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
TechSoup
 
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXXPhrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
MIRIAMSALINAS13
 
Honest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptxHonest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptx
timhan337
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
EverAndrsGuerraGuerr
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
Vivekanand Anglo Vedic Academy
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
Special education needs
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
Thiyagu K
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
Celine George
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
Vikramjit Singh
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Thiyagu K
 
Palestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptxPalestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptx
RaedMohamed3
 

Recently uploaded (20)

Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
 
Acetabularia Information For Class 9 .docx
Acetabularia Information For Class 9  .docxAcetabularia Information For Class 9  .docx
Acetabularia Information For Class 9 .docx
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
 
Overview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with MechanismOverview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with Mechanism
 
CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
 
The geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideasThe geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideas
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
 
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXXPhrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
 
Honest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptxHonest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptx
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
 
Palestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptxPalestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptx
 

Big Data: Its Characteristics And Architecture Capabilities

  • 1. Big Data: Its Characteristics And Architecture Capabilities By Ashraf Uddin South Asian University (http://ashrafsau.blogspot.in/)
  • 2. What is Big Data? Big data refers to large datasets that are challenging to store, search, share, visualize, and analyze. “Big Data” is data whose scale, diversity, and complexity require new architecture, techniques, algorithms, and analytics to manage it and extract value and hidden knowledge from it…
  • 3. The Model of Generating/Consuming Data has Changed Old Model: Few companies are generating data, all others are consuming data New Model: all of us are generating data, and all of us are consuming data
  • 4. Do we really need Big Data? For consumer :  Better understanding of own behavior  Integration of activities  Influence – involvement and recognition For companies :  Real behavior-- what do people do, and what do they value?  Faster interaction  Better targeted offers  Customer understanding
  • 5. Characteristics of Big Data 1. Volume (Scale) 2. Velocity (Speed) 3. Varity (Complexity)
  • 7. Velocity • Data is being generated fast and need to be processed fast • Online Data Analytics • Late Decision leads missing opportunity
  • 8. Varity • Various formats, types, and structures • Text, numerical, images, audio, video, sequences, time series, social media data, multi-dim arrays, etc… • Static data vs. streaming data • A single application can be generating/collecting many types of data • To extract knowledge all these types of data need to linked together
  • 9. Generation of Big Data Scientific instruments (collecting all sorts of data) Social media and networks (all of us are generating data) Sensor technology and networks (measuring all kinds of data)
  • 10. Why Big Data is Different? For example, an airline jet collects 10 terabytes of sensor data for every 30 minutes of flying time. Compare that with conventional high performance computing where New York Stock Exchange collects 1 terabyte of structured trading data per day. Conventional corporate structured data sized in terabytes and petabytes. Big Data is sized in peta-, exa-, and soon perhaps, zetta-bytes!
  • 11. Why Big Data is Different? The unique characteristics of Big Data is the manner in which value is discovered. In conventional BI, the simple summing of a known value reveals a result In Big Data, the value is discovered through a refining modeling process: make a hypothesis create statistical, visual, or semantic models validate, then make a new hypothesis.
  • 12. Use cases for Big Data Analytics
  • 13. A Big Data Use Case: Personalized Insurance Premium an insurance company wants to offer to those who are unlikely to make a claim, thereby optimizing their profits. One way to approach this problem is to collect more detailed data about an individual's driving habits and then assess their risk. to collect data on driving habits utilizing sensors in their customers' cars to capture driving data, such as routes driven, miles driven, time of day, and braking abruptness.
  • 14. A Big Data Use Case: Personalized Insurance Premium This data is used to assess driver risk; they compare individual driving patterns with other statistical information, such as average miles driven in same state, and peak hours of drivers on the road. Driver risk plus actuarial information is then correlated with policy and profile information to offer a competitive and more profitable rate for the company The result A personalized insurance plan. These unique capabilities, delivered from big data analytics, are revolutionizing the insurance industry.
  • 15. A Big Data Use Case: Personalized Insurance Premium To accomplish this task: a great amount of continuous data must be collected, stored, and correlated. Hadoop is an excellent choice for acquisition and reduction of the automobile sensor data. Master data and certain reference data including customer profile information are likely to be stored in the existing DBMS systems a NoSQL database can be used to capture and store reference data that are more dynamic, diverse in formats, and change frequently.
  • 17. Big Data Architecture Capabilities Storage and Management Capability Database Capability Processing Capability Data Integration Capability Statistical Analysis Capability
  • 18. Storage and Management Capability Hadoop (HDFS) Distributed File System  highly scalable storage and automatic data replication across three nodes for fault tolerance Cloudera Manager  gives a cluster-wide, real-time view of nodes and services running; provides a single, central place to enact configuration changes across the cluster
  • 19. Big Data Architecture Capabilities Storage and Management Capability Database Capability Processing Capability Data Integration Capability Statistical Analysis Capability
  • 20. Database Capability Oracle NoSQL  Dynamic and flexible schema design  High performance key value pair database. Apache HBase  Strictly consistent reads and writes  Allows random, real time read/write access Apache Cassandra  Fault tolerance capability is designed for every node  Data model offers column indexes with the performance of log-structured updates, materialized views, and built-in caching Apache Hive  Tools to enable easy data extract/transform/load (ETL)  Query execution via MapReduce
  • 21. Big Data Architecture Capabilities Storage and Management Capability Database Capability Processing Capability Data Integration Capability Statistical Analysis Capability
  • 22. Processing Capability MapReduce  Break problem up into smaller sub-problems  Able to distribute data workloads across thousands of nodes Apache Hadoop  Leading MapReduce implementation  Highly scalable parallel batch processing  Writes multiple copies across cluster for fault tolerance
  • 23. Big Data Architecture Capabilities Storage and Management Capability Database Capability Processing Capability Data Integration Capability Statistical Analysis Capability
  • 24. Data Integration Capability Exports MapReduce results Hadoop, and other targets to RDBMS, Connects Hadoop to relational databases for SQL processing Optimized processing import/export with parallel data
  • 25. Big Data Architecture Capabilities Storage and Management Capability Database Capability Processing Capability Data Integration Capability Statistical Analysis Capability
  • 26. Statistical Analysis Capability Programming analysis language for statistical Oracle R Enterprise allows reuse pre-existing R scripts with no modification of
  • 27. Big Data Architecture Traditional Information Architecture Capability Big Data Information Architecture Capability
  • 28. Conclusion Today’s economic environment demands that business be driven by useful, accurate, and timely information. the world of Big Data is a solution to the problem. there are always business and IT tradeoffs to get to data and information in a most cost-effective way.
  • 29. References 1. Big Data Analytics Guide: Better technology, more insight for the next generation of business applications, SAP 2. Oracle Information Guide to Big Data Architecture: An Architect’s 3. http:// www.csc.com/insights/flxwd/78931-big_data_univers e_beginning_to_explode 4. http:// www.techrepublic.com/blog/big-data-analytics/10-em erging-technologies-for-big-data/280 5. http://www.idc.com/ 6. From Database to Big Data. Sam Madden (MIT)