SlideShare a Scribd company logo
1 of 15
Big Data Basics
AUTHOR : MITHUN BANERJEE
DATE: 05-OCTOBER-2016
C O P Y R I G H T P R O T E C T E D B Y E C L I P S E T E C H N O C O N S U L T I N G G L O B A L ( P ) L T D .
What is Big data?
Big data is the term for a collection of data sets so large and complex
that it becomes difficult to process using on-hand database
management tools or traditional data processing applications.
--Wikipedia
Is the above definition fully comprehensive? 
Lets try to go deep in next slides
Data units to measure exponential growth of data
over the years
VOLUME of DATA
Type of data
• Relational Data (Tables/Transaction/Legacy Data)
• Text Data (Web)
• Semi-structured Data (XML)
• Graph Data
Social Network, SemanticWeb (RDF), …
• Streaming Data
You can only scan the data once
• A single application can be generating/collecting many types of data
• Big Public Data (online, weather, finance, etc)
Variety (complexities) of
data
Velocity of data
Late decisions  missing opportunities
Example: Healthcare monitoring: sensors monitoring your activities and body 
any abnormal measurements require immediate reaction
Velocity of data
Social media and networks
(all of us are generating data) Scientific instruments
(collecting all sorts of data)
Sensor technology and networks
(measuring all kinds of data)
REALTIME / FAST DATA
3Vs
4Vs
Generation and
Consumption of Data
In past
In present
OLTP: O N L I N E T R A N S A C T I O N P R O C E S S I NG ( D B M S )
OLAP: O N L I N E A N A LY T I C A L P R O C E S S I N G ( D ATA
WA R E H O U S I N G )
RTAP: REAL-TIME ANALYTICS PROCESSING (BIG
DATA ARCHITECTURE & TECHNOLOGY)
Driver of Data
- Optimizations and predictive analytics
- Complex statistical analysis
- All types of data, and many sources
-Very large datasets
- More of a real-time
- Ad-hoc querying and reporting
- Data mining techniques
- Structured data, typical sources
- Small to mid-size datasets
The Evolution of Business Intelligence
BI Reporting
OLAP &
Dataware house
Business Objects, SAS,
Informatica, Cognos other
SQL ReportingTools
Interactive
Business
Intelligence &
In-memory RDBMS
QliqView,Tableau, HANA
Big Data:
RealTime &
SingleView
Graph Databases
Big Data:
Batch Processing &
Distributed Data Store
Hadoop/Spark;
HBase/Cassandra
1990’s 2000’s 2010’s
Speed
Scale
Scale
Speed
Topic 1: Data Analytics &
Data Mining
• EXPLORATORY DATA ANALYSIS
•
• LINEAR CLASSIFICATION (PERCEPTRON &
LOGISTIC REGRESSION)
•
• LINEAR REGRESSION
• C4.5 DECISION TREE
• APRIORI
• K-MEANS CLUSTERING
•
• EM ALGORITHM
• PAGERANK & HITS
• COLLABORATIVE FILTERING
Topic 2: Hadoop/MapReduce
Programming & Data Processing
ARCHITECTURE OF HADOOP, HDFS, AND YARN
PROGRAMMING ON HADOOP
BASIC DATA PROCESSING: SORT AND JOIN
INFORMATION RETRIEVAL USING HADOOP
DATA MINING USING HADOOP
(KMEANS+HISTOGRAMS)
MACHINE LEARNING ON HADOOP (EM)
HIVE/PIG
HBASE AND CASSANDRA
Topic 3: Graph Database and
Graph Analytics
GRAPH DATABASE
(HTTP://EN.WIKIPEDIA.ORG/WIKI/GRAPH_DATAB
ASE)
Native Graph Database (Neo4j)
Pregel/Giraph (Distributed Graph Processing Engine)
NEO4J/TITAN/GRAPHLAB/GRAPHSQL
Reference to read for in
depth home work
•Hadoop:The Definitive Guide,Tom White, O’Reilly
•Data Mining: Concepts andTechniques,Third Edition, by
Jiawei Han et al.
•https://www.mongodb.com/collateral/big-data-examples-
and-guidelines-enterprise-decision-maker
•
•http://www.aptude.com/blog/entry/hadoop-vs-mongodb-
which-platform-is-better-for-handling-big-data
•
•http://www.slideshare.net/wlaforest/an-introduction-to-
big-data-nosql-and-mongodb
•http://www.infoworld.com/article/2608460/application-
development/the-10-worst-big-data-practices.html
Ets train ppt_big_data_basics_v2.0

More Related Content

What's hot

Tools for Unstructured Data Analytics
Tools for Unstructured Data AnalyticsTools for Unstructured Data Analytics
Tools for Unstructured Data Analytics
Ravi Teja
 
Batter Up! Advanced Sports Analytics with R and Storm
Batter Up! Advanced Sports Analytics with R and StormBatter Up! Advanced Sports Analytics with R and Storm
Batter Up! Advanced Sports Analytics with R and Storm
Revolution Analytics
 

What's hot (20)

Big data landscape map collection by aibdp
Big data landscape map collection by aibdpBig data landscape map collection by aibdp
Big data landscape map collection by aibdp
 
Tools for Unstructured Data Analytics
Tools for Unstructured Data AnalyticsTools for Unstructured Data Analytics
Tools for Unstructured Data Analytics
 
Batter Up! Advanced Sports Analytics with R and Storm
Batter Up! Advanced Sports Analytics with R and StormBatter Up! Advanced Sports Analytics with R and Storm
Batter Up! Advanced Sports Analytics with R and Storm
 
How the growth of R helps data-driven organizations succeed
How the growth of R helps data-driven organizations succeedHow the growth of R helps data-driven organizations succeed
How the growth of R helps data-driven organizations succeed
 
data warehousing and data mining
data warehousing and data mining data warehousing and data mining
data warehousing and data mining
 
Big Data Landscape 2016
Big Data Landscape 2016Big Data Landscape 2016
Big Data Landscape 2016
 
Applications of R (DataWeek 2014)
Applications of R (DataWeek 2014)Applications of R (DataWeek 2014)
Applications of R (DataWeek 2014)
 
Advanced Analytics for Any Data at Real-Time Speed
Advanced Analytics for Any Data at Real-Time SpeedAdvanced Analytics for Any Data at Real-Time Speed
Advanced Analytics for Any Data at Real-Time Speed
 
Data analysis with pandas and scikit-learn
Data analysis with pandas and scikit-learnData analysis with pandas and scikit-learn
Data analysis with pandas and scikit-learn
 
Big data and Internet
Big data and InternetBig data and Internet
Big data and Internet
 
It takes a village (to raise a ML model)
It takes a village (to raise a ML model)It takes a village (to raise a ML model)
It takes a village (to raise a ML model)
 
introduction to data warehousing and mining
 introduction to data warehousing and mining introduction to data warehousing and mining
introduction to data warehousing and mining
 
Big data technologies with Case Study Finance and Healthcare
Big data technologies with Case Study Finance and HealthcareBig data technologies with Case Study Finance and Healthcare
Big data technologies with Case Study Finance and Healthcare
 
Big Data Pitfalls
Big Data PitfallsBig Data Pitfalls
Big Data Pitfalls
 
How to Create the Google for Earth Data (XLDB 2015, Stanford)
How to Create the Google for Earth Data (XLDB 2015, Stanford)How to Create the Google for Earth Data (XLDB 2015, Stanford)
How to Create the Google for Earth Data (XLDB 2015, Stanford)
 
How to build a data stack from scratch
How to build a data stack from scratchHow to build a data stack from scratch
How to build a data stack from scratch
 
Session 10 data
Session 10 dataSession 10 data
Session 10 data
 
Thinking Outside the Table
Thinking Outside the TableThinking Outside the Table
Thinking Outside the Table
 
Building Knowledge Graphs in 10 steps
Building Knowledge Graphs in 10 stepsBuilding Knowledge Graphs in 10 steps
Building Knowledge Graphs in 10 steps
 
Choosing the Right Graph Database to Succeed in Your Project
Choosing the Right Graph Database to Succeed in Your ProjectChoosing the Right Graph Database to Succeed in Your Project
Choosing the Right Graph Database to Succeed in Your Project
 

Viewers also liked (11)

Ppt for Application of big data
Ppt for Application of big dataPpt for Application of big data
Ppt for Application of big data
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Big Data ppt
Big Data pptBig Data ppt
Big Data ppt
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
big data overview ppt
big data overview pptbig data overview ppt
big data overview ppt
 
Big data Ppt
Big data PptBig data Ppt
Big data Ppt
 
Big Data in Manufacturing Final PPT
Big Data in Manufacturing Final PPTBig Data in Manufacturing Final PPT
Big Data in Manufacturing Final PPT
 
GI2016 ppt shi (big data analytics on the internet)
GI2016 ppt shi (big data analytics on the internet)GI2016 ppt shi (big data analytics on the internet)
GI2016 ppt shi (big data analytics on the internet)
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 

Similar to Ets train ppt_big_data_basics_v2.0

INF2190_W1_2016_public
INF2190_W1_2016_publicINF2190_W1_2016_public
INF2190_W1_2016_public
Attila Barta
 
"Demystifying Big Data by AIBDP.org
"Demystifying Big Data by AIBDP.org"Demystifying Big Data by AIBDP.org
"Demystifying Big Data by AIBDP.org
AIBDP
 
Big-Data-Analytics.8592259.powerpoint.pdf
Big-Data-Analytics.8592259.powerpoint.pdfBig-Data-Analytics.8592259.powerpoint.pdf
Big-Data-Analytics.8592259.powerpoint.pdf
rajsharma159890
 

Similar to Ets train ppt_big_data_basics_v2.0 (20)

Lecture1 introduction to big data
Lecture1 introduction to big dataLecture1 introduction to big data
Lecture1 introduction to big data
 
INF2190_W1_2016_public
INF2190_W1_2016_publicINF2190_W1_2016_public
INF2190_W1_2016_public
 
March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...
March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...
March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...
 
Big Data Session 1.pptx
Big Data Session 1.pptxBig Data Session 1.pptx
Big Data Session 1.pptx
 
Thilga
ThilgaThilga
Thilga
 
Big data
Big dataBig data
Big data
 
Big data
Big dataBig data
Big data
 
Introduction to big data
Introduction to big dataIntroduction to big data
Introduction to big data
 
Oh! Session on Introduction to BIG Data
Oh! Session on Introduction to BIG DataOh! Session on Introduction to BIG Data
Oh! Session on Introduction to BIG Data
 
Big data-ppt
Big data-pptBig data-ppt
Big data-ppt
 
Foundation for Success: How Big Data Fits in an Information Architecture
Foundation for Success: How Big Data Fits in an Information ArchitectureFoundation for Success: How Big Data Fits in an Information Architecture
Foundation for Success: How Big Data Fits in an Information Architecture
 
Big data-ppt-
Big data-ppt-Big data-ppt-
Big data-ppt-
 
De-Mystifying Big Data
De-Mystifying Big DataDe-Mystifying Big Data
De-Mystifying Big Data
 
INTRODUCTION TO BIG DATA AND HADOOP
INTRODUCTION TO BIG DATA AND HADOOPINTRODUCTION TO BIG DATA AND HADOOP
INTRODUCTION TO BIG DATA AND HADOOP
 
"Demystifying Big Data by AIBDP.org
"Demystifying Big Data by AIBDP.org"Demystifying Big Data by AIBDP.org
"Demystifying Big Data by AIBDP.org
 
Big data
Big dataBig data
Big data
 
Big data unit 2
Big data unit 2Big data unit 2
Big data unit 2
 
Bigdata (1) converted
Bigdata (1) convertedBigdata (1) converted
Bigdata (1) converted
 
Big-Data-Analytics.8592259.powerpoint.pdf
Big-Data-Analytics.8592259.powerpoint.pdfBig-Data-Analytics.8592259.powerpoint.pdf
Big-Data-Analytics.8592259.powerpoint.pdf
 
Big data lecture notes
Big data lecture notesBig data lecture notes
Big data lecture notes
 

More from Eclipse Techno Consulting Global (P) Ltd

More from Eclipse Techno Consulting Global (P) Ltd (9)

Secret to success revealed
Secret to success revealedSecret to success revealed
Secret to success revealed
 
Ets's vision for 3 d modelling & animation
Ets's vision for 3 d modelling & animationEts's vision for 3 d modelling & animation
Ets's vision for 3 d modelling & animation
 
Ets's take on motivation
Ets's take on motivationEts's take on motivation
Ets's take on motivation
 
Ets's take on motivation
Ets's take on motivationEts's take on motivation
Ets's take on motivation
 
offers for our customers
offers for our customers offers for our customers
offers for our customers
 
Soft skill enhancement presentation
Soft skill enhancement presentationSoft skill enhancement presentation
Soft skill enhancement presentation
 
Internet marketing proposal from ETS
Internet marketing proposal from ETSInternet marketing proposal from ETS
Internet marketing proposal from ETS
 
How to arrange Events in Corporate world
How to arrange Events in Corporate worldHow to arrange Events in Corporate world
How to arrange Events in Corporate world
 
Face and Voice Recognition- Artificial Intelligence
Face and Voice Recognition- Artificial IntelligenceFace and Voice Recognition- Artificial Intelligence
Face and Voice Recognition- Artificial Intelligence
 

Recently uploaded

CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
amitlee9823
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
amitlee9823
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
amitlee9823
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
amitlee9823
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
karishmasinghjnh
 
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
amitlee9823
 

Recently uploaded (20)

BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
 
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning Approach
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
 
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 

Ets train ppt_big_data_basics_v2.0

  • 1. Big Data Basics AUTHOR : MITHUN BANERJEE DATE: 05-OCTOBER-2016 C O P Y R I G H T P R O T E C T E D B Y E C L I P S E T E C H N O C O N S U L T I N G G L O B A L ( P ) L T D .
  • 2. What is Big data? Big data is the term for a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications. --Wikipedia Is the above definition fully comprehensive?  Lets try to go deep in next slides
  • 3. Data units to measure exponential growth of data over the years VOLUME of DATA
  • 4. Type of data • Relational Data (Tables/Transaction/Legacy Data) • Text Data (Web) • Semi-structured Data (XML) • Graph Data Social Network, SemanticWeb (RDF), … • Streaming Data You can only scan the data once • A single application can be generating/collecting many types of data • Big Public Data (online, weather, finance, etc) Variety (complexities) of data
  • 5. Velocity of data Late decisions  missing opportunities Example: Healthcare monitoring: sensors monitoring your activities and body  any abnormal measurements require immediate reaction Velocity of data Social media and networks (all of us are generating data) Scientific instruments (collecting all sorts of data) Sensor technology and networks (measuring all kinds of data) REALTIME / FAST DATA
  • 6. 3Vs
  • 7. 4Vs
  • 8. Generation and Consumption of Data In past In present OLTP: O N L I N E T R A N S A C T I O N P R O C E S S I NG ( D B M S ) OLAP: O N L I N E A N A LY T I C A L P R O C E S S I N G ( D ATA WA R E H O U S I N G ) RTAP: REAL-TIME ANALYTICS PROCESSING (BIG DATA ARCHITECTURE & TECHNOLOGY)
  • 9. Driver of Data - Optimizations and predictive analytics - Complex statistical analysis - All types of data, and many sources -Very large datasets - More of a real-time - Ad-hoc querying and reporting - Data mining techniques - Structured data, typical sources - Small to mid-size datasets
  • 10. The Evolution of Business Intelligence BI Reporting OLAP & Dataware house Business Objects, SAS, Informatica, Cognos other SQL ReportingTools Interactive Business Intelligence & In-memory RDBMS QliqView,Tableau, HANA Big Data: RealTime & SingleView Graph Databases Big Data: Batch Processing & Distributed Data Store Hadoop/Spark; HBase/Cassandra 1990’s 2000’s 2010’s Speed Scale Scale Speed
  • 11. Topic 1: Data Analytics & Data Mining • EXPLORATORY DATA ANALYSIS • • LINEAR CLASSIFICATION (PERCEPTRON & LOGISTIC REGRESSION) • • LINEAR REGRESSION • C4.5 DECISION TREE • APRIORI • K-MEANS CLUSTERING • • EM ALGORITHM • PAGERANK & HITS • COLLABORATIVE FILTERING
  • 12. Topic 2: Hadoop/MapReduce Programming & Data Processing ARCHITECTURE OF HADOOP, HDFS, AND YARN PROGRAMMING ON HADOOP BASIC DATA PROCESSING: SORT AND JOIN INFORMATION RETRIEVAL USING HADOOP DATA MINING USING HADOOP (KMEANS+HISTOGRAMS) MACHINE LEARNING ON HADOOP (EM) HIVE/PIG HBASE AND CASSANDRA
  • 13. Topic 3: Graph Database and Graph Analytics GRAPH DATABASE (HTTP://EN.WIKIPEDIA.ORG/WIKI/GRAPH_DATAB ASE) Native Graph Database (Neo4j) Pregel/Giraph (Distributed Graph Processing Engine) NEO4J/TITAN/GRAPHLAB/GRAPHSQL
  • 14. Reference to read for in depth home work •Hadoop:The Definitive Guide,Tom White, O’Reilly •Data Mining: Concepts andTechniques,Third Edition, by Jiawei Han et al. •https://www.mongodb.com/collateral/big-data-examples- and-guidelines-enterprise-decision-maker • •http://www.aptude.com/blog/entry/hadoop-vs-mongodb- which-platform-is-better-for-handling-big-data • •http://www.slideshare.net/wlaforest/an-introduction-to- big-data-nosql-and-mongodb •http://www.infoworld.com/article/2608460/application- development/the-10-worst-big-data-practices.html