SlideShare a Scribd company logo
1 of 18
Download to read offline
BIG DATA
By,
Sowmiya.R
CONTENTS
Big Data Overview
Why Big Data?
Big Data Solutions
Hadoop Overview
Hadoop Core components
HDFS Overview
MapReduce Overview
Future of Big Data & Hadoop
BIG DATA OVERVIEW
What is Big data?
 Types:
 Structured Data
 Semi Structured Data
 Unstructured Data
 Characteristics:
 Volume
 Velocity
 Variety
 Challenges:
 Capturing data
 Storage
 Searching
 Sharing
 Transfer
 Analysis
 Presentation
WHY BIG DATA?
 Key enablers for the growth of “Big
Data” are:
 Increase in storage capacities
 Increase of processing power
 Availability of data
BIG DATA SOLUTIONS
 Traditional Enterprise Approach
 Centralized system , RDBMS
 Google’s Solution
 MapReduce
 Hadoop
 HDFS , MapReduce
HADOOP OVERVIEW
What is hadoop?
 Working:
 Data divided into directories, files
 Distributed across clusters
 Processed using HDFS
 Blocks replicated
 Code checking
 Sorting in Map and Reduce stages
 Obtaining output
 Writing logs.
 Advantages :
 Scalability
 Cost-effective solution
 Flexibility
 Fast
 Security and Authentication
 Parallel processing
 Availability and resilient nature
 Simple model of programming
HADOOP CORE COMPONENTS
 Hadoop 1.0:
 MapReduce
 HDFS
 Hadoop 2.0:
 MapReduce
 HDFS
 YARN
HDFS OVERVIEW
 Architecture:
 Objectives:
 Fault detection and recovery
 Huge datasets
 Hardware at data
 Goals:
 Hardware Failure
 Streaming Data access
 Large Data sets
 Simple Coherency Model
 Easy portability
MAPREDUCE OVERVIEW
 Work Flow:
WORKING OF HADOOP
 Store and Compute Operations:
 Master Node, Salve Node-> Network of
Clusters
 Cluster->Racks
 Racks->Nodes
 Store: Large data file divided and stored
 Compute: Retrieve data from near-by nodes.
FUTURE OF BIG DATA & HADOOP
 Soaring demand for Analytics Professionals
 Huge Job Opportunities
 Adoption of Big Data Analytics is growing
 Analytics : A key factor in decision making
 Rise of Unstructured and Semi structured data
 Surpassing Market Forecast/Predictions
THANK YOU

More Related Content

What's hot

Dedup with hadoop
Dedup with hadoopDedup with hadoop
Dedup with hadoop
Neeta Pande
 

What's hot (20)

Introduction to Big Data and Hadoop
Introduction to Big Data and HadoopIntroduction to Big Data and Hadoop
Introduction to Big Data and Hadoop
 
Hadoop and big data
Hadoop and big dataHadoop and big data
Hadoop and big data
 
Big data analytics - hadoop
Big data analytics - hadoopBig data analytics - hadoop
Big data analytics - hadoop
 
Introduction to Big Data Analytics on Apache Hadoop
Introduction to Big Data Analytics on Apache HadoopIntroduction to Big Data Analytics on Apache Hadoop
Introduction to Big Data Analytics on Apache Hadoop
 
Scaling Data overview
Scaling Data overviewScaling Data overview
Scaling Data overview
 
Introduction to Apache Hadoop Eco-System
Introduction to Apache Hadoop Eco-SystemIntroduction to Apache Hadoop Eco-System
Introduction to Apache Hadoop Eco-System
 
1.demystifying big data & hadoop
1.demystifying big data & hadoop1.demystifying big data & hadoop
1.demystifying big data & hadoop
 
Hareesh
HareeshHareesh
Hareesh
 
Resume (1)
Resume (1)Resume (1)
Resume (1)
 
Présentation on radoop
Présentation on radoop   Présentation on radoop
Présentation on radoop
 
Srikanth hadoop hyderabad_3.4yeras - copy
Srikanth hadoop hyderabad_3.4yeras - copySrikanth hadoop hyderabad_3.4yeras - copy
Srikanth hadoop hyderabad_3.4yeras - copy
 
Hadoop: An Industry Perspective
Hadoop: An Industry PerspectiveHadoop: An Industry Perspective
Hadoop: An Industry Perspective
 
Rob peglar introduction_analytics _big data_hadoop
Rob peglar introduction_analytics _big data_hadoopRob peglar introduction_analytics _big data_hadoop
Rob peglar introduction_analytics _big data_hadoop
 
Introduction to BIg Data and Hadoop
Introduction to BIg Data and HadoopIntroduction to BIg Data and Hadoop
Introduction to BIg Data and Hadoop
 
RESUME_N
RESUME_NRESUME_N
RESUME_N
 
Big Data Governance in Hadoop Environments with Cloudera Navigatorfeb2017meetu
Big Data Governance in Hadoop Environments with Cloudera Navigatorfeb2017meetuBig Data Governance in Hadoop Environments with Cloudera Navigatorfeb2017meetu
Big Data Governance in Hadoop Environments with Cloudera Navigatorfeb2017meetu
 
Hadoop,Big Data Analytics and More
Hadoop,Big Data Analytics and MoreHadoop,Big Data Analytics and More
Hadoop,Big Data Analytics and More
 
Big Data and Hadoop Basics
Big Data and Hadoop BasicsBig Data and Hadoop Basics
Big Data and Hadoop Basics
 
Dedup with hadoop
Dedup with hadoopDedup with hadoop
Dedup with hadoop
 
HDF5 iRODS
HDF5 iRODSHDF5 iRODS
HDF5 iRODS
 

Similar to Big data with Hadoop

Cred_hadoop_presenatation
Cred_hadoop_presenatationCred_hadoop_presenatation
Cred_hadoop_presenatation
Ashish Saraf
 

Similar to Big data with Hadoop (20)

Big-Data Hadoop Tutorials - MindScripts Technologies, Pune
Big-Data Hadoop Tutorials - MindScripts Technologies, Pune Big-Data Hadoop Tutorials - MindScripts Technologies, Pune
Big-Data Hadoop Tutorials - MindScripts Technologies, Pune
 
Lecture 2 Hadoop.pptx
Lecture 2 Hadoop.pptxLecture 2 Hadoop.pptx
Lecture 2 Hadoop.pptx
 
Big data
Big dataBig data
Big data
 
Big data
Big dataBig data
Big data
 
Bigdata and hadoop
Bigdata and hadoopBigdata and hadoop
Bigdata and hadoop
 
Hadoop introduction , Why and What is Hadoop ?
Hadoop introduction , Why and What is  Hadoop ?Hadoop introduction , Why and What is  Hadoop ?
Hadoop introduction , Why and What is Hadoop ?
 
Cred_hadoop_presenatation
Cred_hadoop_presenatationCred_hadoop_presenatation
Cred_hadoop_presenatation
 
BIG DATA Session 6
BIG DATA Session 6BIG DATA Session 6
BIG DATA Session 6
 
Understanding Big Data And Hadoop
Understanding Big Data And HadoopUnderstanding Big Data And Hadoop
Understanding Big Data And Hadoop
 
What is hadoop
What is hadoopWhat is hadoop
What is hadoop
 
Hadoop Developer
Hadoop DeveloperHadoop Developer
Hadoop Developer
 
FOSS Sea 2014_DataWarehouse & BigData_Владимир Слободянюк ( Luxoft)
FOSS Sea 2014_DataWarehouse & BigData_Владимир Слободянюк ( Luxoft)FOSS Sea 2014_DataWarehouse & BigData_Владимир Слободянюк ( Luxoft)
FOSS Sea 2014_DataWarehouse & BigData_Владимир Слободянюк ( Luxoft)
 
Hadoop training by keylabs
Hadoop training by keylabsHadoop training by keylabs
Hadoop training by keylabs
 
Stratebi Big Data
Stratebi Big DataStratebi Big Data
Stratebi Big Data
 
Big data and apache hadoop adoption
Big data and apache hadoop adoptionBig data and apache hadoop adoption
Big data and apache hadoop adoption
 
Hybrid Data Warehouse Hadoop Implementations
Hybrid Data Warehouse Hadoop ImplementationsHybrid Data Warehouse Hadoop Implementations
Hybrid Data Warehouse Hadoop Implementations
 
Hadoop admin training
Hadoop admin trainingHadoop admin training
Hadoop admin training
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
G017143640
G017143640G017143640
G017143640
 
Big Data Analysis and Its Scheduling Policy – Hadoop
Big Data Analysis and Its Scheduling Policy – HadoopBig Data Analysis and Its Scheduling Policy – Hadoop
Big Data Analysis and Its Scheduling Policy – Hadoop
 

Recently uploaded

“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
Muhammad Subhan
 
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc
 

Recently uploaded (20)

Syngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdfSyngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdf
 
Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024
 
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
 
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
 
State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!
 
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdfFrisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
 
JavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate GuideJavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate Guide
 
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptxCyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
 
Vector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptxVector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptx
 
WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024
 
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
 
ERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage Intacct
 
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM Performance
 
Introduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptxIntroduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptx
 
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
 
Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024
 
Generative AI Use Cases and Applications.pdf
Generative AI Use Cases and Applications.pdfGenerative AI Use Cases and Applications.pdf
Generative AI Use Cases and Applications.pdf
 
Intro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptxIntro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptx
 
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
 

Big data with Hadoop

  • 2. CONTENTS Big Data Overview Why Big Data? Big Data Solutions Hadoop Overview Hadoop Core components HDFS Overview MapReduce Overview Future of Big Data & Hadoop
  • 3. BIG DATA OVERVIEW What is Big data?
  • 4.  Types:  Structured Data  Semi Structured Data  Unstructured Data  Characteristics:  Volume  Velocity  Variety
  • 5.  Challenges:  Capturing data  Storage  Searching  Sharing  Transfer  Analysis  Presentation
  • 6. WHY BIG DATA?  Key enablers for the growth of “Big Data” are:  Increase in storage capacities  Increase of processing power  Availability of data
  • 7. BIG DATA SOLUTIONS  Traditional Enterprise Approach  Centralized system , RDBMS  Google’s Solution  MapReduce  Hadoop  HDFS , MapReduce
  • 9.  Working:  Data divided into directories, files  Distributed across clusters  Processed using HDFS  Blocks replicated  Code checking  Sorting in Map and Reduce stages  Obtaining output  Writing logs.
  • 10.  Advantages :  Scalability  Cost-effective solution  Flexibility  Fast  Security and Authentication  Parallel processing  Availability and resilient nature  Simple model of programming
  • 11. HADOOP CORE COMPONENTS  Hadoop 1.0:  MapReduce  HDFS  Hadoop 2.0:  MapReduce  HDFS  YARN
  • 13.  Objectives:  Fault detection and recovery  Huge datasets  Hardware at data  Goals:  Hardware Failure  Streaming Data access  Large Data sets  Simple Coherency Model  Easy portability
  • 16.  Store and Compute Operations:  Master Node, Salve Node-> Network of Clusters  Cluster->Racks  Racks->Nodes  Store: Large data file divided and stored  Compute: Retrieve data from near-by nodes.
  • 17. FUTURE OF BIG DATA & HADOOP  Soaring demand for Analytics Professionals  Huge Job Opportunities  Adoption of Big Data Analytics is growing  Analytics : A key factor in decision making  Rise of Unstructured and Semi structured data  Surpassing Market Forecast/Predictions