Big data hadoop

•Download as PPTX, PDF•

1 like•286 views

Developed by Doug Cutting, Mika Cafarella and Team Open Source Project that works on MapReduce algorithm Apache Hadoop is a registered trademark of Apache Software Foundation

Technology

Big Data- Hadoop
Presented By: Vikram Dey

Big Data- Hadoop
90% of world’s data is generated in the last few years
Big Data: Large dataset the cannot be processed using the
traditional computing techniques.
What comes under Big Data:
• Social Media Data
• Stock exchange Data
• Search engine data

HADOOP
• Developed by Doug Cutting, Mika Cafarella and Team
• Open Source Project that works on MapReduce algorithm
• Apache Hadoop is a registered trademark of Apache
Software Foundation

HADOOP Framework
• Hadoop Common: Java Libraries
• Hadoop YARN: Job Scheduling and
cluster management framework
•Hadoop HDFS: Distributed File
System that provides high-
throughput access to application
data
MapReduce: Software framework
for parallel processing of large data
sets

How Does HADOOP Work?
Stage 1
User submit a job to the Hadoop Job-Client
for required process by specifying :
• the input and output files location in DFS
• Job configuration by setting different
parameters specific to the jobStage 2
• The Hadoop job client then submits the
job and configuration to JobTracker
• JobTracker distributes the configuration
to the slaves, scheduling tasks and
monitoring them, providing status to job-
client.

How Does HADOOP Work?
Stage 3
TaskTracker executes the task as per
MapReduce implementation and output is
stored into output files on the file system.

What's hot

Available platforms for Big Data 2.0Petr Novotný

Hortonworks Big Data & HadoopMark Ginnebaugh

Common and unique use cases for Apache HadoopBrock Noland

Hd insight essentials quick viewRajesh Nadipalli

BlueData Hunk Integration: Splunk Analytics for HadoopBlueData, Inc.

Big data and hadoopSri Kanth

Concepts on HadoopChristopher Sharkey

Real World Use Cases: Hadoop and NoSQL in ProductionCodemotion

Intro to Big Data - SparkSofian Hadiwijaya

Big Data and Hadoop - key drivers, ecosystem and use casesJeff Kelly

Working with data using AI based toolsdhruv_gairola

Big Data in the Real WorldMark Kromer

Bigdata and hadoopAditi Yadav

Dba to data scientist -Satyendrapasalapudi123

Apache Eagle: Secure Hadoop in Real TimeDataWorks Summit/Hadoop Summit

The practice of big data - making big data approachablekcmallu

Intro to big data and hadoop ubc cs lecture series - g fawkesgfawkesnew2

Pervasive DataRushtempledf

Dallas TDWI Meeting Dec. 2012: Hadooplamont_lockwood

Keynote – From MapReduce to Spark: An Ecosystem Evolves by Doug Cutting, Chie...Cloudera, Inc.

What's hot (20)

Available platforms for Big Data 2.0

Hortonworks Big Data & Hadoop

Common and unique use cases for Apache Hadoop

Hd insight essentials quick view

BlueData Hunk Integration: Splunk Analytics for Hadoop

Big data and hadoop

Concepts on Hadoop

Real World Use Cases: Hadoop and NoSQL in Production

Intro to Big Data - Spark

Big Data and Hadoop - key drivers, ecosystem and use cases

Working with data using AI based tools

Big Data in the Real World

Bigdata and hadoop

Dba to data scientist -Satyendra

Apache Eagle: Secure Hadoop in Real Time

The practice of big data - making big data approachable

Intro to big data and hadoop ubc cs lecture series - g fawkes

Pervasive DataRush

Dallas TDWI Meeting Dec. 2012: Hadoop

Keynote – From MapReduce to Spark: An Ecosystem Evolves by Doug Cutting, Chie...

Similar to Big data hadoop

List of Engineering Colleges in UttarakhandRoorkee College of Engineering, Roorkee

Hadoop.pptxarslanhaneef

Hadoop.pptxsonukumar379092

Apache hadoop technology : BeginnersShweta Patnaik

Hadoopchandinisanz

Big data applicationsJuan Pablo Paz Grau, Ph.D., PMP

Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3tcloudcomputing-tw

Hadoop and Big DataHarshdeep Kaur

HadoopIntroduction.pptxBalasundaramSr

Hadoop Tutorial For BeginnersDataflair Web Services Pvt Ltd

M. Florence Dayana - Hadoop Foundation for Analytics.pptxDr.Florence Dayana

project--2 nd review_2Aswini Ashu

project--2 nd review_2aswini pilli

Hadoop.pptxAlbertoBarronMiranda1

AnjuAnju Shekhawat

Hadoop ppt1chariorienit

Introduction to BIg Data and HadoopAmir Shaikh

Similar to Big data hadoop (20)

List of Engineering Colleges in Uttarakhand

Hadoop.pptx

Apache hadoop technology : Beginners

Hadoop

Big data applications

Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3

Hadoop and Big Data

HadoopIntroduction.pptx

Hadoop Tutorial For Beginners

M. Florence Dayana - Hadoop Foundation for Analytics.pptx

project--2 nd review_2

Hadoop.pptx

Anju

Hadoop ppt1

Introduction to BIg Data and Hadoop

Recently uploaded

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software

MINDCTI Revenue Release Quarter One 2024MIND CTI

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays

DBX First Quarter 2024 Investor PresentationDropbox

FWD Group - Insurer Innovation Award 2024The Digital Insurer

Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney

Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra

ICT role in 21st century education and its challengesrafiqahmad00786416

AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin

How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes

Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz

presentation ICT roal in 21st century educationjfdjdjcjdnsjd

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10

AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer

Apidays New York 2024 - The value of a flexible API Management solution for O...apidays

2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood

Spring Boot vs Quarkus the ultimate battle - DevoxxUKJago de Vreede

Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub

Recently uploaded (20)

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME

MINDCTI Revenue Release Quarter One 2024

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe

DBX First Quarter 2024 Investor Presentation

FWD Group - Insurer Innovation Award 2024

Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...

Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving

ICT role in 21st century education and its challenges

AWS Community Day CPH - Three problems of Terraform

How to Troubleshoot Apps for the Modern Connected Worker

Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...

presentation ICT roal in 21st century education

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...

AXA XL - Insurer Innovation Award Americas 2024

Apidays New York 2024 - The value of a flexible API Management solution for O...

2024: Domino Containers - The Next Step. News from the Domino Container commu...

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...

Spring Boot vs Quarkus the ultimate battle - DevoxxUK

Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf

Big data hadoop

1. Big Data- Hadoop Presented By: Vikram Dey

2. Big Data- Hadoop 90% of world’s data is generated in the last few years Big Data: Large dataset the cannot be processed using the traditional computing techniques. What comes under Big Data: • Social Media Data • Stock exchange Data • Search engine data

3. Traditional Approach Google’s Solution

4. HADOOP • Developed by Doug Cutting, Mika Cafarella and Team • Open Source Project that works on MapReduce algorithm • Apache Hadoop is a registered trademark of Apache Software Foundation

5. HADOOP Framework • Hadoop Common: Java Libraries • Hadoop YARN: Job Scheduling and cluster management framework •Hadoop HDFS: Distributed File System that provides high- throughput access to application data MapReduce: Software framework for parallel processing of large data sets

6. How Does HADOOP Work? Stage 1 User submit a job to the Hadoop Job-Client for required process by specifying : • the input and output files location in DFS • Job configuration by setting different parameters specific to the jobStage 2 • The Hadoop job client then submits the job and configuration to JobTracker • JobTracker distributes the configuration to the slaves, scheduling tasks and monitoring them, providing status to job- client.

7. How Does HADOOP Work? Stage 3 TaskTracker executes the task as per MapReduce implementation and output is stored into output files on the file system.

8.    Thank You   

Big data hadoop

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Big data hadoop

Similar to Big data hadoop (20)

Recently uploaded

Recently uploaded (20)

Big data hadoop