Hadoop J.G.Rohini II M.Sc.,computer science Bon secours college for women

•

0 likes•23 views

This document discusses Hadoop, including its key features, advantages, and versions. The main features of Hadoop are tooling, code generation, modeling, scheduling, and integration capabilities. It has advantages such as being scalable, cost effective, flexible, fast, and resilient to failure. There are two main versions of Hadoop: Hadoop 1.0 uses MapReduce for data processing and HDFS for storage, while Hadoop 2.0 introduces YARN as a separate resource manager and allows various data processing frameworks beyond just MapReduce.

Education

Hadoop
Features,Key Advantages,Versions
Submitted by
Name:J.G.Rohini
Class:II M.Sc CS
Batch:2017-2019
Incharge:Ms.M.Florance Dayana

1. Tooling :
 Developers can create, design, and deploy big
data services on any platform or development
environment as per their choice.

2. Code generation :
 Hadoop big data suite, there is no need of
writing, debugging, analyzing, and optimizing
MapReduce code
 the complete code is auto generated.

3. Modeling :
 Every Hadoop distribution provides the
infrastructure to integrate Hadoop clusters.
 developers have to make complex codes to
develop MapReduce program.
 They can write such codes in simple Java, or
even can use optimized languages, such as
PigLatin, HQL,etc.

4. Scheduling :
 Big Data jobs execution needs to be monitored
and scheduled.
 Instead of writing jobs for scheduling, developers
can take help of big data suite to define and
handle the execution tasks in most efficient way.

5. Integration :
 Hadoop- it wants to integrate data from all types
of products and technologies.
 Along with files and SQL databases, developers
wants to integrate data from NoSQL databases,
social media, B2B products, etc.

Key Advantages
There are many advantages associated with Hadoop.
In this presentation we have came up with some
major advantages of Hadoop.

Scalable:
 Hadoop is highly scalable.
 it can store and distribute very large data sets
across hundreds of inexpensive servers.

Cost effective:
 Owing to its scale-out architecture
 Hadoop offers a cost effective storage solution
and processing

Flexible:
 Ability to work with all kind of data: structured,
semi-structured and unstructured.
 it can be used for a wide variety of purposes,
such as log processing, recommendation
systems,data warehousing ,data mining and more.

Fast:
 the process is extremely fast in compared to other
conventional systems owing to the ”move code to
data” paradigm.

Resilient to failure:
 Hadoop is fault tolerance.
 It practices replication of data diligently.
 ensuring that in the event of a node failure.

There are two version of Hadoop available:
1.Hadoop 1.0
2.Hadoop 2.0

Hadoop 1.0
It has two main parts:
1.Data storage framework
2.Data processing framework
1.Data storage framework:
 It is a general –purpose filesystem called
Hadoop Distributed File System.
 HDFS is schema-less.
 It stores data files can be in just about any
format.

2.Data processing framework:
 Is a simple functional programming model.
 It essentially uses two functions:
1.MAP
2.REDUCE
1.The “Mapers” take set of key-value pairs and
generate intermediate data.
2.The“Reducers” then act on this input to
produce the output data.

Hadoop 1.0
MapReduce
(Cluster Resource Manager
And Data Processing)
HDFS
(Redundant , reliable
storage)

Hadoop 2.0
 HDFS continues to be the data storage
framework.
 A new and separate resource management
framework called Yet Another Resource
Negotiator(YARN) has been added.
 Any application capable of dividing itself into
parallel tasks is supported by YARN.
 YARN coordinates the allocation of subtasks of the
submitted applications.

 Further enhancing the flexibility , scalability , and
efficiency of the applications.
 ApplicationMaster is able to run any application
and not just MapReduce.
 only supports batch processing but also real-time
processing.
 MapReduce is no longer the only data
processing option.

Hadoop 2.0
MapReduce
(Data Processing)
Others
(Data processing)
YARN
(cluster resource manager)
HDFS
(Redundant , reliable storage)

What's hot

HADOOP TECHNOLOGY pptsravya raju

Introduction to Apache Hadoop Eco-SystemMd. Hasan Basri (Angel)

Big_SQL_3.0_WhitepaperScott Gray

HadoopAnkit Prasad

Hadoopronit gaikwad

Big data Presentationhimanshu arora

SQL Server 2012 and Big DataMicrosoft TechNet - Belgium and Luxembourg

Hadoop and its role in Facebook: An Overviewrahulmonikasharma

RDBMS vs Hadoop vs SparkLaxmi8

Big data hadoop rdbmsArjen de Vries

IJSRED-V2I3P43IJSRED

Hadoop Architecture Ganesh B

Hotel inspection data set analysis copySharon Moses

Big Data and HadoopFlavio Vit

Big data analysis using hadoop clusterFurqan Haider

5 Scenarios: When To Use & When Not to Use HadoopEdureka!

Apache HadoopAjit Koti

Lighting up Big Data Analytics with Apache Spark in AzureJen Stirrup

Hadoop 2.0 and yarnMichael Joseph

What's hot (19)

HADOOP TECHNOLOGY ppt

Introduction to Apache Hadoop Eco-System

Big_SQL_3.0_Whitepaper

Hadoop

Big data Presentation

SQL Server 2012 and Big Data

Hadoop and its role in Facebook: An Overview

RDBMS vs Hadoop vs Spark

Big data hadoop rdbms

IJSRED-V2I3P43

Hadoop Architecture

Hotel inspection data set analysis copy

Big Data and Hadoop

Big data analysis using hadoop cluster

5 Scenarios: When To Use & When Not to Use Hadoop

Apache Hadoop

Lighting up Big Data Analytics with Apache Spark in Azure

Hadoop 2.0 and yarn

Similar to Hadoop J.G.Rohini II M.Sc.,computer science Bon secours college for women

Hadoop a Natural Choice for Data Intensive Log ProcessingHitendra Kumar

project report on hadoopManoj Jangalva

Distributed Systems Hadoop.pptxUttara University

Hadoop architecture-tutorialvinayiqbusiness

Cppt Hadoopchunkypandey12

Cpptchunkypandey12

Overview of big data & hadoop v1Thanh Nguyen

Big Data Analysis and Its Scheduling Policy – HadoopIOSR Journals

G017143640IOSR Journals

M. Florence Dayana - Hadoop Foundation for Analytics.pptxDr.Florence Dayana

2.1-HADOOP.pdfMarianJRuben

Hadoop architecture-tutorialvinayiqbusiness

AnjuAnju Shekhawat

Hadoop and BigData - July 2016Ranjith Sekar

Big Data Hadoop TechnologyRahul Sharma

Hadoopthisisnabin

Big datarevathireddyb

Hadoop and Big DataHarshdeep Kaur

Introduction to HadoopVigen Sahakyan

Similar to Hadoop J.G.Rohini II M.Sc.,computer science Bon secours college for women (20)

Hadoop a Natural Choice for Data Intensive Log Processing

project report on hadoop

Distributed Systems Hadoop.pptx

Hadoop architecture-tutorial

Cppt Hadoop

Cppt

Overview of big data & hadoop v1

Big Data Analysis and Its Scheduling Policy – Hadoop

G017143640

M. Florence Dayana - Hadoop Foundation for Analytics.pptx

2.1-HADOOP.pdf

Hadoop architecture-tutorial

Anju

Hadoop and BigData - July 2016

Big Data Hadoop Technology

Hadoop

Big data

Hadoop and Big Data

Introduction to Hadoop

Recently uploaded

How to Give a Domain for a Field in Odoo 17Celine George

Python Notes for mca i year students osmania university.docxRamakrishna Reddy Bijjam

Unit-IV; Professional Sales Representative (PSR).pptxVishalSingh1417

Unit-V; Pricing (Pharma Marketing Management).pptxVishalSingh1417

ICT role in 21st century education and it's challenges.MaryamAhmad92

Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhikauryashika82

Introduction to Nonprofit Accounting: The BasicsTechSoup

ICT Role in 21st Century Education & its Challenges.pptxAreebaZafar22

How to Create and Manage Wizard in Odoo 17Celine George

Spatium Project Simulation student briefAssociation for Project Management

psychiatric nursing HISTORY COLLECTION .docxPoojaSen20

Dyslexia AI Workshop for Slideshare.pptxcallscotland1987

Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University of Engineering & Technology, Jamshoro

Mixin Classes in Odoo 17 How to Extend Models Using Mixin ClassesCeline George

Third Battle of Panipat detailed notes.pptxAmita Gupta

Understanding Accommodations and ModificationsMJDuyan

Holdier Curriculum Vitae (April 2024).pdfagholdier

microwave assisted reaction. General introductionMaksud Ahmed

Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...pradhanghanshyam7136

Micro-Scholarship, What it is, How can it help me.pdfPoh-Sun Goh

Recently uploaded (20)

How to Give a Domain for a Field in Odoo 17

Python Notes for mca i year students osmania university.docx

Unit-IV; Professional Sales Representative (PSR).pptx

Unit-V; Pricing (Pharma Marketing Management).pptx

ICT role in 21st century education and it's challenges.

Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi

Introduction to Nonprofit Accounting: The Basics

ICT Role in 21st Century Education & its Challenges.pptx

How to Create and Manage Wizard in Odoo 17

Spatium Project Simulation student brief

psychiatric nursing HISTORY COLLECTION .docx

Dyslexia AI Workshop for Slideshare.pptx

Mehran University Newsletter Vol-X, Issue-I, 2024

Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes

Third Battle of Panipat detailed notes.pptx

Understanding Accommodations and Modifications

Holdier Curriculum Vitae (April 2024).pdf

microwave assisted reaction. General introduction

Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...

Micro-Scholarship, What it is, How can it help me.pdf

Hadoop J.G.Rohini II M.Sc.,computer science Bon secours college for women

1. Hadoop Features,Key Advantages,Versions Submitted by Name:J.G.Rohini Class:II M.Sc CS Batch:2017-2019 Incharge:Ms.M.Florance Dayana

2. Features

3. 1. Tooling :  Developers can create, design, and deploy big data services on any platform or development environment as per their choice.

4. 2. Code generation :  Hadoop big data suite, there is no need of writing, debugging, analyzing, and optimizing MapReduce code  the complete code is auto generated.

5. 3. Modeling :  Every Hadoop distribution provides the infrastructure to integrate Hadoop clusters.  developers have to make complex codes to develop MapReduce program.  They can write such codes in simple Java, or even can use optimized languages, such as PigLatin, HQL,etc.

6. 4. Scheduling :  Big Data jobs execution needs to be monitored and scheduled.  Instead of writing jobs for scheduling, developers can take help of big data suite to define and handle the execution tasks in most efficient way.

7. 5. Integration :  Hadoop- it wants to integrate data from all types of products and technologies.  Along with files and SQL databases, developers wants to integrate data from NoSQL databases, social media, B2B products, etc.

8. Key Advantages There are many advantages associated with Hadoop. In this presentation we have came up with some major advantages of Hadoop.

9. Scalable:  Hadoop is highly scalable.  it can store and distribute very large data sets across hundreds of inexpensive servers.

10. Cost effective:  Owing to its scale-out architecture  Hadoop offers a cost effective storage solution and processing

11. Flexible:  Ability to work with all kind of data: structured, semi-structured and unstructured.  it can be used for a wide variety of purposes, such as log processing, recommendation systems,data warehousing ,data mining and more.

12. Fast:  the process is extremely fast in compared to other conventional systems owing to the ”move code to data” paradigm.

13. Resilient to failure:  Hadoop is fault tolerance.  It practices replication of data diligently.  ensuring that in the event of a node failure.

14. Versions of Hadoop

15. There are two version of Hadoop available: 1.Hadoop 1.0 2.Hadoop 2.0

16. Hadoop 1.0 It has two main parts: 1.Data storage framework 2.Data processing framework 1.Data storage framework:  It is a general –purpose filesystem called Hadoop Distributed File System.  HDFS is schema-less.  It stores data files can be in just about any format.

17. 2.Data processing framework:  Is a simple functional programming model.  It essentially uses two functions: 1.MAP 2.REDUCE 1.The “Mapers” take set of key-value pairs and generate intermediate data. 2.The“Reducers” then act on this input to produce the output data.

18. Hadoop 1.0 MapReduce (Cluster Resource Manager And Data Processing) HDFS (Redundant , reliable storage)

19. Hadoop 2.0  HDFS continues to be the data storage framework.  A new and separate resource management framework called Yet Another Resource Negotiator(YARN) has been added.  Any application capable of dividing itself into parallel tasks is supported by YARN.  YARN coordinates the allocation of subtasks of the submitted applications.

20.  Further enhancing the flexibility , scalability , and efficiency of the applications.  ApplicationMaster is able to run any application and not just MapReduce.  only supports batch processing but also real-time processing.  MapReduce is no longer the only data processing option.

21. Hadoop 2.0 MapReduce (Data Processing) Others (Data processing) YARN (cluster resource manager) HDFS (Redundant , reliable storage)

Hadoop J.G.Rohini II M.Sc.,computer science Bon secours college for women

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Similar to Hadoop J.G.Rohini II M.Sc.,computer science Bon secours college for women

Similar to Hadoop J.G.Rohini II M.Sc.,computer science Bon secours college for women (20)

Recently uploaded

Recently uploaded (20)

Hadoop J.G.Rohini II M.Sc.,computer science Bon secours college for women