Introduction to map reduce s. jency jayastina II MSC COMPUTER SCIENCE BON SECOURS COLLEGE FOR WOMEN THANJAVUR

•

0 likes•66 views

jencyjayastina

MAP REDUCE,How it Works

Education

SUBMITTED BY
Name : S.JENCY JAYASTINA
Class : II-MSC CS
Batch : 2017 – 2019
Incharge Staff : Ms. M. Florence Dayana

 MapReduce is a programming model for
processing large data sets with a parallel ,
distributed algorithm on a cluster.
 Map Reduce when coupled with HDFS can be
used to handle big data. ... It has an extensive
capability to handle unstructured data as well

 MapReduce is a programming model Google has
used successfully is processing its “big-data” sets (~
20000 peta bytes per day)
• Users specify the computation in terms of a map
and a reduce function,
• Underlying runtime system automatically
parallelizes the computation across large-scale
clusters of machines, and
• Underlying system also handles machine failures,
efficient communications, and performance issues.

 MapReduce is the processing engine of
the Apache Hadoop that was directly derived
from the Google MapReduce.
 The MapReduce application is written
basically in Java. It conveniently computes
huge amounts of data by the applications of
mapping and reducing steps in order to come
up with the solution for the required problem.

 The mapping step takes a set of data in order to
convert it into another set of data by breaking
the individual elements in to key/value pairs
called tuples.
 The second step of reducing takes the output
derived from the mapping process and
combines the data tuples into a smaller set of
tuples.

 MapReduce is mainly used for parallel
processing of large sets of data stored in
Hadoop cluster
 it is a hypothesis specially designed by Google
to provide parallelism, data distribution and
fault-tolerance.
 MR processes data in the form of key-value
pairs. A key-value (KV) pair is a mapping
element between two linked data items - key
and its value.

 The entire MapReduce process is a massive
parallel processing setup where the
computation is moved to the place of the data
instead of moving the data to the place of the
computation
 The entire computation process is broken down
into the mapping, shuffling and reducing
stages.
MapReduce program executes in three stages,
namely
 map stage
 shuffle stage
 reduce stage

There are two stages
 Mapping Stage
 Reducing Stage
 Mapping Stage: This is the first step of the
MapReduce and it includes the process of
reading the information from the Hadoop
Distributed File System (HDFS).
 Reducing Stage: The reducer phase can consist
of multiple processes. In the shuffling process
the data is transferred from the mapper to the
reducer.

 MasterNode – Place where JobTracker runs and which
accepts job requests from clients
 SlaveNode – It is the place where the mapping and
reducing programs are run
 JobTracker – it is the entity that schedules the jobs and
tracks the jobs assigned using Task Tracker
 TaskTracker – It is the entity that actually tracks the tasks
and provides the report status to the JobTracker
 Job – A MapReduce job is the execution of the Mapper &
Reducer program across a dataset
 Task – the execution of the Mapper & Reducer program on
a specific data section
 TaskAttempt – A particular task execution attempt on a
SlaveNode

 At Google MapReduce operation are run on a special
file system called Google File System (GFS) that is
highly optimized for this purpose.
 GFS is not open source.
 Doug Cutting and Yahoo! reverse engineered the
GFS and called it Hadoop Distributed File System
(HDFS).
 The software framework that supports HDFS,
MapReduce and other related entities is called the
project Hadoop or simply Hadoop.
 This is open source and distributed by Apache

What's hot

Hadoop, mapreduce and yarn networksHariniA7

Repartition join in mapreduceUday Vakalapudi

Map reduceSyed Measum Haider Bokhari

Join Algorithms in MapReduceShrihari Rathod

Hadoop MapReduce joinsShalish VJ

Dsm Presentationrichoe

A data aware caching 2415SANTOSH WAYAL

Big data & HadoopAhmed Gamil

LIDAR and Drone Data - Datamine Discover3DPrakher Hajela Saxena

Paper id 25201498IJRAT

MapInfo Discover 3D for Wind Energy ResourcesPrakher Hajela Saxena

FinalprojectpresentationSANTOSH WAYAL

Map Reduce Workloads: A Dynamic Job Ordering and Slot Configurationsdbpublications

Datamine Discover3D - LIDAR-Drone ApplicationsPrakher Hajela Saxena

Relational Algebra and MapReducePietro Michiardi

A Survey on Data Mapping Strategy for data stored in the storage cloud 111NavNeet KuMar

Big Data on Implementation of Many to Many Clusteringpaperpublications3

Exploration and 3D GIS Software - MapInfo Professional Discover3D 2015Prakher Hajela Saxena

Final Report_798 Project_Nithin_SharmilaNithin Kakkireni

LARGE-SCALE DATA PROCESSING USING MAPREDUCE IN CLOUD COMPUTING ENVIRONMENTijwscjournal

What's hot (20)

Hadoop, mapreduce and yarn networks

Repartition join in mapreduce

Map reduce

Join Algorithms in MapReduce

Hadoop MapReduce joins

Dsm Presentation

A data aware caching 2415

Big data & Hadoop

LIDAR and Drone Data - Datamine Discover3D

Paper id 25201498

MapInfo Discover 3D for Wind Energy Resources

Finalprojectpresentation

Map Reduce Workloads: A Dynamic Job Ordering and Slot Configurations

Datamine Discover3D - LIDAR-Drone Applications

Relational Algebra and MapReduce

A Survey on Data Mapping Strategy for data stored in the storage cloud 111

Big Data on Implementation of Many to Many Clustering

Exploration and 3D GIS Software - MapInfo Professional Discover3D 2015

Final Report_798 Project_Nithin_Sharmila

LARGE-SCALE DATA PROCESSING USING MAPREDUCE IN CLOUD COMPUTING ENVIRONMENT

Similar to Introduction to map reduce s. jency jayastina II MSC COMPUTER SCIENCE BON SECOURS COLLEGE FOR WOMEN THANJAVUR

E031201032036ijceronline

Introduccion a Hadoop / Introduction to HadoopGERARDO BARBERENA

Hadoop Mapreduce Performance Enhancement Using In-Node Combinersijcsit

Big Data Analytics Chapter3-6@2021.pdfWasyihunSema2

B04 06 0918International Journal of Engineering Inventions www.ijeijournal.com

Mapreduce Hadop.pptxBangladesh University of Professionals

Report Hadoop Map ReduceUrvashi Kataria

writing Hadoop Map Reduce programsjani shaik

Seminar_Report_hadoopVarun Narang

Generating Frequent Itemsets by RElim on Hadoop ClustersBRNSSPublicationHubI

B017320612IOSR Journals

Leveraging Map Reduce With Hadoop for Weather Data Analytics iosrjce

Hadoop ppt2Ankit Gupta

Design Issues and Challenges of Peer-to-Peer Video on Demand System cscpconf

Survey of Parallel Data Processing in Context with MapReduce cscpconf

Parallel Data Processing with MapReduce: A SurveyKyong-Ha Lee

Hadoopdevakalyan143

Hadoop eco system with mapreduce hive and pigKhanKhaja1

Novel Scheduling Algorithms for Efficient Deployment of Map Reduce Applicatio...IRJET Journal

Similar to Introduction to map reduce s. jency jayastina II MSC COMPUTER SCIENCE BON SECOURS COLLEGE FOR WOMEN THANJAVUR (20)

E031201032036

Introduccion a Hadoop / Introduction to Hadoop

Hadoop Mapreduce Performance Enhancement Using In-Node Combiners

Big Data Analytics Chapter3-6@2021.pdf

B04 06 0918

Mapreduce Hadop.pptx

Report Hadoop Map Reduce

writing Hadoop Map Reduce programs

Seminar_Report_hadoop

Generating Frequent Itemsets by RElim on Hadoop Clusters

B017320612

Leveraging Map Reduce With Hadoop for Weather Data Analytics

Hadoop ppt2

Design Issues and Challenges of Peer-to-Peer Video on Demand System

Survey of Parallel Data Processing in Context with MapReduce

Parallel Data Processing with MapReduce: A Survey

Hadoop

Hadoop eco system with mapreduce hive and pig

Novel Scheduling Algorithms for Efficient Deployment of Map Reduce Applicatio...

Recently uploaded

EPANDING THE CONTENT OF AN OUTLINE using notes.pptxRaymartEstabillo3

Painted Grey Ware.pptx, PGW Culture of IndiaVirag Sontakke

Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir

MARGINALIZATION (Different learners in Marginalized GroupJonathanParaisoCruz

POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar

Historical philosophical, theoretical, and legal foundations of special and i...jaredbarbolino94

Full Stack Web Development Course for BeginnersSabitha Banu

Meghan Sutherland In Media Res Media ComponentInMediaRes1

Alper Gobel In Media Res Media ComponentInMediaRes1

Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre

DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu

ESSENTIAL of (CS/IT/IS) class 06 (database)Dr. Mazin Mohamed alkathiri

9953330565 Low Rate Call Girls In Rohini Delhi NCR9953056974 Low Rate Call Girls In Saket, Delhi NCR

What is Model Inheritance in Odoo 17 ERPCeline George

Gas measurement O2,Co2,& ph) 04/2024.pptxDr.Ibrahim Hassaan

Blooming Together_ Growing a Community Garden Worksheet.docxUnboundStockton

TataKelola dan KamSiber Kecerdasan Buatan v022.pdfSarwono Sutikno, Dr.Eng.,CISA,CISSP,CISM,CSX-F

Final demo Grade 9 for demo Plan dessert.pptxAvyJaneVismanos

“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr

Roles & Responsibilities in PharmacovigilanceSamikshaHamane

Recently uploaded (20)

EPANDING THE CONTENT OF AN OUTLINE using notes.pptx

Painted Grey Ware.pptx, PGW Culture of India

Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf

MARGINALIZATION (Different learners in Marginalized Group

POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx

Historical philosophical, theoretical, and legal foundations of special and i...

Full Stack Web Development Course for Beginners

Meghan Sutherland In Media Res Media Component

Alper Gobel In Media Res Media Component

Organic Name Reactions for the students and aspirants of Chemistry12th.pptx

DATA STRUCTURE AND ALGORITHM for beginners

ESSENTIAL of (CS/IT/IS) class 06 (database)

9953330565 Low Rate Call Girls In Rohini Delhi NCR

What is Model Inheritance in Odoo 17 ERP

Gas measurement O2,Co2,& ph) 04/2024.pptx

Blooming Together_ Growing a Community Garden Worksheet.docx

TataKelola dan KamSiber Kecerdasan Buatan v022.pdf

Final demo Grade 9 for demo Plan dessert.pptx

“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...

Roles & Responsibilities in Pharmacovigilance

Introduction to map reduce s. jency jayastina II MSC COMPUTER SCIENCE BON SECOURS COLLEGE FOR WOMEN THANJAVUR

1. SUBMITTED BY Name : S.JENCY JAYASTINA Class : II-MSC CS Batch : 2017 – 2019 Incharge Staff : Ms. M. Florence Dayana

2.  MapReduce is a programming model for processing large data sets with a parallel , distributed algorithm on a cluster.  Map Reduce when coupled with HDFS can be used to handle big data. ... It has an extensive capability to handle unstructured data as well

3.  MapReduce is a programming model Google has used successfully is processing its “big-data” sets (~ 20000 peta bytes per day) • Users specify the computation in terms of a map and a reduce function, • Underlying runtime system automatically parallelizes the computation across large-scale clusters of machines, and • Underlying system also handles machine failures, efficient communications, and performance issues.

4.  MapReduce is the processing engine of the Apache Hadoop that was directly derived from the Google MapReduce.  The MapReduce application is written basically in Java. It conveniently computes huge amounts of data by the applications of mapping and reducing steps in order to come up with the solution for the required problem.

5.  The mapping step takes a set of data in order to convert it into another set of data by breaking the individual elements in to key/value pairs called tuples.  The second step of reducing takes the output derived from the mapping process and combines the data tuples into a smaller set of tuples.

7.  MapReduce is mainly used for parallel processing of large sets of data stored in Hadoop cluster  it is a hypothesis specially designed by Google to provide parallelism, data distribution and fault-tolerance.  MR processes data in the form of key-value pairs. A key-value (KV) pair is a mapping element between two linked data items - key and its value.

8.  The entire MapReduce process is a massive parallel processing setup where the computation is moved to the place of the data instead of moving the data to the place of the computation  The entire computation process is broken down into the mapping, shuffling and reducing stages. MapReduce program executes in three stages, namely  map stage  shuffle stage  reduce stage

9. There are two stages  Mapping Stage  Reducing Stage  Mapping Stage: This is the first step of the MapReduce and it includes the process of reading the information from the Hadoop Distributed File System (HDFS).  Reducing Stage: The reducer phase can consist of multiple processes. In the shuffling process the data is transferred from the mapper to the reducer.

10.  MasterNode – Place where JobTracker runs and which accepts job requests from clients  SlaveNode – It is the place where the mapping and reducing programs are run  JobTracker – it is the entity that schedules the jobs and tracks the jobs assigned using Task Tracker  TaskTracker – It is the entity that actually tracks the tasks and provides the report status to the JobTracker  Job – A MapReduce job is the execution of the Mapper & Reducer program across a dataset  Task – the execution of the Mapper & Reducer program on a specific data section  TaskAttempt – A particular task execution attempt on a SlaveNode

11.  At Google MapReduce operation are run on a special file system called Google File System (GFS) that is highly optimized for this purpose.  GFS is not open source.  Doug Cutting and Yahoo! reverse engineered the GFS and called it Hadoop Distributed File System (HDFS).  The software framework that supports HDFS, MapReduce and other related entities is called the project Hadoop or simply Hadoop.  This is open source and distributed by Apache

12.

13. Thank you

Introduction to map reduce s. jency jayastina II MSC COMPUTER SCIENCE BON SECOURS COLLEGE FOR WOMEN THANJAVUR

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Introduction to map reduce s. jency jayastina II MSC COMPUTER SCIENCE BON SECOURS COLLEGE FOR WOMEN THANJAVUR

Similar to Introduction to map reduce s. jency jayastina II MSC COMPUTER SCIENCE BON SECOURS COLLEGE FOR WOMEN THANJAVUR (20)

Recently uploaded

Recently uploaded (20)

Introduction to map reduce s. jency jayastina II MSC COMPUTER SCIENCE BON SECOURS COLLEGE FOR WOMEN THANJAVUR