Big data(hadoop)

•Download as PPTX, PDF•

1 like•75 views

This ppt is a small slideshare to describe the problem of increasing digital data and its technical solution called Hadoop.Embedded effects and motions make it exciting and more presentable

Technology

A
Presentation
On
Big Data
&
Hadoop
Submitted To:-
Mrs. Sonika Narang
Mrs. Poonam Beri
Submitted By:-
Ms. Shabnam
34633

Big data means really a big data, it is a collection of
large & complex data that it becomes difficult to
process using traditional data processing
applications.

Black Box Data
Social Media Data
Stock Exchange Data
Power Grid Data
Transport Data

3Vs /Characterizing BIG
DATA
Volume
Variety
Velocity

TYPES OF BIG DATA
 Structured Data:-Relational Data
 Semi-Structured Data:-XML Data
 Unstructured Data:-PDF ,Word ,Text ,Media Logs etc.

 Daily, updation of 0.5 PBs on FACEBOOK including 40 millions PHOTOS.
 Daily ,videos uploading on YOUTUBE that can be watched for 1 year
continously.
 Also affect INTERNET SEARCH,FINANCE & BUSINESS INFORMATION
 Challenge include in CAPTURE,SEARCHING,SHARING,ANALY-
SIS,STORAGE & VISUALIZATION of data.

LIMITATION
Can’t Deal With Huge Amount of Data
SO TRADITIONAL APPROACH FAILS

Then the
ACTUAL SOLUTION
of
BIG DATA IS NAMED

 A software framework for distributed processing of large datasets
across large clusters of computers
 Large datasets  Terabytes or petabytes of data
 Large clusters  hundreds or thousands of nodes
 Open-source implementation for Google MAPREDUCE
 Based on a simple data model, anydatawillfit

 2005: Doug Cutting and Michael J. Cafarella and team developed Hadoop
to support distribution for the Nutch search engine project.
 Doug named it after his son's toy elephant
 The project was funded by YAHOO
 2006: Yahoo gave the project to APACHE SOFTWARE FOUNDATION.

Architecture of hdoop
MapReduce
HDFS
Hdoop Common

 A software frameawork for distributing computation of
huge data.
 Consists of two main phases
◦ Map
◦ Reduce
 The Map Task: converts input into individually broken
elements.
 The Reduce Task: takes the output from a map task as
input and combines.

How MapReduce Works??
We Love India We 1 Love 1
Love 1 India 1
India 1 We 2
We Play Cricket We 1 Tennis 1
Play 1 Play 1
Tennis
MAP REDUCE
We Love India
We Play Cricket

HDFS
Distributed File system used by Hadoop is (HDFS).
Based on the Google File System (GFS).
Designed to run on thousands of clusters of small
computers.
HDFS uses a MASTERSLAVE ARCHITECTURE

 Master node is called namenode.
 Slave node is called datanode.
 Master (Name Node) manages the file system metadata.
 Slave( DataNodes) store the actual data.
 A file in an HDFS is split into several blocks
 Blocks are stored in a set of DataNodes.
 NameNode the maps blocks to the DataNodes.
 The DataNodes takes care of read, write, creation and deletion
operatons based on instruction given by NameNode.

Provides access to HDFS.
Contains Java libraries and utilities
Contains the necessary java files &
scripts to start HADOOP.

ADVANTAGES OF HADOOP
Designed to detect & handle
failures.
• Automation distribution of data across
the machines.
Doesn’t rely on hardware for fault
tolerance.
• Servers can be added or removed
dynamically.

What's hot

Hadoop and its role in Facebook: An Overviewrahulmonikasharma

Intro to Big Data HadoopApache Apex

A Glimpse of Bigdata - Introductionsaisreealekhya

Introduction of Big data and Hadoop Arohi Khandelwal

Big DataSubhavinolin Raja

Hadoop in actionMahmoud Yassin

Gail Zhou on "Big Data Technology, Strategy, and Applications"Gail Zhou, MBA, PhD

Big data computingTasneemKhan47

Big data abstractnandhiniarumugam619

Big data Presentationhimanshu arora

1. what is hadoop part 1wintersnow181189

Introduction to Big Data and Hadoop using Local Standalone Modeinventionjournals

big data overview pptVIKAS KATARE

Big data PPT Nitesh Dubey

Comparison with Traditional databasesGowriLatha1

Big Data Final Presentation17aroumougamh

Big Data Analytics 2014Stratebi

Big Data Hadoop Training by Easylearning GuruKCC Software Ltd. & Easylearning.guru

Big dataMina Soltani

The Big Data StackZubair Nabi

What's hot (20)

Hadoop and its role in Facebook: An Overview

Intro to Big Data Hadoop

A Glimpse of Bigdata - Introduction

Introduction of Big data and Hadoop

Big Data

Hadoop in action

Gail Zhou on "Big Data Technology, Strategy, and Applications"

Big data computing

Big data abstract

Big data Presentation

1. what is hadoop part 1

Introduction to Big Data and Hadoop using Local Standalone Mode

big data overview ppt

Big data PPT

Comparison with Traditional databases

Big Data Final Presentation

Big Data Analytics 2014

Big Data Hadoop Training by Easylearning Guru

Big data

The Big Data Stack

Similar to Big data(hadoop)

Big data Hadoop presentation Shivanee garg

Big datarevathireddyb

Hadoop hdfs interview questionsKalyan Hadoop

Big Data & HadoopAnkan Banerjee

THE SOLUTION FOR BIG DATATarak Tar

hadoopswatic018

Big data Analytics HadoopMishika Bharadwaj

Hadoop and BigData - July 2016Ranjith Sekar

Big Data and HadoopMr. Ankit

Big Data and Hadoop - An IntroductionNagarjuna Kanamarlapudi

Introduction to Apache Hadoop Eco-SystemMd. Hasan Basri (Angel)

Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...Simplilearn

Hadoop introduction , Why and What is Hadoop ?sudhakara st

Big Data and HadoopFlavio Vit

INTRODUCTION OF BIG DATAHarshitChaurasia6

IJARCCE_49Mr.Sameer Kumar Das

HadoopMayuri Gupta

BIG DATAShashank Shetty

Similar to Big data(hadoop) (20)

Big data Hadoop presentation

Big data

Hadoop hdfs interview questions

Big Data & Hadoop

THE SOLUTION FOR BIG DATA

hadoop

Big data Analytics Hadoop

Hadoop and BigData - July 2016

Big Data and Hadoop

Big Data and Hadoop - An Introduction

Introduction to Apache Hadoop Eco-System

Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...

Hadoop introduction , Why and What is Hadoop ?

Big Data and Hadoop

INTRODUCTION OF BIG DATA

IJARCCE_49

Hadoop

BIG DATA

Recently uploaded

How to convert PDF to text with Nanonetsnaman860154

E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxnull - The Open Security Community

My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar

DMCC Future of Trade Web3 - Special EditionDubai Multi Commodity Centre

Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106

Key Features Of Token Development (1).pptxLBM Solutions

Build your next Gen AI Breakthrough - April 2024Neo4j

Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes

Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited

Presentation on how to chat with PDF using ChatGPT code interpreternaman860154

FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh

Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren

Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm

Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard

Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik

Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson

08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls

The transition to renewables in India.pdfCompetition Advisory Services (India) LLP

Recently uploaded (20)

How to convert PDF to text with Nanonets

E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx

My Hashitalk Indonesia April 2024 Presentation

DMCC Future of Trade Web3 - Special Edition

Scanning the Internet for External Cloud Exposures via SSL Certs

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics

Key Features Of Token Development (1).pptx

Build your next Gen AI Breakthrough - April 2024

Enhancing Worker Digital Experience: A Hands-on Workshop for Partners

Designing IA for AI - Information Architecture Conference 2024

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365

Presentation on how to chat with PDF using ChatGPT code interpreter

FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi

Advanced Test Driven-Development @ php[tek] 2024

Streamlining Python Development: A Guide to a Modern Project Setup

Maximizing Board Effectiveness 2024 Webinar.pptx

Injustice - Developers Among Us (SciFiDevCon 2024)

Are Multi-Cloud and Serverless Good or Bad?

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men

The transition to renewables in India.pdf

Big data(hadoop)

1. A Presentation On Big Data & Hadoop Submitted To:- Mrs. Sonika Narang Mrs. Poonam Beri Submitted By:- Ms. Shabnam 34633

2. Big data means really a big data, it is a collection of large & complex data that it becomes difficult to process using traditional data processing applications.

3. Black Box Data Social Media Data Stock Exchange Data Power Grid Data Transport Data

4. 3Vs /Characterizing BIG DATA Volume Variety Velocity

5. TYPES OF BIG DATA  Structured Data:-Relational Data  Semi-Structured Data:-XML Data  Unstructured Data:-PDF ,Word ,Text ,Media Logs etc.

6.  Daily, updation of 0.5 PBs on FACEBOOK including 40 millions PHOTOS.  Daily ,videos uploading on YOUTUBE that can be watched for 1 year continously.  Also affect INTERNET SEARCH,FINANCE & BUSINESS INFORMATION  Challenge include in CAPTURE,SEARCHING,SHARING,ANALY- SIS,STORAGE & VISUALIZATION of data.

7. LIMITATION Can’t Deal With Huge Amount of Data SO TRADITIONAL APPROACH FAILS

8. Then the ACTUAL SOLUTION of BIG DATA IS NAMED

9.  A software framework for distributed processing of large datasets across large clusters of computers  Large datasets  Terabytes or petabytes of data  Large clusters  hundreds or thousands of nodes  Open-source implementation for Google MAPREDUCE  Based on a simple data model, anydatawillfit

10.  2005: Doug Cutting and Michael J. Cafarella and team developed Hadoop to support distribution for the Nutch search engine project.  Doug named it after his son's toy elephant  The project was funded by YAHOO  2006: Yahoo gave the project to APACHE SOFTWARE FOUNDATION.

11. WHO USES HADOOP?

12. Architecture of hdoop MapReduce HDFS Hdoop Common

13.  A software frameawork for distributing computation of huge data.  Consists of two main phases ◦ Map ◦ Reduce  The Map Task: converts input into individually broken elements.  The Reduce Task: takes the output from a map task as input and combines.

14. How MapReduce Works?? We Love India We 1 Love 1 Love 1 India 1 India 1 We 2 We Play Cricket We 1 Tennis 1 Play 1 Play 1 Tennis MAP REDUCE We Love India We Play Cricket

15. HDFS Distributed File system used by Hadoop is (HDFS). Based on the Google File System (GFS). Designed to run on thousands of clusters of small computers. HDFS uses a MASTERSLAVE ARCHITECTURE

16.  Master node is called namenode.  Slave node is called datanode.  Master (Name Node) manages the file system metadata.  Slave( DataNodes) store the actual data.  A file in an HDFS is split into several blocks  Blocks are stored in a set of DataNodes.  NameNode the maps blocks to the DataNodes.  The DataNodes takes care of read, write, creation and deletion operatons based on instruction given by NameNode.

17. Provides access to HDFS. Contains Java libraries and utilities Contains the necessary java files & scripts to start HADOOP.

18. ADVANTAGES OF HADOOP Designed to detect & handle failures. • Automation distribution of data across the machines. Doesn’t rely on hardware for fault tolerance. • Servers can be added or removed dynamically.

19. ANY QUERIES????

Big data(hadoop)

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Big data(hadoop)

Similar to Big data(hadoop) (20)

Recently uploaded

Recently uploaded (20)

Big data(hadoop)