SlideShare a Scribd company logo
1 of 10
Presented by:
Aarti Bedre
BTech final year(IT)
Introduction
 Data which are very large in size is called Big Data. which is in Peta bytes
i.e. 10^15 byte .
 Almost 90% of the world’s data was generated in the last few years.
 Big data world the sheer volume, velocity and variety of data renders many
ordinary technologies are ineffective.
 These data come from many sources like :
 Social networking sites: like Facebook, Google etc.
 E-commerce site: like Amazon, Flipkart etc.
 Telecom company: like Airtel, Vodafone etc.
 And many more….
 Hadoop is the solution for this big data.To manage all the data that their
servers were gathering in an efficient, cost effective way.
Hadoop
 For this huge amount of unstructured data which needs to be stored,
processed and analyzed.
 So, for this issues hadoop uses HDFS.
 Hadoop was originally created by a Yahoo.
 Hadoop is an open source framework from Apache.
 It is used to store process and analyze data which are very huge in volume.
 It is written in Java and is not online analytical processing.
Architecture
 Hadoop framework includes following four modules:
1. Hadoop Common: These Java libraries are used to start Hadoop.
2. Yarn: Yet another Resource Negotiator is used for job scheduling
and manage the cluster..
3. HDFS: Hadoop Distributed File System.
4. Map Reduce: This is YARN-based system for parallel processing
of large data sets.
Continue…
Fig 1: Hadoop Architecture
Advantage
 Fast
 Scalable
 Cost Effective
 Resilient to failure
Working
 Hadoop runs code across a cluster of computers. This process
includes the following core tasks that Hadoop performs:
 Data is initially divided into directories and files. Files are divided
into uniform sized blocks of 128M and 64M (preferably 128M).
 These files are then distributed across various cluster nodes for
further processing.
 HDFS, being on top of the local file system, supervises the
processing.
 Blocks are replicated for handling hardware failure.
 Checking that the code was executed successfully.
 Performing the sort that takes place between the map and reduce
stages.
 Sending the sorted data to a certain computer.
 Writing the debugging logs for each job.
Application
 Now days Hadoop Technolgy used in healthcare system like in Cancer
Treatments and Genomics , in Monitoring Patient Vitals , in the Hospital
Network , in Fraud Prevention and Detection etc.
 It is being used by Facebook , Yahoo, Google, Twitter, LinkedIn and many
more.
Hadoop Installation
 Environment required for Hadoop:
 For Hadoop installation on the UNIX environment you need :
 Java Installation
 SSH installation
 Hadoop Installation and File Configuration
Hadoop

More Related Content

What's hot

Big Data Technology Stack : Nutshell
Big Data Technology Stack : NutshellBig Data Technology Stack : Nutshell
Big Data Technology Stack : NutshellKhalid Imran
 
Handling Big Data Using a Data-Aware HDFS and Evolutionary Clustering Technique
Handling Big Data Using a Data-Aware HDFS and Evolutionary Clustering TechniqueHandling Big Data Using a Data-Aware HDFS and Evolutionary Clustering Technique
Handling Big Data Using a Data-Aware HDFS and Evolutionary Clustering TechniqueJAYAPRAKASH JPINFOTECH
 
1.demystifying big data & hadoop
1.demystifying big data & hadoop1.demystifying big data & hadoop
1.demystifying big data & hadoopdatabloginfo
 
Big Stream Processing Systems, Big Graphs
Big Stream Processing Systems, Big GraphsBig Stream Processing Systems, Big Graphs
Big Stream Processing Systems, Big GraphsPetr Novotný
 
Introduction_OF_Hadoop_and_BigData
Introduction_OF_Hadoop_and_BigDataIntroduction_OF_Hadoop_and_BigData
Introduction_OF_Hadoop_and_BigDataNilay Mishra
 
The Big Data Stack
The Big Data StackThe Big Data Stack
The Big Data StackZubair Nabi
 
Overview of Big Data by Sunny
Overview of Big Data by SunnyOverview of Big Data by Sunny
Overview of Big Data by SunnyDignitasDigital1
 
Big data presentation
Big data presentationBig data presentation
Big data presentationSreeSowmya7
 
Présentation on radoop
Présentation on radoop   Présentation on radoop
Présentation on radoop siliconsudipt
 
Big data and computing grid
Big data and computing gridBig data and computing grid
Big data and computing gridThang Nguyen
 
Debunking "Purpose-Built Data Systems:": Enter the Universal Database
Debunking "Purpose-Built Data Systems:": Enter the Universal DatabaseDebunking "Purpose-Built Data Systems:": Enter the Universal Database
Debunking "Purpose-Built Data Systems:": Enter the Universal DatabaseStavros Papadopoulos
 
A Comprehensive Study on Big Data Applications and Challenges
A Comprehensive Study on Big Data Applications and ChallengesA Comprehensive Study on Big Data Applications and Challenges
A Comprehensive Study on Big Data Applications and Challengesijcisjournal
 
Big data management
Big data managementBig data management
Big data managementzeba khanam
 
big data and hadoop
 big data and hadoop big data and hadoop
big data and hadoopahmed alshikh
 

What's hot (20)

Big Data Technology Stack : Nutshell
Big Data Technology Stack : NutshellBig Data Technology Stack : Nutshell
Big Data Technology Stack : Nutshell
 
Why Hadoop is Useful?
Why Hadoop is Useful?Why Hadoop is Useful?
Why Hadoop is Useful?
 
Handling Big Data Using a Data-Aware HDFS and Evolutionary Clustering Technique
Handling Big Data Using a Data-Aware HDFS and Evolutionary Clustering TechniqueHandling Big Data Using a Data-Aware HDFS and Evolutionary Clustering Technique
Handling Big Data Using a Data-Aware HDFS and Evolutionary Clustering Technique
 
1.demystifying big data & hadoop
1.demystifying big data & hadoop1.demystifying big data & hadoop
1.demystifying big data & hadoop
 
Big Stream Processing Systems, Big Graphs
Big Stream Processing Systems, Big GraphsBig Stream Processing Systems, Big Graphs
Big Stream Processing Systems, Big Graphs
 
Big Data and Hadoop
Big Data and HadoopBig Data and Hadoop
Big Data and Hadoop
 
Introduction_OF_Hadoop_and_BigData
Introduction_OF_Hadoop_and_BigDataIntroduction_OF_Hadoop_and_BigData
Introduction_OF_Hadoop_and_BigData
 
Big data analytics.
Big data analytics.Big data analytics.
Big data analytics.
 
The Big Data Stack
The Big Data StackThe Big Data Stack
The Big Data Stack
 
Overview of Big Data by Sunny
Overview of Big Data by SunnyOverview of Big Data by Sunny
Overview of Big Data by Sunny
 
Hadoop Tutorial
Hadoop TutorialHadoop Tutorial
Hadoop Tutorial
 
Big data presentation
Big data presentationBig data presentation
Big data presentation
 
Présentation on radoop
Présentation on radoop   Présentation on radoop
Présentation on radoop
 
Big data
Big dataBig data
Big data
 
Big data and computing grid
Big data and computing gridBig data and computing grid
Big data and computing grid
 
Hadoop
HadoopHadoop
Hadoop
 
Debunking "Purpose-Built Data Systems:": Enter the Universal Database
Debunking "Purpose-Built Data Systems:": Enter the Universal DatabaseDebunking "Purpose-Built Data Systems:": Enter the Universal Database
Debunking "Purpose-Built Data Systems:": Enter the Universal Database
 
A Comprehensive Study on Big Data Applications and Challenges
A Comprehensive Study on Big Data Applications and ChallengesA Comprehensive Study on Big Data Applications and Challenges
A Comprehensive Study on Big Data Applications and Challenges
 
Big data management
Big data managementBig data management
Big data management
 
big data and hadoop
 big data and hadoop big data and hadoop
big data and hadoop
 

Similar to Hadoop

Similar to Hadoop (20)

Hadoop
HadoopHadoop
Hadoop
 
paper
paperpaper
paper
 
Big data Presentation
Big data PresentationBig data Presentation
Big data Presentation
 
Big Data Hadoop Technology
Big Data Hadoop TechnologyBig Data Hadoop Technology
Big Data Hadoop Technology
 
Big data
Big dataBig data
Big data
 
00 hadoop welcome_transcript
00 hadoop welcome_transcript00 hadoop welcome_transcript
00 hadoop welcome_transcript
 
Bigdata and Hadoop Bootcamp
Bigdata and Hadoop BootcampBigdata and Hadoop Bootcamp
Bigdata and Hadoop Bootcamp
 
Hadoop Technology
Hadoop TechnologyHadoop Technology
Hadoop Technology
 
Seminar ppt
Seminar pptSeminar ppt
Seminar ppt
 
Hadoop and BigData - July 2016
Hadoop and BigData - July 2016Hadoop and BigData - July 2016
Hadoop and BigData - July 2016
 
Hadoop technology doc
Hadoop technology docHadoop technology doc
Hadoop technology doc
 
hadoop
hadoophadoop
hadoop
 
hadoop
hadoophadoop
hadoop
 
Introduction to hadoop
Introduction to hadoopIntroduction to hadoop
Introduction to hadoop
 
Hadoop info
Hadoop infoHadoop info
Hadoop info
 
hadoop seminar training report
hadoop seminar  training reporthadoop seminar  training report
hadoop seminar training report
 
OPERATING SYSTEM .pptx
OPERATING SYSTEM .pptxOPERATING SYSTEM .pptx
OPERATING SYSTEM .pptx
 
Big data Hadoop presentation
Big data  Hadoop  presentation Big data  Hadoop  presentation
Big data Hadoop presentation
 
Hadoop hdfs interview questions
Hadoop hdfs interview questionsHadoop hdfs interview questions
Hadoop hdfs interview questions
 
Bigdata and hadoop
Bigdata and hadoopBigdata and hadoop
Bigdata and hadoop
 

Recently uploaded

A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 

Recently uploaded (20)

A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 

Hadoop

  • 2. Introduction  Data which are very large in size is called Big Data. which is in Peta bytes i.e. 10^15 byte .  Almost 90% of the world’s data was generated in the last few years.  Big data world the sheer volume, velocity and variety of data renders many ordinary technologies are ineffective.  These data come from many sources like :  Social networking sites: like Facebook, Google etc.  E-commerce site: like Amazon, Flipkart etc.  Telecom company: like Airtel, Vodafone etc.  And many more….  Hadoop is the solution for this big data.To manage all the data that their servers were gathering in an efficient, cost effective way.
  • 3. Hadoop  For this huge amount of unstructured data which needs to be stored, processed and analyzed.  So, for this issues hadoop uses HDFS.  Hadoop was originally created by a Yahoo.  Hadoop is an open source framework from Apache.  It is used to store process and analyze data which are very huge in volume.  It is written in Java and is not online analytical processing.
  • 4. Architecture  Hadoop framework includes following four modules: 1. Hadoop Common: These Java libraries are used to start Hadoop. 2. Yarn: Yet another Resource Negotiator is used for job scheduling and manage the cluster.. 3. HDFS: Hadoop Distributed File System. 4. Map Reduce: This is YARN-based system for parallel processing of large data sets.
  • 6. Advantage  Fast  Scalable  Cost Effective  Resilient to failure
  • 7. Working  Hadoop runs code across a cluster of computers. This process includes the following core tasks that Hadoop performs:  Data is initially divided into directories and files. Files are divided into uniform sized blocks of 128M and 64M (preferably 128M).  These files are then distributed across various cluster nodes for further processing.  HDFS, being on top of the local file system, supervises the processing.  Blocks are replicated for handling hardware failure.  Checking that the code was executed successfully.  Performing the sort that takes place between the map and reduce stages.  Sending the sorted data to a certain computer.  Writing the debugging logs for each job.
  • 8. Application  Now days Hadoop Technolgy used in healthcare system like in Cancer Treatments and Genomics , in Monitoring Patient Vitals , in the Hospital Network , in Fraud Prevention and Detection etc.  It is being used by Facebook , Yahoo, Google, Twitter, LinkedIn and many more.
  • 9. Hadoop Installation  Environment required for Hadoop:  For Hadoop installation on the UNIX environment you need :  Java Installation  SSH installation  Hadoop Installation and File Configuration