SlideShare a Scribd company logo
1 of 21
Introduction to Big Data and Hadoop
Agenda
• What is Big Data?
• Facts about Big Data
• Need for Big Data
• Hadoop Overview
• Big Data Course at GreyCampus
• Understanding HDFS and MapReduce
• Enterprise Hadoop
• Hadoop Career Opportunities
• Pre-Requisites to learn Big Data
What is Big Data?
• Big data is a buzzword describing massive volume of data that is so
large it is difficult to process using traditional database and software
techniques
• In most enterprise scenarios the volume of data is too big or it moves
too fast or it exceeds current processing capacity
• Despite these problems, big data has the potential to help companies
improve operations and make faster, more intelligent decisions
• Big Data has 3 attributes – Volume, Velocity and Variety
• Applications generating massive data in terabytes and petabytes
• Stock market generates 1 terabyte of data per day
What is Big Data?
Facts about Big Data
Need for Big Data
• Currently the data is 4 Zetabytes in the digital world
• The predictions are it might reach 40 Zetabytes in 2020
• The size of the data is doubled every 2 years
• We need mechanisms to store this data
• We need to process this data almost in real time for helping businesses
make informed decisions
• Fact: We are generating big data and we need faster information
Need for Big Data
Overview of Hadoop
Big Data Course at GreyCampus
• Module 1
– Introduction to Big Data
• Module 2
– HDFS Architecture
• Module 3
– MapReduce
• Module 4
– Advanced MapReduce
• Module 5
– Hive
• Module 6
– PIG
• Module 7
– HBase and Zookeeper
• Module 8
– Sqoop and Flume – Moving Data to and
from HDFS
• Module 9
– Hadoop Ecosystem and Components –
Introduction
• Module 10
– Commercial Distributions of Hadoop
Course Topics:
Features of HDFS
• When a dataset outgrows the storage capacity of a single physical machine, it
becomes necessary to partition it across a number of separate machines
• File systems that manage the storage across a network of machines are called
distributed filesystems
• HDFS is a filesystem designed for storing very large files with streaming data
access patterns, running on clusters of commodity hardware
• There are Hadoop clusters running today that store petabytes of data
Features of MapReduce
• Developers don’t have to worry about the plumbing for their jobs
• No threads or inter process communications or semaphores to program
• Just write programs that process part of your input files and produce the output
• The mappers and reducers share nothing. That means each mapper is independent of
what other mapper does and each reducer is independent of other reducers
• So the mappers and reducers can be massively parallel
• The MapReduce system is built handling failure
• The system is built robust so that the users don’t have to take any action and the system
automatically handles the failures.
Enterprise Hadoop
Hadoop Career Advantages
More Job Opportunities!
Look who is hiring!
Hadoop means high on salary!
Transform your career
Future of Big Data
Careers in Big Data
• “By 2015, 4.4 million IT jobs globally will be created to support big
data, generating 1.9 million IT jobs in the United States,” said Peter
Sondergaard, senior vice president at Gartner and global head of
Research. “In addition, every big data-related role in the U.S. will
create employment for three people outside of IT, so over the next
four years a total of 6 million jobs in the U.S. will be generated by
the information economy.“
Careers in Big Data
• “But there is a challenge. There is not enough talent in the industry. Our
public and private education systems are failing us. Therefore, only one-
third of the IT jobs will be filled. Data experts will be a scarce, valuable
commodity,” Mr. Sondergaard said. “IT leaders will need immediate focus
on how their organization develops and attracts the skills required. These
jobs will be needed to grow your business. These jobs are the future of
the new information economy.”
Pre-Requisites
• Good programming skills
• Basic understanding of database management systems
• Knowledge on core Java (added advantage)

More Related Content

What's hot

Lesson 1 introduction to_big_data_and_hadoop.pptx
Lesson 1 introduction to_big_data_and_hadoop.pptxLesson 1 introduction to_big_data_and_hadoop.pptx
Lesson 1 introduction to_big_data_and_hadoop.pptxPankajkumar496281
 
Big data by Mithlesh sadh
Big data by Mithlesh sadhBig data by Mithlesh sadh
Big data by Mithlesh sadhMithlesh Sadh
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big DataKaran Desai
 
Big data and hadoop overview
Big data and hadoop overviewBig data and hadoop overview
Big data and hadoop overviewObinna Ekeh
 
Introduction to Big Data Hadoop Training Online by www.itjobzone.biz
Introduction to Big Data Hadoop Training Online by www.itjobzone.bizIntroduction to Big Data Hadoop Training Online by www.itjobzone.biz
Introduction to Big Data Hadoop Training Online by www.itjobzone.bizITJobZone.biz
 
Big data analysis using map/reduce
Big data analysis using map/reduceBig data analysis using map/reduce
Big data analysis using map/reduceRenuSuren
 
Introduction to Big Data
Introduction to Big Data Introduction to Big Data
Introduction to Big Data Srinath Perera
 
Big data (4Vs,history,concept,algorithm) analysis and applications #bigdata #...
Big data (4Vs,history,concept,algorithm) analysis and applications #bigdata #...Big data (4Vs,history,concept,algorithm) analysis and applications #bigdata #...
Big data (4Vs,history,concept,algorithm) analysis and applications #bigdata #...yashbheda
 
Core concepts and Key technologies - Big Data Analytics
Core concepts and Key technologies - Big Data AnalyticsCore concepts and Key technologies - Big Data Analytics
Core concepts and Key technologies - Big Data AnalyticsKaniska Mandal
 
Gail Zhou on "Big Data Technology, Strategy, and Applications"
Gail Zhou on "Big Data Technology, Strategy, and Applications"Gail Zhou on "Big Data Technology, Strategy, and Applications"
Gail Zhou on "Big Data Technology, Strategy, and Applications"Gail Zhou, MBA, PhD
 

What's hot (20)

Lesson 1 introduction to_big_data_and_hadoop.pptx
Lesson 1 introduction to_big_data_and_hadoop.pptxLesson 1 introduction to_big_data_and_hadoop.pptx
Lesson 1 introduction to_big_data_and_hadoop.pptx
 
Big data by Mithlesh sadh
Big data by Mithlesh sadhBig data by Mithlesh sadh
Big data by Mithlesh sadh
 
Exploring Big Data Analytics Tools
Exploring Big Data Analytics ToolsExploring Big Data Analytics Tools
Exploring Big Data Analytics Tools
 
Big data abstract
Big data abstractBig data abstract
Big data abstract
 
Big_data_ppt
Big_data_ppt Big_data_ppt
Big_data_ppt
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
Big data and hadoop overview
Big data and hadoop overviewBig data and hadoop overview
Big data and hadoop overview
 
Introduction to Big Data Hadoop Training Online by www.itjobzone.biz
Introduction to Big Data Hadoop Training Online by www.itjobzone.bizIntroduction to Big Data Hadoop Training Online by www.itjobzone.biz
Introduction to Big Data Hadoop Training Online by www.itjobzone.biz
 
Big data analysis using map/reduce
Big data analysis using map/reduceBig data analysis using map/reduce
Big data analysis using map/reduce
 
Introduction to Big Data
Introduction to Big Data Introduction to Big Data
Introduction to Big Data
 
Big data
Big dataBig data
Big data
 
Hadoop
HadoopHadoop
Hadoop
 
Big data (4Vs,history,concept,algorithm) analysis and applications #bigdata #...
Big data (4Vs,history,concept,algorithm) analysis and applications #bigdata #...Big data (4Vs,history,concept,algorithm) analysis and applications #bigdata #...
Big data (4Vs,history,concept,algorithm) analysis and applications #bigdata #...
 
Big data
Big dataBig data
Big data
 
Big data tools
Big data toolsBig data tools
Big data tools
 
Core concepts and Key technologies - Big Data Analytics
Core concepts and Key technologies - Big Data AnalyticsCore concepts and Key technologies - Big Data Analytics
Core concepts and Key technologies - Big Data Analytics
 
Big data
Big dataBig data
Big data
 
Gail Zhou on "Big Data Technology, Strategy, and Applications"
Gail Zhou on "Big Data Technology, Strategy, and Applications"Gail Zhou on "Big Data Technology, Strategy, and Applications"
Gail Zhou on "Big Data Technology, Strategy, and Applications"
 
Our big data
Our big dataOur big data
Our big data
 
Chapter 1 big data
Chapter 1 big dataChapter 1 big data
Chapter 1 big data
 

Similar to Introduction to Big Data and Hadoop

Big-Data-Seminar-6-Aug-2014-Koenig
Big-Data-Seminar-6-Aug-2014-KoenigBig-Data-Seminar-6-Aug-2014-Koenig
Big-Data-Seminar-6-Aug-2014-KoenigManish Chopra
 
Big data with Hadoop - Introduction
Big data with Hadoop - IntroductionBig data with Hadoop - Introduction
Big data with Hadoop - IntroductionTomy Rhymond
 
Special issues on big data
Special issues on big dataSpecial issues on big data
Special issues on big dataVedanand Singh
 
Content1. Introduction2. What is Big Data3. Characte.docx
Content1. Introduction2. What is Big Data3. Characte.docxContent1. Introduction2. What is Big Data3. Characte.docx
Content1. Introduction2. What is Big Data3. Characte.docxdickonsondorris
 
Modul_1_Introduction_to_Big_Data.pptx
Modul_1_Introduction_to_Big_Data.pptxModul_1_Introduction_to_Big_Data.pptx
Modul_1_Introduction_to_Big_Data.pptxNouhaElhaji1
 
ppt final.pptx
ppt final.pptxppt final.pptx
ppt final.pptxkalai75
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big DataRoi Blanco
 
Hadoop Master Class : A concise overview
Hadoop Master Class : A concise overviewHadoop Master Class : A concise overview
Hadoop Master Class : A concise overviewAbhishek Roy
 
Hadoop and the Data Warehouse: When to Use Which
Hadoop and the Data Warehouse: When to Use Which Hadoop and the Data Warehouse: When to Use Which
Hadoop and the Data Warehouse: When to Use Which DataWorks Summit
 
Bigdatappt 140225061440-phpapp01
Bigdatappt 140225061440-phpapp01Bigdatappt 140225061440-phpapp01
Bigdatappt 140225061440-phpapp01nayanbhatia2
 

Similar to Introduction to Big Data and Hadoop (20)

Big-Data-Seminar-6-Aug-2014-Koenig
Big-Data-Seminar-6-Aug-2014-KoenigBig-Data-Seminar-6-Aug-2014-Koenig
Big-Data-Seminar-6-Aug-2014-Koenig
 
Big data with Hadoop - Introduction
Big data with Hadoop - IntroductionBig data with Hadoop - Introduction
Big data with Hadoop - Introduction
 
Hadoop Tutorial
Hadoop TutorialHadoop Tutorial
Hadoop Tutorial
 
Data analytics & its Trends
Data analytics & its TrendsData analytics & its Trends
Data analytics & its Trends
 
Hadoop HDFS.ppt
Hadoop HDFS.pptHadoop HDFS.ppt
Hadoop HDFS.ppt
 
Special issues on big data
Special issues on big dataSpecial issues on big data
Special issues on big data
 
Content1. Introduction2. What is Big Data3. Characte.docx
Content1. Introduction2. What is Big Data3. Characte.docxContent1. Introduction2. What is Big Data3. Characte.docx
Content1. Introduction2. What is Big Data3. Characte.docx
 
Modul_1_Introduction_to_Big_Data.pptx
Modul_1_Introduction_to_Big_Data.pptxModul_1_Introduction_to_Big_Data.pptx
Modul_1_Introduction_to_Big_Data.pptx
 
ppt final.pptx
ppt final.pptxppt final.pptx
ppt final.pptx
 
Big_Data_ppt[1] (1).pptx
Big_Data_ppt[1] (1).pptxBig_Data_ppt[1] (1).pptx
Big_Data_ppt[1] (1).pptx
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
Hadoop Master Class : A concise overview
Hadoop Master Class : A concise overviewHadoop Master Class : A concise overview
Hadoop Master Class : A concise overview
 
Big Data
Big DataBig Data
Big Data
 
Hadoop and the Data Warehouse: When to Use Which
Hadoop and the Data Warehouse: When to Use Which Hadoop and the Data Warehouse: When to Use Which
Hadoop and the Data Warehouse: When to Use Which
 
Big data ppt
Big  data pptBig  data ppt
Big data ppt
 
Big Data and Hadoop
Big Data and HadoopBig Data and Hadoop
Big Data and Hadoop
 
Kartikey tripathi
Kartikey tripathiKartikey tripathi
Kartikey tripathi
 
Bigdatappt 140225061440-phpapp01
Bigdatappt 140225061440-phpapp01Bigdatappt 140225061440-phpapp01
Bigdatappt 140225061440-phpapp01
 
Presentation on Big Data
Presentation on Big DataPresentation on Big Data
Presentation on Big Data
 
Big data and analytics
Big data and analyticsBig data and analytics
Big data and analytics
 

Recently uploaded

WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetEnjoy Anytime
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 

Recently uploaded (20)

E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 

Introduction to Big Data and Hadoop

  • 1. Introduction to Big Data and Hadoop
  • 2. Agenda • What is Big Data? • Facts about Big Data • Need for Big Data • Hadoop Overview • Big Data Course at GreyCampus • Understanding HDFS and MapReduce • Enterprise Hadoop • Hadoop Career Opportunities • Pre-Requisites to learn Big Data
  • 3. What is Big Data? • Big data is a buzzword describing massive volume of data that is so large it is difficult to process using traditional database and software techniques • In most enterprise scenarios the volume of data is too big or it moves too fast or it exceeds current processing capacity • Despite these problems, big data has the potential to help companies improve operations and make faster, more intelligent decisions • Big Data has 3 attributes – Volume, Velocity and Variety
  • 4. • Applications generating massive data in terabytes and petabytes • Stock market generates 1 terabyte of data per day What is Big Data?
  • 6. Need for Big Data • Currently the data is 4 Zetabytes in the digital world • The predictions are it might reach 40 Zetabytes in 2020 • The size of the data is doubled every 2 years • We need mechanisms to store this data • We need to process this data almost in real time for helping businesses make informed decisions • Fact: We are generating big data and we need faster information
  • 9. Big Data Course at GreyCampus • Module 1 – Introduction to Big Data • Module 2 – HDFS Architecture • Module 3 – MapReduce • Module 4 – Advanced MapReduce • Module 5 – Hive • Module 6 – PIG • Module 7 – HBase and Zookeeper • Module 8 – Sqoop and Flume – Moving Data to and from HDFS • Module 9 – Hadoop Ecosystem and Components – Introduction • Module 10 – Commercial Distributions of Hadoop Course Topics:
  • 10. Features of HDFS • When a dataset outgrows the storage capacity of a single physical machine, it becomes necessary to partition it across a number of separate machines • File systems that manage the storage across a network of machines are called distributed filesystems • HDFS is a filesystem designed for storing very large files with streaming data access patterns, running on clusters of commodity hardware • There are Hadoop clusters running today that store petabytes of data
  • 11. Features of MapReduce • Developers don’t have to worry about the plumbing for their jobs • No threads or inter process communications or semaphores to program • Just write programs that process part of your input files and produce the output • The mappers and reducers share nothing. That means each mapper is independent of what other mapper does and each reducer is independent of other reducers • So the mappers and reducers can be massively parallel • The MapReduce system is built handling failure • The system is built robust so that the users don’t have to take any action and the system automatically handles the failures.
  • 15. Look who is hiring!
  • 16. Hadoop means high on salary!
  • 19. Careers in Big Data • “By 2015, 4.4 million IT jobs globally will be created to support big data, generating 1.9 million IT jobs in the United States,” said Peter Sondergaard, senior vice president at Gartner and global head of Research. “In addition, every big data-related role in the U.S. will create employment for three people outside of IT, so over the next four years a total of 6 million jobs in the U.S. will be generated by the information economy.“
  • 20. Careers in Big Data • “But there is a challenge. There is not enough talent in the industry. Our public and private education systems are failing us. Therefore, only one- third of the IT jobs will be filled. Data experts will be a scarce, valuable commodity,” Mr. Sondergaard said. “IT leaders will need immediate focus on how their organization develops and attracts the skills required. These jobs will be needed to grow your business. These jobs are the future of the new information economy.”
  • 21. Pre-Requisites • Good programming skills • Basic understanding of database management systems • Knowledge on core Java (added advantage)

Editor's Notes

  1. Please change the image look – Downloaded from HP website
  2. Please change the image – copied from radar.oreilly.com
  3. Source: http://www.edureka.co/blog/5-reasons-to-learn-hadoop/
  4. http://www.slideshare.net/innotech_conference/psl-hadoop-032812