Your SlideShare is downloading. ×
Hadoop   big data introduction and training
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Hadoop big data introduction and training


Published on

Hadoop Big Data Introduction and Training Details

Hadoop Big Data Introduction and Training Details

Published in: Education, Technology

  • Be the first to comment

  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. Hadoop – BIG DATA
  • 2. Do you Know?  The total volume of electronic data stored is approximately 2 zettabytes (1 billion TB)  Do you know how many photos FB host?  10 billion photos, nearly 1PB  Do you know how much data internet archive stores everyday?  Around 2PB of data and is growing at a rate of 20PB of data  15 million smart meters (US) generating data at the rate of 3GB per second  Events collected through user interaction from sites are generated at the rate of 1.5GB per second.
  • 3. What is BIG DATA?  Volume  Velocity  Variety
  • 4. Challenges?  Complex  Near real time Analytics  Storage  Computation
  • 5. Who is using Hadoop? • Amazon • Facebook • Google • IBM • Yahoo! • • New York Times • PowerSet • Veoh
  • 6. What makes Hadoop special? • No high end or expensive systems are required – Built on commodity hardwares • Can run on Linux, Mac OS/X, Windows, Solaris • Fault tolerant system – Execution of the job continues even of nodes are failing • Highly reliable and efficient storage system • In built intelligence to speed up the application – Speculative execution • Fit for lot of applications: – Web log processing – Page Indexing,page ranking – Complex event processing
  • 7. Overview of HDFS architecture
  • 8. Overview of MapReduce Programming Model
  • 9. Does Hadoop solves every one problem????? • I am DB guy, I am proficient in writing SQL and trying very hard to optimize my queries, but still not able to do so. Moreover I am not Java geek. Will this solve my problem Use Hive/HBase • Hadoop is written in Java, and I am purely from C++ back ground, how I can use Hadoop for my big data problems? Use Hadoop Pipes • I am a statistician and I know only R, how can I write MR jobs in R? Use RHIPE Package • Well how about Python, Scala, Ruby, etc programmers? Does Hadoop support all these? Use Hadoop streaming
  • 10. Training Links  Course Details:  Sample Session: Hadoop Installation lab: (3000 + Youtube Hits) Hadoop HDFS File system Lab: Case Study:   LinkedIn-Group ( real time discussion)  Please join linked in group for regular updates on my learning in Hadoop / Bigdata Real time work. 
  • 11. Course Material  Recordings – All sessions - 40 Hours  Exercises – 30+ Fully solved  Certification questions – 2 sets  Resumes -2 sets  Online Case Study – Insurance Domain  Virtual Machine – Red Hat OS. ( Oracle Virtual Box Manager).  Linked in group discussion – Online Hadoop Learning
  • 12. Training Details  GotoMeeting  40 – 45 Hours  1 hours weekday / Weekends  Contact:
  • 13. Thank You