Your SlideShare is downloading. ×
0
Big time: Introducing Hadoop on Azure
Big time: Introducing Hadoop on Azure
Big time: Introducing Hadoop on Azure
Big time: Introducing Hadoop on Azure
Big time: Introducing Hadoop on Azure
Big time: Introducing Hadoop on Azure
Big time: Introducing Hadoop on Azure
Big time: Introducing Hadoop on Azure
Big time: Introducing Hadoop on Azure
Big time: Introducing Hadoop on Azure
Big time: Introducing Hadoop on Azure
Big time: Introducing Hadoop on Azure
Big time: Introducing Hadoop on Azure
Big time: Introducing Hadoop on Azure
Big time: Introducing Hadoop on Azure
Big time: Introducing Hadoop on Azure
Big time: Introducing Hadoop on Azure
Big time: Introducing Hadoop on Azure
Big time: Introducing Hadoop on Azure
Big time: Introducing Hadoop on Azure
Big time: Introducing Hadoop on Azure
Big time: Introducing Hadoop on Azure
Big time: Introducing Hadoop on Azure
Big time: Introducing Hadoop on Azure
Big time: Introducing Hadoop on Azure
Big time: Introducing Hadoop on Azure
Big time: Introducing Hadoop on Azure
Big time: Introducing Hadoop on Azure
Big time: Introducing Hadoop on Azure
Big time: Introducing Hadoop on Azure
Big time: Introducing Hadoop on Azure
Big time: Introducing Hadoop on Azure
Big time: Introducing Hadoop on Azure
Big time: Introducing Hadoop on Azure
Big time: Introducing Hadoop on Azure
Big time: Introducing Hadoop on Azure
Big time: Introducing Hadoop on Azure
Big time: Introducing Hadoop on Azure
Big time: Introducing Hadoop on Azure
Big time: Introducing Hadoop on Azure
Big time: Introducing Hadoop on Azure
Big time: Introducing Hadoop on Azure
Big time: Introducing Hadoop on Azure
Big time: Introducing Hadoop on Azure
Big time: Introducing Hadoop on Azure
Big time: Introducing Hadoop on Azure
Big time: Introducing Hadoop on Azure
Big time: Introducing Hadoop on Azure
Big time: Introducing Hadoop on Azure
Big time: Introducing Hadoop on Azure
Big time: Introducing Hadoop on Azure
Big time: Introducing Hadoop on Azure
Big time: Introducing Hadoop on Azure
Big time: Introducing Hadoop on Azure
Big time: Introducing Hadoop on Azure
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Big time: Introducing Hadoop on Azure

273

Published on

Introduction to HDInsight service (aka Hadoop on Azure)

Introduction to HDInsight service (aka Hadoop on Azure)

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
273
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
11
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  1. BigData
  2. The problem is simple• While the storage capacities of hard drives have increased massively over the years, access speeds—the rate at which data can be read from drives have not kept up.• One typical drive from 1990 could store 1,370 MB of data and had a transfer speed of 4.4 MB/s
  3. • so you could read all the data from a full drive in around five minutes.• Over 20 years later, one terabyte drives are the norm, but the transfer speed is around 100 MB/s, so it takes more than two and a half hours to read all the data off the disk.
  4. GoParallel
  5. Cloud computing changes the way applications growhttp://journals.worldnomads.com/davidsgibson/photo/22804/664941/USA/Elephant-shaped-cloud!
  6. BIG-TIME:Introducing Hadoopon Azure Yaniv Rodenski Senior Consultant, Sela Group http://blogs.microsoft.co.il/blogs/roadan Twitter: @YRodenski yanivr@sela.co.il David Ginzburg Big Data infrastructure consultant Twitter: @David_Ginzburg davidginzburg@gmail.com
  7. AGENDA
  8. Apache™ Hadoop™
  9. Apache™ Hadoop™
  10. Hadoop Distributed File System (HDFS) HDFS Client
  11. Hadoop Distributed File System (HDFS) HDFS Client
  12. Hadoop Distributed File System (HDFS) HDFS Client
  13. MapReduce via WordCount 1 Hello World 1 1 1 2 1 1 2 1 Hello Azure 1 1 1 1 1 1 1Goodbye 1 Cruel World 1 1
  14. DEMOA new way to MapReduce
  15. Hadoop MapReduce Processing Input Split Input Merge Split Input Split
  16. Hadoop MapReduce Processing Job Client
  17. MapReduce TMI Partition, Sort, and spill to disk FetchInput Buffer Split
  18. MapReduce TMI Sort MapOutput Merge result MapOutput Output MapOutput Merge result MapOutput
  19. Partitioners
  20. Combiners
  21. The TeraSort Use case
  22. The TeraSort Use case
  23. Beginners Pitfalls
  24. Beginners Pitfalls
  25. Distinct Values Problem Statementhttp://highlyscalable.wordpress.com/2012/02/01/mapreduce-patterns/
  26. Distinct Values Problem Statementhttp://highlyscalable.wordpress.com/2012/02/01/mapreduce-patterns/
  27. Distinct Values Problem Statementhttp://highlyscalable.wordpress.com/2012/02/01/mapreduce-patterns/
  28. Distinct Values Problem Statementhttp://highlyscalable.wordpress.com/2012/02/01/mapreduce-patterns/
  29. DEMOAdministrating Hadoop in the real world
  30. Why did Microsoft choose Hadoop?
  31. Hadoop on Azure
  32. DEMOUsing hadooponazure.com
  33. Windows Azure Compute Supporting service Application Configuration
  34. Hadoop on Azure Roles Monitoring service (RdAdmin) Hadoop services Configuration
  35. Hadoop MapReduce Processing Fabric Controller
  36. Hadoop MapReduce Processing Fabric Controller
  37. Hadoop MapReduce Processing Fabric Controller
  38. The Head Node Template
  39. The Worker Node Template
  40. Node VM Templates
  41. Cloud Storage
  42. High Availability on Azure Azure Storage Fabric Controller
  43. Elastic MapReduce
  44. Elastic MapReduce Storage Client Azure Amazon Storage S3
  45. Elastic MapReduce Storage Client Azure Amazon Storage S3
  46. Elastic MapReduce Storage Client Azure Amazon Storage S3 $ $ $ $ $ $ $ $
  47. DEMOUsing Elastic MapReduce
  48. Azure Blob Considerations
  49. Storage Size Limitations
  50. IsotopeJS
  51. DEMOUsing the JavaScript interactive console
  52. DEMOUsing Hive
  53. Summary
  54. Q&A
  55. ResourcesMy Blog Windows Azure Developer centerhttp://bit.ly/roadan http://www.windowsazure.com/en-us/develop/overviewApache™ Hadoop™http://hadoop.apache.orgHadoop on Azurehttp://www.hadooponazure.comHadoop: The Definitive GuideTom Whitehttp://shop.oreilly.com/product/9780596521981.do Thanks! Yaniv Rodenski Twitter: @YRodenski

×