HDInsight Hadoop on Windows Azure

4,820 views
4,414 views

Published on

Introduction to HDInsight Hadoop on Windows Azure services, including using the interactive console with JavaScript and running WordCount via other methods (Streaming, Hive, etc..)

Published in: Technology
2 Comments
1 Like
Statistics
Notes
No Downloads
Views
Total views
4,820
On SlideShare
0
From Embeds
0
Number of Embeds
2,098
Actions
Shares
0
Downloads
147
Comments
2
Likes
1
Embeds 0
No embeds

No notes for slide
  • http://hadoop.apache.org/http://en.wikipedia.org/wiki/Apache_Hadoop
  • http://www.infosysblogs.com/microsoft/2011/12/isotope_hadoop_on_windows_and.html
  • OriginalReference: Tom White’s Hadoop: The Definitive Guide (I made some modifications based on my experience)
  • https://www.hadooponazure.com/AccountDemo - http://www.youtube.com/watch?v=XcHz8aUDDN8 and http://www.youtube.com/watch?v=c7oHntP8HBI
  • http://www.windowsazure.com/en-us/manage/services/hdinsight/howto-blob-store/
  • http://blog.gopivotal.com/products/hadoop-101-programming-mapreduce-with-native-libraries-hive-pig-and-cascading
  • https://www.hadooponazure.com/AccountDemo - http://www.youtube.com/watch?v=XcHz8aUDDN8 and http://www.youtube.com/watch?v=c7oHntP8HBI
  • https://www.hadooponazure.com/AccountDemo - http://www.youtube.com/watch?v=XcHz8aUDDN8 and http://www.youtube.com/watch?v=c7oHntP8HBI
  • https://www.hadooponazure.com/AccountDemo - http://www.youtube.com/watch?v=XcHz8aUDDN8 and http://www.youtube.com/watch?v=c7oHntP8HBI
  • https://www.hadooponazure.com/AccountDemo - http://www.youtube.com/watch?v=XcHz8aUDDN8 and http://www.youtube.com/watch?v=c7oHntP8HBI
  • https://www.hadooponazure.com/AccountDemo - http://www.youtube.com/watch?v=XcHz8aUDDN8 and http://www.youtube.com/watch?v=c7oHntP8HBI
  • http://www.windowsazure.com/en-us/manage/services/hdinsight/using-pig-with-hdinsight/
  • http://www.microsoft.com/en-us/bi/default.aspxhttp://dennyglee.com/Demos -   http://www.youtube.com/watch?v=djfpPsGwm6Aand http://www.youtube.com/watch?v=uh9bKWO1K7U
  • Detailed info - http://dennyglee.com/2012/01/21/connecting-powerpivot-to-hadoop-on-azure-self-service-bi-to-big-data-in-the-cloud/
  • http://www.youtube.com/watch?v=eRXEA9-l2eQhttp://thinknook.com/architectures-for-running-sql-server-analysis-service-ssas-on-data-in-hadoop-hive-2013-02-25/#!prettyPhoto
  • HDInsight Hadoop on Windows Azure

    1. 1. S Hadoop on Azure @LynnLangit
    2. 2. Data Expertise / Lynn Langit Practicing Architect Cloud Deployments (Azure, AWS, Google) Technical author / trainer Google Cloud Developer Series SQL Server 2012 Developer Series Cloudera Certified Developer 2 books on SQL Server BI Industry awards Microsoft – MVP for SQL Server Google – GDE for Cloud Platform 10Gen – Master for MongoDB Former MSFT FTE 4 years
    3. 3. What is Hadoop? S HUGE Hype factor in 2011 / 2012 Apache Hadoop is a software framework that supports data- intensive distributed applications under a free license • Uses HDFS storage to enable applications to work with thousands of nodes and petabytes of data • Uses MapReduce to process the data • Inspired by Google • MapReduce • Google File System
    4. 4. What is HDInsight? S Hadoop on Windows S Azure S On-premise S Microsoft worked with Hortonworks to port Hadoop to Windows (from Linux)
    5. 5. Working with HDInsight
    6. 6. RDBMS vs. Hadoop RDBMS Hadoop Data Size Gigabytes (Terabytes) Petabytes (Hexabytes) Access Interactive and Batch Batch – NOT Interactive Updates Read / Write many times Write once, Read many times Structure Static Schema Dynamic Schema Integrity High (ACID) Low Scaling Nonlinear Linear Query Response Time Can be near immediate Has latency (due to batch processing)
    7. 7. Setting Up Your Cluster
    8. 8. Configuration 1
    9. 9. Configuration 2
    10. 10. Pricing (during Preview)
    11. 11. Demo
    12. 12. Basic Administration Connect via RDP
    13. 13. NameNode Utility – Top Level
    14. 14. NameNode Utility – Drill Down
    15. 15. Understanding Storage
    16. 16. Using the Azure Storage Viewer
    17. 17. What is MapReduce?
    18. 18. MapReduce using Java S WordCount example
    19. 19. MapReduce using C# Streaming S WordCount example
    20. 20. MapReduce using JavaScript S WordCount example
    21. 21. Simple Output Graphing S WordCount example
    22. 22. Using HIVE
    23. 23. Understanding Pig Load>Transform>Dump or Store
    24. 24. Monitoring Job Results S In the portal S Main Console S Job icon (button) status summary S Job History S Interactive Console S JS quick feedback S JS detailed feedback (log) S Using RDP S Map/Reduce tool S Hadoop command prompt
    25. 25. Monitoring Job Status
    26. 26. Download – ODBC for HIVE S Includes add-in for Excel
    27. 27. Hadoop Connector to Excel
    28. 28. Connecting to PowerPivot S Create an ODBC connection to HIVE S Connect to ‘other data source’ in PowerPivot
    29. 29. Connecting with PowerQuery
    30. 30. Pulling it Together - Klout
    31. 31. Hadoop To-Do List • Use Hadoop when business needs designate • Use other NoSQL if a better fit BigData = Hadoop • Quick and cheap • Specialized use cases • Behavioral data • dev, test , training environments Hadoop on the cloud • Learn Map/Reduce • Use HIVE via Excel • Pay attention to Impala Hadoop access technologies
    32. 32. www.TeachingKidsProgramming.org
    33. 33. VOTE CONFIRM SHARE
    34. 34. Keep Learning S @LynnLangit S YouTube – SoCalDevGal S Hire Me S Architecture S Best Practices S Performance Tuning

    ×