Your SlideShare is downloading. ×
0
S
Hadoop on Azure
@LynnLangit
Data Expertise / Lynn Langit
Practicing Architect
Cloud Deployments
(Azure, AWS,
Google)
Technical author / trainer
Google...
What is Hadoop?
S HUGE Hype factor in 2011 / 2012
Apache Hadoop is a software framework that supports data-
intensive dist...
What is HDInsight?
S Hadoop on Windows
S Azure
S On-premise
S Microsoft worked with Hortonworks to port Hadoop to
Windows ...
Working with HDInsight
RDBMS vs. Hadoop
RDBMS Hadoop
Data Size Gigabytes (Terabytes) Petabytes (Hexabytes)
Access Interactive and Batch Batch – N...
Setting Up Your Cluster
Configuration 1
Configuration 2
Pricing (during Preview)
Demo
Basic Administration
Connect via RDP
NameNode Utility – Top Level
NameNode Utility – Drill Down
Understanding Storage
Using the Azure Storage Viewer
What is MapReduce?
MapReduce using Java
S WordCount example
MapReduce using C# Streaming
S WordCount example
MapReduce using JavaScript
S WordCount example
Simple Output Graphing
S WordCount example
Using HIVE
Understanding Pig
Load>Transform>Dump
or
Store
Monitoring Job Results
S In the portal
S Main Console
S Job icon (button) status
summary
S Job History
S Interactive Conso...
Monitoring Job Status
Download – ODBC for HIVE
S Includes add-in for Excel
Hadoop Connector to Excel
Connecting to PowerPivot
S Create an ODBC connection to HIVE
S Connect to ‘other data source’ in PowerPivot
Connecting with PowerQuery
Pulling it Together - Klout
Hadoop To-Do List
• Use Hadoop when
business needs
designate
• Use other NoSQL if
a better fit
BigData =
Hadoop
• Quick an...
www.TeachingKidsProgramming.org
VOTE
CONFIRM
SHARE
Keep
Learning
S @LynnLangit
S YouTube – SoCalDevGal
S Hire Me
S Architecture
S Best Practices
S Performance Tuning
HDInsight Hadoop on Windows Azure
Upcoming SlideShare
Loading in...5
×

HDInsight Hadoop on Windows Azure

3,758

Published on

Introduction to HDInsight Hadoop on Windows Azure services, including using the interactive console with JavaScript and running WordCount via other methods (Streaming, Hive, etc..)

Published in: Technology
2 Comments
1 Like
Statistics
Notes
No Downloads
Views
Total Views
3,758
On Slideshare
0
From Embeds
0
Number of Embeds
16
Actions
Shares
0
Downloads
127
Comments
2
Likes
1
Embeds 0
No embeds

No notes for slide
  • http://hadoop.apache.org/http://en.wikipedia.org/wiki/Apache_Hadoop
  • http://www.infosysblogs.com/microsoft/2011/12/isotope_hadoop_on_windows_and.html
  • OriginalReference: Tom White’s Hadoop: The Definitive Guide (I made some modifications based on my experience)
  • https://www.hadooponazure.com/AccountDemo - http://www.youtube.com/watch?v=XcHz8aUDDN8 and http://www.youtube.com/watch?v=c7oHntP8HBI
  • http://www.windowsazure.com/en-us/manage/services/hdinsight/howto-blob-store/
  • http://blog.gopivotal.com/products/hadoop-101-programming-mapreduce-with-native-libraries-hive-pig-and-cascading
  • https://www.hadooponazure.com/AccountDemo - http://www.youtube.com/watch?v=XcHz8aUDDN8 and http://www.youtube.com/watch?v=c7oHntP8HBI
  • https://www.hadooponazure.com/AccountDemo - http://www.youtube.com/watch?v=XcHz8aUDDN8 and http://www.youtube.com/watch?v=c7oHntP8HBI
  • https://www.hadooponazure.com/AccountDemo - http://www.youtube.com/watch?v=XcHz8aUDDN8 and http://www.youtube.com/watch?v=c7oHntP8HBI
  • https://www.hadooponazure.com/AccountDemo - http://www.youtube.com/watch?v=XcHz8aUDDN8 and http://www.youtube.com/watch?v=c7oHntP8HBI
  • https://www.hadooponazure.com/AccountDemo - http://www.youtube.com/watch?v=XcHz8aUDDN8 and http://www.youtube.com/watch?v=c7oHntP8HBI
  • http://www.windowsazure.com/en-us/manage/services/hdinsight/using-pig-with-hdinsight/
  • http://www.microsoft.com/en-us/bi/default.aspxhttp://dennyglee.com/Demos -   http://www.youtube.com/watch?v=djfpPsGwm6Aand http://www.youtube.com/watch?v=uh9bKWO1K7U
  • Detailed info - http://dennyglee.com/2012/01/21/connecting-powerpivot-to-hadoop-on-azure-self-service-bi-to-big-data-in-the-cloud/
  • http://www.youtube.com/watch?v=eRXEA9-l2eQhttp://thinknook.com/architectures-for-running-sql-server-analysis-service-ssas-on-data-in-hadoop-hive-2013-02-25/#!prettyPhoto
  • Transcript of "HDInsight Hadoop on Windows Azure"

    1. 1. S Hadoop on Azure @LynnLangit
    2. 2. Data Expertise / Lynn Langit Practicing Architect Cloud Deployments (Azure, AWS, Google) Technical author / trainer Google Cloud Developer Series SQL Server 2012 Developer Series Cloudera Certified Developer 2 books on SQL Server BI Industry awards Microsoft – MVP for SQL Server Google – GDE for Cloud Platform 10Gen – Master for MongoDB Former MSFT FTE 4 years
    3. 3. What is Hadoop? S HUGE Hype factor in 2011 / 2012 Apache Hadoop is a software framework that supports data- intensive distributed applications under a free license • Uses HDFS storage to enable applications to work with thousands of nodes and petabytes of data • Uses MapReduce to process the data • Inspired by Google • MapReduce • Google File System
    4. 4. What is HDInsight? S Hadoop on Windows S Azure S On-premise S Microsoft worked with Hortonworks to port Hadoop to Windows (from Linux)
    5. 5. Working with HDInsight
    6. 6. RDBMS vs. Hadoop RDBMS Hadoop Data Size Gigabytes (Terabytes) Petabytes (Hexabytes) Access Interactive and Batch Batch – NOT Interactive Updates Read / Write many times Write once, Read many times Structure Static Schema Dynamic Schema Integrity High (ACID) Low Scaling Nonlinear Linear Query Response Time Can be near immediate Has latency (due to batch processing)
    7. 7. Setting Up Your Cluster
    8. 8. Configuration 1
    9. 9. Configuration 2
    10. 10. Pricing (during Preview)
    11. 11. Demo
    12. 12. Basic Administration Connect via RDP
    13. 13. NameNode Utility – Top Level
    14. 14. NameNode Utility – Drill Down
    15. 15. Understanding Storage
    16. 16. Using the Azure Storage Viewer
    17. 17. What is MapReduce?
    18. 18. MapReduce using Java S WordCount example
    19. 19. MapReduce using C# Streaming S WordCount example
    20. 20. MapReduce using JavaScript S WordCount example
    21. 21. Simple Output Graphing S WordCount example
    22. 22. Using HIVE
    23. 23. Understanding Pig Load>Transform>Dump or Store
    24. 24. Monitoring Job Results S In the portal S Main Console S Job icon (button) status summary S Job History S Interactive Console S JS quick feedback S JS detailed feedback (log) S Using RDP S Map/Reduce tool S Hadoop command prompt
    25. 25. Monitoring Job Status
    26. 26. Download – ODBC for HIVE S Includes add-in for Excel
    27. 27. Hadoop Connector to Excel
    28. 28. Connecting to PowerPivot S Create an ODBC connection to HIVE S Connect to ‘other data source’ in PowerPivot
    29. 29. Connecting with PowerQuery
    30. 30. Pulling it Together - Klout
    31. 31. Hadoop To-Do List • Use Hadoop when business needs designate • Use other NoSQL if a better fit BigData = Hadoop • Quick and cheap • Specialized use cases • Behavioral data • dev, test , training environments Hadoop on the cloud • Learn Map/Reduce • Use HIVE via Excel • Pay attention to Impala Hadoop access technologies
    32. 32. www.TeachingKidsProgramming.org
    33. 33. VOTE CONFIRM SHARE
    34. 34. Keep Learning S @LynnLangit S YouTube – SoCalDevGal S Hire Me S Architecture S Best Practices S Performance Tuning
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×