Hadoop on Azure    BigData on the Azure platform    @LynnLangit
Oracle Loader for HadoopSQL Server Connector for Hadoop
Flavors of NoSQL
Column DatabaseWide, sparse column sets
RDBMS vs. Hadoop                 Traditional RDBMS         HadoopData Size        Gigabytes (Terabytes)     Petabytes (Hex...
What about the cloud?
The reality…two pivots
Demo - Setting up Your Cluster
Cluster Allocation Process
Working with Hadoop on Azure                     Tools / Languages                     • MapReduce                        ...
Tasks – DBA vs. Hadoop on AzureRDBMS                           Hadoop on AzureImport Data                     Upload Data ...
Demo - Basic AdministrationOpen Ports
Demo - Basic AdministrationConnect via RDP
NameNode Utility – Top Level
NameNode Utility – Drill Down
Demo - Basic AdministrationConfigure connections to remote storage
Configuring Upload from AWS S3
Configuring Upload from Azure
Using the Azure Storage Viewer
Configuring Upload from DataMarket
Asking Questions = MapReduce
Samples
Demo - MapReduce using Java• WordCount example using AWS S3 data
Demo - MapReduce using C# Streaming• WordCount example
Demo - MapReduce using JavaScript• WordCount example
Demo - Using HIVE• WordCount example
Demo - Using HIVE
Monitoring Job Results• In the portal   – Main Console      • Job icon (button) status summary      • Job History   – Inte...
Demo – Monitoring Job Status
Download – ODBC for HIVE• Includes add-in for Excel
Demo - Hadoop Connector to Excel
Connecting to PowerPivot• Create an ODBC connection to HIVE• Connect to ‘other data source’ in PowerPivot
Real-World – Hadoop and…
Hadoop To-Do List BigData = Hadoop                     Hadoop on the cloud • Use Hadoop when business           • Quick an...
The Changing Data Landscape                               Other                              ServicesRDBMS         Hadoop
TeachingKidsProgramming.orgDo a Recipe  Teach a Kid (Ages 10 ++)SmallBasic or Java  Free Courseware (recipes)
Toward Data Craftsmanship…                Follow me @LynnLangit                     RSS my blog                  www.LynnL...
Hadoop on azure_july_2012
Hadoop on azure_july_2012
Hadoop on azure_july_2012
Upcoming SlideShare
Loading in...5
×

Hadoop on azure_july_2012

3,393

Published on

60 minute webcast for DevelopMentor - Hadoop on Azure

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
3,393
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
38
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • http://hortonworks.com/blog/7-key-drivers-for-the-big-data-market/
  • http://hadoop.apache.org/http://en.wikipedia.org/wiki/Apache_Hadoop
  • http://www.oracle.com/technetwork/bdc/hadoop-loader/overview/index.htmlhttp://www.microsoft.com/download/en/details.aspx?id=27584
  • http://bigdatanerd.wordpress.com/2012/01/04/why-nosql-part-2-overview-of-data-modelrelational-nosql/http://docs.jboss.org/hibernate/ogm/3.0/reference/en-US/html_single/
  • http://stage.hypertable.com/index.php/documentation/architecture/http://code.google.com/appengine/http://code.google.com/appengine/articles/datastore/overview.html
  • OriginalReference: Tom White’s Hadoop: The Definitive Guide (I made some modifications based on my experience)
  • But…is it cheaper? It is, right now on AWS (i.e. MapReduce vs. RDS). However, pricing has not been announced for Hadoop on Azure.
  • https://www.hadooponazure.com/AccountDemo - http://www.youtube.com/watch?v=ugi9C6s_sH4
  • http://www.infosysblogs.com/microsoft/2011/12/isotope_hadoop_on_windows_and.html
  • DBA Tasks originally from From SQL Pass Summit 2011 – by Steve JonesEditorSQLServerCentral/ Red Gate Software
  • https://www.hadooponazure.com/AccountDemo - http://www.youtube.com/watch?v=XcHz8aUDDN8 and http://www.youtube.com/watch?v=c7oHntP8HBI
  • https://www.hadooponazure.com/AccountDemo - http://www.youtube.com/watch?v=XcHz8aUDDN8 and http://www.youtube.com/watch?v=c7oHntP8HBI
  • https://www.hadooponazure.com/AccountDemo - http://www.youtube.com/watch?v=XcHz8aUDDN8 and http://www.youtube.com/watch?v=c7oHntP8HBI
  • When the volume of data is too much for simple human interpretation ->Man PLUS Machine (Data Mining / Statistics)
  • https://www.hadooponazure.com/AccountDemo - http://www.youtube.com/watch?v=XcHz8aUDDN8 and http://www.youtube.com/watch?v=c7oHntP8HBI
  • https://www.hadooponazure.com/AccountDemo - http://www.youtube.com/watch?v=XcHz8aUDDN8 and http://www.youtube.com/watch?v=c7oHntP8HBI
  • https://www.hadooponazure.com/AccountDemo - http://www.youtube.com/watch?v=XcHz8aUDDN8 and http://www.youtube.com/watch?v=c7oHntP8HBI
  • https://www.hadooponazure.com/AccountDemo - http://www.youtube.com/watch?v=XcHz8aUDDN8 and http://www.youtube.com/watch?v=c7oHntP8HBI
  • https://www.hadooponazure.com/AccountDemo - http://www.youtube.com/watch?v=XcHz8aUDDN8 and http://www.youtube.com/watch?v=c7oHntP8HBI
  • http://www.microsoft.com/en-us/bi/default.aspxhttp://dennyglee.com/Demos -   http://www.youtube.com/watch?v=djfpPsGwm6Aand http://www.youtube.com/watch?v=uh9bKWO1K7U
  • Detailed info - http://dennyglee.com/2012/01/21/connecting-powerpivot-to-hadoop-on-azure-self-service-bi-to-big-data-in-the-cloud/
  • http://en.wikipedia.org/wiki/Apache_Hadoop
  • http://www.slideshare.net/mattaslett/mysql-vs-nosql-and-newsql-survey-results-13073043
  • http://www.monafoundation.org/project/Teaching-Kids-Programming/22
  • Hadoop on azure_july_2012

    1. 1. Hadoop on Azure BigData on the Azure platform @LynnLangit
    2. 2. Hadoop = BigData?• HUGE Hype factor in 2011 / 2012Apache Hadoop is a software framework that supports data-intensive distributed applications under a free license• enables applications to work with thousands of nodes and petabytes of data• was inspired by Googles MapReduce and Google File System (GFS) papers
    3. 3. Oracle Loader for HadoopSQL Server Connector for Hadoop
    4. 4. Flavors of NoSQL
    5. 5. Column DatabaseWide, sparse column sets
    6. 6. RDBMS vs. Hadoop Traditional RDBMS HadoopData Size Gigabytes (Terabytes) Petabytes (Hexabytes)Access Interactive and Batch Batch – NOT InteractiveUpdates Read / Write many times Write once, Read many timesStructure Static Schema Dynamic SchemaIntegrity High (ACID) LowScaling Nonlinear LinearQuery Response Can be near immediate Has latency (due to batch processing)Time
    7. 7. What about the cloud?
    8. 8. The reality…two pivots
    9. 9. Demo - Setting up Your Cluster
    10. 10. Cluster Allocation Process
    11. 11. Working with Hadoop on Azure Tools / Languages • MapReduce • Map (query/format) • Reduce (aggregate) • plug-in for Eclipse (Java) • JavaScript • C# Streaming • Pig (ETL -- Java) • Hive (HQL Query) • HBase tables • Others • Mahout (analyze) • R (analyze)
    12. 12. Tasks – DBA vs. Hadoop on AzureRDBMS Hadoop on AzureImport Data Upload Data using FTP or import via SqoopSetup Security Setup SecurityScale Compute (up or out) Add child nodes to the clusterPerform a Backup Monitor and replace failed nodesRestore a Database n/aClean up data via ETL Execute a PIG jobCreate an Index – query tune Write a HIVE query (HQL)Join Tables Together Run MapReducen/a Monitor and manage running MapReduce jobsSchedule a Job Schedule a (Cron) JobRun Database Maintenance Monitor space and resources usedSend an Email from SQL Server Set up resource threshold alertsManage License costs Manage usage time charges
    13. 13. Demo - Basic AdministrationOpen Ports
    14. 14. Demo - Basic AdministrationConnect via RDP
    15. 15. NameNode Utility – Top Level
    16. 16. NameNode Utility – Drill Down
    17. 17. Demo - Basic AdministrationConfigure connections to remote storage
    18. 18. Configuring Upload from AWS S3
    19. 19. Configuring Upload from Azure
    20. 20. Using the Azure Storage Viewer
    21. 21. Configuring Upload from DataMarket
    22. 22. Asking Questions = MapReduce
    23. 23. Samples
    24. 24. Demo - MapReduce using Java• WordCount example using AWS S3 data
    25. 25. Demo - MapReduce using C# Streaming• WordCount example
    26. 26. Demo - MapReduce using JavaScript• WordCount example
    27. 27. Demo - Using HIVE• WordCount example
    28. 28. Demo - Using HIVE
    29. 29. Monitoring Job Results• In the portal – Main Console • Job icon (button) status summary • Job History – Interactive Console • JS quick feedback • JS detailed feedback (log)• Using RDP – Map/Reduce tool
    30. 30. Demo – Monitoring Job Status
    31. 31. Download – ODBC for HIVE• Includes add-in for Excel
    32. 32. Demo - Hadoop Connector to Excel
    33. 33. Connecting to PowerPivot• Create an ODBC connection to HIVE• Connect to ‘other data source’ in PowerPivot
    34. 34. Real-World – Hadoop and…
    35. 35. Hadoop To-Do List BigData = Hadoop Hadoop on the cloud • Use Hadoop when business • Quick and cheap needs designate • Specialized use cases • Behavioral data • dev, test , training environments Hadoop access technologies • Learn Map/Reduce • Use HIVE via Excel
    36. 36. The Changing Data Landscape Other ServicesRDBMS Hadoop
    37. 37. TeachingKidsProgramming.orgDo a Recipe  Teach a Kid (Ages 10 ++)SmallBasic or Java  Free Courseware (recipes)
    38. 38. Toward Data Craftsmanship… Follow me @LynnLangit RSS my blog www.LynnLangit.com Hire me • To help build your BI/Big Data solution • To teach your team next gen BI • To learn more about using NoSQL solutions

    ×