Your SlideShare is downloading. ×
0
Hadoop on azure_july_2012
Hadoop on azure_july_2012
Hadoop on azure_july_2012
Hadoop on azure_july_2012
Hadoop on azure_july_2012
Hadoop on azure_july_2012
Hadoop on azure_july_2012
Hadoop on azure_july_2012
Hadoop on azure_july_2012
Hadoop on azure_july_2012
Hadoop on azure_july_2012
Hadoop on azure_july_2012
Hadoop on azure_july_2012
Hadoop on azure_july_2012
Hadoop on azure_july_2012
Hadoop on azure_july_2012
Hadoop on azure_july_2012
Hadoop on azure_july_2012
Hadoop on azure_july_2012
Hadoop on azure_july_2012
Hadoop on azure_july_2012
Hadoop on azure_july_2012
Hadoop on azure_july_2012
Hadoop on azure_july_2012
Hadoop on azure_july_2012
Hadoop on azure_july_2012
Hadoop on azure_july_2012
Hadoop on azure_july_2012
Hadoop on azure_july_2012
Hadoop on azure_july_2012
Hadoop on azure_july_2012
Hadoop on azure_july_2012
Hadoop on azure_july_2012
Hadoop on azure_july_2012
Hadoop on azure_july_2012
Hadoop on azure_july_2012
Hadoop on azure_july_2012
Hadoop on azure_july_2012
Hadoop on azure_july_2012
Hadoop on azure_july_2012
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Hadoop on azure_july_2012

3,369

Published on

60 minute webcast for DevelopMentor - Hadoop on Azure

60 minute webcast for DevelopMentor - Hadoop on Azure

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
3,369
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
38
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • http://hortonworks.com/blog/7-key-drivers-for-the-big-data-market/
  • http://hadoop.apache.org/http://en.wikipedia.org/wiki/Apache_Hadoop
  • http://www.oracle.com/technetwork/bdc/hadoop-loader/overview/index.htmlhttp://www.microsoft.com/download/en/details.aspx?id=27584
  • http://bigdatanerd.wordpress.com/2012/01/04/why-nosql-part-2-overview-of-data-modelrelational-nosql/http://docs.jboss.org/hibernate/ogm/3.0/reference/en-US/html_single/
  • http://stage.hypertable.com/index.php/documentation/architecture/http://code.google.com/appengine/http://code.google.com/appengine/articles/datastore/overview.html
  • OriginalReference: Tom White’s Hadoop: The Definitive Guide (I made some modifications based on my experience)
  • But…is it cheaper? It is, right now on AWS (i.e. MapReduce vs. RDS). However, pricing has not been announced for Hadoop on Azure.
  • https://www.hadooponazure.com/AccountDemo - http://www.youtube.com/watch?v=ugi9C6s_sH4
  • http://www.infosysblogs.com/microsoft/2011/12/isotope_hadoop_on_windows_and.html
  • DBA Tasks originally from From SQL Pass Summit 2011 – by Steve JonesEditorSQLServerCentral/ Red Gate Software
  • https://www.hadooponazure.com/AccountDemo - http://www.youtube.com/watch?v=XcHz8aUDDN8 and http://www.youtube.com/watch?v=c7oHntP8HBI
  • https://www.hadooponazure.com/AccountDemo - http://www.youtube.com/watch?v=XcHz8aUDDN8 and http://www.youtube.com/watch?v=c7oHntP8HBI
  • https://www.hadooponazure.com/AccountDemo - http://www.youtube.com/watch?v=XcHz8aUDDN8 and http://www.youtube.com/watch?v=c7oHntP8HBI
  • When the volume of data is too much for simple human interpretation ->Man PLUS Machine (Data Mining / Statistics)
  • https://www.hadooponazure.com/AccountDemo - http://www.youtube.com/watch?v=XcHz8aUDDN8 and http://www.youtube.com/watch?v=c7oHntP8HBI
  • https://www.hadooponazure.com/AccountDemo - http://www.youtube.com/watch?v=XcHz8aUDDN8 and http://www.youtube.com/watch?v=c7oHntP8HBI
  • https://www.hadooponazure.com/AccountDemo - http://www.youtube.com/watch?v=XcHz8aUDDN8 and http://www.youtube.com/watch?v=c7oHntP8HBI
  • https://www.hadooponazure.com/AccountDemo - http://www.youtube.com/watch?v=XcHz8aUDDN8 and http://www.youtube.com/watch?v=c7oHntP8HBI
  • https://www.hadooponazure.com/AccountDemo - http://www.youtube.com/watch?v=XcHz8aUDDN8 and http://www.youtube.com/watch?v=c7oHntP8HBI
  • http://www.microsoft.com/en-us/bi/default.aspxhttp://dennyglee.com/Demos -   http://www.youtube.com/watch?v=djfpPsGwm6Aand http://www.youtube.com/watch?v=uh9bKWO1K7U
  • Detailed info - http://dennyglee.com/2012/01/21/connecting-powerpivot-to-hadoop-on-azure-self-service-bi-to-big-data-in-the-cloud/
  • http://en.wikipedia.org/wiki/Apache_Hadoop
  • http://www.slideshare.net/mattaslett/mysql-vs-nosql-and-newsql-survey-results-13073043
  • http://www.monafoundation.org/project/Teaching-Kids-Programming/22
  • Transcript

    • 1. Hadoop on Azure BigData on the Azure platform @LynnLangit
    • 2. Hadoop = BigData?• HUGE Hype factor in 2011 / 2012Apache Hadoop is a software framework that supports data-intensive distributed applications under a free license• enables applications to work with thousands of nodes and petabytes of data• was inspired by Googles MapReduce and Google File System (GFS) papers
    • 3. Oracle Loader for HadoopSQL Server Connector for Hadoop
    • 4. Flavors of NoSQL
    • 5. Column DatabaseWide, sparse column sets
    • 6. RDBMS vs. Hadoop Traditional RDBMS HadoopData Size Gigabytes (Terabytes) Petabytes (Hexabytes)Access Interactive and Batch Batch – NOT InteractiveUpdates Read / Write many times Write once, Read many timesStructure Static Schema Dynamic SchemaIntegrity High (ACID) LowScaling Nonlinear LinearQuery Response Can be near immediate Has latency (due to batch processing)Time
    • 7. What about the cloud?
    • 8. The reality…two pivots
    • 9. Demo - Setting up Your Cluster
    • 10. Cluster Allocation Process
    • 11. Working with Hadoop on Azure Tools / Languages • MapReduce • Map (query/format) • Reduce (aggregate) • plug-in for Eclipse (Java) • JavaScript • C# Streaming • Pig (ETL -- Java) • Hive (HQL Query) • HBase tables • Others • Mahout (analyze) • R (analyze)
    • 12. Tasks – DBA vs. Hadoop on AzureRDBMS Hadoop on AzureImport Data Upload Data using FTP or import via SqoopSetup Security Setup SecurityScale Compute (up or out) Add child nodes to the clusterPerform a Backup Monitor and replace failed nodesRestore a Database n/aClean up data via ETL Execute a PIG jobCreate an Index – query tune Write a HIVE query (HQL)Join Tables Together Run MapReducen/a Monitor and manage running MapReduce jobsSchedule a Job Schedule a (Cron) JobRun Database Maintenance Monitor space and resources usedSend an Email from SQL Server Set up resource threshold alertsManage License costs Manage usage time charges
    • 13. Demo - Basic AdministrationOpen Ports
    • 14. Demo - Basic AdministrationConnect via RDP
    • 15. NameNode Utility – Top Level
    • 16. NameNode Utility – Drill Down
    • 17. Demo - Basic AdministrationConfigure connections to remote storage
    • 18. Configuring Upload from AWS S3
    • 19. Configuring Upload from Azure
    • 20. Using the Azure Storage Viewer
    • 21. Configuring Upload from DataMarket
    • 22. Asking Questions = MapReduce
    • 23. Samples
    • 24. Demo - MapReduce using Java• WordCount example using AWS S3 data
    • 25. Demo - MapReduce using C# Streaming• WordCount example
    • 26. Demo - MapReduce using JavaScript• WordCount example
    • 27. Demo - Using HIVE• WordCount example
    • 28. Demo - Using HIVE
    • 29. Monitoring Job Results• In the portal – Main Console • Job icon (button) status summary • Job History – Interactive Console • JS quick feedback • JS detailed feedback (log)• Using RDP – Map/Reduce tool
    • 30. Demo – Monitoring Job Status
    • 31. Download – ODBC for HIVE• Includes add-in for Excel
    • 32. Demo - Hadoop Connector to Excel
    • 33. Connecting to PowerPivot• Create an ODBC connection to HIVE• Connect to ‘other data source’ in PowerPivot
    • 34. Real-World – Hadoop and…
    • 35. Hadoop To-Do List BigData = Hadoop Hadoop on the cloud • Use Hadoop when business • Quick and cheap needs designate • Specialized use cases • Behavioral data • dev, test , training environments Hadoop access technologies • Learn Map/Reduce • Use HIVE via Excel
    • 36. The Changing Data Landscape Other ServicesRDBMS Hadoop
    • 37. TeachingKidsProgramming.orgDo a Recipe  Teach a Kid (Ages 10 ++)SmallBasic or Java  Free Courseware (recipes)
    • 38. Toward Data Craftsmanship… Follow me @LynnLangit RSS my blog www.LynnLangit.com Hire me • To help build your BI/Big Data solution • To teach your team next gen BI • To learn more about using NoSQL solutions

    ×