Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Getting your Big Data on with HDInsight

728 views

Published on

Introduction to HDInsight, and its capabilities, including Azure Storage, Hive, MapReduce, Mahout and HBase. See also some of the tools mentioned at http://bigdata.red-gate.com/ and source code at https://github.com/simonellistonball/GettingYourBigDataOnMapReduce

Published in: Technology
  • Be the first to comment

Getting your Big Data on with HDInsight

  1. 1. Simon Elliston Ball Head of Big Data @sireb Getting your Big Data on with HDInsight http://bit.ly/GettingHDInsight #gettingHDInsight
  2. 2. HDInsight: Hadoop on Azure.
  3. 3. HDInsight: Hadoop
  4. 4. wasb:// HDInsight: Hadoop on Azure.
  5. 5. wasb:// YARN HDInsight: Hadoop on Azure.
  6. 6. wasb:// YARN
  7. 7. Big Data What can I do with it? Data warehousing Machine Learning Batch Analytics ETL
  8. 8. HDInsight (c. 2013)
  9. 9. All grown up
  10. 10. Portal Creating a cluster PowerShell
  11. 11. Getting data in http://www.cerebrata.com/products/azure-explorer/ http://bigdata.red-gate.com/hdfs-explorer
  12. 12. Import Export tool for RDBMS Sqoop up that SQL Command line based Generates Map Reduce jobs Doing it with PowerShell
  13. 13. Demo! Sqoop up that SQL
  14. 14. SELECT * FROM hivesampletable Hive: like SQL Support for window functions Rollups, aggregates
  15. 15. Limited support for some SQL features Hive: like SQL, but… Works on arbitrary data Schema on Read
  16. 16. Demo! Hive
  17. 17. Java based MapReduce Simple algorithm key: value a:1 a:1 b:1 c:1 a:1,1 b:1 c:1 Map Sort / Shuffle Reduce a:2 b:1 c:1 key: value key: value
  18. 18. Streaming Interface MapReduce .NET http://hadoopsdk.codeplex.com/ PM> Install-Package Microsoft.Hadoop.MapReduce
  19. 19. Demo! MapReduce .NET
  20. 20. Machine learning library for Hadoop Mahout Just another Hadoop Job All packaged in a jar
  21. 21. Demo! Excel and HDInsight
  22. 22. High performance Key-Value store HBase Different cluster type in the portal Can link to MapReduce and Hive
  23. 23. HDFS Explorer Quick plug http://bigdata.red-gate.com/ Hadoop Import/Export
  24. 24. Questions? Simon Elliston Ball simon@simonellistonball.com @sireb http://bit.ly/GettingHDInsight #gettingHDInsight

×