2012-01-10-data tuesday

956 views

Published on

Additional information on #datatuesday: http://data-tuesday.com/

Additional information on Hadoop on Azure: http://www.hadooponazure.com, http://aka.ms/benjguinhadoop

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
956
On SlideShare
0
From Embeds
0
Number of Embeds
5
Actions
Shares
0
Downloads
6
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Isotope is the all-up effort around Microsoft and Hadoop. It includes several components:A full distribution of Apache Hadoop that runs on standard windows hardware.A full version of Apache Hadoop that runs on the Azure cloudConnectors from Hadoop (any Hadoop, not just Microsoft’s) to Microsoft’s key products – SQL, Excel, PDW, etc.Jscript shell for live scripting of Hadoop from the browserAdmin, monitoring, and authoring tools to make Microsoft Hadoop best-in-class
  • 2012-01-10-data tuesday

    1. 1. Data Tuesday – 10 janvier 2012Pierre Lagarde (DPE) – pierlag@microsoft.comBenjamin Guinebertière (DPE) – www.benjguin.com
    2. 2. Microsoft Distribution of Hadoop [MDH]• Code name : Isotope• Leveraging the Hadoop data-driven community – OnPremise – Cloud – Windows Server integration [AD – Secure HDFS] – Connection with SQL Server / Excel – Developer Framework [JavaScript, .NET, F#, …] – Hadoop as a Service through Azure [eMDH]
    3. 3. Structural Overview ISOTOPE [Azure and Enterprise] Java - JavaScript Streaming OM HiveQL PigLatin .NET/C#/F# (T)SQL NOSQL OCEAN OF DATA ETL [unstructured, semi-structured, structured] HDFS A SEAMLESS OCEAN OF INFORMATION PROCESSING AND ANALYTICSEIS / ERP RDBMS File System OData [RSS] Azure Storage
    4. 4. Création d’un cluster à la demande
    5. 5. Map/Reduce - Java
    6. 6. Map/Reduce – C#
    7. 7. Map/Reduce - JavaScript
    8. 8. Démo - JavaScript distcp HDFS Sort/filter JavaScript M/R from("books")Azure Storage .mapReduce("file.js", "word, count:long") .orderBy("count DESC") .take(10) .to("top10") HDFS File Graph.bar(data) Excel ODBC Reporting Hive Connector SQLServer
    9. 9. • from("books") .mapReduce("bin/WordCountLong.js", "word, count:long") .orderBy("count DESC") .take(10) .to("demo-top10")
    10. 10. • #get top10

    ×