Your SlideShare is downloading. ×
  • Like
2012-01-10-data tuesday
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply

2012-01-10-data tuesday

  • 693 views
Published

Additional information on #datatuesday: http://data-tuesday.com/ …

Additional information on #datatuesday: http://data-tuesday.com/

Additional information on Hadoop on Azure: http://www.hadooponazure.com, http://aka.ms/benjguinhadoop

Published in Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
693
On SlideShare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
4
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • Isotope is the all-up effort around Microsoft and Hadoop. It includes several components:A full distribution of Apache Hadoop that runs on standard windows hardware.A full version of Apache Hadoop that runs on the Azure cloudConnectors from Hadoop (any Hadoop, not just Microsoft’s) to Microsoft’s key products – SQL, Excel, PDW, etc.Jscript shell for live scripting of Hadoop from the browserAdmin, monitoring, and authoring tools to make Microsoft Hadoop best-in-class

Transcript

  • 1. Data Tuesday – 10 janvier 2012Pierre Lagarde (DPE) – pierlag@microsoft.comBenjamin Guinebertière (DPE) – www.benjguin.com
  • 2. Microsoft Distribution of Hadoop [MDH]• Code name : Isotope• Leveraging the Hadoop data-driven community – OnPremise – Cloud – Windows Server integration [AD – Secure HDFS] – Connection with SQL Server / Excel – Developer Framework [JavaScript, .NET, F#, …] – Hadoop as a Service through Azure [eMDH]
  • 3. Structural Overview ISOTOPE [Azure and Enterprise] Java - JavaScript Streaming OM HiveQL PigLatin .NET/C#/F# (T)SQL NOSQL OCEAN OF DATA ETL [unstructured, semi-structured, structured] HDFS A SEAMLESS OCEAN OF INFORMATION PROCESSING AND ANALYTICSEIS / ERP RDBMS File System OData [RSS] Azure Storage
  • 4. Création d’un cluster à la demande
  • 5. Map/Reduce - Java
  • 6. Map/Reduce – C#
  • 7. Map/Reduce - JavaScript
  • 8. Démo - JavaScript distcp HDFS Sort/filter JavaScript M/R from("books")Azure Storage .mapReduce("file.js", "word, count:long") .orderBy("count DESC") .take(10) .to("top10") HDFS File Graph.bar(data) Excel ODBC Reporting Hive Connector SQLServer
  • 9. • from("books") .mapReduce("bin/WordCountLong.js", "word, count:long") .orderBy("count DESC") .take(10) .to("demo-top10")
  • 10. • #get top10