BUILDING TOOLS FOR THE
HADOOP DEVELOPER
matt winkler
@mwinkle
Operated
By: mike
flasko
@mflasko
C#, F# Map/Reduce, LINQ to Hive, .NET management clients
Node.js management SDK
Hive, Pig, Mahout, Cascading, Scalding, Sc...
Existing Ecosystem
 Actively contributing to:
 Core
 Pig
 Hive
 HCatalog
 Branching to other projects
 Streamlined,...
.NET
 Map/Reduce
 LINQ to Hive
 Client API’s
 WebHCat
 Ambari
 WebHDFS
 Azure / Cloud
 Visual Studio Tooling
 Loc...
JavaScript
 Node.js client API’s
 WebHCat
 WebHDFS
 Ambari
 Azure / Cloud
Management
 UI Tooling
 Cluster usage
 Job authoring
 Result consumption in common tools
 PowerShell & Cross platform...
 Sources
 http://hadoopsdk.codeplex.com
 http://www.github.com/windowsazure
 NuGet packages
 Microsoft.Hadoop.MapRedu...
Building Tools for the Hadoop Developer
Upcoming SlideShare
Loading in …5
×

Building Tools for the Hadoop Developer

1,155 views

Published on

In this session we’ll first discuss our experience extending Hadoop development to new platforms & languages and then discuss our experiments and experiences building supporting developer tools and plugins for those platforms. First, we’ll take a hands on approach to showing our experiments and successes extending Hadoop to languages such as JavaScript and .NET with LINQ. Second, we’ll walk through some of the developer & developer ops tools and plugins we’ve experimented with in an effort to simplify life for the Hadoop developer across both on premises and cloud-based projects.

Published in: Technology
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,155
On SlideShare
0
From Embeds
0
Number of Embeds
70
Actions
Shares
0
Downloads
0
Comments
0
Likes
4
Embeds 0
No embeds

No notes for slide
  • View from Camp Muir looking to Mount Adams, Mount Rainier National Park, Washington 2011, © matt winkler
  • Innovate across the stack
  • Building Tools for the Hadoop Developer

    1. 1. BUILDING TOOLS FOR THE HADOOP DEVELOPER matt winkler @mwinkle Operated By: mike flasko @mflasko
    2. 2. C#, F# Map/Reduce, LINQ to Hive, .NET management clients Node.js management SDK Hive, Pig, Mahout, Cascading, Scalding, Scoobi, Pegasus… PowerShell, Cross Platform CLI tools
    3. 3. Existing Ecosystem  Actively contributing to:  Core  Pig  Hive  HCatalog  Branching to other projects  Streamlined, Simple Deploy  Simple one-box developer install on Windows  Simple scale up/out to the cloud
    4. 4. .NET  Map/Reduce  LINQ to Hive  Client API’s  WebHCat  Ambari  WebHDFS  Azure / Cloud  Visual Studio Tooling  Local debugging support
    5. 5. JavaScript  Node.js client API’s  WebHCat  WebHDFS  Ambari  Azure / Cloud
    6. 6. Management  UI Tooling  Cluster usage  Job authoring  Result consumption in common tools  PowerShell & Cross platform scripting  API Surface  RDFE – Azure provisioning  Ambari – Cluster monitoring  WebHCatalog – Metadata and job submission  WebHDFS, Blob Storage – Storage >_
    7. 7.  Sources  http://hadoopsdk.codeplex.com  http://www.github.com/windowsazure  NuGet packages  Microsoft.Hadoop.MapReduce  Microsoft.Hadoop.Hive  Microsoft.Hadoop.WebClient  NPM packages  Azure  Azure-cli open

    ×