Microsoft Dryad
Upcoming SlideShare
Loading in...5
×
 

Microsoft Dryad

on

  • 2,479 views

Tools and Services for Data Intensive Research

Tools and Services for Data Intensive Research

Statistics

Views

Total Views
2,479
Views on SlideShare
2,466
Embed Views
13

Actions

Likes
2
Downloads
62
Comments
0

1 Embed 13

http://www.slideshare.net 13

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Language Integrated Query is an extension of .Net which allows one to write declarative computations on collections
  • Dryad is a generalization of the Unix piping mechanism: instead of uni-dimensional (chain) pipelines, it provides two-dimensional pipelines. The unit is still a process connected by a point-to-point channel, but the processes are replicated.
  • This is the basic Dryad terminology.
  • The brain of a Dryad job is a centralized Job Manager, which maintains a complete state of the job.The JM controls the processes running on a cluster, but never exchanges data with them.(The data plane is completely separated from the control plane.)
  • Computation Staging

Microsoft Dryad Microsoft Dryad Presentation Transcript

  • Tools and Services for Data Intensive Research
    An Elephant Through the Eye of a Needle
    Roger Barga, Architect
    eXtreme Computing Group, Microsoft Research
  • Select eXtreme Computing Group (XCG) Initiatives
    Cloud Computing Futures
    ab initio R&D on cloud hardware/software infrastructure
    Multicore academic engagement
    Universal Parallel Computing Research Centers (UPCRCs)
    Software incubations
    Multicore applications, power management, scheduling
    Quantum computing
    Topological quantum computing investigations
    Security and cryptography
    Theoretical explorations and software tools
    • Research cloud engagement
    • Worldwide government and academic research partnerships View slide
    • Inform next generation cloud computing infrastructure
  • Data Intensive Research
    The nature of scientific computing is changing
    It is about the data…
    Hypothesis-driven research
    “I have an idea, let me verify it...”
    Exploratory
    “What correlations can I glean from everyone’s data?”
    Requires different tools and techniques
    Exploratory analysis relies on data mining, viz analytics
    “grep” is not a data mining tool, and neither is a DBMS…
    Massive, multidisciplinary data
    Rising rapidly and at unprecedented scale
    View slide
  • Why Commercial Clouds are Important*
    Research
    Have good idea
    Write proposal
    Wait 6 months
    If successful, wait 3 months
    Install Computers
    Start Work
    Science Start-ups
    Have good idea
    Write Business Plan
    Ask VCs to fund
    If successful..
    Install Computers
    Start Work
    Cloud Computing Model
    Have good idea
    Grab nodes from Cloud provider
    Start Work
    Pay for what you used
    also scalability, cost, sustainability
    * Slide used with permission of Paul Watson, University of Newcastle (UK)
  • The Pull of Economics (follow the money)
    Moore’s “Law” favored consumer commodities
    Economics drove enormous improvements
    Specialized processors and mainframes faltered
    The commodity software industry was born
    LPIA
    LPIA
    DRAM
    DRAM
    OoO
    x86
    x86
    ctlr
    ctlr
    x86
    Today’s economics
    Unprecedented economies of scale
    Enterprise moving to PaaS, SaaS, cloud computing
    Opportunities for Analysis as a Service, multi-disciplinary data sets,…
    LPIA
    LPIA
    1 MB
    1 MB
    x86
    x86
    cache
    cache
    LPIA
    LPIA
    1 MB
    GPU
    GPU
    x86
    x86
    cache
    1 MB
    1 MB
    PCIe
    PCIe
    NoC
    NoC
    ctlr
    ctlr
    cache
    cache
    LPIA
    LPIA
    1 MB
    GPU
    GPU
    x86
    x86
    cache
    This will drive changes in research computing and cloud infrastructure
    Just as did “killer micros” and inexpensive clusters
    LPIA
    LPIA
    1 MB
    1 MB
    x86
    x86
    cache
    cache
    LPIA
    LPIA
    DRAM
    DRAM
    OoO
    x86
    x86
    ctlr
    ctlr
    x86
  • Drinking from the Twitter Fire Hose
    On the “input” end
    • Start with the ‘twitter fire hose’, messages flowing inbound at specific rate.
    • Enrich each element with significantly more metadata, e.g. geolocation.
    Assume the order of magnitude of the twitter user base is in the 10-50MM range, let’s crank this up to the 500M range.
    The average Twitter user is generating a relatively low incoming message rate right now, assume that a user’s devices (phone, car, PC) are enhanced to begin auto-generating periodic Twitter messages on their behalf, e.g. with location ‘pings’ and solving other problems that twitterbots are emerging to address.  So let’s say the input rate grows again to 10x-100x what it was in the previous step.
  • Drinking from the Twitter Fire Hose
    On the “input” end
    On the “output” end: three different usage modalities
    Each user has one or more ‘agents’ they run on their behalf, monitoring this input stream.  This might just be a client that displays a stream that is incoming from the @friends or #topics or the #interesting&@queries (user standing queries).
    A user can do more general queries from a search page.  This query may have more unstructured search terms than the above, and it is expected not just to be going against incoming stream but against much larger corpus of messages from the entire input stream that has been persisted for days, weeks, months, years…
    Finally, analytical tools or bots whose purpose is to do trend analysis on the knowledge popping out of the stream, in real-time.  Whether seeded with an interest (“let me know when a problem pops up with <product> that will damage my company’s reputation”) or just discovering a topic from the noise (“let me know when a new hot news item emerges”), both must be possible.
  • Pause for Moment…
    Defining representative challenges or quests to focus group attention is an excellent way to proceed as a community
    Publishing a whitepaper articulating these
    challenges is a great way to allow others
    to contribute to a shared research agenda
    Make simulated and reference data sets available
    to ground such a distributed research effort
  • Drinking from the Twitter Fire Hose
    On the “input” end
    On the “output” end: three different usage modalities
    A combination of live data, including streaming, and historical data
    Lots of necessary technology, but no single technology is sufficient
    If this is going to be successful it must be accessible to the masses
     Simple to use and highly scalable, which is extremely difficult because in actuality it is not simple…
  • This Talk is About
    Effort to build & port tools for data intensive research in the cloud
    • None have run in the cloud to date or at scale we are targeting…
    Able to handle torrential streams of live and historical data
    • Goal is simplicity and ease-of-use combined with scalability
    Intersection of four fundamental strategies
    Distribute Data and perform Parallel Processing
    Parallel operations to take advantage of multiple cores;
    Reduce the size of the data accessed
    Data compression
    Data structures that limit the amount of data required for queries;
    Stream data processing to extract information before storage
  • Microsoft’s Dryad
    Continuously deployed since 2006
    Running on >> 104 machines
    Sifting through > 10Pb data daily
    Runs on clusters > 3000 machines
    Handles jobs with > 105 processes each
    Used by >> 100 developers
    Rich platform for data analysis
    Microsoft Research, Silicon Valley
    Michael Isard, Mihai Budiu, Yuan Yu, Andrew Birrell, Dennis Fetterly
  • Pause for Moment…
    Data-Intensive Computing Symposium, 2007
    Dryad is now freely available
    http://research.microsoft.com/en-us/collaboration/tools/dryad.aspx
    Thanks to Geoffrey Fox (Indiana) and Magda Balazinska (UW) as early adopters
    Commitment by External Research (MSR) to support research community use
  • Simple Programming Model
    Terasort, well known benchmark, time to sort time 1 TB data [J. Gray 1985]
    • Sequential scan/disk = 4.6 hours
    • DryadLINQ provides simple but powerful programming model
    • Only few lines of code needed to implement Terasort, benchmark May 2008
    • DryadLINQ result: 349 seconds (5.8 min)
    • Cluster of 240 AMD64 (quad) machines, 920 disks
    • Code: 17 lines of LINQ
    DryadDataContext ddc = newDryadDataContext(fileDir);
    DryadTable<TeraRecord> records =
    ddc.GetPartitionedTable<TeraRecord>(file);
    varq = records.OrderBy(x => x);
    q.ToDryadPartitionedTable(output);
  • LINQ
    Microsoft’s Language INtegrated Query
    Available in Visual Studio 2008
    A set of operators to manipulate datasets in .NET
    Support traditional relational operators
    Select, Join, GroupBy, Aggregate, etc.
    Data model
    Data elements are strongly typed .NET objects
    Much more expressive than SQL tables
    Extremely extensible
    Add new custom operators
    Add new execution providers
  • Dryad Generalizes Unix Pipes
    Unix Pipes: 1-D
    grep | sed | sort | awk | perl
    Dryad: 2-D, multi-machine, virtualized
    grep1000 | sed500 | sort1000 | awk500 | perl50
  • Dryad Job Structure
    Channels
    Inputfiles
    Stage
    Outputfiles
    sort
    grep
    awk
    sed
    perl
    sort
    grep
    awk
    sed
    grep
    sort
    Vertices (processes)
    Channel is a finite streams of items
    • NTFS files (temporary)
    • TCP pipes (inter-machine)
    • Memory FIFOs (intra-machine)
  • Dryad System Architecture
    data plane
    Files, TCP, FIFO, Network
    job schedule
    V
    V
    V
    NS
    PD
    PD
    PD
    Job manager
    cluster
    control plane
  • Dryad Job Staging
    1. Build
    7. Serialize vertices
    Vertex Code
    2. Send .exe
    5. Generate graph
    JM code
    Cluster services
    6. Initialize vertices
    3. Start JM
    8. Monitor vertex execution
    4. Query cluster resources
  • Dryad Scheduler is a State Machine
    Static optimizer builds execution graph
    Vertex can run anywhere once all its inputs are ready.
    Dynamic optimizer mutates running graph
    Distributes code, routes data;
    Schedules processes on machines near data;
    Adjusts available compute resources at each stage;
    Automatically recovers computation, adjusts for overload
    • If A fails, run it again;
    • If A’s inputs are gone, run upstream vertices again (recursively);
    • If A is slow, run a copy elsewhere and use output from one that finishes first.
    Masks failures in cluster and network;
  • Combining Query Providers
    Local Machine
    Execution Engines
    Scalability
    .Netprogram
    (C#, VB, F#, etc)
    DryadLINQ
    Cluster
    Query
    PLINQ
    LINQ provider interface
    Multi-core
    LINQ-to-IMDB
    Objects
    LINQ-to-CEP
    Single-core
  • LINQ == Tree of Operators
    A query is comprised of a tree of operators
    As with a program AST, these trees can be analyzed, rewritten
    This is why PLINQ can safely introduce parallelism
    q = from x in A where p(x) select x3;
    • Intra-operator:
    • Inter-operator:
    • Both composed:
    • Nesting queries inside of others is common
    PLINQ can fuse partitions
    var q1 = from x in A select x*2;
    var q2 = q1.Sum();
  • Combining with PLINQ
    Query
    DryadLINQ
    subquery
    PLINQ
  • Combining with LINQ-to-IMDB
    Query
    DryadLINQ
    Subquery
    Subquery
    Subquery
    Subquery
    Historical
    Reference
    Data
    LINQ-to-IMDB
  • Combining with LINQ-to-CEP
    Query
    DryadLINQ
    Subquery
    Subquery
    Subquery
    Subquery
    Subquery
    ‘Live’
    Streaming
    Data
    LINQ-to-IMDB
    LINQ-to-CEP
  • Cost of storing data – few cents/month/MB
    Cost of acquiring data – negligible
    Extracting insight while acquiring data - priceless
    Mining historical data for ways to extract insight – precious
    CEDR CEP – the engine that makes it possible
    Consistent Streaming Through Time: A Vision for Event Stream Processing
    Roger S. Barga, Jonathan Goldstein, Mohamed H. Ali, Mingsheng Hong
    In the proceedings of CIDR 2007
  • Complex Event Processing
    Complex Event Processing (CEP) is the continuous and incremental processing of event (data) streams from multiple sources based on declarative query and pattern specifications with near-zero latency.
  • The CEDR (Orinoco) Algebra
    Leverages existing SQL understanding
    • Streaming extensions to relational algebra
    • Query integration with host languages (LINQ)
    Semantics are independent of order of arrival
    • Specify a standing event query
    • Separately specify desired disorder handling strategy
    • Many interesting repercussions
    Consistent Streaming Through Time: A Vision for Event Stream Processing
    Roger S. Barga, Jonathan Goldstein, Mohamed H. Ali, Mingsheng Hong
    In the proceedings of CIDR 2007
  • CEDR (Orinoco) Overview
    Currently processing over 400M events per day for internal application (5000 events/sec)
  • Reference Data on Azure
    Ocean Science data on Azure SDS-relational
    • Two terabytes of coastal and model data
    • Collaboration with Bill Howe (Univ of Washington)
    Computational finance data on Azure SDS-relational
    • BATS, daily tick data for stocks (10 years)
    • XBRL call report for banks (10,000 banks)
    Working with IRIS to store select seismic data on Azure. IRIS consortium based in Seattle (NSF) collects and distributes global seismological data.
    • Data sets requested by researchers worldwide
    • Includes HD videos, seismograms, images, data from major seismic events.
  • Summary
    Data growing exponentially: big data, with big implications…
    Implications for research environments and cloud infrastructure
    Building cloud analysis & storage tools for data intensive research
    Implementing key services for science (PhyloD for HIV researchers)
    Host select data sets for multidisciplinary data analysis
    Ongoing discussions for research access to Azure
    Many PB of storage and hundreds of thousands of core-hours
    Internet2/ESnet connections, w/ service peering at high bandwidth
    Drive negotiations with ISVs for pay-as-you-go licensing (MATLAB)
    Academic access to Azure through our MSDN program
    Technical engagement team to onboard research groups
    Tools for data analysis, data storage services, and visual analytics
  • Questions