The Network-Aware Data
       Management Workshop

           held in conjunction with
the IEEE/ACM International Conference for High
  Performance Computing, Networking, Storage
             and Analysis (SC'11)

https://sites.google.com/a/lbl.gov/ndm2011/
WELCOME
 The Network-Aware Data Management
              Workshop


                 Mehmet Balman
       Lawrence Berkeley National Laboratory
                mbalman@lbl.gov


                  Surendra Byna
       Lawrence Berkeley National Laboratory
                  sbyna@lbl.gov


https://sites.google.com/a/lbl.gov/ndm2011/
The Network-Aware Data Management
              Workshop

                 Opening
                 Mehmet Balman
       Lawrence Berkeley National Laboratory

https://sites.google.com/a/lbl.gov/ndm2011/
Scope
The amount of data is continuously growing. Managing
large amounts of data in a collaborative environment is a
real challenge

Questions:
  – In order to deliver true exascale performance to the
    application layer:

     • what additional methodologies are necessary for
       high-bandwidth networks?

  – In order to ease the burden on users to configure
    and tune their applications for high-performance
    data access over the network:

     • how do we enhance data management services
       and make them network-aware?
Scientific Data Management


    Scientific Databases, indexing and data analysis

    Storage Resource Management (SRM implementation)

    End-to-end resource provisioning

    High-speed     high-bandwidth   networks    connecting
    institutions



    Middleware tools to manage scientific data across the
    world
       
           (fault tolerant and high-performance transfers)
Recent research



      Advance network reservation, flexible bandwidth
    allocation, scheduling data movement between
    distributed systems

      Efficient use of next-generation high-bandwidth
    networks for climate sciences

    - Use of 100Gbps WAN
           (Climate 100 demo at booth 512)
Goals of the Workshop
    
            Discuss emerging trends in use of networking for
            data management
    
            Address novel abstraction techniques for data
            representation
    
            New methodologies to simplify end-to-end data flow
            by providing:
        
              Transparent data management services
        
              End-to-end resource coordination
        
              Network-aware tools for the scientific community

    And, create new collaboration between network and data
    management communities
Agenda
9:00 - 9:15     Welcome

9:15 - 10:15    Keynote Speech

10:15 - 10:30   Break

10:30 - 12:00   Paper Presentations (session I)

12:00 - 13:00   Lunch Break

13:00 - 14:50   Paper Presentations (session II)

14:50 - 15:15   Break

15:15 - 16:30   Panel Discussion

16:30 - 17:00   Best Paper Announcement
Keynote

     Accelerating Data-driven Discovery by
           Outsourcing the Mundane

                         Ian Foster
Dr. Ian Foster is Director of the Computation Institute, Professor in
the Department of Computer Science at the University of Chicago.
He is also a Senior Scientist and Distinguished Fellow at the
Argonne National Laboratory. He is well known for the Globus
Project, his research and development effort addressing
computational and communications problems for collaborative
computing.
Papers
dFtree - A Fat-tree Routing Algorithm using Dynamic
  Allocation of Virtual Lanes to Alleviate Congestion in
  InfiniBand Networks
(best paper candidate)
Presenter: Wei Lin Guay, Simula Research Laboratory, Norway

Predicting Network Throughput for Grid Applications on
  Network Virtualization Areas
(best paper candidate)
Presenter: Chunghan Lee , Toyohashi University of Technology,
  Japan

Network-Aware End-to-End Data Throughput Optimization
Presenter: Tevfik Kosar, State University of New York at Buffalo.

Network-aware Data Movement Advisor
Presenter: Patrick Brown, Southern Illinois University
Papers
Scientific Data Movement enabled by the DYNES Instrument
Presenter: Jason Zurawski, Internet 2

CernVM-FS: Delivering Scientific Software to Globally Distributed
  Computing Resources
Presenter: Jakob Blomer, CERN, Switzerland

A Peer-to-Peer Architecture for Data-Intensive Cycle Sharing
Presenter: Ian Kelley, Cardiff University, UK

An Architecture for a Data-Intensive Computer
Presenter: Edward Givelberg, Johns Hopkins University
Panel Discussion
    Data management in exa-scale computing
           and terabit networking era
Richard Carlson



DoE Office of Advanced Scientific Computing Research

 Ann Chervenak
   USC Information Sciences Institute

 Daniel S. Katz
   University of Chicago and Argonne National Laboratory

 Dhabaleswar Panda
 Ohio State University

Brian Tierney



ESnet/Lawrence Berkeley National Laboratory
* What research challenges do you expect in exa-scale data
  management on terabit networks?

* What applications need exa-scale computing and terabit
  networks?

* What is the scope of data management research on faster
  network?

* What are the performance problems in next-generation high-
  bandwidth networks?

* How do applications accommodate next generation networks?

* What are the challenges for middleware developers to adapt
  exa-scale data management on terabit networks?

* What are the major challenges in network management in terms
  of provisioning capacity and path, performance monitoring and
  diagnosis tools?
BEST PAPER AWARD

dFtree - A Fat-tree Routing Algorithm using Dynamic
  Allocation of Virtual Lanes to Alleviate Congestion in
  InfiniBand Networks

Presenter: Wei Lin Guay, Simula Research
  Laboratory, Norway

BEST PAPER HONORABLE MENTION
 Predicting Network Throughput for Grid Applications
 on Network Virtualization Areas

Presenter: Chunghan Lee , Toyohashi University of
  Technology, Japan
THANK YOU
 The Network-Aware Data Management
              Workshop


                 Mehmet Balman
       Lawrence Berkeley National Laboratory
                mbalman@lbl.gov


                  Surendra Byna
       Lawrence Berkeley National Laboratory
                  sbyna@lbl.gov


https://sites.google.com/a/lbl.gov/ndm2011/
The Network-Aware Data
       Management Workshop

           held in conjunction with
the IEEE/ACM International Conference for High
  Performance Computing, Networking, Storage
             and Analysis (SC'11)

https://sites.google.com/a/lbl.gov/ndm2011/

Welcome ndm11

  • 1.
    The Network-Aware Data Management Workshop held in conjunction with the IEEE/ACM International Conference for High Performance Computing, Networking, Storage and Analysis (SC'11) https://sites.google.com/a/lbl.gov/ndm2011/
  • 2.
    WELCOME The Network-AwareData Management Workshop Mehmet Balman Lawrence Berkeley National Laboratory mbalman@lbl.gov Surendra Byna Lawrence Berkeley National Laboratory sbyna@lbl.gov https://sites.google.com/a/lbl.gov/ndm2011/
  • 3.
    The Network-Aware DataManagement Workshop Opening Mehmet Balman Lawrence Berkeley National Laboratory https://sites.google.com/a/lbl.gov/ndm2011/
  • 4.
    Scope The amount ofdata is continuously growing. Managing large amounts of data in a collaborative environment is a real challenge Questions: – In order to deliver true exascale performance to the application layer: • what additional methodologies are necessary for high-bandwidth networks? – In order to ease the burden on users to configure and tune their applications for high-performance data access over the network: • how do we enhance data management services and make them network-aware?
  • 5.
    Scientific Data Management  Scientific Databases, indexing and data analysis  Storage Resource Management (SRM implementation)  End-to-end resource provisioning  High-speed high-bandwidth networks connecting institutions  Middleware tools to manage scientific data across the world  (fault tolerant and high-performance transfers)
  • 6.
    Recent research  Advance network reservation, flexible bandwidth allocation, scheduling data movement between distributed systems  Efficient use of next-generation high-bandwidth networks for climate sciences - Use of 100Gbps WAN (Climate 100 demo at booth 512)
  • 7.
    Goals of theWorkshop  Discuss emerging trends in use of networking for data management  Address novel abstraction techniques for data representation  New methodologies to simplify end-to-end data flow by providing:  Transparent data management services  End-to-end resource coordination  Network-aware tools for the scientific community  And, create new collaboration between network and data management communities
  • 8.
    Agenda 9:00 - 9:15 Welcome 9:15 - 10:15 Keynote Speech 10:15 - 10:30 Break 10:30 - 12:00 Paper Presentations (session I) 12:00 - 13:00 Lunch Break 13:00 - 14:50 Paper Presentations (session II) 14:50 - 15:15 Break 15:15 - 16:30 Panel Discussion 16:30 - 17:00 Best Paper Announcement
  • 9.
    Keynote Accelerating Data-driven Discovery by Outsourcing the Mundane Ian Foster Dr. Ian Foster is Director of the Computation Institute, Professor in the Department of Computer Science at the University of Chicago. He is also a Senior Scientist and Distinguished Fellow at the Argonne National Laboratory. He is well known for the Globus Project, his research and development effort addressing computational and communications problems for collaborative computing.
  • 10.
    Papers dFtree - AFat-tree Routing Algorithm using Dynamic Allocation of Virtual Lanes to Alleviate Congestion in InfiniBand Networks (best paper candidate) Presenter: Wei Lin Guay, Simula Research Laboratory, Norway Predicting Network Throughput for Grid Applications on Network Virtualization Areas (best paper candidate) Presenter: Chunghan Lee , Toyohashi University of Technology, Japan Network-Aware End-to-End Data Throughput Optimization Presenter: Tevfik Kosar, State University of New York at Buffalo. Network-aware Data Movement Advisor Presenter: Patrick Brown, Southern Illinois University
  • 11.
    Papers Scientific Data Movementenabled by the DYNES Instrument Presenter: Jason Zurawski, Internet 2 CernVM-FS: Delivering Scientific Software to Globally Distributed Computing Resources Presenter: Jakob Blomer, CERN, Switzerland A Peer-to-Peer Architecture for Data-Intensive Cycle Sharing Presenter: Ian Kelley, Cardiff University, UK An Architecture for a Data-Intensive Computer Presenter: Edward Givelberg, Johns Hopkins University
  • 12.
    Panel Discussion Data management in exa-scale computing and terabit networking era Richard Carlson  DoE Office of Advanced Scientific Computing Research  Ann Chervenak USC Information Sciences Institute  Daniel S. Katz University of Chicago and Argonne National Laboratory  Dhabaleswar Panda Ohio State University Brian Tierney  ESnet/Lawrence Berkeley National Laboratory
  • 13.
    * What researchchallenges do you expect in exa-scale data management on terabit networks? * What applications need exa-scale computing and terabit networks? * What is the scope of data management research on faster network? * What are the performance problems in next-generation high- bandwidth networks? * How do applications accommodate next generation networks? * What are the challenges for middleware developers to adapt exa-scale data management on terabit networks? * What are the major challenges in network management in terms of provisioning capacity and path, performance monitoring and diagnosis tools?
  • 14.
    BEST PAPER AWARD dFtree- A Fat-tree Routing Algorithm using Dynamic Allocation of Virtual Lanes to Alleviate Congestion in InfiniBand Networks Presenter: Wei Lin Guay, Simula Research Laboratory, Norway BEST PAPER HONORABLE MENTION Predicting Network Throughput for Grid Applications on Network Virtualization Areas Presenter: Chunghan Lee , Toyohashi University of Technology, Japan
  • 15.
    THANK YOU TheNetwork-Aware Data Management Workshop Mehmet Balman Lawrence Berkeley National Laboratory mbalman@lbl.gov Surendra Byna Lawrence Berkeley National Laboratory sbyna@lbl.gov https://sites.google.com/a/lbl.gov/ndm2011/
  • 16.
    The Network-Aware Data Management Workshop held in conjunction with the IEEE/ACM International Conference for High Performance Computing, Networking, Storage and Analysis (SC'11) https://sites.google.com/a/lbl.gov/ndm2011/