Plank
Upcoming SlideShare
Loading in...5
×
 

Plank

on

  • 886 views

Fashion, apparel, textile, merchandising, garments

Fashion, apparel, textile, merchandising, garments

Statistics

Views

Total Views
886
Views on SlideShare
886
Embed Views
0

Actions

Likes
0
Downloads
1
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Plank Plank Presentation Transcript

  • The Storage Fabric of the Grid: The Network Storage Stack James S. Plank Director: Lo gistical C omputing and I nternetworking ( LoCI ) Laboratory Department of Computer Science University of Tennessee Cluster and Computational Grids for Scientific Computing: September 12, 2002, Le Chateau de Faverges de la Tour, France
  • Grid Research & The Fabric Layer Middleware Application Resources The “Fabric” Layer
  • What is the Fabric Layer?
    • Networking : TCP/IP
    • Storage : Files in a file system
    • Computation : Processes managed by an OS
  • What is the Fabric Layer?
    • Networking : TCP/IP
    • Storage : Files in a file system
    • Computation : Processes managed by an OS
    Most Grid research accepts these as givens. (Examples: MPI, GridFTP)
  • LoCI’s Research Agenda Redefine the fabric layer based on End-to-End Principles Communication Storage Computation Data / Link / Physical Network Transport Application Access / Physical IBP Depot exNode LoRS Application Access / Physical IBP NFU exProc LoRS Application
  • What Should This Get You?
    • Scalabililty
    • Flexibility
    • Fault-tolerance
    • Composability
    I.E. Better Grids
  • LoCI Lab Personnel
    • Directors :
      • Jim Plank
      • Micah Beck
    • Exec Director :
      • Terry Moore
    • Grad Students :
      • Erika Fuentes
      • Sharmila Kancherla
      • Xiang Li
      • Linzhen Xuan
    • Research Staff :
      • Scott Atchley
      • Alexander Bassi
      • Ying Ding
      • Hunter Hagewood
      • Jeremy Millar
      • Stephen Soltesz
      • Yong Zheng
    • Undergrad Students :
      • Isaac Charles
      • Rebecca Collins
      • Kent Galbraith
      • Dustin Parr
  • Collaborators
    • Jack Dongarra (UT - NetSolve, Linear Algebra)
    • Rich Wolski (UCSB - Network Weather Service)
    • Fran Berman (UCSD/NPACI - Scheduling)
    • Henri Casanova (UCSD/NPACI - Scheduling)
    • Laurent LeFevre (INRAI/ENS - Multicast, Active Networking)
  • The Network Storage Stack Applications Logistical File System Logistical Tools L-Bone IBP Local Access Physical exNode
    • A Fundamental Organizing Principle
    • Like the IP Stack
    • Each level encapsulates details from the lower levels, while still exposing details to higher levels
  • The Network Storage Stack Applications Logistical File System Logistical Tools L-Bone IBP Local Access Physical exNode
    • A Fundamental Organizing Principle
    • Like the IP Stack
    • Each level encapsulates details from the lower levels, while still exposing details to higher levels
  • The Network Storage Stack The L-bone : Resource discovery & proximity queries IBP (Internet Backplane Protocol) : Allocating and managing network storage The exNode : A data structure for aggregation LoRS: The Logistical Runtime System : Aggregation tools and methodologies
  • IBP: The I nternet B ackplane P rotocol
    • Managing and using state in the network.
    • Inserting storage in the network so that:
      • Applications may use it advantageously.
      • Storage owners do not lose control of their resources.
      • The whole system is truly scalable and fault-tolerant
    Low-level primitives and software for :
  • The Byte Array: IBP’s Unit of Storage
    • You can think of it as a “ buffer ”.
    • You can think of it as a “ file ”.
    • Append-only semantics.
    • Transience built in.
  • The IBP Client API
    • Can be used by anyone * who can talk to the server.
    • Seven procedure calls in three categories:
      • Allocation (1)
      • Data transfer (4)
      • Management (2)
    • * not really, but close...
  • Client API: Allocation
    • IBP_allocate (char *host, int maxsize, IBP_attributes attr)
    • Like a network malloc()
    • Returns a trio of capabilities .
      • Read / Write / Manage
      • ASCII Strings (obfuscated)
    • No user-defined file names:
      • Big flat name space.
      • No registration required to pass capabilities.
  • Allocation Attributes
    • Time-Limited or Permanent
    • Soft or Hard
    • Read/Write semantics:
      • Byte Array
      • Pipe
      • Circular Queue
  • Client API: Data Transfer
    • IBP_store (write-cap, bytes, size, ...)
    • IBP_deliver (read-cap, pointer, size, ...)
    • IBP_copy (read-cap, write-cap, size, ...)
    • IBP_mcopy (...)
    2-party: 3-party: N-party/other things:
  • IBP Client API: Management
    • IBP_manage ()/ IBP_status ()
    • Allows for resizing byte arrays.
    • Allows for extending/shortening the time limit on time-limited allocations.
    • Manages reference counts on the read/write capabilities.
    • State probing.
  • IBP Servers
    • Daemons that serve local disk or memory.
    • Root access not required.
    • Can specify sliding time limits or revokability.
    • Encourages resource sharing.
  • Typical IBP usage scenario
  • Logistical Networking Strategies Sender Receiver IBP Network Sender Receiver IBP Network Sender Receiver IBP IBP IBP #2 #1 #3 Sender Receiver IBP #4 IBP IBP
  • XSuffrage on MCell/APST Tokyo Institute of Technology NetSolve + NFS NetSolve + IBP APST Daemon APST Client
      • [ (NetSolve+IBP) + (GRAM+GASS) + (NetSolve+NFS) ] + NWS
    University of Tennessee, Knoxville NetSolve + IBP University of California, San Diego GRAM + GASS
  • MCell/APST Experimental Results
    • Experimental Setting:
    • MCell simulation with 1,200 tasks:
    • composed of 6 Monte-Carlo Simulations
    • input files: 1, 20, 100 MB
    • 4 scenarios: Initially
    • (a) all input files are only in Japan
    • (b) 100MB files staged in California
    • (c) in addition, one 100MB file
    • staged in Tennessee
    • (d) all input files replicated everywhere
    workqueue XSufferage Scheduling Heuristics match data and tasks in appropriate locations - Automatic staging with IBP effective - Improved overall performance
  • The Network Storage Stack The L-bone : Resource Discovery & Proximity queries IBP : Allocating and managing network storage (like a network malloc) The exNode : A data structure for aggregation LoRS: The Logistical Runtime System : Aggregation tools and methodologies
  • The Logistical Backbone (L-Bone)
    • LDAP-based storage resource discovery.
    • Query by capacity, network proximity, geographical proximity, stability, etc.
    • Periodic monitoring of depots.
    • Uses the Network Weather Service (NWS) for live measurements and forecasting.
  • Snapshot: August, 2002 Approximately 1.6 TB of publicly accessible storage (Scaling to a petabyte someday…)
  • The Network Storage Stack The L-bone : Resource Discovery & Proximity queries IBP : Allocating and managing network storage (like a network malloc) The exNode : A data structure for aggregation LoRS: The Logistical Runtime System : Aggregation tools and methodologies
  • The exNode
    • The Network “File” Pointer.
    • Analogous to the Unix inode .
    • Map byte-extents to IBP buffers (or other allocations).
    • XML-based data structure/serialization.
    • Allows for replication, flexible decomposition of data.
    • Also allows for “end-to-end services.”
    • Arbitrary metadata.
  • The exNode (XML-based) A B C 0 300 200 100 IBP Depots Network
  • The Network Storage Stack The L-bone : Resource Discovery & Proximity queries IBP : Allocating and managing network storage (like a network malloc) The exNode : A data structure for aggregation LoRS: The Logistical Runtime System : Aggregation tools and methodologies
  • Logistical Runtime System
    • Aggregation for:
      • Capacity
      • Performance (striping)
      • More performance (caching)
      • Reliability (replication)
      • More reliability (ECC)
      • Logistical purposes (routing)
  • Logistical Runtime System
    • Basic Primitives:
      • Upload : Create a network file from local data
      • Download : Get bytes from a network file.
      • Augment : Add more replicas to a network file.
      • Trim : Remove replicas from a network file.
      • Stat : Get information about the network file.
      • Refresh : Alter the time limits of the IBP buffers.
  • Upload
  • Augment to Tennessee
  • Augment to Santa Barbara
  • Stat (ls)
  • Failures do happen.
  • Download
  • Trimming (dead capability removal)
  • End-To-End Services:
    • MD5 Checksums stored per exNode block to detect corruption.
    • Encryption is a per-block option.
    • Compression is an per-block option.
    • Parity/Coding is in the design.
  • Parity / Coding IBP Buffers Network = + + = + 2 + 3 ExNode with Coding
  • Scalability
    • No bottlenecks
    • Really hard problems left unsolved, but for the most part, the lower levels shouldn’t need changing.
      • Naming
      • Good scheduling
      • Consistency / File System semantics
      • Computation
  • Status Applications Logistical File System Logistical Tools L-Bone IBP Local Access Physical exNode
    • IBP/L-Bone/exNode/Tools all supported.
    • Apps: Mail, IBP-ster, Video IBP-ster, IBPvo -- demo at SC-02
    • Other institutions (see L-Bone)
  • What’s Coming Up?
    • More nodes on the L-Bone
    • More collaboration with applications groups
    • Research on performance and scheduling
    • Logistical File System
    • A Computation Stack
    • Code / Information at loci.cs.utk.edu
  • The Storage Fabric of the Grid: The Network Storage Stack James S. Plank Director: Lo gistical C omputing and I nternetworking ( LoCI ) Laboratory Department of Computer Science University of Tennessee Cluster and Computational Grids for Scientific Computing: September 12, 2002, Le Chateau de Faverges de la Tour, France
  • Replication: Experiment #1 UCSB UCSD TAMU UTK UNC Harvard Turin, IT Stuttgart, DE 3 MB file 0 3 MB UTK 2 UTK 5 UTK 6 UTK 3 UTK 4 UTK 1 UCSB 1 UCSB 2 UCSB 3 UCSD 1 UCSD 3 Harvard UNC UTK 5 UTK 2 UTK 5 UTK 6 UTK 3 UCSB 1 UCSB 2 UCSB 3
  • Replication: Experiment #1 Depot Availability at UTK Depot Availability at UCSD Depot Availability at Harvard 860 Download Attempts 100% Success 857 Download Attempts 100% Success 751 Download Attempts 100% Success 0 10 20 30 40 50 70 80 90 100 60 UTK 99.85 UCSD UCSB Harvard UNC 99.71 95.31 59.77 99.88 Fragment Availability (%) 0 10 20 30 40 50 70 80 90 100 60 UTK 96.27 UCSD UCSB Harvard UNC 98.60 88. 60 57.29 97.20 Fragment Availability (%) 0 10 20 30 40 50 70 80 90 100 60 UTK 99.87 UCSD UCSB Harvard UNC 99.80 90.47 57.45 99.87 Fragment Availability (%)
  • Most Frequent Download Path From UTK From Harvard From UCSD
  • Replication: Experiment #2
    • Deleted 12 of the 21 IBP allocations
    • Downloaded from UTK
    3 MB file 1,225 Attempts 93.88% Success 0 3 MB UTK 2 UTK 5 100% UTK 6 UTK 3 UTK 4 99.84% UTK 1 UCSB 1 93.88% UCSB 2 UCSB 3 UCSD 1 UCSD 3 100% Harvard 48.24% UNC UTK 5 99.78% UTK 2 UTK 5 100% UTK 6 100% UTK 3 UCSB 1 UCSB 2 94.69% UCSB 3