• Like
Plank
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Published

Fashion, apparel, textile, merchandising, garments

Fashion, apparel, textile, merchandising, garments

Published in Business , Lifestyle
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
622
On SlideShare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
1
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. The Storage Fabric of the Grid: The Network Storage Stack James S. Plank Director: Lo gistical C omputing and I nternetworking ( LoCI ) Laboratory Department of Computer Science University of Tennessee Cluster and Computational Grids for Scientific Computing: September 12, 2002, Le Chateau de Faverges de la Tour, France
  • 2. Grid Research & The Fabric Layer Middleware Application Resources The “Fabric” Layer
  • 3. What is the Fabric Layer?
    • Networking : TCP/IP
    • Storage : Files in a file system
    • Computation : Processes managed by an OS
  • 4. What is the Fabric Layer?
    • Networking : TCP/IP
    • Storage : Files in a file system
    • Computation : Processes managed by an OS
    Most Grid research accepts these as givens. (Examples: MPI, GridFTP)
  • 5. LoCI’s Research Agenda Redefine the fabric layer based on End-to-End Principles Communication Storage Computation Data / Link / Physical Network Transport Application Access / Physical IBP Depot exNode LoRS Application Access / Physical IBP NFU exProc LoRS Application
  • 6. What Should This Get You?
    • Scalabililty
    • Flexibility
    • Fault-tolerance
    • Composability
    I.E. Better Grids
  • 7. LoCI Lab Personnel
    • Directors :
      • Jim Plank
      • Micah Beck
    • Exec Director :
      • Terry Moore
    • Grad Students :
      • Erika Fuentes
      • Sharmila Kancherla
      • Xiang Li
      • Linzhen Xuan
    • Research Staff :
      • Scott Atchley
      • Alexander Bassi
      • Ying Ding
      • Hunter Hagewood
      • Jeremy Millar
      • Stephen Soltesz
      • Yong Zheng
    • Undergrad Students :
      • Isaac Charles
      • Rebecca Collins
      • Kent Galbraith
      • Dustin Parr
  • 8. Collaborators
    • Jack Dongarra (UT - NetSolve, Linear Algebra)
    • Rich Wolski (UCSB - Network Weather Service)
    • Fran Berman (UCSD/NPACI - Scheduling)
    • Henri Casanova (UCSD/NPACI - Scheduling)
    • Laurent LeFevre (INRAI/ENS - Multicast, Active Networking)
  • 9. The Network Storage Stack Applications Logistical File System Logistical Tools L-Bone IBP Local Access Physical exNode
    • A Fundamental Organizing Principle
    • Like the IP Stack
    • Each level encapsulates details from the lower levels, while still exposing details to higher levels
  • 10. The Network Storage Stack Applications Logistical File System Logistical Tools L-Bone IBP Local Access Physical exNode
    • A Fundamental Organizing Principle
    • Like the IP Stack
    • Each level encapsulates details from the lower levels, while still exposing details to higher levels
  • 11. The Network Storage Stack The L-bone : Resource discovery & proximity queries IBP (Internet Backplane Protocol) : Allocating and managing network storage The exNode : A data structure for aggregation LoRS: The Logistical Runtime System : Aggregation tools and methodologies
  • 12. IBP: The I nternet B ackplane P rotocol
    • Managing and using state in the network.
    • Inserting storage in the network so that:
      • Applications may use it advantageously.
      • Storage owners do not lose control of their resources.
      • The whole system is truly scalable and fault-tolerant
    Low-level primitives and software for :
  • 13. The Byte Array: IBP’s Unit of Storage
    • You can think of it as a “ buffer ”.
    • You can think of it as a “ file ”.
    • Append-only semantics.
    • Transience built in.
  • 14. The IBP Client API
    • Can be used by anyone * who can talk to the server.
    • Seven procedure calls in three categories:
      • Allocation (1)
      • Data transfer (4)
      • Management (2)
    • * not really, but close...
  • 15. Client API: Allocation
    • IBP_allocate (char *host, int maxsize, IBP_attributes attr)
    • Like a network malloc()
    • Returns a trio of capabilities .
      • Read / Write / Manage
      • ASCII Strings (obfuscated)
    • No user-defined file names:
      • Big flat name space.
      • No registration required to pass capabilities.
  • 16. Allocation Attributes
    • Time-Limited or Permanent
    • Soft or Hard
    • Read/Write semantics:
      • Byte Array
      • Pipe
      • Circular Queue
  • 17. Client API: Data Transfer
    • IBP_store (write-cap, bytes, size, ...)
    • IBP_deliver (read-cap, pointer, size, ...)
    • IBP_copy (read-cap, write-cap, size, ...)
    • IBP_mcopy (...)
    2-party: 3-party: N-party/other things:
  • 18. IBP Client API: Management
    • IBP_manage ()/ IBP_status ()
    • Allows for resizing byte arrays.
    • Allows for extending/shortening the time limit on time-limited allocations.
    • Manages reference counts on the read/write capabilities.
    • State probing.
  • 19. IBP Servers
    • Daemons that serve local disk or memory.
    • Root access not required.
    • Can specify sliding time limits or revokability.
    • Encourages resource sharing.
  • 20. Typical IBP usage scenario
  • 21. Logistical Networking Strategies Sender Receiver IBP Network Sender Receiver IBP Network Sender Receiver IBP IBP IBP #2 #1 #3 Sender Receiver IBP #4 IBP IBP
  • 22. XSuffrage on MCell/APST Tokyo Institute of Technology NetSolve + NFS NetSolve + IBP APST Daemon APST Client
      • [ (NetSolve+IBP) + (GRAM+GASS) + (NetSolve+NFS) ] + NWS
    University of Tennessee, Knoxville NetSolve + IBP University of California, San Diego GRAM + GASS
  • 23. MCell/APST Experimental Results
    • Experimental Setting:
    • MCell simulation with 1,200 tasks:
    • composed of 6 Monte-Carlo Simulations
    • input files: 1, 20, 100 MB
    • 4 scenarios: Initially
    • (a) all input files are only in Japan
    • (b) 100MB files staged in California
    • (c) in addition, one 100MB file
    • staged in Tennessee
    • (d) all input files replicated everywhere
    workqueue XSufferage Scheduling Heuristics match data and tasks in appropriate locations - Automatic staging with IBP effective - Improved overall performance
  • 24. The Network Storage Stack The L-bone : Resource Discovery & Proximity queries IBP : Allocating and managing network storage (like a network malloc) The exNode : A data structure for aggregation LoRS: The Logistical Runtime System : Aggregation tools and methodologies
  • 25. The Logistical Backbone (L-Bone)
    • LDAP-based storage resource discovery.
    • Query by capacity, network proximity, geographical proximity, stability, etc.
    • Periodic monitoring of depots.
    • Uses the Network Weather Service (NWS) for live measurements and forecasting.
  • 26. Snapshot: August, 2002 Approximately 1.6 TB of publicly accessible storage (Scaling to a petabyte someday…)
  • 27. The Network Storage Stack The L-bone : Resource Discovery & Proximity queries IBP : Allocating and managing network storage (like a network malloc) The exNode : A data structure for aggregation LoRS: The Logistical Runtime System : Aggregation tools and methodologies
  • 28. The exNode
    • The Network “File” Pointer.
    • Analogous to the Unix inode .
    • Map byte-extents to IBP buffers (or other allocations).
    • XML-based data structure/serialization.
    • Allows for replication, flexible decomposition of data.
    • Also allows for “end-to-end services.”
    • Arbitrary metadata.
  • 29. The exNode (XML-based) A B C 0 300 200 100 IBP Depots Network
  • 30. The Network Storage Stack The L-bone : Resource Discovery & Proximity queries IBP : Allocating and managing network storage (like a network malloc) The exNode : A data structure for aggregation LoRS: The Logistical Runtime System : Aggregation tools and methodologies
  • 31. Logistical Runtime System
    • Aggregation for:
      • Capacity
      • Performance (striping)
      • More performance (caching)
      • Reliability (replication)
      • More reliability (ECC)
      • Logistical purposes (routing)
  • 32. Logistical Runtime System
    • Basic Primitives:
      • Upload : Create a network file from local data
      • Download : Get bytes from a network file.
      • Augment : Add more replicas to a network file.
      • Trim : Remove replicas from a network file.
      • Stat : Get information about the network file.
      • Refresh : Alter the time limits of the IBP buffers.
  • 33. Upload
  • 34. Augment to Tennessee
  • 35. Augment to Santa Barbara
  • 36. Stat (ls)
  • 37. Failures do happen.
  • 38. Download
  • 39. Trimming (dead capability removal)
  • 40. End-To-End Services:
    • MD5 Checksums stored per exNode block to detect corruption.
    • Encryption is a per-block option.
    • Compression is an per-block option.
    • Parity/Coding is in the design.
  • 41. Parity / Coding IBP Buffers Network = + + = + 2 + 3 ExNode with Coding
  • 42. Scalability
    • No bottlenecks
    • Really hard problems left unsolved, but for the most part, the lower levels shouldn’t need changing.
      • Naming
      • Good scheduling
      • Consistency / File System semantics
      • Computation
  • 43. Status Applications Logistical File System Logistical Tools L-Bone IBP Local Access Physical exNode
    • IBP/L-Bone/exNode/Tools all supported.
    • Apps: Mail, IBP-ster, Video IBP-ster, IBPvo -- demo at SC-02
    • Other institutions (see L-Bone)
  • 44. What’s Coming Up?
    • More nodes on the L-Bone
    • More collaboration with applications groups
    • Research on performance and scheduling
    • Logistical File System
    • A Computation Stack
    • Code / Information at loci.cs.utk.edu
  • 45. The Storage Fabric of the Grid: The Network Storage Stack James S. Plank Director: Lo gistical C omputing and I nternetworking ( LoCI ) Laboratory Department of Computer Science University of Tennessee Cluster and Computational Grids for Scientific Computing: September 12, 2002, Le Chateau de Faverges de la Tour, France
  • 46. Replication: Experiment #1 UCSB UCSD TAMU UTK UNC Harvard Turin, IT Stuttgart, DE 3 MB file 0 3 MB UTK 2 UTK 5 UTK 6 UTK 3 UTK 4 UTK 1 UCSB 1 UCSB 2 UCSB 3 UCSD 1 UCSD 3 Harvard UNC UTK 5 UTK 2 UTK 5 UTK 6 UTK 3 UCSB 1 UCSB 2 UCSB 3
  • 47. Replication: Experiment #1 Depot Availability at UTK Depot Availability at UCSD Depot Availability at Harvard 860 Download Attempts 100% Success 857 Download Attempts 100% Success 751 Download Attempts 100% Success 0 10 20 30 40 50 70 80 90 100 60 UTK 99.85 UCSD UCSB Harvard UNC 99.71 95.31 59.77 99.88 Fragment Availability (%) 0 10 20 30 40 50 70 80 90 100 60 UTK 96.27 UCSD UCSB Harvard UNC 98.60 88. 60 57.29 97.20 Fragment Availability (%) 0 10 20 30 40 50 70 80 90 100 60 UTK 99.87 UCSD UCSB Harvard UNC 99.80 90.47 57.45 99.87 Fragment Availability (%)
  • 48. Most Frequent Download Path From UTK From Harvard From UCSD
  • 49. Replication: Experiment #2
    • Deleted 12 of the 21 IBP allocations
    • Downloaded from UTK
    3 MB file 1,225 Attempts 93.88% Success 0 3 MB UTK 2 UTK 5 100% UTK 6 UTK 3 UTK 4 99.84% UTK 1 UCSB 1 93.88% UCSB 2 UCSB 3 UCSD 1 UCSD 3 100% Harvard 48.24% UNC UTK 5 99.78% UTK 2 UTK 5 100% UTK 6 100% UTK 3 UCSB 1 UCSB 2 94.69% UCSB 3