gLite Data Management System


Data Management architecture and commands supported by the gLite Grid middleware

  1. 1. Architecture of the gLite Data Management System Leandro Neumann Ciuffo INFN-Catania (Italy) EELA-2 Tutorial Montevideo, 22.07.2009
  2. 2. Outline <ul><li>Challenges of data management in a Grid infrastructure </li></ul><ul><li>Initial definitions </li></ul><ul><li>Types of Storage Elements </li></ul><ul><li>File naming conventions </li></ul><ul><li>File catalogue </li></ul><ul><li>Practical exercises (hands on) </li></ul><ul><li>Be prepared for a bunch of acronyms! </li></ul>gLite DMS – EELA-2 Tutorial, 22.07.2009
  3. 3. Challenges <ul><li>Heterogeneity </li></ul><ul><ul><li>Data are stored on different storage systems using different access technologies </li></ul></ul><ul><li>Distribution </li></ul><ul><ul><li>Data are stored in different locations (in most cases there is no shared file system or common namespace) </li></ul></ul><ul><ul><li>Data need to be moved between different locations </li></ul></ul><ul><li>Data description </li></ul><ul><ul><li>Data are stored as files (need to describe and locate them according to their content) </li></ul></ul>gLite DMS – EELA-2 Tutorial, 22.07.2009 Storage Resource Manager interface File Catalogue File Transfer Service Metadata Service
  4. 4. Getting started <ul><li>The Storage Element (SE) is the service which allows users and applications (programs) to store/retrieve data (files) </li></ul><ul><li>The DMS provide services for location, access and transfer of files </li></ul><ul><ul><li>User do not need to know the file location, just its logical name. </li></ul></ul><ul><ul><li>Files can be replicated or transferred to several locations (SEs) as needed. </li></ul></ul><ul><ul><li>Files are shared within a VO </li></ul></ul><ul><li>Files are write-once, read-many </li></ul><ul><ul><li>Files cannot be changed unless remove or replaced </li></ul></ul><ul><ul><li>No intention of providing a global file management system </li></ul></ul>gLite DMS – EELA-2 Tutorial, 22.07.2009
  5. 5. Getting started <ul><li>Files located in the Storage Elements (SEs)… </li></ul><ul><ul><li>Are mostly write-once, read-many. </li></ul></ul><ul><ul><li>Accessible by users and applications from “anywhere” in the Grid. </li></ul></ul><ul><ul><li>Several replicas of one file can be replicated at different sites. </li></ul></ul><ul><ul><li>Cannot be changed unless remove or replaced. </li></ul></ul><ul><li>Storage Elements (SEs)… </li></ul><ul><ul><li>Provide storage space for files. </li></ul></ul><ul><ul><li>Provide transfer protocol (GSIFTP) ~ GSI based FTP server </li></ul></ul><ul><ul><li>Provide an interface for the management of disk and tape storage resources: Storage Resource Manager (SRM) </li></ul></ul>gLite DMS – EELA-2 Tutorial, 22.07.2009
  6. 6. Types of Storage Elements <ul><li>dCache </li></ul><ul><ul><li>Consists of a server and one or more pool nodes. </li></ul></ul><ul><ul><li>Centralized admin.: single point of access to the SE. </li></ul></ul><ul><ul><li>Files are presented in the disk pools under a single virtual filesystem tree. </li></ul></ul><ul><ul><li>Uses the GSI dCache Access Protocol (gsidcap). </li></ul></ul><ul><li>CERN Advanced STORage manager (CASTOR) </li></ul><ul><ul><li>Files are migrated from a disk buffer frontend to a tape mass storage </li></ul></ul><ul><ul><li>Uses the insecure Remote File I/O protocol (RFIO) </li></ul></ul><ul><li>Disk Pool Manager (DPM) </li></ul><ul><ul><li>Used for fairly small SEs (max 10 TB of total space) with disk-based storage only. </li></ul></ul><ul><ul><li>Uses secure RFIO protocol </li></ul></ul>gLite DMS – EELA-2 Tutorial, 22.07.2009
  7. 7. Storage Resource Manager (SRM) B C Worker Nodes A User Interface SE - CASTOR SE - DPM dCache submit read input read input store output gLite DMS – EELA-2 Tutorial, 22.07.2009 myJOB
  8. 8. Storage Resource Manager (SRM) <ul><li>You as a user need to know all the systems!!! </li></ul>SRM I talk to them on your behalf I will even allocate space for your files And I will use transfer protocols to send your files there SE CASTOR SE DPM SE dCache The SRM is a single interface that takes care of local storage interaction and provides a Grid interface to the outside world. gLite DMS – EELA-2 Tutorial, 22.07.2009
  9. 9. File Naming conventions (1) <ul><li>Grid Unique IDentifier (GUID) </li></ul><ul><ul><li>Every file has a GUID </li></ul></ul><ul><ul><li>A non-human-readable unique identifier, e.g.: guid:38ed3f60-c402-11d7-a6b0-f53ee5a37e1d </li></ul></ul><ul><ul><li>Note: all replicas of a file will share the same GUID </li></ul></ul><ul><li>Logical File Name (LFN) </li></ul><ul><ul><li>An a lias that can be used to refer to a file, e.g.: lfn://grid/gilda/users/mario/myfile.dat </li></ul></ul>gLite DMS – EELA-2 Tutorial, 22.07.2009 Logical File Name 1 Logical File Name N GUID ...
  10. 10. File Naming conventions (2) <ul><li>Storage URL (SURL) or Physical File Name (PFN) </li></ul><ul><ul><li>The location of an actual file on a storage system, e.g.: srm:// </li></ul></ul><ul><ul><li>Note: Used by the system to find where the replica is physically stored </li></ul></ul><ul><li>Transport URL (TURL) </li></ul><ul><ul><li>Complete URI with the necessary information to access a file in a SE (including the access protocol) e.g.: rfio:// </li></ul></ul>Logical File Name 1 Logical File Name N GUID ... ... Physical File SURL N Physical File SURL 1 TURL 1 TURL 1 ... gLite DMS – EELA-2 Tutorial, 22.07.2009
  11. 11. SRM interactions SRM <ul><li>The client asks the SRM for the file providing an SURL </li></ul><ul><li>The SRM asks the Storage Element to provide the file </li></ul><ul><li>The Storage Element notifies the availability of the file and its location </li></ul><ul><li>The SRM returns a TURL (Transfer URL), i.e. the location from where the file can be accessed </li></ul><ul><li>The client interacts with the storage using the protocol specified in the TURL </li></ul>2 3 5 1 4 SE gLite DMS – EELA-2 Tutorial, 22.07.2009 Client
  12. 12. Needles in a haystack <ul><li>How do I keep track of all files I have on the Grid? </li></ul><ul><li>Even if I remember all the LFN’s of my files, what about someone else's files? </li></ul><ul><li>How does the Grid keep track of the mapping between LFN(s), GUID and SURL(s)? </li></ul><ul><li>LFC = L CG F ile C atalogue </li></ul><ul><ul><li>LCG = LHC Compute Grid </li></ul></ul><ul><ul><li>LHC = Large Hadron Collider </li></ul></ul>gLite DMS – EELA-2 Tutorial, 22.07.2009 File Catalogue
  13. 13. File Catalogue <ul><li>Is the service which maintains mappings between LFN(s), GUID and SURL(s) </li></ul><ul><ul><li>It keeps track of the location of copies (replicas) of files </li></ul></ul><ul><li>It consists of a unique catalogue, where the LFN is the main key </li></ul><ul><li>Looks like a “top-level” directory in the Grid </li></ul><ul><li>For each of the supported VO a separate subdirectory exists under the &quot;/grid&quot; directory. </li></ul><ul><li>All members of a given VO have read-write permissions in such a directory </li></ul>gLite DMS – EELA-2 Tutorial, 22.07.2009
  14. 14. The LFC Service User Interface SE B SE A SE C File Catalogue lfn:/grid/gilda/tcaland/mpi.txt gLite DMS – EELA-2 Tutorial, 22.07.2009
  15. 15. The LFC Service srm:// /grid/dteam/dir1/dir2/file1.root LFN GUID 38ed3f60-c402-11d7 -a6b0… Replicas /grid/dteam/mydir/mylink Symlink Further LFNs can be added as symlinks to the main LFN. LCF key SURLs User Metadata System Metadata gLite DMS – EELA-2 Tutorial, 22.07.2009
  16. 16. Job submission – example 1 User Interface CE Worker Nodes WMS <ul><li>Small files: InputSandbox / OutputSandbox </li></ul>gLite DMS – EELA-2 Tutorial, 22.07.2009
  17. 17. Data Management – example 2 User Interface CE Worker Nodes WMS LFC SE SE gLite DMS – EELA-2 Tutorial, 22.07.2009
  18. 18. LFC commands <ul><li>Interact with the catalogue only </li></ul>gLite DMS – EELA-2 Tutorial, 22.07.2009 Add/replace a comment lfc-setcomment Set file/directory access control lists lfc-setacl Remove a file/directory lfc-rm Rename a file/directory lfc-rename Create a directory lfc-mkdir List file/directory entries in a directory lfc-ls Make a symbolic link to a file/directory lfc-ln Get file/directory access control lists lfc-getacl Delete the comment associated with the file/directory lfc-delcomment Change owner and group of the LFC file-directory lfc-chown Change access mode of the LFC file/directory lfc-chmod
  19. 19. lcg-utils commands <ul><li>Copy files to/from/between SEs. </li></ul><ul><li>Keep the SEs and the Catalogue up to date. </li></ul><ul><li>The RPM containing these tools (lcg_util) is installed in the WNs and UIs. </li></ul>gLite DMS – EELA-2 Tutorial, 22.07.2009 lcg-cp Copies a grid file to a local destination lcg-cr Copies a file to a SE and registers the file in the catalog lcg-del Delete one file lcg-rep Replication between SEs and registration of the replica lcg-gt Gets the TURL for a given SURL and transfer protocol lcg-sd Sets file status to “Done” for a given SURL in a SRM request
  20. 20. Environment Variables <ul><li>Make sure to use the correct BDII and LFC </li></ul><ul><li>BDII - LCG_GFAL_INFOSYS </li></ul><ul><ul><li>export </li></ul></ul><ul><li>LFC - LFC_HOST </li></ul><ul><ul><li>export </li></ul></ul>gLite DMS – EELA-2 Tutorial, 22.07.2009
  21. 21. Let’s practice! Reference:
  22. 22. Environment Variables <ul><li>Pointing to the right BDII </li></ul><ul><li>Pointing to the right LFC </li></ul>echo $ LCG_GFAL_INFOSYS export LCG_GFAL_INFOSYS echo $ LFC_HOST export LFC_HOST gLite DMS – EELA-2 Tutorial, 22.07.2009
  23. 23. Before starting… voms-proxy-init --voms gilda gLite DMS – EELA-2 Tutorial, 22.07.2009 <ul><li>Make sure to have a proxy created </li></ul>
  24. 24. LFC: Listing file and directory lfc-ls -l /grid/gilda <ul><li>Remember that LFC has a directory tree structure </li></ul><ul><ul><li>/grid/ <VO_name> / <user directory> </li></ul></ul>Defined by the user LFC Namespace <ul><li>You can set LFC_HOME variable to use relative paths </li></ul>export LFC_HOME =/grid/gilda/tutorials lfc-ls gLite DMS – EELA-2 Tutorial, 22.07.2009
  25. 25. LFC: creating a directory lfc-mkdir /grid/gilda/tutorials/ yourname <ul><li>Create your own personal directory inside: </li></ul><ul><ul><li>/grid/gilda/tutorials/ <your dir> </li></ul></ul><ul><li>You can check the creation typing: </li></ul>lfc-ls /grid/gilda/tutorials gLite DMS – EELA-2 Tutorial, 22.07.2009
  26. 26. Downloading a file lcg-cp --vo gilda lfn:/grid/gilda/users/example/alien.txt file://$HOME/alien.txt <ul><li>First of all, let ’s download a file from a SE to start “playing” with it. </li></ul><ul><li>Basic Usage: </li></ul><ul><li>Try it: </li></ul>lcg-cp --vo <vo name> <LFN origin> <local destination> gLite DMS – EELA-2 Tutorial, 22.07.2009
  27. 27. Copying and registering a file lcg-cr --vo <vo name> -l <LFN destination> -d <SE> <local file> <ul><li>lcg-cr </li></ul><ul><ul><li>Copies a file to a SE and registers the file in the catalogue </li></ul></ul><ul><li>This command will return the GUID for your file </li></ul>gLite DMS – EELA-2 Tutorial, 22.07.2009 Make sure to have a directory in the LFC ( /grid/gilda/users/sagrid/yourname/ ) Use the lcg-info or lcg-infosites commands to figure out the available SEs lcg-infosites --vo gilda se Avail Space(Kb) Used Space(Kb) Type SEs ---------------------------------------------------------- 1100000000 1145007 n.a 1030000000 32 n.a 295250000 75945624 n.a n.a 999999 n.a 60440000 3280565 n.a 1008437 8844236 n.a 53160000 440416 n.a 2430000000 440450 n.a 97890000 440423 n.a lcg-cr --vo gilda -l lfn:/grid/gilda/tutorials/ yourname/yourfile.txt -d file://$HOME/alien.txt
  28. 28. Replicate a file between SEs lcg-rep --vo gilda -d lfn:/grid/gilda/tutorials/ yourname/yourfile.txt <ul><li>Basic Usage: </li></ul><ul><li>Try it: </li></ul>lcg-rep --vo <vo name> -d <destination SE> <LFN of your file> gLite DMS – EELA-2 Tutorial, 22.07.2009
  29. 29. Listing the replicas <ul><li>Use the same lcg-lr command used previously: </li></ul><ul><li>The command will return the SURL of all replicas </li></ul><ul><li>A file can be stored on multiple SE's so that a job can download it from the closest SE while is running. </li></ul>lcg-lr --vo gilda lfn:/grid/gilda/tutorials/ yourname/yourfile.txt gLite DMS – EELA-2 Tutorial, 22.07.2009
  30. 30. Adding metadata information lfc-setcomment /grid/gilda/tutorials/ yourname/yourfile.txt “ This is my comment ” <ul><li>This is the only user-defined metadata that can be associated with catalogue entries. </li></ul><ul><li>Basic Usage: </li></ul><ul><li>Try it: </li></ul>lfc-setcomment <LFC file path> &quot;Your comments&quot; gLite DMS – EELA-2 Tutorial, 22.07.2009
  31. 31. Listing with comments lfc-ls --comment /grid/gilda/tutorials/ yourname/ <ul><li>Try it: </li></ul>gLite DMS – EELA-2 Tutorial, 22.07.2009
  32. 32. Creating a symbolic link <ul><li>Two different LFNs will point to the same file. </li></ul><ul><li>Basic Usage: </li></ul><ul><li>Try it: </li></ul><ul><li>Check your link typing: </li></ul>lfc-ln -s /grid/gilda/tutorials/ yourname/yourlink.txt /grid/gilda/tutorials/ yourname/yourfile.txt lfc-ln -s <your symbolic link> <original file> lfc-ls -l /grid/gilda/tutorials/ yourname/ gLite DMS – EELA-2 Tutorial, 22.07.2009
  33. 33. Downloading a file lcg-cp --vo gilda lfn:/grid/gilda/tutorials/ yourname/yourfile.txt file://$HOME/ yourfile.txt <ul><li>Basic Usage: </li></ul><ul><li>Try it: </li></ul>lcg-cp --vo <vo name> <LFN origin> <local destination> gLite DMS – EELA-2 Tutorial, 22.07.2009
  34. 34. Deleting a file lcg-del -a --vo gilda lfn:/grid/gilda/tutorials/ yourname/yourfile.txt <ul><li>Basic Usage: </li></ul><ul><li>Try it : </li></ul>lcg-del -a --vo <vo name> <LFN> gLite DMS – EELA-2 Tutorial, 22.07.2009
  35. 35. Removing a LFC directory <ul><li>Basic Usage: </li></ul><ul><li>Try it : </li></ul>lfc-rm -r <LFC file path> lfc-rm -r /grid/gilda/tutorials/ yourname gLite DMS – EELA-2 Tutorial, 22.07.2009
  36. 36. Get the file SURL <ul><li>Basic Usage: </li></ul><ul><li>Try it: </li></ul><ul><li>Some advanced Data Management commands (File Transfer Service, for instance) requires the SURL of a file </li></ul>lcg-lr --vo gilda lfn:/grid/gilda/tutorials/ yourname/yourfile.txt lcg-lr --vo <vo name> <LFN> gLite DMS – EELA-2 Tutorial, 22.07.2009
  37. 37. Get the file TURL lcg-gt <paste the file SURL: srm://…> gsiftp <ul><li>Basic Usage: </li></ul><ul><li>Try it: </li></ul>lcg-gt <file SURL> <protocol supported by the SE> gLite DMS – EELA-2 Tutorial, 22.07.2009