Mon Acc Ccr Workshop

1,103 views

Published on

Fashion, apparel, textile, merchandising, garments

Published in: Business, Lifestyle
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,103
On SlideShare
0
From Embeds
0
Number of Embeds
5
Actions
Shares
0
Downloads
6
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Mon Acc Ccr Workshop

  1. 1. Fabric Monitor, Accounting, Storage and Reports experience at the INFN Tier1 Felice Rosso on behalf of INFN Tier1 [email_address] Workshop sul calcolo e reti INFN - Otranto - 8-6-2006
  2. 2. Outline <ul><li>CNAF-INFN Tier1 </li></ul><ul><li>FARM and GRID Monitoring </li></ul><ul><li>Local Queues Monitoring </li></ul><ul><ul><li>Local and GRID accounting </li></ul></ul><ul><li>Storage Monitoring and accounting </li></ul><ul><li>Summary </li></ul>
  3. 3. Introduction <ul><li>Location: INFN-CNAF, Bologna (Italy) </li></ul><ul><ul><li>one of the main nodes of GARR network </li></ul></ul><ul><li>Computing facility for INFN HNEP community </li></ul><ul><ul><li>Partecipating to LCG, EGEE, INFNGRID projects </li></ul></ul><ul><li>Multi-Experiments TIER1 </li></ul><ul><ul><li>LHC experiments (Alice, Atlas, CMS, LHCb) </li></ul></ul><ul><ul><li>CDF, BABAR </li></ul></ul><ul><ul><li>VIRGO, MAGIC, ARGO, Bio, TheoPhys, Pamela ... </li></ul></ul><ul><li>Resources assigned to experiments on a yearly Plan. </li></ul>
  4. 4. The Farm in a Nutshell <ul><li>- SLC 3.0.6, LCG 2.7, LSF 6.1 </li></ul><ul><li> - ~ 720 WNs LSF pool (~1580 KSI2K) </li></ul><ul><ul><li>Common LSF pool: 1 job per logical cpu (slot) </li></ul></ul><ul><ul><ul><li>MAX 1 process running at the same time per job </li></ul></ul></ul><ul><ul><li>GRID and local submission allowed </li></ul></ul><ul><ul><ul><li>On the same WN can run GRID and not GRID jobs </li></ul></ul></ul><ul><ul><ul><li>On the same queue can be submitted GRID and not GRID jobs </li></ul></ul></ul><ul><ul><li>For each VO/EXP one or more queues </li></ul></ul><ul><ul><li>Since 24th of April 2005 2.700.000 jobs were executed on our LSF pool (~1.600.000 GRID) </li></ul></ul><ul><ul><li>3 CEs (main CE 4 opteron dualcore, 24 GB RAM) + 1 CE gLite </li></ul></ul>
  5. 5. Access to Batch system “ Legacy” non-Grid Access CE LSF Wn1 WNn SE Grid Access LSF client UI UI UI Grid
  6. 6. Farm Monitoring Goals <ul><li>Scalability to Tier1 full size </li></ul><ul><li>Many parameters for each WN/server </li></ul><ul><li>DataBase and Plots on Web Pages </li></ul><ul><li>Data Analysis </li></ul><ul><li>Report problems on Web Page(s) </li></ul><ul><li>Share data with GRID tools </li></ul><ul><li>RedEye : INFN-T1 tool monitoring </li></ul><ul><li>RedEye : simple local user. No root! </li></ul>
  7. 7. Tier1 Fabric Monitoring <ul><li>What do we get? </li></ul><ul><li>• CPU load, status and jiffies </li></ul><ul><li>Ethernet I/O, (MRTG by net-boys) </li></ul><ul><li>Temperatures, RPM fans (IPMI) </li></ul><ul><li>Total and type of active TCP connections </li></ul><ul><li>Processes created, running, zombie etc </li></ul><ul><li>RAM and SWAP memory </li></ul><ul><li>Users logged in </li></ul><ul><li>SLC3 and SLC4 compatible </li></ul>
  8. 8. Tier1 Fabric Monitor
  9. 9. Local WN Monitoring <ul><li>On each WN every 5 min (local crontab) infos are saved locally (<3KBytes --> 2-3 TCP packets) </li></ul><ul><li>1 minute later a collector “gets” via socket the infos </li></ul><ul><ul><li>“ gets”: tidy parallel fork with timeout control </li></ul></ul><ul><li>To get and save locally datas from 750 WN ~ 6 sec best case. 20 sec worst case (timeout knife) </li></ul><ul><li>Upgrade DataBase (last day, week, month) </li></ul><ul><li>For each WN --> 1 file (possibility of cumulative plots) </li></ul><ul><li>Analysis monitor datas </li></ul><ul><li>Local thumbnail cache creation (web clickable) </li></ul><ul><li>http://collector.cnaf.infn.it/davide/rack.php </li></ul><ul><li>http://collector.cnaf.infn.it/davide/analyzer.html </li></ul>
  10. 10. Web Snapshot CPU-RAM
  11. 11. Web Snapshot TCP connections
  12. 12. Web Snapshot users logged
  13. 13. Analyzer.html
  14. 14. Fabric  GRID Monitoring <ul><li>Effort on exporting relevant fabric metrics to the Grid level e.g.: </li></ul><ul><ul><li># of active WNs, </li></ul></ul><ul><ul><li># of free slots, </li></ul></ul><ul><ul><li>etc… </li></ul></ul><ul><li>GridICE integration </li></ul><ul><ul><li>Configuration based on Quattor </li></ul></ul><ul><li>Avoid duplication of sensors on farm </li></ul>
  15. 15. Local Queues Monitoring <ul><li>Every 5 minutes on batch manager is saved queues status (snapshot) </li></ul><ul><li>A collector gets the infos and upgrades the local database (same logic of farm monitoring) </li></ul><ul><ul><li>Daily / Weekly / Monthly / Yearly DB </li></ul></ul><ul><ul><li>DB: Total and single queues </li></ul></ul><ul><li>3 classes of users for each queue </li></ul><ul><li>Plots generator: Gnuplot 4.0 </li></ul><ul><li>http://tier1.cnaf.infn.it/monitor/LSF/ </li></ul>
  16. 16. Web Snapshot LSF Status
  17. 17. UGRID: general GRID user (lhcb001, lhcb030…) SGM: Software GRID Manager (lhcbsgm) OTHER: local user
  18. 18. UGRID: general GRID user (babar001, babar030…) SGM: Software GRID Manager (babarsgm) OTHER: local user
  19. 19. RedEye - LSF Monitoring <ul><li>Real time slot usage  </li></ul><ul><li>Fast, few CPU power needed, stable, works on WAN  </li></ul><ul><li>RedEye simple user, not root  </li></ul><ul><li>BUT… </li></ul><ul><li>all slots have the same weight (Future: Jeep solution) </li></ul><ul><li>Jobs shorter than 5 minutes can be lost </li></ul><ul><li>SO: </li></ul><ul><li>We need something good for ALL jobs. </li></ul><ul><li>We need to know who and how uses our FARM. </li></ul><ul><li>Solution: </li></ul><ul><li>Offline parsing LSF log files one time per day (Jeep integration) </li></ul>
  20. 20. Job-related metrics <ul><li>From LSF log file we got the following non-GRID info: </li></ul><ul><li>LSF JobID, local UID owner of the JOB </li></ul><ul><li>“ any kind of time” (submission, WCT etc) </li></ul><ul><li>Max RSS and Virtual Memory usage </li></ul><ul><li>From which computer (hostname) the job was submitted (GRID CE/locally) </li></ul><ul><li>Where the job was executed (WN hostname) </li></ul><ul><li>We complete this set with KSI2K & GRID infos (Jeep) </li></ul><ul><li>DGAS interface http://www.to.infn.it/grid/accounting/main.html </li></ul><ul><li>http://tier1.cnaf.infn.it/monitor/LSF/plots/acct/ </li></ul>
  21. 21. Queues accounting report
  22. 22. Queues accounting report
  23. 23. Queues accounting report
  24. 24. Queues accounting report <ul><li>KSI2K [WCT] May 2006, All jobs </li></ul>
  25. 25. Queues accounting report <ul><li>CPUTime [hours] May 2006, GRID jobs </li></ul>
  26. 26. How we use KspecINT2K? <ul><li>1 slot -> 1 job </li></ul><ul><li>http://tier1.cnaf.infn.it/monitor/LSF/plots/ksi/ </li></ul><ul><li>For each job: </li></ul>
  27. 27. KSI2K T1-INFN Story
  28. 28. KSI2K T1-INFN Story
  29. 29. Job Check and Report <ul><li>Lsb.acct had a big bug! </li></ul><ul><ul><li>Randomly: CPU-user-time = 0.00 sec </li></ul></ul><ul><ul><li>From bjobs -l <JOBID> correct CPUtime </li></ul></ul><ul><ul><li>Fixed by Platform at 25th of July 2005 </li></ul></ul><ul><li>CPUtime > WCT? --> Possible Spawn </li></ul><ul><li>RAM memory: is a job on the right WN? </li></ul><ul><li>Is the WorkerNode a “black hole”? </li></ul><ul><li>We have a daily report (Web page) </li></ul>
  30. 30. Fabric and GRID monitoring <ul><li>Effort on exporting relevant queue and job metrics to the Grid level. </li></ul><ul><ul><li>Integration with GridICE </li></ul></ul><ul><ul><li>Integration with DGAS (done!) </li></ul></ul><ul><ul><li>Grid (VO) level view of resource usage </li></ul></ul><ul><li>Integration of local job information with Grid related metrics. E.g.: </li></ul><ul><ul><li>DN of the user proxy </li></ul></ul><ul><ul><li>VOMS extensions to user proxy </li></ul></ul><ul><ul><li>Grid Job ID </li></ul></ul>
  31. 31. GRID ICE <ul><li>Dissemination http://grid.infn.it/gridice </li></ul><ul><li>GridICE server (development with upcoming features) </li></ul><ul><li>http://gridice3.cnaf.infn.it:50080/gridice </li></ul><ul><li>GridICE server for EGEE Grid </li></ul><ul><li>http://gridice2.cnaf.infn.it:50080/gridice </li></ul><ul><li>GridICE server for INFN-Grid </li></ul><ul><li>http://gridice4.cnaf.infn.it:50080/gridice </li></ul>
  32. 32. GRID ICE <ul><li>For each site check GRID services (RB, BDII, CE, SE…) </li></ul><ul><li>Check service--> Does PID exist? </li></ul><ul><li>Summary and/or notification </li></ul><ul><li>From GRID servers: Summary CPU and Storage resources available per site and/or per VO </li></ul><ul><li>Storage available on SE per VO from BDII </li></ul><ul><li>Downtimes </li></ul>
  33. 33. GRID ICE <ul><li>Grid Ice as fabric monitor for “small” sites </li></ul><ul><li>Based on LeMon (server and sensors) </li></ul><ul><li>Parsing of LeMon flatfiles logs </li></ul><ul><li>Plots based on RRD Tools </li></ul><ul><li>Legnaro: ~70 WorkerNodes </li></ul>
  34. 34. GridICE screenshots
  35. 35. Jeep <ul><li>General Purpose collector datas (push technology) </li></ul><ul><li>DB-WNINFO: Historical hardware DB (MySQL on HLR node). </li></ul><ul><li>KSI2K used by each single job (DGAS) </li></ul><ul><li>Job Monitoring (Check RAM usage in real time, efficiency history) </li></ul><ul><li>FS-INFO: Enough available space on volumes? </li></ul><ul><li>AutoFS: all dynamic mount-points are working? </li></ul><ul><li>Match making UID/GID --> VO </li></ul>
  36. 36. The Storage in a Nutshell <ul><li>Different hardware (NAS, SAN, Tapes) </li></ul><ul><ul><li>More than 300 TB HD, 130 TB Tape </li></ul></ul><ul><li>Different access methods (NFS/RFIO/Xrootd/gridftp) </li></ul><ul><li>Volumes FileSystem: EXT3, XFS and GPFS </li></ul><ul><li>Volumes bigger than 2 TBytes: RAID 50 (EXT3/XFS). Direct (GPFS) </li></ul><ul><li>Tape access: CASTOR (50 TB of HD as stage) </li></ul><ul><li>Volumes management via Postgresql DB </li></ul><ul><li>60 servers to export FileSystems to WNs </li></ul>
  37. 37. Storage at T1-INFN <ul><li>Hierarchical Nagios servers to check services status </li></ul><ul><ul><li>gridftp, srm, rfio, castor, ssh </li></ul></ul><ul><li>Local tool to sum space used by VOs </li></ul><ul><li>RRD to plot (volume space total and used) </li></ul><ul><li>Binary and owner (IBM/STEK) software to check some hardware status. </li></ul><ul><li>Very very very difficult to interface owner software to T1 framework </li></ul><ul><li>For now: only e-mail report for bad blocks, disks failure and FileSystem failure </li></ul><ul><li>Plots: intranet & on demand by VO </li></ul>
  38. 38. Tape/Storage usage report
  39. 39. Summary <ul><li>Fabric level monitoring with smart report is needed to ease management </li></ul><ul><li>T1 has already solution for 2 next years! </li></ul><ul><li>Not exportable due to man-power (no support) </li></ul><ul><li>Future at INFN? What is T2s man-power? </li></ul><ul><li>LeMon&Oracle? What is T2s man-power? </li></ul><ul><li>RedEye? What is T2s man-power? </li></ul><ul><li>Real collaboration is far from mailing list and phone conferences only </li></ul>

×