EGEE 3 Project


Published on

EGEE 3 Project Presentation

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

EGEE 3 Project

  1. 1. Enabling Grids for E-sciencE EGEE-III A presentation for EU officials Status: May 2008 EGEE-III INFSO-RI-222667
  2. 2. EGEE Enabling Grids for E-sciencE Flagship Grid infrastructure project co-funded by the European Commission Main Objectives Service & Application • Operate a large-scale, Networking support support 20% production quality Grid 50% infrastructure for e-Science Training 8% Integration • Attract new resources and and testing Middleware 5% users from sciences as well 9% as business Dissemination & International Management Cooperation 2% 6% EGEE-III INFSO-RI-222667 2
  3. 3. EGEE-III Enabling Grids for E-sciencE • EGEE-III – Third phase of the EGEE programme: EGEE: April 2004 – March 2006 EGEE-II: April 2006 – April 2008 – Co-funded under European Commission under call INFRA-2007-1.2.3 – 9010 person months/375 FTEs – 2 year period – 1 May 2008 to 30 April 2010 – EC Requested Contribution : €32M - represents less than 1/3 of total project costs • Key objectives – Expand/optimise existing EGEE infrastructure, include more resources and user communities – Prepare migration from a project-based model to a sustainable federated infrastructure based on National Grid Initiatives • Consortium – Structured on a national basis (National Grid Initiatives/Joint Research Units) – 42 beneficiaries (+ 100 JRU members) EGEE-III INFSO-RI-222667 3
  4. 4. EGEE-III activities and leaders Enabling Grids for E-sciencE NA1: Management of the project Bob Jones, CERN NA2: Dissemination, Communication and SA1: Grid Operations Outreach Maite Barroso Lopez, CERN Catherine Gater, CERN NA3: User Training and support SA2: Networking Support Robin McConnell, UEDIN Xavier Jeannin, CNRS NA4: User Community Support and SA3: Integration, testing & certification Expansion Oliver Keeble, CERN Cal Loomis, CNRS NA5: International Cooperation & Policy JRA1: Middleware engineering Panos Louridas, GRNET Francesco Giacomini, INFN EGEE-III INFSO-RI-222667 4
  5. 5. EGEE – What do we deliver? Enabling Grids for E-sciencE • Infrastructure operation – Sites distributed across many countries Large quantity of CPUs and storage Continuous monitoring of Grid services & automated site configuration/management Support multiple Virtual Organisations from diverse research disciplines • Middleware – Production quality middleware distributed under business friendly open source licence Implements a service-oriented architecture that virtualises resources Adheres to recommendations on web service inter- operability and evolving towards emerging standards • User Support - Managed process from first contact through to production usage – Training – Expertise in Grid-enabling applications – Online helpdesk – Networking events (User Forum, Conferences etc.) EGEE-III INFSO-RI-222667 5
  6. 6. EGEE – Infrastructure Enabling Grids for E-sciencE Application areas include: Archeology Astronomy Astrophysics Civil Protection Comp. Chemistry Earth Sciences Finance >250 sites Fusion 48 countries Geophysics >50,000 CPUs High Energy Physics >20 PetaBytes Life Sciences >10,000 users Multimedia >150 VOs Material Sciences >150,000 jobs/day … EGEE-III INFSO-RI-222667 6
  7. 7. Users and resources distribution Enabling Grids for E-sciencE February 2008 figures EGEE-III INFSO-RI-222667 7
  8. 8. gLite Grid Middleware Services Enabling Grids for E-sciencE Access CLI API Security Information & Monitoring Authorization Auditing Information & Application Monitoring Monitoring Authentication Data Management Workload Management Metadata File & Replica Accounting Job Package Catalog Catalog Provenance Manager Storage Data Site Proxy Computing Workload Element Movement Element Management Overview paper EGEE-III INFSO-RI-222667 8
  9. 9. Disciplines and user communities Enabling Grids for E-sciencE Astrophysics and astroparticle physics Biomedical and bioinformatics information Computational chemistry Others argo libi aegis inaf bio trgrida apesci pamela biomed compchem astron embrace gaussian cesga planck enea virgo High Energy Physics Infrastructure grid-it magic calice edteam auger hone euindia ific ops ncf Earth sciences ildg pvier trgridc rdteam esr pheno rgstest swetest webcom Geophysics geant4 egeode All user communities are required to contribute Finance resources to the infrastructure infngrid proactive cosmo egrid hermes eela eumed diligent Fusion alice dteam cyclops fusion atlas geclipse babar balticgrid gridcc belle dech ~9000 users cdf cms see seegrid listed in dzero gridpp twgrid trgrida/b/c/d/e registered ilc lhcb voce na48 Digital libraries, disaster VOs zeus ghep recovery, computational sciences, etc. desy EGEE-III INFSO-RI-222667 9
  10. 10. Why users choose the EGEE Grid Enabling Grids for E-sciencE • Share more than information • Efficient use of resources at many institutes • Leverage other sources of funding • Data, computing power, applications • Join local communities Challenges: • share data between thousands of scientists with multiple interests • link major and minor computer centres • ensure all data accessible anywhere, anytime • grow rapidly, yet remain reliable for more than a decade • cope with different management policies of different centres • ensure data security • continuous, production service EGEE-III INFSO-RI-222667 10
  11. 11. Why do particle physicists  Enabling Grids for E-sciencE need the Grid? 1/2 CERN Large Hadron Collider The world’s most powerful particle accelerator 4 Large Experiments ATLAS EGEE-III INFSO-RI-222667 11
  12. 12. Why do particle physicists  Enabling Grids for E-sciencE need the Grid? 2/2 Example from LHC: One year’s data starting from this event from LHC would fill a stack of CDs 20km high • ~100,000,000  electronic  Concorde channels We are looking for this “signature” • 0.0002 Higgs  (15 km) per second • 15 PBytes of  data a year  • (10 Million  Mt. Blanc GBytes = 14  Selectivity: 1 in 1013 (4.8 km) Million CDs) Like looking for 1 person in a thousand world populations; or for a needle in 20 million haystacks! EGEE-III INFSO-RI-222667 12
  13. 13. A question of scale Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 13
  14. 14. Recent Grid activity Enabling Grids for E-sciencE In 2007, Worldwide LHC Computing Grid ran ~ 44 M 300k /day jobs on different infrastructures (EGEE, NGDF, OSG) with the large majority of them served by 230k /day EGEE – workload has continued to increase 29M in 1st quarter of 2008 – now at ~ >300k jobs/day Distribution of work across Tier0/Tier1/Tier2 really illustrates the importance of the Grid system Tier 2 contribution is around 50%; > 85% is external to CERN These workloads (reported across all WLCG centres) are at the level anticipated for 2008 data taking EGEE-III INFSO-RI-222667 14
  15. 15. In silico drug discovery Enabling Grids for E-sciencE • Diseases such as HIV/AIDS, SRAS, Bird Flu etc. are a threat to public health due to world wide exchanges and circulation of persons • Grids open new perspectives to in silico drug discovery – Reduced cost, adding an accelerating factor in the search for new drugs International collaboration is required for: • Early detection • Epidemiological watch • Prevention • Search for new drugs •Avian influenza: • Search for vaccines •bird casualties EGEE-III INFSO-RI-222667 15
  16. 16. WISDOM Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 16
  17. 17. Computational Chemistry Enabling Grids for E-sciencE • Researchers from more than 30 universities across Europe use EGEE for their work • Chemical software ported include commercial (Gaussian03, Turbomole, Wien2k) and several freely available packages (GAMES, DL_POLY, CPMD, DALTON, Columbus etc.) • Virtual Organisations: – CompChem ( – Gaussian ( – Turbomole ( • ~ 3 million jobs executed during year 2007 • 90+ users actively using EGEE infrastructure EGEE-III INFSO-RI-222667 17
  18. 18. Computational chemistry example Enabling Grids for E-sciencE • Cytochrome c Oxydase (CcO) consists of approximately 10.000 atoms and the dynamics calculations are unfeasible on ordinary clusters (2.4 years needed for a simulation of 5.2 ns). • Grid computations – Three structures studied – Total time - 93 days – Nearly 6000 jobs – 3043 days of CPU time EGEE-III INFSO-RI-222667 18
  19. 19. Grid added value Enabling Grids for E-sciencE • Grid can help satisfy computational chemistry demands: – both CPU power and intermediate data storage for future restarts – easy management for large numbers of jobs (e.g. GANGA) – automation of common tasks during job execution via workflows – possibility of direct cooperation between computational chemistry and other scientific disciplines some ligand properties such as geometry, charges etc. can be stored on the Grid these data can be accessed by others to study interaction between ligand and protein for example – possibility to execute many parallel jobs at the same time – for some commercial software packages, Grid is the only way to allowing users access to these programs EGEE-III INFSO-RI-222667 19
  20. 20. Expanding Geosciences-On-Demand Enabling Grids for E-sciencE (EGEODE) services to SMEs • Modern seismic data processing and geophysical simulations require greater CGGVeritas market amounts of computing power, data storage and sophisticated software High Tech. • Difficult for oil & gas small & medium size enterprises (SMEs) to exploit innovative algorithms SMEs market • SME Market: small O&G structures Conventional – 1035 O&G companies in EU – 93% are SMEs; 63% < 10 employees Research labs Very small projects of large firms EGEE-III INFSO-RI-222667 20
  21. 21. EGEE workload in 2007 Enabling Grids for E-sciencE Data: 25Pb stored 11Pb transferred CPU: 114 Million hours CPU Xfer Storage Estimated cost if performed with Amazon’s EC2 and S3: $58,690,000 = €37M 17/05/08 Paper on Clouds and Grids, May 2008: EGEE-III INFSO-RI-222667 21
  22. 22. gLite Business Use Cases Enabling Grids for E-sciencE • Adopted gLite on own infrastructure – BEinGRID Earth Sciences; Finance – EU-IndiaGrid Financial Stock Analysis application using gLite – Health-e-Child Biomedical information platform for Paediatrics – Imense Ltd gLite-based Grid computing for large scale image indexing and retrieval – Philips Research Using gLite for medical imaging, bio-informatics and simulation • Proof of Concept – GridVideo gLite-based multimedia application – TOTAL, UK Application to assess the usefulness of External Grids using GILDA testbed • Application and Development – CERN Openlab CERN and industrial partners to develop data-intensive Grid solutions – WISDOM Using EGEE infrastructure for drug discovery EGEE-III INFSO-RI-222667 22
  23. 23. Business and EGEE-III Enabling Grids for E-sciencE • Technology Transfer and potential commercial exploitation – Further develop the Business Forum as a means of dialog with business actors – More attention to SMEs start-ups (innovative applications and portals) collaborative projects (partner grids) – Develop a network of companies to prepare the future commercial exploitation of EGEE technology EGEE Business Associates; ISVs; Software integrators and IT Services providers • Provide solutions to challenges for Business adoption – MoUs signed with related projects and interested partners to develop identified higher-level services and solutions (e.g. SLA; Windows porting, ...) – Further develop EGEE technology to simplify the interaction between grids and commercial cloud services – Explain the advantages and limitations of grids & cloud computing to businesses EGEE-III INFSO-RI-222667 23
  24. 24. Collaborating e-Infrastructures Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 24
  25. 25. Evolution National European e-Infrastructure Global Routine Usage Testbeds Utility Service 25
  26. 26. European Grid Initiative Enabling Grids for E-sciencE • Need to prepare permanent, common Grid infrastructure • Ensure the long-term sustainability of the European e-Infrastructure independent of short project funding cycles • Coordinate the integration and interaction between National Grid Infrastructures (NGIs) • Operate the production Grid infrastructure on a European level for a wide range of user communities Must be no gap in the support of the production Grid EGEE-III INFSO-RI-222667 26
  27. 27. Summary Enabling Grids for E-sciencE • EGEE operates the world’s largest multi-disciplinary Grid infrastructure for scientific research – In constant and significant production use – Constantly growing in scale of resources and breadth of user communities supported • A third phase of EGEE has now started – EGEE-III 2008-2010 • Need to prepare the long-term – EGEE, collaborating projects, National Grid Initiatives and user communities are working to define a model for a sustainable Grid infrastructure that is independent of short project cycles EGEE-III INFSO-RI-222667 27