Neil Chue Hong Director, OMII-UK [email_address] UK e-Infrastructure:  increasing access,  widening participation ICHEC 2007, 13-14 December 2007, Dublin
Neil Chue Hong Director, OMII-UK [email_address] UK e-Infrastructure:  increasing access,  widening participation ICHEC 2007, 13-14 December 2007, Dublin but first… I’ll give an overview of this… and my thoughts on this…
OMII-UK: Software Solutions for e-Research OMII-UK provides software and support to enable a sustained future for the UK e-Science community and its international collaborators. Core support and development: £7.8 million Commissioned Software Programme: £1.4 million ENGAGE: improving access to e-Infrastructure: £0.9 million Phase II: 2006 - 2009
OMII-UK: Adding benefit to e-Science More than just the middleware go above the components to provide added value Skilled team to help the community putting the right things together, integrating components providing consultancy and support to improve takeup developing, commissioning and improving software
What is infrastructure?
What is UK e-Infrastructure? A shared resource That enables science, research, engineering, medicine, industry, … It will improve UK / European / … productivity Lisbon Accord 2000 E-Science Vision SR2000 – John Taylor Commitment by UK government  Sections 2.23-2.25 Always there c.f. telephones, transport, power OSI report www.nesc.ac.uk/documents/OSI/index.html
e-Infrastructure the use of computing to support research but more than just the hardware
Slide: Neil Geddes
e-Science Centres in the UK Oxford Edinburgh Belfast Cambridge STFC Daresbury Manchester LeSC Newcastle Southampton Cardiff STFC Harwell Glasgow Leicester UCL Birmingham White Rose Grid Bristol Lancaster Reading Access Grid Support Centre Digital Curation Centre National Grid Service National Centre for e-Social Science National Centre for Text Mining National Institute for Environmental e-Science OMII-UK Sheffield York Leeds Coordinated by: Directors’ Forum & NeSC
e-Science is me-Science e-Science is  me -Science Why share unless you gain a benefit? To share you need credit credit implies trust and trust needs provenance
The Four Levels of e-Science Enlightenment 1)  Resources:  Providing access to a larger and wider diversity
Toolkits and Middleware The plumbing of “the Grid” Globus Toolkit, UNICORE, gLite, OMII but also .Net/CCS, Websphere … Providing standardised interfaces to resources
GridPP: the UK Grid for particle physics UK’s largest e-science project 19 UK Universities + STFC GridPP1  2001-2004 "From Web to Grid" [£16m+] GridPP2+  2004-2008  "From Prototype to Production” [£17m+] GridPP3  2008-2011   "From Production to Exploitation” [£30m]
GridPP: the UK Grid for particle physics Grid to analyse data from the Large Hadron Collider (LHC) at CERN Operations – Tier-1 centre at Rutherford Appleton Laboratory, 16 other sites Middleware – uses gLite Applications for particle physics experiments > 5,000 CPUs and > 1/2 Petabyte of disk storage Part of EGEE Grid UK/Ireland region contributed  30 million  kSI2k-hours  in 2006 – 25% of the total UK CPU used by biomedics, fusion, industry… Worldwide LHC computing Grid - by 2008 (full year’s data taking) CPU ~100MSI2k (100,000 CPUs) Storage ~80PB  Involving >100 institutes worldwide
Don’t be a banana, be a potato!
What do we need to share resources? Security Data Integration Registries Metadata is the key
What do we need to share resources? Networking Sharing Annotation Reuse Search Getting people involved in a community
Interoperability through standards? Each infrastructure runs different middleware;  most  of it works Standards needed for: security data transport job submission Standardisation is more important than standards documentation APIs tools “ The great thing about standards is they’re so many of them to choose from!”
Uniform access to computing resources Client only needs to know about applications “ Super-users” allow standard configurations to be setup Software used to provide several abstraction layers Campus Grid Toolkit: easy to install grid for job submission GridSAM/AHE Courtesy: Stefan Zasada
Uniform access to data resources OGSA-DAI: data integration for service providers Image courtesy SEEGEO/MoSeS
The Four Levels of e-Science Enlightenment 1)  Resources:  Providing access to a larger and wider diversity 2)  Automation:  Repeatability and management of experiments
Taking control of the research  Taverna: effortless workflows  for scientists
Statistical variability   Slide: Asen Asenov The simulation Paradigm now A 22 nm MOSFET In production 2008 A 4.2 nm MOSFET In production 2023
Delivering new results Simple concept I ntegrated  H ierarchical  S tatistical  D esign   Complex data and workflows D ata and  C ompute  I ntensive S ecurity  S ensitive Slide: Richard Sinnott
The Four Levels of e-Science Enlightenment 1)  Resources:  Providing access to a larger and wider diversity 2)  Automation:  Repeatability and management of experiments 3)  Collaboration:  Intra + cross disciplinary networks
Building better and bigger communities Virtual Research Environments bridge gap between infrastructure and users integrate functionality and facilities Harness interest in communities - make it easy to contribute and easy to benefit infrastructure annotation tools graphical environment Silchester Roman Town Project
Slide: David De Roure
Friends in the Community: OMII-UK PALs Open Source GIS Standards Data Mining Data Integration BioMoby Virtual Labs Alexander Woehrer Isao Kojima Chris Higgins Stephen McGough Mark Wilkinson Marco Roos Matthew Pocock
The Four Levels of e-Science Enlightenment 1)  Resources:  Providing access to a larger and wider diversity 2)  Automation:  Repeatability and management of experiments 3)  Collaboration:  Intra + cross disciplinary networks 4)  Participation:  Increasing access to a wider set of users; increasing knowledge in a domain
The Rise of Web 2.0 New sites allow non-technical users to share information and interact in programmable environments Social Networking: MySpace, Bebo, Facebook GIS: Google Maps, Google Earth Preference Matching: Amazon Meta-clustering: digg, del.icio.us Information Publishing: Flickr
The Rise of Web 2.0 New sites allow non-technical users to share information and interact in programmable environments Social Networking: MySpace, Bebo, Facebook GIS: Google Maps, Google Earth Preference Matching: Amazon Meta-clustering: digg, del.icio.us Information Publishing: Flickr An army of curators, a world of information
Galaxy Zoo
climate prediction .net   Users Worldwide >300,000 users total (90% MS Windows): >60,000 active ~17 million model-years simulated (as of September '06) ~180,000 completed simulations Slide: Robert Gurney The world's largest climate modelling supercomputer! (NB: a  black dot  is one or more computers running  climate prediction .net )
From Web 2.0 back to HEC again Managing data is the challenge: computation is a valuable tool  Capturing and improving scholarly process  is difficult Modelling the spread of  insect borne disease
Different Aspects for Different Users Applied  Technology Specialists  e-Infrastructure e-Researchers  (domain & generic)  Providers
Scientific flexibility not mechanistic complexity High expectations of Grids and e-Science not all of them met Most users just want the familiar but bigger, better, faster, more End-Users are interested in the fine grained detail but for quality choices, not mechanical choices performance, reliability, politics, brand, … Developers care about the detail and need to manage it
ENGAGE: developing new users of e-Infrastructure JISC funded, OMII-UK and NGS Work with e-IUS/e-Uptake, follow up on SUPER, target individual research groups Capture research scenarios Collaborate on e-Infrastructure designs Implementation and deployment Aim to create specific examples of research benefit from e-Infrastructure Get “non e-Science” groups to participate Use and Deployment Development and Integration Interventions Training Support Design Document and Disseminate Study Practice, Barriers, Enablers and Requirements ENGAGE www.engage.ac.uk
What do we mean by integration? Integration is more than just joining software sharing resources dynamically at many levels person to person published workflows Metadata is still the key  but it is not a purely  technical problem
Increasing access, widening participation People Infrastructure Tools Standards Research Output
OMII-UK Team
UK e-Infrastructure:  increasing access,  widening participation Neil Chue Hong Director, OMII-UK [email_address]
OMII-UK: For all kinds of users Taverna: effortless workflows  for scientists OGSA-DAI: data integration for service providers PAG: AG videoconferencing for anyone Campus Grid Toolkit: easy to install grid for job submission
The Four Levels of e-Science Enlightenment 1)  Resources:  Providing access to a larger and wider diversity 2)  Automation:  Repeatability and management of experiments 3)  Collaboration:  Intra + cross disciplinary networks 4)  Participation:  Increasing access to a wider set of users; increasing knowledge in a domain
Evolution National Global European  e-Infrastructure Slide: Neil Geddes Testbeds Utility Service Routine Usage

UK e-Infrastructure: Widening Access, Increasing Participation

  • 1.
    Neil Chue HongDirector, OMII-UK [email_address] UK e-Infrastructure: increasing access, widening participation ICHEC 2007, 13-14 December 2007, Dublin
  • 2.
    Neil Chue HongDirector, OMII-UK [email_address] UK e-Infrastructure: increasing access, widening participation ICHEC 2007, 13-14 December 2007, Dublin but first… I’ll give an overview of this… and my thoughts on this…
  • 3.
    OMII-UK: Software Solutionsfor e-Research OMII-UK provides software and support to enable a sustained future for the UK e-Science community and its international collaborators. Core support and development: £7.8 million Commissioned Software Programme: £1.4 million ENGAGE: improving access to e-Infrastructure: £0.9 million Phase II: 2006 - 2009
  • 4.
    OMII-UK: Adding benefitto e-Science More than just the middleware go above the components to provide added value Skilled team to help the community putting the right things together, integrating components providing consultancy and support to improve takeup developing, commissioning and improving software
  • 5.
  • 6.
    What is UKe-Infrastructure? A shared resource That enables science, research, engineering, medicine, industry, … It will improve UK / European / … productivity Lisbon Accord 2000 E-Science Vision SR2000 – John Taylor Commitment by UK government Sections 2.23-2.25 Always there c.f. telephones, transport, power OSI report www.nesc.ac.uk/documents/OSI/index.html
  • 7.
    e-Infrastructure the useof computing to support research but more than just the hardware
  • 8.
  • 9.
    e-Science Centres inthe UK Oxford Edinburgh Belfast Cambridge STFC Daresbury Manchester LeSC Newcastle Southampton Cardiff STFC Harwell Glasgow Leicester UCL Birmingham White Rose Grid Bristol Lancaster Reading Access Grid Support Centre Digital Curation Centre National Grid Service National Centre for e-Social Science National Centre for Text Mining National Institute for Environmental e-Science OMII-UK Sheffield York Leeds Coordinated by: Directors’ Forum & NeSC
  • 10.
    e-Science is me-Sciencee-Science is me -Science Why share unless you gain a benefit? To share you need credit credit implies trust and trust needs provenance
  • 11.
    The Four Levelsof e-Science Enlightenment 1) Resources: Providing access to a larger and wider diversity
  • 12.
    Toolkits and MiddlewareThe plumbing of “the Grid” Globus Toolkit, UNICORE, gLite, OMII but also .Net/CCS, Websphere … Providing standardised interfaces to resources
  • 13.
    GridPP: the UKGrid for particle physics UK’s largest e-science project 19 UK Universities + STFC GridPP1 2001-2004 "From Web to Grid" [£16m+] GridPP2+ 2004-2008 "From Prototype to Production” [£17m+] GridPP3 2008-2011 "From Production to Exploitation” [£30m]
  • 14.
    GridPP: the UKGrid for particle physics Grid to analyse data from the Large Hadron Collider (LHC) at CERN Operations – Tier-1 centre at Rutherford Appleton Laboratory, 16 other sites Middleware – uses gLite Applications for particle physics experiments > 5,000 CPUs and > 1/2 Petabyte of disk storage Part of EGEE Grid UK/Ireland region contributed 30 million kSI2k-hours in 2006 – 25% of the total UK CPU used by biomedics, fusion, industry… Worldwide LHC computing Grid - by 2008 (full year’s data taking) CPU ~100MSI2k (100,000 CPUs) Storage ~80PB Involving >100 institutes worldwide
  • 15.
    Don’t be abanana, be a potato!
  • 16.
    What do weneed to share resources? Security Data Integration Registries Metadata is the key
  • 17.
    What do weneed to share resources? Networking Sharing Annotation Reuse Search Getting people involved in a community
  • 18.
    Interoperability through standards?Each infrastructure runs different middleware; most of it works Standards needed for: security data transport job submission Standardisation is more important than standards documentation APIs tools “ The great thing about standards is they’re so many of them to choose from!”
  • 19.
    Uniform access tocomputing resources Client only needs to know about applications “ Super-users” allow standard configurations to be setup Software used to provide several abstraction layers Campus Grid Toolkit: easy to install grid for job submission GridSAM/AHE Courtesy: Stefan Zasada
  • 20.
    Uniform access todata resources OGSA-DAI: data integration for service providers Image courtesy SEEGEO/MoSeS
  • 21.
    The Four Levelsof e-Science Enlightenment 1) Resources: Providing access to a larger and wider diversity 2) Automation: Repeatability and management of experiments
  • 22.
    Taking control ofthe research Taverna: effortless workflows for scientists
  • 23.
    Statistical variability Slide: Asen Asenov The simulation Paradigm now A 22 nm MOSFET In production 2008 A 4.2 nm MOSFET In production 2023
  • 24.
    Delivering new resultsSimple concept I ntegrated H ierarchical S tatistical D esign Complex data and workflows D ata and C ompute I ntensive S ecurity S ensitive Slide: Richard Sinnott
  • 25.
    The Four Levelsof e-Science Enlightenment 1) Resources: Providing access to a larger and wider diversity 2) Automation: Repeatability and management of experiments 3) Collaboration: Intra + cross disciplinary networks
  • 26.
    Building better andbigger communities Virtual Research Environments bridge gap between infrastructure and users integrate functionality and facilities Harness interest in communities - make it easy to contribute and easy to benefit infrastructure annotation tools graphical environment Silchester Roman Town Project
  • 27.
  • 28.
    Friends in theCommunity: OMII-UK PALs Open Source GIS Standards Data Mining Data Integration BioMoby Virtual Labs Alexander Woehrer Isao Kojima Chris Higgins Stephen McGough Mark Wilkinson Marco Roos Matthew Pocock
  • 29.
    The Four Levelsof e-Science Enlightenment 1) Resources: Providing access to a larger and wider diversity 2) Automation: Repeatability and management of experiments 3) Collaboration: Intra + cross disciplinary networks 4) Participation: Increasing access to a wider set of users; increasing knowledge in a domain
  • 30.
    The Rise ofWeb 2.0 New sites allow non-technical users to share information and interact in programmable environments Social Networking: MySpace, Bebo, Facebook GIS: Google Maps, Google Earth Preference Matching: Amazon Meta-clustering: digg, del.icio.us Information Publishing: Flickr
  • 31.
    The Rise ofWeb 2.0 New sites allow non-technical users to share information and interact in programmable environments Social Networking: MySpace, Bebo, Facebook GIS: Google Maps, Google Earth Preference Matching: Amazon Meta-clustering: digg, del.icio.us Information Publishing: Flickr An army of curators, a world of information
  • 32.
  • 33.
    climate prediction .net Users Worldwide >300,000 users total (90% MS Windows): >60,000 active ~17 million model-years simulated (as of September '06) ~180,000 completed simulations Slide: Robert Gurney The world's largest climate modelling supercomputer! (NB: a black dot is one or more computers running climate prediction .net )
  • 34.
    From Web 2.0back to HEC again Managing data is the challenge: computation is a valuable tool Capturing and improving scholarly process is difficult Modelling the spread of insect borne disease
  • 35.
    Different Aspects forDifferent Users Applied Technology Specialists e-Infrastructure e-Researchers (domain & generic) Providers
  • 36.
    Scientific flexibility notmechanistic complexity High expectations of Grids and e-Science not all of them met Most users just want the familiar but bigger, better, faster, more End-Users are interested in the fine grained detail but for quality choices, not mechanical choices performance, reliability, politics, brand, … Developers care about the detail and need to manage it
  • 37.
    ENGAGE: developing newusers of e-Infrastructure JISC funded, OMII-UK and NGS Work with e-IUS/e-Uptake, follow up on SUPER, target individual research groups Capture research scenarios Collaborate on e-Infrastructure designs Implementation and deployment Aim to create specific examples of research benefit from e-Infrastructure Get “non e-Science” groups to participate Use and Deployment Development and Integration Interventions Training Support Design Document and Disseminate Study Practice, Barriers, Enablers and Requirements ENGAGE www.engage.ac.uk
  • 38.
    What do wemean by integration? Integration is more than just joining software sharing resources dynamically at many levels person to person published workflows Metadata is still the key but it is not a purely technical problem
  • 39.
    Increasing access, wideningparticipation People Infrastructure Tools Standards Research Output
  • 40.
  • 41.
    UK e-Infrastructure: increasing access, widening participation Neil Chue Hong Director, OMII-UK [email_address]
  • 42.
    OMII-UK: For allkinds of users Taverna: effortless workflows for scientists OGSA-DAI: data integration for service providers PAG: AG videoconferencing for anyone Campus Grid Toolkit: easy to install grid for job submission
  • 43.
    The Four Levelsof e-Science Enlightenment 1) Resources: Providing access to a larger and wider diversity 2) Automation: Repeatability and management of experiments 3) Collaboration: Intra + cross disciplinary networks 4) Participation: Increasing access to a wider set of users; increasing knowledge in a domain
  • 44.
    Evolution National GlobalEuropean e-Infrastructure Slide: Neil Geddes Testbeds Utility Service Routine Usage