Infrastructures Supporting Inter-disciplinary Research - Exemplars from the UK


Published on

Infrastructures Supporting Inter-disciplinary Research - Exemplars from the UK
. Talk given by Richard Sinnott at Urban Research Infrastructure Network Workshops, Melbourne, Brisbane, Sydney, September 2010.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Infrastructures Supporting Inter-disciplinary Research - Exemplars from the UK

  1. 1. Infrastructures Supporting Inter-disciplinary Research - Exemplars from the UK Professor Richard O. Sinnott eResearch Director, University of Melbourne 9 th September 2010 [email_address]
  2. 2. Me Job description in a nutshell… make “e-” happen (Project Consultant on AURIN)
  3. 3. Track Record <ul><li>National e-Science Centre (I, II, III) </li></ul><ul><li>Dynamic Virtual Organisations for e-Science Education </li></ul><ul><li>Biomedical Research Informatics Delivered by Grid Enabled Services </li></ul><ul><li>Grid Enabled Microarray Expression Profile Search </li></ul><ul><li>GridNet </li></ul><ul><li>Glasgow early adoption of Shibboleth </li></ul><ul><li>Joint Data Standards Survey </li></ul><ul><li>ESP-Grid </li></ul><ul><li>GridNet-2 </li></ul><ul><li>HPC Compute cluster award </li></ul><ul><li>Sun industrial sponsorship </li></ul><ul><li>OGC Collision </li></ul><ul><li>OMII-Security Portlets </li></ul><ul><li>OMII-RAVE </li></ul><ul><li>Integrating VOMS and PERMIS for Superior Grid Authorization </li></ul><ul><li>NCeSS Technical Management </li></ul><ul><li>CESSDA PPP </li></ul><ul><li>Pharming of Therapeutic RNA </li></ul><ul><li>Grid Enabled Occupational Data Environment </li></ul><ul><li>Towards an e-Infrastructure for e-Science Digital Repositories </li></ul><ul><li>Grid enabled Biochemical Pathway Simulator </li></ul><ul><li>Virtual Organisations for Trials and Epidemiological Studies </li></ul><ul><li>A European e-Infrastructure for e-Science Repositories </li></ul><ul><li>Modelling, Inference and Analysis for Biological Systems up to the Cellular Level </li></ul><ul><li>Drug Discovery Portal </li></ul><ul><li>Advanced Grid Authorisation through Semantic Technologies ShinTau (Supporting Multiple Shibboleth Attribute Authorities) </li></ul><ul><li>Grid-enabled Virtual Safe Settings </li></ul><ul><li>Scottish Bioinformatics Research Network (SBRN) </li></ul><ul><li>Generation Scotland Scottish Family Health Study </li></ul><ul><li>Meeting the Design Challenges of nanoCMOS Electronics (nanoCMOS) </li></ul><ul><li>EU FW7 EuroDSD </li></ul><ul><li>EU FW7 AvertIT </li></ul><ul><li>Breast Cancer Tissue Biobank </li></ul><ul><li>Data Management through e-Social Science (DAMES) </li></ul><ul><li>NeSC Research Platform (NRP) </li></ul><ul><li>NeSC Information Network (NIN) </li></ul><ul><li>ESF Network for Study of Adrenal Tumors </li></ul><ul><li>Scottish Health Informatics Platform for Research (SHIP) </li></ul><ul><li>National E-Infrastructure for Social Simulation (NeISS) </li></ul><ul><li>Enhancing Repositories for Language and Literature Researchers (ENROLLER) </li></ul><ul><li>Proxy Credential Auditing Infrastructure for the NGS </li></ul><ul><li>EU FW7 European Network for Study of Adrenal Tumors Cancer Research Platform </li></ul><ul><li>EU R4SME Diagnosis of Parkinsons Disease (DiPAR) </li></ul><ul><li>EU European Platform for Study of Wolfram, Alstrom, Bardet Biedl </li></ul>Completed On-Going
  4. 4. AURIN Investment Plan <ul><li>Infrastructure </li></ul><ul><ul><li>… to support ongoing data collection for urban research and the development of tools to interrogate those data </li></ul></ul><ul><ul><li>… problems of distributed data ownership, the diversity of formats </li></ul></ul><ul><ul><li>… considerations of privacy and the importance of maintaining granularity </li></ul></ul><ul><ul><li>… urban and built environment research community needs an infrastructure that provides better modelling and simulation tools that support integrated access to diverse datasets </li></ul></ul><ul><ul><li>… difficult to achieve in practice. Datasets created by organisations and researchers to meet their own specific needs </li></ul></ul><ul><ul><li>… propose to categorise principal methodologies, data types and types of sources, thus identifying key implementation issues </li></ul></ul><ul><ul><li>… strategic implementation streams used to define key data types and research approaches and to specify and develop infrastructure components necessary for realm of research </li></ul></ul>Many of these issues addressed by e-Research and e-Infrastructures
  5. 5. e-Research & e-Infrastructures <ul><li>e=? </li></ul><ul><ul><ul><li>Electronic </li></ul></ul></ul><ul><ul><ul><li>Enabling </li></ul></ul></ul><ul><ul><ul><li>Empowering </li></ul></ul></ul><ul><li>Application of advanced ICT solutions to support research and researchers </li></ul><ul><ul><li>Many software/middleware flavours </li></ul></ul><ul><ul><li>eResearch = High Performance Computing (HPC)? </li></ul></ul><ul><ul><ul><li>Not always (I’d say mostly not!) </li></ul></ul></ul><ul><li>Successful Infrastructures should support </li></ul><ul><ul><li>researchers and domain-specific research </li></ul></ul><ul><ul><ul><li>often at the risk of being non-sexy! </li></ul></ul></ul><ul><ul><li>seamless access to a heterogeneous variety of compute and data resources </li></ul></ul><ul><ul><ul><li>Often domain/inter-domain specific – especially data! </li></ul></ul></ul><ul><ul><li>single sign-on </li></ul></ul><ul><ul><ul><li>Log in once and access resources without need for re-authentication and/or re-authorisation </li></ul></ul></ul>
  6. 6. e-Health Vision Nucleotide sequences Nucleotide structures Gene expressions Protein Structures Protein functions Protein-protein interaction (pathways) Cell Cell signalling Tissues Organs Physiology Organisms Populations SECURITY!!! + environmental, social, geographic …
  7. 7. e-Security <ul><li>A A A A </li></ul><ul><ul><li>A _ _ _ = authentication </li></ul></ul><ul><ul><ul><li>= AAF? </li></ul></ul></ul>Service provider 5. User accesses resource Web site/e-Journal Identity Provider Home Institution W.A.Y.F. Federation User AuthN Log-in once and roam <ul><li>User points browser at Grid resource/portal (or non-Grid resource) </li></ul>2. Shibboleth redirects user to W.A.Y.F . service 3. User selects their home institution 4. Home site authenticates user LDAP
  8. 8. _A _ _ <ul><li>Authorisation </li></ul><ul><ul><li>Defining and enforcing rules on who can do what with what and when etc </li></ul></ul><ul><ul><ul><li>Sites will have different rules/regulations </li></ul></ul></ul><ul><ul><li>Virtual Organisations (VO) </li></ul></ul><ul><ul><ul><li>Collection of distributed resources shared by collection of users from one or more organizations typically to work on common research goal </li></ul></ul></ul><ul><ul><ul><ul><li>Provides conceptual framework for rules/regulations for resources to be offered/shared between VO-institutions and members </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Different domains place greater/lesser emphasis on expression and enforcement of rules and regulations (policies) </li></ul></ul></ul></ul>
  9. 9. AURIN Work Streams = e-VOs . . . {Resources} {Users} Org 1 {Resources} {Users} Org n VO VO specific agreements
  10. 10. _ A _ _ <ul><li>Role Based Access Control (attribute/identity/process…) </li></ul><ul><ul><li>Basic idea is to define: </li></ul></ul><ul><ul><ul><li>roles applicable to specific VO </li></ul></ul></ul><ul><ul><ul><ul><li>roles often hierarchical </li></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>Role X ≥ Role Y ≥ Role Z </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>Manager can do everything (and more) than an employee can do who can do everything (and more) than a trainee can do </li></ul></ul></ul></ul></ul><ul><ul><ul><li>actions allowed/not allowed for VO members </li></ul></ul></ul><ul><ul><ul><li>resources comprising VO infrastructure (computers, data etc) </li></ul></ul></ul><ul><ul><li>A policy then consists of sets of these rules </li></ul></ul><ul><ul><ul><ul><li>{ Role x Action x Target } </li></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>Can user with VO role X invoke service Y on resource Z? </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><li>Policy itself can be represented in many ways, </li></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>e.g. XML document, SAML, XACML, … </li></ul></ul></ul></ul></ul><ul><ul><ul><li>Standards on when/where these used (PEP) and enforced (PDP) </li></ul></ul></ul><ul><ul><li>Should all be transparent to end users! </li></ul></ul>
  11. 11. Privileges, Resources, Access Control and Trust Service provider Shib Frontend 5. Pass authentication info and attributes to auth Z function Grid Portal 6. M ake final AuthZ de cision Identity Provider Home Institution W.A.Y.F. Federation User AuthN AuthZ uid ? ? ? ? ? ? ? 1. User points browser at Grid resource/portal 2. Shibboleth redirects user to W.A.Y.F . service 3. User selects their home institution 4. Home site authenticates user and pushes attributes to the servic e provider LDAP LDAP
  12. 12. Proof of the Pudding <ul><li>Inter-disciplinary research infrastructures </li></ul><ul><ul><li>Data Management through e-Social Science (DAMES – ) </li></ul></ul><ul><ul><ul><li>3 year ESRC funded project </li></ul></ul></ul><ul><ul><ul><li>Ends January 2011 </li></ul></ul></ul><ul><ul><ul><ul><li>Stirling (social science) and Glasgow (e-Support) </li></ul></ul></ul></ul><ul><ul><li>National e-Infrastructure for Social Simulation (NeISS – ) </li></ul></ul><ul><ul><ul><li>3 year JISC funded project </li></ul></ul></ul><ul><ul><ul><li>Ends April 2012 </li></ul></ul></ul><ul><ul><ul><ul><li>Centre for Spatial Analysis and Policy, University of Leeds </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Manchester e-Research Centre, University of Manchester </li></ul></ul></ul></ul><ul><ul><ul><ul><li>STFC, Daresbury Laboratory </li></ul></ul></ul></ul><ul><ul><ul><ul><li>National e-Science Centre, University of Glasgow </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Information Management Group, University of Manchester </li></ul></ul></ul></ul><ul><ul><ul><ul><li>OMII-UK, University of Southampton </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Department of Applied Social Science, University of Stirling </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Centre for Advanced Spatial Analysis - University College London </li></ul></ul></ul></ul>
  13. 13. DAMES <ul><li>Data research environments for social scientists </li></ul><ul><ul><li>Occupational research environment (GEODE) </li></ul></ul><ul><ul><li>Educational research environment (GEEDE) </li></ul></ul><ul><ul><li>Ethnicity/minority research environment (GEMDE) </li></ul></ul><ul><ul><li>E-Health research environments (GEHDE) </li></ul></ul><ul><ul><ul><li>Research into depression, self harm and suicide across Scotland </li></ul></ul></ul><ul><ul><ul><ul><li>Does the number of people in a household have any effect on suicide rates? </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Is there a correlation between age, sex, marital status, history of drug use (including prescribing drugs / anti-depressants) on suicide? </li></ul></ul></ul></ul><ul><ul><ul><ul><li>What is relation with access to parkland/green fields on depression? </li></ul></ul></ul></ul><ul><ul><ul><ul><li>What is optimal way to treat different forms of depression, e.g. drug treatments, therapists, …? </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Does intervention-based cognitive behaviour therapy work? </li></ul></ul></ul></ul>
  14. 14. DAMES GEHDE Data Space <ul><li>Clinical data </li></ul><ul><ul><li>Scottish Morbidity Records (SMR)    </li></ul></ul><ul><ul><ul><li>SMR01A   General acute inpatient and day case discharges </li></ul></ul></ul><ul><ul><ul><li>SMR02 Maternity data </li></ul></ul></ul><ul><ul><ul><li>SMR04A   Psychiatric/mental handicap admissions, residents, discharges    </li></ul></ul></ul><ul><ul><ul><li>SMR06A   Scottish cancer registrations            </li></ul></ul></ul><ul><ul><ul><li>SMR99A   Deaths     </li></ul></ul></ul><ul><li>Social science/survey data </li></ul><ul><ul><li>UK Census </li></ul></ul><ul><ul><ul><li>1971 … 2001 </li></ul></ul></ul><ul><li>Geospatial data </li></ul><ul><ul><li>UK Boundary data </li></ul></ul><ul><ul><ul><li>Changes over time to local authorities </li></ul></ul></ul><ul><ul><li>License protected mapping data sets and tools </li></ul></ul>
  15. 15. <ul><li>Demonstration </li></ul><ul><li>(video) </li></ul>
  16. 16. National e-Infrastructure for Social Simulation <ul><li>Goals </li></ul><ul><ul><li>Provisioning of variety of data sets/tools for social science research purposes </li></ul></ul><ul><li>Key scenarios </li></ul><ul><ul><li>Population demographics </li></ul></ul><ul><ul><ul><li>changing population and impact on society </li></ul></ul></ul><ul><ul><li>Health </li></ul></ul><ul><ul><ul><li>smoking, obesity, … </li></ul></ul></ul><ul><ul><li>Traffic </li></ul></ul><ul><ul><ul><li>congestion charging, road pricing, promotion of non-vehicular alternatives to commuting </li></ul></ul></ul><ul><li>Sim City but with real data! </li></ul><ul><ul><li>“ What-If” scenarios for users/policy-makers </li></ul></ul>
  17. 17. NeISS Service Portfolio <ul><li>Census 2001 – aggregate level data from ONS (via MIMAS) </li></ul><ul><li>PRM - creation of synthetic micro-populations for a city or region </li></ul><ul><li>• DSM - projects population of a city/region forward in time using representations </li></ul><ul><li>of key demographic processes, e.g. births, deaths, migration, marriage etc </li></ul><ul><li>TSM - spatial interaction service which assigns traffic behaviour (mode/route </li></ul><ul><li>choice) on basis of residential location/activity patterns in relation to fixed </li></ul><ul><li>transport infrastructure (from MOSES/GENESIS) </li></ul><ul><li>Aggregator - links individuals in a simulated population to individual level data </li></ul><ul><li>resources, e.g. BHPS </li></ul><ul><li>SurveyMapper – crowd sourcing solution from wider communities (from CASA/UCL) </li></ul><ul><li>FusionTool – service for auto-generation of missing data (from DAMES) </li></ul><ul><li>MapTube - allows to visualise spatial distributions using GIS shapefiles or 3 rd party </li></ul><ul><li>sources such as GoogleMaps (from CASA/UCL) </li></ul>
  18. 18. NeISS Tools Workflows
  19. 19. <ul><li>Demonstration </li></ul><ul><li>(video) </li></ul>
  20. 20. NeISS Infrastructure – Exemplars <ul><li>Data Census tables (e.g. specific wards/output areas) used with PRM service to create synthetic population of individuals and households with basic demographic attributes </li></ul><ul><ul><li>e.g. age, gender, marital status, ethnicity, quality of health, social group, housing </li></ul></ul><ul><ul><li>type, tenure… </li></ul></ul>Linking BHPS data to a synthetic population to estimate #smokers in Leeds
  21. 21. NeISS Infrastructure – Exemplars Each synthetically generated member of a population in a city, region or neighbourhood can be linked, e.g. to BHPS, by matching their demographic characteristics. From this process, additional characteristics can be assigned to the synthetic database using the profiles in the BHPS. The power of this lies in the fact that while the PRM generates in the order of 10 demographic and household attributes, the BHPS has over a thousand attribute fields. In this way, many spatial patterns can be newly inferred. Distribution of elderly people around Newcastle
  22. 22. NeISS Infrastructure – Exemplars DSM uses a series of sub-models representing key demographic transitions, e.g. marriage, migration, pro-creation, death and so forth Dynamic simulation showing proportional population change between 2001 and 2031 in Birmingham.
  23. 23. NeISS Infrastructure – Exemplars Three scenarios – congestion charging, road pricing and promotion of non-vehicular alternatives to commuting. For example, would policies on financial alternatives to leave a car at home and travel to work by bicycle actually work? Float idea to general public via tools like SurveyMapper – use results to gauge public response and inform policy makers, and as basis for calibration of simulation tools. Traffic scenarios
  24. 24. The Wider Context <ul><li>Major (complementary) initiatives </li></ul><ul><ul><li>$47m EIF Super Science National eResearch Collaboration Tools and Resources (NeCTAR) </li></ul></ul><ul><ul><ul><ul><li>$23m 2009/2010; $12m 2011/2012; $12m 2012/2013 </li></ul></ul></ul></ul><ul><ul><ul><li>Four key strands </li></ul></ul></ul><ul><ul><ul><ul><li>National Servers Program </li></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>started </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><li>eResearch Tools Program </li></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>Sub-projects over 2-years from $500k-$1m </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>50% total costs </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><li>Research Cloud Program </li></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>Consultation exercise starting </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><li>Virtual Laboratories Program </li></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>Consultation exercise starting </li></ul></ul></ul></ul></ul><ul><ul><li>Interim Director </li></ul></ul><ul><ul><ul><li>Prof. Geoff Taylor, UoM Physics </li></ul></ul></ul>
  25. 25. The Wider Context <ul><li>Super Science Research Data Storage Infrastructure </li></ul><ul><ul><li>$50m – to be decided </li></ul></ul><ul><li>Key data resources and data provider organisations </li></ul><ul><ul><li>See spreadsheet </li></ul></ul><ul><li>MANY other on-going efforts </li></ul><ul><ul><li>AAF/ANDS/ASSDA/AuSCOPE/INSPIRE/SISS/… </li></ul></ul>
  26. 26. <ul><li>Questions …? </li></ul>