FAM12 - Broadening Horizons     The proposed application of Project Moonshot to Diamond User Access                       ...
Three Linked ContributionsDiamond Light Source Brief OverviewPANData              Brief description of aimsMOONSHOT       ...
Diamond Light Source in reality
What is Diamond Light Source?• The largest scientific investment in the UK for 45 years  ~(£263M + £120M + £66M) in three ...
What is Diamond Used For?
Health and disease   Structure of theHistamine H1 receptor           Understand rejection in                              ...
Engineering and manufacturing                          Understanding the corrosion       Pharmaceutical                   ...
New materials     MOFs for hydrogen storage                                                              Harry’s wheel – c...
How is light produced?
Science Beamlines
Overall Mission for Data Acquisition and AnalysisThe mission of our software developers is that users in all of our suppor...
Common users between facilities
Unique set of user credentials
An example overall future project hierarchy
PaN-data Europe – building a sustainable data infrastructure for Neutron and Photon laboratories                          ...
The Umbrella Project AuthenticationThe minimum user information possible is stored centrally to avoid Data Protectionissue...
Recommendations from Pandata Europe WP 41. An authentication mechanism based on Umbrella should be   implemented as widely...
Potential Future Authentication Strategy                                    Moonshot
User Session Authentication with Moonshot
Issues for consideration• Current systems can be difficult to define the duration for which users can   be authenticated. ...
High Speed automated data processing for MX                 Automated                                   Automated         ...
From Absorption image to 3d structure                  Reconstruction                  takes~30 minutes16GB             1 ...
The Spider catches a Hair
Futures                 Industrial                                     Academic                   Dept.                   ...
Thank you                    Acknowledgements:Brian Abram (Diamond/Janet UK), Rhys Smith (Cardiff/Janet     UK), Bjoern Ab...
Diamond and how you do the science
Diamond – a look inside
Upcoming SlideShare
Loading in …5
×

The proposed application of Project Moonshot to Diamond User Access - Dr Bill Pulford, Diamond Light Source

2,247 views
2,093 views

Published on

The work was initiated as part of the PANData project whose main objective is to provide a high quality data acquisition and analysis infrastructure across most large scientific research facilities across Europe. Its purpose is the implementation of a system to allow scientific users to access resources across the physically distributed repositories at these different facilities. A typical use case would be a user having performed experiments at several facilities who needs to perform the same data analysis on data sets distributed across those facilities. This process involves the use of remote computing resources and software packages; this implies a system whereby a logged user at a local site can be authenticated and authorised in their use of remote facilities. In addition to access to web sites using these single sign on credentials, it is of key importance to enable direct login; the technology of choice is that of project MOONSHOT.

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,247
On SlideShare
0
From Embeds
0
Number of Embeds
43
Actions
Shares
0
Downloads
0
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • The three parts of the presentations – Diamond Overview, PANData (particularly Umbrella) and the role of MOONShot
  • The largest scientific facility in the UK for 40 years ~ £263M + £120M + £66M Part of the UK science infrastructure - funded by UK Government (86%), and the Wellcome Trust (14%) A third generation medium energy synchrotron Can be compared to a gigantic x-ray machine or a series of ‘super microscopes’ Free at point of use to UK academic users.Increasing direct contributions to UK industry
  • We now
  • The strong case for PAN European single sign on.The users who perform their research at multiple research sites.
  • All of the above processes are authenticated by a single set of user credentialsIt is envisaged to extend this to other collaborating facilities
  • PaNDATA – Photon and Neutron Data InfrastructureAimsPANdata brings together some of the major multidisciplinary Research Infrastructures in Europe to construct and operate a sustainable data infrastructure for the European Neutron and Photon laboratories. Such a unique infrastructure will enhance all research done in this community, by making data accessible, preserving the data, allowing experiments to be carried out jointly in several laboratories and by providing powerful tools for scientists to remotely interact with the data. The current situation at many of the facilities, and in particular at the photon sources, leaves data management almost entirely to the individual users who essentially have to carry away data on portable media. These media are notoriously unsuitable to guarantee the longevity and availability of precious and costly experimental data. Not only is this becoming unfeasible considering the dramatic increase in size of some of the data sets, it is also counterproductive for the scientific workflow and in the end constitutes a dramatic loss because the data is inaccessible for the scientific community. This is why the participating laboratories have decided to join forces, create synergy, and remedy the current situation which has persisted for too long. Neutron and photon facilities are major scientific data producers, serving an expanding user community of 25,000 to 30,000 scientists across Europe. The experiments in these facilities are of increasing complexity, they are increasingly done by international research groups and many of them will be done in more than one laboratory. The resulting data needs to be accessible over the Internet and remain on-line until the results are published and in many cases much longer to allow re-processing and to allow for the preservation of knowledge. PANdata will provide our user communities with data repositories and data management tools to: · deal with large sets and large data rates from the experiments, · enable easy and standardised annotation of data, · allow transparent and secure remote access to data, · establish sustainable and compatible data catalogues, allow long-term preservation of data, and · provide compatible open source data analysis software. This will have a major impact on our scientific user community because it will offer: · cross facility and cross discipline data analysis, · secure access to large data sets over the network instead of using portable media, · maintaining the records of science by having properly annotated data, · linking publications to data, · allowing efficient software developments, and efficient scientific collaborations across Europe by providing compatible data formats and analysis software. At the heart of the vision is a series of federated data catalogues which allow scientists to perform cross-facility, cross-discipline interaction with experimental and derived data, with near real-time access to the data – a ‘Google Earth’, at the scale of atoms and molecules. Along with this PANdata will thrive for a unification of data management policies in order for the common technology to be successfully adoptedPartnersISIS Pulsed Neutron & Muon Source ILL PSI - Paul ScherrerInstitutTrieste Italy: Elettra Synchrotron Light SourceDiamond Light SourceSynchrotron SOLEILDeutschesElektronen-Synchrotron DESYHZBHelmholtz-Zentrum Berlin fürMaterialien und EnergieALBA synchrotron - Cerdanyola del Vallès, Barcelona
  • Must include Linux, Windows and MacintoshShould support Pluggable Authentication Modules
  • Protein crystal to 3 dimensional structure in 6 minutes.A protein molecule may have many millions of atoms.
  • Inside you see the beamlines, each specialized to an area of scientific research.Increasingly work is done that includes remote collaboration Most users actually visit Diamond and login to beamline cabin situated Linux workstations.
  • The proposed application of Project Moonshot to Diamond User Access - Dr Bill Pulford, Diamond Light Source

    1. 1. FAM12 - Broadening Horizons The proposed application of Project Moonshot to Diamond User Access Birmingham, 6 November 2012Bill PulfordHead of DataAcquisition andScientific ComputingDiamond Light Source
    2. 2. Three Linked ContributionsDiamond Light Source Brief OverviewPANData Brief description of aimsMOONSHOT How the above come together with Janet and Moonshot.
    3. 3. Diamond Light Source in reality
    4. 4. What is Diamond Light Source?• The largest scientific investment in the UK for 45 years ~(£263M + £120M + £66M) in three phases;• A source of extremely intense light, particularly X-rays and can be thought of as a huge microscope• Funded by Our Shareholders – The UK Government through STFC (86%) and the Wellcome Trust (14%).• Free at point of use to UK academic users.• Increasing direct contributions to UK industry
    5. 5. What is Diamond Used For?
    6. 6. Health and disease Structure of theHistamine H1 receptor Understand rejection in hip implants New drug target for Candida infection Sugar binding in gut flora Improving nutritional quality in wheat
    7. 7. Engineering and manufacturing Understanding the corrosion Pharmaceutical process manufacture andCasting aluminium processing Tunable polymers Organic photovoltaics
    8. 8. New materials MOFs for hydrogen storage Harry’s wheel – complex templates for new materialsBio-mimetics Multiferroics – electronic storage and memory
    9. 9. How is light produced?
    10. 10. Science Beamlines
    11. 11. Overall Mission for Data Acquisition and AnalysisThe mission of our software developers is that users in all of our supportedscientific disciplines be able to acquire data using the most flexible andcapable tools that we can provide and then leave Diamond having at leastevaluated the data quality but ideally also having been able to perform theappropriate analysis.Most automated processes require Single Sign On and increasinglytransparent access to remote resources.In the year between 1-Jan-2011 and 1-Jan-2012 we had 1726 experimentalvisits and 4976 external experimenters of whom 1976 where unique. Currentlywe have Data Volumes and Numbers of files (ICAT) ~ 228Tb/95,000,000
    12. 12. Common users between facilities
    13. 13. Unique set of user credentials
    14. 14. An example overall future project hierarchy
    15. 15. PaN-data Europe – building a sustainable data infrastructure for Neutron and Photon laboratories PaN-data Standardisation PaN-data Europe is undertaking 5 standardisation activities: 1. Development of a common data policy framework 2. Agreement on protocols for shared user information exchange 3. Definition of standards for common scientific data formats 4. Strategy for the interoperation of data analysis software enabling the most appropriate software to be used independently of where the data is collected 5. Integration and cross-linking of research outputs completing the lifecycle of research, linking all information underpinning publications, and supporting the long-term preservation of the research outputs 6. ORCID and OpenAire+(Publishers) and Biostruct also noteworthy
    16. 16. The Umbrella Project AuthenticationThe minimum user information possible is stored centrally to avoid Data Protectionissues. The Authentication is done by the user logging into the Umbrella centralsite – currently umbrella.psi.ch – to generate a Shibboleth token. Authorization isdelegated to the facility site. Minimum information necessary
    17. 17. Recommendations from Pandata Europe WP 41. An authentication mechanism based on Umbrella should be implemented as widely as possible across participating facilities.2. A technical solution needs to be provided to allow interactive sessions on computing resources at the facilities’ sites. Initial coverage must include Linux and Windows (optionally Macintosh) systems in order for any complete adoption of the authentication system.Project MOONShot provides this functionality1. An important architectural advantage is offered by using a system such as the Jasig Central Authentication Service (CAS) where most of the internal Authentication and Authorization issues are covered by this one system thereby obviating the need to modify dependent systems.
    18. 18. Potential Future Authentication Strategy Moonshot
    19. 19. User Session Authentication with Moonshot
    20. 20. Issues for consideration• Current systems can be difficult to define the duration for which users can be authenticated. a. X509 certificates may be useful to help with this problem• A mechanism of persistence or caching is necessary to avoid loss of service due to network unavailability.• PAM stacks. User X509 certificates• Must support Windows, Macintosh and Linux• Should be very simple to install and configure (consider rpm or msi)• The overall network infrastructure should provide the necessary support.Interesting:• Compatible with any Microsoft initiative if possible.• EduRoam• Mobile devices
    21. 21. High Speed automated data processing for MX Automated Automated Acquisition Analysis 3 mins 2 minsMountedmacromolecularcrystal Raw Diffraction data Electron Density map of protein/DNA (The 3-D structure) Biological Macromolecule to structure in less than 5 minutes! In more detail Raw diffraction data recorded in less than 3 minutes on Diamond MX beamline I02, as part of ongoing research into DNA / ligand structures, joint Ph.D. place between Diamond and Reading University (James Hall). Automatically calculated map derived purely from experimental information, requiring a search over a number of parameters. Results were obtained less than two minutes after the data collection was completed, a step which would once have taken hours! In both the processing and phase calculation extensive use was made of parallel processing, making use of one to two hours processing time in under two minutes.
    22. 22. From Absorption image to 3d structure Reconstruction takes~30 minutes16GB 1 per 6 hours to 2 per hour.
    23. 23. The Spider catches a Hair
    24. 24. Futures Industrial Academic Dept. Dept. Site Site Data Interact High Power FacilityFacility Repository 2 1 Cluster Cloud Site Publisher ORCHID/ Openaire 24
    25. 25. Thank you Acknowledgements:Brian Abram (Diamond/Janet UK), Rhys Smith (Cardiff/Janet UK), Bjoern Abt (PSI), Heinz Weyer (PSI), Mirjam van Daalen (PSI), Roland Hedburg, Josh Howlett (Janet UK), John Chapman (Janet UK) and many others
    26. 26. Diamond and how you do the science
    27. 27. Diamond – a look inside

    ×