The business of research is getting more and more data intensive, as digital technology spreads into every discipline. At Jisc we provide shared services such as the Janet network, eduroam wireless roaming and shared data centres – reaching around 18 million people in the research, education and skills sectors. In this session at the HPC and Big Data 2017 conference we heard from the Francis Crick Institute and the Wellcome Trust about how their vision for using digital technology to accelerate science and innovation. We heard how they are using Jisc services to reduce the “time to science”, enable ground breaking research collaborations, and achieve significant operating efficiencies.
Accelerating Science and Innovation - It's Good to Share (HPC & Big Data 2017)
1. Accelerating Science and Innovation – It’s Good to Share
Martin Hamilton, Jisc
Alison Davis, Francis Crick Institute
Tim Cutts, Wellcome Trust Sanger Institute
1HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share01/02/2017
2. Accelerating Science & Innovation – It’s Good to Share
1. About Jisc
2. R&D on new services for researchers
› Research Data Shared Service
› Research Data Discovery Service
› What’s next?
3. Personal perspectives:
› Alison Davis, CIO, Francis Crick Institute
› Tim Cutts, Head of ScientificComputing,
WellcomeTrust Sanger Institute
4. Panel discussion and Q&A
01/02/2017 HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share 2
3. 1. About Jisc
01/02/2017 HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share 3
4. About Jisc
4HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share01/02/2017
Jisc is the UK higher education, further education and
skills sectors’ not-for-profit organisation for digital
services and solutions.This is what we do:
› Operate shared digital infrastructure and services
for universities and colleges
› Negotiate sector-wide deals, e.g. with IT vendors
and commercial publishers
› Provide trusted advice and practical assistance
5. About Jisc
5HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share01/02/2017
In the
UK
there
is…
470Colleges providing
further
education
160Higher education
institutions
2.3m
Students in HE
4.9m
Learners in FE
23%
Postgraduate
77%
Undergraduate
Funding for FE and skills
£7.7bn
Income of HEIs
£30.7bn
1,085
Providers of further
education and
skills
6. About Jisc
6HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share01/02/2017
Janet network
7. About Jisc
7HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share01/02/2017
Janet network
8. About Jisc
8HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share01/02/2017
Netflix
Voicenet
Akamai
Virgin Radio
Bogons
Logicalis UK
Pipex / GXN
BBC
Datahop
InTechnology
INUK
Simplecall
LINX multicast
Gamma
Google
Simplecall
Redstone
Updata
aql
Voicenet
Google
Limelight
Limelight
Akamai
BTnet
Init7
Amazon
Microsoft EU (viaTN)
Telekom Malaysia
Globelynx
10Gbit/s
1Gbit/s
100Gbit/s
GÉANT
GÉANT+
LINX
Microsoft EU (viaTW)
Total external connectivity ≈ 1Tbit/s
Leeds
Akamai
Google
VM for LGfLInTechnology
NHS N3
Exa Networks
Synetrix BBC (HD 4K pilots)
One Connect
Glasgow
&
Edinburgh
HEAnet
BBC (Pacific Quay)
Gamma
BBC (HD 4K pilots)
NHS N3
SWAN (Glas)
SWAN (Edin)
Manchester
Telecity
Harbour
Exch.
Telehouse
North &
West
VM for LGfL
RM for Schools
VM for LGfL
RM for Schools
GlobalTransit
Tata
IXManchester
IXLeeds
GlobalTransit
Level3
GlobalTransit
Level3
9. About Jisc
9HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share01/02/2017
Support for researchers, including:
› Technology platform, building on the Janet network:
– Shared data centres withVirtus & aql
– Cloud deals and agreements, e.g. AWS & Azure
– Archiving framework with Arkivum
– Access and identity for higher assurance use cases (Assent, Safe Share)
› Open access and open data:
– SHERPA services – tracking funder/journal OpenAccess policies
– Monitor – track Open Access costs and compliance
– CORE – Open Access publications search engine
› Agreements with publishers
– e.g. Elsevier,Taylor & Francis,Wiley, Springer
– Progress with subscription offsets for Article ProcessingCharges
10. About Jisc
10HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share01/02/2017
Support for researchers, including:
› Technology platform, building on the Janet network:
– Shared data centres withVirtus & aql
– Cloud deals and agreements, e.g. AWS & Azure
– Archiving framework with Arkivum
› Open access and open data:
– SHERPA services – tracking Open Access policies
– Monitor – track Open Access costs and compliance
– CORE – Open Access publications search engine
› Agreements with publishers
– e.g. Elsevier,Taylor & Francis,Wiley, Springer
– Progress with subscription offsets for Article ProcessingCharges
11. About Jisc
11HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share01/02/2017
Shared Data centre (North):
› Run in partnership with aql
› Tier 3 data centre, designed with HPC in mind
› Able to offer air and water cooling
› Will be connected to the core of Janet at
2x100G initially
› Racks available in 4/10/20/30kW
configurations
› Anchor tenants are Universities of Liverpool,
Leeds, Sheffield and Sheffield Hallam
University
› Expected to be available for service April 2017
12. About Jisc
12HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share01/02/2017
Shared Data centre (South):
› Tier 3 data centre, designed with HPC in
mind. Able to offer air and water cooling
› Connected to the core of Janet at 2x400G
› Racks available in 4/10/20/30kW configurations
› Anchor tenants are UCL, LSE, QMUL, Kings
College, Sanger Institute, Francis Crick Institute
› Other tenants include Imperial College, Brunel
University, Bristol Uni, Surrey University,
University of the Arts, HEFCE, University of
Sussex, Institute of Cancer Research
› Currently seeing a 60:40 split in favour of HPC
› The SDC(South) framework is now at ~220 racks
and a committed power from the tenants of
~2.3MW – all in just over 2 years
13. 2. R&D on new services for researchers
01/02/2017 HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share 13
14. R&D on new services for researchers
14HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share01/02/2017
Research Data
Shared Service
› Procurement concluded and
suppliers selected
› Now building the service to the
community’s requirements
› 13 pilot institutions
› Research Data Network
› Find out more:
researchdata.jiscinvolve.org
15. R&D on new services for researchers
15HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share01/02/2017
16. R&D on new services for researchers
16HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share01/02/2017
17. R&D on new services for researchers
17HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share01/02/2017
18. R&D on new services for researchers
18HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share01/02/2017
Milestones 2015-18
Apr 2015-Dec 2015 Jan 2016 – July 2016 Aug-2016 -June 2017 Jul 2017-Sept 2017 Oct 2017-Apr 2018
-Requirements
- HEI Pilots
Selected
-Procurement
commences
- Support
consultancy
work begins
-Supplier
Framework
selected
-Alpha
Development
-Alpha service
tested and
reviewed
-Beta
Development
-Feedback on
Beta Service
- Business case
decision
-If go then begin
transition to
production service
-Institutional
survey
-HEI and supplier
workshops
-Pilot HEI
selection process
-Detailed HEI
requirements and
technical
architecture
-Contracting
commences
-Development
Phase
-Contact additional
early adopter HEI’s
and promote Beta
Service
-Business planning
and Begin Business
Case
-Market Research
and Consultation
-Promote service to
institutions
-Start on next
phases (service
enhancement/mod
ular)
19. R&D on new services for researchers
19HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share01/02/2017
Research Data Discovery
Service
› Alpha - feedback sought!
› Uses CKAN to aggregate
research data from institutions
› Test system has 14K datasets
from 22 organisations so far
› Find out more:
rdds.jiscinvolve.org
› Try it:
staging.ckan.data.alpha.jisc.ac.uk
21. Personal perspectives
21HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share01/02/2017
› Alison Davis
› Chief Information Officer
› Francis Crick Institute
22. Big Data & HPC Collaboration Challenges
Alison Davis, CIO
The Francis Crick Institute
24. • Funded by 6 founding partners
• Construction = £650M
• The building is170 m long & 50 m high.
• Total floor space of 93,000 m2
(17.5 football fields)
• Capacity for ca.1250 scientists &
250 operational staff
• Migration August – December 2016
24
The Crick – key facts
27. High
availability
tier
Middle
working tier
Near line
archive
Long term
archive
Batch
Compute
(GPU/CPU
)
Ext data
sets
Instruments
Other
Target state scientific computing platform
Backup
Operational support and processes
Data sources
Interactiv
e
Compute
(GPU/CPU
)
Cloud
Services
Applications
30. Personal perspectives
30HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share01/02/2017
› Tim Cutts
› Head of Scientific Computing
› Wellcome Trust
Sanger Institute
34. 2013 Vision:
A Meeting of Minds
Francis
Crick
Institute
WTSI DR
and
Capacity
MRC
Collaborative
Bioinformatics
JISCUCL
KCL
Many
others
Scientific
Collaboration
36. WTSI Use of JSDC Slough
• WTSI share of eMedLab
• Business continuity and disaster recovery
• iRODS replica including all our primary sequence data (6.5 PB)
• Transferred all services from our previous DR site
• Replicas of critical RDBMS
• Replicas of enterprise NAS system (~2 PB)
• BCP for critical web services
• 10 Gbit dark fibre
37. Today’s Science
Drives IT Strategy
Large Scale
Scientific
Computing
Collaboration
Reproducible
Science
Reliabilit
y
Rapid
Delivery
Performance
Scalability
Cost
Effectiveness
Data
Security and
Governance
• Scale
• Data security
• Reliability
38. Limitations
• Ever-increasing data acquisition
rates
• Aggregating data is not sustainable
• Scale (== cost)
• Governance
• Duplication (== cost)
• Network bandwidth (== cost)
39. 2017 Vision:
Evolution and Federation
Open Standard APIs
Federated AAAI
MRC Bioinformatics Centres
Public clouds
WTSI Flexible Compute
EBI Embassy Cloud
40. WTSI Flexible Compute
• 5,996 cores
• 50 TB RAM
• 3PB storage
• 100 Gbit software-defined network
• Red Hat OpenStack Platform with CloudForms and Ceph
• Automated deployments
• Continuous integration tests
• Reproducible
• Images deployable to any cloud environment
• Enables Service Desk to make safe changes
41. Ongoing challenges
• Open standard APIs
• Sufficient resources to develop and
operate services
• Major change in application
development strategies
42. Conclusion
• Data science needs federated analysis
• Keep the data at its source
• Move computational work to the data
• Standards and infrastructure software need investment
43. 4. Panel discussion and Q&A
01/02/2017 HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share 43
44. Contact details
44
Martin Hamilton
Futurist, Jisc
@martin_hamilton
martin.hamilton@jisc.ac.uk
HPC & Big Data 2017: Accelerating Science and Innovation - It's Good to Share01/02/2017
Except where otherwise noted, this
work is licensed under CC-BY