Drivers for Virtualization in Research Computing


Published on

In this presentation from the Dell booth at SC13, Dr Paul Calleja from the University of Cambridge describes how they are using HPC Virtualization to meet user needs.

Watch the video presentation:

Published in: Technology, Education
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Drivers for Virtualization in Research Computing

  1. 1. Drivers for Virtualization in Research Computing Dr Paul Calleja Director HPC Service, University of Cambridge Dell SC 2013
  2. 2. What do we do ? HPC service provision & Grant funded HPC research Cambridge HPC Service Dell HPC Solution Centre Commercial HPC as a service Promote uptake of HPC by industry Dell SC 2013 HPC Centre of excellence
  3. 3. The University and surrounding technology hub •  The University of Cambridge is a world leading teaching & research institution, consistently ranked within the top 3 Universities world wide •  Annual income of £1200M - 40% is research related - one of the largest R&D budgets within the UK HE sector •  17000 students, 9,000 staff •  Cambridge is a major technology centre –  1535 technology companies in surrounding science parks –  £12B annual revenue –  53000 staff •  HPC is recognised as an important enabling technology for University research and the wider Cambridge technology community •  We are tasked with providing HPC services to the University and surrounding technology companies Dell SC 2013
  4. 4. Our business model •  The HPCS is run as a charge at point of use cost centre •  We receive no central funding from the University •  We pay for all costs, staff, power, data centre operations, university services ie HR, accountancy, legal, coffee……. •  The only subsidy we enjoy is capital cost of machine room infrastructure •  We charge our internal and external customers for services under contract to recover costs, Internal use at cost, external use under a margin to subsidise internal access. •  We started this model 7 years ago with a one off capital injection of £2M and 6 month’s of oppex !! •  We are now fully self sustaining having increased our capital turnover by factor 2 and our operational turnover by factor 4 Dell SC 2013
  5. 5. Research Computing Services •  We provide access to large scale central shared HPC & data storage systems •  We provides a full range of consultancy services on the design, procurement implementation and support of 3rd party (customer) owned research computing infrastructure. •  Hosting service for 3rd party research computing infrastructure as a managed service •  Traditional strengths in large scale HPC & visualisation •  Emerging push into data analytics platforms and methods and remote visualisation, virtualised platforms Dell SC 2013
  6. 6. Cambridge HPC facts •  700 registered users from 30 departments •  36 external industrial engagement over last 18 months •  856 Dell Servers - 450 TF sustained DP performance •  600 node (9600 core) full non blocking Mellanox FDR IB 2,6 GHz sandy bridge (185 TF) – fastest Intel cluster in UK (when installed) – entered at 93 in TOP500 •  128 node 256 card NVIDIA K20 GPU cluster 250 TF full non blocking dual rail Mellanox FDR connect IB - fastest GPU cluster in UK install date October -2013 •  128 node Westmere (1536 cores) (15 TF) •  2.5 PB storage –high performance parallel file system 30GB/s Dell SC 2013
  7. 7. Drivers for virtualization •  Because of our business model we need to :•  Increase accessibility to HPC and data analytics platform •  Increase customers productivity •  Provide secure multi-tenant solutions •  The key drivers for adoption and development of virtualisation technologies within RC is that it helps with these issues by providing :•  More flexible access to remote compute recourses •  Customised environments •  Sandboxing of the user environment protecting them from others and others from them i.e. more secure multi-tenancy •  Allows users to take advantage of wider 3rd party cloud infrastructure •  Dynamic environment for throughput of workload. Ability to checkpoint out low priority workload for increased throughput of higher SLA work •  Moves the HPC / data analytics system stack out of an HPC niche and into more mainstream enterprise computing domain Dell SC 2013
  8. 8. Three virtualization use cases •  Remote visualisation / Virtual workstation via 3D accelerated virtual remote sessions •  Recent advanced from NVIDIA and virtualisation technologies from VMware and Critix allow remote 3D accelerated remote access to the virtual machine •  Virtualise HPC platform for large scale simulation and throughput work loads •  Virtualised data analytics platform – allows data analytics as a service, bring the users to the data not the data to the users Dell SC 2013