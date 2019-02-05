Successfully reported this slideshow.
FOSDEM 2019 – Feb 03, 2019 – Brussels The convergence of HPC and BigData What does it mean for HPC sysadmins?
Now all Cloud providers offer HPC services
What should Academic HPC centers do? Answer on next slide. Please be patient.
They should add Cloud-related technologies to their offering.
Commodity entry-level procs, 10Gbps net, harddisks, medium-size RAM, etc. High-end costly procs, 100Gbps net, SSDs, hardwa...
Nikolay Malitsky, Bringing the HPC reconstruction algorithms to Big Data Platforms, New York Data Summit, 2016
5 paths to follow
1Virtualization 1.a Private Cloud on HPC 1.b HPC On Demand & HPC as a Service 1.c Containers More user control, more isola...
2Cloud bursting Elasticity for the cluster Provision virtual machines in a cloud and append them to the cluster resources....
3Additional storage paradigms Solve the ZOT fles problem and increase external share-ability 3.a Object storage 3.b Hadoop...
4Additional programming paradigms Offer new libraries, mid-way between MPI and job arrays: HPDA 4.a Standalone MapReduce o...
4Additional programming paradigms Offer new libraries, mid-way between MPI and job arrays: HPDA 4.d HPC and BigData schedu...
Allow users to submit jobs through web interfaces, but also to use Web-based interactive scientifc interpreters such as RS...
Fast interconnect High-memory compute nodes Accelerated compute nodes RAID SSDs compute nodes Parallel flesystem Managemen...
Scientists will be happy Well, I hope. Thank you for your attention.
The convergence of HPC and BigData: What does it mean for HPC sysadmins?

14 views

Published on

In this deck from FOSDEM'19, Damien Francois from the Université catholique de Louvain presents: The convergence of HPC and BigData: What does it mean for HPC sysadmins?

"There are mainly two types of people in the scientific computing world: those who produce data and those who consume it. Those who have models and generate data from those models, a process known as 'simulation', and those who have data and infer models from the data ('analytics'). The former often originate from disciplines such as Engineering, Physics, or Climatology, while the latter are most often active in Remote sensing, Bioinformatics, Sociology, or Management.

Simulations often require large amount of computations so they are often run on generic High-Performance Computing (HPC) infrastructures built on a cluster of powerful high-end machines linked together with high-bandwidth low-latency networks. The cluster is often augmented with hardware accelerators (co-processors such as GPUs or FPGAs) and a large and fast parallel filesystem, all setup and tuned by systems administrators. By contrast, in analytics, the focus is on the storage and access of the data so analytics is often performed on a BigData infrastructure suited for the problem at hand. Those infrastructure offer specific data stores and are often installed in a more or less self-service way on a public or private 'Cloud' typically built on top of 'commodity' hardware.

Those two worlds, the world of HPC and the world of BigData are slowly, but surely, converging. The HPC world realizes that there are more to data storage than just files and that 'self-service' ideas are tempting. In the meantime, the BigData world realizes that co-processors and fast networks can really speedup analytics. And indeed, all major public Cloud services now have an HPC offering. And many academic HPC centres start to offer Cloud infrastructures and BigData-related tools.

This talk will focus on the latter point of view and review the tools originating from the BigData and the ideas from the Cloud that can be implemented in a HPC context to enlarge the offer for scientific computing in universities and research centres."

Published in: Technology
License: CC Attribution License
The convergence of HPC and BigData: What does it mean for HPC sysadmins?

  2. 2. Scientists are never happy
  3. 3. Please do not ask me to explain the equations. Thanks. Pictures courtesy of NASA and Wikipedia. Some have models but they want data
  4. 4. Please do not ask me to explain the equations. Thanks. Pictures courtesy of NASA and Wikipedia. Others have data but they want models
  5. 5. The Landscape of Parallel Computing Research: A View from Berkeley Krste Asanović et al EECS Department University of California, Berkeley Technical Report No. UCB/EECS-2006-183 December 18, 2006 Fox, G et al Towards a comprehensive set of big data benchmarks. In: BigData and High Performance Computing, vol 26, p. 47, February 2015 Compute intensive (HPC Dwarfs) Dense and Sparse Linear Algebrae, Spectral Methods, N-Body Methods, Structured and Unstructured Grids, MonteCarlo Data intensive (BigData Ogres) PageRank, Collaborative Filtering, Linear Classifers, Outlier Detection, Clustering, Latent Dirichlet Allocation, Probabilistic Latent Semantic Indexing, Singular Value Decomposition, Multidimentional Scaling, Graphs Algorithms, Neural Networks, Global Optimisation, Agents, Geographical Information Systems I did not invent that. Pictures courtesy of Disney and DreamWorks.
  6. 6. Compute intensive (HPC) Clusters This is caricatural a little inaccurate but it saves me tons of explanation. Pics (c) Disney and Dreamworks Data intensive (BigData) Cloud
  7. 7. Instant availability Self-service or Ready-made Elasticity, fault tolerance Close to the metal High-end/Dedicated hardware Exclusive access to resources Compute intensive (HPC) Clusters This is caricatural a little inaccurate but it saves me tons of explanation. Pics (c) Disney and Dreamworks Data intensive (BigData) Cloud
  8. 8. The word ‘cloudster’ does not exist. I made it up. Not related to shoes. Pics (c) Disney and Dreamworks Cloudster(?) Compute intensive (HPC) Data intensive (BigData)
  9. 9. Now all Cloud providers offer HPC services
  10. 10. What should Academic HPC centers do? Answer on next slide. Please be patient.
  11. 11. They should add Cloud-related technologies to their offering.
  12. 12. Commodity entry-level procs, 10Gbps net, harddisks, medium-size RAM, etc. High-end costly procs, 100Gbps net, SSDs, hardware accelerators, etc. SystemHardware OS (with RDMA, Perf monitoring)OS Hypervisor MPI Resource manager //FS HPC user ecosystem Block storage VMs + VNets MapReduce/Spark NoSQL + DFS Resource manager BigData user ecosystem Web Mobile Infra.,Platform,Soft. Cloud stack Cluster stack
  13. 13. Nikolay Malitsky, Bringing the HPC reconstruction algorithms to Big Data Platforms, New York Data Summit, 2016
  14. 14. 5 paths to follow
  15. 15. 1Virtualization 1.a Private Cloud on HPC 1.b HPC On Demand & HPC as a Service 1.c Containers More user control, more isolation Deploy a cloud and install the HPC stack inside virtual machines allocated for each project/user with, for instance, TrinityX. Deploy virtual machines inside a job allocation with, for instance, pcocc. Run jobs in containers, with for instance Singularity, Shifter, or CharlieCloud.
  16. 16. 2Cloud bursting Elasticity for the cluster Provision virtual machines in a cloud and append them to the cluster resources. Example with the Slurm resource manager:
  17. 17. 3Additional storage paradigms Solve the ZOT fles problem and increase external share-ability 3.a Object storage 3.b Hadoop connectors 3.c NoSQL Deploy an object store, e.g. HDFS, but also Swift or Ceph, either on a dedicated set of machines close to the cluster and with external connectivity or on the hard drives of the compute nodes. Deploy an ElasticSearch, a MongoDB, a Cassandra, a InfuxDB, and a Neo4j cluster on separate hardware close to the cluster. There are many more other options for NoSQL databases. Install a ‘connector’ on top of BeeGFS, Gluster, Lustre, etc. to offer a HDFS interface.
  18. 18. 4Additional programming paradigms Offer new libraries, mid-way between MPI and job arrays: HPDA 4.a Standalone MapReduce or Spark 4.b Deploy a Hadoop framework inside allocation 4.c Disguise the scheduler as a Hadoop platform ... Using for instance MyHadoop, a “Framework for deploying Hadoop clusters on traditional HPC from userland” Using a tool that deploys a Hadoop framework by submitting jobs, then report back to the user and allow them to submit MapReduce jobs, for instance HanythingOnDemand, HAM, or Magpie
  19. 19. 4Additional programming paradigms Offer new libraries, mid-way between MPI and job arrays: HPDA 4.d HPC and BigData scheduler colocation 4.e Unifed BigData/HPC stack Take advantage of the elasticity and resilience of the Hadoop framework to deploy Yarn on the idle nodes of a cluster and update the Yarn node list upon job start or termination. Or dedicate a portion of the cluster to Yarn/Mesos. <Spoiler> Probably not. But generates a lot of fuss. </Spoiler> One day? Intel, IBM working on that. Will it be FOSS?
  20. 20. Allow users to submit jobs through web interfaces, but also to use Web-based interactive scientifc interpreters such as RStudioServer and JupyterLab, and notebooks, etc. 5Web and Apps Going beyond SSH and the command line, adding interactivity 5.a Web-HPC 5.b Ubiquitous access to data I personnaly prefer my terminal. Let the user access data and results from the Web, an App, or a Desktop client, with for instance NextCloud.
  21. 21. Fast interconnect High-memory compute nodes Accelerated compute nodes RAID SSDs compute nodes Parallel flesystem Management Nodes Databases nodes Data transfer nodes Login nodesWeb nodes Outbound Submit job scripts or or containers or VMs or MapReduce or Spark jobs Run baremetal or container or VM With a Hadoop connector GridFTP, Sqoop NextCloud RStudio et al The “Cloudster” The Ultimate Machine.
  22. 22. Scientists will be happy Well, I hope. Thank you for your attention.

