SlideShare a Scribd company logo
1 of 30
Download to read offline
OpenHPC: Project Overview and
Updates
Karl W. Schulz, Ph.D.
Software and Services Group, Intel
Technical Project Lead, OpenHPC
5th Annual MVAPICH User Group (MUG) Meeting
August 16, 2017 w Columbus, Ohio
http://openhpc.community
Outline
• Brief project overview
• New items/updates from last year
• Highlights from latest release
2
OpenHPC: Mission and Vision
• Mission: to provide a reference collection of open-source HPC
software components and best practices, lowering barriers to
deployment, advancement, and use of modern HPC methods
and tools.
• Vision: OpenHPC components and best practices will enable
and accelerate innovation and discoveries by broadening
access to state-of-the-art, open-source HPC methods and tools
in a consistent environment, supported by a collaborative,
worldwide community of HPC users, developers, researchers,
administrators, and vendors.
3
OpenHPC: a brief History…
4
ISC’15
BoF on the
merits/interest in a
community effort
SC’15
seed initial 1.0 release,
gather interested parties
to work with Linux
Foundation
ISC’16
v1.1.1 release
Linux Foundation
announces technical
leadership, founding
members, and formal
governance structure
June  2015
Nov  2015
June  2016
SC’16
v1.2 release
BoF
Nov  2016
June  2017
ISC’17
v1.3.1 release
BoF
MUG’16
OpenHPC Intro
OpenHPC Project Members
5
Project  member  participation  interest? Please  contact  
Jeff  ErnstFriedman jernstfriedman@linuxfoundation.org
Argonne
National
Laboratory
Indiana
University
University of
Cambridge
Mixture of Academics, Labs,
OEMs, and ISVs/OSVs
6
Governance: Technical Steering Committee (TSC)
Role Overview
OpenHPC
Technical  Steering  Committee  (TSC)
Maintainers
Integration  
Testing  
Coordinator(s)
Upstream  Component  
Development  
Representative(s)
Project  
Leader
End-­User  /  Site  
Representative(s)
Note:  We  just  completed  election  of  TSC  
members  for  the  2017-­2018  term.
• terms  are  for  1-­year  
7
OpenHPC TSC – Individual Members
• Reese Baird, Intel (Maintainer)
• David Brayford, LRZ (Maintainer)
• Eric Coulter, Indiana University (End-User/Site Representative)
• Leonordo Fialho, ATOS (Maintainer)
• Todd Gamblin, LLNL (Component Development Representative)
• Craig Gardner, SUSE (Maintainer)
• Renato Golin, Linaro (Testing Coordinator)
• Jennifer Green, Los Alamos National Laboratory (Maintainer)
• Douglas Jacobsen, NERSC (End-User/Site Representative)
• Chulho Kim, Lenovo (Maintainer)
• Janet Lebens, Cray (Maintainer)
• Thomas Moschny, ParTec (Maintainer)
• Nam Pho, New York Langone Medical Center (Maintainer)
• Cyrus Proctor, Texas Advanced Computing Center (Maintainer)
• Adrian Reber, Red Hat (Maintainer)
• Joseph Stanfield, Dell (Maintainer)
• Karl W. Schulz, Intel (Project Lead, Testing Coordinator)
• Jeff Schutkoske, Cray (Component Development Representative)
• Derek Simmel, Pittsburgh Supercomputing Center (End-User/Site Representative)
• Scott Suchyta, Altair (Maintainer)
• Nirmala Sundararajan, Dell (Maintainer)
https://github.com/openhpc/ohpc/wiki/Governance-­Overview
New for 2017-2018
8
Target system architecture
• Basic cluster architecture:
head node (SMS) + computes
• Ethernet fabric for mgmt.
network
• Shared or separate out-of-band
(BMC) network
• High speed fabric (InfiniBand,
Omni-Path)
to  compute  eth  interface
to  compute  BMC  interface
compute  
nodes
high  speed  network  
Parallel  File  System
tcp  networking
eth0 eth1
[  Key  takeaway  ]
• OpenHPC provides a collection of pre-built ingredients common in
HPC environments; fundamentally it is a package repository
• The repository is published for use with Linux distro package
managers:
• yum (CentOS/RHEL)
• zypper (SLES)
• You can pick relevant bits of interest for your site:
• if you prefer a resource manager that is not included, you can build
that locally and still leverage the scientific libraries and development
environment
• similarly, you might prefer to utilize a different provisioning system
OpenHPC: a building block repository
9
10
Newish Items/Updates
changes and new items since we were last together at
MUG’16
11
• For  those  who  prefer  to  mirror  a  repo  
locally,  we  have  historically  provided  
an  ISO  that  contained  all  the  
packages/repodata
• Beginning  with  v1.2  release,  
switched  to  tarball  based  distribution
• Distribution  tarballs  
available  at:  
http://build.openhpc.community/dist
• A  “make_repo.sh”  script  is  provided  
that  will  setup  a  locally  hosted  
OpenHPC  repository  using  the  
contents  from  downloaded  tarball
Switched from ISOs -> distribution tarballs
Index of /dist/1.2.1
Name Last modified Size Description
Parent Directory -
OpenHPC-1.2.1.CentOS_7.2_aarch64.tar 2017-01-24 12:43 1.3G
OpenHPC-1.2.1.CentOS_7.2_src.tar 2017-01-24 12:45 6.8G
OpenHPC-1.2.1.CentOS_7.2_x86_64.tar 2017-01-24 12:43 2.2G
OpenHPC-1.2.1.SLE_12_SP1_aarch64.tar 2017-01-24 12:40 1.1G
OpenHPC-1.2.1.SLE_12_SP1_src.tar 2017-01-24 12:42 6.2G
OpenHPC-1.2.1.SLE_12_SP1_x86_64.tar 2017-01-24 12:41 1.9G
OpenHPC-1.2.1.md5s 2017-01-24 12:50 416
# tar xf OpenHPC-1.2.1.CentOS_7.2_x86_64.tar
# ./make_repo.sh
--> Creating OpenHPC.local.repo file in /etc/yum.repos.d
--> Local repodata stored in /root/repo
# yum repolist | grep OpenHPC
OpenHPC-local OpenHPC-1.2 - Base
OpenHPC-local-updates OpenHPC-1.2.1 - Updates
More Generic Repo Paths
• Starting with the v1.3 release, we adopted more generic paths
for underlying distros
12
[OpenHPC]
name=OpenHPC-1.3 - Base
baseurl=http://build.openhpc.community/OpenHPC:/1.3/CentOS_7
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-OpenHPC-1
[OpenHPC-updates]
name=OpenHPC-1.3 - Updates
baseurl=http://build.openhpc.community/OpenHPC:/1.3/updates/CentOS_7
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-OpenHPC-1
Similar approach for SLES12 repo config -> SLE_12
Release Roadmap Published
13
Clone this wiki locally
Roadmap
The table below highlights expected future releases that are currently on the near term
OpenHPC roadmap. Note that these are dynamic in nature and will be updated based on
directions from the OpenHPC TSC. Approximate release date(s) are provided to aid in
planning, but please be aware that these dates and releases are subject to change, YMMV,
etc.
Release Target Release Date Expectations
1.3.2 August 2017 New component additions and version upgrades.
1.3.3 November 2017 New component additions and version upgrades.
Previous Releases
A history of previous OpenHPC releases is highlighted below. Clicking on the version string will
take you to the !Release!Notes!for more detailed information on the changes in a particular
release.
Release Date
1.3.1 June 16, 2017
1.3 March 31, 2017
1.2.1 January 24, 2017
1.2 November 12, 2016
1.1.1 June 21, 2016
1.1 April 18, 2016
1.0.1 February 05, 2016
1.0 November 12, 2015
Add a custom footer
Pages 12
Home
ARM Tech Preview
ARM Tech Preview (1.3.1)
Component List
Component Suggestions
Governance Overview
Open Build Service
Papers and Presentations
Release History and
Roadmap
Technical Steering
Committee Meetings
TSC Membership
Participation and Selection
Process
TSC Weekly Meeting Call
Info:
Add a custom sidebar
https://github.com/openhpc/ohpc.wiki.git
Clone in Desktop
• Have had some requests for a roadmap for future
releases
• High-level roadmap now maintained on GitHub wiki:
https://github.com/openhpc/ohpc/wiki/Release-History-
and-Roadmap
Component Submission Site
• A common question posed to the
project has been how to request new
software components?
• We now have a simple submission site
for new requests:
- https://github.com/openhpc/submissions
- requests reviewed on rolling basis at
roughly a quarterly cadence
• Example software/recipes that have
been added via this process:
- xCAT
- BeeGFS
- PLASMA, SLEPc
- clang/LLVM
koomie commented 15 days ago
Software Name
Public URL
Technical Overview
Latest stable version number
Open-source license type
Relationship to component?
contributing developer
user
other
If other, please describe:
Build system
autotools-based
CMake
other
If other, please describe:
Does the current build system support staged path installati
For example: !make!install!DESTIR=/tmp/foo!(or equivalent)
Subset  of  information  requested
during  submission  process
14
- MPICH
- PBS Professional
- Singularity
- Scalasca
Opt-in System Registry Now Available
• Interested users can
now register their
usage on a public
system registry
• Helpful for us to have
an idea as to who is
potentially benefitting
from this community
effort
• Accessible from top-
level GitHub page
15
OpenHPC System Registry
This opt-in form can be used to register your system to let us (and the community) know that you
are using elements of OpenHPC.
* Required
CentOS/RHEL
SLES
Other:
Name of Site/Organization *
Your answer
What OS distribution are you using? *
Site or System URL
Your answer
System  Registry
Multiple Architecture Builds
• Starting with v1.2 release, we also include builds for aarch64
- both SUSE and RHEL/CentOS now have aarch64 variants available for
latest versions (SLES 12 SP2, CentOS 7.3)
• Recipes/packages being made available as a Tech Preview
- some additional work required for provisioning
- significant majority of development packages testing OK, but there are a
few known caveats
- please see https://github.com/openhpc/ohpc/wiki/ARM-Tech-Preview for
latest info
v1.3.1  RPM  counts
16
misc better tar excludes
tests fix test path (#472)
README.md Updating available package counts to reflect fact that some of packag…
README.md
OpenHPC: Community building blocks for HPC systems. (v1.3.1)
components availablecomponents available 6767 new additionsnew additions 33 updatesupdates 44%44%
Introduction
This stack provides a variety of common, pre-built ingredients required to deploy and manage an HPC Linux
including provisioning tools, resource management, I/O clients, runtimes, development tools, and a variety o
libraries.
The compatible OS version(s) for this release and the total number of pre-packaged binary RPMs available p
architecture type are summarized as follows:
Base OS x86_64 aarch64 noarch
CentOS 7.3 424 265 32
SLES 12 SP2 429 274 32
A detailed list of all available components is available in the "Package Manifest" appendix located in each of
companion install guide documents, and a list of updated packages can be found in the release notes. The r
notes also contain important information for those upgrading from previous versions of OpenHPC.
Dev Environment Consistency
17
x86_64
aarch64
OpenHPC providing  
consistent  
development  
environment  to  the  
end  user  across  
multiple  architectures
18
End-user software addition tools
OpenHPC repositories include two additional tools that can be used to
further extend a user’s development environment
- EasyBuild and Spack
- leverages other community efforts for build reproducibility and best
practices for configuration
- modules available for both after install
# module load gcc7
# module load spack
# spack compiler find
==> Added 1 new compiler to /root/.spack/linux/compilers.yaml gcc@7.1.0
# spack install darshan-runtime
...
# . /opt/ohpc/admin/spack/0.10.0/share/spack/setup-env.sh
# module avail
----- /opt/ohpc/admin/spack/0.10.0/share/spack/modules/linux-sles12-x86_64 -----
darshan-runtime-3.1.0-gcc-7.1.0-vhd5hhg m4-1.4.17-gcc-7.1.0-7jd575i
hwloc-1.11.4-gcc-7.1.0-u3k6dok ncurses-6.0-gcc-7.1.0-l3mdumo
libelf-0.8.13-gcc-7.1.0-tsgwr7j openmpi-2.0.1-gcc-7.1.0-5imqlfb
libpciaccess-0.13.4-gcc-7.1.0-33gbduz pkg-config-0.29.1-gcc-7.1.0-dhbpa2i
libsigsegv-2.10-gcc-7.1.0-lj5rntg util-macros-1.19.0-gcc-7.1.0-vkdpa3t
libtool-2.4.6-gcc-7.1.0-ulicbkz zlib-1.2.10-gcc-7.1.0-gy4dtna
Variety of recipes now available
Choose your own adventure…
Initially,  we  started  
off  with  a  single  
recipe  with  the  
intent  to  expandOpenHPC (v1.0.1)
Cluster Building Recipes
CentOS7.1 Base OS
Base Linux* Edition
Document Last Update: 2016-02-05
Document Revision: 7e93115
Latest v1.3.1 release continues to expand with multiple
resource managers, OSes, provisioners, and architectures:
• Install_guide-CentOS7-Warewulf-PBSPro-1.3.1-x86_64.pdf
• Install_guide-CentOS7-Warewulf-SLURM-1.3.1-aarch64.pd
• fInstall_guide-CentOS7-Warewulf-SLURM-1.3.1-x86_64.pdf
• Install_guide-CentOS7-xCAT-SLURM-1.3.1-x86_64.pdf
• Install_guide-SLE_12-Warewulf-PBSPro-1.3.1-x86_64.pdf
• Install_guide-SLE_12-Warewulf-SLURM-1.3.1-aarch64.pdf
• fInstall_guide-SLE_12-Warewulf-SLURM-1.3.1-x86_64.pdf
• Additional  resource  
manager  (PBS  
Professional)
with  v1.2
• Additional  provisioner  
(xCAT)  with  v1.3.1
19
20
Template scripts
Template recipe scripts are proved that encapsulate commands presented in
the guides:
# yum/zypper install docs-ohpc
# ls /opt/ohpc/pub/doc/recipes/*/*/*/*/recipe.sh
/opt/ohpc/pub/doc/recipes/centos7/aarch64/warewulf/slurm/recipe.sh
/opt/ohpc/pub/doc/recipes/centos7/x86_64/warewulf/pbspro/recipe.sh
/opt/ohpc/pub/doc/recipes/centos7/x86_64/warewulf/slurm/recipe.sh
/opt/ohpc/pub/doc/recipes/centos7/x86_64/xcat/slurm/recipe.sh
/opt/ohpc/pub/doc/recipes/sles12/aarch64/warewulf/slurm/recipe.sh
/opt/ohpc/pub/doc/recipes/sles12/x86_64/warewulf/pbspro/recipe.sh
/opt/ohpc/pub/doc/recipes/sles12/x86_64/warewulf/slurm/recipe.sh
# ls /opt/ohpc/pub/doc/recipes/*/input.local
/opt/ohpc/pub/doc/recipes/centos7/input.local
/opt/ohpc/pub/doc/recipes/sles12/input.local
input.local + recipe.sh == installed system
#  compute  hostnames
c_name[0]=c1
c_name[1]=c2
…
#  compute  node  MAC  addresses  
c_mac[0]=00:1a:2b:3c:4f:56
c_mac[1]=00:1a:2b:3c:4f:56
…
Test Suite
• Initiated from discussion/requests at SC’16 BoF, the OpenHPC test
suite is now available as an installable RPM (introduced with v1.3
release)
• # yum/zypper install test-suite-ohpc
- creates/relies on “ohpc-test” user to perform user testing (with
accessibility to run jobs through resource manager)
- related discussion added to recipes in Appendix C
but is instead, devised to confirm component builds are functional and interoperable within the modular
OpenHPC environment. The test suite is generally organized by components and the OpenHPC CI workflow
relies on running the full suite using Jenkins to test multiple OS configurations and installation recipes. To
facilitate customization and running of the test suite locally, we provide these tests in a standalone RPM.
[sms]# yum -y install test-suite-ohpc
The RPM installation creates a user named ohpc-test to house the test suite and provide an isolated
environment for execution. Configuration of the test suite is done using standard GNU autotools semantics
and the BATS shell-testing framework is used to execute and log a number of individual unit tests. Some
tests require privileged execution, so a di↵erent combination of tests will be enabled depending on which user
executes the top-level configure script. Non-privileged tests requiring execution on one or more compute
nodes are submitted as jobs through the SLURM resource manager. The tests are further divided into
“short” and “long” run categories. The short run configuration is a subset of approximately 180 tests to
demonstrate basic functionality of key components (e.g. MPI stacks) and should complete in 10-20 minutes.
The long run (around 1000 tests) is comprehensive and can take an hour or more to complete.
Most components can be tested individually, but a default configuration is setup to enable collective
testing. To test an isolated component, use the configure option to disable all tests, then re-enable the
desired test to run. The --help option to configure will display all possible tests. Example output is
shown below (some output is omitted for the sake of brevity).
[sms]# su - ohpc-test
[test@sms ~]$ cd tests
[test@sms ~]$ ./configure --disable-all --enable-fftw
checking for a BSD-compatible install... /bin/install -c
checking whether build environment is sane... yes
...
---------------------------------------------- SUMMARY ---------------------------------------------
Package version............... : test-suite-1.3.0
Build user.................... : ohpc-test
Build host.................... : sms001
Configure date................ : 2017-03-24 15:41
Build architecture............ : x86 64
Compiler Families............. : gnu
MPI Families.................. : mpich mvapich2 openmpi
Resource manager ............. : SLURM
Test suite configuration...... : short
...
Libraries:
21
Project CI infrastructure
• TACC is kindly hosting some CI
infrastructure for the project
(Austin, TX)
• Using for build servers and
continuous integration (CI)
testbed.
http://test.openhpc.community:8080
Many  thanks  to  TACC  and  
vendors  for  hardware  
donations!!:  Intel,  Cavium,  Dell
x86_64
aarch64
22
23
Community Test System for CI in use
http://test.openhpc.community:8080
All recipes exercised in CI system (start w/ bare-metal installs + integration test suite)
nHPC CI Infrastructure
nks to the Texas Advanced Computing Center (TACC) for hosting support and to Intel, Cavium, and Dell for hardware donations.
add description
W Name ↓ Last Success Last Failure Last Duration
(1.2) - (centos7.2,x86_64) - (warewulf+pbspro) - long cycle 1 day 8 hr - #39 N/A 58 min
(1.2) - (centos7.2,x86_64) - (warewulf+pbspro) - short cycle 1 day 4 hr - #155 4 days 10 hr - #109 14 min
(1.2) - (centos7.2,x86_64) - (warewulf+slurm) - long cycle 2 days 4 hr - #244 4 days 4 hr - #219 1 hr 0 min
(1.2) - (centos7.2,x86_64) - (warewulf+slurm) - short cycle 2 hr 46 min - #554 8 days 15 hr - #349 14 min
(1.2) - (centos7.2,x86_64) - (warewulf+slurm+PXSE) - long cycle 1 day 6 hr - #39 4 days 10 hr - #20 2 hr 29 min
(1.2) - (sles12sp1,x86_64) - (warewulf+pbspro) - short cycle 1 day 3 hr - #166 4 days 10 hr - #86 12 min
(1.2) - (sles12sp1,x86_64) - (warewulf+slurm) - short cycle 1 day 2 hr - #259 8 days 20 hr - #72 14 min
(1.2) - (sles12sp1,x86_64) - (warewulf,slurm) - long test cycle 1 day 5 hr - #97 6 days 19 hr - #41 54 min
(1.2) - aarch64 - (centos7.2) - (warewulf+slurm) 2 days 21 hr - #3 N/A 0.41 sec
(1.2) - aarch64 - (sles12sp1) - (warewulf+slurm) 1 day 8 hr - #45 2 days 21 hr - #41 2 hr 13 min
n: S M L Legend RSS for all RSS for failures RSS for just latest builds
KARL W. SCHULZ | LOG OUTsearch
1.1 All Interactive admin +1.2
ENABLE AUTO REFRESH
New Item
People
Build History
Edit View
Project Relationship
Check File Fingerprint
Manage Jenkins
My Views
Lockable Resources
Credentials
No builds in the queue.
master
1 Idle
2 Idle
3 Idle
OpenHPC CI Infrastructure
Thanks to the Texas Advanced Computing Center (TACC) for hosting support and to Intel, Cavium, and
S W Name ↓ Last Success
(1.2) - (centos7.2,x86_64) - (warewulf+pbspro) - long cycle 1 day 8 hr - #39
(1.2) - (centos7.2,x86_64) - (warewulf+pbspro) - short cycle 1 day 4 hr - #155
(1.2) - (centos7.2,x86_64) - (warewulf+slurm) - long cycle 2 days 4 hr - #244
(1.2) - (centos7.2,x86_64) - (warewulf+slurm) - short cycle 2 hr 46 min - #554
(1.2) - (centos7.2,x86_64) - (warewulf+slurm+PXSE) - long cycle 1 day 6 hr - #39
(1.2) - (sles12sp1,x86_64) - (warewulf+pbspro) - short cycle 1 day 3 hr - #166
(1.2) - (sles12sp1,x86_64) - (warewulf+slurm) - short cycle 1 day 2 hr - #259
(1.2) - (sles12sp1,x86_64) - (warewulf,slurm) - long test cycle 1 day 5 hr - #97
(1.2) - aarch64 - (centos7.2) - (warewulf+slurm) 2 days 21 hr - #3
(1.2) - aarch64 - (sles12sp1) - (warewulf+slurm) 1 day 8 hr - #45
Icon: S M L Legend RSS for all
search
Build Queue
Build Executor Status
1.1.1 All Interactive admin +1.2
Jenkins
1.3.x
OpenHPC CI Infrastructure
Thanks to the Texas Advanced Computing Center (TACC) for hosting support and to Intel, Cavium, and Dell for hardware
donations.
add description
S Name ↓ Last Success Last Duration
(1.3 to 1.3.1) - (centos7.3,x86_64) - (warewulf+slurm) 2 days 12 hr - #41 1 hr 12 min
(1.3.1) - (centos7.3,x86_64) - (warewulf+pbspro) - UEFI 2 days 7 hr - #390 56 min
(1.3.1) - (centos7.3,x86_64) - (warewulf+slurm) - long cycle 2 days 3 hr - #892 1 hr 2 min
(1.3.1) - (centos7.3,x86_64) - (warewulf+slurm) - tarball REPO 2 days 5 hr - #48 1 hr 2 min
(1.3.1) - (centos7.3,x86_64) - (warewulf+slurm+PSXE) 2 days 4 hr - #726 2 hr 14 min
(1.3.1) - (centos7.3,x86_64) - (warewulf+slurm+PSXE+OPA) 2 days 7 hr - #80 1 hr 51 min
(1.3.1) - (centos7.3,x86_64) - (xcat+slurm) 2 days 13 hr - #271 1 hr 1 min
(1.3.1) - (sles12sp2,x86_64) - (warewulf+pbspro) - tarball REPO 2 days 8 hr - #45 44 min
(1.3.1) - (sles12sp2,x86_64) - (warewulf+pbspro) - UEFI 2 days 4 hr - #70 45 min
(1.3.1) - (sles12sp2,x86_64) - (warewulf+slurm) 2 days 8 hr - #780 53 min
(1.3.1) - (sles12sp2,x86_64) - (warewulf+slurm+PSXE) - long cycle 2 days 3 hr - #114 1 hr 53 min
(1.3.1) - (sles12sp2,x86_64) - (warewulf+slurm+PSXE+OPA) 2 days 8 hr - #36 1 hr 33 min
KARL W. SCHULZ | LOG OUTsearch1
1.3 All +1.3.1
ENABLE AUTO REFRESH
24
OpenHPC v1.3.1 Release
June 16, 2017
25
OpenHPC v1.3.1 - Current S/W components
Functional  Areas Components
Base  OS CentOS  7.3, SLES12  SP2
Architecture x86_64,  aarch64  (Tech  Preview)
Administrative  Tools
Conman,  Ganglia, Lmod,  LosF,  Nagios,  pdsh,  pdsh-­mod-­slurm,  prun,  EasyBuild,  
ClusterShell,  mrsh,  Genders,  Shine,  Spack,  test-­suite
Provisioning   Warewulf,  xCAT
Resource  Mgmt. SLURM,  Munge, PBS  Professional
Runtimes OpenMP, OCR,  Singularity
I/O  Services Lustre client (community  version),  BeeGFS client
Numerical/Scientific  
Libraries
Boost,  GSL,  FFTW,  Metis,  PETSc,  Trilinos,  Hypre,  SuperLU,  SuperLU_Dist,  
Mumps, OpenBLAS,  Scalapack
I/O  Libraries HDF5  (pHDF5),  NetCDF (including  C++  and  Fortran  interfaces),  Adios
Compiler  Families GNU  (gcc,  g++,  gfortran),
MPI  Families MVAPICH2,  OpenMPI,  MPICH
Development  Tools Autotools (autoconf,  automake,  libtool),  Valgrind,R,  SciPy/NumPy,  hwloc
Performance  Tools PAPI,  IMB, mpiP,  pdtoolkit TAU,  Scalasca,  ScoreP,  SIONLib
Notes:
• Additional
dependencies that are
not provided by the
BaseOS or community
repos (e.g. EPEL) are
also included
• 3rd Party libraries are
built for each
compiler/MPI family
• Resulting repositories
currently comprised of
~450 RPMs
new  with  v1.3.1
Future  additions  
approved  for  inclusion  
in  v1.3.2  release:
• PLASMA
• SLEPc
• pNetCDF
• Scotch
• Clang/LLVM
OpenHPC Component Updates
• Part  of  motivation  for  community  effort  like  OpenHPC  is  the  rapidity  of  
S/W  updates  in  our  space
• Rolling  history  of  updates/additions:
v1.1$Release$
Updated$$ New$Addi2on$ No$change$
v1.2%Release%
Updated%% New%Addi3on% No%change%
v1.3%Release%
Updated%% New%Addi3on% No%change%
38.6%
63.3%
29.1%
v1.3.1%Release%
Updated%% New%Addi3on% No%change%
44%
Apr  2016 Nov  2016 Mar  2017 June  2017
27
Other new items for v1.3.1 Release
Meta RPM packages
introduced and adopted in
recipes:
• these replace previous use
of groups/patterns
• general convention
remains
- names that begin with
“ohpc-*” are typically
metapackges
- intended to group
related collections of
RPMs by functionality
• some names have been
updated for consistency
during the switch over
• updated list available in
Appendix E
E Package Manifest
This appendix provides a summary of available meta-package groupings and all of the individual RPM
packages that are available as part of this OpenHPC release. The meta-packages provide a mechanism to
group related collections of RPMs by functionality and provide a convenience mechanism for installation. A
list of the available meta-packages and a brief description is presented in Table 2.
Table 2: Available OpenHPC Meta-packages
Group Name Description
ohpc-autotools Collection of GNU autotools packages.
ohpc-base Collection of base packages.
ohpc-base-compute Collection of compute node base packages.
ohpc-ganglia Collection of Ganglia monitoring and metrics packages.
ohpc-gnu7-io-libs Collection of IO library builds for use with GNU compiler toolchain.
ohpc-gnu7-mpich-parallel-libs Collection of parallel library builds for use with GNU compiler toolchain and
the MPICH runtime.
ohpc-gnu7-mvapich2-parallel-libs Collection of parallel library builds for use with GNU compiler toolchain and
the MVAPICH2 runtime.
ohpc-gnu7-openmpi-parallel-libs Collection of parallel library builds for use with GNU compiler toolchain and
the OpenMPI runtime.
ohpc-gnu7-parallel-libs Collection of parallel library builds for use with GNU compiler toolchain.
ohpc-gnu7-perf-tools Collection of performance tool builds for use with GNU compiler toolchain.
ohpc-gnu7-python-libs Collection of python related library builds for use with GNU compiler
toolchain.
ohpc-gnu7-runtimes Collection of runtimes for use with GNU compiler toolchain.
ohpc-gnu7-serial-libs Collection of serial library builds for use with GNU compiler toolchain.
ohpc-intel-impi-parallel-libs Collection of parallel library builds for use with Intel(R) Parallel Studio XE
toolchain and the Intel(R) MPI Library.
ohpc-intel-io-libs Collection of IO library builds for use with Intel(R) Parallel Studio XE software
suite.
ohpc-intel-mpich-parallel-libs Collection of parallel library builds for use with Intel(R) Parallel Studio XE
toolchain and the MPICH runtime.
ohpc-intel-mvapich2-parallel-libs Collection of parallel library builds for use with Intel(R) Parallel Studio XE
toolchain and the MVAPICH2 runtime.
ohpc-intel-openmpi-parallel-libs Collection of parallel library builds for use with Intel(R) Parallel Studio XE
toolchain and the OpenMPI runtime.
ohpc-intel-perf-tools Collection of performance tool builds for use with Intel(R) Parallel Studio XE
toolchain.
ohpc-intel-python-libs Collection of python related library builds for use with Intel(R) Parallel Studio
XE toolchain.
ohpc-intel-runtimes Collection of runtimes for use with Intel(R) Parallel Studio XE toolchain.
ohpc-intel-serial-libs Collection of serial library builds for use with Intel(R) Parallel Studio XE
toolchain.
ohpc-nagios Collection of Nagios monitoring and metrics packages.
ohpc-slurm-client Collection of client packages for SLURM.
ohpc-slurm-server Collection of server packages for SLURM.
ohpc-warewulf Collection of base packages for Warewulf provisioning.
33 Rev: 208896e1
ohpc-gnu7-mvapich2-parallel-libs
28
Other new items for v1.3.1 Release
• A new compiler variant (gnu7) was introduced
- in the case of a fresh install, recipes default to installing the new variant along with
matching runtimes and libraries
- if upgrading a previously installed system, administrators can opt-in to enable
the gnu7 variant
• The meta-packages for “gnu7” provide a convenient mechanism to add on:
- upgrade discussion in recipes (Appendix B) amended to highlight this workflow
Install Guide (v1.3.1): CentOS7.3/x86 64 + xCAT + SLURM
# Update default environment
[sms]# yum -y remove lmod-defaults-gnu-mvapich2-ohpc
[sms]# yum -y install lmod-defaults-gnu7-mvapich2-ohpc
# Install GCC 7.x-compiled meta-packages with dependencies
[sms]# yum -y install ohpc-gnu7-perf-tools 
ohpc-gnu7-serial-libs 
ohpc-gnu7-io-libs 
ohpc-gnu7-python-libs 
ohpc-gnu7-runtimes 
ohpc-gnu7-mpich-parallel-libs 
ohpc-gnu7-openmpi-parallel-libs 
ohpc-gnu7-mvapich2-parallel-libs
parallel  libs  for  
gnu7/mpich
note:  could  skip  this  to  leave  
previous  gnu  toolchain  as  
default
29
Coexistence  of  multiple  variants
$ module list
Currently Loaded Modules:
1) autotools 2) prun/1.1 3) gnu7/7.1.0 4) mvapich2/2.2 5) ohpc
$ module avail
--------------------------- /opt/ohpc/pub/moduledeps/gnu7-mvapich2 -----------------------------
adios/1.11.0 imb/4.1 netcdf-cxx/4.3.0 scalapack/2.0.2 sionlib/1.7.1
boost/1.63.0 mpiP/3.4.1 netcdf-fortran/4.4.4 scalasca/2.3.1 superlu_dist/4.2
fftw/3.3.6 mumps/5.1.1 petsc/3.7.6 scipy/0.19.0 tau/2.26.1
hypre/2.11.1 netcdf/4.4.1.1 phdf5/1.10.0 scorep/3.0 trilinos/12.10.1
------------------------------- /opt/ohpc/pub/moduledeps/gnu7 ----------------------------------
R_base/3.3.3 metis/5.1.0 numpy/1.12.1 openmpi/1.10.7
gsl/2.3 mpich/3.2 ocr/1.0.1 pdtoolkit/3.23
hdf5/1.10.0 mvapich2/2.2 (L) openblas/0.2.19 superlu/5.2.1
--------------------------------- /opt/ohpc/pub/modulefiles ------------------------------------
EasyBuild/3.2.1 gnu/5.4.0 ohpc (L) singularity/2.3
autotools (L) gnu7/7.1.0 (L) papi/5.5.1 valgrind/3.12.0
clustershell/1.7.3 hwloc/1.11.6 prun/1.1 (L)
previously  
installed  from  
1.3  release
everything  
else  from  
1.3.1  
updates  
(add-­on)
Consider an example of system originally installed from 1.3 base
release and then added gnu7 variant using commands from last slide
Summary
• Technical Steering Committee just completed it’s first year of
operation; new membership selection for 2017-2018 in place
• Provided a highlight of changes/evolutions that have occurred since
MUG’16
- 4 releases since last MUG
- architecture addition (ARM)
- multiple provisioner/resource manager recipes
• “Getting Started” Tutorial held at PEARC’17 last month
- more in-depth overview: https://goo.gl/NyiDmr
30
http://openhpc.community (general  info)
https://github.com/openhpc/ohpc (main  GitHub  site)
https://github.com/openhpc/submissions (new  submissions)
https://build.openhpc.community (build  system/repos)
http://www.openhpc.community/support/mail-­lists/ (mailing  lists)
Community Resources

More Related Content

What's hot

Enabling POWER 8 advanced features on Linux
Enabling POWER 8 advanced features on LinuxEnabling POWER 8 advanced features on Linux
Enabling POWER 8 advanced features on LinuxSebastien Chabrolles
 
VMware vSphere Performance Troubleshooting
VMware vSphere Performance TroubleshootingVMware vSphere Performance Troubleshooting
VMware vSphere Performance TroubleshootingDan Brinkmann
 
SAP HANA System Replication simplified
SAP HANA System Replication simplifiedSAP HANA System Replication simplified
SAP HANA System Replication simplifiedDirk Oppenkowski
 
Java on zSystems zOS
Java on zSystems zOSJava on zSystems zOS
Java on zSystems zOSTim Ellison
 
Business analysis: ontwikkeling, toepassing, ervaringen
Business analysis: ontwikkeling, toepassing, ervaringenBusiness analysis: ontwikkeling, toepassing, ervaringen
Business analysis: ontwikkeling, toepassing, ervaringenSKA
 
Compare Performance-power of Arm Cortex vs RISC-V for AI applications_oct_2021
Compare Performance-power of Arm Cortex vs RISC-V for AI applications_oct_2021Compare Performance-power of Arm Cortex vs RISC-V for AI applications_oct_2021
Compare Performance-power of Arm Cortex vs RISC-V for AI applications_oct_2021Deepak Shankar
 
S4 h 188 sap s4hana cloud implementation with sap activate
S4 h 188 sap s4hana cloud implementation with sap activateS4 h 188 sap s4hana cloud implementation with sap activate
S4 h 188 sap s4hana cloud implementation with sap activateLokesh Modem
 
Xen and the art of embedded virtualization (ELC 2017)
Xen and the art of embedded virtualization (ELC 2017)Xen and the art of embedded virtualization (ELC 2017)
Xen and the art of embedded virtualization (ELC 2017)Stefano Stabellini
 
Understanding the Basics of CA Workload Automation iDash
Understanding the Basics of CA Workload Automation iDashUnderstanding the Basics of CA Workload Automation iDash
Understanding the Basics of CA Workload Automation iDashCA Technologies
 
Introduction to extracting data from sap s 4 hana with abap cds views
Introduction to extracting data from sap s 4 hana with abap cds viewsIntroduction to extracting data from sap s 4 hana with abap cds views
Introduction to extracting data from sap s 4 hana with abap cds viewsLuc Vanrobays
 
Disaster Recovery for SAP HANA with SUSE Linux
Disaster Recovery for SAP HANA with SUSE LinuxDisaster Recovery for SAP HANA with SUSE Linux
Disaster Recovery for SAP HANA with SUSE LinuxDirk Oppenkowski
 
AIXpert - AIX Security expert
AIXpert - AIX Security expertAIXpert - AIX Security expert
AIXpert - AIX Security expertdlfrench
 
10 good reasons to go for model-based systems engineering in your organization
10 good reasons to go for model-based systems engineering in your organization10 good reasons to go for model-based systems engineering in your organization
10 good reasons to go for model-based systems engineering in your organizationSiemens PLM Software
 
Mastering SAP Monitoring - SAP HANA Monitoring, Management & Automation
Mastering SAP Monitoring - SAP HANA Monitoring, Management & AutomationMastering SAP Monitoring - SAP HANA Monitoring, Management & Automation
Mastering SAP Monitoring - SAP HANA Monitoring, Management & AutomationLinh Nguyen
 
Xen & the Art of Virtualization
Xen & the Art of VirtualizationXen & the Art of Virtualization
Xen & the Art of VirtualizationTareque Hossain
 

What's hot (20)

Enabling POWER 8 advanced features on Linux
Enabling POWER 8 advanced features on LinuxEnabling POWER 8 advanced features on Linux
Enabling POWER 8 advanced features on Linux
 
VMware vSphere Performance Troubleshooting
VMware vSphere Performance TroubleshootingVMware vSphere Performance Troubleshooting
VMware vSphere Performance Troubleshooting
 
SAP HANA System Replication simplified
SAP HANA System Replication simplifiedSAP HANA System Replication simplified
SAP HANA System Replication simplified
 
Sap activate overview
Sap activate overviewSap activate overview
Sap activate overview
 
Java on zSystems zOS
Java on zSystems zOSJava on zSystems zOS
Java on zSystems zOS
 
Business analysis: ontwikkeling, toepassing, ervaringen
Business analysis: ontwikkeling, toepassing, ervaringenBusiness analysis: ontwikkeling, toepassing, ervaringen
Business analysis: ontwikkeling, toepassing, ervaringen
 
Ibm aix
Ibm aixIbm aix
Ibm aix
 
Compare Performance-power of Arm Cortex vs RISC-V for AI applications_oct_2021
Compare Performance-power of Arm Cortex vs RISC-V for AI applications_oct_2021Compare Performance-power of Arm Cortex vs RISC-V for AI applications_oct_2021
Compare Performance-power of Arm Cortex vs RISC-V for AI applications_oct_2021
 
S4 h 188 sap s4hana cloud implementation with sap activate
S4 h 188 sap s4hana cloud implementation with sap activateS4 h 188 sap s4hana cloud implementation with sap activate
S4 h 188 sap s4hana cloud implementation with sap activate
 
Qemu
QemuQemu
Qemu
 
QNX Software Systems
QNX Software SystemsQNX Software Systems
QNX Software Systems
 
Xen and the art of embedded virtualization (ELC 2017)
Xen and the art of embedded virtualization (ELC 2017)Xen and the art of embedded virtualization (ELC 2017)
Xen and the art of embedded virtualization (ELC 2017)
 
Understanding the Basics of CA Workload Automation iDash
Understanding the Basics of CA Workload Automation iDashUnderstanding the Basics of CA Workload Automation iDash
Understanding the Basics of CA Workload Automation iDash
 
Introduction to extracting data from sap s 4 hana with abap cds views
Introduction to extracting data from sap s 4 hana with abap cds viewsIntroduction to extracting data from sap s 4 hana with abap cds views
Introduction to extracting data from sap s 4 hana with abap cds views
 
Disaster Recovery for SAP HANA with SUSE Linux
Disaster Recovery for SAP HANA with SUSE LinuxDisaster Recovery for SAP HANA with SUSE Linux
Disaster Recovery for SAP HANA with SUSE Linux
 
AIXpert - AIX Security expert
AIXpert - AIX Security expertAIXpert - AIX Security expert
AIXpert - AIX Security expert
 
10 good reasons to go for model-based systems engineering in your organization
10 good reasons to go for model-based systems engineering in your organization10 good reasons to go for model-based systems engineering in your organization
10 good reasons to go for model-based systems engineering in your organization
 
Sap fi modülü
Sap fi modülüSap fi modülü
Sap fi modülü
 
Mastering SAP Monitoring - SAP HANA Monitoring, Management & Automation
Mastering SAP Monitoring - SAP HANA Monitoring, Management & AutomationMastering SAP Monitoring - SAP HANA Monitoring, Management & Automation
Mastering SAP Monitoring - SAP HANA Monitoring, Management & Automation
 
Xen & the Art of Virtualization
Xen & the Art of VirtualizationXen & the Art of Virtualization
Xen & the Art of Virtualization
 

Similar to OpenHPC: Project Overview and Updates

Cosug for jiang su lug dec 2011
Cosug  for jiang su lug dec 2011Cosug  for jiang su lug dec 2011
Cosug for jiang su lug dec 2011OpenCity Community
 
OpenHPC: A Comprehensive System Software Stack
OpenHPC: A Comprehensive System Software StackOpenHPC: A Comprehensive System Software Stack
OpenHPC: A Comprehensive System Software Stackinside-BigData.com
 
HPCC Systems 6.0.0 Highlights
HPCC Systems 6.0.0 HighlightsHPCC Systems 6.0.0 Highlights
HPCC Systems 6.0.0 HighlightsHPCC Systems
 
OpenHPC: Community Building Blocks for HPC Systems
OpenHPC: Community Building Blocks for HPC SystemsOpenHPC: Community Building Blocks for HPC Systems
OpenHPC: Community Building Blocks for HPC Systemsinside-BigData.com
 
Release Cycle Changes
Release Cycle ChangesRelease Cycle Changes
Release Cycle ChangesHPCC Systems
 
Pacemaker+DRBD
Pacemaker+DRBDPacemaker+DRBD
Pacemaker+DRBDDan Frincu
 
OpenNaaS Overview Complete
OpenNaaS Overview CompleteOpenNaaS Overview Complete
OpenNaaS Overview CompleteJoan Garcia
 
Involvement in OpenHPC
Involvement in OpenHPC	Involvement in OpenHPC
Involvement in OpenHPC Linaro
 
Embedded Webinar #13: "From Zero to Hero: contribute to Linux Kernel in 15 mi...
Embedded Webinar #13: "From Zero to Hero: contribute to Linux Kernel in 15 mi...Embedded Webinar #13: "From Zero to Hero: contribute to Linux Kernel in 15 mi...
Embedded Webinar #13: "From Zero to Hero: contribute to Linux Kernel in 15 mi...GlobalLogic Ukraine
 
Presentation1
Presentation1Presentation1
Presentation1cedrick
 
Presentation1
Presentation1Presentation1
Presentation1catarino
 
Compass first meetup
Compass first meetupCompass first meetup
Compass first meetupShuo Yang
 
Programming the Network Data Plane
Programming the Network Data PlaneProgramming the Network Data Plane
Programming the Network Data PlaneC4Media
 
IPMI is dead, Long live Redfish
IPMI is dead, Long live RedfishIPMI is dead, Long live Redfish
IPMI is dead, Long live RedfishBruno Cornec
 

Similar to OpenHPC: Project Overview and Updates (20)

Cosug for jiang su lug dec 2011
Cosug  for jiang su lug dec 2011Cosug  for jiang su lug dec 2011
Cosug for jiang su lug dec 2011
 
OpenHPC: A Comprehensive System Software Stack
OpenHPC: A Comprehensive System Software StackOpenHPC: A Comprehensive System Software Stack
OpenHPC: A Comprehensive System Software Stack
 
HPCC Systems 6.0.0 Highlights
HPCC Systems 6.0.0 HighlightsHPCC Systems 6.0.0 Highlights
HPCC Systems 6.0.0 Highlights
 
OpenHPC: Community Building Blocks for HPC Systems
OpenHPC: Community Building Blocks for HPC SystemsOpenHPC: Community Building Blocks for HPC Systems
OpenHPC: Community Building Blocks for HPC Systems
 
Release Cycle Changes
Release Cycle ChangesRelease Cycle Changes
Release Cycle Changes
 
Pacemaker+DRBD
Pacemaker+DRBDPacemaker+DRBD
Pacemaker+DRBD
 
Linux internals v4
Linux internals v4Linux internals v4
Linux internals v4
 
OpenNaaS Overview Complete
OpenNaaS Overview CompleteOpenNaaS Overview Complete
OpenNaaS Overview Complete
 
Involvement in OpenHPC
Involvement in OpenHPC	Involvement in OpenHPC
Involvement in OpenHPC
 
Embedded Webinar #13: "From Zero to Hero: contribute to Linux Kernel in 15 mi...
Embedded Webinar #13: "From Zero to Hero: contribute to Linux Kernel in 15 mi...Embedded Webinar #13: "From Zero to Hero: contribute to Linux Kernel in 15 mi...
Embedded Webinar #13: "From Zero to Hero: contribute to Linux Kernel in 15 mi...
 
Presentation1
Presentation1Presentation1
Presentation1
 
CentOS
CentOSCentOS
CentOS
 
CentOS
CentOSCentOS
CentOS
 
CentOS
CentOSCentOS
CentOS
 
Presentation1
Presentation1Presentation1
Presentation1
 
Presentation1
Presentation1Presentation1
Presentation1
 
Compass first meetup
Compass first meetupCompass first meetup
Compass first meetup
 
Programming the Network Data Plane
Programming the Network Data PlaneProgramming the Network Data Plane
Programming the Network Data Plane
 
Server Tools New approach
Server Tools New approachServer Tools New approach
Server Tools New approach
 
IPMI is dead, Long live Redfish
IPMI is dead, Long live RedfishIPMI is dead, Long live Redfish
IPMI is dead, Long live Redfish
 

More from inside-BigData.com

Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...inside-BigData.com
 
Transforming Private 5G Networks
Transforming Private 5G NetworksTransforming Private 5G Networks
Transforming Private 5G Networksinside-BigData.com
 
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...inside-BigData.com
 
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...inside-BigData.com
 
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...inside-BigData.com
 
HPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural NetworksHPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural Networksinside-BigData.com
 
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean MonitoringBiohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoringinside-BigData.com
 
Machine Learning for Weather Forecasts
Machine Learning for Weather ForecastsMachine Learning for Weather Forecasts
Machine Learning for Weather Forecastsinside-BigData.com
 
HPC AI Advisory Council Update
HPC AI Advisory Council UpdateHPC AI Advisory Council Update
HPC AI Advisory Council Updateinside-BigData.com
 
Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19inside-BigData.com
 
Energy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic TuningEnergy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic Tuninginside-BigData.com
 
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPODHPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPODinside-BigData.com
 
Versal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud AccelerationVersal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud Accelerationinside-BigData.com
 
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance EfficientlyZettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance Efficientlyinside-BigData.com
 
Scaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's EraScaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's Erainside-BigData.com
 
CUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computingCUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computinginside-BigData.com
 
Introducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi ClusterIntroducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi Clusterinside-BigData.com
 

More from inside-BigData.com (20)

Major Market Shifts in IT
Major Market Shifts in ITMajor Market Shifts in IT
Major Market Shifts in IT
 
Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...
 
Transforming Private 5G Networks
Transforming Private 5G NetworksTransforming Private 5G Networks
Transforming Private 5G Networks
 
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
 
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
 
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
 
HPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural NetworksHPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural Networks
 
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean MonitoringBiohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
 
Machine Learning for Weather Forecasts
Machine Learning for Weather ForecastsMachine Learning for Weather Forecasts
Machine Learning for Weather Forecasts
 
HPC AI Advisory Council Update
HPC AI Advisory Council UpdateHPC AI Advisory Council Update
HPC AI Advisory Council Update
 
Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19
 
Energy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic TuningEnergy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic Tuning
 
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPODHPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
 
State of ARM-based HPC
State of ARM-based HPCState of ARM-based HPC
State of ARM-based HPC
 
Versal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud AccelerationVersal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud Acceleration
 
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance EfficientlyZettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
 
Scaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's EraScaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's Era
 
CUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computingCUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computing
 
Introducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi ClusterIntroducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi Cluster
 
Overview of HPC Interconnects
Overview of HPC InterconnectsOverview of HPC Interconnects
Overview of HPC Interconnects
 

Recently uploaded

Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 

Recently uploaded (20)

Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 

OpenHPC: Project Overview and Updates

  • 1. OpenHPC: Project Overview and Updates Karl W. Schulz, Ph.D. Software and Services Group, Intel Technical Project Lead, OpenHPC 5th Annual MVAPICH User Group (MUG) Meeting August 16, 2017 w Columbus, Ohio http://openhpc.community
  • 2. Outline • Brief project overview • New items/updates from last year • Highlights from latest release 2
  • 3. OpenHPC: Mission and Vision • Mission: to provide a reference collection of open-source HPC software components and best practices, lowering barriers to deployment, advancement, and use of modern HPC methods and tools. • Vision: OpenHPC components and best practices will enable and accelerate innovation and discoveries by broadening access to state-of-the-art, open-source HPC methods and tools in a consistent environment, supported by a collaborative, worldwide community of HPC users, developers, researchers, administrators, and vendors. 3
  • 4. OpenHPC: a brief History… 4 ISC’15 BoF on the merits/interest in a community effort SC’15 seed initial 1.0 release, gather interested parties to work with Linux Foundation ISC’16 v1.1.1 release Linux Foundation announces technical leadership, founding members, and formal governance structure June  2015 Nov  2015 June  2016 SC’16 v1.2 release BoF Nov  2016 June  2017 ISC’17 v1.3.1 release BoF MUG’16 OpenHPC Intro
  • 5. OpenHPC Project Members 5 Project  member  participation  interest? Please  contact   Jeff  ErnstFriedman jernstfriedman@linuxfoundation.org Argonne National Laboratory Indiana University University of Cambridge Mixture of Academics, Labs, OEMs, and ISVs/OSVs
  • 6. 6 Governance: Technical Steering Committee (TSC) Role Overview OpenHPC Technical  Steering  Committee  (TSC) Maintainers Integration   Testing   Coordinator(s) Upstream  Component   Development   Representative(s) Project   Leader End-­User  /  Site   Representative(s) Note:  We  just  completed  election  of  TSC   members  for  the  2017-­2018  term. • terms  are  for  1-­year  
  • 7. 7 OpenHPC TSC – Individual Members • Reese Baird, Intel (Maintainer) • David Brayford, LRZ (Maintainer) • Eric Coulter, Indiana University (End-User/Site Representative) • Leonordo Fialho, ATOS (Maintainer) • Todd Gamblin, LLNL (Component Development Representative) • Craig Gardner, SUSE (Maintainer) • Renato Golin, Linaro (Testing Coordinator) • Jennifer Green, Los Alamos National Laboratory (Maintainer) • Douglas Jacobsen, NERSC (End-User/Site Representative) • Chulho Kim, Lenovo (Maintainer) • Janet Lebens, Cray (Maintainer) • Thomas Moschny, ParTec (Maintainer) • Nam Pho, New York Langone Medical Center (Maintainer) • Cyrus Proctor, Texas Advanced Computing Center (Maintainer) • Adrian Reber, Red Hat (Maintainer) • Joseph Stanfield, Dell (Maintainer) • Karl W. Schulz, Intel (Project Lead, Testing Coordinator) • Jeff Schutkoske, Cray (Component Development Representative) • Derek Simmel, Pittsburgh Supercomputing Center (End-User/Site Representative) • Scott Suchyta, Altair (Maintainer) • Nirmala Sundararajan, Dell (Maintainer) https://github.com/openhpc/ohpc/wiki/Governance-­Overview New for 2017-2018
  • 8. 8 Target system architecture • Basic cluster architecture: head node (SMS) + computes • Ethernet fabric for mgmt. network • Shared or separate out-of-band (BMC) network • High speed fabric (InfiniBand, Omni-Path) to  compute  eth  interface to  compute  BMC  interface compute   nodes high  speed  network   Parallel  File  System tcp  networking eth0 eth1
  • 9. [  Key  takeaway  ] • OpenHPC provides a collection of pre-built ingredients common in HPC environments; fundamentally it is a package repository • The repository is published for use with Linux distro package managers: • yum (CentOS/RHEL) • zypper (SLES) • You can pick relevant bits of interest for your site: • if you prefer a resource manager that is not included, you can build that locally and still leverage the scientific libraries and development environment • similarly, you might prefer to utilize a different provisioning system OpenHPC: a building block repository 9
  • 10. 10 Newish Items/Updates changes and new items since we were last together at MUG’16
  • 11. 11 • For  those  who  prefer  to  mirror  a  repo   locally,  we  have  historically  provided   an  ISO  that  contained  all  the   packages/repodata • Beginning  with  v1.2  release,   switched  to  tarball  based  distribution • Distribution  tarballs   available  at:   http://build.openhpc.community/dist • A  “make_repo.sh”  script  is  provided   that  will  setup  a  locally  hosted   OpenHPC  repository  using  the   contents  from  downloaded  tarball Switched from ISOs -> distribution tarballs Index of /dist/1.2.1 Name Last modified Size Description Parent Directory - OpenHPC-1.2.1.CentOS_7.2_aarch64.tar 2017-01-24 12:43 1.3G OpenHPC-1.2.1.CentOS_7.2_src.tar 2017-01-24 12:45 6.8G OpenHPC-1.2.1.CentOS_7.2_x86_64.tar 2017-01-24 12:43 2.2G OpenHPC-1.2.1.SLE_12_SP1_aarch64.tar 2017-01-24 12:40 1.1G OpenHPC-1.2.1.SLE_12_SP1_src.tar 2017-01-24 12:42 6.2G OpenHPC-1.2.1.SLE_12_SP1_x86_64.tar 2017-01-24 12:41 1.9G OpenHPC-1.2.1.md5s 2017-01-24 12:50 416 # tar xf OpenHPC-1.2.1.CentOS_7.2_x86_64.tar # ./make_repo.sh --> Creating OpenHPC.local.repo file in /etc/yum.repos.d --> Local repodata stored in /root/repo # yum repolist | grep OpenHPC OpenHPC-local OpenHPC-1.2 - Base OpenHPC-local-updates OpenHPC-1.2.1 - Updates
  • 12. More Generic Repo Paths • Starting with the v1.3 release, we adopted more generic paths for underlying distros 12 [OpenHPC] name=OpenHPC-1.3 - Base baseurl=http://build.openhpc.community/OpenHPC:/1.3/CentOS_7 gpgcheck=1 gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-OpenHPC-1 [OpenHPC-updates] name=OpenHPC-1.3 - Updates baseurl=http://build.openhpc.community/OpenHPC:/1.3/updates/CentOS_7 gpgcheck=1 gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-OpenHPC-1 Similar approach for SLES12 repo config -> SLE_12
  • 13. Release Roadmap Published 13 Clone this wiki locally Roadmap The table below highlights expected future releases that are currently on the near term OpenHPC roadmap. Note that these are dynamic in nature and will be updated based on directions from the OpenHPC TSC. Approximate release date(s) are provided to aid in planning, but please be aware that these dates and releases are subject to change, YMMV, etc. Release Target Release Date Expectations 1.3.2 August 2017 New component additions and version upgrades. 1.3.3 November 2017 New component additions and version upgrades. Previous Releases A history of previous OpenHPC releases is highlighted below. Clicking on the version string will take you to the !Release!Notes!for more detailed information on the changes in a particular release. Release Date 1.3.1 June 16, 2017 1.3 March 31, 2017 1.2.1 January 24, 2017 1.2 November 12, 2016 1.1.1 June 21, 2016 1.1 April 18, 2016 1.0.1 February 05, 2016 1.0 November 12, 2015 Add a custom footer Pages 12 Home ARM Tech Preview ARM Tech Preview (1.3.1) Component List Component Suggestions Governance Overview Open Build Service Papers and Presentations Release History and Roadmap Technical Steering Committee Meetings TSC Membership Participation and Selection Process TSC Weekly Meeting Call Info: Add a custom sidebar https://github.com/openhpc/ohpc.wiki.git Clone in Desktop • Have had some requests for a roadmap for future releases • High-level roadmap now maintained on GitHub wiki: https://github.com/openhpc/ohpc/wiki/Release-History- and-Roadmap
  • 14. Component Submission Site • A common question posed to the project has been how to request new software components? • We now have a simple submission site for new requests: - https://github.com/openhpc/submissions - requests reviewed on rolling basis at roughly a quarterly cadence • Example software/recipes that have been added via this process: - xCAT - BeeGFS - PLASMA, SLEPc - clang/LLVM koomie commented 15 days ago Software Name Public URL Technical Overview Latest stable version number Open-source license type Relationship to component? contributing developer user other If other, please describe: Build system autotools-based CMake other If other, please describe: Does the current build system support staged path installati For example: !make!install!DESTIR=/tmp/foo!(or equivalent) Subset  of  information  requested during  submission  process 14 - MPICH - PBS Professional - Singularity - Scalasca
  • 15. Opt-in System Registry Now Available • Interested users can now register their usage on a public system registry • Helpful for us to have an idea as to who is potentially benefitting from this community effort • Accessible from top- level GitHub page 15 OpenHPC System Registry This opt-in form can be used to register your system to let us (and the community) know that you are using elements of OpenHPC. * Required CentOS/RHEL SLES Other: Name of Site/Organization * Your answer What OS distribution are you using? * Site or System URL Your answer System  Registry
  • 16. Multiple Architecture Builds • Starting with v1.2 release, we also include builds for aarch64 - both SUSE and RHEL/CentOS now have aarch64 variants available for latest versions (SLES 12 SP2, CentOS 7.3) • Recipes/packages being made available as a Tech Preview - some additional work required for provisioning - significant majority of development packages testing OK, but there are a few known caveats - please see https://github.com/openhpc/ohpc/wiki/ARM-Tech-Preview for latest info v1.3.1  RPM  counts 16 misc better tar excludes tests fix test path (#472) README.md Updating available package counts to reflect fact that some of packag… README.md OpenHPC: Community building blocks for HPC systems. (v1.3.1) components availablecomponents available 6767 new additionsnew additions 33 updatesupdates 44%44% Introduction This stack provides a variety of common, pre-built ingredients required to deploy and manage an HPC Linux including provisioning tools, resource management, I/O clients, runtimes, development tools, and a variety o libraries. The compatible OS version(s) for this release and the total number of pre-packaged binary RPMs available p architecture type are summarized as follows: Base OS x86_64 aarch64 noarch CentOS 7.3 424 265 32 SLES 12 SP2 429 274 32 A detailed list of all available components is available in the "Package Manifest" appendix located in each of companion install guide documents, and a list of updated packages can be found in the release notes. The r notes also contain important information for those upgrading from previous versions of OpenHPC.
  • 17. Dev Environment Consistency 17 x86_64 aarch64 OpenHPC providing   consistent   development   environment  to  the   end  user  across   multiple  architectures
  • 18. 18 End-user software addition tools OpenHPC repositories include two additional tools that can be used to further extend a user’s development environment - EasyBuild and Spack - leverages other community efforts for build reproducibility and best practices for configuration - modules available for both after install # module load gcc7 # module load spack # spack compiler find ==> Added 1 new compiler to /root/.spack/linux/compilers.yaml gcc@7.1.0 # spack install darshan-runtime ... # . /opt/ohpc/admin/spack/0.10.0/share/spack/setup-env.sh # module avail ----- /opt/ohpc/admin/spack/0.10.0/share/spack/modules/linux-sles12-x86_64 ----- darshan-runtime-3.1.0-gcc-7.1.0-vhd5hhg m4-1.4.17-gcc-7.1.0-7jd575i hwloc-1.11.4-gcc-7.1.0-u3k6dok ncurses-6.0-gcc-7.1.0-l3mdumo libelf-0.8.13-gcc-7.1.0-tsgwr7j openmpi-2.0.1-gcc-7.1.0-5imqlfb libpciaccess-0.13.4-gcc-7.1.0-33gbduz pkg-config-0.29.1-gcc-7.1.0-dhbpa2i libsigsegv-2.10-gcc-7.1.0-lj5rntg util-macros-1.19.0-gcc-7.1.0-vkdpa3t libtool-2.4.6-gcc-7.1.0-ulicbkz zlib-1.2.10-gcc-7.1.0-gy4dtna
  • 19. Variety of recipes now available Choose your own adventure… Initially,  we  started   off  with  a  single   recipe  with  the   intent  to  expandOpenHPC (v1.0.1) Cluster Building Recipes CentOS7.1 Base OS Base Linux* Edition Document Last Update: 2016-02-05 Document Revision: 7e93115 Latest v1.3.1 release continues to expand with multiple resource managers, OSes, provisioners, and architectures: • Install_guide-CentOS7-Warewulf-PBSPro-1.3.1-x86_64.pdf • Install_guide-CentOS7-Warewulf-SLURM-1.3.1-aarch64.pd • fInstall_guide-CentOS7-Warewulf-SLURM-1.3.1-x86_64.pdf • Install_guide-CentOS7-xCAT-SLURM-1.3.1-x86_64.pdf • Install_guide-SLE_12-Warewulf-PBSPro-1.3.1-x86_64.pdf • Install_guide-SLE_12-Warewulf-SLURM-1.3.1-aarch64.pdf • fInstall_guide-SLE_12-Warewulf-SLURM-1.3.1-x86_64.pdf • Additional  resource   manager  (PBS   Professional) with  v1.2 • Additional  provisioner   (xCAT)  with  v1.3.1 19
  • 20. 20 Template scripts Template recipe scripts are proved that encapsulate commands presented in the guides: # yum/zypper install docs-ohpc # ls /opt/ohpc/pub/doc/recipes/*/*/*/*/recipe.sh /opt/ohpc/pub/doc/recipes/centos7/aarch64/warewulf/slurm/recipe.sh /opt/ohpc/pub/doc/recipes/centos7/x86_64/warewulf/pbspro/recipe.sh /opt/ohpc/pub/doc/recipes/centos7/x86_64/warewulf/slurm/recipe.sh /opt/ohpc/pub/doc/recipes/centos7/x86_64/xcat/slurm/recipe.sh /opt/ohpc/pub/doc/recipes/sles12/aarch64/warewulf/slurm/recipe.sh /opt/ohpc/pub/doc/recipes/sles12/x86_64/warewulf/pbspro/recipe.sh /opt/ohpc/pub/doc/recipes/sles12/x86_64/warewulf/slurm/recipe.sh # ls /opt/ohpc/pub/doc/recipes/*/input.local /opt/ohpc/pub/doc/recipes/centos7/input.local /opt/ohpc/pub/doc/recipes/sles12/input.local input.local + recipe.sh == installed system #  compute  hostnames c_name[0]=c1 c_name[1]=c2 … #  compute  node  MAC  addresses   c_mac[0]=00:1a:2b:3c:4f:56 c_mac[1]=00:1a:2b:3c:4f:56 …
  • 21. Test Suite • Initiated from discussion/requests at SC’16 BoF, the OpenHPC test suite is now available as an installable RPM (introduced with v1.3 release) • # yum/zypper install test-suite-ohpc - creates/relies on “ohpc-test” user to perform user testing (with accessibility to run jobs through resource manager) - related discussion added to recipes in Appendix C but is instead, devised to confirm component builds are functional and interoperable within the modular OpenHPC environment. The test suite is generally organized by components and the OpenHPC CI workflow relies on running the full suite using Jenkins to test multiple OS configurations and installation recipes. To facilitate customization and running of the test suite locally, we provide these tests in a standalone RPM. [sms]# yum -y install test-suite-ohpc The RPM installation creates a user named ohpc-test to house the test suite and provide an isolated environment for execution. Configuration of the test suite is done using standard GNU autotools semantics and the BATS shell-testing framework is used to execute and log a number of individual unit tests. Some tests require privileged execution, so a di↵erent combination of tests will be enabled depending on which user executes the top-level configure script. Non-privileged tests requiring execution on one or more compute nodes are submitted as jobs through the SLURM resource manager. The tests are further divided into “short” and “long” run categories. The short run configuration is a subset of approximately 180 tests to demonstrate basic functionality of key components (e.g. MPI stacks) and should complete in 10-20 minutes. The long run (around 1000 tests) is comprehensive and can take an hour or more to complete. Most components can be tested individually, but a default configuration is setup to enable collective testing. To test an isolated component, use the configure option to disable all tests, then re-enable the desired test to run. The --help option to configure will display all possible tests. Example output is shown below (some output is omitted for the sake of brevity). [sms]# su - ohpc-test [test@sms ~]$ cd tests [test@sms ~]$ ./configure --disable-all --enable-fftw checking for a BSD-compatible install... /bin/install -c checking whether build environment is sane... yes ... ---------------------------------------------- SUMMARY --------------------------------------------- Package version............... : test-suite-1.3.0 Build user.................... : ohpc-test Build host.................... : sms001 Configure date................ : 2017-03-24 15:41 Build architecture............ : x86 64 Compiler Families............. : gnu MPI Families.................. : mpich mvapich2 openmpi Resource manager ............. : SLURM Test suite configuration...... : short ... Libraries: 21
  • 22. Project CI infrastructure • TACC is kindly hosting some CI infrastructure for the project (Austin, TX) • Using for build servers and continuous integration (CI) testbed. http://test.openhpc.community:8080 Many  thanks  to  TACC  and   vendors  for  hardware   donations!!:  Intel,  Cavium,  Dell x86_64 aarch64 22
  • 23. 23 Community Test System for CI in use http://test.openhpc.community:8080 All recipes exercised in CI system (start w/ bare-metal installs + integration test suite) nHPC CI Infrastructure nks to the Texas Advanced Computing Center (TACC) for hosting support and to Intel, Cavium, and Dell for hardware donations. add description W Name ↓ Last Success Last Failure Last Duration (1.2) - (centos7.2,x86_64) - (warewulf+pbspro) - long cycle 1 day 8 hr - #39 N/A 58 min (1.2) - (centos7.2,x86_64) - (warewulf+pbspro) - short cycle 1 day 4 hr - #155 4 days 10 hr - #109 14 min (1.2) - (centos7.2,x86_64) - (warewulf+slurm) - long cycle 2 days 4 hr - #244 4 days 4 hr - #219 1 hr 0 min (1.2) - (centos7.2,x86_64) - (warewulf+slurm) - short cycle 2 hr 46 min - #554 8 days 15 hr - #349 14 min (1.2) - (centos7.2,x86_64) - (warewulf+slurm+PXSE) - long cycle 1 day 6 hr - #39 4 days 10 hr - #20 2 hr 29 min (1.2) - (sles12sp1,x86_64) - (warewulf+pbspro) - short cycle 1 day 3 hr - #166 4 days 10 hr - #86 12 min (1.2) - (sles12sp1,x86_64) - (warewulf+slurm) - short cycle 1 day 2 hr - #259 8 days 20 hr - #72 14 min (1.2) - (sles12sp1,x86_64) - (warewulf,slurm) - long test cycle 1 day 5 hr - #97 6 days 19 hr - #41 54 min (1.2) - aarch64 - (centos7.2) - (warewulf+slurm) 2 days 21 hr - #3 N/A 0.41 sec (1.2) - aarch64 - (sles12sp1) - (warewulf+slurm) 1 day 8 hr - #45 2 days 21 hr - #41 2 hr 13 min n: S M L Legend RSS for all RSS for failures RSS for just latest builds KARL W. SCHULZ | LOG OUTsearch 1.1 All Interactive admin +1.2 ENABLE AUTO REFRESH New Item People Build History Edit View Project Relationship Check File Fingerprint Manage Jenkins My Views Lockable Resources Credentials No builds in the queue. master 1 Idle 2 Idle 3 Idle OpenHPC CI Infrastructure Thanks to the Texas Advanced Computing Center (TACC) for hosting support and to Intel, Cavium, and S W Name ↓ Last Success (1.2) - (centos7.2,x86_64) - (warewulf+pbspro) - long cycle 1 day 8 hr - #39 (1.2) - (centos7.2,x86_64) - (warewulf+pbspro) - short cycle 1 day 4 hr - #155 (1.2) - (centos7.2,x86_64) - (warewulf+slurm) - long cycle 2 days 4 hr - #244 (1.2) - (centos7.2,x86_64) - (warewulf+slurm) - short cycle 2 hr 46 min - #554 (1.2) - (centos7.2,x86_64) - (warewulf+slurm+PXSE) - long cycle 1 day 6 hr - #39 (1.2) - (sles12sp1,x86_64) - (warewulf+pbspro) - short cycle 1 day 3 hr - #166 (1.2) - (sles12sp1,x86_64) - (warewulf+slurm) - short cycle 1 day 2 hr - #259 (1.2) - (sles12sp1,x86_64) - (warewulf,slurm) - long test cycle 1 day 5 hr - #97 (1.2) - aarch64 - (centos7.2) - (warewulf+slurm) 2 days 21 hr - #3 (1.2) - aarch64 - (sles12sp1) - (warewulf+slurm) 1 day 8 hr - #45 Icon: S M L Legend RSS for all search Build Queue Build Executor Status 1.1.1 All Interactive admin +1.2 Jenkins 1.3.x OpenHPC CI Infrastructure Thanks to the Texas Advanced Computing Center (TACC) for hosting support and to Intel, Cavium, and Dell for hardware donations. add description S Name ↓ Last Success Last Duration (1.3 to 1.3.1) - (centos7.3,x86_64) - (warewulf+slurm) 2 days 12 hr - #41 1 hr 12 min (1.3.1) - (centos7.3,x86_64) - (warewulf+pbspro) - UEFI 2 days 7 hr - #390 56 min (1.3.1) - (centos7.3,x86_64) - (warewulf+slurm) - long cycle 2 days 3 hr - #892 1 hr 2 min (1.3.1) - (centos7.3,x86_64) - (warewulf+slurm) - tarball REPO 2 days 5 hr - #48 1 hr 2 min (1.3.1) - (centos7.3,x86_64) - (warewulf+slurm+PSXE) 2 days 4 hr - #726 2 hr 14 min (1.3.1) - (centos7.3,x86_64) - (warewulf+slurm+PSXE+OPA) 2 days 7 hr - #80 1 hr 51 min (1.3.1) - (centos7.3,x86_64) - (xcat+slurm) 2 days 13 hr - #271 1 hr 1 min (1.3.1) - (sles12sp2,x86_64) - (warewulf+pbspro) - tarball REPO 2 days 8 hr - #45 44 min (1.3.1) - (sles12sp2,x86_64) - (warewulf+pbspro) - UEFI 2 days 4 hr - #70 45 min (1.3.1) - (sles12sp2,x86_64) - (warewulf+slurm) 2 days 8 hr - #780 53 min (1.3.1) - (sles12sp2,x86_64) - (warewulf+slurm+PSXE) - long cycle 2 days 3 hr - #114 1 hr 53 min (1.3.1) - (sles12sp2,x86_64) - (warewulf+slurm+PSXE+OPA) 2 days 8 hr - #36 1 hr 33 min KARL W. SCHULZ | LOG OUTsearch1 1.3 All +1.3.1 ENABLE AUTO REFRESH
  • 25. 25 OpenHPC v1.3.1 - Current S/W components Functional  Areas Components Base  OS CentOS  7.3, SLES12  SP2 Architecture x86_64,  aarch64  (Tech  Preview) Administrative  Tools Conman,  Ganglia, Lmod,  LosF,  Nagios,  pdsh,  pdsh-­mod-­slurm,  prun,  EasyBuild,   ClusterShell,  mrsh,  Genders,  Shine,  Spack,  test-­suite Provisioning   Warewulf,  xCAT Resource  Mgmt. SLURM,  Munge, PBS  Professional Runtimes OpenMP, OCR,  Singularity I/O  Services Lustre client (community  version),  BeeGFS client Numerical/Scientific   Libraries Boost,  GSL,  FFTW,  Metis,  PETSc,  Trilinos,  Hypre,  SuperLU,  SuperLU_Dist,   Mumps, OpenBLAS,  Scalapack I/O  Libraries HDF5  (pHDF5),  NetCDF (including  C++  and  Fortran  interfaces),  Adios Compiler  Families GNU  (gcc,  g++,  gfortran), MPI  Families MVAPICH2,  OpenMPI,  MPICH Development  Tools Autotools (autoconf,  automake,  libtool),  Valgrind,R,  SciPy/NumPy,  hwloc Performance  Tools PAPI,  IMB, mpiP,  pdtoolkit TAU,  Scalasca,  ScoreP,  SIONLib Notes: • Additional dependencies that are not provided by the BaseOS or community repos (e.g. EPEL) are also included • 3rd Party libraries are built for each compiler/MPI family • Resulting repositories currently comprised of ~450 RPMs new  with  v1.3.1 Future  additions   approved  for  inclusion   in  v1.3.2  release: • PLASMA • SLEPc • pNetCDF • Scotch • Clang/LLVM
  • 26. OpenHPC Component Updates • Part  of  motivation  for  community  effort  like  OpenHPC  is  the  rapidity  of   S/W  updates  in  our  space • Rolling  history  of  updates/additions: v1.1$Release$ Updated$$ New$Addi2on$ No$change$ v1.2%Release% Updated%% New%Addi3on% No%change% v1.3%Release% Updated%% New%Addi3on% No%change% 38.6% 63.3% 29.1% v1.3.1%Release% Updated%% New%Addi3on% No%change% 44% Apr  2016 Nov  2016 Mar  2017 June  2017
  • 27. 27 Other new items for v1.3.1 Release Meta RPM packages introduced and adopted in recipes: • these replace previous use of groups/patterns • general convention remains - names that begin with “ohpc-*” are typically metapackges - intended to group related collections of RPMs by functionality • some names have been updated for consistency during the switch over • updated list available in Appendix E E Package Manifest This appendix provides a summary of available meta-package groupings and all of the individual RPM packages that are available as part of this OpenHPC release. The meta-packages provide a mechanism to group related collections of RPMs by functionality and provide a convenience mechanism for installation. A list of the available meta-packages and a brief description is presented in Table 2. Table 2: Available OpenHPC Meta-packages Group Name Description ohpc-autotools Collection of GNU autotools packages. ohpc-base Collection of base packages. ohpc-base-compute Collection of compute node base packages. ohpc-ganglia Collection of Ganglia monitoring and metrics packages. ohpc-gnu7-io-libs Collection of IO library builds for use with GNU compiler toolchain. ohpc-gnu7-mpich-parallel-libs Collection of parallel library builds for use with GNU compiler toolchain and the MPICH runtime. ohpc-gnu7-mvapich2-parallel-libs Collection of parallel library builds for use with GNU compiler toolchain and the MVAPICH2 runtime. ohpc-gnu7-openmpi-parallel-libs Collection of parallel library builds for use with GNU compiler toolchain and the OpenMPI runtime. ohpc-gnu7-parallel-libs Collection of parallel library builds for use with GNU compiler toolchain. ohpc-gnu7-perf-tools Collection of performance tool builds for use with GNU compiler toolchain. ohpc-gnu7-python-libs Collection of python related library builds for use with GNU compiler toolchain. ohpc-gnu7-runtimes Collection of runtimes for use with GNU compiler toolchain. ohpc-gnu7-serial-libs Collection of serial library builds for use with GNU compiler toolchain. ohpc-intel-impi-parallel-libs Collection of parallel library builds for use with Intel(R) Parallel Studio XE toolchain and the Intel(R) MPI Library. ohpc-intel-io-libs Collection of IO library builds for use with Intel(R) Parallel Studio XE software suite. ohpc-intel-mpich-parallel-libs Collection of parallel library builds for use with Intel(R) Parallel Studio XE toolchain and the MPICH runtime. ohpc-intel-mvapich2-parallel-libs Collection of parallel library builds for use with Intel(R) Parallel Studio XE toolchain and the MVAPICH2 runtime. ohpc-intel-openmpi-parallel-libs Collection of parallel library builds for use with Intel(R) Parallel Studio XE toolchain and the OpenMPI runtime. ohpc-intel-perf-tools Collection of performance tool builds for use with Intel(R) Parallel Studio XE toolchain. ohpc-intel-python-libs Collection of python related library builds for use with Intel(R) Parallel Studio XE toolchain. ohpc-intel-runtimes Collection of runtimes for use with Intel(R) Parallel Studio XE toolchain. ohpc-intel-serial-libs Collection of serial library builds for use with Intel(R) Parallel Studio XE toolchain. ohpc-nagios Collection of Nagios monitoring and metrics packages. ohpc-slurm-client Collection of client packages for SLURM. ohpc-slurm-server Collection of server packages for SLURM. ohpc-warewulf Collection of base packages for Warewulf provisioning. 33 Rev: 208896e1 ohpc-gnu7-mvapich2-parallel-libs
  • 28. 28 Other new items for v1.3.1 Release • A new compiler variant (gnu7) was introduced - in the case of a fresh install, recipes default to installing the new variant along with matching runtimes and libraries - if upgrading a previously installed system, administrators can opt-in to enable the gnu7 variant • The meta-packages for “gnu7” provide a convenient mechanism to add on: - upgrade discussion in recipes (Appendix B) amended to highlight this workflow Install Guide (v1.3.1): CentOS7.3/x86 64 + xCAT + SLURM # Update default environment [sms]# yum -y remove lmod-defaults-gnu-mvapich2-ohpc [sms]# yum -y install lmod-defaults-gnu7-mvapich2-ohpc # Install GCC 7.x-compiled meta-packages with dependencies [sms]# yum -y install ohpc-gnu7-perf-tools ohpc-gnu7-serial-libs ohpc-gnu7-io-libs ohpc-gnu7-python-libs ohpc-gnu7-runtimes ohpc-gnu7-mpich-parallel-libs ohpc-gnu7-openmpi-parallel-libs ohpc-gnu7-mvapich2-parallel-libs parallel  libs  for   gnu7/mpich note:  could  skip  this  to  leave   previous  gnu  toolchain  as   default
  • 29. 29 Coexistence  of  multiple  variants $ module list Currently Loaded Modules: 1) autotools 2) prun/1.1 3) gnu7/7.1.0 4) mvapich2/2.2 5) ohpc $ module avail --------------------------- /opt/ohpc/pub/moduledeps/gnu7-mvapich2 ----------------------------- adios/1.11.0 imb/4.1 netcdf-cxx/4.3.0 scalapack/2.0.2 sionlib/1.7.1 boost/1.63.0 mpiP/3.4.1 netcdf-fortran/4.4.4 scalasca/2.3.1 superlu_dist/4.2 fftw/3.3.6 mumps/5.1.1 petsc/3.7.6 scipy/0.19.0 tau/2.26.1 hypre/2.11.1 netcdf/4.4.1.1 phdf5/1.10.0 scorep/3.0 trilinos/12.10.1 ------------------------------- /opt/ohpc/pub/moduledeps/gnu7 ---------------------------------- R_base/3.3.3 metis/5.1.0 numpy/1.12.1 openmpi/1.10.7 gsl/2.3 mpich/3.2 ocr/1.0.1 pdtoolkit/3.23 hdf5/1.10.0 mvapich2/2.2 (L) openblas/0.2.19 superlu/5.2.1 --------------------------------- /opt/ohpc/pub/modulefiles ------------------------------------ EasyBuild/3.2.1 gnu/5.4.0 ohpc (L) singularity/2.3 autotools (L) gnu7/7.1.0 (L) papi/5.5.1 valgrind/3.12.0 clustershell/1.7.3 hwloc/1.11.6 prun/1.1 (L) previously   installed  from   1.3  release everything   else  from   1.3.1   updates   (add-­on) Consider an example of system originally installed from 1.3 base release and then added gnu7 variant using commands from last slide
  • 30. Summary • Technical Steering Committee just completed it’s first year of operation; new membership selection for 2017-2018 in place • Provided a highlight of changes/evolutions that have occurred since MUG’16 - 4 releases since last MUG - architecture addition (ARM) - multiple provisioner/resource manager recipes • “Getting Started” Tutorial held at PEARC’17 last month - more in-depth overview: https://goo.gl/NyiDmr 30 http://openhpc.community (general  info) https://github.com/openhpc/ohpc (main  GitHub  site) https://github.com/openhpc/submissions (new  submissions) https://build.openhpc.community (build  system/repos) http://www.openhpc.community/support/mail-­lists/ (mailing  lists) Community Resources