Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Feedback on
Big Compute & HPC
on Windows Azure
Antoine Poliakov
HPC Consultant
ANEO
apoliakov@aneo.fr
http://blog.aneo.eu
...
Introduction

HPC : a challenge for the cloud
•

Cloud : on-demand access through a telecommunications network to shared a...
Introduction

3 ingredients yield an answer through experimentation
Technology
HPC oriented
cloud

Use-case
HPC software

...
Introduction

Experimenting on HPC in the cloud : our approach
Identify technologies and partners
• HPC software use-case
...
Introduction

A collaborative project with 3 complementary actors
Consulting firm: organization and
technologies
 HPC Pra...
Introduction

Dedicated and competant teams: thank you all!

Consulting
Ported and deployed the
application in the cloud
L...
Presentation contents
1. Technical context

2. Feedback on porting the application

3. Optimizations

4. Results

#mstechd...
1. TECHNICAL CONTEXT
a. Azure Big Compute
b. ParSon

#mstechdays

#9

Innovation Recherche
Azure Big Compute

Azure Big Compute = New Azure nodes + HPC Pack
New nodes: A8 and A9
•
•
•
•

2x8 snb E5-2670 @2.6Ghz, 1...
Azure Big Compute

HPC Pack : on permise cluster

•
•

#mstechdays

N

N

N

N

N

N

N

N

Administration : hardware + so...
Azure Big Compute

HPC Pack : in the Azure Big Compute cloud
•

Active Directory and manager in the cloud (VMs)

•

Nodes ...
Azure Big Compute

HPC Pack : hybrid deployment
•

Active Directory and manager on premise

•

Nodes both in the datacente...
ParSon

ParSon: an audio segmentation scientific software
• ParSon = audio segmentation algorithm : voice / music
1. Super...
ParSon

ParSon is distributed with OpenMP + MPI
6. Get outputs

Data
Control

4. MPI Exec
2. Reserves
N computers

OAR

5....
ParSon

Performances are limited by data transfers
Best runtime (s)

2048
512
128
IO bound

32

Nodes read from NAS
en rés...
2. PORTING THE APPLICATION
a. Porting C++ code: Linux  Windows
b. Porting distribution strategy: Cluster  HPC Cluster Ma...
Standards conformance = easy Linux  Windows
porting
• ParSon and Visual conform to the C++ standard  few code
changes

•...
Porting

ParSon in the cluster
6. Get output

Data
Control

4. MPI Exec
2. Reserves
N computers

OAR

5. Run and
inter-com...
Porting

ParSon dans le Cloud Azure
6. Get output

IaaS

PaaS

4. MPI Exec
2. Reserves
N nodes
HPC Cluster
Manager

AD
Dom...
Porting

Deployment within Azure
At every software update : package + send in the cloud
1. Send to manager
–
–

Either wit...
Porting

This first working setup has some limitations
• Transferring the input file is longer than sequential computation...
3. OPTIMIZATIONS

#mstechdays

#23

Innovation Recherche
Optimizations

Methodology : suppress the bottleneck
Identified bottleneck is the input file transfer
1. Disk write throug...
Optimizations

Accelerating local data access with a RAM filesystem
•

RAMFS = filesystem stored in a RAM block
–
–

•

Im...
Optimizations

Accelerating input file deployment
•

All standard transfer systems go through the Ethernet interface
– Azu...
4. RESULTS

#mstechdays

#27

Innovation Recherche
Results

Computations scale well, especially for bigger files
Computation efficiency for different input sizes

Computatio...
Results

Input file transfer make global scaling worse
Efficiency for compute only and including transfers

Time decomposi...
Broadcast time (sec, log)

Download time (min)

Consistent storage throughput (220Mb/s), latency may be high
Broadcast con...
5. CONCLUSION

#mstechdays

#31

Innovation Recherche
Our feedback on the Big Compute technology
•

HPC standards conformance: C++, OpenMP,
MPI
–

•

•

Nodes administration

–...
Azure Big Compute for research and business
Predictable, pay what you use cost model
Modern design, extensive documentatio...
Thank you for your attention
•

Antoine Poliakov
apoliakov@aneo.fr

•

Stéphane Vialle
stephane.vialle@supelec.fr

•

ANEO...
Digital is
business
Feedback on Big Compute & HPC on Windows Azure
Upcoming SlideShare
Loading in …5
×

Feedback on Big Compute & HPC on Windows Azure

990 views

Published on

Is the cloud relevant for high performance workloads ? We answer by sharing our experience : HPC consultants at ANEO have ported and optimized a distributed scientific software developed at Supelec, from their Linux cluster to Microsoft's new cloud technology, Big Compute (InfiniBand nodes interconnect).

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Feedback on Big Compute & HPC on Windows Azure

  1. 1. Feedback on Big Compute & HPC on Windows Azure Antoine Poliakov HPC Consultant ANEO apoliakov@aneo.fr http://blog.aneo.eu Innovation Recherche
  2. 2. Introduction HPC : a challenge for the cloud • Cloud : on-demand access through a telecommunications network to shared and userconfigurable IT resources • HPC (High Performance Computing) : a branch of computer science conercned with maximizing software efficiency, in particular in terms of execution speed – – – Raw computing power doubles every 1.5 - 2 years Network throughput doubles every 2 - 3 years The compute/network gap doubles every 5 years • HPC in the cloud allows makes computing power accessible to all (SME, research labs, etc.)  Fosters innovation • Our question : can the cloud offer sufficient performances for HPC workloads ? – – – #mstechdays CPU : 100% native speed RAM: 99% native speed Network ??? #3 Innovation Recherche
  3. 3. Introduction 3 ingredients yield an answer through experimentation Technology HPC oriented cloud Use-case HPC software Experiments State of the art of HPC in the cloud #mstechdays #4 Innovation Recherche
  4. 4. Introduction Experimenting on HPC in the cloud : our approach Identify technologies and partners • HPC software use-case • Efficient cloud computing service Port the applicative HPC code : cluster  cloud • Skills improvements • Feedback on the technologies Experiment and measure performances • Scaling • Data transfers #mstechdays #5 Innovation Recherche
  5. 5. Introduction A collaborative project with 3 complementary actors Consulting firm: organization and technologies  HPC Practice: fast/massive information processing for finance and industries Established HPC research teams: Distributed software & big data Machine learning and interactive systems Windows Azure provides a cloud solution aimed at HPC workloads: Azure Big Compute Goals Identify most relevent use-cases for our clients Estimate the complexity of porting and deploying an app Evaluate if the solution is production-ready Goals Is the cloud ready for scientific computing ? Specificities of deploying in the cloud ? Performances Goals Pre-release feedback Inside view of a HPC cluster  cloud transition #mstechdays #6 Innovation Recherche
  6. 6. Introduction Dedicated and competant teams: thank you all! Consulting Ported and deployed the application in the cloud Led the benchmarks Constantinos Makassikis HPC Consultant Research Use-case: distributed audio segmentation Experiments analysis Stéphane Vialle Professor, Computer science Antoine Poliakov HPC Consultant Stéphane Rossignol Assistant Professor, Signal processing Wilfried Kirschenmann HPC Consultant Kévin Dehlinger Computer scientist intern CNAM #mstechdays #7 Provider Created the technical solution Made available notable computational power Innovation Recherche Xavier Pillons Principal Program Manager, Windows Azure CAT
  7. 7. Presentation contents 1. Technical context 2. Feedback on porting the application 3. Optimizations 4. Results #mstechdays #8 Innovation Recherche
  8. 8. 1. TECHNICAL CONTEXT a. Azure Big Compute b. ParSon #mstechdays #9 Innovation Recherche
  9. 9. Azure Big Compute Azure Big Compute = New Azure nodes + HPC Pack New nodes: A8 and A9 • • • • 2x8 snb E5-2670 @2.6Ghz, 112Gb DDR3 @1.6Ghz InfiniBand (network direct @40Gbit/s): RDMA via MS-MPI @3.5Gb/s, 3µs IP over Ethernet @10Gbit/s ; HDD 2Tb @250Mo/s Azure hypervisor HPC Pack • Task scheduler middleware: Cluster Manager + SDK • Tested with 50k cores in Azure • Free Extension Pack : any Windows Server install can be a node #mstechdays #10 Innovation Recherche
  10. 10. Azure Big Compute HPC Pack : on permise cluster • • #mstechdays N N N N N N N N Administration : hardware + software N N M N Cluster dimensioned w.r.t. maximal workload • AD Active Directory, Manager and nodes in a privately managed infrastructure N #11 Innovation Recherche
  11. 11. Azure Big Compute HPC Pack : in the Azure Big Compute cloud • Active Directory and manager in the cloud (VMs) • Nodes allocation and pricing on demand • Admin : software only PaaS nodes IaaS VM Remote desktop/CLI #mstechdays #12 M Innovation Recherche N N N N N AD N N N N N N N
  12. 12. Azure Big Compute HPC Pack : hybrid deployment • Active Directory and manager on premise • Nodes both in the datacenter and in the cloud • Local dimensioning w.r.t. average load Dynamic cloud dimensioning: absorbs peaks • Admin: software + hardware N N N N N N N N N N #13 VPN M Innovation Recherche N N N N N AD N N N N #mstechdays N N N N N
  13. 13. ParSon ParSon: an audio segmentation scientific software • ParSon = audio segmentation algorithm : voice / music 1. Supervised training on known audio samples to calibrate the classifier 2. Classification based on spectral analysis (FFT) on sliding windows Digital audio ParSon Segmentation and classification #mstechdays #14 Innovation Recherche voice music
  14. 14. ParSon ParSon is distributed with OpenMP + MPI 6. Get outputs Data Control 4. MPI Exec 2. Reserves N computers OAR 5. Tasks with heavy intercommunications 1. Upload input files NAS #mstechdays #15 3. Input deployment Reserved computers Innovation Recherche Linux cluster
  15. 15. ParSon Performances are limited by data transfers Best runtime (s) 2048 512 128 IO bound 32 Nodes read from NAS en réseau, à froid Nodes read froid en local, à locally 8 1 4 16 Number of nodes #mstechdays #16 Innovation Recherche 64 256
  16. 16. 2. PORTING THE APPLICATION a. Porting C++ code: Linux  Windows b. Porting distribution strategy: Cluster  HPC Cluster Manager c. Porting and adapting deployment scripts #mstechdays #17 Innovation Recherche
  17. 17. Standards conformance = easy Linux  Windows porting • ParSon and Visual conform to the C++ standard  few code changes • Dependencies are the standard libraries and cross-platform scientific libraries : libsnd, fftw • Thanks to MS-MPI, inter-process communication code doesn’t change • Visual Studio natively supports OpenMP • The only task left was translating build files: Makefiles  Visual C++ projects #mstechdays #18 Innovation Recherche Porting
  18. 18. Porting ParSon in the cluster 6. Get output Data Control 4. MPI Exec 2. Reserves N computers OAR 5. Run and inter-com. 1. Upload input file NAS #mstechdays #19 3. Input deployment Reserved computers Innovation Recherche Linux cluster
  19. 19. Porting ParSon dans le Cloud Azure 6. Get output IaaS PaaS 4. MPI Exec 2. Reserves N nodes HPC Cluster Manager AD Domain controller 5. Run and inter-com. 1. Upload input file HPC pack SDK Azure Storage #mstechdays #20 3. Input deployment Provisioned A9 nodes Innovation Recherche PaaS Big Compute Data Control
  20. 20. Porting Deployment within Azure At every software update : package + send in the cloud 1. Send to manager – – Either with Azure Storage Set-AzureStorageBlobContent  Get-AzureStorageBlobContent hpcpack create ; hpcpack upload  hpcpack download Or with normal transfert : internet accessible fileserver : FileZilla, etc. 2. Packaging script: mkdir, copy, etc. ; hpcpack create 3. Send to Azure storage: hpcpack upload At every node provisioning : local copy 1. Remotely execute on nodes from the manager with clusrun 2. hpcpack download 3. powershell -command "Set-ExecutionPolicy RemoteSigned" Invoke-Command -FilePath … -Credential … Start-Process powershell -Verb runAs -ArgumentList … 4. Installation : %deployedPath%deployScript.ps1 #mstechdays #21 Innovation Recherche
  21. 21. Porting This first working setup has some limitations • Transferring the input file is longer than sequential computation on a single thread • On many cores, computation times is negligible compared to transfers • WAV format headers and ParSon code limit input size to 4Gb #mstechdays #22 Innovation Recherche
  22. 22. 3. OPTIMIZATIONS #mstechdays #23 Innovation Recherche
  23. 23. Optimizations Methodology : suppress the bottleneck Identified bottleneck is the input file transfer 1. Disk write throughput: 300 Mb/s  We use a RAMFS 2. Accès Azure Storage : QoS 1.6 Gb/s  Download only once from the storage account, then broadcast through InfiniBand 3. Large input files: 60 Gb  FLAC c8 lossless compression halves size + not limited to 4Gb  Declare all counters as 64 bits ints in C++ code #mstechdays #24 Innovation Recherche
  24. 24. Optimizations Accelerating local data access with a RAM filesystem • RAMFS = filesystem stored in a RAM block – – • ImDisk – – • Lightweight: driver + service + command line Open-source but signed for Win64 Scripted silent install : – – – – • Very fast Limited capacity, non persistent hpcpack create … rundll32 setupapi.dll,InstallHinfSection DefaultInstall 128 disk.inf Start-Service -inputobject $(get-service -Name imdisk) imdisk.exe -a -t vm -s 30G -m F: -o rw format F: /fs:ntfs /x /q /Y $acl = Get-Acl F: $acl.AddAccessRule(…FileSystemAccessRule("Everyone","Write", …)) Set-Acl F: $acl Run at every node provisioning #mstechdays #25 Innovation Recherche
  25. 25. Optimizations Accelerating input file deployment • All standard transfer systems go through the Ethernet interface – Azure Storage access via Azure and HPC Pack SDKs – Windows share or CIFS network drive – Standard file transfer protocols: FTP, NFS, etc. • The simplest way to leverage InfiniBand is through MPI 1. On one node: download the input file: Azure  RAMFS 2. mpiexec broadcast.exe : 1 process per node • We developped a command line utility in C++ / MPI • If id = 0, reads RAMFS, by 4mb blocs and sends to other nodes through InfiniBand : MPI_Bcast • If id ≠ 0, recieve data blocs and save them on RAMFS • Uses Win32 API: faster than standard library abstractions 3. Input data is in the RAM of all nodes, accessible as a file from the application #mstechdays #26 Innovation Recherche
  26. 26. 4. RESULTS #mstechdays #27 Innovation Recherche
  27. 27. Results Computations scale well, especially for bigger files Computation efficiency for different input sizes Computation time (sec, log) Real speedup / ideal speedup Computation time scaling (log-log plot) Number of cores (log) Number of cores (log) #mstechdays #28 Innovation Recherche
  28. 28. Results Input file transfer make global scaling worse Efficiency for compute only and including transfers Time decomposition, for an hour of input audio + - Real speedup / ideal speedup Time (sec, log) Raw compute Number of cores (log) #mstechdays #29 Number of cores (log) Innovation Recherche
  29. 29. Broadcast time (sec, log) Download time (min) Consistent storage throughput (220Mb/s), latency may be high Broadcast constant @700 Mb/sAsure storage download performances Broadcast time scaling Number of machines File size (Gb) #mstechdays #30 Innovation Recherche Results
  30. 30. 5. CONCLUSION #mstechdays #31 Innovation Recherche
  31. 31. Our feedback on the Big Compute technology • HPC standards conformance: C++, OpenMP, MPI – • • Nodes administration – Azure storage latency sometimes high – Azure storage limited QoS  users must implement multiple account striping – HDDs are slow (for HPC), even on A9 Ported in 10 work days Compute: CPU, RAM Network: InfiniBand between nodes Reactive support – • Data transfers Solid performances – – • • Community, Microsoft – Nodes ↔ Manager transfers must go through Azure storage: less convenient than conventional remote file systems Intuitive user interface – – manage.windowsazure.com HPC Cluster Manager • Everything is scriptable & programmable • Cloud is more flexible than cluster • Unified management of cloud and on-premise #mstechdays #32 • Provisioning time must be taken into account (~7min) Innovation Recherche
  32. 32. Azure Big Compute for research and business Predictable, pay what you use cost model Modern design, extensive documentation, efficient support Decreased need for administration – but still needed on the software side For research • • Access to compute without any barrier paperwork, finance, etc. • A super computer for all, without investment • Elastic scaling : on-demand sizing Start your workload in minutes • Interoperable with Windows clusters – Cloud absorbs peaks – Best of both worlds • Datacenters in UE : Ireland + Netherlands – • For business For squeezing a few more before the (extended) deadline for that conference  Well suited to researchers in distributed computing – Parametric experiments #mstechdays #33 Innovation Recherche
  33. 33. Thank you for your attention • Antoine Poliakov apoliakov@aneo.fr • Stéphane Vialle stephane.vialle@supelec.fr • ANEO http://aneo.eu http://blog.aneo.eu • Retrouvez nous aux TechDays ! Stand ANEO jeudi 11h30 - 13h Au cœur du SI > Infrastructure moderne avec Azure #mstechdays #34 Thanks All our thanks to Microsoft for lending us the nodes ? A question : don’t hesitate! Innovation Recherche
  34. 34. Digital is business

×