SlideShare a Scribd company logo
1 of 24
Download to read offline
High Performance Computing

        Adam DeConinck
       R Systems NA, Inc.




        1
Development of models begins at small scale.

Working on your laptop is convenient, simple.

Actual analysis, however, is slow.




    2
Development of models begins at small scale.

Working on your laptop is convenient, simple.

Actual analysis, however, is slow.


“Scaling up” typically means a small server or
fast multi-core desktop.

Speedup exists, but for very large models, not
significant.

Single machines don't scale up forever.


    3
For the largest models, a different approach is required.


                    4
High-Performance Computing involves many
  distinct computer processors working together on
  the same calculation.

Large problems are divided into smaller parts and
  distributed among the many computers.

Usually clusters of quasi-independent computers
  which are coordinated by a central scheduler.


                   5
Typical HPC Cluster

             Login
External
connection                 Ethernet network




                     Scheduler



                                               Computes

                     File Server
                                              High-speed network
                                              (10GigE / Infiniband)

                      6
Performance gains
    High-end
    workstation



                    Duration (s)




                                            Number of cores


     Performance test: stochastic finance model on R Systems cluster

     High-end workstation: 8 cores. Maximum speedup of 20x: 4.5 hrs → 14 minutes
      
          Scale-up heavily model-dependent: 5x – 100x in our tests, can be faster

     No more performance gain after ~500 cores: why? Some operations can't be parallelized.

     Additional cores? Run multiple models simultaneously


                                     7
Performance comes at a price: complexity.



    New paradigm: real-time analysis vs batch jobs.

    Applications must be written specifically to take
    advantage of distributed computing.

    Performance characteristics of applications change.

    Debugging becomes more of a challenge.



                     8
New paradigm: real-time analysis vs batch jobs.




Most small analyses are done in    Large jobs are typically done in a
  real time:                          batch model:

    “At-your-desk” analysis        
                                       Submit job to a queue

    Small models only              
                                       Much larger models

    Fast iterations                
                                       Slow iterations

    No waiting for resources       
                                       May need to wait



                               9
Applications must be written specifically to
  take advantage of distributed
  computing.

    Explicitly split your problem into smaller
    “chunks”

    “Message passing” between processes

    Entire computation can be slowed by one
    or two slow chunks

    Exception: “embarrassingly parallel”
    problems

    Easy-to-split, independent chunks of
    computation

    Thankfully, many useful models fall under
                                                   “Embarrassingly parallel” =
    this heading. (e.g. stochastic models)       No inter-process communication

                              10
Performance characteristics of applications change.


On a single machine:        On a cluster:

    CPU speed (compute)     
                                Single-machine metrics

    Cache                   
                                Network

    Memory                  
                                File server

    Disk                    
                                Scheduler contention
                            
                                Results from other nodes



                   11
Debugging becomes more of a challenge.


    More complexity = more pieces that can fail

    Race conditions: sequence of events no longer deterministic

    Single nodes can “stall” and slow the entire computation

    Scheduler, file server, login server all have their own challenges




                          12
External resources

    One solution to handling complexity: outsource it!

    Historical HPC facilities: universities, national labs
    
        Often have the most absolute compute capacity, and will sell
        excess capacity
    
        Competition with academic projects, typically do not include
        SLA or high-level support

    Dedicated commercial HPC facilities providing “on-demand”
    compute power.



                           13
External HPC                      Internal HPC

    Outsource HPC sysadmin        
                                      Requires in-house expertise

    No hardware investment        
                                      Major investment in hardware

    Pay-as-you-go                 
                                      Possible idle time

    Easy to migrate to new tech   
                                      Upgrades require new hardware




                          14
Internal HPC                          External HPC

    No external contention            
                                          No guaranteed access

    All internal—easy security        
                                          Security arrangements complex

    Full control over configuration   
                                          Limited control of configuration

    Simpler licensing control         
                                          Some licensing complex


    Requires in-house expertise       
                                          Outsource HPC sysadmin

    Major investment in hardware      
                                          No hardware investment

    Possible idle time                
                                          Pay-as-you-go

    Upgrades require new hardware     
                                          Easy to migrate to new tech



                             15
“The Cloud”

    “Cloud computing”: virtual machines, dynamic allocation of resources in
    an external resource

    Lower performance (virtualization), higher flexibility

    Usually no contracts necessary: pay with your credit card, get 16 nodes

    Often have to do all your own sysadmin

    Low support, high control



                              16
CASE STUDY:
Windows cluster for Actuarial
       Application




         17
Global insurance company


    Needed 500-1000 cores on a temporary basis

    Preferred a utility, “pay-as-you-go” model

    Experimenting with external resources for “burst”
    capacity during high-activity periods

    Commercially licensed and supported application

    Requested a proof of concept

                     18
Cluster configuration

    Application embarrassingly parallel, small-to-medium data files,
    computationally and memory-intensive
    
        Prioritize computation (processors), access to fileserver over
        inter-node communication, large storage
    
        Upgraded memory in compute nodes to 2 GB/core

    128-node cluster: 3.0 GHz Intel Xeon processors, 8 cores per node for
    1024 cores total

    Windows 2008 HPC R2 operating system

    Application and fileserver on login node


                            19
Stumbling blocks

    Application optimization
Customer had a wide variety of models which generated different usage
  patterns. (IO, compute, memory-intensive jobs) Required dynamic
  reconfiguration for different conditions.

    Technical issue
Iterative testing process. Application turned out to be generating massive
   fileserver contention. Had to make changes to both software and hardware.

    Human processes
    Users were accustomed to internal access model. Required changes both
    for providers (increase ease-of-use) and users (change workflow)

    Security
    Customer had never worked with an external provider before. Complex
    internal security policy had to be reconciled with remote access.
                            20
Lessons learned:


    Security was the biggest delaying factor. The initial security setup took over
    3 months from the first expression of interest, even though cluster setup
    was done in less than a week.
    
        Only mattered the first time though: subsequent runs started much
        more smoothly.

    A low-cost proof-of-concept run was important to demonstrate feasibility,
    and for working the bugs out.

    A good relationship with the application vendor was extremely important
    to solving problems and properly optimizing the model for performance.



                              21
Recent developments: GPUs




       22
Graphics processing units




    CPU: complex, general-purpose processor

    GPU: highly-specialized parallel processor, optimized for performing operations for
    common graphics routines

    Highly specialized → many more “cores” for same cost and space
     
         Intel Core i7: 4 cores @ 3.4 GHz: $300 = $75/core
     
         NVIDIA Tesla M2070: 448 cores @ 575 MHz: $4500 = $10/core

    Also higher bandwidth: 100+ GB/s for GPU vs 10-30 GB/s for CPU

    Same operations can be adapted for non-graphics applications: “GPGPU”

                                       23
                      Image from http://blogs.nvidia.com/2009/12/whats-the-difference-between-a-cpu-and-a-gpu/
HPC/Actuarial using GPUs
                                                   
                                                        Random-number generation
                                                   
                                                        Finite-difference modeling
                                                   
                                                        Image processing

                                                   
                                                        Numerical Algorithms Group:
                                                        GPU random-number generator
                                                   
                                                        MATLAB: operations on large arrays/matrices
                                                   
                                                        Wolfram Mathematica: symbolic math analysis


Data from
http://www.nvidia.com/object/computational_finan
ce.html




                                                   24

More Related Content

What's hot

High Performance Computing
High Performance ComputingHigh Performance Computing
High Performance ComputingDivyen Patel
 
High Performance Computing Presentation
High Performance Computing PresentationHigh Performance Computing Presentation
High Performance Computing Presentationomar altayyan
 
High performance computing
High performance computingHigh performance computing
High performance computingGuy Tel-Zur
 
Introduction to High Performance Computing
Introduction to High Performance ComputingIntroduction to High Performance Computing
Introduction to High Performance ComputingUmarudin Zaenuri
 
Accelerating TensorFlow with RDMA for high-performance deep learning
Accelerating TensorFlow with RDMA for high-performance deep learningAccelerating TensorFlow with RDMA for high-performance deep learning
Accelerating TensorFlow with RDMA for high-performance deep learningDataWorks Summit
 
High Performance Computing - The Future is Here
High Performance Computing - The Future is HereHigh Performance Computing - The Future is Here
High Performance Computing - The Future is HereMartin Hamilton
 
Introducing ucx unified communications x framework
Introducing ucx unified communications x frameworkIntroducing ucx unified communications x framework
Introducing ucx unified communications x frameworkinside-BigData.com
 
Stability Patterns for Microservices
Stability Patterns for MicroservicesStability Patterns for Microservices
Stability Patterns for Microservicespflueras
 
Overview of HPC.pptx
Overview of HPC.pptxOverview of HPC.pptx
Overview of HPC.pptxsundariprabhu
 
Apache Kafka Streams + Machine Learning / Deep Learning
Apache Kafka Streams + Machine Learning / Deep LearningApache Kafka Streams + Machine Learning / Deep Learning
Apache Kafka Streams + Machine Learning / Deep LearningKai Wähner
 
Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in HadoopBackup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadooplarsgeorge
 
SK hynix CXL Disaggregated Memory Solution
SK hynix CXL Disaggregated Memory SolutionSK hynix CXL Disaggregated Memory Solution
SK hynix CXL Disaggregated Memory SolutionMemory Fabric Forum
 
Introduction to Apache NiFi dws19 DWS - DC 2019
Introduction to Apache NiFi   dws19 DWS - DC 2019Introduction to Apache NiFi   dws19 DWS - DC 2019
Introduction to Apache NiFi dws19 DWS - DC 2019Timothy Spann
 
Modular by Design: Supermicro’s New Standards-Based Universal GPU Server
Modular by Design: Supermicro’s New Standards-Based Universal GPU ServerModular by Design: Supermicro’s New Standards-Based Universal GPU Server
Modular by Design: Supermicro’s New Standards-Based Universal GPU ServerRebekah Rodriguez
 

What's hot (20)

High Performance Computing
High Performance ComputingHigh Performance Computing
High Performance Computing
 
High Performance Computing Presentation
High Performance Computing PresentationHigh Performance Computing Presentation
High Performance Computing Presentation
 
High performance computing
High performance computingHigh performance computing
High performance computing
 
Introduction to High Performance Computing
Introduction to High Performance ComputingIntroduction to High Performance Computing
Introduction to High Performance Computing
 
Introduction to SLURM
Introduction to SLURMIntroduction to SLURM
Introduction to SLURM
 
Accelerating TensorFlow with RDMA for high-performance deep learning
Accelerating TensorFlow with RDMA for high-performance deep learningAccelerating TensorFlow with RDMA for high-performance deep learning
Accelerating TensorFlow with RDMA for high-performance deep learning
 
High Performance Computing - The Future is Here
High Performance Computing - The Future is HereHigh Performance Computing - The Future is Here
High Performance Computing - The Future is Here
 
Introducing ucx unified communications x framework
Introducing ucx unified communications x frameworkIntroducing ucx unified communications x framework
Introducing ucx unified communications x framework
 
Stability Patterns for Microservices
Stability Patterns for MicroservicesStability Patterns for Microservices
Stability Patterns for Microservices
 
Overview of HPC.pptx
Overview of HPC.pptxOverview of HPC.pptx
Overview of HPC.pptx
 
GPU Computing
GPU ComputingGPU Computing
GPU Computing
 
Tensor Processing Unit (TPU)
Tensor Processing Unit (TPU)Tensor Processing Unit (TPU)
Tensor Processing Unit (TPU)
 
Introduction to SLURM
Introduction to SLURMIntroduction to SLURM
Introduction to SLURM
 
Apache Kafka Streams + Machine Learning / Deep Learning
Apache Kafka Streams + Machine Learning / Deep LearningApache Kafka Streams + Machine Learning / Deep Learning
Apache Kafka Streams + Machine Learning / Deep Learning
 
Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in HadoopBackup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop
 
SK hynix CXL Disaggregated Memory Solution
SK hynix CXL Disaggregated Memory SolutionSK hynix CXL Disaggregated Memory Solution
SK hynix CXL Disaggregated Memory Solution
 
CXL Fabric Management Standards
CXL Fabric Management StandardsCXL Fabric Management Standards
CXL Fabric Management Standards
 
Introduction to Apache NiFi dws19 DWS - DC 2019
Introduction to Apache NiFi   dws19 DWS - DC 2019Introduction to Apache NiFi   dws19 DWS - DC 2019
Introduction to Apache NiFi dws19 DWS - DC 2019
 
Cluster Schedulers
Cluster SchedulersCluster Schedulers
Cluster Schedulers
 
Modular by Design: Supermicro’s New Standards-Based Universal GPU Server
Modular by Design: Supermicro’s New Standards-Based Universal GPU ServerModular by Design: Supermicro’s New Standards-Based Universal GPU Server
Modular by Design: Supermicro’s New Standards-Based Universal GPU Server
 

Viewers also liked

High Performance Computing using MPI
High Performance Computing using MPIHigh Performance Computing using MPI
High Performance Computing using MPIAnkit Mahato
 
High performance concrete ppt
High performance concrete pptHigh performance concrete ppt
High performance concrete pptGoogle
 
Intro to High Performance Computing in the AWS Cloud
Intro to High Performance Computing in the AWS CloudIntro to High Performance Computing in the AWS Cloud
Intro to High Performance Computing in the AWS CloudAmazon Web Services
 
INCITE - INtegrated Components for Interactive TEaching
INCITE - INtegrated Components for Interactive TEachingINCITE - INtegrated Components for Interactive TEaching
INCITE - INtegrated Components for Interactive TEachingDragos Sbîrlea
 
High Performance Statistical Computing
High Performance Statistical ComputingHigh Performance Statistical Computing
High Performance Statistical ComputingMicah Altman
 
High performance computing
High performance computingHigh performance computing
High performance computingMaher Alshammari
 
Kalray TURBOCARD2 @ ISC'14
Kalray TURBOCARD2 @ ISC'14Kalray TURBOCARD2 @ ISC'14
Kalray TURBOCARD2 @ ISC'14KALRAY
 
High Performance Computing in the Cloud?
High Performance Computing in the Cloud?High Performance Computing in the Cloud?
High Performance Computing in the Cloud?Ian Lumb
 
High Performance Computing: The Essential tool for a Knowledge Economy
High Performance Computing: The Essential tool for a Knowledge EconomyHigh Performance Computing: The Essential tool for a Knowledge Economy
High Performance Computing: The Essential tool for a Knowledge EconomyIntel IT Center
 
AWS Webcast - An Introduction to High Performance Computing on AWS
AWS Webcast - An Introduction to High Performance Computing on AWSAWS Webcast - An Introduction to High Performance Computing on AWS
AWS Webcast - An Introduction to High Performance Computing on AWSAmazon Web Services
 
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...xKinAnx
 
GPFS - graphical intro
GPFS - graphical introGPFS - graphical intro
GPFS - graphical introAlex Balk
 
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...Geoffrey Fox
 
Unix _linux_fundamentals_for_hpc-_b
Unix  _linux_fundamentals_for_hpc-_bUnix  _linux_fundamentals_for_hpc-_b
Unix _linux_fundamentals_for_hpc-_bMohammad Reza Beygi
 
Parasitic Computing
Parasitic ComputingParasitic Computing
Parasitic Computingjojothish
 
Accelerating Hadoop, Spark, and Memcached with HPC Technologies
Accelerating Hadoop, Spark, and Memcached with HPC TechnologiesAccelerating Hadoop, Spark, and Memcached with HPC Technologies
Accelerating Hadoop, Spark, and Memcached with HPC Technologiesinside-BigData.com
 
Delivering Transformational Solutions to Industry by Dr. Frederick Streitz, D...
Delivering Transformational Solutions to Industry by Dr. Frederick Streitz, D...Delivering Transformational Solutions to Industry by Dr. Frederick Streitz, D...
Delivering Transformational Solutions to Industry by Dr. Frederick Streitz, D...Industrial Partnerships Office
 

Viewers also liked (20)

High Performance Computing using MPI
High Performance Computing using MPIHigh Performance Computing using MPI
High Performance Computing using MPI
 
High performance concrete ppt
High performance concrete pptHigh performance concrete ppt
High performance concrete ppt
 
Intro to High Performance Computing in the AWS Cloud
Intro to High Performance Computing in the AWS CloudIntro to High Performance Computing in the AWS Cloud
Intro to High Performance Computing in the AWS Cloud
 
INCITE - INtegrated Components for Interactive TEaching
INCITE - INtegrated Components for Interactive TEachingINCITE - INtegrated Components for Interactive TEaching
INCITE - INtegrated Components for Interactive TEaching
 
JAWS
JAWSJAWS
JAWS
 
High Performance Statistical Computing
High Performance Statistical ComputingHigh Performance Statistical Computing
High Performance Statistical Computing
 
High performance computing
High performance computingHigh performance computing
High performance computing
 
Kalray TURBOCARD2 @ ISC'14
Kalray TURBOCARD2 @ ISC'14Kalray TURBOCARD2 @ ISC'14
Kalray TURBOCARD2 @ ISC'14
 
High Performance Computing in the Cloud?
High Performance Computing in the Cloud?High Performance Computing in the Cloud?
High Performance Computing in the Cloud?
 
Current Trends in HPC
Current Trends in HPCCurrent Trends in HPC
Current Trends in HPC
 
High Performance Computing: The Essential tool for a Knowledge Economy
High Performance Computing: The Essential tool for a Knowledge EconomyHigh Performance Computing: The Essential tool for a Knowledge Economy
High Performance Computing: The Essential tool for a Knowledge Economy
 
AWS Webcast - An Introduction to High Performance Computing on AWS
AWS Webcast - An Introduction to High Performance Computing on AWSAWS Webcast - An Introduction to High Performance Computing on AWS
AWS Webcast - An Introduction to High Performance Computing on AWS
 
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...
 
GPFS - graphical intro
GPFS - graphical introGPFS - graphical intro
GPFS - graphical intro
 
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
 
Unix _linux_fundamentals_for_hpc-_b
Unix  _linux_fundamentals_for_hpc-_bUnix  _linux_fundamentals_for_hpc-_b
Unix _linux_fundamentals_for_hpc-_b
 
Parasitic Computing
Parasitic ComputingParasitic Computing
Parasitic Computing
 
Accelerating Hadoop, Spark, and Memcached with HPC Technologies
Accelerating Hadoop, Spark, and Memcached with HPC TechnologiesAccelerating Hadoop, Spark, and Memcached with HPC Technologies
Accelerating Hadoop, Spark, and Memcached with HPC Technologies
 
Delivering Transformational Solutions to Industry by Dr. Frederick Streitz, D...
Delivering Transformational Solutions to Industry by Dr. Frederick Streitz, D...Delivering Transformational Solutions to Industry by Dr. Frederick Streitz, D...
Delivering Transformational Solutions to Industry by Dr. Frederick Streitz, D...
 
Biometric technology
Biometric technologyBiometric technology
Biometric technology
 

Similar to High Performance Computing: an Introduction for the Society of Actuaries

Webinar: High Performance MongoDB Applications with IBM POWER8
Webinar: High Performance MongoDB Applications with IBM POWER8Webinar: High Performance MongoDB Applications with IBM POWER8
Webinar: High Performance MongoDB Applications with IBM POWER8MongoDB
 
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)Lablup Inc.
 
Matching Your Costs to Your DAU: Thin Client Back-End Infrastructure Made Easy
Matching Your Costs to Your DAU: Thin Client Back-End Infrastructure Made EasyMatching Your Costs to Your DAU: Thin Client Back-End Infrastructure Made Easy
Matching Your Costs to Your DAU: Thin Client Back-End Infrastructure Made EasyPete Johnson
 
Microsoft Azure in HPC scenarios
Microsoft Azure in HPC scenariosMicrosoft Azure in HPC scenarios
Microsoft Azure in HPC scenariosmictc
 
InTech Event | Cognitive Infrastructure for Enterprise AI
InTech Event | Cognitive Infrastructure for Enterprise AIInTech Event | Cognitive Infrastructure for Enterprise AI
InTech Event | Cognitive Infrastructure for Enterprise AIInTTrust S.A.
 
From Rack scale computers to Warehouse scale computers
From Rack scale computers to Warehouse scale computersFrom Rack scale computers to Warehouse scale computers
From Rack scale computers to Warehouse scale computersRyousei Takano
 
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...confluent
 
Cassandra in Operation
Cassandra in OperationCassandra in Operation
Cassandra in Operationniallmilton
 
X13 Pre-Release Update featuring 4th Gen Intel® Xeon® Scalable Processors
X13 Pre-Release Update featuring 4th Gen Intel® Xeon® Scalable Processors X13 Pre-Release Update featuring 4th Gen Intel® Xeon® Scalable Processors
X13 Pre-Release Update featuring 4th Gen Intel® Xeon® Scalable Processors Rebekah Rodriguez
 
Performance testing virtualized systems v5
Performance testing virtualized systems v5Performance testing virtualized systems v5
Performance testing virtualized systems v5Mentora
 
A Survey on in-a-box parallel computing and its implications on system softwa...
A Survey on in-a-box parallel computing and its implications on system softwa...A Survey on in-a-box parallel computing and its implications on system softwa...
A Survey on in-a-box parallel computing and its implications on system softwa...ChangWoo Min
 
Computação de Alto Desempenho - Fator chave para a competitividade do País, d...
Computação de Alto Desempenho - Fator chave para a competitividade do País, d...Computação de Alto Desempenho - Fator chave para a competitividade do País, d...
Computação de Alto Desempenho - Fator chave para a competitividade do País, d...Igor José F. Freitas
 
Adaptive Computing Using PlateSpin Orchestrate
Adaptive Computing Using PlateSpin OrchestrateAdaptive Computing Using PlateSpin Orchestrate
Adaptive Computing Using PlateSpin OrchestrateNovell
 
Applying Cloud Techniques to Address Complexity in HPC System Integrations
Applying Cloud Techniques to Address Complexity in HPC System IntegrationsApplying Cloud Techniques to Address Complexity in HPC System Integrations
Applying Cloud Techniques to Address Complexity in HPC System Integrationsinside-BigData.com
 
Cloud Roundtable at Microsoft Switzerland
Cloud Roundtable at Microsoft Switzerland Cloud Roundtable at Microsoft Switzerland
Cloud Roundtable at Microsoft Switzerland mictc
 
Deep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance PerformanceDeep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance PerformanceAmazon Web Services
 
Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...
Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...
Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...Amazon Web Services
 
Deep learning for FinTech
Deep learning for FinTechDeep learning for FinTech
Deep learning for FinTechgeetachauhan
 

Similar to High Performance Computing: an Introduction for the Society of Actuaries (20)

Webinar: High Performance MongoDB Applications with IBM POWER8
Webinar: High Performance MongoDB Applications with IBM POWER8Webinar: High Performance MongoDB Applications with IBM POWER8
Webinar: High Performance MongoDB Applications with IBM POWER8
 
B9 cmis
B9 cmisB9 cmis
B9 cmis
 
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
 
Matching Your Costs to Your DAU: Thin Client Back-End Infrastructure Made Easy
Matching Your Costs to Your DAU: Thin Client Back-End Infrastructure Made EasyMatching Your Costs to Your DAU: Thin Client Back-End Infrastructure Made Easy
Matching Your Costs to Your DAU: Thin Client Back-End Infrastructure Made Easy
 
Microsoft Azure in HPC scenarios
Microsoft Azure in HPC scenariosMicrosoft Azure in HPC scenarios
Microsoft Azure in HPC scenarios
 
InTech Event | Cognitive Infrastructure for Enterprise AI
InTech Event | Cognitive Infrastructure for Enterprise AIInTech Event | Cognitive Infrastructure for Enterprise AI
InTech Event | Cognitive Infrastructure for Enterprise AI
 
From Rack scale computers to Warehouse scale computers
From Rack scale computers to Warehouse scale computersFrom Rack scale computers to Warehouse scale computers
From Rack scale computers to Warehouse scale computers
 
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...
 
Cassandra in Operation
Cassandra in OperationCassandra in Operation
Cassandra in Operation
 
X13 Pre-Release Update featuring 4th Gen Intel® Xeon® Scalable Processors
X13 Pre-Release Update featuring 4th Gen Intel® Xeon® Scalable Processors X13 Pre-Release Update featuring 4th Gen Intel® Xeon® Scalable Processors
X13 Pre-Release Update featuring 4th Gen Intel® Xeon® Scalable Processors
 
Performance testing virtualized systems v5
Performance testing virtualized systems v5Performance testing virtualized systems v5
Performance testing virtualized systems v5
 
A Survey on in-a-box parallel computing and its implications on system softwa...
A Survey on in-a-box parallel computing and its implications on system softwa...A Survey on in-a-box parallel computing and its implications on system softwa...
A Survey on in-a-box parallel computing and its implications on system softwa...
 
Computação de Alto Desempenho - Fator chave para a competitividade do País, d...
Computação de Alto Desempenho - Fator chave para a competitividade do País, d...Computação de Alto Desempenho - Fator chave para a competitividade do País, d...
Computação de Alto Desempenho - Fator chave para a competitividade do País, d...
 
Adaptive Computing Using PlateSpin Orchestrate
Adaptive Computing Using PlateSpin OrchestrateAdaptive Computing Using PlateSpin Orchestrate
Adaptive Computing Using PlateSpin Orchestrate
 
Cluster Computing
Cluster ComputingCluster Computing
Cluster Computing
 
Applying Cloud Techniques to Address Complexity in HPC System Integrations
Applying Cloud Techniques to Address Complexity in HPC System IntegrationsApplying Cloud Techniques to Address Complexity in HPC System Integrations
Applying Cloud Techniques to Address Complexity in HPC System Integrations
 
Cloud Roundtable at Microsoft Switzerland
Cloud Roundtable at Microsoft Switzerland Cloud Roundtable at Microsoft Switzerland
Cloud Roundtable at Microsoft Switzerland
 
Deep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance PerformanceDeep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance Performance
 
Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...
Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...
Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...
 
Deep learning for FinTech
Deep learning for FinTechDeep learning for FinTech
Deep learning for FinTech
 

Recently uploaded

Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbuapidays
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 

Recently uploaded (20)

Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 

High Performance Computing: an Introduction for the Society of Actuaries

  • 1. High Performance Computing Adam DeConinck R Systems NA, Inc. 1
  • 2. Development of models begins at small scale. Working on your laptop is convenient, simple. Actual analysis, however, is slow. 2
  • 3. Development of models begins at small scale. Working on your laptop is convenient, simple. Actual analysis, however, is slow. “Scaling up” typically means a small server or fast multi-core desktop. Speedup exists, but for very large models, not significant. Single machines don't scale up forever. 3
  • 4. For the largest models, a different approach is required. 4
  • 5. High-Performance Computing involves many distinct computer processors working together on the same calculation. Large problems are divided into smaller parts and distributed among the many computers. Usually clusters of quasi-independent computers which are coordinated by a central scheduler. 5
  • 6. Typical HPC Cluster Login External connection Ethernet network Scheduler Computes File Server High-speed network (10GigE / Infiniband) 6
  • 7. Performance gains High-end workstation Duration (s) Number of cores  Performance test: stochastic finance model on R Systems cluster  High-end workstation: 8 cores. Maximum speedup of 20x: 4.5 hrs → 14 minutes  Scale-up heavily model-dependent: 5x – 100x in our tests, can be faster  No more performance gain after ~500 cores: why? Some operations can't be parallelized.  Additional cores? Run multiple models simultaneously 7
  • 8. Performance comes at a price: complexity.  New paradigm: real-time analysis vs batch jobs.  Applications must be written specifically to take advantage of distributed computing.  Performance characteristics of applications change.  Debugging becomes more of a challenge. 8
  • 9. New paradigm: real-time analysis vs batch jobs. Most small analyses are done in Large jobs are typically done in a real time: batch model:  “At-your-desk” analysis  Submit job to a queue  Small models only  Much larger models  Fast iterations  Slow iterations  No waiting for resources  May need to wait 9
  • 10. Applications must be written specifically to take advantage of distributed computing.  Explicitly split your problem into smaller “chunks”  “Message passing” between processes  Entire computation can be slowed by one or two slow chunks  Exception: “embarrassingly parallel” problems  Easy-to-split, independent chunks of computation  Thankfully, many useful models fall under “Embarrassingly parallel” = this heading. (e.g. stochastic models) No inter-process communication 10
  • 11. Performance characteristics of applications change. On a single machine: On a cluster:  CPU speed (compute)  Single-machine metrics  Cache  Network  Memory  File server  Disk  Scheduler contention  Results from other nodes 11
  • 12. Debugging becomes more of a challenge.  More complexity = more pieces that can fail  Race conditions: sequence of events no longer deterministic  Single nodes can “stall” and slow the entire computation  Scheduler, file server, login server all have their own challenges 12
  • 13. External resources  One solution to handling complexity: outsource it!  Historical HPC facilities: universities, national labs  Often have the most absolute compute capacity, and will sell excess capacity  Competition with academic projects, typically do not include SLA or high-level support  Dedicated commercial HPC facilities providing “on-demand” compute power. 13
  • 14. External HPC Internal HPC  Outsource HPC sysadmin  Requires in-house expertise  No hardware investment  Major investment in hardware  Pay-as-you-go  Possible idle time  Easy to migrate to new tech  Upgrades require new hardware 14
  • 15. Internal HPC External HPC  No external contention  No guaranteed access  All internal—easy security  Security arrangements complex  Full control over configuration  Limited control of configuration  Simpler licensing control  Some licensing complex  Requires in-house expertise  Outsource HPC sysadmin  Major investment in hardware  No hardware investment  Possible idle time  Pay-as-you-go  Upgrades require new hardware  Easy to migrate to new tech 15
  • 16. “The Cloud”  “Cloud computing”: virtual machines, dynamic allocation of resources in an external resource  Lower performance (virtualization), higher flexibility  Usually no contracts necessary: pay with your credit card, get 16 nodes  Often have to do all your own sysadmin  Low support, high control 16
  • 17. CASE STUDY: Windows cluster for Actuarial Application 17
  • 18. Global insurance company  Needed 500-1000 cores on a temporary basis  Preferred a utility, “pay-as-you-go” model  Experimenting with external resources for “burst” capacity during high-activity periods  Commercially licensed and supported application  Requested a proof of concept 18
  • 19. Cluster configuration  Application embarrassingly parallel, small-to-medium data files, computationally and memory-intensive  Prioritize computation (processors), access to fileserver over inter-node communication, large storage  Upgraded memory in compute nodes to 2 GB/core  128-node cluster: 3.0 GHz Intel Xeon processors, 8 cores per node for 1024 cores total  Windows 2008 HPC R2 operating system  Application and fileserver on login node 19
  • 20. Stumbling blocks  Application optimization Customer had a wide variety of models which generated different usage patterns. (IO, compute, memory-intensive jobs) Required dynamic reconfiguration for different conditions.  Technical issue Iterative testing process. Application turned out to be generating massive fileserver contention. Had to make changes to both software and hardware.  Human processes Users were accustomed to internal access model. Required changes both for providers (increase ease-of-use) and users (change workflow)  Security Customer had never worked with an external provider before. Complex internal security policy had to be reconciled with remote access. 20
  • 21. Lessons learned:  Security was the biggest delaying factor. The initial security setup took over 3 months from the first expression of interest, even though cluster setup was done in less than a week.  Only mattered the first time though: subsequent runs started much more smoothly.  A low-cost proof-of-concept run was important to demonstrate feasibility, and for working the bugs out.  A good relationship with the application vendor was extremely important to solving problems and properly optimizing the model for performance. 21
  • 23. Graphics processing units  CPU: complex, general-purpose processor  GPU: highly-specialized parallel processor, optimized for performing operations for common graphics routines  Highly specialized → many more “cores” for same cost and space  Intel Core i7: 4 cores @ 3.4 GHz: $300 = $75/core  NVIDIA Tesla M2070: 448 cores @ 575 MHz: $4500 = $10/core  Also higher bandwidth: 100+ GB/s for GPU vs 10-30 GB/s for CPU  Same operations can be adapted for non-graphics applications: “GPGPU” 23 Image from http://blogs.nvidia.com/2009/12/whats-the-difference-between-a-cpu-and-a-gpu/
  • 24. HPC/Actuarial using GPUs  Random-number generation  Finite-difference modeling  Image processing  Numerical Algorithms Group: GPU random-number generator  MATLAB: operations on large arrays/matrices  Wolfram Mathematica: symbolic math analysis Data from http://www.nvidia.com/object/computational_finan ce.html 24