SlideShare a Scribd company logo
GPU-based Parallelization of
System Modeling

Stephan Pachnicke, 18.03.2013
Outline

• Motivation

• Numerical System Modeling

• GPU-Parallelization

• Comparison of Speedup and Accuracy

• Conclusion




2                       © 2013 ADVA Optical Networking. All rights reserved.
Acknowledgments

The author would like to acknowledge the help and
contributions of


Adam Chachaj – Krone Messtechnik

Heinrich Müller – TU Dortmund

Peter Krummrich – TU Dortmund

Markus Roppelt – ADVA Optical Networking

Michael Eiselt – ADVA Optical Networking




3                    © 2013 ADVA Optical Networking. All rights reserved.
Motivation




4            © 2013 ADVA Optical Networking. All rights reserved.
In Short: Computational Performance




                                                                           Graphical Processing Unit
                                                                                    (GPU)




                                       vs.
      CPU Cluster




5                   © 2013 ADVA Optical Networking. All rights reserved.
Increase in GFlop/s




• GPU performance is growing even faster than predicted by Moore„s
  law and is significantly higher than CPU performance

• GPUs are attractive also for general purpose computing
  (complex numerical simulations)



6                      © 2013 ADVA Optical Networking. All rights reserved.
Optical System Modeling

• Simulation of (long-haul) optical transmission systems requires
  numerical solution of the nonlinear Schrödinger equation

 High computational effort for small step-sizes due to accurate
  simulation of nonlinear fiber effects

• Precise estimation of the bit error ratio with Monte-Carlo
  simulations for PMD and noise

 Requires a high number of simulated bits




7                     © 2013 ADVA Optical Networking. All rights reserved.
Split-Step Fourier Method (SSFM)
•   Splits nonlinear Schrödinger equation in linear and nonlinear parts
•   Separate solution of linear and nonlinear parts




•   Solution of the linear part in the frequency domain and of the nonlinear
    part in time domain (acceptable for small step-sizes)




…                           FFT
                             FFT                                                 IFFT
                                                                                  IFFT
                                                                                   IFFT   …


                                       1 Split-Step
8                         © 2013 ADVA Optical Networking. All rights reserved.
Speedup Factor                              (GPU vs CPU)


          Single precision
                (SP)




                             Double precision
                                  (DP)
                                                                                              Legend
                                                                          DP:           Nvidia CUDA FFT
                                                                          SP:           FFT using pre-calculated
                                                                                        twiddle factors




•   Single precision arithmetic has much higher performance on GPU
    (because main target group is computer gaming)

•   Longer block lengths allow better parallelization

 Single precision implementation desirable

9                                © 2013 ADVA Optical Networking. All rights reserved.
Accuracy         (in single precision)


                                                                                   Legend
                                                                  CUFFT: Nvidia CUDA FFT
                                                                  FFTW: Fastest Fourier Transform
                                                                        in the West
                                                                  IPP:        Intel Integrated
                                                                              Performance Primitives
                             LUT-based FFT                        LUT:        Precalculate trigonometric
                                                                              functions in DP




 • Total accuracy of SSFM dominated by FFT accuracy

 • Backward error grows linearly with increasing number of FFTs

 • CUDA FFT shows considerably higher error than other FFT
   implementations

10                     © 2013 ADVA Optical Networking. All rights reserved.
Analysis: Accuracy

 Why is the accuracy of CUFFT in SP relatively low?

  FFT performance depends crucially on accuracy of „twiddle-
   factors“ (or trigonometric functions)

  HW implementation of trigonometric functions in SP on GPUs
   optimized for peak performance not accuracy


 What can be done to increase accuracy in single precision?

  Implementation of Taylor series expansion (slow!)

  Compute trigonometric functions in DP on CPU and store them in
   a look-up table on the GPU
   (especially suited to the split-step Fourier method with thousands
   of FFTs of similar length)

                                                         J. C. Schatzman, SIAM J. Scientific Comput. (1996).

11                     © 2013 ADVA Optical Networking. All rights reserved.
Illustrative Example
             CUDA FFT (SP)                                                  LUT-based FFT (SP)




                                                                                                                 -: GPU
                                                                                                                 -: CPU




     •   Look-up table based FFT provides a significantly increased accuracy in single-
         precision arithmetics
     •   Look-up table holds pre-calculated „twiddle-factor“ values

                                                                                   Source: S. Pachnicke, et al, OFC 2011.

12                              © 2013 ADVA Optical Networking. All rights reserved.
System Analysis                                               (SSFM Simulation)

      Req. OSNR deviation for BER=10-3 [dB]




                                                                                                     GPU simulation
                                                                                                      (in SP or DP)
                                                                                                           vs.
                                                                                                     CPU simulation
                                                                                                         (in DP)

                                                                                                     11x 112 Gb/s CP-QPSK




 •   GPU double precision results are (almost) identical to CPU results

 •   The OSNR penalty of our single precision implementation remains below
     0.1 dB up to a number of approx. 125,000 split-steps
                                                                                             Source: S. Pachnicke, IEEE ICTON, 2010.


13                                            © 2013 ADVA Optical Networking. All rights reserved.
Combined Simulation in SP & DP
                                                                  Calculate approximate
                                                                   division of the parameter
                                                                   space into strata by fast
                                                                   simulations with single
                                                                   precision.
                                                                  The ellipses represent
                                                                   parameter combinations
                                                                   for which bit errors occur
                                                                   during transmission.
                                                                  Execute simulations with
                                                                   double precision
                                                                   accuracy sparsely in the
                                                                   different strata to assess
                                                                   the BER.


  Combined simulation with single and double precision and automatic
   (algorithmic) choice of amount of single precision simulations
                                                                               P. Serena, et al, IEEE JLT, 2009.
                                                                                 S. Pachnicke, et al, OFC 2011.

14                      © 2013 ADVA Optical Networking. All rights reserved.
Discussion




                                                                   Robustness of algorithm has
                                                                   been checked by deliberately
                                                                   selecting high amount of
                                                                   880,000 split-steps



 •   Results of combined (SP & DP) GPU simulations match well with results obtained
     from CPU simulations in DP
 •   Speedup of up to a factor of 180 possible compared to CPU
  Stratified Monte-Carlo sampling allows algorithmic choice of amount of required DP
   simulations for a given accuracy


                                                                                    Source: S. Pachnicke, et al, OFC 2011.


15                           © 2013 ADVA Optical Networking. All rights reserved.
Design Advantages
 •   GPU parallelization allows simulation of a long distance 80 WDM channel system on
     a PC in reasonable time




                                                             Source: C. Xia, D. van den Borne, OFC, 2011




 •   Result: The system performance can be estimated much more precisely than with
     CPU-based simulations (typically modeling only 10 WDM channel systems)




16                           © 2013 ADVA Optical Networking. All rights reserved.
Conclusion

 • GPUs offer a much higher computational peak performance
   than CPUs

 • Full benefit of GPU power only in single precision

 • Increase in single precision accuracy possible by pre-computing of
   trigonometric function values for FFTs

 • Speedup in simulation time of more than a factor of 100 possible
   compared to CPU




17                     © 2013 ADVA Optical Networking. All rights reserved.
Further Reading

 •   N. K. Govindaraju, B. Lloyd, Y. Dotsenko, B. Smith, J. Manferdelli, “High
     Performance Discrete Fourier Transforms on Graphics Processors”, Proc. of
     IEEE conference on Supercomputing (SC), article no. 2 (2008).

 •   S. Pachnicke, “Fiber-Optic Transmission Networks: Efficient Design and
     Dynamic Operation”, Springer (2011).

 •   J. C. Schatzman, “Accuracy of the Discrete Fourier Transform and the Fast
     Fourier Transform”, SIAM J. Scientific Comput. 17, 1150-1166 (1996).

 •   G. Falcao, V. Silva, L. Sousa, “How GPUs can outperform ASICs for fast LDPC
     decoding”, Proc. of ACM International Conference on Supercomputing
     (ICS), 390-399 (2009).

 •   J. A. Stratton, S. S. Stone, W.-M. W. Hwu, “MCUDA: An Efficient
     Implementation of CUDA Kernels for Multi-core CPUs”, Lecture Notes in
     Computer Science 5335, 16-30 (2008).

 •   R. R. Exposito, G. L. Taboada, S. Ramos, J. Tourino, R. Doallo, “General-
     purpose computation on GPUs for high performance cloud computing”, Wiley J.
     Concurrency and Computation 24 (2012).




18                          © 2013 ADVA Optical Networking. All rights reserved.
Thank you

spachnicke@advaoptical.com


IMPORTANT NOTICE

The content of this presentation is strictly confidential. ADVA Optical Networking is the exclusive owner or licensee of the
content, material, and information in this presentation. Any reproduction, publication or reprint, in whole or in part, is strictly
prohibited.

The information in this presentation may not be accurate, complete or up to date, and is provided without warranties or
representations of any kind, either express or implied. ADVA Optical Networking shall not be responsible for and disclaims any
liability for any loss or damages, including without limitation, direct, indirect, incidental, consequential and special damages,
alleged to have been caused by or in connection with using and/or relying on the information contained in this presentation.

Copyright © for the entire content of this presentation: ADVA Optical Networking.

More Related Content

What's hot

Emerging Trends and Applications for Cost Effective ROADMs
Emerging Trends and Applications for Cost Effective ROADMsEmerging Trends and Applications for Cost Effective ROADMs
Emerging Trends and Applications for Cost Effective ROADMsCPqD
 
ROADM Technologies for Flexible - Tbitsec Optical Networks
ROADM Technologies for Flexible - Tbitsec Optical NetworksROADM Technologies for Flexible - Tbitsec Optical Networks
ROADM Technologies for Flexible - Tbitsec Optical NetworksCPqD
 
Evaluation of Virtualization Models for Optical Connectivity Service Providers
Evaluation of Virtualization Models for Optical Connectivity Service ProvidersEvaluation of Virtualization Models for Optical Connectivity Service Providers
Evaluation of Virtualization Models for Optical Connectivity Service ProvidersADVA
 
Metro High-Speed Product Line Manager
Metro High-Speed Product Line ManagerMetro High-Speed Product Line Manager
Metro High-Speed Product Line ManagerCPqD
 
Ft tx presentation to telkom 25092013
Ft tx presentation to telkom 25092013Ft tx presentation to telkom 25092013
Ft tx presentation to telkom 25092013Wahyu Nasution
 
Basics of DWDM Technology
Basics of DWDM TechnologyBasics of DWDM Technology
Basics of DWDM TechnologyPankaj Lahariya
 
Performance Tradeoffs of 120 Gb/s DP-QPSK in ROADM Systems
Performance Tradeoffs of 120 Gb/s DP-QPSK in ROADM SystemsPerformance Tradeoffs of 120 Gb/s DP-QPSK in ROADM Systems
Performance Tradeoffs of 120 Gb/s DP-QPSK in ROADM SystemsADVA
 
Optical Transport Technologies and Trends
Optical Transport Technologies and TrendsOptical Transport Technologies and Trends
Optical Transport Technologies and TrendsMyNOG
 
DWDM & Packet Optical Fundamentals by Dion Leung [APRICOT 2015]
DWDM & Packet Optical Fundamentals by Dion Leung [APRICOT 2015]DWDM & Packet Optical Fundamentals by Dion Leung [APRICOT 2015]
DWDM & Packet Optical Fundamentals by Dion Leung [APRICOT 2015]APNIC
 
Optical Networks Infrastructure
Optical Networks InfrastructureOptical Networks Infrastructure
Optical Networks InfrastructureTal Lavian Ph.D.
 
Synchronization protection & redundancy in ng networks itsf 2015
Synchronization protection & redundancy in ng networks   itsf 2015Synchronization protection & redundancy in ng networks   itsf 2015
Synchronization protection & redundancy in ng networks itsf 2015Daniel Sproats
 
Introduction to dwdm technology
Introduction to dwdm technologyIntroduction to dwdm technology
Introduction to dwdm technologySayed Qaisar Shah
 
Optical Fibre & Introduction to TDM & DWDM
Optical Fibre & Introduction to TDM & DWDMOptical Fibre & Introduction to TDM & DWDM
Optical Fibre & Introduction to TDM & DWDMHasna Heng
 
DWDM 101 - BRKOPT-2016
DWDM 101 - BRKOPT-2016DWDM 101 - BRKOPT-2016
DWDM 101 - BRKOPT-2016Bruno Teixeira
 
Optical network evolution
Optical network evolutionOptical network evolution
Optical network evolutionCPqD
 
Implications of super channels on CDC ROADM architectures
Implications of super channels on CDC ROADM architecturesImplications of super channels on CDC ROADM architectures
Implications of super channels on CDC ROADM architecturesAnuj Malik
 
LTE introduction part1
LTE introduction part1LTE introduction part1
LTE introduction part1Pei-Che Chang
 

What's hot (20)

Emerging Trends and Applications for Cost Effective ROADMs
Emerging Trends and Applications for Cost Effective ROADMsEmerging Trends and Applications for Cost Effective ROADMs
Emerging Trends and Applications for Cost Effective ROADMs
 
ROADM Technologies for Flexible - Tbitsec Optical Networks
ROADM Technologies for Flexible - Tbitsec Optical NetworksROADM Technologies for Flexible - Tbitsec Optical Networks
ROADM Technologies for Flexible - Tbitsec Optical Networks
 
Evaluation of Virtualization Models for Optical Connectivity Service Providers
Evaluation of Virtualization Models for Optical Connectivity Service ProvidersEvaluation of Virtualization Models for Optical Connectivity Service Providers
Evaluation of Virtualization Models for Optical Connectivity Service Providers
 
Metro High-Speed Product Line Manager
Metro High-Speed Product Line ManagerMetro High-Speed Product Line Manager
Metro High-Speed Product Line Manager
 
Ft tx presentation to telkom 25092013
Ft tx presentation to telkom 25092013Ft tx presentation to telkom 25092013
Ft tx presentation to telkom 25092013
 
Basics of DWDM Technology
Basics of DWDM TechnologyBasics of DWDM Technology
Basics of DWDM Technology
 
Performance Tradeoffs of 120 Gb/s DP-QPSK in ROADM Systems
Performance Tradeoffs of 120 Gb/s DP-QPSK in ROADM SystemsPerformance Tradeoffs of 120 Gb/s DP-QPSK in ROADM Systems
Performance Tradeoffs of 120 Gb/s DP-QPSK in ROADM Systems
 
Optical Transport Technologies and Trends
Optical Transport Technologies and TrendsOptical Transport Technologies and Trends
Optical Transport Technologies and Trends
 
DWDM & Packet Optical Fundamentals by Dion Leung [APRICOT 2015]
DWDM & Packet Optical Fundamentals by Dion Leung [APRICOT 2015]DWDM & Packet Optical Fundamentals by Dion Leung [APRICOT 2015]
DWDM & Packet Optical Fundamentals by Dion Leung [APRICOT 2015]
 
WDM Basics
WDM BasicsWDM Basics
WDM Basics
 
Optical Networks Infrastructure
Optical Networks InfrastructureOptical Networks Infrastructure
Optical Networks Infrastructure
 
Synchronization protection & redundancy in ng networks itsf 2015
Synchronization protection & redundancy in ng networks   itsf 2015Synchronization protection & redundancy in ng networks   itsf 2015
Synchronization protection & redundancy in ng networks itsf 2015
 
Mobile Broadband
Mobile BroadbandMobile Broadband
Mobile Broadband
 
Introduction to dwdm technology
Introduction to dwdm technologyIntroduction to dwdm technology
Introduction to dwdm technology
 
Optical Fibre & Introduction to TDM & DWDM
Optical Fibre & Introduction to TDM & DWDMOptical Fibre & Introduction to TDM & DWDM
Optical Fibre & Introduction to TDM & DWDM
 
DWDM 101 - BRKOPT-2016
DWDM 101 - BRKOPT-2016DWDM 101 - BRKOPT-2016
DWDM 101 - BRKOPT-2016
 
Optical network evolution
Optical network evolutionOptical network evolution
Optical network evolution
 
Implications of super channels on CDC ROADM architectures
Implications of super channels on CDC ROADM architecturesImplications of super channels on CDC ROADM architectures
Implications of super channels on CDC ROADM architectures
 
LTE introduction part1
LTE introduction part1LTE introduction part1
LTE introduction part1
 
Guide otn ang
Guide otn angGuide otn ang
Guide otn ang
 

Viewers also liked

Deploying Virtualized Services Over Legacy Networks
Deploying Virtualized Services Over Legacy NetworksDeploying Virtualized Services Over Legacy Networks
Deploying Virtualized Services Over Legacy NetworksADVA
 
Statistical-Multiplexing Gain of C-RAN
Statistical-Multiplexing Gain of C-RANStatistical-Multiplexing Gain of C-RAN
Statistical-Multiplexing Gain of C-RANJingchu Liu
 
数据中心网络研究:机遇与挑战
数据中心网络研究:机遇与挑战数据中心网络研究:机遇与挑战
数据中心网络研究:机遇与挑战Weiwei Fang
 
Forget the Layers: NFV Is About Dynamism
Forget the Layers: NFV Is About DynamismForget the Layers: NFV Is About Dynamism
Forget the Layers: NFV Is About DynamismADVA
 
WDM PON Forum Workshop
WDM PON Forum WorkshopWDM PON Forum Workshop
WDM PON Forum WorkshopADVA
 
NGFI (Next Generation Fronthaul Interface) native RoE (Radio over Ethernet)
NGFI (Next Generation Fronthaul Interface) native RoE (Radio over Ethernet)NGFI (Next Generation Fronthaul Interface) native RoE (Radio over Ethernet)
NGFI (Next Generation Fronthaul Interface) native RoE (Radio over Ethernet)ITU
 
Tunable DWDM PON at WDM PON Forum Workshop
Tunable DWDM PON at WDM PON Forum WorkshopTunable DWDM PON at WDM PON Forum Workshop
Tunable DWDM PON at WDM PON Forum WorkshopADVA
 

Viewers also liked (7)

Deploying Virtualized Services Over Legacy Networks
Deploying Virtualized Services Over Legacy NetworksDeploying Virtualized Services Over Legacy Networks
Deploying Virtualized Services Over Legacy Networks
 
Statistical-Multiplexing Gain of C-RAN
Statistical-Multiplexing Gain of C-RANStatistical-Multiplexing Gain of C-RAN
Statistical-Multiplexing Gain of C-RAN
 
数据中心网络研究:机遇与挑战
数据中心网络研究:机遇与挑战数据中心网络研究:机遇与挑战
数据中心网络研究:机遇与挑战
 
Forget the Layers: NFV Is About Dynamism
Forget the Layers: NFV Is About DynamismForget the Layers: NFV Is About Dynamism
Forget the Layers: NFV Is About Dynamism
 
WDM PON Forum Workshop
WDM PON Forum WorkshopWDM PON Forum Workshop
WDM PON Forum Workshop
 
NGFI (Next Generation Fronthaul Interface) native RoE (Radio over Ethernet)
NGFI (Next Generation Fronthaul Interface) native RoE (Radio over Ethernet)NGFI (Next Generation Fronthaul Interface) native RoE (Radio over Ethernet)
NGFI (Next Generation Fronthaul Interface) native RoE (Radio over Ethernet)
 
Tunable DWDM PON at WDM PON Forum Workshop
Tunable DWDM PON at WDM PON Forum WorkshopTunable DWDM PON at WDM PON Forum Workshop
Tunable DWDM PON at WDM PON Forum Workshop
 

Similar to OFC/NFOEC: GPU-based Parallelization of System Modeling

Design and implementation of GPU-based SAR image processor
Design and implementation of GPU-based SAR image processorDesign and implementation of GPU-based SAR image processor
Design and implementation of GPU-based SAR image processorNajeeb Ahmad
 
improve deep learning training and inference performance
improve deep learning training and inference performanceimprove deep learning training and inference performance
improve deep learning training and inference performances.rohit
 
APSys Presentation Final copy2
APSys Presentation Final copy2APSys Presentation Final copy2
APSys Presentation Final copy2Junli Gu
 
Symposium on HPC Applications – IIT Kanpur
Symposium on HPC Applications – IIT KanpurSymposium on HPC Applications – IIT Kanpur
Symposium on HPC Applications – IIT KanpurRishi Pathak
 
N A G P A R I S280101
N A G P A R I S280101N A G P A R I S280101
N A G P A R I S280101John Holden
 
Optimizing the graphics pipeline with compute
Optimizing the graphics pipeline with computeOptimizing the graphics pipeline with compute
Optimizing the graphics pipeline with computeWuBinbo
 
Using Many-Core Processors to Improve the Performance of Space Computing Plat...
Using Many-Core Processors to Improve the Performance of Space Computing Plat...Using Many-Core Processors to Improve the Performance of Space Computing Plat...
Using Many-Core Processors to Improve the Performance of Space Computing Plat...Fisnik Kraja
 
graphics processing unit ppt
graphics processing unit pptgraphics processing unit ppt
graphics processing unit pptNitesh Dubey
 
Adv. FPGA Motor Control--EBV & Univ. of Koln: Embedded World 2010
Adv. FPGA Motor Control--EBV & Univ. of Koln: Embedded World 2010Adv. FPGA Motor Control--EBV & Univ. of Koln: Embedded World 2010
Adv. FPGA Motor Control--EBV & Univ. of Koln: Embedded World 2010Altera Corporation
 
PhD defense talk (portfolio of my expertise)
PhD defense talk (portfolio of my expertise)PhD defense talk (portfolio of my expertise)
PhD defense talk (portfolio of my expertise)Gernot Ziegler
 

Similar to OFC/NFOEC: GPU-based Parallelization of System Modeling (20)

PG-Strom
PG-StromPG-Strom
PG-Strom
 
Cuda tutorial
Cuda tutorialCuda tutorial
Cuda tutorial
 
Design and implementation of GPU-based SAR image processor
Design and implementation of GPU-based SAR image processorDesign and implementation of GPU-based SAR image processor
Design and implementation of GPU-based SAR image processor
 
Imaging on embedded GPUs
Imaging on embedded GPUsImaging on embedded GPUs
Imaging on embedded GPUs
 
Circuits eda
Circuits edaCircuits eda
Circuits eda
 
improve deep learning training and inference performance
improve deep learning training and inference performanceimprove deep learning training and inference performance
improve deep learning training and inference performance
 
Main (3)
Main (3)Main (3)
Main (3)
 
427 432
427 432427 432
427 432
 
APSys Presentation Final copy2
APSys Presentation Final copy2APSys Presentation Final copy2
APSys Presentation Final copy2
 
Symposium on HPC Applications – IIT Kanpur
Symposium on HPC Applications – IIT KanpurSymposium on HPC Applications – IIT Kanpur
Symposium on HPC Applications – IIT Kanpur
 
FIR filter on GPU
FIR filter on GPUFIR filter on GPU
FIR filter on GPU
 
N A G P A R I S280101
N A G P A R I S280101N A G P A R I S280101
N A G P A R I S280101
 
Optimizing the graphics pipeline with compute
Optimizing the graphics pipeline with computeOptimizing the graphics pipeline with compute
Optimizing the graphics pipeline with compute
 
Nvidia Cuda Apps Jun27 11
Nvidia Cuda Apps Jun27 11Nvidia Cuda Apps Jun27 11
Nvidia Cuda Apps Jun27 11
 
Using Many-Core Processors to Improve the Performance of Space Computing Plat...
Using Many-Core Processors to Improve the Performance of Space Computing Plat...Using Many-Core Processors to Improve the Performance of Space Computing Plat...
Using Many-Core Processors to Improve the Performance of Space Computing Plat...
 
graphics processing unit ppt
graphics processing unit pptgraphics processing unit ppt
graphics processing unit ppt
 
Jpeg dct
Jpeg dctJpeg dct
Jpeg dct
 
stdp_on_fpga.ppt
stdp_on_fpga.pptstdp_on_fpga.ppt
stdp_on_fpga.ppt
 
Adv. FPGA Motor Control--EBV & Univ. of Koln: Embedded World 2010
Adv. FPGA Motor Control--EBV & Univ. of Koln: Embedded World 2010Adv. FPGA Motor Control--EBV & Univ. of Koln: Embedded World 2010
Adv. FPGA Motor Control--EBV & Univ. of Koln: Embedded World 2010
 
PhD defense talk (portfolio of my expertise)
PhD defense talk (portfolio of my expertise)PhD defense talk (portfolio of my expertise)
PhD defense talk (portfolio of my expertise)
 

More from ADVA

Industrial optically pumped cesium beam clock
Industrial optically pumped cesium beam clockIndustrial optically pumped cesium beam clock
Industrial optically pumped cesium beam clockADVA
 
The need for GBaaS as GPS/GNSS is no longer a reliable source for critical PN...
The need for GBaaS as GPS/GNSS is no longer a reliable source for critical PN...The need for GBaaS as GPS/GNSS is no longer a reliable source for critical PN...
The need for GBaaS as GPS/GNSS is no longer a reliable source for critical PN...ADVA
 
Industry's longest holdover with the OSA 3350 SePRC™ optical cesium clock
Industry's longest holdover with the OSA 3350  SePRC™ optical cesium clockIndustry's longest holdover with the OSA 3350  SePRC™ optical cesium clock
Industry's longest holdover with the OSA 3350 SePRC™ optical cesium clockADVA
 
Addressing PNT threats in critical defense infrastructure
Addressing PNT threats in critical defense infrastructureAddressing PNT threats in critical defense infrastructure
Addressing PNT threats in critical defense infrastructureADVA
 
Precise and assured timing for enterprise networks
Precise and assured timing for enterprise networksPrecise and assured timing for enterprise networks
Precise and assured timing for enterprise networksADVA
 
Introducing Ensemble Cloudlet for on-premises cloud demand
Introducing Ensemble Cloudlet for on-premises cloud demandIntroducing Ensemble Cloudlet for on-premises cloud demand
Introducing Ensemble Cloudlet for on-premises cloud demandADVA
 
ePRTC in data centers - GNSS-backup-as-a-service (GBaaS)
ePRTC in data centers - GNSS-backup-as-a-service (GBaaS)ePRTC in data centers - GNSS-backup-as-a-service (GBaaS)
ePRTC in data centers - GNSS-backup-as-a-service (GBaaS)ADVA
 
Sync on TAP - Syncing infrastructure with software
Sync on TAP - Syncing infrastructure with softwareSync on TAP - Syncing infrastructure with software
Sync on TAP - Syncing infrastructure with softwareADVA
 
Meet stringent latency demands with time-sensitive networking
Meet stringent latency demands with time-sensitive networkingMeet stringent latency demands with time-sensitive networking
Meet stringent latency demands with time-sensitive networkingADVA
 
Making networks secure with multi-layer encryption
Making networks secure with multi-layer encryptionMaking networks secure with multi-layer encryption
Making networks secure with multi-layer encryptionADVA
 
Quantum threat: How to protect your optical network
Quantum threat: How to protect your optical networkQuantum threat: How to protect your optical network
Quantum threat: How to protect your optical networkADVA
 
Optical networks and the ecodesign tradeoff between climate change mitigation...
Optical networks and the ecodesign tradeoff between climate change mitigation...Optical networks and the ecodesign tradeoff between climate change mitigation...
Optical networks and the ecodesign tradeoff between climate change mitigation...ADVA
 
Trends in next-generation data center interconnects (DCI)
Trends in next-generation data center interconnects (DCI)Trends in next-generation data center interconnects (DCI)
Trends in next-generation data center interconnects (DCI)ADVA
 
Open optical edge connecting mobile access networks
Open optical edge connecting mobile access networksOpen optical edge connecting mobile access networks
Open optical edge connecting mobile access networksADVA
 
Introducing Adva Network Security – a trusted German anchor
Introducing Adva Network Security – a trusted German anchorIntroducing Adva Network Security – a trusted German anchor
Introducing Adva Network Security – a trusted German anchorADVA
 
Meet the industry's first pluggable 10G demarcation device
Meet the industry's first pluggable 10G demarcation deviceMeet the industry's first pluggable 10G demarcation device
Meet the industry's first pluggable 10G demarcation deviceADVA
 
Introducing ADVA AccessWave25™
Introducing ADVA AccessWave25™Introducing ADVA AccessWave25™
Introducing ADVA AccessWave25™ADVA
 
10G edge technology for outdoor environments
10G edge technology for outdoor environments10G edge technology for outdoor environments
10G edge technology for outdoor environmentsADVA
 
The quantum age - secure transport networks
The quantum age - secure transport networksThe quantum age - secure transport networks
The quantum age - secure transport networksADVA
 
From leased lines to optical spectrum services
From leased lines to optical spectrum servicesFrom leased lines to optical spectrum services
From leased lines to optical spectrum servicesADVA
 

More from ADVA (20)

Industrial optically pumped cesium beam clock
Industrial optically pumped cesium beam clockIndustrial optically pumped cesium beam clock
Industrial optically pumped cesium beam clock
 
The need for GBaaS as GPS/GNSS is no longer a reliable source for critical PN...
The need for GBaaS as GPS/GNSS is no longer a reliable source for critical PN...The need for GBaaS as GPS/GNSS is no longer a reliable source for critical PN...
The need for GBaaS as GPS/GNSS is no longer a reliable source for critical PN...
 
Industry's longest holdover with the OSA 3350 SePRC™ optical cesium clock
Industry's longest holdover with the OSA 3350  SePRC™ optical cesium clockIndustry's longest holdover with the OSA 3350  SePRC™ optical cesium clock
Industry's longest holdover with the OSA 3350 SePRC™ optical cesium clock
 
Addressing PNT threats in critical defense infrastructure
Addressing PNT threats in critical defense infrastructureAddressing PNT threats in critical defense infrastructure
Addressing PNT threats in critical defense infrastructure
 
Precise and assured timing for enterprise networks
Precise and assured timing for enterprise networksPrecise and assured timing for enterprise networks
Precise and assured timing for enterprise networks
 
Introducing Ensemble Cloudlet for on-premises cloud demand
Introducing Ensemble Cloudlet for on-premises cloud demandIntroducing Ensemble Cloudlet for on-premises cloud demand
Introducing Ensemble Cloudlet for on-premises cloud demand
 
ePRTC in data centers - GNSS-backup-as-a-service (GBaaS)
ePRTC in data centers - GNSS-backup-as-a-service (GBaaS)ePRTC in data centers - GNSS-backup-as-a-service (GBaaS)
ePRTC in data centers - GNSS-backup-as-a-service (GBaaS)
 
Sync on TAP - Syncing infrastructure with software
Sync on TAP - Syncing infrastructure with softwareSync on TAP - Syncing infrastructure with software
Sync on TAP - Syncing infrastructure with software
 
Meet stringent latency demands with time-sensitive networking
Meet stringent latency demands with time-sensitive networkingMeet stringent latency demands with time-sensitive networking
Meet stringent latency demands with time-sensitive networking
 
Making networks secure with multi-layer encryption
Making networks secure with multi-layer encryptionMaking networks secure with multi-layer encryption
Making networks secure with multi-layer encryption
 
Quantum threat: How to protect your optical network
Quantum threat: How to protect your optical networkQuantum threat: How to protect your optical network
Quantum threat: How to protect your optical network
 
Optical networks and the ecodesign tradeoff between climate change mitigation...
Optical networks and the ecodesign tradeoff between climate change mitigation...Optical networks and the ecodesign tradeoff between climate change mitigation...
Optical networks and the ecodesign tradeoff between climate change mitigation...
 
Trends in next-generation data center interconnects (DCI)
Trends in next-generation data center interconnects (DCI)Trends in next-generation data center interconnects (DCI)
Trends in next-generation data center interconnects (DCI)
 
Open optical edge connecting mobile access networks
Open optical edge connecting mobile access networksOpen optical edge connecting mobile access networks
Open optical edge connecting mobile access networks
 
Introducing Adva Network Security – a trusted German anchor
Introducing Adva Network Security – a trusted German anchorIntroducing Adva Network Security – a trusted German anchor
Introducing Adva Network Security – a trusted German anchor
 
Meet the industry's first pluggable 10G demarcation device
Meet the industry's first pluggable 10G demarcation deviceMeet the industry's first pluggable 10G demarcation device
Meet the industry's first pluggable 10G demarcation device
 
Introducing ADVA AccessWave25™
Introducing ADVA AccessWave25™Introducing ADVA AccessWave25™
Introducing ADVA AccessWave25™
 
10G edge technology for outdoor environments
10G edge technology for outdoor environments10G edge technology for outdoor environments
10G edge technology for outdoor environments
 
The quantum age - secure transport networks
The quantum age - secure transport networksThe quantum age - secure transport networks
The quantum age - secure transport networks
 
From leased lines to optical spectrum services
From leased lines to optical spectrum servicesFrom leased lines to optical spectrum services
From leased lines to optical spectrum services
 

Recently uploaded

Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Product School
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...Product School
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...Product School
 
UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1DianaGray10
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...Product School
 
In-Depth Performance Testing Guide for IT Professionals
In-Depth Performance Testing Guide for IT ProfessionalsIn-Depth Performance Testing Guide for IT Professionals
In-Depth Performance Testing Guide for IT ProfessionalsExpeed Software
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesThousandEyes
 
UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2DianaGray10
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...Sri Ambati
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
 
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyJohn Staveley
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupCatarinaPereira64715
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaRTTS
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backElena Simperl
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
 

Recently uploaded (20)

Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
In-Depth Performance Testing Guide for IT Professionals
In-Depth Performance Testing Guide for IT ProfessionalsIn-Depth Performance Testing Guide for IT Professionals
In-Depth Performance Testing Guide for IT Professionals
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John Staveley
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 

OFC/NFOEC: GPU-based Parallelization of System Modeling

  • 1. GPU-based Parallelization of System Modeling Stephan Pachnicke, 18.03.2013
  • 2. Outline • Motivation • Numerical System Modeling • GPU-Parallelization • Comparison of Speedup and Accuracy • Conclusion 2 © 2013 ADVA Optical Networking. All rights reserved.
  • 3. Acknowledgments The author would like to acknowledge the help and contributions of Adam Chachaj – Krone Messtechnik Heinrich Müller – TU Dortmund Peter Krummrich – TU Dortmund Markus Roppelt – ADVA Optical Networking Michael Eiselt – ADVA Optical Networking 3 © 2013 ADVA Optical Networking. All rights reserved.
  • 4. Motivation 4 © 2013 ADVA Optical Networking. All rights reserved.
  • 5. In Short: Computational Performance Graphical Processing Unit (GPU) vs. CPU Cluster 5 © 2013 ADVA Optical Networking. All rights reserved.
  • 6. Increase in GFlop/s • GPU performance is growing even faster than predicted by Moore„s law and is significantly higher than CPU performance • GPUs are attractive also for general purpose computing (complex numerical simulations) 6 © 2013 ADVA Optical Networking. All rights reserved.
  • 7. Optical System Modeling • Simulation of (long-haul) optical transmission systems requires numerical solution of the nonlinear Schrödinger equation  High computational effort for small step-sizes due to accurate simulation of nonlinear fiber effects • Precise estimation of the bit error ratio with Monte-Carlo simulations for PMD and noise  Requires a high number of simulated bits 7 © 2013 ADVA Optical Networking. All rights reserved.
  • 8. Split-Step Fourier Method (SSFM) • Splits nonlinear Schrödinger equation in linear and nonlinear parts • Separate solution of linear and nonlinear parts • Solution of the linear part in the frequency domain and of the nonlinear part in time domain (acceptable for small step-sizes) … FFT FFT IFFT IFFT IFFT … 1 Split-Step 8 © 2013 ADVA Optical Networking. All rights reserved.
  • 9. Speedup Factor (GPU vs CPU) Single precision (SP) Double precision (DP) Legend DP: Nvidia CUDA FFT SP: FFT using pre-calculated twiddle factors • Single precision arithmetic has much higher performance on GPU (because main target group is computer gaming) • Longer block lengths allow better parallelization  Single precision implementation desirable 9 © 2013 ADVA Optical Networking. All rights reserved.
  • 10. Accuracy (in single precision) Legend CUFFT: Nvidia CUDA FFT FFTW: Fastest Fourier Transform in the West IPP: Intel Integrated Performance Primitives LUT-based FFT LUT: Precalculate trigonometric functions in DP • Total accuracy of SSFM dominated by FFT accuracy • Backward error grows linearly with increasing number of FFTs • CUDA FFT shows considerably higher error than other FFT implementations 10 © 2013 ADVA Optical Networking. All rights reserved.
  • 11. Analysis: Accuracy Why is the accuracy of CUFFT in SP relatively low?  FFT performance depends crucially on accuracy of „twiddle- factors“ (or trigonometric functions)  HW implementation of trigonometric functions in SP on GPUs optimized for peak performance not accuracy What can be done to increase accuracy in single precision?  Implementation of Taylor series expansion (slow!)  Compute trigonometric functions in DP on CPU and store them in a look-up table on the GPU (especially suited to the split-step Fourier method with thousands of FFTs of similar length) J. C. Schatzman, SIAM J. Scientific Comput. (1996). 11 © 2013 ADVA Optical Networking. All rights reserved.
  • 12. Illustrative Example CUDA FFT (SP) LUT-based FFT (SP) -: GPU -: CPU • Look-up table based FFT provides a significantly increased accuracy in single- precision arithmetics • Look-up table holds pre-calculated „twiddle-factor“ values Source: S. Pachnicke, et al, OFC 2011. 12 © 2013 ADVA Optical Networking. All rights reserved.
  • 13. System Analysis (SSFM Simulation) Req. OSNR deviation for BER=10-3 [dB] GPU simulation (in SP or DP) vs. CPU simulation (in DP) 11x 112 Gb/s CP-QPSK • GPU double precision results are (almost) identical to CPU results • The OSNR penalty of our single precision implementation remains below 0.1 dB up to a number of approx. 125,000 split-steps Source: S. Pachnicke, IEEE ICTON, 2010. 13 © 2013 ADVA Optical Networking. All rights reserved.
  • 14. Combined Simulation in SP & DP  Calculate approximate division of the parameter space into strata by fast simulations with single precision.  The ellipses represent parameter combinations for which bit errors occur during transmission.  Execute simulations with double precision accuracy sparsely in the different strata to assess the BER.  Combined simulation with single and double precision and automatic (algorithmic) choice of amount of single precision simulations P. Serena, et al, IEEE JLT, 2009. S. Pachnicke, et al, OFC 2011. 14 © 2013 ADVA Optical Networking. All rights reserved.
  • 15. Discussion Robustness of algorithm has been checked by deliberately selecting high amount of 880,000 split-steps • Results of combined (SP & DP) GPU simulations match well with results obtained from CPU simulations in DP • Speedup of up to a factor of 180 possible compared to CPU  Stratified Monte-Carlo sampling allows algorithmic choice of amount of required DP simulations for a given accuracy Source: S. Pachnicke, et al, OFC 2011. 15 © 2013 ADVA Optical Networking. All rights reserved.
  • 16. Design Advantages • GPU parallelization allows simulation of a long distance 80 WDM channel system on a PC in reasonable time Source: C. Xia, D. van den Borne, OFC, 2011 • Result: The system performance can be estimated much more precisely than with CPU-based simulations (typically modeling only 10 WDM channel systems) 16 © 2013 ADVA Optical Networking. All rights reserved.
  • 17. Conclusion • GPUs offer a much higher computational peak performance than CPUs • Full benefit of GPU power only in single precision • Increase in single precision accuracy possible by pre-computing of trigonometric function values for FFTs • Speedup in simulation time of more than a factor of 100 possible compared to CPU 17 © 2013 ADVA Optical Networking. All rights reserved.
  • 18. Further Reading • N. K. Govindaraju, B. Lloyd, Y. Dotsenko, B. Smith, J. Manferdelli, “High Performance Discrete Fourier Transforms on Graphics Processors”, Proc. of IEEE conference on Supercomputing (SC), article no. 2 (2008). • S. Pachnicke, “Fiber-Optic Transmission Networks: Efficient Design and Dynamic Operation”, Springer (2011). • J. C. Schatzman, “Accuracy of the Discrete Fourier Transform and the Fast Fourier Transform”, SIAM J. Scientific Comput. 17, 1150-1166 (1996). • G. Falcao, V. Silva, L. Sousa, “How GPUs can outperform ASICs for fast LDPC decoding”, Proc. of ACM International Conference on Supercomputing (ICS), 390-399 (2009). • J. A. Stratton, S. S. Stone, W.-M. W. Hwu, “MCUDA: An Efficient Implementation of CUDA Kernels for Multi-core CPUs”, Lecture Notes in Computer Science 5335, 16-30 (2008). • R. R. Exposito, G. L. Taboada, S. Ramos, J. Tourino, R. Doallo, “General- purpose computation on GPUs for high performance cloud computing”, Wiley J. Concurrency and Computation 24 (2012). 18 © 2013 ADVA Optical Networking. All rights reserved.
  • 19. Thank you spachnicke@advaoptical.com IMPORTANT NOTICE The content of this presentation is strictly confidential. ADVA Optical Networking is the exclusive owner or licensee of the content, material, and information in this presentation. Any reproduction, publication or reprint, in whole or in part, is strictly prohibited. The information in this presentation may not be accurate, complete or up to date, and is provided without warranties or representations of any kind, either express or implied. ADVA Optical Networking shall not be responsible for and disclaims any liability for any loss or damages, including without limitation, direct, indirect, incidental, consequential and special damages, alleged to have been caused by or in connection with using and/or relying on the information contained in this presentation. Copyright © for the entire content of this presentation: ADVA Optical Networking.