byteLAKE's Alveo FPGA Solutions

byteLAKE
byteLAKEAI Solutions for Industries | Automated Quality Inspection | Data Insights | Self-Checkout | byteLAKE.com
Alveo Products Marketplace
Expertise Software Services
Alveo Solutions
Expertise in Alveo FPGA programming
PCIe
x86 CPU
Host
Application
Runtime and Drivers
Acceleration API
FPGA
Accelerated
Functions
DMA Engine
AXI Interfaces
byteLAKE’s
Solutions
Xilinx
Acceleration
Platform
C/C++ code
with
OpenCL API calls
C/C++
or
OpenCL C
FPGACPU
2
More at: byteLAKE.com/en/Alveo
• Xilinx pioneered C to FPGA compilation technology (aka “HLS”) in 2011
3
Source code in C, C++ or OpenCL
loop_main:for(int j=0;j<NUM_SIMGROUPS;j+=2) {
loop_share:for(uint k=0;k<NUM_SIMS;k++) {
loop_parallel:for(int i=0;i<NUM_RNGS;i++) {
mt_rng[i].BOX_MULLER(&num1[i][k],&num2[i][k],ratio4,ratio3);
float payoff1 = expf(num1[i][k])-1.0f;
float payoff2 = expf(num2[i][k])-1.0f;
if(num1[i][k]>0.0f)
pCall1[i][k]+= payoff1;
else
pPut1[i][k]-=payoff1;
if(num2[i][k]>0.0f)
pCall2[i][k]+=payoff2;
else
pPut2[i][k]-=payoff2;
}
}
}
FPGACompile
Xilinx FPGAs highlights
4
• No predefined instruction set or underlying architecture
• Developers customize the architecture to their needs
– Custom data paths
– Custom bit-width
– Custom memory hierarchies
• Excels at all types of parallelism
– Deeply pipelined (e.g. Video codecs)
– Bit manipulations (e.g. AES, SHA)
– Wide data path (e.g. DNN)
– Custom memory hierarchy (e.g. Data analytics)
• Adapts to evolving algorithms and workload needs
FPGAs - the Ultimate Parallel Processing Device
• Compute domain divided
into sub-domains
• Host sends data to the FPGA
global memory
• Host calls kernels to execute them on
FPGA (kernel is called many times)
• Each kernel call represents
a single time step
• FPGA sends the output array
back to the host
Typical Architecture
More at: byteLAKE.com/en/PPAM19
• Kernel is distributed
into 4 SLRs
• Each sub-domain is
allocated in different
memory bank
• Data transfer occurs
between neighboring
memory banks
Example processing
SLR0
Kernel_A
SLR1
Kernel_B
SLR2
Kernel_C
SLR3
Kernel_D
Kernel
Bank0 Bank1
Bank2 Bank3
Sub-domain Sub-domain
Sub-domain Sub-domain
19
Case study: CFD Kernels adaptation
Typical CFD workflow
From CAD to MESH…
(meshing)
Image source: https://www.openfoam.com/products/visualcfd.php
…to CFD simulation and visualization.
• MESH conversion (input)
• byteLAKE’s CFD Kernels
• Data output for visualization
upto5%
ofsimulationtime
major
workload
OPENFOAM® is a registered trademark of ESI Group. This offering is not approved or endorsed by ESI Group, the producer of the OpenFOAM software and owner of the OPENFOAM® and OpenCFD® trademarks.
byteLAKE created set of highly optimized CFD kernels
for Xilinx Alveo Datacentre accelerator cards
–Advection (movement of some material, dissolved or
suspended in the fluid)
–Pseudo velocity (approximation of the relative velocity)
–Divergence (measures how much of fluid is flowing into/ out
of a certain point in a vector field)
–Thomas algorithm (simplified form of Gaussian
elimination for tridiagonal system of equations)
8Download solution description:
bytelake.com/en/download/2716/
CFD Kernels
CFD acceleration with Alveo FPGA
9
More at: byteLAKE.com/en/FPGA
MPDATA Accelerated
CFD / Advection algorithm optimized for heterogeneous
computing.
CFD
Computational Fluid Dynamics
• Numerical analysis and algorithms
to solve fluid flows problems
–how liquids and gases flow
and interact with surfaces
• Widely used across industries:
–automotive, chemical, aerospace,
biomedical, power and energy, and
construction etc.
• Typical applications
–weather simulations,
–aerodynamic characteristics modelling and
optimization,
–petroleum mass flow rate assessment
11
• MPDATA
(Multidimensional Positive Definite Advection Transport Algorithm)
– main part of the dynamic core of the Eulerian/
semi-Lagrangian (EULAG) model
– EULAG (MPDATA+elliptic solver) is the established computational model,
developed for simulating thermo-fluid flows across a wide range of scales
and physical scenarios
– currently, this model is being implemented as the new dynamic core of the COSMO
(Consortium for Small-scale Modeling) weather prediction framework
– advection (together with the elliptic solver) is a key part of many frameworks that allow
users to implement their simulations
• Advection
– movement of some material (dissolved or suspended) in the fluid.
Algorithm: Advection (MPDATA)
General Information
• Easy to integrate
– Can work as a standalone application or be called as a function via our dedicated interface
(e.g. can be called as a function with input and output arrays)
– Compatible with frameworks like TensorFlow for integrating deep learning with CFD codes
• Easy to visualize the results
– Results can be stored in a raw format as a binary file of the output arrays or converted via
byteLAKE tools to a ParaView format
• See benefits already in 1-node HPC configurations
– Strongly adapted to Alveo U250, were single card supports the max size of arrays: 2,1 Gcells
(max compute domain: 1264 x 1264 x 1264) ~ 60 GB
• Scalable to many cards per node and many nodes
Algorithm: Advection (MPDATA)
byteLAKE’s implementation compatibility
• First-order-accurate step of the advection scheme.
Second-order is an option.
• Input data
– Array X – non-diffusive quantity
(e.g. temperature of water vapor, ice, precipitation, etc.)
– Arrays V1, V2, V3 - each of them stores the velocity vectors in one direction
– (optional) Arrays Fi, Fe - implosion and explosion forces acting on a structure of X
– (optional) Array D with density
– (optional) Array rho which defines an interface for the coupling of COSMO and EULAG dynamic core
(used to provide the transformation of the X variable)
– DT – time step (scalar)
• Output data
– single X array that was updated in the given time step
Algorithm: Advection (MPDATA)
Technical Information
• Applications include
– To characterize the sub-grid scales effect in global numerical simulations
of turbulent stellar interiors
– To compare anelastic and compressible convection-permitting weather forecasts
for the Alpine region
– Modeling the prediction of forest fire spread
– Flood simulations
– Biomechanical modeling of brain injuries within the Voigt model
(a linear system of differential equations where the motion of the brain tissue depends
merely on the balance between viscous and elastic forces)
– Simulation gravity wave turbulence in the Earth's atmosphere
– Simulation of geophysical turbulence in the Earth's atmosphere
– Ocean modeling: simulation of three-dimensional solitary wave generation and
propagation using EULAG coupled to the barotropic NCOM (Navy Coastal Ocean Model)
tidal model
15
Applications of Advection (MPDATA)
• Applications include cont.
– Oil and Gas: provides a significant return on investment (ROI) in seismic analysis,
reservoir modelling and basin modelling. Used also to monitor drilling and seismic data
to optimize drilling trajectories and minimize environmental risk.
– AgriTech: models to track and predict various environmental impacts on crop yield such
as weather changes. For example, daily weather predictions can be customized based on
the needs of each client and range from hyperlocal to global.
• Example adopters
– Poznan Supercomputing and Networking Center, Poland: prognosis of air pollution
– European Centre for Medium-Range Weather Forecasts, UK: weather forecast
– Institute of Meteorology and Water Management, Poland: weather forecasts
– German Aerospace Center: aeronautics, transport and energy areas
– University of Cape Town, RPA: weather simulation
– Montreal University: weather simulation
– Warsaw University: ocean simulation
Applications of Advection (MPDATA), cont.
Full list
Algorithm: Advection (MPDATA)
Alveo FPGA Benchmark
INTEL XEON
E5-2995
INTEL XEON
E5-2995
INTEL XEON
GOLD 6148
INTEL XEON
PLATINUM
8168
XILINX
ALVEO U250
Performance (the higher
the better)
INTEL XEON
E5-2995
INTEL XEON
E5-2995
INTEL XEON
GOLD 6148
INTEL XEON
PLATINUM
8168
XILINX
ALVEO U250
Energy (the lower the
better) INTEL XEON
E5-2995
INTEL XEON
E5-2995
INTEL XEON
GOLD 6148
INTEL XEON
PLATINUM
8168
XILINX
ALVEO U250
Performance/W (the
higher the better)
More at: byteLAKE.com/en/FPGA
Explore byteLAKE’s CFD Suite
www.byteLAKE.com/en/CFDSuite
byteLAKE’s
CFD Suite
AI for CFD
AI
• highly optimized AI engines to analyze
text, image, video, sound and time
series data.
• Detecting shapes & patterns.
• Complex tasks automation.
• IoT/ edge, Cloud, on-premise.
HPC
• accelerating time to results and
adapting complex algorithms to GPU,
FPGA, many-CPU architectures.
• From single nodes to clusters.
Meet byteLAKE
AI and HPC Experts Your software partner
for AI & HPC projects
Experts in adapting
& optimizing
software for
Select Products
AI for CFD.
Ultra fast results,
radically lower TCO.
New possibilities.
Objects Detection
Edge AI and real time
computer vision.
56x faster AI training.
R&I • R&D • Licensing
HPC at byteLAKE
Accelerating time to results and adapting complex algorithms
to GPU, FPGA, many-CPU architectures.
Unleashing the power:
• selecting the right programming model to a given problem
(task parallelism, data parallelism, mixture of these two)
• providing the right balance between CPUs and GPUs/FPGAs
• optimizing data transfers between host memory and accelerators
• code adaptation to a variety of computing platforms
Bottom line: lowering TCO thru various optimizations
(performance, energy efficiency, accuracy of calculations)
More at: byteLAKE.com/en/HPC
Making the most of the hardware:
• Speedup: accelerating time to results for complex algorithms
• Green Computing: optimizing algorithms to reduce energy consumption
• Scalability: from single nodes to clusters
Products and Services
Cognitive AutomationEdge AI
Services
HPC
Products
CFD Suite
brainello
Ewa Guard
Federated Learning
Green Computing
(FPGA, GPU)
Intelligent
Restaurant
Incubation
byteLAKE among top AI companies in Poland!
"It contains information on practically all meaningful
companies operating in Poland which offer services or
products in the field of modern technologies. We believe this map
will be necessary to help both domestic and international
investors looking for interesting projects in Poland.",
Aleksander Kutela, President of Digital Poland Foundation
1 of 22

Recommended

AI for Manufacturing (Machine Vision, Edge AI, Federated Learning) by
AI for Manufacturing (Machine Vision, Edge AI, Federated Learning)AI for Manufacturing (Machine Vision, Edge AI, Federated Learning)
AI for Manufacturing (Machine Vision, Edge AI, Federated Learning)byteLAKE
728 views35 slides
Dell NVIDIA AI Roadshow - South Western Ontario by
Dell NVIDIA AI Roadshow - South Western OntarioDell NVIDIA AI Roadshow - South Western Ontario
Dell NVIDIA AI Roadshow - South Western OntarioBill Wong
135 views33 slides
AI Solutions in Manufacturing by
AI Solutions in ManufacturingAI Solutions in Manufacturing
AI Solutions in ManufacturingSri Ambati
3.7K views16 slides
Dive into H2O: NYC by
Dive into H2O: NYCDive into H2O: NYC
Dive into H2O: NYCSri Ambati
706 views30 slides
Hpe partner summit proposal 2017 by
Hpe partner summit proposal 2017 Hpe partner summit proposal 2017
Hpe partner summit proposal 2017 Guru Idea Lab
416 views75 slides
ML Model Deployment and Scoring on the Edge with Automatic ML & DF by
ML Model Deployment and Scoring on the Edge with Automatic ML & DFML Model Deployment and Scoring on the Edge with Automatic ML & DF
ML Model Deployment and Scoring on the Edge with Automatic ML & DFSri Ambati
683 views20 slides

More Related Content

What's hot

Getting Your Supply Chain Back on Track with AI by
Getting Your Supply Chain Back on Track with AIGetting Your Supply Chain Back on Track with AI
Getting Your Supply Chain Back on Track with AISri Ambati
1K views39 slides
AI Foundations Course Module 1 - An AI Transformation Journey by
AI Foundations Course Module 1 - An AI Transformation JourneyAI Foundations Course Module 1 - An AI Transformation Journey
AI Foundations Course Module 1 - An AI Transformation JourneySri Ambati
843 views57 slides
Vertex Perspectives | AI Optimized Chipsets | Part III by
Vertex Perspectives | AI Optimized Chipsets | Part IIIVertex Perspectives | AI Optimized Chipsets | Part III
Vertex Perspectives | AI Optimized Chipsets | Part IIIVertex Holdings
2.6K views24 slides
Powering the Internet of Things with Apache Hadoop by
Powering the Internet of Things with Apache HadoopPowering the Internet of Things with Apache Hadoop
Powering the Internet of Things with Apache HadoopCloudera, Inc.
11.1K views27 slides
AI in the Enterprise at Scale by
AI in the Enterprise at ScaleAI in the Enterprise at Scale
AI in the Enterprise at ScaleGanesan Narayanasamy
119 views47 slides
Graphcore presenting at Project Juno Machine Intelligence Showcase by
Graphcore presenting at Project Juno Machine Intelligence ShowcaseGraphcore presenting at Project Juno Machine Intelligence Showcase
Graphcore presenting at Project Juno Machine Intelligence ShowcaseProject Juno
1.4K views10 slides

What's hot(20)

Getting Your Supply Chain Back on Track with AI by Sri Ambati
Getting Your Supply Chain Back on Track with AIGetting Your Supply Chain Back on Track with AI
Getting Your Supply Chain Back on Track with AI
Sri Ambati1K views
AI Foundations Course Module 1 - An AI Transformation Journey by Sri Ambati
AI Foundations Course Module 1 - An AI Transformation JourneyAI Foundations Course Module 1 - An AI Transformation Journey
AI Foundations Course Module 1 - An AI Transformation Journey
Sri Ambati843 views
Vertex Perspectives | AI Optimized Chipsets | Part III by Vertex Holdings
Vertex Perspectives | AI Optimized Chipsets | Part IIIVertex Perspectives | AI Optimized Chipsets | Part III
Vertex Perspectives | AI Optimized Chipsets | Part III
Vertex Holdings2.6K views
Powering the Internet of Things with Apache Hadoop by Cloudera, Inc.
Powering the Internet of Things with Apache HadoopPowering the Internet of Things with Apache Hadoop
Powering the Internet of Things with Apache Hadoop
Cloudera, Inc.11.1K views
Graphcore presenting at Project Juno Machine Intelligence Showcase by Project Juno
Graphcore presenting at Project Juno Machine Intelligence ShowcaseGraphcore presenting at Project Juno Machine Intelligence Showcase
Graphcore presenting at Project Juno Machine Intelligence Showcase
Project Juno1.4K views
Introduction & Hands-on with H2O Driverless AI by Sri Ambati
Introduction & Hands-on with H2O Driverless AIIntroduction & Hands-on with H2O Driverless AI
Introduction & Hands-on with H2O Driverless AI
Sri Ambati826 views
Accelerating AI Adoption with Partners by Sri Ambati
Accelerating AI Adoption with PartnersAccelerating AI Adoption with Partners
Accelerating AI Adoption with Partners
Sri Ambati873 views
Vertex perspectives artificial intelligence by Yanai Oron
Vertex perspectives   artificial intelligenceVertex perspectives   artificial intelligence
Vertex perspectives artificial intelligence
Yanai Oron7K views
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use... by Sri Ambati
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
Sri Ambati1.7K views
Design advantages of Hadoop ETL offload with the Intel processor-powered Dell... by Principled Technologies
Design advantages of Hadoop ETL offload with the Intel processor-powered Dell...Design advantages of Hadoop ETL offload with the Intel processor-powered Dell...
Design advantages of Hadoop ETL offload with the Intel processor-powered Dell...
Scaling & Managing Production Deployments with H2O ModelOps by Sri Ambati
Scaling & Managing Production Deployments with H2O ModelOpsScaling & Managing Production Deployments with H2O ModelOps
Scaling & Managing Production Deployments with H2O ModelOps
Sri Ambati556 views
TechWiseTV Workshop: Improving Performance and Agility with Cisco HyperFlex by Robb Boyd
TechWiseTV Workshop: Improving Performance and Agility with Cisco HyperFlexTechWiseTV Workshop: Improving Performance and Agility with Cisco HyperFlex
TechWiseTV Workshop: Improving Performance and Agility with Cisco HyperFlex
Robb Boyd174 views
Logitech Accelerates Cloud Analytics Using Data Virtualization by Avinash Des... by Data Con LA
Logitech Accelerates Cloud Analytics Using Data Virtualization by Avinash Des...Logitech Accelerates Cloud Analytics Using Data Virtualization by Avinash Des...
Logitech Accelerates Cloud Analytics Using Data Virtualization by Avinash Des...
Data Con LA682 views
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo... by Sri Ambati
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
Sri Ambati1.1K views
Deep Learning In Enterprise by NVIDIA
Deep Learning In EnterpriseDeep Learning In Enterprise
Deep Learning In Enterprise
NVIDIA4.3K views
Advanced Techniques to Accelerate Model Tuning | Software for AI Optimization... by Intel® Software
Advanced Techniques to Accelerate Model Tuning | Software for AI Optimization...Advanced Techniques to Accelerate Model Tuning | Software for AI Optimization...
Advanced Techniques to Accelerate Model Tuning | Software for AI Optimization...
Intel® Software2.7K views
Dell Digital Transformation Through AI and Data Analytics Webinar by Bill Wong
Dell Digital Transformation Through AI and  Data Analytics WebinarDell Digital Transformation Through AI and  Data Analytics Webinar
Dell Digital Transformation Through AI and Data Analytics Webinar
Bill Wong114 views
AIOps: Anomalies Detection of Distributed Traces by Jorge Cardoso
AIOps: Anomalies Detection of Distributed TracesAIOps: Anomalies Detection of Distributed Traces
AIOps: Anomalies Detection of Distributed Traces
Jorge Cardoso727 views

Similar to byteLAKE's Alveo FPGA Solutions

byteLAKE's expertise across NVIDIA architectures and configurations by
byteLAKE's expertise across NVIDIA architectures and configurationsbyteLAKE's expertise across NVIDIA architectures and configurations
byteLAKE's expertise across NVIDIA architectures and configurationsbyteLAKE
3 views23 slides
The CAOS framework: democratize the acceleration of compute intensive applica... by
The CAOS framework: democratize the acceleration of compute intensive applica...The CAOS framework: democratize the acceleration of compute intensive applica...
The CAOS framework: democratize the acceleration of compute intensive applica...NECST Lab @ Politecnico di Milano
93 views41 slides
LEGaTO: Software Stack Runtimes by
LEGaTO: Software Stack RuntimesLEGaTO: Software Stack Runtimes
LEGaTO: Software Stack RuntimesLEGATO project
62 views18 slides
Introduction to FPGA acceleration by
Introduction to FPGA accelerationIntroduction to FPGA acceleration
Introduction to FPGA accelerationMarco77328
101 views59 slides
Programmable Exascale Supercomputer by
Programmable Exascale SupercomputerProgrammable Exascale Supercomputer
Programmable Exascale SupercomputerSagar Dolas
81 views38 slides
Fast Insights to Optimized Vectorization and Memory Using Cache-aware Rooflin... by
Fast Insights to Optimized Vectorization and Memory Using Cache-aware Rooflin...Fast Insights to Optimized Vectorization and Memory Using Cache-aware Rooflin...
Fast Insights to Optimized Vectorization and Memory Using Cache-aware Rooflin...Intel® Software
1.8K views39 slides

Similar to byteLAKE's Alveo FPGA Solutions(20)

byteLAKE's expertise across NVIDIA architectures and configurations by byteLAKE
byteLAKE's expertise across NVIDIA architectures and configurationsbyteLAKE's expertise across NVIDIA architectures and configurations
byteLAKE's expertise across NVIDIA architectures and configurations
byteLAKE3 views
LEGaTO: Software Stack Runtimes by LEGATO project
LEGaTO: Software Stack RuntimesLEGaTO: Software Stack Runtimes
LEGaTO: Software Stack Runtimes
LEGATO project62 views
Introduction to FPGA acceleration by Marco77328
Introduction to FPGA accelerationIntroduction to FPGA acceleration
Introduction to FPGA acceleration
Marco77328101 views
Programmable Exascale Supercomputer by Sagar Dolas
Programmable Exascale SupercomputerProgrammable Exascale Supercomputer
Programmable Exascale Supercomputer
Sagar Dolas81 views
Fast Insights to Optimized Vectorization and Memory Using Cache-aware Rooflin... by Intel® Software
Fast Insights to Optimized Vectorization and Memory Using Cache-aware Rooflin...Fast Insights to Optimized Vectorization and Memory Using Cache-aware Rooflin...
Fast Insights to Optimized Vectorization and Memory Using Cache-aware Rooflin...
Intel® Software1.8K views
Short.course.introduction.to.vhdl for beginners by Ravi Sony
Short.course.introduction.to.vhdl for beginners Short.course.introduction.to.vhdl for beginners
Short.course.introduction.to.vhdl for beginners
Ravi Sony555 views
The Principle Of Ultrasound Imaging System by Melissa Luster
The Principle Of Ultrasound Imaging SystemThe Principle Of Ultrasound Imaging System
The Principle Of Ultrasound Imaging System
Melissa Luster3 views
Summer training vhdl by Arshit Rai
Summer training vhdlSummer training vhdl
Summer training vhdl
Arshit Rai381 views
Summer training vhdl by Arshit Rai
Summer training vhdlSummer training vhdl
Summer training vhdl
Arshit Rai312 views
DATE 2020: Design, Automation and Test in Europe Conference by LEGATO project
DATE 2020: Design, Automation and Test in Europe ConferenceDATE 2020: Design, Automation and Test in Europe Conference
DATE 2020: Design, Automation and Test in Europe Conference
LEGATO project168 views
CV-RENJINIK-27062016 by Renjini K
CV-RENJINIK-27062016CV-RENJINIK-27062016
CV-RENJINIK-27062016
Renjini K324 views

More from byteLAKE

CFD Suite (AI-accelerated CFD) - Sztuczna Inteligencja Przyspiesza Symulacje ... by
CFD Suite (AI-accelerated CFD) - Sztuczna Inteligencja Przyspiesza Symulacje ...CFD Suite (AI-accelerated CFD) - Sztuczna Inteligencja Przyspiesza Symulacje ...
CFD Suite (AI-accelerated CFD) - Sztuczna Inteligencja Przyspiesza Symulacje ...byteLAKE
32 views23 slides
Empowering Industries with byteLAKE's High-Performance AI by
Empowering Industries with byteLAKE's High-Performance AIEmpowering Industries with byteLAKE's High-Performance AI
Empowering Industries with byteLAKE's High-Performance AIbyteLAKE
47 views33 slides
Automatyczny Monitoring Jakości w Fabryce (Sztuczna Inteligencja, byteLAKE) by
Automatyczny Monitoring Jakości w Fabryce (Sztuczna Inteligencja, byteLAKE)Automatyczny Monitoring Jakości w Fabryce (Sztuczna Inteligencja, byteLAKE)
Automatyczny Monitoring Jakości w Fabryce (Sztuczna Inteligencja, byteLAKE)byteLAKE
25 views33 slides
Sztuczna Inteligencja dla Biznesu (Made In Wroclaw 2020) by
Sztuczna Inteligencja dla Biznesu (Made In Wroclaw 2020)Sztuczna Inteligencja dla Biznesu (Made In Wroclaw 2020)
Sztuczna Inteligencja dla Biznesu (Made In Wroclaw 2020)byteLAKE
1.4K views8 slides
CFD Acceleration with FPGA (byteLAKE's & Xilinx's presentation from H2RC work... by
CFD Acceleration with FPGA (byteLAKE's & Xilinx's presentation from H2RC work...CFD Acceleration with FPGA (byteLAKE's & Xilinx's presentation from H2RC work...
CFD Acceleration with FPGA (byteLAKE's & Xilinx's presentation from H2RC work...byteLAKE
255 views14 slides
CFD acceleration with FPGA (byteLAKE's presentation from PPAM 2019) by
CFD acceleration with FPGA (byteLAKE's presentation from PPAM 2019)CFD acceleration with FPGA (byteLAKE's presentation from PPAM 2019)
CFD acceleration with FPGA (byteLAKE's presentation from PPAM 2019)byteLAKE
245 views31 slides

More from byteLAKE(10)

CFD Suite (AI-accelerated CFD) - Sztuczna Inteligencja Przyspiesza Symulacje ... by byteLAKE
CFD Suite (AI-accelerated CFD) - Sztuczna Inteligencja Przyspiesza Symulacje ...CFD Suite (AI-accelerated CFD) - Sztuczna Inteligencja Przyspiesza Symulacje ...
CFD Suite (AI-accelerated CFD) - Sztuczna Inteligencja Przyspiesza Symulacje ...
byteLAKE32 views
Empowering Industries with byteLAKE's High-Performance AI by byteLAKE
Empowering Industries with byteLAKE's High-Performance AIEmpowering Industries with byteLAKE's High-Performance AI
Empowering Industries with byteLAKE's High-Performance AI
byteLAKE47 views
Automatyczny Monitoring Jakości w Fabryce (Sztuczna Inteligencja, byteLAKE) by byteLAKE
Automatyczny Monitoring Jakości w Fabryce (Sztuczna Inteligencja, byteLAKE)Automatyczny Monitoring Jakości w Fabryce (Sztuczna Inteligencja, byteLAKE)
Automatyczny Monitoring Jakości w Fabryce (Sztuczna Inteligencja, byteLAKE)
byteLAKE25 views
Sztuczna Inteligencja dla Biznesu (Made In Wroclaw 2020) by byteLAKE
Sztuczna Inteligencja dla Biznesu (Made In Wroclaw 2020)Sztuczna Inteligencja dla Biznesu (Made In Wroclaw 2020)
Sztuczna Inteligencja dla Biznesu (Made In Wroclaw 2020)
byteLAKE1.4K views
CFD Acceleration with FPGA (byteLAKE's & Xilinx's presentation from H2RC work... by byteLAKE
CFD Acceleration with FPGA (byteLAKE's & Xilinx's presentation from H2RC work...CFD Acceleration with FPGA (byteLAKE's & Xilinx's presentation from H2RC work...
CFD Acceleration with FPGA (byteLAKE's & Xilinx's presentation from H2RC work...
byteLAKE255 views
CFD acceleration with FPGA (byteLAKE's presentation from PPAM 2019) by byteLAKE
CFD acceleration with FPGA (byteLAKE's presentation from PPAM 2019)CFD acceleration with FPGA (byteLAKE's presentation from PPAM 2019)
CFD acceleration with FPGA (byteLAKE's presentation from PPAM 2019)
byteLAKE245 views
byteLAKE and Lenovo presenting Federated Learning at MWC 2019 by byteLAKE
byteLAKE and Lenovo presenting Federated Learning at MWC 2019byteLAKE and Lenovo presenting Federated Learning at MWC 2019
byteLAKE and Lenovo presenting Federated Learning at MWC 2019
byteLAKE484 views
Benchmark of common AI accelerators: NVIDIA GPU vs. Intel Movidius by byteLAKE
Benchmark of common AI accelerators: NVIDIA GPU vs. Intel MovidiusBenchmark of common AI accelerators: NVIDIA GPU vs. Intel Movidius
Benchmark of common AI accelerators: NVIDIA GPU vs. Intel Movidius
byteLAKE19.5K views
byteLAKE's Edge AI by byteLAKE
byteLAKE's Edge AIbyteLAKE's Edge AI
byteLAKE's Edge AI
byteLAKE107 views
AI optimizing HPC simulations (presentation from 6th EULAG Workshop) by byteLAKE
AI optimizing HPC simulations (presentation from  6th EULAG Workshop)AI optimizing HPC simulations (presentation from  6th EULAG Workshop)
AI optimizing HPC simulations (presentation from 6th EULAG Workshop)
byteLAKE182 views

Recently uploaded

CryptoBotsAI by
CryptoBotsAICryptoBotsAI
CryptoBotsAIchandureddyvadala199
42 views5 slides
Redefining the book supply chain: A glimpse into the future - Tech Forum 2023 by
Redefining the book supply chain: A glimpse into the future - Tech Forum 2023Redefining the book supply chain: A glimpse into the future - Tech Forum 2023
Redefining the book supply chain: A glimpse into the future - Tech Forum 2023BookNet Canada
44 views19 slides
GDSC GLAU Info Session.pptx by
GDSC GLAU Info Session.pptxGDSC GLAU Info Session.pptx
GDSC GLAU Info Session.pptxgauriverrma4
15 views28 slides
This talk was not generated with ChatGPT: how AI is changing science by
This talk was not generated with ChatGPT: how AI is changing scienceThis talk was not generated with ChatGPT: how AI is changing science
This talk was not generated with ChatGPT: how AI is changing scienceElena Simperl
32 views13 slides
Discover Aura Workshop (12.5.23).pdf by
Discover Aura Workshop (12.5.23).pdfDiscover Aura Workshop (12.5.23).pdf
Discover Aura Workshop (12.5.23).pdfNeo4j
15 views55 slides
Future of AR - Facebook Presentation by
Future of AR - Facebook PresentationFuture of AR - Facebook Presentation
Future of AR - Facebook PresentationRob McCarty
65 views27 slides

Recently uploaded(20)

Redefining the book supply chain: A glimpse into the future - Tech Forum 2023 by BookNet Canada
Redefining the book supply chain: A glimpse into the future - Tech Forum 2023Redefining the book supply chain: A glimpse into the future - Tech Forum 2023
Redefining the book supply chain: A glimpse into the future - Tech Forum 2023
BookNet Canada44 views
GDSC GLAU Info Session.pptx by gauriverrma4
GDSC GLAU Info Session.pptxGDSC GLAU Info Session.pptx
GDSC GLAU Info Session.pptx
gauriverrma415 views
This talk was not generated with ChatGPT: how AI is changing science by Elena Simperl
This talk was not generated with ChatGPT: how AI is changing scienceThis talk was not generated with ChatGPT: how AI is changing science
This talk was not generated with ChatGPT: how AI is changing science
Elena Simperl32 views
Discover Aura Workshop (12.5.23).pdf by Neo4j
Discover Aura Workshop (12.5.23).pdfDiscover Aura Workshop (12.5.23).pdf
Discover Aura Workshop (12.5.23).pdf
Neo4j15 views
Future of AR - Facebook Presentation by Rob McCarty
Future of AR - Facebook PresentationFuture of AR - Facebook Presentation
Future of AR - Facebook Presentation
Rob McCarty65 views
Bronack Skills - Risk Management and SRE v1.0 12-3-2023.pdf by ThomasBronack
Bronack Skills - Risk Management and SRE v1.0 12-3-2023.pdfBronack Skills - Risk Management and SRE v1.0 12-3-2023.pdf
Bronack Skills - Risk Management and SRE v1.0 12-3-2023.pdf
ThomasBronack31 views
PCCC23:日本AMD株式会社 テーマ2「AMD EPYC™ プロセッサーを用いたAIソリューション」 by PC Cluster Consortium
PCCC23:日本AMD株式会社 テーマ2「AMD EPYC™ プロセッサーを用いたAIソリューション」PCCC23:日本AMD株式会社 テーマ2「AMD EPYC™ プロセッサーを用いたAIソリューション」
PCCC23:日本AMD株式会社 テーマ2「AMD EPYC™ プロセッサーを用いたAIソリューション」
"Surviving highload with Node.js", Andrii Shumada by Fwdays
"Surviving highload with Node.js", Andrii Shumada "Surviving highload with Node.js", Andrii Shumada
"Surviving highload with Node.js", Andrii Shumada
Fwdays58 views
Enabling DPU Hardware Accelerators in XCP-ng Cloud Platform Environment - And... by ShapeBlue
Enabling DPU Hardware Accelerators in XCP-ng Cloud Platform Environment - And...Enabling DPU Hardware Accelerators in XCP-ng Cloud Platform Environment - And...
Enabling DPU Hardware Accelerators in XCP-ng Cloud Platform Environment - And...
ShapeBlue108 views
"Running students' code in isolation. The hard way", Yurii Holiuk by Fwdays
"Running students' code in isolation. The hard way", Yurii Holiuk "Running students' code in isolation. The hard way", Yurii Holiuk
"Running students' code in isolation. The hard way", Yurii Holiuk
Fwdays36 views
"Node.js Development in 2024: trends and tools", Nikita Galkin by Fwdays
"Node.js Development in 2024: trends and tools", Nikita Galkin "Node.js Development in 2024: trends and tools", Nikita Galkin
"Node.js Development in 2024: trends and tools", Nikita Galkin
Fwdays33 views
Measurecamp Brussels - Synthetic data.pdf by Human37
Measurecamp Brussels - Synthetic data.pdfMeasurecamp Brussels - Synthetic data.pdf
Measurecamp Brussels - Synthetic data.pdf
Human37 26 views
The Power of Generative AI in Accelerating No Code Adoption.pdf by Saeed Al Dhaheri
The Power of Generative AI in Accelerating No Code Adoption.pdfThe Power of Generative AI in Accelerating No Code Adoption.pdf
The Power of Generative AI in Accelerating No Code Adoption.pdf
Saeed Al Dhaheri39 views
Business Analyst Series 2023 - Week 4 Session 7 by DianaGray10
Business Analyst Series 2023 -  Week 4 Session 7Business Analyst Series 2023 -  Week 4 Session 7
Business Analyst Series 2023 - Week 4 Session 7
DianaGray10146 views
Optimizing Communication to Optimize Human Behavior - LCBM by Yaman Kumar
Optimizing Communication to Optimize Human Behavior - LCBMOptimizing Communication to Optimize Human Behavior - LCBM
Optimizing Communication to Optimize Human Behavior - LCBM
Yaman Kumar38 views

byteLAKE's Alveo FPGA Solutions

  • 1. Alveo Products Marketplace Expertise Software Services Alveo Solutions
  • 2. Expertise in Alveo FPGA programming PCIe x86 CPU Host Application Runtime and Drivers Acceleration API FPGA Accelerated Functions DMA Engine AXI Interfaces byteLAKE’s Solutions Xilinx Acceleration Platform C/C++ code with OpenCL API calls C/C++ or OpenCL C FPGACPU 2 More at: byteLAKE.com/en/Alveo
  • 3. • Xilinx pioneered C to FPGA compilation technology (aka “HLS”) in 2011 3 Source code in C, C++ or OpenCL loop_main:for(int j=0;j<NUM_SIMGROUPS;j+=2) { loop_share:for(uint k=0;k<NUM_SIMS;k++) { loop_parallel:for(int i=0;i<NUM_RNGS;i++) { mt_rng[i].BOX_MULLER(&num1[i][k],&num2[i][k],ratio4,ratio3); float payoff1 = expf(num1[i][k])-1.0f; float payoff2 = expf(num2[i][k])-1.0f; if(num1[i][k]>0.0f) pCall1[i][k]+= payoff1; else pPut1[i][k]-=payoff1; if(num2[i][k]>0.0f) pCall2[i][k]+=payoff2; else pPut2[i][k]-=payoff2; } } } FPGACompile
  • 4. Xilinx FPGAs highlights 4 • No predefined instruction set or underlying architecture • Developers customize the architecture to their needs – Custom data paths – Custom bit-width – Custom memory hierarchies • Excels at all types of parallelism – Deeply pipelined (e.g. Video codecs) – Bit manipulations (e.g. AES, SHA) – Wide data path (e.g. DNN) – Custom memory hierarchy (e.g. Data analytics) • Adapts to evolving algorithms and workload needs FPGAs - the Ultimate Parallel Processing Device
  • 5. • Compute domain divided into sub-domains • Host sends data to the FPGA global memory • Host calls kernels to execute them on FPGA (kernel is called many times) • Each kernel call represents a single time step • FPGA sends the output array back to the host Typical Architecture More at: byteLAKE.com/en/PPAM19
  • 6. • Kernel is distributed into 4 SLRs • Each sub-domain is allocated in different memory bank • Data transfer occurs between neighboring memory banks Example processing SLR0 Kernel_A SLR1 Kernel_B SLR2 Kernel_C SLR3 Kernel_D Kernel Bank0 Bank1 Bank2 Bank3 Sub-domain Sub-domain Sub-domain Sub-domain 19
  • 7. Case study: CFD Kernels adaptation Typical CFD workflow From CAD to MESH… (meshing) Image source: https://www.openfoam.com/products/visualcfd.php …to CFD simulation and visualization. • MESH conversion (input) • byteLAKE’s CFD Kernels • Data output for visualization upto5% ofsimulationtime major workload OPENFOAM® is a registered trademark of ESI Group. This offering is not approved or endorsed by ESI Group, the producer of the OpenFOAM software and owner of the OPENFOAM® and OpenCFD® trademarks.
  • 8. byteLAKE created set of highly optimized CFD kernels for Xilinx Alveo Datacentre accelerator cards –Advection (movement of some material, dissolved or suspended in the fluid) –Pseudo velocity (approximation of the relative velocity) –Divergence (measures how much of fluid is flowing into/ out of a certain point in a vector field) –Thomas algorithm (simplified form of Gaussian elimination for tridiagonal system of equations) 8Download solution description: bytelake.com/en/download/2716/ CFD Kernels
  • 9. CFD acceleration with Alveo FPGA 9 More at: byteLAKE.com/en/FPGA
  • 10. MPDATA Accelerated CFD / Advection algorithm optimized for heterogeneous computing.
  • 11. CFD Computational Fluid Dynamics • Numerical analysis and algorithms to solve fluid flows problems –how liquids and gases flow and interact with surfaces • Widely used across industries: –automotive, chemical, aerospace, biomedical, power and energy, and construction etc. • Typical applications –weather simulations, –aerodynamic characteristics modelling and optimization, –petroleum mass flow rate assessment 11
  • 12. • MPDATA (Multidimensional Positive Definite Advection Transport Algorithm) – main part of the dynamic core of the Eulerian/ semi-Lagrangian (EULAG) model – EULAG (MPDATA+elliptic solver) is the established computational model, developed for simulating thermo-fluid flows across a wide range of scales and physical scenarios – currently, this model is being implemented as the new dynamic core of the COSMO (Consortium for Small-scale Modeling) weather prediction framework – advection (together with the elliptic solver) is a key part of many frameworks that allow users to implement their simulations • Advection – movement of some material (dissolved or suspended) in the fluid. Algorithm: Advection (MPDATA) General Information
  • 13. • Easy to integrate – Can work as a standalone application or be called as a function via our dedicated interface (e.g. can be called as a function with input and output arrays) – Compatible with frameworks like TensorFlow for integrating deep learning with CFD codes • Easy to visualize the results – Results can be stored in a raw format as a binary file of the output arrays or converted via byteLAKE tools to a ParaView format • See benefits already in 1-node HPC configurations – Strongly adapted to Alveo U250, were single card supports the max size of arrays: 2,1 Gcells (max compute domain: 1264 x 1264 x 1264) ~ 60 GB • Scalable to many cards per node and many nodes Algorithm: Advection (MPDATA) byteLAKE’s implementation compatibility
  • 14. • First-order-accurate step of the advection scheme. Second-order is an option. • Input data – Array X – non-diffusive quantity (e.g. temperature of water vapor, ice, precipitation, etc.) – Arrays V1, V2, V3 - each of them stores the velocity vectors in one direction – (optional) Arrays Fi, Fe - implosion and explosion forces acting on a structure of X – (optional) Array D with density – (optional) Array rho which defines an interface for the coupling of COSMO and EULAG dynamic core (used to provide the transformation of the X variable) – DT – time step (scalar) • Output data – single X array that was updated in the given time step Algorithm: Advection (MPDATA) Technical Information
  • 15. • Applications include – To characterize the sub-grid scales effect in global numerical simulations of turbulent stellar interiors – To compare anelastic and compressible convection-permitting weather forecasts for the Alpine region – Modeling the prediction of forest fire spread – Flood simulations – Biomechanical modeling of brain injuries within the Voigt model (a linear system of differential equations where the motion of the brain tissue depends merely on the balance between viscous and elastic forces) – Simulation gravity wave turbulence in the Earth's atmosphere – Simulation of geophysical turbulence in the Earth's atmosphere – Ocean modeling: simulation of three-dimensional solitary wave generation and propagation using EULAG coupled to the barotropic NCOM (Navy Coastal Ocean Model) tidal model 15 Applications of Advection (MPDATA)
  • 16. • Applications include cont. – Oil and Gas: provides a significant return on investment (ROI) in seismic analysis, reservoir modelling and basin modelling. Used also to monitor drilling and seismic data to optimize drilling trajectories and minimize environmental risk. – AgriTech: models to track and predict various environmental impacts on crop yield such as weather changes. For example, daily weather predictions can be customized based on the needs of each client and range from hyperlocal to global. • Example adopters – Poznan Supercomputing and Networking Center, Poland: prognosis of air pollution – European Centre for Medium-Range Weather Forecasts, UK: weather forecast – Institute of Meteorology and Water Management, Poland: weather forecasts – German Aerospace Center: aeronautics, transport and energy areas – University of Cape Town, RPA: weather simulation – Montreal University: weather simulation – Warsaw University: ocean simulation Applications of Advection (MPDATA), cont. Full list
  • 17. Algorithm: Advection (MPDATA) Alveo FPGA Benchmark INTEL XEON E5-2995 INTEL XEON E5-2995 INTEL XEON GOLD 6148 INTEL XEON PLATINUM 8168 XILINX ALVEO U250 Performance (the higher the better) INTEL XEON E5-2995 INTEL XEON E5-2995 INTEL XEON GOLD 6148 INTEL XEON PLATINUM 8168 XILINX ALVEO U250 Energy (the lower the better) INTEL XEON E5-2995 INTEL XEON E5-2995 INTEL XEON GOLD 6148 INTEL XEON PLATINUM 8168 XILINX ALVEO U250 Performance/W (the higher the better) More at: byteLAKE.com/en/FPGA
  • 18. Explore byteLAKE’s CFD Suite www.byteLAKE.com/en/CFDSuite byteLAKE’s CFD Suite AI for CFD
  • 19. AI • highly optimized AI engines to analyze text, image, video, sound and time series data. • Detecting shapes & patterns. • Complex tasks automation. • IoT/ edge, Cloud, on-premise. HPC • accelerating time to results and adapting complex algorithms to GPU, FPGA, many-CPU architectures. • From single nodes to clusters. Meet byteLAKE AI and HPC Experts Your software partner for AI & HPC projects Experts in adapting & optimizing software for Select Products AI for CFD. Ultra fast results, radically lower TCO. New possibilities. Objects Detection Edge AI and real time computer vision. 56x faster AI training. R&I • R&D • Licensing
  • 20. HPC at byteLAKE Accelerating time to results and adapting complex algorithms to GPU, FPGA, many-CPU architectures. Unleashing the power: • selecting the right programming model to a given problem (task parallelism, data parallelism, mixture of these two) • providing the right balance between CPUs and GPUs/FPGAs • optimizing data transfers between host memory and accelerators • code adaptation to a variety of computing platforms Bottom line: lowering TCO thru various optimizations (performance, energy efficiency, accuracy of calculations) More at: byteLAKE.com/en/HPC Making the most of the hardware: • Speedup: accelerating time to results for complex algorithms • Green Computing: optimizing algorithms to reduce energy consumption • Scalability: from single nodes to clusters
  • 21. Products and Services Cognitive AutomationEdge AI Services HPC Products CFD Suite brainello Ewa Guard Federated Learning Green Computing (FPGA, GPU) Intelligent Restaurant Incubation
  • 22. byteLAKE among top AI companies in Poland! "It contains information on practically all meaningful companies operating in Poland which offer services or products in the field of modern technologies. We believe this map will be necessary to help both domestic and international investors looking for interesting projects in Poland.", Aleksander Kutela, President of Digital Poland Foundation