SlideShare a Scribd company logo
1 of 23
NVIDIA Expertise
AI Solutions for Industries
• In-depth understanding of NVIDIA hardware and software ecosystems:
– Expertise in various NVIDIA architectures, including Fermi, Kepler (K80, GeForce GTX Titan,
Jetson), Maxwell (NVIDIA GeForce GTX 980), Pascal (P100), Tesla V100/A100/H100, and T4.
– Mastery of GPU programming languages such as CUDA, OpenCL, and OpenACC.
– Extensive experience optimizing solutions for NVIDIA GPUs.
– Comprehensive knowledge spanning desktop, mobile, and server environments.
• Demonstrated success through multiple case studies:
– AI training (machine learning and deep learning)
– Edge AI inferencing
– Classic HPC simulations (Computational Fluid Dynamics, weather simulations)
• Active participation in cutting-edge research with numerous publications in prestigious
journals, including Concurrency and Computation: Practice and Experience, Parallel
Computing, and the Journal of Supercomputing.
Expertise Across NVIDIA Architectures
and Configurations
Performance & Scalability
AI
Always optimized for the latest hardware available.
AI training optimized for edge servers and multi-node HPC architectures.
Edge Servers and HPC
Cost
Efficient
Scalable
Solution
Maximum
Performance
Quick deployment
& no external dependencies.
On-premises
Example configurations.
Other options available.
Hardware
ThinkSystem SR650V2
ThinkEdge SE350 ThinkEdge SE450
Workstations ThinkSystem SR670V2
AI Predictions AI Training
Edge HPC Server Performance: AI training
➢ Lenovo’s ThinkEdge SE450
➢ 2 NVIDIA A100 80GB Tensor Core GPUs
➢ Intel Xeon Gold 6330N CPU
➢ Download the full report
HPC Simulations optimized by Machine Learning
automatic adaptation of algorithm to a specific hardware architecture
• Enables software portability between different architectures:
– CPU: different number of cores, hierarchy of memory, caches size;
– GPU: register file reusing, shared memory utilization, GPU direct support, reduction of global memory transaction;
– HPC: selecting the right number of nodes, scalability estimation, overlapping data transfers & communication;
– Hybrid: load balancing
(i.e. selecting appropriate parts of the algorithm for different devices executing the code with different performance)
• Helps build adaptable algorithms:
– automatically selecting the size of data blocks, number of threads, number of processes, precision of data
(i.e. depending on algorithm or input data characteristics some data can be stored using double, single of half precision format)
– selecting the criterion of optimization: performance, energy consumption, accuracy of result, mix
(i.e. mix of performance & energy to minimize energy and keep execution time or to optimization performance & keep energy budget)
• Can auto-configure the system and provide the most suitable compiler flags
byteLAKE’s Software Autotuning
• Goal: Reducing the energy consumption of the MPDATA algorithm (algorithm for numerical simulation of
geophysical fluids flows on micro-to-planetary scales – especially used in a numerical weather prediction).
• Hardware: Piz Daint supercomputer (ranked 3-rd at top 500), equipped with the most advanced Pascal-based
GPUs: NVIDIA Tesla P100.
• Idea: Applying mixed precision arithmetic - set a part of operations to be performed in a single precision (32-bits)
and the remaining set to double (64-bits).
• Why do we use it? A single simulation of the weather phenomenon needed more than 1013
operations. We
suspected that not all of them needs double precision arithmetic to preserve the same simulation accuracy. We
believe that the control of the precision and accuracy of numerical results can increase the performance, decrease
the energy consumption, and provide highly accurate results.
• Solution: We used unsupervised learning to estimate the correlation between the precision of each matrix and
their influence on criteria (energy, accuracy of results). During the dynamic and short training stage we evaluated
the set of operations that could be performed in a single precision without loss in accuracy of the weather
simulation.
Research Case: (concluded)
CFD acceleration with GPU (MPDATA, weather forecast)
Results: We reduced energy by 33%, increased
performance by the factor of 1.27x using 25% less GPUs,
keeping the accuracy of the results at the same level
as when using double precision arithmetic.
Research Case:
Reconfiguring HPC Simulation with AI to optimize performance and energy
node count
accelerators per node
memory alignment
streams count
buffering types
…
cpu cores
memory policy
1 000 000 000
Possible
configurations
Ca. 5000
possible
configurations
Artificial Intelligenc
e
This module utilizes among
others the supervised learning
method with the random forest
algorithm.
The main functionality of the module
is to prune the search space in
order to eliminate the worst
configurations.
We develop a Machine
Learning module in order to
select the most fitting
configuration.
In this way we achieve a small set
that at 90% contains the best
configuration.
MPDATA Accelerated
CFD / Advection algorithm optimized for heterogeneous
computing. For Nvidia GPUs and Xilinx Alveo FPGAs.
• MPDATA
(Multidimensional Positive Definite Advection Transport Algorithm)
– main part of the dynamic core of the Eulerian/
semi-Lagrangian (EULAG) model
– EULAG (MPDATA+elliptic solver) is the established computational model,
developed for simulating thermo-fluid flows across a wide range of scales
and physical scenarios
– currently, this model is being implemented as the new dynamic core of the COSMO
(Consortium for Small-scale Modeling) weather prediction framework
– advection (together with the elliptic solver) is a key part of many frameworks that allow
users to implement their simulations
• Advection
– movement of some material (dissolved or suspended) in the fluid.
Algorithm: Advection (MPDATA)
General Information
• Easy to integrate
– Can work as a standalone application or be called as a function via our dedicated interface
(e.g. can be called as a function with input and output arrays)
– Compatible with frameworks like TensorFlow for integrating deep learning with CFD codes
• Easy to visualize the results
– Results can be stored in a raw format as a binary file of the output arrays or converted via
byteLAKE tools to a ParaView format
• See benefits already in 1-node HPC configurations
– Strongly adapted to Alveo U250, were single card supports the max size of arrays: 2,1 Gcells
(max compute domain: 1264 x 1264 x 1264) ~ 60 GB
• Scalable to many cards per node and many nodes
Algorithm: Advection (MPDATA)
byteLAKE’s implementation compatibility
• First-order-accurate step of the advection scheme.
Second-order is an option.
• Input data
– Array X – non-diffusive quantity
(e.g. temperature of water vapor, ice, precipitation, etc.)
– Arrays V1, V2, V3 - each of them stores the velocity vectors in one direction
– (optional) Arrays Fi, Fe - implosion and explosion forces acting on a structure of X
– (optional) Array D with density
– (optional) Array rho which defines an interface for the coupling of COSMO and EULAG dynamic core
(used to provide the transformation of the X variable)
– DT – time step (scalar)
• Output data
– single X array that was updated in the given time step
Algorithm: Advection (MPDATA)
Technical Information
• Applications include
– To characterize the sub-grid scales effect in global numerical simulations
of turbulent stellar interiors
– To compare anelastic and compressible convection-permitting weather forecasts
for the Alpine region
– Modeling the prediction of forest fire spread
– Flood simulations
– Biomechanical modeling of brain injuries within the Voigt model
(a linear system of differential equations where the motion of the brain tissue depends
merely on the balance between viscous and elastic forces)
– Simulation gravity wave turbulence in the Earth's atmosphere
– Simulation of geophysical turbulence in the Earth's atmosphere
– Ocean modeling: simulation of three-dimensional solitary wave generation and
propagation using EULAG coupled to the barotropic NCOM (Navy Coastal Ocean Model)
tidal model
11
Applications of Advection (MPDATA)
• Applications include cont.
– Oil and Gas: provides a significant return on investment (ROI) in seismic analysis,
reservoir modelling and basin modelling. Used also to monitor drilling and seismic data
to optimize drilling trajectories and minimize environmental risk.
– AgriTech: models to track and predict various environmental impacts on crop yield such
as weather changes. For example, daily weather predictions can be customized based on
the needs of each client and range from hyperlocal to global.
• Example adopters
– Poznan Supercomputing and Networking Center, Poland: prognosis of air pollution
– European Centre for Medium-Range Weather Forecasts, UK: weather forecast
– Institute of Meteorology and Water Management, Poland: weather forecasts
– German Aerospace Center: aeronautics, transport and energy areas
– University of Cape Town, RPA: weather simulation
– Montreal University: weather simulation
– Warsaw University: ocean simulation
Applications of Advection (MPDATA), cont.
Full list
12x better performance
30% reduced energy consumption
• Our solution: machine learning managed, dynamic application of mixed precision
• Highlights:
– Dynamic estimation of the algorithm’s power consumption as a function of the
frequency of the processor and the number of cores.
– Energy-aware task management
– Auto-tuning procedure taking into account algorithm’s and GPU-specific parameters
for auto-configuring purposes.
– Result: better performance, less energy consumed.
Weather engine optimized for Europe’s fastest
supercomputer (Piz Daint)
Our mechanism provides the energy savings of up to 1.43x
comparing to the default Linux scaling governor.
Dynamic Mixed Precision, cont.
We reduced E by 33%, increased performance by the factor of 1.27x using 25% less GPUs.
We kept the accuracy of the results at the same level as when using double precision arithmetic.
Dynamic Mixed Precision
Optimize execution time
• Ported geophysical model
(EULAG) to a parallel
computing supercomputer
architecture (Piz Daint)
• Used Machine Learning
(Random Forest) to
optimize various numerical
parameters as: data blocks
sizes, number of GPU
streams, sizes of vector data
types
Optimize energy efficiency
• Developed a mechanism
(mixed precision) that
allowed for providing a low
energy consumption of
supercomputers keeping the
code performance at the
highest possible level
• Developed a framework,
based on software
automatic tuning approach
Results
✓ 10 times faster
✓ Then we improved it
even more, reaching the
speed-up of 1.27
✓ Energy consumption
reduced by 33%
✓ Optimized GPUs usage
while keeping the accuracy
of computations
Highlights:
• C++, CUDA, MPI, OpenMP
Advanced quality inspection and data insights
AI for Manufacturing, Automotive, Paper, Chemical, and Energy sectors.
AI self-checkout stations for Restaurants and object recognition Retail businesses.
AI Solutions
for Industries
for Manufacturing for Automotive for Paper Industry Data Insights
Cognitive Services
Advanced quality inspection and data insights.
Cognitive Services for Restaurants
Self-checkout and object recognition.
CFD Suite
AI-accelerated Computational Fluid Dynamics.
Predictive Maintenance
Featured Products
Custom AI Development
AI Services
AI Workshop Edge AI Cognitive Automation HPC
Incubation
Intel Expertise Alveo Expertise
NVIDIA Expertise
brainello Ewa Guard
Document Processing Forestry & Agriculture
+48 508 091 885
+48 505 322 282
welcome@byteLAKE.com
What is AI?
AI
➢ LinkedIn.com/company/byteLAKE
➢ X.com/byteLAKEGlobal
➢ FB.com/byteLAKE/
➢ byteLAKE.com/en/YouTube
➢ Blog
Partners & Clients
“AI already plays a very important role in our daily lives. […]
The application of the Intel® Distribution of OpenVINO™ toolkit
in byteLAKE’s Cognitive Services shows that AI works efficiently
as an actual tool for optimizing company operations. Moreover,
such a combination reduces the barrier of necessary upgrades to IT
infrastructure [...],” said Krzysztof Jonak,
EMEA Territory Sales Director, Intel.
“We’re also working with a number of partners on AI initiatives
that will provide real world solutions for customers. […]
Our collaboration with partners such as Intel, NVIDIA, Mark III
systems, and byteLAKE greatly expands the resources and
expertise we’re able to provide“, said Dr. Bhushan Desam,
Lenovo’s AI Global Business Leader, HPC and AI Business.
Our Research Studies
GO PUBLIC!
More at: byteLAKE.com/en/research
AI
Deployment
Plan
Case
Studies
Science +
business +
industry know-
how
& Partners
Meet byteLAKE
AI Solutions for Industries |
Quality Inspection |
Data Insights |
AI-accelerated CFD |
Self-Checkout
Products:
CFD Suite Cognitive Services
www.byteLAKE.com
Headquartered in Poland
Empowering Industries with Artificial Intelligence
Solutions.
At byteLAKE, we harness cutting-edge technology to
provide advanced quality inspection and data insights
tailored for the Manufacturing, Automotive, Paper,
Chemical, and Energy sectors.
Additionally, we offer self-checkout stations for
Restaurants and object recognition solutions for Retail
businesses.
+48 508 091 885
+48 505 322 282
welcome@byteLAKE.com

More Related Content

Similar to byteLAKE's expertise across NVIDIA architectures and configurations

The CAOS framework: democratize the acceleration of compute intensive applica...
The CAOS framework: democratize the acceleration of compute intensive applica...The CAOS framework: democratize the acceleration of compute intensive applica...
The CAOS framework: democratize the acceleration of compute intensive applica...NECST Lab @ Politecnico di Milano
 
A Container-based Sizing Framework for Apache Hadoop/Spark Clusters
A Container-based Sizing Framework for Apache Hadoop/Spark ClustersA Container-based Sizing Framework for Apache Hadoop/Spark Clusters
A Container-based Sizing Framework for Apache Hadoop/Spark ClustersDataWorks Summit/Hadoop Summit
 
Development and Applications of Distributed IoT Sensors for Intermittent Conn...
Development and Applications of Distributed IoT Sensors for Intermittent Conn...Development and Applications of Distributed IoT Sensors for Intermittent Conn...
Development and Applications of Distributed IoT Sensors for Intermittent Conn...InfluxData
 
Expectations for optical network from the viewpoint of system software research
Expectations for optical network from the viewpoint of system software researchExpectations for optical network from the viewpoint of system software research
Expectations for optical network from the viewpoint of system software researchRyousei Takano
 
OpenACC Monthly Highlights: October2020
OpenACC Monthly Highlights: October2020OpenACC Monthly Highlights: October2020
OpenACC Monthly Highlights: October2020OpenACC
 
Targeting GPUs using OpenMP Directives on Summit with GenASiS: A Simple and...
Targeting GPUs using OpenMP  Directives on Summit with  GenASiS: A Simple and...Targeting GPUs using OpenMP  Directives on Summit with  GenASiS: A Simple and...
Targeting GPUs using OpenMP Directives on Summit with GenASiS: A Simple and...Ganesan Narayanasamy
 
Performance Analysis of Lattice QCD on GPUs in APGAS Programming Model
Performance Analysis of Lattice QCD on GPUs in APGAS Programming ModelPerformance Analysis of Lattice QCD on GPUs in APGAS Programming Model
Performance Analysis of Lattice QCD on GPUs in APGAS Programming ModelKoichi Shirahata
 
Barcelona Supercomputing Center, Generador de Riqueza
Barcelona Supercomputing Center, Generador de RiquezaBarcelona Supercomputing Center, Generador de Riqueza
Barcelona Supercomputing Center, Generador de RiquezaFacultad de Informática UCM
 
Achitecture Aware Algorithms and Software for Peta and Exascale
Achitecture Aware Algorithms and Software for Peta and ExascaleAchitecture Aware Algorithms and Software for Peta and Exascale
Achitecture Aware Algorithms and Software for Peta and Exascaleinside-BigData.com
 
AI optimizing HPC simulations (presentation from 6th EULAG Workshop)
AI optimizing HPC simulations (presentation from  6th EULAG Workshop)AI optimizing HPC simulations (presentation from  6th EULAG Workshop)
AI optimizing HPC simulations (presentation from 6th EULAG Workshop)byteLAKE
 
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facilityinside-BigData.com
 
Large-Scale Optimization Strategies for Typical HPC Workloads
Large-Scale Optimization Strategies for Typical HPC WorkloadsLarge-Scale Optimization Strategies for Typical HPC Workloads
Large-Scale Optimization Strategies for Typical HPC Workloadsinside-BigData.com
 
Talk on commercialising space data
Talk on commercialising space data Talk on commercialising space data
Talk on commercialising space data Alison B. Lowndes
 
Multiscale Dataflow Computing: Competitive Advantage at the Exascale Frontier
Multiscale Dataflow Computing: Competitive Advantage at the Exascale FrontierMultiscale Dataflow Computing: Competitive Advantage at the Exascale Frontier
Multiscale Dataflow Computing: Competitive Advantage at the Exascale Frontierinside-BigData.com
 
High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...
High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...
High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...EUDAT
 
CloudLightning and the OPM-based Use Case
CloudLightning and the OPM-based Use CaseCloudLightning and the OPM-based Use Case
CloudLightning and the OPM-based Use CaseCloudLightning
 
AWS re:Invent 2016: High Performance Computing on AWS (CMP207)
AWS re:Invent 2016: High Performance Computing on AWS (CMP207)AWS re:Invent 2016: High Performance Computing on AWS (CMP207)
AWS re:Invent 2016: High Performance Computing on AWS (CMP207)Amazon Web Services
 

Similar to byteLAKE's expertise across NVIDIA architectures and configurations (20)

The CAOS framework: democratize the acceleration of compute intensive applica...
The CAOS framework: democratize the acceleration of compute intensive applica...The CAOS framework: democratize the acceleration of compute intensive applica...
The CAOS framework: democratize the acceleration of compute intensive applica...
 
Cliff sugerman
Cliff sugermanCliff sugerman
Cliff sugerman
 
A Container-based Sizing Framework for Apache Hadoop/Spark Clusters
A Container-based Sizing Framework for Apache Hadoop/Spark ClustersA Container-based Sizing Framework for Apache Hadoop/Spark Clusters
A Container-based Sizing Framework for Apache Hadoop/Spark Clusters
 
Development and Applications of Distributed IoT Sensors for Intermittent Conn...
Development and Applications of Distributed IoT Sensors for Intermittent Conn...Development and Applications of Distributed IoT Sensors for Intermittent Conn...
Development and Applications of Distributed IoT Sensors for Intermittent Conn...
 
Expectations for optical network from the viewpoint of system software research
Expectations for optical network from the viewpoint of system software researchExpectations for optical network from the viewpoint of system software research
Expectations for optical network from the viewpoint of system software research
 
OpenACC Monthly Highlights: October2020
OpenACC Monthly Highlights: October2020OpenACC Monthly Highlights: October2020
OpenACC Monthly Highlights: October2020
 
Targeting GPUs using OpenMP Directives on Summit with GenASiS: A Simple and...
Targeting GPUs using OpenMP  Directives on Summit with  GenASiS: A Simple and...Targeting GPUs using OpenMP  Directives on Summit with  GenASiS: A Simple and...
Targeting GPUs using OpenMP Directives on Summit with GenASiS: A Simple and...
 
Performance Analysis of Lattice QCD on GPUs in APGAS Programming Model
Performance Analysis of Lattice QCD on GPUs in APGAS Programming ModelPerformance Analysis of Lattice QCD on GPUs in APGAS Programming Model
Performance Analysis of Lattice QCD on GPUs in APGAS Programming Model
 
Barcelona Supercomputing Center, Generador de Riqueza
Barcelona Supercomputing Center, Generador de RiquezaBarcelona Supercomputing Center, Generador de Riqueza
Barcelona Supercomputing Center, Generador de Riqueza
 
Green scheduling
Green schedulingGreen scheduling
Green scheduling
 
Achitecture Aware Algorithms and Software for Peta and Exascale
Achitecture Aware Algorithms and Software for Peta and ExascaleAchitecture Aware Algorithms and Software for Peta and Exascale
Achitecture Aware Algorithms and Software for Peta and Exascale
 
AI optimizing HPC simulations (presentation from 6th EULAG Workshop)
AI optimizing HPC simulations (presentation from  6th EULAG Workshop)AI optimizing HPC simulations (presentation from  6th EULAG Workshop)
AI optimizing HPC simulations (presentation from 6th EULAG Workshop)
 
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility
 
Large-Scale Optimization Strategies for Typical HPC Workloads
Large-Scale Optimization Strategies for Typical HPC WorkloadsLarge-Scale Optimization Strategies for Typical HPC Workloads
Large-Scale Optimization Strategies for Typical HPC Workloads
 
Talk on commercialising space data
Talk on commercialising space data Talk on commercialising space data
Talk on commercialising space data
 
Multiscale Dataflow Computing: Competitive Advantage at the Exascale Frontier
Multiscale Dataflow Computing: Competitive Advantage at the Exascale FrontierMultiscale Dataflow Computing: Competitive Advantage at the Exascale Frontier
Multiscale Dataflow Computing: Competitive Advantage at the Exascale Frontier
 
High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...
High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...
High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...
 
CloudLightning and the OPM-based Use Case
CloudLightning and the OPM-based Use CaseCloudLightning and the OPM-based Use Case
CloudLightning and the OPM-based Use Case
 
System mldl meetup
System mldl meetupSystem mldl meetup
System mldl meetup
 
AWS re:Invent 2016: High Performance Computing on AWS (CMP207)
AWS re:Invent 2016: High Performance Computing on AWS (CMP207)AWS re:Invent 2016: High Performance Computing on AWS (CMP207)
AWS re:Invent 2016: High Performance Computing on AWS (CMP207)
 

More from byteLAKE

byteLAKE's AI Products (use cases) (short)
byteLAKE's AI Products (use cases) (short)byteLAKE's AI Products (use cases) (short)
byteLAKE's AI Products (use cases) (short)byteLAKE
 
byteLAKE's AI Products (use cases) - presentation
byteLAKE's AI Products (use cases) - presentationbyteLAKE's AI Products (use cases) - presentation
byteLAKE's AI Products (use cases) - presentationbyteLAKE
 
byteLAKE's AI Products for Industries (2024-02)
byteLAKE's AI Products for Industries (2024-02)byteLAKE's AI Products for Industries (2024-02)
byteLAKE's AI Products for Industries (2024-02)byteLAKE
 
byteLAKE's CFD Suite (AI-accelerated CFD) (2024-02)
byteLAKE's CFD Suite (AI-accelerated CFD) (2024-02)byteLAKE's CFD Suite (AI-accelerated CFD) (2024-02)
byteLAKE's CFD Suite (AI-accelerated CFD) (2024-02)byteLAKE
 
AI Solutions for Industries | Quality Inspection | Data Insights | Predictive...
AI Solutions for Industries | Quality Inspection | Data Insights | Predictive...AI Solutions for Industries | Quality Inspection | Data Insights | Predictive...
AI Solutions for Industries | Quality Inspection | Data Insights | Predictive...byteLAKE
 
Self-Checkout for Restaurants / AI Restaurants (2024-02)
Self-Checkout for Restaurants / AI Restaurants (2024-02)Self-Checkout for Restaurants / AI Restaurants (2024-02)
Self-Checkout for Restaurants / AI Restaurants (2024-02)byteLAKE
 
Self-Checkout (AI for Restautants) - case study by byteLAKE's partner: Simpra
Self-Checkout (AI for Restautants) - case study by byteLAKE's partner: SimpraSelf-Checkout (AI for Restautants) - case study by byteLAKE's partner: Simpra
Self-Checkout (AI for Restautants) - case study by byteLAKE's partner: SimprabyteLAKE
 
byteLAKE: Sztuczna Inteligencja dla Przemysłu i Usług
byteLAKE: Sztuczna Inteligencja dla Przemysłu i UsługbyteLAKE: Sztuczna Inteligencja dla Przemysłu i Usług
byteLAKE: Sztuczna Inteligencja dla Przemysłu i UsługbyteLAKE
 
Przegląd zastosowań sztucznej inteligencji (2024-01)
Przegląd zastosowań sztucznej inteligencji (2024-01)Przegląd zastosowań sztucznej inteligencji (2024-01)
Przegląd zastosowań sztucznej inteligencji (2024-01)byteLAKE
 
Przegląd zastosowań Sztucznej inteligencjI
Przegląd zastosowań Sztucznej inteligencjIPrzegląd zastosowań Sztucznej inteligencjI
Przegląd zastosowań Sztucznej inteligencjIbyteLAKE
 
AI Solutions for Industries
AI Solutions for IndustriesAI Solutions for Industries
AI Solutions for IndustriesbyteLAKE
 
AI-accelerated CFD (Computational Fluid Dynamics)
AI-accelerated CFD (Computational Fluid Dynamics)AI-accelerated CFD (Computational Fluid Dynamics)
AI-accelerated CFD (Computational Fluid Dynamics)byteLAKE
 
Advanced Quality Inspection and Data Insights (Artificial Intelligence)
Advanced Quality Inspection and Data Insights (Artificial Intelligence)Advanced Quality Inspection and Data Insights (Artificial Intelligence)
Advanced Quality Inspection and Data Insights (Artificial Intelligence)byteLAKE
 
AI Solutions for Industries (short)
AI Solutions for Industries (short)AI Solutions for Industries (short)
AI Solutions for Industries (short)byteLAKE
 
Self-Checkout (AI for Restautants)
Self-Checkout (AI for Restautants)Self-Checkout (AI for Restautants)
Self-Checkout (AI for Restautants)byteLAKE
 
Applying Industrial AI Models to Product Quality Inspection
Applying Industrial AI Models to Product Quality InspectionApplying Industrial AI Models to Product Quality Inspection
Applying Industrial AI Models to Product Quality InspectionbyteLAKE
 
byteLAKE and Intel Partnership
byteLAKE and Intel PartnershipbyteLAKE and Intel Partnership
byteLAKE and Intel PartnershipbyteLAKE
 
CFD Suite (AI-accelerated CFD) - Sztuczna Inteligencja Przyspiesza Symulacje ...
CFD Suite (AI-accelerated CFD) - Sztuczna Inteligencja Przyspiesza Symulacje ...CFD Suite (AI-accelerated CFD) - Sztuczna Inteligencja Przyspiesza Symulacje ...
CFD Suite (AI-accelerated CFD) - Sztuczna Inteligencja Przyspiesza Symulacje ...byteLAKE
 
byteLAKE's Scan&GO - Self-Check-Out Solution for Retail (EuroShop'23)
byteLAKE's Scan&GO - Self-Check-Out Solution for Retail (EuroShop'23)byteLAKE's Scan&GO - Self-Check-Out Solution for Retail (EuroShop'23)
byteLAKE's Scan&GO - Self-Check-Out Solution for Retail (EuroShop'23)byteLAKE
 
Empowering Industries with byteLAKE's High-Performance AI
Empowering Industries with byteLAKE's High-Performance AIEmpowering Industries with byteLAKE's High-Performance AI
Empowering Industries with byteLAKE's High-Performance AIbyteLAKE
 

More from byteLAKE (20)

byteLAKE's AI Products (use cases) (short)
byteLAKE's AI Products (use cases) (short)byteLAKE's AI Products (use cases) (short)
byteLAKE's AI Products (use cases) (short)
 
byteLAKE's AI Products (use cases) - presentation
byteLAKE's AI Products (use cases) - presentationbyteLAKE's AI Products (use cases) - presentation
byteLAKE's AI Products (use cases) - presentation
 
byteLAKE's AI Products for Industries (2024-02)
byteLAKE's AI Products for Industries (2024-02)byteLAKE's AI Products for Industries (2024-02)
byteLAKE's AI Products for Industries (2024-02)
 
byteLAKE's CFD Suite (AI-accelerated CFD) (2024-02)
byteLAKE's CFD Suite (AI-accelerated CFD) (2024-02)byteLAKE's CFD Suite (AI-accelerated CFD) (2024-02)
byteLAKE's CFD Suite (AI-accelerated CFD) (2024-02)
 
AI Solutions for Industries | Quality Inspection | Data Insights | Predictive...
AI Solutions for Industries | Quality Inspection | Data Insights | Predictive...AI Solutions for Industries | Quality Inspection | Data Insights | Predictive...
AI Solutions for Industries | Quality Inspection | Data Insights | Predictive...
 
Self-Checkout for Restaurants / AI Restaurants (2024-02)
Self-Checkout for Restaurants / AI Restaurants (2024-02)Self-Checkout for Restaurants / AI Restaurants (2024-02)
Self-Checkout for Restaurants / AI Restaurants (2024-02)
 
Self-Checkout (AI for Restautants) - case study by byteLAKE's partner: Simpra
Self-Checkout (AI for Restautants) - case study by byteLAKE's partner: SimpraSelf-Checkout (AI for Restautants) - case study by byteLAKE's partner: Simpra
Self-Checkout (AI for Restautants) - case study by byteLAKE's partner: Simpra
 
byteLAKE: Sztuczna Inteligencja dla Przemysłu i Usług
byteLAKE: Sztuczna Inteligencja dla Przemysłu i UsługbyteLAKE: Sztuczna Inteligencja dla Przemysłu i Usług
byteLAKE: Sztuczna Inteligencja dla Przemysłu i Usług
 
Przegląd zastosowań sztucznej inteligencji (2024-01)
Przegląd zastosowań sztucznej inteligencji (2024-01)Przegląd zastosowań sztucznej inteligencji (2024-01)
Przegląd zastosowań sztucznej inteligencji (2024-01)
 
Przegląd zastosowań Sztucznej inteligencjI
Przegląd zastosowań Sztucznej inteligencjIPrzegląd zastosowań Sztucznej inteligencjI
Przegląd zastosowań Sztucznej inteligencjI
 
AI Solutions for Industries
AI Solutions for IndustriesAI Solutions for Industries
AI Solutions for Industries
 
AI-accelerated CFD (Computational Fluid Dynamics)
AI-accelerated CFD (Computational Fluid Dynamics)AI-accelerated CFD (Computational Fluid Dynamics)
AI-accelerated CFD (Computational Fluid Dynamics)
 
Advanced Quality Inspection and Data Insights (Artificial Intelligence)
Advanced Quality Inspection and Data Insights (Artificial Intelligence)Advanced Quality Inspection and Data Insights (Artificial Intelligence)
Advanced Quality Inspection and Data Insights (Artificial Intelligence)
 
AI Solutions for Industries (short)
AI Solutions for Industries (short)AI Solutions for Industries (short)
AI Solutions for Industries (short)
 
Self-Checkout (AI for Restautants)
Self-Checkout (AI for Restautants)Self-Checkout (AI for Restautants)
Self-Checkout (AI for Restautants)
 
Applying Industrial AI Models to Product Quality Inspection
Applying Industrial AI Models to Product Quality InspectionApplying Industrial AI Models to Product Quality Inspection
Applying Industrial AI Models to Product Quality Inspection
 
byteLAKE and Intel Partnership
byteLAKE and Intel PartnershipbyteLAKE and Intel Partnership
byteLAKE and Intel Partnership
 
CFD Suite (AI-accelerated CFD) - Sztuczna Inteligencja Przyspiesza Symulacje ...
CFD Suite (AI-accelerated CFD) - Sztuczna Inteligencja Przyspiesza Symulacje ...CFD Suite (AI-accelerated CFD) - Sztuczna Inteligencja Przyspiesza Symulacje ...
CFD Suite (AI-accelerated CFD) - Sztuczna Inteligencja Przyspiesza Symulacje ...
 
byteLAKE's Scan&GO - Self-Check-Out Solution for Retail (EuroShop'23)
byteLAKE's Scan&GO - Self-Check-Out Solution for Retail (EuroShop'23)byteLAKE's Scan&GO - Self-Check-Out Solution for Retail (EuroShop'23)
byteLAKE's Scan&GO - Self-Check-Out Solution for Retail (EuroShop'23)
 
Empowering Industries with byteLAKE's High-Performance AI
Empowering Industries with byteLAKE's High-Performance AIEmpowering Industries with byteLAKE's High-Performance AI
Empowering Industries with byteLAKE's High-Performance AI
 

Recently uploaded

Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 

Recently uploaded (20)

Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 

byteLAKE's expertise across NVIDIA architectures and configurations

  • 2. • In-depth understanding of NVIDIA hardware and software ecosystems: – Expertise in various NVIDIA architectures, including Fermi, Kepler (K80, GeForce GTX Titan, Jetson), Maxwell (NVIDIA GeForce GTX 980), Pascal (P100), Tesla V100/A100/H100, and T4. – Mastery of GPU programming languages such as CUDA, OpenCL, and OpenACC. – Extensive experience optimizing solutions for NVIDIA GPUs. – Comprehensive knowledge spanning desktop, mobile, and server environments. • Demonstrated success through multiple case studies: – AI training (machine learning and deep learning) – Edge AI inferencing – Classic HPC simulations (Computational Fluid Dynamics, weather simulations) • Active participation in cutting-edge research with numerous publications in prestigious journals, including Concurrency and Computation: Practice and Experience, Parallel Computing, and the Journal of Supercomputing. Expertise Across NVIDIA Architectures and Configurations
  • 3. Performance & Scalability AI Always optimized for the latest hardware available. AI training optimized for edge servers and multi-node HPC architectures. Edge Servers and HPC Cost Efficient Scalable Solution Maximum Performance Quick deployment & no external dependencies. On-premises Example configurations. Other options available. Hardware ThinkSystem SR650V2 ThinkEdge SE350 ThinkEdge SE450 Workstations ThinkSystem SR670V2 AI Predictions AI Training Edge HPC Server Performance: AI training ➢ Lenovo’s ThinkEdge SE450 ➢ 2 NVIDIA A100 80GB Tensor Core GPUs ➢ Intel Xeon Gold 6330N CPU ➢ Download the full report
  • 4. HPC Simulations optimized by Machine Learning automatic adaptation of algorithm to a specific hardware architecture • Enables software portability between different architectures: – CPU: different number of cores, hierarchy of memory, caches size; – GPU: register file reusing, shared memory utilization, GPU direct support, reduction of global memory transaction; – HPC: selecting the right number of nodes, scalability estimation, overlapping data transfers & communication; – Hybrid: load balancing (i.e. selecting appropriate parts of the algorithm for different devices executing the code with different performance) • Helps build adaptable algorithms: – automatically selecting the size of data blocks, number of threads, number of processes, precision of data (i.e. depending on algorithm or input data characteristics some data can be stored using double, single of half precision format) – selecting the criterion of optimization: performance, energy consumption, accuracy of result, mix (i.e. mix of performance & energy to minimize energy and keep execution time or to optimization performance & keep energy budget) • Can auto-configure the system and provide the most suitable compiler flags byteLAKE’s Software Autotuning
  • 5. • Goal: Reducing the energy consumption of the MPDATA algorithm (algorithm for numerical simulation of geophysical fluids flows on micro-to-planetary scales – especially used in a numerical weather prediction). • Hardware: Piz Daint supercomputer (ranked 3-rd at top 500), equipped with the most advanced Pascal-based GPUs: NVIDIA Tesla P100. • Idea: Applying mixed precision arithmetic - set a part of operations to be performed in a single precision (32-bits) and the remaining set to double (64-bits). • Why do we use it? A single simulation of the weather phenomenon needed more than 1013 operations. We suspected that not all of them needs double precision arithmetic to preserve the same simulation accuracy. We believe that the control of the precision and accuracy of numerical results can increase the performance, decrease the energy consumption, and provide highly accurate results. • Solution: We used unsupervised learning to estimate the correlation between the precision of each matrix and their influence on criteria (energy, accuracy of results). During the dynamic and short training stage we evaluated the set of operations that could be performed in a single precision without loss in accuracy of the weather simulation. Research Case: (concluded) CFD acceleration with GPU (MPDATA, weather forecast) Results: We reduced energy by 33%, increased performance by the factor of 1.27x using 25% less GPUs, keeping the accuracy of the results at the same level as when using double precision arithmetic.
  • 6. Research Case: Reconfiguring HPC Simulation with AI to optimize performance and energy node count accelerators per node memory alignment streams count buffering types … cpu cores memory policy 1 000 000 000 Possible configurations Ca. 5000 possible configurations Artificial Intelligenc e This module utilizes among others the supervised learning method with the random forest algorithm. The main functionality of the module is to prune the search space in order to eliminate the worst configurations. We develop a Machine Learning module in order to select the most fitting configuration. In this way we achieve a small set that at 90% contains the best configuration.
  • 7. MPDATA Accelerated CFD / Advection algorithm optimized for heterogeneous computing. For Nvidia GPUs and Xilinx Alveo FPGAs.
  • 8. • MPDATA (Multidimensional Positive Definite Advection Transport Algorithm) – main part of the dynamic core of the Eulerian/ semi-Lagrangian (EULAG) model – EULAG (MPDATA+elliptic solver) is the established computational model, developed for simulating thermo-fluid flows across a wide range of scales and physical scenarios – currently, this model is being implemented as the new dynamic core of the COSMO (Consortium for Small-scale Modeling) weather prediction framework – advection (together with the elliptic solver) is a key part of many frameworks that allow users to implement their simulations • Advection – movement of some material (dissolved or suspended) in the fluid. Algorithm: Advection (MPDATA) General Information
  • 9. • Easy to integrate – Can work as a standalone application or be called as a function via our dedicated interface (e.g. can be called as a function with input and output arrays) – Compatible with frameworks like TensorFlow for integrating deep learning with CFD codes • Easy to visualize the results – Results can be stored in a raw format as a binary file of the output arrays or converted via byteLAKE tools to a ParaView format • See benefits already in 1-node HPC configurations – Strongly adapted to Alveo U250, were single card supports the max size of arrays: 2,1 Gcells (max compute domain: 1264 x 1264 x 1264) ~ 60 GB • Scalable to many cards per node and many nodes Algorithm: Advection (MPDATA) byteLAKE’s implementation compatibility
  • 10. • First-order-accurate step of the advection scheme. Second-order is an option. • Input data – Array X – non-diffusive quantity (e.g. temperature of water vapor, ice, precipitation, etc.) – Arrays V1, V2, V3 - each of them stores the velocity vectors in one direction – (optional) Arrays Fi, Fe - implosion and explosion forces acting on a structure of X – (optional) Array D with density – (optional) Array rho which defines an interface for the coupling of COSMO and EULAG dynamic core (used to provide the transformation of the X variable) – DT – time step (scalar) • Output data – single X array that was updated in the given time step Algorithm: Advection (MPDATA) Technical Information
  • 11. • Applications include – To characterize the sub-grid scales effect in global numerical simulations of turbulent stellar interiors – To compare anelastic and compressible convection-permitting weather forecasts for the Alpine region – Modeling the prediction of forest fire spread – Flood simulations – Biomechanical modeling of brain injuries within the Voigt model (a linear system of differential equations where the motion of the brain tissue depends merely on the balance between viscous and elastic forces) – Simulation gravity wave turbulence in the Earth's atmosphere – Simulation of geophysical turbulence in the Earth's atmosphere – Ocean modeling: simulation of three-dimensional solitary wave generation and propagation using EULAG coupled to the barotropic NCOM (Navy Coastal Ocean Model) tidal model 11 Applications of Advection (MPDATA)
  • 12. • Applications include cont. – Oil and Gas: provides a significant return on investment (ROI) in seismic analysis, reservoir modelling and basin modelling. Used also to monitor drilling and seismic data to optimize drilling trajectories and minimize environmental risk. – AgriTech: models to track and predict various environmental impacts on crop yield such as weather changes. For example, daily weather predictions can be customized based on the needs of each client and range from hyperlocal to global. • Example adopters – Poznan Supercomputing and Networking Center, Poland: prognosis of air pollution – European Centre for Medium-Range Weather Forecasts, UK: weather forecast – Institute of Meteorology and Water Management, Poland: weather forecasts – German Aerospace Center: aeronautics, transport and energy areas – University of Cape Town, RPA: weather simulation – Montreal University: weather simulation – Warsaw University: ocean simulation Applications of Advection (MPDATA), cont. Full list
  • 13. 12x better performance 30% reduced energy consumption • Our solution: machine learning managed, dynamic application of mixed precision • Highlights: – Dynamic estimation of the algorithm’s power consumption as a function of the frequency of the processor and the number of cores. – Energy-aware task management – Auto-tuning procedure taking into account algorithm’s and GPU-specific parameters for auto-configuring purposes. – Result: better performance, less energy consumed. Weather engine optimized for Europe’s fastest supercomputer (Piz Daint) Our mechanism provides the energy savings of up to 1.43x comparing to the default Linux scaling governor.
  • 14. Dynamic Mixed Precision, cont. We reduced E by 33%, increased performance by the factor of 1.27x using 25% less GPUs. We kept the accuracy of the results at the same level as when using double precision arithmetic.
  • 15. Dynamic Mixed Precision Optimize execution time • Ported geophysical model (EULAG) to a parallel computing supercomputer architecture (Piz Daint) • Used Machine Learning (Random Forest) to optimize various numerical parameters as: data blocks sizes, number of GPU streams, sizes of vector data types Optimize energy efficiency • Developed a mechanism (mixed precision) that allowed for providing a low energy consumption of supercomputers keeping the code performance at the highest possible level • Developed a framework, based on software automatic tuning approach Results ✓ 10 times faster ✓ Then we improved it even more, reaching the speed-up of 1.27 ✓ Energy consumption reduced by 33% ✓ Optimized GPUs usage while keeping the accuracy of computations Highlights: • C++, CUDA, MPI, OpenMP
  • 16. Advanced quality inspection and data insights AI for Manufacturing, Automotive, Paper, Chemical, and Energy sectors. AI self-checkout stations for Restaurants and object recognition Retail businesses. AI Solutions for Industries
  • 17. for Manufacturing for Automotive for Paper Industry Data Insights Cognitive Services Advanced quality inspection and data insights. Cognitive Services for Restaurants Self-checkout and object recognition. CFD Suite AI-accelerated Computational Fluid Dynamics. Predictive Maintenance Featured Products
  • 18. Custom AI Development AI Services AI Workshop Edge AI Cognitive Automation HPC Incubation Intel Expertise Alveo Expertise NVIDIA Expertise brainello Ewa Guard Document Processing Forestry & Agriculture +48 508 091 885 +48 505 322 282 welcome@byteLAKE.com
  • 20. ➢ LinkedIn.com/company/byteLAKE ➢ X.com/byteLAKEGlobal ➢ FB.com/byteLAKE/ ➢ byteLAKE.com/en/YouTube ➢ Blog Partners & Clients “AI already plays a very important role in our daily lives. […] The application of the Intel® Distribution of OpenVINO™ toolkit in byteLAKE’s Cognitive Services shows that AI works efficiently as an actual tool for optimizing company operations. Moreover, such a combination reduces the barrier of necessary upgrades to IT infrastructure [...],” said Krzysztof Jonak, EMEA Territory Sales Director, Intel. “We’re also working with a number of partners on AI initiatives that will provide real world solutions for customers. […] Our collaboration with partners such as Intel, NVIDIA, Mark III systems, and byteLAKE greatly expands the resources and expertise we’re able to provide“, said Dr. Bhushan Desam, Lenovo’s AI Global Business Leader, HPC and AI Business.
  • 21. Our Research Studies GO PUBLIC! More at: byteLAKE.com/en/research
  • 23. Meet byteLAKE AI Solutions for Industries | Quality Inspection | Data Insights | AI-accelerated CFD | Self-Checkout Products: CFD Suite Cognitive Services www.byteLAKE.com Headquartered in Poland Empowering Industries with Artificial Intelligence Solutions. At byteLAKE, we harness cutting-edge technology to provide advanced quality inspection and data insights tailored for the Manufacturing, Automotive, Paper, Chemical, and Energy sectors. Additionally, we offer self-checkout stations for Restaurants and object recognition solutions for Retail businesses. +48 508 091 885 +48 505 322 282 welcome@byteLAKE.com