SlideShare a Scribd company logo
AI &
Sustainability
Dr. Tamar Eilam
IBM Fellow, Chief Scientist Sustainable Computing,
IBM Research
Generated by Dall-E
2
Approximate and Partial list
of contributors in arbitrary
order
3
Energy modeling and quantification Marcelo Amaral, Huamin Chen, Tatsuhiro Chiba,
Rina Nakazawa, Sunyanan Choochotkaew, Eun K Lee, Umamaheswari Devi, Aanchal
Goyal Workload Classification Xi Yang, Rohan R Arora, Chandra Narayanaswami,
Cheuk Lam, Jerrold Leichter, Yu Deng, Daby Sow Energy Aware Optimization Tatebeh
Bahreini, Asser Tantawi, Alaa Youssef, Chen Wang, AI System Jeffrey Burns, Leland
Chang, Ankur Agrawal, Kailash Gopalakrishnan, Pradip Bose AI Quantification and
Metric Pedro Bello-Maldonado, Bishwaranjan Bhattacharjee, Carlos Costa, AI
Infrastructure Innovation Seelam Seetharami Model Architecture Innovation David Cox,
Rameswar Panda, Rogerio Feris, Leonid Karlinsky
The Climate Impact Chain
Human
activity
Increased
Green House
Gas (GHG) in
atmosphere
Global
warming
Global
climate
change
Physical
&
biological
impact
Human socio-
economic
impact
$150 billion
Average cost in damages per year
100M+
Increase in population facing hunger
IBM Research | © 2022 IBM Corporation
5
Mitigation
Carbon
Capture
Geo-
engine
ering
Reduce Carbon
Emission
Sustainable
Computing
6
Mitigation
Carbon
Capture
Geo-
engineering
Reduce Carbon Emission
Sustainable
Computing
adaptation
AI
Harness the power of AI to
fight climate change
material discovery
climate and risk monitoring we also have to mitigate its affect on the environment
Part1: Sustainable
Computing
IBM Research / Doc ID / Month XX, 2020 / © 2020 IBM Corporation 7
What is Sustainable Computing ?
8
Ability to measure, quantify, and ultimately reduce carbon
footprint at every layer of the computing stack, in- and across-
data centers, and across the entire life cycle.
IBM Research | © 2022 IBM Corporation
The Computer Energy Problem
9
We are at an inflection point :
3. The end of Dennard
Scaling means we can’t
keep up
 Some predict that electricity consumed by Data Centers will increase to 8% by 2030
 Golden Era for Chip Design
1. Demand is growing at
exponential scale
How to stop data centers from gobbling up the
world’s electricity
https://www.nature.com/articles/d41586-018-
06610-y
2. The emergence of
energy-demanding
workloads(AI)
AI power consumption doubles
every 3-4 months
* Green AI, R. Schwartz, J. Dodge,
N. A. Smith, O. Etzioni 2019
Ever rising energy demands
for computing vs. global
energy production is
creating new risk, and new
opportunities for radically
different computing
to drastically improve
efficiency
31%
a years the energy consumption
increase trend for hyperscalers in
North America
>10%
of the world's power will be
consumed by hyperscalers by 2030
IBM Research | © 2022 IBM Corporation 10
Sustainable Computing epochs
Making the Current State
More Sustainable
Introducing Accelerators
(Digital)
& Hardware and Software
co-design / co-optimization
New Computational Models
(beyond digital)
 Understanding the As-Is
 Hot Spot Detection
 Remediation and Optimization
 Coupling Power and Cloud
 Cooling, Data Center Planning, etc
 Storage AutoTiering.
 HW and SW co-design (scalable
approach)
 Reduced precision chips – 8bit
precision approximate computing
 Voltage scaling with error correction
 Runtime management of dis-
aggregated & composable
heterogenous DC
 New computational
models that completely
break the relationship
between energy & computation:
neuromorphic, analog AI, data-
centric,
quantum, etc.
https://research.ibm.com/blog/telum-
processor
https://www.esp.cs.columbia.edu
https://research.ibm.com/blog/the-
hardware-behind-analog-ai
https://www.zurich.ibm.com/sto/memory/
IBM Research | © 2022 IBM Corporation 11
Carbon Intensity
The emission rate: grams of carbon
dioxide released
per megajoule of energy produced
—
With coal power stations, the carbon intensity
is high as CO2 is produced as part of the
power generation process.
Carbon intensity is >1 kg/kWh for coal;
—
Renewable energy such as hydro or solar
produce almost no emissions, so their carbon
intensity is very low.
Carbon intensity is ~0 for solar/wind
Modeling the Data Center Carbon
Footprint
12
x Carbon Intensity
Power usage effectiveness (PUE)
A predominant metric used to measure the energy
efficiency of a data center.
—
PUE = (Total Facility Energy) / ( IT Equipment
Energy)
Efficiency improves as the quotient decreases
towards 1.
1 is optimal, 2 is very bad.
Total Carbon Footprint
The total amount of carbon dioxide (CO2) and
equivalent green house gas emissions associated
with powering a data center.
CFP >= 0.
Carbon Footprint = IT Equipment Energy x Power Usage Effectiveness
CFP =EIT × PUE × CI
EIT
PUE
An example DC Energy Breakdown
IBM Research | © 2022 IBM Corporation
Reducing the Data Center Carbon
Footprint: Research Opportunities
13
x Carbon Intensity
Carbon Footprint = IT Equipment Energy x Power Usage Effectiveness
CFP =EIT × ERE × CI
• Data Center Design, Cooling and Heat-
Reuse
• Rack Design to optimize power
conversion, and direct liquid cooling
• Improving power conversion in the data
center
• Energy Aware Scheduling, Vertical Scaling,
Dispatching
• Power Management
• Chip Design
• Dispatching of batch workload such as AI
Training Jobs across time and space to
maximize renewable energy use.
• Forecasting of renewable energy (time
series composition)
• Can the cloud sense renewable energy and
adapt?
https://research.ibm.com/blog/ibm-artificial-intelligence-
unit-aiu
https://www.zurich.ibm.com/st/energy_efficiency/zeroemiss
ion.html
https://research.ibm.com/blog/northpole-ibm-ai-chip
14
Act
14
Energy and CFP
per workload, tenant,
VM, container, Service,
Etc.
Identify hotspots
and applicable
strategies.
Calculate potential
savings.
Assess
Estimate
A set of controllers
to dynamically optimize the
Carbon footprint at
operation.
Design efficient systems
Report
Report CFP across your
entire organization in a
consistent fashion factoring
in requirements
Carbon Assessment & Reduction Framework
An Approach for Sustainable Computing
Energy Quantification
Challenge
• How do you
estimate the power
consumption of
applications
running on shared
servers?
• How do you do
that when you do
not have on-line
power
measurement at the
server level?
• How do you do that
if you do not know
what else is running
on the machine?
IBM Research / Doc ID / Month XX, 2020 / © 2020 IBM Corporation 15
generated with Dal-E
Energy Quantification
Challenge
• How do you estimate the power
consumption of applications running on
shared servers?
=> ratio based approach
• How do you do that when you do not have
on-line power measurement at the server
level?
=> power modeling
• How do you do that if you do not know what
else is running on the machine?
=> dynamic power estimation only
• How do you scale the approach to
developing power models (combinatorial
explosion problem)?
16
The Kepler Project
https://github.com/sustainable-computing-io/kepler
17
[1] https://github.com/sustainable-computing-io/kepler
Kepler Architecture
• eBPF metrics:
hardware
counters, cpu
time and soft IRQ
• System Power
metrics from BMs
and VMs
• Ratio Power
Model for
containers
• Trained Power
Model to estimate
the VM’s
component
power
consumption
18
Kepler Deployment Approaches
- Ratio Power Model for Dynamic CPU Power
with Hardware Counter:
DynPowerprocess i =
𝐶𝑃𝑈 𝑐𝑦𝑐𝑙𝑒𝑠 𝑝𝑟𝑜𝑐𝑒𝑠𝑠 𝑖
𝛴𝐶𝑃𝑈 𝑐𝑦𝑐𝑙𝑒𝑠
𝑥 DynPowerhost_CPU
without Hardware Counter:
DynPowerprocess i =
𝐵𝐹𝑃 𝐶𝑃𝑈 𝑡𝑖𝑚𝑒 𝑝𝑟𝑜𝑐𝑒𝑠𝑠 𝑖
𝛴𝐵𝑃𝐹 𝐶𝑃𝑈 𝑡𝑖𝑚𝑒
𝑥
DynPowerhost_CPU
DynPowercontainer j = Σ 𝑖 𝜖 𝑗 DynPowerprocess i
- Evenly distribution of Idle Power
Powercontainer j = IdlePowerhost_CPU / numContainers GPU (nvml)
Kepler Model Server Project
facilitate training power model for server without power meter
Bare-metal (BM)
Kepler
Estimated System
Power Metrics
Ratio Power Model
Process/Container
Power Consumption
Virtual Machine (VM)
Trained Power Model
Bare-metal (BM)
RAPL ACPI/Sensors
Redfish/IPMI GPU (nvml)
Kepler
Ratio Power Model
Process/Container
Power Consumption
Server with
power meter Server without
power meter
Kepler Model Server
Motivation:
• No power measurement exposed or instrumented in some running systems
Challenges:
• No or not-enough data to train power model specific to all available metrics and emerging system platform and
settings (e.g., variety of CPU architecture, Frequency governor)
• Dynamicity of control plane processes
Collect
Data
Train
Model
Export
Model
Serve
Model
Estimate
Power
core of Kepler model server
Pipeline Framework (one extractor, one isolator, multiple trainers )
Extract
…
Prometheus query result Extracted data Isolated data
Power models
Node-level
Train
Container-level
Train
Isolate
Energy metric
Energy-related
metric (s)
with background power
without background power
https://www.cncf.io/blog/2023/10/11/exploring-keplers-potentials-unveiling-cloud-application-power-consumption/
The Issue with Third-
Party Clouds
 No server power metric available
 No knowledge of what else is running on my machine
 how to split idle power? 
 Limited knowledge of the architecture and configuration of the bare metal servers
 Challenge for applying separately trained power models… 
 ALL Cloud Native calculators are too coarse grained to be useful for optimization .. 
Generated with Dall-E
https://adrianco.medium.com/proposal-for-a-realtime-carbon-footprint-standard-60b71c269948
Adrian Cockcroft
How can we get to real time monitoring
of application carbon consumption in third party
clouds?
Consistent
Trustworthy
Transparent
Explainable
Can Kepler help?
What else do we need?
WIP: Reference
Implementation to be
Open Sourced.
24
Act
24
Energy and CFP
per workload, tenant,
VM, container, Service,
Etc.
Identify hotspots
and applicable
strategies.
Calculate potential
savings.
Assess
Estimate
A set of controllers
to dynamically optimize the
Carbon footprint at
operation.
Design efficient systems
Report
Report CFP across your
entire organization in a
consistent fashion factoring
in requirements
Carbon Assessment & Reduction Framework
An Approach for Sustainable Computing
Detect non-productive workloads
• Virtual Machines
• Cloud-native deployments
• Cloud services
Can schedules be drawn up for a
few (if not all) productive
workloads?
Workload Classification: Motivation
Methodology
Workload*
Classification
Phase
Abstraction
Inactive/
active
phases
Non-repeatable
Constantly
Productive
Alternating
Workload
Timetabling
Candidate for
Termination
Candidate for
Parking
No Action
Repeatable
Recommendation
Metrics
Non-productive
• Non-productive: Remaining in the Inactive Phase
• Constantly Productive: Remaining in the Active Phase
• Alternating: Switching between the two Phases
VM1
VM2
VM𝑁
𝑇 − 𝑤𝑐
7/14/21 Days
𝑇
More details
28
Act
28
Energy and CFP
per workload, tenant,
VM, container, Service,
Etc.
Identify hotspots
and applicable
strategies.
Calculate potential
savings.
Assess
Estimate
A set of controllers
to dynamically optimize the
Carbon footprint at
operation.
Design efficient systems
Report
Report CFP across your
entire organization in a
consistent fashion factoring
in requirements
Carbon Assessment & Reduction Framework
An Approach for Sustainable Computing
CARE: Carbon Quantification &
Reduction
Coordinated set of controllers to
dynamically quantify and
optimize the carbon footprint in
every level of the hybrid cloud
stack in and across on and off
prem data centers
Container
Right-Sizing
Dynamic
dispatching
Energy aware
scheduler
VM
placement
Power
management
Container
Right-Sizing
Energy aware
scheduler
VM
placement
Power
management
CFP =EIT × PUE × CI
Leverage renewable energy
when and where it is
available across datacenters.
Efficiency with container
resource consumption
within a datacenter.
Efficient infrastructure with
VM and power
management
29
Part2:
AI
Sustainability
IBM Research / Doc ID / Month XX, 2020 / © 2020 IBM Corporation 30
Generated by Dall-E
The energy cost of AI
 Deep learning is computationally intensive
 Time consuming even with high-performance computing resources
Take for example: Training Image recognition model
Dataset: ImageNet-22K
Network: ResNet-101
256 GPUs
7 hours
~450kWh
4 GPUs
16 days
~385
kWh
1 model training run is ~2 weeks of
home energy consumption
https://arxiv.org/abs/1708.02188
AI demand keeps surging Training requirements
are doubling every 3.5
months
Source: Emma Strubell, Ananya Ganesh, and Andrew McCallum. 2019. Energy and
Policy Considerations for Deep Learning in NLP. CoRR abs/1906.02243 (2019).
arXiv:1906.02243
Source: Roy Schwartz, Jesse Dodge, Noah A. Smith, and Oren Etzioni. 2019. Green AI.
arXiv:1907.10597 [cs.CY]
The emergence of foundation models
Homogenization: a broad foundation
model is adapted to perform specific tasks.
Almost all state-of- the-art NLP models are
now adapted from one of a few foundation
models, such as BERT, RoBERTa, BART,
T5, etc.
Multi modal, and cross domains are next.
Source: RishiBommasani,DrewA.Hudson,EhsanAdeli,RussAltman,SimranArora, Sydney von Arx, Michael S.
Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, Erik Brynjolfsson, Shyamal Buch, Dallas Card,
Dallas Card, Rodrigo Castellon, Niladri Chatterji, Annie Chen, Kathleen Creel, Jared Quincy Davis, Dora
Dora Demszky, and Chris Donahue et al. 2022. On the Opportunities and Risks of Foundation Models.
Models. arXiv:2108.07258 [cs.LG]
Sizes of Language Models Training Cost of Language Model
GPT-3 needs 1024 A100 GPUs for 34 days for training!
Large language models are getting larger
Some say that this is okay, because they are re-used for multiple tasks*
This claim is yet to be substantiated based on a sound analysis
*E.g., DavidPatterson,JosephGonzalez,QuocLe,ChenLiang,Lluis-MiquelMunguia, Daniel Rothchild, David So, Maud Texier, and Jeff Dean. 2021. Carbon Emissions and Large Neural Network Training.
Data Scientist Dilemma: to adapt or not to
adapt
• To adapt from a broad model,
or, to train a smaller model on a more specific data set?
• How much data to use?
• Can I synthesize a few smaller models?
• Neural Architecture Search? Hyper Parameter Optimization?
Is it worth the cost? well, it depends….
• What is the optimal frequency of re-training?
Daily? Weekly?
what data shall I use for re-training? incremental? Complete?
Sustainable AI platform principles
Transparency dynamically track
energy and carbon across the data
and model life cycle
Traceability and Governance track
the ‘supply chain’ of models and
data-sets and associated energy
and carbon
Energy Efficiency Innovation across
all layers of the stack
Meaningful
Metrics
11/3/2023 37
Meaningful Metrics categories
data-
set
model
Products
Core Metric
Life-cycle
Efficiency
Construction Operation Construction
pre-training
11/3/2023
Operation
re-
training
Inference
Life-cycle
factor-in the provenance of models and data-sets and their associate
energy and carbon footprint (Life-Cycle-Assessment principles)
D FM M
Efficiency efficiency =
𝑐𝑜𝑠𝑡
𝑤𝑜𝑟𝑘 𝑝𝑟𝑜𝑑𝑢𝑐𝑒𝑑
what goes into ‘cost’?
 compute for inference
 +training
 +bill of material ‘tax’
holistic approach to Sustainable AI
Factor-in the entire life cycle of models
Sustainable strategy exploration and what-if analysis
Provenance, Governance, and reporting
Holistic impact analysis and tradeoff based planning
AI Sustainability Metrics
Transparency
11/3/2023 40
The life-cycle of a model as a state machine
Each ‘state transition’ is associated with a significant energy/carbon cost,
and involve critical decisions, that will affect cost of this and downstream tasks.
• Tradeoffs between
accuracy,
time-to-value, and
energy/carbon
• Cost of one phase
may depend on
decisions taken
at a prior stage.
save now, pay
later….
• The particulars of
the target task are
important to factor in
early on.
On-Line Fine Grain monitoring of Energy and
Carbon with Kepler
• An open-source project pioneered by
RedHat and IBM Research to quantify
cloud native applications
energy/carbon.
• On road map to deliver in OCP and
integrate in Rosa
• Adrian Cockcroft advocating use
of Kepler across all cloud providers
“Real Time Energy and Carbon Standard
for Cloud Providers”
11/3/2023 42
SusQL: Context aware aggregation and energy accounting
Infrastructure: Kubernetes controller with its own CRD that gets data from Kepler for
aggregation
susql-controller
map[labels]->energy table
1 2
3
4
apiVersion: …
kind: LabelGroup
metadata: …
spec:
labels:
- <label-1>
- <label-2>
- <label-3>
- <label-4>
status:
totalEnergy: <total energy>
Can we connect the dots? Kepler + Kubeflow
source: https://cloud.google.com/blog/topics/developers-practitioners/scalable-ml-workflows-using-pytorch-kubeflow-pipelines-and-vertex-pipelines
KubeFlow Pipeline Example Associated Meta-Data
Can we leverage Kepler
to add energy
data?
Governance
11/3/2023 45
A ‘Supply Chain’ of models
Models are created (‘manufactured’)
distilled, fine tuned, and rer-used
(adapted) to created new models
Deployment is just the beginning of
the journey.
How do we reason about the Life-Cycle
Cost of models?
Product Life Cycle Assessment Principles
for Sustainable AI:
Products = data-set | model
We need to factor in the cost of the Bill of Material used in the creation of a new model
If B (a product or a service) is used in the process of creation of A1, A2, … An, then the carbon cost of B
is inherited by A1, A2, …, An in proportion to their use.
The Governance Chain
Efficiency at
Every Layer
11/3/2023 49
Efficiency at every layer of the AI Stack
• Every layer of the FM stack offer opportunity for efficiencies gains
Model Quantization,
architecture innovation
Tools dynamic batching
Platform Multiplexing, dispatching
Infrastructure DVFS, power param
optimization, caching,
Systems Approximate computing
and other system
innovations
• Empower the data scientist to make choices and explore tradeoffs between accuracy, performance, energy
• Empower the data scientist to reason about life-cycle strategies: e.g., if/what/when to re-use, and how much
to retrain
Systems innovation
11/3/2023 51
52
IBM Research’s Artificial Intelligence Unit (AIU)
Chip architecture optimized for enterprise AI workloads
Enabled for Foundation Models
Enabled in the Red Hat software stack
Supports multi-precision inference (& training)
FP16, FP8, INT8, INT4, INT2
Implemented in leading edge 5nm technology
https://research.ibm.com/blog/ibm-artificial-intelligence-unit-aiu
SoC implements IBM’s leadership innovations
in low-precision AI arithmetic and algorithms
IBM Research AI Hardware Center / © 2023 IBM Corporation
53
Vision for AI Performance Scaling
• Applying Approximate Computing techniques to AI compute
• Critical requirement: maintain model accuracy
• Advantage: Quadratic improvement in performance
• IBM Research has been at the forefront of every major
technical advancement on bit-precision scaling
• 16-bit training (2015)
• 8-bit training (2018, 2019)
• 4-bit training (2020)
• 2/4-bit Inference (2018-2020)
• Complemented by
• Sparsity support
• Analog Computing
• 3D Stacking
Digital AI Cores
Scaling precision for quadratic gains in performance with iso-accuracy
4-bit Inference ASICs
J.Choi et al., https://arxiv.org/pdf/1805.06085.pdf
J.McKinstry et al., https://arxiv.org/abs/1809.04191
2-bit Inference ASICs
J.Choi et al., SysML 2019
0.1
1
10
100
2012 2015 2018 2021 2024
16-bit
32-bit
16-bit
8-bit
8-bit
2-bit
4-bit
4-bit
16-bit Training
ICML 2015
Training
Inference
4-bit Training
X. Sun et al NeurIPS 2020
8-bit Training
NeurIPS 2018, 2019
4-bit Inference
J.Choi et al.,arxiv 2018
2-bit Inference
J.Choi et al., SysML 2019
Bit
Precision
https://research.ibm.com/blog/ibm-artificial-intelligence-unit-aiu
https://research.ibm.com/blog/ai-chip-precision-scaling
54
Northpole: Neural-inspired memory-on-chip architecture
to overcome the von-neumann bottleneck
NorthPole is 25 times more energy efficient,
when it comes to the number of frames
interpreted per joule of power.
Infrastructure innovation
11/3/2023 55
Vela: A Cloud Native Supercomputer for the Foundation Model Age (Kepler inside)
System specifications
– Nodes with 8 x A100 GPUs (80GB)
– GPUs interconnected with NVLink, NVSwitch
– Cascade Lake CPUs, 1.5TB of DRAM,
– Four 3.2TB NVMe drives
– Redundant connections between nodes, TORs and
spines
– 2 x 100G NICs from each node – NCCL benchmarks
show we drive close to line rate
https://research.ibm.com/blog/AI-supercomputer-Vela-GPU-cluster
– Configure resources through software (APIs)
– Broad ecosystem of available cloud services
– Leverage data sets on Cloud Object Store
– Standard, flexible, scalable infrastructure design (vs
traditional HPC)
– Near bare metal performance (within 5%, single node)
How do you evolve from
specialized (monolithic), costly,
and inflexible HPC stack to Cloud
Native Stack without
compromising efficiency ?
- Programmability
- Scalability
- Re-use
- Observability
- Agility
- Democratization
11/3/2023
57
Platform Innovation
Dispatching of jobs based on renewable energy
58
Motivation:
 Carbon intensity of the energy mix of different
regions of IBM data centers varies over time.
 Renewable energy is not available all the time
and in all places.
Workload Optimization: Placement and scheduling
of workloads based on carbon-free energy
availability.
Ideal dispatching: High CPU utilization when
carbon intensity is low and low CPU utilization when
carbon intensity is high.
T. Bahreini, A. Tantawi and A. Youssef, "An Approximation Algorithm for Minimizing the Cloud Carbon Footprint
through Workload Scheduling," 2022 IEEE 15th International Conference on Cloud Computing (CLOUD), 2022, pp.
522-531,
Challenge: Ideal dispatching might be practically
infeasible.
 Short jobs may have short deadline.
 Some jobs are not interruptible.
 Jobs have heterogenous resource demands.
Obtaining the optimal packing is intractable.
59
IEEE Cloud 2023 – dispatching
(placement & scheduling) across
data centers to minimize carbon
IEEE Cloud 2022 polynomial approximation algorithms.
scheduling in a single data center to minimize carbon.
Dispatch Workloads onto Clusters
Spoke Cluster 1
Spoke Cluster 2
Hub Cluster
MCAD
Dispatcher Spoke Cluster 3
MCAD
Runner
KubeStellar
MCAD Dispatcher
• queue & dispatch jobs
• resource allocation
• quota management
• requeue & retry jobs
MCAD Runner
• run & monitor jobs
• monitor cluster
KubeStellar
• downsync job spec
• upsync job status
• upsync cluster status
placement
engine
KubeStellar https://github.com/kubestellar/kubestellar
MCAD – Multi Cluster Application Dispatcher
https://github.com/project-codeflare/multi-cluster-app-dispatcher
• Hard constraints
• filter clusters
• filter workloads
• Soft constraints
• score clusters
• score workloads
• Space dimension
• global vs. individual decisions
• aggregates
• Time dimension
• in-time vs. ahead-of-time
• delayed start, suspend/resume
11/3/2023
63
Models & Tools Innovation
64
65
Call to action:
AI Platform providers:
- Build-in transparency and governance
- Incorporate platform and system innovation for efficiency.
Academia & Industry: Focus you Research on Efficiency not just
accuracy
Data Scientists / Practitioners: Develop a sustainability
mind-set
Re-use where it makes sense
Domain specific, smaller models are better!
Explore tradeoffs (accuracy vs cost)
66
Tokyo
Shin-Kawasaki
Delhi
Bangalore
Singapore
Nairobi
Haifa
Zurich
Warrington
Dublin
Cambridge
Albany
Yorktown
Almaden
Rio de Janeiro
Sao Paulo Johannesburg
6 Nobel Laureates 10 Medals of Technology 5 National Medals of Science 6 Turing Awards
IBM Research
Questions?
68
AI-Sustainability.pptx

More Related Content

What's hot

Introduction to LLMs
Introduction to LLMsIntroduction to LLMs
Introduction to LLMs
Loic Merckel
 
Generative AI
Generative AIGenerative AI
Generative AI
Carlos J. Costa
 
Artificial Intelligence = ML + DL with Tensor Flow
Artificial Intelligence = ML + DL with Tensor FlowArtificial Intelligence = ML + DL with Tensor Flow
Generative AI at the edge.pdf
Generative AI at the edge.pdfGenerative AI at the edge.pdf
Generative AI at the edge.pdf
Qualcomm Research
 
Introduction to AI Ethics
Introduction to AI EthicsIntroduction to AI Ethics
Introduction to AI Ethics
Gabriele Graffieti
 
Artificial Intelligence
Artificial IntelligenceArtificial Intelligence
Artificial Intelligence
SANKALPA CHOWDHURY
 
Harry Surden - Artificial Intelligence and Law Overview
Harry Surden - Artificial Intelligence and Law OverviewHarry Surden - Artificial Intelligence and Law Overview
Harry Surden - Artificial Intelligence and Law Overview
Harry Surden
 
The Future of AI is Generative not Discriminative 5/26/2021
The Future of AI is Generative not Discriminative 5/26/2021The Future of AI is Generative not Discriminative 5/26/2021
The Future of AI is Generative not Discriminative 5/26/2021
Steve Omohundro
 
Human and Artificial Intelligence
Human and Artificial IntelligenceHuman and Artificial Intelligence
Human and Artificial Intelligence
orengomoises
 
Explainable AI in Industry (KDD 2019 Tutorial)
Explainable AI in Industry (KDD 2019 Tutorial)Explainable AI in Industry (KDD 2019 Tutorial)
Explainable AI in Industry (KDD 2019 Tutorial)
Krishnaram Kenthapadi
 
AI and ML Series - Introduction to Generative AI and LLMs - Session 1
AI and ML Series - Introduction to Generative AI and LLMs - Session 1AI and ML Series - Introduction to Generative AI and LLMs - Session 1
AI and ML Series - Introduction to Generative AI and LLMs - Session 1
DianaGray10
 
Artificial intelligence tapan
Artificial intelligence tapanArtificial intelligence tapan
Artificial intelligence tapan
Tapan Khilar
 
AI in Higher Education – Challenges & Opportunities #edlw2019
AI in Higher Education – Challenges & Opportunities #edlw2019AI in Higher Education – Challenges & Opportunities #edlw2019
AI in Higher Education – Challenges & Opportunities #edlw2019
EDEN Digital Learning Europe
 
Artificial Intelligence
Artificial IntelligenceArtificial Intelligence
Artificial Intelligencefalepiz
 
How do we train AI to be Ethical and Unbiased?
How do we train AI to be Ethical and Unbiased?How do we train AI to be Ethical and Unbiased?
How do we train AI to be Ethical and Unbiased?
Mark Borg
 
The Ethics of AI
The Ethics of AIThe Ethics of AI
The Ethics of AI
Mark S. Steed
 
Responsible AI
Responsible AIResponsible AI
Responsible AI
Neo4j
 
Artificial Intelligence
Artificial IntelligenceArtificial Intelligence
Artificial Intelligence
Nadaraja Sarmilan
 
Artificial Intelligence
Artificial IntelligenceArtificial Intelligence
Artificial Intelligence
Prakhyath Rai
 
Responsible AI in Industry (ICML 2021 Tutorial)
Responsible AI in Industry (ICML 2021 Tutorial)Responsible AI in Industry (ICML 2021 Tutorial)
Responsible AI in Industry (ICML 2021 Tutorial)
Krishnaram Kenthapadi
 

What's hot (20)

Introduction to LLMs
Introduction to LLMsIntroduction to LLMs
Introduction to LLMs
 
Generative AI
Generative AIGenerative AI
Generative AI
 
Artificial Intelligence = ML + DL with Tensor Flow
Artificial Intelligence = ML + DL with Tensor FlowArtificial Intelligence = ML + DL with Tensor Flow
Artificial Intelligence = ML + DL with Tensor Flow
 
Generative AI at the edge.pdf
Generative AI at the edge.pdfGenerative AI at the edge.pdf
Generative AI at the edge.pdf
 
Introduction to AI Ethics
Introduction to AI EthicsIntroduction to AI Ethics
Introduction to AI Ethics
 
Artificial Intelligence
Artificial IntelligenceArtificial Intelligence
Artificial Intelligence
 
Harry Surden - Artificial Intelligence and Law Overview
Harry Surden - Artificial Intelligence and Law OverviewHarry Surden - Artificial Intelligence and Law Overview
Harry Surden - Artificial Intelligence and Law Overview
 
The Future of AI is Generative not Discriminative 5/26/2021
The Future of AI is Generative not Discriminative 5/26/2021The Future of AI is Generative not Discriminative 5/26/2021
The Future of AI is Generative not Discriminative 5/26/2021
 
Human and Artificial Intelligence
Human and Artificial IntelligenceHuman and Artificial Intelligence
Human and Artificial Intelligence
 
Explainable AI in Industry (KDD 2019 Tutorial)
Explainable AI in Industry (KDD 2019 Tutorial)Explainable AI in Industry (KDD 2019 Tutorial)
Explainable AI in Industry (KDD 2019 Tutorial)
 
AI and ML Series - Introduction to Generative AI and LLMs - Session 1
AI and ML Series - Introduction to Generative AI and LLMs - Session 1AI and ML Series - Introduction to Generative AI and LLMs - Session 1
AI and ML Series - Introduction to Generative AI and LLMs - Session 1
 
Artificial intelligence tapan
Artificial intelligence tapanArtificial intelligence tapan
Artificial intelligence tapan
 
AI in Higher Education – Challenges & Opportunities #edlw2019
AI in Higher Education – Challenges & Opportunities #edlw2019AI in Higher Education – Challenges & Opportunities #edlw2019
AI in Higher Education – Challenges & Opportunities #edlw2019
 
Artificial Intelligence
Artificial IntelligenceArtificial Intelligence
Artificial Intelligence
 
How do we train AI to be Ethical and Unbiased?
How do we train AI to be Ethical and Unbiased?How do we train AI to be Ethical and Unbiased?
How do we train AI to be Ethical and Unbiased?
 
The Ethics of AI
The Ethics of AIThe Ethics of AI
The Ethics of AI
 
Responsible AI
Responsible AIResponsible AI
Responsible AI
 
Artificial Intelligence
Artificial IntelligenceArtificial Intelligence
Artificial Intelligence
 
Artificial Intelligence
Artificial IntelligenceArtificial Intelligence
Artificial Intelligence
 
Responsible AI in Industry (ICML 2021 Tutorial)
Responsible AI in Industry (ICML 2021 Tutorial)Responsible AI in Industry (ICML 2021 Tutorial)
Responsible AI in Industry (ICML 2021 Tutorial)
 

Similar to AI-Sustainability.pptx

AI Sustainability Mascots 23-f.pptx
AI Sustainability Mascots 23-f.pptxAI Sustainability Mascots 23-f.pptx
AI Sustainability Mascots 23-f.pptx
Tamar Eilam
 
Green & Beyond: Data Center Actions to Increase Business Responsiveness and R...
Green & Beyond: Data Center Actions to Increase Business Responsiveness and R...Green & Beyond: Data Center Actions to Increase Business Responsiveness and R...
Green & Beyond: Data Center Actions to Increase Business Responsiveness and R...
IBMAsean
 
Koomeyoncloudcomputing V5
Koomeyoncloudcomputing V5Koomeyoncloudcomputing V5
Koomeyoncloudcomputing V5Jonathan Koomey
 
Koomeyondatacenterelectricityuse v24
Koomeyondatacenterelectricityuse v24Koomeyondatacenterelectricityuse v24
Koomeyondatacenterelectricityuse v24
Jonathan Koomey
 
E source energy managers conf 4 24-13-final
E source energy managers conf 4 24-13-finalE source energy managers conf 4 24-13-final
E source energy managers conf 4 24-13-finaljosh whitney
 
Optimization of power consumption in data centers using machine learning bas...
Optimization of power consumption in data centers using  machine learning bas...Optimization of power consumption in data centers using  machine learning bas...
Optimization of power consumption in data centers using machine learning bas...
IJECEIAES
 
IBM and GREEN IT; Green IT – How to Make IT Work and Save Money
IBM and GREEN IT; Green IT – How to Make IT Work and Save MoneyIBM and GREEN IT; Green IT – How to Make IT Work and Save Money
IBM and GREEN IT; Green IT – How to Make IT Work and Save Money
IBMAsean
 
Bringing Enterprise IT into the 21st Century: A Management and Sustainabilit...
Bringing Enterprise IT into the 21st Century:  A Management and Sustainabilit...Bringing Enterprise IT into the 21st Century:  A Management and Sustainabilit...
Bringing Enterprise IT into the 21st Century: A Management and Sustainabilit...
Jonathan Koomey
 
Green Computing
Green  ComputingGreen  Computing
Green Computing
Nikunj_Agrawal
 
Green Computing: A Methodology of Saving Energy by Resource Virtualization.
Green Computing: A Methodology of Saving Energy by Resource Virtualization.Green Computing: A Methodology of Saving Energy by Resource Virtualization.
Green Computing: A Methodology of Saving Energy by Resource Virtualization.
IJCERT
 
Can the Cloud Be Green?
Can the Cloud Be Green?Can the Cloud Be Green?
Can the Cloud Be Green?
Peter May-Ostendorp
 
Energy Efficiency in Data Centers
Energy Efficiency in Data CentersEnergy Efficiency in Data Centers
Energy Efficiency in Data Centers
GreenLSI Team, LSI, UPM
 
Motivation for Green Computing, an Analytical Approach
Motivation for Green Computing, an Analytical ApproachMotivation for Green Computing, an Analytical Approach
Motivation for Green Computing, an Analytical Approach
IOSR Journals
 
F032031036
F032031036F032031036
F032031036inventy
 
Energy Efficient Data Center
Energy Efficient Data CenterEnergy Efficient Data Center
Energy Efficient Data Center
Gunawan Jusuf
 
Electrical Audit of Computer Labs on Campus
Electrical Audit of Computer Labs on CampusElectrical Audit of Computer Labs on Campus
Electrical Audit of Computer Labs on CampusMichael Pérez
 
Green Data Centers_Report
Green Data Centers_ReportGreen Data Centers_Report
Green Data Centers_ReportSwena Gupta
 
Green data center_rahul ppt
Green data center_rahul pptGreen data center_rahul ppt
Green data center_rahul ppt
RAHUL KAUSHAL
 
Green Cloud Computing
Green Cloud ComputingGreen Cloud Computing
Green Cloud Computing
University of St Andrews
 
MRI Energy-Efficient Cloud Computing
MRI Energy-Efficient Cloud ComputingMRI Energy-Efficient Cloud Computing
MRI Energy-Efficient Cloud ComputingRoger Rafanell Mas
 

Similar to AI-Sustainability.pptx (20)

AI Sustainability Mascots 23-f.pptx
AI Sustainability Mascots 23-f.pptxAI Sustainability Mascots 23-f.pptx
AI Sustainability Mascots 23-f.pptx
 
Green & Beyond: Data Center Actions to Increase Business Responsiveness and R...
Green & Beyond: Data Center Actions to Increase Business Responsiveness and R...Green & Beyond: Data Center Actions to Increase Business Responsiveness and R...
Green & Beyond: Data Center Actions to Increase Business Responsiveness and R...
 
Koomeyoncloudcomputing V5
Koomeyoncloudcomputing V5Koomeyoncloudcomputing V5
Koomeyoncloudcomputing V5
 
Koomeyondatacenterelectricityuse v24
Koomeyondatacenterelectricityuse v24Koomeyondatacenterelectricityuse v24
Koomeyondatacenterelectricityuse v24
 
E source energy managers conf 4 24-13-final
E source energy managers conf 4 24-13-finalE source energy managers conf 4 24-13-final
E source energy managers conf 4 24-13-final
 
Optimization of power consumption in data centers using machine learning bas...
Optimization of power consumption in data centers using  machine learning bas...Optimization of power consumption in data centers using  machine learning bas...
Optimization of power consumption in data centers using machine learning bas...
 
IBM and GREEN IT; Green IT – How to Make IT Work and Save Money
IBM and GREEN IT; Green IT – How to Make IT Work and Save MoneyIBM and GREEN IT; Green IT – How to Make IT Work and Save Money
IBM and GREEN IT; Green IT – How to Make IT Work and Save Money
 
Bringing Enterprise IT into the 21st Century: A Management and Sustainabilit...
Bringing Enterprise IT into the 21st Century:  A Management and Sustainabilit...Bringing Enterprise IT into the 21st Century:  A Management and Sustainabilit...
Bringing Enterprise IT into the 21st Century: A Management and Sustainabilit...
 
Green Computing
Green  ComputingGreen  Computing
Green Computing
 
Green Computing: A Methodology of Saving Energy by Resource Virtualization.
Green Computing: A Methodology of Saving Energy by Resource Virtualization.Green Computing: A Methodology of Saving Energy by Resource Virtualization.
Green Computing: A Methodology of Saving Energy by Resource Virtualization.
 
Can the Cloud Be Green?
Can the Cloud Be Green?Can the Cloud Be Green?
Can the Cloud Be Green?
 
Energy Efficiency in Data Centers
Energy Efficiency in Data CentersEnergy Efficiency in Data Centers
Energy Efficiency in Data Centers
 
Motivation for Green Computing, an Analytical Approach
Motivation for Green Computing, an Analytical ApproachMotivation for Green Computing, an Analytical Approach
Motivation for Green Computing, an Analytical Approach
 
F032031036
F032031036F032031036
F032031036
 
Energy Efficient Data Center
Energy Efficient Data CenterEnergy Efficient Data Center
Energy Efficient Data Center
 
Electrical Audit of Computer Labs on Campus
Electrical Audit of Computer Labs on CampusElectrical Audit of Computer Labs on Campus
Electrical Audit of Computer Labs on Campus
 
Green Data Centers_Report
Green Data Centers_ReportGreen Data Centers_Report
Green Data Centers_Report
 
Green data center_rahul ppt
Green data center_rahul pptGreen data center_rahul ppt
Green data center_rahul ppt
 
Green Cloud Computing
Green Cloud ComputingGreen Cloud Computing
Green Cloud Computing
 
MRI Energy-Efficient Cloud Computing
MRI Energy-Efficient Cloud ComputingMRI Energy-Efficient Cloud Computing
MRI Energy-Efficient Cloud Computing
 

Recently uploaded

Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
Columbia Weather Systems
 
bordetella pertussis.................................ppt
bordetella pertussis.................................pptbordetella pertussis.................................ppt
bordetella pertussis.................................ppt
kejapriya1
 
ISI 2024: Application Form (Extended), Exam Date (Out), Eligibility
ISI 2024: Application Form (Extended), Exam Date (Out), EligibilityISI 2024: Application Form (Extended), Exam Date (Out), Eligibility
ISI 2024: Application Form (Extended), Exam Date (Out), Eligibility
SciAstra
 
Introduction to Mean Field Theory(MFT).pptx
Introduction to Mean Field Theory(MFT).pptxIntroduction to Mean Field Theory(MFT).pptx
Introduction to Mean Field Theory(MFT).pptx
zeex60
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Erdal Coalmaker
 
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills MN
 
Chapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisisChapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisis
tonzsalvador2222
 
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
David Osipyan
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
moosaasad1975
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Ana Luísa Pinho
 
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptxANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
RASHMI M G
 
ESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptxESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptx
PRIYANKA PATEL
 
Mudde & Rovira Kaltwasser. - Populism - a very short introduction [2017].pdf
Mudde & Rovira Kaltwasser. - Populism - a very short introduction [2017].pdfMudde & Rovira Kaltwasser. - Populism - a very short introduction [2017].pdf
Mudde & Rovira Kaltwasser. - Populism - a very short introduction [2017].pdf
frank0071
 
Richard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlandsRichard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlands
Richard Gill
 
nodule formation by alisha dewangan.pptx
nodule formation by alisha dewangan.pptxnodule formation by alisha dewangan.pptx
nodule formation by alisha dewangan.pptx
alishadewangan1
 
Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.
Nistarini College, Purulia (W.B) India
 
Leaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdfLeaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdf
RenuJangid3
 
20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx
Sharon Liu
 
Lateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensiveLateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensive
silvermistyshot
 
SAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdfSAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdf
KrushnaDarade1
 

Recently uploaded (20)

Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
 
bordetella pertussis.................................ppt
bordetella pertussis.................................pptbordetella pertussis.................................ppt
bordetella pertussis.................................ppt
 
ISI 2024: Application Form (Extended), Exam Date (Out), Eligibility
ISI 2024: Application Form (Extended), Exam Date (Out), EligibilityISI 2024: Application Form (Extended), Exam Date (Out), Eligibility
ISI 2024: Application Form (Extended), Exam Date (Out), Eligibility
 
Introduction to Mean Field Theory(MFT).pptx
Introduction to Mean Field Theory(MFT).pptxIntroduction to Mean Field Theory(MFT).pptx
Introduction to Mean Field Theory(MFT).pptx
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
 
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
 
Chapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisisChapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisis
 
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
 
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptxANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
 
ESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptxESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptx
 
Mudde & Rovira Kaltwasser. - Populism - a very short introduction [2017].pdf
Mudde & Rovira Kaltwasser. - Populism - a very short introduction [2017].pdfMudde & Rovira Kaltwasser. - Populism - a very short introduction [2017].pdf
Mudde & Rovira Kaltwasser. - Populism - a very short introduction [2017].pdf
 
Richard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlandsRichard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlands
 
nodule formation by alisha dewangan.pptx
nodule formation by alisha dewangan.pptxnodule formation by alisha dewangan.pptx
nodule formation by alisha dewangan.pptx
 
Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.
 
Leaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdfLeaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdf
 
20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx
 
Lateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensiveLateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensive
 
SAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdfSAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdf
 

AI-Sustainability.pptx

  • 1. AI & Sustainability Dr. Tamar Eilam IBM Fellow, Chief Scientist Sustainable Computing, IBM Research Generated by Dall-E
  • 2. 2
  • 3. Approximate and Partial list of contributors in arbitrary order 3 Energy modeling and quantification Marcelo Amaral, Huamin Chen, Tatsuhiro Chiba, Rina Nakazawa, Sunyanan Choochotkaew, Eun K Lee, Umamaheswari Devi, Aanchal Goyal Workload Classification Xi Yang, Rohan R Arora, Chandra Narayanaswami, Cheuk Lam, Jerrold Leichter, Yu Deng, Daby Sow Energy Aware Optimization Tatebeh Bahreini, Asser Tantawi, Alaa Youssef, Chen Wang, AI System Jeffrey Burns, Leland Chang, Ankur Agrawal, Kailash Gopalakrishnan, Pradip Bose AI Quantification and Metric Pedro Bello-Maldonado, Bishwaranjan Bhattacharjee, Carlos Costa, AI Infrastructure Innovation Seelam Seetharami Model Architecture Innovation David Cox, Rameswar Panda, Rogerio Feris, Leonid Karlinsky
  • 4. The Climate Impact Chain Human activity Increased Green House Gas (GHG) in atmosphere Global warming Global climate change Physical & biological impact Human socio- economic impact $150 billion Average cost in damages per year 100M+ Increase in population facing hunger IBM Research | © 2022 IBM Corporation
  • 6. 6 Mitigation Carbon Capture Geo- engineering Reduce Carbon Emission Sustainable Computing adaptation AI Harness the power of AI to fight climate change material discovery climate and risk monitoring we also have to mitigate its affect on the environment
  • 7. Part1: Sustainable Computing IBM Research / Doc ID / Month XX, 2020 / © 2020 IBM Corporation 7
  • 8. What is Sustainable Computing ? 8 Ability to measure, quantify, and ultimately reduce carbon footprint at every layer of the computing stack, in- and across- data centers, and across the entire life cycle. IBM Research | © 2022 IBM Corporation
  • 9. The Computer Energy Problem 9 We are at an inflection point : 3. The end of Dennard Scaling means we can’t keep up  Some predict that electricity consumed by Data Centers will increase to 8% by 2030  Golden Era for Chip Design 1. Demand is growing at exponential scale How to stop data centers from gobbling up the world’s electricity https://www.nature.com/articles/d41586-018- 06610-y 2. The emergence of energy-demanding workloads(AI) AI power consumption doubles every 3-4 months * Green AI, R. Schwartz, J. Dodge, N. A. Smith, O. Etzioni 2019
  • 10. Ever rising energy demands for computing vs. global energy production is creating new risk, and new opportunities for radically different computing to drastically improve efficiency 31% a years the energy consumption increase trend for hyperscalers in North America >10% of the world's power will be consumed by hyperscalers by 2030 IBM Research | © 2022 IBM Corporation 10
  • 11. Sustainable Computing epochs Making the Current State More Sustainable Introducing Accelerators (Digital) & Hardware and Software co-design / co-optimization New Computational Models (beyond digital)  Understanding the As-Is  Hot Spot Detection  Remediation and Optimization  Coupling Power and Cloud  Cooling, Data Center Planning, etc  Storage AutoTiering.  HW and SW co-design (scalable approach)  Reduced precision chips – 8bit precision approximate computing  Voltage scaling with error correction  Runtime management of dis- aggregated & composable heterogenous DC  New computational models that completely break the relationship between energy & computation: neuromorphic, analog AI, data- centric, quantum, etc. https://research.ibm.com/blog/telum- processor https://www.esp.cs.columbia.edu https://research.ibm.com/blog/the- hardware-behind-analog-ai https://www.zurich.ibm.com/sto/memory/ IBM Research | © 2022 IBM Corporation 11
  • 12. Carbon Intensity The emission rate: grams of carbon dioxide released per megajoule of energy produced — With coal power stations, the carbon intensity is high as CO2 is produced as part of the power generation process. Carbon intensity is >1 kg/kWh for coal; — Renewable energy such as hydro or solar produce almost no emissions, so their carbon intensity is very low. Carbon intensity is ~0 for solar/wind Modeling the Data Center Carbon Footprint 12 x Carbon Intensity Power usage effectiveness (PUE) A predominant metric used to measure the energy efficiency of a data center. — PUE = (Total Facility Energy) / ( IT Equipment Energy) Efficiency improves as the quotient decreases towards 1. 1 is optimal, 2 is very bad. Total Carbon Footprint The total amount of carbon dioxide (CO2) and equivalent green house gas emissions associated with powering a data center. CFP >= 0. Carbon Footprint = IT Equipment Energy x Power Usage Effectiveness CFP =EIT × PUE × CI EIT PUE An example DC Energy Breakdown IBM Research | © 2022 IBM Corporation
  • 13. Reducing the Data Center Carbon Footprint: Research Opportunities 13 x Carbon Intensity Carbon Footprint = IT Equipment Energy x Power Usage Effectiveness CFP =EIT × ERE × CI • Data Center Design, Cooling and Heat- Reuse • Rack Design to optimize power conversion, and direct liquid cooling • Improving power conversion in the data center • Energy Aware Scheduling, Vertical Scaling, Dispatching • Power Management • Chip Design • Dispatching of batch workload such as AI Training Jobs across time and space to maximize renewable energy use. • Forecasting of renewable energy (time series composition) • Can the cloud sense renewable energy and adapt? https://research.ibm.com/blog/ibm-artificial-intelligence- unit-aiu https://www.zurich.ibm.com/st/energy_efficiency/zeroemiss ion.html https://research.ibm.com/blog/northpole-ibm-ai-chip
  • 14. 14 Act 14 Energy and CFP per workload, tenant, VM, container, Service, Etc. Identify hotspots and applicable strategies. Calculate potential savings. Assess Estimate A set of controllers to dynamically optimize the Carbon footprint at operation. Design efficient systems Report Report CFP across your entire organization in a consistent fashion factoring in requirements Carbon Assessment & Reduction Framework An Approach for Sustainable Computing
  • 15. Energy Quantification Challenge • How do you estimate the power consumption of applications running on shared servers? • How do you do that when you do not have on-line power measurement at the server level? • How do you do that if you do not know what else is running on the machine? IBM Research / Doc ID / Month XX, 2020 / © 2020 IBM Corporation 15 generated with Dal-E
  • 16. Energy Quantification Challenge • How do you estimate the power consumption of applications running on shared servers? => ratio based approach • How do you do that when you do not have on-line power measurement at the server level? => power modeling • How do you do that if you do not know what else is running on the machine? => dynamic power estimation only • How do you scale the approach to developing power models (combinatorial explosion problem)? 16 The Kepler Project https://github.com/sustainable-computing-io/kepler
  • 17. 17 [1] https://github.com/sustainable-computing-io/kepler Kepler Architecture • eBPF metrics: hardware counters, cpu time and soft IRQ • System Power metrics from BMs and VMs • Ratio Power Model for containers • Trained Power Model to estimate the VM’s component power consumption
  • 18. 18 Kepler Deployment Approaches - Ratio Power Model for Dynamic CPU Power with Hardware Counter: DynPowerprocess i = 𝐶𝑃𝑈 𝑐𝑦𝑐𝑙𝑒𝑠 𝑝𝑟𝑜𝑐𝑒𝑠𝑠 𝑖 𝛴𝐶𝑃𝑈 𝑐𝑦𝑐𝑙𝑒𝑠 𝑥 DynPowerhost_CPU without Hardware Counter: DynPowerprocess i = 𝐵𝐹𝑃 𝐶𝑃𝑈 𝑡𝑖𝑚𝑒 𝑝𝑟𝑜𝑐𝑒𝑠𝑠 𝑖 𝛴𝐵𝑃𝐹 𝐶𝑃𝑈 𝑡𝑖𝑚𝑒 𝑥 DynPowerhost_CPU DynPowercontainer j = Σ 𝑖 𝜖 𝑗 DynPowerprocess i - Evenly distribution of Idle Power Powercontainer j = IdlePowerhost_CPU / numContainers GPU (nvml)
  • 19. Kepler Model Server Project facilitate training power model for server without power meter Bare-metal (BM) Kepler Estimated System Power Metrics Ratio Power Model Process/Container Power Consumption Virtual Machine (VM) Trained Power Model Bare-metal (BM) RAPL ACPI/Sensors Redfish/IPMI GPU (nvml) Kepler Ratio Power Model Process/Container Power Consumption Server with power meter Server without power meter Kepler Model Server Motivation: • No power measurement exposed or instrumented in some running systems Challenges: • No or not-enough data to train power model specific to all available metrics and emerging system platform and settings (e.g., variety of CPU architecture, Frequency governor) • Dynamicity of control plane processes Collect Data Train Model Export Model Serve Model Estimate Power
  • 20. core of Kepler model server Pipeline Framework (one extractor, one isolator, multiple trainers ) Extract … Prometheus query result Extracted data Isolated data Power models Node-level Train Container-level Train Isolate Energy metric Energy-related metric (s) with background power without background power https://www.cncf.io/blog/2023/10/11/exploring-keplers-potentials-unveiling-cloud-application-power-consumption/
  • 21. The Issue with Third- Party Clouds  No server power metric available  No knowledge of what else is running on my machine  how to split idle power?   Limited knowledge of the architecture and configuration of the bare metal servers  Challenge for applying separately trained power models…   ALL Cloud Native calculators are too coarse grained to be useful for optimization ..  Generated with Dall-E
  • 22. https://adrianco.medium.com/proposal-for-a-realtime-carbon-footprint-standard-60b71c269948 Adrian Cockcroft How can we get to real time monitoring of application carbon consumption in third party clouds? Consistent Trustworthy Transparent Explainable Can Kepler help? What else do we need? WIP: Reference Implementation to be Open Sourced.
  • 23. 24 Act 24 Energy and CFP per workload, tenant, VM, container, Service, Etc. Identify hotspots and applicable strategies. Calculate potential savings. Assess Estimate A set of controllers to dynamically optimize the Carbon footprint at operation. Design efficient systems Report Report CFP across your entire organization in a consistent fashion factoring in requirements Carbon Assessment & Reduction Framework An Approach for Sustainable Computing
  • 24. Detect non-productive workloads • Virtual Machines • Cloud-native deployments • Cloud services Can schedules be drawn up for a few (if not all) productive workloads? Workload Classification: Motivation
  • 25. Methodology Workload* Classification Phase Abstraction Inactive/ active phases Non-repeatable Constantly Productive Alternating Workload Timetabling Candidate for Termination Candidate for Parking No Action Repeatable Recommendation Metrics Non-productive • Non-productive: Remaining in the Inactive Phase • Constantly Productive: Remaining in the Active Phase • Alternating: Switching between the two Phases VM1 VM2 VM𝑁 𝑇 − 𝑤𝑐 7/14/21 Days 𝑇
  • 27. 28 Act 28 Energy and CFP per workload, tenant, VM, container, Service, Etc. Identify hotspots and applicable strategies. Calculate potential savings. Assess Estimate A set of controllers to dynamically optimize the Carbon footprint at operation. Design efficient systems Report Report CFP across your entire organization in a consistent fashion factoring in requirements Carbon Assessment & Reduction Framework An Approach for Sustainable Computing
  • 28. CARE: Carbon Quantification & Reduction Coordinated set of controllers to dynamically quantify and optimize the carbon footprint in every level of the hybrid cloud stack in and across on and off prem data centers Container Right-Sizing Dynamic dispatching Energy aware scheduler VM placement Power management Container Right-Sizing Energy aware scheduler VM placement Power management CFP =EIT × PUE × CI Leverage renewable energy when and where it is available across datacenters. Efficiency with container resource consumption within a datacenter. Efficient infrastructure with VM and power management 29
  • 29. Part2: AI Sustainability IBM Research / Doc ID / Month XX, 2020 / © 2020 IBM Corporation 30 Generated by Dall-E
  • 30.
  • 31. The energy cost of AI  Deep learning is computationally intensive  Time consuming even with high-performance computing resources Take for example: Training Image recognition model Dataset: ImageNet-22K Network: ResNet-101 256 GPUs 7 hours ~450kWh 4 GPUs 16 days ~385 kWh 1 model training run is ~2 weeks of home energy consumption https://arxiv.org/abs/1708.02188
  • 32. AI demand keeps surging Training requirements are doubling every 3.5 months Source: Emma Strubell, Ananya Ganesh, and Andrew McCallum. 2019. Energy and Policy Considerations for Deep Learning in NLP. CoRR abs/1906.02243 (2019). arXiv:1906.02243 Source: Roy Schwartz, Jesse Dodge, Noah A. Smith, and Oren Etzioni. 2019. Green AI. arXiv:1907.10597 [cs.CY]
  • 33. The emergence of foundation models Homogenization: a broad foundation model is adapted to perform specific tasks. Almost all state-of- the-art NLP models are now adapted from one of a few foundation models, such as BERT, RoBERTa, BART, T5, etc. Multi modal, and cross domains are next. Source: RishiBommasani,DrewA.Hudson,EhsanAdeli,RussAltman,SimranArora, Sydney von Arx, Michael S. Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, Erik Brynjolfsson, Shyamal Buch, Dallas Card, Dallas Card, Rodrigo Castellon, Niladri Chatterji, Annie Chen, Kathleen Creel, Jared Quincy Davis, Dora Dora Demszky, and Chris Donahue et al. 2022. On the Opportunities and Risks of Foundation Models. Models. arXiv:2108.07258 [cs.LG]
  • 34. Sizes of Language Models Training Cost of Language Model GPT-3 needs 1024 A100 GPUs for 34 days for training! Large language models are getting larger Some say that this is okay, because they are re-used for multiple tasks* This claim is yet to be substantiated based on a sound analysis *E.g., DavidPatterson,JosephGonzalez,QuocLe,ChenLiang,Lluis-MiquelMunguia, Daniel Rothchild, David So, Maud Texier, and Jeff Dean. 2021. Carbon Emissions and Large Neural Network Training.
  • 35. Data Scientist Dilemma: to adapt or not to adapt • To adapt from a broad model, or, to train a smaller model on a more specific data set? • How much data to use? • Can I synthesize a few smaller models? • Neural Architecture Search? Hyper Parameter Optimization? Is it worth the cost? well, it depends…. • What is the optimal frequency of re-training? Daily? Weekly? what data shall I use for re-training? incremental? Complete?
  • 36. Sustainable AI platform principles Transparency dynamically track energy and carbon across the data and model life cycle Traceability and Governance track the ‘supply chain’ of models and data-sets and associated energy and carbon Energy Efficiency Innovation across all layers of the stack Meaningful Metrics 11/3/2023 37
  • 37. Meaningful Metrics categories data- set model Products Core Metric Life-cycle Efficiency Construction Operation Construction pre-training 11/3/2023 Operation re- training Inference Life-cycle factor-in the provenance of models and data-sets and their associate energy and carbon footprint (Life-Cycle-Assessment principles) D FM M Efficiency efficiency = 𝑐𝑜𝑠𝑡 𝑤𝑜𝑟𝑘 𝑝𝑟𝑜𝑑𝑢𝑐𝑒𝑑 what goes into ‘cost’?  compute for inference  +training  +bill of material ‘tax’
  • 38. holistic approach to Sustainable AI Factor-in the entire life cycle of models Sustainable strategy exploration and what-if analysis Provenance, Governance, and reporting Holistic impact analysis and tradeoff based planning AI Sustainability Metrics
  • 40. The life-cycle of a model as a state machine Each ‘state transition’ is associated with a significant energy/carbon cost, and involve critical decisions, that will affect cost of this and downstream tasks. • Tradeoffs between accuracy, time-to-value, and energy/carbon • Cost of one phase may depend on decisions taken at a prior stage. save now, pay later…. • The particulars of the target task are important to factor in early on.
  • 41. On-Line Fine Grain monitoring of Energy and Carbon with Kepler • An open-source project pioneered by RedHat and IBM Research to quantify cloud native applications energy/carbon. • On road map to deliver in OCP and integrate in Rosa • Adrian Cockcroft advocating use of Kepler across all cloud providers “Real Time Energy and Carbon Standard for Cloud Providers” 11/3/2023 42
  • 42. SusQL: Context aware aggregation and energy accounting Infrastructure: Kubernetes controller with its own CRD that gets data from Kepler for aggregation susql-controller map[labels]->energy table 1 2 3 4 apiVersion: … kind: LabelGroup metadata: … spec: labels: - <label-1> - <label-2> - <label-3> - <label-4> status: totalEnergy: <total energy>
  • 43. Can we connect the dots? Kepler + Kubeflow source: https://cloud.google.com/blog/topics/developers-practitioners/scalable-ml-workflows-using-pytorch-kubeflow-pipelines-and-vertex-pipelines KubeFlow Pipeline Example Associated Meta-Data Can we leverage Kepler to add energy data?
  • 45. A ‘Supply Chain’ of models Models are created (‘manufactured’) distilled, fine tuned, and rer-used (adapted) to created new models Deployment is just the beginning of the journey. How do we reason about the Life-Cycle Cost of models?
  • 46. Product Life Cycle Assessment Principles for Sustainable AI: Products = data-set | model We need to factor in the cost of the Bill of Material used in the creation of a new model If B (a product or a service) is used in the process of creation of A1, A2, … An, then the carbon cost of B is inherited by A1, A2, …, An in proportion to their use.
  • 49. Efficiency at every layer of the AI Stack • Every layer of the FM stack offer opportunity for efficiencies gains Model Quantization, architecture innovation Tools dynamic batching Platform Multiplexing, dispatching Infrastructure DVFS, power param optimization, caching, Systems Approximate computing and other system innovations • Empower the data scientist to make choices and explore tradeoffs between accuracy, performance, energy • Empower the data scientist to reason about life-cycle strategies: e.g., if/what/when to re-use, and how much to retrain
  • 51. 52 IBM Research’s Artificial Intelligence Unit (AIU) Chip architecture optimized for enterprise AI workloads Enabled for Foundation Models Enabled in the Red Hat software stack Supports multi-precision inference (& training) FP16, FP8, INT8, INT4, INT2 Implemented in leading edge 5nm technology https://research.ibm.com/blog/ibm-artificial-intelligence-unit-aiu SoC implements IBM’s leadership innovations in low-precision AI arithmetic and algorithms IBM Research AI Hardware Center / © 2023 IBM Corporation
  • 52. 53 Vision for AI Performance Scaling • Applying Approximate Computing techniques to AI compute • Critical requirement: maintain model accuracy • Advantage: Quadratic improvement in performance • IBM Research has been at the forefront of every major technical advancement on bit-precision scaling • 16-bit training (2015) • 8-bit training (2018, 2019) • 4-bit training (2020) • 2/4-bit Inference (2018-2020) • Complemented by • Sparsity support • Analog Computing • 3D Stacking Digital AI Cores Scaling precision for quadratic gains in performance with iso-accuracy 4-bit Inference ASICs J.Choi et al., https://arxiv.org/pdf/1805.06085.pdf J.McKinstry et al., https://arxiv.org/abs/1809.04191 2-bit Inference ASICs J.Choi et al., SysML 2019 0.1 1 10 100 2012 2015 2018 2021 2024 16-bit 32-bit 16-bit 8-bit 8-bit 2-bit 4-bit 4-bit 16-bit Training ICML 2015 Training Inference 4-bit Training X. Sun et al NeurIPS 2020 8-bit Training NeurIPS 2018, 2019 4-bit Inference J.Choi et al.,arxiv 2018 2-bit Inference J.Choi et al., SysML 2019 Bit Precision https://research.ibm.com/blog/ibm-artificial-intelligence-unit-aiu https://research.ibm.com/blog/ai-chip-precision-scaling
  • 53. 54 Northpole: Neural-inspired memory-on-chip architecture to overcome the von-neumann bottleneck NorthPole is 25 times more energy efficient, when it comes to the number of frames interpreted per joule of power.
  • 55. Vela: A Cloud Native Supercomputer for the Foundation Model Age (Kepler inside) System specifications – Nodes with 8 x A100 GPUs (80GB) – GPUs interconnected with NVLink, NVSwitch – Cascade Lake CPUs, 1.5TB of DRAM, – Four 3.2TB NVMe drives – Redundant connections between nodes, TORs and spines – 2 x 100G NICs from each node – NCCL benchmarks show we drive close to line rate https://research.ibm.com/blog/AI-supercomputer-Vela-GPU-cluster – Configure resources through software (APIs) – Broad ecosystem of available cloud services – Leverage data sets on Cloud Object Store – Standard, flexible, scalable infrastructure design (vs traditional HPC) – Near bare metal performance (within 5%, single node) How do you evolve from specialized (monolithic), costly, and inflexible HPC stack to Cloud Native Stack without compromising efficiency ? - Programmability - Scalability - Re-use - Observability - Agility - Democratization
  • 57. Dispatching of jobs based on renewable energy 58 Motivation:  Carbon intensity of the energy mix of different regions of IBM data centers varies over time.  Renewable energy is not available all the time and in all places. Workload Optimization: Placement and scheduling of workloads based on carbon-free energy availability. Ideal dispatching: High CPU utilization when carbon intensity is low and low CPU utilization when carbon intensity is high. T. Bahreini, A. Tantawi and A. Youssef, "An Approximation Algorithm for Minimizing the Cloud Carbon Footprint through Workload Scheduling," 2022 IEEE 15th International Conference on Cloud Computing (CLOUD), 2022, pp. 522-531, Challenge: Ideal dispatching might be practically infeasible.  Short jobs may have short deadline.  Some jobs are not interruptible.  Jobs have heterogenous resource demands. Obtaining the optimal packing is intractable.
  • 58. 59 IEEE Cloud 2023 – dispatching (placement & scheduling) across data centers to minimize carbon IEEE Cloud 2022 polynomial approximation algorithms. scheduling in a single data center to minimize carbon.
  • 59. Dispatch Workloads onto Clusters Spoke Cluster 1 Spoke Cluster 2 Hub Cluster MCAD Dispatcher Spoke Cluster 3 MCAD Runner KubeStellar MCAD Dispatcher • queue & dispatch jobs • resource allocation • quota management • requeue & retry jobs MCAD Runner • run & monitor jobs • monitor cluster KubeStellar • downsync job spec • upsync job status • upsync cluster status placement engine
  • 61. MCAD – Multi Cluster Application Dispatcher https://github.com/project-codeflare/multi-cluster-app-dispatcher • Hard constraints • filter clusters • filter workloads • Soft constraints • score clusters • score workloads • Space dimension • global vs. individual decisions • aggregates • Time dimension • in-time vs. ahead-of-time • delayed start, suspend/resume
  • 63. 64
  • 64. 65
  • 65. Call to action: AI Platform providers: - Build-in transparency and governance - Incorporate platform and system innovation for efficiency. Academia & Industry: Focus you Research on Efficiency not just accuracy Data Scientists / Practitioners: Develop a sustainability mind-set Re-use where it makes sense Domain specific, smaller models are better! Explore tradeoffs (accuracy vs cost) 66
  • 66. Tokyo Shin-Kawasaki Delhi Bangalore Singapore Nairobi Haifa Zurich Warrington Dublin Cambridge Albany Yorktown Almaden Rio de Janeiro Sao Paulo Johannesburg 6 Nobel Laureates 10 Medals of Technology 5 National Medals of Science 6 Turing Awards IBM Research

Editor's Notes

  1. energy-efficiency throughout the memory hierarchy. Since different tasks require a different system composition for best utilization, the data centers need to be rearchitected in the future using disaggregation and composability. This allows flexible composition and ystem configuration to optimally serve a particular task. Considering that the various technical components (CPUs, GPUs, memory, and storage) have different lifecycles, disaggregation additionally improves the system performance and reduces cost, as they can be replaced separately. Common memory systems for AI/ML applications include on-chip memory, high bandwidth memory (HBM), and GDDR—and all have different architectural implications. A universal goal is to realize memory technology with much higher bandwidth and lower latency, while consuming less energy. While HBM DRAMs are already very power-efficient, roughly 2/3 of the power budget is still spent moving data between an SoC and the DRAM (Figure 2.4)5. Reducing the volume of data moved provides an opportunity for large improvement, this requires further research. Different concepts for disaggregation of memory and storage are already proposed, but more research is needed to identify the best way to use disaggregation to achieve TCO benefits at scale and improve latency. To generate these benefits, a multi-tiered memory approach that includes the use of storage-class memories is needed. The new architectures can pose a challenge but can also provide an opportunity for application development. The impact to legacy code needs to be understood and mitigated.
  2. Foundation models have led to an unprecedented level of homogenization: Almost all state-of- the-art NLP models are now adapted from one of a few foundation models, such as BERT, RoBERTa, BART, T5, etc.
  3. Training GPT-3, which is a single general-purpose AI program that can generate language and has many different uses, took 1.287 gigawatt hours, according to a research paper published in 2021, or about as much electricity as 120 US homes would consume in a year. That training generated 502 tons of carbon emissions, according to the same paper, or about as much as 110 US cars emit in a year. That’s for just one program, or “model.” While training a model has a huge upfront power cost, researchers found in some cases it’s only about 40% of the power burned by the actual use of the model, with billions of requests pouring in for popular programs. Plus, the models are getting bigger. OpenAI’s GPT-3 uses 175 billion parameters, or variables, that the AI system has learned through its training and retraining. Its predecessor used just 1.5 billion. Another relative measure comes from Google, where researchers found that artificial intelligence made up 10 to 15% of the company’s total electricity consumption, which was 18.3 terawatt hours in 2021. That would mean that Google’s AI burns around 2.3 terawatt hours annually, about as much electricity each year as all the homes in a city the size of Atlanta. https://www.bloomberg.com/news/articles/2023-03-09/how-much-energy-do-ai-and-chatgpt-use-no-one-knows-for-sure
  4. Packaged as a PCIe card, for ease of integration into virtually any on-premises or cloud system Integration into the IBM Watson software stack underway, to power the AI inference infrastructure of IBM Research’s Foundation Model Big Bet Packaged as a PCIe card, for ease of integration into virtually any on-premises or cloud system Enabled in the Red Hat software stack including PyTorch and TensorFlow integration
  5. we can drop from 32-bit floating point arithmetic to bit-formats holding a quarter as much information. This simplified format dramatically cuts the amount of number crunching needed to train and run an AI model, without sacrificing accuracy. We leverage key IBM breakthroughs from the last five years to find the best tradeoff between speed and accuracy. This is not a chip we designed entirely from scratch. Rather, it’s the scaled version of an already proven AI accelerator built into our Telum chip that power Z 16 System.
  6. So, we asked ourselves: how do we deliver bare-metal performance inside of a VM? Following a significant amount of research and discovery, we devised a way to expose all of the capabilities on the node (GPUs, CPUs, networking, and storage) into the VM so that the virtualization overhead is less than 5%, which is the lowest overhead in the industry that we’re aware of. This work includes configuring the bare-metal host for virtualization with support for Virtual Machine Extensions (VMX), single-root IO virtualization (SR-IOV), and huge pages. We also needed to faithfully represent all devices and their connectivity inside the VM, such as which network cards are connected to which CPUs and GPUs, how GPUs are connected to the CPU sockets, and how GPUs are connected to each other. These, along with other hardware and software configurations, enabled our system to achieve close to bare metal performance. Bare Metal vs. VMs || Ethernet vs Infiniband || openhsift scheduling (MCAD) vs. LSF enabling SR-IOV for our network interface cards on each node, thereby exposing each 100G link directly into the VMs via virtual functions.  we can hide the communication time over the network behind compute time occurring on the GPUs. This approach is aided by our choice of GPUs with 80GB of memory (discussed above), which allows us to use bigger batch sizes (compared to the 40 GB model), and leverage the Fully Shared Data Parallel (FSDP) training strategy more efficiently.  Next we’ll be rolling out an implementation of remote direct memory access (RDMA) over converged ethernet (RoCE) at scale and GPU Direct RDMA (GDR), to deliver the performance benefits of RDMA and GDR while minimizing adverse impact to other traffic. Our lab measurements indicate that this will cut latency in half.
  7. One effective technique is known as model growth. Using the model growth method, researchers can increase the size of a transformer by copying neurons, or even entire layers of a previous version of the network, then stacking them on top. They can make a network wider by adding new neurons to a layer or make it deeper by adding additional layers of neurons. In contrast to previous approaches for model growth, parameters associated with the new neurons in the expanded transformer are not just copies of the smaller network’s parameters, Kim explains. Rather, they are learned combinations of the parameters of the smaller model. LiGO also expands width and depth simultaneously, which makes it more efficient than other methods. A user can tune how wide and deep they want the larger model to be when they input the smaller model and its parameters, Kim explains.