SlideShare a Scribd company logo
Deep Red
The Environmental Impact of Deep Learning
Paolo Caressa – GSE spa
About the speaker: a life pie…
Study & Leisure; 48,00%
Math Research; 20,00%
IT Consultant; 3,00%
Quant; 8,00%
IT PM & PgM; 21,00%
What is energy?
Energy is force (=ma) times a displacement, or
Energy is mass times the square of velocity
Example: Kinetic Energy = ½ mass × velocity2
Source: Wikipedia. Di Ferdinand Schmutzer
(1870-1928) - Edited version of
Image:Einstein1921 by F Schmutzer 2.jpg.,
Pubblico dominio,
https://commons.wikimedia.org/w/index.php?
curid=5216482
Source: https://giphy.com/gifs/looneytunes-
angry-mad-1wPC7WSiRq6pEpcBOo
Also Einstein’s famous
E = mc2
confirms it…
Power is energy per unit time
We use energy during a certain time interval, mainly to move around:
power is the amount of energy transferred in a unit of time, aka power =
force times velocity
For example to climb a 700 mt high mountain
in a hour, a man weighing 80kg (≈800 N)
needs a power of
P = F × v = (m × a) × v
= 800 N × 700 mt/1 h
= 800 N × 700 mt / 3600 s
= 155 W Source: https://giphy.com/explore/mountain-
climbing
Why do we get tired?
However no one could keep walking or
climbing forever: indeed we say that we
consume energy, which is not correct since,
as Antoine de Lavoisier (1743-1794) put it:
Energy is neither created nor destroyed but
just transformed.
Rather...
Source: Wikipedia. By Louis Jean Desire Delaistre, after Boilly - Rev.
Superinteressante, n. 23, Pubblico dominio,
https://commons.wikimedia.org/w/index.php?curid=5507967
During transformation energy
dissipates!
Whenever we convert or transmit
energy it dissipates a bit.
Thus “conservative systems” are but a
(useful) fiction.
Dissipative effects are due to friction,
resistance, etc.
Example: Joule heatingSource: © Science Photo Library Limited 2019
https://www.sciencephoto.com/media/1064671/view/joule-s-
heat-equivalence-experiment-1840s
Human industry exploits
dissipation!!!
Source: Wikipedia. By Nijs, Jac de / Anefo - [1] Dutch National
Archives, The Hague, Fotocollectie Algemeen Nederlands
Persbureau (ANeFo), 1945-1989, Nummer toegang 2.24.01.03
Bestanddeelnummer 913-7320, CC BY-SA 3.0 nl,
https://commons.wikimedia.org/w/index.php?curid=31527984
Source: Wikipedia. Di KMJ - de.wikipedia,
original upload 26 Jun 2004 by
de:Benutzer:KMJ, CC BY-SA 3.0,
https://commons.wikimedia.org/w/index.php?
curid=242907
Source: Wikipedia. By Nijs, Jac de / Anefo - [1] Dutch National
Archives, The Hague, Fotocollectie Algemeen Nederlands
Persbureau (ANeFo), 1945-1989, Nummer toegang 2.24.01.03
Bestanddeelnummer 913-7320, CC BY-SA 3.0 nl,
https://commons.wikimedia.org/w/index.php?curid=31527984
The dark side of dissipation…
The problem with dissipation is not just wasting resources…
Rather dissipative effects and energy transformations produce
wastes which may have a negative impact on the environment.
For instance the infamous CO2
Carbon dioxide!!!
Source: Wikipedia. By Nijs, Jac de / Anefo - [1] Dutch National Archives, The Hague,
Fotocollectie Algemeen Nederlands Persbureau (ANeFo), 1945-1989, Nummer toegang
2.24.01.03 Bestanddeelnummer 913-7320, CC BY-SA 3.0 nl,
https://commons.wikimedia.org/w/index.php?curid=31527984
No renewable sources
Following Lavoisier, energy sources are «stores» whom energy is
transformed: each time we «take» energy from such a source,
the source depletes, until it get exhausted. If not, we say that the
source is renewable.
thermodynamic principles imply that renewable sources do
not exist…
but some sources, at human timescale, may be approximated
as renewable (sun, wind, tides, geothermal, …)
https://ips-dc.org/crony-capitalism-cant-save-coal-country/
The moral is
We should avoid energy transformations implying bad
byproducts (as CO2).
We should avoid relying on non renewable resources
We should avoid wasting energy in general: it is a limited
resorce…
Of course… we don’t!
We produce emissions
Source: IEA, "CO2 emissions by energy source, World 1990-2017", IEA, Paris https://www.iea.org/data-and-statistics?country=WORLD&fuel=CO2%20emissions&indicator=CO2%20emissions%20by%20energy%20source
We do not pursue sustainabilty
Source: IEA, "Electricity generation by fuel and scenario, 2018-2040", IEA, Paris https://www.iea.org/data-and-statistics/charts/electricity-generation-by-fuel-and-scenario-2018-2040
We keep on consuming
Source: IEA, "Total final consumption (TFC) by sector, World 1990-2017 ", IEA, Paris https://www.iea.org/data-and-statistics?country=WORLD&fuel=Energy%20consumption&indicator=Total%20final%20consumption%20(TFC)%20by%20sector
Carbon footprint
“The carbon footprint is a measure of the exclusive total amount
of carbon dioxide emissions that is directly and indirectly caused
by an activity or is accumulated over the lifestages of a product”
So IT activities and products do have a carbon footprint, too.
Source: Wiedmann, T. and Minx, J. (2008). A Definition of 'Carbon Footprint'. In: C. C. Pertsova, EcologicalEconomics Research
Trends: Chapter 1, pp. 1-11, Nova Science Publishers, Hauppauge NY,
USA.https://www.novapublishers.com/catalog/product_info.php?products_id=5999.
IT activities dissipate and emit
waste
Computers (and all that: servers, tablets, mobile phones, etc.) do
consume energy and do dissipate it. This consumption usually
stems from CO2 (and other) emissions, while dissipation is mainly
due to Joule law
For example, electricity, which is needed to run electronic devices,
is transformed from other kind of energies, which may be non
renewable ones, such as fossil fuels...
Moreover, fans attached to motherboards, to cool down them,
which in turn consume and dissipate electric energy…Source: New York Times. https://www.nytimes.com/2012/09/23/technology/data-centers-waste-vast-amounts-of-energy-belying-industry-image.html
IT accounts for 10% electric
consumption
Source: Guillame Jacquart, “Digital Carbon Footprint — What can we do ?”. https://medium.com/@guillaumejacquart/digital-carbon-footprint-what-can-we-do-d676480a556d
Estimates on renewables %
Graph based on data found in Strubell, Ganesh, McCallum, Energy and Policy Considerations for Deep Learning in NLP, https://arxiv.org/abs/1906.02243
China Germany United States Amazon AWS Google Microsoft
0
10
20
30
40
50
60
70
80
90
100
Other
Nuclear
Coal
Gas
Renewables
Deep Learning everywhere
Artificial Intelligence is as old as computer science is: for
example Alan Turing (1912-1954) contributed to found both!
In the 80s one could program expert systems on 8-bit CPUs
with a storage of 64K or so: I can confirm it!
Today, the state-of-the-art AI paradigm, deep learning, which is
widespread and universally adopted, requires massive parallel
computations and aimed at processing Tbytes of data.
Source: https://www.ibm.com/blogs/watson/2018/03/deep-learning-service-ibm-makes-advanced-ai-accessible-users-everywhere/
Some features of DL algorithm
Deep Learning algorithms are “just” neural networks
They display many layers connected via non linear
transformations, for a total of even millions and billions of
neurons
They works exceedingly fine but why their performances are so
astonishing is still poorly understood, at least from the
theoretical point of view.
More features of DL algorithm
Deep learning algorithms use different layers of the neural
networks to perform different tasks and to concentrate on
different “concepts”: e.g. the form of an object in an image, etc.
To work properly, deep learning algorithms need to be trained:
they have to be fed with huge amount of data in an orgy of
iterated parallel computations
Deep learning algorithms depend on “hyper-parameters” which
have to be empirically fine tuned by trial and errors
Carbon footprint of DL Training
Recently the carbon footprint of some NLP models (DL
algorithms aimed at text classification and translation) training
have been estimated, and compared to other consumptions:
Activity CO2 emission (Tons)
Air travel, 1 passenger, NY->SF 0,9
Human life (average), 1 year 5
American life (average), 1 year 16.4
Car (average) included fuel, 1 lifetime 57.15
NLP Transformer training 0.09
NLP BERT training 0.65
NLP Neural Architecture Search training 284.02
Source: Strubell, Ganesh, McCallum, Energy and Policy Considerations for Deep Learning in NLP, https://arxiv.org/abs/1906.02243
Don’t panic
The analysis of Strubell, Ganesh, McCallum stresses that training
deep learning models is expensive in energetic terms (and therefore
also in dissipative and wasting terms).
On the other hand, inference is also very expensive, and it is
estimated to be the 80%-90% of total computational cost (e.g.
https://www.forbes.com/sites/moorinsights/2019/05/09/google-cloud-doubles-down-on-nvidia-
gpus-for-inference/#2cc458267926)
However, the most consuming model (transformer with neural
architecture, whatever it is) is an outlier in terms of computations
needed: the average is an order of magnitude less
Be aware, don’t beware!
The importance of measuring and being aware of energy
impact of deep learning is that we can address our use of it
toward a sustainable path
In the same paper by Strubell, Ganesh, McCallum, some policy
suggestions are provided: I barely quote them in the following
slides
Authors should report training time
and sensitivity to hyper-parameters
This will enable direct comparison across models, allowing
subsequent consumers of these models to accurately assess
whether the required computational resources are compatible
with their setting. Realizing this will require:
• a standard, hardware-independent measurement of training
time, such as gigaflops required to convergence
• a standard measurement of model sensitivity to data and
hyper-parameters, such as variance with respect to hyper-
parameters searched
Academic researchers need equitable
access to computation resources
Most of the recent DL advances were developed outside
academia, since industry can access to large-scale compute
To make such an access possible even to Academia, it would be
more cost-effective to pool resources to build shared compute
centers at the level of funding agencies, such as the U.S.
National Sci-ence Foundation, instead of using cloud services
such as AWS
Researchers should prioritize
computationally efficient
hardware and algorithms
It is desirable a concerted effort by industry and academia to
promote research of more computationally efficient algorithms,
as well as hardware that requires less energy
Also, it is desirable to provide easy-to-use APIs implementing
more efficient alternatives to brute-force grid search for hyper-
parameter tuning, e.g. random or Bayesian hyper-parameter
search techniques
A new hope
The debate on energy consumption of DL is hot and interesting:
however, it should be stressed that those same computational
consuming models may be used to help in fighting against climatic
and environmental issues.
For example a collective effort (which include Yoshua Bengio) aims
at proving that machine learning can be a powerful tool in
reducing greenhouse gas emissions and helping society adapt to a
changing climate https://arxiv.org/abs/1906.05433
Stay tuned for more information on the next IAML MeetUp!!!Source: http://theconversation.com/star-wars-planet-with-two-suns-a-step-towards-luke-skywalkers-tatooine-3379
Thanks for your attention!!!
Q&A
Paolo Caressa
https://www.linkedin.com/in/paolocaressa/
https://twitter.com/www_caressa_it
http://www.caressa.it

More Related Content

Similar to Deep red - The environmental impact of deep learning (Paolo Caressa)

 Asynchronous futures: Digital technologies at the time of the Anthropocene
 Asynchronous futures: Digital technologies at the time of the Anthropocene Asynchronous futures: Digital technologies at the time of the Anthropocene
 Asynchronous futures: Digital technologies at the time of the Anthropocene
Alexandre Monnin
 
Innovation in the Power Systems industry CIGRE
Innovation in the Power Systems industry CIGREInnovation in the Power Systems industry CIGRE
Innovation in the Power Systems industry CIGRE
Power System Operation
 
Innovation in the Power Systems industry
Innovation in the Power Systems industryInnovation in the Power Systems industry
Innovation in the Power Systems industry
Power System Operation
 
UK e-Infrastructure: Widening Access, Increasing Participation
UK e-Infrastructure: Widening Access, Increasing ParticipationUK e-Infrastructure: Widening Access, Increasing Participation
UK e-Infrastructure: Widening Access, Increasing Participation
Neil Chue Hong
 
Practical Experiences with Smart-Homes Modeling and Simulation
Practical Experiences with Smart-Homes Modeling and SimulationPractical Experiences with Smart-Homes Modeling and Simulation
Practical Experiences with Smart-Homes Modeling and Simulation
SimulationX
 
Computer Science Essay Topics
Computer Science Essay TopicsComputer Science Essay Topics
Computer Science Essay Topics
Amanda Jaramillo
 
Costs of the French PWR
Costs of the French PWRCosts of the French PWR
Costs of the French PWRmyatom
 
cap.doc
cap.doccap.doc
cap.docbutest
 
Visualisatie - Module 3 - Big Data
Visualisatie - Module 3 - Big DataVisualisatie - Module 3 - Big Data
Visualisatie - Module 3 - Big Data
Joris Klerkx
 
Unraveling Information about Deep Learning
Unraveling Information about Deep LearningUnraveling Information about Deep Learning
Unraveling Information about Deep Learning
IRJET Journal
 
Sc10 slide share
Sc10 slide shareSc10 slide share
Sc10 slide share
Guy Tel-Zur
 
Multimodal Deep Learning
Multimodal Deep LearningMultimodal Deep Learning
Multimodal Deep Learning
Universitat Politècnica de Catalunya
 
The marketization of energy commodities is conducive to the enviro.docx
The marketization of energy commodities is conducive to the enviro.docxThe marketization of energy commodities is conducive to the enviro.docx
The marketization of energy commodities is conducive to the enviro.docx
oreo10
 
National scale research computing and beyond pearc panel 2017
National scale research computing and beyond   pearc panel 2017National scale research computing and beyond   pearc panel 2017
National scale research computing and beyond pearc panel 2017
Gregory Newby
 
An Ecosystem for Linked Humanities Data
An Ecosystem for Linked Humanities DataAn Ecosystem for Linked Humanities Data
An Ecosystem for Linked Humanities Data
Rinke Hoekstra
 
Klaus Jäger_Development and future of (solar) energy technologies
Klaus Jäger_Development and future of (solar) energy technologiesKlaus Jäger_Development and future of (solar) energy technologies
Klaus Jäger_Development and future of (solar) energy technologies
UNICORNS IN TECH
 
Extreme Computing A Primer
Extreme Computing A PrimerExtreme Computing A Primer
Extreme Computing A Primer
ijtsrd
 
Gridforum Juergen Knobloch Grids For Science 20080402
Gridforum Juergen Knobloch Grids For Science 20080402Gridforum Juergen Knobloch Grids For Science 20080402
Gridforum Juergen Knobloch Grids For Science 20080402
vrij
 
KIIT.pptx
KIIT.pptxKIIT.pptx
KIIT.pptx
coebgpi
 
Artificial Intelligence in Power Station
Artificial Intelligence in Power StationArtificial Intelligence in Power Station
Artificial Intelligence in Power Station
ijtsrd
 

Similar to Deep red - The environmental impact of deep learning (Paolo Caressa) (20)

 Asynchronous futures: Digital technologies at the time of the Anthropocene
 Asynchronous futures: Digital technologies at the time of the Anthropocene Asynchronous futures: Digital technologies at the time of the Anthropocene
 Asynchronous futures: Digital technologies at the time of the Anthropocene
 
Innovation in the Power Systems industry CIGRE
Innovation in the Power Systems industry CIGREInnovation in the Power Systems industry CIGRE
Innovation in the Power Systems industry CIGRE
 
Innovation in the Power Systems industry
Innovation in the Power Systems industryInnovation in the Power Systems industry
Innovation in the Power Systems industry
 
UK e-Infrastructure: Widening Access, Increasing Participation
UK e-Infrastructure: Widening Access, Increasing ParticipationUK e-Infrastructure: Widening Access, Increasing Participation
UK e-Infrastructure: Widening Access, Increasing Participation
 
Practical Experiences with Smart-Homes Modeling and Simulation
Practical Experiences with Smart-Homes Modeling and SimulationPractical Experiences with Smart-Homes Modeling and Simulation
Practical Experiences with Smart-Homes Modeling and Simulation
 
Computer Science Essay Topics
Computer Science Essay TopicsComputer Science Essay Topics
Computer Science Essay Topics
 
Costs of the French PWR
Costs of the French PWRCosts of the French PWR
Costs of the French PWR
 
cap.doc
cap.doccap.doc
cap.doc
 
Visualisatie - Module 3 - Big Data
Visualisatie - Module 3 - Big DataVisualisatie - Module 3 - Big Data
Visualisatie - Module 3 - Big Data
 
Unraveling Information about Deep Learning
Unraveling Information about Deep LearningUnraveling Information about Deep Learning
Unraveling Information about Deep Learning
 
Sc10 slide share
Sc10 slide shareSc10 slide share
Sc10 slide share
 
Multimodal Deep Learning
Multimodal Deep LearningMultimodal Deep Learning
Multimodal Deep Learning
 
The marketization of energy commodities is conducive to the enviro.docx
The marketization of energy commodities is conducive to the enviro.docxThe marketization of energy commodities is conducive to the enviro.docx
The marketization of energy commodities is conducive to the enviro.docx
 
National scale research computing and beyond pearc panel 2017
National scale research computing and beyond   pearc panel 2017National scale research computing and beyond   pearc panel 2017
National scale research computing and beyond pearc panel 2017
 
An Ecosystem for Linked Humanities Data
An Ecosystem for Linked Humanities DataAn Ecosystem for Linked Humanities Data
An Ecosystem for Linked Humanities Data
 
Klaus Jäger_Development and future of (solar) energy technologies
Klaus Jäger_Development and future of (solar) energy technologiesKlaus Jäger_Development and future of (solar) energy technologies
Klaus Jäger_Development and future of (solar) energy technologies
 
Extreme Computing A Primer
Extreme Computing A PrimerExtreme Computing A Primer
Extreme Computing A Primer
 
Gridforum Juergen Knobloch Grids For Science 20080402
Gridforum Juergen Knobloch Grids For Science 20080402Gridforum Juergen Knobloch Grids For Science 20080402
Gridforum Juergen Knobloch Grids For Science 20080402
 
KIIT.pptx
KIIT.pptxKIIT.pptx
KIIT.pptx
 
Artificial Intelligence in Power Station
Artificial Intelligence in Power StationArtificial Intelligence in Power Station
Artificial Intelligence in Power Station
 

More from MeetupDataScienceRoma

Serve Davvero il Machine Learning nelle PMI? | Niccolò Annino
Serve Davvero il Machine Learning nelle PMI? | Niccolò AnninoServe Davvero il Machine Learning nelle PMI? | Niccolò Annino
Serve Davvero il Machine Learning nelle PMI? | Niccolò Annino
MeetupDataScienceRoma
 
Meta-learning through the lenses of Statistical Learning Theory (Carlo Cilibe...
Meta-learning through the lenses of Statistical Learning Theory (Carlo Cilibe...Meta-learning through the lenses of Statistical Learning Theory (Carlo Cilibe...
Meta-learning through the lenses of Statistical Learning Theory (Carlo Cilibe...
MeetupDataScienceRoma
 
Claudio Gallicchio - Deep Reservoir Computing for Structured Data
Claudio Gallicchio - Deep Reservoir Computing for Structured DataClaudio Gallicchio - Deep Reservoir Computing for Structured Data
Claudio Gallicchio - Deep Reservoir Computing for Structured Data
MeetupDataScienceRoma
 
Docker for Deep Learning (Andrea Panizza)
Docker for Deep Learning (Andrea Panizza)Docker for Deep Learning (Andrea Panizza)
Docker for Deep Learning (Andrea Panizza)
MeetupDataScienceRoma
 
Machine Learning for Epidemiological Models (Enrico Meloni)
Machine Learning for Epidemiological Models (Enrico Meloni)Machine Learning for Epidemiological Models (Enrico Meloni)
Machine Learning for Epidemiological Models (Enrico Meloni)
MeetupDataScienceRoma
 
Quantum Machine Learning and QEM for Gaussian mixture models (Alessandro Luongo)
Quantum Machine Learning and QEM for Gaussian mixture models (Alessandro Luongo)Quantum Machine Learning and QEM for Gaussian mixture models (Alessandro Luongo)
Quantum Machine Learning and QEM for Gaussian mixture models (Alessandro Luongo)
MeetupDataScienceRoma
 
Web Meetup #2: Modelli matematici per l'epidemiologia
Web Meetup #2: Modelli matematici per l'epidemiologiaWeb Meetup #2: Modelli matematici per l'epidemiologia
Web Meetup #2: Modelli matematici per l'epidemiologia
MeetupDataScienceRoma
 
[Sponsored] C3.ai description
[Sponsored] C3.ai description[Sponsored] C3.ai description
[Sponsored] C3.ai description
MeetupDataScienceRoma
 
Paolo Galeone - Dissecting tf.function to discover auto graph strengths and s...
Paolo Galeone - Dissecting tf.function to discover auto graph strengths and s...Paolo Galeone - Dissecting tf.function to discover auto graph strengths and s...
Paolo Galeone - Dissecting tf.function to discover auto graph strengths and s...
MeetupDataScienceRoma
 
Multimodal AI Approach to Provide Assistive Services (Francesco Puja)
Multimodal AI Approach to Provide Assistive Services (Francesco Puja)Multimodal AI Approach to Provide Assistive Services (Francesco Puja)
Multimodal AI Approach to Provide Assistive Services (Francesco Puja)
MeetupDataScienceRoma
 
Introduzione - Meetup MLOps & Assistive AI
Introduzione - Meetup MLOps & Assistive AIIntroduzione - Meetup MLOps & Assistive AI
Introduzione - Meetup MLOps & Assistive AI
MeetupDataScienceRoma
 
Zero, One, Many - Machine Learning in Produzione (Luca Palmieri)
Zero, One, Many - Machine Learning in Produzione (Luca Palmieri)Zero, One, Many - Machine Learning in Produzione (Luca Palmieri)
Zero, One, Many - Machine Learning in Produzione (Luca Palmieri)
MeetupDataScienceRoma
 
Mario Incarnati - The power of data visualization
Mario Incarnati - The power of data visualizationMario Incarnati - The power of data visualization
Mario Incarnati - The power of data visualization
MeetupDataScienceRoma
 
Machine Learning in the AWS Cloud
Machine Learning in the AWS CloudMachine Learning in the AWS Cloud
Machine Learning in the AWS Cloud
MeetupDataScienceRoma
 
OLIVAW: reaching superhuman strength at Othello
OLIVAW: reaching superhuman strength at OthelloOLIVAW: reaching superhuman strength at Othello
OLIVAW: reaching superhuman strength at Othello
MeetupDataScienceRoma
 
[Giovanni Galloro] How to use machine learning on Google Cloud Platform
[Giovanni Galloro] How to use machine learning on Google Cloud Platform[Giovanni Galloro] How to use machine learning on Google Cloud Platform
[Giovanni Galloro] How to use machine learning on Google Cloud Platform
MeetupDataScienceRoma
 
Bring your neural networks to the browser with TF.js - Simone Scardapane
Bring your neural networks to the browser with TF.js - Simone ScardapaneBring your neural networks to the browser with TF.js - Simone Scardapane
Bring your neural networks to the browser with TF.js - Simone Scardapane
MeetupDataScienceRoma
 
Meetup Gennaio 2019 - Slide introduttiva
Meetup Gennaio 2019 - Slide introduttivaMeetup Gennaio 2019 - Slide introduttiva
Meetup Gennaio 2019 - Slide introduttiva
MeetupDataScienceRoma
 
Elena Gagliardoni - Neural Chatbot
Elena Gagliardoni - Neural ChatbotElena Gagliardoni - Neural Chatbot
Elena Gagliardoni - Neural Chatbot
MeetupDataScienceRoma
 
Bruno Coletta - Data-Driven Creativity in Marketing and Advertising
Bruno Coletta - Data-Driven Creativity in Marketing and AdvertisingBruno Coletta - Data-Driven Creativity in Marketing and Advertising
Bruno Coletta - Data-Driven Creativity in Marketing and Advertising
MeetupDataScienceRoma
 

More from MeetupDataScienceRoma (20)

Serve Davvero il Machine Learning nelle PMI? | Niccolò Annino
Serve Davvero il Machine Learning nelle PMI? | Niccolò AnninoServe Davvero il Machine Learning nelle PMI? | Niccolò Annino
Serve Davvero il Machine Learning nelle PMI? | Niccolò Annino
 
Meta-learning through the lenses of Statistical Learning Theory (Carlo Cilibe...
Meta-learning through the lenses of Statistical Learning Theory (Carlo Cilibe...Meta-learning through the lenses of Statistical Learning Theory (Carlo Cilibe...
Meta-learning through the lenses of Statistical Learning Theory (Carlo Cilibe...
 
Claudio Gallicchio - Deep Reservoir Computing for Structured Data
Claudio Gallicchio - Deep Reservoir Computing for Structured DataClaudio Gallicchio - Deep Reservoir Computing for Structured Data
Claudio Gallicchio - Deep Reservoir Computing for Structured Data
 
Docker for Deep Learning (Andrea Panizza)
Docker for Deep Learning (Andrea Panizza)Docker for Deep Learning (Andrea Panizza)
Docker for Deep Learning (Andrea Panizza)
 
Machine Learning for Epidemiological Models (Enrico Meloni)
Machine Learning for Epidemiological Models (Enrico Meloni)Machine Learning for Epidemiological Models (Enrico Meloni)
Machine Learning for Epidemiological Models (Enrico Meloni)
 
Quantum Machine Learning and QEM for Gaussian mixture models (Alessandro Luongo)
Quantum Machine Learning and QEM for Gaussian mixture models (Alessandro Luongo)Quantum Machine Learning and QEM for Gaussian mixture models (Alessandro Luongo)
Quantum Machine Learning and QEM for Gaussian mixture models (Alessandro Luongo)
 
Web Meetup #2: Modelli matematici per l'epidemiologia
Web Meetup #2: Modelli matematici per l'epidemiologiaWeb Meetup #2: Modelli matematici per l'epidemiologia
Web Meetup #2: Modelli matematici per l'epidemiologia
 
[Sponsored] C3.ai description
[Sponsored] C3.ai description[Sponsored] C3.ai description
[Sponsored] C3.ai description
 
Paolo Galeone - Dissecting tf.function to discover auto graph strengths and s...
Paolo Galeone - Dissecting tf.function to discover auto graph strengths and s...Paolo Galeone - Dissecting tf.function to discover auto graph strengths and s...
Paolo Galeone - Dissecting tf.function to discover auto graph strengths and s...
 
Multimodal AI Approach to Provide Assistive Services (Francesco Puja)
Multimodal AI Approach to Provide Assistive Services (Francesco Puja)Multimodal AI Approach to Provide Assistive Services (Francesco Puja)
Multimodal AI Approach to Provide Assistive Services (Francesco Puja)
 
Introduzione - Meetup MLOps & Assistive AI
Introduzione - Meetup MLOps & Assistive AIIntroduzione - Meetup MLOps & Assistive AI
Introduzione - Meetup MLOps & Assistive AI
 
Zero, One, Many - Machine Learning in Produzione (Luca Palmieri)
Zero, One, Many - Machine Learning in Produzione (Luca Palmieri)Zero, One, Many - Machine Learning in Produzione (Luca Palmieri)
Zero, One, Many - Machine Learning in Produzione (Luca Palmieri)
 
Mario Incarnati - The power of data visualization
Mario Incarnati - The power of data visualizationMario Incarnati - The power of data visualization
Mario Incarnati - The power of data visualization
 
Machine Learning in the AWS Cloud
Machine Learning in the AWS CloudMachine Learning in the AWS Cloud
Machine Learning in the AWS Cloud
 
OLIVAW: reaching superhuman strength at Othello
OLIVAW: reaching superhuman strength at OthelloOLIVAW: reaching superhuman strength at Othello
OLIVAW: reaching superhuman strength at Othello
 
[Giovanni Galloro] How to use machine learning on Google Cloud Platform
[Giovanni Galloro] How to use machine learning on Google Cloud Platform[Giovanni Galloro] How to use machine learning on Google Cloud Platform
[Giovanni Galloro] How to use machine learning on Google Cloud Platform
 
Bring your neural networks to the browser with TF.js - Simone Scardapane
Bring your neural networks to the browser with TF.js - Simone ScardapaneBring your neural networks to the browser with TF.js - Simone Scardapane
Bring your neural networks to the browser with TF.js - Simone Scardapane
 
Meetup Gennaio 2019 - Slide introduttiva
Meetup Gennaio 2019 - Slide introduttivaMeetup Gennaio 2019 - Slide introduttiva
Meetup Gennaio 2019 - Slide introduttiva
 
Elena Gagliardoni - Neural Chatbot
Elena Gagliardoni - Neural ChatbotElena Gagliardoni - Neural Chatbot
Elena Gagliardoni - Neural Chatbot
 
Bruno Coletta - Data-Driven Creativity in Marketing and Advertising
Bruno Coletta - Data-Driven Creativity in Marketing and AdvertisingBruno Coletta - Data-Driven Creativity in Marketing and Advertising
Bruno Coletta - Data-Driven Creativity in Marketing and Advertising
 

Recently uploaded

Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
g2nightmarescribd
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 

Recently uploaded (20)

Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 

Deep red - The environmental impact of deep learning (Paolo Caressa)

  • 1. Deep Red The Environmental Impact of Deep Learning Paolo Caressa – GSE spa
  • 2. About the speaker: a life pie… Study & Leisure; 48,00% Math Research; 20,00% IT Consultant; 3,00% Quant; 8,00% IT PM & PgM; 21,00%
  • 3. What is energy? Energy is force (=ma) times a displacement, or Energy is mass times the square of velocity Example: Kinetic Energy = ½ mass × velocity2 Source: Wikipedia. Di Ferdinand Schmutzer (1870-1928) - Edited version of Image:Einstein1921 by F Schmutzer 2.jpg., Pubblico dominio, https://commons.wikimedia.org/w/index.php? curid=5216482 Source: https://giphy.com/gifs/looneytunes- angry-mad-1wPC7WSiRq6pEpcBOo Also Einstein’s famous E = mc2 confirms it…
  • 4. Power is energy per unit time We use energy during a certain time interval, mainly to move around: power is the amount of energy transferred in a unit of time, aka power = force times velocity For example to climb a 700 mt high mountain in a hour, a man weighing 80kg (≈800 N) needs a power of P = F × v = (m × a) × v = 800 N × 700 mt/1 h = 800 N × 700 mt / 3600 s = 155 W Source: https://giphy.com/explore/mountain- climbing
  • 5. Why do we get tired? However no one could keep walking or climbing forever: indeed we say that we consume energy, which is not correct since, as Antoine de Lavoisier (1743-1794) put it: Energy is neither created nor destroyed but just transformed. Rather... Source: Wikipedia. By Louis Jean Desire Delaistre, after Boilly - Rev. Superinteressante, n. 23, Pubblico dominio, https://commons.wikimedia.org/w/index.php?curid=5507967
  • 6. During transformation energy dissipates! Whenever we convert or transmit energy it dissipates a bit. Thus “conservative systems” are but a (useful) fiction. Dissipative effects are due to friction, resistance, etc. Example: Joule heatingSource: © Science Photo Library Limited 2019 https://www.sciencephoto.com/media/1064671/view/joule-s- heat-equivalence-experiment-1840s
  • 7. Human industry exploits dissipation!!! Source: Wikipedia. By Nijs, Jac de / Anefo - [1] Dutch National Archives, The Hague, Fotocollectie Algemeen Nederlands Persbureau (ANeFo), 1945-1989, Nummer toegang 2.24.01.03 Bestanddeelnummer 913-7320, CC BY-SA 3.0 nl, https://commons.wikimedia.org/w/index.php?curid=31527984 Source: Wikipedia. Di KMJ - de.wikipedia, original upload 26 Jun 2004 by de:Benutzer:KMJ, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php? curid=242907 Source: Wikipedia. By Nijs, Jac de / Anefo - [1] Dutch National Archives, The Hague, Fotocollectie Algemeen Nederlands Persbureau (ANeFo), 1945-1989, Nummer toegang 2.24.01.03 Bestanddeelnummer 913-7320, CC BY-SA 3.0 nl, https://commons.wikimedia.org/w/index.php?curid=31527984
  • 8. The dark side of dissipation… The problem with dissipation is not just wasting resources… Rather dissipative effects and energy transformations produce wastes which may have a negative impact on the environment. For instance the infamous CO2 Carbon dioxide!!! Source: Wikipedia. By Nijs, Jac de / Anefo - [1] Dutch National Archives, The Hague, Fotocollectie Algemeen Nederlands Persbureau (ANeFo), 1945-1989, Nummer toegang 2.24.01.03 Bestanddeelnummer 913-7320, CC BY-SA 3.0 nl, https://commons.wikimedia.org/w/index.php?curid=31527984
  • 9. No renewable sources Following Lavoisier, energy sources are «stores» whom energy is transformed: each time we «take» energy from such a source, the source depletes, until it get exhausted. If not, we say that the source is renewable. thermodynamic principles imply that renewable sources do not exist… but some sources, at human timescale, may be approximated as renewable (sun, wind, tides, geothermal, …) https://ips-dc.org/crony-capitalism-cant-save-coal-country/
  • 10. The moral is We should avoid energy transformations implying bad byproducts (as CO2). We should avoid relying on non renewable resources We should avoid wasting energy in general: it is a limited resorce… Of course… we don’t!
  • 11. We produce emissions Source: IEA, "CO2 emissions by energy source, World 1990-2017", IEA, Paris https://www.iea.org/data-and-statistics?country=WORLD&fuel=CO2%20emissions&indicator=CO2%20emissions%20by%20energy%20source
  • 12. We do not pursue sustainabilty Source: IEA, "Electricity generation by fuel and scenario, 2018-2040", IEA, Paris https://www.iea.org/data-and-statistics/charts/electricity-generation-by-fuel-and-scenario-2018-2040
  • 13. We keep on consuming Source: IEA, "Total final consumption (TFC) by sector, World 1990-2017 ", IEA, Paris https://www.iea.org/data-and-statistics?country=WORLD&fuel=Energy%20consumption&indicator=Total%20final%20consumption%20(TFC)%20by%20sector
  • 14. Carbon footprint “The carbon footprint is a measure of the exclusive total amount of carbon dioxide emissions that is directly and indirectly caused by an activity or is accumulated over the lifestages of a product” So IT activities and products do have a carbon footprint, too. Source: Wiedmann, T. and Minx, J. (2008). A Definition of 'Carbon Footprint'. In: C. C. Pertsova, EcologicalEconomics Research Trends: Chapter 1, pp. 1-11, Nova Science Publishers, Hauppauge NY, USA.https://www.novapublishers.com/catalog/product_info.php?products_id=5999.
  • 15. IT activities dissipate and emit waste Computers (and all that: servers, tablets, mobile phones, etc.) do consume energy and do dissipate it. This consumption usually stems from CO2 (and other) emissions, while dissipation is mainly due to Joule law For example, electricity, which is needed to run electronic devices, is transformed from other kind of energies, which may be non renewable ones, such as fossil fuels... Moreover, fans attached to motherboards, to cool down them, which in turn consume and dissipate electric energy…Source: New York Times. https://www.nytimes.com/2012/09/23/technology/data-centers-waste-vast-amounts-of-energy-belying-industry-image.html
  • 16.
  • 17. IT accounts for 10% electric consumption Source: Guillame Jacquart, “Digital Carbon Footprint — What can we do ?”. https://medium.com/@guillaumejacquart/digital-carbon-footprint-what-can-we-do-d676480a556d
  • 18. Estimates on renewables % Graph based on data found in Strubell, Ganesh, McCallum, Energy and Policy Considerations for Deep Learning in NLP, https://arxiv.org/abs/1906.02243 China Germany United States Amazon AWS Google Microsoft 0 10 20 30 40 50 60 70 80 90 100 Other Nuclear Coal Gas Renewables
  • 19. Deep Learning everywhere Artificial Intelligence is as old as computer science is: for example Alan Turing (1912-1954) contributed to found both! In the 80s one could program expert systems on 8-bit CPUs with a storage of 64K or so: I can confirm it! Today, the state-of-the-art AI paradigm, deep learning, which is widespread and universally adopted, requires massive parallel computations and aimed at processing Tbytes of data. Source: https://www.ibm.com/blogs/watson/2018/03/deep-learning-service-ibm-makes-advanced-ai-accessible-users-everywhere/
  • 20. Some features of DL algorithm Deep Learning algorithms are “just” neural networks They display many layers connected via non linear transformations, for a total of even millions and billions of neurons They works exceedingly fine but why their performances are so astonishing is still poorly understood, at least from the theoretical point of view.
  • 21. More features of DL algorithm Deep learning algorithms use different layers of the neural networks to perform different tasks and to concentrate on different “concepts”: e.g. the form of an object in an image, etc. To work properly, deep learning algorithms need to be trained: they have to be fed with huge amount of data in an orgy of iterated parallel computations Deep learning algorithms depend on “hyper-parameters” which have to be empirically fine tuned by trial and errors
  • 22. Carbon footprint of DL Training Recently the carbon footprint of some NLP models (DL algorithms aimed at text classification and translation) training have been estimated, and compared to other consumptions: Activity CO2 emission (Tons) Air travel, 1 passenger, NY->SF 0,9 Human life (average), 1 year 5 American life (average), 1 year 16.4 Car (average) included fuel, 1 lifetime 57.15 NLP Transformer training 0.09 NLP BERT training 0.65 NLP Neural Architecture Search training 284.02 Source: Strubell, Ganesh, McCallum, Energy and Policy Considerations for Deep Learning in NLP, https://arxiv.org/abs/1906.02243
  • 23. Don’t panic The analysis of Strubell, Ganesh, McCallum stresses that training deep learning models is expensive in energetic terms (and therefore also in dissipative and wasting terms). On the other hand, inference is also very expensive, and it is estimated to be the 80%-90% of total computational cost (e.g. https://www.forbes.com/sites/moorinsights/2019/05/09/google-cloud-doubles-down-on-nvidia- gpus-for-inference/#2cc458267926) However, the most consuming model (transformer with neural architecture, whatever it is) is an outlier in terms of computations needed: the average is an order of magnitude less
  • 24. Be aware, don’t beware! The importance of measuring and being aware of energy impact of deep learning is that we can address our use of it toward a sustainable path In the same paper by Strubell, Ganesh, McCallum, some policy suggestions are provided: I barely quote them in the following slides
  • 25. Authors should report training time and sensitivity to hyper-parameters This will enable direct comparison across models, allowing subsequent consumers of these models to accurately assess whether the required computational resources are compatible with their setting. Realizing this will require: • a standard, hardware-independent measurement of training time, such as gigaflops required to convergence • a standard measurement of model sensitivity to data and hyper-parameters, such as variance with respect to hyper- parameters searched
  • 26. Academic researchers need equitable access to computation resources Most of the recent DL advances were developed outside academia, since industry can access to large-scale compute To make such an access possible even to Academia, it would be more cost-effective to pool resources to build shared compute centers at the level of funding agencies, such as the U.S. National Sci-ence Foundation, instead of using cloud services such as AWS
  • 27. Researchers should prioritize computationally efficient hardware and algorithms It is desirable a concerted effort by industry and academia to promote research of more computationally efficient algorithms, as well as hardware that requires less energy Also, it is desirable to provide easy-to-use APIs implementing more efficient alternatives to brute-force grid search for hyper- parameter tuning, e.g. random or Bayesian hyper-parameter search techniques
  • 28. A new hope The debate on energy consumption of DL is hot and interesting: however, it should be stressed that those same computational consuming models may be used to help in fighting against climatic and environmental issues. For example a collective effort (which include Yoshua Bengio) aims at proving that machine learning can be a powerful tool in reducing greenhouse gas emissions and helping society adapt to a changing climate https://arxiv.org/abs/1906.05433 Stay tuned for more information on the next IAML MeetUp!!!Source: http://theconversation.com/star-wars-planet-with-two-suns-a-step-towards-luke-skywalkers-tatooine-3379
  • 29. Thanks for your attention!!! Q&A Paolo Caressa https://www.linkedin.com/in/paolocaressa/ https://twitter.com/www_caressa_it http://www.caressa.it