2. About the speaker: a life pie…
Study & Leisure; 48,00%
Math Research; 20,00%
IT Consultant; 3,00%
Quant; 8,00%
IT PM & PgM; 21,00%
3. What is energy?
Energy is force (=ma) times a displacement, or
Energy is mass times the square of velocity
Example: Kinetic Energy = ½ mass × velocity2
Source: Wikipedia. Di Ferdinand Schmutzer
(1870-1928) - Edited version of
Image:Einstein1921 by F Schmutzer 2.jpg.,
Pubblico dominio,
https://commons.wikimedia.org/w/index.php?
curid=5216482
Source: https://giphy.com/gifs/looneytunes-
angry-mad-1wPC7WSiRq6pEpcBOo
Also Einstein’s famous
E = mc2
confirms it…
4. Power is energy per unit time
We use energy during a certain time interval, mainly to move around:
power is the amount of energy transferred in a unit of time, aka power =
force times velocity
For example to climb a 700 mt high mountain
in a hour, a man weighing 80kg (≈800 N)
needs a power of
P = F × v = (m × a) × v
= 800 N × 700 mt/1 h
= 800 N × 700 mt / 3600 s
= 155 W Source: https://giphy.com/explore/mountain-
climbing
5. Why do we get tired?
However no one could keep walking or
climbing forever: indeed we say that we
consume energy, which is not correct since,
as Antoine de Lavoisier (1743-1794) put it:
Energy is neither created nor destroyed but
just transformed.
Rather...
Source: Wikipedia. By Louis Jean Desire Delaistre, after Boilly - Rev.
Superinteressante, n. 23, Pubblico dominio,
https://commons.wikimedia.org/w/index.php?curid=5507967
7. Human industry exploits
dissipation!!!
Source: Wikipedia. By Nijs, Jac de / Anefo - [1] Dutch National
Archives, The Hague, Fotocollectie Algemeen Nederlands
Persbureau (ANeFo), 1945-1989, Nummer toegang 2.24.01.03
Bestanddeelnummer 913-7320, CC BY-SA 3.0 nl,
https://commons.wikimedia.org/w/index.php?curid=31527984
Source: Wikipedia. Di KMJ - de.wikipedia,
original upload 26 Jun 2004 by
de:Benutzer:KMJ, CC BY-SA 3.0,
https://commons.wikimedia.org/w/index.php?
curid=242907
Source: Wikipedia. By Nijs, Jac de / Anefo - [1] Dutch National
Archives, The Hague, Fotocollectie Algemeen Nederlands
Persbureau (ANeFo), 1945-1989, Nummer toegang 2.24.01.03
Bestanddeelnummer 913-7320, CC BY-SA 3.0 nl,
https://commons.wikimedia.org/w/index.php?curid=31527984
8. The dark side of dissipation…
The problem with dissipation is not just wasting resources…
Rather dissipative effects and energy transformations produce
wastes which may have a negative impact on the environment.
For instance the infamous CO2
Carbon dioxide!!!
Source: Wikipedia. By Nijs, Jac de / Anefo - [1] Dutch National Archives, The Hague,
Fotocollectie Algemeen Nederlands Persbureau (ANeFo), 1945-1989, Nummer toegang
2.24.01.03 Bestanddeelnummer 913-7320, CC BY-SA 3.0 nl,
https://commons.wikimedia.org/w/index.php?curid=31527984
9. No renewable sources
Following Lavoisier, energy sources are «stores» whom energy is
transformed: each time we «take» energy from such a source,
the source depletes, until it get exhausted. If not, we say that the
source is renewable.
thermodynamic principles imply that renewable sources do
not exist…
but some sources, at human timescale, may be approximated
as renewable (sun, wind, tides, geothermal, …)
https://ips-dc.org/crony-capitalism-cant-save-coal-country/
10. The moral is
We should avoid energy transformations implying bad
byproducts (as CO2).
We should avoid relying on non renewable resources
We should avoid wasting energy in general: it is a limited
resorce…
Of course… we don’t!
11. We produce emissions
Source: IEA, "CO2 emissions by energy source, World 1990-2017", IEA, Paris https://www.iea.org/data-and-statistics?country=WORLD&fuel=CO2%20emissions&indicator=CO2%20emissions%20by%20energy%20source
12. We do not pursue sustainabilty
Source: IEA, "Electricity generation by fuel and scenario, 2018-2040", IEA, Paris https://www.iea.org/data-and-statistics/charts/electricity-generation-by-fuel-and-scenario-2018-2040
13. We keep on consuming
Source: IEA, "Total final consumption (TFC) by sector, World 1990-2017 ", IEA, Paris https://www.iea.org/data-and-statistics?country=WORLD&fuel=Energy%20consumption&indicator=Total%20final%20consumption%20(TFC)%20by%20sector
14. Carbon footprint
“The carbon footprint is a measure of the exclusive total amount
of carbon dioxide emissions that is directly and indirectly caused
by an activity or is accumulated over the lifestages of a product”
So IT activities and products do have a carbon footprint, too.
Source: Wiedmann, T. and Minx, J. (2008). A Definition of 'Carbon Footprint'. In: C. C. Pertsova, EcologicalEconomics Research
Trends: Chapter 1, pp. 1-11, Nova Science Publishers, Hauppauge NY,
USA.https://www.novapublishers.com/catalog/product_info.php?products_id=5999.
15. IT activities dissipate and emit
waste
Computers (and all that: servers, tablets, mobile phones, etc.) do
consume energy and do dissipate it. This consumption usually
stems from CO2 (and other) emissions, while dissipation is mainly
due to Joule law
For example, electricity, which is needed to run electronic devices,
is transformed from other kind of energies, which may be non
renewable ones, such as fossil fuels...
Moreover, fans attached to motherboards, to cool down them,
which in turn consume and dissipate electric energy…Source: New York Times. https://www.nytimes.com/2012/09/23/technology/data-centers-waste-vast-amounts-of-energy-belying-industry-image.html
16.
17. IT accounts for 10% electric
consumption
Source: Guillame Jacquart, “Digital Carbon Footprint — What can we do ?”. https://medium.com/@guillaumejacquart/digital-carbon-footprint-what-can-we-do-d676480a556d
18. Estimates on renewables %
Graph based on data found in Strubell, Ganesh, McCallum, Energy and Policy Considerations for Deep Learning in NLP, https://arxiv.org/abs/1906.02243
China Germany United States Amazon AWS Google Microsoft
0
10
20
30
40
50
60
70
80
90
100
Other
Nuclear
Coal
Gas
Renewables
19. Deep Learning everywhere
Artificial Intelligence is as old as computer science is: for
example Alan Turing (1912-1954) contributed to found both!
In the 80s one could program expert systems on 8-bit CPUs
with a storage of 64K or so: I can confirm it!
Today, the state-of-the-art AI paradigm, deep learning, which is
widespread and universally adopted, requires massive parallel
computations and aimed at processing Tbytes of data.
Source: https://www.ibm.com/blogs/watson/2018/03/deep-learning-service-ibm-makes-advanced-ai-accessible-users-everywhere/
20. Some features of DL algorithm
Deep Learning algorithms are “just” neural networks
They display many layers connected via non linear
transformations, for a total of even millions and billions of
neurons
They works exceedingly fine but why their performances are so
astonishing is still poorly understood, at least from the
theoretical point of view.
21. More features of DL algorithm
Deep learning algorithms use different layers of the neural
networks to perform different tasks and to concentrate on
different “concepts”: e.g. the form of an object in an image, etc.
To work properly, deep learning algorithms need to be trained:
they have to be fed with huge amount of data in an orgy of
iterated parallel computations
Deep learning algorithms depend on “hyper-parameters” which
have to be empirically fine tuned by trial and errors
22. Carbon footprint of DL Training
Recently the carbon footprint of some NLP models (DL
algorithms aimed at text classification and translation) training
have been estimated, and compared to other consumptions:
Activity CO2 emission (Tons)
Air travel, 1 passenger, NY->SF 0,9
Human life (average), 1 year 5
American life (average), 1 year 16.4
Car (average) included fuel, 1 lifetime 57.15
NLP Transformer training 0.09
NLP BERT training 0.65
NLP Neural Architecture Search training 284.02
Source: Strubell, Ganesh, McCallum, Energy and Policy Considerations for Deep Learning in NLP, https://arxiv.org/abs/1906.02243
23. Don’t panic
The analysis of Strubell, Ganesh, McCallum stresses that training
deep learning models is expensive in energetic terms (and therefore
also in dissipative and wasting terms).
On the other hand, inference is also very expensive, and it is
estimated to be the 80%-90% of total computational cost (e.g.
https://www.forbes.com/sites/moorinsights/2019/05/09/google-cloud-doubles-down-on-nvidia-
gpus-for-inference/#2cc458267926)
However, the most consuming model (transformer with neural
architecture, whatever it is) is an outlier in terms of computations
needed: the average is an order of magnitude less
24. Be aware, don’t beware!
The importance of measuring and being aware of energy
impact of deep learning is that we can address our use of it
toward a sustainable path
In the same paper by Strubell, Ganesh, McCallum, some policy
suggestions are provided: I barely quote them in the following
slides
25. Authors should report training time
and sensitivity to hyper-parameters
This will enable direct comparison across models, allowing
subsequent consumers of these models to accurately assess
whether the required computational resources are compatible
with their setting. Realizing this will require:
• a standard, hardware-independent measurement of training
time, such as gigaflops required to convergence
• a standard measurement of model sensitivity to data and
hyper-parameters, such as variance with respect to hyper-
parameters searched
26. Academic researchers need equitable
access to computation resources
Most of the recent DL advances were developed outside
academia, since industry can access to large-scale compute
To make such an access possible even to Academia, it would be
more cost-effective to pool resources to build shared compute
centers at the level of funding agencies, such as the U.S.
National Sci-ence Foundation, instead of using cloud services
such as AWS
27. Researchers should prioritize
computationally efficient
hardware and algorithms
It is desirable a concerted effort by industry and academia to
promote research of more computationally efficient algorithms,
as well as hardware that requires less energy
Also, it is desirable to provide easy-to-use APIs implementing
more efficient alternatives to brute-force grid search for hyper-
parameter tuning, e.g. random or Bayesian hyper-parameter
search techniques
28. A new hope
The debate on energy consumption of DL is hot and interesting:
however, it should be stressed that those same computational
consuming models may be used to help in fighting against climatic
and environmental issues.
For example a collective effort (which include Yoshua Bengio) aims
at proving that machine learning can be a powerful tool in
reducing greenhouse gas emissions and helping society adapt to a
changing climate https://arxiv.org/abs/1906.05433
Stay tuned for more information on the next IAML MeetUp!!!Source: http://theconversation.com/star-wars-planet-with-two-suns-a-step-towards-luke-skywalkers-tatooine-3379
29. Thanks for your attention!!!
Q&A
Paolo Caressa
https://www.linkedin.com/in/paolocaressa/
https://twitter.com/www_caressa_it
http://www.caressa.it