Rendimiento y consumo energético con python

Monitorización de rendimiento y consumo
energético a nivel de proceso en arquitecturas
multinúcleo
Rendimiento y consumo energético con Python
Tomás L. López-Fragoso Rumeu - tlopezfr@ull.edu.es
Tutor: Vicente Blanco Pérez - vblanco@ull.edu.es
11 de Marzo de 2017

Índice
● Presentación
● Introducción
● Estado del arte
● Objetivos
● Perf
● Scripts
● PerfCall.py
● PerfCsvPlot.py
○ Pandas
○ Matplotlib
● Demo
○ BenchmarksGame
● Bibliografía
● Preguntas / Debate

Presentación
● Ingeniero en Electrónica por la Universidad de la Laguna.
● Studio Manager y desarrollador "Full Stack" en BoomBox
● Anterior:
○ Diseñando circuitos electrónicos para una empresa
de motos eléctricas.
○ Fundé un estudio de diseño web y aplicaciones
móviles.
○ Trabajé en la ULL, tanto en el SAII como más tarde
en exclusiva para el Vicerrectorado de Posgrado.
● Doctorando en el programa de Ingeniería Industrial,
Informática y Medioambiental, con la tesis titulada
"Monitorización de rendimiento y consumo energético a
nivel de proceso en arquitecturas multinúcleo".

Presentación
● Ingeniero en Electrónica por la Universidad de la Laguna.
● Studio Manager y desarrollador "Full Stack" en BoomBox
● Anterior:
○ Diseñando circuitos electrónicos para una empresa
de motos eléctricas.
○ Fundé un estudio de diseño web y aplicaciones
móviles.
○ Trabajé en la ULL, tanto en el SAII como más tarde
en exclusiva para el Vicerrectorado de Posgrado.
● Doctorando en el programa de Ingeniería Industrial,
Informática y Medioambiental, con la tesis titulada
"Monitorización de rendimiento y consumo energético a
nivel de proceso en arquitecturas multinúcleo".
tomas@gamescribes.com

Introducción (I)
● El consumo energético es una de las mayores preocupaciones a la hora de
diseñar nuevos sistemas de computación de altas prestaciones.
● Con el crecimiento de la computación de altas prestaciones y el cloudcomputing
se necesitan sistemas que cada vez más optimizados en el plano energético.
● Sería idóneo conseguir acceder a los datos de consumo a todos los niveles.
● Esto no ocurre así en la realidad: el acceso a datos de consumo está muy
limitado.

Introducción (II)
● Acceso a determinados datos, en muchos casos limitados o siendo estos
aproximaciones.
● Suficientes para administrar el sistema.
● Cada fabricante permite el acceso a estos datos de una manera diferente.
● Entre los métodos actuales de medición de rendimiento y consumo energético:
○ Hardware: componentes para conocer el vataje o métricas similares o
derivadas.
○ Software: que estiman el vataje a través de programas y modelos
informáticos.

Estado del arte (I)
● Los sistemas de computación actuales son plataformas heterogéneas complejas
capaces de ofrecer una alta potencia de computación → complicado sacar
ventaja de dichos sistemas.
● Herramientas → Optimización energética y de rendimiento
● Desventajas:
○ Dificultad convertir los datos que adquieren en información útil.
○ Poca fiabilidad.
● Performance Monitoring Units, PMUs
○ Contar eventos a nivel de micro arquitecturas (ciclos de reloj,
instrucciones retiradas, etc)
● Model-Specific Registers, MSRs
○ Contar eventos.
○ Información de consumo de potencia en tiempo de ejecución: (Running
Average Power Limit, RAPL) en Intel.

Estado del arte (II) - Métodos Software de medición
En la literatura hay multitud de opciones:
● Interfaces de bajo nivel:
○ perfctr y perf_events.
○ PAPI, OProfile, perf.
■ PAPI soporta RAPL, NVML, XEON Phi o IBM EMON, como fuente de datos
para las medidas de consumo energético.
● Interfaces específicamente centrados en potencia:
○ Schedmon: control total sobre el hardware subyacente a través de una interfaz de
línea de comandos
○ Librería de Medición de Energía o Energy Measurement Library (EML): Universidad de
La Laguna.
Se trata de una interfaz simple para la adquisición de datos de consumo de energía
del hardware, a través de instrumentación de código.
■ Soporte para RAPL, NVML, Xeon Phi y Schleifenbauer PDUs.

Objetivos
● Estudiar herramientas que permitan la monitorización de rendimiento y consumo
energético a nivel de proceso en arquitecturas modernas.
○ “Schedmon”: control total del hardware subyacente a través de línea de
comandos.
○ Obtiene la cuenta de eventos hardware de la aplicación objetivo.
○ Toma de muestras en tiempo de ejecución (global o llamada a funciones).
○ Permite evaluar la aplicación sin tener que realizar cambios a su código fuente.
○ Permite multi-hilo.

PERF
● Perf_event, perf tools, originalmente PCL.
● Es una herramienta de análisis de rendimiento para Linux
● Está disponible en los Kernel con versiones superiores a 2.6.31
● Se accede a través de línea de comandos.
● Es capaz de realizar perfiles estadísticos de todo el sistema.
● Soporta contadores de rendimiento hardware, software,
tracepoints, y sondas dinámicas. Es una de las herramientas de
análisis de rendimiento más usadas del mundo según IBM.
● https://perf.wiki.kernel.org/index.php/Main_Page

Scripts
Con este script ejecutamos “perf”
y guardamos su salida a CSV
PERFCALL.py PERFCSVPLOT
Con este script graficamos,
haciendo uso de MatPlotLib y
Pandas, la salida en formato
CSV, obtenida en
“perfcall.py”
● Se crean dos programas que hacen uso de perf:
○ Perfcall: https://github.com/Anexo/perfcall
○ PerfCsvplot: https://github.com/Anexo/perfcsvplot

PERFCALL.py
Con este script ejecutamos “perf” y guardamos su salida a CSV
El módulo de subproceso permite generar nuevos procesos, conectarse
a sus canales de entrada / salida / error y obtener sus códigos de
retorno.
El módulo argparse facilita la escritura de interfaces de línea de
comandos.
Acceso a los datos de identificación de la plataforma subyacente

PERFCALL.py
Recogemos los argumentos por
línea de comandos

PERFCALL.py
Creamos un subproceso, donde se
ejecuta el comando introducido. (perf)
Creamos el archivo de salida:
output.csv

PERFCSVPLOT
import matplotlib.pyplot as plt
import sys
import json
import pandas as pd
import csv
import operator
#Getting data from header:
data = sys.argv[1]
with open(data, 'rt') as csvfile:
unit = 'Joules'
query = csv.reader(csvfile, delimiter=',', skipinitialspace=True)
for row in range(23):
if row == 1:
date = next(query)
elif row == 2:
uname = next(query)
elif row == 6:
title = next(query)
elif row == 13:
time_interval = next(query)
elif row == 17:
number_cores = next(query)
else:
next(query)
#Open the CSV:
csvfile = sys.argv[1]
#The data read by csv reader into data variable:
data = pd.read_csv(csvfile, header=None, skipinitialspace=True, skiprows=23, names=['time'],
usecols=[0], delimiter=';')
#print data
#Formatting time array
times = list(pd.unique(data.time.ravel()))
#The data read by csv reader into data variable:
data = pd.read_csv(csvfile, header=None, skipinitialspace=True, skiprows=23, names=['cpu',
'energy','unit','event'], usecols=[1,2,3,4], delimiter=';', decimal=',')
#print data
#Group data by CPU
cpuList = data.groupby(['cpu'])
#CPU dict
cpuEnergy = {}
#Loop for indexing CPU and energy
for i in range(len(cpuList)):
eachCPU = 'CPU' + str(i+1)
cpuEnergy[eachCPU] = list(cpuList.get_group('CPU' + str(i+1))['energy'])
#Calculating sum of energy array per core:
total_joules_array = []
total_watt_array = []
total_time = times[-1] - times[0]
for k, v in cpuEnergy.items():
total_joules = sum(v)
total_joules_array.append(total_joules)
total_watt = total_joules/total_time
total_watt_array.append(total_watt)
#Defining plot with multiple cores:
plt.plot(times, v, '-', linewidth=2)
#Annotating each plot line to core:
indent = 0
plt.annotate(k, xy=(times[0], v[0]), xytext=(times[0],v[0]+indent))
indent = indent + 0.3

PERFCSVPLOT
#Calculating time interval and defining title:
time_interval = float(time_interval[0])/1000
#Plot total energy per core:
indent = 0
bbox_args = dict(boxstyle="round", fc="0.2")
for k,v in cpuEnergy.items():
total_joules = sum(v)
print (total_joules)
total_watt = total_joules / total_time
print (total_watt)
plt.text(times[-1], 0.5+indent, k + ' ' + "%.2f" %total_watt + ' Watts - ' + "%.2f"
%total_joules + ' Joules', bbox=bbox_args)
indent = indent + 0.5
#Configure axes:
plt.title(title)
plt.xlabel('Time (s)')
plt.ylabel('Energy (Joules)')
#Print plot:
plt.show()
print ('')
print ('Plot finished.')
#JSON export:
def json_dict(time,power, description):
js_dict = dict(zip(
["1title",
"2description",
"3date",
"4cpu",
"5Joules",
"6Watt",
"7time",
"Cores"],
["Perfcall and Perfcsvplot JSON export",
str(description),
str(date),
str(uname),
str(total_joules),
str(total_watt),
time,
cpuEnergy]
))
return js_dict
j_dict = json_dict(times, cpuList, title)
with open('output.json', 'w') as outfile:
outfile.write(json.dumps(j_dict, indent=4, sort_keys=True, separators=(',', ': ')))
print ('')
print ("JSON export finished.")
print ("Bye!")

PERFCSVPLOT
Con este script graficamos, haciendo uso de MatPlotLib y Pandas, la salida
en formato CSV, obtenida en “perfcall.py”
Las importamos con un alias para hacerlas más
manejable.
El módulo operator exporta un conjunto de funciones
eficientes correspondientes a los operadores
intrínsecos de Python. Por ejemplo, operator.add (x,
y) es equivalente a la expresión x + y.

Pandas
Pandas es una librería escrita para Python, cuyo objetivo es el
análisis y manipulación de datos. En concreto, ofrece las
estructuras y operaciones necesarias para la manipulación de
tablas y series numéricas y temporales.

MAtplotlib
Matplotlib es una biblioteca para la generación de gráficos a partir
de datos contenidos en listas o arrays en el lenguaje de
programación Python y su extensión matemática NumPy.
Proporciona una API, pylab, diseñada para recordar a la de
MATLAB.

benchmarksgame
http://benchmarksgame.alioth.debian.org/

fannkuch-redux
http://benchmarksgame.alioth.debian.org/u64q/fannkuchredux-description.html#fannkuchredux
2275,31 J
20.70 W
Python
106,46 J
21,76 W
The fannkuch benchmark is defined by programs in
Performing Lisp Analysis of the FANNKUCH Benchmark

fannkuch-redux
http://benchmarksgame.alioth.debian.org/u64q/fannkuchredux-description.html#fannkuchredux
2275,31 J
20.70 W
Python C
106,46 J
21,76 W
The fannkuch benchmark is defined by programs in
Performing Lisp Analysis of the FANNKUCH Benchmark

https://es.slideshare.net/arcangelsombra/python-vs-c-presentation
fannkuch-redux

Spectral Norm
http://benchmarksgame.alioth.debian.org/u64q/performance.php?test=spectralnorm
3169,27 J
20.76 W
Python
2775,45 J
20,88 W
MathWorld: "Hundred-Dollar, Hundred-Digit Challenge Problems",
Challenge #3.
● calculate the spectral norm of an infinite matrix A, with
entries a11
=1, a12
=1/2, a21
=1/3, a13
=1/4, a22
=1/5, a31
=1/6, etc

Spectral Norm
http://benchmarksgame.alioth.debian.org/u64q/performance.php?test=spectralnorm
3169,27 J
20.76 W
Python Python
2775,45 J
20,88 W
MathWorld: "Hundred-Dollar, Hundred-Digit Challenge Problems",
Challenge #3.
● calculate the spectral norm of an infinite matrix A, with
entries a11
=1, a12
=1/2, a21
=1/3, a13
=1/4, a22
=1/5, a31
=1/6, etc

Bibliografía (I)
[1] Luis Taniça, Aleksandar Ilic, Pedro Tomás and Leonel Sousa. (August 2014). SchedMon: A Performance and Energy Monitoring Tool for
Modern Multi-cores, In 7th International Workshop on Multi/many-Core Computing Systems (MuCoCus’2014), Porto, Portugal
[2] Browne, S. (2000). A Portable Programming Interface for Performance Evaluation on Modern Processors. International Journal of High
Performance Computing Applications, 14(3), pp.189-204.
[3] Ley de Moore. (2015, 18 de mayo). Wikipedia, La enciclopedia libre. Fecha de consulta: 19:14, junio 4, 2015 desde http://es.wikipedia.org/w/index.php?title=Ley_de_Moore&oldid=82521994.
[4] Intel (2013). Intel 64 and ia-32 architectures software developer’s manual. Volume 3: System Programming Guide.
[5] Pettersson, M. (2009). Perfctr: Linux performance monitoring counters driver.
[6] Weaver, V.M (2013). Linux perf event features and overhead. In: Proceedings of the International Workshop on Performance Analysis of
Workload Optimized Systems, FastPath 2013, p. 80.
[7] Cohen, W. (2004). Tuning programs with OProfile. Wide Open Magazine 1, 53–62.

Bibliografía (II)
[8] Perf Wiki tutorial on perf (accessed: March 2015). https://perf.wiki.kernel.org/index.php
[9] http://sips.inesc-id.pt/tools/schedmon/
[10] Almeida, F. Arteaga J., Blanco V., Cabrera A. (2015) Energy Measurement Tools for Ultrascale Computing: A Survey.
[11] Chung-Hsing Hsu, & Poole, S. (2011). Power measurement for high performance computing: State of the art. 2011 International Green Computing Conference And Workshops.
doi:10.1109/igcc.2011.6008596
[12] Intel Corporation (2015) Intel 64 and IA-32 Architectures Software Developer's Manual. 253669-053US
[13] Intel 64 and IA-32 Architectures Software Developer's Manual (March 2012) Volume 3: System Programming Guide, 325384-042US, Section 34.1.
[14] Wikipedia, (2015) Model-specific register Retrieved 30 June 2015, from https://en.wikipedia.org/wiki/Model-specific_register#References Performance Monitoring and Energy Consumption at Process
Level in Multicore Architectures 6
[15] Advanced Micro Devices: AMD BIOS and Kernel Developer’s Guide (BKDG) for AMD Family 15th Models 00h-0Fh Processors. (2013)
[16] Hackenberg, D., Ilsche, T., Schuchart, J., Schone, R., Nagel, W., Simon, M., & Georgiou, Y. (2014). HDEEM: High Definition Energy Efficiency Monitoring. 2014 Energy Efficient Supercomputing
Workshop. doi:10.1109/e2sc.2014.13

Bibliografía (III)
[17] Hemsoth, N. (2014) Are Supercomputing’s Elite Turning Backs on Accelerators?. URL http://www.hpcwire.com/2014/06/26/acceleratorshold/ Retrieved 28 June 2015
[18] McGraw, H., Ralph, J., Danalis, A., Dongarra, J., (2014) Power monitoring with PAPI for extreme scale architectures and dataflowbased programming models.
[19] Weaver, V., Johnson, M., Kasichayanula, K., Ralph, J., Luszczek, P., Terpstra, D., Moore, S. (2012) Measuring energy and power with papi. In International Workshop on Power-Aware Systems and
Architectures. Pittsburgh, PA.
[20] Weaver, V.M., Terpstra, D., McGraw, H., Johnson, M., Kasichayanula, K., Ralph, J., Nelson, J., Mucci, P., Moham, T., Moore, S. (2013) Papi 5: Measuring power, energy, and the cloud. In: Performance
Analysis of Systems and Software (ISPASS). International Symposium on, pp. 124-125. IEEE.
[21] Cabrera, A., Almeida, F. Arteaga J., Blanco V. (2014) Measuring energy consumption using EML (energy Measurement Library). Computer Science-Research and Development pp.1-9.
[22] Vince Weaver, The Unofficial Linux Perf Events Web-Page
[23] Linux perf event Features and Overhead. 2013 FastPath Workshop, Vince Weaver
[24] Jake Edge, Perfcounters added to the mainline, LWN July 1, 2009, "perfcounters being included into the mainline during the recently completed 2.6.31 merge window"
[25] Arnaldo Carvalho de Melo, The New Linux ’perf’ tools, presentation from Linux Kongress, September, 2010

Bibliografía (IV)
[26] Roberto A. Vitillo (LBNL). PERFORMANCE TOOLS DEVELOPMENTS, 16 June 2011, presentation from "Future computing in particle physics" conference
[27] Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume 3B: System Programming Guide, Part 2. Intel. June 2009. p. 19-2 vol. 3.
[28] NAS Parallel Benchmarks – Nasa Advanced Supercomputing Division http://www.nas.nasa.gov/publications/npb.html

Muchas gracias
Monitorización de rendimiento y consumo
energético a nivel de proceso en arquitecturas
multinúcleo
Rendimiento y consumo energético con Python
Tomás L. López-Fragoso Rumeu - tlopezfr@ull.edu.es
Tutor: Vicente Blanco Pérez - vblanco@ull.edu.es

Rendimiento y consumo energético con python

Recommended

Recommended

More Related Content

What's hot

What's hot (16)

Similar to Rendimiento y consumo energético con python

Similar to Rendimiento y consumo energético con python (20)

Recently uploaded

Recently uploaded (12)

Rendimiento y consumo energético con python