SlideShare a Scribd company logo
1 of 17
Download to read offline
Image:	NASA
CLAW	Code	Manipulation	for	Performance	Portability
PASC’16	-	MS03	Code	Generation	Techniques	for	HPC	Earth	Science	Applications	
8th	June	2016	
Valentin	Clement	
valentin.clement@env.ethz.ch
Valentin	Clement 8th	of	June	2016	-	PASC’16 2
Summary
• Performance	portability	problem	
• CLAW	1st	approach:	Low	level	transformation	
• Column	model	abstraction	
• CLAW	2nd	approach:	High	level	abstraction	
• CLAW	Compiler	
• CLAW	current	status	and	next	steps
Valentin	Clement 8th	of	June	2016	-	PASC’16 3
Hybrid	supercomputer	(CPU/GPU)
Valentin	Clement 8th	of	June	2016	-	PASC’16 4
COSMO	to	the	Hybrid	supercomputers
Initialization
Cleanup
Boundary conditions
Physics
Dynamics
Data assimilation
Halo-update
Diagnostics
Input / Output
Δt
à OpenACC port
à OpenACC port
à C++ / DSL rewrite
à Mixed OpenACC / CPU
à Communication library (GCL)
à OpenACC port
à Mixed OpenACC / CPU
Interface
Interface
Copy to accelerator
Valentin	Clement 8th	of	June	2016	-	PASC’16 5
COSMO	radiation	example
!$acc	parallel	loop	
DO	j=1,nproma	
		!$acc	loop	
		DO	k=1,nz	
				CALL	fct()	
				!	1st	loop	body	
				!	2nd	loop	body	
				!	3rd	loop	body	
		END	DO	
END	DO	
!$acc	end	parallel
DO	k=1,nz	
		CALL	fct()	
		DO	j=1,nproma	
				!	1st	loop	body	
		END	DO	
		DO	j=1,nproma	
				!	2nd	loop	body	
		END	DO	
		DO	j=1,nproma	
				!	3rd	loop	body	
		END	DO	
END	DO
CPU	optimal GPU	optimal
Iteration	over	2	dimensions	(block,	column	level)
Valentin	Clement 8th	of	June	2016	-	PASC’16 6
Performance	portability:	COSMO	Radiation
Runtime[s]
0
0.3
0.6
0.9
1.2
Code restructured for CPU Code restructured for GPU
CPU GPU
-30%
-35%
Valentin	Clement 8th	of	June	2016	-	PASC’16 7
Performance	portability:	past	situation
#ifndef	_OPENACC	
DO	k=1,nz	
		CALL	fct()	
		DO	j=1,nproma	
				!	1st	loop	body	
		END	DO	
		DO	j=1,nproma	
				!	2nd	loop	body	
		END	DO	
		DO	j=1,nproma	
				!	3rd	loop	body	
		END	DO	
END	DO	
#else	
!$acc	parallel	loop	
DO	j=1,nproma	
		!$acc	loop	
		DO	k=1,nz	
				CALL	fct()	
				!	1st	loop	body	
				!	2nd	loop	body	
				!	3rd	loop	body	
		END	DO	
END	DO	
!$acc	end	parallel	
#endif
• Multiple	code	paths	
• Hard	maintenance	
• Error	prone	
• Domain	scientist	have	to	
know	well	each	target	
architectures
CPU
GPU
Valentin	Clement 8th	of	June	2016	-	PASC’16 8
Performance	portability:	optimizations
• Loop	fusion	
• Loop	reordering	
• Loop	extraction	
• Loop	hoisting	
• Caching	
• On	the	fly	computation	
• Array	notation	to	do	statement
First	step:	COSMO	radiation/HAMMOZ	
CPU	optimal	—>	GPU	optimal
Valentin	Clement 8th	of	June	2016	-	PASC’16 9
CLAW:	first	steps	—>	Low	level	transformations
!$acc	parallel	loop	
!$claw	loop-interchange	
DO	k=1,nz	
		!$claw	loop-extract	fusion	
		CALL	fct()	
		!$claw	loop-fusion	
		!$acc	loop	
		DO	j=1,nproma	
				!	1st	loop	body	
		END	DO	
		!$claw	loop-fusion	
		!$acc	loop	
		DO	j=1,nproma	
				!	2nd	loop	body	
		END	DO	
		!$claw	loop-fusion	
		!$acc	loop	
		DO	j=1,nproma	
				!	3rd	loop	body	
		END	DO	
END	DO	
!$acc	end	parallel
!$acc	parallel	loop	
DO	j=1,nproma	
		!$acc	loop	
		DO	k=1,nz	
				CALL	fct()	
				!	1st	loop	body	
				!	2nd	loop	body	
				!	3rd	loop	body	
		END	DO	
END	DO	
!$acc	end	parallel
CLAW	compiler
• CLAW	v0.2	
• 8	low	level	transformations	
• CPU	optimal	—>	GPU	optimal
Valentin	Clement 8th	of	June	2016	-	PASC’16 10
Hard	to	use	-	Too	many	directives…
Valentin	Clement 8th	of	June	2016	-	PASC’16 11
CLAW	Higher	level	abstraction	for	domain	scientist
Separation	of	concerns	
• Domain	scientists	focus	
on	their	problem	(1	
column,	1	box)	
• Dependency	only	in	k	
levels	
• CLAW	compiler	produce	
code	for	each	target	and	
directive	languages
Valentin	Clement 8th	of	June	2016	-	PASC’16 12
CLAW	abstraction	column/box	model
SUBROUTINE	compute_col(q,t,nz)	
		INTEGER,	INTENT(IN)	::	nz	
		REAL,	INTENT(INOUT)	::	q(:),t(:)	
		INTEGER													::	k	
		REAL																::	c	
			
!$claw	define	dimension	j(1:nproma)	&	
!$claw	parallelize	
c	=	5.345	
DO	k	=	1,	nz	
		t(k)	=	c	*	k	
		q(k)	=	t(k)	*	c	
END	DO	
END	SUBROUTINE	column_model
clawfc -d=openacc -t=gpu -o mo_col_gpu.f90 mo_col.f90
clawfc -d=openmp -t=cpu -o mo_col_cpu.f90 mo_col.f90
original
code
GPU
optimized
CPU
optimized
• Column	model	code	
encapsulated	in	a	
subroutine	
• Agnostic	of	any	
architecture	
• Let	CLAW	manage	the	
parallelization	over	the	
domain
clawfc -d=openmp -t=mic -o mo_col_mic.f90 mo_col.f90
MIC
optimized
Valentin	Clement 8th	of	June	2016	-	PASC’16 13
CLAW	compiler
• Based	on	the	OMNI	Compiler	
• Source-to-source	translator	
• Open	source	under	the	BSD	license
Valentin	Clement 8th	of	June	2016	-	PASC’16 14
CLAW	compiler
Sets	of	programs/libraries	to	build	source-to-source	compilers	for	C	
and	Fortran	via	an	XcodeML	intermediate	representation	
-Used	to	implement	XcalableMP	(abstract	inter-node	communication),	XcalableACC	(XMP	+	
OpenACC),	OpenMP	(implementation	for	C	and	Fortran),	OpenACC	(C	implementation	only)	
Development	team	
– Programming	Environments	Research	Team	from	the	RIKEN	Advanced	
Institute	for	Computational	Sciences,	Kobe,	Japan	
– High	Performance	Computing	System	Lab,	University	of	Tsukuba,	Tsukuba	
– http://www.omni-compiler.org		
– https://github.com/omni-compiler
OMNI	Compiler
Valentin	Clement 8th	of	June	2016	-	PASC’16 15
CLAW	Current	status	and	next	steps
• Low-level	transformations	implemented	
• COSMO	Radiation	using	them	
• Heading	towards	high-level	abstractions	and	
automatic	transformations	
• Working	with	examples	from:	
• COSMO	
• ICON	RRTMGP	
• ECHAM-HAMMOZ
Valentin	Clement 8th	of	June	2016	-	PASC’16 16
Resources
https://github.com/C2SM-RCM
Valentin	Clement 8th	of	June	2016	-	PASC’16 17
valentin.clement@env.ethz.ch
Involved	persons:			Jon	Rood	(C2SM),	Sylvaine	Ferrachat	(ETH	Zurich),	Will	Sawyer	
(CSCS),	Oliver	Fuhrer	(MeteoSwiss),	Xavier	Lapillonne	(MeteoSwiss)

More Related Content

Viewers also liked

About myHSA
About myHSAAbout myHSA
About myHSATim Kane
 
Johana marcelainsandarasalazar de_caso_01_concepto_basico_ev08
Johana marcelainsandarasalazar de_caso_01_concepto_basico_ev08Johana marcelainsandarasalazar de_caso_01_concepto_basico_ev08
Johana marcelainsandarasalazar de_caso_01_concepto_basico_ev08johana1995insandara
 
Adverbios de frecuencia
Adverbios  de frecuenciaAdverbios  de frecuencia
Adverbios de frecuenciapamelladc2631
 
La fragmentacion de la deuda tributaria como consecuencia de la liquidación a...
La fragmentacion de la deuda tributaria como consecuencia de la liquidación a...La fragmentacion de la deuda tributaria como consecuencia de la liquidación a...
La fragmentacion de la deuda tributaria como consecuencia de la liquidación a...Guillermo Ruiz Zapatero
 
BPM-Club Köln 03.11.16: Ariane Möller - Praxisbeispiele prozessorientiertes U...
BPM-Club Köln 03.11.16: Ariane Möller - Praxisbeispiele prozessorientiertes U...BPM-Club Köln 03.11.16: Ariane Möller - Praxisbeispiele prozessorientiertes U...
BPM-Club Köln 03.11.16: Ariane Möller - Praxisbeispiele prozessorientiertes U...BPM&O GmbH
 

Viewers also liked (12)

About myHSA
About myHSAAbout myHSA
About myHSA
 
Johana marcelainsandarasalazar de_caso_01_concepto_basico_ev08
Johana marcelainsandarasalazar de_caso_01_concepto_basico_ev08Johana marcelainsandarasalazar de_caso_01_concepto_basico_ev08
Johana marcelainsandarasalazar de_caso_01_concepto_basico_ev08
 
Adverbios de frecuencia
Adverbios  de frecuenciaAdverbios  de frecuencia
Adverbios de frecuencia
 
CV_Pradeep Asnani
CV_Pradeep AsnaniCV_Pradeep Asnani
CV_Pradeep Asnani
 
Tutorial - Baby Blessing
Tutorial - Baby BlessingTutorial - Baby Blessing
Tutorial - Baby Blessing
 
Biomimicry pechstein
Biomimicry pechsteinBiomimicry pechstein
Biomimicry pechstein
 
Birmingham1963
Birmingham1963Birmingham1963
Birmingham1963
 
La fragmentacion de la deuda tributaria como consecuencia de la liquidación a...
La fragmentacion de la deuda tributaria como consecuencia de la liquidación a...La fragmentacion de la deuda tributaria como consecuencia de la liquidación a...
La fragmentacion de la deuda tributaria como consecuencia de la liquidación a...
 
договір сбу
договір сбудоговір сбу
договір сбу
 
Baloise eb@dp afinal
Baloise eb@dp afinalBaloise eb@dp afinal
Baloise eb@dp afinal
 
A study of in depth indigenous knowledge for Development of health management...
A study of in depth indigenous knowledge for Development of health management...A study of in depth indigenous knowledge for Development of health management...
A study of in depth indigenous knowledge for Development of health management...
 
BPM-Club Köln 03.11.16: Ariane Möller - Praxisbeispiele prozessorientiertes U...
BPM-Club Köln 03.11.16: Ariane Möller - Praxisbeispiele prozessorientiertes U...BPM-Club Köln 03.11.16: Ariane Möller - Praxisbeispiele prozessorientiertes U...
BPM-Club Köln 03.11.16: Ariane Möller - Praxisbeispiele prozessorientiertes U...
 

Similar to 20160608_claw_pasc16

OpenACC Monthly Highlights April 2018
OpenACC Monthly Highlights April 2018OpenACC Monthly Highlights April 2018
OpenACC Monthly Highlights April 2018NVIDIA
 
OpenACC Monthly Highlights March 2019
OpenACC Monthly Highlights March 2019OpenACC Monthly Highlights March 2019
OpenACC Monthly Highlights March 2019OpenACC
 
Optimize Single Particle Orbital (SPO) Evaluations Based on B-splines
Optimize Single Particle Orbital (SPO) Evaluations Based on B-splinesOptimize Single Particle Orbital (SPO) Evaluations Based on B-splines
Optimize Single Particle Orbital (SPO) Evaluations Based on B-splinesIntel® Software
 
OpenACC Monthly Highlights - February 2018
OpenACC Monthly Highlights - February 2018OpenACC Monthly Highlights - February 2018
OpenACC Monthly Highlights - February 2018NVIDIA
 
OpenACC Monthly Highlights: July and August 2018
OpenACC Monthly Highlights: July and August 2018OpenACC Monthly Highlights: July and August 2018
OpenACC Monthly Highlights: July and August 2018NVIDIA
 
OpenACC and Open Hackathons Monthly Highlights: April 2022
OpenACC and Open Hackathons Monthly Highlights: April 2022OpenACC and Open Hackathons Monthly Highlights: April 2022
OpenACC and Open Hackathons Monthly Highlights: April 2022OpenACC
 
OpenACC Monthly Highlights: May 2019
OpenACC Monthly Highlights: May 2019OpenACC Monthly Highlights: May 2019
OpenACC Monthly Highlights: May 2019OpenACC
 
Addressing Emerging Challenges in Designing HPC Runtimes
Addressing Emerging Challenges in Designing HPC RuntimesAddressing Emerging Challenges in Designing HPC Runtimes
Addressing Emerging Challenges in Designing HPC Runtimesinside-BigData.com
 
OpenACC Monthly Highlights - March 2018
OpenACC Monthly Highlights - March 2018OpenACC Monthly Highlights - March 2018
OpenACC Monthly Highlights - March 2018NVIDIA
 
User-­friendly Metaworkflows in Quantum Chemistry
User-­friendly Metaworkflows in Quantum ChemistryUser-­friendly Metaworkflows in Quantum Chemistry
User-­friendly Metaworkflows in Quantum ChemistrySandra Gesing
 
1 s2.0-s2090447917300291-main
1 s2.0-s2090447917300291-main1 s2.0-s2090447917300291-main
1 s2.0-s2090447917300291-mainVIT-AP University
 
OpenACC Monthly Highlights: October2020
OpenACC Monthly Highlights: October2020OpenACC Monthly Highlights: October2020
OpenACC Monthly Highlights: October2020OpenACC
 
OpenACC Highlights: GTC Digital April 2020
OpenACC Highlights: GTC Digital April 2020OpenACC Highlights: GTC Digital April 2020
OpenACC Highlights: GTC Digital April 2020OpenACC
 
OpenACC Monthly Highlights: February 2022
OpenACC Monthly Highlights: February 2022OpenACC Monthly Highlights: February 2022
OpenACC Monthly Highlights: February 2022OpenACC
 
OpenACC Highlights - February
OpenACC Highlights - FebruaryOpenACC Highlights - February
OpenACC Highlights - FebruaryNVIDIA
 
OpenACC Monthly Highlights: September 2021
OpenACC Monthly Highlights: September 2021OpenACC Monthly Highlights: September 2021
OpenACC Monthly Highlights: September 2021OpenACC
 
Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II
Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen IIPorting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II
Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen IIGeorge Markomanolis
 
OpenACC Monthly Highlights Summer 2019
OpenACC Monthly Highlights Summer 2019OpenACC Monthly Highlights Summer 2019
OpenACC Monthly Highlights Summer 2019OpenACC
 
OpenACC Monthly Highlights June 2017
OpenACC Monthly Highlights June 2017OpenACC Monthly Highlights June 2017
OpenACC Monthly Highlights June 2017NVIDIA
 

Similar to 20160608_claw_pasc16 (20)

OpenACC Monthly Highlights April 2018
OpenACC Monthly Highlights April 2018OpenACC Monthly Highlights April 2018
OpenACC Monthly Highlights April 2018
 
OpenACC Monthly Highlights March 2019
OpenACC Monthly Highlights March 2019OpenACC Monthly Highlights March 2019
OpenACC Monthly Highlights March 2019
 
Optimize Single Particle Orbital (SPO) Evaluations Based on B-splines
Optimize Single Particle Orbital (SPO) Evaluations Based on B-splinesOptimize Single Particle Orbital (SPO) Evaluations Based on B-splines
Optimize Single Particle Orbital (SPO) Evaluations Based on B-splines
 
OpenACC Monthly Highlights - February 2018
OpenACC Monthly Highlights - February 2018OpenACC Monthly Highlights - February 2018
OpenACC Monthly Highlights - February 2018
 
OpenACC Monthly Highlights: July and August 2018
OpenACC Monthly Highlights: July and August 2018OpenACC Monthly Highlights: July and August 2018
OpenACC Monthly Highlights: July and August 2018
 
Alfonso Senatore
Alfonso SenatoreAlfonso Senatore
Alfonso Senatore
 
OpenACC and Open Hackathons Monthly Highlights: April 2022
OpenACC and Open Hackathons Monthly Highlights: April 2022OpenACC and Open Hackathons Monthly Highlights: April 2022
OpenACC and Open Hackathons Monthly Highlights: April 2022
 
OpenACC Monthly Highlights: May 2019
OpenACC Monthly Highlights: May 2019OpenACC Monthly Highlights: May 2019
OpenACC Monthly Highlights: May 2019
 
Addressing Emerging Challenges in Designing HPC Runtimes
Addressing Emerging Challenges in Designing HPC RuntimesAddressing Emerging Challenges in Designing HPC Runtimes
Addressing Emerging Challenges in Designing HPC Runtimes
 
OpenACC Monthly Highlights - March 2018
OpenACC Monthly Highlights - March 2018OpenACC Monthly Highlights - March 2018
OpenACC Monthly Highlights - March 2018
 
User-­friendly Metaworkflows in Quantum Chemistry
User-­friendly Metaworkflows in Quantum ChemistryUser-­friendly Metaworkflows in Quantum Chemistry
User-­friendly Metaworkflows in Quantum Chemistry
 
1 s2.0-s2090447917300291-main
1 s2.0-s2090447917300291-main1 s2.0-s2090447917300291-main
1 s2.0-s2090447917300291-main
 
OpenACC Monthly Highlights: October2020
OpenACC Monthly Highlights: October2020OpenACC Monthly Highlights: October2020
OpenACC Monthly Highlights: October2020
 
OpenACC Highlights: GTC Digital April 2020
OpenACC Highlights: GTC Digital April 2020OpenACC Highlights: GTC Digital April 2020
OpenACC Highlights: GTC Digital April 2020
 
OpenACC Monthly Highlights: February 2022
OpenACC Monthly Highlights: February 2022OpenACC Monthly Highlights: February 2022
OpenACC Monthly Highlights: February 2022
 
OpenACC Highlights - February
OpenACC Highlights - FebruaryOpenACC Highlights - February
OpenACC Highlights - February
 
OpenACC Monthly Highlights: September 2021
OpenACC Monthly Highlights: September 2021OpenACC Monthly Highlights: September 2021
OpenACC Monthly Highlights: September 2021
 
Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II
Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen IIPorting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II
Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II
 
OpenACC Monthly Highlights Summer 2019
OpenACC Monthly Highlights Summer 2019OpenACC Monthly Highlights Summer 2019
OpenACC Monthly Highlights Summer 2019
 
OpenACC Monthly Highlights June 2017
OpenACC Monthly Highlights June 2017OpenACC Monthly Highlights June 2017
OpenACC Monthly Highlights June 2017
 

20160608_claw_pasc16