SystemML	Architecture
Niketan	Pansare,	Berthold	Reinwald
July	25th,	2016
Agenda
• High-level	Design	&	APIs
• Architecture	Overview
• Tooling
• Important	links
2
From	http://systemml.apache.org/
Agenda
• High-level	Design	&	APIs	
• Architecture	Overview
• Language
• Compiler
• Runtime
• Two	examples:
• Simple	DML	expression	with	an	example	dataset
• Linear	Regression	with	varying	datasizes
• Tooling
• Important	links
3
Agenda
• High-level	Design	&	APIs	
• Architecture	Overview
• Language
• Compiler
• Runtime
• Two	examples:
• Simple	DML	expression	with	an	example	dataset
• Linear	Regression	with	varying	datasizes
• Tooling
• Important	links
4
SystemML Design
5
DML (Declarative Machine
Learning Language)
Hadoop or Spark Cluster
(scale-out)
since 2010
In-Memory Single Node
(scale-up)
since 2012 since 2015
DML Scripts
Data
CP	+	b	sb	_mVar1
SPARK	mapmm	X	_mvar1	_mVar2	
RIGHT	false	NONE
CP	*	y	_mVar2	_mVar3
Hybrid	execution	
plans*
SystemML3.	double	[]	[]
1.	On	disk/HDFS
2.	RDD/DataFrame
SystemML Design
6
Hadoop or Spark Cluster
(scale-out)
since 2010
In-Memory Single Node
(scale-up)
since 2012
DML Scripts
Data
SystemML
1.	On	disk/HDFS
2.	RDD/DataFrame
3.	double	[]	[]
Command line API*
(also MLContext*)
-exec	hadoop
SystemML Design
7
Hadoop or Spark Cluster
(scale-out)
In-Memory Single Node
(scale-up)
since 2012
DML Scripts
Data
SystemML
1.	On	disk/HDFS
2.	RDD/DataFrame
3.	double	[]	[]
Two	options:
1. –exec	singlenode
2. Use	standalone	jar	(preserves	rewrites,	but	
may	spawn	Local	MR	jobs)
Command line API*
(also MLContext*)
SystemML Design
8
Spark Cluster
(scale-out)
In-Memory Single Node
(scale-up)
since 2012 since 2015
DML Scripts
Data
SystemML
1.	On	disk/HDFS
2.	RDD/DataFrame
3.	double	[]	[]
Command line API*
(also MLContext*)
SystemML Design
9
Spark Cluster
(scale-out)
In-Memory Single Node
(scale-up)
since 2012 since 2015
DML Scripts
Data
SystemML
1.	On	disk/HDFS
2.	RDD/DataFrame
3.	double	[]	[]
MLContext API
- Java/Python/Scala
https://apache.github.io/incubator-systemml/spark-mlcontext-programming-guide.html
SystemML Design
10
In-Memory Single Node
(scale-up)
since 2012
DML Scripts
Data
SystemML
1.	On	disk/HDFS
2.	RDD/DataFrame
3.	double	[]	[]
JMLC API
https://apache.github.io/incubator-systemml/jmlc.html
Agenda
• High-level	Design	&	APIs	
• Architecture	Overview
• Language
• Compiler
• Runtime
• Two	examples:
• Simple	DML	expression	with	an	example	dataset
• Linear	Regression	with	varying	datasizes
• Tooling
• Important	links
11
From DML to Execution Plan
12
Hadoop or Spark Cluster
(scale-out)
In-Memory Single Node
(scale-up)
DML Scripts DML (Declarative Machine
Learning Language)
since 2010since 2012 since 2015
Data
CP	+	b	sb	_mVar1
SPARK	mapmm	X	_mvar1	_mVar2	
RIGHT	false	NONE
CP	*	y	_mVar2	_mVar3
Hybrid	execution	
plans*
SystemML
From DML to Execution Plan
13
Hadoop or Spark Cluster
(scale-out)
In-Memory Single Node
(scale-up)
Runtime
Compiler
Language
DML Scripts DML (Declarative Machine
Learning Language)
since 2010since 2012 since 2015
Data
CP	+	b	sb	_mVar1
SPARK	mapmm	X	_mvar1	_mVar2	
RIGHT	false	NONE
CP	*	y	_mVar2	_mVar3
Hybrid	execution	
plans*
Assuming	 an	example	dataset	
X:	100M	X	500,	y:	100M	X	1,	
b/sb:	500	X	1
SystemML Compilation Chain
14
SystemML Compilation Chain
15
• Parsing
• Parse input DML/PyDML using Antlr v4 (see Dml.g4 and Pydml.g4)
• Perform syntactic validation
• Construct DMLProgram (=> list of Statement and function blocks)
• Live Variable Analysis
• Classic dataflow analysis
• A variable is “live” if it holds value that may be needed in future
• Dead code elimination
• Semantic Validation
SystemML Compilation Chain
16
• Dataflow in DAGs of operations on matrices, frames, and scalars
• Choosing from alternative execution plans based on memory and cost estimates
• Operator ordering & selection; hybrid plans
SystemML Compilation Chain
17
*	Discussed	later	in	Tooling
spark-submit	--master	yarn-client	 --driver-memory	20G	--num-executors	 4	--executor-memory	 40G	--executor-cores	 24	SystemML.jar	-f	test.dml	-explain	 hops
SystemML Compilation Chain
18
• Low-level physical execution plan (LOPDags)
• Over key-value pairs for MR
• Over RDDs for Spark
• “Piggybacking” operations into minimal number Map-Reduce jobs
SystemML Compilation Chain
19
Spark
CP + b sb _mVar1
SPARK mapmm X.MATRIX.DOUBLE _mvar1.MATRIX.DOUBLE
_mVar2.MATRIX.DOUBLE RIGHT false NONE
CP * y _mVar2 _mVar3
SystemML Runtime
• Hybrid Runtime
• CP: single machine operations & orchestrate jobs
• MR: generic Map-Reduce jobs & operations
• SP: Spark Jobs
• Numerically stable operators
• Dense / sparse matrix representation
• Multi-Level buffer pool (caching) to evict in-memory
objects
• Dynamic Recompilation for initial unknowns
Control	 Program
Runtime
Program
Buffer	Pool
ParFor Optimizer/
Runtime
MR
InstSpark	
Inst
CP
Inst
Recompiler
DFS	IOMem/FS	IO
Generic
MR	Jobs
MatrixBlock Library
(single/multi-threaded)
From DML to Execution Plan
21
Hadoop or Spark Cluster
(scale-out)
In-Memory Single Node
(scale-up)
Runtime
Compiler
Language
DML Scripts DML (Declarative Machine
Learning Language)
since 2010since 2012 since 2015
Data
CP	+	b	sb	_mVar1
SPARK	mapmm	X_mvar1	_mVar2	
RIGHT	false	NONE
CP	*	y	_mVar2	_mVar3
Hybrid	execution	
plans*
Varying	data	sizes
LinearRegression.dml
A	Data	Scientist	– Linear	Regression
22
X ≈
Explanatory/
Independent Variables
Predicted/
Dependant VariableModel
w
w = argminw ||Xw-y||2 +λ||w||2
Optimization Problem:
next	direction
Iterate	until	
convergence
initialize
step	size
update		w
initial	direction
accuracy
measures
Conjugate GradientMethod:
• Start off with the (negative) gradient
• For each step
1. Move to the optimal point along the chosen direction;
2. Recompute the gradient;
3. Project it onto the subspace conjugate* to allprior directions;
4. Use this as the next direction
(* conjugate =orthogonalgiven A as the metric)
A = XT X + λ
y
SystemML	– Run	LinReg	CG	on	Spark
23
100M
10,000
100M
1
yX
100M
1,000
X
100M
100
X
100M
10
X
100M
1
y
100M
1
y
100M
1
y
8 TB
800 GB
80 GB
8 GB …
tMMp
…
Multithreaded
Single Node
20 GB Driver on 16c
6 x 55 GB Executors
Hybrid Plan
with RDD caching
and fused operator
Hybrid Plan
with RDD out-of-
core and fused
operator
Hybrid Plan
with RDD out-of-
core and different
operators
…
x.persist();
...
X.mapValues(tMMv
)
.reduce ()
…
Driver
Fused
Executors
…
RDD	cache:	X
tMMv tMMv
…
x.persist();
...
X.mapValues(tMMv)
.reduce()
...
Executors
…
RDD	cache:	X
tMMv tMMv
Driver
Spilling
…
x.persist();
...
// 2 MxV mult
// with broadcast,
// mapToPair, and
// reduceByKey
... Executors
…
RDD	cache:	X
Mv
tvM
Mv
tvM
Driver
Driver
Cache
Agenda
• Architecture	Overview
• Language	&	APIs
• Compiler
• Runtime
• Two	examples:
• Simple	DML	expression	with	an	example	dataset
• Linear	Regression	with	varying	datasizes
• Tooling
• Important	links
24
SystemML’s	Compilation	Chain	/	Overview	Tools
25
EXPLAIN
hops
STATS
DEBUG
EXPLAIN
runtime
[Matthias	Boehm	et	al:
SystemML's	Optimizer:	Plan	
Generation	for	Large-Scale	
Machine	Learning	Programs.	IEEE	
Data	Eng.	Bull	2014]
HOP	(High-level	operator)
LOP	(Low-level	operator)	
EXPLAIN
*_recompile
Explain	(Understanding	Execution	Plans)
• Overview
• Shows	generated	execution	plan	(at	different	compilation	steps)	
• Introduced	05/2014	for	internal	usage
• Important	tool	for	understanding/debugging	 optimizer	choices!
• Usage
• hadoop jar SystemML.jar -f test.dml –explain
[hops | runtime | hops_recompile | runtime_recompile]
• Hops
• Program	w/	hop	dags	after	optimization
• Runtime	(default)
• Program	w/	generated	runtime	instructions
• Hops_recompile:	
• See	hops	+	hop	dag	after	every	recompile
• Runtime_recompile:	
• See	runtime	+	generated	runtime	instructions	after	every	recompile
26
Explain:	Understanding	HOP	DAGs	(simple		DML)
27
Spark
• HOP	ID
• HOP	opcode
• HOP	input	data	dependencies	(via	HOP	IDs)
• HOP	output	matrix	characteristics	(rlen,	clen,	brlen,	bclen,	nnz)
• Hop	memory	estimates	(all	inputs,	 intermediates,	output	à
operation	mem)
• Hop	execution	type	(CP/SP/MR)
• Optional:	indicators	of	reblock/checkpointing	 (caching)	of	hop	
outputs
-explain	hops
-explain	recompile_hops
spark-submit	--master	yarn-client	 --driver-memory	20G	--num-executors	 4	--executor-memory	 40G	--executor-cores	 24	SystemML.jar	-f	test.dml	-explain	 hops
Broadcast	mem	
budget
Explain:	Understanding	HOP	DAGs	(entire	script)
• Example	DML	Script	(Simplified	LinregDS)
28
X = read($1);
y = read($2);
intercept = $3;
lambda = $4;
if( intercept == 1 ) {
ones = matrix(1, nrow(X), 1);
X = append(X, ones);
}
I = matrix(1, ncol(X), 1);
A = t(X) %*% X + diag(I*lambda);
b = t(X) %*% y;
beta = solve(A, b);
write(beta, $5);
Invocation:	
hadoop jar SystemML.jar -f
linregds.dml -args X y 0 0 beta
Scenario:
X:	100,000	x	1,000,	1.0
y:	100,000	x	1,	1.0	
(800MB,	200+GFlop)
Explain:	Understanding	HOP	DAGs	(2)
• Explain	Hops
29
15/07/05 17:18:06 INFO api.DMLScript: EXPLAIN (HOPS):
# Memory Budget local/remote = 57344MB/1434MB/1434MB
# Degree of Parallelism (vcores) local/remote = 24/144/72
PROGRAM
--MAIN PROGRAM
----GENERIC (lines 1-4) [recompile=false]
------(10) PRead X [100000,1000,1000,1000,100000000] [0,0,763 -> 763MB], CP
------(11) TWrite X (10) [100000,1000,1000,1000,100000000] [763,0,0 -> 763MB], CP
------(21) PRead y [100000,1,1000,1000,100000] [0,0,1 -> 1MB], CP
------(22) TWrite y (21) [100000,1,1000,1000,100000] [1,0,0 -> 1MB], CP
------(24) TWrite intercept [0,0,-1,-1,-1] [0,0,0 -> 0MB], CP
------(26) TWrite lambda [0,0,-1,-1,-1] [0,0,0 -> 0MB], CP
----GENERIC (lines 11-16) [recompile=false]
------(42) TRead X [100000,1000,1000,1000,100000000] [0,0,763 -> 763MB], CP
------(52) r(t) (42) [1000,100000,1000,1000,100000000] [763,0,763 -> 1526MB]
------(53) ba(+*) (52,42) [1000,1000,1000,1000,-1] [1526,8,8 -> 1541MB], CP
------(43) TRead y [100000,1,1000,1000,100000] [0,0,1 -> 1MB], CP
------(59) ba(+*) (52,43) [1000,1,1000,1000,-1] [764,0,0 -> 764MB], CP
------(60) b(solve) (53,59) [1000,1,1000,1000,-1] [8,8,0 -> 15MB], CP
------(66) PWrite beta (60) [1000,1,-1,-1,-1] [0,0,0 -> 0MB], CP
Cluster
Characteristics
Program	Structure	
(incl	recompile)
Unrolled	
HOP	
DAG
Notes:	if	branch	(6-9)	and	regularization	removed	by	rewrites
Explain:	Understanding	Runtime	Plans	(1)
• Explain	Runtime	(simplified	filenames,	removed	rmvar)
30 IBM	Research
15/07/05 17:18:53 INFO api.DMLScript: EXPLAIN (RUNTIME):
# Memory Budget local/remote = 57344MB/1434MB/1434MB
# Degree of Parallelism (vcores) local/remote = 24/144/72
PROGRAM ( size CP/MR = 25/0 )
--MAIN PROGRAM
----GENERIC (lines 1-4) [recompile=false]
------CP createvar pREADX X false binaryblock 100000 1000 1000 1000 100000000
------CP createvar pREADy y false binaryblock 100000 1 1000 1000 100000
------CP assignvar 0.SCALAR.INT.true intercept.SCALAR.INT
------CP assignvar 0.0.SCALAR.DOUBLE.true lambda.SCALAR.DOUBLE
------CP cpvar pREADX X
------CP cpvar pREADy y
----GENERIC (lines 11-16) [recompile=false]
------CP createvar _mVar2 .../_t0/temp1 true binaryblock 1000 1000 1000 1000 -1
------CP tsmm X.MATRIX.DOUBLE _mVar2.MATRIX.DOUBLE LEFT 24
------CP createvar _mVar3 .../_t0/temp2 true binaryblock 1 100000 1000 1000 100000
------CP r' y.MATRIX.DOUBLE _mVar3.MATRIX.DOUBLE
------CP createvar _mVar4 .../_t0/temp3 true binaryblock 1 1000 1000 1000 -1
------CP ba+* _mVar3.MATRIX.DOUBLE X.MATRIX.DOUBLE _mVar4.MATRIX.DOUBLE 24
------CP createvar _mVar5 .../_t0/temp4 true binaryblock 1000 1 1000 1000 -1
------CP r' _mVar4.MATRIX.DOUBLE _mVar5.MATRIX.DOUBLE
------CP createvar _mVar6 .../_t0/temp5 true binaryblock 1000 1 1000 1000 -1
------CP solve _mVar2.MATRIX.DOUBLE _mVar5.MATRIX.DOUBLE _mVar6.MATRIX.DOUBLE
------CP write _mVar6.MATRIX.DOUBLE .../beta.SCALAR.STRING.true textcell.SCALAR.STRING.true
Literally	a	string	
representation	of	
runtime	instructions
Stats	(Profiling	Runtime	Statistics)
• Overview
• Profiles	and	shows	aggregated	runtime	statistics	of	potential	bottlenecks
• Introduced	01/2014	for	internal	usage,	extension	of	buffer	pool	stats	01/2013
• Important	tool	for	understanding	runtime	characteristics	and	profiling/tuning	
system	internals	by	developers
• Usage
• hadoop jar SystemML.jar -f test.dml -stats
31 IBM	Research
SystemML Statistics
Total	exec	time
Buffer	pool	stats	
Dynamic	recompilation	stats
JVM	stats	(JIT,	GC)
Heavy	hitter	instructions
(incl.	buffer	pool	times)
optional:	parfor	stats	
(if	program	contains	parfors)
Debug	(Script	Debugging)
• Overview
• Script-level	debugging	by	end-users	(and	developers)
• Introduced	09/2014	as	result	of	intern	project
• gdb-inspired	command-line	debugger	interface
• Usage
• hadoop jar SystemML.jar -f test.dml -debug
33
Agenda
• Architecture	Overview
• Language	&	APIs
• Compiler
• Runtime
• Two	examples:
• Simple	DML	expression	with	an	example	dataset
• Linear	Regression	with	varying	datasizes
• Tooling
• Important	links
34
Important Links
• Website:	http://systemml.apache.org/
35
Important Links
• Website:	http://systemml.apache.org/
• Interested	in	SystemML	?
• Go	to	https://github.com/apache/incubator-systemml and	“Star	it”
36
Important Links
• Website: http://systemml.apache.org/
• Interested in SystemML ?
• Go to https://github.com/apache/incubator-systemml and “Star it”
• Want to contribute to SystemML ?
• See http://apache.github.io/incubator-systemml/contributing-to-
systemml.html
• List of issues: https://issues.apache.org/jira/browse/SYSTEMML/
• Ask any of our PMC members for suggestions
• Want to try out SystemML ?
• Laptop: http://apache.github.io/incubator-systemml/quick-start-guide.html
(Does not require Hadoop/Spark installation)
• Spark Cluster: http://apache.github.io/incubator-systemml/spark-
mlcontext-programming-guide.html (Includes Jupyter/Zeppelin demo)
37
Thank	You

Apache SystemML Architecture by Niketan Panesar

  • 1.
  • 2.
    Agenda • High-level Design & APIs • Architecture Overview •Tooling • Important links 2 From http://systemml.apache.org/
  • 3.
    Agenda • High-level Design & APIs • Architecture Overview •Language • Compiler • Runtime • Two examples: • Simple DML expression with an example dataset • Linear Regression with varying datasizes • Tooling • Important links 3
  • 4.
    Agenda • High-level Design & APIs • Architecture Overview •Language • Compiler • Runtime • Two examples: • Simple DML expression with an example dataset • Linear Regression with varying datasizes • Tooling • Important links 4
  • 5.
    SystemML Design 5 DML (DeclarativeMachine Learning Language) Hadoop or Spark Cluster (scale-out) since 2010 In-Memory Single Node (scale-up) since 2012 since 2015 DML Scripts Data CP + b sb _mVar1 SPARK mapmm X _mvar1 _mVar2 RIGHT false NONE CP * y _mVar2 _mVar3 Hybrid execution plans* SystemML3. double [] [] 1. On disk/HDFS 2. RDD/DataFrame
  • 6.
    SystemML Design 6 Hadoop orSpark Cluster (scale-out) since 2010 In-Memory Single Node (scale-up) since 2012 DML Scripts Data SystemML 1. On disk/HDFS 2. RDD/DataFrame 3. double [] [] Command line API* (also MLContext*) -exec hadoop
  • 7.
    SystemML Design 7 Hadoop orSpark Cluster (scale-out) In-Memory Single Node (scale-up) since 2012 DML Scripts Data SystemML 1. On disk/HDFS 2. RDD/DataFrame 3. double [] [] Two options: 1. –exec singlenode 2. Use standalone jar (preserves rewrites, but may spawn Local MR jobs) Command line API* (also MLContext*)
  • 8.
    SystemML Design 8 Spark Cluster (scale-out) In-MemorySingle Node (scale-up) since 2012 since 2015 DML Scripts Data SystemML 1. On disk/HDFS 2. RDD/DataFrame 3. double [] [] Command line API* (also MLContext*)
  • 9.
    SystemML Design 9 Spark Cluster (scale-out) In-MemorySingle Node (scale-up) since 2012 since 2015 DML Scripts Data SystemML 1. On disk/HDFS 2. RDD/DataFrame 3. double [] [] MLContext API - Java/Python/Scala https://apache.github.io/incubator-systemml/spark-mlcontext-programming-guide.html
  • 10.
    SystemML Design 10 In-Memory SingleNode (scale-up) since 2012 DML Scripts Data SystemML 1. On disk/HDFS 2. RDD/DataFrame 3. double [] [] JMLC API https://apache.github.io/incubator-systemml/jmlc.html
  • 11.
    Agenda • High-level Design & APIs • Architecture Overview •Language • Compiler • Runtime • Two examples: • Simple DML expression with an example dataset • Linear Regression with varying datasizes • Tooling • Important links 11
  • 12.
    From DML toExecution Plan 12 Hadoop or Spark Cluster (scale-out) In-Memory Single Node (scale-up) DML Scripts DML (Declarative Machine Learning Language) since 2010since 2012 since 2015 Data CP + b sb _mVar1 SPARK mapmm X _mvar1 _mVar2 RIGHT false NONE CP * y _mVar2 _mVar3 Hybrid execution plans* SystemML
  • 13.
    From DML toExecution Plan 13 Hadoop or Spark Cluster (scale-out) In-Memory Single Node (scale-up) Runtime Compiler Language DML Scripts DML (Declarative Machine Learning Language) since 2010since 2012 since 2015 Data CP + b sb _mVar1 SPARK mapmm X _mvar1 _mVar2 RIGHT false NONE CP * y _mVar2 _mVar3 Hybrid execution plans* Assuming an example dataset X: 100M X 500, y: 100M X 1, b/sb: 500 X 1
  • 14.
  • 15.
    SystemML Compilation Chain 15 •Parsing • Parse input DML/PyDML using Antlr v4 (see Dml.g4 and Pydml.g4) • Perform syntactic validation • Construct DMLProgram (=> list of Statement and function blocks) • Live Variable Analysis • Classic dataflow analysis • A variable is “live” if it holds value that may be needed in future • Dead code elimination • Semantic Validation
  • 16.
    SystemML Compilation Chain 16 •Dataflow in DAGs of operations on matrices, frames, and scalars • Choosing from alternative execution plans based on memory and cost estimates • Operator ordering & selection; hybrid plans
  • 17.
    SystemML Compilation Chain 17 * Discussed later in Tooling spark-submit --master yarn-client --driver-memory 20G --num-executors 4 --executor-memory 40G --executor-cores 24 SystemML.jar -f test.dml -explain hops
  • 18.
    SystemML Compilation Chain 18 •Low-level physical execution plan (LOPDags) • Over key-value pairs for MR • Over RDDs for Spark • “Piggybacking” operations into minimal number Map-Reduce jobs
  • 19.
    SystemML Compilation Chain 19 Spark CP+ b sb _mVar1 SPARK mapmm X.MATRIX.DOUBLE _mvar1.MATRIX.DOUBLE _mVar2.MATRIX.DOUBLE RIGHT false NONE CP * y _mVar2 _mVar3
  • 20.
    SystemML Runtime • HybridRuntime • CP: single machine operations & orchestrate jobs • MR: generic Map-Reduce jobs & operations • SP: Spark Jobs • Numerically stable operators • Dense / sparse matrix representation • Multi-Level buffer pool (caching) to evict in-memory objects • Dynamic Recompilation for initial unknowns Control Program Runtime Program Buffer Pool ParFor Optimizer/ Runtime MR InstSpark Inst CP Inst Recompiler DFS IOMem/FS IO Generic MR Jobs MatrixBlock Library (single/multi-threaded)
  • 21.
    From DML toExecution Plan 21 Hadoop or Spark Cluster (scale-out) In-Memory Single Node (scale-up) Runtime Compiler Language DML Scripts DML (Declarative Machine Learning Language) since 2010since 2012 since 2015 Data CP + b sb _mVar1 SPARK mapmm X_mvar1 _mVar2 RIGHT false NONE CP * y _mVar2 _mVar3 Hybrid execution plans* Varying data sizes LinearRegression.dml
  • 22.
    A Data Scientist – Linear Regression 22 X ≈ Explanatory/ IndependentVariables Predicted/ Dependant VariableModel w w = argminw ||Xw-y||2 +λ||w||2 Optimization Problem: next direction Iterate until convergence initialize step size update w initial direction accuracy measures Conjugate GradientMethod: • Start off with the (negative) gradient • For each step 1. Move to the optimal point along the chosen direction; 2. Recompute the gradient; 3. Project it onto the subspace conjugate* to allprior directions; 4. Use this as the next direction (* conjugate =orthogonalgiven A as the metric) A = XT X + λ y
  • 23.
    SystemML – Run LinReg CG on Spark 23 100M 10,000 100M 1 yX 100M 1,000 X 100M 100 X 100M 10 X 100M 1 y 100M 1 y 100M 1 y 8 TB 800GB 80 GB 8 GB … tMMp … Multithreaded Single Node 20 GB Driver on 16c 6 x 55 GB Executors Hybrid Plan with RDD caching and fused operator Hybrid Plan with RDD out-of- core and fused operator Hybrid Plan with RDD out-of- core and different operators … x.persist(); ... X.mapValues(tMMv ) .reduce () … Driver Fused Executors … RDD cache: X tMMv tMMv … x.persist(); ... X.mapValues(tMMv) .reduce() ... Executors … RDD cache: X tMMv tMMv Driver Spilling … x.persist(); ... // 2 MxV mult // with broadcast, // mapToPair, and // reduceByKey ... Executors … RDD cache: X Mv tvM Mv tvM Driver Driver Cache
  • 24.
    Agenda • Architecture Overview • Language & APIs •Compiler • Runtime • Two examples: • Simple DML expression with an example dataset • Linear Regression with varying datasizes • Tooling • Important links 24
  • 25.
  • 26.
    Explain (Understanding Execution Plans) • Overview • Shows generated execution plan (at different compilation steps) •Introduced 05/2014 for internal usage • Important tool for understanding/debugging optimizer choices! • Usage • hadoop jar SystemML.jar -f test.dml –explain [hops | runtime | hops_recompile | runtime_recompile] • Hops • Program w/ hop dags after optimization • Runtime (default) • Program w/ generated runtime instructions • Hops_recompile: • See hops + hop dag after every recompile • Runtime_recompile: • See runtime + generated runtime instructions after every recompile 26
  • 27.
    Explain: Understanding HOP DAGs (simple DML) 27 Spark • HOP ID • HOP opcode •HOP input data dependencies (via HOP IDs) • HOP output matrix characteristics (rlen, clen, brlen, bclen, nnz) • Hop memory estimates (all inputs, intermediates, output à operation mem) • Hop execution type (CP/SP/MR) • Optional: indicators of reblock/checkpointing (caching) of hop outputs -explain hops -explain recompile_hops spark-submit --master yarn-client --driver-memory 20G --num-executors 4 --executor-memory 40G --executor-cores 24 SystemML.jar -f test.dml -explain hops Broadcast mem budget
  • 28.
    Explain: Understanding HOP DAGs (entire script) • Example DML Script (Simplified LinregDS) 28 X =read($1); y = read($2); intercept = $3; lambda = $4; if( intercept == 1 ) { ones = matrix(1, nrow(X), 1); X = append(X, ones); } I = matrix(1, ncol(X), 1); A = t(X) %*% X + diag(I*lambda); b = t(X) %*% y; beta = solve(A, b); write(beta, $5); Invocation: hadoop jar SystemML.jar -f linregds.dml -args X y 0 0 beta Scenario: X: 100,000 x 1,000, 1.0 y: 100,000 x 1, 1.0 (800MB, 200+GFlop)
  • 29.
    Explain: Understanding HOP DAGs (2) • Explain Hops 29 15/07/05 17:18:06INFO api.DMLScript: EXPLAIN (HOPS): # Memory Budget local/remote = 57344MB/1434MB/1434MB # Degree of Parallelism (vcores) local/remote = 24/144/72 PROGRAM --MAIN PROGRAM ----GENERIC (lines 1-4) [recompile=false] ------(10) PRead X [100000,1000,1000,1000,100000000] [0,0,763 -> 763MB], CP ------(11) TWrite X (10) [100000,1000,1000,1000,100000000] [763,0,0 -> 763MB], CP ------(21) PRead y [100000,1,1000,1000,100000] [0,0,1 -> 1MB], CP ------(22) TWrite y (21) [100000,1,1000,1000,100000] [1,0,0 -> 1MB], CP ------(24) TWrite intercept [0,0,-1,-1,-1] [0,0,0 -> 0MB], CP ------(26) TWrite lambda [0,0,-1,-1,-1] [0,0,0 -> 0MB], CP ----GENERIC (lines 11-16) [recompile=false] ------(42) TRead X [100000,1000,1000,1000,100000000] [0,0,763 -> 763MB], CP ------(52) r(t) (42) [1000,100000,1000,1000,100000000] [763,0,763 -> 1526MB] ------(53) ba(+*) (52,42) [1000,1000,1000,1000,-1] [1526,8,8 -> 1541MB], CP ------(43) TRead y [100000,1,1000,1000,100000] [0,0,1 -> 1MB], CP ------(59) ba(+*) (52,43) [1000,1,1000,1000,-1] [764,0,0 -> 764MB], CP ------(60) b(solve) (53,59) [1000,1,1000,1000,-1] [8,8,0 -> 15MB], CP ------(66) PWrite beta (60) [1000,1,-1,-1,-1] [0,0,0 -> 0MB], CP Cluster Characteristics Program Structure (incl recompile) Unrolled HOP DAG Notes: if branch (6-9) and regularization removed by rewrites
  • 30.
    Explain: Understanding Runtime Plans (1) • Explain Runtime (simplified filenames, removed rmvar) 30 IBM Research 15/07/0517:18:53 INFO api.DMLScript: EXPLAIN (RUNTIME): # Memory Budget local/remote = 57344MB/1434MB/1434MB # Degree of Parallelism (vcores) local/remote = 24/144/72 PROGRAM ( size CP/MR = 25/0 ) --MAIN PROGRAM ----GENERIC (lines 1-4) [recompile=false] ------CP createvar pREADX X false binaryblock 100000 1000 1000 1000 100000000 ------CP createvar pREADy y false binaryblock 100000 1 1000 1000 100000 ------CP assignvar 0.SCALAR.INT.true intercept.SCALAR.INT ------CP assignvar 0.0.SCALAR.DOUBLE.true lambda.SCALAR.DOUBLE ------CP cpvar pREADX X ------CP cpvar pREADy y ----GENERIC (lines 11-16) [recompile=false] ------CP createvar _mVar2 .../_t0/temp1 true binaryblock 1000 1000 1000 1000 -1 ------CP tsmm X.MATRIX.DOUBLE _mVar2.MATRIX.DOUBLE LEFT 24 ------CP createvar _mVar3 .../_t0/temp2 true binaryblock 1 100000 1000 1000 100000 ------CP r' y.MATRIX.DOUBLE _mVar3.MATRIX.DOUBLE ------CP createvar _mVar4 .../_t0/temp3 true binaryblock 1 1000 1000 1000 -1 ------CP ba+* _mVar3.MATRIX.DOUBLE X.MATRIX.DOUBLE _mVar4.MATRIX.DOUBLE 24 ------CP createvar _mVar5 .../_t0/temp4 true binaryblock 1000 1 1000 1000 -1 ------CP r' _mVar4.MATRIX.DOUBLE _mVar5.MATRIX.DOUBLE ------CP createvar _mVar6 .../_t0/temp5 true binaryblock 1000 1 1000 1000 -1 ------CP solve _mVar2.MATRIX.DOUBLE _mVar5.MATRIX.DOUBLE _mVar6.MATRIX.DOUBLE ------CP write _mVar6.MATRIX.DOUBLE .../beta.SCALAR.STRING.true textcell.SCALAR.STRING.true Literally a string representation of runtime instructions
  • 31.
    Stats (Profiling Runtime Statistics) • Overview • Profiles and shows aggregated runtime statistics of potential bottlenecks •Introduced 01/2014 for internal usage, extension of buffer pool stats 01/2013 • Important tool for understanding runtime characteristics and profiling/tuning system internals by developers • Usage • hadoop jar SystemML.jar -f test.dml -stats 31 IBM Research
  • 32.
  • 33.
    Debug (Script Debugging) • Overview • Script-level debugging by end-users (and developers) •Introduced 09/2014 as result of intern project • gdb-inspired command-line debugger interface • Usage • hadoop jar SystemML.jar -f test.dml -debug 33
  • 34.
    Agenda • Architecture Overview • Language & APIs •Compiler • Runtime • Two examples: • Simple DML expression with an example dataset • Linear Regression with varying datasizes • Tooling • Important links 34
  • 35.
  • 36.
    Important Links • Website: http://systemml.apache.org/ •Interested in SystemML ? • Go to https://github.com/apache/incubator-systemml and “Star it” 36
  • 37.
    Important Links • Website:http://systemml.apache.org/ • Interested in SystemML ? • Go to https://github.com/apache/incubator-systemml and “Star it” • Want to contribute to SystemML ? • See http://apache.github.io/incubator-systemml/contributing-to- systemml.html • List of issues: https://issues.apache.org/jira/browse/SYSTEMML/ • Ask any of our PMC members for suggestions • Want to try out SystemML ? • Laptop: http://apache.github.io/incubator-systemml/quick-start-guide.html (Does not require Hadoop/Spark installation) • Spark Cluster: http://apache.github.io/incubator-systemml/spark- mlcontext-programming-guide.html (Includes Jupyter/Zeppelin demo) 37
  • 38.