Kazuaki Ishizaki (石崎 一明)
IBM Research – Tokyo (日本アイ・ビー・エム(株)東京基礎研究所)
@kiszk
Looking back at Spark 2.x and forward to 3.0
1
About Me – Kazuaki Ishizaki
▪ Researcher at IBM Research - Tokyo https://ibm.biz/ishizaki
– Compiler optimization
– Language runtime
– Parallel processing
▪ Working for IBM Java virtual machine (now OpenJ9) from over 20 years
– In particular, just-in-time compiler
▪ Apache Spark Committer for SQL package (from 2018/9)
– My first PR has been merged on 2015/12
▪ ACM Distinguished Member
▪ SNS
– @kiszk
– Slideshare: https://www.slideshare.net/ishizaki
2 Looking back at Spark 2.x and forward to 3.0 - Kazuaki Ishizaki
Today’s Talk
▪ I will not talk about distributed framework
– You are more familiar than myself
▪ I will not talk about SQL, machine learning, and other libraries
– I expect @maropu will talk about SQL in the next session
3 Looking back at Spark 2.x and forward to 3.0 - Kazuaki Ishizaki
Today’s Talk
▪ I will not talk about distributed framework
– You are more familiar than myself
▪ I will not talk about SQL, machine learning, and other libraries
– I expect @maropu will talk about SQL in the next session
▪ I will talk about how a program is executed on an executor at a node
4 Looking back at Spark 2.x and forward to 3.0 - Kazuaki Ishizaki
Outline
▪ How a DataFrame/Dataset program is executed?
▪ What are problems in Spark 2.x?
▪ What’s new in Spark 3.0?
▪ Why am I appointed to a committer?
5 Looking back at Spark 2.x and forward to 3.0 - Kazuaki Ishizaki
Apache Spark Program is Written by a User
▪ This DataFrame program is written in Scala
6 Looking back at Spark 2.x and forward to 3.0 - Kazuaki Ishizaki
df: DataFrame[int] = (1 to 100).toDF
df.selectExpr(“value + 1”)
.selectExpr(“value + 2”)
.show
Java code is actually executed
▪ A DataFrame/Dataset program is translated to Java program to be
actually executed
– An optimizer combines two arithmetic operations into one
– Whole-stage codegen puts multiple operations (read, selectExpr, and
projection) into one loop
7
while (itr.hasNext()) { // execute a row
// get a value from a row in DF
int value =((Row)itr.next()).getInt(0);
// compute a new value
int mapValue = value + 3;
// store a new value to a row in DF
outRow.write(0, mapValue);
append(outRow);
}
df: DataFrame[int] = …
df.selectExpr(“value + 1”)
.selectExpr(“value + 2”)
.show
Code
generation
Looking back at Spark 2.x and forward to 3.0 - Kazuaki Ishizaki
1, 2, 3, 4, …
Unsafe data (on heap)
How a Program is Translated to Java Code
8 Looking back at Spark 2.x and forward to 3.0 - Kazuaki Ishizaki
From Structuring Apache Spark 2.0: SQL, DataFrames, Datasets And Streaming - by Michael Armbrust
Who is More Familiar with Each Module
▪ Four Japanese committers are in this room
9 Looking back at Spark 2.x and forward to 3.0 - Kazuaki Ishizaki
Project Tungsten
Major Items in Spark 2.x to Me
▪ Improve performance
– by improving data representation
– by eliminating serialization/deserialization (ser/de)
– by improving generated code
▪ Stable code generation
– No more Java exception while a program has large number of columns
(>1000)
10 Looking back at Spark 2.x and forward to 3.0 - Kazuaki Ishizaki
Array Internal Representation
▪ Before Spark 2.1, an array (UnsafeArrayData) is internally represented by
using an sparse/indirect structure
– Good for small memory consumption if an array is sparse
▪ After Spark 2.1, the array representation is dense/contiguous
– Good for performance
11 Looking back at Spark 2.x and forward to 3.0 - Kazuaki Ishizaki
len = 2 7 8
a[0] a[1]offset[0] offset[1]
len = 2 Non
Null
Non
Null 7 8
a[0] a[1]
SPARK-15962 improves this representation
This is (the first) tough PR for me
▪ Spent three months with 270 conversations
12 Looking back at Spark 2.x and forward to 3.0 - Kazuaki Ishizaki
A Simple Dataset Program with Array
▪ Read an integer array in a row
▪ Create a new array from the first element
13 Looking back at Spark 2.x and forward to 3.0 - Kazuaki Ishizaki
ds: DataSet[Array[Int]] = Seq(Array(7, 8)).toDS
ds.map(a => Array(a(0)))
Weird Generated Pseudo Code with DataSet
▪ Data conversion is too slow
– Between internal representation (Tungsten) and Java object format (Object[])
▪ Element-wise data copy is too slow
14 Looking back at Spark 2.x and forward to 3.0 - Kazuaki Ishizaki
ArrayData inArray;
while (itr.hasNext()) {
inArray = ((Row)itr.next().getArray(0);
append(outRow);
}
ds: DataSet[Array[Int]] =
Seq(Array(7, 8)).toDS
ds.map(a => Array(a(0)))
Data conversion
Element-wise data copy
Element-wise data copy
int[] mapArray = new int[1] { a[0] };
Code
generation
Data conversion
Element-wise data copy
Ser
De
Copy each element with null check
Data conversion
Data conversion Copy with Java object creation
Element-wise data copy
Generated Source Java Code
▪ Data conversion is done by boxing or unboxing
▪ Element-wise data copy is done
by for-loop
15 Looking back at Spark 2.x and forward to 3.0 - Kazuaki Ishizaki
ds: DataSet[Array[Int]] =
Seq(Array(7, 8)).toDS
ds.map(a => Array(a(0)))
Data conversion
Code
generation
Element-wise data copy
ArrayData inArray;
while (itr.hasNext()) {
inArray = ((Row)itr.next().getArray(0);
Object[] tmp = new Object[inArray.numElements()];
for (int i = 0; i < tmp.length; i ++) {
tmp[i] = (inArray.isNullAt(i)) ?
null : inArray.getInt(i);
}
ArrayData array =
new GenericIntArrayData(tmpArray);
int[] javaArray = array.toIntArray();
int[] mapArray = (int[])map_func.apply(javaArray);
outArray = new GenericArrayData(mapArray);
for (int i = 0; i < outArray.numElements(); i++) {
if (outArray.isNullAt(i)) {
arrayWriter.setNullInt(i);
} else {
arrayWriter.write(i, outArray.getInt(i));
}
}
append(outRow);
}
Ser
De
Too Long Actually-Generated Java Code (Spark 2.0)
▪ Too to
16 Looking back at Spark 2.x and forward to 3.0 - Kazuaki Ishizaki
Data conversion Element-wise data copy
final int[] mapelements_value = mapelements_isNull ?
null : (int[]) mapelements_value1.apply(deserializetoobject_value);
mapelements_isNull = mapelements_value == null;
final boolean serializefromobject_isNull = mapelements_isNull;
final ArrayData serializefromobject_value = serializefromobject_isNull ?
null : new GenericArrayData(mapelements_value);
serializefromobject_holder.reset();
serializefromobject_rowWriter.zeroOutNullBytes();
if (serializefromobject_isNull) {
serializefromobject_rowWriter.setNullAt(0);
} else {
final int serializefromobject_tmpCursor = serializefromobject_holder.cursor;
if (serializefromobject_value instanceof UnsafeArrayData) {
final int serializefromobject_sizeInBytes = ((UnsafeArrayData) serializefromobject_value).getSizeInBytes();
serializefromobject_holder.grow(serializefromobject_sizeInBytes);
((UnsafeArrayData) serializefromobject_value).writeToMemory(
serializefromobject_holder.buffer, serializefromobject_holder.cursor);
serializefromobject_holder.cursor += serializefromobject_sizeInBytes;
} else {
final int serializefromobject_numElements = serializefromobject_value.numElements();
serializefromobject_arrayWriter.initialize(serializefromobject_holder, serializefromobject_numElements, 8);
for (int serializefromobject_index = 0; serializefromobject_index < serializefromobject_numElements;
serializefromobject_index++) {
if (serializefromobject_value.isNullAt(serializefromobject_index)) {
serializefromobject_arrayWriter.setNullAt(serializefromobject_index);
} else {
final int serializefromobject_element = serializefromobject_value.getInt(serializefromobject_index);
serializefromobject_arrayWriter.write(serializefromobject_index, serializefromobject_element);
}
}
}
serializefromobject_rowWriter.setOffsetAndSize(0, serializefromobject_tmpCursor,
serializefromobject_holder.cursor - serializefromobject_tmpCursor);
serializefromobject_rowWriter.alignToWords(serializefromobject_holder.cursor - serializefromobject_tmpCursor);
}
serializefromobject_result.setTotalSize(serializefromobject_holder.totalSize());
append(serializefromobject_result);
if (shouldStop()) return;
}
}
protected void processNext() throws java.io.IOException {
while (inputadapter_input.hasNext()) {
InternalRow inputadapter_row = (InternalRow) inputadapter_input.next();
boolean inputadapter_isNull = inputadapter_row.isNullAt(0);
ArrayData inputadapter_value = inputadapter_isNull ?
null : (inputadapter_row.getArray(0));
boolean deserializetoobject_isNull1 = inputadapter_isNull;
ArrayData deserializetoobject_value1 = null;
if (!inputadapter_isNull) {
final int deserializetoobject_n = inputadapter_value.numElements();
final Object[] deserializetoobject_values = new Object[deserializetoobject_n];
for (int deserializetoobject_j = 0;
deserializetoobject_j < deserializetoobject_n; deserializetoobject_j ++) {
if (inputadapter_value.isNullAt(deserializetoobject_j)) {
deserializetoobject_values[deserializetoobject_j] = null;
} else {
boolean deserializetoobject_feNull = false;
int deserializetoobject_fePrim =
inputadapter_value.getInt(deserializetoobject_j);
boolean deserializetoobject_teNull = deserializetoobject_feNull;
int deserializetoobject_tePrim = -1;
if (!deserializetoobject_feNull) {
deserializetoobject_tePrim = deserializetoobject_fePrim;
}
if (deserializetoobject_teNull) {
deserializetoobject_values[deserializetoobject_j] = null;
} else {
deserializetoobject_values[deserializetoobject_j] = deserializetoobject_tePrim;
}
}
}
deserializetoobject_value1 = new GenericArrayData(deserializetoobject_values);
}
boolean deserializetoobject_isNull = deserializetoobject_isNull1;
final int[] deserializetoobject_value = deserializetoobject_isNull ?
null : (int[]) deserializetoobject_value1.toIntArray();
deserializetoobject_isNull = deserializetoobject_value == null;
Object mapelements_obj = ((Expression) references[0]).eval(null);
scala.Function1 mapelements_value1 = (scala.Function1) mapelements_obj;
boolean mapelements_isNull = false || deserializetoobject_isNull;
ds.map(a => Array(a(0))).debugCodegen
Simple Generated Code for Array on Spark 2.2
▪ Data conversion and element-wise copy are not used
▪ Bulk copy is faster than element-wise data copy
17
ds: DataSet[Array[Int]] =
Seq(Array(7, 8)).toDS
ds.map(a => Array(a(0)))
Bulk data copy Copy whole array using memcpy()
while (itr.hasNext()) {
inArray =((Row)itr.next()).getArray(0);
int[]mapArray=(int[])map.apply(javaArray);
append(outRow);
}
Bulk data copy
int[] mapArray = new int[1] { a[0] };
Bulk data copy
Bulk data copy
SPARK-15985 and SPARK-17490 simplify Ser/De by using bulk data copy
Code
generation
Looking back at Spark 2.x and forward to 3.0 - Kazuaki Ishizaki
Simple Generated Java Code for Array
▪ Data conversion and element-wise copy are not used
▪ Bulk copy is faster than element-wise data copy
18
ds: DataSet[Array[Int]] =
Seq(Array(7, 8)).toDS
ds.map(a => Array(a(0)))
Bulk data copy Copy whole array using memcpy()
SPARK-15985 and SPARK-17490 simplify Ser/De by using bulk data copy
while (itr.hasNext()) {
inArray =((Row)itr.next()).getArray(0);
int[] javaArray = inArray.toIntArray();
int[] mapArray = (int[])mapFunc.apply(javaArray);
outArray = UnsafeArrayData
.fromPrimitiveArray(mapArray);
outArray.writeToMemory(outRow);
append(outRow);
}
Code
generation
Looking back at Spark 2.x and forward to 3.0 - Kazuaki Ishizaki
Dataset for Array Is Not Extremely Slow
▪ Good news: 4.5x faster than Spark 2.0
▪ Bad news: still 12x slower than DataFrame
19
0 10 20 30 40 50 60
Relative execution time over DataFrame
DataFrame Dataset
ds = Seq(Array(…), Array(…), …)
.toDS.cache
ds.map(a => Array(a(0)))
4.5x
df = Seq(Array(…), Array(…), …)
.toDF(”a”).cache
df.selectExpr(“Array(a[0])”)
12x
Shorter is better
Looking back at Spark 2.x and forward to 3.0 - Kazuaki Ishizaki
Spark 2.4 Supports Array Built-in Functions
▪ These built-in functions operate on array elements without writing a loop
– Single array input: array_min, array_max, array_position, ...
– Two-array input: array_intersect, array_union, array_except,
▪ Before Spark 2.4, users have to write a function using Dataset or UDF
20
SPARK-23899 is an umbrella entry
ds: DataSet[Array[Int]] = Seq(Array(7, 8)).toDS
ds.map(a => a.mix)
df: DataSet[Array[Int]] = Seq(Array(7, 8)).toDF(“a”)
df.array_min(“a”)
@ueshin co-wrote an blog entry at
https://databricks.com/blog/2018/11/16/introducing-new-built-in-functions-and-higher-order-functions-for-complex-data-types-in-apache-spark.html
Looking back at Spark 2.x and forward to 3.0 - Kazuaki Ishizaki
Pre-Spark 2.3 Throws Java Exception with Large Columns
▪ Java class file has multiple limitations
– Bytecode size of a method is less than 64KB
– Entry size of constant pool (e.g. symbol name) is less than 64KB
21
df.groupBy(“id”).agg(max(“c1”), sum(“c2”), …, min(“c4000”))
01:11:11.123 ERROR org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator: failed to
compile: org.codehaus.janino.JaninoRuntimeException: Code of method
"apply(Lorg/apache/spark/sql/catalyst/InternalRow;)Lorg/apache/spark/sql/catalyst/expressions
/UnsafeRow;" of class
"org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection" grows
beyond 64 KB
...
Looking back at Spark 2.x and forward to 3.0 - Kazuaki Ishizaki
Generated a huge method whose bytecode size is
more than 64KB
Spark 2.3 Fixes Java Exception with Large Columns
▪ Generate small size of methods conservatively when potentially large
Java code is generated
– Apply this policy to multiple places into source files
22
SPARK-22150 is an umbrella entry that has 25 sub-tasks
Looking back at Spark 2.x and forward to 3.0 - Kazuaki Ishizaki
Spark 2.3 Fixes Java Exception with Large Columns
▪ Generate small size of methods conservatively when potentially large
Java code is generated
– Apply this policy to multiple places into source files
23
SPARK-22150 is an umbrella entry that has 25 sub-tasks
Looking back at Spark 2.x and forward to 3.0 - Kazuaki Ishizaki
Major Items in Spark 3.0
▪ JDK11 and Scala 2.12 support (available in master branch)
– SPARK-24417
▪ Tungsten intermediate representation (IR)
– Easy to restructure generated code
▪ SPARK-25728 (under proposal)
▪ DataSource V2 API
– SPARK-25528
▪ …
24 Looking back at Spark 2.x and forward to 3.0 - Kazuaki Ishizaki
Motivation of Tungsten IR
▪ It is not easy to restructure Java code after generating the code
– Code generation is done by string concatenation
25
int i = ...
func1(i + 1, i * 2);
...
func1(i + 500, i * 2);
func1(i + 501, i * 2);
func1(i + 1000, i * 2);
Hard to split here into two parts
without parsing Java code
Looking back at Spark 2.x and forward to 3.0 - Kazuaki Ishizaki
Structured IR Allows us to Restructure Generated Code
▪ Ease of code restructuring in blue
▪ Ease of rebuilding an expression in green
26
Method
Invoke
+
load i
500
*
load i
2
Looking back at Spark 2.x and forward to 3.0 - Kazuaki Ishizaki
Invoke
+
load i
501
Easy to split here
into two parts
Other Major Items in Spark 2.x
▪ PySpark Performance Improvement
– To use Pandas UDF with Apache Arrow can drastically improve performance
of PySpark
▪ https://databricks.com/blog/2017/10/30/introducing-vectorized-udfs-for-pyspark.html
▪ Project Hydrogen
– Barrier execution mode for integrating ML/DL frameworks with Spark
▪ https://databricks.com/blog/2018/07/25/bay-area-apache-spark-meetup-summary-
databricks-hq.html
27 Looking back at Spark 2.x and forward to 3.0 - Kazuaki Ishizaki
What I am Interested in
▪ Tungsten IR in Spark
– ease of code restructuring
– (In the future) apply multiple optimizations
▪ Improvement of generated code in Spark
– for Parquet reader
– data representation for array table cache
▪ Integration of Spark with DL/ML frameworks (TensorFlow…) and others
28 Looking back at Spark 2.x and forward to 3.0 - Kazuaki Ishizaki
Possible Integration of Spark thru Arrow (My View)
▪ Frameworks: DL/ML frameworks (TensorFlow…)
▪ Resource: GPU, …
– RAPIDS (by NVIDIA) may help integrate with GPU
29 Looking back at Spark 2.x and forward to 3.0 - Kazuaki Ishizaki
From rapids.ai
2 12
1 11
In-memory Columnar
Why Am I Appointed to a Committer?
▪ Continue to make contributions to certain component (SQL)
▪ Review many pull requests
▪ Share knowledge based on my expertise in the community
– Compiler and Java virtual machine
▪ Meet committers and contributors in person
– Hadoop Source Code Reading, Hadoop Spark Conference Japan,
Spark Summit, other meetups
30 Looking back at Spark 2.x and forward to 3.0 - Kazuaki Ishizaki
31
ぜひオープンソースにコントリビューションを
Sparkコミュニティに飛び込もう!
by Apache Spark committer 猿田さん
https://www.slideshare.net/hadoopxnttdata/apache-spark-commnity-nttdata-sarutak

Looking back at Spark 2.x and forward to 3.0

  • 1.
    Kazuaki Ishizaki (石崎一明) IBM Research – Tokyo (日本アイ・ビー・エム(株)東京基礎研究所) @kiszk Looking back at Spark 2.x and forward to 3.0 1
  • 2.
    About Me –Kazuaki Ishizaki ▪ Researcher at IBM Research - Tokyo https://ibm.biz/ishizaki – Compiler optimization – Language runtime – Parallel processing ▪ Working for IBM Java virtual machine (now OpenJ9) from over 20 years – In particular, just-in-time compiler ▪ Apache Spark Committer for SQL package (from 2018/9) – My first PR has been merged on 2015/12 ▪ ACM Distinguished Member ▪ SNS – @kiszk – Slideshare: https://www.slideshare.net/ishizaki 2 Looking back at Spark 2.x and forward to 3.0 - Kazuaki Ishizaki
  • 3.
    Today’s Talk ▪ Iwill not talk about distributed framework – You are more familiar than myself ▪ I will not talk about SQL, machine learning, and other libraries – I expect @maropu will talk about SQL in the next session 3 Looking back at Spark 2.x and forward to 3.0 - Kazuaki Ishizaki
  • 4.
    Today’s Talk ▪ Iwill not talk about distributed framework – You are more familiar than myself ▪ I will not talk about SQL, machine learning, and other libraries – I expect @maropu will talk about SQL in the next session ▪ I will talk about how a program is executed on an executor at a node 4 Looking back at Spark 2.x and forward to 3.0 - Kazuaki Ishizaki
  • 5.
    Outline ▪ How aDataFrame/Dataset program is executed? ▪ What are problems in Spark 2.x? ▪ What’s new in Spark 3.0? ▪ Why am I appointed to a committer? 5 Looking back at Spark 2.x and forward to 3.0 - Kazuaki Ishizaki
  • 6.
    Apache Spark Programis Written by a User ▪ This DataFrame program is written in Scala 6 Looking back at Spark 2.x and forward to 3.0 - Kazuaki Ishizaki df: DataFrame[int] = (1 to 100).toDF df.selectExpr(“value + 1”) .selectExpr(“value + 2”) .show
  • 7.
    Java code isactually executed ▪ A DataFrame/Dataset program is translated to Java program to be actually executed – An optimizer combines two arithmetic operations into one – Whole-stage codegen puts multiple operations (read, selectExpr, and projection) into one loop 7 while (itr.hasNext()) { // execute a row // get a value from a row in DF int value =((Row)itr.next()).getInt(0); // compute a new value int mapValue = value + 3; // store a new value to a row in DF outRow.write(0, mapValue); append(outRow); } df: DataFrame[int] = … df.selectExpr(“value + 1”) .selectExpr(“value + 2”) .show Code generation Looking back at Spark 2.x and forward to 3.0 - Kazuaki Ishizaki 1, 2, 3, 4, … Unsafe data (on heap)
  • 8.
    How a Programis Translated to Java Code 8 Looking back at Spark 2.x and forward to 3.0 - Kazuaki Ishizaki From Structuring Apache Spark 2.0: SQL, DataFrames, Datasets And Streaming - by Michael Armbrust
  • 9.
    Who is MoreFamiliar with Each Module ▪ Four Japanese committers are in this room 9 Looking back at Spark 2.x and forward to 3.0 - Kazuaki Ishizaki Project Tungsten
  • 10.
    Major Items inSpark 2.x to Me ▪ Improve performance – by improving data representation – by eliminating serialization/deserialization (ser/de) – by improving generated code ▪ Stable code generation – No more Java exception while a program has large number of columns (>1000) 10 Looking back at Spark 2.x and forward to 3.0 - Kazuaki Ishizaki
  • 11.
    Array Internal Representation ▪Before Spark 2.1, an array (UnsafeArrayData) is internally represented by using an sparse/indirect structure – Good for small memory consumption if an array is sparse ▪ After Spark 2.1, the array representation is dense/contiguous – Good for performance 11 Looking back at Spark 2.x and forward to 3.0 - Kazuaki Ishizaki len = 2 7 8 a[0] a[1]offset[0] offset[1] len = 2 Non Null Non Null 7 8 a[0] a[1] SPARK-15962 improves this representation
  • 12.
    This is (thefirst) tough PR for me ▪ Spent three months with 270 conversations 12 Looking back at Spark 2.x and forward to 3.0 - Kazuaki Ishizaki
  • 13.
    A Simple DatasetProgram with Array ▪ Read an integer array in a row ▪ Create a new array from the first element 13 Looking back at Spark 2.x and forward to 3.0 - Kazuaki Ishizaki ds: DataSet[Array[Int]] = Seq(Array(7, 8)).toDS ds.map(a => Array(a(0)))
  • 14.
    Weird Generated PseudoCode with DataSet ▪ Data conversion is too slow – Between internal representation (Tungsten) and Java object format (Object[]) ▪ Element-wise data copy is too slow 14 Looking back at Spark 2.x and forward to 3.0 - Kazuaki Ishizaki ArrayData inArray; while (itr.hasNext()) { inArray = ((Row)itr.next().getArray(0); append(outRow); } ds: DataSet[Array[Int]] = Seq(Array(7, 8)).toDS ds.map(a => Array(a(0))) Data conversion Element-wise data copy Element-wise data copy int[] mapArray = new int[1] { a[0] }; Code generation Data conversion Element-wise data copy Ser De Copy each element with null check Data conversion Data conversion Copy with Java object creation Element-wise data copy
  • 15.
    Generated Source JavaCode ▪ Data conversion is done by boxing or unboxing ▪ Element-wise data copy is done by for-loop 15 Looking back at Spark 2.x and forward to 3.0 - Kazuaki Ishizaki ds: DataSet[Array[Int]] = Seq(Array(7, 8)).toDS ds.map(a => Array(a(0))) Data conversion Code generation Element-wise data copy ArrayData inArray; while (itr.hasNext()) { inArray = ((Row)itr.next().getArray(0); Object[] tmp = new Object[inArray.numElements()]; for (int i = 0; i < tmp.length; i ++) { tmp[i] = (inArray.isNullAt(i)) ? null : inArray.getInt(i); } ArrayData array = new GenericIntArrayData(tmpArray); int[] javaArray = array.toIntArray(); int[] mapArray = (int[])map_func.apply(javaArray); outArray = new GenericArrayData(mapArray); for (int i = 0; i < outArray.numElements(); i++) { if (outArray.isNullAt(i)) { arrayWriter.setNullInt(i); } else { arrayWriter.write(i, outArray.getInt(i)); } } append(outRow); } Ser De
  • 16.
    Too Long Actually-GeneratedJava Code (Spark 2.0) ▪ Too to 16 Looking back at Spark 2.x and forward to 3.0 - Kazuaki Ishizaki Data conversion Element-wise data copy final int[] mapelements_value = mapelements_isNull ? null : (int[]) mapelements_value1.apply(deserializetoobject_value); mapelements_isNull = mapelements_value == null; final boolean serializefromobject_isNull = mapelements_isNull; final ArrayData serializefromobject_value = serializefromobject_isNull ? null : new GenericArrayData(mapelements_value); serializefromobject_holder.reset(); serializefromobject_rowWriter.zeroOutNullBytes(); if (serializefromobject_isNull) { serializefromobject_rowWriter.setNullAt(0); } else { final int serializefromobject_tmpCursor = serializefromobject_holder.cursor; if (serializefromobject_value instanceof UnsafeArrayData) { final int serializefromobject_sizeInBytes = ((UnsafeArrayData) serializefromobject_value).getSizeInBytes(); serializefromobject_holder.grow(serializefromobject_sizeInBytes); ((UnsafeArrayData) serializefromobject_value).writeToMemory( serializefromobject_holder.buffer, serializefromobject_holder.cursor); serializefromobject_holder.cursor += serializefromobject_sizeInBytes; } else { final int serializefromobject_numElements = serializefromobject_value.numElements(); serializefromobject_arrayWriter.initialize(serializefromobject_holder, serializefromobject_numElements, 8); for (int serializefromobject_index = 0; serializefromobject_index < serializefromobject_numElements; serializefromobject_index++) { if (serializefromobject_value.isNullAt(serializefromobject_index)) { serializefromobject_arrayWriter.setNullAt(serializefromobject_index); } else { final int serializefromobject_element = serializefromobject_value.getInt(serializefromobject_index); serializefromobject_arrayWriter.write(serializefromobject_index, serializefromobject_element); } } } serializefromobject_rowWriter.setOffsetAndSize(0, serializefromobject_tmpCursor, serializefromobject_holder.cursor - serializefromobject_tmpCursor); serializefromobject_rowWriter.alignToWords(serializefromobject_holder.cursor - serializefromobject_tmpCursor); } serializefromobject_result.setTotalSize(serializefromobject_holder.totalSize()); append(serializefromobject_result); if (shouldStop()) return; } } protected void processNext() throws java.io.IOException { while (inputadapter_input.hasNext()) { InternalRow inputadapter_row = (InternalRow) inputadapter_input.next(); boolean inputadapter_isNull = inputadapter_row.isNullAt(0); ArrayData inputadapter_value = inputadapter_isNull ? null : (inputadapter_row.getArray(0)); boolean deserializetoobject_isNull1 = inputadapter_isNull; ArrayData deserializetoobject_value1 = null; if (!inputadapter_isNull) { final int deserializetoobject_n = inputadapter_value.numElements(); final Object[] deserializetoobject_values = new Object[deserializetoobject_n]; for (int deserializetoobject_j = 0; deserializetoobject_j < deserializetoobject_n; deserializetoobject_j ++) { if (inputadapter_value.isNullAt(deserializetoobject_j)) { deserializetoobject_values[deserializetoobject_j] = null; } else { boolean deserializetoobject_feNull = false; int deserializetoobject_fePrim = inputadapter_value.getInt(deserializetoobject_j); boolean deserializetoobject_teNull = deserializetoobject_feNull; int deserializetoobject_tePrim = -1; if (!deserializetoobject_feNull) { deserializetoobject_tePrim = deserializetoobject_fePrim; } if (deserializetoobject_teNull) { deserializetoobject_values[deserializetoobject_j] = null; } else { deserializetoobject_values[deserializetoobject_j] = deserializetoobject_tePrim; } } } deserializetoobject_value1 = new GenericArrayData(deserializetoobject_values); } boolean deserializetoobject_isNull = deserializetoobject_isNull1; final int[] deserializetoobject_value = deserializetoobject_isNull ? null : (int[]) deserializetoobject_value1.toIntArray(); deserializetoobject_isNull = deserializetoobject_value == null; Object mapelements_obj = ((Expression) references[0]).eval(null); scala.Function1 mapelements_value1 = (scala.Function1) mapelements_obj; boolean mapelements_isNull = false || deserializetoobject_isNull; ds.map(a => Array(a(0))).debugCodegen
  • 17.
    Simple Generated Codefor Array on Spark 2.2 ▪ Data conversion and element-wise copy are not used ▪ Bulk copy is faster than element-wise data copy 17 ds: DataSet[Array[Int]] = Seq(Array(7, 8)).toDS ds.map(a => Array(a(0))) Bulk data copy Copy whole array using memcpy() while (itr.hasNext()) { inArray =((Row)itr.next()).getArray(0); int[]mapArray=(int[])map.apply(javaArray); append(outRow); } Bulk data copy int[] mapArray = new int[1] { a[0] }; Bulk data copy Bulk data copy SPARK-15985 and SPARK-17490 simplify Ser/De by using bulk data copy Code generation Looking back at Spark 2.x and forward to 3.0 - Kazuaki Ishizaki
  • 18.
    Simple Generated JavaCode for Array ▪ Data conversion and element-wise copy are not used ▪ Bulk copy is faster than element-wise data copy 18 ds: DataSet[Array[Int]] = Seq(Array(7, 8)).toDS ds.map(a => Array(a(0))) Bulk data copy Copy whole array using memcpy() SPARK-15985 and SPARK-17490 simplify Ser/De by using bulk data copy while (itr.hasNext()) { inArray =((Row)itr.next()).getArray(0); int[] javaArray = inArray.toIntArray(); int[] mapArray = (int[])mapFunc.apply(javaArray); outArray = UnsafeArrayData .fromPrimitiveArray(mapArray); outArray.writeToMemory(outRow); append(outRow); } Code generation Looking back at Spark 2.x and forward to 3.0 - Kazuaki Ishizaki
  • 19.
    Dataset for ArrayIs Not Extremely Slow ▪ Good news: 4.5x faster than Spark 2.0 ▪ Bad news: still 12x slower than DataFrame 19 0 10 20 30 40 50 60 Relative execution time over DataFrame DataFrame Dataset ds = Seq(Array(…), Array(…), …) .toDS.cache ds.map(a => Array(a(0))) 4.5x df = Seq(Array(…), Array(…), …) .toDF(”a”).cache df.selectExpr(“Array(a[0])”) 12x Shorter is better Looking back at Spark 2.x and forward to 3.0 - Kazuaki Ishizaki
  • 20.
    Spark 2.4 SupportsArray Built-in Functions ▪ These built-in functions operate on array elements without writing a loop – Single array input: array_min, array_max, array_position, ... – Two-array input: array_intersect, array_union, array_except, ▪ Before Spark 2.4, users have to write a function using Dataset or UDF 20 SPARK-23899 is an umbrella entry ds: DataSet[Array[Int]] = Seq(Array(7, 8)).toDS ds.map(a => a.mix) df: DataSet[Array[Int]] = Seq(Array(7, 8)).toDF(“a”) df.array_min(“a”) @ueshin co-wrote an blog entry at https://databricks.com/blog/2018/11/16/introducing-new-built-in-functions-and-higher-order-functions-for-complex-data-types-in-apache-spark.html Looking back at Spark 2.x and forward to 3.0 - Kazuaki Ishizaki
  • 21.
    Pre-Spark 2.3 ThrowsJava Exception with Large Columns ▪ Java class file has multiple limitations – Bytecode size of a method is less than 64KB – Entry size of constant pool (e.g. symbol name) is less than 64KB 21 df.groupBy(“id”).agg(max(“c1”), sum(“c2”), …, min(“c4000”)) 01:11:11.123 ERROR org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator: failed to compile: org.codehaus.janino.JaninoRuntimeException: Code of method "apply(Lorg/apache/spark/sql/catalyst/InternalRow;)Lorg/apache/spark/sql/catalyst/expressions /UnsafeRow;" of class "org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection" grows beyond 64 KB ... Looking back at Spark 2.x and forward to 3.0 - Kazuaki Ishizaki Generated a huge method whose bytecode size is more than 64KB
  • 22.
    Spark 2.3 FixesJava Exception with Large Columns ▪ Generate small size of methods conservatively when potentially large Java code is generated – Apply this policy to multiple places into source files 22 SPARK-22150 is an umbrella entry that has 25 sub-tasks Looking back at Spark 2.x and forward to 3.0 - Kazuaki Ishizaki
  • 23.
    Spark 2.3 FixesJava Exception with Large Columns ▪ Generate small size of methods conservatively when potentially large Java code is generated – Apply this policy to multiple places into source files 23 SPARK-22150 is an umbrella entry that has 25 sub-tasks Looking back at Spark 2.x and forward to 3.0 - Kazuaki Ishizaki
  • 24.
    Major Items inSpark 3.0 ▪ JDK11 and Scala 2.12 support (available in master branch) – SPARK-24417 ▪ Tungsten intermediate representation (IR) – Easy to restructure generated code ▪ SPARK-25728 (under proposal) ▪ DataSource V2 API – SPARK-25528 ▪ … 24 Looking back at Spark 2.x and forward to 3.0 - Kazuaki Ishizaki
  • 25.
    Motivation of TungstenIR ▪ It is not easy to restructure Java code after generating the code – Code generation is done by string concatenation 25 int i = ... func1(i + 1, i * 2); ... func1(i + 500, i * 2); func1(i + 501, i * 2); func1(i + 1000, i * 2); Hard to split here into two parts without parsing Java code Looking back at Spark 2.x and forward to 3.0 - Kazuaki Ishizaki
  • 26.
    Structured IR Allowsus to Restructure Generated Code ▪ Ease of code restructuring in blue ▪ Ease of rebuilding an expression in green 26 Method Invoke + load i 500 * load i 2 Looking back at Spark 2.x and forward to 3.0 - Kazuaki Ishizaki Invoke + load i 501 Easy to split here into two parts
  • 27.
    Other Major Itemsin Spark 2.x ▪ PySpark Performance Improvement – To use Pandas UDF with Apache Arrow can drastically improve performance of PySpark ▪ https://databricks.com/blog/2017/10/30/introducing-vectorized-udfs-for-pyspark.html ▪ Project Hydrogen – Barrier execution mode for integrating ML/DL frameworks with Spark ▪ https://databricks.com/blog/2018/07/25/bay-area-apache-spark-meetup-summary- databricks-hq.html 27 Looking back at Spark 2.x and forward to 3.0 - Kazuaki Ishizaki
  • 28.
    What I amInterested in ▪ Tungsten IR in Spark – ease of code restructuring – (In the future) apply multiple optimizations ▪ Improvement of generated code in Spark – for Parquet reader – data representation for array table cache ▪ Integration of Spark with DL/ML frameworks (TensorFlow…) and others 28 Looking back at Spark 2.x and forward to 3.0 - Kazuaki Ishizaki
  • 29.
    Possible Integration ofSpark thru Arrow (My View) ▪ Frameworks: DL/ML frameworks (TensorFlow…) ▪ Resource: GPU, … – RAPIDS (by NVIDIA) may help integrate with GPU 29 Looking back at Spark 2.x and forward to 3.0 - Kazuaki Ishizaki From rapids.ai 2 12 1 11 In-memory Columnar
  • 30.
    Why Am IAppointed to a Committer? ▪ Continue to make contributions to certain component (SQL) ▪ Review many pull requests ▪ Share knowledge based on my expertise in the community – Compiler and Java virtual machine ▪ Meet committers and contributors in person – Hadoop Source Code Reading, Hadoop Spark Conference Japan, Spark Summit, other meetups 30 Looking back at Spark 2.x and forward to 3.0 - Kazuaki Ishizaki
  • 31.
    31 ぜひオープンソースにコントリビューションを Sparkコミュニティに飛び込もう! by Apache Sparkcommitter 猿田さん https://www.slideshare.net/hadoopxnttdata/apache-spark-commnity-nttdata-sarutak