SlideShare a Scribd company logo
Hive	Bucketing	in
Apache	Spark
Tejas	Patil
Facebook
Agenda
• Why	bucketing	?
• Why	is	shuffle	bad	?
• How	to	avoid	shuffle	?
• When	to	use	bucketing	?
• Spark’s	bucketing	support
• Bucketing	semantics	of	Spark	vs	Hive
• Hive	bucketing	support	in	Spark
• SQL	Planner	improvements
Why	bucketing	?
Table	A Table	B
.	.	.	.	.	.	 .	.	.	.	.	.	JOIN
Table	A Table	B
.	.	.	.	.	.	 .	.	.	.	.	.	JOIN
Broadcast	hash	join
- Ship	smaller	table	to	all	nodes
- Stream	the	other	table
Table	A Table	B
.	.	.	.	.	.	 .	.	.	.	.	.	JOIN
Broadcast	hash	join Shuffle	hash	join
- Ship	smaller	table	to	all	nodes
- Stream	the	other	table
- Shuffle	both	tables,
- Hash	smaller	one,	stream	the	
bigger	one
Table	A Table	B
.	.	.	.	.	.	 .	.	.	.	.	.	JOIN
Broadcast	hash	join Shuffle	hash	join
- Ship	smaller	table	to	all	nodes
- Stream	the	other	table
- Shuffle	both	tables,
- Hash	smaller	one,	stream	the	
bigger	one
Sort	merge	join
- Shuffle	both	tables,
- Sort	both	tables,	buffer	one,	
stream	the	bigger	one
Table	A Table	B
.	.	.	.	.	.	 .	.	.	.	.	.	JOIN
Broadcast	hash	join Shuffle	hash	join Sort	merge	join
- Ship	smaller	table	to	all	nodes
- Stream	the	other	table
- Shuffle	both	tables,
- Hash	smaller	one,	stream	the	
bigger	one
- Shuffle	both	tables,
- Sort	both	tables,	buffer	one,	
stream	the	bigger	one
Table	A Table	B
.	.	.	.	.	.	 .	.	.	.	.	.	
Sort	Merge	Join
Table	A Table	B
.	.	.	.	.	.	 .	.	.	.	.	.	
Shuffle ShuffleShuffleShuffle
Sort	Merge	Join
Table	A Table	B
.	.	.	.	.	.	 .	.	.	.	.	.	
Sort Sort Sort Sort
Shuffle ShuffleShuffleShuffle
Sort	Merge	Join
Table	A Table	B
.	.	.	.	.	.	 .	.	.	.	.	.	
Sort
Join
Sort
Join
Sort
Join
Sort
Join
Shuffle ShuffleShuffleShuffle
Sort	Merge	Join
Sort	Merge	Join
Why	is	shuffle	bad	?
Table	A Table	B
.	.	.	.	.	.	 .	.	.	.	.	.	
Disk Disk
Disk	IO	for	shuffle	outputs
Why	is	shuffle	bad	?
Table	A Table	B
.	.	.	.	.	.	 .	.	.	.	.	.	
Shuffle Shuffle Shuffle Shuffle
M	*	R	network	connections
Why	is	shuffle	bad	?
How	to	avoid	shuffle	?
student	s JOIN	attendance	a	
ON	s.id =	a.student_id
student	s	JOIN	results	r	
ON	s.id =	r.student_id
student	s	JOIN	course_registrationc
ON	s.id =	c.student_id
……..
……..
student	s	JOIN	attendance	a	
ON	s.id =	a.student_id
student	s	JOIN	results	r	
ON	s.id =	r.student_id
student	s	JOIN	attendance	a	
ON	s.id =	a.student_id
student	s	JOIN	results	r	
ON	s.id =	r.student_id
Pre-compute	at	
table	creation
Bucketing	=
pre-(shuffle	+	sort)	inputs
on	join	keys
Without	
bucketing
Job(s)	populating	input	table
Without	
bucketing
With
bucketing
Job(s)	populating	input	table
Performance	comparison
0
100
200
300
400
500
600
700
800
900
Before After	Bucketing
Reserved	CPU	time	(days)
0
1
2
3
4
5
6
7
8
Before After	Bucketing
Latency	(hours)
When	to	use	bucketing	?
• Tables	used	frequently	in	JOINs	with	same	key
• Student	->	student_id
• Employee	->	employee_id
• Users	->	user_id
When	to	use	bucketing	?
• Tables	used	frequently	in	JOINs	with	same	key
• Student	->	student_id
• Employee	->	employee_id
• Users	->	user_id
….	=>	Dimension	tables
When	to	use	bucketing	?
• Tables	used	frequently	in	JOINs	with	same	key
• Student	->	student_id
• Employee	->	employee_id
• Users	->	user_id
• Loading	daily	cumulative	tables
• Both	base	and	delta	tables	could	be	bucketed	on	a	common	column
When	to	use	bucketing	?
• Tables	used	frequently	in	JOINs	with	same	key
• Student	->	student_id
• Employee	->	employee_id
• Users	->	user_id
• Loading	daily	cumulative	tables
• Both	base	and	delta	tables	could	be	bucketed	on	a	common	column
• Indexing	capability
Spark’s	bucketing	support
Creation	of	bucketed	tables
df.write
.bucketBy(numBuckets,	 "col1",	...)
.sortBy("col1",	...)
.saveAsTable("bucketed_table")
via	Dataframe API
Creation	of	bucketed	tables
CREATE	TABLE	bucketed_table(
column1	INT,
...
)	USING	parquet
CLUSTERED	BY	(column1,	…)
SORTED	BY	(column1,	…)
INTO	`n`	BUCKETS
via	SQL	statement
Check	bucketing	spec
scala>	sparkContext.sql("DESC	FORMATTED	student").collect.foreach(println)
[#	col_name,data_type,comment]
[student_id,int,null]
[name,int,null]
[#	Detailed	Table	Information,,]
[Database,default,]
[Table,table1,]
[Owner,tejas,]
[Created,Fri May	12	08:06:33	PDT	2017,]
[Type,MANAGED,]
[Num Buckets,64,]
[Bucket	Columns,[`student_id`],]
[Sort	Columns,[`student_id`],]
[Properties,[serialization.format=1],]
[Serde Library,org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe,]
[InputFormat,org.apache.hadoop.mapred.SequenceFileInputFormat,]
Bucketing	config
SET	spark.sql.sources.bucketing.enabled=true
[SPARK-15453]	Extract	bucketing	info	in	
FileSourceScanExec
SELECT	*	FROM	tableA JOIN	tableB ON	tableA.i=	tableB.i AND	tableA.j=	tableB.j
[SPARK-15453]	Extract	bucketing	info	in	
FileSourceScanExec
SELECT	*	FROM	tableA JOIN	tableB ON	tableA.i=	tableB.i AND	tableA.j=	tableB.j
Sort	Merge	Join
Sort
Exchange
TableScan
Sort
Exchange
TableScan Already	bucketed	on	columns	(i,	j)
Shuffle	on	columns	(i,	j)
Sort	on	columns	(i,	j)
Join	on	[i,j],	[i,j]	respectively	of	relations
[SPARK-15453]	Extract	bucketing	info	in	
FileSourceScanExec
SELECT	*	FROM	tableA JOIN	tableB ON	tableA.i=	tableB.i AND	tableA.j=	tableB.j
Sort	Merge	Join
Sort
Exchange
TableScan
Sort
Exchange
TableScan Already	bucketed	on	columns	(i,	j)
Shuffle	on	columns	(i,	j)
Sort	on	columns	(i,	j)
Join	on	[i,j],	[i,j]	respectively	of	relations
[SPARK-15453]	Extract	bucketing	info	in	
FileSourceScanExec
SELECT	*	FROM	tableA JOIN	tableB ON	tableA.i=	tableB.i AND	tableA.j=	tableB.j
Sort	Merge	Join
TableScan TableScan
Sort	Merge	Join
TableScan TableScanSort
Exchange
Sort
Exchange
Bucketing	semantics	of
Spark	vs	Hive
Bucketing	semantics	of	Spark	vs	Hive
bucket	0 bucket	1
bucket
(n-1)
…..
Hive
my_bucketed_table
Bucketing	semantics	of	Spark	vs	Hive
bucket	0 bucket	1
bucket
(n-1)
…..
Hive Spark
bucket	0 bucket	1
bucket
(n-1)
…..
bucket	0
bucket	0
bucket	1
bucket
(n-1)bucket
(n-1)bucket
(n-1)
my_bucketed_tablemy_bucketed_table
Bucketing	semantics	of	Spark	vs	Hive
bucket	0 bucket	1
bucket
(n-1)
…..
Hive
M M M M M
R R R
…..
Bucketing	semantics	of	Spark	vs	Hive
bucket	0 bucket	1
bucket
(n-1)
…..
Hive Spark
M M M M M
R R R
…..
bucket	0 bucket	1
bucket
(n-1)
…..
M M M M M…..
bucket	0
bucket	0
bucket	1
bucket
(n-1)bucket
(n-1)bucket
(n-1)
Bucketing	semantics	of	Spark	vs	Hive
Hive Spark
Model Optimizes	 reads,	writes	are	costly Writes	are	cheaper,	 reads	are	
costlier
Bucketing	semantics	of	Spark	vs	Hive
Hive Spark
Model Optimizes	 reads,	writes	are	costly Writes	are	cheaper,	 reads	are	
costlier
Unit	of	bucketing A	single	file	represents	 a	 single	
bucket
A	collection	 of	files	together	
comprise	 a	single	bucket
Bucketing	semantics	of	Spark	vs	Hive
Hive Spark
Model Optimizes	 reads,	writes	are	costly Writes	are	cheaper,	 reads	are	
costlier
Unit	of	bucketing A	single	file	represents	 a	 single	
bucket
A	collection	 of	files	together	
comprise	 a	single	bucket
Sort	ordering Each	bucket	is	sorted	globally.	 This	
makes	it	easy	to	optimize	 queries	
reading	this	data
Each	file	is	individually	 sorted	 but	
the	“bucket”	as	a	whole	is	not	
globally	 sorted
Bucketing	semantics	of	Spark	vs	Hive
Hive Spark
Model Optimizes	 reads,	writes	are	costly Writes	are	cheaper,	 reads	are	
costlier
Unit	of	bucketing A	single	file	represents	 a	 single	
bucket
A	collection	 of	files	together	
comprise	 a	single	bucket
Sort	ordering Each	bucket	is	sorted	globally.	 This	
makes	it	easy	to	optimize	 queries	
reading	this	data
Each	file	is	individually	 sorted	 but	
the	“bucket”	as	a	whole	is	not	
globally	 sorted
Hashing	function Hive’s	 inbuilt	 hash Murmur3Hash
[SPARK-19256] Hive	bucketing	support
• Introduce	Hive’s	hashing	function	[SPARK-17495]
• Enable	creating	hive	bucketed	tables	[SPARK-17729]
• Support	Hive’s	`Bucketed	file	format`
• Propagate	Hive	bucketing	information	to	planner	[SPARK-17654]
• Expressing	outputPartitioning and	requiredChildDistribution
• Creating	empty	bucket	files	when	no	data	for	a	bucket
• Allow	Sort	Merge	join	over	tables	with	number	of	buckets	
multiple	of	each	other
• Support	N-way	Sort	Merge	Join
[SPARK-19256]	Hive	bucketing	support
• Introduce	Hive’s	hashing	function	[SPARK-17495]
• Enable	creating	hive	bucketed	tables	[SPARK-17729]
• Support	Hive’s	`Bucketed	file	format`
• Propagate	Hive	bucketing	information	to	planner	[SPARK-17654]
• Expressing	outputPartitioning and	requiredChildDistribution
• Creating	empty	bucket	files	when	no	data	for	a	bucket
• Allow	Sort	Merge	join	over	tables	with	number	of	buckets	
multiple	of	each	other
• Support	N-way	Sort	Merge	Join
Merged	in
upstream
[SPARK-19256]	Hive	bucketing	support
• Introduce	Hive’s	hashing	function	[SPARK-17495]
• Enable	creating	hive	bucketed	tables	[SPARK-17729]
• Support	Hive’s	`Bucketed	file	format`
• Propagate	Hive	bucketing	information	to	planner	[SPARK-17654]
• Expressing	outputPartitioning and	requiredChildDistribution
• Creating	empty	bucket	files	when	no	data	for	a	bucket
• Allow	Sort	Merge	join	over	tables	with	number	of	buckets	
multiple	of	each	other
• Support	N-way	Sort	Merge	Join
FB-prod	(6	months)
SQL	Planner	improvements
[SPARK-19122]	Unnecessary	shuffle+sort added	if	
join	predicates	ordering	differ	from	bucketing	and	
sorting	order
SELECT	*	FROM	a	JOIN	b	ON	a.x=b.xAND	a.y=b.y
[SPARK-19122]	Unnecessary	shuffle+sort added	if	
join	predicates	ordering	differ	from	bucketing	and	
sorting	order
Sort	Merge	Join
TableScan TableScan
SELECT	*	FROM	a	JOIN	b	ON	a.x=b.xAND	a.y=b.y
[SPARK-19122]	Unnecessary	shuffle+sort added	if	
join	predicates	ordering	differ	from	bucketing	and	
sorting	order
SELECT	*	FROM	a	JOIN	b	ON	a.x=b.xAND	a.y=b.y
Sort	Merge	Join
TableScan TableScan
SELECT	*	FROM	a	JOIN	b	ON	a.y=b.y AND	a.x=b.x
[SPARK-19122]	Unnecessary	shuffle+sort added	if	
join	predicates	ordering	differ	from	bucketing	and	
sorting	order
Sort	Merge	Join
Exchange
TableScan
Sort
Exchange
TableScan
Sort
Sort	Merge	Join
TableScan TableScan
SELECT	*	FROM	a	JOIN	b	ON	a.x=b.xAND	a.y=b.y SELECT	*	FROM	a	JOIN	b	ON	a.y=b.y AND	a.x=b.x
[SPARK-19122]	Unnecessary	shuffle+sort added	if	
join	predicates	ordering	differ	from	bucketing	and	
sorting	order
Sort	Merge	Join
Exchange
TableScan
Sort
Exchange
TableScan
Sort
Sort	Merge	Join
TableScan TableScan
SELECT	*	FROM	a	JOIN	b	ON	a.x=b.xAND	a.y=b.y SELECT	*	FROM	a	JOIN	b	ON	a.y=b.y AND	a.x=b.x
[SPARK-19122]	Unnecessary	shuffle+sort added	if	
join	predicates	ordering	differ	from	bucketing	and	
sorting	order
Sort	Merge	Join
TableScan TableScan
SELECT	*	FROM	a	JOIN	b	ON	a.x=b.xAND	a.y=b.y SELECT	*	FROM	a	JOIN	b	ON	a.y=b.y AND	a.x=b.x
Sort	Merge	Join
TableScan TableScan
[SPARK-17271]	Planner	adds	un-necessary	Sort	
even	if	child	ordering	is	semantically	same	as	
required	ordering
SELECT	*	FROM	table1	a	JOIN	table2	b	ON	a.col1=b.col1
[SPARK-17271]	Planner	adds	un-necessary	Sort	
even	if	child	ordering	is	semantically	same	as	
required	ordering
SELECT	*	FROM	table1	a	JOIN	table2	b	ON	a.col1=b.col1
Sort	Merge	Join
Sort
TableScan
Sort
TableScan Already	bucketed	on	column	`col1`
Sort	on	columns	(a.col1	and	b.col1)
Join	on	[col1],	[col1]	respectively	of	relations
[SPARK-17271]	Planner	adds	un-necessary	Sort	
even	if	child	ordering	is	semantically	same	as	
required	ordering
SELECT	*	FROM	table1	a	JOIN	table2	b	ON	a.col1=b.col1
Sort	Merge	Join
TableScan TableScan Already	bucketed	on	column	`col1`
Sort	on	columns	(a.col1	and	b.col1)Sort Sort
Join	on	[col1],	[col1]	respectively	of	relations
[SPARK-17271]	Planner	adds	un-necessary	Sort	
even	if	child	ordering	is	semantically	same	as	
required	ordering
SELECT	*	FROM	table1	a	JOIN	table2	b	ON	a.col1=b.col1
Sort	Merge	Join
TableScan TableScan
Sort Sort
Sort	Merge	Join
TableScan TableScan
[SPARK-17698]	Join	predicates	should	not	
contain	filter	clauses
SELECT	a.id,	b.id FROM	table1	a	FULL	OUTER	JOIN	table2	b	ON	a.id=	b.id AND	a.id='1'	AND	b.id='1'
[SPARK-17698]	Join	predicates	should	not	
contain	filter	clauses
SELECT	a.id,	b.id FROM	table1	a	FULL	OUTER	JOIN	table2	b	ON	a.id=	b.id AND	a.id='1'	AND	b.id='1'
Sort	Merge	Join
TableScan TableScan
Sort
Exchange
Sort
Exchange
Already	bucketed	on	column	[id]
Shuffle	on	columns	[id,	1,	id]	and	[id,	id,	1]
Sort	on	columns	[id,	id,	1]	and	[id,	1,	id]	
Join	on	[id,	id,	1],	[id,	1,	id]	respectively	of	relations
[SPARK-17698]	Join	predicates	should	not	
contain	filter	clauses
SELECT	a.id,	b.id FROM	table1	a	FULL	OUTER	JOIN	table2	b	ON	a.id =	b.id AND a.id='1'	AND b.id='1'
Sort	Merge	Join
TableScan TableScan
Sort
Exchange
Sort
Exchange
Already	bucketed	on	column	[id]
Shuffle	on	columns	[id,	1,	id]	and	[id,	id,	1]
Sort	on	columns	[id,	id,	1]	and	[id,	1,	id]	
Join	on	[id,	id,	1],	[id,	1,	id]	respectively	of	relations
[SPARK-17698]	Join	predicates	should	not	
contain	filter	clauses
SELECT	a.id,	b.id FROM	table1	a	FULL	OUTER	JOIN	table2	b	ON	a.id =	b.id AND a.id='1'	AND b.id='1'
Sort	Merge	Join
TableScan TableScan
Sort
Exchange
Sort
Exchange
Already	bucketed	on	column	[id]
Shuffle	on	columns	[id,	1,	id]	and	[id,	id,	1]
Sort	on	columns	[id,	id,	1]	and	[id,	1,	id]	
Join	on	[id,	id,	1],	[id,	1,	id]	respectively	of	relations
[SPARK-17698]	Join	predicates	should	not	
contain	filter	clauses
SELECT	a.id,	b.id FROM	table1	a	FULL	OUTER	JOIN	table2	b	ON	a.id =	b.id AND a.id='1'	AND b.id='1'
Sort	Merge	Join
TableScan TableScan
Sort	Merge	Join
TableScan TableScanSort
Exchange
Sort
Exchange
[SPARK-13450]	Introduce	
ExternalAppendOnlyUnsafeRowArray
[SPARK-13450]	Introduce	
ExternalAppendOnlyUnsafeRowArray
Buffered	
relation
Streamed	
relation
If	there	are	a	lot	of	rows	to	be	
buffered	(eg.	Skew),	
the	query	would	OOM
row	with	
key=`foo`
rows	with	
key=`foo`
[SPARK-13450]	Introduce	
ExternalAppendOnlyUnsafeRowArray
Buffered	
relation
Streamed	
relation
row	with	
key=`foo`
rows	with	
key=`foo`
Disk
buffer	would	spill	to	disk	
after	reaching	a	threshold
spill
Summary
• Shuffle	and	sort	is	costly	due	to	disk	and	network	IO
• Bucketing	will	pre-(shuffle	and	sort)	the	inputs
• Expect	at	least	2-5x	gains	after	bucketing	input	tables	for	joins
• Candidates	for	bucketing:
• Tables	used	frequently	in	JOINs	with	same	key
• Loading	of	data	in	cumulative	tables
• Spark’s	bucketing	is	not	compatible	with	Hive’s	bucketing
Questions	?

More Related Content

What's hot

Hive: Loading Data
Hive: Loading DataHive: Loading Data
Hive: Loading Data
Benjamin Leonhardi
 
Top 5 Mistakes to Avoid When Writing Apache Spark Applications
Top 5 Mistakes to Avoid When Writing Apache Spark ApplicationsTop 5 Mistakes to Avoid When Writing Apache Spark Applications
Top 5 Mistakes to Avoid When Writing Apache Spark Applications
Cloudera, Inc.
 
Change Data Feed in Delta
Change Data Feed in DeltaChange Data Feed in Delta
Change Data Feed in Delta
Databricks
 
Deep Dive into the New Features of Apache Spark 3.0
Deep Dive into the New Features of Apache Spark 3.0Deep Dive into the New Features of Apache Spark 3.0
Deep Dive into the New Features of Apache Spark 3.0
Databricks
 
How Adobe Does 2 Million Records Per Second Using Apache Spark!
How Adobe Does 2 Million Records Per Second Using Apache Spark!How Adobe Does 2 Million Records Per Second Using Apache Spark!
How Adobe Does 2 Million Records Per Second Using Apache Spark!
Databricks
 
Memory Management in Apache Spark
Memory Management in Apache SparkMemory Management in Apache Spark
Memory Management in Apache Spark
Databricks
 
Lessons from the Field: Applying Best Practices to Your Apache Spark Applicat...
Lessons from the Field: Applying Best Practices to Your Apache Spark Applicat...Lessons from the Field: Applying Best Practices to Your Apache Spark Applicat...
Lessons from the Field: Applying Best Practices to Your Apache Spark Applicat...
Databricks
 
The Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
The Rise of ZStandard: Apache Spark/Parquet/ORC/AvroThe Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
The Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
Databricks
 
Fine Tuning and Enhancing Performance of Apache Spark Jobs
Fine Tuning and Enhancing Performance of Apache Spark JobsFine Tuning and Enhancing Performance of Apache Spark Jobs
Fine Tuning and Enhancing Performance of Apache Spark Jobs
Databricks
 
A Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiA Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and Hudi
Databricks
 
Spark shuffle introduction
Spark shuffle introductionSpark shuffle introduction
Spark shuffle introduction
colorant
 
Hudi architecture, fundamentals and capabilities
Hudi architecture, fundamentals and capabilitiesHudi architecture, fundamentals and capabilities
Hudi architecture, fundamentals and capabilities
Nishith Agarwal
 
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiHow to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
Flink Forward
 
Bucketing 2.0: Improve Spark SQL Performance by Removing Shuffle
Bucketing 2.0: Improve Spark SQL Performance by Removing ShuffleBucketing 2.0: Improve Spark SQL Performance by Removing Shuffle
Bucketing 2.0: Improve Spark SQL Performance by Removing Shuffle
Databricks
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Databricks
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Noritaka Sekiyama
 
Best Practice of Compression/Decompression Codes in Apache Spark with Sophia...
 Best Practice of Compression/Decompression Codes in Apache Spark with Sophia... Best Practice of Compression/Decompression Codes in Apache Spark with Sophia...
Best Practice of Compression/Decompression Codes in Apache Spark with Sophia...
Databricks
 
How We Optimize Spark SQL Jobs With parallel and sync IO
How We Optimize Spark SQL Jobs With parallel and sync IOHow We Optimize Spark SQL Jobs With parallel and sync IO
How We Optimize Spark SQL Jobs With parallel and sync IO
Databricks
 
Data Source API in Spark
Data Source API in SparkData Source API in Spark
Data Source API in Spark
Databricks
 
Simplify CDC Pipeline with Spark Streaming SQL and Delta Lake
Simplify CDC Pipeline with Spark Streaming SQL and Delta LakeSimplify CDC Pipeline with Spark Streaming SQL and Delta Lake
Simplify CDC Pipeline with Spark Streaming SQL and Delta Lake
Databricks
 

What's hot (20)

Hive: Loading Data
Hive: Loading DataHive: Loading Data
Hive: Loading Data
 
Top 5 Mistakes to Avoid When Writing Apache Spark Applications
Top 5 Mistakes to Avoid When Writing Apache Spark ApplicationsTop 5 Mistakes to Avoid When Writing Apache Spark Applications
Top 5 Mistakes to Avoid When Writing Apache Spark Applications
 
Change Data Feed in Delta
Change Data Feed in DeltaChange Data Feed in Delta
Change Data Feed in Delta
 
Deep Dive into the New Features of Apache Spark 3.0
Deep Dive into the New Features of Apache Spark 3.0Deep Dive into the New Features of Apache Spark 3.0
Deep Dive into the New Features of Apache Spark 3.0
 
How Adobe Does 2 Million Records Per Second Using Apache Spark!
How Adobe Does 2 Million Records Per Second Using Apache Spark!How Adobe Does 2 Million Records Per Second Using Apache Spark!
How Adobe Does 2 Million Records Per Second Using Apache Spark!
 
Memory Management in Apache Spark
Memory Management in Apache SparkMemory Management in Apache Spark
Memory Management in Apache Spark
 
Lessons from the Field: Applying Best Practices to Your Apache Spark Applicat...
Lessons from the Field: Applying Best Practices to Your Apache Spark Applicat...Lessons from the Field: Applying Best Practices to Your Apache Spark Applicat...
Lessons from the Field: Applying Best Practices to Your Apache Spark Applicat...
 
The Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
The Rise of ZStandard: Apache Spark/Parquet/ORC/AvroThe Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
The Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
 
Fine Tuning and Enhancing Performance of Apache Spark Jobs
Fine Tuning and Enhancing Performance of Apache Spark JobsFine Tuning and Enhancing Performance of Apache Spark Jobs
Fine Tuning and Enhancing Performance of Apache Spark Jobs
 
A Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiA Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and Hudi
 
Spark shuffle introduction
Spark shuffle introductionSpark shuffle introduction
Spark shuffle introduction
 
Hudi architecture, fundamentals and capabilities
Hudi architecture, fundamentals and capabilitiesHudi architecture, fundamentals and capabilities
Hudi architecture, fundamentals and capabilities
 
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiHow to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
 
Bucketing 2.0: Improve Spark SQL Performance by Removing Shuffle
Bucketing 2.0: Improve Spark SQL Performance by Removing ShuffleBucketing 2.0: Improve Spark SQL Performance by Removing Shuffle
Bucketing 2.0: Improve Spark SQL Performance by Removing Shuffle
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
 
Best Practice of Compression/Decompression Codes in Apache Spark with Sophia...
 Best Practice of Compression/Decompression Codes in Apache Spark with Sophia... Best Practice of Compression/Decompression Codes in Apache Spark with Sophia...
Best Practice of Compression/Decompression Codes in Apache Spark with Sophia...
 
How We Optimize Spark SQL Jobs With parallel and sync IO
How We Optimize Spark SQL Jobs With parallel and sync IOHow We Optimize Spark SQL Jobs With parallel and sync IO
How We Optimize Spark SQL Jobs With parallel and sync IO
 
Data Source API in Spark
Data Source API in SparkData Source API in Spark
Data Source API in Spark
 
Simplify CDC Pipeline with Spark Streaming SQL and Delta Lake
Simplify CDC Pipeline with Spark Streaming SQL and Delta LakeSimplify CDC Pipeline with Spark Streaming SQL and Delta Lake
Simplify CDC Pipeline with Spark Streaming SQL and Delta Lake
 

More from Databricks

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
Databricks
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
Databricks
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
Databricks
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
Databricks
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
Databricks
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
Databricks
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
Databricks
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
Databricks
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
Databricks
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
Databricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
Databricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Databricks
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
Databricks
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Databricks
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
Databricks
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
Databricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
Databricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
Databricks
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
Databricks
 
Machine Learning CI/CD for Email Attack Detection
Machine Learning CI/CD for Email Attack DetectionMachine Learning CI/CD for Email Attack Detection
Machine Learning CI/CD for Email Attack Detection
Databricks
 

More from Databricks (20)

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 
Machine Learning CI/CD for Email Attack Detection
Machine Learning CI/CD for Email Attack DetectionMachine Learning CI/CD for Email Attack Detection
Machine Learning CI/CD for Email Attack Detection
 

Recently uploaded

Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
enxupq
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
vcaxypu
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
oz8q3jxlp
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
ahzuo
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
enxupq
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
ewymefz
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
ewymefz
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
balafet
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
mbawufebxi
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
ewymefz
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
ukgaet
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
Opendatabay
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
nscud
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Boston Institute of Analytics
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 

Recently uploaded (20)

Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 

Hive Bucketing in Apache Spark with Tejas Patil