SlideShare a Scribd company logo
Hivemall:	Machine	Learning	
Library	for	Apache	Hive/Spark
Research	Engineer
Makoto	YUI	(油井 誠) @myui
<myui@treasure-data.com>
12016/09/09	HadoopCon	16,	Taipei
Ø 2015.04~ Research	Engineer	at	Treasure	Data,	
Inc.
• My	mission	is	developing	ML-as-a-Service	in	a	Hadoop-as-
a-service	company
Ø 2010.04-2015.03	Senior	Researcher	at	National	
Institute	of	Advanced	Industrial	Science	and	
Technology,	Japan.	産業技術総合研究所
• Developed	Hivemall	as	a	personal	research	project
Ø 2009.03	Ph.D.	in	Computer	Science	from	NAIST
• Majored	in	Parallel	Data	Processing,	not	ML	then
Ø Visiting	scholar	in	CWI,	Amsterdam	and	Univ.	Edinburgh
Little	about	me	..
2016/09/09	HadoopCon	16,	Taipei 2
2016/09/09	HadoopCon	16,	Taipei 3
Hiro Yoshikawa
CEO
Kaz Ota
CTO
Sada Furuhashi
Chief Architect
Open source business
veteran
Founder - world’s
largest Hadoop group
Invented Fluentd,
Messagepack
TODAY

100+ Employees, 30M+ funding
2015

New office in Seoul, Korea
2013

New office in Tokyo, Japan
2012

Founded in Mountain View, CA
Investors
Jerry Yang

Yahoo! Founder
Bill Tai

Angel Investor
Yukihiro Matsumoto

Ruby Inventor
Sierra Ventures - Tim Guleri

Entrerprise Software
Scale Ventures - Andy Vitus

B2B SaaS
Treasure	Data
2016/09/09	HadoopCon	16,	Taipei 4
We							Open-source!	TD	invented	..
Streaming log collector Bulk data import/export efficient binary serialization
Streaming Query Processor
Machine learning on Hadoop
digdag.io
Workflow engine (Beta)
2016/09/09	HadoopCon 16,	Taipei 5
Microsoft Operation	Management	Suite and	Google	Cloud	Platform	
(Kubernates)	are	using	Fluentd for	log	collection
Point
Our	technology	users
2016/09/09	HadoopCon 16,	Taipei 6
Microsoft Operation	Management	Suite and	Google	Cloud	Platform	
(Kubernates)	are	using	Fluentd for	log	collection
Point
Our	technology	users
2016/09/09	HadoopCon	16,	Taipei 7
Treasure	Data’s	Solution
2016/09/09	HadoopCon	16,	Taipei 8
Big	Data	Stats	in	TD
Ad-tech
IoT
三菱重工
Agency	/	Trading Desk DMP / DSP Ad-Network
Diverse Corporate Identity Manual 02
コーポレートカラー
千歳緑(ちとせみどり)
この千歳緑をDiversのコーポレートカラーとします。
千歳緑は、常緑の松の緑をさし、吉祥的な意味を持つ事から、おめでたく、喜ばしい意味を持ちます。
繁栄・幸運を意味し、吉祥天は幸福・美・富を顕す神であるとともに、美女の代名詞ともされています。
■ CMYK / プロセスカラー
C : 85% M : 17% Y : 76% K : 57%
■ PANTONE / プロセスカラー
555EC
■ RGB / モニター
R : 0 G : 80 B : 60
背景と干渉する場合に使用するボックスロゴ
背景と干渉する場合に使用するボックスロゴ 白黒
白黒のみの場合
EC Media Game/SNS
Gaminge-Commerce Internet	Service	
Retail Finance TechnologyTelecommunicationMaker
Other	domain
Our	Customers
2016/09/09	HadoopCon	16,	Taipei 9
Ad-tech
IoT
三菱重工
Agency	/	Trading Desk DMP / DSP Ad-Network
Diverse Corporate Identity Manual 02
コーポレートカラー
千歳緑(ちとせみどり)
この千歳緑をDiversのコーポレートカラーとします。
千歳緑は、常緑の松の緑をさし、吉祥的な意味を持つ事から、おめでたく、喜ばしい意味を持ちます。
繁栄・幸運を意味し、吉祥天は幸福・美・富を顕す神であるとともに、美女の代名詞ともされています。
■ CMYK / プロセスカラー
C : 85% M : 17% Y : 76% K : 57%
■ PANTONE / プロセスカラー
555EC
■ RGB / モニター
R : 0 G : 80 B : 60
背景と干渉する場合に使用するボックスロゴ
背景と干渉する場合に使用するボックスロゴ 白黒
白黒のみの場合
EC Media Game/SNS
Gaminge-Commerce Internet	Service	
Retail Finance TechnologyTelecommunicationMaker
Other	domain
Our	Customers
2016/09/09	HadoopCon	16,	Taipei 10
1. What	is	Hivemall	(introduction)
2. Why	Hivemall	(motivations	etc.)
3. Hivemall	Internals
4. How	to	use	Hivemall
5. Future	roadmap
Agenda
2016/09/09	HadoopCon	16,	Taipei 11
What	is	Hivemall
Scalable	machine	learning	library	built	
as	a	collection	of	Hive	UDFs,	licensed	
under	the	Apache	License	v2
12
https://github.com/myui/hivemall
2016/09/09	HadoopCon	16,	Taipei
Hadoop	HDFS
MapReduce
(MRv1)
Hivemall
Apache	YARN
Apache	Tez
DAG	processing
Machine Learning
Query Processing
Parallel Data
Processing Framework
Resource Management
Distributed File System
Cloud Storage
SparkSQL
Apache	Spark
MESOS
Hive Pig
MLlib
Hivemall’s Technology	Stack
Amazon	S3
2016/09/09	HadoopCon	16,	Taipei 13
Hivemall’s Vision:	ML	on	SQL
Classification	with	Mahout
CREATE	TABLE	lr_model	AS
SELECT
feature,	-- reducers	perform	model	averaging	in	
parallel
avg(weight)	as	weight
FROM	(
SELECT	logress(features,label,..)	as	(feature,weight)
FROM	train
)	t	-- map-only	task
GROUP	BY	feature;	-- shuffled	to	reducers
✓Machine	Learning	made	easy	for	SQL	
developers	(ML	for	the	rest	of	us)
✓Interactive	and	Stable	APIs	w/ SQL	abstraction
This	SQL	query	automatically	runs	in	
parallel	on	Hadoop	
142016/09/09	HadoopCon	16,	Taipei
List	of	supported	Algorithms
Classification	
✓ Perceptron
✓ Passive	Aggressive	(PA,	PA1,	
PA2)
✓ Confidence	Weighted	(CW)
✓ Adaptive	Regularization	of	
Weight	Vectors	(AROW)
✓ Soft	Confidence	Weighted	
(SCW)
✓ AdaGrad+RDA
✓ Factorization	Machines
✓ RandomForest	Classification
15
Regression
✓Logistic	Regression	(SGD)
✓AdaGrad (logistic	loss)
✓AdaDELTA (logistic	loss)
✓PA	Regression
✓AROW	Regression
✓Factorization	Machines
✓RandomForest	Regression
SCW is a good first choice
Try RandomForest if SCW does
not work
Logistic regression is good for
getting a probability of a positive
class
Factorization Machines is good
where features are sparse and
categorical ones
2016/09/09	HadoopCon	16,	Taipei
List	of	Algorithms	for	Recommendation
16
K-Nearest	Neighbor
✓ Minhash and	b-Bit	Minhash
(LSH	variant)
✓ Similarity	Search	on	Vector	
Space
(Euclid/Cosine/Jaccard/Angular)
Matrix	Completion
✓ Matrix	Factorization
✓ Factorization	Machines	
(regression)
each_top_k function	of	Hivemall	is	
useful	for	recommending	top-k	items
2016/09/09	HadoopCon	16,	Taipei
Other	Supported	Algorithms
17
Anomaly	Detection
✓ Local	Outlier	Factor	(LoF)
Feature	Engineering
✓Feature	Hashing
✓Feature	Scaling
(normalization,	z-score)	
✓ TF-IDF	vectorizer
✓ Polynomial	Expansion
(Feature	Pairing)
✓ Amplifier
NLP
✓Basic	Englist text	Tokenizer	
✓Japanese	Tokenizer	
(Kuromoji)
2016/09/09	HadoopCon	16,	Taipei
• CTR	prediction	of	Ad	click	logs
• Algorithm:	Logistic	regression
• Freakout Inc.,	Smartnews,	and	more
• Gender	prediction	of	Ad	click	logs
• Algorithm:	Classification
• Scaleout Inc.
Industry	use	cases	of	Hivemall
182016/09/09	HadoopCon	16,	Taipei
• CTR	prediction	of	Ad	click	logs
• Algorithm:	Logistic	regression
• Freakout Inc.,	Smartnews,	and	more
• Gender	prediction	of	Ad	click	logs
• Algorithm:	Classification
• Scaleout Inc.
• Item/User	recommendation
• Algorithm:	Recommendation
• Wish.com,	GMO	pepabo
Industry	use	cases	of	Hivemall
19
Problem:	Recommendation	using	hot-item	is	hard	in	hand-crafted	
product	market	because	each	creator	sells	few	single	items	(will	
soon	become	out-of-stock)
2016/09/09	HadoopCon	16,	Taipei
minne.com
• CTR	prediction	of	Ad	click	logs
• Algorithm:	Logistic	regression
• Freakout Inc.,	Smartnews,	and	more
• Gender	prediction	of	Ad	click	logs
• Algorithm:	Classification
• Scaleout Inc.
• Item/User	recommendation
• Algorithm:	Recommendation
• Wish.com,	GMO	pepabo
• Value	prediction	of	Real	estates
• Algorithm:		Regression
• Livesense
Industry	use	cases	of	Hivemall
202016/09/09	HadoopCon	16,	Taipei
• CTR	prediction	of	Ad	click	logs
• Algorithm:	Logistic	regression
• Freakout Inc.,	Smartnews,	and	more
• Gender	prediction	of	Ad	click	logs
• Algorithm:	Classification
• Scaleout Inc.
• Item/User	recommendation
• Algorithm:	Recommendation
• Wish.com,	GMO	pepabo
• Value	prediction	of	Real	estates
• Algorithm:		Regression
• Livesense
• User	score	calculation
• Algrorithm:	Regression
• Klout
Industry	use	cases	of	Hivemall
21
bit.ly/klout-hivemall
2016/09/09	HadoopCon	16,	Taipei
Influencer	marketing
klout.com
OISIX,	a	leading	food	delivery	service	company	in	Japan,	
used	Hivemall’s Logistic	Regression	to	get	churn	probability	
2016/09/09	HadoopCon	16,	Taipei 22
Churn	Detection	of	Monthly	Payment	Service
Churn	rate	dropped	almost	by	half	by	giving	gift	points	to	
customers	being	predicted	to	leave J
1. What	is	Hivemall
2. Why	Hivemall	(motivations	etc.)
3. Hivemall	Internals
4. How	to	use	Hivemall
5. Future	roadmap
Agenda
2016/09/09	HadoopCon	16,	Taipei 23
2016/09/09	HadoopCon	16,	Taipei
Motivation	– Why	a	new	ML	framework?
Mahout?
Vowpal	Wabbit?
(w/	Hadoop	streaming)
Spark	MLlib?
0xdata	H2O? Cloudera	Oryx?
Machine	Learning	frameworks	out	there that	
run	with	Hadoop
Quick	Poll:	
How	many	people	in	this	room	are	using	them?
24
How	I	used	to	do	ML	projects	before	Hivemall
Given	raw	data	stored	on	Hadoop	HDFS
Raw
Data
HDFS
S3 Feature	Vector
Extract-Transform-Load
Machine	Learning
file
2016/09/09	HadoopCon	16,	Taipei 25
height:173cm
weight:60kg
age:34
gender:	man
…
How	I	used	to	do	ML	projects	before	Hivemall
Given	raw	data	stored	on	Hadoop	HDFS
Raw
Data
HDFS
S3 Feature	Vector
height:173cm
weight:60kg
age:34
gender:	man
…
Extract-Transform-Load
file
Need	to	do	expensive	data	
preprocessing	
(Joins,	Filtering,	and	Formatting	of	Data	
that	does	not	fit	in	memory)
Machine	Learning
2016/09/09	HadoopCon	16,	Taipei 26
How	I	used	to	do	ML	projects	before	Hivemall
Given	raw	data	stored	on	Hadoop	HDFS
Raw
Data
HDFS
S3 Feature	Vector
Extract-Transform-Load
file
Do	not	scale
Have	to	learn	R/Python	APIs
height:173cm
weight:60kg
age:34
gender:	man
…
2016/09/09	HadoopCon	16,	Taipei 27
Hivemall’s Vision:	ML	on	SQL	(again)
Classification	with	Mahout
CREATE	TABLE	lr_model	AS
SELECT
feature,	-- reducers	perform	model	averaging	in	
parallel
avg(weight)	as	weight
FROM	(
SELECT	logress(features,label,..)	as	(feature,weight)
FROM	train
)	t	-- map-only	task
GROUP	BY	feature;	-- shuffled	to	reducers
✓Machine	Learning	made	easy	for	SQL	
developers	(ML	for	the	rest	of	us)
✓Interactive	and	Stable	APIs	w/ SQL	abstraction
This	SQL	query	automatically	runs	in	
parallel	on	Hadoop	2016/09/09	HadoopCon	16,	Taipei 28
29
Hivemall	on	Apache	Spark
Installation	is	very	easy	as	follows:
$	spark-shell	--packages	maropu:hivemall-spark:0.0.6	
2016/09/09	HadoopCon	16,	Taipei
1. What	is	Hivemall
2. Why	Hivemall	(motivations	etc.)
3. Hivemall	Internals
4. How	to	use	Hivemall
5. Future	roadmap
Agenda
2016/09/09	HadoopCon	16,	Taipei 30
Implemented	machine	learning	algorithms	as	
User-Defined	Table	generating	Functions	(UDTFs)
How	Hivemall	works	in	training
+1,	<1,2>
..
+1,	<1,7,9>
-1,	<1,3,	9>
..
+1,	<3,8>
tuple
<label,	array<features>>
tuple<feature,	weights>
Prediction	model
UDTF
Relation
<feature,	weights>
param-mix param-mix
Training	
table
Shuffle	
by	feature
train train
● Resulting prediction model is a
relation of feature and its weight
● # of mapper and reducers are
configurable
UDTF	is	a	function	that	returns	a	relation
Parallelism	is	Powerful
2016/09/09	HadoopCon	16,	Taipei 31
32
train train
+1,	<1,2>
..
+1,	<1,7,9>
-1,	<1,3,	9>
..
+1,	<3,8>
tuple
<label,	featues>
array<weight>
Training	
table
-1,	<2,7,	9>
..
+1,	<3,8>
MIX
-1,	<2,7,	9>
..
+1,	<3,8>
train train
array<weight>
Parameter	averaging	(bagging)
2016/09/09	HadoopCon	16,	Taipei
Alternative	Approach	in	Hivemall
Hivemall	provides	the amplify UDTF	to	enumerate	
iteration	effects	in	machine	learning	without	several	
MapReduce steps
SET hivevar:xtimes=3;
CREATE VIEW training_x3
as
SELECT
*
FROM (
SELECT
amplify(${xtimes}, *) as (rowid, label, features)
FROM
training
) t
CLUSTER BY rand()
2016/09/09	HadoopCon	16,	Taipei 33
1. What	is	Hivemall
2. Why	Hivemall	(motivations	etc.)
3. Hivemall	Internals
4. How	to	use	Hivemall
5. Future	roadmap
Agenda
2016/09/09	HadoopCon	16,	Taipei 34
How	to	use	Hivemall
Machine
Learning
Training
Prediction
Prediction
Model
Label
Feature	
Vector
Feature	Vector
Label
Data	preparation 352016/09/09	HadoopCon	16,	Taipei
Create external table e2006tfidf_train (
rowid int,
label float,
features ARRAY<STRING>
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '¥t'
COLLECTION ITEMS TERMINATED BY ",“
STORED AS TEXTFILE LOCATION '/dataset/E2006-tfidf/train';
How	to	use	Hivemall	- Data	preparation
Define	a	Hive	table	for	training/testing	data
362016/09/09	HadoopCon	16,	Taipei
How	to	use	Hivemall
Machine
Learning
Training
Prediction
Prediction
Model
Label
Feature	
Vector
Feature	Vector
Label
Feature	Engineering
372016/09/09	HadoopCon	16,	Taipei
create view e2006tfidf_train_scaled
as
select
rowid,
rescale(target,${min_label},${max_label})
as label,
features
from
e2006tfidf_train;
Applying a Min-Max Feature
Normalization
How	to	use	Hivemall	- Feature	Engineering
Transforming	a	label	value	
to	a	value	between	0.0	and	1.0
382016/09/09	HadoopCon	16,	Taipei
How	to	use	Hivemall
Machine
Learning
Training
Prediction
Prediction
Model
Label
Feature	
Vector
Feature	Vector
Label
Training
392016/09/09	HadoopCon	16,	Taipei
How	to	use	Hivemall	- Training
CREATE TABLE lr_model AS
SELECT
feature,
avg(weight) as weight
FROM (
SELECT logress(features,label,..)
as (feature,weight)
FROM train
) t
GROUP BY feature
Training	by	logistic	regression
map-only	task	to	learn	a	prediction	model
Shuffle	map-outputs	to	reduces	by	feature
Reducers	perform	model	averaging	
in	parallel
402016/09/09	HadoopCon	16,	Taipei
How	to	use	Hivemall	- Training
CREATE TABLE news20b_cw_model1 AS
SELECT
feature,
voted_avg(weight) as weight
FROM
(SELECT
train_cw(features,label)
as (feature,weight)
FROM
news20b_train
) t
GROUP BY feature
Training	of	Confidence	Weighted	Classifier
Vote	to	use	negative	or	positive	
weights	for	avg
+0.7,	+0.3,	+0.2,	-0.1,	+0.7
Training	for	the	CW	classifier
412016/09/09	HadoopCon	16,	Taipei
How	to	use	Hivemall
Machine
Learning
Training
Prediction
Prediction
Model
Label
Feature	
Vector
Feature	Vector
Label
Prediction
422016/09/09	HadoopCon	16,	Taipei
How	to	use	Hivemall	- Prediction
CREATE TABLE lr_predict
as
SELECT
t.rowid,
sigmoid(sum(m.weight)) as prob
FROM
testing_exploded t LEFT OUTER JOIN
lr_model m ON (t.feature = m.feature)
GROUP BY
t.rowid
Prediction	is	done	by	LEFT	OUTER	JOIN
between	test	data	and	prediction	model
No	need	to	load	the	entire	model	into	memory
432016/09/09	HadoopCon	16,	Taipei
Real-time	prediction
Machine
Learning
Batch Training on Hadoop
Online Prediction on RDBMS
Prediction
Model
Label
Feature	
Vector
Feature	Vector
Label
Export	
prediction	model
44
bit.ly/hivemall-rtp
2016/09/09	HadoopCon	16,	Taipei
RandomForest	in	Hivemall
Ensemble	of	Decision	Trees
2016/09/09	HadoopCon	16,	Taipei 45
Training	of	RandomForest
2016/09/09	HadoopCon	16,	Taipei 46
Prediction	of	RandomForest
2016/09/09	HadoopCon	16,	Taipei 47
1. What	is	Hivemall
2. Why	Hivemall	(motivations	etc.)
3. Hivemall	Internals
4. How	to	use	Hivemall
5. Future	roadmap
Agenda
2016/09/09	HadoopCon	16,	Taipei 48
49
Future	of	Hivemall
Hivemall	will	become	Apache	Hivemall	(?)
Now	on	voting	though..
2016/09/09	HadoopCon	16,	Taipei
50
Apache	Incubation	status
2016/09/09	HadoopCon	16,	Taipei
• Makoto	Yui	<Treasure	Data>
• Takeshi	Yamamuro <NTT>
Ø Hivemall	on	Apache	Spark
• Daniel	Dai	<Hortonworks>
Ø Hivemall	on	Apache	Pig	
Ø Apache	Pig	PMC	member
• Tsuyoshi	Ozawa	<NTT>
ØApache	Hadoop	PMC	member
• Kai	Sasaki	<Treasure	Data>
51
Initial	committers
2016/09/09	HadoopCon	16,	Taipei
Champion
Nominated	Mentors
52
Project	mentors
• Reynold	Xin	<Databricks,	ASF	member>
Apache	Spark	PMC	member
• Markus	Weimer	<Microsoft,	ASF	member>
Apache	REEF	PMC	member
• Xiangrui Meng <Databricks,	ASF	member>
Apache	Spark	PMC	member
• Roman	Shaposhnik <Pivotal,	ASF	member>
Apache	Bigtop/Incubator	PMC	member
2016/09/09	HadoopCon	16,	Taipei
• Possibly	enter	Apache	Incubator	soon
• IP	clearance	and	project/repository	site	
setup
•Contribution	guideline
•Create	who	use	Hivemall	list
•More	documentations!	Sept	to	Nov
• Initial	Apache	Release	will	be	Dec	(or	
late	Nov?)
53
Roadmap
2016/09/09	HadoopCon	16,	Taipei
ü Hivemall	on	Spark	2.0	w/	Dataframe
support	
ü XGBoost support
54
Coming	New	Features	- already	merged	in	Master
2016/09/09	HadoopCon	16,	Taipei
Please	Refer	
bit.ly/hivemall-xgboost
for	detail
ü ChangeFinder
• Efficient	algorithm	for	finding	change	point	and	outliers	
from	timeseries data
55
Coming	New	Features	- already	merged	in	Master
J.	Takeuchi	and	K.	Yamanishi,	“A	Unifying	Framework	for	Detecting	
Outliers	and	Change	Points	from	Time	Series,” IEEE		transactions	on	
Knowledge	and	Data	Engineering,	pp.482-492,	2006.
2016/09/09	HadoopCon	16,	Taipei
ü ChangeFinder
• Efficient	algorithm	for	finding	change	point	and	outliers	
from	timeseries data
56
Coming	New	Features	- already	merged	in	Master
J.	Takeuchi	and	K.	Yamanishi,	“A	Unifying	Framework	for	Detecting	
Outliers	and	Change	Points	from	Time	Series,” IEEE		transactions	on	
Knowledge	and	Data	Engineering,	pp.482-492,	2006.
2016/09/09	HadoopCon	16,	Taipei
ü Various	Evaluation	Metrics
•PR	#326
57
Coming	New	Features	- already	merged	in	Master
2016/09/09	HadoopCon	16,	Taipei
• v0.5-beta{1,2}	release	(Oct-Nov)
üone-hot	encoding
ü Field-aware	Factorization	Machines
ü Kernelized Passive	Aggressive
üGeneralized	Linear	Model
ü Optimizer	framework	including	ADAM
ü L1/L2	regularization
ü Gradient	Tree	Boosting	
ü Online	LDA
58
Other	undergoing	new	features
2016/09/09	HadoopCon	16,	Taipei
Conclusion	and	Takeaway
Hivemall	provides	a	collection	of	machine	
learning	algorithms	as	Hive	UDFs/UDTFs
59
Ø For	SQL	users	that	need	ML
Ø For	whom	already	using	Hive
Ø Easy-of-use	and	scalability	in	mind
Do	not	require	coding,	packaging,	compiling	or	
introducing	a	new	programming	language	or APIs.
Hivemall’s Positioning
We	welcome	your	contributions	to	Apache	Hivemall	J
2016/09/09	HadoopCon	16,	Taipei
60
Any	feature	request	or	questions?
#hivemall
2016/09/09	HadoopCon	16,	Taipei

More Related Content

What's hot

Writing Apache Spark and Apache Flink Applications Using Apache Bahir
Writing Apache Spark and Apache Flink Applications Using Apache BahirWriting Apache Spark and Apache Flink Applications Using Apache Bahir
Writing Apache Spark and Apache Flink Applications Using Apache Bahir
Luciano Resende
 
Fast, Scalable Graph Processing: Apache Giraph on YARN
Fast, Scalable Graph Processing: Apache Giraph on YARNFast, Scalable Graph Processing: Apache Giraph on YARN
Fast, Scalable Graph Processing: Apache Giraph on YARN
DataWorks Summit
 
The Apache Way - Building Open Source Community in China - Luke Han
The Apache Way - Building Open Source Community in China - Luke HanThe Apache Way - Building Open Source Community in China - Luke Han
The Apache Way - Building Open Source Community in China - Luke Han
Luke Han
 
The Evolution of Apache Kylin by Luke Han
The Evolution of Apache Kylin by Luke HanThe Evolution of Apache Kylin by Luke Han
The Evolution of Apache Kylin by Luke Han
Luke Han
 
Extending Pandas using Apache Arrow and Numba
Extending Pandas using Apache Arrow and NumbaExtending Pandas using Apache Arrow and Numba
Extending Pandas using Apache Arrow and Numba
Uwe Korn
 
SystemML - Declarative Machine Learning
SystemML - Declarative Machine LearningSystemML - Declarative Machine Learning
SystemML - Declarative Machine Learning
Luciano Resende
 
Spark Summit EU talk by Oscar Castaneda
Spark Summit EU talk by Oscar CastanedaSpark Summit EU talk by Oscar Castaneda
Spark Summit EU talk by Oscar Castaneda
Spark Summit
 
Hadoop Graph Processing with Apache Giraph
Hadoop Graph Processing with Apache GiraphHadoop Graph Processing with Apache Giraph
Hadoop Graph Processing with Apache Giraph
DataWorks Summit
 
Data Science Languages and Industry Analytics
Data Science Languages and Industry AnalyticsData Science Languages and Industry Analytics
Data Science Languages and Industry Analytics
Wes McKinney
 
SparkR + Zeppelin
SparkR + ZeppelinSparkR + Zeppelin
SparkR + Zeppelin
felixcss
 
1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...
1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...
1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...
Luke Han
 
Hw09 Hadoop Applications At Yahoo!
Hw09   Hadoop Applications At Yahoo!Hw09   Hadoop Applications At Yahoo!
Hw09 Hadoop Applications At Yahoo!Cloudera, Inc.
 
PyCon Singapore 2013 Keynote
PyCon Singapore 2013 KeynotePyCon Singapore 2013 Keynote
PyCon Singapore 2013 Keynote
Wes McKinney
 
Luciano Resende's keynote at Apache big data conference
Luciano Resende's keynote at Apache big data conferenceLuciano Resende's keynote at Apache big data conference
Luciano Resende's keynote at Apache big data conference
Luciano Resende
 
Improving data interoperability in Python and R
Improving data interoperability in Python and RImproving data interoperability in Python and R
Improving data interoperability in Python and R
Wes McKinney
 
Apache Spark & MLlib
Apache Spark & MLlibApache Spark & MLlib
Apache Spark & MLlib
Grigory Sapunov
 
Adding Spark support to Kylin at Bay Area Spark Meetup
Adding Spark support to Kylin at Bay Area Spark MeetupAdding Spark support to Kylin at Bay Area Spark Meetup
Adding Spark support to Kylin at Bay Area Spark Meetup
Luke Han
 
Apache Arrow: Cross-language Development Platform for In-memory Data
Apache Arrow: Cross-language Development Platform for In-memory DataApache Arrow: Cross-language Development Platform for In-memory Data
Apache Arrow: Cross-language Development Platform for In-memory Data
Wes McKinney
 
Apache Kylin Introduction
Apache Kylin IntroductionApache Kylin Introduction
Apache Kylin Introduction
Luke Han
 

What's hot (20)

Writing Apache Spark and Apache Flink Applications Using Apache Bahir
Writing Apache Spark and Apache Flink Applications Using Apache BahirWriting Apache Spark and Apache Flink Applications Using Apache Bahir
Writing Apache Spark and Apache Flink Applications Using Apache Bahir
 
Fast, Scalable Graph Processing: Apache Giraph on YARN
Fast, Scalable Graph Processing: Apache Giraph on YARNFast, Scalable Graph Processing: Apache Giraph on YARN
Fast, Scalable Graph Processing: Apache Giraph on YARN
 
The Apache Way - Building Open Source Community in China - Luke Han
The Apache Way - Building Open Source Community in China - Luke HanThe Apache Way - Building Open Source Community in China - Luke Han
The Apache Way - Building Open Source Community in China - Luke Han
 
The Evolution of Apache Kylin by Luke Han
The Evolution of Apache Kylin by Luke HanThe Evolution of Apache Kylin by Luke Han
The Evolution of Apache Kylin by Luke Han
 
Extending Pandas using Apache Arrow and Numba
Extending Pandas using Apache Arrow and NumbaExtending Pandas using Apache Arrow and Numba
Extending Pandas using Apache Arrow and Numba
 
MahoutNew
MahoutNewMahoutNew
MahoutNew
 
SystemML - Declarative Machine Learning
SystemML - Declarative Machine LearningSystemML - Declarative Machine Learning
SystemML - Declarative Machine Learning
 
Spark Summit EU talk by Oscar Castaneda
Spark Summit EU talk by Oscar CastanedaSpark Summit EU talk by Oscar Castaneda
Spark Summit EU talk by Oscar Castaneda
 
Hadoop Graph Processing with Apache Giraph
Hadoop Graph Processing with Apache GiraphHadoop Graph Processing with Apache Giraph
Hadoop Graph Processing with Apache Giraph
 
Data Science Languages and Industry Analytics
Data Science Languages and Industry AnalyticsData Science Languages and Industry Analytics
Data Science Languages and Industry Analytics
 
SparkR + Zeppelin
SparkR + ZeppelinSparkR + Zeppelin
SparkR + Zeppelin
 
1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...
1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...
1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...
 
Hw09 Hadoop Applications At Yahoo!
Hw09   Hadoop Applications At Yahoo!Hw09   Hadoop Applications At Yahoo!
Hw09 Hadoop Applications At Yahoo!
 
PyCon Singapore 2013 Keynote
PyCon Singapore 2013 KeynotePyCon Singapore 2013 Keynote
PyCon Singapore 2013 Keynote
 
Luciano Resende's keynote at Apache big data conference
Luciano Resende's keynote at Apache big data conferenceLuciano Resende's keynote at Apache big data conference
Luciano Resende's keynote at Apache big data conference
 
Improving data interoperability in Python and R
Improving data interoperability in Python and RImproving data interoperability in Python and R
Improving data interoperability in Python and R
 
Apache Spark & MLlib
Apache Spark & MLlibApache Spark & MLlib
Apache Spark & MLlib
 
Adding Spark support to Kylin at Bay Area Spark Meetup
Adding Spark support to Kylin at Bay Area Spark MeetupAdding Spark support to Kylin at Bay Area Spark Meetup
Adding Spark support to Kylin at Bay Area Spark Meetup
 
Apache Arrow: Cross-language Development Platform for In-memory Data
Apache Arrow: Cross-language Development Platform for In-memory DataApache Arrow: Cross-language Development Platform for In-memory Data
Apache Arrow: Cross-language Development Platform for In-memory Data
 
Apache Kylin Introduction
Apache Kylin IntroductionApache Kylin Introduction
Apache Kylin Introduction
 

Viewers also liked

Log Event Stream Processing In Flink Way
Log Event Stream Processing In Flink WayLog Event Stream Processing In Flink Way
Log Event Stream Processing In Flink Way
George T. C. Lai
 
機械学習のデータ並列処理@第7回BDI研究会
機械学習のデータ並列処理@第7回BDI研究会機械学習のデータ並列処理@第7回BDI研究会
機械学習のデータ並列処理@第7回BDI研究会
Makoto Yui
 
Stream Processing with Apache Flink
Stream Processing with Apache FlinkStream Processing with Apache Flink
Stream Processing with Apache Flink
C4Media
 
Hivemall v0.3の機能紹介@1st Hivemall meetup
Hivemall v0.3の機能紹介@1st Hivemall meetupHivemall v0.3の機能紹介@1st Hivemall meetup
Hivemall v0.3の機能紹介@1st Hivemall meetupMakoto Yui
 
Dots20161029 myui
Dots20161029 myuiDots20161029 myui
Dots20161029 myui
Makoto Yui
 
Hivemall meetup vol2 oisix
Hivemall meetup vol2 oisixHivemall meetup vol2 oisix
Hivemall meetup vol2 oisix
Taisuke Fukawa
 
Hadoop con 2016_9_10_王經篤(Jing-Doo Wang)
Hadoop con 2016_9_10_王經篤(Jing-Doo Wang)Hadoop con 2016_9_10_王經篤(Jing-Doo Wang)
Hadoop con 2016_9_10_王經篤(Jing-Doo Wang)
Jing-Doo Wang
 
Yarn Resource Management Using Machine Learning
Yarn Resource Management Using Machine LearningYarn Resource Management Using Machine Learning
Yarn Resource Management Using Machine Learning
ojavajava
 
How to plan a hadoop cluster for testing and production environment
How to plan a hadoop cluster for testing and production environmentHow to plan a hadoop cluster for testing and production environment
How to plan a hadoop cluster for testing and production environment
Anna Yen
 
Apache Flink & Graph Processing
Apache Flink & Graph ProcessingApache Flink & Graph Processing
Apache Flink & Graph Processing
Vasia Kalavri
 
Hivemallで始める不動産価格推定サービス
Hivemallで始める不動産価格推定サービスHivemallで始める不動産価格推定サービス
Hivemallで始める不動産価格推定サービス
Kentaro Yoshida
 
2016-07-12 Introduction to Big Data Platform Security
2016-07-12 Introduction to Big Data Platform Security2016-07-12 Introduction to Big Data Platform Security
2016-07-12 Introduction to Big Data Platform Security
Jazz Yao-Tsung Wang
 
Apache Flink Training Workshop @ HadoopCon2016 - #2 DataSet API Hands-On
Apache Flink Training Workshop @ HadoopCon2016 - #2 DataSet API Hands-OnApache Flink Training Workshop @ HadoopCon2016 - #2 DataSet API Hands-On
Apache Flink Training Workshop @ HadoopCon2016 - #2 DataSet API Hands-On
Apache Flink Taiwan User Group
 
Batch and Stream Graph Processing with Apache Flink
Batch and Stream Graph Processing with Apache FlinkBatch and Stream Graph Processing with Apache Flink
Batch and Stream Graph Processing with Apache Flink
Vasia Kalavri
 
2016 Hadoop Conf TW - 如何建置數據精靈
2016 Hadoop Conf TW - 如何建置數據精靈2016 Hadoop Conf TW - 如何建置數據精靈
2016 Hadoop Conf TW - 如何建置數據精靈
晨揚 施
 
Apache Software Foundation: How To Contribute, with Apache Flink as Example (...
Apache Software Foundation: How To Contribute, with Apache Flink as Example (...Apache Software Foundation: How To Contribute, with Apache Flink as Example (...
Apache Software Foundation: How To Contribute, with Apache Flink as Example (...
Apache Flink Taiwan User Group
 
HadoopCon 2016 - 用 Jupyter Notebook Hold 住一個上線 Spark Machine Learning 專案實戰
HadoopCon 2016  - 用 Jupyter Notebook Hold 住一個上線 Spark  Machine Learning 專案實戰HadoopCon 2016  - 用 Jupyter Notebook Hold 住一個上線 Spark  Machine Learning 專案實戰
HadoopCon 2016 - 用 Jupyter Notebook Hold 住一個上線 Spark Machine Learning 專案實戰
Wayne Chen
 
Streaming in the Wild with Apache Flink
Streaming in the Wild with Apache FlinkStreaming in the Wild with Apache Flink
Streaming in the Wild with Apache Flink
DataWorks Summit/Hadoop Summit
 
BI in Xuenn
BI in XuennBI in Xuenn
BI in Xuenn
Len Chang
 
Achieve big data analytic platform with lambda architecture on cloud
Achieve big data analytic platform with lambda architecture on cloudAchieve big data analytic platform with lambda architecture on cloud
Achieve big data analytic platform with lambda architecture on cloud
Scott Miao
 

Viewers also liked (20)

Log Event Stream Processing In Flink Way
Log Event Stream Processing In Flink WayLog Event Stream Processing In Flink Way
Log Event Stream Processing In Flink Way
 
機械学習のデータ並列処理@第7回BDI研究会
機械学習のデータ並列処理@第7回BDI研究会機械学習のデータ並列処理@第7回BDI研究会
機械学習のデータ並列処理@第7回BDI研究会
 
Stream Processing with Apache Flink
Stream Processing with Apache FlinkStream Processing with Apache Flink
Stream Processing with Apache Flink
 
Hivemall v0.3の機能紹介@1st Hivemall meetup
Hivemall v0.3の機能紹介@1st Hivemall meetupHivemall v0.3の機能紹介@1st Hivemall meetup
Hivemall v0.3の機能紹介@1st Hivemall meetup
 
Dots20161029 myui
Dots20161029 myuiDots20161029 myui
Dots20161029 myui
 
Hivemall meetup vol2 oisix
Hivemall meetup vol2 oisixHivemall meetup vol2 oisix
Hivemall meetup vol2 oisix
 
Hadoop con 2016_9_10_王經篤(Jing-Doo Wang)
Hadoop con 2016_9_10_王經篤(Jing-Doo Wang)Hadoop con 2016_9_10_王經篤(Jing-Doo Wang)
Hadoop con 2016_9_10_王經篤(Jing-Doo Wang)
 
Yarn Resource Management Using Machine Learning
Yarn Resource Management Using Machine LearningYarn Resource Management Using Machine Learning
Yarn Resource Management Using Machine Learning
 
How to plan a hadoop cluster for testing and production environment
How to plan a hadoop cluster for testing and production environmentHow to plan a hadoop cluster for testing and production environment
How to plan a hadoop cluster for testing and production environment
 
Apache Flink & Graph Processing
Apache Flink & Graph ProcessingApache Flink & Graph Processing
Apache Flink & Graph Processing
 
Hivemallで始める不動産価格推定サービス
Hivemallで始める不動産価格推定サービスHivemallで始める不動産価格推定サービス
Hivemallで始める不動産価格推定サービス
 
2016-07-12 Introduction to Big Data Platform Security
2016-07-12 Introduction to Big Data Platform Security2016-07-12 Introduction to Big Data Platform Security
2016-07-12 Introduction to Big Data Platform Security
 
Apache Flink Training Workshop @ HadoopCon2016 - #2 DataSet API Hands-On
Apache Flink Training Workshop @ HadoopCon2016 - #2 DataSet API Hands-OnApache Flink Training Workshop @ HadoopCon2016 - #2 DataSet API Hands-On
Apache Flink Training Workshop @ HadoopCon2016 - #2 DataSet API Hands-On
 
Batch and Stream Graph Processing with Apache Flink
Batch and Stream Graph Processing with Apache FlinkBatch and Stream Graph Processing with Apache Flink
Batch and Stream Graph Processing with Apache Flink
 
2016 Hadoop Conf TW - 如何建置數據精靈
2016 Hadoop Conf TW - 如何建置數據精靈2016 Hadoop Conf TW - 如何建置數據精靈
2016 Hadoop Conf TW - 如何建置數據精靈
 
Apache Software Foundation: How To Contribute, with Apache Flink as Example (...
Apache Software Foundation: How To Contribute, with Apache Flink as Example (...Apache Software Foundation: How To Contribute, with Apache Flink as Example (...
Apache Software Foundation: How To Contribute, with Apache Flink as Example (...
 
HadoopCon 2016 - 用 Jupyter Notebook Hold 住一個上線 Spark Machine Learning 專案實戰
HadoopCon 2016  - 用 Jupyter Notebook Hold 住一個上線 Spark  Machine Learning 專案實戰HadoopCon 2016  - 用 Jupyter Notebook Hold 住一個上線 Spark  Machine Learning 專案實戰
HadoopCon 2016 - 用 Jupyter Notebook Hold 住一個上線 Spark Machine Learning 專案實戰
 
Streaming in the Wild with Apache Flink
Streaming in the Wild with Apache FlinkStreaming in the Wild with Apache Flink
Streaming in the Wild with Apache Flink
 
BI in Xuenn
BI in XuennBI in Xuenn
BI in Xuenn
 
Achieve big data analytic platform with lambda architecture on cloud
Achieve big data analytic platform with lambda architecture on cloudAchieve big data analytic platform with lambda architecture on cloud
Achieve big data analytic platform with lambda architecture on cloud
 

Similar to HadoopCon'16, Taipei @myui

Introduction to Hivemall
Introduction to HivemallIntroduction to Hivemall
Introduction to Hivemall
Makoto Yui
 
Big data or big deal
Big data or big dealBig data or big deal
Big data or big deal
eduarderwee
 
The Big Picture on Hadoop
The Big Picture on HadoopThe Big Picture on Hadoop
The Big Picture on Hadoop
StackIQ
 
Apache Hivemall and my OSS experience
Apache Hivemall and my OSS experienceApache Hivemall and my OSS experience
Apache Hivemall and my OSS experience
Makoto Yui
 
Apache-Flink-What-How-Why-Who-Where-by-Slim-Baltagi
Apache-Flink-What-How-Why-Who-Where-by-Slim-BaltagiApache-Flink-What-How-Why-Who-Where-by-Slim-Baltagi
Apache-Flink-What-How-Why-Who-Where-by-Slim-Baltagi
Slim Baltagi
 
Hadoop or Spark: is it an either-or proposition? By Slim Baltagi
Hadoop or Spark: is it an either-or proposition? By Slim BaltagiHadoop or Spark: is it an either-or proposition? By Slim Baltagi
Hadoop or Spark: is it an either-or proposition? By Slim Baltagi
Slim Baltagi
 
Idea behind Apache Hivemall
Idea behind Apache HivemallIdea behind Apache Hivemall
Idea behind Apache Hivemall
Makoto Yui
 
2014 feb 24_big_datacongress_hadoopsession1_hadoop101
2014 feb 24_big_datacongress_hadoopsession1_hadoop1012014 feb 24_big_datacongress_hadoopsession1_hadoop101
2014 feb 24_big_datacongress_hadoopsession1_hadoop101
Adam Muise
 
Get started with hadoop hive hive ql languages
Get started with hadoop hive hive ql languagesGet started with hadoop hive hive ql languages
Get started with hadoop hive hive ql languages
JanBask Training
 
Hortonworks for Financial Analysts Presentation
Hortonworks for Financial Analysts PresentationHortonworks for Financial Analysts Presentation
Hortonworks for Financial Analysts Presentation
Hortonworks
 
13회 Oracle Developer Meetup 발표 자료: Oracle Cloud Data Interface(2019.07.20)
13회 Oracle Developer Meetup 발표 자료: Oracle Cloud Data Interface(2019.07.20)13회 Oracle Developer Meetup 발표 자료: Oracle Cloud Data Interface(2019.07.20)
13회 Oracle Developer Meetup 발표 자료: Oracle Cloud Data Interface(2019.07.20)
Taewan Kim
 
Presto & differences between popular SQL engines (Spark, Redshift, and Hive)
Presto & differences between popular SQL engines (Spark, Redshift, and Hive)Presto & differences between popular SQL engines (Spark, Redshift, and Hive)
Presto & differences between popular SQL engines (Spark, Redshift, and Hive)
Holden Ackerman
 
Detailed presentation on big data hadoop +Hadoop Project Near Duplicate Detec...
Detailed presentation on big data hadoop +Hadoop Project Near Duplicate Detec...Detailed presentation on big data hadoop +Hadoop Project Near Duplicate Detec...
Detailed presentation on big data hadoop +Hadoop Project Near Duplicate Detec...
Ashok Royal
 
The sensor data challenge - Innovations (not only) for the Internet of Things
The sensor data challenge - Innovations (not only) for the Internet of ThingsThe sensor data challenge - Innovations (not only) for the Internet of Things
The sensor data challenge - Innovations (not only) for the Internet of Things
Stephan Reimann
 
Big data advance topics - part 2.pptx
Big data   advance topics - part 2.pptxBig data   advance topics - part 2.pptx
Big data advance topics - part 2.pptx
Moldovan Radu Adrian
 
Big Data & Open Source - Neil Jadhav
Big Data & Open Source - Neil JadhavBig Data & Open Source - Neil Jadhav
Big Data & Open Source - Neil JadhavSwapnil (Neil) Jadhav
 
Architecting the Future of Big Data and Search
Architecting the Future of Big Data and SearchArchitecting the Future of Big Data and Search
Architecting the Future of Big Data and Search
Hortonworks
 
Unified Batch and Real-Time Stream Processing Using Apache Flink
Unified Batch and Real-Time Stream Processing Using Apache FlinkUnified Batch and Real-Time Stream Processing Using Apache Flink
Unified Batch and Real-Time Stream Processing Using Apache Flink
Slim Baltagi
 
Apache SystemML - Declarative Large-Scale Machine Learning
Apache SystemML - Declarative Large-Scale Machine LearningApache SystemML - Declarative Large-Scale Machine Learning
Apache SystemML - Declarative Large-Scale Machine Learning
Romeo Kienzler
 
Introduction to pyspark new
Introduction to pyspark newIntroduction to pyspark new
Introduction to pyspark new
Anam Mahmood
 

Similar to HadoopCon'16, Taipei @myui (20)

Introduction to Hivemall
Introduction to HivemallIntroduction to Hivemall
Introduction to Hivemall
 
Big data or big deal
Big data or big dealBig data or big deal
Big data or big deal
 
The Big Picture on Hadoop
The Big Picture on HadoopThe Big Picture on Hadoop
The Big Picture on Hadoop
 
Apache Hivemall and my OSS experience
Apache Hivemall and my OSS experienceApache Hivemall and my OSS experience
Apache Hivemall and my OSS experience
 
Apache-Flink-What-How-Why-Who-Where-by-Slim-Baltagi
Apache-Flink-What-How-Why-Who-Where-by-Slim-BaltagiApache-Flink-What-How-Why-Who-Where-by-Slim-Baltagi
Apache-Flink-What-How-Why-Who-Where-by-Slim-Baltagi
 
Hadoop or Spark: is it an either-or proposition? By Slim Baltagi
Hadoop or Spark: is it an either-or proposition? By Slim BaltagiHadoop or Spark: is it an either-or proposition? By Slim Baltagi
Hadoop or Spark: is it an either-or proposition? By Slim Baltagi
 
Idea behind Apache Hivemall
Idea behind Apache HivemallIdea behind Apache Hivemall
Idea behind Apache Hivemall
 
2014 feb 24_big_datacongress_hadoopsession1_hadoop101
2014 feb 24_big_datacongress_hadoopsession1_hadoop1012014 feb 24_big_datacongress_hadoopsession1_hadoop101
2014 feb 24_big_datacongress_hadoopsession1_hadoop101
 
Get started with hadoop hive hive ql languages
Get started with hadoop hive hive ql languagesGet started with hadoop hive hive ql languages
Get started with hadoop hive hive ql languages
 
Hortonworks for Financial Analysts Presentation
Hortonworks for Financial Analysts PresentationHortonworks for Financial Analysts Presentation
Hortonworks for Financial Analysts Presentation
 
13회 Oracle Developer Meetup 발표 자료: Oracle Cloud Data Interface(2019.07.20)
13회 Oracle Developer Meetup 발표 자료: Oracle Cloud Data Interface(2019.07.20)13회 Oracle Developer Meetup 발표 자료: Oracle Cloud Data Interface(2019.07.20)
13회 Oracle Developer Meetup 발표 자료: Oracle Cloud Data Interface(2019.07.20)
 
Presto & differences between popular SQL engines (Spark, Redshift, and Hive)
Presto & differences between popular SQL engines (Spark, Redshift, and Hive)Presto & differences between popular SQL engines (Spark, Redshift, and Hive)
Presto & differences between popular SQL engines (Spark, Redshift, and Hive)
 
Detailed presentation on big data hadoop +Hadoop Project Near Duplicate Detec...
Detailed presentation on big data hadoop +Hadoop Project Near Duplicate Detec...Detailed presentation on big data hadoop +Hadoop Project Near Duplicate Detec...
Detailed presentation on big data hadoop +Hadoop Project Near Duplicate Detec...
 
The sensor data challenge - Innovations (not only) for the Internet of Things
The sensor data challenge - Innovations (not only) for the Internet of ThingsThe sensor data challenge - Innovations (not only) for the Internet of Things
The sensor data challenge - Innovations (not only) for the Internet of Things
 
Big data advance topics - part 2.pptx
Big data   advance topics - part 2.pptxBig data   advance topics - part 2.pptx
Big data advance topics - part 2.pptx
 
Big Data & Open Source - Neil Jadhav
Big Data & Open Source - Neil JadhavBig Data & Open Source - Neil Jadhav
Big Data & Open Source - Neil Jadhav
 
Architecting the Future of Big Data and Search
Architecting the Future of Big Data and SearchArchitecting the Future of Big Data and Search
Architecting the Future of Big Data and Search
 
Unified Batch and Real-Time Stream Processing Using Apache Flink
Unified Batch and Real-Time Stream Processing Using Apache FlinkUnified Batch and Real-Time Stream Processing Using Apache Flink
Unified Batch and Real-Time Stream Processing Using Apache Flink
 
Apache SystemML - Declarative Large-Scale Machine Learning
Apache SystemML - Declarative Large-Scale Machine LearningApache SystemML - Declarative Large-Scale Machine Learning
Apache SystemML - Declarative Large-Scale Machine Learning
 
Introduction to pyspark new
Introduction to pyspark newIntroduction to pyspark new
Introduction to pyspark new
 

More from Makoto Yui

Introduction to Apache Hivemall v0.5.2 and v0.6
Introduction to Apache Hivemall v0.5.2 and v0.6Introduction to Apache Hivemall v0.5.2 and v0.6
Introduction to Apache Hivemall v0.5.2 and v0.6
Makoto Yui
 
Introduction to Apache Hivemall v0.5.0
Introduction to Apache Hivemall v0.5.0Introduction to Apache Hivemall v0.5.0
Introduction to Apache Hivemall v0.5.0
Makoto Yui
 
Introduction to Apache Hivemall v0.5.0
Introduction to Apache Hivemall v0.5.0Introduction to Apache Hivemall v0.5.0
Introduction to Apache Hivemall v0.5.0
Makoto Yui
 
What's new in Hivemall v0.5.0
What's new in Hivemall v0.5.0What's new in Hivemall v0.5.0
What's new in Hivemall v0.5.0
Makoto Yui
 
What's new in Apache Hivemall v0.5.0
What's new in Apache Hivemall v0.5.0What's new in Apache Hivemall v0.5.0
What's new in Apache Hivemall v0.5.0
Makoto Yui
 
Revisiting b+-trees
Revisiting b+-treesRevisiting b+-trees
Revisiting b+-trees
Makoto Yui
 
Incubating Apache Hivemall
Incubating Apache HivemallIncubating Apache Hivemall
Incubating Apache Hivemall
Makoto Yui
 
Hivemall meets Digdag @Hackertackle 2018-02-17
Hivemall meets Digdag @Hackertackle 2018-02-17Hivemall meets Digdag @Hackertackle 2018-02-17
Hivemall meets Digdag @Hackertackle 2018-02-17
Makoto Yui
 
Recommendation 101 using Hivemall
Recommendation 101 using HivemallRecommendation 101 using Hivemall
Recommendation 101 using Hivemall
Makoto Yui
 
Hivemall dbtechshowcase 20160713 #dbts2016
Hivemall dbtechshowcase 20160713 #dbts2016Hivemall dbtechshowcase 20160713 #dbts2016
Hivemall dbtechshowcase 20160713 #dbts2016
Makoto Yui
 
Tdtechtalk20160425myui
Tdtechtalk20160425myuiTdtechtalk20160425myui
Tdtechtalk20160425myui
Makoto Yui
 
Tdtechtalk20160330myui
Tdtechtalk20160330myuiTdtechtalk20160330myui
Tdtechtalk20160330myui
Makoto Yui
 
Datascientistsymp1113
Datascientistsymp1113Datascientistsymp1113
Datascientistsymp1113
Makoto Yui
 
2nd Hivemall meetup 20151020
2nd Hivemall meetup 201510202nd Hivemall meetup 20151020
2nd Hivemall meetup 20151020
Makoto Yui
 
Talk about Hivemall at Data Scientist Organization on 2015/09/17
Talk about Hivemall at Data Scientist Organization on 2015/09/17Talk about Hivemall at Data Scientist Organization on 2015/09/17
Talk about Hivemall at Data Scientist Organization on 2015/09/17
Makoto Yui
 
Db tech show - hivemall
Db tech show - hivemallDb tech show - hivemall
Db tech show - hivemallMakoto Yui
 
Hivemall tech talk at Redwood, CA
Hivemall tech talk at Redwood, CAHivemall tech talk at Redwood, CA
Hivemall tech talk at Redwood, CA
Makoto Yui
 
Hivemall Talk at TD tech talk #3
Hivemall Talk at TD tech talk #3Hivemall Talk at TD tech talk #3
Hivemall Talk at TD tech talk #3
Makoto Yui
 
Hivemall LT @ Machine Learning Casual Talks #3
Hivemall LT @ Machine Learning Casual Talks #3Hivemall LT @ Machine Learning Casual Talks #3
Hivemall LT @ Machine Learning Casual Talks #3
Makoto Yui
 
HivemallとSpark MLlibの比較
HivemallとSpark MLlibの比較HivemallとSpark MLlibの比較
HivemallとSpark MLlibの比較
Makoto Yui
 

More from Makoto Yui (20)

Introduction to Apache Hivemall v0.5.2 and v0.6
Introduction to Apache Hivemall v0.5.2 and v0.6Introduction to Apache Hivemall v0.5.2 and v0.6
Introduction to Apache Hivemall v0.5.2 and v0.6
 
Introduction to Apache Hivemall v0.5.0
Introduction to Apache Hivemall v0.5.0Introduction to Apache Hivemall v0.5.0
Introduction to Apache Hivemall v0.5.0
 
Introduction to Apache Hivemall v0.5.0
Introduction to Apache Hivemall v0.5.0Introduction to Apache Hivemall v0.5.0
Introduction to Apache Hivemall v0.5.0
 
What's new in Hivemall v0.5.0
What's new in Hivemall v0.5.0What's new in Hivemall v0.5.0
What's new in Hivemall v0.5.0
 
What's new in Apache Hivemall v0.5.0
What's new in Apache Hivemall v0.5.0What's new in Apache Hivemall v0.5.0
What's new in Apache Hivemall v0.5.0
 
Revisiting b+-trees
Revisiting b+-treesRevisiting b+-trees
Revisiting b+-trees
 
Incubating Apache Hivemall
Incubating Apache HivemallIncubating Apache Hivemall
Incubating Apache Hivemall
 
Hivemall meets Digdag @Hackertackle 2018-02-17
Hivemall meets Digdag @Hackertackle 2018-02-17Hivemall meets Digdag @Hackertackle 2018-02-17
Hivemall meets Digdag @Hackertackle 2018-02-17
 
Recommendation 101 using Hivemall
Recommendation 101 using HivemallRecommendation 101 using Hivemall
Recommendation 101 using Hivemall
 
Hivemall dbtechshowcase 20160713 #dbts2016
Hivemall dbtechshowcase 20160713 #dbts2016Hivemall dbtechshowcase 20160713 #dbts2016
Hivemall dbtechshowcase 20160713 #dbts2016
 
Tdtechtalk20160425myui
Tdtechtalk20160425myuiTdtechtalk20160425myui
Tdtechtalk20160425myui
 
Tdtechtalk20160330myui
Tdtechtalk20160330myuiTdtechtalk20160330myui
Tdtechtalk20160330myui
 
Datascientistsymp1113
Datascientistsymp1113Datascientistsymp1113
Datascientistsymp1113
 
2nd Hivemall meetup 20151020
2nd Hivemall meetup 201510202nd Hivemall meetup 20151020
2nd Hivemall meetup 20151020
 
Talk about Hivemall at Data Scientist Organization on 2015/09/17
Talk about Hivemall at Data Scientist Organization on 2015/09/17Talk about Hivemall at Data Scientist Organization on 2015/09/17
Talk about Hivemall at Data Scientist Organization on 2015/09/17
 
Db tech show - hivemall
Db tech show - hivemallDb tech show - hivemall
Db tech show - hivemall
 
Hivemall tech talk at Redwood, CA
Hivemall tech talk at Redwood, CAHivemall tech talk at Redwood, CA
Hivemall tech talk at Redwood, CA
 
Hivemall Talk at TD tech talk #3
Hivemall Talk at TD tech talk #3Hivemall Talk at TD tech talk #3
Hivemall Talk at TD tech talk #3
 
Hivemall LT @ Machine Learning Casual Talks #3
Hivemall LT @ Machine Learning Casual Talks #3Hivemall LT @ Machine Learning Casual Talks #3
Hivemall LT @ Machine Learning Casual Talks #3
 
HivemallとSpark MLlibの比較
HivemallとSpark MLlibの比較HivemallとSpark MLlibの比較
HivemallとSpark MLlibの比較
 

Recently uploaded

CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxCFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
R&R Consult
 
ASME IX(9) 2007 Full Version .pdf
ASME IX(9)  2007 Full Version       .pdfASME IX(9)  2007 Full Version       .pdf
ASME IX(9) 2007 Full Version .pdf
AhmedHussein950959
 
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
AJAYKUMARPUND1
 
CME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional ElectiveCME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional Elective
karthi keyan
 
AP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specificAP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specific
BrazilAccount1
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Sreedhar Chowdam
 
Architectural Portfolio Sean Lockwood
Architectural Portfolio Sean LockwoodArchitectural Portfolio Sean Lockwood
Architectural Portfolio Sean Lockwood
seandesed
 
Cosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdfCosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdf
Kamal Acharya
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
Amil Baba Dawood bangali
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
SamSarthak3
 
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdfTop 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Teleport Manpower Consultant
 
Standard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - NeometrixStandard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - Neometrix
Neometrix_Engineering_Pvt_Ltd
 
Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
JoytuBarua2
 
road safety engineering r s e unit 3.pdf
road safety engineering  r s e unit 3.pdfroad safety engineering  r s e unit 3.pdf
road safety engineering r s e unit 3.pdf
VENKATESHvenky89705
 
block diagram and signal flow graph representation
block diagram and signal flow graph representationblock diagram and signal flow graph representation
block diagram and signal flow graph representation
Divya Somashekar
 
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
thanhdowork
 
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdfWater Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation & Control
 
MCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdfMCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdf
Osamah Alsalih
 
WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234
AafreenAbuthahir2
 
Hierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power SystemHierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power System
Kerry Sado
 

Recently uploaded (20)

CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxCFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
 
ASME IX(9) 2007 Full Version .pdf
ASME IX(9)  2007 Full Version       .pdfASME IX(9)  2007 Full Version       .pdf
ASME IX(9) 2007 Full Version .pdf
 
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
 
CME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional ElectiveCME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional Elective
 
AP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specificAP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specific
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
 
Architectural Portfolio Sean Lockwood
Architectural Portfolio Sean LockwoodArchitectural Portfolio Sean Lockwood
Architectural Portfolio Sean Lockwood
 
Cosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdfCosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdf
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
 
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdfTop 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
 
Standard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - NeometrixStandard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - Neometrix
 
Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
 
road safety engineering r s e unit 3.pdf
road safety engineering  r s e unit 3.pdfroad safety engineering  r s e unit 3.pdf
road safety engineering r s e unit 3.pdf
 
block diagram and signal flow graph representation
block diagram and signal flow graph representationblock diagram and signal flow graph representation
block diagram and signal flow graph representation
 
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
 
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdfWater Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdf
 
MCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdfMCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdf
 
WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234
 
Hierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power SystemHierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power System
 

HadoopCon'16, Taipei @myui