Personal Information
Organization / Workplace
San Francisco Bay Area United States
Occupation
Senior Research Engineer at Netflix
Industry
Education
Website
www.dbtsai.com
About
Big Data Machine Learning Engineer with strong computer science, theoretical physics and mathematical background. I've deep understanding of implementing data mining algorithms in a scalable ways, not just using them as consumers.
I'm a big fan of Scala, and have been using it to develop scalable and distributed data mining algorithms with Apache Spark. I've involved with open source Apache Spark development as a contributor. Apache Spark is a fast and general engine for large-scale data processing, and it fits into the Hadoop open-source ecosystem.
Specialties:
• Machine Learning and Data Mining.
• Distributed/Parallel Computing and Big Data Processing.
• Expert in Apache Hadoop
Tags
machine learning
spark
mapreduce
hadoop
mllib
alpine data labs
big data
logistic regression
netflix
data mining
apache spark
multinomial
l-bfgs
recommendation
pipeline
kernel methods
linear models
polynomial mapping
feature engineering
linear regression
ml
spark summit
elastic-net
batch layer
serving layer
speed layer
spark streaming
pig
lambda architecture
real time
storm
stream
large scale
iot
internet of things
svd
k-means
unsupervised learning
See more
Presentations
(9)
See all
Likes
(4)
See all
Distributed Time Travel for Feature Generation at Netflix
sfbiganalytics
•
7 years ago
Introducing Windowing Functions (pgCon 2009)
PostgreSQL Experts, Inc.
•
10 years ago
Multinomial Logistic Regression with Apache Spark
DB Tsai
•
9 years ago
Presentations
(9)
See all
Likes
(4)
See all
Distributed Time Travel for Feature Generation at Netflix
sfbiganalytics
•
7 years ago
Introducing Windowing Functions (pgCon 2009)
PostgreSQL Experts, Inc.
•
10 years ago
Multinomial Logistic Regression with Apache Spark
DB Tsai
•
9 years ago
Personal Information
Organization / Workplace
San Francisco Bay Area United States
Occupation
Senior Research Engineer at Netflix
Industry
Education
Website
www.dbtsai.com
About
Big Data Machine Learning Engineer with strong computer science, theoretical physics and mathematical background. I've deep understanding of implementing data mining algorithms in a scalable ways, not just using them as consumers.
I'm a big fan of Scala, and have been using it to develop scalable and distributed data mining algorithms with Apache Spark. I've involved with open source Apache Spark development as a contributor. Apache Spark is a fast and general engine for large-scale data processing, and it fits into the Hadoop open-source ecosystem.
Specialties:
• Machine Learning and Data Mining.
• Distributed/Parallel Computing and Big Data Processing.
• Expert in Apache Hadoop
Tags
machine learning
spark
mapreduce
hadoop
mllib
alpine data labs
big data
logistic regression
netflix
data mining
apache spark
multinomial
l-bfgs
recommendation
pipeline
kernel methods
linear models
polynomial mapping
feature engineering
linear regression
ml
spark summit
elastic-net
batch layer
serving layer
speed layer
spark streaming
pig
lambda architecture
real time
storm
stream
large scale
iot
internet of things
svd
k-means
unsupervised learning
See more