Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Building a Unified Data 
Aaron Davidson 
Slides adapted from Matei Zaharia 
spark.apache.org 
Pipeline in 
Spark で構築する統合デー...
What is Apache Spark? 
Fast and general cluster computing system 
interoperable with Hadoop 
Improves efficiency through: ...
Project History 
Started at UC Berkeley in 2009, open 
sourced in 2010 
50+ companies now contributing 
»Databricks, Yahoo...
1 of 37 Ad