Your SlideShare is downloading.
×

×

Saving this for later?
Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.

Text the download link to your phone

Standard text messaging rates apply

Like this presentation? Why not share!

- Apache Hama at Samsung Open Source ... by Edward J. Yoon 1044 views
- Apache HAMA: An Introduction toBulk... by Edward J. Yoon 9314 views
- K-Means with BSP by tjungblut 2335 views
- Introduction of Apache Hama - 2011 by Edward J. Yoon 22072 views
- Apache hama @ Samsung SW Academy by Edward J. Yoon 643 views
- BigTable And Hbase by Edward J. Yoon 13856 views
- Apache Hama 0.4 by Edward J. Yoon 3330 views
- Understand Of Linear Algebra by Edward J. Yoon 2601 views
- Graph processing - Powergraph and G... by Amir Payberah 158 views
- Introducing Apache Giraph for Large... by sscdotopen 21135 views
- An Introduction to Apache Hama by Semtech Solutions... 569 views
- Experiences with Apache Software Fo... by Edward J. Yoon 280 views

Like this? Share it with your network
Share

No Downloads

Total Views

8,547

On Slideshare

0

From Embeds

0

Number of Embeds

19

Shares

0

Downloads

165

Comments

0

Likes

16

No embeds

No notes for slide

- 1. Machine Learning with Apache Hama Tommaso Teofili tommaso [at] apache [dot] org 1
- 2. About me ASF member having fun with: Lucene / Solr Hama UIMA Stanbol … some others SW engineer @ Adobe R&D 2
- 3. Agenda Apache Hama and BSP Why machine learning on BSP Some examples Benchmarks 3
- 4. Apache Hama Bulk Synchronous Parallel computing framework on top of HDFS for massive scientific computations TLP since May 2012 0.6.0 release out soon Growing community 4
- 5. BSP supersteps A BSP algorithm is composed by a sequence of “supersteps” 5
- 6. BSP supersteps Each task Superstep 1 Do some computation Communicate with other tasks Synchronize Superstep 2 Do some computation Communicate with other tasks Synchronize … … … Superstep N Do some computation Communicate with other tasks Synchronize 6
- 7. Why BSP Simple programming model Supersteps semantic is easy Preserve data locality Improve performance Well suited for iterative algorithms 7
- 8. Apache Hama architecture BSP Program execution flow 8
- 9. Apache Hama architecture 9
- 10. Apache Hama Features BSP API M/R like I/O API Graph API Job management / monitoring Checkpoint recovery Local & (Pseudo) Distributed run modes Pluggable message transfer architecture YARN supported Running in Apache Whirr 10
- 11. Apache Hama BSP API public abstract class BSP<K1, V1, K2, V2, M extends Writable> … K1, V1 are key, values for inputs K2, V2 are key, values for outputs M are they type of messages used for task communication 11
- 12. Apache Hama BSP API public void bsp(BSPPeer<K1, V1, K2, V2, M> peer) throws .. public void setup(BSPPeer<K1, V1, K2, V2, M> peer) throws .. public void cleanup(BSPPeer<K1, V1, K2, V2, M> peer) throws .. 12
- 13. Machine learning on BSP Lots (most?) of ML algorithms are inherently iterative Hama ML module currently counts Collaborative filtering Clustering Gradient descent 13
- 14. Benchmarking architectureNodeNode Node Node Node Node Node Node Hama Hama Solr DBMS Lucene Mahout Mahout HDFS HDFS 14
- 15. Collaborative filtering Given user preferences on movies We want to find users “near” to some specific user So that that user can “follow” them And/or see what they like (which he/she could like too) 15
- 16. Collaborative filtering BSP Given a specific user Iteratively (for each task) Superstep 1*i Read a new user preference row Find how near is that user from the current user That is finding how near their preferences are Since they are given as vectors we may use vector distance measures like Euclidean, cosine, etc. distance algorithms Broadcast the measure output to other peers Superstep 2*i Aggregate measure outputs Update most relevant users Still to be committed (HAMA-612) 16
- 17. Collaborative filtering BSP Given user ratings about movies "john" -> 0, 0, 0, 9.5, 4.5, 9.5, 8 "paula" -> 7, 3, 8, 2, 8.5, 0, 0 "jim” -> 4, 5, 0, 5, 8, 0, 1.5 "tom" -> 9, 4, 9, 1, 5, 0, 8 "timothy" -> 7, 3, 5.5, 0, 9.5, 6.5, 0 We ask for 2 nearest users to “paula” and we get “timothy” and “tom” user recommendation We can extract highly rated movies “timothy” and “tom” that “paula” didn’t see Item recommendation 17
- 18. Benchmarks Fairly simple algorithm Highly iterative Comparing to Apache Mahout Behaves better than ALS-WR Behaves similarly to RecommenderJob and ItemSimilarityJob 18
- 19. K-Means clustering We have a bunch of data (e.g. documents) We want to group those docs in k homogeneous clusters Iteratively for each cluster Calculate new cluster center Add doc nearest to new center to the cluster 19
- 20. K-Means clustering 20
- 21. K-Means clustering BSP Iteratively Superstep 1*i Assignment phase Read vectors splits Sum up temporary centers with assigned vectors Broadcast sum and ingested vectors count Superstep 2*i Update phase Calculate the total sum over all received messages and average Replace old centers with new centers and check for convergence 21
- 22. Benchmarks One rack (16 nodes 256 cores) cluster 10G network On average faster than Mahout’s impl 22
- 23. Gradient descent Optimization algorithm Find a (local) minimum of some function Used for solving linear systems solving non linear systems in machine learning tasks linear regression logistic regression neural networks backpropagation … 23
- 24. Gradient descent Minimize a given (cost) function Give the function a starting point (set of parameters) Iteratively change parameters in order to minimize the function Stop at the (local) minimum There’s some math but intuitively: evaluate derivatives at a given point in order to choose where to “go” next 24
- 25. Gradient descent BSP Iteratively Superstep 1*i each task calculates and broadcasts portions of the cost function with the current parameters Superstep 2*i aggregate and update cost function check the aggregated cost and iterations count cost should always decrease Superstep 3*i each task calculates and broadcasts portions of (partial) derivatives Superstep 4*i aggregate and update parameters 25
- 26. Gradient descent BSP Simplistic example Linear regression Given real estate market dataset Estimate new houses prices given known houses’ size, geographic region and prices Expected output: actual parameters for the (linear) prediction function 26
- 27. Gradient descent BSP Generate a different model for each region House item vectors price -> size 150k -> 80 2 dimensional space ~1.3M vectors dataset 27
- 28. Gradient descent BSP Dataset and model fit 28
- 29. Gradient descent BSP Cost checking 29
- 30. Gradient descent BSP Classification Logistic regression with gradient descent Real estate market dataset We want to find which estate listings belong to agencies To avoid buying from them Same algorithm With different cost function and features Existing items are tagged or not as “belonging to agency” Create vectors from items’ text Sample vector 1 -> 1 3 0 0 5 3 4 1 30
- 31. Gradient descent BSP Classification 31
- 32. Benchmarks Not directly comparable to Mahout’s regression algorithms Both SGD and CGD are inherently better than plain GD But Hama GD had on average same performance of Mahout’s SGD / CGD Next step is implementing SGD / CGD on top of Hama 32
- 33. Wrap up Even if ML module is still “young” / work in progress and tools like Apache Mahout have better “coverage” Apache Hama can be particularly useful in certain “highly iterative” use cases Interesting benchmarks 33
- 34. Thanks! 34

Be the first to comment