Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Apache Mahout Architecture Overview

7 views

Published on

Mahout is an open source machine learning java library from Apache Software Foundation, and therefore platform independent, that provides a fertile framework and collection of patterns and ready-made component for testing and deploying new large-scale algorithms.
With these slides we aims at providing a deeper understanding of its architecture.

Published in: Software
  • Be the first to comment

  • Be the first to like this

Apache Mahout Architecture Overview

  1. 1. MAHOUT Dalla Palma Stefano Open source Machine Learning Java library A P A C H E Part I: Overview
  2. 2. Search Text mining Information Retrieval
  3. 3. Search Text mining Information Retrieval
  4. 4. Search Text mining Information Retrieval Collaborative filtering
  5. 5. Search Text mining Information Retrieval Collaborative filtering
  6. 6. Mahout Machine Learning main techniques and architecture overview
  7. 7. Recommender Data Store User Preference Item Preference Item Recommender Neighborhood Correlation Preference InferrerData Model User User
  8. 8. Classification (Naïve Bayes) Training examples Training algorithm ModelNew examples Decisions Predictors and target variables Classification System Copy Estimated target variable Predicted variables only Model
  9. 9. Clustering
  10. 10. Clustering
  11. 11. Clustering (k-means) Map output = <centroid_id, data_point> S = Shuffle and Sorting M1 Split 1 Split 2 Split n-1 Split n …Input M2 Mn R1 R2 Rk S New Centroids file on Distr. Cache Centroids file on Distr. Cache Map phase Reduce phase
  12. 12. Samsara Apache Flink
  13. 13. Flink
  14. 14. MAHOUTOpen source Machine Learning Java library A P A C H E Part II: Architecture
  15. 15. MAHOUTOpen source Machine Learning Java library A P A C H E Part III: Conclusions
  16. 16. Stablest components
  17. 17. Most instable components
  18. 18. Questions? Dalla Palma Stefano A P A C H E M A H O U T

×