The GraphLab model is defined in 4 parts. The Data Graph which is used to express sparse data dependencies in your computation.And the Shared Data Table which is used to express global data as well as global computationIn addition, we also have the scheduler which determines the order of computationAnd the scope system which provides thread safety and consistency.
2 parts. A Map stage and a Reduce stage. The Map stage represents embarassingly parallel computation. That is, each computation is independent and can performed on different macheina without any communciation.
For instance, we could use MapReduce to perform feature extraction on a large number of pictures. For instance, .. To compute an attractiveness score.
The Reduce stage is essentially a “fold” or an aggregation operation over the results. This for instance can be used to compile summary statistics.
Outlines Introduction Objectives Scope Problem with existing system Purpose of new system Proposed architecture Technologies to be used Modules of system Integration of technologies Implementation Issues to be solved Application Future Enhancement
Objectives Information Filtering System Recommendation engine recommends- User based- Item based- Slop based Run On Cloud Environment
Introduction Engine - Gives Suggestion Based onmovies,songs,videos,websites,books,images and alsosocial elements. Applicable for E-business. Useful for both Customers and online Retailers Recommendation engine is being used atAmazon, Youtube, Facebook,Twitter
Scope Our system will only provide Recommendation serviceonly. Recommendation will be genrated based on user’shistorical activity like purchase pattern as well asrating and like. Recommendation will be either stored on database,file or directly retrieved to retailers web application.
Problems with existing System Take more Time to generate recommendations No real time recommendation for large data
Purpose of new System Less time for generating recommendations Applicable for Bigdata Recommendations be several algorithms User based Item based Slop based Association rule mining Evaluation of recommendation
Recommendations-Type User Based Recommendation
Recommendations-Type Item Based Recommendation
Technologies to be used Hadoop Mahout Graphlab Google prediction Google Storage Google App engine
Modules of System User Module Admin Module Recommendation Module File management Module Search Module
Integration of Technologies Mahout based Recommendation Graph based Recommendation Google prediction Based Recommendation
Technology: HADOOP Hadoop is a top-level Apache project being builtand used by a global community of contributors. Hadoop project develops open-source software forreliable, scalable, distributed computing. It enables applications to work with thousands ofnodes and peta bytes of data. Hadoop also support Map/Reduce Algorithm. It provides HDFS file system that stores data onthe compute nodes.
Graphlab It is New Parallel Framework for MachineLearning Algorithm . Now a day ,Designing and implementing efficientand correct parallel machine learning (ML)algorithms can be very challenging. Designed specifically for ML needs Automatic data synchronization. Map phase like – Update Function . Reduce phase like – Sync Operation .
17Data GraphShared Data TableSchedulingUpdate Functions andScopesGraphLabModel
CPU 1 CPU 2 CPU 3 CPU 4MapReduce – Map Phase18Embarrassingly Parallel independent computation12.942.321.325.8No Communication needed
CPU 1 CPU 2 CPU 3 CPU 4MapReduce – Map Phase19Embarrassingly Parallel independent computation12.942.321.325.824.184.318.484.4No Communication needed
CPU 1 CPU 2MapReduce – Reduce Phase2012.942.321.325.824.184.318.484.417.567.514.934.32226.261726.31Fold/Aggregation
Graphlab in Recommendation Graphlab provide better way in recommendationengine. Its just first load fits simple dataset file. In graphlab we can also implement various algortihmlike k-means clustering ,fuzzy logic, pagerank and etc. Its first translated dataset into Matrix form. And then according to different algorithm itgenerated recommendated output.
Google Prediction Service Google cloud service used for Building smartApplication. Having Machine learning Algorithms. Related to Artificial Intelligence.
Google Prediction Service Google Prediction API : Set of Methods for Data Analysis. Libraries support multiple languages. Google App Engine : Enable Application to Cloud environment Applicationserver Google Cloud Storage : Enable Data to store on Google Cloud database.
Technology : MAHOUT• Apache Mahout is open source project by the ApacheSoftware Foundation (ASF).• The primary goal of Mahout is creating scalablemachine-learning algorithms.• Several Map-Reduce in Mahout enabled clusteringimplementations, including k-Means, fuzzy k-Means,Canopy, Dirichlet, and Mean-Shift.• Mahout have fix datasets which generally take as datainput.• Amzon EC2 are working with Hadoop and Mahout.
Implementation Issues to solved Lack of knowledge about hadoop,mahout,hive Memory issue Operating system support Load Balancing Configuration Data normalization Developing Clustering algorithm Configuring mahout with hadoop
Application of recommendation Yahoo! Facebook Twitter Baidu eBay LinkedIn New York Times Rackspace eHarmony PowersetRecommendationEngine
Future enhancement Integration with Web Application like Jsp , Servlet Integration with Database likeHive, Hbase, Mongodb, Couch db Cloud based recommendation Service Integration of Mahout , Graphlab and Google predictionbased recommendation services. Mobile application integration