Recommendation
Engine
Outlines
 Introduction
 Objectives
 Scope
 Problem with existing system
 Purpose of new system
 Proposed architecture
 Technologies to be used
 Modules of system
 Integration of technologies
 Implementation Issues to be solved
 Application
 Future Enhancement
Objectives
 Information Filtering System
 Recommendation engine recommends
- User based
- Item based
- Slop based
 Run On Cloud Environment
Introduction
 Engine - Gives Suggestion Based on
movies,songs,videos,websites,books,images and also
social elements.
 Applicable for E-business.
 Useful for both Customers and online Retailers
 Recommendation engine is being used at
Amazon, Youtube, Facebook,Twitter
Scope
 Our system will only provide Recommendation service
only.
 Recommendation will be genrated based on user’s
historical activity like purchase pattern as well as
rating and like.
 Recommendation will be either stored on database
,file or directly retrieved to retailers web application.
Problems with existing System
 Take more Time to generate recommendations
 No real time recommendation for large data
Purpose of new System
 Less time for generating recommendations
 Applicable for Bigdata
 Recommendations be several algorithms
 User based
 Item based
 Slop based
 Association rule mining
 Evaluation of recommendation
Recommendations-Type
 User Based Recommendation
Recommendations-Type
 Item Based Recommendation
Proposed System Architecture
Technologies to be used
 Hadoop
 Mahout
 Graphlab
 Google prediction
 Google Storage
 Google App engine
Modules of System
 User Module
 Admin Module
 Recommendation Module
 File management Module
 Search Module
Integration of Technologies
 Mahout based Recommendation
 Graph based Recommendation
 Google prediction Based Recommendation
Technology: HADOOP
 Hadoop is a top-level Apache project being built
and used by a global community of contributors.
 Hadoop project develops open-source software for
reliable, scalable, distributed computing.
 It enables applications to work with thousands of
nodes and peta bytes of data.
 Hadoop also support Map/Reduce Algorithm.
 It provides HDFS file system that stores data on
the compute nodes.
Hadoop
Graphlab
 It is New Parallel Framework for Machine
Learning Algorithm .
 Now a day ,Designing and implementing efficient
and correct parallel machine learning (ML)
algorithms can be very challenging.
 Designed specifically for ML needs
 Automatic data synchronization.
 Map phase like – Update Function .
 Reduce phase like – Sync Operation .
17
Data Graph
Shared Data Table
Scheduling
Update Functions and
Scopes
GraphLab
Model
CPU 1 CPU 2 CPU 3 CPU 4
MapReduce – Map Phase
18
Embarrassingly Parallel independent computation
1
2
.
9
4
2
.
3
2
1
.
3
2
5
.
8
No Communication needed
CPU 1 CPU 2 CPU 3 CPU 4
MapReduce – Map Phase
19
Embarrassingly Parallel independent computation
1
2
.
9
4
2
.
3
2
1
.
3
2
5
.
8
2
4
.
1
8
4
.
3
1
8
.
4
8
4
.
4
No Communication needed
CPU 1 CPU 2
MapReduce – Reduce Phase
20
1
2
.
9
4
2
.
3
2
1
.
3
2
5
.
8
2
4
.
1
8
4
.
3
1
8
.
4
8
4
.
4
1
7
.
5
6
7
.
5
1
4
.
9
3
4
.
3
22
26
.
26
17
26
.
31
Fold/Aggregation
Graphlab in Recommendation
 Graphlab provide better way in recommendation
engine.
 Its just first load fits simple dataset file.
 In graphlab we can also implement various algortihm
like k-means clustering ,fuzzy logic, pagerank and etc.
 Its first translated dataset into Matrix form.
 And then according to different algorithm it
generated recommendated output.
Google Prediction Service
 Google cloud service used for Building smart
Application.
 Having Machine learning Algorithms.
 Related to Artificial Intelligence.
Google Prediction Service
 Google Prediction API :
 Set of Methods for Data Analysis.
 Libraries support multiple languages.
 Google App Engine :
 Enable Application to Cloud environment Application
server
 Google Cloud Storage :
 Enable Data to store on Google Cloud database.
Google Prediction Service
Technology : MAHOUT
• Apache Mahout is open source project by the Apache
Software Foundation (ASF).
• The primary goal of Mahout is creating scalable
machine-learning algorithms.
• Several Map-Reduce in Mahout enabled clustering
implementations, including k-Means, fuzzy k-Means,
Canopy, Dirichlet, and Mean-Shift.
• Mahout have fix datasets which generally take as data
input.
• Amzon EC2 are working with Hadoop and Mahout.
Implementation Issues to solved
 Lack of knowledge about hadoop,mahout,hive
 Memory issue
 Operating system support
 Load Balancing
 Configuration
 Data normalization
 Developing Clustering algorithm
 Configuring mahout with hadoop
Application of recommendation
 Yahoo!
 Facebook
 Twitter
 Baidu
 eBay
 LinkedIn
 New York Times
 Rackspace
 eHarmony
 Powerset
Recommendation
Engine
Future enhancement
 Integration with Web Application like Jsp , Servlet
 Integration with Database like
Hive, Hbase, Mongodb, Couch db
 Cloud based recommendation Service
 Integration of Mahout , Graphlab and Google prediction
based recommendation services.
 Mobile application integration
Thank You
Recommendation engine
Recommendation engine

Recommendation engine

  • 1.
  • 2.
    Outlines  Introduction  Objectives Scope  Problem with existing system  Purpose of new system  Proposed architecture  Technologies to be used  Modules of system  Integration of technologies  Implementation Issues to be solved  Application  Future Enhancement
  • 3.
    Objectives  Information FilteringSystem  Recommendation engine recommends - User based - Item based - Slop based  Run On Cloud Environment
  • 4.
    Introduction  Engine -Gives Suggestion Based on movies,songs,videos,websites,books,images and also social elements.  Applicable for E-business.  Useful for both Customers and online Retailers  Recommendation engine is being used at Amazon, Youtube, Facebook,Twitter
  • 5.
    Scope  Our systemwill only provide Recommendation service only.  Recommendation will be genrated based on user’s historical activity like purchase pattern as well as rating and like.  Recommendation will be either stored on database ,file or directly retrieved to retailers web application.
  • 6.
    Problems with existingSystem  Take more Time to generate recommendations  No real time recommendation for large data
  • 7.
    Purpose of newSystem  Less time for generating recommendations  Applicable for Bigdata  Recommendations be several algorithms  User based  Item based  Slop based  Association rule mining  Evaluation of recommendation
  • 8.
  • 9.
  • 10.
  • 11.
    Technologies to beused  Hadoop  Mahout  Graphlab  Google prediction  Google Storage  Google App engine
  • 12.
    Modules of System User Module  Admin Module  Recommendation Module  File management Module  Search Module
  • 13.
    Integration of Technologies Mahout based Recommendation  Graph based Recommendation  Google prediction Based Recommendation
  • 14.
    Technology: HADOOP  Hadoopis a top-level Apache project being built and used by a global community of contributors.  Hadoop project develops open-source software for reliable, scalable, distributed computing.  It enables applications to work with thousands of nodes and peta bytes of data.  Hadoop also support Map/Reduce Algorithm.  It provides HDFS file system that stores data on the compute nodes.
  • 15.
  • 16.
    Graphlab  It isNew Parallel Framework for Machine Learning Algorithm .  Now a day ,Designing and implementing efficient and correct parallel machine learning (ML) algorithms can be very challenging.  Designed specifically for ML needs  Automatic data synchronization.  Map phase like – Update Function .  Reduce phase like – Sync Operation .
  • 17.
    17 Data Graph Shared DataTable Scheduling Update Functions and Scopes GraphLab Model
  • 18.
    CPU 1 CPU2 CPU 3 CPU 4 MapReduce – Map Phase 18 Embarrassingly Parallel independent computation 1 2 . 9 4 2 . 3 2 1 . 3 2 5 . 8 No Communication needed
  • 19.
    CPU 1 CPU2 CPU 3 CPU 4 MapReduce – Map Phase 19 Embarrassingly Parallel independent computation 1 2 . 9 4 2 . 3 2 1 . 3 2 5 . 8 2 4 . 1 8 4 . 3 1 8 . 4 8 4 . 4 No Communication needed
  • 20.
    CPU 1 CPU2 MapReduce – Reduce Phase 20 1 2 . 9 4 2 . 3 2 1 . 3 2 5 . 8 2 4 . 1 8 4 . 3 1 8 . 4 8 4 . 4 1 7 . 5 6 7 . 5 1 4 . 9 3 4 . 3 22 26 . 26 17 26 . 31 Fold/Aggregation
  • 21.
    Graphlab in Recommendation Graphlab provide better way in recommendation engine.  Its just first load fits simple dataset file.  In graphlab we can also implement various algortihm like k-means clustering ,fuzzy logic, pagerank and etc.  Its first translated dataset into Matrix form.  And then according to different algorithm it generated recommendated output.
  • 22.
    Google Prediction Service Google cloud service used for Building smart Application.  Having Machine learning Algorithms.  Related to Artificial Intelligence.
  • 23.
    Google Prediction Service Google Prediction API :  Set of Methods for Data Analysis.  Libraries support multiple languages.  Google App Engine :  Enable Application to Cloud environment Application server  Google Cloud Storage :  Enable Data to store on Google Cloud database.
  • 24.
  • 25.
    Technology : MAHOUT •Apache Mahout is open source project by the Apache Software Foundation (ASF). • The primary goal of Mahout is creating scalable machine-learning algorithms. • Several Map-Reduce in Mahout enabled clustering implementations, including k-Means, fuzzy k-Means, Canopy, Dirichlet, and Mean-Shift. • Mahout have fix datasets which generally take as data input. • Amzon EC2 are working with Hadoop and Mahout.
  • 26.
    Implementation Issues tosolved  Lack of knowledge about hadoop,mahout,hive  Memory issue  Operating system support  Load Balancing  Configuration  Data normalization  Developing Clustering algorithm  Configuring mahout with hadoop
  • 27.
    Application of recommendation Yahoo!  Facebook  Twitter  Baidu  eBay  LinkedIn  New York Times  Rackspace  eHarmony  Powerset Recommendation Engine
  • 28.
    Future enhancement  Integrationwith Web Application like Jsp , Servlet  Integration with Database like Hive, Hbase, Mongodb, Couch db  Cloud based recommendation Service  Integration of Mahout , Graphlab and Google prediction based recommendation services.  Mobile application integration
  • 29.

Editor's Notes

  • #18 The GraphLab model is defined in 4 parts. The Data Graph which is used to express sparse data dependencies in your computation.And the Shared Data Table which is used to express global data as well as global computationIn addition, we also have the scheduler which determines the order of computationAnd the scope system which provides thread safety and consistency.
  • #19 2 parts. A Map stage and a Reduce stage. The Map stage represents embarassingly parallel computation. That is, each computation is independent and can performed on different macheina without any communciation.
  • #20 For instance, we could use MapReduce to perform feature extraction on a large number of pictures. For instance, .. To compute an attractiveness score.
  • #21 The Reduce stage is essentially a “fold” or an aggregation operation over the results. This for instance can be used to compile summary statistics.