Your SlideShare is downloading. ×
0
CS267_Graph_Lab
CS267_Graph_Lab
CS267_Graph_Lab
CS267_Graph_Lab
CS267_Graph_Lab
CS267_Graph_Lab
CS267_Graph_Lab
CS267_Graph_Lab
CS267_Graph_Lab
CS267_Graph_Lab
CS267_Graph_Lab
CS267_Graph_Lab
CS267_Graph_Lab
CS267_Graph_Lab
CS267_Graph_Lab
CS267_Graph_Lab
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

CS267_Graph_Lab

270

Published on

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
270
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
10
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  1. By :: Jaideep Katkar Under the Guidance of :: Dr. Tran Thanh
  2. GraphLab Overview A New Framework For Parallel Machine Learning – high-level abstractions for machine learning problems – Shared-memory multiprocessor – Assume no fault tolerance needed – Concurrent access precessing models with sequential-consistency guarantees
  3. How GraphLab Works? – Represent the user's data by a directed graph – Each block of data is represented by a vertex and a directed edge – Shared data table – User functions:  Update: modify the vertex and edges state, read only to shared table  Fold: sequential aggregation to a key entry in the shared table, modify vertex data  Merge: Parallelize Fold function  Apply: Finalize the key entry in the shared table
  4. GAS Decomposition
  5. GraphLab Toolkit  Topic Modeling contains applications like LDA which can be used to cluster documents and extract topical representations.  Graph Analytics contains application like pagerank and triangle counting which can be applied to general graphs to estimate community structure.  Clustering contains standard data clustering tools such as Kmeans  Collaborative Filtering contains a collection of applications used to make predictions about users interests and factorize large matrices.  Graphical Models contains tools for making joint predictions about collections of related random variables.  Computer Vision contains a collection of tools for reasoning about images.
  6. Running GraphLab on EC2 Cluster Requirements :: • You should have Amazon EC2 account eligible to run on us-east-1a zone. • Amazon AWS console your AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY (under your account name on the top right corner-> security credentials -> access keys) • You should have a keypair attached to the zone you are running on (in our example us-east-1a) • Install boto. This is the AWS Python client. To install, run: ‘sudo pip boto’. • Download and install Graphlab as mentioned on next slides.
  7. Satisfying Dependencies on Ubuntu All the dependencies can be satisfied from the repository: Below command will install gcc , jdk need to compile graphlab Programs: Downloading GraphLab version 2.2 You can download GraphLab directly from our Github Repository. Github also offers a zip download of the repository if you do not have git. The git command line for cloning the repository is:
  8. Compiling and Running Graphlab In the graphlabapi directory, will create two sub-directories, release/ and debug/ . cd into either of these directories and running make will build the release or the debug versions respectively. Note that this will compile all of GraphLab, including all toolkits.
  9. Running Stochastic gradient descent (SGD) in Collaborative Filtering toolkit The collaborative filtering toolkit contains tools for computing a linear model of the data, and predicting missing values based on this linear model. This is useful when computing recommendations for users http://docs.graphlab.org/collaborative_filtering.html
  10. Running SGD for Netflix Data to predict User Rating Input File (Training) for Netflix Data [User] [item] [rating] 1000 2 5.0 3 7 12.0 6 2 2.1 Creating Directory to load Netflix data
  11. Command Line Arguments to Run SGD --gamma=XX Gradient descent step size --lambda=XX Gradient descent regularization --step_dec=XX Multiplicative step decrease. Should be between 0.1 to 1. Default is 0.9. --D=X Feature vector width. Common values are 20 - 150. --max_iter=XX Max number of iterations --maxval=XX Maximum allowed rating --minval=XX Min allowed rating --predictions=XX File name to write prediction to. Note that you will need a user/item pair input file named something. predict to enable predictions (see section: ratings). --tol=XX Stop computation when absolute error of prediction is less than tolerance. Default is 1e-3.
  12. O/P file SGD is a simple gradient descent algorithm. Prediction in SGD is done as : r_ui = p_u * q_i Where r_ui is a scalar rating of user u to item i, and p_u is the user feature vector of size D, q_i is the item feature vector of size D and the product is a vector product.
  13. Creating a GraphLab project  Create a GraphLab project, simply create a sub- directory in the graphlab/apps/ folder with your project Name.  For instance, graphlab/apps/my_first_GraphLabProject.  Create a text file called CMakeLists.txt with the following contents :: project(My_GraphLabProject) add_graphlab_executable(my_first_GraphLabProject <ProgramName>.cpp)
  14. Hello World in GraphLab #include <graphlab.hpp> using namespace graphlab; #include <graphlab.hpp> int main(int argc, char** argv) { graphlab::mpi_tools::init(argc, argv); graphlab::distributed_control dc; dc.cout() << "Hello World!n"; graphlab::mpi_tools::finalize(); } • dc is the distributed communication layer which is needed by a number of the core GraphLab objects, whether you are running distributed or not • To create the program run the configure script, than run "make" in the •debug/ release/ build folders. The program when executed, will print "Hello World!".
  15. Thank You References :: http://graphlab.com/community/events/conference14.html http://graphlab.com/learn/notebooks/introduction_to_sframes.html http://en.wikipedia.org/wiki/GraphLab https://www.youtube.com/watch?v=lRN91_-hlkg https://wiki.engr.illinois.edu/download/attachments/227740647/GraphLab .pdf?version=1&modificationDate=1382500521000#page=1&zoom=auto, 0,280 http://arxiv.org/pdf/1204.6078v1.pdf http://select.cs.cmu.edu/code/graphlab/doxygen/html/index.html

×