• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Using AWS to Build a Graph-Based Product Recommendation System (BDT303) | AWS re:Invent 2013
 

Using AWS to Build a Graph-Based Product Recommendation System (BDT303) | AWS re:Invent 2013

on

  • 5,451 views

Magazine Luiza, one of the largest retail chains in Brazil, developed an in-house product recommendation system, built on top of a large knowledge Graph. AWS resources like Amazon EC2, Amazon SQS, ...

Magazine Luiza, one of the largest retail chains in Brazil, developed an in-house product recommendation system, built on top of a large knowledge Graph. AWS resources like Amazon EC2, Amazon SQS, Amazon ElastiCache and others made it possible for them to scale from a very small dataset to a huge Cassandra cluster. By improving their big data processing algorithms on their in-house solution built on AWS, they improved their conversion rates on revenue by more than 25 percent compared to market solutions they had used in the past.

Statistics

Views

Total Views
5,451
Views on SlideShare
3,543
Embed Views
1,908

Actions

Likes
19
Downloads
84
Comments
0

7 Embeds 1,908

http://titandb.wpengine.com 1739
https://twitter.com 141
http://www.datascienceassn.org 20
https://www.rebelmouse.com 3
https://internal.autodesk360beta.com 2
http://www.google.com 2
http://datascienceassn.org 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Using AWS to Build a Graph-Based Product Recommendation System (BDT303) | AWS re:Invent 2013 Using AWS to Build a Graph-Based Product Recommendation System (BDT303) | AWS re:Invent 2013 Presentation Transcript

    • Using AWS to Build a Graph-based Product Recommendation System Andre Fatala & Renato Pedigoni November 14, 2013 © 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc. Friday, November 15, 13
    • About Magazine Luiza Magazine Luiza is one of the largest household appliance retail chains in Brazil. Focused on providing durable goods for Brazil's middle and lower-to-middle income classes. • • • • • 731 stores 8 distribution centers more than 23.000 workers 22.8 million customers multi-channel strategy Friday, November 15, 13
    • Friday, November 15, 13
    • Recommendation systems Friday, November 15, 13
    • Recommendation systems Friday, November 15, 13
    • Graphs Friday, November 15, 13
    • Graph Stack Distributed Graph Database Friday, November 15, 13 Distributed database management system
    • Graph Stack Distributed Graph Database • Used for OLTP queries Friday, November 15, 13 Distributed database management system
    • Graph Stack Distributed Graph Database • Used for OLTP queries • Native integration with Tinkerpop Friday, November 15, 13 Distributed database management system
    • Graph Stack Distributed Graph Database Distributed database management system • Used for OLTP queries • Native integration with Tinkerpop • Continuously available with no single point of failure Friday, November 15, 13
    • Graph Stack Distributed Graph Database Distributed database management system • Used for OLTP queries • Native integration with Tinkerpop • Continuously available with no single point of failure • Elastic scalability Friday, November 15, 13
    • Graph Stack Distributed Graph Database Distributed database management system • Used for OLTP queries • Native integration with Tinkerpop • Continuously available with no single point of failure • Elastic scalability • Caching layer Friday, November 15, 13
    • Graph Stack Distributed Graph Database Distributed database management system • Used for OLTP queries • Native integration with Tinkerpop • • • • Friday, November 15, 13 Continuously available with no single point of failure Elastic scalability Caching layer Built-in replication
    • Storing users data Elastic Load Balancing EC2 instance EC2 instance Auto Scaling API instances Friday, November 15, 13 m2.xlarge m2.xlarge m2.xlarge m2.xlarge m2.xlarge m2.xlarge Cassandra cluster
    • Storing users data Elastic Load Balancing EC2 instance EC2 instance Auto Scaling API instances Friday, November 15, 13 m2.xlarge m2.xlarge m2.xlarge m2.xlarge m2.xlarge m2.xlarge Cassandra cluster
    • In graph words… person Friday, November 15, 13
    • In graph words… person Friday, November 15, 13 session
    • In graph words… person Friday, November 15, 13 created session
    • In graph words… channel person Friday, November 15, 13 created session
    • In graph words… channel visited person Friday, November 15, 13 created session
    • In graph words… channel visited person created session item Friday, November 15, 13
    • In graph words… channel visited person created session viewed item Friday, November 15, 13
    • In graph words… channel visited person created session +1 viewed item Friday, November 15, 13
    • In graph words… channel visited person created session +1 add_to_cart item Friday, November 15, 13
    • In graph words… channel visited person created session +13 +1 add_to_cart item Friday, November 15, 13
    • In graph words… channel visited person created session +13 +1 bought item Friday, November 15, 13
    • In graph words… channel visited person created session +21 +13 +1 bought item Friday, November 15, 13
    • Friday, November 15, 13
    • Friday, November 15, 13
    • Base recommendations Who viewed this item also viewed Friday, November 15, 13
    • Base recommendations Who viewed this item also viewed Friday, November 15, 13
    • Base recommendations Who bought this item also bought Friday, November 15, 13
    • Base recommendations Bought after viewing this item Friday, November 15, 13
    • Base recommendations Upselling Friday, November 15, 13
    • How to query the graph for recs? Friday, November 15, 13
    • How to query the graph for recs? Friday, November 15, 13
    • Gremlin Graph Language Friday, November 15, 13
    • Gremlin Graph Language • Groovy DSL for graph traversals Friday, November 15, 13
    • Gremlin Graph Language • Groovy DSL for graph traversals • Easy to learn Friday, November 15, 13
    • Gremlin Graph Language • Groovy DSL for graph traversals • Easy to learn • Great community Friday, November 15, 13
    • Gremlin Graph Language • Groovy DSL for graph traversals • Easy to learn • Great community • Part of the Tinkerpop stack Friday, November 15, 13
    • Gremlin Graph Language • Groovy DSL for graph traversals • Easy to learn • Great community • Part of the Tinkerpop stack • Works with any Blueprints enabled graph database Friday, November 15, 13
    • viewed LED TV 40" Renato viewed viewed LED TV 42" LCD TV 42" viewed viewed Fatala viewed LED 50" People who viewed a product Friday, November 15, 13
    • viewed LED TV 40" Renato viewed viewed LED TV 42" LCD TV 42" viewed viewed Fatala viewed People who viewed a product g.v(4).in(‘viewed’) Friday, November 15, 13 LED 50"
    • viewed LED TV 40" Renato viewed viewed LED TV 42" LCD TV 42" viewed viewed Fatala viewed People who viewed a product g.v(4).in(‘viewed’) Friday, November 15, 13 LED 50"
    • viewed LED TV 40" Renato viewed viewed LED TV 42" LCD TV 42" viewed viewed Fatala viewed People who viewed a product g.v(4).in(‘viewed’) Friday, November 15, 13 LED 50"
    • viewed LED TV 40" Renato viewed viewed LED TV 42" LCD TV 42" viewed viewed Fatala viewed People who viewed a product g.v(4).in(‘viewed’) Friday, November 15, 13 LED 50"
    • viewed LED TV 40" Renato viewed viewed LED TV 42" LCD TV 42" viewed viewed Fatala viewed LED 50" Who viewed this product also viewed Friday, November 15, 13
    • viewed LED TV 40" Renato viewed viewed LED TV 42" LCD TV 42" viewed viewed Fatala viewed LED 50" Who viewed this product also viewed g.v(4).in(‘viewed’).out(‘viewed’) Friday, November 15, 13
    • viewed LED TV 40" Renato viewed viewed LED TV 42" LCD TV 42" viewed viewed Fatala viewed LED 50" Who viewed this product also viewed g.v(4).in(‘viewed’).out(‘viewed’) Friday, November 15, 13
    • viewed LED TV 40" Renato viewed viewed LED TV 42" LCD TV 42" viewed viewed Fatala viewed LED 50" Who viewed this product also viewed g.v(4).in(‘viewed’).out(‘viewed’) Friday, November 15, 13
    • viewed LED TV 40" Renato viewed viewed LED TV 42" LCD TV 42" viewed viewed Fatala viewed LED 50" Who viewed this product also viewed g.v(4).in(‘viewed’).out(‘viewed’) Friday, November 15, 13
    • Processing data with Spot Instances Friday, November 15, 13
    • Processing data with Spot Instances Bob dispatch a task to Amazon SQS containing the product id Simple Queue Service (Amazon SQS) Friday, November 15, 13
    • Processing data with Spot Instances Bob dispatch a task to Amazon SQS containing the product id Simple Queue Service (Amazon SQS) consume Amazon SQS tasks EC2 instance EC2 instance m1.large m1.large … Spot instances Friday, November 15, 13 EC2 instance m1.large process W*A* recommendations
    • Processing data with Spot Instances Bob dispatch a task to Amazon SQS containing the product id Simple Queue Service (Amazon SQS) consume Amazon SQS tasks sync logs sync logs Simple Storage Service (Amazon S3) Friday, November 15, 13 EC2 instance EC2 instance m1.large m1.large … Spot instances EC2 instance m1.large process W*A* recommendations
    • Personalized e-mails Abandoned cart Friday, November 15, 13 Price dropped
    • Personalized e-mails Users receive e-mails when: Friday, November 15, 13
    • Personalized e-mails Users receive e-mails when: • A product has a price drop Friday, November 15, 13
    • Personalized e-mails Users receive e-mails when: • A product has a price drop • Abandoned a product on cart Friday, November 15, 13
    • Personalized e-mails Users receive e-mails when: • A product has a price drop • Abandoned a product on cart • Visits many similar products Friday, November 15, 13
    • Personalized e-mails Bob Bob API Friday, November 15, 13
    • Personalized e-mails Bob Bob API notifies an user interaction Mailer Manager dispatch a task to Amazon SQS containing the customer id Simple Queue Service (Amazon SQS) m1.large Bobby Mailer Friday, November 15, 13
    • Personalized e-mails Bob Bob API notifies an user interaction Mailer Manager dispatch a task to Amazon SQS containing the customer id Simple Queue Service (Amazon SQS) m1.large consume Amazon SQS tasks EC2 instance EC2 instance m1.large m1.large … Spot instances Bobby Mailer Friday, November 15, 13 EC2 instance m1.large find the best recommendation for that user
    • Personalized e-mails Bob Bob API notifies an user interaction Mailer Manager dispatch a task to Amazon SQS containing the customer id Simple Queue Service (Amazon SQS) m1.large Simple Email Service (Amazon SES) send the e-mail consume Amazon SQS tasks EC2 instance EC2 instance m1.large m1.large … Spot instances Bobby Mailer Friday, November 15, 13 EC2 instance m1.large find the best recommendation for that user
    • Personalized e-mails Bob Bob API notifies an user interaction Mailer Manager dispatch a task to Amazon SQS containing the customer id Simple Queue Service (Amazon SQS) m1.large sync logs Simple Email Service (Amazon SES) sync logs Simple Storage Service (Amazon S3) send the e-mail consume Amazon SQS tasks EC2 instance EC2 instance m1.large m1.large Spot instances Bobby Mailer Friday, November 15, 13 … EC2 instance m1.large find the best recommendation for that user
    • Analytics with Faunus Amazon EMR Graph Analytics Engine Friday, November 15, 13 Distributed computing
    • Analytics with Faunus Amazon EMR Graph Analytics Engine • Provides graphs input/output formats Friday, November 15, 13 Distributed computing
    • Analytics with Faunus Amazon EMR Graph Analytics Engine • Provides graphs input/output formats and traversal language for graphs Friday, November 15, 13 Distributed computing
    • Analytics with Faunus Amazon EMR Graph Analytics Engine Distributed computing • Provides graphs input/output formats and traversal language for graphs • Distributed processing of large data sets across clusters Friday, November 15, 13
    • Analytics with Faunus Amazon EMR Graph Analytics Engine Distributed computing • Provides graphs input/output formats and traversal language for graphs • Distributed processing of large data sets across clusters • Designed to scale Friday, November 15, 13
    • Analytics with Faunus Amazon EMR Graph Analytics Engine Distributed computing • Provides graphs input/output formats and traversal language for graphs • Distributed processing of large data sets across clusters • Designed to scale • Detect and handle failures at application layer Friday, November 15, 13
    • Analytics in Graphs with AWS Friday, November 15, 13
    • Analytics in Graphs with AWS > g.V.has(‘element_type’, ‘person’).age.mean() 34.683232 Friday, November 15, 13
    • Analytics in Graphs with AWS > g.V.has(‘element_type’, ‘person’).age.mean() 34.683232 Friday, November 15, 13
    • Analytics in Graphs with AWS > g.V.has(‘element_type’, ‘person’).age.mean() 34.683232 Amazon EMR Friday, November 15, 13
    • Backup process nodetool script Friday, November 15, 13 Amazon S3
    • Backup process nodetool script Friday, November 15, 13 Amazon S3
    • Backup process nodetool script Friday, November 15, 13 Amazon S3
    • Internet Gateway Infrastructure Amazon Route 53 Elastic Load Balancing Queue Queue CACHE EC2 instance m2.xlarge EC2 instance Auto Scaling m2.xlarge EC2 instance Amazon S3 Logs m2.xlarge m2.xlarge m2.xlarge m2.xlarge EC2 instance Auto Scaling m2.xlarge Spot instances m2.xlarge Backups Amazon SQS Amazon ElastiCache API instances Amazon S3 Queue Cassandra cluster Friday, November 15, 13 Amazon EMR Simple Email Service (Amazon SES)
    • Metrics Friday, November 15, 13
    • Metrics • 4.3 million Magazine Luiza identified customers Friday, November 15, 13
    • Metrics • 4.3 million Magazine Luiza identified customers • 50,000 nodes “products” Friday, November 15, 13
    • Metrics • 4.3 million Magazine Luiza identified customers • 50,000 nodes “products” • 90 million total nodes Friday, November 15, 13
    • Metrics • • • • 4.3 million Magazine Luiza identified customers 50,000 nodes “products” 90 million total nodes 350 million total edges Friday, November 15, 13
    • Metrics • • • • • 4.3 million Magazine Luiza identified customers 50,000 nodes “products” 90 million total nodes 350 million total edges 700 GB of data Friday, November 15, 13
    • Metrics • • • • • • 4.3 million Magazine Luiza identified customers 50,000 nodes “products” 90 million total nodes 350 million total edges 700 GB of data Peaks with 20,000 reads/sec - Cassandra Cluster Friday, November 15, 13
    • Results matter… 10x faster Friday, November 15, 13 60%
    • Results matter… January 2013 Friday, November 15, 13 March 2013 May 2013 July 2013 September 2013
    • Results matter… Solution A alone January 2013 Friday, November 15, 13 March 2013 May 2013 July 2013 September 2013
    • Results matter… Solution A alone January 2013 Friday, November 15, 13 First Bob tests March 2013 May 2013 July 2013 September 2013
    • Results matter… Bob out for 2 weeks Solution A alone January 2013 Friday, November 15, 13 First Bob tests March 2013 May 2013 July 2013 September 2013
    • Results matter… Bob alone Bob out for 2 weeks Solution A alone January 2013 Friday, November 15, 13 First Bob tests March 2013 May 2013 July 2013 September 2013
    • Results matter… Bob alone First Bob tests January 2013 Friday, November 15, 13 March 2013 May 2013 July 2013 September 2013
    • Results matter… Bob alone First Bob tests January 2013 Friday, November 15, 13 March 2013 190% May 2013 July 2013 September 2013
    • Next steps Friday, November 15, 13
    • Next steps • Use Faunus to pre-process all W*A* recommendations Friday, November 15, 13
    • Next steps • Use Faunus to pre-process all W*A* recommendations • Algorithms to identify communities in graph Friday, November 15, 13
    • Next steps • Use Faunus to pre-process all W*A* recommendations • Algorithms to identify communities in graph • Cassandra replication between regions Friday, November 15, 13
    • Please give us your feedback on this presentation BDT303 As a thank you, we will select prize winners daily for completed surveys! Friday, November 15, 13 Thank You