Your SlideShare is downloading. ×
Snapguide - Amazon Cloudsearch
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Snapguide - Amazon Cloudsearch

243
views

Published on

A user talk by Sam Kimbrel from Snapguide presented at the CloudSearch Meetup (http://www.meetup.com/Bay-Area-Amazon-CloudSearch-Group/)

A user talk by Sam Kimbrel from Snapguide presented at the CloudSearch Meetup (http://www.meetup.com/Bay-Area-Amazon-CloudSearch-Group/)

Published in: Technology

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
243
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
4
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Share what you know Sam Kimbrel sam@snapguide.com Software EngineerMonday, April 1, 13
  • 2. What is Snapguide? • 1.5 million uniques/month • ~2000 reqs/min across app and web • Python (Pyramid/uWSGI/ nginx) • MySQL/Redis • Built primarily on AWS: EC2, RDS, S3, SQS, SNS, CloudSearch, CloudFront daniel@snapguide.com • confidential do not distributeMonday, April 1, 13
  • 3. daniel@snapguide.com • confidential do not distributeMonday, April 1, 13
  • 4. daniel@snapguide.com • confidential do not distributeMonday, April 1, 13
  • 5. daniel@snapguide.com • confidential do not distributeMonday, April 1, 13
  • 6. Snapguide on CloudSearch • Beta trial users after mentioning Solr on the phone (seriously!) • Primary data set: guides • Facets: guide topic, “featured” boolean, visibility/ACL flags • “autocomplete” search (more later) daniel@snapguide.com • confidential do not distributeMonday, April 1, 13
  • 7. { "lang": "en", "fields": { "step_count": "14", "author_external_id": "qS878yliQ4mxg_9uHt2AZg", "author": "Claire Hesseltine", "items": [ "Preheat oven to 325 degrees Fahrenheit.", ... ], "title": "Make Brown Butter Sea Salt Cookies", "featured": 1, "summary": "The brown butter adds a nutty, caramel-like taste to these delicious cookies.", "topic": [ "desserts" ], "main_image_uuid": "43d201c8fd4b4833b83d3f95d112f1c1", "like_count": 761, "public": "true" }, "version": 1364333310, "type": "add", "id": "9eabff97e32c4244a8205da3fba442e9" } daniel@snapguide.com • confidential do not distributeMonday, April 1, 13
  • 8. Queries • Guide text search: q=cookies • Guide search with topic: q=cookies&facet=topic&bq=topic:‘desserts’ • “Typeahead”/suggestion search: bq=(or ‘paper flower’ ‘paper flower*’) daniel@snapguide.com • confidential do not distributeMonday, April 1, 13
  • 9. Result Ranking • Use “Compare Rank Expressions” • text_relevance is your friend • Goals: • Boost popular/featured guides • Make title/summary matches worth more than item (supplies, step text) matches daniel@snapguide.com • confidential do not distributeMonday, April 1, 13
  • 10. min( cs.text_relevance( {"weights": {"title":2.5, "author": 1.5, "items": 0.1, "summary": 1.5}, "default_weight":1}), 1000)+ min(200, like_count / 10)+ 100*featured daniel@snapguide.com • confidential do not distributeMonday, April 1, 13
  • 11. Offline index updates • Extracting guide data to update document is slow • Remove update from online web request process • Internal-only API endpoints • SQS • queue_consumer daemon daniel@snapguide.com • confidential do not distributeMonday, April 1, 13
  • 12. Offline index updates Web server SQS Queue consumer Snapguide DB/Redis Web server (dedicated to queues) CloudSearch daniel@snapguide.com • confidential do not distributeMonday, April 1, 13
  • 13. Performance SSL is painful daniel@snapguide.com • confidential do not distributeMonday, April 1, 13
  • 14. Performance but physical proximity (us-west-1) is awesome daniel@snapguide.com • confidential do not distributeMonday, April 1, 13
  • 15. Future work • Add more domains (users, new features) • Search-based suggestion engine • Improved ranking/scoring — crawl our social graph daniel@snapguide.com • confidential do not distributeMonday, April 1, 13
  • 16. Questions? www.snapguide.comMonday, April 1, 13