Snapguide - CloudSearch

1,043 views

Published on

0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,043
On SlideShare
0
From Embeds
0
Number of Embeds
8
Actions
Shares
0
Downloads
8
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Snapguide - CloudSearch

  1. 1. Share what you know Sam Kimbrel sam@snapguide.com Software EngineerMonday, April 1, 13
  2. 2. What is Snapguide? • 1.5 million uniques/month • ~2000 reqs/min across app and web • Python (Pyramid/uWSGI/ nginx) • MySQL/Redis • Built primarily on AWS: EC2, RDS, S3, SQS, SNS, CloudSearch, CloudFront daniel@snapguide.com • confidential do not distributeMonday, April 1, 13
  3. 3. daniel@snapguide.com • confidential do not distributeMonday, April 1, 13
  4. 4. daniel@snapguide.com • confidential do not distributeMonday, April 1, 13
  5. 5. daniel@snapguide.com • confidential do not distributeMonday, April 1, 13
  6. 6. Snapguide on CloudSearch • Beta trial users after mentioning Solr on the phone (seriously!) • Primary data set: guides • Facets: guide topic, “featured” boolean, visibility/ACL flags • “autocomplete” search (more later) daniel@snapguide.com • confidential do not distributeMonday, April 1, 13
  7. 7. { "lang": "en", "fields": { "step_count": "14", "author_external_id": "qS878yliQ4mxg_9uHt2AZg", "author": "Claire Hesseltine", "items": [ "Preheat oven to 325 degrees Fahrenheit.", ... ], "title": "Make Brown Butter Sea Salt Cookies", "featured": 1, "summary": "The brown butter adds a nutty, caramel-like taste to these delicious cookies.", "topic": [ "desserts" ], "main_image_uuid": "43d201c8fd4b4833b83d3f95d112f1c1", "like_count": 761, "public": "true" }, "version": 1364333310, "type": "add", "id": "9eabff97e32c4244a8205da3fba442e9" } daniel@snapguide.com • confidential do not distributeMonday, April 1, 13
  8. 8. Queries • Guide text search: q=cookies • Guide search with topic: q=cookies&facet=topic&bq=topic:‘desserts’ • “Typeahead”/suggestion search: bq=(or ‘paper flower’ ‘paper flower*’) daniel@snapguide.com • confidential do not distributeMonday, April 1, 13
  9. 9. Result Ranking • Use “Compare Rank Expressions” • text_relevance is your friend • Goals: • Boost popular/featured guides • Make title/summary matches worth more than item (supplies, step text) matches daniel@snapguide.com • confidential do not distributeMonday, April 1, 13
  10. 10. min( cs.text_relevance( {"weights": {"title":2.5, "author": 1.5, "items": 0.1, "summary": 1.5}, "default_weight":1}), 1000)+ min(200, like_count / 10)+ 100*featured daniel@snapguide.com • confidential do not distributeMonday, April 1, 13
  11. 11. Offline index updates • Extracting guide data to update document is slow • Remove update from online web request process • Internal-only API endpoints • SQS • queue_consumer daemon daniel@snapguide.com • confidential do not distributeMonday, April 1, 13
  12. 12. Offline index updates Web server SQS Queue consumer Snapguide DB/Redis Web server (dedicated to queues) CloudSearch daniel@snapguide.com • confidential do not distributeMonday, April 1, 13
  13. 13. Performance SSL is painful daniel@snapguide.com • confidential do not distributeMonday, April 1, 13
  14. 14. Performance but physical proximity (us-west-1) is awesome daniel@snapguide.com • confidential do not distributeMonday, April 1, 13
  15. 15. Future work • Add more domains (users, new features) • Search-based suggestion engine • Improved ranking/scoring — crawl our social graph daniel@snapguide.com • confidential do not distributeMonday, April 1, 13
  16. 16. Questions? www.snapguide.comMonday, April 1, 13

×