Your SlideShare is downloading. ×
Revolutionazing Search Advertising with ElasticSearch at Swoop
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Revolutionazing Search Advertising with ElasticSearch at Swoop


Published on

Search advertising is the only type of online advertising that consistently provides value to users. Swoop is a search advertising company that uses ElasticSearch at the core of its offering. This …

Search advertising is the only type of online advertising that consistently provides value to users. Swoop is a search advertising company that uses ElasticSearch at the core of its offering. This presentation is from a talk Swoop founder & CTO Sim Simeonov gave at the Boston ElasticSearch meetup.

Published in: Technology, Design

  • Be the first to comment

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. Revolutionizing Search Advertising with ElasticSearch
  • 2. Hi, I’m @simeons. I build startups.
  • 3. This hangs at the Swoop office
  • 4. Super brief history of advertising on the Web
  • 5. October 27, 1994
  • 6. Traditional advertising makes the Web suck
  • 7. October, 2000
  • 8. Google AdWords
  • 9. Display Advertising Search Advertising High volume Low quality Does not optimize for users Low engagement 16% of users click 1 in 1,200 ads clicked Low volume High quality Optimizes for users High engagement 80% of users click 1 in 40 ads clicked
  • 10. Search advertising is a real, useful Web service
  • 11. The Battle of the Web
  • 12. Display Advertising $18 billion 200 companies Search Advertising $20 billion
  • 13. Display Advertising $18 $13 billion 200 companies Search Advertising $20 $25 billion
  • 14. Join us Work with people who care Solve insanely hard problems Make the Web better
  • 15. query AdWords on SERP ads
  • 16. What’s in the index?
  • 17. Data model Advertisers Campaigns Ad Groups Creatives (Ads) Keywords
  • 18. Creatives don’t match queries. Keywords match queries.
  • 19. What’s in the index? keyword documents
  • 20. What is a keyword? A string e.g., canon d70 A type: specifies when a keyword matches e.g., positive phrase 9 types: each with own analysis pipeline Inherited filtering criteria e.g., US-only traffic also negative keywords
  • 21. Keyword Types
  • 22. Keyword doc schema Many possible schema Query dependent One type vs. many types Query depends on matching model
  • 23. Matching models Two main approaches Boolean matching IR matching No time to discuss this Gets very geeky/math-y very quickly
  • 24. Boolean Query Pattern for all keyword document fields i, AND together ( “does not have field i” OR ( “has field i” AND “field i satisfies the user query” ) )
  • 25. Keyword ranking Generalized second-price auctions with revenue ordering, minimum prices and user value feedback, tuned for locally envy-free equilibria P.S. Tends to work best when the moon is full
  • 26. Search relevance is not enough "Terrorism: Pursue a certificate in terrorism 100% online. Enroll today. Ads by Google.”
  • 27. Custom ranking algorithm Balance expected “value” trade-offs User: engagement w/o WTF moments Advertiser: performance Publisher/network: revenue Need external data CTRs, bounce rates, share of budget, … Frequent updates to this data
  • 28. Problem Lucene not suited for external data access Expensive to add data to indexes update == delete + add
  • 29. Superheroes to the rescue @antirez @imotov elasticsearch-facet-script
  • 30. General map/reduce with ES elasticsearch-facet-script on each shard node init_script: run once map_script: run per result combine_script: run w/ shard results on the aggregation node reduce_script: sees all results
  • 31. Congrats! You built nano-AdW0rdz. Deploy to your search portal! What do you mean, you don’t have a search portal???
  • 32. Search advertising for content Google AdWords for GDN a.k.a, Google AdSense GDN == Google Display Network Bing ContentAds
  • 33. Search ads Search ads
  • 34. Search ads Where is the query???
  • 35. Build a “query” from the page Same two models as before Phrase extraction (boolean) IR matching Common tools Text analysis/summarization Language modeling Often involves indexing the pages
  • 36. There is a catch AdWords on GDN performs 3-10x worse than AdWords on SERP
  • 37. Problems Poor targeting accuracy Poor placement locality
  • 38. Swoop solves these problems Unique real-time extraction & placement browser/app, Web/mobile 100+ patent claims A single page can generate 50+ queries Pixel-perfect placement in content If there is nothing to say we say nothing
  • 39. Some metrics 3 x 3 x 3 ES deployment data, master, client nodes 5,000+ rps < 5ms query execution time ElasticSearch, Lucene & Redis are fast!
  • 40. Rewards for solving problems A big sense of accomplishment Business doubling Q-Q Users getting better content Bigger, harder, more important problems
  • 41. Swoop’s future with ES Deeper into Lucene More machine learning in ES map/reduce Better query rewriting engine Better content enhancement engine Probabilistic synchronized sharding Much bigger clusters
  • 42. Thanks! Sim Simeonov @simeons Join us & make the Web better