Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Nondeterministic Software for the Rest of Us

114 views

Published on

A talk given at GeeCON 2018 in Krakow, Poland.

Classically-trained (if you can call it that) software engineers are used to clear problem statements and clear success and acceptance criteria. Need a mobile front-end for your blog? Sure! Support instant messaging for a million concurrent users? No problem! Store and serve 50TB of JSON blobs? Presto!

Unfortunately, it turns out modern software often includes challenges that we have a hard time with: those without clear criteria for correctness, no easy way to measure performance and success is about more than green dashboards. Your blog platform better have a spam filter, your instant messaging service has to have search, and your blobs will inevitably be fed into some data scientist's crazy contraption.

In this talk I'll share my experiences of learning to deal with non-deterministic problems, what made the process easier for me and what I've learned along the way. With any luck, you'll have an easier time of it!

Published in: Software
  • Be the first to comment

  • Be the first to like this

Nondeterministic Software for the Rest of Us

  1. 1. NONDETERMINISTIC SOFTWARE FOR THE REST OF US An exercise in frustration by Tomer Gabel @ GeeCON 2018, Krakow
  2. 2. Case Study #1 • Delver, circa 2007 • We built a search engine • What’s expected? – Performant (<1 sec) – Reliable – Useful
  3. 3. Let me take you back… • We applied good old fashioned engineering • It was kind of great! – Reliability – Fast iteration – Built-in regression suite Spec Tests Code Deployment
  4. 4. Let me take you back… • So yeah, we coded it • And it worked… sort of – It was highly available – It responded within SLA – … but with crap results • Green tests aren’t everything!
  5. 5. Furthermore • Not all software can be acceptance-tested – Qualitative/subjective (e.g. search, social feed)
  6. 6. Furthermore • Not all software can be acceptance-tested – Qualitative/subjective (e.g. search, social feed) – Huge input space (e.g. machine vision) Image: Cristian David
  7. 7. Furthermore • Not all software can be acceptance-tested – Qualitative/subjective (e.g. search, social feed) – Huge input space (e.g. machine vision) – Resource-constrained (e.g. Lyft or Uber) Image: rideshareapps.com
  8. 8. “CORRECT” AND “GOOD” ARE SEPARATE DIMENSIONS Takeaway #1
  9. 9. Getting Started • For any product of any scale, always ask: – What does success look like? Image: Hole in the Wall, FremantleMedia North America
  10. 10. Getting Started • For any product of any scale, always ask: – What does success look like? – How can I measure success? Image: Hole in the Wall, FremantleMedia North America
  11. 11. Getting Started • For any product of any scale, always ask: – What does success look like? – How can I measure success? • You’re an engineer! – Intuition can’t replace data – QA can’t save your butt Image: Hole in the Wall, FremantleMedia North America
  12. 12. What should you measure? • (Un-) fortunately, you have customers • Analyze their behavior – What do they want? – What influences your quality of service? • For a search engine… Query Skim Decide Follow RefinementPaging
  13. 13. USERS ARE PART OF YOUR SYSTEM Takeaway #2
  14. 14. What should you measure? • (Un-) fortunately, you have customers • Analyze their behavior – What do they want? – What influences your quality of service? • For a search engine… Query Skim Decide Follow RefinementPaging Signal Signal Signal
  15. 15. What should you measure? Paging – “Not relevant enough” Query Skim Decide Follow RefinementPaging
  16. 16. What should you measure? Paging – “Not relevant enough” Refinement – “Not what I meant” Query Skim Decide Follow RefinementPaging
  17. 17. What should you measure? Paging – “Not relevant enough” Refinement – “Not what I meant” Clickthrough – “Bingo!” Query Skim Decide Follow RefinementPaging
  18. 18. What should you measure? Paging – “Not relevant enough” Refinement – “Not what I meant” Clickthrough – “Bingo!” Bonus: Abandonment – ”You suck” Query Skim Decide Follow RefinementPaging
  19. 19. It should. Is this starting to look familiar?
  20. 20. Well now! • We’ve been having this conversation for years • Mostly with… – Product managers – Business analysis – Data engineers • Guess what? Product Changes R&D DeploymentMeasurement Analysis
  21. 21. Well now! • We’ve been having this conversation for years • Mostly with… – Product managers – Business analysis – Data engineers • Guess what? Product Changes R&D DeploymentMeasurement Analysis Informed by BI
  22. 22. What can we learn from BI? Ø Be mindful of your users Ø Talk to your analysts!• Analysis • Experimentation • Iteration
  23. 23. What can we learn from BI? Ø Invest in A/B tests Ø Prove your improvements! • Analysis • Experimentation • Iteration
  24. 24. What can we learn from BI? • Analysis • Experimentation • Iteration Ø Establish your baseline Ø Invest in metric collection and dashboards
  25. 25. SYSTEMS ARE NOT SNAPSHOTS. MEASURE CONTINUOUSLY Takeaway #3
  26. 26. Hold on to your hats … this isn’t about search engines
  27. 27. Case Study #2 • newBrandAnalytics, circa 2011 • A social listening platform – Finds user-generated content (e.g. reviews) – Provides operational analytics
  28. 28. Social Listening Platform • A three-stage pipeline Acquisition •3rd party ingestion •BizDev •Web scraping Analysis •Manual tagging/training •NLP/ML models Analytics •Dashboards •Ad-hoc query/drilldown •Reporting
  29. 29. Social Listening Platform • A three-stage pipeline • My team focused on data acquisition • Let’s discuss web scraping – Structured data extraction – At scale – Reliability is paramount Acquisition •3rd party ingestion •BizDev •Web scraping Analysis •Manual tagging/training •NLP/ML models Analytics •Dashboards •Ad-hoc query/drilldown •Reporting
  30. 30. Large-Scale Scraping • A two-pronged problem • Target sites… – Can change at the drop of a hat – Actively resist scraping! • Both are external constraints • Neither can be unit-tested
  31. 31. Optimizing for User Happiness • Users consume reviews • What do they want? – Completeness (no missed reviews) – Correctness (no duplicates/garbage) – Timeliness (near real-time) TripAdvisor Twitter Yelp … DataAcquisition Reports Notifications Data Lake
  32. 32. Putting It Together • How do we measure completeness? • Manually – Costly, time consuming – Sampled (by definition) Image: Keypunching at Texas A&M, Cushing Memorial Library and Archives, Texas A&M (CC-BY 2.0)
  33. 33. Putting It Together • How do we measure completeness? • Manually – Costly, time consuming – Sampled (by definition) • Automatically – Re-scrape a known subset – Produce similarity score
  34. 34. Putting It Together • How do we measure completeness? • Manually – Costly, time consuming – Sampled (by definition) • Automatically – Re-scrape a known subset – Produce similarity score • Same with correctness
  35. 35. Putting It Together • Targets do not want to be scraped • Major sites employ: – IP throttling – Traffic fingerprinting • 3rd party proxies are expensive Image from the movie “UHF", Metro-Goldwyn-Mayer
  36. 36. Putting It Together • What of timeliness? • It’s an optimization problem – Polling frequency determines latency – But polling has a cost – “Good” is a tradeoff
  37. 37. Putting It Together • So then, timeliness…? • First, build a cost model – Review acquisition cost – Break it down by source • Next, put together SLAs – Reflect cost in pricing! – Adjust scheduler by SLA
  38. 38. Recap 1. ”Correct” and “Good” are separate dimensions 2. Users are part of your system 3. Systems are not snapshots. Measure continuously Image: Confused Monkey, Michael Keen (CC BY-NC-ND 2.0)
  39. 39. QUESTIONS? Thank you for listening tomer@tomergabel.com @tomerg http://www.tomergabel.com This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

×