Nondeterministic Software for the Rest of Us

Tomer Gabel
Tomer GabelConsulting Engineer at Substrate Software Services
NONDETERMINISTIC SOFTWARE
FOR THE REST OF US
An exercise in frustration by
Tomer Gabel @ GeeCON 2018, Krakow
Case Study #1
• Delver, circa 2007
• We built a search engine
• What’s expected?
– Performant (<1 sec)
– Reliable
– Useful
Let me take you back…
• We applied good old
fashioned engineering
• It was kind of great!
– Reliability
– Fast iteration
– Built-in regression suite
Spec
Tests
Code
Deployment
Let me take you back…
• So yeah, we coded it
• And it worked… sort of
– It was highly available
– It responded within SLA
– … but with crap results
• Green tests aren’t
everything!
Furthermore
• Not all software can be
acceptance-tested
– Qualitative/subjective
(e.g. search, social feed)
Furthermore
• Not all software can be
acceptance-tested
– Qualitative/subjective
(e.g. search, social feed)
– Huge input space
(e.g. machine vision)
Image: Cristian David
Furthermore
• Not all software can be
acceptance-tested
– Qualitative/subjective
(e.g. search, social feed)
– Huge input space
(e.g. machine vision)
– Resource-constrained
(e.g. Lyft or Uber)
Image: rideshareapps.com
“CORRECT” AND “GOOD”
ARE SEPARATE DIMENSIONS
Takeaway #1
Getting Started
• For any product of any
scale, always ask:
– What does success look like?
Image: Hole in the Wall, FremantleMedia North America
Getting Started
• For any product of any
scale, always ask:
– What does success look like?
– How can I measure success?
Image: Hole in the Wall, FremantleMedia North America
Getting Started
• For any product of any
scale, always ask:
– What does success look like?
– How can I measure success?
• You’re an engineer!
– Intuition can’t replace data
– QA can’t save your butt
Image: Hole in the Wall, FremantleMedia North America
What should you measure?
• (Un-) fortunately, you
have customers
• Analyze their behavior
– What do they want?
– What influences your
quality of service?
• For a search engine…
Query
Skim
Decide
Follow
RefinementPaging
USERS ARE PART OF YOUR SYSTEM
Takeaway #2
What should you measure?
• (Un-) fortunately, you
have customers
• Analyze their behavior
– What do they want?
– What influences your
quality of service?
• For a search engine…
Query
Skim
Decide
Follow
RefinementPaging
Signal
Signal
Signal
What should you measure?
Paging
– “Not relevant enough”
Query
Skim
Decide
Follow
RefinementPaging
What should you measure?
Paging
– “Not relevant enough”
Refinement
– “Not what I meant”
Query
Skim
Decide
Follow
RefinementPaging
What should you measure?
Paging
– “Not relevant enough”
Refinement
– “Not what I meant”
Clickthrough
– “Bingo!”
Query
Skim
Decide
Follow
RefinementPaging
What should you measure?
Paging
– “Not relevant enough”
Refinement
– “Not what I meant”
Clickthrough
– “Bingo!”
Bonus: Abandonment
– ”You suck”
Query
Skim
Decide
Follow
RefinementPaging
It should.
Is this starting to look familiar?
Well now!
• We’ve been having this
conversation for years
• Mostly with…
– Product managers
– Business analysis
– Data engineers
• Guess what?
Product
Changes
R&D
DeploymentMeasurement
Analysis
Well now!
• We’ve been having this
conversation for years
• Mostly with…
– Product managers
– Business analysis
– Data engineers
• Guess what?
Product
Changes
R&D
DeploymentMeasurement
Analysis
Informed
by BI
What can we learn from BI?
Ø Be mindful of your users
Ø Talk to your analysts!• Analysis
• Experimentation
• Iteration
What can we learn from BI?
Ø Invest in A/B tests
Ø Prove your
improvements!
• Analysis
• Experimentation
• Iteration
What can we learn from BI?
• Analysis
• Experimentation
• Iteration
Ø Establish your baseline
Ø Invest in metric collection
and dashboards
SYSTEMS ARE NOT SNAPSHOTS.
MEASURE CONTINUOUSLY
Takeaway #3
Hold on to your hats
… this isn’t about search engines
Case Study #2
• newBrandAnalytics,
circa 2011
• A social listening platform
– Finds user-generated
content (e.g. reviews)
– Provides operational
analytics
Social Listening Platform
• A three-stage pipeline
Acquisition
•3rd party ingestion
•BizDev
•Web scraping
Analysis
•Manual tagging/training
•NLP/ML models
Analytics
•Dashboards
•Ad-hoc query/drilldown
•Reporting
Social Listening Platform
• A three-stage pipeline
• My team focused on data
acquisition
• Let’s discuss web scraping
– Structured data extraction
– At scale
– Reliability is paramount
Acquisition
•3rd party ingestion
•BizDev
•Web scraping
Analysis
•Manual tagging/training
•NLP/ML models
Analytics
•Dashboards
•Ad-hoc query/drilldown
•Reporting
Large-Scale Scraping
• A two-pronged problem
• Target sites…
– Can change at the drop of a hat
– Actively resist scraping!
• Both are external constraints
• Neither can be unit-tested
Optimizing for User Happiness
• Users consume reviews
• What do they want?
– Completeness
(no missed reviews)
– Correctness
(no duplicates/garbage)
– Timeliness
(near real-time)
TripAdvisor
Twitter
Yelp
…
DataAcquisition
Reports
Notifications
Data Lake
Putting It Together
• How do we measure
completeness?
• Manually
– Costly, time consuming
– Sampled (by definition)
Image: Keypunching at Texas A&M, Cushing Memorial Library and Archives, Texas A&M (CC-BY 2.0)
Putting It Together
• How do we measure
completeness?
• Manually
– Costly, time consuming
– Sampled (by definition)
• Automatically
– Re-scrape a known subset
– Produce similarity score
Putting It Together
• How do we measure
completeness?
• Manually
– Costly, time consuming
– Sampled (by definition)
• Automatically
– Re-scrape a known subset
– Produce similarity score
• Same with correctness
Putting It Together
• Targets do not want to
be scraped
• Major sites employ:
– IP throttling
– Traffic fingerprinting
• 3rd party proxies are
expensive
Image from the movie “UHF", Metro-Goldwyn-Mayer
Putting It Together
• What of timeliness?
• It’s an optimization
problem
– Polling frequency
determines latency
– But polling has a cost
– “Good” is a tradeoff
Putting It Together
• So then, timeliness…?
• First, build a cost model
– Review acquisition cost
– Break it down by source
• Next, put together SLAs
– Reflect cost in pricing!
– Adjust scheduler by SLA
Recap
1. ”Correct” and “Good” are
separate dimensions
2. Users are part of your system
3. Systems are not snapshots.
Measure continuously
Image: Confused Monkey, Michael Keen (CC BY-NC-ND 2.0)
QUESTIONS?
Thank you for listening
tomer@tomergabel.com
@tomerg
http://www.tomergabel.com
This work is licensed under a Creative
Commons Attribution-ShareAlike 4.0
International License.
1 of 39

Recommended

Keynote #5 scaling up design by jurgen spangl by
Keynote #5 scaling up design by jurgen spanglKeynote #5 scaling up design by jurgen spangl
Keynote #5 scaling up design by jurgen spanglux singapore
754 views62 slides
How Google works by
How Google worksHow Google works
How Google worksAccesstrade Vietnam
265 views69 slides
Workshop #2: User Research For Everyone by Aras Bilgen by
Workshop #2: User Research For Everyone by Aras BilgenWorkshop #2: User Research For Everyone by Aras Bilgen
Workshop #2: User Research For Everyone by Aras Bilgenux singapore
729 views146 slides
How We Re-imagined and Simplified Confluence Bit by Bit by
How We Re-imagined and Simplified Confluence Bit by BitHow We Re-imagined and Simplified Confluence Bit by Bit
How We Re-imagined and Simplified Confluence Bit by BitAtlassian
6.7K views75 slides
How to Make Customer Support Your Product's Greatest Feature by
How to Make Customer Support Your Product's Greatest FeatureHow to Make Customer Support Your Product's Greatest Feature
How to Make Customer Support Your Product's Greatest FeatureAtlassian
3.4K views73 slides
When Support Calls by
When Support CallsWhen Support Calls
When Support CallsJames Thomas
904 views54 slides

More Related Content

What's hot

Moving Fast at Scale by
Moving Fast at ScaleMoving Fast at Scale
Moving Fast at ScaleRandy Shoup
740 views44 slides
Flowcon2013 - Virtuous Cycles of Velocity: What I Learned About Going Fast at... by
Flowcon2013 - Virtuous Cycles of Velocity: What I Learned About Going Fast at...Flowcon2013 - Virtuous Cycles of Velocity: What I Learned About Going Fast at...
Flowcon2013 - Virtuous Cycles of Velocity: What I Learned About Going Fast at...Randy Shoup
1.4K views18 slides
The benefits of "Do It Yourself" Usability Testing by
The benefits of "Do It Yourself" Usability TestingThe benefits of "Do It Yourself" Usability Testing
The benefits of "Do It Yourself" Usability Testingjrubin8877
204 views56 slides
Minimal Viable Architecture - Silicon Slopes 2020 by
Minimal Viable Architecture - Silicon Slopes 2020Minimal Viable Architecture - Silicon Slopes 2020
Minimal Viable Architecture - Silicon Slopes 2020Randy Shoup
965 views43 slides
Scaling Your Architecture for the Long Term by
Scaling Your Architecture for the Long TermScaling Your Architecture for the Long Term
Scaling Your Architecture for the Long TermRandy Shoup
693 views35 slides
Agile for Me- CodeStock 2009 by
Agile for Me- CodeStock 2009Agile for Me- CodeStock 2009
Agile for Me- CodeStock 2009Adrian Carr
410 views53 slides

What's hot(17)

Moving Fast at Scale by Randy Shoup
Moving Fast at ScaleMoving Fast at Scale
Moving Fast at Scale
Randy Shoup740 views
Flowcon2013 - Virtuous Cycles of Velocity: What I Learned About Going Fast at... by Randy Shoup
Flowcon2013 - Virtuous Cycles of Velocity: What I Learned About Going Fast at...Flowcon2013 - Virtuous Cycles of Velocity: What I Learned About Going Fast at...
Flowcon2013 - Virtuous Cycles of Velocity: What I Learned About Going Fast at...
Randy Shoup1.4K views
The benefits of "Do It Yourself" Usability Testing by jrubin8877
The benefits of "Do It Yourself" Usability TestingThe benefits of "Do It Yourself" Usability Testing
The benefits of "Do It Yourself" Usability Testing
jrubin8877204 views
Minimal Viable Architecture - Silicon Slopes 2020 by Randy Shoup
Minimal Viable Architecture - Silicon Slopes 2020Minimal Viable Architecture - Silicon Slopes 2020
Minimal Viable Architecture - Silicon Slopes 2020
Randy Shoup965 views
Scaling Your Architecture for the Long Term by Randy Shoup
Scaling Your Architecture for the Long TermScaling Your Architecture for the Long Term
Scaling Your Architecture for the Long Term
Randy Shoup693 views
Agile for Me- CodeStock 2009 by Adrian Carr
Agile for Me- CodeStock 2009Agile for Me- CodeStock 2009
Agile for Me- CodeStock 2009
Adrian Carr410 views
How to Test Anything by James Thomas
How to Test AnythingHow to Test Anything
How to Test Anything
James Thomas2.9K views
To Deliver, Discover We Must - A value-driven approach to agile planning by Raj Indugula
To Deliver, Discover We Must - A value-driven approach to agile planningTo Deliver, Discover We Must - A value-driven approach to agile planning
To Deliver, Discover We Must - A value-driven approach to agile planning
Raj Indugula74 views
Estimating time-tracking by Leigh White
Estimating time-trackingEstimating time-tracking
Estimating time-tracking
Leigh White700 views
Large Scale Data Management by Thomas Miller
Large Scale Data ManagementLarge Scale Data Management
Large Scale Data Management
Thomas Miller130 views
Software Tests and Robots by Larry Cynkin
Software Tests and RobotsSoftware Tests and Robots
Software Tests and Robots
Larry Cynkin1.9K views
Test Strategy-The real silver bullet in testing by Matthew Eakin by QA or the Highway
Test Strategy-The real silver bullet in testing by Matthew EakinTest Strategy-The real silver bullet in testing by Matthew Eakin
Test Strategy-The real silver bullet in testing by Matthew Eakin
QA or the Highway439 views
An Agile Approach to Machine Learning by Randy Shoup
An Agile Approach to Machine LearningAn Agile Approach to Machine Learning
An Agile Approach to Machine Learning
Randy Shoup605 views
Cyd Harrell - State of The Vendor Circus by bolt peters
Cyd Harrell - State of The Vendor CircusCyd Harrell - State of The Vendor Circus
Cyd Harrell - State of The Vendor Circus
bolt peters663 views
Detecting Good Abandonment in Mobile Search by Julia Kiseleva
Detecting Good Abandonment in Mobile SearchDetecting Good Abandonment in Mobile Search
Detecting Good Abandonment in Mobile Search
Julia Kiseleva1.7K views

Similar to Nondeterministic Software for the Rest of Us

The Hive Think Tank: Machine Learning at Pinterest by Jure Leskovec by
The Hive Think Tank: Machine Learning at Pinterest by Jure LeskovecThe Hive Think Tank: Machine Learning at Pinterest by Jure Leskovec
The Hive Think Tank: Machine Learning at Pinterest by Jure LeskovecThe Hive
2.1K views60 slides
PA2557_SQM_Lecture7 - Defect Prevention.pdf by
PA2557_SQM_Lecture7 - Defect Prevention.pdfPA2557_SQM_Lecture7 - Defect Prevention.pdf
PA2557_SQM_Lecture7 - Defect Prevention.pdfhulk smash
6 views76 slides
Amp Up Your Testing by Harnessing Test Data by
Amp Up Your Testing by Harnessing Test DataAmp Up Your Testing by Harnessing Test Data
Amp Up Your Testing by Harnessing Test DataTechWell
90 views26 slides
Agile Base Camp - Agile metrics by
Agile Base Camp - Agile metricsAgile Base Camp - Agile metrics
Agile Base Camp - Agile metricsSerge Kovaleff
2K views56 slides
Dlf 2012 by
Dlf 2012Dlf 2012
Dlf 2012sherriberger
880 views50 slides
Moving Fast At Scale by
Moving Fast At ScaleMoving Fast At Scale
Moving Fast At ScaleRandy Shoup
2.7K views68 slides

Similar to Nondeterministic Software for the Rest of Us(20)

The Hive Think Tank: Machine Learning at Pinterest by Jure Leskovec by The Hive
The Hive Think Tank: Machine Learning at Pinterest by Jure LeskovecThe Hive Think Tank: Machine Learning at Pinterest by Jure Leskovec
The Hive Think Tank: Machine Learning at Pinterest by Jure Leskovec
The Hive2.1K views
PA2557_SQM_Lecture7 - Defect Prevention.pdf by hulk smash
PA2557_SQM_Lecture7 - Defect Prevention.pdfPA2557_SQM_Lecture7 - Defect Prevention.pdf
PA2557_SQM_Lecture7 - Defect Prevention.pdf
hulk smash6 views
Amp Up Your Testing by Harnessing Test Data by TechWell
Amp Up Your Testing by Harnessing Test DataAmp Up Your Testing by Harnessing Test Data
Amp Up Your Testing by Harnessing Test Data
TechWell90 views
Agile Base Camp - Agile metrics by Serge Kovaleff
Agile Base Camp - Agile metricsAgile Base Camp - Agile metrics
Agile Base Camp - Agile metrics
Serge Kovaleff2K views
Moving Fast At Scale by Randy Shoup
Moving Fast At ScaleMoving Fast At Scale
Moving Fast At Scale
Randy Shoup2.7K views
Agile Metrics...That Matter by Erik Weber
Agile Metrics...That MatterAgile Metrics...That Matter
Agile Metrics...That Matter
Erik Weber3K views
Fast, Cheap, and Actionable: Creating an Affordable User Research Program (Th... by Michael Powers
Fast, Cheap, and Actionable: Creating an Affordable User Research Program (Th...Fast, Cheap, and Actionable: Creating an Affordable User Research Program (Th...
Fast, Cheap, and Actionable: Creating an Affordable User Research Program (Th...
Michael Powers1.5K views
Test Driven Search Relevancy w/ Quepid by Doug Turnbull
Test Driven Search Relevancy w/ QuepidTest Driven Search Relevancy w/ Quepid
Test Driven Search Relevancy w/ Quepid
Doug Turnbull1.9K views
Agile & UX What changes and other C.R.A.P. by LeanDog
Agile & UX What changes and other C.R.A.P.Agile & UX What changes and other C.R.A.P.
Agile & UX What changes and other C.R.A.P.
LeanDog2.4K views
Human computation, crowdsourcing and social: An industrial perspective by oralonso
Human computation, crowdsourcing and social: An industrial perspectiveHuman computation, crowdsourcing and social: An industrial perspective
Human computation, crowdsourcing and social: An industrial perspective
oralonso1.2K views
Test Improvement - Any place, anytime, any where by Ruud Teunissen
Test Improvement - Any place, anytime, any whereTest Improvement - Any place, anytime, any where
Test Improvement - Any place, anytime, any where
Ruud Teunissen224 views
Machine learning for product managers. Presented at Boston ProductCamp (June... by Mukund Seshadri
Machine learning for product  managers. Presented at Boston ProductCamp (June...Machine learning for product  managers. Presented at Boston ProductCamp (June...
Machine learning for product managers. Presented at Boston ProductCamp (June...
Mukund Seshadri296 views
Why Using Data And Talking To Users Is Critical by Product School
Why Using Data And Talking To Users Is CriticalWhy Using Data And Talking To Users Is Critical
Why Using Data And Talking To Users Is Critical
Product School70 views
Measuring the Quality of Online Service - Jinyoung kim by Jin Young Kim
Measuring the Quality of Online Service - Jinyoung kimMeasuring the Quality of Online Service - Jinyoung kim
Measuring the Quality of Online Service - Jinyoung kim
Jin Young Kim405 views
Estimation by Dev9Com
EstimationEstimation
Estimation
Dev9Com1.4K views
How do we prioritize our product backlog in Hygger.io? by Alexander Sergeev
How do we prioritize our product backlog in Hygger.io?How do we prioritize our product backlog in Hygger.io?
How do we prioritize our product backlog in Hygger.io?
Alexander Sergeev12.3K views
Customer Development - Notes from the Field by Christian Gammill
Customer Development - Notes from the FieldCustomer Development - Notes from the Field
Customer Development - Notes from the Field
Christian Gammill885 views

More from Tomer Gabel

How shit works: Time by
How shit works: TimeHow shit works: Time
How shit works: TimeTomer Gabel
342 views53 slides
Slaying Sacred Cows: Deconstructing Dependency Injection by
Slaying Sacred Cows: Deconstructing Dependency InjectionSlaying Sacred Cows: Deconstructing Dependency Injection
Slaying Sacred Cows: Deconstructing Dependency InjectionTomer Gabel
1.3K views34 slides
An Abridged Guide to Event Sourcing by
An Abridged Guide to Event SourcingAn Abridged Guide to Event Sourcing
An Abridged Guide to Event SourcingTomer Gabel
1K views32 slides
How shit works: the CPU by
How shit works: the CPUHow shit works: the CPU
How shit works: the CPUTomer Gabel
1.8K views38 slides
How Shit Works: Storage by
How Shit Works: StorageHow Shit Works: Storage
How Shit Works: StorageTomer Gabel
914 views44 slides
Java 8 and Beyond, a Scala Story by
Java 8 and Beyond, a Scala StoryJava 8 and Beyond, a Scala Story
Java 8 and Beyond, a Scala StoryTomer Gabel
747 views24 slides

More from Tomer Gabel(20)

How shit works: Time by Tomer Gabel
How shit works: TimeHow shit works: Time
How shit works: Time
Tomer Gabel342 views
Slaying Sacred Cows: Deconstructing Dependency Injection by Tomer Gabel
Slaying Sacred Cows: Deconstructing Dependency InjectionSlaying Sacred Cows: Deconstructing Dependency Injection
Slaying Sacred Cows: Deconstructing Dependency Injection
Tomer Gabel1.3K views
An Abridged Guide to Event Sourcing by Tomer Gabel
An Abridged Guide to Event SourcingAn Abridged Guide to Event Sourcing
An Abridged Guide to Event Sourcing
Tomer Gabel1K views
How shit works: the CPU by Tomer Gabel
How shit works: the CPUHow shit works: the CPU
How shit works: the CPU
Tomer Gabel1.8K views
How Shit Works: Storage by Tomer Gabel
How Shit Works: StorageHow Shit Works: Storage
How Shit Works: Storage
Tomer Gabel914 views
Java 8 and Beyond, a Scala Story by Tomer Gabel
Java 8 and Beyond, a Scala StoryJava 8 and Beyond, a Scala Story
Java 8 and Beyond, a Scala Story
Tomer Gabel747 views
The Wix Microservice Stack by Tomer Gabel
The Wix Microservice StackThe Wix Microservice Stack
The Wix Microservice Stack
Tomer Gabel1.7K views
Scala Refactoring for Fun and Profit (Japanese subtitles) by Tomer Gabel
Scala Refactoring for Fun and Profit (Japanese subtitles)Scala Refactoring for Fun and Profit (Japanese subtitles)
Scala Refactoring for Fun and Profit (Japanese subtitles)
Tomer Gabel6.6K views
Scala Refactoring for Fun and Profit by Tomer Gabel
Scala Refactoring for Fun and ProfitScala Refactoring for Fun and Profit
Scala Refactoring for Fun and Profit
Tomer Gabel985 views
Onboarding at Scale by Tomer Gabel
Onboarding at ScaleOnboarding at Scale
Onboarding at Scale
Tomer Gabel1.5K views
Scala in the Wild by Tomer Gabel
Scala in the WildScala in the Wild
Scala in the Wild
Tomer Gabel2.8K views
Speaking Scala: Refactoring for Fun and Profit (Workshop) by Tomer Gabel
Speaking Scala: Refactoring for Fun and Profit (Workshop)Speaking Scala: Refactoring for Fun and Profit (Workshop)
Speaking Scala: Refactoring for Fun and Profit (Workshop)
Tomer Gabel765 views
Put Your Thinking CAP On by Tomer Gabel
Put Your Thinking CAP OnPut Your Thinking CAP On
Put Your Thinking CAP On
Tomer Gabel3.5K views
Leveraging Scala Macros for Better Validation by Tomer Gabel
Leveraging Scala Macros for Better ValidationLeveraging Scala Macros for Better Validation
Leveraging Scala Macros for Better Validation
Tomer Gabel1.4K views
A Field Guide to DSL Design in Scala by Tomer Gabel
A Field Guide to DSL Design in ScalaA Field Guide to DSL Design in Scala
A Field Guide to DSL Design in Scala
Tomer Gabel6.5K views
Functional Leap of Faith (Keynote at JDay Lviv 2014) by Tomer Gabel
Functional Leap of Faith (Keynote at JDay Lviv 2014)Functional Leap of Faith (Keynote at JDay Lviv 2014)
Functional Leap of Faith (Keynote at JDay Lviv 2014)
Tomer Gabel1.5K views
Scala Back to Basics: Type Classes by Tomer Gabel
Scala Back to Basics: Type ClassesScala Back to Basics: Type Classes
Scala Back to Basics: Type Classes
Tomer Gabel3.7K views
5 Bullets to Scala Adoption by Tomer Gabel
5 Bullets to Scala Adoption5 Bullets to Scala Adoption
5 Bullets to Scala Adoption
Tomer Gabel2.7K views
Nashorn: JavaScript that doesn’t suck (ILJUG) by Tomer Gabel
Nashorn: JavaScript that doesn’t suck (ILJUG)Nashorn: JavaScript that doesn’t suck (ILJUG)
Nashorn: JavaScript that doesn’t suck (ILJUG)
Tomer Gabel5.9K views
Ponies and Unicorns With Scala by Tomer Gabel
Ponies and Unicorns With ScalaPonies and Unicorns With Scala
Ponies and Unicorns With Scala
Tomer Gabel961 views

Recently uploaded

Dev-Cloud Conference 2023 - Continuous Deployment Showdown: Traditionelles CI... by
Dev-Cloud Conference 2023 - Continuous Deployment Showdown: Traditionelles CI...Dev-Cloud Conference 2023 - Continuous Deployment Showdown: Traditionelles CI...
Dev-Cloud Conference 2023 - Continuous Deployment Showdown: Traditionelles CI...Marc Müller
37 views83 slides
Myths and Facts About Hospice Care: Busting Common Misconceptions by
Myths and Facts About Hospice Care: Busting Common MisconceptionsMyths and Facts About Hospice Care: Busting Common Misconceptions
Myths and Facts About Hospice Care: Busting Common MisconceptionsCare Coordinations
5 views1 slide
HarshithAkkapelli_Presentation.pdf by
HarshithAkkapelli_Presentation.pdfHarshithAkkapelli_Presentation.pdf
HarshithAkkapelli_Presentation.pdfharshithakkapelli
11 views16 slides
Agile 101 by
Agile 101Agile 101
Agile 101John Valentino
7 views20 slides
Unmasking the Dark Art of Vectored Exception Handling: Bypassing XDR and EDR ... by
Unmasking the Dark Art of Vectored Exception Handling: Bypassing XDR and EDR ...Unmasking the Dark Art of Vectored Exception Handling: Bypassing XDR and EDR ...
Unmasking the Dark Art of Vectored Exception Handling: Bypassing XDR and EDR ...Donato Onofri
825 views34 slides
Headless JS UG Presentation.pptx by
Headless JS UG Presentation.pptxHeadless JS UG Presentation.pptx
Headless JS UG Presentation.pptxJack Spektor
7 views24 slides

Recently uploaded(20)

Dev-Cloud Conference 2023 - Continuous Deployment Showdown: Traditionelles CI... by Marc Müller
Dev-Cloud Conference 2023 - Continuous Deployment Showdown: Traditionelles CI...Dev-Cloud Conference 2023 - Continuous Deployment Showdown: Traditionelles CI...
Dev-Cloud Conference 2023 - Continuous Deployment Showdown: Traditionelles CI...
Marc Müller37 views
Myths and Facts About Hospice Care: Busting Common Misconceptions by Care Coordinations
Myths and Facts About Hospice Care: Busting Common MisconceptionsMyths and Facts About Hospice Care: Busting Common Misconceptions
Myths and Facts About Hospice Care: Busting Common Misconceptions
Unmasking the Dark Art of Vectored Exception Handling: Bypassing XDR and EDR ... by Donato Onofri
Unmasking the Dark Art of Vectored Exception Handling: Bypassing XDR and EDR ...Unmasking the Dark Art of Vectored Exception Handling: Bypassing XDR and EDR ...
Unmasking the Dark Art of Vectored Exception Handling: Bypassing XDR and EDR ...
Donato Onofri825 views
Headless JS UG Presentation.pptx by Jack Spektor
Headless JS UG Presentation.pptxHeadless JS UG Presentation.pptx
Headless JS UG Presentation.pptx
Jack Spektor7 views
Advanced API Mocking Techniques by Dimpy Adhikary
Advanced API Mocking TechniquesAdvanced API Mocking Techniques
Advanced API Mocking Techniques
Dimpy Adhikary19 views
Fleet Management Software in India by Fleetable
Fleet Management Software in India Fleet Management Software in India
Fleet Management Software in India
Fleetable11 views
Navigating container technology for enhanced security by Niklas Saari by Metosin Oy
Navigating container technology for enhanced security by Niklas SaariNavigating container technology for enhanced security by Niklas Saari
Navigating container technology for enhanced security by Niklas Saari
Metosin Oy14 views
Sprint 226 by ManageIQ
Sprint 226Sprint 226
Sprint 226
ManageIQ5 views
Dev-HRE-Ops - Addressing the _Last Mile DevOps Challenge_ in Highly Regulated... by TomHalpin9
Dev-HRE-Ops - Addressing the _Last Mile DevOps Challenge_ in Highly Regulated...Dev-HRE-Ops - Addressing the _Last Mile DevOps Challenge_ in Highly Regulated...
Dev-HRE-Ops - Addressing the _Last Mile DevOps Challenge_ in Highly Regulated...
TomHalpin96 views
2023-November-Schneider Electric-Meetup-BCN Admin Group.pptx by animuscrm
2023-November-Schneider Electric-Meetup-BCN Admin Group.pptx2023-November-Schneider Electric-Meetup-BCN Admin Group.pptx
2023-November-Schneider Electric-Meetup-BCN Admin Group.pptx
animuscrm14 views
DSD-INT 2023 Process-based modelling of salt marsh development coupling Delft... by Deltares
DSD-INT 2023 Process-based modelling of salt marsh development coupling Delft...DSD-INT 2023 Process-based modelling of salt marsh development coupling Delft...
DSD-INT 2023 Process-based modelling of salt marsh development coupling Delft...
Deltares7 views
DSD-INT 2023 Wave-Current Interaction at Montrose Tidal Inlet System and Its ... by Deltares
DSD-INT 2023 Wave-Current Interaction at Montrose Tidal Inlet System and Its ...DSD-INT 2023 Wave-Current Interaction at Montrose Tidal Inlet System and Its ...
DSD-INT 2023 Wave-Current Interaction at Montrose Tidal Inlet System and Its ...
Deltares11 views
DSD-INT 2023 Delft3D FM Suite 2024.01 1D2D - Beta testing programme - Geertsema by Deltares
DSD-INT 2023 Delft3D FM Suite 2024.01 1D2D - Beta testing programme - GeertsemaDSD-INT 2023 Delft3D FM Suite 2024.01 1D2D - Beta testing programme - Geertsema
DSD-INT 2023 Delft3D FM Suite 2024.01 1D2D - Beta testing programme - Geertsema
Deltares17 views
DSD-INT 2023 Salt intrusion Modelling of the Lauwersmeer, towards a measureme... by Deltares
DSD-INT 2023 Salt intrusion Modelling of the Lauwersmeer, towards a measureme...DSD-INT 2023 Salt intrusion Modelling of the Lauwersmeer, towards a measureme...
DSD-INT 2023 Salt intrusion Modelling of the Lauwersmeer, towards a measureme...
Deltares5 views
DSD-INT 2023 Exploring flash flood hazard reduction in arid regions using a h... by Deltares
DSD-INT 2023 Exploring flash flood hazard reduction in arid regions using a h...DSD-INT 2023 Exploring flash flood hazard reduction in arid regions using a h...
DSD-INT 2023 Exploring flash flood hazard reduction in arid regions using a h...
Deltares5 views
360 graden fabriek by info33492
360 graden fabriek360 graden fabriek
360 graden fabriek
info3349238 views
DSD-INT 2023 The Danube Hazardous Substances Model - Kovacs by Deltares
DSD-INT 2023 The Danube Hazardous Substances Model - KovacsDSD-INT 2023 The Danube Hazardous Substances Model - Kovacs
DSD-INT 2023 The Danube Hazardous Substances Model - Kovacs
Deltares8 views

Nondeterministic Software for the Rest of Us

  • 1. NONDETERMINISTIC SOFTWARE FOR THE REST OF US An exercise in frustration by Tomer Gabel @ GeeCON 2018, Krakow
  • 2. Case Study #1 • Delver, circa 2007 • We built a search engine • What’s expected? – Performant (<1 sec) – Reliable – Useful
  • 3. Let me take you back… • We applied good old fashioned engineering • It was kind of great! – Reliability – Fast iteration – Built-in regression suite Spec Tests Code Deployment
  • 4. Let me take you back… • So yeah, we coded it • And it worked… sort of – It was highly available – It responded within SLA – … but with crap results • Green tests aren’t everything!
  • 5. Furthermore • Not all software can be acceptance-tested – Qualitative/subjective (e.g. search, social feed)
  • 6. Furthermore • Not all software can be acceptance-tested – Qualitative/subjective (e.g. search, social feed) – Huge input space (e.g. machine vision) Image: Cristian David
  • 7. Furthermore • Not all software can be acceptance-tested – Qualitative/subjective (e.g. search, social feed) – Huge input space (e.g. machine vision) – Resource-constrained (e.g. Lyft or Uber) Image: rideshareapps.com
  • 8. “CORRECT” AND “GOOD” ARE SEPARATE DIMENSIONS Takeaway #1
  • 9. Getting Started • For any product of any scale, always ask: – What does success look like? Image: Hole in the Wall, FremantleMedia North America
  • 10. Getting Started • For any product of any scale, always ask: – What does success look like? – How can I measure success? Image: Hole in the Wall, FremantleMedia North America
  • 11. Getting Started • For any product of any scale, always ask: – What does success look like? – How can I measure success? • You’re an engineer! – Intuition can’t replace data – QA can’t save your butt Image: Hole in the Wall, FremantleMedia North America
  • 12. What should you measure? • (Un-) fortunately, you have customers • Analyze their behavior – What do they want? – What influences your quality of service? • For a search engine… Query Skim Decide Follow RefinementPaging
  • 13. USERS ARE PART OF YOUR SYSTEM Takeaway #2
  • 14. What should you measure? • (Un-) fortunately, you have customers • Analyze their behavior – What do they want? – What influences your quality of service? • For a search engine… Query Skim Decide Follow RefinementPaging Signal Signal Signal
  • 15. What should you measure? Paging – “Not relevant enough” Query Skim Decide Follow RefinementPaging
  • 16. What should you measure? Paging – “Not relevant enough” Refinement – “Not what I meant” Query Skim Decide Follow RefinementPaging
  • 17. What should you measure? Paging – “Not relevant enough” Refinement – “Not what I meant” Clickthrough – “Bingo!” Query Skim Decide Follow RefinementPaging
  • 18. What should you measure? Paging – “Not relevant enough” Refinement – “Not what I meant” Clickthrough – “Bingo!” Bonus: Abandonment – ”You suck” Query Skim Decide Follow RefinementPaging
  • 19. It should. Is this starting to look familiar?
  • 20. Well now! • We’ve been having this conversation for years • Mostly with… – Product managers – Business analysis – Data engineers • Guess what? Product Changes R&D DeploymentMeasurement Analysis
  • 21. Well now! • We’ve been having this conversation for years • Mostly with… – Product managers – Business analysis – Data engineers • Guess what? Product Changes R&D DeploymentMeasurement Analysis Informed by BI
  • 22. What can we learn from BI? Ø Be mindful of your users Ø Talk to your analysts!• Analysis • Experimentation • Iteration
  • 23. What can we learn from BI? Ø Invest in A/B tests Ø Prove your improvements! • Analysis • Experimentation • Iteration
  • 24. What can we learn from BI? • Analysis • Experimentation • Iteration Ø Establish your baseline Ø Invest in metric collection and dashboards
  • 25. SYSTEMS ARE NOT SNAPSHOTS. MEASURE CONTINUOUSLY Takeaway #3
  • 26. Hold on to your hats … this isn’t about search engines
  • 27. Case Study #2 • newBrandAnalytics, circa 2011 • A social listening platform – Finds user-generated content (e.g. reviews) – Provides operational analytics
  • 28. Social Listening Platform • A three-stage pipeline Acquisition •3rd party ingestion •BizDev •Web scraping Analysis •Manual tagging/training •NLP/ML models Analytics •Dashboards •Ad-hoc query/drilldown •Reporting
  • 29. Social Listening Platform • A three-stage pipeline • My team focused on data acquisition • Let’s discuss web scraping – Structured data extraction – At scale – Reliability is paramount Acquisition •3rd party ingestion •BizDev •Web scraping Analysis •Manual tagging/training •NLP/ML models Analytics •Dashboards •Ad-hoc query/drilldown •Reporting
  • 30. Large-Scale Scraping • A two-pronged problem • Target sites… – Can change at the drop of a hat – Actively resist scraping! • Both are external constraints • Neither can be unit-tested
  • 31. Optimizing for User Happiness • Users consume reviews • What do they want? – Completeness (no missed reviews) – Correctness (no duplicates/garbage) – Timeliness (near real-time) TripAdvisor Twitter Yelp … DataAcquisition Reports Notifications Data Lake
  • 32. Putting It Together • How do we measure completeness? • Manually – Costly, time consuming – Sampled (by definition) Image: Keypunching at Texas A&M, Cushing Memorial Library and Archives, Texas A&M (CC-BY 2.0)
  • 33. Putting It Together • How do we measure completeness? • Manually – Costly, time consuming – Sampled (by definition) • Automatically – Re-scrape a known subset – Produce similarity score
  • 34. Putting It Together • How do we measure completeness? • Manually – Costly, time consuming – Sampled (by definition) • Automatically – Re-scrape a known subset – Produce similarity score • Same with correctness
  • 35. Putting It Together • Targets do not want to be scraped • Major sites employ: – IP throttling – Traffic fingerprinting • 3rd party proxies are expensive Image from the movie “UHF", Metro-Goldwyn-Mayer
  • 36. Putting It Together • What of timeliness? • It’s an optimization problem – Polling frequency determines latency – But polling has a cost – “Good” is a tradeoff
  • 37. Putting It Together • So then, timeliness…? • First, build a cost model – Review acquisition cost – Break it down by source • Next, put together SLAs – Reflect cost in pricing! – Adjust scheduler by SLA
  • 38. Recap 1. ”Correct” and “Good” are separate dimensions 2. Users are part of your system 3. Systems are not snapshots. Measure continuously Image: Confused Monkey, Michael Keen (CC BY-NC-ND 2.0)
  • 39. QUESTIONS? Thank you for listening tomer@tomergabel.com @tomerg http://www.tomergabel.com This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.