SlideShare a Scribd company logo
What Counts as (Useful) Evidence? 
Lisa Farlow 
Oct 25, 2014
Overview 
• Some definitions 
• What makes for a good hypothesis? 
• What counts as useful evidence? 
• What makes for a good test?
What Counts as Knowledge? 
• Start with a belief/hunch/hypothesis/feeling 
• Gather evidence to justify that belief 
• Knowledge is justified true belief (Plato!)
Types of Evidence 
• Any observation about the world can be 
evidence 
• Before gathering evidence, your hypothesis 
has an initial likelihood of being true 
– Some evidence increases the likelihood 
– Some evidence decreases the likelihood 
– Some evidence is irrelevant to the hypothesis
HYPOTHESES
Start with a Hypothesis 
• A good hypothesis is one in which… 
– You can find evidence to support/deny it 
(testable) 
– It is possible to find evidence that proves the 
hypothesis wrong (falsifiable)
Hypotheses that aren’t testable 
• I will have twice as many feelings in December 
as in November (not measurable) 
• The spell I cast caused my favourite sports 
team to win, but it can only be used once (not 
repeatable) 
• An afterlife exists (not observable)
Hypotheses that aren’t falsifiable 
• A teapot orbits the Sun somewhere in space 
between Earth and Mars 
• There is a ghost in the basement of my parents’ 
house 
• I know a lot of people think the new feature on 
our site is clunky, but some people love it and 
find it extremely useful 
“Absence of evidence is not evidence of absence.” 
(Carl Sagan)
Some Good Hypotheses 
• People located in Alberta spend more time on 
our site than people located in BC 
• Some people struggle to fill out the PDFs on 
our site 
• On average, people will be able to find our 
contact info more quickly if we title the tab 
“Give us a Ring-a-Ding” rather than “Contact 
Us” (though it is a bad idea, it is a good 
hypothesis)
WAYS OF TALKING ABOUT EVIDENCE
Ways of Talking About Evidence 
• All, Some, None 
• All, Only, All & Only 
• Necessary, Sufficient, Necessary &Sufficient
All or None 
• All Xs are Y 
• Ex: All swans are white 
– Confirm: Examine every single swan that has ever 
existed in all of time and space and observe that it 
is white 
– Deny: Find a single swan that is not white 
• Same for All Xs are not Y (i.e, No Xs are Y, a 
“none” statements)
Some (“there exists at least one”) 
• Some Xs are Y 
• Ex: Some dogs are able to talk 
– Confirm: Find a single dog that is able to talk 
– Deny: Examine every single dog and determine if 
he can talk 
• Same for some Xs are not Y (Not all Xs are Y)
Open and Closed Worlds 
• Open world of data 
– You cannot confirm “All” statements or deny “Some” 
statements 
– Ex: Analytics search key words. “Nobody found your 
site by searching for ‘worst burgers in Edmonton’ 
• Closed world of data 
– It’s possible to examine every piece of data 
– You can confirm “All” statements and deny “Some” 
statements 
– Include closed-world caveats 
• “Of those we interviewed….” 
• “Of those who responded to the survey…”
All, Only, All & Only 
• All Xs are Y 
– Confirm: Examine every X within the study and 
observe that it is Y 
– Deny: Find a single X that is not Y
All, Only, All & Only 
• Only Xs are Y 
– Confirm: Examine all things that are Y, observe 
that they are X 
– Deny: Find a non-X that is Y
All, Only, All & Only 
• All and Only Xs are Y 
– Confirm: Examine every X and observe that it is Y 
AND examine everything else within the study 
and observe that it is not Y 
– Deny: Find a single X that is not Y OR find a single 
thing that is Y
Case Study: All and Only 
• You’re running a survey to find out student 
attitudes about the cafeteria serving more 
vegetables. 
• What counts as evidence for/against each 
hypothesis?
Case Study: All and Only 
• Hypothesis: Only female students are more 
willing to more spend money on healthier 
options 
– Confirm: Look through all respondents who said 
they were willing to spend more. Determine 
whether 100% of them are girls 
– Deny: Look through all respondents who said 
they were willing to spend more. Find one boy.
Case Study: All and Only 
• Hypothesis: All and only the grade 6 students 
bring healthy snacks from home 
– Confirm: Look through all respondents… 
– Deny: Look through all respondents …
Case Study: All and Only 
• Hypothesis: None of the chemistry students 
think pizza sauce counts as a vegetable 
– Confirm: Look through all respondents… 
– Deny: Look through all respondents….
Necessary & Sufficient Evidence 
• Necessary (If H is true, then E must be 
observed) 
– Being a woman is a necessary condition to 
become a nun 
– Submitting your resume is a necessary condition 
to getting an interview 
– Spending 20+ minutes on the site is necessary for 
you to be classified as a happy, high-traffic user
Necessary & Sufficient Evidence 
• Sufficient (If E is observed, you know H is 
true) 
– If it’s been raining hard for a while, I know the 
ground will be wet 
– If you voted for Don Iveson in the last election, I 
know you are at least 18 years old 
– If you renewed your library books online, I know 
you have a a library card
Case Study: Necessary & Sufficient 
• Hypothesis: An online wizard is a better registration 
process for our car-share program than our mail-in PDF
What are the necessary conditions for success?
Necessary but Insufficient 
• You observe: “very few of the graduating class 
carry federal or provincial students loans” 
• You cannot conclude: “very few students take 
out loans to pay for school” 
• Nor can you conclude: “very few students 
take out federal/provincial loans”
Necessary but Insufficient 
• You observe: “page abc has more pageviews 
than any other page” 
• You cannot conclude: “page abc is the page 
with the most important content”
WHAT MAKES A TEST BAD?
What makes a good test? 
• Open to gathering evidence that falsifies your 
hypothesis 
• Repeatable 
• Minimal biases
Sample Bias 
• The sample does not represent the larger 
population 
– Ex: customer feedback box is likely to be full of 
negative comments, because angry people are 
more motivated to comment 
– Ex: your impression testing showed extremely 
high success, but you sent it to stakeholders who 
had already seen the wireframes
Confirmation Bias 
• Your test is set up to lead people to answer a 
certain way 
– Ex: you ask people what they thought of the meal 
you cooked with the options, “tasty,” “very tasty,” 
and “extremely tasty”
Clever Hans Bias 
• A horse who can do math, who was just 
responding to cues from his testers 
– Common in usability tasks 
– Ex: “Hmm, I would try downloading this form, I 
guess” “Yes, that’s the right one! Next task…”
Anchoring Bias 
• In analysis, anchoring is when the researcher 
clings to one the first example or piece of 
information 
• Lessens the impact of gathering more data
Hawthorne Effect 
• People change when people observed 
• Factory workers were more productive when 
the lights were turned brighter. But they were 
also more productive when the lights were 
turned lower. 
– Ex: in usability tests, people are less likely to give 
up on at ask than in real life
QUESTIONS?

More Related Content

Viewers also liked

Tagging: Five Emerging Trends
Tagging: Five Emerging TrendsTagging: Five Emerging Trends
Tagging: Five Emerging Trends
nForm User Experience
 
Experience Impact Framework
Experience Impact FrameworkExperience Impact Framework
Experience Impact Framework
nForm User Experience
 
Making Sense of Web Analytics
Making Sense of Web AnalyticsMaking Sense of Web Analytics
Making Sense of Web Analytics
nForm User Experience
 
Getting Strategic with Digital: Intro to Analytics
Getting Strategic with Digital: Intro to AnalyticsGetting Strategic with Digital: Intro to Analytics
Getting Strategic with Digital: Intro to Analytics
nForm User Experience
 
Designing Better Applications, Websites and Intranets
Designing Better Applications, Websites and IntranetsDesigning Better Applications, Websites and Intranets
Designing Better Applications, Websites and Intranets
nForm User Experience
 
nForm better intranets | PCL construction case study
nForm better intranets | PCL construction case studynForm better intranets | PCL construction case study
nForm better intranets | PCL construction case study
nForm User Experience
 
The Experience Gap (UX Camp Edmonton)
The Experience Gap (UX Camp Edmonton)The Experience Gap (UX Camp Edmonton)
The Experience Gap (UX Camp Edmonton)
nForm User Experience
 
Uxcoach Yshek Iasummit2008
Uxcoach Yshek Iasummit2008Uxcoach Yshek Iasummit2008
Uxcoach Yshek Iasummit2008
nForm User Experience
 
Getting Strategic with Digital - Analytics and Digital Strategy
Getting Strategic with Digital  - Analytics and Digital StrategyGetting Strategic with Digital  - Analytics and Digital Strategy
Getting Strategic with Digital - Analytics and Digital Strategy
nForm User Experience
 
AcceleratorWorkshop_UPA2008_After
AcceleratorWorkshop_UPA2008_AfterAcceleratorWorkshop_UPA2008_After
AcceleratorWorkshop_UPA2008_After
nForm User Experience
 
Customer Journey Mapping and CX Research - Marc Stickdorn | UX Riga 2016
Customer Journey Mapping and CX Research - Marc Stickdorn | UX Riga 2016Customer Journey Mapping and CX Research - Marc Stickdorn | UX Riga 2016
Customer Journey Mapping and CX Research - Marc Stickdorn | UX Riga 2016
UX Riga
 

Viewers also liked (11)

Tagging: Five Emerging Trends
Tagging: Five Emerging TrendsTagging: Five Emerging Trends
Tagging: Five Emerging Trends
 
Experience Impact Framework
Experience Impact FrameworkExperience Impact Framework
Experience Impact Framework
 
Making Sense of Web Analytics
Making Sense of Web AnalyticsMaking Sense of Web Analytics
Making Sense of Web Analytics
 
Getting Strategic with Digital: Intro to Analytics
Getting Strategic with Digital: Intro to AnalyticsGetting Strategic with Digital: Intro to Analytics
Getting Strategic with Digital: Intro to Analytics
 
Designing Better Applications, Websites and Intranets
Designing Better Applications, Websites and IntranetsDesigning Better Applications, Websites and Intranets
Designing Better Applications, Websites and Intranets
 
nForm better intranets | PCL construction case study
nForm better intranets | PCL construction case studynForm better intranets | PCL construction case study
nForm better intranets | PCL construction case study
 
The Experience Gap (UX Camp Edmonton)
The Experience Gap (UX Camp Edmonton)The Experience Gap (UX Camp Edmonton)
The Experience Gap (UX Camp Edmonton)
 
Uxcoach Yshek Iasummit2008
Uxcoach Yshek Iasummit2008Uxcoach Yshek Iasummit2008
Uxcoach Yshek Iasummit2008
 
Getting Strategic with Digital - Analytics and Digital Strategy
Getting Strategic with Digital  - Analytics and Digital StrategyGetting Strategic with Digital  - Analytics and Digital Strategy
Getting Strategic with Digital - Analytics and Digital Strategy
 
AcceleratorWorkshop_UPA2008_After
AcceleratorWorkshop_UPA2008_AfterAcceleratorWorkshop_UPA2008_After
AcceleratorWorkshop_UPA2008_After
 
Customer Journey Mapping and CX Research - Marc Stickdorn | UX Riga 2016
Customer Journey Mapping and CX Research - Marc Stickdorn | UX Riga 2016Customer Journey Mapping and CX Research - Marc Stickdorn | UX Riga 2016
Customer Journey Mapping and CX Research - Marc Stickdorn | UX Riga 2016
 

Similar to What Counts as (Useful) Evidence?

QA Analysis.pptx
QA Analysis.pptxQA Analysis.pptx
QualitativeLecture
QualitativeLectureQualitativeLecture
Creative Thinking and problem solving.pptx
Creative Thinking and problem solving.pptxCreative Thinking and problem solving.pptx
Creative Thinking and problem solving.pptx
AkshanshChauhan1
 
PowerPoint Textbook. Parts of an argument recognizing arguments-1-1-1-1.pptx
PowerPoint Textbook. Parts of an argument recognizing arguments-1-1-1-1.pptxPowerPoint Textbook. Parts of an argument recognizing arguments-1-1-1-1.pptx
PowerPoint Textbook. Parts of an argument recognizing arguments-1-1-1-1.pptx
UsamaHassan88
 
Fallacies
FallaciesFallacies
Fallacies
AyeshaKashif12
 
How to Argue Logically
How to Argue LogicallyHow to Argue Logically
How to Argue Logically
Damian T. Gordon
 
Reference interview mla (1)
Reference interview mla (1)Reference interview mla (1)
Reference interview mla (1)
Nicolette Sosulski
 
Essential Questions
Essential QuestionsEssential Questions
Essential Questions
morag
 
Essential Questions
Essential Questions Essential Questions
Essential Questions
morag
 
Making your research and teaching more efficient, transparent and impactful
Making your research and teaching more efficient, transparent and impactfulMaking your research and teaching more efficient, transparent and impactful
Making your research and teaching more efficient, transparent and impactful
Jay Van Bavel
 
9 Logical Fallacies(Slideshare)
9 Logical Fallacies(Slideshare)9 Logical Fallacies(Slideshare)
9 Logical Fallacies(Slideshare)
jponcelet
 
SY 7034 Week9
SY 7034 Week9SY 7034 Week9
SY 7034 Week9
Edmund Chattoe-Brown
 
METHODS OF PHILOSOPHIZING.pdf
METHODS OF PHILOSOPHIZING.pdfMETHODS OF PHILOSOPHIZING.pdf
METHODS OF PHILOSOPHIZING.pdf
BalucaShanleyV
 
Philosophy 101
Philosophy 101Philosophy 101
Philosophy 101
jcklp1
 
Writing Process and Planning
Writing Process and PlanningWriting Process and Planning
Writing Process and Planning
Ariadne Rooney
 
EAPP Position Paper Powerpoint Presentation
EAPP Position Paper Powerpoint PresentationEAPP Position Paper Powerpoint Presentation
EAPP Position Paper Powerpoint Presentation
evafecampanado1
 
Text Evidence and Elaboration in Writing
Text Evidence and Elaboration in WritingText Evidence and Elaboration in Writing
Text Evidence and Elaboration in Writing
brittmc
 
Ethan Chazin Critical Thinking Program
Ethan Chazin Critical Thinking Program Ethan Chazin Critical Thinking Program
Ethan Chazin Critical Thinking Program
Ethan Chazin MBA
 
Mr course module 03
Mr course module 03Mr course module 03
Mr course module 03
MROC Japan
 
02 asking questions
02 asking questions02 asking questions
02 asking questions
lizabethwalsh
 

Similar to What Counts as (Useful) Evidence? (20)

QA Analysis.pptx
QA Analysis.pptxQA Analysis.pptx
QA Analysis.pptx
 
QualitativeLecture
QualitativeLectureQualitativeLecture
QualitativeLecture
 
Creative Thinking and problem solving.pptx
Creative Thinking and problem solving.pptxCreative Thinking and problem solving.pptx
Creative Thinking and problem solving.pptx
 
PowerPoint Textbook. Parts of an argument recognizing arguments-1-1-1-1.pptx
PowerPoint Textbook. Parts of an argument recognizing arguments-1-1-1-1.pptxPowerPoint Textbook. Parts of an argument recognizing arguments-1-1-1-1.pptx
PowerPoint Textbook. Parts of an argument recognizing arguments-1-1-1-1.pptx
 
Fallacies
FallaciesFallacies
Fallacies
 
How to Argue Logically
How to Argue LogicallyHow to Argue Logically
How to Argue Logically
 
Reference interview mla (1)
Reference interview mla (1)Reference interview mla (1)
Reference interview mla (1)
 
Essential Questions
Essential QuestionsEssential Questions
Essential Questions
 
Essential Questions
Essential Questions Essential Questions
Essential Questions
 
Making your research and teaching more efficient, transparent and impactful
Making your research and teaching more efficient, transparent and impactfulMaking your research and teaching more efficient, transparent and impactful
Making your research and teaching more efficient, transparent and impactful
 
9 Logical Fallacies(Slideshare)
9 Logical Fallacies(Slideshare)9 Logical Fallacies(Slideshare)
9 Logical Fallacies(Slideshare)
 
SY 7034 Week9
SY 7034 Week9SY 7034 Week9
SY 7034 Week9
 
METHODS OF PHILOSOPHIZING.pdf
METHODS OF PHILOSOPHIZING.pdfMETHODS OF PHILOSOPHIZING.pdf
METHODS OF PHILOSOPHIZING.pdf
 
Philosophy 101
Philosophy 101Philosophy 101
Philosophy 101
 
Writing Process and Planning
Writing Process and PlanningWriting Process and Planning
Writing Process and Planning
 
EAPP Position Paper Powerpoint Presentation
EAPP Position Paper Powerpoint PresentationEAPP Position Paper Powerpoint Presentation
EAPP Position Paper Powerpoint Presentation
 
Text Evidence and Elaboration in Writing
Text Evidence and Elaboration in WritingText Evidence and Elaboration in Writing
Text Evidence and Elaboration in Writing
 
Ethan Chazin Critical Thinking Program
Ethan Chazin Critical Thinking Program Ethan Chazin Critical Thinking Program
Ethan Chazin Critical Thinking Program
 
Mr course module 03
Mr course module 03Mr course module 03
Mr course module 03
 
02 asking questions
02 asking questions02 asking questions
02 asking questions
 

More from nForm User Experience

nForm Better Intranets | SharePoint 2013 Best Practices
nForm Better Intranets | SharePoint 2013 Best PracticesnForm Better Intranets | SharePoint 2013 Best Practices
nForm Better Intranets | SharePoint 2013 Best Practices
nForm User Experience
 
Better Intranets | Canadian Intranet Practices 2014
Better Intranets | Canadian Intranet Practices 2014Better Intranets | Canadian Intranet Practices 2014
Better Intranets | Canadian Intranet Practices 2014
nForm User Experience
 
Stakeholder Mapping: IA Summit 2014
Stakeholder Mapping: IA Summit 2014Stakeholder Mapping: IA Summit 2014
Stakeholder Mapping: IA Summit 2014
nForm User Experience
 
Emotional response cards
Emotional response cardsEmotional response cards
Emotional response cards
nForm User Experience
 
UXLX2012 User Research Hacks
UXLX2012 User Research HacksUXLX2012 User Research Hacks
UXLX2012 User Research Hacks
nForm User Experience
 
Information Architecture for SharePoint
Information Architecture for SharePointInformation Architecture for SharePoint
Information Architecture for SharePoint
nForm User Experience
 
User Experience Design: 5 Techniques for Creating Better Websites and Applica...
User Experience Design: 5 Techniques for Creating Better Websites and Applica...User Experience Design: 5 Techniques for Creating Better Websites and Applica...
User Experience Design: 5 Techniques for Creating Better Websites and Applica...
nForm User Experience
 

More from nForm User Experience (7)

nForm Better Intranets | SharePoint 2013 Best Practices
nForm Better Intranets | SharePoint 2013 Best PracticesnForm Better Intranets | SharePoint 2013 Best Practices
nForm Better Intranets | SharePoint 2013 Best Practices
 
Better Intranets | Canadian Intranet Practices 2014
Better Intranets | Canadian Intranet Practices 2014Better Intranets | Canadian Intranet Practices 2014
Better Intranets | Canadian Intranet Practices 2014
 
Stakeholder Mapping: IA Summit 2014
Stakeholder Mapping: IA Summit 2014Stakeholder Mapping: IA Summit 2014
Stakeholder Mapping: IA Summit 2014
 
Emotional response cards
Emotional response cardsEmotional response cards
Emotional response cards
 
UXLX2012 User Research Hacks
UXLX2012 User Research HacksUXLX2012 User Research Hacks
UXLX2012 User Research Hacks
 
Information Architecture for SharePoint
Information Architecture for SharePointInformation Architecture for SharePoint
Information Architecture for SharePoint
 
User Experience Design: 5 Techniques for Creating Better Websites and Applica...
User Experience Design: 5 Techniques for Creating Better Websites and Applica...User Experience Design: 5 Techniques for Creating Better Websites and Applica...
User Experience Design: 5 Techniques for Creating Better Websites and Applica...
 

Recently uploaded

all about the data science process, covering the steps present in almost ever...
all about the data science process, covering the steps present in almost ever...all about the data science process, covering the steps present in almost ever...
all about the data science process, covering the steps present in almost ever...
palaniappancse
 
transgenders community data in india by govt
transgenders community data in india by govttransgenders community data in india by govt
transgenders community data in india by govt
palanisamyiiiier
 
Nipissing University degree offer Nipissing diploma Transcript
Nipissing University degree offer Nipissing diploma TranscriptNipissing University degree offer Nipissing diploma Transcript
Nipissing University degree offer Nipissing diploma Transcript
zyqedad
 
Ch09_. Control Chart for Attributes.ppt
Ch09_.  Control Chart for Attributes.pptCh09_.  Control Chart for Attributes.ppt
Ch09_. Control Chart for Attributes.ppt
alafif2090
 
Verified Girls Call Andheri 9930245274 Unlimited Short Providing Girls Servic...
Verified Girls Call Andheri 9930245274 Unlimited Short Providing Girls Servic...Verified Girls Call Andheri 9930245274 Unlimited Short Providing Girls Servic...
Verified Girls Call Andheri 9930245274 Unlimited Short Providing Girls Servic...
revolutionary575
 
Supervised Learning (Data Science).pptx
Supervised Learning  (Data Science).pptxSupervised Learning  (Data Science).pptx
Supervised Learning (Data Science).pptx
TARIKU ENDALE
 
the unexpected potential of Dijkstra's Algorithm
the unexpected potential of Dijkstra's Algorithmthe unexpected potential of Dijkstra's Algorithm
the unexpected potential of Dijkstra's Algorithm
huseindihon
 
MUMBAI MONTHLY RAINFALL CAPSTONE PROJECT
MUMBAI MONTHLY RAINFALL CAPSTONE PROJECTMUMBAI MONTHLY RAINFALL CAPSTONE PROJECT
MUMBAI MONTHLY RAINFALL CAPSTONE PROJECT
GaneshGanesh399816
 
New Girls Call Delhi 🎈🔥9711199171 🔥💋🎈 Provide Best And Top Girl Service And N...
New Girls Call Delhi 🎈🔥9711199171 🔥💋🎈 Provide Best And Top Girl Service And N...New Girls Call Delhi 🎈🔥9711199171 🔥💋🎈 Provide Best And Top Girl Service And N...
New Girls Call Delhi 🎈🔥9711199171 🔥💋🎈 Provide Best And Top Girl Service And N...
tanupasswan6
 
NPS_Presentation_V3.pptx it is regarding National pension scheme
NPS_Presentation_V3.pptx it is regarding National pension schemeNPS_Presentation_V3.pptx it is regarding National pension scheme
NPS_Presentation_V3.pptx it is regarding National pension scheme
ASISHSABAT3
 
Premium Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
Premium Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...Premium Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
Premium Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
6459astrid
 
Willis Tower //Sears Tower- Supertall Building .pdf
Willis Tower //Sears Tower- Supertall Building .pdfWillis Tower //Sears Tower- Supertall Building .pdf
Willis Tower //Sears Tower- Supertall Building .pdf
LINAT
 
Celonis Busniess Analyst Virtual Internship.pptx
Celonis Busniess Analyst Virtual Internship.pptxCelonis Busniess Analyst Virtual Internship.pptx
Celonis Busniess Analyst Virtual Internship.pptx
AnujaGaikwad28
 
Research proposal seminar ,Research Methodology
Research proposal seminar ,Research MethodologyResearch proposal seminar ,Research Methodology
Research proposal seminar ,Research Methodology
doctorzlife786
 
DataScienceConcept_Kanchana_Weerasinghe.pptx
DataScienceConcept_Kanchana_Weerasinghe.pptxDataScienceConcept_Kanchana_Weerasinghe.pptx
DataScienceConcept_Kanchana_Weerasinghe.pptx
Kanchana Weerasinghe
 
🚂🚘 Premium Girls Call Bangalore 🛵🚡000XX00000 💃 Choose Best And Top Girl Serv...
🚂🚘 Premium Girls Call Bangalore  🛵🚡000XX00000 💃 Choose Best And Top Girl Serv...🚂🚘 Premium Girls Call Bangalore  🛵🚡000XX00000 💃 Choose Best And Top Girl Serv...
🚂🚘 Premium Girls Call Bangalore 🛵🚡000XX00000 💃 Choose Best And Top Girl Serv...
bhupeshkumar0889
 
Fine-Tuning of Small/Medium LLMs for Business QA on Structured Data
Fine-Tuning of Small/Medium LLMs for Business QA on Structured DataFine-Tuning of Small/Medium LLMs for Business QA on Structured Data
Fine-Tuning of Small/Medium LLMs for Business QA on Structured Data
kevig
 
potential usefulness of multi-agent maze-solving in general
potential usefulness of multi-agent maze-solving in generalpotential usefulness of multi-agent maze-solving in general
potential usefulness of multi-agent maze-solving in general
huseindihon
 
Beautiful Girls Call Pune 000XX00000 Provide Best And Top Girl Service And No...
Beautiful Girls Call Pune 000XX00000 Provide Best And Top Girl Service And No...Beautiful Girls Call Pune 000XX00000 Provide Best And Top Girl Service And No...
Beautiful Girls Call Pune 000XX00000 Provide Best And Top Girl Service And No...
birajmohan012
 
CHAPTER-1-Introduction-to-Marketing.pptx
CHAPTER-1-Introduction-to-Marketing.pptxCHAPTER-1-Introduction-to-Marketing.pptx
CHAPTER-1-Introduction-to-Marketing.pptx
girewiy968
 

Recently uploaded (20)

all about the data science process, covering the steps present in almost ever...
all about the data science process, covering the steps present in almost ever...all about the data science process, covering the steps present in almost ever...
all about the data science process, covering the steps present in almost ever...
 
transgenders community data in india by govt
transgenders community data in india by govttransgenders community data in india by govt
transgenders community data in india by govt
 
Nipissing University degree offer Nipissing diploma Transcript
Nipissing University degree offer Nipissing diploma TranscriptNipissing University degree offer Nipissing diploma Transcript
Nipissing University degree offer Nipissing diploma Transcript
 
Ch09_. Control Chart for Attributes.ppt
Ch09_.  Control Chart for Attributes.pptCh09_.  Control Chart for Attributes.ppt
Ch09_. Control Chart for Attributes.ppt
 
Verified Girls Call Andheri 9930245274 Unlimited Short Providing Girls Servic...
Verified Girls Call Andheri 9930245274 Unlimited Short Providing Girls Servic...Verified Girls Call Andheri 9930245274 Unlimited Short Providing Girls Servic...
Verified Girls Call Andheri 9930245274 Unlimited Short Providing Girls Servic...
 
Supervised Learning (Data Science).pptx
Supervised Learning  (Data Science).pptxSupervised Learning  (Data Science).pptx
Supervised Learning (Data Science).pptx
 
the unexpected potential of Dijkstra's Algorithm
the unexpected potential of Dijkstra's Algorithmthe unexpected potential of Dijkstra's Algorithm
the unexpected potential of Dijkstra's Algorithm
 
MUMBAI MONTHLY RAINFALL CAPSTONE PROJECT
MUMBAI MONTHLY RAINFALL CAPSTONE PROJECTMUMBAI MONTHLY RAINFALL CAPSTONE PROJECT
MUMBAI MONTHLY RAINFALL CAPSTONE PROJECT
 
New Girls Call Delhi 🎈🔥9711199171 🔥💋🎈 Provide Best And Top Girl Service And N...
New Girls Call Delhi 🎈🔥9711199171 🔥💋🎈 Provide Best And Top Girl Service And N...New Girls Call Delhi 🎈🔥9711199171 🔥💋🎈 Provide Best And Top Girl Service And N...
New Girls Call Delhi 🎈🔥9711199171 🔥💋🎈 Provide Best And Top Girl Service And N...
 
NPS_Presentation_V3.pptx it is regarding National pension scheme
NPS_Presentation_V3.pptx it is regarding National pension schemeNPS_Presentation_V3.pptx it is regarding National pension scheme
NPS_Presentation_V3.pptx it is regarding National pension scheme
 
Premium Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
Premium Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...Premium Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
Premium Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
 
Willis Tower //Sears Tower- Supertall Building .pdf
Willis Tower //Sears Tower- Supertall Building .pdfWillis Tower //Sears Tower- Supertall Building .pdf
Willis Tower //Sears Tower- Supertall Building .pdf
 
Celonis Busniess Analyst Virtual Internship.pptx
Celonis Busniess Analyst Virtual Internship.pptxCelonis Busniess Analyst Virtual Internship.pptx
Celonis Busniess Analyst Virtual Internship.pptx
 
Research proposal seminar ,Research Methodology
Research proposal seminar ,Research MethodologyResearch proposal seminar ,Research Methodology
Research proposal seminar ,Research Methodology
 
DataScienceConcept_Kanchana_Weerasinghe.pptx
DataScienceConcept_Kanchana_Weerasinghe.pptxDataScienceConcept_Kanchana_Weerasinghe.pptx
DataScienceConcept_Kanchana_Weerasinghe.pptx
 
🚂🚘 Premium Girls Call Bangalore 🛵🚡000XX00000 💃 Choose Best And Top Girl Serv...
🚂🚘 Premium Girls Call Bangalore  🛵🚡000XX00000 💃 Choose Best And Top Girl Serv...🚂🚘 Premium Girls Call Bangalore  🛵🚡000XX00000 💃 Choose Best And Top Girl Serv...
🚂🚘 Premium Girls Call Bangalore 🛵🚡000XX00000 💃 Choose Best And Top Girl Serv...
 
Fine-Tuning of Small/Medium LLMs for Business QA on Structured Data
Fine-Tuning of Small/Medium LLMs for Business QA on Structured DataFine-Tuning of Small/Medium LLMs for Business QA on Structured Data
Fine-Tuning of Small/Medium LLMs for Business QA on Structured Data
 
potential usefulness of multi-agent maze-solving in general
potential usefulness of multi-agent maze-solving in generalpotential usefulness of multi-agent maze-solving in general
potential usefulness of multi-agent maze-solving in general
 
Beautiful Girls Call Pune 000XX00000 Provide Best And Top Girl Service And No...
Beautiful Girls Call Pune 000XX00000 Provide Best And Top Girl Service And No...Beautiful Girls Call Pune 000XX00000 Provide Best And Top Girl Service And No...
Beautiful Girls Call Pune 000XX00000 Provide Best And Top Girl Service And No...
 
CHAPTER-1-Introduction-to-Marketing.pptx
CHAPTER-1-Introduction-to-Marketing.pptxCHAPTER-1-Introduction-to-Marketing.pptx
CHAPTER-1-Introduction-to-Marketing.pptx
 

What Counts as (Useful) Evidence?

  • 1. What Counts as (Useful) Evidence? Lisa Farlow Oct 25, 2014
  • 2. Overview • Some definitions • What makes for a good hypothesis? • What counts as useful evidence? • What makes for a good test?
  • 3. What Counts as Knowledge? • Start with a belief/hunch/hypothesis/feeling • Gather evidence to justify that belief • Knowledge is justified true belief (Plato!)
  • 4. Types of Evidence • Any observation about the world can be evidence • Before gathering evidence, your hypothesis has an initial likelihood of being true – Some evidence increases the likelihood – Some evidence decreases the likelihood – Some evidence is irrelevant to the hypothesis
  • 6. Start with a Hypothesis • A good hypothesis is one in which… – You can find evidence to support/deny it (testable) – It is possible to find evidence that proves the hypothesis wrong (falsifiable)
  • 7. Hypotheses that aren’t testable • I will have twice as many feelings in December as in November (not measurable) • The spell I cast caused my favourite sports team to win, but it can only be used once (not repeatable) • An afterlife exists (not observable)
  • 8. Hypotheses that aren’t falsifiable • A teapot orbits the Sun somewhere in space between Earth and Mars • There is a ghost in the basement of my parents’ house • I know a lot of people think the new feature on our site is clunky, but some people love it and find it extremely useful “Absence of evidence is not evidence of absence.” (Carl Sagan)
  • 9. Some Good Hypotheses • People located in Alberta spend more time on our site than people located in BC • Some people struggle to fill out the PDFs on our site • On average, people will be able to find our contact info more quickly if we title the tab “Give us a Ring-a-Ding” rather than “Contact Us” (though it is a bad idea, it is a good hypothesis)
  • 10. WAYS OF TALKING ABOUT EVIDENCE
  • 11. Ways of Talking About Evidence • All, Some, None • All, Only, All & Only • Necessary, Sufficient, Necessary &Sufficient
  • 12. All or None • All Xs are Y • Ex: All swans are white – Confirm: Examine every single swan that has ever existed in all of time and space and observe that it is white – Deny: Find a single swan that is not white • Same for All Xs are not Y (i.e, No Xs are Y, a “none” statements)
  • 13. Some (“there exists at least one”) • Some Xs are Y • Ex: Some dogs are able to talk – Confirm: Find a single dog that is able to talk – Deny: Examine every single dog and determine if he can talk • Same for some Xs are not Y (Not all Xs are Y)
  • 14. Open and Closed Worlds • Open world of data – You cannot confirm “All” statements or deny “Some” statements – Ex: Analytics search key words. “Nobody found your site by searching for ‘worst burgers in Edmonton’ • Closed world of data – It’s possible to examine every piece of data – You can confirm “All” statements and deny “Some” statements – Include closed-world caveats • “Of those we interviewed….” • “Of those who responded to the survey…”
  • 15. All, Only, All & Only • All Xs are Y – Confirm: Examine every X within the study and observe that it is Y – Deny: Find a single X that is not Y
  • 16. All, Only, All & Only • Only Xs are Y – Confirm: Examine all things that are Y, observe that they are X – Deny: Find a non-X that is Y
  • 17. All, Only, All & Only • All and Only Xs are Y – Confirm: Examine every X and observe that it is Y AND examine everything else within the study and observe that it is not Y – Deny: Find a single X that is not Y OR find a single thing that is Y
  • 18. Case Study: All and Only • You’re running a survey to find out student attitudes about the cafeteria serving more vegetables. • What counts as evidence for/against each hypothesis?
  • 19. Case Study: All and Only • Hypothesis: Only female students are more willing to more spend money on healthier options – Confirm: Look through all respondents who said they were willing to spend more. Determine whether 100% of them are girls – Deny: Look through all respondents who said they were willing to spend more. Find one boy.
  • 20. Case Study: All and Only • Hypothesis: All and only the grade 6 students bring healthy snacks from home – Confirm: Look through all respondents… – Deny: Look through all respondents …
  • 21. Case Study: All and Only • Hypothesis: None of the chemistry students think pizza sauce counts as a vegetable – Confirm: Look through all respondents… – Deny: Look through all respondents….
  • 22. Necessary & Sufficient Evidence • Necessary (If H is true, then E must be observed) – Being a woman is a necessary condition to become a nun – Submitting your resume is a necessary condition to getting an interview – Spending 20+ minutes on the site is necessary for you to be classified as a happy, high-traffic user
  • 23. Necessary & Sufficient Evidence • Sufficient (If E is observed, you know H is true) – If it’s been raining hard for a while, I know the ground will be wet – If you voted for Don Iveson in the last election, I know you are at least 18 years old – If you renewed your library books online, I know you have a a library card
  • 24. Case Study: Necessary & Sufficient • Hypothesis: An online wizard is a better registration process for our car-share program than our mail-in PDF
  • 25. What are the necessary conditions for success?
  • 26. Necessary but Insufficient • You observe: “very few of the graduating class carry federal or provincial students loans” • You cannot conclude: “very few students take out loans to pay for school” • Nor can you conclude: “very few students take out federal/provincial loans”
  • 27. Necessary but Insufficient • You observe: “page abc has more pageviews than any other page” • You cannot conclude: “page abc is the page with the most important content”
  • 28. WHAT MAKES A TEST BAD?
  • 29. What makes a good test? • Open to gathering evidence that falsifies your hypothesis • Repeatable • Minimal biases
  • 30. Sample Bias • The sample does not represent the larger population – Ex: customer feedback box is likely to be full of negative comments, because angry people are more motivated to comment – Ex: your impression testing showed extremely high success, but you sent it to stakeholders who had already seen the wireframes
  • 31. Confirmation Bias • Your test is set up to lead people to answer a certain way – Ex: you ask people what they thought of the meal you cooked with the options, “tasty,” “very tasty,” and “extremely tasty”
  • 32. Clever Hans Bias • A horse who can do math, who was just responding to cues from his testers – Common in usability tasks – Ex: “Hmm, I would try downloading this form, I guess” “Yes, that’s the right one! Next task…”
  • 33. Anchoring Bias • In analysis, anchoring is when the researcher clings to one the first example or piece of information • Lessens the impact of gathering more data
  • 34. Hawthorne Effect • People change when people observed • Factory workers were more productive when the lights were turned brighter. But they were also more productive when the lights were turned lower. – Ex: in usability tests, people are less likely to give up on at ask than in real life

Editor's Notes

  1. UX as a co-op student -- MA in logic, tutored the LSAT -- now I see lots of common errors in interpretting (tree, interview, GA) We have SO MUCH DATA yet have trouble coming to conclusions, and yet still overstate ourselves - Not stats or deductive formal arguments, Just about thinking what sorts of evidence proves a hypothesis
  2. Example: Did John commit the murder?
  3. Must be practically testable. Not measurable, not repeatable, not observatble
  4. Nothing could contradict it. Impossible exhaustive search would be necessary to prove them wrong Even one piece of evidence could prove it, but a lack of evidence doesn’t mean its right
  5. Going forward, we’re going to assume closed worlds
  6. Test: okay, so I’m a female student, and I am not willing to spend more. Have I disproven your theory?
  7. Example: I’m a female, I’m not spending more. Confirm/deny…?? Irrelevant
  8. Confirm: every person who is gr 6 brings snacks, and every person who brings healthy snacks is gr 6 Deny: a single gr 6 who doesn’t or a single gr 7 who does
  9. Confirm: every chem student says it doesn’t count Deny: find one chem student who says it does
  10. Know about the online option Find the website, navigate to the wizard Complete all steps ACCURATELY Click submit.