SlideShare a Scribd company logo
Speed Dating
Data Set
Vaibhav, Tejasvi, Ritesh, Foram, Mary
Outline
• Introduction & Business Problem
• Description of Data
• Pre-Processing Steps
• Exploratory Techniques & Interesting Observations
• BI Model
• Conclusions
Introduction & Business Problem
Current popular dating apps geared toward young adults do not take
preferences and interests into consideration.
Goal: To create a superior dating app that results in a higher
percentage of dates and relationships.
How: Use data from speed dating events to predict whether users
are compatible.
Description of Data
• Source: Kaggle
• 8,378 Observations from twenty-one speed dating events from
2002 to 2004
• Each observation represents a four-minute date between two
people
• Includes:
• User demographics
• User interests/preferences
• Scorecard for each user
• Whether each user desired a second date with their partner
Description of Data - Scorecard
Pre-Processing Steps
• Four of the speed dating events used a different ranking method for
their preferences
• For these observations, we used the following method to scale the data
𝑅𝑎𝑡𝑖𝑛𝑔𝑠𝑐𝑎𝑙𝑒𝑑 =
100
Σ𝐴𝑡𝑡𝑟𝑖𝑏𝑢𝑡𝑒 𝑅𝑎𝑡𝑖𝑛𝑔𝑠
×𝑅𝑎𝑡𝑖𝑛𝑔 𝑜𝑟𝑖𝑔𝑖𝑛𝑎𝑙
We rejected the following variables:
• Match
• Dec_o
• Num_in_3
Pre-Processing Steps
• For certain models, the following nodes were applied:
• Impute
• Mean value replaced blank interval variables
• Median value replaced blank ordinal variables
• Replacement
• Missing values replaced with a ‘.’
• Variable Transformation
• Skewed variables transformed using log
• Variable Selection
• Computed automatically by SAS
Exploratory Techniques & Interesting
Observations
• Overall match rate:
16.5%
• Individual ‘Yes’
rate: 42%
• Age Range: 18-55
• Mean: 26.3
• St. Deviation: 3.566
• Skewness: 1.07
Exploratory Techniques & Interesting
Observations
> Gender
Note:
‘0’ represents female
‘1’ represents male
Exploratory Techniques & Interesting
Observations
> Age
Exploratory Techniques & Interesting
Observations
> Season
BI Model
BI Model Comparison
BI Model Comparison
Model Misclassification
Rate
True Positive Rate
Replacement + Decision Tree 18.9% 80.2%
Replacement + Gradient
Boosting
18.1% 75.2%
● A decision tree after replacement is the superior model
○ While the misclassification rate is slightly higher than for gradient
boosting, the true positive rate is significantly higher
Our BI Model Results
All the ratings are on the scale of 1 to 10
• If user likes a person greater than equal to 8 → user rates them on attractiveness
greater than equal to 7.5 → user thinks the probability of getting a match is greater
than equal to 3 .Then there is a 86.28 percent chance that the user will say yes
• If the user likes the person greater than equal to 5.5 and less than 6.5 → if they are
from London, England. They have 100 percent chance of saying a yes but if the user is
from Alabama, Texas, Argentina there is 68.12 percent chance of saying no.
• If the user likes a person less than 5.5 → is a lawyer. Then there is a 93.16 percent
chance that user will say no the other person. Similarly if the user is in the field of
Informatics or Psychology, the user will say no 100 percent of the time and if the user
is a journalist, there is an 83 percent chance of saying a yes.
Conclusion
We are going to use the BI model for building an application and
the overview for the Dating Application will be :
• User profile
• Suggesting users people based on their preferences
• Users ratings for the suggested profiles
• BI model used for suggesting potential partners using the ratings
• Chat option
• After a significant user base implement recommendation system

More Related Content

What's hot

The Impact of Computing Systems | Causal inference in practice
The Impact of Computing Systems | Causal inference in practiceThe Impact of Computing Systems | Causal inference in practice
The Impact of Computing Systems | Causal inference in practice
Amit Sharma
 
Artificial Intelligence for Societal Impact
Artificial Intelligence for Societal ImpactArtificial Intelligence for Societal Impact
Artificial Intelligence for Societal Impact
Amit Sharma
 
Machine learning, health data & the limits of knowledge
Machine learning, health data & the limits of knowledgeMachine learning, health data & the limits of knowledge
Machine learning, health data & the limits of knowledge
Paul Agapow
 
Data analytics and the power of creating social impact
Data analytics and the power of creating social impactData analytics and the power of creating social impact
Data analytics and the power of creating social impact
TA Telecom
 
Data science concept by Raj Krishna Paul
Data science concept by Raj Krishna PaulData science concept by Raj Krishna Paul
Data science concept by Raj Krishna Paul
Subir Paul
 
Auditing search engines for differential satisfaction across demographics
Auditing search engines for differential satisfaction across demographicsAuditing search engines for differential satisfaction across demographics
Auditing search engines for differential satisfaction across demographics
Amit Sharma
 

What's hot (6)

The Impact of Computing Systems | Causal inference in practice
The Impact of Computing Systems | Causal inference in practiceThe Impact of Computing Systems | Causal inference in practice
The Impact of Computing Systems | Causal inference in practice
 
Artificial Intelligence for Societal Impact
Artificial Intelligence for Societal ImpactArtificial Intelligence for Societal Impact
Artificial Intelligence for Societal Impact
 
Machine learning, health data & the limits of knowledge
Machine learning, health data & the limits of knowledgeMachine learning, health data & the limits of knowledge
Machine learning, health data & the limits of knowledge
 
Data analytics and the power of creating social impact
Data analytics and the power of creating social impactData analytics and the power of creating social impact
Data analytics and the power of creating social impact
 
Data science concept by Raj Krishna Paul
Data science concept by Raj Krishna PaulData science concept by Raj Krishna Paul
Data science concept by Raj Krishna Paul
 
Auditing search engines for differential satisfaction across demographics
Auditing search engines for differential satisfaction across demographicsAuditing search engines for differential satisfaction across demographics
Auditing search engines for differential satisfaction across demographics
 

Similar to Speed Data Set

Data Science and Online Dating.pptx
Data Science and Online Dating.pptxData Science and Online Dating.pptx
Data Science and Online Dating.pptx
Piyush Prashant
 
Hair_EOMA_1e_Chap001_PPT.pptx
Hair_EOMA_1e_Chap001_PPT.pptxHair_EOMA_1e_Chap001_PPT.pptx
Hair_EOMA_1e_Chap001_PPT.pptx
AsadAli104515
 
Customer Needs Assessment
Customer Needs AssessmentCustomer Needs Assessment
Customer Needs Assessment
lovelynguiangyahoo
 
ADRP | Measurement of Success
ADRP | Measurement of SuccessADRP | Measurement of Success
ADRP | Measurement of Success
Heurista, Co.
 
Running a Low-Resource Data-Driven Program
Running a Low-Resource Data-Driven ProgramRunning a Low-Resource Data-Driven Program
Running a Low-Resource Data-Driven Program
Enroll America
 
LoyalT Summary report
LoyalT Summary reportLoyalT Summary report
LoyalT Summary report
R3 Marketing
 
Maxfield_8e_PPT_Ch09 Survey Research.pptx
Maxfield_8e_PPT_Ch09 Survey Research.pptxMaxfield_8e_PPT_Ch09 Survey Research.pptx
Maxfield_8e_PPT_Ch09 Survey Research.pptx
MarcCollazo1
 
Maxfield_8e_PPT_Ch09.pptx
Maxfield_8e_PPT_Ch09.pptxMaxfield_8e_PPT_Ch09.pptx
Maxfield_8e_PPT_Ch09.pptx
MarcCollazo1
 
q method research (1).pptx
q method research (1).pptxq method research (1).pptx
q method research (1).pptx
Chia Barzinje
 
Ellen Wagner: Putting Data to Work
Ellen Wagner: Putting Data to WorkEllen Wagner: Putting Data to Work
Ellen Wagner: Putting Data to Work
Alexandra M. Pickett
 
Data mining - Machine Learning
Data mining - Machine LearningData mining - Machine Learning
Data mining - Machine Learning
RupaDutta3
 
for_love_or_money_loyalty_research_2015_FINAL
for_love_or_money_loyalty_research_2015_FINALfor_love_or_money_loyalty_research_2015_FINAL
for_love_or_money_loyalty_research_2015_FINAL
Adam Posner
 
for_love_or_money_loyalty_research_2015_FINAL
for_love_or_money_loyalty_research_2015_FINALfor_love_or_money_loyalty_research_2015_FINAL
for_love_or_money_loyalty_research_2015_FINAL
Adam Posner
 
For love or_money_loyalty_research_2015
For love or_money_loyalty_research_2015For love or_money_loyalty_research_2015
For love or_money_loyalty_research_2015
Adam Posner
 
for_love_or_money_loyalty_research_2015_FINAL
for_love_or_money_loyalty_research_2015_FINALfor_love_or_money_loyalty_research_2015_FINAL
for_love_or_money_loyalty_research_2015_FINAL
Adam Posner
 
For love or_money_loyalty_research_2015
For love or_money_loyalty_research_2015For love or_money_loyalty_research_2015
For love or_money_loyalty_research_2015
Adam Posner
 
Ayat Shukairy (Co-Founder, Invesp) - Why "Customer First" Fails, And What To ...
Ayat Shukairy (Co-Founder, Invesp) - Why "Customer First" Fails, And What To ...Ayat Shukairy (Co-Founder, Invesp) - Why "Customer First" Fails, And What To ...
Ayat Shukairy (Co-Founder, Invesp) - Why "Customer First" Fails, And What To ...
Business of Software Conference
 
How to Conduct a Survey gf form to anylyz
How to Conduct a Survey gf form to anylyzHow to Conduct a Survey gf form to anylyz
How to Conduct a Survey gf form to anylyz
edenjrodrigo
 
Storytelling with Data (Global Engagement Summit at Northwestern University 2...
Storytelling with Data (Global Engagement Summit at Northwestern University 2...Storytelling with Data (Global Engagement Summit at Northwestern University 2...
Storytelling with Data (Global Engagement Summit at Northwestern University 2...
Sara Hooker
 
Measuring the Quality of Online Service - Jinyoung kim
Measuring the Quality of Online Service - Jinyoung kimMeasuring the Quality of Online Service - Jinyoung kim
Measuring the Quality of Online Service - Jinyoung kim
Jin Young Kim
 

Similar to Speed Data Set (20)

Data Science and Online Dating.pptx
Data Science and Online Dating.pptxData Science and Online Dating.pptx
Data Science and Online Dating.pptx
 
Hair_EOMA_1e_Chap001_PPT.pptx
Hair_EOMA_1e_Chap001_PPT.pptxHair_EOMA_1e_Chap001_PPT.pptx
Hair_EOMA_1e_Chap001_PPT.pptx
 
Customer Needs Assessment
Customer Needs AssessmentCustomer Needs Assessment
Customer Needs Assessment
 
ADRP | Measurement of Success
ADRP | Measurement of SuccessADRP | Measurement of Success
ADRP | Measurement of Success
 
Running a Low-Resource Data-Driven Program
Running a Low-Resource Data-Driven ProgramRunning a Low-Resource Data-Driven Program
Running a Low-Resource Data-Driven Program
 
LoyalT Summary report
LoyalT Summary reportLoyalT Summary report
LoyalT Summary report
 
Maxfield_8e_PPT_Ch09 Survey Research.pptx
Maxfield_8e_PPT_Ch09 Survey Research.pptxMaxfield_8e_PPT_Ch09 Survey Research.pptx
Maxfield_8e_PPT_Ch09 Survey Research.pptx
 
Maxfield_8e_PPT_Ch09.pptx
Maxfield_8e_PPT_Ch09.pptxMaxfield_8e_PPT_Ch09.pptx
Maxfield_8e_PPT_Ch09.pptx
 
q method research (1).pptx
q method research (1).pptxq method research (1).pptx
q method research (1).pptx
 
Ellen Wagner: Putting Data to Work
Ellen Wagner: Putting Data to WorkEllen Wagner: Putting Data to Work
Ellen Wagner: Putting Data to Work
 
Data mining - Machine Learning
Data mining - Machine LearningData mining - Machine Learning
Data mining - Machine Learning
 
for_love_or_money_loyalty_research_2015_FINAL
for_love_or_money_loyalty_research_2015_FINALfor_love_or_money_loyalty_research_2015_FINAL
for_love_or_money_loyalty_research_2015_FINAL
 
for_love_or_money_loyalty_research_2015_FINAL
for_love_or_money_loyalty_research_2015_FINALfor_love_or_money_loyalty_research_2015_FINAL
for_love_or_money_loyalty_research_2015_FINAL
 
For love or_money_loyalty_research_2015
For love or_money_loyalty_research_2015For love or_money_loyalty_research_2015
For love or_money_loyalty_research_2015
 
for_love_or_money_loyalty_research_2015_FINAL
for_love_or_money_loyalty_research_2015_FINALfor_love_or_money_loyalty_research_2015_FINAL
for_love_or_money_loyalty_research_2015_FINAL
 
For love or_money_loyalty_research_2015
For love or_money_loyalty_research_2015For love or_money_loyalty_research_2015
For love or_money_loyalty_research_2015
 
Ayat Shukairy (Co-Founder, Invesp) - Why "Customer First" Fails, And What To ...
Ayat Shukairy (Co-Founder, Invesp) - Why "Customer First" Fails, And What To ...Ayat Shukairy (Co-Founder, Invesp) - Why "Customer First" Fails, And What To ...
Ayat Shukairy (Co-Founder, Invesp) - Why "Customer First" Fails, And What To ...
 
How to Conduct a Survey gf form to anylyz
How to Conduct a Survey gf form to anylyzHow to Conduct a Survey gf form to anylyz
How to Conduct a Survey gf form to anylyz
 
Storytelling with Data (Global Engagement Summit at Northwestern University 2...
Storytelling with Data (Global Engagement Summit at Northwestern University 2...Storytelling with Data (Global Engagement Summit at Northwestern University 2...
Storytelling with Data (Global Engagement Summit at Northwestern University 2...
 
Measuring the Quality of Online Service - Jinyoung kim
Measuring the Quality of Online Service - Jinyoung kimMeasuring the Quality of Online Service - Jinyoung kim
Measuring the Quality of Online Service - Jinyoung kim
 

Recently uploaded

STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
sameer shah
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
jitskeb
 
一比一原版(CU毕业证)卡尔顿大学毕业证如何办理
一比一原版(CU毕业证)卡尔顿大学毕业证如何办理一比一原版(CU毕业证)卡尔顿大学毕业证如何办理
一比一原版(CU毕业证)卡尔顿大学毕业证如何办理
bmucuha
 
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
mkkikqvo
 
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
slg6lamcq
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
Social Samosa
 
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
taqyea
 
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCAModule 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
yuvarajkumar334
 
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
Timothy Spann
 
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
xclpvhuk
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
bopyb
 
Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
AndrzejJarynowski
 
Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
Sachin Paul
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
ihavuls
 
writing report business partner b1+ .pdf
writing report business partner b1+ .pdfwriting report business partner b1+ .pdf
writing report business partner b1+ .pdf
VyNguyen709676
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Kiwi Creative
 
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
a9qfiubqu
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
Bill641377
 
A presentation that explain the Power BI Licensing
A presentation that explain the Power BI LicensingA presentation that explain the Power BI Licensing
A presentation that explain the Power BI Licensing
AlessioFois2
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
Lars Albertsson
 

Recently uploaded (20)

STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
 
一比一原版(CU毕业证)卡尔顿大学毕业证如何办理
一比一原版(CU毕业证)卡尔顿大学毕业证如何办理一比一原版(CU毕业证)卡尔顿大学毕业证如何办理
一比一原版(CU毕业证)卡尔顿大学毕业证如何办理
 
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
 
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
 
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
 
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCAModule 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
 
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
 
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
 
Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
 
Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
 
writing report business partner b1+ .pdf
writing report business partner b1+ .pdfwriting report business partner b1+ .pdf
writing report business partner b1+ .pdf
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
 
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
 
A presentation that explain the Power BI Licensing
A presentation that explain the Power BI LicensingA presentation that explain the Power BI Licensing
A presentation that explain the Power BI Licensing
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
 

Speed Data Set

  • 1. Speed Dating Data Set Vaibhav, Tejasvi, Ritesh, Foram, Mary
  • 2. Outline • Introduction & Business Problem • Description of Data • Pre-Processing Steps • Exploratory Techniques & Interesting Observations • BI Model • Conclusions
  • 3. Introduction & Business Problem Current popular dating apps geared toward young adults do not take preferences and interests into consideration. Goal: To create a superior dating app that results in a higher percentage of dates and relationships. How: Use data from speed dating events to predict whether users are compatible.
  • 4. Description of Data • Source: Kaggle • 8,378 Observations from twenty-one speed dating events from 2002 to 2004 • Each observation represents a four-minute date between two people • Includes: • User demographics • User interests/preferences • Scorecard for each user • Whether each user desired a second date with their partner
  • 5. Description of Data - Scorecard
  • 6. Pre-Processing Steps • Four of the speed dating events used a different ranking method for their preferences • For these observations, we used the following method to scale the data 𝑅𝑎𝑡𝑖𝑛𝑔𝑠𝑐𝑎𝑙𝑒𝑑 = 100 Σ𝐴𝑡𝑡𝑟𝑖𝑏𝑢𝑡𝑒 𝑅𝑎𝑡𝑖𝑛𝑔𝑠 ×𝑅𝑎𝑡𝑖𝑛𝑔 𝑜𝑟𝑖𝑔𝑖𝑛𝑎𝑙 We rejected the following variables: • Match • Dec_o • Num_in_3
  • 7. Pre-Processing Steps • For certain models, the following nodes were applied: • Impute • Mean value replaced blank interval variables • Median value replaced blank ordinal variables • Replacement • Missing values replaced with a ‘.’ • Variable Transformation • Skewed variables transformed using log • Variable Selection • Computed automatically by SAS
  • 8. Exploratory Techniques & Interesting Observations • Overall match rate: 16.5% • Individual ‘Yes’ rate: 42% • Age Range: 18-55 • Mean: 26.3 • St. Deviation: 3.566 • Skewness: 1.07
  • 9. Exploratory Techniques & Interesting Observations > Gender Note: ‘0’ represents female ‘1’ represents male
  • 10. Exploratory Techniques & Interesting Observations > Age
  • 11. Exploratory Techniques & Interesting Observations > Season
  • 14. BI Model Comparison Model Misclassification Rate True Positive Rate Replacement + Decision Tree 18.9% 80.2% Replacement + Gradient Boosting 18.1% 75.2% ● A decision tree after replacement is the superior model ○ While the misclassification rate is slightly higher than for gradient boosting, the true positive rate is significantly higher
  • 15. Our BI Model Results All the ratings are on the scale of 1 to 10 • If user likes a person greater than equal to 8 → user rates them on attractiveness greater than equal to 7.5 → user thinks the probability of getting a match is greater than equal to 3 .Then there is a 86.28 percent chance that the user will say yes • If the user likes the person greater than equal to 5.5 and less than 6.5 → if they are from London, England. They have 100 percent chance of saying a yes but if the user is from Alabama, Texas, Argentina there is 68.12 percent chance of saying no. • If the user likes a person less than 5.5 → is a lawyer. Then there is a 93.16 percent chance that user will say no the other person. Similarly if the user is in the field of Informatics or Psychology, the user will say no 100 percent of the time and if the user is a journalist, there is an 83 percent chance of saying a yes.
  • 16. Conclusion We are going to use the BI model for building an application and the overview for the Dating Application will be : • User profile • Suggesting users people based on their preferences • Users ratings for the suggested profiles • BI model used for suggesting potential partners using the ratings • Chat option • After a significant user base implement recommendation system

Editor's Notes

  1. Don’t take into consideration: -personality -interests/hobbies -political/religious preferences -what they are looking for
  2. Match - not independent from dec Dec_o because it would be double counting Num_in_3 - more than 90% of observations missing