SlideShare a Scribd company logo
1 of 31
Student’s Alcohol Consumption Analysis
Group 9
Demin; Derrick; Gaurav; Jingya; Ramya; Si
Introduction
Some of the most important new data to emerge on young adult drinking were
collected through a recent nationwide survey, the National Epidemiologic Survey
on Alcohol and Related Conditions (NESARC). According to these data, about 70
percent of young adults or about 19 million people, consumed alcohol in the year
preceding the survey.
Short exploratory data analysis focusing on the alcohol variables from the
Portuguese school dataset. Our main goal is using Data Mining To Predict School
Student Alcohol Consumption and finding the significant factors.
Objective/problem statement
•Build models to predict school students’ drinking behavior during weekdays and
weekends.
•Compare various models and choose the best.
•Find out which factors are influential to school students’ alcohol consumption –
sensible recommendations were made.
Dataset
Data collected through a survey from two classes in two schools in Portugal
33 Variables
Personal e.g. school, sex, age, address, health status, romantic experience, going out with friends,
free time after school
Educational e.g. study time, class failures, intention for higher education, extra-curricular activities,
educational support, number of school absences, grades
Family e.g. mother/father’s education, mother/father’s job, family size, quality of family relationship,
parent’s cohabitation status
Alcohol Consumption e.g. workday alcohol consumption, weekend alcohol consumption
Data Types
Data preparation
No missing data
Overlapping
Students taking both math and portuguese class
649 students in Portuguese class, 395 students in Math class
Merging data
Criterion
"school","sex","age","address","famsize","Pstatus","Medu","Fedu","Mjob","Fjob","reason","nurs
ery","internet"
382 students identified
Approaches
The data is distributed to analyse 2 different models(alcohol consumption for weekday and for the
weekend)
Target variables: Weekday alcohol consumption and weekends alcohol consumption
For weekday (more serious issue than weekend),
Level 1 - acceptable alcohol consumption
Levels 2- 5 - unacceptable
For the weekend,
Levels 1 and 2 - acceptable alcohol consumption
Level 3, 4, 5 - unacceptable
Techniques Used
Decision Tree
Poor performance ☹
• Overall error rate 38%
• Tried improving the model by cost matrix (0,25,80,0) →
32% error in predicting unacceptable behavior
• But increased the error rate of acceptable to 44%
REJECTED DECISION TREE
Neural Network
Poor performance ☹
• Neural network worked best for 15 nodes
• But the error rate is quite high → 53% for unacceptable
class
• Also the error rate for the acceptable class was 22%
REJECTED NEURAL NETWORK
Boosting
Poor performance ☹
• Overall error rate is 25% which is quite less
☺
• However, 59% of the data is wrongly
classified into unacceptable
• Area under ROC curve is 0.6782
REJECTED BOOSTING
Naïve Bayes
Poor performance ☹
• Overall error rate was 38.46%
• Couldn’t properly classify unacceptable class
• Accuracy was also very low
REJECTED NAÏVE BAYES
Random Forest
Winner ☺
• Unacceptable class error rate was 29%
• And the unacceptable class is very important for the
prediction of the model
ACCEPTED RANDOM FOREST
Weekday Alcohol Consumption
Input Variables: All the variables were chosen as input for Weekday Alcohol consumption model building except
G1, G2 and Weekend Alcohol consumption.
Weekend Alcohol consumption is ignored to avoid the target leakage condition
G1, G2 - Grades for the first and second year. We include G3 (derived from G1 and G2) and ignore G1 and G2 to make the
input variables independent.
Target:
Weekday Alcohol consumption
We classified the Ordinal Variable Weekday Alcohol consumption (Ratings 1 - 5)
Acceptable (Rating 1) and
Unacceptable (Ratings 2 - 5)
Weekday Alcohol Consumption
Random Forest Model:
Partitioning:
Training: Validation: Test - 70:15:15
Sample size chosen as 85,100 to downsample the acceptable class
No.of Trees : 5200
Weekday Alcohol Consumption
Random Forest Model:
Overall error 35%
For Unacc class
Precision: 52%
Recall : 70.5%
Weekday Alcohol Consumption
Random Forest Model:
Weekday Alcohol Consumption
Important Factors:
● Sex being male
● Grades
● Mother’s education
● Going out
● Mother’s job
● Failures
Weekend Alcohol Consumption - Input & Balance
The best model is Balanced Random Forest :
Ignore the variable Dalc, G1 & G2
The target value walc: 1-2 “Low” & 3-5 “High”
High : Low = 262 : 412 = 38 : 62
Train : Validation : Test =70 : 15 : 15
Weekend Alcohol Consumption - Number of Trees
The number of
trees is 5200
Weekend Alcohol Consumption - Validation
AUC=0.748
Overall error 32%
Precision: 58.5%
Recall : 73.8%
Actual Unac Accp Error
Unac 0.31 0.11 0.26
Accp 0.22 0.36 0.37
Weekend Alcohol Consumption - Importance
Important Factors:
● Going Out with friends
● Sexual
● Grades
● Family Size
● Absences
● Freetime
● Father’s Job
Compare two models
Random forest can best predict the data in both models.
For daily alcohol consumption, the overall error rate is 35%, with the error rate in
unacceptable group of 29%. However, according to AUC, it explains only 69%
of the data.
For weekends alcohol consumption, the overall error rate is 32%, with the error
rate in high consumption group of 26%. According to AUC, it explains 74.8% of
the data.
The weekend model is the better one.
Insights of the models
1.Drinking is a daily behavior
most of the drinkers drink both on weekends and weekdays.Students tend to drink more on weekends.
2. Mom and dad plays important roles in different time
According to the daily alcohol consumption model, mother’s education, mother’s job have relationship with
the daily drinking behavior of the child.
While, during weekends, father’s job matters to the weekends drinking behavior.
Insights of the models
3. Common factors shows up in both models
● Sexual --boys tend to drink more than girls
● Grades --kids with lowers grades drinks more than those with higher grades
● Absences --kids absences more tend to drink more
● Freetime --kids with more free time tend to drink more
4. Exclusive factors related to alcohol consumption
● Going out with friends --on weekends peer behavior have relationship with alcohol consumption
● Family Size --kids with larger family size tend to drink less on weekends.
● Going out for more time --during weekdays, more freetime have relationship with alcohol
consumption
Recommendation
Family and school are both important.
After running both models on only school-related data, family-related data we discover the
prediction error rate get even higher, which indicates that alcohol consumption behaviour
related to both aspects. Solving the alcohol consumption problem among high-school
students need the efforts from both school and family.
● Educate the students. Reduce negative peer impacts. Build their awareness of harmful
effects of alcohol use.
● Educate the parents. And get parents to keep track of their kids’ after school behavior.
● Keep track of the data to build students’ behavior profile in future prediction.
Recommendation
How to predict better.
As both models can hardly predict the drinkers group well. We could collect more data on larger
sample to build the model better.There might be more relevant variables like the group the kids
hang out with or how much money they have or other factors we are not included in the study.
THANK YOU
More support pages
Variables
Variables
Correlation
Weekday Alcohol Consumption
Decision Tree Model:
● Sex being male
● Lesser Grade during finals (G3 <14)
● Going out more
● More absences from class
● Mother’s education lower than 1.5 yrs
● Mother’s job other than At home,health or teacher
are the factors that seemed to cause Unacceptable drinking behavior (Ratings 2 - 5)
Weekday Alcohol Consumption
Decision Tree with Loss Matrix:
Loss matrix used: 0,25,80,0
Training: Validation: Test - 70:15:15
Weekday Alcohol Consumption
Decision Tree Model with loss matrix:
Why chose G3 as the indicator of grade

More Related Content

What's hot (14)

Applications of-linear-algebra-hill-cipher
Applications of-linear-algebra-hill-cipherApplications of-linear-algebra-hill-cipher
Applications of-linear-algebra-hill-cipher
 
Building Caregiver Resilience
Building Caregiver ResilienceBuilding Caregiver Resilience
Building Caregiver Resilience
 
Alcoholism - Sign, Effects, Threats, Treatment
Alcoholism - Sign, Effects, Threats, TreatmentAlcoholism - Sign, Effects, Threats, Treatment
Alcoholism - Sign, Effects, Threats, Treatment
 
Alcohol ppt
Alcohol pptAlcohol ppt
Alcohol ppt
 
Lsd
LsdLsd
Lsd
 
Alcohol & Drug Abuse
Alcohol &  Drug  AbuseAlcohol &  Drug  Abuse
Alcohol & Drug Abuse
 
Alcoholism
AlcoholismAlcoholism
Alcoholism
 
Designing in Context
Designing in ContextDesigning in Context
Designing in Context
 
Cyber security awareness for students
 Cyber security awareness for students Cyber security awareness for students
Cyber security awareness for students
 
Chapter 11 Authentication and Account Management
Chapter 11 Authentication and Account ManagementChapter 11 Authentication and Account Management
Chapter 11 Authentication and Account Management
 
Nicotine in psychiatry
Nicotine in psychiatryNicotine in psychiatry
Nicotine in psychiatry
 
Digital signatures
Digital signaturesDigital signatures
Digital signatures
 
Teen suicide
Teen suicideTeen suicide
Teen suicide
 
Hci activity#2
Hci activity#2Hci activity#2
Hci activity#2
 

Viewers also liked

Business Idea Competition: Miaoguide
Business Idea Competition: Miaoguide Business Idea Competition: Miaoguide
Business Idea Competition: Miaoguide Demin Wang
 
Alcohol : Industry facts, fun facts and other information.
Alcohol : Industry facts, fun facts and other information.Alcohol : Industry facts, fun facts and other information.
Alcohol : Industry facts, fun facts and other information.Cognac Lover
 
Alcohol And Adolescence What Every Educator Should Know
Alcohol And Adolescence   What Every Educator Should KnowAlcohol And Adolescence   What Every Educator Should Know
Alcohol And Adolescence What Every Educator Should KnowSarah Pahl
 
Database Marketing - Dominick's stores in Chicago distric
Database Marketing - Dominick's stores in Chicago districDatabase Marketing - Dominick's stores in Chicago distric
Database Marketing - Dominick's stores in Chicago districDemin Wang
 
Capacitación Ángel Protector 2013
Capacitación Ángel Protector 2013Capacitación Ángel Protector 2013
Capacitación Ángel Protector 2013Red PaPaz
 
Lowering The Drinking Age to 18
Lowering The Drinking Age to 18Lowering The Drinking Age to 18
Lowering The Drinking Age to 18Luko17667
 
Lowering the drinking age
Lowering the drinking ageLowering the drinking age
Lowering the drinking agevalka69
 
Alcohol trends and public attitudes in Ireland
Alcohol trends and public attitudes in IrelandAlcohol trends and public attitudes in Ireland
Alcohol trends and public attitudes in IrelandAlcoholActionIreland
 
Utility and cardinal utility analysis
Utility and cardinal utility analysisUtility and cardinal utility analysis
Utility and cardinal utility analysisSIASDEECONOMICA
 
Alcohol as public health problem
Alcohol as public health problem Alcohol as public health problem
Alcohol as public health problem Dr Praseeda BK
 
The Dangers of Alcohol - MAPEH 8 (Health 4th Quarter)
The Dangers of Alcohol - MAPEH 8 (Health 4th Quarter)The Dangers of Alcohol - MAPEH 8 (Health 4th Quarter)
The Dangers of Alcohol - MAPEH 8 (Health 4th Quarter)Carlo Luna
 
Oracle vs Salesforce.com Case Analysis: Competition on Hosted CRM
Oracle vs Salesforce.com Case Analysis: Competition on Hosted CRMOracle vs Salesforce.com Case Analysis: Competition on Hosted CRM
Oracle vs Salesforce.com Case Analysis: Competition on Hosted CRMDemin Wang
 
Alcohol drug awareness
Alcohol drug awarenessAlcohol drug awareness
Alcohol drug awarenessRCIPS
 
Social Studies SBA
Social Studies SBA Social Studies SBA
Social Studies SBA Quarrie
 

Viewers also liked (20)

Business Idea Competition: Miaoguide
Business Idea Competition: Miaoguide Business Idea Competition: Miaoguide
Business Idea Competition: Miaoguide
 
Alcohol : Industry facts, fun facts and other information.
Alcohol : Industry facts, fun facts and other information.Alcohol : Industry facts, fun facts and other information.
Alcohol : Industry facts, fun facts and other information.
 
Alcohol And Adolescence What Every Educator Should Know
Alcohol And Adolescence   What Every Educator Should KnowAlcohol And Adolescence   What Every Educator Should Know
Alcohol And Adolescence What Every Educator Should Know
 
Database Marketing - Dominick's stores in Chicago distric
Database Marketing - Dominick's stores in Chicago districDatabase Marketing - Dominick's stores in Chicago distric
Database Marketing - Dominick's stores in Chicago distric
 
MLDA to 18
MLDA to 18MLDA to 18
MLDA to 18
 
Capacitación Ángel Protector 2013
Capacitación Ángel Protector 2013Capacitación Ángel Protector 2013
Capacitación Ángel Protector 2013
 
Lowering The Drinking Age to 18
Lowering The Drinking Age to 18Lowering The Drinking Age to 18
Lowering The Drinking Age to 18
 
Lowering the drinking age
Lowering the drinking ageLowering the drinking age
Lowering the drinking age
 
Alcohol trends and public attitudes in Ireland
Alcohol trends and public attitudes in IrelandAlcohol trends and public attitudes in Ireland
Alcohol trends and public attitudes in Ireland
 
Alcoholism
AlcoholismAlcoholism
Alcoholism
 
Utility and cardinal utility analysis
Utility and cardinal utility analysisUtility and cardinal utility analysis
Utility and cardinal utility analysis
 
Alcohol as public health problem
Alcohol as public health problem Alcohol as public health problem
Alcohol as public health problem
 
Utility analysis ppt
Utility analysis pptUtility analysis ppt
Utility analysis ppt
 
20 facts about alcohol consumption
20 facts about alcohol consumption20 facts about alcohol consumption
20 facts about alcohol consumption
 
Pharmacology of Alcohol
Pharmacology of Alcohol Pharmacology of Alcohol
Pharmacology of Alcohol
 
The Dangers of Alcohol - MAPEH 8 (Health 4th Quarter)
The Dangers of Alcohol - MAPEH 8 (Health 4th Quarter)The Dangers of Alcohol - MAPEH 8 (Health 4th Quarter)
The Dangers of Alcohol - MAPEH 8 (Health 4th Quarter)
 
Oracle vs Salesforce.com Case Analysis: Competition on Hosted CRM
Oracle vs Salesforce.com Case Analysis: Competition on Hosted CRMOracle vs Salesforce.com Case Analysis: Competition on Hosted CRM
Oracle vs Salesforce.com Case Analysis: Competition on Hosted CRM
 
Earthsoft say no to alcohol- stop alcohol
Earthsoft say no to alcohol- stop alcoholEarthsoft say no to alcohol- stop alcohol
Earthsoft say no to alcohol- stop alcohol
 
Alcohol drug awareness
Alcohol drug awarenessAlcohol drug awareness
Alcohol drug awareness
 
Social Studies SBA
Social Studies SBA Social Studies SBA
Social Studies SBA
 

Similar to Student’s Alcohol Consumption Data Analysis

SAP BW Project
SAP BW ProjectSAP BW Project
SAP BW ProjectAli Asad
 
ADEPIS seminar - Effekt (Orebro Prevention Programme) - N. Koutakis
ADEPIS seminar - Effekt (Orebro Prevention Programme) - N. KoutakisADEPIS seminar - Effekt (Orebro Prevention Programme) - N. Koutakis
ADEPIS seminar - Effekt (Orebro Prevention Programme) - N. KoutakisMentor
 
AlcoholEdu Impact Report
AlcoholEdu Impact Report AlcoholEdu Impact Report
AlcoholEdu Impact Report Katie Mitchell
 
Perception Final plans book
Perception Final plans bookPerception Final plans book
Perception Final plans bookJocelyn Martinez
 
MarketingResearchFinalProject
MarketingResearchFinalProjectMarketingResearchFinalProject
MarketingResearchFinalProjectMarissa Garcia
 
HPEB300 Alcohol Abuse Powerpoint-FINAL
HPEB300 Alcohol Abuse Powerpoint-FINALHPEB300 Alcohol Abuse Powerpoint-FINAL
HPEB300 Alcohol Abuse Powerpoint-FINALAbby Hoke
 
U101 Alcohol Presentation - 50 Minute Class
U101 Alcohol Presentation - 50 Minute ClassU101 Alcohol Presentation - 50 Minute Class
U101 Alcohol Presentation - 50 Minute ClassMike Dial
 
To predict the academic performance of an elementary school using Linear Regr...
To predict the academic performance of an elementary school using Linear Regr...To predict the academic performance of an elementary school using Linear Regr...
To predict the academic performance of an elementary school using Linear Regr...Kamalika Some
 
Gasps orientation master ppt final-revised 12-7-11(1)
Gasps orientation master ppt final-revised 12-7-11(1)Gasps orientation master ppt final-revised 12-7-11(1)
Gasps orientation master ppt final-revised 12-7-11(1)progroup
 
Substance Abuse among Adolescents: 1. Prevalence and Patterns of Alcohol Use ...
Substance Abuse among Adolescents: 1. Prevalence and Patterns of Alcohol Use ...Substance Abuse among Adolescents: 1. Prevalence and Patterns of Alcohol Use ...
Substance Abuse among Adolescents: 1. Prevalence and Patterns of Alcohol Use ...iosrphr_editor
 
Asteriadis_Rivera_2014_NV_State_Gambling_Conference_4.9.14
Asteriadis_Rivera_2014_NV_State_Gambling_Conference_4.9.14Asteriadis_Rivera_2014_NV_State_Gambling_Conference_4.9.14
Asteriadis_Rivera_2014_NV_State_Gambling_Conference_4.9.14Carina Rivera
 
Substance Abuse Allegan, Michigan
Substance Abuse Allegan, MichiganSubstance Abuse Allegan, Michigan
Substance Abuse Allegan, Michiganrecoveryrestart2
 

Similar to Student’s Alcohol Consumption Data Analysis (20)

SAP BW Project
SAP BW ProjectSAP BW Project
SAP BW Project
 
ADEPIS seminar - Effekt (Orebro Prevention Programme) - N. Koutakis
ADEPIS seminar - Effekt (Orebro Prevention Programme) - N. KoutakisADEPIS seminar - Effekt (Orebro Prevention Programme) - N. Koutakis
ADEPIS seminar - Effekt (Orebro Prevention Programme) - N. Koutakis
 
AlcoholEdu Impact Report
AlcoholEdu Impact Report AlcoholEdu Impact Report
AlcoholEdu Impact Report
 
Perception Final plans book
Perception Final plans bookPerception Final plans book
Perception Final plans book
 
MarketingResearchFinalProject
MarketingResearchFinalProjectMarketingResearchFinalProject
MarketingResearchFinalProject
 
October 2019 Directors Meeting
October 2019 Directors MeetingOctober 2019 Directors Meeting
October 2019 Directors Meeting
 
Headstrong's "My World" survey
Headstrong's "My World" surveyHeadstrong's "My World" survey
Headstrong's "My World" survey
 
HPEB300 Alcohol Abuse Powerpoint-FINAL
HPEB300 Alcohol Abuse Powerpoint-FINALHPEB300 Alcohol Abuse Powerpoint-FINAL
HPEB300 Alcohol Abuse Powerpoint-FINAL
 
Coalition Orientation to Public
Coalition Orientation to PublicCoalition Orientation to Public
Coalition Orientation to Public
 
Legalization of Marijuana: Challenges Facing College Campuses
Legalization of Marijuana: Challenges Facing College CampusesLegalization of Marijuana: Challenges Facing College Campuses
Legalization of Marijuana: Challenges Facing College Campuses
 
U101 Alcohol Presentation - 50 Minute Class
U101 Alcohol Presentation - 50 Minute ClassU101 Alcohol Presentation - 50 Minute Class
U101 Alcohol Presentation - 50 Minute Class
 
To predict the academic performance of an elementary school using Linear Regr...
To predict the academic performance of an elementary school using Linear Regr...To predict the academic performance of an elementary school using Linear Regr...
To predict the academic performance of an elementary school using Linear Regr...
 
Alcohol # 1 concern march 16 2016
Alcohol # 1 concern march 16 2016Alcohol # 1 concern march 16 2016
Alcohol # 1 concern march 16 2016
 
March 2021 Directors Meeting
March 2021 Directors MeetingMarch 2021 Directors Meeting
March 2021 Directors Meeting
 
City on Science - Feb 2020
City on Science - Feb 2020City on Science - Feb 2020
City on Science - Feb 2020
 
Gasps orientation master ppt final-revised 12-7-11(1)
Gasps orientation master ppt final-revised 12-7-11(1)Gasps orientation master ppt final-revised 12-7-11(1)
Gasps orientation master ppt final-revised 12-7-11(1)
 
Substance Abuse among Adolescents: 1. Prevalence and Patterns of Alcohol Use ...
Substance Abuse among Adolescents: 1. Prevalence and Patterns of Alcohol Use ...Substance Abuse among Adolescents: 1. Prevalence and Patterns of Alcohol Use ...
Substance Abuse among Adolescents: 1. Prevalence and Patterns of Alcohol Use ...
 
Asteriadis_Rivera_2014_NV_State_Gambling_Conference_4.9.14
Asteriadis_Rivera_2014_NV_State_Gambling_Conference_4.9.14Asteriadis_Rivera_2014_NV_State_Gambling_Conference_4.9.14
Asteriadis_Rivera_2014_NV_State_Gambling_Conference_4.9.14
 
Slide Design
Slide DesignSlide Design
Slide Design
 
Substance Abuse Allegan, Michigan
Substance Abuse Allegan, MichiganSubstance Abuse Allegan, Michigan
Substance Abuse Allegan, Michigan
 

Recently uploaded

Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 

Recently uploaded (20)

Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 

Student’s Alcohol Consumption Data Analysis

  • 1. Student’s Alcohol Consumption Analysis Group 9 Demin; Derrick; Gaurav; Jingya; Ramya; Si
  • 2. Introduction Some of the most important new data to emerge on young adult drinking were collected through a recent nationwide survey, the National Epidemiologic Survey on Alcohol and Related Conditions (NESARC). According to these data, about 70 percent of young adults or about 19 million people, consumed alcohol in the year preceding the survey. Short exploratory data analysis focusing on the alcohol variables from the Portuguese school dataset. Our main goal is using Data Mining To Predict School Student Alcohol Consumption and finding the significant factors.
  • 3. Objective/problem statement •Build models to predict school students’ drinking behavior during weekdays and weekends. •Compare various models and choose the best. •Find out which factors are influential to school students’ alcohol consumption – sensible recommendations were made.
  • 4. Dataset Data collected through a survey from two classes in two schools in Portugal 33 Variables Personal e.g. school, sex, age, address, health status, romantic experience, going out with friends, free time after school Educational e.g. study time, class failures, intention for higher education, extra-curricular activities, educational support, number of school absences, grades Family e.g. mother/father’s education, mother/father’s job, family size, quality of family relationship, parent’s cohabitation status Alcohol Consumption e.g. workday alcohol consumption, weekend alcohol consumption Data Types
  • 5. Data preparation No missing data Overlapping Students taking both math and portuguese class 649 students in Portuguese class, 395 students in Math class Merging data Criterion "school","sex","age","address","famsize","Pstatus","Medu","Fedu","Mjob","Fjob","reason","nurs ery","internet" 382 students identified
  • 6. Approaches The data is distributed to analyse 2 different models(alcohol consumption for weekday and for the weekend) Target variables: Weekday alcohol consumption and weekends alcohol consumption For weekday (more serious issue than weekend), Level 1 - acceptable alcohol consumption Levels 2- 5 - unacceptable For the weekend, Levels 1 and 2 - acceptable alcohol consumption Level 3, 4, 5 - unacceptable
  • 7. Techniques Used Decision Tree Poor performance ☹ • Overall error rate 38% • Tried improving the model by cost matrix (0,25,80,0) → 32% error in predicting unacceptable behavior • But increased the error rate of acceptable to 44% REJECTED DECISION TREE Neural Network Poor performance ☹ • Neural network worked best for 15 nodes • But the error rate is quite high → 53% for unacceptable class • Also the error rate for the acceptable class was 22% REJECTED NEURAL NETWORK Boosting Poor performance ☹ • Overall error rate is 25% which is quite less ☺ • However, 59% of the data is wrongly classified into unacceptable • Area under ROC curve is 0.6782 REJECTED BOOSTING Naïve Bayes Poor performance ☹ • Overall error rate was 38.46% • Couldn’t properly classify unacceptable class • Accuracy was also very low REJECTED NAÏVE BAYES
  • 8. Random Forest Winner ☺ • Unacceptable class error rate was 29% • And the unacceptable class is very important for the prediction of the model ACCEPTED RANDOM FOREST
  • 9. Weekday Alcohol Consumption Input Variables: All the variables were chosen as input for Weekday Alcohol consumption model building except G1, G2 and Weekend Alcohol consumption. Weekend Alcohol consumption is ignored to avoid the target leakage condition G1, G2 - Grades for the first and second year. We include G3 (derived from G1 and G2) and ignore G1 and G2 to make the input variables independent. Target: Weekday Alcohol consumption We classified the Ordinal Variable Weekday Alcohol consumption (Ratings 1 - 5) Acceptable (Rating 1) and Unacceptable (Ratings 2 - 5)
  • 10. Weekday Alcohol Consumption Random Forest Model: Partitioning: Training: Validation: Test - 70:15:15 Sample size chosen as 85,100 to downsample the acceptable class No.of Trees : 5200
  • 11. Weekday Alcohol Consumption Random Forest Model: Overall error 35% For Unacc class Precision: 52% Recall : 70.5%
  • 13. Weekday Alcohol Consumption Important Factors: ● Sex being male ● Grades ● Mother’s education ● Going out ● Mother’s job ● Failures
  • 14. Weekend Alcohol Consumption - Input & Balance The best model is Balanced Random Forest : Ignore the variable Dalc, G1 & G2 The target value walc: 1-2 “Low” & 3-5 “High” High : Low = 262 : 412 = 38 : 62 Train : Validation : Test =70 : 15 : 15
  • 15. Weekend Alcohol Consumption - Number of Trees The number of trees is 5200
  • 16. Weekend Alcohol Consumption - Validation AUC=0.748 Overall error 32% Precision: 58.5% Recall : 73.8% Actual Unac Accp Error Unac 0.31 0.11 0.26 Accp 0.22 0.36 0.37
  • 17. Weekend Alcohol Consumption - Importance Important Factors: ● Going Out with friends ● Sexual ● Grades ● Family Size ● Absences ● Freetime ● Father’s Job
  • 18. Compare two models Random forest can best predict the data in both models. For daily alcohol consumption, the overall error rate is 35%, with the error rate in unacceptable group of 29%. However, according to AUC, it explains only 69% of the data. For weekends alcohol consumption, the overall error rate is 32%, with the error rate in high consumption group of 26%. According to AUC, it explains 74.8% of the data. The weekend model is the better one.
  • 19. Insights of the models 1.Drinking is a daily behavior most of the drinkers drink both on weekends and weekdays.Students tend to drink more on weekends. 2. Mom and dad plays important roles in different time According to the daily alcohol consumption model, mother’s education, mother’s job have relationship with the daily drinking behavior of the child. While, during weekends, father’s job matters to the weekends drinking behavior.
  • 20. Insights of the models 3. Common factors shows up in both models ● Sexual --boys tend to drink more than girls ● Grades --kids with lowers grades drinks more than those with higher grades ● Absences --kids absences more tend to drink more ● Freetime --kids with more free time tend to drink more 4. Exclusive factors related to alcohol consumption ● Going out with friends --on weekends peer behavior have relationship with alcohol consumption ● Family Size --kids with larger family size tend to drink less on weekends. ● Going out for more time --during weekdays, more freetime have relationship with alcohol consumption
  • 21. Recommendation Family and school are both important. After running both models on only school-related data, family-related data we discover the prediction error rate get even higher, which indicates that alcohol consumption behaviour related to both aspects. Solving the alcohol consumption problem among high-school students need the efforts from both school and family. ● Educate the students. Reduce negative peer impacts. Build their awareness of harmful effects of alcohol use. ● Educate the parents. And get parents to keep track of their kids’ after school behavior. ● Keep track of the data to build students’ behavior profile in future prediction.
  • 22. Recommendation How to predict better. As both models can hardly predict the drinkers group well. We could collect more data on larger sample to build the model better.There might be more relevant variables like the group the kids hang out with or how much money they have or other factors we are not included in the study.
  • 28. Weekday Alcohol Consumption Decision Tree Model: ● Sex being male ● Lesser Grade during finals (G3 <14) ● Going out more ● More absences from class ● Mother’s education lower than 1.5 yrs ● Mother’s job other than At home,health or teacher are the factors that seemed to cause Unacceptable drinking behavior (Ratings 2 - 5)
  • 29. Weekday Alcohol Consumption Decision Tree with Loss Matrix: Loss matrix used: 0,25,80,0 Training: Validation: Test - 70:15:15
  • 30. Weekday Alcohol Consumption Decision Tree Model with loss matrix:
  • 31. Why chose G3 as the indicator of grade