Causes behind increasing the consumption of cigarette in young generation: A ...
SAP BW Project
1. DATA WAREHOUSE AND BI
Presented by
Ali Asad, Amaldas AS, Ankita Banerjee & Israa Tolson
To
Prof. Monica Luxemburg
Alcohol consumption among students
A3I data analyst
Business Consulting Master
2. AGENDA
Company profile & Objective
Problem statement & Business Case
KPI description
Star schema
Excel prototype
SAP BW & Data flows
Qlikview Analysis
Conclusion
Recommendation
Lessons learnt
2
3. COMPANY PROFILE
FOUNDED: 2012
INDUSTRY: IT/Data analytics
HEADQUARTERS: Stuttgart, Germany
BUSINESS PROCESS: Customer oriented
STRATEGY: Growth and innovation
CUSTOMERS: Open to all kinds of projects, clients
ranging from Banking to Retail etc.
3
4. OBJECTIVE
Inspecting, cleansing, transforming, and modelling data with the goal of
discovering useful information, suggesting conclusions, and supporting
decision-making.
Emphasizing the generation of actionable insights that lead to tangible
improvements in constituent touch points across the organization.
Improve the member experience and deliver more value to customers.
4
5. PROBLEM STATEMENT
NGO approached us to analyze data to measure alcohol consumption
among students and the main factors that play a role in this case.
Statistical analysis of several attributes that could lead students to
consume alcohol, behavioral elements for example.
5
7. KEY PERFORMANCE INDICATOR
Alcohol consumption (Weekend + Weekday consumption)
First Term grades
Second Term grades
Final grade(Average of term 1 and 2)
• Dimensions :
Time
Student
Subject
7
19. ALCOHOL CONSUMPTION WEEKEND
With alcohol consumption of 5 there
is a slight change in pass percentage
ie 53.57% as it’s the weekend and
generally doesn’t affect the grades
much
19
20. NAMING CONVENTION 20
Dimensions Key Figures Data Sources
Student A05_STD_NSP_05
Student Subject A05_SUB_NSP_05
Student_ID A05CHSTDID Grade_1 A05KFGRD Time A05_TIM_NSP_05
Age A05CHAGE Alcholo_cons_W A05KFACWD Fact Table A05_FT_NSP_05
Sex A05CHSEX Alchol_cons_weekend A05KFACWE
Family_status A05CHFAMS Avg_alc_cons A05KFAVGC
Pstatus A05CHPSTAT Average_grade A05KFAVGG
studytime A05CHSTUDT students A05KFSTD
failures A05CHFAIL pass/fail A05KFPORF
activities A05CHACT
absences A05CHABS
Subject Info Packages
Subject_id A05CHSID Student A05_STD_NSP_05
Subject_name A05CHSNAM Subject A05_SUB_NSP_05
Time A05_TIM_NSP_05
Time Fact Table A05_FT_NSP_05
Term A05CHTerm
Term_Desc A05CHTerm
23. DATA FLOWS
• Data flow for transaction data grades • Data flow for students dimension
23
24. • Data flow for subjects • Data flow for term
24
25. ANALYSIS OF DATA IN QLIKVIEW
• We would be looking into the following parameters for our analysis and will try to
draw some meaningful conclusion from the same.
1. Grades vs weekday alcohol intake
2. Grades vs weekend alcohol intake
3. Family size vs alcohol consumption
4. Parents status vs alcohol consumption
5. Study time vs alcohol consumption
6. Student activities vs alcohol consumption
7. Past failures vs alcohol consumption
25
26. DATA IN QLIKVIEW
• For our analysis we have taken
subject Portuguese Term 1 and
Mathematics Term 1
26
31. MORE INSIGHT IN WEEKDAY
CONSUMPTION
Alcohol
consumption of
scale 1
Alcohol
consumption of
scale 5
31
32. MORE INSIGHT IN WEEKEND
CONSUMPTION
Alcohol
consumption of
scale 1
Alcohol
consumption of
scale 5
32
33. FAMILY SIZE AND ALCOHOL
CONSUMPTION
• Family size GT3 is family size greater than 3 LE3 is lesser than 3
• For the report we find that alcohol consumption majorly takes place among
students whose family size is greater than 3.
• Conclusion to be drawn is that loneliness or feeling left out (which may
happen if family size is less than 3) is not a reason for the student to consume
alcohol.
sub: protugese term1
33
34. PARENT STATUS AND ALCOHOL
CONSUMPTION
• Parents status T stands together and A for separated.
• Attention towards the scale of maximum alcohol consumption of 4 and 5 ,
interestingly we find that the parents status of students is separated.
34
35. STUDY TIME AND RELATIONSHIP
ALCOHOL CONSUMPTION
• Study time of 1 is the least with < 2 hours and 4 is highest with >10 hours.
• From the report we find that alcohol consumption from scale 1 to 5 is done
majorly by students whose study time is the least i.e of 1 and 2 .
• The maximum alcohol consumption of 4 & 5 is only done by students with study
time of 1 and 2
35
36. EXTRA CURRICULAR ACTIVITIES VS ALCOHOL
CONSUMPTION
• Marginal difference so drawing any conclusion from this attribute is not
feasible.
36
37. PAST FAILURES AND RELATIONSHIP WITH
ALCOHOL CONSUMPTION
• Past failures have been measured on scale of 0-3 (0 lowest , 3 highest)
• Concentrate on alcohol consumption 4 and 5 (highest) we find that count of
students is more in this regard with failures of 2 and 3 .
37
38. CONCLUSION
Average alcohol consumption data doesn’t help us to evaluate or draw any suitable
conclusion as the weekend alcohol consumption does not affect the grades of the
student majorly.
The weekday alcohol consumption clearly illustrates that pass percentage of
students with alcohol consumption 5 is 42.46% where as with least alcohol
consumption of 1 is 83.37%.
The Family size of most of the students who consume alcohol is grater than 3.
The Parent status of the majority of students with high alcohol consumption is
separated.
Study time of students with more alcohol consumption is less
The student activities show marginal difference but still students with less activity
time consume more alcohol.
Students with more past failures are prone to more alcohol consumption.
38
39. RECOMMENDATIONS
• We have provided the NGO with our analysis and our recommendation would be
that the parents and teachers should look into study time and activity time of the
students and the parents should also look into their family atmosphere.
“Because building a better world starts with raising healthy , happy and empowered
children.”
39
40. LESSONS LEARNT
• Data quality plays an important role for correct decision making.
Some roles with names of not assigned populated but for data quality we changed the
tag to 0 to make more sense.
• Aggregated data is not always meaningful , segregation can be more useful in certain
situations.
E.g.: Average alcohol consumption & weekday consumption.
• Multiple analysis possible with multidimensional start schema.
E.g.: Initial scope of project was to analyze alcohol consumption with grades but
with multidimensional data model we could identity more attributes for analysis.
40
41. PROJECT PLAN
Project phase
Task Start date End date Progress
Preparation
Setting topic and scope 18.10.2016 24.10.2016 100%
Business case 25.10.2016 07.11.2016 100%
Defining KPIs 25.10.2016 07.11.2017 100%
Modelling and Prototyping
Identifyting and collecting data sources 08.11.2016 14.11.2016 100%
Analysing the data sources 15.11.2016 21.11.2016 100%
Defining the star schema 22.11.2016 28.11.2016 100%
Creating excel prototype 29.11.2016 05.12.2016 100%
Data warehouse Implementation
Creating infocubes and dimension 06.12.2016 12.12.2016 100%
Uploading of data 13.12.2016 19.12.2016 100%
Creating connection with DWH 20.12.2016 25.12.2016 100%
BI Implementation
Using QV extracter extracting the data query from SAP BW 5.01.2017 6.01.2017 100%
Reloading of query in Qlikview and generating reports 7.01.2017 10.01.2017 100%
Testing
Testing for the project
Testingreports in Qlikview
41
Editor's Notes
Ppt file and excel files email to prof after final presentation
Pdf version in felix after presentation
More details more marks
Add legend
Include fact table and dimension from excel as well
Create link on the slide
Add legend
Sap screenshot of work, create infocube,etl,upload data one slide per each step,one slide abot naming convention
Data flow chart explaining the etl process to infocube
No conclusion as such that if parents are separated then more alcohol or less grades.
Study time 1 being hightest or lowest?
Again marginal difference
Female drinking more
Family status: which draws us to the conclusion that the consumption of alcohol is not due to feeling of loneliness among students.
Parents status:, conclusion drawn is that family situation may lead to distress and alcohol consumption
Study time: most of the students who consume alcohol have lesser study time of less than 4 hours which eventually gives them a lot of leisure time and leads to alcohol consumption and less grades.
Studnt activities: students with no activities are prone to alcohol. Mat be they dnt involve in extra curricular activities etc which leads them to alcohol consumption.
Past failures: students with more failures in the past record, have a tendency for more alcohol consumption compared to 0 or less failures, though there are 446 students with 0 failures who consume least alcohol among 1044 students which counts to 42% of students.
1. Some roles with names of not assigned populated but for data quality we changed the tag to 0 to make more sense.