SlideShare a Scribd company logo
Statistical Measurement, Analysis & Research
Zhang Kexin (kz2159)
Final Project Presentation
Self-Introduction
Hi, this is Roxie Zhang. I like rock music and stand-up
show, and I’m currently considering forming a rock
band. I would like to pursue a career in brand
marketing and it is rather necessary to utilize data
analyzing tools or learn how to work data analysts
nowadays. It was such a pleasure to study with my
classmates this semester. Now here’s my
presentation for reviewing this course, Statistical
Measurement, Analysis & Research.
Github Repo link:
https://colab.research.google.com/github/kexinez/NYU_Integrated_Marketing
Kaggle Notebook link: https://www.kaggle.com/kexinezhang/women-clothing-
ecommerce-analysis
Linkedin URL: www.linkedin.com/kexin-zhang-972823166
Lessons
What I’ve learned:
• I think hypothesis testing will be useful in finding
correlation of two items co-occurrence in a customer’s
basket. Regression models can be utilized in predicting the
amount of items sold or other figures. More importantly,
I’m happy to learn how to find the most valuable customer
segment. I can even visualize the analysis result to others
now.
• I guess the most valuable treasure I obtained from this class
is that I’m no more that unconfident in data analysis. I used
to freak out when I heard of the potential risk of getting in
touch with data processing stuff. Now I would really get on
hand and try if I can make some progress or learn it by
searching on Internet.
Research Design
• Dataset: Individual medical costs billed by health insurance of over 1330
beneficiaries in the US and their basic information.
• URL: https://www.kaggle.com/artaseyedian/predicting-health-insurance-
charge-with-tidymodels
• Key variables include: Age of the insurant, sex, BMI, number of children,
smoke or not, region, individual medical costs billed by health insurance
• Research Design: To explore whether the medical costs are correlated
with the number of children and BMI index or not, I will conduct linear
regression models using Github. The research will help marketers find the
customers with lower medical costs so that the insurance company is
more likely to find this group so as to maintain the costs low.
Data Preparation
• Sample: 13,38 insurants of a health insurance
• From this research, I conducted a preliminary data inspection which categorized our
customers according to their region in the US. What’s more, we can see that
southeastern region not only occupied the largest share in our customers, it also has
the highest percent of smokers compared to the non-smokers
• https://datastudio.google.com/reporting/4c2b0b69-782f-4690-ab78-42b944d1fa27
Reproduce
Regression analysis
The 1st graph shows there is no clear
linear relationship between bmi and
charges.
The 2nd graph shows there is no linear
relationship between age and bmi.
• According to the regression results, the p value of the variable (children) is
0.015 (<0.05), thus we can reject the null hypothesis that the charges are not
correlated with children.
• the p value of the variable (bmi) is 0 (<0.05), thus we can reject the null
hypothesis that the charges are not correlated with bmi.
• It seems that children and bmi are positively correlated with charges
• children is more influential on the charges but it also has higher stand error.
Insights
• With 95% confidence level, we can say the two
variables (children and bmi) have influence on
charges.
• This gives us clues on what insurants of high charges
look like in the two aspects (more children and
higher in BMI index). That is to say, in order for
insurance company to make profits, we should
absorb insurants who have fewer children and lower
BMI.
Assumptions Check
Assumptions:
• 1. Satisfied: The error term is
almost normally distributed.
• 3. Not satisfied: The mean of
the error term is not 0.
Assumptions Check
• 2. Satisfied. The means of all these
normal distributions of Y, given X, lie
on a straight line with slope b.
• 4. Not satisfied: The variance of the
error term is not constant.
• 5. Satisfied: The error terms are
uncorrelated. In other words, the
observations are not drawn
independently.
6. Satisfied: The independent variables in X are not correlated. This is no
issues of multi-collinearity.
P-value = 0.631 > 0.05, we can conclude that at 0.05 significant level, we
cannot reject the null hypothesis that the independent variables are not
correlated.
Further Research
• From the linear regression analysis I’ve done, no clear linear
relationship can be obtained, but they do have a positive
effect on the decision variable. Therefore, a guessing is that
there are other variables are relevant in the level of insurants’
medical bills having not be inspected.
• The recommendation is to investigate on more variables such
as smoker or non-smoker, diet, sleeping habit and daily
workout and also analyze these aspects in combination.
Because some diseases do not generate just for one reason.
Instead, they are results of convoluted factors. We need to
obtain a more accurate statistical model using variables in
order to predict customers’ medical bills.
Appendix
Recap on Previous Assignments
Milestone 2
Milestone 3
Limitations of the research:
Even though the p-value is under 0.05, the statistical power is too low. So the deviation
of statistics can be large.
Limitation:
This research only implies marital condition is correlated to the duration of
calls, but did not find the quantitative relationship between them. Besides,
duration’s relationship with other dimensions of information is also important
for us to predict duration and target at valuable customers, which needs
further research such as regression analysis.
The plots represent the relationship between the number of total international charge and the total
international minutes.
Result: There seems to be a linear relationship between x and y and they are positively correlated.
Milestone 4: Regression
Milestone 5: Clustering
Kaggle Notebook URL: https://www.kaggle.com/kexinezhang/customer-
segementation-kz2159
Milestone 5: clustering
Lowest recency: Cluster 1&2 Highest frequency: Cluster 1 Highest amount: Cluster 1
By the RFM criteria, we should choose the customer clusters with a lower recency,
higher frequency and amount. From the K-means clustering results, we can see that
customer with Cluster Labels=1 best fit the criteria.
statistical measurement project present

More Related Content

What's hot

email
email email
What is regression / Quantification of the impact ?
What is regression / Quantification of the impact ?What is regression / Quantification of the impact ?
What is regression / Quantification of the impact ?
Rupak Roy
 
Identifying key factors affecting consumer purchase behavior in an online sho...
Identifying key factors affecting consumer purchase behavior in an online sho...Identifying key factors affecting consumer purchase behavior in an online sho...
Identifying key factors affecting consumer purchase behavior in an online sho...
KULDEEP MATHUR
 
Mk0013 market research
Mk0013  market researchMk0013  market research
Mk0013 market research
smumbahelp
 
Survey Proposal
Survey ProposalSurvey Proposal
Survey Proposal
kwise4utk
 
Data Wrangling
Data WranglingData Wrangling
Data Wrangling
Gramener
 
Basic statistical &amp; pharmaceutical statistical applications
Basic statistical &amp; pharmaceutical statistical applicationsBasic statistical &amp; pharmaceutical statistical applications
Basic statistical &amp; pharmaceutical statistical applications
YogitaKolekar1
 
Discovering Statistics Using IBM SPSS Statistics 4th Edition Field Test Bank
Discovering Statistics Using IBM SPSS Statistics 4th Edition Field Test BankDiscovering Statistics Using IBM SPSS Statistics 4th Edition Field Test Bank
Discovering Statistics Using IBM SPSS Statistics 4th Edition Field Test Bank
guzofahug
 
Ballard Integrated Managed Services
Ballard Integrated Managed ServicesBallard Integrated Managed Services
Ballard Integrated Managed Services
Ashley Kruempel
 
What is the Chi Square Test of Association and How Can it be Used for Analysis?
What is the Chi Square Test of Association and How Can it be Used for Analysis?What is the Chi Square Test of Association and How Can it be Used for Analysis?
What is the Chi Square Test of Association and How Can it be Used for Analysis?
Smarten Augmented Analytics
 
Mk0013 market research
Mk0013  market researchMk0013  market research
Mk0013 market research
Study Stuff
 
Qnt 275 final exam new 2016
Qnt 275 final exam   new 2016Qnt 275 final exam   new 2016
Qnt 275 final exam new 2016
sergejsvolkovs10
 
How to Enter the Data Analytics Industry?
How to Enter the Data Analytics Industry?How to Enter the Data Analytics Industry?
How to Enter the Data Analytics Industry?
Ganes Kesari
 
M&M’s e la statistica con Minitab
M&M’s e la statistica con MinitabM&M’s e la statistica con Minitab
M&M’s e la statistica con Minitab
GMSL S.r.l.
 
The impact of search ads on organic search traffic
The impact of search ads on organic search trafficThe impact of search ads on organic search traffic
The impact of search ads on organic search traffic
Alex Papageorgiou
 

What's hot (15)

email
email email
email
 
What is regression / Quantification of the impact ?
What is regression / Quantification of the impact ?What is regression / Quantification of the impact ?
What is regression / Quantification of the impact ?
 
Identifying key factors affecting consumer purchase behavior in an online sho...
Identifying key factors affecting consumer purchase behavior in an online sho...Identifying key factors affecting consumer purchase behavior in an online sho...
Identifying key factors affecting consumer purchase behavior in an online sho...
 
Mk0013 market research
Mk0013  market researchMk0013  market research
Mk0013 market research
 
Survey Proposal
Survey ProposalSurvey Proposal
Survey Proposal
 
Data Wrangling
Data WranglingData Wrangling
Data Wrangling
 
Basic statistical &amp; pharmaceutical statistical applications
Basic statistical &amp; pharmaceutical statistical applicationsBasic statistical &amp; pharmaceutical statistical applications
Basic statistical &amp; pharmaceutical statistical applications
 
Discovering Statistics Using IBM SPSS Statistics 4th Edition Field Test Bank
Discovering Statistics Using IBM SPSS Statistics 4th Edition Field Test BankDiscovering Statistics Using IBM SPSS Statistics 4th Edition Field Test Bank
Discovering Statistics Using IBM SPSS Statistics 4th Edition Field Test Bank
 
Ballard Integrated Managed Services
Ballard Integrated Managed ServicesBallard Integrated Managed Services
Ballard Integrated Managed Services
 
What is the Chi Square Test of Association and How Can it be Used for Analysis?
What is the Chi Square Test of Association and How Can it be Used for Analysis?What is the Chi Square Test of Association and How Can it be Used for Analysis?
What is the Chi Square Test of Association and How Can it be Used for Analysis?
 
Mk0013 market research
Mk0013  market researchMk0013  market research
Mk0013 market research
 
Qnt 275 final exam new 2016
Qnt 275 final exam   new 2016Qnt 275 final exam   new 2016
Qnt 275 final exam new 2016
 
How to Enter the Data Analytics Industry?
How to Enter the Data Analytics Industry?How to Enter the Data Analytics Industry?
How to Enter the Data Analytics Industry?
 
M&M’s e la statistica con Minitab
M&M’s e la statistica con MinitabM&M’s e la statistica con Minitab
M&M’s e la statistica con Minitab
 
The impact of search ads on organic search traffic
The impact of search ads on organic search trafficThe impact of search ads on organic search traffic
The impact of search ads on organic search traffic
 

Similar to statistical measurement project present

statistical measurement project presentation
statistical measurement project presentationstatistical measurement project presentation
statistical measurement project presentation
KexinZhang22
 
The effect of testers on return rate in the cosmetic field
The effect of testers on return rate in the cosmetic fieldThe effect of testers on return rate in the cosmetic field
The effect of testers on return rate in the cosmetic field
Angela Singh
 
Final presentation zg2088
Final presentation zg2088Final presentation zg2088
Final presentation zg2088
ssuserd6504f
 
wt2084 final presentation slides
wt2084 final presentation slideswt2084 final presentation slides
wt2084 final presentation slides
WeixiTan
 
Between Black and White Population1. Comparing annual percent .docx
Between Black and White Population1. Comparing annual percent .docxBetween Black and White Population1. Comparing annual percent .docx
Between Black and White Population1. Comparing annual percent .docx
jasoninnes20
 
Comprehensive Final PPT.pptx
Comprehensive Final PPT.pptxComprehensive Final PPT.pptx
Comprehensive Final PPT.pptx
SoumyajitKarmakar7
 
Bank churn with Data Science
Bank churn with Data ScienceBank churn with Data Science
Bank churn with Data Science
Carolyn Knight
 
Final presentation
Final presentationFinal presentation
Final presentation
ssuser8e5ee2
 
Kuliah 6-research designprocess
Kuliah 6-research designprocessKuliah 6-research designprocess
Kuliah 6-research designprocess
Awilliambuth Balthasar Lewikinta
 
Assignment DescriptionA reputable hospital has high quality .docx
Assignment DescriptionA reputable hospital has high quality .docxAssignment DescriptionA reputable hospital has high quality .docx
Assignment DescriptionA reputable hospital has high quality .docx
luearsome
 
P l e a s e n o t e t h a t g ra y a re a s re f l e c t .docx
P l e a s e  n o t e  t h a t  g ra y  a re a s  re f l e c t .docxP l e a s e  n o t e  t h a t  g ra y  a re a s  re f l e c t .docx
P l e a s e n o t e t h a t g ra y a re a s re f l e c t .docx
gerardkortney
 
Running head SALES DECLINE AT MCDONALDS INC. .docx
Running head SALES DECLINE AT MCDONALDS INC.                   .docxRunning head SALES DECLINE AT MCDONALDS INC.                   .docx
Running head SALES DECLINE AT MCDONALDS INC. .docx
toltonkendal
 
Demand estimation and forecasting
Demand estimation and forecastingDemand estimation and forecasting
Demand estimation and forecasting
shivraj negi
 
Hy2208 Final
Hy2208 FinalHy2208 Final
Hy2208 Final
ssuser433675
 
Hy2208 final
Hy2208 finalHy2208 final
Hy2208 final
ssuser433675
 
Webinar - How to Prepare for a Pay Equity Analysis Series Ep 2 Diagnose
Webinar - How to Prepare for a Pay Equity Analysis Series Ep 2 DiagnoseWebinar - How to Prepare for a Pay Equity Analysis Series Ep 2 Diagnose
Webinar - How to Prepare for a Pay Equity Analysis Series Ep 2 Diagnose
PayScale, Inc.
 
Certified Specialist Business Intelligence (.docx
Certified     Specialist     Business  Intelligence     (.docxCertified     Specialist     Business  Intelligence     (.docx
Certified Specialist Business Intelligence (.docx
durantheseldine
 
Final Presentation Slide--yw5244
Final Presentation Slide--yw5244Final Presentation Slide--yw5244
Final Presentation Slide--yw5244
ssuserdb31951
 
Measurement and monetizing customer experience with social media.
Measurement and monetizing customer experience with social media.Measurement and monetizing customer experience with social media.
Measurement and monetizing customer experience with social media.
Michael Wolfe
 
Webinar - How to Prepare for a Pay Equity Analysis Series Ep 3: Take Action
Webinar - How to Prepare for a Pay Equity Analysis Series Ep 3: Take ActionWebinar - How to Prepare for a Pay Equity Analysis Series Ep 3: Take Action
Webinar - How to Prepare for a Pay Equity Analysis Series Ep 3: Take Action
PayScale, Inc.
 

Similar to statistical measurement project present (20)

statistical measurement project presentation
statistical measurement project presentationstatistical measurement project presentation
statistical measurement project presentation
 
The effect of testers on return rate in the cosmetic field
The effect of testers on return rate in the cosmetic fieldThe effect of testers on return rate in the cosmetic field
The effect of testers on return rate in the cosmetic field
 
Final presentation zg2088
Final presentation zg2088Final presentation zg2088
Final presentation zg2088
 
wt2084 final presentation slides
wt2084 final presentation slideswt2084 final presentation slides
wt2084 final presentation slides
 
Between Black and White Population1. Comparing annual percent .docx
Between Black and White Population1. Comparing annual percent .docxBetween Black and White Population1. Comparing annual percent .docx
Between Black and White Population1. Comparing annual percent .docx
 
Comprehensive Final PPT.pptx
Comprehensive Final PPT.pptxComprehensive Final PPT.pptx
Comprehensive Final PPT.pptx
 
Bank churn with Data Science
Bank churn with Data ScienceBank churn with Data Science
Bank churn with Data Science
 
Final presentation
Final presentationFinal presentation
Final presentation
 
Kuliah 6-research designprocess
Kuliah 6-research designprocessKuliah 6-research designprocess
Kuliah 6-research designprocess
 
Assignment DescriptionA reputable hospital has high quality .docx
Assignment DescriptionA reputable hospital has high quality .docxAssignment DescriptionA reputable hospital has high quality .docx
Assignment DescriptionA reputable hospital has high quality .docx
 
P l e a s e n o t e t h a t g ra y a re a s re f l e c t .docx
P l e a s e  n o t e  t h a t  g ra y  a re a s  re f l e c t .docxP l e a s e  n o t e  t h a t  g ra y  a re a s  re f l e c t .docx
P l e a s e n o t e t h a t g ra y a re a s re f l e c t .docx
 
Running head SALES DECLINE AT MCDONALDS INC. .docx
Running head SALES DECLINE AT MCDONALDS INC.                   .docxRunning head SALES DECLINE AT MCDONALDS INC.                   .docx
Running head SALES DECLINE AT MCDONALDS INC. .docx
 
Demand estimation and forecasting
Demand estimation and forecastingDemand estimation and forecasting
Demand estimation and forecasting
 
Hy2208 Final
Hy2208 FinalHy2208 Final
Hy2208 Final
 
Hy2208 final
Hy2208 finalHy2208 final
Hy2208 final
 
Webinar - How to Prepare for a Pay Equity Analysis Series Ep 2 Diagnose
Webinar - How to Prepare for a Pay Equity Analysis Series Ep 2 DiagnoseWebinar - How to Prepare for a Pay Equity Analysis Series Ep 2 Diagnose
Webinar - How to Prepare for a Pay Equity Analysis Series Ep 2 Diagnose
 
Certified Specialist Business Intelligence (.docx
Certified     Specialist     Business  Intelligence     (.docxCertified     Specialist     Business  Intelligence     (.docx
Certified Specialist Business Intelligence (.docx
 
Final Presentation Slide--yw5244
Final Presentation Slide--yw5244Final Presentation Slide--yw5244
Final Presentation Slide--yw5244
 
Measurement and monetizing customer experience with social media.
Measurement and monetizing customer experience with social media.Measurement and monetizing customer experience with social media.
Measurement and monetizing customer experience with social media.
 
Webinar - How to Prepare for a Pay Equity Analysis Series Ep 3: Take Action
Webinar - How to Prepare for a Pay Equity Analysis Series Ep 3: Take ActionWebinar - How to Prepare for a Pay Equity Analysis Series Ep 3: Take Action
Webinar - How to Prepare for a Pay Equity Analysis Series Ep 3: Take Action
 

Recently uploaded

06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
slg6lamcq
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
mbawufebxi
 
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
74nqk8xf
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
soxrziqu
 
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
mzpolocfi
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
apvysm8
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
manishkhaire30
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
nuttdpt
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
roli9797
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
g4dpvqap0
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
nyfuhyz
 
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
74nqk8xf
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
zsjl4mimo
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
v7oacc3l
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
aqzctr7x
 
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
u86oixdj
 
Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
AndrzejJarynowski
 

Recently uploaded (20)

06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
 
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
 
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
 
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
 
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
 
Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
 

statistical measurement project present

  • 1. Statistical Measurement, Analysis & Research Zhang Kexin (kz2159) Final Project Presentation
  • 2. Self-Introduction Hi, this is Roxie Zhang. I like rock music and stand-up show, and I’m currently considering forming a rock band. I would like to pursue a career in brand marketing and it is rather necessary to utilize data analyzing tools or learn how to work data analysts nowadays. It was such a pleasure to study with my classmates this semester. Now here’s my presentation for reviewing this course, Statistical Measurement, Analysis & Research. Github Repo link: https://colab.research.google.com/github/kexinez/NYU_Integrated_Marketing Kaggle Notebook link: https://www.kaggle.com/kexinezhang/women-clothing- ecommerce-analysis Linkedin URL: www.linkedin.com/kexin-zhang-972823166
  • 3. Lessons What I’ve learned: • I think hypothesis testing will be useful in finding correlation of two items co-occurrence in a customer’s basket. Regression models can be utilized in predicting the amount of items sold or other figures. More importantly, I’m happy to learn how to find the most valuable customer segment. I can even visualize the analysis result to others now. • I guess the most valuable treasure I obtained from this class is that I’m no more that unconfident in data analysis. I used to freak out when I heard of the potential risk of getting in touch with data processing stuff. Now I would really get on hand and try if I can make some progress or learn it by searching on Internet.
  • 4. Research Design • Dataset: Individual medical costs billed by health insurance of over 1330 beneficiaries in the US and their basic information. • URL: https://www.kaggle.com/artaseyedian/predicting-health-insurance- charge-with-tidymodels • Key variables include: Age of the insurant, sex, BMI, number of children, smoke or not, region, individual medical costs billed by health insurance • Research Design: To explore whether the medical costs are correlated with the number of children and BMI index or not, I will conduct linear regression models using Github. The research will help marketers find the customers with lower medical costs so that the insurance company is more likely to find this group so as to maintain the costs low.
  • 5. Data Preparation • Sample: 13,38 insurants of a health insurance • From this research, I conducted a preliminary data inspection which categorized our customers according to their region in the US. What’s more, we can see that southeastern region not only occupied the largest share in our customers, it also has the highest percent of smokers compared to the non-smokers • https://datastudio.google.com/reporting/4c2b0b69-782f-4690-ab78-42b944d1fa27
  • 6. Reproduce Regression analysis The 1st graph shows there is no clear linear relationship between bmi and charges. The 2nd graph shows there is no linear relationship between age and bmi.
  • 7. • According to the regression results, the p value of the variable (children) is 0.015 (<0.05), thus we can reject the null hypothesis that the charges are not correlated with children. • the p value of the variable (bmi) is 0 (<0.05), thus we can reject the null hypothesis that the charges are not correlated with bmi. • It seems that children and bmi are positively correlated with charges • children is more influential on the charges but it also has higher stand error.
  • 8. Insights • With 95% confidence level, we can say the two variables (children and bmi) have influence on charges. • This gives us clues on what insurants of high charges look like in the two aspects (more children and higher in BMI index). That is to say, in order for insurance company to make profits, we should absorb insurants who have fewer children and lower BMI.
  • 9. Assumptions Check Assumptions: • 1. Satisfied: The error term is almost normally distributed. • 3. Not satisfied: The mean of the error term is not 0.
  • 10. Assumptions Check • 2. Satisfied. The means of all these normal distributions of Y, given X, lie on a straight line with slope b. • 4. Not satisfied: The variance of the error term is not constant. • 5. Satisfied: The error terms are uncorrelated. In other words, the observations are not drawn independently. 6. Satisfied: The independent variables in X are not correlated. This is no issues of multi-collinearity. P-value = 0.631 > 0.05, we can conclude that at 0.05 significant level, we cannot reject the null hypothesis that the independent variables are not correlated.
  • 11. Further Research • From the linear regression analysis I’ve done, no clear linear relationship can be obtained, but they do have a positive effect on the decision variable. Therefore, a guessing is that there are other variables are relevant in the level of insurants’ medical bills having not be inspected. • The recommendation is to investigate on more variables such as smoker or non-smoker, diet, sleeping habit and daily workout and also analyze these aspects in combination. Because some diseases do not generate just for one reason. Instead, they are results of convoluted factors. We need to obtain a more accurate statistical model using variables in order to predict customers’ medical bills.
  • 14. Milestone 3 Limitations of the research: Even though the p-value is under 0.05, the statistical power is too low. So the deviation of statistics can be large.
  • 15. Limitation: This research only implies marital condition is correlated to the duration of calls, but did not find the quantitative relationship between them. Besides, duration’s relationship with other dimensions of information is also important for us to predict duration and target at valuable customers, which needs further research such as regression analysis.
  • 16. The plots represent the relationship between the number of total international charge and the total international minutes. Result: There seems to be a linear relationship between x and y and they are positively correlated. Milestone 4: Regression
  • 17. Milestone 5: Clustering Kaggle Notebook URL: https://www.kaggle.com/kexinezhang/customer- segementation-kz2159
  • 18.
  • 19. Milestone 5: clustering Lowest recency: Cluster 1&2 Highest frequency: Cluster 1 Highest amount: Cluster 1 By the RFM criteria, we should choose the customer clusters with a lower recency, higher frequency and amount. From the K-means clustering results, we can see that customer with Cluster Labels=1 best fit the criteria.