SlideShare a Scribd company logo
Part I: Your self-introduction
slides
Yi Xu
Global Economics, International Trade, Securities, Investment
Yi Xu comes from Shangrao, Jiangxi. He graduated from UC Santa Cruz, majored in
Global Economics, during his internship at Yusi Education Technology as channel
manager assistant for proposing data-driven marketing strategies on the internet to
optimize business processes, he explored how data and analytics were transforming
critical areas of strategy, marketing, and operations. Besides, he realized that data
analytics skills were needed in future marketing. After that, he started his full time job
at China Galaxy Securities as a investment consultant, responsible for Maintaining the
relationship of some high net worth customers, and provided professional investment
and financial advices to customers to achieve their optimal allocation of asset goal.
During one year and six month at Securities company, he found out he wants to learn
more knowledge about analyzing data to support marketing decision, he wanted to be
equipped with stronger analytical skills to collect, organize, analyze, and disseminate
significant amounts of information. Therefore, he decided to apply for the master
program in Integrated Marketing.
B.A. in Global Economics | University of California, Santa Cruz
Email: yx2489@nyu.edu
LinkedIn:https://www.linkedin.com/in/yi-xu-65bb
92118/
Github:https://github.com/yx2489/NYU_Integrat
ed_Marketing
Kaggle NoteBook:
https://www.kaggle.com/yx2489/notebooks
Part II: Summary
Summary
For this class, I have learned how to use different methods to analyse my
data, I have chance get to know the Cryptocurrency Website which is
CoinMarketCap.com. As a Cryptocurrency player, this website allows me
to check the my cryptocurrency price real time. Moreover, This website
include sufficient data source for almost every cryptocurrency and its for
free! So that I can follow the cryptocurrency trend to a great extent. As for
my professional growth, I have learned how to analyse data by using
Hypothesis Testing, logit regression, and clustering. In my future career, I
think I will combine what I have learned in this class with my financial
skills from my previous major to achieve my career goal.
Part III: Your own market
research report
Session1
The data contains information from the 1990 California census. So
although it has huge different from current housing prices like the Zillow
Zestimate dataset, it does provide an accessible introductory dataset for
us get to know the California housing price back in 90s.
The data pertains to the houses found in a given California district include
Inland, Near Bay Area, Island, Near Ocean, 1 hour from ocean. The
columns are as follows, their names are pretty self explanitory: Longitude,
Latitude, Housing, Median age, Total_rooms, Total_bedrooms, Population,
Households, Median_income, Median house value, ocean_proximity.
New Dataset
https://www.kaggle.com/camnugent/california-housing-prices
Part III: Your own market
research report
Session2
Capstone 2 California Housing
Abstract: I use the data from California Housing back in 90s. I compare the house price in different
area in California, for example, in island, near bay, near ocean, inland. This graph shows how many
houses are in these different are and how is there price relate to the location. We can conclude that
most people in California like to live in less one hour to the ocean.
Link:https://datastudio.google.com/reporting/a83b
2abe-31ec-424c-8ac9-7103df25e64f
Part III: Your own market
research report
Session3
Capstone4:Logit Regression
Executive Summary
The data is from Kaggle
https://www.kaggle.com/camnugent/california-housing-prices
In this research I choose Logit Model to conduct the the influence of x on if People live on a
island in California during 90s.
X include housing median age, median income, and population.
P is probability of live on a island.
Result: Since this housing median
age, median income, and population
do not have significant influence on
if they live on a island. For further
research we need to test other
variables see if they have influence
on if they live on a island.
Capstone 4
Logit Regression Result
Summary: The three x variables P value is more than 0.05, so we can not reject
null hypothesis that they do not have significant influence on the whether they live
on island in California.
Evaluate The Result
Summary: The accuracy rate is 0.9995 and 0.9997. The precision is high based on
the test result.
Interpret the Result
Summary: If housing median age increase by 1, the odds ratio will increase
by 1.08. Therefore, the housing median age have great influence on if they
live on island.
Part VI: Appendix
•Capstone Project Milestone 2: Research Design and The Data
•Capstone Project Milestone 3: Hypothesis Testing
•Capstone Project Milestone 4: Regression
•Capstone Project Milestone 5: Clustering
DOGE vs OMG | Yi Xu
Abstract: DOGE and OMG are two popular tokens with relatively small volume and market cap. In this study, we explore the future potetial of these
two tokens by comparing their performance. OMG Network (first developed as OmiseGO) is a non-custodial, Layer 2 scaling solution for transferring
value on Ethereum. How the protocol processes transactions is centralized, but its Plasma-based design aims to decentralize network security
whileDogecoin (DOGE) is based on the popular "doge" Internet meme and features a Shiba Inu on its logo. The open-source digital currency was
created by Billy Markus from Portland, Oregon and Jackson Palmer from Sydney, Australia, and was forked from Litecoin in December 2013.
Dogecoin's creators envisaged it as a fun, light-hearted cryptocurrency that would have greater appeal beyond the core Bitcoin audience, since it
was based on a dog meme. After visualizing the data from Coin Market Cap, we find out that this two tokens Volume are similar before 2019, but
after 2020 we can see that OMG volume increased dramatically. Therefore, we believe OMG can be a profitable investment option for those who
want to invest in small maket cap and volume cryptocurrencies.
Capstone 2
Capstone 3
Summary Name:Yi Xu yx2489
In this research, I conduct Paired T-Test, Two Sample T-Test, for the Assumptions,
because it is normal distribution, so we can use pearson correlation.
https://data.world/data-society/bank-marketing-data
This data is related with directing marketing campaigns of a portuguese banking
institution.
https://stats.oecd.org/index.aspx?queryid=33940#
This data is Economy for OECD countries before and after COVID-19.
Github repo URL: https://github.com/yx2489/NYU_Integrated_Marketing
Conclusion: Since p-value is lower than 0.5, we can reject the null hypothesis, because the we
have the conclude that it is normal distribution, so we use pearson correlation.
Assumptions
Paired T-Test
Conclusion: Since p-value is lower than 0.05, we can reject the null hypothesis. Countries
economy in 2020 is lower than in 2018. We can know that COVID-19 had a negative impact for
OECD countries economy.
Two Sample T-Test
Conclusion: We can reject the null hypothesis that the mean of balance equals between those
who have loan and those who do not has loan load at 0.05 (or even 0.001) significant level.
Conclusion: For a 0.8 cohen d effect size, a power of 0.70, and a type I error of 0.05, we need a sample
size
of 20 (for each group).
Limitations and future research:
For assumption since p-value is lower than 0.5, we can reject the null hypothesis, because the we
have the conclude that it is normal distribution, so we use pearson correlation. For paired T-test,
since p-value is lower than 0.5, we can reject the null hypothesis, because the we have the conclude
that it is normal distribution, so we use pearson correlation. However, the sample size might not be
sufficient for other complicated test.
If in the future we need to test for the cohen d equal to 0.5, the sample size will be 50 instead of 20.
We can also expand the sample size if the future clients ask for more detailed test.
Executive Summary
The data is from Kaggle
https://www.kaggle.com/c/customer-churn-prediction-2020
Github URL: https://github.com/yx2489/NYU_Integrated_Marketing
In this research I choose Logit Model to conduct the probability of success.
X include total night charge, total night calls, and total night minutes.
P is probability of success.
Result: Since this total night
charge, total night calls, and total
night minutes do not have
significant influence on the number
of churns. For further research we
need to test other variables see if
they have influence on churns.
Capstone 4
Logit Regression Result
Summary: The three x variables P value is more than 0.05, so we can not reject
null hypothesis that they do not have significant influence on the number of churns.
Evaluate the Result
Summary: The accuracy rate is 0.86 and 0.85929. The precision is high based on the test result.
Interpret the Result
Summary: If total night minutes increase by 1, the odds ratio will increase by 1.261855.
Therefore, the total night minutes do not have great influence on the ration of churn.
Data Set:https://www.kaggle.com/hellbuoy/online-retail-customer-clustering
Kaggle Notebook URL: https://www.kaggle.com/yx2489/customer-segementation-yx2489
In the research, I will be using the online retail transnational dataset from France to build a RFM
clustering and choose the best set of customers which the company should target. I will use K-Mean
Clustering and Hierarchical Clustering to conduct my results. We can see that we k-Means Clustering
returns 18 target customer. We can see that Hierarchical Clustering returns 2 target customer for
customer cluster 2, which is a much smaller group than the one that K-Means Clustering return. And
We can see that Hierarchical Clustering still returns 2 target Customer for customer cluster 1.
K-Mean Clustering: K-means clustering is an effective way of non-hierarchical clustering. In this
method the partitions are made such that non-overlapping groups having no hierarchical relationships
between themselves.
Hierarchical Clustering: Hierarchical clustering is basically an unsupervised clustering technique
which involves creating clusters in a predefined order. The clusters are ordered in a top to bottom
manner.
Capstone 5
K-Means Clustering: Finding the best k
When metric is silhouette, the
best k equals to 3.
K-Means Clustering: Interpreting the Clustering
By the RFM criteria, we should choose the customer clusters with a lower recency, a higher
frequency and amount. From the K-means clustering results, we can see that see that
customers with Cluster_Id=0 best fit the criteria.
K-Means Clustering: Interpreting the Clustering
We can see that we k-Means
Clustering returns 18 target
customer.
Hierarchical Clustering: Visualize the dendrogram (tree)
This is dendrogram visualize tree by Linkage Methods.
Single Linkage Complete Linkage Average Linkage
Hierarchical Clustering: Virtualize and Interprets Result
By the RFM criteria, we should choose the customer clusters with a lower recency, a higher
frequency and amount. From the K-means clustering results, we can see that customers with
Cluster_Labels=2 best fit the criteria of Low recency and high frequency whereas Cluster 1 fits the high
amount.
Hierarchical Clustering: Interpreting the Clustering
We can see that Hierarchical
Clustering returns 2 target
customer for customer cluster 2,
which is a much smaller group than
the one that K-Means Clustering
return.
We can see that Hierarchical
Clustering still returns 2 target
Customer for customer cluster 1.

More Related Content

Similar to Yx2489 final presentation slides

wt2084 final presentation slides
wt2084 final presentation slideswt2084 final presentation slides
wt2084 final presentation slides
WeixiTan
 
Final Presentation
Final PresentationFinal Presentation
Final Presentation
ssuseraf9eb5
 
Final presentation zg2088
Final presentation zg2088Final presentation zg2088
Final presentation zg2088
ssuserd6504f
 
Yg2298
Yg2298Yg2298
Yg2298
YuehanGu
 
DSO528GroupProject-PortugueseBank
DSO528GroupProject-PortugueseBankDSO528GroupProject-PortugueseBank
DSO528GroupProject-PortugueseBankEric Esajian
 
statistical measurement project present
statistical measurement project presentstatistical measurement project present
statistical measurement project present
KexinZhang22
 
Classification via Logistic Regression
Classification via Logistic RegressionClassification via Logistic Regression
Classification via Logistic RegressionTaweh Beysolow II
 
!JWI 531 Financial Management II Week Four Lec.docx
!JWI 531 Financial Management II Week Four    Lec.docx!JWI 531 Financial Management II Week Four    Lec.docx
!JWI 531 Financial Management II Week Four Lec.docx
katherncarlyle
 
Summary And Response Essay Example. Summary
Summary And Response Essay Example. SummarySummary And Response Essay Example. Summary
Summary And Response Essay Example. Summary
Allison Thompson
 
Neural Network Model
Neural Network ModelNeural Network Model
Neural Network ModelEric Esajian
 
Hypothesis Testing using STATA
Hypothesis Testing using STATAHypothesis Testing using STATA
Hypothesis Testing using STATA
Pantho Sarker
 
How To Write An Essay Comparing 2 Poems
How To Write An Essay Comparing 2 PoemsHow To Write An Essay Comparing 2 Poems
How To Write An Essay Comparing 2 Poems
Angel Morris
 
statistical measurement project present
statistical measurement project presentstatistical measurement project present
statistical measurement project present
KexinZhang22
 
statistical measurement project presentation
statistical measurement project presentationstatistical measurement project presentation
statistical measurement project presentation
KexinZhang22
 
Cheap Or Expensive Custo. Online assignment writing service.
Cheap Or Expensive Custo. Online assignment writing service.Cheap Or Expensive Custo. Online assignment writing service.
Cheap Or Expensive Custo. Online assignment writing service.
Tiffany Miller
 

Similar to Yx2489 final presentation slides (15)

wt2084 final presentation slides
wt2084 final presentation slideswt2084 final presentation slides
wt2084 final presentation slides
 
Final Presentation
Final PresentationFinal Presentation
Final Presentation
 
Final presentation zg2088
Final presentation zg2088Final presentation zg2088
Final presentation zg2088
 
Yg2298
Yg2298Yg2298
Yg2298
 
DSO528GroupProject-PortugueseBank
DSO528GroupProject-PortugueseBankDSO528GroupProject-PortugueseBank
DSO528GroupProject-PortugueseBank
 
statistical measurement project present
statistical measurement project presentstatistical measurement project present
statistical measurement project present
 
Classification via Logistic Regression
Classification via Logistic RegressionClassification via Logistic Regression
Classification via Logistic Regression
 
!JWI 531 Financial Management II Week Four Lec.docx
!JWI 531 Financial Management II Week Four    Lec.docx!JWI 531 Financial Management II Week Four    Lec.docx
!JWI 531 Financial Management II Week Four Lec.docx
 
Summary And Response Essay Example. Summary
Summary And Response Essay Example. SummarySummary And Response Essay Example. Summary
Summary And Response Essay Example. Summary
 
Neural Network Model
Neural Network ModelNeural Network Model
Neural Network Model
 
Hypothesis Testing using STATA
Hypothesis Testing using STATAHypothesis Testing using STATA
Hypothesis Testing using STATA
 
How To Write An Essay Comparing 2 Poems
How To Write An Essay Comparing 2 PoemsHow To Write An Essay Comparing 2 Poems
How To Write An Essay Comparing 2 Poems
 
statistical measurement project present
statistical measurement project presentstatistical measurement project present
statistical measurement project present
 
statistical measurement project presentation
statistical measurement project presentationstatistical measurement project presentation
statistical measurement project presentation
 
Cheap Or Expensive Custo. Online assignment writing service.
Cheap Or Expensive Custo. Online assignment writing service.Cheap Or Expensive Custo. Online assignment writing service.
Cheap Or Expensive Custo. Online assignment writing service.
 

Recently uploaded

一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
ewymefz
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
alex933524
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
Tiktokethiodaily
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
enxupq
 
tapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive data
theahmadsaood
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Linda486226
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
ewymefz
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
vcaxypu
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
Oppotus
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
StarCompliance.io
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 

Recently uploaded (20)

一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 
tapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive data
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 

Yx2489 final presentation slides

  • 1. Part I: Your self-introduction slides
  • 2. Yi Xu Global Economics, International Trade, Securities, Investment Yi Xu comes from Shangrao, Jiangxi. He graduated from UC Santa Cruz, majored in Global Economics, during his internship at Yusi Education Technology as channel manager assistant for proposing data-driven marketing strategies on the internet to optimize business processes, he explored how data and analytics were transforming critical areas of strategy, marketing, and operations. Besides, he realized that data analytics skills were needed in future marketing. After that, he started his full time job at China Galaxy Securities as a investment consultant, responsible for Maintaining the relationship of some high net worth customers, and provided professional investment and financial advices to customers to achieve their optimal allocation of asset goal. During one year and six month at Securities company, he found out he wants to learn more knowledge about analyzing data to support marketing decision, he wanted to be equipped with stronger analytical skills to collect, organize, analyze, and disseminate significant amounts of information. Therefore, he decided to apply for the master program in Integrated Marketing. B.A. in Global Economics | University of California, Santa Cruz Email: yx2489@nyu.edu LinkedIn:https://www.linkedin.com/in/yi-xu-65bb 92118/ Github:https://github.com/yx2489/NYU_Integrat ed_Marketing Kaggle NoteBook: https://www.kaggle.com/yx2489/notebooks
  • 4. Summary For this class, I have learned how to use different methods to analyse my data, I have chance get to know the Cryptocurrency Website which is CoinMarketCap.com. As a Cryptocurrency player, this website allows me to check the my cryptocurrency price real time. Moreover, This website include sufficient data source for almost every cryptocurrency and its for free! So that I can follow the cryptocurrency trend to a great extent. As for my professional growth, I have learned how to analyse data by using Hypothesis Testing, logit regression, and clustering. In my future career, I think I will combine what I have learned in this class with my financial skills from my previous major to achieve my career goal.
  • 5. Part III: Your own market research report Session1
  • 6. The data contains information from the 1990 California census. So although it has huge different from current housing prices like the Zillow Zestimate dataset, it does provide an accessible introductory dataset for us get to know the California housing price back in 90s. The data pertains to the houses found in a given California district include Inland, Near Bay Area, Island, Near Ocean, 1 hour from ocean. The columns are as follows, their names are pretty self explanitory: Longitude, Latitude, Housing, Median age, Total_rooms, Total_bedrooms, Population, Households, Median_income, Median house value, ocean_proximity. New Dataset https://www.kaggle.com/camnugent/california-housing-prices
  • 7. Part III: Your own market research report Session2
  • 8. Capstone 2 California Housing Abstract: I use the data from California Housing back in 90s. I compare the house price in different area in California, for example, in island, near bay, near ocean, inland. This graph shows how many houses are in these different are and how is there price relate to the location. We can conclude that most people in California like to live in less one hour to the ocean. Link:https://datastudio.google.com/reporting/a83b 2abe-31ec-424c-8ac9-7103df25e64f
  • 9. Part III: Your own market research report Session3 Capstone4:Logit Regression
  • 10. Executive Summary The data is from Kaggle https://www.kaggle.com/camnugent/california-housing-prices In this research I choose Logit Model to conduct the the influence of x on if People live on a island in California during 90s. X include housing median age, median income, and population. P is probability of live on a island. Result: Since this housing median age, median income, and population do not have significant influence on if they live on a island. For further research we need to test other variables see if they have influence on if they live on a island. Capstone 4
  • 11. Logit Regression Result Summary: The three x variables P value is more than 0.05, so we can not reject null hypothesis that they do not have significant influence on the whether they live on island in California.
  • 12. Evaluate The Result Summary: The accuracy rate is 0.9995 and 0.9997. The precision is high based on the test result.
  • 13. Interpret the Result Summary: If housing median age increase by 1, the odds ratio will increase by 1.08. Therefore, the housing median age have great influence on if they live on island.
  • 14. Part VI: Appendix •Capstone Project Milestone 2: Research Design and The Data •Capstone Project Milestone 3: Hypothesis Testing •Capstone Project Milestone 4: Regression •Capstone Project Milestone 5: Clustering
  • 15. DOGE vs OMG | Yi Xu Abstract: DOGE and OMG are two popular tokens with relatively small volume and market cap. In this study, we explore the future potetial of these two tokens by comparing their performance. OMG Network (first developed as OmiseGO) is a non-custodial, Layer 2 scaling solution for transferring value on Ethereum. How the protocol processes transactions is centralized, but its Plasma-based design aims to decentralize network security whileDogecoin (DOGE) is based on the popular "doge" Internet meme and features a Shiba Inu on its logo. The open-source digital currency was created by Billy Markus from Portland, Oregon and Jackson Palmer from Sydney, Australia, and was forked from Litecoin in December 2013. Dogecoin's creators envisaged it as a fun, light-hearted cryptocurrency that would have greater appeal beyond the core Bitcoin audience, since it was based on a dog meme. After visualizing the data from Coin Market Cap, we find out that this two tokens Volume are similar before 2019, but after 2020 we can see that OMG volume increased dramatically. Therefore, we believe OMG can be a profitable investment option for those who want to invest in small maket cap and volume cryptocurrencies. Capstone 2
  • 16. Capstone 3 Summary Name:Yi Xu yx2489 In this research, I conduct Paired T-Test, Two Sample T-Test, for the Assumptions, because it is normal distribution, so we can use pearson correlation. https://data.world/data-society/bank-marketing-data This data is related with directing marketing campaigns of a portuguese banking institution. https://stats.oecd.org/index.aspx?queryid=33940# This data is Economy for OECD countries before and after COVID-19. Github repo URL: https://github.com/yx2489/NYU_Integrated_Marketing
  • 17. Conclusion: Since p-value is lower than 0.5, we can reject the null hypothesis, because the we have the conclude that it is normal distribution, so we use pearson correlation. Assumptions
  • 18. Paired T-Test Conclusion: Since p-value is lower than 0.05, we can reject the null hypothesis. Countries economy in 2020 is lower than in 2018. We can know that COVID-19 had a negative impact for OECD countries economy.
  • 19. Two Sample T-Test Conclusion: We can reject the null hypothesis that the mean of balance equals between those who have loan and those who do not has loan load at 0.05 (or even 0.001) significant level.
  • 20. Conclusion: For a 0.8 cohen d effect size, a power of 0.70, and a type I error of 0.05, we need a sample size of 20 (for each group).
  • 21. Limitations and future research: For assumption since p-value is lower than 0.5, we can reject the null hypothesis, because the we have the conclude that it is normal distribution, so we use pearson correlation. For paired T-test, since p-value is lower than 0.5, we can reject the null hypothesis, because the we have the conclude that it is normal distribution, so we use pearson correlation. However, the sample size might not be sufficient for other complicated test. If in the future we need to test for the cohen d equal to 0.5, the sample size will be 50 instead of 20. We can also expand the sample size if the future clients ask for more detailed test.
  • 22. Executive Summary The data is from Kaggle https://www.kaggle.com/c/customer-churn-prediction-2020 Github URL: https://github.com/yx2489/NYU_Integrated_Marketing In this research I choose Logit Model to conduct the probability of success. X include total night charge, total night calls, and total night minutes. P is probability of success. Result: Since this total night charge, total night calls, and total night minutes do not have significant influence on the number of churns. For further research we need to test other variables see if they have influence on churns. Capstone 4
  • 23. Logit Regression Result Summary: The three x variables P value is more than 0.05, so we can not reject null hypothesis that they do not have significant influence on the number of churns.
  • 24. Evaluate the Result Summary: The accuracy rate is 0.86 and 0.85929. The precision is high based on the test result.
  • 25. Interpret the Result Summary: If total night minutes increase by 1, the odds ratio will increase by 1.261855. Therefore, the total night minutes do not have great influence on the ration of churn.
  • 26. Data Set:https://www.kaggle.com/hellbuoy/online-retail-customer-clustering Kaggle Notebook URL: https://www.kaggle.com/yx2489/customer-segementation-yx2489 In the research, I will be using the online retail transnational dataset from France to build a RFM clustering and choose the best set of customers which the company should target. I will use K-Mean Clustering and Hierarchical Clustering to conduct my results. We can see that we k-Means Clustering returns 18 target customer. We can see that Hierarchical Clustering returns 2 target customer for customer cluster 2, which is a much smaller group than the one that K-Means Clustering return. And We can see that Hierarchical Clustering still returns 2 target Customer for customer cluster 1. K-Mean Clustering: K-means clustering is an effective way of non-hierarchical clustering. In this method the partitions are made such that non-overlapping groups having no hierarchical relationships between themselves. Hierarchical Clustering: Hierarchical clustering is basically an unsupervised clustering technique which involves creating clusters in a predefined order. The clusters are ordered in a top to bottom manner. Capstone 5
  • 27. K-Means Clustering: Finding the best k When metric is silhouette, the best k equals to 3.
  • 28. K-Means Clustering: Interpreting the Clustering By the RFM criteria, we should choose the customer clusters with a lower recency, a higher frequency and amount. From the K-means clustering results, we can see that see that customers with Cluster_Id=0 best fit the criteria.
  • 29. K-Means Clustering: Interpreting the Clustering We can see that we k-Means Clustering returns 18 target customer.
  • 30. Hierarchical Clustering: Visualize the dendrogram (tree) This is dendrogram visualize tree by Linkage Methods. Single Linkage Complete Linkage Average Linkage
  • 31. Hierarchical Clustering: Virtualize and Interprets Result By the RFM criteria, we should choose the customer clusters with a lower recency, a higher frequency and amount. From the K-means clustering results, we can see that customers with Cluster_Labels=2 best fit the criteria of Low recency and high frequency whereas Cluster 1 fits the high amount.
  • 32. Hierarchical Clustering: Interpreting the Clustering We can see that Hierarchical Clustering returns 2 target customer for customer cluster 2, which is a much smaller group than the one that K-Means Clustering return. We can see that Hierarchical Clustering still returns 2 target Customer for customer cluster 1.