SlideShare a Scribd company logo
1 of 18
Group A
Ethan Jacobs
Apoorva Singu
Christopher Walker
Chun Wu
Mobile User Demographics Data Analysis
(by using R studio)
MIS 7190 Programming for Business
11/17/2016
Project Overview
TalkingData is a mobile app marketplace company based in
China that has the largest independent big data service
platform.
Objective:
To explore and analyze data offered by TalkingData (such as
app usage, geolocation, and mobile device properties) based
on our hypothesis about users’ demographic characteristics
to provide useful insights to support company’s decision-
making in R&D and Branding purposes.
Hypothesis
Hypothesis 1. Assuming the age group between 25-45 are more likely to use phone apps
during lunch time (noon-2pm) because in China, most companies allow people to have a
two-hour lunch/nap break.
Hypothesis 2. Assuming finance and business apps will be used more frequently on Friday
because users will be likely to check bank accounts and make payments.
Hypothesis 3. Assuming players of massive multiplayer online games (MMO) are likely to use
Tencent (chinese version skype/Facebook) while using gaming apps because distance
team players can voice chat with each other.
Hypothesis 4. Assuming large majority of game players will be under the age of
25 because there are social pressures and shifting personal preferences with
age.
Data Source Mapping
Data Exploration - Data type
Interval Discrete:
- Count of Apps Used (derived)
- Hour of the Day (derived)
Interval Continuous:
- Age (given)
Categorical Nominal:
- device_id, gender, group, phone_brand, device_model
Data Exploration - Data Processing Methods
Data Import/Export: setwd, read.csv
Text Manipulation:
Problem: some data are collected in different language so was not presented well in R
Solution: add another external table with translation called p_b_d_model_trans.csv
Dimensions, header names and classes: summary, header, class
6,218,496 observation of 15 variable
Problem: the dataset is too large to analyze
Solution: only pull the tables and data we need based on the hypothesis to conduct analysis
Data Exploration - Data Processing Methods
Handling missing values is.na
Handling dates and time
Problem: how to pull “DATE” data (such as Monday, Sunday…) from time stamp
Time stamp example: 2016-05-02 00:46:51 → MONDAY
Solution: use as.POSIXIt for date, use weekdays(as.Date()) to converted it to Monday,
Tuesday…
finance.events <- events.data$event_id %in% finance.apps$event_id
finance.only <- events.data[finance.events,]
converted_time <- weekdays(as.Date(as.POSIXlt(finance.only$timestamp)))
Indexing, subsetting rows, columns
Data aggregation by groups dataframe
Data merging/joining merge
Data Summary
Mean age: 31.4
Median age: 29
Stand. Dev age: 9.87
Mean Age (male): 31.05
Median age (male): 29
Std. Dev Age (male): 9.45
Mean Age (female): 32.05
Median Age (female): 29
Std. Dev Age (female): 10.54
Data Summary
Data Summary
Age and App Count
Analysis - Hypothesis 1
Assuming the age group between 25-45 are more likely to use phone apps during lunch time (noon-2pm) due to the fact
that in China, most companies allow people to have a two-hour lunch/nap break.
Findings
● This age group ( 25-45) are more likely
to use phone apps at 10am and 9pm.
● Of 1.2million app usage observation,
these two time slots account of 11% of
the usage
● Phone usage starts to decrease after
10 am and increase again at around
6pm
Analysis - Hypothesis 2
Assuming finance and business apps will be used more frequently on Friday because users will be likely to check bank
accounts and make payments.
Findings
● There is no significant differences of
finance and business usage in different
days of a week
● On average, 7800 usages of app per
day
● Friday finance and business apps
usages are a little bit more frequent
*1=Sunday, 7=Saturday
Analysis - Hypothesis 3
Assuming players of massively multiplayer online (MMO) games are likely to use Tencent (chinese version of
Skype/Facebook) while using gaming apps due to the fact that distant team players can voice chat with each other.
Findings
● “All MMO users are using Tencent
when play MMO games” ?
● The MMO games are developed by
and used within Tencent
Analysis - Hypothesis 4
Assuming large majority of game players will be under the age of 25 because there are social pressures and shifting
personal preferences with age.
Max: Age 26 = 926
Majority: Age 25-30
● Max: Age 26 = 4540
● Majority: Age 25-30
More findings - Are casual gamers mostly male?
Findings
Max: Age 26 = 575
Majority: Age 26-30
Max: Age 26 = 352
Majority: Age 26-29
Future analysis
Better clean up and organized App_lable data for further research
Explore more data
Data set isn’t diverse
Such as MMO and Tencent Data
Explore Shopping Application use
Explore location
References
Kaggle.com Talkingdata Mobile User Demographics https://www.kaggle.com/c/talkingdata-mobile-user-demographics
Developer.apple.com Apple application category: https://developer.apple.com/app-store/categories/
MMO list https://en.wikipedia.org/wiki/List_of_massively_multiplayer_online_games

More Related Content

Similar to Talking Data: Mobile User Demographic Data Analysis

Kp-Data Analytics-ts.pptx
Kp-Data Analytics-ts.pptxKp-Data Analytics-ts.pptx
Kp-Data Analytics-ts.pptxCloudBusiness2
 
Ch1IntroductiontoDataScience.pptx
Ch1IntroductiontoDataScience.pptxCh1IntroductiontoDataScience.pptx
Ch1IntroductiontoDataScience.pptxAbderrahmanABID2
 
Data science tutorial
Data science tutorialData science tutorial
Data science tutorialAakashdata
 
Confirming PagesLess managing. More teaching. Greater
Confirming PagesLess managing. More teaching. Greater Confirming PagesLess managing. More teaching. Greater
Confirming PagesLess managing. More teaching. Greater AlleneMcclendon878
 
Data Science Demystified
Data Science DemystifiedData Science Demystified
Data Science DemystifiedEmily Robinson
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data AnalyticsOsman Ali
 
data science and business analytics
data science and business analyticsdata science and business analytics
data science and business analyticssunnypatil1778
 
Twitter Sentiment Analysis
Twitter Sentiment AnalysisTwitter Sentiment Analysis
Twitter Sentiment AnalysisIRJET Journal
 
Simplify your analytics strategy
Simplify your analytics strategySimplify your analytics strategy
Simplify your analytics strategyShikhar Gupta
 
MAT111–Spring2020 Name__________________________.docx
MAT111–Spring2020     Name__________________________.docxMAT111–Spring2020     Name__________________________.docx
MAT111–Spring2020 Name__________________________.docxalfredacavx97
 
The profile of the management (data) scientist: Potential scenarios and skill...
The profile of the management (data) scientist: Potential scenarios and skill...The profile of the management (data) scientist: Potential scenarios and skill...
The profile of the management (data) scientist: Potential scenarios and skill...Juan Mateos-Garcia
 
SPS Utah 2016 - Unlock your big data with analytics and BI on Office 365
SPS Utah 2016 - Unlock your big data with analytics and BI on Office 365SPS Utah 2016 - Unlock your big data with analytics and BI on Office 365
SPS Utah 2016 - Unlock your big data with analytics and BI on Office 365Brian Culver
 
Business Analytics with R
Business Analytics with RBusiness Analytics with R
Business Analytics with REdureka!
 
Welcome to Data Science
Welcome to Data ScienceWelcome to Data Science
Welcome to Data ScienceNyraSehgal
 
SPT 104 Unlock your big data with analytics and BI on Office 365
SPT 104 Unlock your big data with analytics and BI on Office 365SPT 104 Unlock your big data with analytics and BI on Office 365
SPT 104 Unlock your big data with analytics and BI on Office 365Brian Culver
 
Implementation of Sentimental Analysis of Social Media for Stock Prediction ...
Implementation of Sentimental Analysis of Social Media for Stock  Prediction ...Implementation of Sentimental Analysis of Social Media for Stock  Prediction ...
Implementation of Sentimental Analysis of Social Media for Stock Prediction ...IRJET Journal
 
Simplify our analytics strategy
Simplify our analytics strategySimplify our analytics strategy
Simplify our analytics strategysaurabh sethia
 

Similar to Talking Data: Mobile User Demographic Data Analysis (20)

Kp-Data Analytics-ts.pptx
Kp-Data Analytics-ts.pptxKp-Data Analytics-ts.pptx
Kp-Data Analytics-ts.pptx
 
Ch1IntroductiontoDataScience.pptx
Ch1IntroductiontoDataScience.pptxCh1IntroductiontoDataScience.pptx
Ch1IntroductiontoDataScience.pptx
 
Big Data
Big DataBig Data
Big Data
 
Presentation2
Presentation2Presentation2
Presentation2
 
Data science tutorial
Data science tutorialData science tutorial
Data science tutorial
 
Confirming PagesLess managing. More teaching. Greater
Confirming PagesLess managing. More teaching. Greater Confirming PagesLess managing. More teaching. Greater
Confirming PagesLess managing. More teaching. Greater
 
Data Science Demystified
Data Science DemystifiedData Science Demystified
Data Science Demystified
 
Bayesian reasoning
Bayesian reasoningBayesian reasoning
Bayesian reasoning
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
data science and business analytics
data science and business analyticsdata science and business analytics
data science and business analytics
 
Twitter Sentiment Analysis
Twitter Sentiment AnalysisTwitter Sentiment Analysis
Twitter Sentiment Analysis
 
Simplify your analytics strategy
Simplify your analytics strategySimplify your analytics strategy
Simplify your analytics strategy
 
MAT111–Spring2020 Name__________________________.docx
MAT111–Spring2020     Name__________________________.docxMAT111–Spring2020     Name__________________________.docx
MAT111–Spring2020 Name__________________________.docx
 
The profile of the management (data) scientist: Potential scenarios and skill...
The profile of the management (data) scientist: Potential scenarios and skill...The profile of the management (data) scientist: Potential scenarios and skill...
The profile of the management (data) scientist: Potential scenarios and skill...
 
SPS Utah 2016 - Unlock your big data with analytics and BI on Office 365
SPS Utah 2016 - Unlock your big data with analytics and BI on Office 365SPS Utah 2016 - Unlock your big data with analytics and BI on Office 365
SPS Utah 2016 - Unlock your big data with analytics and BI on Office 365
 
Business Analytics with R
Business Analytics with RBusiness Analytics with R
Business Analytics with R
 
Welcome to Data Science
Welcome to Data ScienceWelcome to Data Science
Welcome to Data Science
 
SPT 104 Unlock your big data with analytics and BI on Office 365
SPT 104 Unlock your big data with analytics and BI on Office 365SPT 104 Unlock your big data with analytics and BI on Office 365
SPT 104 Unlock your big data with analytics and BI on Office 365
 
Implementation of Sentimental Analysis of Social Media for Stock Prediction ...
Implementation of Sentimental Analysis of Social Media for Stock  Prediction ...Implementation of Sentimental Analysis of Social Media for Stock  Prediction ...
Implementation of Sentimental Analysis of Social Media for Stock Prediction ...
 
Simplify our analytics strategy
Simplify our analytics strategySimplify our analytics strategy
Simplify our analytics strategy
 

Recently uploaded

What is Insertion Sort. Its basic information
What is Insertion Sort. Its basic informationWhat is Insertion Sort. Its basic information
What is Insertion Sort. Its basic informationmuqadasqasim10
 
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证dq9vz1isj
 
Seven tools of quality control.slideshare
Seven tools of quality control.slideshareSeven tools of quality control.slideshare
Seven tools of quality control.slideshareraiaryan448
 
Displacement, Velocity, Acceleration, and Second Derivatives
Displacement, Velocity, Acceleration, and Second DerivativesDisplacement, Velocity, Acceleration, and Second Derivatives
Displacement, Velocity, Acceleration, and Second Derivatives23050636
 
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证ppy8zfkfm
 
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...ThinkInnovation
 
SCI8-Q4-MOD11.pdfwrwujrrjfaajerjrajrrarj
SCI8-Q4-MOD11.pdfwrwujrrjfaajerjrajrrarjSCI8-Q4-MOD11.pdfwrwujrrjfaajerjrajrrarj
SCI8-Q4-MOD11.pdfwrwujrrjfaajerjrajrrarjadimosmejiaslendon
 
Predictive Precipitation: Advanced Rain Forecasting Techniques
Predictive Precipitation: Advanced Rain Forecasting TechniquesPredictive Precipitation: Advanced Rain Forecasting Techniques
Predictive Precipitation: Advanced Rain Forecasting TechniquesBoston Institute of Analytics
 
Sensing the Future: Anomaly Detection and Event Prediction in Sensor Networks
Sensing the Future: Anomaly Detection and Event Prediction in Sensor NetworksSensing the Future: Anomaly Detection and Event Prediction in Sensor Networks
Sensing the Future: Anomaly Detection and Event Prediction in Sensor NetworksBoston Institute of Analytics
 
How to Transform Clinical Trial Management with Advanced Data Analytics
How to Transform Clinical Trial Management with Advanced Data AnalyticsHow to Transform Clinical Trial Management with Advanced Data Analytics
How to Transform Clinical Trial Management with Advanced Data AnalyticsBrainSell Technologies
 
社内勉強会資料_Object Recognition as Next Token Prediction
社内勉強会資料_Object Recognition as Next Token Prediction社内勉強会資料_Object Recognition as Next Token Prediction
社内勉強会資料_Object Recognition as Next Token PredictionNABLAS株式会社
 
NOAM AAUG Adobe Summit 2024: Summit Slam Dunks
NOAM AAUG Adobe Summit 2024: Summit Slam DunksNOAM AAUG Adobe Summit 2024: Summit Slam Dunks
NOAM AAUG Adobe Summit 2024: Summit Slam Dunksgmuir1066
 
如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证acoha1
 
edited gordis ebook sixth edition david d.pdf
edited gordis ebook sixth edition david d.pdfedited gordis ebook sixth edition david d.pdf
edited gordis ebook sixth edition david d.pdfgreat91
 
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证ju0dztxtn
 
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...yulianti213969
 
如何办理加州大学伯克利分校毕业证(UCB毕业证)成绩单留信学历认证
如何办理加州大学伯克利分校毕业证(UCB毕业证)成绩单留信学历认证如何办理加州大学伯克利分校毕业证(UCB毕业证)成绩单留信学历认证
如何办理加州大学伯克利分校毕业证(UCB毕业证)成绩单留信学历认证a8om7o51
 
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证pwgnohujw
 
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证zifhagzkk
 
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...Valters Lauzums
 

Recently uploaded (20)

What is Insertion Sort. Its basic information
What is Insertion Sort. Its basic informationWhat is Insertion Sort. Its basic information
What is Insertion Sort. Its basic information
 
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
 
Seven tools of quality control.slideshare
Seven tools of quality control.slideshareSeven tools of quality control.slideshare
Seven tools of quality control.slideshare
 
Displacement, Velocity, Acceleration, and Second Derivatives
Displacement, Velocity, Acceleration, and Second DerivativesDisplacement, Velocity, Acceleration, and Second Derivatives
Displacement, Velocity, Acceleration, and Second Derivatives
 
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
 
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
 
SCI8-Q4-MOD11.pdfwrwujrrjfaajerjrajrrarj
SCI8-Q4-MOD11.pdfwrwujrrjfaajerjrajrrarjSCI8-Q4-MOD11.pdfwrwujrrjfaajerjrajrrarj
SCI8-Q4-MOD11.pdfwrwujrrjfaajerjrajrrarj
 
Predictive Precipitation: Advanced Rain Forecasting Techniques
Predictive Precipitation: Advanced Rain Forecasting TechniquesPredictive Precipitation: Advanced Rain Forecasting Techniques
Predictive Precipitation: Advanced Rain Forecasting Techniques
 
Sensing the Future: Anomaly Detection and Event Prediction in Sensor Networks
Sensing the Future: Anomaly Detection and Event Prediction in Sensor NetworksSensing the Future: Anomaly Detection and Event Prediction in Sensor Networks
Sensing the Future: Anomaly Detection and Event Prediction in Sensor Networks
 
How to Transform Clinical Trial Management with Advanced Data Analytics
How to Transform Clinical Trial Management with Advanced Data AnalyticsHow to Transform Clinical Trial Management with Advanced Data Analytics
How to Transform Clinical Trial Management with Advanced Data Analytics
 
社内勉強会資料_Object Recognition as Next Token Prediction
社内勉強会資料_Object Recognition as Next Token Prediction社内勉強会資料_Object Recognition as Next Token Prediction
社内勉強会資料_Object Recognition as Next Token Prediction
 
NOAM AAUG Adobe Summit 2024: Summit Slam Dunks
NOAM AAUG Adobe Summit 2024: Summit Slam DunksNOAM AAUG Adobe Summit 2024: Summit Slam Dunks
NOAM AAUG Adobe Summit 2024: Summit Slam Dunks
 
如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证
 
edited gordis ebook sixth edition david d.pdf
edited gordis ebook sixth edition david d.pdfedited gordis ebook sixth edition david d.pdf
edited gordis ebook sixth edition david d.pdf
 
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
 
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
 
如何办理加州大学伯克利分校毕业证(UCB毕业证)成绩单留信学历认证
如何办理加州大学伯克利分校毕业证(UCB毕业证)成绩单留信学历认证如何办理加州大学伯克利分校毕业证(UCB毕业证)成绩单留信学历认证
如何办理加州大学伯克利分校毕业证(UCB毕业证)成绩单留信学历认证
 
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
 
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
 
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
 

Talking Data: Mobile User Demographic Data Analysis

  • 1. Group A Ethan Jacobs Apoorva Singu Christopher Walker Chun Wu Mobile User Demographics Data Analysis (by using R studio) MIS 7190 Programming for Business 11/17/2016
  • 2. Project Overview TalkingData is a mobile app marketplace company based in China that has the largest independent big data service platform. Objective: To explore and analyze data offered by TalkingData (such as app usage, geolocation, and mobile device properties) based on our hypothesis about users’ demographic characteristics to provide useful insights to support company’s decision- making in R&D and Branding purposes.
  • 3. Hypothesis Hypothesis 1. Assuming the age group between 25-45 are more likely to use phone apps during lunch time (noon-2pm) because in China, most companies allow people to have a two-hour lunch/nap break. Hypothesis 2. Assuming finance and business apps will be used more frequently on Friday because users will be likely to check bank accounts and make payments. Hypothesis 3. Assuming players of massive multiplayer online games (MMO) are likely to use Tencent (chinese version skype/Facebook) while using gaming apps because distance team players can voice chat with each other. Hypothesis 4. Assuming large majority of game players will be under the age of 25 because there are social pressures and shifting personal preferences with age.
  • 5. Data Exploration - Data type Interval Discrete: - Count of Apps Used (derived) - Hour of the Day (derived) Interval Continuous: - Age (given) Categorical Nominal: - device_id, gender, group, phone_brand, device_model
  • 6. Data Exploration - Data Processing Methods Data Import/Export: setwd, read.csv Text Manipulation: Problem: some data are collected in different language so was not presented well in R Solution: add another external table with translation called p_b_d_model_trans.csv Dimensions, header names and classes: summary, header, class 6,218,496 observation of 15 variable Problem: the dataset is too large to analyze Solution: only pull the tables and data we need based on the hypothesis to conduct analysis
  • 7. Data Exploration - Data Processing Methods Handling missing values is.na Handling dates and time Problem: how to pull “DATE” data (such as Monday, Sunday…) from time stamp Time stamp example: 2016-05-02 00:46:51 → MONDAY Solution: use as.POSIXIt for date, use weekdays(as.Date()) to converted it to Monday, Tuesday… finance.events <- events.data$event_id %in% finance.apps$event_id finance.only <- events.data[finance.events,] converted_time <- weekdays(as.Date(as.POSIXlt(finance.only$timestamp))) Indexing, subsetting rows, columns Data aggregation by groups dataframe Data merging/joining merge
  • 8. Data Summary Mean age: 31.4 Median age: 29 Stand. Dev age: 9.87 Mean Age (male): 31.05 Median age (male): 29 Std. Dev Age (male): 9.45 Mean Age (female): 32.05 Median Age (female): 29 Std. Dev Age (female): 10.54
  • 10. Data Summary Age and App Count
  • 11. Analysis - Hypothesis 1 Assuming the age group between 25-45 are more likely to use phone apps during lunch time (noon-2pm) due to the fact that in China, most companies allow people to have a two-hour lunch/nap break. Findings ● This age group ( 25-45) are more likely to use phone apps at 10am and 9pm. ● Of 1.2million app usage observation, these two time slots account of 11% of the usage ● Phone usage starts to decrease after 10 am and increase again at around 6pm
  • 12. Analysis - Hypothesis 2 Assuming finance and business apps will be used more frequently on Friday because users will be likely to check bank accounts and make payments. Findings ● There is no significant differences of finance and business usage in different days of a week ● On average, 7800 usages of app per day ● Friday finance and business apps usages are a little bit more frequent *1=Sunday, 7=Saturday
  • 13. Analysis - Hypothesis 3 Assuming players of massively multiplayer online (MMO) games are likely to use Tencent (chinese version of Skype/Facebook) while using gaming apps due to the fact that distant team players can voice chat with each other. Findings ● “All MMO users are using Tencent when play MMO games” ? ● The MMO games are developed by and used within Tencent
  • 14. Analysis - Hypothesis 4 Assuming large majority of game players will be under the age of 25 because there are social pressures and shifting personal preferences with age. Max: Age 26 = 926 Majority: Age 25-30 ● Max: Age 26 = 4540 ● Majority: Age 25-30
  • 15. More findings - Are casual gamers mostly male?
  • 16. Findings Max: Age 26 = 575 Majority: Age 26-30 Max: Age 26 = 352 Majority: Age 26-29
  • 17. Future analysis Better clean up and organized App_lable data for further research Explore more data Data set isn’t diverse Such as MMO and Tencent Data Explore Shopping Application use Explore location
  • 18. References Kaggle.com Talkingdata Mobile User Demographics https://www.kaggle.com/c/talkingdata-mobile-user-demographics Developer.apple.com Apple application category: https://developer.apple.com/app-store/categories/ MMO list https://en.wikipedia.org/wiki/List_of_massively_multiplayer_online_games

Editor's Notes

  1. Whoever takes this part do mention about the intuition is from the objective and data exploration. We want to not only find information, but also find useful information for talkingdata’s r&d and branding business
  2. Sunday 7241 Monday 7791 Tuesday 8128 Wednesday 7473 Thursday 8057 Friday 8373 Saturday 7710 30 21