Our Data Journey discusses the goals and work of the data science team at a company that owns multiple online forums and shopping sites. The team aims to (1) understand the current user situation, (2) gain new insights from data, (3) test ideas through A/B testing, and (4) predict user activities. Examples provided include business dashboards, A/B tests of the user interface and email content, and using machine learning to predict if new users will visit a particular forum channel. The presentation concludes by noting data science is not new but is growing significantly due to rising data volumes and technology capabilities.
Our Data Journey: Understanding the Current Situation, Gaining Insights and Predicting the Future
1. Our Data Journey
Patrick Ng
Head of Data Science Team
Our Data Journey
Patrick Ng
Head of Data Science Team
2. Our vision is to create a better world through bringing technology to
life. To realize this mission, we provide reliable and enjoyable services
that enhance communication among people.
3. With more than 5.5 million registered members who came from
different classes and background, Discuss is the epitome of the
entire Hong Kong society.
No. 1 Forum in Hong Kong
4. Price aims to enhance and improve users’
shopping experience by providing latest price
reviews and comprehensive information of
various products and services.
No. 1 Comparison
Shopping Site
5. Uwants is a place where users can share their life and discuss on
leisure stuff. Gaming, ACG, love and football are the most popular
topics among Uwants users.
No. 1 Gaming Portal
6. Networld and our Data
• Generating 20GB of traffic data daily
• Setup the Data Science Team in 2016
• Goals: Leverage data to enable us to:
• Understand the current situation
• Get new insights
• Test our ideas
• Predict user activities
13. Introduction to AB Testing
• Typical approach:
• Implement new feature → Rollout → Measure
• Example:
• UI change on Price:
• To attract users to view more products
• Launched it on Dec 1st
• Monitor the daily click rate for two weeks
• Dec 15th
: daily rate increased by 5%
• We declare victory!
• What may we miss?
14. Basics of AB Testing
A B
Image: Kohavi, et al., 2009. Online Experimentation at Microsoft.
http://www.exp-platform.com/Documents/ExPThinkWeek2009Public.pdf
15. UI Change – Show price ranges on listing page
Original UI
single price
(latest )
New UI
Price ranges
(from all sellers)
20. Scenario:
● Assume 女性頻道 is a new channel
● We have launched it for 2 weeks
● Now a user comes in
○ The user was absent in last 2 weeks
○ Will the user visit 女性頻道 ?
21. Use Predictive Analytics
• Collect activities of other users in past 2 weeks
• 5% of them visited 女性頻道
• Now a user comes in, who was absent in last weeks
• Predict whether the current user will visit 女性頻道
Approach Precision
Random guess 5%
Machine Learning 1
67%
1. Algorithm: AdaBoost
23. Is Data Science a new thing?
• It is not new:
• 1920s: Modern statistics
• 1960s: Statistical learning was introduced
• 1980s: Commercial application of statistical learning
• E.g. Credit cards company: custom offerings for each individual
• Since 2000s:
• Technology - much more affordable and accessible
• Data volume is rising exponentially
• Rising awareness (thanks to Google, Facebook, Amazon, etc.)
•Something is taking off !
25. “It is easy to lie with statistics.
It is hard to tell the truth without it.”
Quotes of the day
- Andrejs Dunkels (a Swedish mathematician)
"If any publisher wants to grow their programmatic ad revenue,
Acqua Media can upgrade them to AdX,
then buyers like Google, iClick, and Amnet will be able to buy their
ad space."
- Ben Chien (Founder/Director of Acqua Media)