1. Shared bicycle scheme in
China datathon
By Bart, Chira, Wei, Haimei, Jiamei and Shuo
1
2. What is the shared bicycle scheme?
• This is a service in which bicycles are made available for shared use to
individuals on a very short term basis. The first 30mins – 1hr are usually free or
very inexpensive compared to other modes of transport.
• Hangzhou bike-sharing system was the first in China starting in 2008. There are
now numerous cities across China that have one or more competing bike-sharing
programs.
• In Shenzhen there are several different brands to choose from with some minor
differences. Most are accessible via a phone app.
2
3. What is the shared bicycle scheme?
• This is a service in which bicycles are made available for shared use to
individuals on a very short term basis. The first 30mins – 1hr are usually free or
very inexpensive compared to other modes of transport.
• Hangzhou bike-sharing system was the first in China starting in 2008. There are
now numerous cities across China that have one or more competing bike-sharing
programs.
• In Shenzhen there are several different brands to choose from with some minor
differences. Most are accessible via a phone app.
3
4. What is the shared bicycle scheme?
• This is a service in which bicycles are made available for shared use to
individuals on a very short term basis. The first 30mins – 1hr are usually free or
very inexpensive compared to other modes of transport.
• Hangzhou bike-sharing system was the first in China starting in 2008. There are
now numerous cities across China that have one or more competing bike-sharing
programs.
• In Shenzhen there are several different brands to choose from with some minor
differences. Most are accessible via a phone app.
4
8. What’s the problem?
• Users are finding that the app’s location services are slow and unreliable
• Users wants to know where the nearest type of bike is (newer and cheaper)
• The market is cluttered and quite saturated
– How can Mobike improve their services?
– Do people speak more positively or negatively about Mobike?
– Does providing ‘free rides’ have a positive impact on sentiment?
8
9. How should we use the data?
The data we had
• 20mil CN newspapers txt files from
2016 to present
• Access to Baidu tie ba, Weibo,
Zhihu, blogs and search engines
• Github Mobike GPS data real-time
What we did with it
• Keyword search to narrow down
data set to 15,000 articles
• Manual searching for the term ‘free
rides’ to pin point offers in the past
year
• Track real-time location of Mobikes
9
11. How should we use the data?
The data we had
• 20mil CN newspapers txt files from
2016 to present
• Access to Baidu tie ba, Weibo,
Zhihu, blogs and search engines
• Github Mobike GPS data real-time
What we did with it
• Keyword search to narrow down
data set to 15,000 articles
• Manual searching for the term ‘free
rides’ to pin point offers in the past
year
• Track real-time location of Mobikes
11
12. How should we use the data?
12
• Access to Baidu tie ba, Weibo,
Zhihu, blogs and search engines
13. How should we use the data?
• Manual searching for the term ‘free
rides’ to pin point offers in the past
year
13
14. How should we use the data?
The data we had
• 20mil CN newspapers txt files from
2016 to present
• Access to Baidu tie ba, Weibo,
Zhihu, blogs and search engines
• Github Mobike GPS data real-time
What we did with it
• Keyword search to narrow down
data set to 15,000 articles
• Manual searching for the term ‘free
rides’ to pin point offers in the past
year
• Track real-time location of Mobikes
14
15. How do our data representations help?
Our analysis
• Heat map of popular docking
locations
• Cluster map of no. of available bikes
by location
Solution
• Helps users know the location of
their nearest Mobike in real-time
(faster than the current app) and
how many are available
• Helps Mobike understand where to
place more bikes at particular times
of day
15
16. How do our data representations help?
Our analysis
• Route maps of Mobikes
Solution
• Helps users know popular routes
taken by other users
– To avoid
– To plan better cycling routes
– To know if a bike is coming to a
location near them soon
16
17. How do our data representations help?
Our analysis
• Sentiment analysis of Mobike
Solution
• Helps Mobike understand if users
are favourable towards them
– With more data we could begin to draw
comparisons on other brands
• Helps Mobike see if ‘free rides’ make
users more favourable towards them
• Helps Mobike see if ‘free rides’ are
beneficial to their business model
(does it increase usage etc.)
17
18. Method
• First, we download the sentiment dictionary from the Web, which includes stop
word list, negative word list, degree adverb list.
• Second, we cut the sentence by a tool named jieba and remove the stop words.
• Third, we record the sentiment word and its position. And according to this, we
can compute the score of the sentence.
• Finally, we sum all sentences of an article and use it as the final sentiment score.
19. Result
The number of news per week
• The news about shared bicycles becomes
more and more.
• From the left the news is almost zero,
because the shared bicycles just start to
use.
• From the middle, there is a decrease
about the shared bicycles. There is the
time of Spring Festival and people have a
long holiday, and mainly talk about
transport season or relevant topic.
0
200
400
600
800
1000
1200
1400
1600
1800
2016/10/1 2016/11/1 2016/12/1 2017/1/1 2017/2/1 2017/3/1
the number of news
20. Result
The sentiment score of news
per week
• The x axis is time, the y axis is
sentiment score.
• Almost all news are positive
while only a little is negative.
• As we all know, the content of
news is always official, so the
negative is few
0
200
400
600
800
1000
1200
1400
1600
2016/10/1 2016/11/1 2016/12/1 2017/1/1 2017/2/1 2017/3/1
positive vs negative
positive negative
21. Result
The sentiment score of news per week
• The positive ones are usually too long and
only contain few negative sentences. So when
we sum all sentences, its score becomes bigger
and positive.
• There is a sharp increase because there are a
few articles which talk about situation
analyzation of shared bicycles and are very
long.
• We try to normalize the score but as Xin said
it would lose some useful information. So we
need other method to solve the problem in the
future.
0
100
200
300
400
500
600
700
2016/10/1 2016/11/1 2016/12/1 2017/1/1 2017/2/1 2017/3/1
Score of positive
22. Result
The sentiment score of news
per week
• Almost all news are positive
while only a little is negative.
• We analyze a few negative case
and find that it is usually short
and about destroying the
bicycles. -16
-14
-12
-10
-8
-6
-4
-2
0
2016/10/1 2016/11/1 2016/12/1 2017/1/1 2017/2/1 2017/3/1
Score of negative
23. Further work
• Start collecting GPS data to create long-term historical data and understand
frequency of usage vs time of day vs day of the week.
• Create a cluster map so users know the nearest location of either a type 1 or 2
Mobike
23
24. Limitations
• How we were limited?
– We only had GPS data for one brand
– The newspaper data was very noisy and overly positive tones
– Zhihu contact got back to us too late and we didn’t have enough time to evaluate data
• What did we learn?
– How to work in a group with cross-cultural and language differences
– What data is worth manipulating to tell a story and which to ignore
• What we would do differently?
– Create a Web App to make the work accessible and more user friendly
24