4. Your business is at a standstill
• Even the number of stations is unchanged,
the number of trips is stalling.
3
5. Separate annual user and daypass user
• Annual user are probably increasing
2015 may be special case
• Daypass user are steadily going away
4
6. Focus on daypass user
• The number of trips became around half
There may be problem in the daypass service
5
7. Why focus on daypass user?
To expand your business
It is inevitable to increase daypass trips
6
8. Why it is essential to enlarge daypass trips?
• There is a limitation of the number of annual user
Annual user can be considered as residents or frequent workers
Of course there is a limitation from population of one city
• There are no limitation of the number of daypass user
Daypass user can be considered as tourists
Of course The number of tourists has no limitation
7
9. Focus on trips in San Francisco
• There are two reason to focus on San Francisco
Most trips are in one city
San Francisco has the majority of trips
8
10. Most trips are in one city
• There are only a few users to trip to another city
It is enough to Analyze a trip data in one city
9
11. San Francisco has the majority of trips
• Visualize where daypass user use this service
It likely has a big impact to analyze trip data in San Francisco
10
12. Which terminal is daypass user most use?
• Embarcadero at Sansome and Harry Bridges Plaza are
the most used terminals
112014
13. Which terminal is daypass user most use?
• Each circle size show how many times the terminal is used
• Both terminal are along the coast line
12Daypass user 2016
14. Visualize daypass user trips
• Line colors show how many trips occur between two stations
Red : more than 200 trips
Orange : between 200 and 150
Blue : between 150 and 100
• Embarcadero at Sansome Terminal
seems to be a hub
132016
15. Determine how to analyze
• Use random forest classifier to determine how to analyze
There are many feature of one trip, it is hard to determine how to analyze
ex. where use, where go, when use, how long use etc.
• Random forest classifier is a really powerful algorithm to classify.
More powerful than decision tree classifier and SVM(SVC)
• This algorithm give you importance of features it used to classify classes
We can determine which variables we use to analyze data
14
16. Classify annual user and daypass user
• Make dummy variables
include one-hot-vector for start station, end station, hour, day of week
and how many minutes user use
15
17. Classify annual user and daypass user
• Random Forest can classify two type user very well
16
18. Get important features
• Duration is most important feature to classify two type
We can analyze data using these feature.
17
19. See important features
• There are clear difference on how many minutes user use
mean of all annual user trips and daypass user trips
182016
21. See important features
• There are clear difference on trip share at Embarcadero at Sansome,
but there are little difference on trip share at Harry Bridges Plaza
20
2016 2016
23. Why customer trips greatly decreased?
• Business environmental situations may cause
• Your serviceʼs fault may cause
• Your competitive may cause
22
24. Why customer trips greatly decreased?
• Business environmental situations may cause
The number of tourists decreased?
• Your serviceʼs fault may cause
Some terminals are frequently out of service?
• Your competitive may cause
Particular type user disappear?
23
25. Why daypass user trips greatly decreased?
• Business environmental situations may cause
The number of tourists decreased?
If total number of tourists decrease, of course daypass user trips will
decrease.
Because most part of daypass user can be considered as tourists.
24
26. Why daypass user trips greatly decreased?
• Your serviceʼs fault may cause
Some terminals are frequently out of service?
If some terminals are frequently out of service, such terminal should have
opportunity loss and should have negative reputations.
And those negative reputations may cause user leaving.
25
27. Why daypass user trips greatly decreased?
• Your competitive may cause
Particular type user disappear?
If particular type user disappear, of course daypass user trips will decrease.
And your competitive should cause this.
26
29. The number of tourists decreased?
• The number of tourists is rising
• Business environmental situations
may not cause decrease of daypass
user trips
• http://www.sftravel.com/article/san-francisco-travel-reports-record-breaking-tourism-2016
• http://www.sftravel.com/article/san-francisco-travel-reports-record-breaking-year-tourism
28
30. Some terminals are frequently out of service?
• Focus on two main terminal and, look how they
work
29
31. Some terminals are frequently out of service?
• Terminal working performance are improving or unchanged
but we canʼt draw a conclusion from these two station data
30
32. Some terminals are frequently out of service?
• Working performance of top5 used terminal are improving
Total minutes of no bike and no docks at top5 used terminal
are decreasing.
Terminal working status may not
cause decrease of daypass user trips
31
33. Particular type user disappear?
• See duration which is most important feature to classify daypass user
3000seconds are over free 30minutes ride
32
34. Particular type user disappear?
• Separate two type daypass trips between over30min and within 30min
In 2016, the mean duration of over 30
minutes rides greatly decreased
This show long time user leaved
Your service didnʼt changed,
So Your competitive may cause this
33
35. Particular type user disappear?
34
• For confirmation, see station data which is second important feature
the shape of trips share between terminal didnʼt change
36. Particular type user disappear?
35
• For confirmation, see station data which is second important feature
the shape of trips share between terminal didnʼt change
Station data canʼt identify whether particular type user disappear or not
37. Analysis conclusion
• Business environmental situations may not cause decrease of daypass
user trips
• Terminal working status may not cause decrease of daypass user trips
• Your competitive may cause decrease of long time daypass user
36
39. Retrieve long time daypass user
• Whom?
take measures for long time daypass user
• What value
cheaper trip than now
• Where
San Francisco especially at Embarcadero at Sansome Terminal
• How
offer new price plan for long time user
38
40. Retrieve long time daypass user
• Whom?
take measures for long time daypass user
Because the decrease of daypass trips
are caused by long time daypass user
leaving.
39
41. Retrieve long time daypass user
• What value
cheaper trip than now
Because your service offer expensive trip for long time daypass user now
40
42. Retrieve long time daypass user
• What value
cheaper trip than now
long time user pay about 28$ averagely
for their 2hour trip now.
41
43. Retrieve long time daypass user
• What value
cheaper trip than now
28$ is far more expensive than
your competitive
And your competitive got trip advisor
award in 2015(see previous slide
what happen to duration means
in 2015)
42
https://www.bikeandview.com/?gclid=EAIaIQobChMIko-
tq6aP1QIVmAoqCh1OiQl5EAAYASAAEgJFqPD_BwE
44. Retrieve long time daypass user
• Where
San Francisco especially at Embarcadero at Sansome Terminal
The destination of long time user may be The
golden gate bridge. This trip takes about
2 hour from Embarcadero at Sansome Terminal
2 hour is equal to long time userʼs duration
mean.
This is just hypothesis because I donʼt have data
for verify this hypothesis.
43
45. Retrieve long time daypass user
• How
offer new price plan for long time user
Make 2 hour plan at around 20$.
Because your competitive offer
same service about 20$ or less.
This plan gives long time user good alternative choice,
when they go to the golden bridge by bike.
44
50. Clustering daypass user trips
• Convert dummy variables into 2 variable by PCA
include one-hot-vector for start station, end station, hour, day of week
and how many minutes user use
49
51. Clustering daypass user trips
• Clustering daypass user trips by kmeans++ using PCA components
50
52. Estimate how many trips will occur by weather data
Cloud cover is most importtant
feature to decide the number of
daily trip.
51
53. Estimate how many trips will occur by weather data
• Decide which variable to use estimation by comparing coefficient of variation
• And to merge weather data and trip data add year, month, day column
52