My second talk on behalf of my student Homayoun Hamed and postdoc Ingrida Steponavice at the Urban Systems and Network Science (UrbanNets) satellite workshop at NetSci 2017, Indianapolis organized by Marta Gonzalez (MIT).
2. 2
§ Nodes are origins and destinations
§ Links are trips between origin and destination pairs
§ Mobility networks are large.
§ Examples:
§ ~14 million trips per day in Melbourne, Australia
§ ~22 million trips per day in Chicago, IL
§ Over 450,000 trips per day (~1 million taxi passengers) in New York City
§ Mobility networks are spatial and temporal.
§ Mobility networks are weighted and directed.
§ Traditionally difficult to observe, often through household travel
survey data
§ More recently observed using mobile phone data and GPS tracks.
Introduction
3. 3
Large-scale origin-destination trip matrix
OD matrix representation generated from New York taxi
trips on 14 Feb 2015, 8-9 AM scaled logarithmically
4. 4
§ Our analysis builds on a study by Louail et al. (2014, 2015)
proposing an OD matrix coarse graining method.
§ Locations are classified into hotspots and non-hotspots origins/
destinations based on the number of trips to/from zones.
§ Reducing all commuting flows into four major flows:
- Integrated: from hotspots to hotspots (e.g. home to work)
- Convergent: from non-hotspots to hotspots
- Divergent: from hotspots to non-hotspots
- Random: from non-hotspots to non-hotspots
§ The outcome is a 2 x 2 matrix
Introduction
5. 5
§ Data description
- New York taxi data
- 14,025,351 taxi trip records in February 2015
- A trip record includes pick up and drop off timestamp and
location coordinates with trip distance
• Data cleaning
- Average travel speed is calculated for each trip
- Trips with average speed greater than105 km/hr are removed.
- Trips with duration smaller than 60 sec are removed.
- Trips with distance smaller than 300 m are removed.
- Trips with distance greater than 3x Manhattan distance are
removed.
Data
6. 6
- 1 km2 square cells
- Each cell is associated with a node
Methodology
Zoning
Genera+ng
weighted
directed
network
- Trips from node i to j determine the
weight and average distance attributes of
the edge.
- Each edge has an ordered pair attribute
showing spatial direction of the edge.
- One network is generated for each hour
interval.
8. 8
Methodology: Classifying zones
- Each edge is labeled with a flow type
according to the nodes it connects.
- Flow types: Integrated, Convergent,
Divergent, and Random.
Labeling
flow
types
Determining
origin/
des+na+on
hotspots
For a sorted list of zones, with respect to their
flux-in/out, a division point is identified when
the sum of differences between flux-in/out
values and their corresponding class mean
value is minimized.
!"#$%&! !! −
1
!
!!
!
!!!
!
!!!
+ !! −
1
! − !
!!
!
!!!!!
!
!!!!!
!
9. 9
Methodology: Finding flow direction
Edge e has an attribute vector (𝑡 𝑒 ,𝑤 𝑒,𝑑 𝑒, (𝑥 𝑒,𝑦 𝑒)) where:
– 𝑡 𝑒
is
flow
type
– 𝑤 𝑒
is
weight
– 𝑑 𝑒
is
average
distance
– (𝑥 𝑒,
𝑦 𝑒)
is
spa+al
direc+on
Calcula+ng
overall
direc+on
for
each
flow
type
10. 10
§ The probability of being a origin/
destination hotspot for a zone in all
hourly networks generated for the
data, is calculated.
§ Hotspots (both origin and
destination) are mostly in the
Manhattan area with two or three
zones fairly stable at JFK airport.
Results: Hotspots
Origin
Hotspots
Des+na+on
Hotspots
11. 11
Results: ICDR flows
§ Time-series of hourly mean ICDR flow proportion are obtained.
§ I and C flows have a positive correlation.
§ D and R flows have a positive correlation.
§ I and D flows have a negative correlation.
§ C and R flows have a negative correlation.
§ At early morning peak hours I and C flows are the dominant flows.
Then they decrease gradually through the day, although I remains
as the largest proportion of the flows for almost every hour.
12. 12
Results: Directionality analysisIntegrated
Convergent
§ Integrated flows show two opposite major
directions during a day.
§ Overall direction for convergent flows is
towards Manhattan from outside
throughout most of the day.
14. 14
Conclusions
§ We determined origin/destination hotspots with simple
classification method, and accordingly determined the major
sub-flows in taxi mobility network.
§ The flows are characterized by their overall direction and its
temporal changes, and weekly and daily patterns are shown
to exist.
§ The findings provide a better understanding of spatio-
temporal structure of mobility network in New York.
§ What’s next?
§ Apply the same method to Chicago taxi data
§ Further explore the spatial and temporal characteristics