2. Outline of the presentation
- Project goals
- Context
- What is car sharing
- Transport, mobility and Big data
- Project overview
- Data Structures
- Python and MongoDB
- Results
3. Project goals
- Collect, organize and process data about car sharing systems’,
and integrate them with other data sources.
- Implement a Decision Support Tool (DST) which might help
planners, service operators, public administration and users to
support the decision making process when interacting with
transport systems.
4. What is car sharing - 1
- Model of car rental where people rent cars for short periods of time
- Booking done through website or app
- Available providers in Torino:
- Enjoy, Car2go, BlueTorino, IoGuido
- Slightly different operative areas
- e.g. Caselle airport link covered only by Car2go
- Focus on Enjoy and Car2go
- Time based billing policy
- Reservation free-of-charge intervals
6. Transport, mobility and Big Data - 1
- Heterogeneous framework
- Data sources
- Car2go: public API
- Enjoy: scraping from website
- BlueTorino: no API, and strong authentication mechanism on website
- Google directions API
- Geoportale: Torino’s shapefiles
- Feature extraction
- Create time-indexed sequence of status per each car
8. Project Overview - Main tools
- The implemented software is written in Python
- Open source
- High-performing scientific libraries
- Data are stored in a MongoDB database
- Open source
- NoSQL
- JSON oriented
- Dynamic schema
- Python compatible
9. Project Overview - Data collection
- Providers data gathering
- Acquire for each provider the position of all available cars in time
- Sampling period: 1 minute
- The car is on the map: the car is parked and available for a booking
- The car is not on the map: the car is booked
- Car informations
- Plate and Vehicle Identification Number
- GNSS location
- Fuel status
10. Project Overview - Feature extraction
- Process snapshots in a given time interval:
- If car is present in the i-th snapshot and it is not in the (i+1)-th: booking detected
- If car is not present in the i-th snapshot and it is in the (i+1)-th: parking detected
- Shape bookings and parkings into a relational schema:
- Parkings attributes:
- Position
- Initial and final datetime
- Duration
- (...)
- Bookings attributes:
- Initial and final datetime
- Initial and final position
- Beeline distance
- Duration
- (…)
- Integrate with Google Directions API data
- Better estimation of distance and duration
- Comparison with public transports
11. Project Overview - Data analysis
- Time interval:
- December 10th 2016 - January 31st 2017: 52 days
- Number of analyzed records:
- 125867 snapshots
- 215014 bookings
- 215400 parkings
- Filtering criteria for bookings:
- Duration > 3 and < 80 minutes
- Distance > 0.02 kilometers
18. Faced Issues - 1
- Data collection
- Scrape websites
- Scraper failures
- Server failures
- Data management
- Boost bookings and parkings extraction
- Custom algorithm
19. Faced Issues - 2
- Feature extraction
- Bookings and parks coordinates outside from Torino
- Providers provide have different format data
- Distinguish between shorts booking and GNSS errors
- Filtering Criteria
- Google Directions interaction
- Failures
- Different units of measurements
21. References
Mobility Polito - GitHub
MongoDB Access
Host name: turinmobility.tk
DB name: CSMS
DB collection: Torino
User:
viewer
1. Open Terminal
2. Type “mongo turinmobility.tk”
3. Type “use CSMS”
4. Type “db.auth(‘viewer’,’pass’)”