SlideShare a Scribd company logo
1 of 37
Predictive Analytics for
Transportation in a High Dimensional
Heterogeneous Data World
Dr. Chandra Bhat
Center for Transportation Research
The University of Texas at Austin
Acknowledgments: D-STOP, Humboldt Award, Dr. Ram Pendyala, Dr. Kostas Goulias,
all my graduate/undergraduate students
World of high dimensional
heterogeneous data
• Providing accurate traffic
information is becoming an
imperative
• Cameras, GPS, cell phone tracking,
and probe vehicles are used to
supplement the information
provided by conventional
measurement systems.
• Methodologies to combine and
aggregate high dimensional
heterogeneous data are needed
Connected/Automated Vehicles (CAVs)
and big data
• The car of the near future will be a part of a gigantic data-collection
engine.
• Vehicles have embedded
 computers,
 GPS receivers,
 short-range wireless network interfaces,
 in-car sensors,
 cameras, and
 internet.
• Vehicles interact with
 Roadside wireless sensor networks,
 passenger’s wireless devices, and
 other cars.
Data required to keep a CAV safely on
the road
• Highly detailed maps information:
 Shape and elevation of roadways,
 lane lines,
 intersections,
 crosswalks,
 speed limits, and
 traffic signals.
• Position, speed and intentions of other vehicles and pedestrians.
• Position, speed and intentions of other road users, such as…
 jaywalking pedestrians,
 cars coming out of hidden driveways,
 a stop sign held up by a crossing guard, and
 cyclist making signs of riding intent.
What can be inferred from CAVs and
smartphones data
• Where people drive,
• when people drive,
• what route people take,
• where people stop,
• what people put in their car,
• why, how and when people take decisions on the fly and change
their activity plan route, and
• detailed crashes data (speed, position, and intention at the moment
of the accident).
Data Science
• Not enough humans to process
• Machine learning, visualization, and
advanced computation techniques
• Statistics, social sciences, and domain knowledge
• High-dimensional heterogeneous data
DOMAIN KNOWLEDGE IN THE CONTEXT
OF TRAVEL DEMAND MODELING
Exogenous
Variable
Vector X
Model
• Conceptual/Theoretical/
Methods/Tools and Techniques
• Specification and Definition of
Alternatives
Activity-based Paradigm
Trip-based Paradigm
Outputs
Five Pillars of ABM Design
• Based on sound behavioral theory/paradigm
• Computationally feasible and tractable
– Model estimation
– Model implementation
• Optimal use of available data (present and future)
• ABM should be both an Activity-Based Model and an Agent-Based
Model
• Sensitive to policy issues and planning applications of interest
Behavioral Basis of ABM
• Decision hierarchies and choice processes
– A variety of behavioral decision structures possible
– Virtually all models assume a sequential decision structure
similar to traditional four-step models for computational
convenience
• Considerable evidence of simultaneity in behavioral choice
mechanisms
– Several choices made simultaneously as a lifestyle package
Behavioral Basis of ABM
• Examples of simultaneous choice packages
– Residential location, vehicle ownership, mode to pre-
planned activities (e.g., work)
– Activity type, activity duration, and activity timing
(scheduling)
• Behavioral heterogeneity
– Differences in choice processes across market segments
– Identify market segments both exogenously and
endogenously (latent market segments)
Time-Space Interactions
Home Work
Activity 1 (Fixed)
Activity 2 (Fixed)
Time
Urban Space
1
v
Home Activity
A
Activity at
Location A
Activity 1
Activity 2
Agent Interactions
I have a client
meeting today;
so I will take the
car
I have to pick up
Jane from School
and go shopping
later; I need the
car.
My meeting is in the
morning. I can pick up
Jane from school today.
And we can go shopping
together in the evening. OK, that sounds
good. I’ll go
ahead and take
light rail today to
work. See you
later.
Hey, Mom and
Dad, don’t forget;
you have to drop
me off at Johnny’s
house in the
evening today
Don’t worry Jane; we’ll
drop you off on the way
to the store and pick you
up later. Run along now,
you’ll miss the bus.
Definition of an Activity
• Disaggregate activity purpose definition
– Challenge traditional notion of mandatory and discretionary
activities/trips
– Movie, ball game, and child’s tennis lesson or soccer game often
have spatial and/or temporal fixity
– Characterize activities and trips by level of spatial and temporal
fixity/constraints (besides purpose)
– Can be accomplished using concepts of time-space
geography
– Automated method to add attributes describing degrees of
freedom according to set of spatial/temporal fixity criteria to
activity records in data set
Central Role of Time Use
• Notion of time is central to activity-based modeling
– Explicit modeling of activity durations (daily activity time allocation and
individual episode duration)
– Treat time as “continuous” and not as “discrete choice” blocks
• Activity engagement is the focus of attention
– Travel patterns are inferred as an outcome of activity participation and
time use decisions
– Continuous treatment of time dimension allows explicit consideration of
time constraints on human activities
• Reconcile activity durations with network travel durations (feedback
processes)
In Summary
• ABM should…
– Capture the central role of activities, time, and space
in a continuum
– Explicitly recognize constraints and interactions
– Represent simultaneity in behavioral choice processes
– Account for heterogeneity in behavioral decision
hierarchies
– Incorporate feedback processes to facilitate
integration with land use and network models
• SimAGENT does it all and more…
SIMAGENT (SIMULATOR OF ACTIVITIES,
GREENHOUSE GAS EMISSIONS, ENERGY,
NETWORKS, AND TRAVEL):
AN OVERVIEW
SimAGENT
Activity-travel
environment
characteristics
(base year)
Detailed
individual-level
socio-
demographics
(base year)
Activity-travel
simulator
(CEMDAP)
Individual
activity-travel
patterns
Link volumes
and speeds
Dynamic Traffic
Assignment
(DTA)
Socio-economics,
land-use and
transportation
system
characteristics
simulator
(CEMSELTS)
Socio-
demographics
and activity-travel
environment
SimAGENT
Aggregate socio-
demographics
(base year)
Synthetic
population
generator
(PopGen)
Base Year
Inputs
Forecast Year Outputs and
GHG Emissions Prediction
CEMDAP
A COMPREHENSIVE ECONOMETRIC
MICROSIMULATOR OF DAILY
ACTIVITY-TRAVEL PATTERNS
CEMDAP – The Core ABM in SimAGENT
Socio-Economic Data
PopGen
CEMSELTS
CEMDAP
• Simulates activity schedule and
travel characteristics for each
individual of the region
• Core module of SimAGENT
• 52 sub-models.
• Developed by UT Austin
Features of CEMDAP (continued)
• Changes in the activity-travel pattern of one individual in a
household may bring about changes in activity-travel patterns
of other household members
• MDCEV approach facilitates modeling activity participation at a
household level with joint activity participation incorporated in
a simple fashion
– MDCEV – Multiple Discrete Continuous Extreme Value
econometric choice modeling method
• Includes a model of household vehicle ownership by type and
make/model, and primary driver assignment
INNOVATION:
COMPREHENSIVE REPRESENTATION OF
INTRA-HOUSEHOLD INTERACTIONS
Joint Activities and Household Interactions
MDCEV Model
• Most activity based models accommodate activity type choice
as a series of models for each individual in the household
• These approaches do not explicitly recognize that activity
participation is a collective decision of household members
• MDCEV approach – simple and relatively inexpensive for
modeling activity participation at a household level
• SimAGENT now features MDCEV modeling methodology to
capture household-level activity participation
Joint Activities and Interactions
MDCEV Model
• Conventional discrete choice frameworks need to generate
mutually exclusive alternatives  results in an explosion in
the number of alternatives
• MDCEV allows us to tackle the problem by considering
activity participation as a household decision
• MDCEV offers substantial computational and behavioral
advantages
– Employ one model to generate activities
– Accommodate substitution/complementarity in activity participation
and household member dimensions
MDCEV Model
P1 P2 P1 P2
None None None
A1 None None
A2 None None
A1 A2 None None
P1 P2 P1 P2
None None A1
A1 None A1
A2 None A1
A1 A2 None A1
P1 P2 P1 P2
None None A2
A1 None A2
A2 None A2
A1 A2 None A2
P1 P2 P1 P2
None None A1 A2
A1 None A1 A2
A2 None A1 A2
A1 A2 None A1 A2
P1 P2 P1 P2
None A1 None
A1 A1 None
A2 A1 None
A1 A2 A1 None
P1 P2 P1 P2
None A1 A1
A1 A1 A1
A2 A1 A1
A1 A2 A1 A1
P1 P2 P1 P2
None A1 A2
A1 A1 A2
A2 A1 A2
A1 A2 A1 A2
P1 P2 P1 P2
None A1 A1 A2
A1 A1 A1 A2
A2 A1 A1 A2
A1 A2 A1 A1 A2
P1 P2 P1 P2
None A2 None
A1 A2 None
A2 A2 None
A1 A2 A2 None
P1 P2 P1 P2
None A2 A1
A1 A2 A1
A2 A2 A1
A1 A2 A2 A1
P1 P2 P1 P2
None A2 A2
A1 A2 A2
A2 A2 A2
A1 A2 A2 A2
P1 P2 P1 P2
None A2 A1 A2
A1 A2 A1 A2
A2 A2 A1 A2
A1 A2 A2 A1 A2
P1 P2 P1 P2
None A1 A2 None
A1 A1 A2 None
A2 A1 A2 None
A1 A2 A1 A2 None
P1 P2 P1 P2
None A1 A2 A1
A1 A1 A2 A1
A2 A1 A2 A1
A1 A2 A1 A2 A1
P1 P2 P1 P2
None A1 A2 A2
A1 A1 A2 A2
A2 A1 A2 A2
A1 A2 A1 A2 A2
P1 P2 P1 P2
None A1 A2 A1 A2
A1 A1 A2 A1 A2
A2 A1 A2 A1 A2
A1 A2 A1 A2 A1 A2
Each box
represents an
alternative
MDCEV Model
A1 P1 A1 P2 A1 P1P2
A2 P1 A2 P2 A2 P1P2
Each box
represents an
alternative
None+
Alternatives - Total 7alternatives versus 64in traditional case
INNOVATION:
HOUSEHOLD VEHICLE COMPOSITION
AND DRIVER ASSIGNMENT
Vehicle Type Choice Simulation Component
• Vehicle type choice determines vehicle fleet mix; critical to
energy and emissions analysis
• SimAGENT incorporates joint vehicle type choice and primary
driver allocation model which jointly determines:
– Multiple vehicle holdings
– Body type (Sub-compact, Compact car, Mid-sized car, Large car,
Small SUV, Mid-sized SUV, Large SUV, Van, and Pickup)
– Age (Less than 2 years old, 2 to 3 years old, 4 to 5 years old, 6 to
9 years old, 10 to 12 years old, Older than 12 years)
– Make/model and use (miles)
– Primary driver of each vehicle
Vehicle Holdings and Use
Vehicle
Type/
Vintage
33 makes/models
21 makes/models
24 makes/models
25 makes/models
7 makes/models
10 makes/models
23 makes/models
19 makes/models
16 makes/models
12 makes/models
13 makes/models
13 makes/models
23 makes/models
15 makes/models
12 makes/models
23 makes/models
12 makes/models
5 makes/models
6 makes/models
15 makes/models
Coupe Old
Sedan Mid-size New
Sedan Mid-size Old
Sedan Compact Old
Sedan Mini/Subcompact New
Sedan Mini/Subcompact Old
Coupe New
Sedan Compact New
Sedan Large Old
Sedan Large New
Minivan Old
Pickup Truck New
SUV New
SUV Old
Hatchback/Station Wagon New
Hatchback/Station Wagon Old
Pickup Truck Old
Van New
Van Old
Minivan New
Non-motorized vehicles
COMPUTATIONAL TECHNIQUES AND
INTEGRATION POTENTIAL
Portable & Flexible Software Architecture
ODBC
Run-Time Data Objects
Household
Person
Zone Data
LOS Data
Pattern
Tour
Stop
Output Files
Simulation
Coordinator
Modeling Modules
…
.
.
.
Decision to Work Model
Work Start/End Time
model
Input
Database
Application
Driver
Data Queries
Zone to Zone
Data
Coordinator
Ability to Integrate and Enhance
• Successfully interfaced with
– Multi-period static assignment (the current four-step
approach of SCAG)
– TRANSIMS and MATSim (second by second assignment of
people and vehicles on networks), and
• Continuous-time evolutionary framework facilitates real-time
dynamic integration of ABM and DTA models
• SimAGENT is successfully implemented in the LA region
• Existing SimAGENT code (CEMDAP, PopGen, CEMSELTS) is
open source
• Being implemented currently in the New York region; selected
based on behavioral realism and ability to accommodate CAVs
• Elements of system being used for long distance travel
modeling by CDOT; UT-Austin working with CDOT
HIGH DIMENSIONAL HETEROGENEOUS
DATA
Why joint modeling of data is important?
• Borrows information on other outcomes
• Able to answer intrinsically multivariate questions, such as the
effect of a covariate on a multidimensional outcome
• Obviates the need for multiple tests and facilitates global tests,
offering superior testing power and better control of Type 1 error
rates
• If some endogenous outcomes are used to explain other
endogenous outcomes, and if the outcomes are not modeled
jointly, the result can be inconsistent estimation of the effects of
one endogenous outcome on another.
• Problem? Mixed data, high-dimensional data
A Way-Out
• The new Spatial Generalized Heterogeneous Data Model (GHDM);
Bhat (2015)
• Correlation across various dimensions (of the dependent variables)
are captured using latent constructs.
• Accommodates all possible types of data (dependent variables).
• Dimension of integration is independent of number of latent
constructs.
• Bhat’s Maximum Approximate Composite Marginal Likelihood
(MACML) estimation approach is used for estimation of GHDM.
Conceptual diagram of structural relationships in the empirical model

More Related Content

Similar to Predictive Analytics for Transportation in a High Dimensional Heterogeneous Data World

Chapter_6_Prescriptive_Analytics_Optimization_and_Simulation.pptx.pdf
Chapter_6_Prescriptive_Analytics_Optimization_and_Simulation.pptx.pdfChapter_6_Prescriptive_Analytics_Optimization_and_Simulation.pptx.pdf
Chapter_6_Prescriptive_Analytics_Optimization_and_Simulation.pptx.pdfAndresBelloAvila
 
Guest lecture at TU Delft: Travel demand models: from trip- to activity-based
Guest lecture at TU Delft: Travel demand models: from trip- to activity-basedGuest lecture at TU Delft: Travel demand models: from trip- to activity-based
Guest lecture at TU Delft: Travel demand models: from trip- to activity-basedLuuk Brederode
 
Scottish Urban Air Qualtiy Steering Group - Modelling & Monitoring Workshop -...
Scottish Urban Air Qualtiy Steering Group - Modelling & Monitoring Workshop -...Scottish Urban Air Qualtiy Steering Group - Modelling & Monitoring Workshop -...
Scottish Urban Air Qualtiy Steering Group - Modelling & Monitoring Workshop -...STEP_scotland
 
Assessment Model for Opportunistic Routing
Assessment Model for Opportunistic RoutingAssessment Model for Opportunistic Routing
Assessment Model for Opportunistic RoutingWaldir Moreira
 
Transport Modelling for managers 2014 willumsen
Transport Modelling for managers 2014 willumsenTransport Modelling for managers 2014 willumsen
Transport Modelling for managers 2014 willumsenLuis Willumsen
 
Local modeling in regression and time series prediction
Local modeling in regression and time series predictionLocal modeling in regression and time series prediction
Local modeling in regression and time series predictionGianluca Bontempi
 
e3-chap-09.ppt
e3-chap-09.ppte3-chap-09.ppt
e3-chap-09.pptKingSh2
 
Matrix Adjustments – How to build better matrices
Matrix Adjustments – How to build better matricesMatrix Adjustments – How to build better matrices
Matrix Adjustments – How to build better matricesJumpingJaq
 
Using PySpark to Scale Markov Decision Problems for Policy Exploration
Using PySpark to Scale Markov Decision Problems for Policy ExplorationUsing PySpark to Scale Markov Decision Problems for Policy Exploration
Using PySpark to Scale Markov Decision Problems for Policy ExplorationDatabricks
 
Autonomy Incubator Seminar Series: Tractable Robust Planning and Model Learni...
Autonomy Incubator Seminar Series: Tractable Robust Planning and Model Learni...Autonomy Incubator Seminar Series: Tractable Robust Planning and Model Learni...
Autonomy Incubator Seminar Series: Tractable Robust Planning and Model Learni...AutonomyIncubator
 
Strategic transport models and smart urban mobility
Strategic transport models and smart urban mobilityStrategic transport models and smart urban mobility
Strategic transport models and smart urban mobilityLuuk Brederode
 

Similar to Predictive Analytics for Transportation in a High Dimensional Heterogeneous Data World (20)

Chapter_6_Prescriptive_Analytics_Optimization_and_Simulation.pptx.pdf
Chapter_6_Prescriptive_Analytics_Optimization_and_Simulation.pptx.pdfChapter_6_Prescriptive_Analytics_Optimization_and_Simulation.pptx.pdf
Chapter_6_Prescriptive_Analytics_Optimization_and_Simulation.pptx.pdf
 
James Parrott
James ParrottJames Parrott
James Parrott
 
Guest lecture at TU Delft: Travel demand models: from trip- to activity-based
Guest lecture at TU Delft: Travel demand models: from trip- to activity-basedGuest lecture at TU Delft: Travel demand models: from trip- to activity-based
Guest lecture at TU Delft: Travel demand models: from trip- to activity-based
 
Scottish Urban Air Qualtiy Steering Group - Modelling & Monitoring Workshop -...
Scottish Urban Air Qualtiy Steering Group - Modelling & Monitoring Workshop -...Scottish Urban Air Qualtiy Steering Group - Modelling & Monitoring Workshop -...
Scottish Urban Air Qualtiy Steering Group - Modelling & Monitoring Workshop -...
 
Assessment Model for Opportunistic Routing
Assessment Model for Opportunistic RoutingAssessment Model for Opportunistic Routing
Assessment Model for Opportunistic Routing
 
Transport Modelling for managers 2014 willumsen
Transport Modelling for managers 2014 willumsenTransport Modelling for managers 2014 willumsen
Transport Modelling for managers 2014 willumsen
 
Local modeling in regression and time series prediction
Local modeling in regression and time series predictionLocal modeling in regression and time series prediction
Local modeling in regression and time series prediction
 
E3 chap-09
E3 chap-09E3 chap-09
E3 chap-09
 
Evaluation techniques
Evaluation techniquesEvaluation techniques
Evaluation techniques
 
e3-chap-09.ppt
e3-chap-09.ppte3-chap-09.ppt
e3-chap-09.ppt
 
E3 chap-09
E3 chap-09E3 chap-09
E3 chap-09
 
M 3 iot
M 3 iotM 3 iot
M 3 iot
 
Human Computer Interaction Evaluation
Human Computer Interaction EvaluationHuman Computer Interaction Evaluation
Human Computer Interaction Evaluation
 
Matrix Adjustments – How to build better matrices
Matrix Adjustments – How to build better matricesMatrix Adjustments – How to build better matrices
Matrix Adjustments – How to build better matrices
 
Mini datathon
Mini datathonMini datathon
Mini datathon
 
Using PySpark to Scale Markov Decision Problems for Policy Exploration
Using PySpark to Scale Markov Decision Problems for Policy ExplorationUsing PySpark to Scale Markov Decision Problems for Policy Exploration
Using PySpark to Scale Markov Decision Problems for Policy Exploration
 
Autonomy Incubator Seminar Series: Tractable Robust Planning and Model Learni...
Autonomy Incubator Seminar Series: Tractable Robust Planning and Model Learni...Autonomy Incubator Seminar Series: Tractable Robust Planning and Model Learni...
Autonomy Incubator Seminar Series: Tractable Robust Planning and Model Learni...
 
MUMS Opening Workshop - The Isaac Newton Institute Uncertainty Quantification...
MUMS Opening Workshop - The Isaac Newton Institute Uncertainty Quantification...MUMS Opening Workshop - The Isaac Newton Institute Uncertainty Quantification...
MUMS Opening Workshop - The Isaac Newton Institute Uncertainty Quantification...
 
Strategic transport models and smart urban mobility
Strategic transport models and smart urban mobilityStrategic transport models and smart urban mobility
Strategic transport models and smart urban mobility
 
A New Generalized Mixed Data Model with Applications to Transport Analysis
A New Generalized Mixed Data Model with Applications to Transport AnalysisA New Generalized Mixed Data Model with Applications to Transport Analysis
A New Generalized Mixed Data Model with Applications to Transport Analysis
 

More from Center for Transportation Research - UT Austin

More from Center for Transportation Research - UT Austin (20)

Flying with SAVES
Flying with SAVESFlying with SAVES
Flying with SAVES
 
Regret of Queueing Bandits
Regret of Queueing BanditsRegret of Queueing Bandits
Regret of Queueing Bandits
 
Advances in Millimeter Wave for V2X
Advances in Millimeter Wave for V2XAdvances in Millimeter Wave for V2X
Advances in Millimeter Wave for V2X
 
Collaborative Sensing and Heterogeneous Networking Leveraging Vehicular Fleets
Collaborative Sensing and Heterogeneous Networking Leveraging Vehicular FleetsCollaborative Sensing and Heterogeneous Networking Leveraging Vehicular Fleets
Collaborative Sensing and Heterogeneous Networking Leveraging Vehicular Fleets
 
Collaborative Sensing for Automated Vehicles
Collaborative Sensing for Automated VehiclesCollaborative Sensing for Automated Vehicles
Collaborative Sensing for Automated Vehicles
 
Statistical Inference Using Stochastic Gradient Descent
Statistical Inference Using Stochastic Gradient DescentStatistical Inference Using Stochastic Gradient Descent
Statistical Inference Using Stochastic Gradient Descent
 
CAV/Mixed Transportation Modeling
CAV/Mixed Transportation ModelingCAV/Mixed Transportation Modeling
CAV/Mixed Transportation Modeling
 
Real-time Signal Control and Traffic Stability / Improved Models for Managed ...
Real-time Signal Control and Traffic Stability / Improved Models for Managed ...Real-time Signal Control and Traffic Stability / Improved Models for Managed ...
Real-time Signal Control and Traffic Stability / Improved Models for Managed ...
 
Sharing Novel Data Sources to Promote Innovation Through Collaboration: Case ...
Sharing Novel Data Sources to Promote Innovation Through Collaboration: Case ...Sharing Novel Data Sources to Promote Innovation Through Collaboration: Case ...
Sharing Novel Data Sources to Promote Innovation Through Collaboration: Case ...
 
UT SAVES: Situation Aware Vehicular Engineering Systems
UT SAVES: Situation Aware Vehicular Engineering SystemsUT SAVES: Situation Aware Vehicular Engineering Systems
UT SAVES: Situation Aware Vehicular Engineering Systems
 
Regret of Queueing Bandits
Regret of Queueing BanditsRegret of Queueing Bandits
Regret of Queueing Bandits
 
Sharing Novel Data Sources to Promote Innovation through Collaboration: Case ...
Sharing Novel Data Sources to Promote Innovation through Collaboration: Case ...Sharing Novel Data Sources to Promote Innovation through Collaboration: Case ...
Sharing Novel Data Sources to Promote Innovation through Collaboration: Case ...
 
CAV/Mixed Transportation Modeling
CAV/Mixed Transportation ModelingCAV/Mixed Transportation Modeling
CAV/Mixed Transportation Modeling
 
Collaborative Sensing for Automated Vehicles
Collaborative Sensing for Automated VehiclesCollaborative Sensing for Automated Vehicles
Collaborative Sensing for Automated Vehicles
 
Advances in Millimeter Wave for V2X
Advances in Millimeter Wave for V2XAdvances in Millimeter Wave for V2X
Advances in Millimeter Wave for V2X
 
Statistical Inference Using Stochastic Gradient Descent
Statistical Inference Using Stochastic Gradient DescentStatistical Inference Using Stochastic Gradient Descent
Statistical Inference Using Stochastic Gradient Descent
 
Status of two projects: Real-time Signal Control and Traffic Stability; Impro...
Status of two projects: Real-time Signal Control and Traffic Stability; Impro...Status of two projects: Real-time Signal Control and Traffic Stability; Impro...
Status of two projects: Real-time Signal Control and Traffic Stability; Impro...
 
SAVES general overview
SAVES general overviewSAVES general overview
SAVES general overview
 
D-STOP Overview April 2018
D-STOP Overview April 2018D-STOP Overview April 2018
D-STOP Overview April 2018
 
Managing Mobility during Design-Build Highway Construction: Successes and Les...
Managing Mobility during Design-Build Highway Construction: Successes and Les...Managing Mobility during Design-Build Highway Construction: Successes and Les...
Managing Mobility during Design-Build Highway Construction: Successes and Les...
 

Recently uploaded

My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 

Recently uploaded (20)

E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 

Predictive Analytics for Transportation in a High Dimensional Heterogeneous Data World

  • 1. Predictive Analytics for Transportation in a High Dimensional Heterogeneous Data World Dr. Chandra Bhat Center for Transportation Research The University of Texas at Austin Acknowledgments: D-STOP, Humboldt Award, Dr. Ram Pendyala, Dr. Kostas Goulias, all my graduate/undergraduate students
  • 2. World of high dimensional heterogeneous data • Providing accurate traffic information is becoming an imperative • Cameras, GPS, cell phone tracking, and probe vehicles are used to supplement the information provided by conventional measurement systems. • Methodologies to combine and aggregate high dimensional heterogeneous data are needed
  • 3. Connected/Automated Vehicles (CAVs) and big data • The car of the near future will be a part of a gigantic data-collection engine. • Vehicles have embedded  computers,  GPS receivers,  short-range wireless network interfaces,  in-car sensors,  cameras, and  internet. • Vehicles interact with  Roadside wireless sensor networks,  passenger’s wireless devices, and  other cars.
  • 4. Data required to keep a CAV safely on the road • Highly detailed maps information:  Shape and elevation of roadways,  lane lines,  intersections,  crosswalks,  speed limits, and  traffic signals. • Position, speed and intentions of other vehicles and pedestrians. • Position, speed and intentions of other road users, such as…  jaywalking pedestrians,  cars coming out of hidden driveways,  a stop sign held up by a crossing guard, and  cyclist making signs of riding intent.
  • 5. What can be inferred from CAVs and smartphones data • Where people drive, • when people drive, • what route people take, • where people stop, • what people put in their car, • why, how and when people take decisions on the fly and change their activity plan route, and • detailed crashes data (speed, position, and intention at the moment of the accident).
  • 6. Data Science • Not enough humans to process • Machine learning, visualization, and advanced computation techniques • Statistics, social sciences, and domain knowledge • High-dimensional heterogeneous data
  • 7. DOMAIN KNOWLEDGE IN THE CONTEXT OF TRAVEL DEMAND MODELING
  • 8. Exogenous Variable Vector X Model • Conceptual/Theoretical/ Methods/Tools and Techniques • Specification and Definition of Alternatives Activity-based Paradigm Trip-based Paradigm Outputs
  • 9. Five Pillars of ABM Design • Based on sound behavioral theory/paradigm • Computationally feasible and tractable – Model estimation – Model implementation • Optimal use of available data (present and future) • ABM should be both an Activity-Based Model and an Agent-Based Model • Sensitive to policy issues and planning applications of interest
  • 10. Behavioral Basis of ABM • Decision hierarchies and choice processes – A variety of behavioral decision structures possible – Virtually all models assume a sequential decision structure similar to traditional four-step models for computational convenience • Considerable evidence of simultaneity in behavioral choice mechanisms – Several choices made simultaneously as a lifestyle package
  • 11. Behavioral Basis of ABM • Examples of simultaneous choice packages – Residential location, vehicle ownership, mode to pre- planned activities (e.g., work) – Activity type, activity duration, and activity timing (scheduling) • Behavioral heterogeneity – Differences in choice processes across market segments – Identify market segments both exogenously and endogenously (latent market segments)
  • 12. Time-Space Interactions Home Work Activity 1 (Fixed) Activity 2 (Fixed) Time Urban Space 1 v Home Activity A Activity at Location A Activity 1 Activity 2
  • 13. Agent Interactions I have a client meeting today; so I will take the car I have to pick up Jane from School and go shopping later; I need the car. My meeting is in the morning. I can pick up Jane from school today. And we can go shopping together in the evening. OK, that sounds good. I’ll go ahead and take light rail today to work. See you later. Hey, Mom and Dad, don’t forget; you have to drop me off at Johnny’s house in the evening today Don’t worry Jane; we’ll drop you off on the way to the store and pick you up later. Run along now, you’ll miss the bus.
  • 14. Definition of an Activity • Disaggregate activity purpose definition – Challenge traditional notion of mandatory and discretionary activities/trips – Movie, ball game, and child’s tennis lesson or soccer game often have spatial and/or temporal fixity – Characterize activities and trips by level of spatial and temporal fixity/constraints (besides purpose) – Can be accomplished using concepts of time-space geography – Automated method to add attributes describing degrees of freedom according to set of spatial/temporal fixity criteria to activity records in data set
  • 15. Central Role of Time Use • Notion of time is central to activity-based modeling – Explicit modeling of activity durations (daily activity time allocation and individual episode duration) – Treat time as “continuous” and not as “discrete choice” blocks • Activity engagement is the focus of attention – Travel patterns are inferred as an outcome of activity participation and time use decisions – Continuous treatment of time dimension allows explicit consideration of time constraints on human activities • Reconcile activity durations with network travel durations (feedback processes)
  • 16. In Summary • ABM should… – Capture the central role of activities, time, and space in a continuum – Explicitly recognize constraints and interactions – Represent simultaneity in behavioral choice processes – Account for heterogeneity in behavioral decision hierarchies – Incorporate feedback processes to facilitate integration with land use and network models • SimAGENT does it all and more…
  • 17. SIMAGENT (SIMULATOR OF ACTIVITIES, GREENHOUSE GAS EMISSIONS, ENERGY, NETWORKS, AND TRAVEL): AN OVERVIEW
  • 18. SimAGENT Activity-travel environment characteristics (base year) Detailed individual-level socio- demographics (base year) Activity-travel simulator (CEMDAP) Individual activity-travel patterns Link volumes and speeds Dynamic Traffic Assignment (DTA) Socio-economics, land-use and transportation system characteristics simulator (CEMSELTS) Socio- demographics and activity-travel environment SimAGENT Aggregate socio- demographics (base year) Synthetic population generator (PopGen) Base Year Inputs Forecast Year Outputs and GHG Emissions Prediction
  • 19. CEMDAP A COMPREHENSIVE ECONOMETRIC MICROSIMULATOR OF DAILY ACTIVITY-TRAVEL PATTERNS
  • 20. CEMDAP – The Core ABM in SimAGENT Socio-Economic Data PopGen CEMSELTS CEMDAP • Simulates activity schedule and travel characteristics for each individual of the region • Core module of SimAGENT • 52 sub-models. • Developed by UT Austin
  • 21. Features of CEMDAP (continued) • Changes in the activity-travel pattern of one individual in a household may bring about changes in activity-travel patterns of other household members • MDCEV approach facilitates modeling activity participation at a household level with joint activity participation incorporated in a simple fashion – MDCEV – Multiple Discrete Continuous Extreme Value econometric choice modeling method • Includes a model of household vehicle ownership by type and make/model, and primary driver assignment
  • 23. Joint Activities and Household Interactions MDCEV Model • Most activity based models accommodate activity type choice as a series of models for each individual in the household • These approaches do not explicitly recognize that activity participation is a collective decision of household members • MDCEV approach – simple and relatively inexpensive for modeling activity participation at a household level • SimAGENT now features MDCEV modeling methodology to capture household-level activity participation
  • 24. Joint Activities and Interactions MDCEV Model • Conventional discrete choice frameworks need to generate mutually exclusive alternatives  results in an explosion in the number of alternatives • MDCEV allows us to tackle the problem by considering activity participation as a household decision • MDCEV offers substantial computational and behavioral advantages – Employ one model to generate activities – Accommodate substitution/complementarity in activity participation and household member dimensions
  • 25. MDCEV Model P1 P2 P1 P2 None None None A1 None None A2 None None A1 A2 None None P1 P2 P1 P2 None None A1 A1 None A1 A2 None A1 A1 A2 None A1 P1 P2 P1 P2 None None A2 A1 None A2 A2 None A2 A1 A2 None A2 P1 P2 P1 P2 None None A1 A2 A1 None A1 A2 A2 None A1 A2 A1 A2 None A1 A2 P1 P2 P1 P2 None A1 None A1 A1 None A2 A1 None A1 A2 A1 None P1 P2 P1 P2 None A1 A1 A1 A1 A1 A2 A1 A1 A1 A2 A1 A1 P1 P2 P1 P2 None A1 A2 A1 A1 A2 A2 A1 A2 A1 A2 A1 A2 P1 P2 P1 P2 None A1 A1 A2 A1 A1 A1 A2 A2 A1 A1 A2 A1 A2 A1 A1 A2 P1 P2 P1 P2 None A2 None A1 A2 None A2 A2 None A1 A2 A2 None P1 P2 P1 P2 None A2 A1 A1 A2 A1 A2 A2 A1 A1 A2 A2 A1 P1 P2 P1 P2 None A2 A2 A1 A2 A2 A2 A2 A2 A1 A2 A2 A2 P1 P2 P1 P2 None A2 A1 A2 A1 A2 A1 A2 A2 A2 A1 A2 A1 A2 A2 A1 A2 P1 P2 P1 P2 None A1 A2 None A1 A1 A2 None A2 A1 A2 None A1 A2 A1 A2 None P1 P2 P1 P2 None A1 A2 A1 A1 A1 A2 A1 A2 A1 A2 A1 A1 A2 A1 A2 A1 P1 P2 P1 P2 None A1 A2 A2 A1 A1 A2 A2 A2 A1 A2 A2 A1 A2 A1 A2 A2 P1 P2 P1 P2 None A1 A2 A1 A2 A1 A1 A2 A1 A2 A2 A1 A2 A1 A2 A1 A2 A1 A2 A1 A2 Each box represents an alternative
  • 26. MDCEV Model A1 P1 A1 P2 A1 P1P2 A2 P1 A2 P2 A2 P1P2 Each box represents an alternative None+ Alternatives - Total 7alternatives versus 64in traditional case
  • 28. Vehicle Type Choice Simulation Component • Vehicle type choice determines vehicle fleet mix; critical to energy and emissions analysis • SimAGENT incorporates joint vehicle type choice and primary driver allocation model which jointly determines: – Multiple vehicle holdings – Body type (Sub-compact, Compact car, Mid-sized car, Large car, Small SUV, Mid-sized SUV, Large SUV, Van, and Pickup) – Age (Less than 2 years old, 2 to 3 years old, 4 to 5 years old, 6 to 9 years old, 10 to 12 years old, Older than 12 years) – Make/model and use (miles) – Primary driver of each vehicle
  • 29. Vehicle Holdings and Use Vehicle Type/ Vintage 33 makes/models 21 makes/models 24 makes/models 25 makes/models 7 makes/models 10 makes/models 23 makes/models 19 makes/models 16 makes/models 12 makes/models 13 makes/models 13 makes/models 23 makes/models 15 makes/models 12 makes/models 23 makes/models 12 makes/models 5 makes/models 6 makes/models 15 makes/models Coupe Old Sedan Mid-size New Sedan Mid-size Old Sedan Compact Old Sedan Mini/Subcompact New Sedan Mini/Subcompact Old Coupe New Sedan Compact New Sedan Large Old Sedan Large New Minivan Old Pickup Truck New SUV New SUV Old Hatchback/Station Wagon New Hatchback/Station Wagon Old Pickup Truck Old Van New Van Old Minivan New Non-motorized vehicles
  • 31. Portable & Flexible Software Architecture ODBC Run-Time Data Objects Household Person Zone Data LOS Data Pattern Tour Stop Output Files Simulation Coordinator Modeling Modules … . . . Decision to Work Model Work Start/End Time model Input Database Application Driver Data Queries Zone to Zone Data Coordinator
  • 32. Ability to Integrate and Enhance • Successfully interfaced with – Multi-period static assignment (the current four-step approach of SCAG) – TRANSIMS and MATSim (second by second assignment of people and vehicles on networks), and • Continuous-time evolutionary framework facilitates real-time dynamic integration of ABM and DTA models
  • 33. • SimAGENT is successfully implemented in the LA region • Existing SimAGENT code (CEMDAP, PopGen, CEMSELTS) is open source • Being implemented currently in the New York region; selected based on behavioral realism and ability to accommodate CAVs • Elements of system being used for long distance travel modeling by CDOT; UT-Austin working with CDOT
  • 35. Why joint modeling of data is important? • Borrows information on other outcomes • Able to answer intrinsically multivariate questions, such as the effect of a covariate on a multidimensional outcome • Obviates the need for multiple tests and facilitates global tests, offering superior testing power and better control of Type 1 error rates • If some endogenous outcomes are used to explain other endogenous outcomes, and if the outcomes are not modeled jointly, the result can be inconsistent estimation of the effects of one endogenous outcome on another. • Problem? Mixed data, high-dimensional data
  • 36. A Way-Out • The new Spatial Generalized Heterogeneous Data Model (GHDM); Bhat (2015) • Correlation across various dimensions (of the dependent variables) are captured using latent constructs. • Accommodates all possible types of data (dependent variables). • Dimension of integration is independent of number of latent constructs. • Bhat’s Maximum Approximate Composite Marginal Likelihood (MACML) estimation approach is used for estimation of GHDM.
  • 37. Conceptual diagram of structural relationships in the empirical model