SlideShare a Scribd company logo
Predicting cancellation probabilities of airline bookings using a discrete hazard model
Dr. Heiko Schmitz, Dr. Jan Ulbricht, Stephan Würll,
Lufthansa Systems, Berlin, Germany
Motivation
► Large fraction of airline bookings are
cancelled before departure
► Knowing the number of empty seats is
crucial for revenue management
► Overbooking is an important measure
to increase revenue
► Cancellation probability depends on
flight-related and passenger-related
attributes
► Predictions are required in a dynamic
environment (schedule changes, etc.)
► Predictions shall be stable and robust
► Regression models are suitable for
this task
The Model
Goal
Predict the probability that a booking
done at time t will be cancelled at a
later time t‘.
Assumptions and Preconditions
► Sufficient number of observations from
the past are available
► Discrete time
(Data collection points, DCP)
► Model is estimated per OD
(Origin/Destination)
► Sufficient variation in observation data
► GLM with logistic link function
𝐹 𝜂 =
1
1 + exp −𝜂
∈ 0,1 ,
𝜂 = 𝛾0 + 𝛽𝑗 ∙ 𝑥𝑗
► Log-likelihood function
ln𝐿 𝛽 = 𝑦𝑖𝑗ln𝜆 𝑗|𝑥𝑖 + 1 − 𝑦𝑖𝑗 ln 1 − 𝜆 𝑗|𝑥𝑖
𝑡 𝑖
𝑗=1
𝑛
𝑖=1
► Maximum likelihood estimator
𝛽 = argmax ln 𝐿 𝛽
► Fisher scoring
𝛽 𝑘+1 = 𝑍T
∙ 𝑊𝑘 ∙ 𝑍
−1
∙ 𝑍T
∙ 𝑊𝑘 ∙ 𝑦 𝑘 with
weight matrix 𝑊𝑘 = 𝑊 𝛽 𝑘 ,
grand design matrix 𝑍 = 𝑍 𝑥 ,
𝑦 𝑘 = 𝑦 𝑦, 𝛽 𝑘 .
• 𝑑𝑑𝑑𝑑𝑑𝑃(𝑇dddd = 𝑡) =
𝜆(𝑡) 1 − 𝜆(𝑖) , 𝑡 = 0,1, … , 𝑘, 𝑘 + 1
𝑡−1
𝑖=0
Practicalities
Regularization
► Tikhonov regularization
𝑀 → 𝑀 − 𝜀I
avoids numerical problems in inversion
Coding of categorical covariables (e.g. DoW)
► Use effect coding
► No reference category necessary
► Easy extrapolation to unobserved values
New covariate value (e.g. flight on new DoW
can be predicted as “average” of all DoWs
Convergence
► Criterion
ln𝐿 𝛽 𝑘+1 −ln𝐿 𝛽 𝑘
ln𝐿 𝛽 𝑘+1 +0.1
≤ 𝜀
► Usually very fast (10 to 20 iterations)
Hazard Rates and Lifetime
► Discrete hazard rates (of cancellation)
𝜆 𝑡 ≔ 𝑃 𝑇 = 𝑡|𝑇 ≥ 𝑡
► Probability of lifetime (duration) 𝑡
► Lifetime if booking has survived
until today (=r )
𝑃(𝑇 = 𝑡|𝑇 ≥ 𝑟) = 𝜆(𝑡) ∙ 1 − 𝜆 𝑖
𝑡−1
𝑖=𝑟
𝑃(𝑇 = 𝑡) = 𝜆(𝑡) ∙ 1 − 𝜆 𝑖
𝑡−1
𝑖=0
Cancellation
(”Hazard“) at 𝑡
“Survival“
𝑖 = 0, … , 𝑡 − 1
DepartureTodayBooking Cancellation
Timeline
T1 T2
Covariates
Itinerary-related
► Departure DoW
► Departure Time
► Travel Time
► Number of legs
► Number of flight
segments
Customer-related
► Customer segment
► Point of Sale
► Booking DCP
Outlook
► Incorporate tariff information
► Use smooth components
(Tensor B-Splines)
Updating Coefficients
► How do we learn from new
observations?
► Estimate new coefficients 𝛽new
► Mix with old coefficients 𝛽old
𝛽 𝛼 = 1 − 𝛼 𝛽old + 𝛼 𝛽new
► Maximize likelihood on validation data
𝛼∗
= argmax ln𝐿 𝛽 𝛼
today
time
estimation validation
validationestimation
previous update
maximize ln𝐿estimate 𝛽new
Results and Summary
Comparison with old time series method
►Robust and reliable prediction of
cancellation rates
►Modest resource consumption
(memory and CPU)
►Works well in dynamical environment
►Easy to interpret parameters
►Revenue effect: some M$
Market PMAD old PMAD new Diff pp
1 0.372 0.350 2.2
2 0.372 0.349 2.3
3 0.622 0.549 7.3
4 0.654 0.555 9.8
5 0.422 0.389 3.3
Mean 0.530 0.471 5.9
Optimization of Fisher Scoring
Textbook approach
► Each lifetime value is a covariate
► One observation (lifetime 𝑡𝑖 ) is duplicated
into
►𝑡𝑖 − 1 observations of surviving booking
►1 observation of cancellation/survival at 𝑡𝑖
► Design matrix 𝑍 ∈ ℝ 𝑡 𝑖×(𝑘+𝑝)
𝑍 =
𝐈 𝑡1
𝟎 𝑡1×(𝑘−𝑡1)
𝐱1
T
⋮
𝐱1
T
⋮ ⋮ ⋮
𝐈 𝒕 𝒏
𝟎 𝑡 𝑛×(𝑘−𝑡 𝑛)
𝐱 𝑛
T
⋮
𝐱 𝑛
T
= 𝑍1 𝑍2
► Weight matrix 𝑊 ∈ ℝ 𝑡 𝑖× 𝑡 𝑖
𝑊 = diag 𝑤1, … , 𝑤 𝑛 ∈ ℝ 𝑡 𝑖× 𝑡 𝑖
Matrix operations: Typical numbers
► 𝑘 = 100 lifetimes, 𝑝 = 20 covariates,
mean lifetime 𝑡 = 50, 𝑛~50,000 observations
► 𝑍 ∈ ℝ2,500,000×120
, 𝑊 ∈ ℝ2,500,000×2,500,000
► Huge matrices with very regular structure
Use block structure to reduce dimensions
► Matrix algebra yields
𝑍T
∙ 𝑊 ∙ 𝑍 =
𝐷 𝐵
𝐵T
𝐴
;
partitioned with 𝐷 ∈ ℝ 𝑘×𝑘
diagonal,
𝐴 ∈ ℝ 𝑝×𝑝
, 𝐵 ∈ ℝ 𝑘×𝑝
► Sub-matrices are easy to calculate
► Matrices are manageable
► Inversion is easy
𝑍T 𝑊𝑍
−1
= 𝐷−1 0
0 0
+ −𝐷−1
𝐵
I
𝐴 − 𝐵T 𝐷−1 𝐵 −1
−𝐵T 𝐷−1 I
► Similar reduction of dimensions is possible for
𝑍T
∙ 𝑊 ∙ 𝑦 𝑘
► No loss of accuracy (algebraic transformation)
References
■ Dan C. Iliescu,
Customer Based Time-to-Event Models for
Cancellation Behavior: A Revenue Management
Integrated Approach, Proquest, 2011
Contact
heiko.schmitz@LHsystems.com
Regularize!

More Related Content

Viewers also liked

Los componentes del poder
Los componentes del poder Los componentes del poder
Los componentes del poder
julio martínez
 
La Superposicion Cosmica - W.Reich
La Superposicion Cosmica - W.ReichLa Superposicion Cosmica - W.Reich
La Superposicion Cosmica - W.Reich
Ivan Francisco MG
 
R language Project report
R language Project reportR language Project report
R language Project reportTianyue Wang
 
Validação de Testes Diagnósticos (aula 9)
Validação de Testes Diagnósticos (aula 9)Validação de Testes Diagnósticos (aula 9)
Validação de Testes Diagnósticos (aula 9)
Sandra Lago Moraes
 
Portafolio digital pauta
Portafolio digital pautaPortafolio digital pauta
Portafolio digital pauta
Marcelo Luis Barbosa dos Santos
 
Tipos de-liderazgo
Tipos de-liderazgoTipos de-liderazgo
Tipos de-liderazgo
ivan_antrax
 

Viewers also liked (6)

Los componentes del poder
Los componentes del poder Los componentes del poder
Los componentes del poder
 
La Superposicion Cosmica - W.Reich
La Superposicion Cosmica - W.ReichLa Superposicion Cosmica - W.Reich
La Superposicion Cosmica - W.Reich
 
R language Project report
R language Project reportR language Project report
R language Project report
 
Validação de Testes Diagnósticos (aula 9)
Validação de Testes Diagnósticos (aula 9)Validação de Testes Diagnósticos (aula 9)
Validação de Testes Diagnósticos (aula 9)
 
Portafolio digital pauta
Portafolio digital pautaPortafolio digital pauta
Portafolio digital pauta
 
Tipos de-liderazgo
Tipos de-liderazgoTipos de-liderazgo
Tipos de-liderazgo
 

Recently uploaded

Structural Classification Of Protein (SCOP)
Structural Classification Of Protein  (SCOP)Structural Classification Of Protein  (SCOP)
Structural Classification Of Protein (SCOP)
aishnasrivastava
 
Comparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebratesComparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebrates
sachin783648
 
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
Scintica Instrumentation
 
Transport in plants G1.pptx Cambridge IGCSE
Transport in plants G1.pptx Cambridge IGCSETransport in plants G1.pptx Cambridge IGCSE
Transport in plants G1.pptx Cambridge IGCSE
jordanparish425
 
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdfSCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SELF-EXPLANATORY
 
THYROID-PARATHYROID medical surgical nursing
THYROID-PARATHYROID medical surgical nursingTHYROID-PARATHYROID medical surgical nursing
THYROID-PARATHYROID medical surgical nursing
Jocelyn Atis
 
GEOLOGICAL FIELD REPORT On Kaptai Rangamati Road-Cut Section.pdf
GEOLOGICAL FIELD REPORT  On  Kaptai Rangamati Road-Cut Section.pdfGEOLOGICAL FIELD REPORT  On  Kaptai Rangamati Road-Cut Section.pdf
GEOLOGICAL FIELD REPORT On Kaptai Rangamati Road-Cut Section.pdf
University of Barishal
 
GBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram StainingGBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram Staining
Areesha Ahmad
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
SAMIR PANDA
 
Viksit bharat till 2047 India@2047.pptx
Viksit bharat till 2047  India@2047.pptxViksit bharat till 2047  India@2047.pptx
Viksit bharat till 2047 India@2047.pptx
rakeshsharma20142015
 
Plant Biotechnology undergraduates note.pptx
Plant Biotechnology undergraduates note.pptxPlant Biotechnology undergraduates note.pptx
Plant Biotechnology undergraduates note.pptx
yusufzako14
 
GLOBAL AND LOCAL SCENARIO OF FOOD AND NUTRITION.pptx
GLOBAL AND LOCAL SCENARIO OF FOOD AND NUTRITION.pptxGLOBAL AND LOCAL SCENARIO OF FOOD AND NUTRITION.pptx
GLOBAL AND LOCAL SCENARIO OF FOOD AND NUTRITION.pptx
SultanMuhammadGhauri
 
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
Sérgio Sacani
 
SAMPLING.pptx for analystical chemistry sample techniques
SAMPLING.pptx for analystical chemistry sample techniquesSAMPLING.pptx for analystical chemistry sample techniques
SAMPLING.pptx for analystical chemistry sample techniques
rodneykiptoo8
 
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
NathanBaughman3
 
Cancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate PathwayCancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate Pathway
AADYARAJPANDEY1
 
FAIR & AI Ready KGs for Explainable Predictions
FAIR & AI Ready KGs for Explainable PredictionsFAIR & AI Ready KGs for Explainable Predictions
FAIR & AI Ready KGs for Explainable Predictions
Michel Dumontier
 
Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...
Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...
Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...
Sérgio Sacani
 
NuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final versionNuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final version
pablovgd
 
Large scale production of streptomycin.pptx
Large scale production of streptomycin.pptxLarge scale production of streptomycin.pptx
Large scale production of streptomycin.pptx
Cherry
 

Recently uploaded (20)

Structural Classification Of Protein (SCOP)
Structural Classification Of Protein  (SCOP)Structural Classification Of Protein  (SCOP)
Structural Classification Of Protein (SCOP)
 
Comparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebratesComparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebrates
 
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
 
Transport in plants G1.pptx Cambridge IGCSE
Transport in plants G1.pptx Cambridge IGCSETransport in plants G1.pptx Cambridge IGCSE
Transport in plants G1.pptx Cambridge IGCSE
 
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdfSCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
 
THYROID-PARATHYROID medical surgical nursing
THYROID-PARATHYROID medical surgical nursingTHYROID-PARATHYROID medical surgical nursing
THYROID-PARATHYROID medical surgical nursing
 
GEOLOGICAL FIELD REPORT On Kaptai Rangamati Road-Cut Section.pdf
GEOLOGICAL FIELD REPORT  On  Kaptai Rangamati Road-Cut Section.pdfGEOLOGICAL FIELD REPORT  On  Kaptai Rangamati Road-Cut Section.pdf
GEOLOGICAL FIELD REPORT On Kaptai Rangamati Road-Cut Section.pdf
 
GBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram StainingGBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram Staining
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
 
Viksit bharat till 2047 India@2047.pptx
Viksit bharat till 2047  India@2047.pptxViksit bharat till 2047  India@2047.pptx
Viksit bharat till 2047 India@2047.pptx
 
Plant Biotechnology undergraduates note.pptx
Plant Biotechnology undergraduates note.pptxPlant Biotechnology undergraduates note.pptx
Plant Biotechnology undergraduates note.pptx
 
GLOBAL AND LOCAL SCENARIO OF FOOD AND NUTRITION.pptx
GLOBAL AND LOCAL SCENARIO OF FOOD AND NUTRITION.pptxGLOBAL AND LOCAL SCENARIO OF FOOD AND NUTRITION.pptx
GLOBAL AND LOCAL SCENARIO OF FOOD AND NUTRITION.pptx
 
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
 
SAMPLING.pptx for analystical chemistry sample techniques
SAMPLING.pptx for analystical chemistry sample techniquesSAMPLING.pptx for analystical chemistry sample techniques
SAMPLING.pptx for analystical chemistry sample techniques
 
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
 
Cancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate PathwayCancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate Pathway
 
FAIR & AI Ready KGs for Explainable Predictions
FAIR & AI Ready KGs for Explainable PredictionsFAIR & AI Ready KGs for Explainable Predictions
FAIR & AI Ready KGs for Explainable Predictions
 
Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...
Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...
Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...
 
NuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final versionNuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final version
 
Large scale production of streptomycin.pptx
Large scale production of streptomycin.pptxLarge scale production of streptomycin.pptx
Large scale production of streptomycin.pptx
 

Predicting cancellation probabilities of airline bookings using a discrete hazard model

  • 1. Predicting cancellation probabilities of airline bookings using a discrete hazard model Dr. Heiko Schmitz, Dr. Jan Ulbricht, Stephan Würll, Lufthansa Systems, Berlin, Germany Motivation ► Large fraction of airline bookings are cancelled before departure ► Knowing the number of empty seats is crucial for revenue management ► Overbooking is an important measure to increase revenue ► Cancellation probability depends on flight-related and passenger-related attributes ► Predictions are required in a dynamic environment (schedule changes, etc.) ► Predictions shall be stable and robust ► Regression models are suitable for this task The Model Goal Predict the probability that a booking done at time t will be cancelled at a later time t‘. Assumptions and Preconditions ► Sufficient number of observations from the past are available ► Discrete time (Data collection points, DCP) ► Model is estimated per OD (Origin/Destination) ► Sufficient variation in observation data ► GLM with logistic link function 𝐹 𝜂 = 1 1 + exp −𝜂 ∈ 0,1 , 𝜂 = 𝛾0 + 𝛽𝑗 ∙ 𝑥𝑗 ► Log-likelihood function ln𝐿 𝛽 = 𝑦𝑖𝑗ln𝜆 𝑗|𝑥𝑖 + 1 − 𝑦𝑖𝑗 ln 1 − 𝜆 𝑗|𝑥𝑖 𝑡 𝑖 𝑗=1 𝑛 𝑖=1 ► Maximum likelihood estimator 𝛽 = argmax ln 𝐿 𝛽 ► Fisher scoring 𝛽 𝑘+1 = 𝑍T ∙ 𝑊𝑘 ∙ 𝑍 −1 ∙ 𝑍T ∙ 𝑊𝑘 ∙ 𝑦 𝑘 with weight matrix 𝑊𝑘 = 𝑊 𝛽 𝑘 , grand design matrix 𝑍 = 𝑍 𝑥 , 𝑦 𝑘 = 𝑦 𝑦, 𝛽 𝑘 . • 𝑑𝑑𝑑𝑑𝑑𝑃(𝑇dddd = 𝑡) = 𝜆(𝑡) 1 − 𝜆(𝑖) , 𝑡 = 0,1, … , 𝑘, 𝑘 + 1 𝑡−1 𝑖=0 Practicalities Regularization ► Tikhonov regularization 𝑀 → 𝑀 − 𝜀I avoids numerical problems in inversion Coding of categorical covariables (e.g. DoW) ► Use effect coding ► No reference category necessary ► Easy extrapolation to unobserved values New covariate value (e.g. flight on new DoW can be predicted as “average” of all DoWs Convergence ► Criterion ln𝐿 𝛽 𝑘+1 −ln𝐿 𝛽 𝑘 ln𝐿 𝛽 𝑘+1 +0.1 ≤ 𝜀 ► Usually very fast (10 to 20 iterations) Hazard Rates and Lifetime ► Discrete hazard rates (of cancellation) 𝜆 𝑡 ≔ 𝑃 𝑇 = 𝑡|𝑇 ≥ 𝑡 ► Probability of lifetime (duration) 𝑡 ► Lifetime if booking has survived until today (=r ) 𝑃(𝑇 = 𝑡|𝑇 ≥ 𝑟) = 𝜆(𝑡) ∙ 1 − 𝜆 𝑖 𝑡−1 𝑖=𝑟 𝑃(𝑇 = 𝑡) = 𝜆(𝑡) ∙ 1 − 𝜆 𝑖 𝑡−1 𝑖=0 Cancellation (”Hazard“) at 𝑡 “Survival“ 𝑖 = 0, … , 𝑡 − 1 DepartureTodayBooking Cancellation Timeline T1 T2 Covariates Itinerary-related ► Departure DoW ► Departure Time ► Travel Time ► Number of legs ► Number of flight segments Customer-related ► Customer segment ► Point of Sale ► Booking DCP Outlook ► Incorporate tariff information ► Use smooth components (Tensor B-Splines) Updating Coefficients ► How do we learn from new observations? ► Estimate new coefficients 𝛽new ► Mix with old coefficients 𝛽old 𝛽 𝛼 = 1 − 𝛼 𝛽old + 𝛼 𝛽new ► Maximize likelihood on validation data 𝛼∗ = argmax ln𝐿 𝛽 𝛼 today time estimation validation validationestimation previous update maximize ln𝐿estimate 𝛽new Results and Summary Comparison with old time series method ►Robust and reliable prediction of cancellation rates ►Modest resource consumption (memory and CPU) ►Works well in dynamical environment ►Easy to interpret parameters ►Revenue effect: some M$ Market PMAD old PMAD new Diff pp 1 0.372 0.350 2.2 2 0.372 0.349 2.3 3 0.622 0.549 7.3 4 0.654 0.555 9.8 5 0.422 0.389 3.3 Mean 0.530 0.471 5.9 Optimization of Fisher Scoring Textbook approach ► Each lifetime value is a covariate ► One observation (lifetime 𝑡𝑖 ) is duplicated into ►𝑡𝑖 − 1 observations of surviving booking ►1 observation of cancellation/survival at 𝑡𝑖 ► Design matrix 𝑍 ∈ ℝ 𝑡 𝑖×(𝑘+𝑝) 𝑍 = 𝐈 𝑡1 𝟎 𝑡1×(𝑘−𝑡1) 𝐱1 T ⋮ 𝐱1 T ⋮ ⋮ ⋮ 𝐈 𝒕 𝒏 𝟎 𝑡 𝑛×(𝑘−𝑡 𝑛) 𝐱 𝑛 T ⋮ 𝐱 𝑛 T = 𝑍1 𝑍2 ► Weight matrix 𝑊 ∈ ℝ 𝑡 𝑖× 𝑡 𝑖 𝑊 = diag 𝑤1, … , 𝑤 𝑛 ∈ ℝ 𝑡 𝑖× 𝑡 𝑖 Matrix operations: Typical numbers ► 𝑘 = 100 lifetimes, 𝑝 = 20 covariates, mean lifetime 𝑡 = 50, 𝑛~50,000 observations ► 𝑍 ∈ ℝ2,500,000×120 , 𝑊 ∈ ℝ2,500,000×2,500,000 ► Huge matrices with very regular structure Use block structure to reduce dimensions ► Matrix algebra yields 𝑍T ∙ 𝑊 ∙ 𝑍 = 𝐷 𝐵 𝐵T 𝐴 ; partitioned with 𝐷 ∈ ℝ 𝑘×𝑘 diagonal, 𝐴 ∈ ℝ 𝑝×𝑝 , 𝐵 ∈ ℝ 𝑘×𝑝 ► Sub-matrices are easy to calculate ► Matrices are manageable ► Inversion is easy 𝑍T 𝑊𝑍 −1 = 𝐷−1 0 0 0 + −𝐷−1 𝐵 I 𝐴 − 𝐵T 𝐷−1 𝐵 −1 −𝐵T 𝐷−1 I ► Similar reduction of dimensions is possible for 𝑍T ∙ 𝑊 ∙ 𝑦 𝑘 ► No loss of accuracy (algebraic transformation) References ■ Dan C. Iliescu, Customer Based Time-to-Event Models for Cancellation Behavior: A Revenue Management Integrated Approach, Proquest, 2011 Contact heiko.schmitz@LHsystems.com Regularize!