SlideShare a Scribd company logo
Detection of outliers in Poisson regression models via overdispersion
Freedom Gumedze and Tinashe Chatora
Department of Statistical Sciences, University of Cape Town
http://www.stats.uct.ac.za Email: freedom.gumedze@uct.ac.za
Introduction
Both undispersed and overdispersed count may contain outliers
We propose a variance shift outlier model (VSOM) for the detection and
accommodation of outliers in count data
Our proposed model is a form of a hierarchical generalized linear model (HGLM)
We consider both independent and longitudinal data settings
Hierarchical generalized linear model (HGLM)
A HGLM has the the following properties (Lee and Nelder, 1996):
Let Yij be the jth observation for the ith subject and bi be the unobserved random
effect for the ith subject, for i = 1, . . . , q and j = 1, . . . , ni. Conditional on bi, Yij
follows an exponential family distribution and has the following properties
E(Yij|bi) = µij and var(Yij|bi) = φV (µij),
where V (.) is a monotonic function of µij and φ is the dispersion parameter. The
linear predictor for µij takes the form
g(E(Yij|bi)) = g(µij) = ηij = Xijβ + νi, (1)
where νi is a monotonic function of bi, Xij is the jth row of the design matrix Xi
and Xi is a ni × p design matrix for the fixed effects for the ith subject.
The random component bi follows a distribution conjugate to an exponential family
of distributions with parameter λi.
Negative binomial model GLM and Poisson-gamma HGLM
The negative binomial GLM can be fitted as a Poisson-gamma HGLM with a
saturated random effect
log[E(Yi|si)] = Xiβ + νsi
, (2)
where si is the random effect for the ith observation. Let νsi
= log(si), with si
following a gamma distribution with a mean of one and variance of α.
The model has the negative binomial variance
var(Yi) = µi + αµ2
i . (3)
αµ2
i measures the amount overdispersion.
Variance shift outlier model (VSOM) for Poisson count data
Independent count data: a VSOM for the ith observation
log[E(Yi|δi)] = Xiβ + νδi
, (4)
where δi is a random effect for the ith count, νδi
= log(δi) and δi has a gamma
distribution with a mean of one and variance of λi.
Longitudinal setting
VSOM for the ijth observation:
ηij = log[E(Yij|bi, δij)] = Xijβ + νbi
+ νδij
, (5)
where both bi and δij follow gamma distributions with each mean of one, and variances λij and γ,
respectively.
VSOM for the ith subject
ηij = log[E(Yij|bi)] = Xijβ + νbi
+ νζi
, (6)
where both bi and ζi follow gamma distributions with each mean of one, and variances γ and τi,
respectively.
Large estimates of the variance parameters λi, λij or τi are indicative of potential
outliers
Likelihood ratio tests (LRTs) are used to test for variance parameters, with LRTs
having 0.68χ2
0 + 0.32χ2
1 mixture distributions.
Application: Epilepsy data
Data description: The dataset is taken from Thall and Vail (1990) and contains
59 patients with epilepsy who were randomized to a new drug or a placebo. For each
patient the number of seizure counts were recorded at baseline, and every fortnight
during a 8-week period.
Initial model: Negative binomial - gamma HGLM (since the data are overdispersed):
log[E(Yij|bi)] = (β0 + bi) + β1lij + β2tij + β3tijlij + β4aij + β5vij + δij,
where lij = log(baseline seizure count), vij is the linear trend for the visits, coded as
(−3, −1, 1, 3)/10, bi is the subject random effect.
VSOM for the ijth observation:
log[E(Yij|bi)] = (β0 + bi) + β1lij + β2tij + β3tijlij + β4aij + β5vij + δij,
where δij is the random effect for the ijth observation.
VSOM for the ith subject:
log[E(Yij|bi)] = (β0 + bi) + β1lij + β2tij + β3tijlij + β4aij + β5vij + ζi,
where ζi is the random effect for the ith subject.
Application: continued
qqqqqqqqqq
q
q
qq
q
q
q
qq
q
q
q
qq
q
q
q
q
q
qqqqqqq
qq
q
q
qq
q
qqqqqqqqqqqqqq
q
qq
q
qq
q
q
q
qqqqqqqqqqq
q
q
q
qqqqqqqqqqqqqq
q
q
q
q
q
qqqqqqqqqqqqqqqqqqqqq
q
qq
q
q
q
qqqq
q
qqq
q
qqqqqqqqqqq
q
q
q
qqq
q
q
q
qqqqq
q
q
q
q
qq
q
qqq
q
qqq
q
q
q
qqqqqq
q
qqqqqq
qqqqqqqqqqqqqq
q
q
q
qqq
q
qqqq
q
qqqq
q
q
q
qqq
q
qqqqqqqqq
0 50 100 150 200
0.01.02.03.0
λk
(a)
qqqqqqqqqqqqqqqq
q
qq
q
qqqqqq
q
q
q
qqqqqqq
qqq
q
qq
q
qqqqqqqqqqqqqq
q
qq
q
qq
qqqqqqqqqqqqqq
q
qqqqqqqqqqqqqqqq
q
q
q
q
q
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
q
q
qqqqqqqqqqq
q
qqq
q
qqqqq
q
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
q
qqqq
q
qqqq
q
q
q
qqqqqqqqqqqqq
0 50 100 150 200
0.080.100.12
αk
(b)
qqqqqqqqqq
q
qqqqq
q
qqqq
q
qqqq
q
q
q
qqqqqqq
qq
q
q
qq
q
qqqqqqqqqqqqqqqqq
q
qq
q
qq
qqqqqqqqqqq
q
qqqqqqqqqqqqqqqq
q
q
q
q
q
qqqqqqqqqqqqqqqqqqqqq
q
qq
q
qqqqqqqqqqqqqqqqqqqqqq
q
q
q
qqqq
q
q
qqqqq
q
qq
q
qq
q
qqqqqqqqq
q
qqqqqqqqqqqqqqqqqqqqqqqqqqq
q
q
q
qqq
q
qqqq
q
qqqq
q
q
q
qqq
q
qqqqqqqqq
0 50 100 150 200
02468
Observations
LRTk
(c)
12345678910
11
1213141516
17
18192021
22
23242526
27
282930313233343536
3738
39
40
4142
43
4445464748495051525354555657585960
61
6263
64
6566
6768697071727374757677
78
79808182838485868788899091929394
95
96
97
98
99
100101102103104105106107108109110111112113114115116117118119120
121
122123
124
125126127128129130131132133134135136137138139140141142143144145146
147
148
149
150151152153
154
155
156157158159160
161
162163
164
165166
167
168169170171172173174175176
177
178179180181182183184185186187188189190191192193194195196197198199200201202203204
205
206207208209210211
212213214215
216
217218219220
221
222
223
224225226
227
228229230231232233234235236
Negative binomial-gamma VSOM statistics plotted against observation number. (c)
Likelihood ratio statistics, LRTk with rth percentiles from 0.68χ2
0 + 0.32χ2
1 mixture
distribution: r = 95 (solid line), r = 97.5 (dashed line) and r = 99 (dotted line).
k = 1, . . . , N = 236.
Potential outliers: observations 40, 62, 62, 78, 99 and 221.
Application: continued
q q q q q q q q q
q
q q q q q
q q
q q q q q q q
q
q
q q q q q q q q
q
q q
q
q q
q
q q q q q q
q
q
q q
q
q q q
q
q
q
q
0 10 20 30 40 50 60
01234
ψi
(a)
q q q q q q q q q
q
q q q q q
q
q
q q q q q q q
q
q q q q q q
q
q q
q
q
q
q
q q q q q q q q q q
q
q q
q
q q q
q
q
q
q
0 10 20 30 40 50 60
0.11100.1120
αi
(b)
q q q q q q q q q
q
q q q q q
q q
q q q q q q q
q
q q q q q q q q q
q
q q
q
q q
q
q q q q q q q q q q
q
q q q
q
q
q
q
0 10 20 30 40 50 60
02468
Subject
LRTi
(c)
56
58
Only subject 58 is a potential outlier.
Application: continued
Parameter estimates of combined VSOMs fitted to the epilepsy data set.
Parameter M0 M1 M2 M3
Estimate (s.e.) Estimate (s.e.) Estimate (s.e.) Estimate (s.e.)
constant -1.326 (1.210 -1.015 (1.199) -1.558 (1.163) -1.273 (1.149)
lbase 0.881 (0.129) 0.834 (0.128) 0.880 (0.124) 0.834 (0.122)
treatment -0.887(0.392 -0.932 (0.387) -0.799 (0.378) -0.846 (0.373)
treatment × lbase 0.337 (0.198) 0.372 (0.196) 0.308 (0.190) 0.343 (0.187)
log(age) 0.496 (0.360) 0.432 (0.357) 0.574 (0.345) 0.508 (0.342)
visit -0.264 (0.116) -0.312 (0.136) -0.264 (0.158) -0.312 (0.136)
γ 0.235 (0.051) 0.244 (0.051) 0.208 (0.046) 0.216 (0.047)
α 0.051 (0.011) 0.112 (0.018) 0.052 (0.011)
λ40 3.353 (4.091) 3.309 (4.037)
λ62 3.665 (4.435) 3.680 (4.453)
λ63 3.565 (4.314) 3.580 (4.332)
λ78 2.891 (3.614) 2.878 (3.598)
λ99 2.040 (2.693) 2.063 (2.723)
λ221 2.195 (2.766) 2.173 (2.738)
ψ58 4.012 (4.774) 4.097 (4.875)
deviance 1265.425 1217.112 1256.919 1208.506
Combined Negative binomial-gamma VSOMs (denoted M1, M2, M3, respectively)
accommodate outliers in the analysis, and perform better than the null model
(denoted M0).
Conclusions and future work
The VSOM for count data can be used to identify outliers, and down-weight them
in the analysis if desired.
An advantage of the VSOM over case deletion methods is ability to both identify
and down-weight outlying observations rather than deleting them.
Extension of the parametric bootstrap procedure of Gumedze et al. (2010) to
obtain a sampling distribution for the likelihod ratio test statistics and deal with the
problem of multiple testing.
Acknowledgements
Funding for this research was provided by University of Cape Town and the National
Research Foundation.

More Related Content

What's hot

Bayesian inference on mixtures
Bayesian inference on mixturesBayesian inference on mixtures
Bayesian inference on mixtures
Christian Robert
 
Intro to ABC
Intro to ABCIntro to ABC
Intro to ABC
Matt Moores
 
ABC-Gibbs
ABC-GibbsABC-Gibbs
ABC-Gibbs
Christian Robert
 
Predicting Short Term Movements of Stock Prices: A Two-Stage L1-Penalized Model
Predicting Short Term Movements of Stock Prices: A Two-Stage L1-Penalized ModelPredicting Short Term Movements of Stock Prices: A Two-Stage L1-Penalized Model
Predicting Short Term Movements of Stock Prices: A Two-Stage L1-Penalized Model
weekendsunny
 
5. cem granger causality ecm
5. cem granger causality  ecm 5. cem granger causality  ecm
5. cem granger causality ecm Quang Hoang
 
CISEA 2019: ABC consistency and convergence
CISEA 2019: ABC consistency and convergenceCISEA 2019: ABC consistency and convergence
CISEA 2019: ABC consistency and convergence
Christian Robert
 
Inference in generative models using the Wasserstein distance [[INI]
Inference in generative models using the Wasserstein distance [[INI]Inference in generative models using the Wasserstein distance [[INI]
Inference in generative models using the Wasserstein distance [[INI]
Christian Robert
 
Levy processes in the energy markets
Levy processes in the energy marketsLevy processes in the energy markets
Levy processes in the energy markets
Otmane Senhadji El Rhazi
 
Basic concepts of curve fittings
Basic concepts of curve fittingsBasic concepts of curve fittings
Basic concepts of curve fittings
Tarun Gehlot
 
ABC short course: model choice chapter
ABC short course: model choice chapterABC short course: model choice chapter
ABC short course: model choice chapter
Christian Robert
 
An Introduction to HSIC for Independence Testing
An Introduction to HSIC for Independence TestingAn Introduction to HSIC for Independence Testing
An Introduction to HSIC for Independence Testing
Yuchi Matsuoka
 
the ABC of ABC
the ABC of ABCthe ABC of ABC
the ABC of ABC
Christian Robert
 
Monte Carlo in Montréal 2017
Monte Carlo in Montréal 2017Monte Carlo in Montréal 2017
Monte Carlo in Montréal 2017
Christian Robert
 
Introduction to CAGD for Inverse Problems
Introduction to CAGD for Inverse ProblemsIntroduction to CAGD for Inverse Problems
Introduction to CAGD for Inverse Problems
Delta Pi Systems
 
Multiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximationsMultiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximations
Christian Robert
 
Ijetcas14 567
Ijetcas14 567Ijetcas14 567
Ijetcas14 567
Iasir Journals
 
4. standard granger causality
4. standard granger causality4. standard granger causality
4. standard granger causalityQuang Hoang
 
7. toda yamamoto-granger causality
7. toda yamamoto-granger causality7. toda yamamoto-granger causality
7. toda yamamoto-granger causalityQuang Hoang
 
Martin Roth: A spatial peaks-over-threshold model in a nonstationary climate
Martin Roth: A spatial peaks-over-threshold model in a nonstationary climateMartin Roth: A spatial peaks-over-threshold model in a nonstationary climate
Martin Roth: A spatial peaks-over-threshold model in a nonstationary climateJiří Šmída
 
D041114862
D041114862D041114862
D041114862
IOSR-JEN
 

What's hot (20)

Bayesian inference on mixtures
Bayesian inference on mixturesBayesian inference on mixtures
Bayesian inference on mixtures
 
Intro to ABC
Intro to ABCIntro to ABC
Intro to ABC
 
ABC-Gibbs
ABC-GibbsABC-Gibbs
ABC-Gibbs
 
Predicting Short Term Movements of Stock Prices: A Two-Stage L1-Penalized Model
Predicting Short Term Movements of Stock Prices: A Two-Stage L1-Penalized ModelPredicting Short Term Movements of Stock Prices: A Two-Stage L1-Penalized Model
Predicting Short Term Movements of Stock Prices: A Two-Stage L1-Penalized Model
 
5. cem granger causality ecm
5. cem granger causality  ecm 5. cem granger causality  ecm
5. cem granger causality ecm
 
CISEA 2019: ABC consistency and convergence
CISEA 2019: ABC consistency and convergenceCISEA 2019: ABC consistency and convergence
CISEA 2019: ABC consistency and convergence
 
Inference in generative models using the Wasserstein distance [[INI]
Inference in generative models using the Wasserstein distance [[INI]Inference in generative models using the Wasserstein distance [[INI]
Inference in generative models using the Wasserstein distance [[INI]
 
Levy processes in the energy markets
Levy processes in the energy marketsLevy processes in the energy markets
Levy processes in the energy markets
 
Basic concepts of curve fittings
Basic concepts of curve fittingsBasic concepts of curve fittings
Basic concepts of curve fittings
 
ABC short course: model choice chapter
ABC short course: model choice chapterABC short course: model choice chapter
ABC short course: model choice chapter
 
An Introduction to HSIC for Independence Testing
An Introduction to HSIC for Independence TestingAn Introduction to HSIC for Independence Testing
An Introduction to HSIC for Independence Testing
 
the ABC of ABC
the ABC of ABCthe ABC of ABC
the ABC of ABC
 
Monte Carlo in Montréal 2017
Monte Carlo in Montréal 2017Monte Carlo in Montréal 2017
Monte Carlo in Montréal 2017
 
Introduction to CAGD for Inverse Problems
Introduction to CAGD for Inverse ProblemsIntroduction to CAGD for Inverse Problems
Introduction to CAGD for Inverse Problems
 
Multiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximationsMultiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximations
 
Ijetcas14 567
Ijetcas14 567Ijetcas14 567
Ijetcas14 567
 
4. standard granger causality
4. standard granger causality4. standard granger causality
4. standard granger causality
 
7. toda yamamoto-granger causality
7. toda yamamoto-granger causality7. toda yamamoto-granger causality
7. toda yamamoto-granger causality
 
Martin Roth: A spatial peaks-over-threshold model in a nonstationary climate
Martin Roth: A spatial peaks-over-threshold model in a nonstationary climateMartin Roth: A spatial peaks-over-threshold model in a nonstationary climate
Martin Roth: A spatial peaks-over-threshold model in a nonstationary climate
 
D041114862
D041114862D041114862
D041114862
 

Viewers also liked

Lasso
LassoLasso
Lasso
子軒 林
 
Seminar on Robust Regression Methods
Seminar on Robust Regression MethodsSeminar on Robust Regression Methods
Seminar on Robust Regression MethodsSumon Sdb
 
A_Study_on_the_Medieval_Kerala_School_of_Mathematics
A_Study_on_the_Medieval_Kerala_School_of_MathematicsA_Study_on_the_Medieval_Kerala_School_of_Mathematics
A_Study_on_the_Medieval_Kerala_School_of_MathematicsSumon Sdb
 
Seminar- Robust Regression Methods
Seminar- Robust Regression MethodsSeminar- Robust Regression Methods
Seminar- Robust Regression MethodsSumon Sdb
 
Seminarppt
SeminarpptSeminarppt
5.7 poisson regression in the analysis of cohort data
5.7 poisson regression in the analysis of  cohort data5.7 poisson regression in the analysis of  cohort data
5.7 poisson regression in the analysis of cohort data
A M
 
Outlier detection for high dimensional data
Outlier detection for high dimensional dataOutlier detection for high dimensional data
Outlier detection for high dimensional dataParag Tamhane
 
Reading the Lasso 1996 paper by Robert Tibshirani
Reading the Lasso 1996 paper by Robert TibshiraniReading the Lasso 1996 paper by Robert Tibshirani
Reading the Lasso 1996 paper by Robert Tibshirani
Christian Robert
 
C2.5
C2.5C2.5
Ridge regression
Ridge regressionRidge regression
Ridge regression
Ananda Swarup
 
Poisson regression models for count data
Poisson regression models for count dataPoisson regression models for count data
Poisson regression models for count data
University of Southampton
 
Diagnostic in poisson regression models
Diagnostic in poisson regression modelsDiagnostic in poisson regression models
Diagnostic in poisson regression models
University of Southampton
 
Lasso regression
Lasso regressionLasso regression
Lasso regression
Masayuki Tanaka
 
Multicollinearity1
Multicollinearity1Multicollinearity1
Multicollinearity1Muhammad Ali
 
Multicolinearity
MulticolinearityMulticolinearity
Multicolinearity
Pawan Kawan
 
Ridge regression, lasso and elastic net
Ridge regression, lasso and elastic netRidge regression, lasso and elastic net
Ridge regression, lasso and elastic net
Vivian S. Zhang
 
Apprentissage automatique, Régression Ridge et LASSO
Apprentissage automatique, Régression Ridge et LASSOApprentissage automatique, Régression Ridge et LASSO
Apprentissage automatique, Régression Ridge et LASSO
Pierre-Hugues Carmichael
 
Slideshare.Com Powerpoint
Slideshare.Com PowerpointSlideshare.Com Powerpoint
Slideshare.Com Powerpoint
guested929b
 
Data Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
Data Science - Part XII - Ridge Regression, LASSO, and Elastic NetsData Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
Data Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
Derek Kane
 

Viewers also liked (20)

Lasso
LassoLasso
Lasso
 
Seminar on Robust Regression Methods
Seminar on Robust Regression MethodsSeminar on Robust Regression Methods
Seminar on Robust Regression Methods
 
A_Study_on_the_Medieval_Kerala_School_of_Mathematics
A_Study_on_the_Medieval_Kerala_School_of_MathematicsA_Study_on_the_Medieval_Kerala_School_of_Mathematics
A_Study_on_the_Medieval_Kerala_School_of_Mathematics
 
Seminar- Robust Regression Methods
Seminar- Robust Regression MethodsSeminar- Robust Regression Methods
Seminar- Robust Regression Methods
 
Seminarppt
SeminarpptSeminarppt
Seminarppt
 
5.7 poisson regression in the analysis of cohort data
5.7 poisson regression in the analysis of  cohort data5.7 poisson regression in the analysis of  cohort data
5.7 poisson regression in the analysis of cohort data
 
Outlier detection for high dimensional data
Outlier detection for high dimensional dataOutlier detection for high dimensional data
Outlier detection for high dimensional data
 
Reading the Lasso 1996 paper by Robert Tibshirani
Reading the Lasso 1996 paper by Robert TibshiraniReading the Lasso 1996 paper by Robert Tibshirani
Reading the Lasso 1996 paper by Robert Tibshirani
 
C2.5
C2.5C2.5
C2.5
 
Ridge regression
Ridge regressionRidge regression
Ridge regression
 
Poisson regression models for count data
Poisson regression models for count dataPoisson regression models for count data
Poisson regression models for count data
 
Diagnostic in poisson regression models
Diagnostic in poisson regression modelsDiagnostic in poisson regression models
Diagnostic in poisson regression models
 
Lasso regression
Lasso regressionLasso regression
Lasso regression
 
Multicollinearity1
Multicollinearity1Multicollinearity1
Multicollinearity1
 
Multicolinearity
MulticolinearityMulticolinearity
Multicolinearity
 
Module1
Module1Module1
Module1
 
Ridge regression, lasso and elastic net
Ridge regression, lasso and elastic netRidge regression, lasso and elastic net
Ridge regression, lasso and elastic net
 
Apprentissage automatique, Régression Ridge et LASSO
Apprentissage automatique, Régression Ridge et LASSOApprentissage automatique, Régression Ridge et LASSO
Apprentissage automatique, Régression Ridge et LASSO
 
Slideshare.Com Powerpoint
Slideshare.Com PowerpointSlideshare.Com Powerpoint
Slideshare.Com Powerpoint
 
Data Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
Data Science - Part XII - Ridge Regression, LASSO, and Elastic NetsData Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
Data Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
 

Similar to 4thchannel conference poster_freedom_gumedze

PMED Transition Workshop - A Bayesian Model for Joint Longitudinal and Surviv...
PMED Transition Workshop - A Bayesian Model for Joint Longitudinal and Surviv...PMED Transition Workshop - A Bayesian Model for Joint Longitudinal and Surviv...
PMED Transition Workshop - A Bayesian Model for Joint Longitudinal and Surviv...
The Statistical and Applied Mathematical Sciences Institute
 
Inference via Bayesian Synthetic Likelihoods for a Mixed-Effects SDE Model of...
Inference via Bayesian Synthetic Likelihoods for a Mixed-Effects SDE Model of...Inference via Bayesian Synthetic Likelihoods for a Mixed-Effects SDE Model of...
Inference via Bayesian Synthetic Likelihoods for a Mixed-Effects SDE Model of...
Umberto Picchini
 
Bayesian inference for mixed-effects models driven by SDEs and other stochast...
Bayesian inference for mixed-effects models driven by SDEs and other stochast...Bayesian inference for mixed-effects models driven by SDEs and other stochast...
Bayesian inference for mixed-effects models driven by SDEs and other stochast...
Umberto Picchini
 
Regression on gaussian symbols
Regression on gaussian symbolsRegression on gaussian symbols
Regression on gaussian symbols
Axel de Romblay
 
A Tutorial of the EM-algorithm and Its Application to Outlier Detection
A Tutorial of the EM-algorithm and Its Application to Outlier DetectionA Tutorial of the EM-algorithm and Its Application to Outlier Detection
A Tutorial of the EM-algorithm and Its Application to Outlier Detection
Konkuk University, Korea
 
asymptotics of ABC
asymptotics of ABCasymptotics of ABC
asymptotics of ABC
Christian Robert
 
Bayesian computation with INLA
Bayesian computation with INLABayesian computation with INLA
Bayesian computation with INLA
Thiago Guerrera Martins
 
MUMS Undergraduate Workshop - A Biased Introduction to Global Sensitivity Ana...
MUMS Undergraduate Workshop - A Biased Introduction to Global Sensitivity Ana...MUMS Undergraduate Workshop - A Biased Introduction to Global Sensitivity Ana...
MUMS Undergraduate Workshop - A Biased Introduction to Global Sensitivity Ana...
The Statistical and Applied Mathematical Sciences Institute
 
Ijmet 10 01_192
Ijmet 10 01_192Ijmet 10 01_192
Ijmet 10 01_192
IAEME Publication
 
Interpreting Logistic Regression.pptx
Interpreting Logistic Regression.pptxInterpreting Logistic Regression.pptx
Interpreting Logistic Regression.pptx
GairuzazmiMGhani
 
Congrès SMAI 2019
Congrès SMAI 2019Congrès SMAI 2019
Congrès SMAI 2019
Hamed Zakerzadeh
 
ABC short course: survey chapter
ABC short course: survey chapterABC short course: survey chapter
ABC short course: survey chapter
Christian Robert
 
CLIM Fall 2017 Course: Statistics for Climate Research, Nonstationary Covaria...
CLIM Fall 2017 Course: Statistics for Climate Research, Nonstationary Covaria...CLIM Fall 2017 Course: Statistics for Climate Research, Nonstationary Covaria...
CLIM Fall 2017 Course: Statistics for Climate Research, Nonstationary Covaria...
The Statistical and Applied Mathematical Sciences Institute
 
Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Laplace's Demon: seminar #1
Laplace's Demon: seminar #1
Christian Robert
 
MEM and SEM in the GME framework: Modelling Perception and Satisfaction - Car...
MEM and SEM in the GME framework: Modelling Perception and Satisfaction - Car...MEM and SEM in the GME framework: Modelling Perception and Satisfaction - Car...
MEM and SEM in the GME framework: Modelling Perception and Satisfaction - Car...
SYRTO Project
 
Input analysis
Input analysisInput analysis
Input analysis
Bhavik A Shah
 
Connection between inverse problems and uncertainty quantification problems
Connection between inverse problems and uncertainty quantification problemsConnection between inverse problems and uncertainty quantification problems
Connection between inverse problems and uncertainty quantification problems
Alexander Litvinenko
 
Workshop in honour of Don Poskitt and Gael Martin
Workshop in honour of Don Poskitt and Gael MartinWorkshop in honour of Don Poskitt and Gael Martin
Workshop in honour of Don Poskitt and Gael Martin
Christian Robert
 
Uncertainty in deep learning
Uncertainty in deep learningUncertainty in deep learning
Uncertainty in deep learning
Yujiro Katagiri
 
Ch01_03.ppt
Ch01_03.pptCh01_03.ppt
Ch01_03.ppt
ssuser61b04f
 

Similar to 4thchannel conference poster_freedom_gumedze (20)

PMED Transition Workshop - A Bayesian Model for Joint Longitudinal and Surviv...
PMED Transition Workshop - A Bayesian Model for Joint Longitudinal and Surviv...PMED Transition Workshop - A Bayesian Model for Joint Longitudinal and Surviv...
PMED Transition Workshop - A Bayesian Model for Joint Longitudinal and Surviv...
 
Inference via Bayesian Synthetic Likelihoods for a Mixed-Effects SDE Model of...
Inference via Bayesian Synthetic Likelihoods for a Mixed-Effects SDE Model of...Inference via Bayesian Synthetic Likelihoods for a Mixed-Effects SDE Model of...
Inference via Bayesian Synthetic Likelihoods for a Mixed-Effects SDE Model of...
 
Bayesian inference for mixed-effects models driven by SDEs and other stochast...
Bayesian inference for mixed-effects models driven by SDEs and other stochast...Bayesian inference for mixed-effects models driven by SDEs and other stochast...
Bayesian inference for mixed-effects models driven by SDEs and other stochast...
 
Regression on gaussian symbols
Regression on gaussian symbolsRegression on gaussian symbols
Regression on gaussian symbols
 
A Tutorial of the EM-algorithm and Its Application to Outlier Detection
A Tutorial of the EM-algorithm and Its Application to Outlier DetectionA Tutorial of the EM-algorithm and Its Application to Outlier Detection
A Tutorial of the EM-algorithm and Its Application to Outlier Detection
 
asymptotics of ABC
asymptotics of ABCasymptotics of ABC
asymptotics of ABC
 
Bayesian computation with INLA
Bayesian computation with INLABayesian computation with INLA
Bayesian computation with INLA
 
MUMS Undergraduate Workshop - A Biased Introduction to Global Sensitivity Ana...
MUMS Undergraduate Workshop - A Biased Introduction to Global Sensitivity Ana...MUMS Undergraduate Workshop - A Biased Introduction to Global Sensitivity Ana...
MUMS Undergraduate Workshop - A Biased Introduction to Global Sensitivity Ana...
 
Ijmet 10 01_192
Ijmet 10 01_192Ijmet 10 01_192
Ijmet 10 01_192
 
Interpreting Logistic Regression.pptx
Interpreting Logistic Regression.pptxInterpreting Logistic Regression.pptx
Interpreting Logistic Regression.pptx
 
Congrès SMAI 2019
Congrès SMAI 2019Congrès SMAI 2019
Congrès SMAI 2019
 
ABC short course: survey chapter
ABC short course: survey chapterABC short course: survey chapter
ABC short course: survey chapter
 
CLIM Fall 2017 Course: Statistics for Climate Research, Nonstationary Covaria...
CLIM Fall 2017 Course: Statistics for Climate Research, Nonstationary Covaria...CLIM Fall 2017 Course: Statistics for Climate Research, Nonstationary Covaria...
CLIM Fall 2017 Course: Statistics for Climate Research, Nonstationary Covaria...
 
Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Laplace's Demon: seminar #1
Laplace's Demon: seminar #1
 
MEM and SEM in the GME framework: Modelling Perception and Satisfaction - Car...
MEM and SEM in the GME framework: Modelling Perception and Satisfaction - Car...MEM and SEM in the GME framework: Modelling Perception and Satisfaction - Car...
MEM and SEM in the GME framework: Modelling Perception and Satisfaction - Car...
 
Input analysis
Input analysisInput analysis
Input analysis
 
Connection between inverse problems and uncertainty quantification problems
Connection between inverse problems and uncertainty quantification problemsConnection between inverse problems and uncertainty quantification problems
Connection between inverse problems and uncertainty quantification problems
 
Workshop in honour of Don Poskitt and Gael Martin
Workshop in honour of Don Poskitt and Gael MartinWorkshop in honour of Don Poskitt and Gael Martin
Workshop in honour of Don Poskitt and Gael Martin
 
Uncertainty in deep learning
Uncertainty in deep learningUncertainty in deep learning
Uncertainty in deep learning
 
Ch01_03.ppt
Ch01_03.pptCh01_03.ppt
Ch01_03.ppt
 

Recently uploaded

Azure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHatAzure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHat
Scholarhat
 
Aficamten in HCM (SEQUOIA HCM TRIAL 2024)
Aficamten in HCM (SEQUOIA HCM TRIAL 2024)Aficamten in HCM (SEQUOIA HCM TRIAL 2024)
Aficamten in HCM (SEQUOIA HCM TRIAL 2024)
Ashish Kohli
 
MATATAG CURRICULUM: ASSESSING THE READINESS OF ELEM. PUBLIC SCHOOL TEACHERS I...
MATATAG CURRICULUM: ASSESSING THE READINESS OF ELEM. PUBLIC SCHOOL TEACHERS I...MATATAG CURRICULUM: ASSESSING THE READINESS OF ELEM. PUBLIC SCHOOL TEACHERS I...
MATATAG CURRICULUM: ASSESSING THE READINESS OF ELEM. PUBLIC SCHOOL TEACHERS I...
NelTorrente
 
Assignment_4_ArianaBusciglio Marvel(1).docx
Assignment_4_ArianaBusciglio Marvel(1).docxAssignment_4_ArianaBusciglio Marvel(1).docx
Assignment_4_ArianaBusciglio Marvel(1).docx
ArianaBusciglio
 
S1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptxS1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptx
tarandeep35
 
How to Build a Module in Odoo 17 Using the Scaffold Method
How to Build a Module in Odoo 17 Using the Scaffold MethodHow to Build a Module in Odoo 17 Using the Scaffold Method
How to Build a Module in Odoo 17 Using the Scaffold Method
Celine George
 
Advantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO PerspectiveAdvantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO Perspective
Krisztián Száraz
 
A Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptxA Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptx
thanhdowork
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
Jean Carlos Nunes Paixão
 
Group Presentation 2 Economics.Ariana Buscigliopptx
Group Presentation 2 Economics.Ariana BuscigliopptxGroup Presentation 2 Economics.Ariana Buscigliopptx
Group Presentation 2 Economics.Ariana Buscigliopptx
ArianaBusciglio
 
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
RitikBhardwaj56
 
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptxChapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia
 
Digital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental DesignDigital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental Design
amberjdewit93
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
Levi Shapiro
 
How to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP ModuleHow to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP Module
Celine George
 
Digital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments UnitDigital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments Unit
chanes7
 
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Dr. Vinod Kumar Kanvaria
 
MASS MEDIA STUDIES-835-CLASS XI Resource Material.pdf
MASS MEDIA STUDIES-835-CLASS XI Resource Material.pdfMASS MEDIA STUDIES-835-CLASS XI Resource Material.pdf
MASS MEDIA STUDIES-835-CLASS XI Resource Material.pdf
goswamiyash170123
 
CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
camakaiclarkmusic
 
Landownership in the Philippines under the Americans-2-pptx.pptx
Landownership in the Philippines under the Americans-2-pptx.pptxLandownership in the Philippines under the Americans-2-pptx.pptx
Landownership in the Philippines under the Americans-2-pptx.pptx
JezreelCabil2
 

Recently uploaded (20)

Azure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHatAzure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHat
 
Aficamten in HCM (SEQUOIA HCM TRIAL 2024)
Aficamten in HCM (SEQUOIA HCM TRIAL 2024)Aficamten in HCM (SEQUOIA HCM TRIAL 2024)
Aficamten in HCM (SEQUOIA HCM TRIAL 2024)
 
MATATAG CURRICULUM: ASSESSING THE READINESS OF ELEM. PUBLIC SCHOOL TEACHERS I...
MATATAG CURRICULUM: ASSESSING THE READINESS OF ELEM. PUBLIC SCHOOL TEACHERS I...MATATAG CURRICULUM: ASSESSING THE READINESS OF ELEM. PUBLIC SCHOOL TEACHERS I...
MATATAG CURRICULUM: ASSESSING THE READINESS OF ELEM. PUBLIC SCHOOL TEACHERS I...
 
Assignment_4_ArianaBusciglio Marvel(1).docx
Assignment_4_ArianaBusciglio Marvel(1).docxAssignment_4_ArianaBusciglio Marvel(1).docx
Assignment_4_ArianaBusciglio Marvel(1).docx
 
S1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptxS1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptx
 
How to Build a Module in Odoo 17 Using the Scaffold Method
How to Build a Module in Odoo 17 Using the Scaffold MethodHow to Build a Module in Odoo 17 Using the Scaffold Method
How to Build a Module in Odoo 17 Using the Scaffold Method
 
Advantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO PerspectiveAdvantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO Perspective
 
A Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptxA Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptx
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
 
Group Presentation 2 Economics.Ariana Buscigliopptx
Group Presentation 2 Economics.Ariana BuscigliopptxGroup Presentation 2 Economics.Ariana Buscigliopptx
Group Presentation 2 Economics.Ariana Buscigliopptx
 
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
 
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptxChapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
 
Digital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental DesignDigital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental Design
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
 
How to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP ModuleHow to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP Module
 
Digital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments UnitDigital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments Unit
 
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
 
MASS MEDIA STUDIES-835-CLASS XI Resource Material.pdf
MASS MEDIA STUDIES-835-CLASS XI Resource Material.pdfMASS MEDIA STUDIES-835-CLASS XI Resource Material.pdf
MASS MEDIA STUDIES-835-CLASS XI Resource Material.pdf
 
CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
 
Landownership in the Philippines under the Americans-2-pptx.pptx
Landownership in the Philippines under the Americans-2-pptx.pptxLandownership in the Philippines under the Americans-2-pptx.pptx
Landownership in the Philippines under the Americans-2-pptx.pptx
 

4thchannel conference poster_freedom_gumedze

  • 1. Detection of outliers in Poisson regression models via overdispersion Freedom Gumedze and Tinashe Chatora Department of Statistical Sciences, University of Cape Town http://www.stats.uct.ac.za Email: freedom.gumedze@uct.ac.za Introduction Both undispersed and overdispersed count may contain outliers We propose a variance shift outlier model (VSOM) for the detection and accommodation of outliers in count data Our proposed model is a form of a hierarchical generalized linear model (HGLM) We consider both independent and longitudinal data settings Hierarchical generalized linear model (HGLM) A HGLM has the the following properties (Lee and Nelder, 1996): Let Yij be the jth observation for the ith subject and bi be the unobserved random effect for the ith subject, for i = 1, . . . , q and j = 1, . . . , ni. Conditional on bi, Yij follows an exponential family distribution and has the following properties E(Yij|bi) = µij and var(Yij|bi) = φV (µij), where V (.) is a monotonic function of µij and φ is the dispersion parameter. The linear predictor for µij takes the form g(E(Yij|bi)) = g(µij) = ηij = Xijβ + νi, (1) where νi is a monotonic function of bi, Xij is the jth row of the design matrix Xi and Xi is a ni × p design matrix for the fixed effects for the ith subject. The random component bi follows a distribution conjugate to an exponential family of distributions with parameter λi. Negative binomial model GLM and Poisson-gamma HGLM The negative binomial GLM can be fitted as a Poisson-gamma HGLM with a saturated random effect log[E(Yi|si)] = Xiβ + νsi , (2) where si is the random effect for the ith observation. Let νsi = log(si), with si following a gamma distribution with a mean of one and variance of α. The model has the negative binomial variance var(Yi) = µi + αµ2 i . (3) αµ2 i measures the amount overdispersion. Variance shift outlier model (VSOM) for Poisson count data Independent count data: a VSOM for the ith observation log[E(Yi|δi)] = Xiβ + νδi , (4) where δi is a random effect for the ith count, νδi = log(δi) and δi has a gamma distribution with a mean of one and variance of λi. Longitudinal setting VSOM for the ijth observation: ηij = log[E(Yij|bi, δij)] = Xijβ + νbi + νδij , (5) where both bi and δij follow gamma distributions with each mean of one, and variances λij and γ, respectively. VSOM for the ith subject ηij = log[E(Yij|bi)] = Xijβ + νbi + νζi , (6) where both bi and ζi follow gamma distributions with each mean of one, and variances γ and τi, respectively. Large estimates of the variance parameters λi, λij or τi are indicative of potential outliers Likelihood ratio tests (LRTs) are used to test for variance parameters, with LRTs having 0.68χ2 0 + 0.32χ2 1 mixture distributions. Application: Epilepsy data Data description: The dataset is taken from Thall and Vail (1990) and contains 59 patients with epilepsy who were randomized to a new drug or a placebo. For each patient the number of seizure counts were recorded at baseline, and every fortnight during a 8-week period. Initial model: Negative binomial - gamma HGLM (since the data are overdispersed): log[E(Yij|bi)] = (β0 + bi) + β1lij + β2tij + β3tijlij + β4aij + β5vij + δij, where lij = log(baseline seizure count), vij is the linear trend for the visits, coded as (−3, −1, 1, 3)/10, bi is the subject random effect. VSOM for the ijth observation: log[E(Yij|bi)] = (β0 + bi) + β1lij + β2tij + β3tijlij + β4aij + β5vij + δij, where δij is the random effect for the ijth observation. VSOM for the ith subject: log[E(Yij|bi)] = (β0 + bi) + β1lij + β2tij + β3tijlij + β4aij + β5vij + ζi, where ζi is the random effect for the ith subject. Application: continued qqqqqqqqqq q q qq q q q qq q q q qq q q q q q qqqqqqq qq q q qq q qqqqqqqqqqqqqq q qq q qq q q q qqqqqqqqqqq q q q qqqqqqqqqqqqqq q q q q q qqqqqqqqqqqqqqqqqqqqq q qq q q q qqqq q qqq q qqqqqqqqqqq q q q qqq q q q qqqqq q q q q qq q qqq q qqq q q q qqqqqq q qqqqqq qqqqqqqqqqqqqq q q q qqq q qqqq q qqqq q q q qqq q qqqqqqqqq 0 50 100 150 200 0.01.02.03.0 λk (a) qqqqqqqqqqqqqqqq q qq q qqqqqq q q q qqqqqqq qqq q qq q qqqqqqqqqqqqqq q qq q qq qqqqqqqqqqqqqq q qqqqqqqqqqqqqqqq q q q q q qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq q q qqqqqqqqqqq q qqq q qqqqq q qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq q qqqq q qqqq q q q qqqqqqqqqqqqq 0 50 100 150 200 0.080.100.12 αk (b) qqqqqqqqqq q qqqqq q qqqq q qqqq q q q qqqqqqq qq q q qq q qqqqqqqqqqqqqqqqq q qq q qq qqqqqqqqqqq q qqqqqqqqqqqqqqqq q q q q q qqqqqqqqqqqqqqqqqqqqq q qq q qqqqqqqqqqqqqqqqqqqqqq q q q qqqq q q qqqqq q qq q qq q qqqqqqqqq q qqqqqqqqqqqqqqqqqqqqqqqqqqq q q q qqq q qqqq q qqqq q q q qqq q qqqqqqqqq 0 50 100 150 200 02468 Observations LRTk (c) 12345678910 11 1213141516 17 18192021 22 23242526 27 282930313233343536 3738 39 40 4142 43 4445464748495051525354555657585960 61 6263 64 6566 6768697071727374757677 78 79808182838485868788899091929394 95 96 97 98 99 100101102103104105106107108109110111112113114115116117118119120 121 122123 124 125126127128129130131132133134135136137138139140141142143144145146 147 148 149 150151152153 154 155 156157158159160 161 162163 164 165166 167 168169170171172173174175176 177 178179180181182183184185186187188189190191192193194195196197198199200201202203204 205 206207208209210211 212213214215 216 217218219220 221 222 223 224225226 227 228229230231232233234235236 Negative binomial-gamma VSOM statistics plotted against observation number. (c) Likelihood ratio statistics, LRTk with rth percentiles from 0.68χ2 0 + 0.32χ2 1 mixture distribution: r = 95 (solid line), r = 97.5 (dashed line) and r = 99 (dotted line). k = 1, . . . , N = 236. Potential outliers: observations 40, 62, 62, 78, 99 and 221. Application: continued q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 0 10 20 30 40 50 60 01234 ψi (a) q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 0 10 20 30 40 50 60 0.11100.1120 αi (b) q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 0 10 20 30 40 50 60 02468 Subject LRTi (c) 56 58 Only subject 58 is a potential outlier. Application: continued Parameter estimates of combined VSOMs fitted to the epilepsy data set. Parameter M0 M1 M2 M3 Estimate (s.e.) Estimate (s.e.) Estimate (s.e.) Estimate (s.e.) constant -1.326 (1.210 -1.015 (1.199) -1.558 (1.163) -1.273 (1.149) lbase 0.881 (0.129) 0.834 (0.128) 0.880 (0.124) 0.834 (0.122) treatment -0.887(0.392 -0.932 (0.387) -0.799 (0.378) -0.846 (0.373) treatment × lbase 0.337 (0.198) 0.372 (0.196) 0.308 (0.190) 0.343 (0.187) log(age) 0.496 (0.360) 0.432 (0.357) 0.574 (0.345) 0.508 (0.342) visit -0.264 (0.116) -0.312 (0.136) -0.264 (0.158) -0.312 (0.136) γ 0.235 (0.051) 0.244 (0.051) 0.208 (0.046) 0.216 (0.047) α 0.051 (0.011) 0.112 (0.018) 0.052 (0.011) λ40 3.353 (4.091) 3.309 (4.037) λ62 3.665 (4.435) 3.680 (4.453) λ63 3.565 (4.314) 3.580 (4.332) λ78 2.891 (3.614) 2.878 (3.598) λ99 2.040 (2.693) 2.063 (2.723) λ221 2.195 (2.766) 2.173 (2.738) ψ58 4.012 (4.774) 4.097 (4.875) deviance 1265.425 1217.112 1256.919 1208.506 Combined Negative binomial-gamma VSOMs (denoted M1, M2, M3, respectively) accommodate outliers in the analysis, and perform better than the null model (denoted M0). Conclusions and future work The VSOM for count data can be used to identify outliers, and down-weight them in the analysis if desired. An advantage of the VSOM over case deletion methods is ability to both identify and down-weight outlying observations rather than deleting them. Extension of the parametric bootstrap procedure of Gumedze et al. (2010) to obtain a sampling distribution for the likelihod ratio test statistics and deal with the problem of multiple testing. Acknowledgements Funding for this research was provided by University of Cape Town and the National Research Foundation.