A process used by companies to turn raw data into useful
The WHO has defined ‘signal’ as ‘‘Reported information on a
possible causal relationship between an adverse event and a drug,
the relationship being unknown or incompletely documented.’’
In a perfect signalling system:
a) All signals should be found
b) “False” signals should be ignored
c) Signals should be found early.
Signal detection and analysis
Data mining is refereed to as the computer-
assisted procedures, starting from processing of
dataset by data “cleaning” and culminating into
the application of statistical techniques, often
known as data mining algorithms (DMAs).
DMAs can be classified in
Bayesian theorem first proposed by English clergyman and mathematician
Bayes theorem is a branch of logic applied to decision making and
inferential statistics that deals with probability inference and is used to
predict the future events based upon the knowledge of previous events.
Health technology assessment Spigelhalter et al define Bayesian statistics
‘the explicit quantitative use of external evidence in the design,
analysis, interpretation, and reporting of a health technology
oBayesian statistics uses ‘probability’ to
express subjective ‘degree of belief’ in a
specific outcome of interest.
oBased on prior belief, and constantly updated
on addition of new data, Bayesian statistics
provides a formal statistical framework for
updating probabilities and incorporating prior
(even assumed) knowledge.
Benefits of using Bayesian statistics when data mining
a) External information based on for example the goals of the data
mining project can be incorporated into the statistical model.
b) The use of a conservative prior allows a dampening at low
values, stopping inappropriate fluctuations at low counter values
thus stopping unwanted spurious associations or patterns.
c) Flexibility of choice of prior, so an alternative attitude to external
information can be taken.
d) Formal approach for looking at how results differ with different
priors, including sensitivity testing of results.
e) Repeated testing is fundamental to the approach, rather than an
additional issue to be considered in the analysis.
f) Method can readily handle missing data, can make calculations
no data, and simply have more uncertainty (wider credibility
intervals) around precision estimate.
g) Credibility intervals easier to interpret than conventional
h) A probability distribution with the addition of data leads to a new
probability distribution, thus emphasising the immediacy of any
conclusion, as magnitude and certainty of the change is obvious.
i) Tractable to calculate estimate of parameter even without data
is the prior distribution), some classical estimates have problems
before data is collected.
1) Clinical trial Analysis
2) Causality assessment
Supervised pattern recognition
3) Testing of new drugs
5) Lawsuits and public policy decision making
The term 'artificial neural network' or simply 'neural network’ is used to
describe a wide range of different computational architectures.
These architectures are used for a diverse range of tasks, including
Classification of data sets
And data mining.
Neural networks are good for problems where many complex,
undefined constraints exist with possibly varying relationships between
Two main tracks of neural network based research have been pursued:
1. To further develop the neural network so it more and more effectively
mimics the behaviour of the human brain
2. The further development of original neural network based techniques
in whatever way necessary to optimise their use in specific data
A typical neural network is made up of many
simple interconnected units ("nodes"). Such units
can be either
Input nodes (representing input variables)
Output nodes (representing network output), or
In some cases hidden nodes.
Communication channels ("connections" or
"weights"), which carry numerical data, link the
nodes to each other, so that each node only
operates on local data and on inputs received via
connections from other nodes.
Neural Network models include:
Recurrent back propagation
Adaptive resonance theory (ART) networks
Radial basis function (rbf) networks
Combination of ART and rbf
Probabilistic neural networks and other types.
There are different types of neural network
architecture: as data can flow from input to output
nodes in different ways.
Feed forward models are where data flows directly
from input to output.
In contrast recurrent networks model information
about past inputs which is fed back in alongside
current data via recurrent or feedback connections
to provide an answer.
•This can be done by feeding the outputs directly
back into the input layer, or it could also be done
by feedback from the hidden layers to the input
layer. Computation continues until the network
stabilizes, when the output values can be read.
Neural networks are most commonly used in the following
types of task:
a) Classification tasks, where cases need to be classified
together into predefined classes, the definitions of the classes
of which are not precise, or secondly into classes which are
created from the data itself. The aim is to distinguish between
items; finding whether a
specified pattern or types of cases exist in the data set.
b) Modelling outcome prediction, where the probability of
outcomes for specific cases are predicted based on the input
data for that case, after learning from a test set of data.
c) Data mining/clustering where one searches for
unexpected patterns of variables or variable values in the
data. This may be either supervised pattern recognition
when one searches the data to see whether it contains a
specific defined pattern, or unsupervised pattern
recognition where previously unsuspected patterns are
searched for in the data, that is relating similar cases or
patterns in the data.
d) Time series forecasting prediction of future outcomes
BAYESIAN CONFIDENCE PROPAGATION
Initial testing of the BCPNN on the WHO database
began in 1995.
BCPNN is a semi-automated knowledge-detection
technique that helps to find new drug-ADR associations
likely to be potential signals by quantitative data filtration
The BCPNN was used as a hypothesis generating tool
in the field of pharmacovigilance.
IMPORTANCE OF BCPNN
BCPNN methodology can manage large databases
It is robust in handling very incomplete data
To find patterns of information within datasets with
It can identify the drug-ADR combinations occurring
with high frequency as against the background
frequency, unexpected patterns and variations in
It can be useful in
• detecting new syndromes
• Age group affected
• Population groups at high risk
• Dose relations
It also provides information about variations in ADR
patterns among different countries
It is highly objective, picks up the associations early
and also indicates the strength of associations based
on IC value.
The measure of disproportionality calculated by BCPNN.
The IC value indicates the strength of quantitative dependency between a
drug and ADR.
•A positive value is indicative of higher occurrence of drug-ADR
association than expected.
•Negative value indicates that the association is observed less frequently
than expected statistically
•Zero indicates that there is no quantitative dependency between drug
The IC value is based on:
The total number of case reports with drug
The total number of case reports with
The number of reports with the specific
The total number of reports in the database
No. Reports Target
Other AE Total
a b nTD
Other Drug C d nOD
Total nTA nOA n(ij)
2-dimensional projection of an
Both use ratio nij / Eij where
nij = no. of reports mentioning both drug i & event
Eij = expected no. of reports of drug i & event j
Both report features of posterior dist’n of
ICij = log2 nij / Eij
No. Reports Nephropathy Other AE Total
Target Drug X 20 100 120
Other Drug Y 100 980 1080
Total 120 1080 1200
BCPNN approach advantages
Bayesian approaches provide a natural method of
combining new data with this previous knowledge defined
from external information.
Has no problem with repeated trials which can make
calculation of appropriate confidence intervals
Bayesian precision estimates are easy to Comprehend
The use of a prior distribution gives flexibility
Bayesian methods handle missing data intuitively
Bate, a. (2003). The use of a bayesian confidence propagation neural network in
Grieve, A. (n.d.). Applied Bayesian Approaches in Safety and Pharmacovigilance Professor
Andy Grieve Department of Public Health Sciences King ‟ s College , London Safety is an
issue in all phases.
Hauben, M., Madigan, D., Gerrits, C. M., Walsh, L., & Puijenbroek, E. P. Van. (2005). The
role of data mining in pharmacovigilance rig y o t h s f e hl y b u lic a ib h t s u b s i n i t n i n o
i t ite d rig y p o l l i d Lt t i b d, 929–948.
Liu, M., Matheny, M., Hu, Y., & Xu, H. (2012). Data mining methodologies for
pharmacovigilance. ACM SIGKDD Explorations Newsletter, 14(1), 35–42.
Madigan, D., Ryan, P., Simpson, S., & Zorych, I. (2011). Bayesian methods in
pharmacovigilance. Bayesian Statistics. Retrieved from
Paper, C. W. (n.d.). Enhanced Signal Detection and Management Enables More Effective
Poluzzi, E., Raschi, E., Piccinni, C., & De Ponti, F. (2012). Data mining techniques in
pharmacovigilance: analysis of the publicly accessible FDA adverse event reporting system
(AERS). Data Mining Applications in Engineering and Medicine. Croatia: InTech, 267–301.