DATA MINING IN
PHARMACOVIGILANCE
Dr. Bhaswat S. Chakraborty
Sr. VP & Chair, R&D Core Committee
Cadila Pharmaceuticals Ltd....
CONTENTS
 Pharmacovigilance (PV)
 PV process
 PV databases
 Data mining in PV
 Toxic signals & signal detection (SD)
...
PREMATURE APPROVAL,
INCOMPLETE SAFETY PROFILE?
 Many drugs whose complete safety profile is still
unknown have been appro...
CHANCES TO OBSERVE SAES
THROUGH CTS
Reaction
Rate
Sample
Size
Pr(at least
1)
Pr(at least
2)
1% 500 0.993 0.960
0.5% 500 0....
PHARMACOVIGILANCE
(PV)
 Monitoring, evaluation and
implementation of drug safety
 Detection and quantitation
of adverse...
6
THE PHARMACOVIGILANCE
PROCESS
Source: A.L. Gould, Internet PPT
PHARMACOVIGILANCE
DATABASES
 PV is usually practiced by agencies and pharmaceutical
companies by focusing on SD in large ...
DESIRABLE ATTRIBUTES OF AE
DATABASE SOFTWARE
 Should be well integrated with Clinical data
management software
 User fri...
DATA MINING
 Getting something useful from lots and lots and lots of data
 Although it might appear so, the methodology ...
DRUG TOXIC SIGNALS
 WHO: “reported information on a possible causal
relationship between an adverse event and a
drug, the...
SIGNAL DETECTION
 Comes originally from electronics engg.
 In signal detection theory
 a receiver operating characteris...
Increasing the threshold would mean fewer false
positives (and more false negatives). The actual shape of the
curve is det...
GOALS FOR ADR SIGNALS
 Low false positive signals
 Drug-ADR association should be real
 Low false negative signal
 Sho...
DATA MINING
& SD
PROTOCOL
 Report collection
 Database
cleaning
 Quantitative
assessment
 Qualitative
assessment
 Eva...
15
DATA DISPLAY & MINING METHODS
IN PV
No.
Reports
Target R Other R Total
Target D a b nTD
Other D c d nOD
Total nTA nOA n...
CRITERIA FOR A TOXIC
DISPROPORTIONAL ADR
ROR =
χ2
=
Expected
ExpectedObserved 2
)( −
Significant disproportional
Signal is...
CASESTUDY EXAMPLE:
PROPRANOLOL-BRADYCARDIA
Gavali, Kulkarni, Kumar and Chakraborty (2009), Ind J Pharmacol, 41, 162-166
17
BAYESIAN STATISTICS
IN SD
where Pr(R|D) is the posterior probability of observing a
specific adverse event R given that a ...
MULTI-ITEM GAMMA POISSON
SHRINKER (MGPS)
 It ranks drug-event combinations
 According to how ‘interestingly large’ the n...
MULTI-ITEM GAMMA POISSON
SHRINKER
(MGPS)
Reporting
ratio
Modified
Reporting
ratio
Modeled
Reporting
ratio
Empirical Bayes
...
Hauben & Zhou. (2003) Drug Safety 26, 159-186
21
BAYESIAN CONFIDENCE
PROPAGATION NEURAL
NETWORK (BCPNN)
 The Uppsala Monitoring Centre (UMC) for WHO
databases uses BCPNN ...
INFORMATION
COMPONENT (IC)
 IC is used to decide whether the joint
probabilities of ADRs are different from
independent D...
POSITIVE IC AND TIME SCANS
 If Pr of co-occurrence of R & D is the same as the
product of the individual Pr of R & D, the...
CAPTOPRIL AND COUGH
The diagram shows the IC for the drug-ADR association. Error bars: + 95% CI.
R. Orre et al. (2000) Com...
A well known signal: suprofen and back pain. The diagram shows the IC for the
drug-ADR association. Error bars: + 95% CI.
...
The development from 1973 to 1990 of the IC for the drug azapropazone
vs. the photosensitivity reaction with 95% CI.
R. Or...
CHARACTERISTICS OF
IC
 The preceding
diagrams show how the
IC for the D-R (e.g.,
suprofen-back pain
association varies ov...
DIGOXINE & RASH: AN INTERESTING
CASE
Although overall negative IC, when examined across age group,
increasing age was aaso...
PACLITAXEL-TACHYCARDIA
Change of IC between 1970 to 2010 for the association of tachycardia-
paclitaxel. The IC is plotted...
DOCETAXEL - FLUSHING
Change of IC between 1970 to 2010 for the association of Doclitaxel-
flushing.
Singhal & Chakraborty....
CONCLUDING REMARKS
 Statistical data mining for drug-adverse reaction offers a
useful, non-invasive and sophisticated too...
THANK YOU VERY
MUCH
Acknowledgement: Ms. Raji Nair
33
Upcoming SlideShare
Loading in...5
×

Data mining in pharmacovigilance

2,007

Published on

Published in: Health & Medicine, Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,007
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
104
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide
  • Alosetron is indicated only for women with severe diarrhea-predominant irritable bowel syndrome (IBS). Grepafloxacin hydrochloride (trade name Raxar, Glaxo Wellcome) is an oral broad-spectrum quinoline antibiotic agent used to treat bacterial infection. Rofecoxib is a nonsteroidal anti-inflammatory drug (NSAId) that has now been withdrawn over safety concerns.
  • Proportional Reporting Ratio (PRR); Reporting Odds Ratio (ROR);
  • neural networks are self-organising, suited to parallel computation, computationally efficient and provide a simple probabilistic interpretation of network weights.[3] Computational efficiency may be particularly advantageous with this programme because the BCPNN starts by calculating cell counts for all potential drug-adverse event combinations in the database, not just those that appear together in at least one report. This is acccomplished with two fully interconnected layers, one for all drugs and one for all adverse events.
  • neural networks are self-organising, suited to parallel computation, computationally efficient and provide a simple probabilistic interpretation of network weights.[3] Computational efficiency may be particularly advantageous with this programme because the BCPNN starts by calculating cell counts for all potential drug-adverse event combinations in the database, not just those that appear together in at least one report. This is acccomplished with two fully interconnected layers, one for all drugs and one for all adverse events.
  • Azapropazone is a non-steroidal anti-inflammatory drug used in a cute gout, ankylosing spondylitis & rheumatoid arthritis
  • Data mining in pharmacovigilance

    1. 1. DATA MINING IN PHARMACOVIGILANCE Dr. Bhaswat S. Chakraborty Sr. VP & Chair, R&D Core Committee Cadila Pharmaceuticals Ltd., Ahmedabad Presented at Indian Pharmacological Society Meeting, Ahmedabad, October 5, 2013 1
    2. 2. CONTENTS  Pharmacovigilance (PV)  PV process  PV databases  Data mining in PV  Toxic signals & signal detection (SD)  Non-Bayesian SD  Disproportionality  Bayesian SD  Multi-item gamma poisson shrinker (MGPS)  Bayesian confidence propagation neural network (BCPNN)  Examples  Concluding remarks 2
    3. 3. PREMATURE APPROVAL, INCOMPLETE SAFETY PROFILE?  Many drugs whose complete safety profile is still unknown have been approved  In some cases, drugs are approveddespite identification of SAEs in premarketing trials  Alosetron hydrochloride – ischemic colitis  Grepafloxacinhydrochloride – QT prolongationand deaths  Rofecoxib – heart attack and stroke (long-term, high- dosage use)  They were all subsequently withdrawn fromthe market because of these SAEs  In currently marketed drugs black box warnings (SAEs caused by prescription drugs) is very common 3
    4. 4. CHANCES TO OBSERVE SAES THROUGH CTS Reaction Rate Sample Size Pr(at least 1) Pr(at least 2) 1% 500 0.993 0.960 0.5% 500 0.918 0.713 1000 0.993 0.960 0.1% 1500 0.777 0.442 3000 0.950 0.801 0.01% 6000 0.451 0.122 10000 0.632 0.264 20000 0.865 0.594 4
    5. 5. PHARMACOVIGILANCE (PV)  Monitoring, evaluation and implementation of drug safety  Detection and quantitation of adverse drug reactions (ADRs) novel or partially known previously unknown known hazard ↑frequency or ↑severity in their Clinical nature, Severity or Frequency 5
    6. 6. 6 THE PHARMACOVIGILANCE PROCESS Source: A.L. Gould, Internet PPT
    7. 7. PHARMACOVIGILANCE DATABASES  PV is usually practiced by agencies and pharmaceutical companies by focusing on SD in large databases  These databases are of huge sizes, e.g.,  USFDA database, AERS: > 6.2 million records  WHO database, VIGIBASE: >7.2 million records  GSK databse, OCEANS: > 2 million records  Based on a study, the highest power for finding a true signal is achieved by combining those databases with the most drug- specific data.  Also early safety SD should involve the use of multiple large global databases  Reliance on a single database may reduce statistical power and diversity of ADRs Hammond IW et al. (2007). Expert Opin Drug Saf. 6:713-21 7
    8. 8. DESIRABLE ATTRIBUTES OF AE DATABASE SOFTWARE  Should be well integrated with Clinical data management software  User friendly  Individual reports management features  Easy for query  Line listing of the entire database or part is possible and easy  Data extraction is easy, with desirable filters  May also keep track of postmarketing Rx utility and complaints data 8
    9. 9. DATA MINING  Getting something useful from lots and lots and lots of data  Although it might appear so, the methodology is not linear, as it involves building and assessing models, carrying out simultaneous as well as serial steps 9
    10. 10. DRUG TOXIC SIGNALS  WHO: “reported information on a possible causal relationship between an adverse event and a drug, the relationship being unknown or incompletely documented previously.”  More than a single report needed  Suggests Drug-ADR (D-R) association (doesn't establish causality)  An alert from any available source  Pre or post-marketing data generated  Data-mining of especially post-marketing safety databases 10
    11. 11. SIGNAL DETECTION  Comes originally from electronics engg.  In signal detection theory  a receiver operating characteristic (ROC) illustrates performance of true positives vs. false positives out of the negatives  at various threshold settings  Sensitivity is high with low true negative rate  Specificity is high with a true positive rate 11
    12. 12. Increasing the threshold would mean fewer false positives (and more false negatives). The actual shape of the curve is determined by the overlap the two distributions. 12
    13. 13. GOALS FOR ADR SIGNALS  Low false positive signals  Drug-ADR association should be real  Low false negative signal  Should not miss any Drug-ADR signal  Early detection of signals is desirable  False discovery rate → 0  Association  Bupropion – seizures  Olanzapine – thrombosis  Pergolide – increased libido  Risperidon – diabetes mellitus  Terbinafine – stomatistis  Rosiglitazone – liver function abnormalities  Dis-association  Isotretinoine– suicide  Source: LAREB 13
    14. 14. DATA MINING & SD PROTOCOL  Report collection  Database cleaning  Quantitative assessment  Qualitative assessment  Evaluation  Communication Gavali, Kulkarni, Kumar and Chakraborty (2009), Ind J Pharmacol, 41, 162-166 14
    15. 15. 15 DATA DISPLAY & MINING METHODS IN PV No. Reports Target R Other R Total Target D a b nTD Other D c d nOD Total nTA nOA n Methods for Mining Reporting Ratio (RR): E(a) = nTD × nTA/n Proportional Reporting Ratio (PRR): E(a) = nTD × c/nOD Odds Ratio (OR): E(a) = b × c/d Need to accommodate uncertainty, especially if a is small Bayesian approaches provide a way to do this Basic approach: possible Signal when R = a/E(a) is “large”
    16. 16. CRITERIA FOR A TOXIC DISPROPORTIONAL ADR ROR = χ2 = Expected ExpectedObserved 2 )( − Significant disproportional Signal is detected when χ2 is ≥ 4.0 and the rest ≥ 2.0 16 c baa )( + =PRR dc ba /
    17. 17. CASESTUDY EXAMPLE: PROPRANOLOL-BRADYCARDIA Gavali, Kulkarni, Kumar and Chakraborty (2009), Ind J Pharmacol, 41, 162-166 17
    18. 18. BAYESIAN STATISTICS IN SD where Pr(R|D) is the posterior probability of observing a specific adverse event R given that a specific drug D is the suspect drug. Pr(R) and Pr(D) are prior probabilities of observing R and D in the entire database. Pr(R,D) is joint probability that both R and D were observed in the same database coincidentally. Pr(R|D) / Pr(R) = Pr(R,D) / Pr(R)*Pr(D) 18
    19. 19. MULTI-ITEM GAMMA POISSON SHRINKER (MGPS)  It ranks drug-event combinations  According to how ‘interestingly large’ the number of reports of that R-D combination  compared with what would be expected if the drug and event were statistically independent.  Unlike the Information Component (IC), MGPS technique gives an overall ranking of R-D combinations  IC gives a kind of non-relative measure (IC) for each R-D combination 19
    20. 20. MULTI-ITEM GAMMA POISSON SHRINKER (MGPS) Reporting ratio Modified Reporting ratio Modeled Reporting ratio Empirical Bayes Geometric Mean (EBGM) Stratification by gender, age, yr. etc.) Bayesian shrinkage for cell sizes If the lower bound of 90%CI of EBGM (EB05) ≥2, R-D combinations occur twice as often as expected; also, For N>20 or so, N/E = EBGM = PRR 20
    21. 21. Hauben & Zhou. (2003) Drug Safety 26, 159-186 21
    22. 22. BAYESIAN CONFIDENCE PROPAGATION NEURAL NETWORK (BCPNN)  The Uppsala Monitoring Centre (UMC) for WHO databases uses BCPNN architecture for SD  Neural networks are highly organized & efficient  Give simple probabilistic interpretation of network weights  Analogous to a living neuron with its multiple dendrites and single axon  BCPNN calculates cell counts for all potential R-D combinations in the database, not just those appearing in at least one report  Done with two fully interconnected layers  One for all drugs and one for all adverse events 22
    23. 23. INFORMATION COMPONENT (IC)  IC is used to decide whether the joint probabilities of ADRs are different from independent D & R.  This makes sense because if the events are independent  the knowledge of one of the variables contributes no new information about the other &  does not reduce the uncertainty about Y (due to knowledge about X) IC = log2 [Pr(R,D) / Pr(R)*Pr(D) 23
    24. 24. POSITIVE IC AND TIME SCANS  If Pr of co-occurrence of R & D is the same as the product of the individual Pr of R & D, the Bayesian likelihood estimator Pr(R,D)/Pr(R)*Pr(D) will be equal to 1  This means equal prior and posterior probabilities  Log2 1 = 0, therefore IC = 0  However, when posterior probability Pr(R|D) exceeds the prior probability P(R), the IC becomes more positive  An IC with a lower bound of 95% CI>0 that increases with sequential time scans is positive stable signal 24
    25. 25. CAPTOPRIL AND COUGH The diagram shows the IC for the drug-ADR association. Error bars: + 95% CI. R. Orre et al. (2000) Computational Statistics & Data Analysis 34, 473-493 25
    26. 26. A well known signal: suprofen and back pain. The diagram shows the IC for the drug-ADR association. Error bars: + 95% CI. R. Orre et al. (2000) Computational Statistics & Data Analysis 34, 473-493 26
    27. 27. The development from 1973 to 1990 of the IC for the drug azapropazone vs. the photosensitivity reaction with 95% CI. R. Orre et al. (2000) Computational Statistics & Data Analysis 34, 473-493 27
    28. 28. CHARACTERISTICS OF IC  The preceding diagrams show how the IC for the D-R (e.g., suprofen-back pain association varies over a span of time (e.g., 1983 – 1990) The cumulative probability function for IC being greater than zero [Pr(IC>0)] develops over time. This association is seen with 80% certainty after the Q1, 1984. 28
    29. 29. DIGOXINE & RASH: AN INTERESTING CASE Although overall negative IC, when examined across age group, increasing age was aasociated with positive IC. R. Orre et al. (2000) Computational Statistics & Data Analysis 34, 473-493 29
    30. 30. PACLITAXEL-TACHYCARDIA Change of IC between 1970 to 2010 for the association of tachycardia- paclitaxel. The IC is plotted from year of 1970 to 2010 with five year intervals with 95% CI Singhal & Chakraborty. Unpublished data 30
    31. 31. DOCETAXEL - FLUSHING Change of IC between 1970 to 2010 for the association of Doclitaxel- flushing. Singhal & Chakraborty. Unpublished data -2 -1 0 1 2 3 4 5 6 7 1970-1975 1976-1980 1981-1985 1986-1990 1991-1995 1996-2000 2001-2005 2006-2010 E(IC) Time(Year) 31
    32. 32. CONCLUDING REMARKS  Statistical data mining for drug-adverse reaction offers a useful, non-invasive and sophisticated tool for unknown or incompletely signals  Mainly proportional reporting ratios (PRR) and Bayesian data mining including Empirical Bayesian Screening (EBS) & Bayesian Confidence Propagation Neural Network (BCPNN) are used  PRRs and EBS are comparable, only EBS has an advantage with D-R combinations in very small numbers but it is based on relative ranking  BCPNN provides an IC (a kind of threshold) for signaling that applies to any D-R cells irrespective of ranking  The signals do not establish causality, they only indicate very strong association between D & R  With all methods of data mining (especially PRR, EBS & BCPNN), the quality & size of the database is very important (can amplify or dilute a signal) 32
    33. 33. THANK YOU VERY MUCH Acknowledgement: Ms. Raji Nair 33
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×