Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Bayesian statistics


Published on

Discussion in brief about Bayesian stats, neural networks and BCPNN

Published in: Health & Medicine
  • Be the first to comment

Bayesian statistics

  2. 2. INTRODUCTION DATA MINING A process used by companies to turn raw data into useful information. SIGNAL The WHO has defined ‘signal’ as ‘‘Reported information on a possible causal relationship between an adverse event and a drug, the relationship being unknown or incompletely documented.’’ In a perfect signalling system: a) All signals should be found b) “False” signals should be ignored c) Signals should be found early.
  3. 3. Signal detection and analysis Associations Signals Signal document Drug safety data Combinations Quantitative threshold Clinical evaluation
  4. 4. Data mining is refereed to as the computer- assisted procedures, starting from processing of dataset by data “cleaning” and culminating into the application of statistical techniques, often known as data mining algorithms (DMAs). DMAs can be classified in Frequentist Bayesian approach.
  5. 5. BAYESIAN STATISTICS Bayesian theorem first proposed by English clergyman and mathematician Thomas Bayes. Bayes theorem is a branch of logic applied to decision making and inferential statistics that deals with probability inference and is used to predict the future events based upon the knowledge of previous events. Health technology assessment Spigelhalter et al define Bayesian statistics as: ‘the explicit quantitative use of external evidence in the design, monitoring, analysis, interpretation, and reporting of a health technology assessment.’
  6. 6. oBayesian statistics uses ‘probability’ to express subjective ‘degree of belief’ in a specific outcome of interest. oBased on prior belief, and constantly updated on addition of new data, Bayesian statistics provides a formal statistical framework for updating probabilities and incorporating prior (even assumed) knowledge.
  7. 7. Benefits of using Bayesian statistics when data mining a) External information based on for example the goals of the data mining project can be incorporated into the statistical model. b) The use of a conservative prior allows a dampening at low counter values, stopping inappropriate fluctuations at low counter values and thus stopping unwanted spurious associations or patterns. c) Flexibility of choice of prior, so an alternative attitude to external information can be taken. d) Formal approach for looking at how results differ with different priors, including sensitivity testing of results. e) Repeated testing is fundamental to the approach, rather than an additional issue to be considered in the analysis.
  8. 8. f) Method can readily handle missing data, can make calculations when no data, and simply have more uncertainty (wider credibility intervals) around precision estimate. g) Credibility intervals easier to interpret than conventional confidence intervals. h) A probability distribution with the addition of data leads to a new probability distribution, thus emphasising the immediacy of any conclusion, as magnitude and certainty of the change is obvious. i) Tractable to calculate estimate of parameter even without data (this is the prior distribution), some classical estimates have problems before data is collected.
  9. 9. Applications 1) Clinical trial Analysis 2) Causality assessment  Supervised pattern recognition  Classification tasks  Cluster analysis  Neural networks 3) Testing of new drugs 4) Astrophysics 5) Lawsuits and public policy decision making
  10. 10. Neural Networks The term 'artificial neural network' or simply 'neural network’ is used to describe a wide range of different computational architectures. These architectures are used for a diverse range of tasks, including Prediction Classification of data sets And data mining. Neural networks are good for problems where many complex, undefined constraints exist with possibly varying relationships between them. Two main tracks of neural network based research have been pursued: 1. To further develop the neural network so it more and more effectively mimics the behaviour of the human brain 2. The further development of original neural network based techniques in whatever way necessary to optimise their use in specific data analysis tasks.
  11. 11. A typical neural network is made up of many simple interconnected units ("nodes"). Such units can be either Input nodes (representing input variables) Output nodes (representing network output), or In some cases hidden nodes. Communication channels ("connections" or "weights"), which carry numerical data, link the nodes to each other, so that each node only operates on local data and on inputs received via connections from other nodes.
  12. 12. Neural Network models include: Kohonen Back propagation Recurrent back propagation Adaptive resonance theory (ART) networks Radial basis function (rbf) networks Combination of ART and rbf Probabilistic neural networks and other types.
  13. 13. There are different types of neural network architecture: as data can flow from input to output nodes in different ways. Feed forward models are where data flows directly from input to output. In contrast recurrent networks model information about past inputs which is fed back in alongside current data via recurrent or feedback connections to provide an answer. •This can be done by feeding the outputs directly back into the input layer, or it could also be done by feedback from the hidden layers to the input layer. Computation continues until the network stabilizes, when the output values can be read.
  14. 14. Neural networks are most commonly used in the following types of task: a) Classification tasks, where cases need to be classified together into predefined classes, the definitions of the classes of which are not precise, or secondly into classes which are created from the data itself. The aim is to distinguish between items; finding whether a specified pattern or types of cases exist in the data set. b) Modelling outcome prediction, where the probability of specific outcomes for specific cases are predicted based on the input data for that case, after learning from a test set of data.
  15. 15. c) Data mining/clustering where one searches for unexpected patterns of variables or variable values in the data. This may be either supervised pattern recognition when one searches the data to see whether it contains a specific defined pattern, or unsupervised pattern recognition where previously unsuspected patterns are searched for in the data, that is relating similar cases or patterns in the data. d) Time series forecasting prediction of future outcomes
  16. 16. BAYESIAN CONFIDENCE PROPAGATION NEURAL NETWORK Initial testing of the BCPNN on the WHO database began in 1995. BCPNN is a semi-automated knowledge-detection technique that helps to find new drug-ADR associations likely to be potential signals by quantitative data filtration in Vigibase. The BCPNN was used as a hypothesis generating tool in the field of pharmacovigilance.
  17. 17. IMPORTANCE OF BCPNN BCPNN methodology can manage large databases It is robust in handling very incomplete data To find patterns of information within datasets with complex dependencies It can identify the drug-ADR combinations occurring with high frequency as against the background frequency, unexpected patterns and variations in patterns overtime
  18. 18. It can be useful in • detecting new syndromes • Age group affected • Population groups at high risk • Dose relations It also provides information about variations in ADR patterns among different countries It is highly objective, picks up the associations early and also indicates the strength of associations based on IC value.
  19. 19. INFORMATION COMPONENT(IC) The measure of disproportionality calculated by BCPNN. The IC value indicates the strength of quantitative dependency between a drug and ADR. •A positive value is indicative of higher occurrence of drug-ADR association than expected. •Negative value indicates that the association is observed less frequently than expected statistically •Zero indicates that there is no quantitative dependency between drug and ADR.
  20. 20. The IC value is based on: The total number of case reports with drug The total number of case reports with adverse reaction The number of reports with the specific drug-ADR combination The total number of reports in the database
  21. 21. No. Reports Target AE(j) Other AE Total Target Drug(i) a b nTD Other Drug C d nOD Total nTA nOA n(ij) 2-dimensional projection of an SRS database.
  22. 22. Both use ratio nij / Eij where nij = no. of reports mentioning both drug i & event j Eij = expected no. of reports of drug i & event j Both report features of posterior dist’n of ‘information component’ ICij = log2 nij / Eij IC=log2 a(a+b+c+d)/(a+c)(a+b)
  23. 23. No. Reports Nephropathy Other AE Total Target Drug X 20 100 120 Other Drug Y 100 980 1080 Total 120 1080 1200 For Eg
  24. 24. BCPNN approach advantages Bayesian approaches provide a natural method of combining new data with this previous knowledge defined from external information. Has no problem with repeated trials which can make calculation of appropriate confidence intervals Bayesian precision estimates are easy to Comprehend The use of a prior distribution gives flexibility Bayesian methods handle missing data intuitively
  25. 25. REFERENCES Bate, a. (2003). The use of a bayesian confidence propagation neural network in pharmacovigilance. fda_ind_larryg. (n.d.). Grieve, A. (n.d.). Applied Bayesian Approaches in Safety and Pharmacovigilance Professor Andy Grieve Department of Public Health Sciences King ‟ s College , London Safety is an issue in all phases. Hauben, M., Madigan, D., Gerrits, C. M., Walsh, L., & Puijenbroek, E. P. Van. (2005). The role of data mining in pharmacovigilance rig y o t h s f e hl y b u lic a ib h t s u b s i n i t n i n o i t ite d rig y p o l l i d Lt t i b d, 929–948. Liu, M., Matheny, M., Hu, Y., & Xu, H. (2012). Data mining methodologies for pharmacovigilance. ACM SIGKDD Explorations Newsletter, 14(1), 35–42. doi:10.1145/2408736.2408742 Madigan, D., Ryan, P., Simpson, S., & Zorych, I. (2011). Bayesian methods in pharmacovigilance. Bayesian Statistics. Retrieved from Paper, C. W. (n.d.). Enhanced Signal Detection and Management Enables More Effective Pharmacovigilance. Poluzzi, E., Raschi, E., Piccinni, C., & De Ponti, F. (2012). Data mining techniques in pharmacovigilance: analysis of the publicly accessible FDA adverse event reporting system (AERS). Data Mining Applications in Engineering and Medicine. Croatia: InTech, 267–301. doi:10.5772/50095 presentation_Haijun5-15-2009-19-4-50. (n.d.).
  26. 26. QUESTIONS