Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Statistical SignificancePieceFinal


Published on

  • Be the first to comment

  • Be the first to like this

Statistical SignificancePieceFinal

  1. 1. Statistics for Systems Biology Pathway analysisisan interdisciplinary approach thatresulted fromtheadventof recent technologicaladvancesin next- generation sequencing of DNA which revolutionized genomicresearch.Statistics is integral in the analysisand visualization of thesedata in order to betterunderstand theunderlying biologicalsystems. Pathway Analysis Pathway analysis is a rapidly growing field that combines biology, computer science, and statistics to build a working computational model of the living cell. The completion of the Human Genome Project in 2003, which took over 13 years and nearly $3 billion, fueled the development of sequencing technologies, known as next-generation sequencing (NGS), that were less costly and took less time. Every year since 2003, the cost to generate a whole human genome sequence has fallen, resulting in an exponential increase in data. Molecular interactions can be measured with all of these data to understand biological functions and predict cell behavior in response to outside stimuli. The most common dataset to use from NGS is microarray-based expression profiling data. Statistics is used to process the data of these datasets to use in applications such as drug discovery. Genes within a specific pathway could be analyzed to determine whether those genes are significantly more likely to mutate than chance. For example, if many of the genes altered in a cancer appear to affect a particular pathway, then drugs targeting this pathway could be effective for that cancer. As a result, pathway analysis could be used to develop personalized therapies that are effective and reduce costs as well as side effects. Visualizing Biological Data Figure in paper “Gaussian graphical modeling reconstructs pathway reactions from high-throughput metabolomics data” by Jan Krumsiek, Karsten Suhre, Thomas Illig, Jerzy Adamski and Fabian J Theis. Networks are commonly used to visualize biological data by modeling the relationships between genes given the interactions with nodes. Gaussian graphical models (GGM) is a popular method of inferring a network and assumes that the data are normally distributed. GGMs simplify the structure in the data and can predict pathways as well as discover novel ones. Bayesian nonparanormal graphical models relax the normal assumption by transforming non-normal data to normal data and puts a distribution on the parameters in the model in order to construct the network. Current research involves improving the estimation of the parameters in order to better detect significant interactions and remove superfluous data. Combining robust computational algorithms with statistical analysis and visualization to describe biological data allows for effective communication and education among members of the scientific community and the public.