How to Troubleshoot Apps for the Modern Connected Worker
Human SNPs in microRNA Target Sites
1. SNPs in the human genome that lie within predicted miR NA target sites MiR kat Manor presents 19 December, 2007 Yasin “ Yossarian ” Ş enbabaoglu Brian “ Zaphod ” Magnuson . _
13. Hypotheses 1. If SNPs in human populations exist in miR target sites, there may be a real, functional effect in vivo . 2. Given the importance of the 8mer seed, SNPs in target site may show a bias toward being within the seed region.
14. miR targets (miRBase) SNPs (HapMap) data sources Generating a set of SNPs in miR targets miRBase http://microrna.sanger.ac.uk/ Wellcome Trust Sanger Institute algorithm: miRanda 3.0 targets: v5 (12 November 2007) miR source: miRNA Registry release 9.0 genome source: Ensembl 40 3'UTRs International HapMap Project http://hapmap.org/ Release 22 genome source: NCBI build 36 Populations: YRI – Yoruban in Ibadan, Nigeria CHB – Han Chinese in Beijing JPT – Japanese in Tokyo CEU – European descent in Utah
15. miR targets (miRBase) SNPs (HapMap) filters filters hsa targets hsa miRs unique targets 3'UTR frequency cutoff split by chromosome split by chromosome data sources filtered targets filtered SNPs Generating a set of SNPs in miR targets
16. miR targets (miRBase) SNPs (HapMap) filters filters hsa targets hsa miRs unique targets 3'UTR frequency cutoff split by chromosome split by chromosome data sources filtered targets filtered SNPs merge via chromosomal coordinates tarSNPs Generating a set of SNPs in miR targets
17. miR targets (miRBase) SNPs (HapMap) filters filters hsa targets hsa miRs unique targets 3'UTR frequency cutoff split by chromosome split by chromosome data sources filtered targets filtered SNPs merge via chromosomal coordinates tarSNPs STATISTICS seed non seed Generating a set of SNPs in miR targets 5' 3'
18. Z-scores and Hypothesis Testing For each chromosome of each population, we need to determine whether the target site SNPs have a bias to be inside or outside the seed region (8-mer) We find the proportion of SNPs within seed regions and standardize them using z-scores. But, is the normality assumption valid? Z-score: