Disentangling the origin of chemical differences using GHOST
rareAPA_website.pptx
1. Web portal for human
APA outlier associated
rare variants atlas
A step-by-step guide
2. Data and methods
• We collected RNA-seq data and WGS data from the v8 release of the
GTEx project. The RNA-seq data contains 17,832 samples of 54
biological tissues from 838 donors. In the current study, we used 49 of
the tissues that with at least 70 samples. Original RNA-seq reads were
aligned with the human genome (hg38/GRCh38) using STAR v.2.5.2b.
The resulting sorted BAM files were converted into bedGraph formats
using BEDTools version 2.17.0 40.
• We called APA outlier (aOutlier) in a single tissue (single-tissue
aOutliers) and in multiple tissues (multitissue aOutliers). In brief, for
multitissue aOutliers, we calculated the median Z score on covariates
corrected APA usage for each APA event across all tissues for which
data were available, restricting to individuals with APA measurements in
at least five tissues. For each APA event, the multitissue aOutliers were
defined as individuals with an absolute median value of Z score greater
than 3. To account for situations where widespread aberrant APA might
occur in an individual due to non-genetic influences, we removed 11
individuals where the proportion of tested genes that were multitissue
outliers exceeded 1.5 times the interquartile range of the distribution of
proportion outlier genes across all individuals. The 11 individuals were
marked as global outliers. For single-tissue aOutlier calling, we
calculated a Z score for each APA event and defined single-tissue
aOutliers for each event in a single tissue as the individuals with the
absolute value of Z score greater than 3. The 11 individuals marked as
global outliers were also excluded in single-tissue aOutliers.
3. Part 1: Querying aOutliers/ipaOutliers of interest
Part 2: Genome browser
Part 3: Downloading data of aOutliers and the
related data of the watershed model
4. Part 1 Querying aOutliers/ipaOutliers of interest
Step 1:
Search a Gene ( e.g. SUGP1)
9. Part 2 Genome browser
IGV.js was used to construct the frame of genome browser
Download SVG
Based on the hg38 genome
This process might consume some time to
load the frame. Thank you for waiting.
10. The information of the column
Search by gene symbol (e.g. DDX18), or genome position (e.g.
chr1:32,774,256-32,776,257)
When the frame was fully loaded, the figure is shown as follow: