1. Web portal for human
APA outlier associated
rare variants atlas
A step-by-step guide
2. Data and methods
• We collected RNA-seq data and WGS data from the v8 release of the
GTEx project. The RNA-seq data contains 17,832 samples of 54
biological tissues from 838 donors. In the current study, we used 49 of
the tissues that with at least 70 samples. Original RNA-seq reads were
aligned with the human genome (hg38/GRCh38) using STAR v.2.5.2b.
The resulting sorted BAM files were converted into bedGraph formats
using BEDTools version 2.17.0 40.
• We called APA outlier (aOutlier) in a single tissue (single-tissue
aOutliers) and in multiple tissues (multitissue aOutliers). In brief, for
multitissue aOutliers, we calculated the median Z score on covariates
corrected APA usage for each APA event across all tissues for which
data were available, restricting to individuals with APA measurements in
at least five tissues. For each APA event, the multitissue aOutliers were
defined as individuals with an absolute median value of Z score greater
than 3. To account for situations where widespread aberrant APA might
occur in an individual due to non-genetic influences, we removed 11
individuals where the proportion of tested genes that were multitissue
outliers exceeded 1.5 times the interquartile range of the distribution of
proportion outlier genes across all individuals. The 11 individuals were
marked as global outliers. For single-tissue aOutlier calling, we
calculated a Z score for each APA event and defined single-tissue
aOutliers for each event in a single tissue as the individuals with the
absolute value of Z score greater than 3. The 11 individuals marked as
global outliers were also excluded in single-tissue aOutliers.
3. Part 1: Querying aOutliers/ipaOutliers of interest
Part 2: Downloading data of aOutliers and the
related data of the watershed model