EpiMOLAS: An Intuitive Web-based Framework for Genome-Wide DNA Methylation Analysis

EpiMOLAS: An Intuitive Web-based Framework for
Genome-wide DNA Methylation Analysis
Presented By
Sheng-Yao Su
Bioinformatics Program, Taiwan International Graduate Program,
Institute of Information Science, Academia Sinica
Institute of Biomedical Informatics, National Yang-Ming University
TAIWAN
Sep 10, 2019

Outline
• Introduction
• Methods
• Implementations and Results
• EpiMOLAS consists of DocMethyl and EpiMOLAS_web
• Discussion
• Conclusion

Epigenomics
• Epi- (upon, above, beyond) genomics (DNA sequence)
• Waddington proposed this term in 1940s.
• Epigenomics is the study of the complete set of epigenetic
modification on the genetic material of a cell (wiki)

Epigenomic Dynamics
DNA
methylation
Histone
Midification
Nucleosome
Remodelling
Non-coding
RNAs

DNA methylation – an epigenetic mark of
cellular memory
DNA methylation: an epigenetic mark of cellular memory
Experimental & Molecular Medicine volume 49, page e322 (2017)

5-mC 5-hmCC
Chemical Structure

Sodium Bisulfite treatment
Correct conversion : C -> U -> T
Correct conversion : mC -> mC -> C
incorrect conversion : mC -> U -> T
Bisulfite
treatment
PCR
amplification
Unmethylated DNA Methylated DNA
Original sequence CCGTCGACGT CmCGTmCGAmCGT
Bisulfite converted UUGTUGAUGT UmCGTmCGAmCGT
PCR product TTGTTGATGT TCGTCGACGT
Incomplete conversion

Detect DNA modification changes
• Bisulfite conversion treatment
• Reduced Representation Bisulfite Sequencing (RRBS)
• Whole Genome Bisulfite Sequencing (WGBS)
• Bisulfite-free
• Anti-methylcytosine Antibody
• Methyl-CpG binding domain (MBD)
• Chemical labeling (MeFISH)
• Methylation-sensitive restriction enzyme
• Electrochemical oxidation
• Third Generation (SMRT-seq, Nanopore)

Generic bioinformatic analysis workflow for
bisulfite sequencing data
Seq Reads
Quality
Control
Alignment
Methylation
Call
Visualization
Annotation
Diff. Methyl.
Region
Biomarker
Candidates

Flowchart of EpiMOLAS
Biomarker
CandidatesSeq Reads
Trim Galore!
FastQC
Bowtie
Visualization
Annotation
Bismark
Extract.
EpiMOLAS_webDocMethyl
Bismark
mtable

Metric for Methylation Profiling - mtable
Gene
Genome
C
C
C
C
at least four counts of
methylated and
unmethylated cytosine
at least five qualified
observed cytosines
1 16425704 + 0 8 CHH CTC
1 16425710 + 6 6 CHG CAG
1 16425714 + 10 5 CHH CAA
1 16425717 + 6 0 CHG CTG
1 16425719 + 4 0 CG CGC
Bismark genome-wide cytosine report
Sequence
depth
Input Output
EpiMolas.jar
CG
CHG
CHH
Su et al. TEA: the epigenome platform for Arabidopsis methylome study. BMC Genomics 17(Suppl 13): 1027 (2016)

An Example of mtable
Ensembl
Gene ID
Methylation level of gene body and
promoter regions according to three
cytosine methylation contexts
less than five
qualified
observed
cytosines

Architecture of DocMethyl and EpiMOLAS_web

DocMethyl
• Docker
• Galaxy
Infrastructure
Operating System
Docker Daemon
Galaxy platform
TrimGalore
FastQC
Bismark
EpiMolas.jar
Workflow
mtable
Methylation
Report
Raw
data
Input DocMethyl
DocMethyl
output
QC Report
Trimmed
Data
Reference
Genome
Gene
Annotation

A Workflow In DocMethyl
Trim
Sequences
Check QC of
Trimmed reads
Map Reads
on Genome
Extract Methylated
Cytosines
Generate Output
of Submission to
EpiMOLAS_web
• Trim Galore
• FastQC
• Bismark
• EpiMolas.jar

Steps and Output Files of the Workflow

Full text Search
DMGs (select diff
methylation Genes)
mC Threshold
Import Genelist
KEGG Global View
Gene List Analysis
Generate New Gene
List for further
Analysis in Built-in
Approaches
Modules Inside EpiMOLAS_web

Visualization Modules
• Boxplot
• Circos plot
• Heatmap
• Potein network

Discussion
• It is hard to find the significant DMG according to DMG approaches.
Long region of gene size in length amortize the effect of DNA
methylation.
• Approximately 80% of all CpGs are located in repetitive sequences
and centromeric repeat regions of chromosomes, and are heavily
methylated.
• We list the comparison among several platforms and tools for
genome-wide DNA methylation analysis.

Comparison of each platform
EpiMOLAS BAT ENCODE
-WGBS
snakePipe NGI-
MethylSeq
Mint RnBeads
2.0
MethylPipe MethylSig Methylkit
Environment Docker,
Galaxy,
Web server
Docker Shell
script
Bioconda
Snakemake
Docker
Nexflow
Galaxy R package R package R package R package
Sequence
context
CG, CHG,
CHH
CG CG, CHG,
CHH
CG, CHG,
CHH
CG, CHG,
CHH
CG CG CG, CHG, CHH CG, CHG,
CHH
CG, CHG,
CHH
Start with raw reads raw
reads
raw reads raw reads raw reads raw
reads
Methyl.
Call file
Methyl. Call
file
Methyl.
Call file
Methyl.
Call file
Docker
Container
+ + – – + – NA NA NA NA
Web
interface
+
(Galaxy)
– + – – +
(Galaxy)
NA NA NA NA
Adapter and
base quality
trimming
+ – + + + + – NA NA NA
QC report + + + + + + + NA NA NA
Read
mapping
+ + + + + + – NA NA NA
Methylation
sites calling
+ + + + + + – NA NA NA

EpiMOLAS BAT ENCODE
-WGBS
snakePipe NGI-
MethylSeq
Mint RnBeads
2.0
Discriptive
statistics
+ + – + + + + + + +
Find DMRs +
(simple)
+
(metilene)
– +
(metilene)
– +
(DSS)
+ + + +
Clustering
analysis
+
(heatmap)
+
(heatmap)
– +
(heatmap)
– – +
(heatmap)
– +
(heatmap)
–
GO term
enrichment
+ – – – – + + + – –
KEGG
pathway
enrichment
+ – – – – – – – – –
TFBS
enrichment
– – – – – – – – + –
Genome-
wide
visualization
+
(circos plot)
+
(circos
plot)
– – – – + – – –
Interactive
Quantitative
analysis
+ – – – – – +
(R Shiny)
NA NA NA
Data
browing and
retrieving UI
+ – – – – – +
(R Shiny)
NA NA NA

EpiMOLAS BAT ENCODE
-WGBS
snakePipe NGI-
MethylSeq
Mint RnBeads
2.0
Gene list
with
tracking logs
+ – – – – – NA NA NA NA
Venn
analysis on
gene lists
+ – – – – – NA – – –
Interplay
with other
high
throughput
data
protein
Interactome
transcript
ome
– RNA-seq,
ChIP-seq,
ATAC-seq,
Hi-C etc.
– 5-
hmc
– RNA-seq,
ChIP-seq,
Dnase-seq
– –

Conclusion
• We present an integrated two-phase web-based ‘gene-centric’
framework for WGBS data from raw data processing to downstream
analysis.
• EpiMOLAS helps users deal with their WGBS data and alleviates the
burden on conducting reproducible analysis of public datasets.

https://hub.docker.com/r/lsbnb/docmethyl/

http://symbiosis.iis.sinica.edu.tw/epimolas/

Thank you for your attention !
Photo by KageHuang/Getty Images

EpiMOLAS: An Intuitive Web-based Framework for Genome-Wide DNA Methylation Analysis

Recommended

Recommended

More Related Content

What's hot

What's hot (18)

Similar to EpiMOLAS: An Intuitive Web-based Framework for Genome-Wide DNA Methylation Analysis

Similar to EpiMOLAS: An Intuitive Web-based Framework for Genome-Wide DNA Methylation Analysis (20)

Recently uploaded

Recently uploaded (20)

EpiMOLAS: An Intuitive Web-based Framework for Genome-Wide DNA Methylation Analysis