BiPday 2014 -- Tulipano Angelica

CNR – Istituto di Tecnologie Biomediche di Bari
nc-aReNA: an integrated
resource for small non-coding
RNA functional annotation
Angelica Tulipano

Small non-coding RNAs
(sncRNAs) serve as
regulatory molecules in
a number of different
organisms
 MicroRNA (miRNA): post-transcriptional regulatory genes
 PIWI-interacting RNA (piRNA): germline transposon silencing
 Small interfering RNA (siRNA): active molecules in RNA interference
 Small nuclear RNA (snRNA): includes spliceosomal RNAs.
 Small nucleolar RNA (snoRNA): involved in rRNA modification
 Long non-coding RNA (lncRNA): little is known about them, involved
in mRNA regulation
The RNA World
BiP-Day, 19 Dicembre 2014

To address these questions, new systematic gene-discovery approaches need
to be developed that are specifically aiming at the ncRNA discovery
High-Throughput Sequencing technologies enable such a research
The new spectrum of NGS applications together with the massive amount of
data requires focused investments and development of bioinformatics tools
managing and analysing such complex and large datasets to infer biological
meaning.
Why non-coding RNA?
• How many ncRNA genes are there?
• How important are they?
• Which functions does a cell delegate to RNA
instead of protein and why?
The RNA World

 Identification and classification of reads in known functional ncRNA classes
and dataset export
 Identification and filtering of reads mapping to ribosomal RNAs and mtDNA
transcripts
 Quantification of ncRNA expression and differential expression analysis
 Graphical visualization of sample expression profiles in different conditions
and at different time courses
 Creation of a collection of unclassified reads, useful for the prediction of
novel ncRNAs
A bioinformatics pipeline to classify and analyze
small non-coding RNAs (sncRNAs) in deep RNA sequencing
The ITB nc-aReNA Platform

Differential
expression
Bioinformatics workflow for ncRNA analyses
Raw Sequencing
Data
Adapter &
Barcode
identification
and removal
Size filtering &
General Statistics
Cleaned Sequence
Data (fastq, fasta)
Reads Mapping
data-warehouse
Quality control
check
isomiR
identification

Sequence Processing:
Adapter & Barcode identification and removal
 detect barcode sequence and separate multiplexed experiments
 trim barcode and 3’-adapter fragment
3’ adapter
~20 bases
barcode
(if multiplexed)
4-6 bases
small RNA
18-30 bases
5’ adapter

Sequence Processing
QC check and general statistics
FastQC: A Quality Control tool for High Throughput
Sequence Data
http://www.bioinformatics.babraham.ac.uk/projec
ts/fastqc/ by S. Andrews

Reads Mapping:
Bowtie
Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient
alignment of short DNA sequences to the human genome. Genome Biol 10:R25.
Mapping on
ncRNADB
reference
Mapping on
reference genome
unmapped
mapped
annotated
not
annotated
unmappedmapped
classified
ncRNA
classified
ncRNA
Export
cleaned
sequences
data-warehouse
miRNA
precursors
isomiR
identification

Ribosomal RNAs
ncRNAs
Reference
database
redundancy
identification &
removal
GENCODE
Data Sources
The ncRNA Reference Database

 Reference database contains several classes of ncRNAs
 Some reads map to more than one reference sequence in
different classes
 Software dealing with multiple mapping: RSEM
Li and Dewey, RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome
BMC Bioinformatics 2011, 12:323
Multiple mapping: Multireads
Example of multimapping:
processed_transcript | miRNA_primary_transcript
lincRNA | miRNA_primary_transcript
antisense | miRNA_primary_transcript
lncRNA | miRNA_primary_transcript
rRNA | piRNA
rRNA | lincRNA
lincRNA | snoRNA
snoRNA | piRNA

IsomiRs
IsomiRs are defined as variations (isoforms) of a
mature microRNA
These variants were originally dismissed as
experimental artifacts
IsomiRs have demonstrated to be actively associated
with the RISC and the mRNA translation machinery
IsomiRs are real physiological miRNA variants

IsomiR Variants
 “Incorrect” or alternative Dicer cleavage: length variations
 RNA editing: nucleotide additions, sequence variants
AGCGACAGCUGGCUACUGGGU
AGCUACAUCUGGCUACUGGGU
GCGACAUCUGGCUACUGGGU
AAGCGACAUCUGGCUACUGGGU
GCAGCGACAUCUGGCUACUGGGU
AGCGACAUCUGGCUACUGGG
AGCGACAUCUGGCUACUGGGUU
AGCGACAUCUGGCUACUGGGUCU
AGCGACAUCUGGCUACUGGGU
Polymorphic
IsomiRs
AAUCAGCAGCGACAUCUGGCUACUGGGUCUCUGAU
3’ IsomiRs
5’ IsomiRs
Canonical miRNA
Pre-miR
sequence
Mature miRNA-sequence

IsomiR identification: isomiRID
de Oliveira LF, Christoff AP, Margis R. isomiRID: a framework to identify microRNA isoforms, Bioinformatics. 2013 Oct 15;29(20):2521-3

Differential expression analysis
Comparison of ncRNA counts in different experimental
conditions:
- sample vs. control
- time course samples
Statistical test:
- Fisher test (no biological replicates)
- T-test or Wilcoxon test (with biological replicates)

nc-aReNA pipeline design

Test Case 1: ncRNA classification
rRNA 47.1%
miRNA_primary_transcript
46.2%
piRNA 2.9%
tRNA 2.3%
snoRNA 0.7% lncRNA 0.2%
lincRNA 0.2%
ncRNA counts
categories

Test Case 2: DE analysis
Characterization of miRNA expression profiles
in Mus musculus time course dataset

Test Case 2
data-warehouse
Gene Ontology
ncRNA
expression
profile
BioinformaticsAnalyses
small
RNA-Seq
TarBase
Characterization of miRNAs expression
profiles in Mus musculus time course dataset
Metadata
•Mus musculus
•Muscle & Skin
•T0: Mock
•T1: 3 hrs
•T2: 24 hrs

ncRNA research activities
 Immune response in mouse
 Plant – Viroid interactions in peach tree, grapevine
and tobacco
 Multidrug resistence in dog cell lines
 miRNA driven methylation profile in Arabidopsis
 miRNAs expression profile in Amyotrophic Lateral
Sclerosis and interaction with biomarkers of clinical
feature
Collaborations
CNR – IVV, Bari
• Francesco Di Serio
• Beatriz Navarro
• Livia Stavolone
• Fabrizio Cillo
University of Bari
Dept. of Pharmacy
• Nicola Colabufo
• Antonio Carrieri
CSIC – UPV, Valencia, Spain
• Ricardo Flores
Work in progress:
 Web portal
 Statistical analysis of isomiR variants
 Prediction of novel miRNA

The Team
Consiglio Arianna
De Caro Giorgio
D’Elia Domenica
Gisel Andreas
Grillo Giorgio
Licciulli Flavio
Liuni Sabino
Losito Nicola
Tulipano Angelica

Thanks for your attention!

BiPday 2014 -- Tulipano Angelica

More Related Content

What's hot

Similar to BiPday 2014 -- Tulipano Angelica

More from eventi-ITBbari

BiPday 2014 -- Tulipano Angelica

Editor's Notes