Apolo Taller en BIOS

Editando anotaciones con Apollo 
Un taller para la comunidad científica reunida en BIOS
Monica Munoz-Torres, PhD | @monimunozto 
Berkeley Bioinformatics Open-Source Projects (BBOP) 
Lawrence Berkeley National Laboratory |  
University of California Berkeley | U.S. Department of Energy
 
BIOS, Manizales, Colombia | 21 Septiembre, 2015

APOLLO DEVELOPMENT
APOLLO DEVELOPERS 2
h" p://G e nom e Ar c hite c t. or g /

Nathan Dunn
Eric Yao
JBrowse, UC Berkeley
Christine Elsik’s Lab,
University of Missouri
Suzi Lewis
Principal Investigator
BBOP

Moni Munoz-Torres
Stephen Ficklin
GenSAS,
Washington State University
Colin DieshDeepak Unni

OUTLINE 
Web
Apollo
Collabora(ve
Cura(on
and

Interac(ve
Analysis
of
Genomes

3OUTLINE
•  Hoy
descubriremos
cómo

sortear
obstáculos
para

extraer
la
información
más

valiosa
en
un
proyectos
de

secuenciación
&
anotación

de
genomas.

4
BY THE END OF THIS TALK 
you will 
v BeAer
understand
genome
cura(on
in
the
context
of
annota(on:

assembled
genome
à
automated
annota=on
à
manual
annota=on

v Become
familiar
with
the
environment
and
func(onality
of
the
Apollo

genome
annota(on
edi(ng
tool.

v Learn
to
iden(fy
homologs
of
known
genes
of
interest
in
a
newly

sequenced
genome.

v Learn
about
corrobora(ng
and
modifying
automa(cally
annotated
gene

models
using
available
evidence
in
Apollo.

Introduction

¿Cómo
se
traza

el
mapa
de
un
genoma?

6
El mapa del genoma
Introduction
Diseño & muestreo
Análisis comparativos
Colección
consenso
de genes
Anotación
manual
Anotación
automatizada
Secuenciación Ensamblaje
Síntesis &
publicación

7
El mapa del genoma
Introduction
Diseño & muestreo
Análisis comparativos
Colección
consenso
de genes
Anotación
manual
Anotación
automatizada
Secuenciación Ensamblaje
Síntesis &
publicación
QC
QC
QC
QC
QCQC
QC

CURATING GENOMES 
steps involved
1  Genera=on
of
Gene
Models

calling
ORFs,
one
or
more

rounds
of
gene
predic(on,

etc.

2  Annota=on
of
gene
models

Describing
func(on,

expression
paAerns,

metabolic
network

memberships.

3  Manual
annota=on

CURATING GENOMES 8

ANOTACION DE GENOMAS 
requiere precisión y profundidad
Anotando Genomas 9
La
colección
de
genes
de
cada
organismo
informa
una
variedad
de
análisis:

•  Número
de
genes,
%
GC,
composición
de
TEs,
áreas
repe((vas

•  Asignar
función

•  Evolución
molecular,
conservación
de
secuencias

•  Familias
de
genes

•  Caminos
metabólicos

•  ¿Qué
hace
único
a
cada
organismo?

¿Qué
hace
“abeja”
a
una
abeja?

Marbach et al. 2011. Nature Methods | Shutterstock.com | Alexander Wild

Refresquemos
nuestra

memoria.

REVIEW ON YOUR OWN 
for manual annotation
To
remember…
Biological
concepts
to
beAer

understand
manual
annota(on

11FOOD FOR THOUGHT
•  GLOSSARY

from
con1g
to
splice
site

•  CENTRAL
DOGMA

in
molecular
biology

•  WHAT
IS
A
GENE?

deﬁning
your
goal

•  TRANSCRIPTION

mRNA
in
detail

•  TRANSLATION

and
other
deﬁni(ons

•  GENOME
CURATION

steps
involved

12BIO-REFRESHER
What is a gene?
v  The
deﬁni(on
of
a
gene
paints
a
very
complex
picture
of
molecular
ac(vity

and
it
is
a
con(nuously
evolving
concept.

•  From
the
Sequence
Ontology
(SO):

“A
gene
is
a
locatable
region
of
genomic
sequence,
corresponding
to
a
unit

of
inheritance,
which
is
associated
with
regulatory
regions,
transcribed

regions
and/or
other
func(onal
sequence
regions”.

“Evolving
Concept”
at
hAp://goo.gl/LpsajQ

13BIO-REFRESHER
What is a gene?
v  In
our
life(me,
the
Encyclopedia
of
DNA
Elements
(ENCODE)
project

updated
this
concept
yet
again.
Long
transcripts
&
dispersed
regula1on!

“A
gene
is
a
DNA
segment
that
contributes
phenotype/func(on.
In
the
absence
of

demonstrated
func(on,
a
gene
may
be
characterized
by
sequence,
transcrip(on
or

homology.”

https://www.encodeproject.org/

14BIO-REFRESHER
What is a gene? 
let’s think computationally!
v  Think
of
the
genome
as
an
operating system for
a
living
being

•  Considering
that
the
nucleo(des
of
the
genome
are
put
together
into
a

code
that
is
executed
through
the
process
of
transcription
and

translation…
•  …
think
of
genes
as
subroutines
that
are
repe((vely
called
in
the

process
of
transcription
Gerstein et al., 2007. Genome Res.

15BIO-REFRESHER
What is a gene? 
considerations
v  Also
consider
:

•  A
gene
is
a
genomic
sequence
(DNA
or
RNA)
directly
encoding

func(onal
product
molecules,
either
RNA
or
protein.

•  If
several
func(onal
products
share
overlapping
regions,
we
take
the

union
of
all
overlapping
genomics
sequences
coding
for
them.

•  This
union
must
be
coherent
–
i.e.,
processed
separately
for
ﬁnal

protein
and
RNA
products
–
but
does
not
require
that
all
products

necessarily
share
a
common
subsequence.
Gerstein et al., 2007. Genome Res.

16BIO-REFRESHER
“El
gen
es
la
unión
de

secuencias
genómicas

que
codiﬁcan
una

colección
coherente

de
productos

funcionales
que

pueden
o
no

superponerse.”

Gerstein et al., 2007. Genome Res
El
Gen:
un
blanco
en
movimiento.

¿QUÉ ES UN GEN?

17BIO-REFRESHER
TRANSLATION 
reading frame
v  Reading
frame
is
a
manner
of
dividing
the
sequence
of
nucleo(des
in
mRNA

(or
DNA)
into
a
set
of
consecu(ve,
non-‐overlapping
triplets
(codons).

v  Three
frames
can
be
read
in
the
5’
à
3’
direc(on.
Given
that
DNA
has
two

an(-‐parallel
strands,
an
addi(onal
three
frames
are
possible
to
be
read
on

the
an(-‐sense
strand.
Six
total
possible
reading
frames
exist.

v  In
eukaryotes,
only
one
reading
frame
per
sec(on
of
DNA
is
biologically

relevant
at
a
(me:
it
has
the
poten(al
to
be
transcribed
into
RNA
and

translated
into
protein.
This
is
called
the
OPEN
READING
FRAME
(ORF)

•  ORF
=
Start
signal
+
coding
sequence
(divisible
by
3)
+
Stop
signal

v  The
sec(ons
of
the
mature
mRNA
transcribed
with
the
coding
sequence
but

not
translated
are
called
UnTranslated
Regions
(UTR);
one
at
each
end.

18
"Reading Frame" by Hornung Ákos - Wikimedia Commons
BIO-REFRESHER
TRANSLATION 
reading frame

19
"ORF" by Thatsonginc - Wikimedia Commons
BIO-REFRESHER
TRANSLATION 
reading frame

20BIO-REFRESHER
TRANSLATION 
reading frame: splice sites
v  The
spliceosome
catalyzes
the
removal
of
introns
and
the
liga(on
of
ﬂanking

exons.

•  introns:
spaces
inside
the
gene,
not
part
of
the
coding
sequence

•  exons:
expression
units
(of
the
coding
sequence)

v  Splicing
“signals”
(from
the
point
of
view
of
an
intron):

•  There
is
a
5’
end
splice
“signal”
(site):
usually
GT
(less
common:
GC)

•  And
a
3’
end
splice
site:
usually
AG

•  …]5’-‐GT/AG-‐3’[…

v  It
is
possible
to
produce
more
than
one
protein
(polypep(de)
sequence
from

the
same
genic
region,
by
alterna(vely
bringing
exons
together=
alterna=ve

splicing.
For
example,
the
gene
Dscam
(Drosophila)
has
38,000
alterna(vely

spliced
mRNAs
=
isoforms

21
"Gene structure" by Daycd- Wikimedia Commons
BIO-REFRESHER
TRANSLATION 
now in your mind
•  Although
of
brief
existence,
understanding
mRNAs
is
crucial,

as
they
will
become
the
center
of
your
work.

22
Text for figures goes here
BIO-REFRESHER
TRANSLATION 
reading frame: phase
v  Introns
can
interrupt
the
reading
frame
of
a
gene
by
inser(ng
a
sequence

between
two
consecu(ve
codons

v  Between
the
ﬁrst
and
second
nucleo(de
of
a
codon

v  Or
between
the
second
and
third
nucleo(de
of
a
codon

"Exon and Intron classes”. Licensed under Fair use via Wikipedia

23
"Protein synthesis" by Kelvinsong - Wikimedia Commons
CURATING GENOMES
TRANSLATION 
in detail

24BIO-REFRESHER
HICCUPS 
in transcription and translation
v  The
presence
of
premature
Stop
codons
in
the
message
is
possible.
A

process
called
non-‐sense
mediated
decay
checks
for
them
and
corrects

them
to
avoid:
incomplete
splicing,
DNA
muta(ons,
transcrip(on
errors,
and

leaky
scanning
of
ribosome
–
causing
changes
in
the
reading
frame
(frame

shiYs).

v  Inser(ons
and
dele(ons
(indels)
can
cause
frame
shios,
when
indel
is
not

divisible
by
three
(3).
As
a
result,
the
pep(de
can
be
abnormally
long,
or

abnormally
short
–
depending
when
the
ﬁrst
in-‐frame
Stop
signal
is
located.

Predicción
&
Anotación

26Gene Prediction
GENE PREDICTION
v  The
iden(ﬁca(on
of
structural
features
of
the
genome:

•  Primarily
focused
on
protein-‐coding
genes.

•  Predicts
also
transfer
RNAs
(tRNA),
ribosomal
RNAs
(rRNA),

regulatory
mo(fs,
long
and
small
non-‐coding
RNAs
(ncRNA),

repe((ve
elements
(masked),
etc.

•  Two
methods
for
iden(ﬁca(on.

•  Some
are
self-‐trained
and
some
must
be
trained.

27Gene Prediction
GENE PREDICTION 
methods for discovery
1)
Ab
ini,o:

-‐
based
on
DNA
composi(on,

-‐
deals
strictly
with
genomic

sequences

-‐
makes
use
of
sta(s(cal

approaches
to
search
for
coding

regions
and
typical
gene
signals.

•  E.g.
Augustus,
GENSCAN,

geneid,
fgenesh,
etc.

3’

Nat Rev Genet. 2015 Jun;16(6):321-32. doi: 10.1038/nrg3920

28
Nucleic Acids 2003 vol. 31 no. 13 3738-3741
Gene Prediction
GENE PREDICTION 
methods for discovery (ctd)
2)
Homology-‐based:

-‐
evidence-‐based,

-‐
ﬁnds
genes
using
either
similarity
searches
in
the
main
databases
or

experimental
data
including
RNAseq,
expressed
sequence
tags
(ESTs),
full-‐length

complementary
DNAs
(cDNAs),
etc.

•  E.g:
fgenesh++,
Just
Annotate
My
genome
(JAMg),
SGP2

29
GENE ANNOTATION
Integra(on
of
data
from
computa(onal
&
experimental
evidence
with
data

from
predic(on
tools,
to
generate
a
reliable
set
of
structural
annota=ons.

Involves:

1)
ab
ini1o
predic(ons

2)
assessment
of
biological
evidence
to
drive
the
gene
predic(on
process

3)
synthesis
of
these
results
to
produce
a
set
of
consensus
gene
models

Gene Annotation

30
In
some
cases
algorithms
and
metrics
used
to
generate

consensus
sets
may
actually
reduce
the
accuracy
of
the
gene’s

representa(on.

GENE ANNOTATION
Gene
models
may
be
organized
into
“sets”
using:

v  automa(c
integra(on
of
predicted
sets
(combiners);
e.g:
GLEAN,

EvidenceModeler

or

v  tools
packaged
into
pipelines;
e.g:
MAKER,
PASA,
Gnomon,

Ensembl,
etc.

Gene Annotation

ANOTACION 
un arte imperfecto
No one is perfect, least of all automated annotation. 31
Nuevas
tecnologías
traen
nuevos
retos:

•  Errores
en
el
ensamblaje
pueden
causar

fragmentación
en
las
anotaciones

•  Cobertura
limitada
diﬁculta
la

iden(ﬁcación
con
certeza

Image: www.BroadInstitute.org

ANOTACION MANUAL 
mejorando predicciones
Schiex
et
al.
Nucleic
Acids
2003
(31)
13:
3738-‐3741

Predicciones Automatizadas
Evidencia Experimental
Manual Annotation – to the rescue. 32
cDNAs,
búsquedas
con
HMM,
RNAseq,

genes
de
otras
especies.

Entonces,
es
necesario
reﬁnar
las

predicciones
de
elementos
biológicos

codiﬁcados
en
el
genoma,
lo
que
requiere

una
cuidadosa
revisión.

33
BIOCURACION 
ajustes estructurales y funcionales
Iden(ﬁcar
los
elementos
del
genoma

que
mejor
representan
la
biología

subyacente
y
eliminar
los
elementos

que
reﬂejan
errores
sistémicos
de
los

análisis
automa(zados.

Asignar
funciones
a
través
de
análisis

compara(vos
entre
elementos

genómicos
similares
de
organismos

cercanamente
relacionados
usando

literatura,
bases
de
datos,
y
datos

experimentales.

BIOCURACION
hAp://GeneOntology.org

1

2

MANUAL ANNOTATION 34
PERO, EN CURACION 
no siempre era posible ampliar estos esfuerzos
Researchers
on
their
own;

may
or
may
not
publicize

results;
may
be
a
dead-‐end

with
very
few
people
ever

aware
of
these
results.

Elsik
et
al.
2006.
Genome
Res.
16(11):1329-‐33.

Too
many
sequences
and
not
enough
hands.

A
small
group
of
highly

trained
experts
(e.g.
GO).

1
Museum

A
few
very
good
biologists,
a

few
very
good
bioinforma(cians

camping
together
for
intense
but

short
periods
of
(me.

Jamboree
2

Co"age
3

ANOTACION 
un ejercicio en colaboración
COLABORANDO 35
Los
inves1gadores
usualmente
buscamos
las

opiniones
y
percepciones
de
colegas
con

experiencia
en
áreas
especíﬁcas
del

conocimiento.

Por
ejemplo,
dominios
conservados

o
familias
de
genes.

Apollo 
una herramienta para editar anotaciones
36
v  En
la
web,
integrado
con
JBrowse.

v  ¡Permite
la
colaboración
en
(empo
real!

v  Automá(camente
genera
datos
en

formatos
comunes
para
análisis.

v  Anotación
manual
de
genes,
pseudogenes,
tRNAs,

snRNAs,
snoRNAs,
ncRNAs,
miRNAs,
TEs,
y
fragmentos
repe((vos.

v  Funciones
intui(vas
y
menús
desplegables
crean
y
editan
estructuras

de
transcritos
y
exones,
insertan
comentarios
(CV,
texto
libre),
y

términos
de
GO,
etc.

INTRODUCING APOLLO
hAp://GenomeArchitect.org/

ARQUITECTURA 
simple, flexible
ARCHITECTURE 37
Cliente
de
web
+
Motor
de
edición
de
anotaciones
+
Servicio
de
datos
en
el
servidor

REST / JSON
Websockets
Motor de Anotación (Servidor)
Shiro
LDAP
OAuth
Annotations
Security
Preferences
Organisms
Tracks
BAM
BED
VCF
GFF3
BigWig
Curadores
Google Web Toolkit (GWT) /
Bootstrap
JBrowse DOJO / jQuery Datos a JBrowse
Organismo 1
Carga de datos con
evidencia genómica
para cada organismo
Servicio único de almacenamiento
PostgreSQL, MySQL, MongoDB,
ElasticSearch
Apollo v2.0
Datos a JBrowse
Organismo 2

CLIENTE DE WEB 
panel del curador
ARCHITECTURE 38
Curadores
Google Web Toolkit (GWT) / Bootstrap
JBrowse
DOJO /
jQuery
Apollo v2.0
BAM
BED
VCF
GFF3
BigWig
REST / JSON
Websockets
Usa GWT/Bootstrap en el
frente para proveerle
un comportamiento
versátil a la aplicación.
Panel Del Curador
¡NUEVO!
¡NUEVO!

MOTOR DE ANOTACION 
lógica de edición
ARCHITECTURE 39
Shiro
LDAP
OAuth
Datos a JBrowse
Organismo 2
Datos a JBrowse
Organismo 1
Apollo v2.0
Controladores Grails (J2EE servlet)
llevan las solicitudes al directorio
de datos apropiado para cada
organismo en JBrowse
Carga de datos con evidencia
genómica para cada organismo
¡NUEVO!
Cliente de web
REST / JSON
Websockets

SERVICIO DE DATOS EN EL SERVIDOR 
servicio único de almacenamiento
ARCHITECTURE 40
Anotaciones
Seguridad
Preferencias
Organismos
Pistas de datos
PostgreSQL, MySQL, MongoDB,
ElasticSearch
Un solo servicio de almacenamiento,
consultable, para guardar las
anotaciones. ¡NUEVO!
Apollo v2.0

¡COLABOREMOS! 
Apollo tiene código abierto y es expandible
HIGHLIGHTED IMPROVEMENTS 41
The Genome Sequence Annotation Server (GenSAS)
Annotate
Los
usuarios
pueden
adicionar
programas
para
permi=r
sus
propios
procesos
de

trabajo.

Ejemplos:

•  GenSAS:
plataforma
para

anotación
estructural
del

genoma.

•  i5K:

-‐
Espacio
en
NAL
para
compar(r

ensamblajes
y
conjuntos
de

genes,
y
para
anotación
manual.

-‐
Proyecto
Piloto
>40
genomas:

47
charlas,
9
posters
en

Simposio
de
Genómica
de

Artrópodos.

Annotate
National
Agricultural
Library

We
train
and
support
hundreds
of
geographically
dispersed
scien(sts
from

diverse
research
communi(es
to
conduct
manual
annota(ons,
to
recover

coding
sequences
in
agreement
with
all
available
biological
evidence
using

Apollo.

v  Gate
keeping
and
monitoring.

v  Tutorials,
training
workshops,
and
“geneborees”.

42
DISPERSED COMMUNITIES
collaborative manual annotation efforts
APOLLO

LESSONS LEARNED 
What
we
have
learned:

•  Collabora(ve
work
dis(lls
invaluable
knowledge

•  We
must
enforce
strict
rules
and
formats

•  We
must
evolve
with
the
data

•  A
liAle
training
goes
a
long
way

•  NGS
poses
addi(onal
challenges

LESSONS LEARNED 43

¿Cuál
es
la
tarea
del

curador?

Becoming Acquainted with Web Apollo
45 | 45
GENERAL PROCESS OF CURATION 
main steps to remember
1.  Select
or
find
a
region
of
interest,
e.g.
scaffold.

2.  Select
appropriate
evidence
tracks
to
review
the
gene
model.

3.  Determine
whether
a
feature
in
an
exis(ng
evidence
track

will
provide
a
reasonable
gene
model
to
start
working.

4.  If
necessary,
adjust
the
gene
model.

5.  Check
your
edited
gene
model
for
integrity
and
accuracy
by

comparing
it
with
available
homologs.

6.  Comment
and
finish.

46CURATING GENOMES
WHAT ANNOTATORS SHOULD LOOK FOR 
annotators: that’s you!
v  Annota=ng
a
simple
case:
WHEN
“The
oﬃcial
predic(on
is
correct,
or
nearly

correct,
assuming
that
no
aligned
data
extends
beyond
the
gene
model
and

if
so,
it
is
not
likely
to
be
coding
sequence,
and/or
the
gene
predic(on

matches
what
you
know
about
the
gene”:

a.  Can
you
add
UTRs?

b.  Check
exon
structures.

c.  Check
splice
sites:
…]5’-‐GT/AG-‐3’[…

d.  Check
‘start’
and
‘stop’
sites.

e.  Check
the
predicted
protein
product(s).

f.  If
the
protein
product
s(ll
does
not
look
correct,
go
on
to
“Annota(ng

more
complex
cases”.

47CURATING GENOMES
continued
v  Addi=onal
func=onality.
You
may
also
need
to
learn
how
to:

a.  Get
genomic
sequence

b.  Merge
exons

c.  Add/Delete
an
exon

d.  Create
an
exon
de
novo
(within
an
intron
or
outside
exis(ng

annota(ons).

e.  Right/apple-‐click
on
a
feature
to
get
feature
ID
and
addi(onal

informa(on

f.  Looking
up
homolog
descrip(ons
going
to
the
accession
web
page
at

UniProt/Swissprot

48CURATING GENOMES
continued
v  Annota=ng
more
complex
cases:

a.  Incomplete
annota(on:
protein
integrity
checks,
indicate
gaps,
missing
5’

sequences
or
missing
3’
sequences.

b.  Merge
of
2
gene
predic(ons
on
same
scaffold

c.  Merge
of
2
gene
predic(ons
on
different
scaffolds
(uh-‐oh!).

d.  Split
of
a
gene
predic(on

e.  Frameshios,
Selenocysteine,
single-‐base
errors,
and
other
inconvenient

phenomena

49CURATING GENOMES
continued
v  Adding
important
project
informa=on
in
the
form
of
Canned
and/or

Customized
Comments:

a.  NCBI
ID,
RefSeq
ID,
gene
symbol(s),
common
name(s),
synonyms,
top

BLAST
hits
(GenBank
IDs),
orthologs
with
species
names,
and
anything

else
you
can
think
of,
because
you
are
the
expert.

b.  Type
of
annota(on
(e.g.:
whether
or
not
the
gene
model
was
changed)

c.  Data
source
(for
example
if
the
Fgeneshpp
predicted
gene
was
the

star(ng
point
for
your
annota(on)

d.  The
kinds
of
changes
you
made
to
the
gene
model,
e.g.:
split,
merge

e.  Func(onal
descrip(on

f.  Whether
you
would
like
for
your
MOD
curator
to
check
the
annota(on

g.  Whether
part
of
your
gene
is
on
a
diﬀerent
scaﬀold.

50
TRAINING CURATORS 
a little training goes a long way!
Provided
with
adequate
tools,
wet
lab
scien(sts
make

excep(onal
curators
who
can
easily
learn
to
maximize
the

genera(on
of
accurate,
biologically
supported
gene
models.

APOLLO

Conozcamos
a
Apollo

hAp://genomearchitect.org/web_apollo_user_guide

52
Apollo 
ámbito de edición para anotaciones
BECOMING ACQUAINTED WITH APOLLO
Color
por
marco
de
lectura,

alternar
cadena,
cambiar

esquema
de
color,
resaltador

Cargar
evidencia
experimental

(GFF3,
BAM,
BigWig),
pistas
de

datos
de
combinación
y

búsqueda.

Interrogar
el
genoma

usando
BLAT.

Navegación
y
zoom.

Buscar
un
gen
o
un
grupo

Obtener
coordenadas,
y
hacer

zoom
con
“selección
elás(ca”

Login

Anotaciones

creadas
por
el

usuario

Panel
del

curador

Pistas
de

datos
de

evidencia

Datos

transcriptómicos
de

estadío
y

(po

celular

especíﬁcos.

Instrucciones
54 | 54
APOLLO EN LA WEB 
instrucciones
Email:

nombre.apellido@example.com

Contraseña:

nombreapellido

Email
Contraseña
Servidor
Empezar
en

user.one@example.com
userone
1
1

user.two@example.com
usertwo
2
1

user.three@example.com
userthree
3
1

user.four@example.com
userfour
4
1

user.five@example.com
userfive
5
1

user.six@example.com
usersix
1
7

user.seven@example.com
userseven
2
7

user.eight@example.com
usereight
3
7

user.nine@example.com
usernine
4
7

user.ten@example.com
userten
5
7

user.eleven@example.com
usereleven
1
1

user.twelve@example.com
usertwelve
2
1

user.thirteen@example.com
userthirteen
3
1

user.fourteen@example.com
userfourteen
4
1

user.fioeen@example.com
userfioeen
5
1

user.sixteen@example.com
usersixteen
1
7

user.seventeen@example.com
userseventeen
2
7

user.eighAeen@example.com
usereighteen
3
7

user.nineteen@example.com
usernineteen
4
7

user.twenty@example.com
usertwenty
5
7

user.twentyone@example.com
usertwentyone
1
1

user.twentytwo@example.com
usertwentytwo
2
1

user.twentythree@example.com
usertwentythree
3
1

user.twentyfour@example.com
usertwentyfour
4
1

user.twentyfive@example.com
usertwentyfive
5
1

user.twentysix@example.com
usertwentysix
1
7

user.twentyseven@example.com
usertwentyseven
2
7

user.twentyeight@example.com
usertwentyeight
3
7

user.twentynine@example.com
usertwentynine
4
7

Servidor
URL

1
hAp://54.94.132.228:8080/apollo/annotator/index

2

3

4

5

Funcionalidad
y
navegación.

56
Apollo 
funcionalidad
BECOMING ACQUAINTED WITH APOLLO
•  Agregar:

–  IDs
de
bases
de
datos
públicas
(e.g.

GenBank,
usando
DBXRef);
símbolo(s)

de
cada
gen,
nombre(s)
común(es),

sinónimos,
el
mejor
resultado
de
BLAST,

ortólogos
con
el
nombre
de
la
especie.

–  Asignaciones
de
función
apropiadas

(e.g.
via
datos
de
RNA-‐Seq,
búsquedas

de
literatura,
búsquedas
con
HMMs,

etc.)

–  Comentarios
acerca
de
las

modiﬁcaciones
que
se
realizaron,
o
si

ninguna
fue
necesaria.

–  Y
otras
notas
que
se
le
ocurran
al

biocurador.

•  Corregir
si(os
de
‘Inicio’
y
‘Parada’

•  Arreglar
si(os
de
ayuste
no
canónicos

•  Anotar
UTRs
(e.g.:
usando
RNA-‐Seq)

•  Obtener
&
corregir
predicciones
de

productos
de
proteínas

-‐
Alinearlos
con
genes
o
familias
de

genes
relevantes.

-‐
Usar
blastp
en
RefSeq
o
nr
de
NCBI

•  Revisar
la
falta
de
datos
en
el
ensamble

•  Unir
2
predicciones
de
genes
en
el

mismo
grupo

•  Dividir
una
predicción
de
gen

•  Corregir
desplazamientos
de
la
pauta

de
lectura,
y
otros
errores
en
el

ensamblaje

•  Anotar
selenocisteínas,
errores
de
una

sola
base,
etc.

REMOVABLE SIDE DOCK 
with customizable tabs
Annotations Organism Users Groups AdminTracks
Reference
Sequence

EDITS & EXPORTS 
annotation details, exon boundaries, data export
1 2
Annotations
1
2

Reference
Sequences
3
FASTA

GFF3

EDITS & EXPORTS 
annotation details, exon boundaries, data export
3

60 | 60
Becoming Acquainted with Web Apollo.
USER NAVIGATION
Annotator

panel.

•  Choose appropriate evidence tracks from list on annotator panel.
•  Select & drag elements from evidence track into the ‘User-created Annotations’ area.
•  Hovering over annotation in progress brings up an information pop-up.

61 | 61
USER NAVIGATION
•  Annotation right-click menu

62
Annota(ons,
annota(on
edits,
and
History:
stored
in
a
centralized
database.

62
USER NAVIGATION

63
The
Annota(on
Informa=on
Editor

DBXRefs
are
database
crossed
references:
if
you
have

reason
to
believe
that
this
gene
is
linked
to
a
gene
in
a

public
database
(including
your
own),
then
add
it
here.

63
USER NAVIGATION

64
The
Annota(on
Informa=on
Editor

•  Add
PubMed
IDs

•  Include
GO
terms
as
appropriate

from
any
of
the
three
ontologies

•  Write
comments
sta(ng
how
you

have
validated
each
model.

64
USER NAVIGATION

65 | 65
USER NAVIGATION
•  ‘Zoom
to
base
level’
op(on
reveals
the
DNA
Track.

66 | 66
USER NAVIGATION
•  Color
exons
by
CDS
from
the
‘View’
menu.

67 |
Zoom
in/out
with
keyboard:

shio
+
arrow
keys
up/down

67
USER NAVIGATION
•  Toggle
reference
DNA
sequence
and
transla=on
frames
in
forward

strand.
Toggle
models
in
either
direc(on.

“Simple
case”:

-‐
the
predicted
gene
model
is
correct
or
nearly
correct,
and

-‐
this
model
is
supported
by
evidence
that
completely
or
mostly

agrees
with
the
predic(on.

-‐
evidence
that
extends
beyond
the
predicted
model
is
assumed

to
be
non-‐coding
sequence.

The
following
are
simple
modiﬁca(ons.

70 | 70
ANNOTATING SIMPLE CASES
Becoming Acquainted with Web Apollo. SIMPLE CASES

71 |
•  A
conﬁrma(on
box
will
warn
you
if
the
receiving
transcript
is
not
on
the

same
strand
as
the
feature
where
the
new
exon
originated.

•  Check
‘Start’
and
‘Stop’
signals
aoer
each
edit.

71
ADDING EXONS

If
transcript
alignment
data
are
available
and
extend
beyond
your
original
annota(on,
you

may
extend
or
add
UTRs.

1.  Right
click
at
the
exon
edge
and
‘Zoom
to
base
level’.

2.  Place
the
cursor
over
the
edge
of
the
exon
un1l
it
becomes
a
black
arrow
then
click

and
drag
the
edge
of
the
exon
to
the
new
coordinate
posi(on
that
includes
the
UTR.

72 |
To
add
a
new
spliced
UTR
to
an
exis(ng
annota(on

follow
the
procedure
for
adding
an
exon.

72
ADDING UTRs

1.  Zoom
in
to
clearly
resolve
each
exon
as
a
dis(nct
rectangle.

2.  Two
exons
from
diﬀerent
tracks
sharing
the
same
start
and/or
end

coordinates
will
display
a
red
bar
to
indicate
matching
edges.

3.  Selec(ng
the
whole
annota(on
or
one
exon
at
a
(me,
use
this
‘edge-‐
matching’
func(on
and
scroll
along
the
length
of
the
annota(on,

verifying
exon
boundaries
against
available
data.
Use
square
[
]

brackets
to
scroll
from
exon
to
exon.

4.  Check
if
cDNA
/
RNAseq
reads
lack
one
or
more
of
the
annotated

exons
or
include
addi(onal
exons.

73 | 73
CHECK EXON INTEGRITY

To
modify
an
exon
boundary
and
match

data
in
the
evidence
tracks:
select

both
the
oﬀending
exon
and
the

feature
with
the
expected
boundary,

then
right
click
on
the
annota(on
to

select
‘Set
3’
end’
or
‘Set
5’
end’
as

appropriate.

74 |
In
some
cases
all
the
data
may
disagree
with
the
annota(on,
in

other
cases
some
data
support
the
annota(on
and
some
of
the

data
support
one
or
more
alterna(ve
transcripts.
Try
to
annotate

as
many
alterna(ve
transcripts
as
are
well
supported
by
the
data.

74
EXON STRUCTURE INTEGRITY

Flags
non-‐canonical

splice
sites.

Selec(on
of
features
and
sub-‐
features

Edge-‐matching

Evidence
Tracks
Area

‘User-‐created
Annota(ons’
Track

Apollo’s
edi(ng
logic
(brain):

§  selects
longest
ORF
as
CDS

§  ﬂags
non-‐canonical
splice
sites

75
ORFs AND SPLICE SITES

76 |
Exon/intron
junc(on
possible
error

Original
model

Curated
model

Non-‐canonical
splices
are
indicated
by
an

orange
circle
with
a
white
exclama(on
point

inside,
placed
over
the
edge
of
the
oﬀending

exon.

Canonical
splice
sites:

3’-‐…exon]GA
/
TG[exon…-‐5’

5’-‐…exon]GT
/
AG[exon…-‐3’

reverse
strand,
not
reverse-‐complemented:

forward
strand

76
SPLICE SITES
Zoom
to
review
non-‐canonical

splice
site
warnings.
Although

these
may
not
always
have
to
be

corrected
(e.g
GC
donor),
they

should
be
ﬂagged
with
the

appropriate
comment.

Web
Apollo
calculates
the
longest
possible
open

reading
frame
(ORF)
that
includes
canonical
‘Start’

and
‘Stop’
signals
within
the
predicted
exons.

If
‘Start’
appears
to
be
incorrect,
modify
it
by
selec(ng

an
in-‐frame
‘Start’
codon
further
up
or

downstream,
depending
on
evidence
(protein

database,
addi(onal
evidence
tracks).

It
may
be
present
outside
the
predicted
gene

model,
within
a
region
supported
by
another

evidence
track.

In
very
rare
cases,
the
actual
‘Start’
codon
may
be

non-‐canonical
(non-‐ATG).

77 | 77
‘START’ AND ‘STOP’ SITES

Evidence
may
support
joining
two
or
more
diﬀerent
gene
models.

Warning:
protein
alignments
may
have
incorrect
splice
sites
and
lack
non-‐conserved
regions!

1.  In
‘User-‐created
Annota=ons’
area
shio-‐click
to
select
an
intron
from
each
gene
model
and

right
click
to
select
the
‘Merge’
op(on
from
the
menu.

2.  Drag
suppor(ng
evidence
tracks
over
the
candidate
models
to
corroborate
overlap,
or

review
edge
matching
and
coverage
across
models.

3.  Check
the
resul(ng
transla(on
by
querying
a
protein
database
e.g.
UniProt.
Add
comments

to
record
that
this
annota(on
is
the
result
of
a
merge.

79 | 79
Red
lines
around
exons:

‘edge-‐matching’
allows
annotators
to
conﬁrm
whether
the

evidence
is
in
agreement
without
examining
each
exon
at
the

base
level.

COMPLEX CASES
merge two gene predictions on the same scaffold
Becoming Acquainted with Web Apollo. COMPLEX CASES

One
or
more
splits
may
be
recommended
when:

-‐
different
segments
of
the
predicted
protein
align
to
two
or
more

different
gene
families

-‐
predicted
protein
doesn’t
align
to
known
proteins
over
its
en(re
length

Transcript
data
may
support
a
split,
but
first
verify
whether
they
are

alterna(ve
transcripts.

80 | 80
COMPLEX CASES
split a gene prediction

DNA
Track

‘User-‐created
Annota=ons’
Track

81
COMPLEX CASES
correcting frameshifts and single-base errors
Always
remember:
when
annota(ng
gene
models
using
Apollo,
you
are
looking
at
a
‘frozen’
version
of

the
genome
assembly
and
you
will
not
be
able
to
modify
the
assembly
itself.

82
COMPLEX CASES
correcting selenocysteine containing proteins

83
COMPLEX CASES
correcting selenocysteine containing proteins

1.  Apollo
allows
annotators
to
make
single
base
modifica(ons
or
frameshios
that
are
reflected
in

the
sequence
and
structure
of
any
transcripts
overlapping
the
modifica(on.
These

manipula(ons
do
NOT
change
the
underlying
genomic
sequence.

2.  If
you
determine
that
you
need
to
make
one
of
these
changes,
zoom
in
to
the
nucleo(de
level

and
right
click
over
a
single
nucleo(de
on
the
genomic
sequence
to
access
a
menu
that

provides
op(ons
for
crea(ng
inser(ons,
dele(ons
or
subs(tu(ons.

3.  The
‘Create
Genomic
Inser=on’
feature
will
require
you
to
enter
the
necessary
string
of

nucleo(de
residues
that
will
be
inserted
to
the
right
of
the
cursor’s
current
loca(on.
The

‘Create
Genomic
Dele=on’
op(on
will
require
you
to
enter
the
length
of
the
dele(on,
star(ng

with
the
nucleo(de
where
the
cursor
is
posi(oned.
The
‘Create
Genomic
Subs=tu=on’
feature

asks
for
the
string
of
nucleo(de
residues
that
will
replace
the
ones
on
the
DNA
track.

4.  Once
you
have
entered
the
modifica(ons,
Apollo
will
recalculate
the
corrected
transcript
and

protein
sequences,
which
will
appear
when
you
use
the
right-‐click
menu
‘Get
Sequence’

op(on.
Since
the
underlying
genomic
sequence
is
reflected
in
all
annota(ons
that
include
the

modified
region
you
should
alert
the
curators
of
your
organisms
database
using
the

‘Comments’
sec(on
to
report
the
CDS
edits.

5.  In
special
cases
such
as
selenocysteine
containing
proteins
(read-‐throughs),
right-‐click
over
the

offending/premature
‘Stop’
signal
and
choose
the
‘Set
readthrough
stop
codon’
op(on
from

the
menu.

84 | 84
COMPLEX CASES
correcting frameshifts, single-base errors, and selenocysteines

Follow
the
checklist
un(l
you
are
happy
with
the
annota(on!

And
remember
to…

–  comment
to
validate
your
annota(on,
even
if
you
made
no
changes
to
an

exis(ng
model.
Think
of
comments
as
your
vote
of
conﬁdence.

–  or
add
a
comment
to
inform
the
community
of
unresolved
issues
you

think
this
model
may
have.

85 | 85
Always
Remember:
Web
Apollo
cura(on
is
a
community
eﬀort
so

please
use
comments
to
communicate
the
reasons
for
your

annota(on
(your
comments
will
be
visible
to
everyone).

COMPLETING THE ANNOTATION

1.  Can
you
add
UTRs
(e.g.:
via
RNA-‐Seq)?

2.  Check
exon
structures

3.  Check
splice
sites:
most
splice
sites
display
these

residues
…]5’-‐GT/AG-‐3’[…

4.  Check
‘Start’
and
‘Stop’
sites

5.  Check
the
predicted
protein
product(s)

–  Align
it
against
relevant
genes/gene
family.

–  blastp
against
NCBI’s
RefSeq
or
nr

6.  If
the
protein
product
s(ll
does
not
look
correct

then
check:

–  Are
there
gaps
in
the
genome?

–  Merge
of
2
gene
predic(ons
on
the
same

scaffold

–  Merge
of
2
gene
predic(ons
from
different

scaffolds

–  Split
a
gene
predic(on

–  FrameshiYs

–  error
in
the
genome
assembly?

–  Selenocysteines,
single-‐base
errors,
etc

87 | 87
7.  Finalize
annota(on
by
adding:

–  Important
project
informa(on
in
the
form
of

comments

–  IDs
from
public
databases
e.g.
GenBank
(via

DBXRef),
gene
symbol(s),
common
name(s),

synonyms,
top
BLAST
hits,
orthologs
with
species

names,
and
everything
else
you
can
think
of,

because
you
are
the
expert.

–  Whether
your
model
replaces
one
or
more
models

from
the
official
gene
set
(so
it
can
be
deleted).

–  The
kinds
of
changes
you
made
to
the
gene
model

of
interest,
if
any.

–  Any
appropriate
func(onal
assignments
of
interest

to
the
community
(e.g.
via
BLAST,
RNA-‐Seq
data,

literature
searches,
etc.)

THE CHECKLIST
for accuracy and integrity
MANUAL ANNOTATION CHECKLIST

Example
Example 89
A
public
Apollo
Demo
using
the
Honey
Bee
genome
is
available
at

hAp://genomearchitect.org/WebApolloDemo

-‐
Demonstra(on
using
the
Hyalella
azteca
genome

(amphipod
crustacean).

What do we know about this genome?
•  Currently
publicly
available
data
at
NCBI:

•  >37,000

nucleo(de
seqsà
scaffolds,
mitochondrial
genes

•  300

amino
acid
seqsà
mitochondrion

•  53

ESTs

•  0

conserved
domains
iden(fied

•  0

“gene”
entries
submiAed

•  Data
at
i5K
Workspace@NAL
(annota(on
hosted
at
USDA)

-‐
10,832
scaffolds:
23,288
transcripts:
12,906
proteins

Example 90

PubMed Search:  
what’s new?
Example 91

PubMed Search: what’s new?
Example 92
“Ten
popula(ons
(3
cultures,
7
from
California
water

bodies)
differed
by
at
least
550-‐fold
in
sensi=vity
to

pyrethroids.”

“By
sequencing
the
primary
pyrethroid
target
site,
the

voltage-‐gated
sodium
channel
(vgsc),
we
show
that

point
muta(ons
and
their
spread
in
natural
popula(ons

were
responsible
for
differences
in
pyrethroid

sensi(vity.”

“The
finding
that
a
non-‐target
aqua(c
species
has

acquired
resistance
to
pes(cides
used
only
on
terrestrial

pests
is
troubling
evidence
of
the
impact
of
chronic

pes=cide
transport
from
land-‐based
applica(ons
into

aqua(c
systems.”

How many sequences for our gene of
interest?
Example 93
•  Para,
(voltage-‐gated
sodium
channel
alpha

subunit;
Nasonia
vitripennis).

•  NaCP60E
(Sodium
channel
protein
60
E;
D.

melanogaster).

–  MF:
voltage-‐gated
ca(on
channel
ac(vity

(IDA,
GO:0022843).

–  BP:
olfactory
behavior
(IMP,
GO:
0042048),
sodium
ion
transmembrane

transport
(ISS,GO:0035725).

–  CC:
voltage-‐gated
sodium
channel

complex
(IEA,
GO:0001518).

And
what
do
we
know
about
them?

Retrieving sequences for  
sequence similarity searches.
Example 94
>vgsc-‐Segment3-‐DomainII

RVFKLAKSWPTLNLLISIMGKTVGALGNLTFVLCIIIFIFAVMGMQLFGKNYTEKVTKFKWSQDG
QMPRWNFVDFFHSFMIVFRVLCGEWIESMWDCMYVGDFSCVPFFLATVVIGNLVVSFMHR

BLAT search
Example 95
>vgsc-‐Segment3-‐DomainII

RVFKLAKSWPTLNLLISIMGKTVGALGNLTFVLCIIIFIFAVMGMQLFGKNYTEKVTKFKWSQDG
QMPRWNFVDFFHSFMIVFRVLCGEWIESMWDCMYVGDFSCVPFFLATVVIGNLVVSFMHR

Customizations:  
high-scoring segment pairs (hsp) in “BLAST+ Results” track
Example 97

Creating a new gene model: drag and drop
Example 98
•  Apollo automatically calculates ORF.
In this case, ORF includes the high-scoring segment pairs (hsp).

Get Sequence
Example 100
http://blast.ncbi.nlm.nih.gov/Blast.cgi

Also, flanking sequences (other gene models) vs. NCBI nr
Example 101
In
this
case,
two
gene

models
upstream,
at
5’

end.

BLAST
hsps

Review alignments
Example 102
HaztTmpM006234

HaztTmpM006233

HaztTmpM006232

Hypothesis for vgsc gene model
Example 103

Editing: merge the three models
Example 104
Merge
by
dropping
an

exon
or
gene
model

onto
another.

Merge
by
selec(ng

two
exons
(holding

down
“Shio”)
and

using
the
right
click

menu.

or…

Editing: correct boundaries, delete exons
Example 105
Modify
exon
/
intron

boundary:

-‐  Drag
the
end
of
the

exon
to
the
nearest

canonical
splice
site.

-‐  Use
right-‐click
menu.

Delete
ﬁrst
exon
from

HaztTmpM006233

Editing: set translation start
Example 106

Editing: modify boundaries
Example 107
Modify
intron
/

exon
boundary

also
at
coord.

78,999.

Finished model
Example 108
Corroborate
integrity
and
accuracy
of
the
model:

-‐
Start
and
Stop

-‐
Exon
structure
and
splice
sites
…]5’-‐GT/AG-‐3’[…

-‐
Check
the
predicted
protein
product
vs.
NCBI
nr

Information Editor
•  DBXRefs:
e.g.
NP_001128389.1,
N.

vitripennis,
RefSeq

•  PubMed
iden(ﬁer:
PMID:
24065824

•  Gene
Ontology
IDs:
GO:0022843,
GO:
0042048,
GO:0035725,
GO:0001518.

•  Comments.

•  Name,
Symbol.

•  Approve
/
Delete
radio
buAon.

Example 109
Comments

(if
applicable)

APOLLO 
demonstration
DEMO 111
Apollo
demo
video
available
at:

hAps://youtu.be/VgPtAP_fvxY

CONTENIDO 
Web
Apollo
Collabora(ve
Cura(on
and

Interac(ve
Analysis
of
Genomes

112OUTLINE
•  BIO-‐REFRESHER

conceptos
que
neceistamos

•  ANOTACION

predicciones
automá(cas

•  ANOTACION
MANUAL

necesaria,
en
colaboración

•  APOLLO

avanzando
la
curación
en
colaboración

•  EJEMPLO

demonstraciones

•  EJERCICIOS

Exercises
Live
Demonstra(on
using
the
Apis
mellifera
genome.

115
1.
Evidence
in
support
of
protein
coding
gene

models.

1.1
Consensus
Gene
Sets:

Oﬃcial
Gene
Set
v3.2

Oﬃcial
Gene
Set
v1.0

1.2
Consensus
Gene
Sets
comparison:

OGSv3.2
genes
that
merge
OGSv1.0
and

RefSeq
genes

OGSv3.2
genes
that
split
OGSv1.0
and
RefSeq

genes

1.3
Protein
Coding
Gene
Predic=ons
Supported
by

Biological
Evidence:

NCBI
Gnomon

Fgenesh++
with
RNASeq
training
data

Fgenesh++
without
RNASeq
training
data

NCBI
RefSeq
Protein
Coding
Genes
and
Low
Quality

Protein
Coding
Genes

1.4
Ab
ini,o
protein
coding
gene
predic=ons:

Augustus
Set
12,
Augustus
Set
9,
Fgenesh,
GeneID,

N-‐SCAN,
SGP2

1.5
Transcript
Sequence
Alignment:

NCBI
ESTs,
Apis
cerana
RNA-‐Seq,
Forager
Bee
Brain

Illumina
Con(gs,
Nurse
Bee
Brain
Illumina
Con(gs,

Forager
RNA-‐Seq
reads,
Nurse
RNA-‐Seq
reads,

Abdomen
454
Con(gs,
Brain
and
Ovary
454

Con(gs,
Embryo
454
Con(gs,
Larvae
454
Con(gs,

Mixed
Antennae
454
Con(gs,
Ovary
454
Con(gs

Testes
454
Con(gs,
Forager
RNA-‐Seq
HeatMap,

Forager
RNA-‐Seq
XY
Plot,
Nurse
RNA-‐Seq

HeatMap,
Nurse
RNA-‐Seq
XY
Plot


Exercises
Live
Demonstra(on
using
the
Apis
mellifera
genome.

116
1.
Evidence
in
support
of
protein
coding
gene

models
(Con=nued).

1.6
Protein
homolog
alignment:

Acep_OGSv1.2

Aech_OGSv3.8

Cﬂo_OGSv3.3

Dmel_r5.42

Hsal_OGSv3.3

Lhum_OGSv1.2

Nvit_OGSv1.2

Nvit_OGSv2.0

Pbar_OGSv1.2

Sinv_OGSv2.2.3

Znev_OGSv2.1

Metazoa_Swissprot

2.
Evidence
in
support
of
non
protein
coding
gene

models

2.1
Non-‐protein
coding
gene
predic=ons:

NCBI
RefSeq
Noncoding
RNA

NCBI
RefSeq
miRNA

2.2
Pseudogene
predic=ons:

NCBI
RefSeq
Pseudogene


Instrucciones
117 | 117
APOLLO EN LA WEB 
instrucciones
Servidor
URL

1

2

3

4

5

Email:

nombre.apellido@example.com

Contraseña:

nombreapellido

Email
Contraseña
Servidor
Empezar
en

user.one@example.com
userone
1
1

user.two@example.com
usertwo
2
1

user.three@example.com
userthree
3
1

user.four@example.com
userfour
4
1

user.five@example.com
userfive
5
1

user.six@example.com
usersix
1
7

user.seven@example.com
userseven
2
7

user.eight@example.com
usereight
3
7

user.nine@example.com
usernine
4
7

user.ten@example.com
userten
5
7

user.eleven@example.com
usereleven
1
1

user.twelve@example.com
usertwelve
2
1

user.thirteen@example.com
userthirteen
3
1

user.fourteen@example.com
userfourteen
4
1

user.fioeen@example.com
userfioeen
5
1

user.sixteen@example.com
usersixteen
1
7

user.seventeen@example.com
userseventeen
2
7

user.eighAeen@example.com
usereighteen
3
7

user.nineteen@example.com
usernineteen
4
7

user.twenty@example.com
usertwenty
5
7

user.twentyone@example.com
usertwentyone
1
1

user.twentytwo@example.com
usertwentytwo
2
1

user.twentythree@example.com
usertwentythree
3
1

user.twentyfour@example.com
usertwentyfour
4
1

user.twentyfive@example.com
usertwentyfive
5
1

user.twentysix@example.com
usertwentysix
1
7

user.twentyseven@example.com
usertwentyseven
2
7

user.twentyeight@example.com
usertwentyeight
3
7

user.twentynine@example.com
usertwentynine
4
7

Thank you. 118
•  Berkeley
Bioinforma=cs
Open-‐source
Projects
(BBOP),

Berkeley
Lab:
Apollo
and
Gene
Ontology
teams.
Suzanna

E.
Lewis
(PI).

•  §
Chris1ne
G.
Elsik
(PI).
University
of
Missouri.

•  *
Ian
Holmes
(PI).
University
of
California
Berkeley.

•  Arthropod
genomics
community:
i5K
Steering

CommiAee
(esp.
Sue
Brown
(Kansas
State)),
Alexie

Papanicolaou
(UWS),
and
the
Honey
Bee
Genome

Sequencing
Consor(um.

•  Stephen
Ficklin
GenSAS
Washington
State
University

•  Apollo
is
supported
by
NIH
grants
5R01GM080203
from

NIGMS,
and
5R01HG004483
from
NHGRI.
Both
projects

are
also
supported
by
the
Director,
Oﬃce
of
Science,

Oﬃce
of
Basic
Energy
Sciences,
of
the
U.S.
Department

of
Energy
under
Contract
No.
DE-‐AC02-‐05CH11231

• 

•  For
your
a"en=on,
thank
you!

Apollo

Nathan
Dunn

Colin
Diesh
§

Deepak
Unni
§

Gene
Ontology

Chris
Mungall

Seth
Carbon

Heiko
Dietze

BBOP

Apollo:
hAp://GenomeArchitect.org

GO:
hAp://GeneOntology.org

i5K:
hAp://arthropodgenomes.org/wiki/i5K

¡Gracias!

NAL
at
USDA

Monica
Poelchau

Christopher
Childers

Gary
Moore

HGSC
at
BCM

fringy
Richards

Kim
Worley

JBrowse

Eric
Yao
*

Apolo Taller en BIOS

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (9)

Similar to Apolo Taller en BIOS

Similar to Apolo Taller en BIOS (20)

More from Monica Munoz-Torres

More from Monica Munoz-Torres (13)

Recently uploaded

Recently uploaded (20)

Apolo Taller en BIOS