Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
DAS (Distributed Annotation System): Introduction
1. Introduction
DAS (Distributed Annotation System)
Programmatic Access To Biological Databases (Perl)
1 – 4 October 2012
Rafael C. Jimenez & Leyla J. Garcia
rafael@ebi.ac.uk and ljgarcia@ebi.ac.uk
2. The Distributed Annotation System, 2001 Dowell et al;
BMC Bioinformatics. 2001; 2: 7. Published online 2001 October 10.
DAS (Distributed Annotation System)
Communication protocol used to exchange sequence annotations
Washington University
Ensembl
Sean Eddy Laboratory
John’s data
www
18. Introduction
DAS (Distributed Annotation System)
Programmatic Access To Biological Databases (Perl)
1 – 4 October 2012
Rafael C. Jimenez & Leyla J. Garcia
rafael@ebi.ac.uk and ljgarcia@ebi.ac.uk
19. DAS, The Distributed Annotation System
The Distributed Annotation System is…
• A network of biological data sources
• A Service Oriented Architecture (SOA)
• RESTful web service
• An example of federation
• Uniform access to multiple repositories of biological data.
• Repositories distributed in different geographical locations.
The DAS Protocol is…
• An integration platform
• A client-server protocol
• An agreed standard for web services
20. DAS – Andy Jenkinson
13.12.201820
DAS Design Principles
• Data remains distributed
• “live” data
• data providers retain responsibility
• good for changing data
• spreads resources
• Easy for data providers to implement
• simple protocol
• lots of data providers
21. DAS Design Principles
Principally for display
• should be responsive (fast)
• region-targeted queries
• lightweight infrastructure
Downsides
• Rigid data model
• Weak semantics
DAS – Andy Jenkinson13.12.2018
21
22. DAS RNG
DAS server
DAS spec 1.6
SO, MOD, BS XML
Coordinate system
sequence
information
Definition Representation Access
DAS format
Sequence Types and Features (SO)
• Contains terms for both genomic and protein sequence annotations.
Protein Modifications (MOD)
• Contains terms for post-translational modifications.
BioSapiens Annotations (BS)
• Originally developed for the BioSapiens consortium, this ontology contains protein-
focussed terms for nonpositional annotations (e.g. publications).
Schema
Ontology
Guideline
Identifiers
23. DAS – Andy Jenkinson
13.12.201823
Query model
Structured REST URL
• http://server/das/source/command?arguments
• servers, data sources, commands, parameters
Reference object
• e.g. “chromosome X”
Reference servers provide sequence
• http://server/das/source/sequence?segment=X:1,500
Annotation servers provide features
• http://server/das/source/features?segment=X:1,500
24. DAS – Andy Jenkinson
13.12.201824
Data model
Lightweight XML
http://server/das/source/features?segment=X:1,500
<SEGMENT id=“X” start=“1” stop=“500”>
<FEATURE id=“…”>
<TYPE id=“…” category=“…”>…</TYPE>
<METHOD id=“…”>…</METHOD>
<START>…</START>
<END>…</END>
</FEATURE>
<FEATURE id=“…”>
…
</FEATURE>
</SEGMENT>
25. DAS Annotation source - Protein Feature Request
Non-positional feature
Positional feature
http://www.ebi.ac.uk/das-srv/uniprot/das/uniprot/features?segment=Q12345
26. DAS Reference source - Protein Sequence Request
http://www.ebi.ac.uk/das-srv/uniprot/das/uniprot/sequence?segment=Q12345
27. More DAS Commands
• Alignment, Structure and Interaction
• More …
http://server/das/source/entry_points
• entry_points: List of available “chromosomes | contigs | proteins | …”
http://server/das/source/types
• types – provides a summary of the feature types for a segment.
http://server/das/source/stylesheet
• stylesheet – gives hints to the DAS client about how to display the feature
types. Can be ignored of course.
http://server/das/sources
• sources – list of available sources in one DAS server. Replaces the
original, underspecified dsn command.
http://www.biodas.org/wiki/DAS1.6
28. DAS registry REST interface
28
http://www.dasregistry.org/DASCommandExamples.jsp
29. Split data and presentation
• Databases serving data as
primitive datatypes defined by
open standards
• Different front ends or
components of front ends
compete for users
Data Representation
43. Versions of DAS
2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011
~250
sources
~380
sources
~650
sources
~ 8 sources
DAS
1.01
~1300
sources
DAS
1.53
DAS
2.0
DAS
2.1
DAS
1.53E
DAS
1.6DAS 1 DAS/2