GEN2PHEN GAM8 meeting Leiden - Identifiers for LSDBs
G. A. Thorisson, A. J. Webb, R. Dalgleish ULEIC / J. Muilu FIMM Identiﬁcation of G2P databases - challenges and proposal for a solution Gudmundur A. Thorisson <firstname.lastname@example.org> ULEIC Adam J. Webb <email@example.com> ULEIC Raymond Dalgleish <firstname.lastname@example.org> ULEIC Juha Muilu <email@example.com> FIMM -- Overview -- ✴ Identification difficulties - the Knowledge Centre perspective ✴ Or, why we need persistent identifiers for database resources ✴ Proposal to collaborate with the BioDBCore initiative ✴ standardizing registration & description of bio-databases This work is published under the Creative Commons Attribution license (CC BY: http://creativecommons.org/licenses/by/3.0/) which means that it can be freely copied, redistributed and adapted, as long as proper attribution is given. GEN2PHEN 8th General Assembly Meeting, Leiden, Jan 24-25 2012 1Friday, 27 January 12
Linking resources External records / annotaEons c. c. c. c. c. Databases 301C> 465A 555G> 103C> 321G> T >G T T T DB maintainer SubmiIer SubmiIer DB maintainerFriday, 27 January 12
URLs are unstable hIp://subdomain.example.com/path/to/resource • Domain names / subdomains can change – hgvbaseg2p.org -‐> gwascentral.org – server1.example.com -‐> server2.example.com • Paths can change – e.g /LOVD2/ change to /LOVD3/ • LSDB genes can move – e.g gene ADAM19 moves from one LOVD install to another • Databases can merge – i.e gene ADAM19 on two diﬀerent installs are reconciled into a single installFriday, 27 January 12
1:1 IDENTIFIER DATA RESOURCE • Gene name not suitable – > 1 database for a given gene • gene.lovd.nl -‐> returns list of databases (or redirects if only 1 is known) – 1 to many • lovd.nl/gene -‐> redirects to *one* database – 1 to one, but many resource do not receive idenEﬁers • These are locators, not idenEﬁers • Non-‐gene based resources • Ideally the idenEﬁer should also operate as the locator (like DOIs via a DOI resoluEon service) – hIp://dx.doi.org/10.19192 resolves DOI 10.19192Friday, 27 January 12
G. A. Thorisson, A. J. Webb, R. Dalgleish ULEIC / J. Muilu FIMM Proposal to collaborate with BioDBCore • BioDBCore aims – annotation - organize the bio-database ‘resourceome’ – discovery - e.g. which protein sequence databases are available? • Who’s behind it? – International Society for Biocuration – Resource catalogues: Bioinformatics Links, BioSiteMaps, NAR db-issue etc – Working group includes reps from NAR and DATABASE journals, MIBBI, Model organism db’s, CASIMIR mouse informatics consortium, others GEN2PHEN 8th General Assembly Meeting, Leiden, Jan 24-25 2012 5Friday, 27 January 12
G. A. Thorisson, A. J. Webb, R. Dalgleish ULEIC / J. Muilu FIMM GEN2PHEN 8th General Assembly Meeting, Leiden, Jan 24-25 2012 6Friday, 27 January 12
G. A. Thorisson, A. J. Webb, R. Dalgleish ULEIC / J. Muilu FIMM Persistent resource identiﬁers in BioDBCore • They plan to use MIRIAM registry / ID resolution service – unique, persistent and unambiguous identiﬁcation of various kind of concepts. • http://identiﬁers.org/ec-code/22.214.171.124 • http://identiﬁers.org/pubmed/16333295 • http://identiﬁers.org/doi/10.1038/nbt1156 • Decouples identiﬁcation from location • Many resourcesa are already registered with MIRIAM • Operated by EBI <-- long-term sustainability prospect • Adoption by players LS Semantic Web comunity – URIs for identifying entities in biological information represented in RDF – http://lsrn.org, Shared Names, Bio2RDF, others GEN2PHEN 8th General Assembly Meeting, Leiden, Jan 24-25 2012 7Friday, 27 January 12
G. A. Thorisson, A. J. Webb, R. Dalgleish ULEIC / J. Muilu FIMM How might this work? • Using database URIs - plausible scenario – Persistent canonical URI: http://identiﬁers.org/biodbcore/10235900 – Click URL, browser redirects to http://biodbcore.org/resource/10235900 – BioDBCore metadata record for the database (akin to “landing page” online journal site) • BioDBCore “landing page” presents database metadata – Information *about* the “thing” – Name: Ehlers-Danlos Syndrome Variant Database Main resource URL: https://eds.gene.le.ac.uk <-- the “thing” itself [scope, data standards, other metadata] • Location of database = the “thing” itself GEN2PHEN 8th General Assembly Meeting, Leiden, Jan 24-25 2012 8Friday, 27 January 12
G. A. Thorisson, A. J. Webb, R. Dalgleish ULEIC / J. Muilu FIMM Mututal beneﬁts • To GEN2PHEN / G2P community – Identiﬁcation - slot into resource identiﬁer scheme for bio-databases globally, build more detailed catalogues & annotation systems around this – Discovery - ﬁnding relevant LSDB and other G2P resources via range of search/ query tools outside the KC or LSDB lists – BioDBCore could possibly evolve into a sort of live “database publishing platform” , instead of the static “snapshot” conventional papers. • To BioDBCore initiative – Acquire an entire category’s worth of metadata records & link to community – Extra pairs of eyes on what they’re doing, alternative perspective – Potential for further collaboration on contrib. tracking tools & ORCID integration GEN2PHEN 8th General Assembly Meeting, Leiden, Jan 24-25 2012 9Friday, 27 January 12
G. A. Thorisson, A. J. Webb, R. Dalgleish ULEIC / J. Muilu FIMM Open questions, known unknowns etc. • BioDBCore quite new, many things remain in ﬂux – e.g. the MIRIAM / identiﬁers.org technical details are vague • DOIs for BioDBCore records - register database DOIs for fuller integration into publishing process? • How will this work with existing LSDB lists? GEN2PHEN 8th General Assembly Meeting, Leiden, Jan 24-25 2012 10Friday, 27 January 12
G. A. Thorisson, ULEIC Acknowledgements GEN2PHEN Consortium This work has received funding from the http://www.gen2phen.org/about-gen2phen/partners European Communitys Seventh Framework Programme (FP7/2007-2013) under grant agreement number 200754 - Prof Anthony J. Brookes Bioinformatics Group, Leicester the GEN2PHEN project. Contact me! <firstname.lastname@example.org> |<email@example.com> http://www.linkedin.com/in/mummi http://www.twitter.com/gthorisson Published under the CC BY license (http:// http://www.gthorisson.name creativecommons.org/licenses/by/3.0/) GEN2PHEN 8th General Assembly Meeting, Leiden, Jan 24-25 2012 11Friday, 27 January 12
A particular slide catching your eye?
Clipping is a handy way to collect important slides you want to go back to later.