5. Recent BD2K sponsored hackathons
• Palo Alto, November 2015, Network of BioThings 4
• La Jolla, January 2016, Web Apollo
• Boston, February 2016, Morphological profiling data analysis
• Buenos Aires, April 2016, UCLA Heart Bits
• Cambridge, MA, April 2016, Hacking Cancer: TCGA and 7 Bridges
6. Winner!
Palo Alto, November 2015,
(Network of BioThings 4)
https://github.com/Networkof
BioThings/conceptcentricview
http://ccv.bioontology.org/
Concept Centric View
“CCV”
Search for “heart”
7. La Jolla, January 2016, Web Apollo
No judging this time, just my favorite..
Genome Annotations grabbed from remote SPARQL endpoint, rendered in Web Apollo
15. To build Data Soups
• We need to integrate data from different sources
• We need the data to remain accessible
• We need people to:
• eat our soup (use our data..)
• pay for it
• help us improve our recipe
16. Knowledge Commons
A common place for:
• Storing identifier mappings
• Hosting knowledge for the long term
• Channeling collaborative spirit !
18. Is to data
as Wikipedia is to text
“Giving more people more access to more knowledge”
A free and open repository of knowledge
Managed by the MediaWiki foundation
that operates Wikipedia
Not a ‘project’…
25. Once the data is in, it joins a GGG
https://query.wikidata.org = SPARQL endpoint for Wikidata
Giant
Global
Graph
Everything is
connected,
Reelin, Heart
disease,
Barack Obama,
everything..
26. We are seeding it with
biomedical data
• All human, mouse genes and proteins
• All Gene Ontology terms
• All FDA approved drugs
• 9,000+ human diseases
• 120 reference microbial genomes
Burgstaller et al (2016) Database (preprint in BioRxiv)
Mitraka et al (2015) Semantic Web Applications for the Life Sciences (best paper) (preprint in BioRxiv)
Putman et al (2016) Database (preprint in BioRxiv)
27. Our seeds are largely
concepts linked to many
identifier systems
N identifiers per item
• Genes: 8
• Drugs: 18
• Diseases: 11
Facilitate
integration with
other key
knowledge bases
34. Stone Soups Harness Collaborative Spirit
• Programmers
• Data
Wikidata Workshop, International
Society for Biocuration
Geneva, Switzerland,
April 10-14 2016 !!!
35. Thanks!
Wikidata Team
Andra Waagmeester (Micelio)
* Sebastian Burgstaller (Scripps)
* Tim Putman (Scripps)
* Elvira Mitraka (U Maryland)
Julia Turner (Scripps)
Justin Leong (UBC)
Lynn Schriml (U Maryland)
Paul Pavlidis (UBC)
Andrew Su (Scripps)
Ginger Tsueng (Scripps)
Contact
bgood@scripps.edu* First author on manuscript cited in this presentation
Ben Tim
Andra
Elvira
Sebastian
Some Gene Wiki team members
enjoying their best paper award
at SWAT4LS, Dec. 2015
Adapted logo
Su Laboratory at TSRI
36.
37. Adding data automatically with a Bot
(1) Learn the API, use our library* perhaps..
(2) Use your Wikidata account to make tests
(3) Describe your project, request approval for bot user account
(4) If you need new properties, request approval
(5) Execute bot
(6) Pay attention to community reactions, they can block you if it goes
wrong..
*http://bitbucket.org/sulab/wikidatabots/
Opportunities for a group of people to come together to make something new
Social gatherings usually lasting a few days
Competitions (usually but not necessarily)
Our group produces a lot of tools - as evidenced by the long string of presentations organized by Vincent. A hackathon provides the chance to see if there are ways that these tools and / or the scientists and engineers that build them can be combined in new and innovative ways.
The winning team developed a conceptcentric view (C CV) for ontological terms. CCV offers a unique service to users by aggregating all the relevant information concerning a search term across the BioPortal repository of biomedical ontologies. The user interface presents definitions, synonyms, child and parent terms, concept identifiers and the ontologies that contain them. The service is highly useful because dramatically reduces the time and effort required to collate such information to get a crossontology view.
○ Repo: h ttps://github.com/NetworkofBioThings/conceptcentricview .
○ Website: h ttp://ccv.bioontology.org/
● The runner up developed P redicType. PredicType i s a Google Docs plugin to incorporate structured annotations into documents. It taps into BioPortal to enable users to select concepts that partially match their text and add external links to their document. With further development, PredicType aims to remove the ambiguity in retrospective text mining of scientific documents, and make it possible to query these documents directly.
○ Presentation: h ttps://goo.gl/WC0Afx
○ Repo: h ttps://github.com/veleritas/predictype
Knowledge bases are not complete
Knowledge needs integration
Funded individually on a project basis
Lack of central authority results in constant identifier mapping problems
From mosaic to melting pot?
Funded individually on a project basis
Lack of central authority results in constant identifier mapping problems
Common ground, not owned by anyone, not subject to annual grant review cycles (institution)
Support interlinking of resources in a meaningful way
Increase dissemination
Generate feedback
Common ground, not owned by anyone, not subject to annual grant review cycles (institution)
Support interlinking of resources in a meaningful way
Increase dissemination
Generate feedback
Common ground, not owned by anyone, not subject to annual grant review cycles (institution)
Support interlinking of resources in a meaningful way
Increase dissemination
Generate feedback
This is the first application of the work that we have done