Successfully reported this slideshow.
Your SlideShare is downloading. ×

FAIRy Stories

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Loading in …3
×

Check these out next

1 of 59 Ad

FAIRy Stories

Download to read offline

Findable Accessable Interoperable Reusable < data |models | SOPs | samples | articles| * >. FAIR is a mantra; a meme; a myth; a mystery; a moan. For the past 15 years I have been working on FAIR in a bunch of projects and initiatives in Life Science projects. Some are top-down like Life Science European Research Infrastructures ELIXIR and ISBE, and some are bottom-up, supporting research projects in Systems and Synthetic Biology (FAIRDOM), Biodiversity (BioVel), and Pharmacology (open PHACTS), for example. Some have become movements, like Bioschemas, the Common Workflow Language and Research Objects. Others focus on cross-cutting approaches in reproducibility, computational workflows, metadata representation and scholarly sharing & publication. In this talk I will relate a series of FAIRy tales. Some of them are Grimm. Some have happy endings. Who are the villains and who are the heroes? What are the morals we can draw from these stories?

Findable Accessable Interoperable Reusable < data |models | SOPs | samples | articles| * >. FAIR is a mantra; a meme; a myth; a mystery; a moan. For the past 15 years I have been working on FAIR in a bunch of projects and initiatives in Life Science projects. Some are top-down like Life Science European Research Infrastructures ELIXIR and ISBE, and some are bottom-up, supporting research projects in Systems and Synthetic Biology (FAIRDOM), Biodiversity (BioVel), and Pharmacology (open PHACTS), for example. Some have become movements, like Bioschemas, the Common Workflow Language and Research Objects. Others focus on cross-cutting approaches in reproducibility, computational workflows, metadata representation and scholarly sharing & publication. In this talk I will relate a series of FAIRy tales. Some of them are Grimm. Some have happy endings. Who are the villains and who are the heroes? What are the morals we can draw from these stories?

Advertisement
Advertisement

More Related Content

Slideshows for you (20)

Similar to FAIRy Stories (20)

Advertisement

More from Carole Goble (20)

Advertisement

FAIRy Stories

  1. 1. FAIRy stories for Christmas Carole Goble The University of Manchester, UK carole.goble@manchester.ac.uk ELIXIR-UK, FAIRDOM, ISBE, BioExcel CoE, Software Sustainability Institute Open PHACTS SWAT4HCLS 2017, 5th Dec 2017, Rome
  2. 2. Once upon a time in a land far, far away lived a KinG … Who wanted all data to be FAIR….
  3. 3. Mark D. Wilkinson, Michel Dumontier, IJsbrand Jan Aalbersberg, Gabrielle Appleton, Myles Axton, Arie Baak, Niklas Blomberg, Jan-Willem Boiten, Luiz Bonino da Silva Santos, Philip E. Bourne, Jildau Bouwman, Anthony J. Brookes, Tim Clark, Mercè Crosas, Ingrid Dillo, Olivier Dumon, Scott Edmunds, Chris T. Evelo, Richard Finkers, Alejandra Gonzalez-Beltran, Alasdair J.G. Gray, Paul Groth, Carole Goble, Jeffrey S. Grethe, Jaap Heringa, Peter A.C ’t Hoen, Rob Hooft, Tobias Kuhn, Ruben Kok, Joost Kok, Scott J. Lusher, Maryann E. Martone, Albert Mons, Abel L. Packer, Bengt Persson, Philippe Rocca-Serra, Marco Roos, Rene van Schaik, Susanna-Assunta Sansone, Erik Schultes, Thierry Sengstag, Ted Slater, George Strawn, Morris A. Swertz, Mark Thompson, Johan van der Lei, Erik van Mulligen, Jan Velterop, Andra Waagmeester, Peter Wittenburg, Katherine Wolstencroft, Jun Zhao, Barend Mons Wilkinson Dumontier Schultes Scientific Data 3, 160018 (2016) doi:10.1038/sdata.2016.18
  4. 4. Queens… And FAIRY GODMOTHERS Scientific Data 3, 160018 (2016) doi:10.1038/sdata.2016.18
  5. 5. Machine Processable Metadata Scientific Data 3, 160018 (2016) doi:10.1038/sdata.2016.18 • Catalogues, Search, Stores • Metadata Standards • StandardAccess protocols • Identifiers, Policies • Authorised Access • Licensing
  6. 6. FAIR spread across the lands …… VIVO/SciTS Conferences 6-8 August 2014, Austin, TX
  7. 7. FAIR spread across the lands ……
  8. 8. Stakeholder FAIR Awareness UK Institutional Research Data Management guidance* * Jisc: Final Report FAIR in Practice, Nov 2017 Government, Funder, Publisher, National & International Infrastructures… Institutional Researchers FAIR spread across the lands …… BUT not necessarily all the peoples
  9. 9. FAIR spread across the lands ……
  10. 10. Moral: Names are important Spinning (metadata) straw into gold Be careful what you promise…
  11. 11. Me Too! staking claims we { are | will be | always have been } FAIR a rallying flag
  12. 12. Hype Curve
  13. 13. http://dx.doi.org/10.1101/225490 http://blog.ukdataser vice.ac.uk/fair-data- assessment-tool/ http://fairmetrics.org/
  14. 14. Beware… beauty is in the eye of the beholder What’s FAIR from a Cataloguer perspective maybe useless from a biologists viewpoint
  15. 15. My Semantic FAIRy Stories The Scientist and the FAIR Commons The MAGIC Research Object little semantics and the big Web
  16. 16. The Scientists and the FAIR Research Commons Supporting mixed types and many researchers FAIR
  17. 17. The Scientists and the FAIR Research Commons Find: ID resolution Faceted Navigation Search, RDF SPARQL endpoint, APIs A Commons for Workflows myexperiment.org A Commons for Systems Biology Projects fairdomhub.org investigation study assay/analysis data models SOPs
  18. 18. Community & Project Commons Structured organisation across standards and types Federation over autonomous resources Laissez-Faire Independent Users Ecosystem of types, stores and metadata
  19. 19. Own little houses: from straw to bricks Permission controls Staged sharing Licenses Negotiated access Embargos Open
  20. 20. Schema Dublin core Datacite, DCAT, Bioschemas Catalogue Level Investigation Studies Assay/Analysis Content level Persistent Identifiers Content level subject thematic standards Content level Stratified Linked Data
  21. 21. Getting the best FAIR metadata…. FAIR Access – myExperiment -> open – FAIRDOM -> friends and family – Hand over straw houses to FAIRDOMHub “TheTragedy of the Commons”* – Metadata quality and quantity – Identifier hygiene – Curation & contributions – Public good vs personal burden – Incorporation into processes – Community socialisation - obligations mismatches. Credit! *Mark Musen , https://ncip.nci.nih.gov/blog/face-new-tragedy-commons-remedy-better-metadata/
  22. 22. project PIs, funders time burden, distrust project PIs, funders PALs – juniors, advocates and Cinderellas templates, tools benefit
  23. 23. Moral: Incentives
  24. 24. Bake in “Semantic Nudging” Ontologies stealthily embedded in Excel spreadsheet templates Added value - Model execution Vanity, guilt, shaming Automation rightfield.org.uk
  25. 25. Cinderella? The Spreadsheet
  26. 26. “The Last Mile”* -> The First Mile FAIR from bench to cloud Last mile - Infrastructure view First mile - researcher / resource view * Dimitrios Koureas et al Community engagement: The ‘last mile’ challenge for European research e-infrastructures Research I deas and Outcomes 2: e9933 (20 Jul 2016) https://doi.org/10.3897/rio.2.e9933
  27. 27. the generic vs specific zig zag path
  28. 28. The MAGIC Research OBJECT GENERIC Framework For exchange, reproducibility, Preservation, active artefacts Universal Catering, bottomless content FAIR
  29. 29. The FAIR Research Object import, exchange, portability, maintenance ISA-TAB Bergman et al COMBINE archive and OMEX format: one file to share all information to reproduce a modeling project, BMC Bioinformatics 2014, 15:369
  30. 30. workflow engine Workflow Run Provenance Inputs Outputs Intermediates Parameters Configs Narrative Exchange between people & platforms Commons store, catalogue & archive Reproduce preserve, port, repair Activate re-compute, mix, compare, evolve The FAIR Workflow Research Object
  31. 31. researchobject.org Bechhofer et al (2013) Why linked data is not enough for scientists https://doi.org/10.1016/j.future.2011.08.004 Bechhofer et al (2010) Research Objects: Towards Exchange and Reuse of Digital Knowledge, https://eprints.soton.ac.uk/268555/ Standards-based generic metadata framework for bundling internal and external resources with context citable reproducible packaging Data used and results produced in study Methods employed to produce/analyse data Provenance and settings for the experiments People involved in the investigation Annotations about these resources:- understanding & interpretation
  32. 32. Linking across ROs and into the Linked Open Data Cloud • Recording & linking together the components of an experiment • Linking across experiments. • Linked ROs • A SemanticWeb of Research Objects • Resource References – a bottomless pot
  33. 33. Technology Independent. The least possible. The simplest feasible. Low tech. Low user overhead and thin client Graceful degradation. FAIR ROs Desiderata
  34. 34. Construction Content Profile Types Identification to locate things Aggregates to link things together Annotations about things & their relationships Type Checklists what should be there Provenance where it came from Versioning its evolution Dependencies what else is needed Manifest checklist Type Checklists describing what should be there Container Metadata Objects
  35. 35. Construction http://www.researchobject.org/specifications/ RO Model Identifiers: URI, RRI, DOI, ORCID W3C Web AnnotationVocabulary Open Archives Initiative Object Exchange and Reuse Aggregation Annotation Container
  36. 36. Content Profiles. Progression LevelsContainer
  37. 37. Profile http://purl.org/minim/description W3C Shape Specs *Gamble, Zhao, Klyne, Goble. "MIM: A Minimum Information Model Vocabulary and Framework for Scientific Linked Data", IEEE eScience 2012 Chicago, USA October, 2012), http://dx.doi.org/10.1109/eScience.2012.6404489 validators / viewers Minim model for defining checklists* multiple profiles for different consumers Generic Specifics RO-SHOW Container
  38. 38. Linked Data Pharmacological Discovery Platform Data Releases Dataset “build” RO Library Earth Sciences Public Health Learning Systems Asthma Research e- Lab sharing and computing statistical cohort studies Happy Endings! ISA based Packaging, Systems Biology commons & publishing Managing distributed unmovable large datasets for Biomedical HTS analytic pipelines * * Chard et al I'll take that to go: Big data bags and minimal identifiers for exchange of large, complex datasets, https://doi.org/10.1109/BigData.2016.7840618
  39. 39. Happy Ending – Workflows Biomedical HTS analytic pipelines Manifest description of CWL workflows + rich context + provenance + other objects + snapshots Precision medicine NGS pipelines regulation* *Alterovitz, Dean II, Goble, Crusoe, Soiland-Reyes et al Enabling Precision Medicine via standard communication of NGS provenance, analysis, and results, biorxiv.org, 2017, https://doi.org/10.1101/191783 EDAM Biomolecular modelling PortableWorkflows
  40. 40. BagIT, JSON(-LD), schema.org https://dokie.li/ https://linkedresearch.org/ Manifest: Schema.org, JSON-LD, RDF Archive: .tar.gz Reproducible Document Stack project eLife, Substance and Stencila BagIT data profile + schema.org JSON-LD annotations Many Roads
  41. 41. Morals Incremental, open frameworks hard work – Extensive reuse of standards is tricky – Too Generic vsToo Specific – Multi-element type & nesting challenges – ROs with a Purpose – Examples & templates Representational Beauty vsTools – Easy to make, hard to consume – Be specific, be developer friendly – Profiles & tools critical Patience is a virtue
  42. 42. Bioschemas: Little Semantics and the big web Being and keeping light, small and viral FAIR
  43. 43. Structured data markup for web pages Schema.org adds simple structured metadata markup to web pages & sitemaps for harvesting, search and summary snippet making. Search engines often highlight websites containing Schema.org Widespread commercial and open source infrastructure creates a low barrier to adoption
  44. 44. Goldilocks & the 3 Use Cases Standardised metadata mark-up Metadata published & harvested withoutAPIs or special feeds 3 Use Cases 1. Finding/Citing, 2. Summary snippets 3. Metadata exchange / ingest Goldilocks • Reuse ubiquitous commercial platform • The least possible change, the max possible reuse • Minimum properties – 6 • Reuse domain ontologies – we are not reinventing them! Commodity Off the Shelf tools App eco-system Repository Level Content type level
  45. 45. Standardised metadata mark-up Metadata published & harvested withoutAPIs or special feeds Commodity Off the Shelf tools App eco-system Repository Level Content type level Goldilocks & the 3 Use Cases
  46. 46. Training materialsEvents Organizations Data Software Lab Protocols schema.org tailored to the Biosciences for FAIR simple structured metadata markup on web pages & sitemaps bio.tools
  47. 47. schema.org tailored to the Biosciences simple structured metadata markup on web pages & sitemaps • Specific for life sciences • Extends existing Schema.org types • Focused on few types and well defined relationships • Minimum properties for finding and accessing data • Best practices for selected properties • Managed by Bioschemas.org • Generic data model • Generous list of properties to describe data types • Managed by Schema.org
  48. 48. Tailored schema.org to improve Findability and Accessibility in Bioscience Layer of constraints + documentation + extensions Leyla Garcia. Poster & Flashtalk
  49. 49. 2-3 Oct 2017, Hinxton, ~50 people Ideally 6 concepts Reuse ontologies schema.org Real mark-up Tools Find, Cite, Snippets, Metadata exchange Community
  50. 50. http://www.france-bioinformatique.fr/en/training_material https://search.google.com/structured-data/testing-tool Applied Drupal 7 schema.org extension Took about 2 hours Included inTeSS in an hour [Niall Beard]
  51. 51. MORALs Community Buy-in Worth it • First specs & main mechanism for training • Google / Schema & ELIXIR support • Research Schemas for EuropeanOpen Science Cloud pilot Goldilocks works but is hard work • Types & Profiles debates • Elegance vs best for tools • Reuse domain ontologies • Validation, mark-up & harvesting tools Trolls
  52. 52. How are we FAIRing? Different levels with different emphasis Its an Ecosystem, not a single solution • Catalogues, Search, Stores • Metadata Standards • StandardAccess protocols • Identifiers, Policies • AuthorisedAccess • Licensing
  53. 53. smart rebrand launch Still hard, same stuff Rally big communities and grassroots initiatives Examine our capabilities There is no magic
  54. 54. FAIRy Land PEST Political Economic Social Technical
  55. 55. Platform & user buy-in from the get-go Passionate, dedicated leadership Seeding critical mass Community Tools Driver Bottom up initiatives fostered by big umbrellas infrastructures FAIR Semantic Village* Simple & Lightweight Ramps not revolutions FAIR with a PURPOSE & With PEOPLE FAIR Support typical developer – Familiarity – JSON, APIs *Deb McGuinness
  56. 56. Research for FAIR FAIR representation • The Semantic Web Automated metadata • Deep learning, machine learning, AI • Text Mining, Ontology mapping Social metadata • User Experience, Crowd Sourcing • Choice architecture FAIR action • Blockchain • Virtualised & remote execution • Image processing • Preservation & portability • Provenance tracking, object trajectories • Engineering & Design, Ethics, Social Sciences Research + Developer Practitioner practices
  57. 57. Mark Robinson Norman Morrison Paul Groth Tim Clark Alejandra Gonzalez-Beltran Philippe Rocca-Serra Ian Cottam Susanna Sansone Kristian Garza Daniel Garijo Catarina Martins Iain Buchan Caroline Jay David De Roure Oscar Corcho Steve Pettifer Khalid Belhajjame Jun Zhao Phil Crouch Lilian Gorea, Oluwatomide Fasugba Stian Soiland-Reyes Michael Crusoe Rafael Jimenez Alasdair Gray Barend Mons Sean Bechhofer Michel Dumontier Mark Wilkinson Leyla Garcia Stuart Owen KatyWolstencroft Finn Bacall Alan Williams Wolfgang Mueller Olga Krebs Jacky Snoep Matthew Gamble Raul Palma Mark Musen http://www.researchobject.org http://www.myexperiment.org http://wf4ever.org http://www.fair-dom.org http://www.fairdomhub.org http://seek4science.org http://rightfield.org.uk http://www.bioschemas.org http://www.commonwl.org http://www.bioexcel.eu http://www.openphacts.org

Editor's Notes

  • Findable Accessable Interoperable Reusable < data |models | SOPs | samples | articles| * >. FAIR is a mantra; a meme; a myth; a mystery; a moan. For the past 15 years I have been working on FAIR in a bunch of projects and initiatives in Life Science projects. Some are top-down like Life Science European Research Infrastructures ELIXIR and ISBE, and some are bottom-up, supporting research projects in Systems and Synthetic Biology (FAIRDOM), Biodiversity (BioVel), and Pharmacology (open PHACTS), for example. Some have become movements, like Bioschemas, the Common Workflow Language and Research Objects. Others focus on cross-cutting approaches in reproducibility, computational workflows, metadata representation and scholarly sharing & publication. In this talk I will relate a series of FAIRy tales. Some of them are Grimm. Some have happy endings. Who are the villains and who are the heroes? What are the morals we can draw from these stories?



  • The additions are hidden behind these … just as important and not the same….
  • Many Princes Scientific Data 3, Article number: 160018 (2016)DOIdoi:10.1038/sdata.2016.18
    https://www.nature.com/articles/sdata201618 (2016)
  • ELIXIR, RDA
  • Child as first payment
    Be careful what you promise
  • Slide from NLM CLA
    RIN?
    CERIF, CLARIN
    me too! the elephant & blind men
  • Who are the witches and the godmothers?
    What the get out clause?
  • Three – open PHACTS?
    What did we learn – much harder than you think.
  • Windsor….what did we learn?
    Distributed commons

    Dig out user numbers
  • Cliques and complementarity
    Visibility is muted.
    Licensing…

    PI leadership
    Sticking to conventions
    Local responsibility
    Time and resource
    Curation recognition

    Trust
    Tribal trading behaviours
    Enclave sharing
    Not public donation
    Reciprocity & credit

    Drivers …
    External dominate
    Personal productivity


  • Stratified to hide the visible from the invisible.

    We also have APIs, RAILS
  • Consumer – producer obligations mismatches

    Wolves: Project PIs, funders, time
    Godmothers: Project PIs, “PALs”, templates, funders

    Deferred pain

    The ant and the grasshopper

    Automate or sneak

    From the IB 13 talk and the Group 09 talk

    Active enclave sharing
    Public sharing tricky even after publication, bribery and threats
    Data Hugging, Flirting and Voyerism

    Playground rules apply
    Fluid, transient collaborations > membership mgt pain in a*se
    Shameless exploitation of PI competitiveness & vanity
    PI & Funder leadership

    Pan project spawned collaborations – YES!!!!
    But not necessarily visible to us.
  • PALs are also the cinderellas
    The scientists’ world does not revolve around your infrastructure or agenda.
  • Bullying doesn’t work
    Fame / Shame
    Money / Burden
    Love / Fear
    Side effect / special effort
  • Templates! Spreadsheets
    spreadsheets are your friend, not Cinderellas
    Similarly on myexperiment – metadata in CWL can be extracted…
    Choice
  • Don’t necessarily interleave
  • Across platforms
  • Bechhofer, Sean, De Roure, David, Gamble, Matthew, Goble, Carole and Buchan, Iain (2010) Research Objects: Towards Exchange and Reuse of Digital Knowledge At The Future of the Web for Collaborative Science (FWCS 2010), United States.

    Why linked data is not enough for scientists
    Sean Bechhofer, Iain Buchan, David De Roure, Paolo Missier, John Ainsworth, Jiten Bhagat, Philip Couch, Don Cruickshank, Mark Delderfield, Ian Dunlop, Matthew Gamble, Danius Michaelides, Stuart Owen, David Newman, Shoaib Sufi, Carole Goble
    Publication date
    2013/2/28
    Journal
    Future Generation Computer Systems
    Volume
    29
    Issue
    2
    Pages
    599-611
    Publisher
    North-Holland
  • Recording & linking together the components of an experiment
    Linking across experiments.
    Linked Ros
    Bigger on the inside than the outside
  • Predated the FAIR Principles
    Element enumeration
    Identification & citation
    Description tracking attributes (metadata) and origins (provenance) of contents.
    Simplicity - low user overhead and thin (no) client
  • RO-bagit
  • Generic tools
    multiple bespoke profiles – RDA Data Provenance approach. One for CERIF, one for DataCite

    Typing
  • HIDDEN SLIDE
    Specific to the generic
  • HIDDEN SLIDE

    Context of data content together when its scattered transferring and archiving very large HTS datasets in a location-independent way
    These tools combine a simple and robust method for describing data collections (BDBags), data descriptions (Research Objects), and simple persistent identifiers (Minids) to create a powerful ecosystem of tools and services for big data analysis and sharing. We present these tools and use biomedical case studies to illustrate their use for the rapid assembly, sharing, and analysis of large datasets.

  • SEAD – Jim Myers
  • Too vague and too general – needed profile lock-down
    Can’t make profiles in the abstract
  • First specifications:

    Bio data infrastructure
    Data Catalog
    Datasets

    Bio data types
    Human beacons
    Samples
    Plant Phenotypes
    Proteins
    (Chemistry)

    Bio stuff
    Training materials
    Events
    Laboratory protocols
    Workflows and Tools
  • Of course this is relevant to ROs – dataset in particular is similar to collection. An RO is a structured collection.
  • Now the most popular mechanism for publishing and harvesting metadata, beating APIs and scrapping.
  • HIDDEN SLIDE
    Usecases
    Biobanks should be able to crawl the BioSamples database to identify all the published (and searchable) datasets derived from samples they have provided
    Public archives should be able to crawl Biobank websites, in order to identify samples that are known to have public accessions in the BioSamples database AND that can be made publicly available, and thereby link public samples to a provider (“where can I get more of this sample?”).  
    In case of privacy or consent considerations, only the biobank should know what are the specific samples connected to publicly available datasets
    Public archives should be able to crawl Biobank websites, in order to identify ‘sanitised’ sample metadata descriptions (again, in case of confidentiality or consent considerations).  Biobanks remain responsible for ensuring only authorised metadata is visible, and can control access to restricted samples.
    Assumptions
    Each sample provided by a biobank has an opaque pseudo-anonymous identifier that is assigned by the biobank to identify a specific sample (referred to hereafter as the “sample name”)
    Each sample reported in a public archive or used to generate a public dataset has a public, BioSamples database accession (hereafter called “sample identifier”).
    In some cases, a biobank may issue different sample identifiers when providing the same sample to different projects. This may result in duplicated sample accessions in the BioSamples database
    Given these use cases and assumptions, we will use Bioschemas to describe sample links.  The main challenge is therefore the identification of links between sample identifiers (within Biobanks) and sample accessions (from the BioSamples database).  This is not always possible without considerable additional curation effort, but of the 5 million samples in the BioSamples database, over 4 million declare either a ‘synonym’, ‘sample source name’ or ‘source name’ attribute, frequently used to encode the original biobank sample name.  Exposing these in a structured manner through the BioSamples database would allow Biobanks to crawl and analyse this content, marrying sample that are recognised with their own internal identifiers.
    Once this mapping is done, Biobanks can then re-expose these links through structured content on their own websites, allowing public resources to reciprocate links from public records back to the sample provider.
    Implementation Study Outline
    Objectives
    Facilitate the ingestion of sample metadata from data repositories (eg. Biobank databases) into registries like the BioSamples, BBMRI Biobank directory or the UKCRC Tissue Directory via Bioschemas.
    Engage and help data providers and developers of BioBank LIMS to test and adopt the exposure of sample metadata via Bioschemas
    Contribute to contextualise information from data sample registries (eg. BioSamples) and biobank sample repositories (eg. NL Biobank) and Biobank Registries (eg. BBMRI Biobank directory)
    Make registries like BioSamples compliant with Bioschemas.




    Biobanks crawl BioSamples to discover sample accessions, markup etc if they have 'known' biobank name fields.
    Sample (study) catalogues provide findability for the individual samples
    - Aligning with MIABIS Sample Donor and Sample modules

    Work with repositories/Biobanks/LIMS to adopt Bioschema
    • Develop general crawler: in collaboration with Bioschema community

    F2Share (Federation framework for data Sharing): https://github.com/MIABIS/logstash-configuration-generator/wiki
  • More tools needed than thought!
    14+ repositories marked up
  • HIDDEN SLIDE
    Maintain common profiles across scientific domains focused on finding and accessing data
    Minimum properties
    General best practices
    Support different scientific domains to extend and develop domain specific profiles
  • Evidence for the funders and researchers
    Focused on technical and social, but the economics and political is critical.
  • Ecosystem

    Grassroots community activities
    Fostered by Infrastructure Initiatives
    Don’t squash the start up!
    Open standards and lightweight
    Practical engineering
    Keeping it simple and real
    Ramps rather than Revolution

    Specialist, bespoke
    Rise of containers
    Too vague and too general – needed profile lock-down
    Can’t make profiles in the abstract
  • Added afterwards….
  • Successes
    Multiple apps developed
    500+ users
    20-30 million hits a month
    Used to answer real pharmaceutical research questions
    API documentation

    Lessons
    Support the typical app developer workflow (i.e. APIs, JSON)
    Support domain specific (non-RDF) services
    Identifier equivalence is non-trivial
    Free text search is important
    Staying up-to-date with dataset updates is a challenge

×