• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
If we build it will they come?

If we build it will they come?



Keynote presentation by Professor Carole Goble at BOSC (Bioinformatics Open Source Conference) Long Beach, California, USA, July 14 2012. Co-located with ISMB, Intelligent Systems in Molecular Biology ...

Keynote presentation by Professor Carole Goble at BOSC (Bioinformatics Open Source Conference) Long Beach, California, USA, July 14 2012. Co-located with ISMB, Intelligent Systems in Molecular Biology



Total Views
Views on SlideShare
Embed Views



1 Embed 1

https://twitter.com 1



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment
  • If I build it will they come? : What is it we are building? What is it we are building ? Who is they? Who are we? Over the years I have built a bunch of open source software and services for researchers: the Taverna workflow system, myExperiment for workflow sharing, BioCatalogue for services, SEEK for Systems Biology data and models, and most recently MethodBox for longitudinal data sets. As well as building software we built communities: development communities and user communities. So what drives/hinders adoption? What do I know now that I wished I had known before? How do we sustain communities on time-limited grants? How do we build it so they come, stay and join in?
  • Because we don’t make any content ourselves
  • Templates, controlled vocabularies, metadata collection, components, better descriptions….
  • Distributed Groups Independents and Partners Organised Teams, Planned, Strong connections with resource providers and each other. Structured, Cross-partner sharing, Retained results Distributed Groups & Independent Lone rangers Long tail, Disconnected from data providers and each other, emergent, fluid, personal stores, small science from big Make workflows for group Run workflows from platforms Store and Find Workflows Catalogue and Find Services Catalogue, store and find data, SOPs, Models Link stuff Release & Share stuff Curate stuff Cooperate / Collaborate / Coordinate / CoShape Vary on Coordination, collaboration, cooperation, contribution, integration, sustainability, longevity
  • Make workflows for group Run workflows from platforms Store and Find Workflows Catalogue and Find Services Catalogue, store and find data, SOPs, Models Link stuff Release & Share stuff Curate stuff Cooperate / Collaborate / Coordinate / CoShape
  • Still some people missing!
  • Knowledge Transfer Three tracks Large Team.
  • Developer and user adoption Contributed collaborative content Collaborative development
  • What is the motivation…..
  • Learning curve
  • Learning curve
  • Maybe you don’t care…. Content and Promotion matter more than software, but harder to fund and different people to software developers.
  • Incidental – not really building for adoption or others to take up Familial – the producer and the consumer are the same – many are like this in BOSC
  • CLAs for set up. Remember upgrade paths Cooperate, Network effects, Amplify Self-supporting, Multi-level marketing There are no green fields.
  • Please some of the people some of the time
  • They all start off like this…
  • Getting buy-in
  • Working the first time User experience over smart. Cool interfaces (even for plumbing)
  • Primary Community Review Facebook generation! Community participation Sharing Commons based production Social Curation Voluntary contribution 1. Primary Content 2. Curation duties GeneWiki, Rfam, myExperiment, PloS, UsefulChem, OpenWetWare Open Science vs Long Tail Social networks vs the Long Tail Incentives and Obstacles Myths and Miracles Contribution. Curation. Volunteer science
  • Limited focus Social networking around content . Feedback loops.
  • PAL recruitment Content contribution Stick: Community, Journal and funder mandates – there is no stick Credit for peer review
  • Don’t forget to make more demands though!
  • User burn-out and over familiarity Over-friendly Stockhausen syndrome, absence of friendly fire, Keep enemies even closer Unadjusted over-user accommodation Fit in at first, get buy-in, move in, move on Drifting apart and not keeping it fresh Keep jointly working on real, concrete cases Don’t assume they will stay: Users are fickle. Step back, observe and adapt/intervene! So relieved get a community forget to see what they do (e.g. dubious workflow designs) Much easier with e-Laboratory Services that are inherently social collaboration spaces. Complacency Esp. dangerous outside funded collaborations Measuring impact and getting feedback Downloads ≠ useful (or usable) Don’t be prescriptive. Scientists control. – but actually we need to be a bit prescriptive Danger! Going native. Missing users. Fossilisation and complacency User experience over smart. Cool interfaces (even for plumbing *-athons Embedded co-working The total problem Replying Eating your own dog food Examples! Working the first time
  • Version 2 Syndrome Being too clever, forgetting about engagement Technical bog down and operational burn-out Fire fighting, Heads down not eyes up Little simple things that are important but don’t seem that urgent… But are the ha’peth of tar that sinks the ship Major project dominance He who pays the piper calls the tune Non-software innovations Seek and contribute content/component and contributing partners
  • Activation Energy Argument Balance against feature creep short-termism Keep planning the big stuff… Balance the cost to the benefit. But hacks survive – and don’t do the strategy.
  • 58% by students, 24% unmaintained Schultheiss et al. (2010) PLoS Comp Bio Content and Promotion matter more than software, but harder to fund and different people to software developers. What’s your plan? Maintaining content, software, services Different groups, evolving practices, changing times, new patterns….. Funding cycles, chasms and reinventions Reward not hinder adoption. Foundations, Friends and Business Models…and the Open Source Community Silver Bullet!
  • Hard to Plan….
  • When the program’s Data Management Group chair claims it’s the only data system they have used that works. To your funders. Whoo-hoo!
  • Computer Supported Cooperative Work, Team Science, Knowledge Management, Social Science, Information Science, Library Science, Digital Scholarship, Collaboratories…

If we build it will they come? If we build it will they come? Presentation Transcript

  • If we build it will they come?Prof Carole Goble FREng FBCS CITP The University of Manchester, UK carole.goble@manchester.ac.ukBOSC, Long Beach, CA, USA, July 14 2012 http://www.mygrid.org.uk
  • Est. 2001 Improving Knowledge Turning, Enabling Reuse and Reproducibility [Josh Sommer]Keep the vision, modify the plan
  • Computational Methods LGPL Scientific workflows. Distributed web/grid/cloud services Third party, independent service reuse Data pipelines and analytics Volunteerist Human Computation BSD e-Laboratories - social collaboration and sharing environments for scientific artefacts. Libraries and Catalogues. Asset safe havens, sharing, reuse. Knowledge Acquisition Tools Various Semantic technology, semantic applications, research objects, executable papers. OWL Data/Metadata curation & reusePOPULOUS SKOSEdit
  • The Taverna Suite of Tools Web PortalsWorkflow Repository GUI Workbench Client User Interfaces Virtual Machine Service Catalogue Third Party Tools Workflow Engine Provenance Workflow Store Command Line ServerActivity and Service Plug-in Manager Open Provenance Model Programming and Secure Service Access APIs
  • Community Haven Sharing Resource Social Collaboration http://www.myexperiment.org5820 members, 304groups, 2415 workflows,604 files and 229 packs(research objects) http://wiki.myexperiment.org/index.php/Galaxy
  • BioCatalogue:crowd curation of web services Contribute, Find and understand Web Services Curate, review and comment Learning resource Monitor Services Cloud Registry 2295 REST and SOAP services, 169 service providers. 674 members, 27 countries
  • Find experts, colleagues and peers. Find, exchange and interlink, preserve, publish data, models, publications, SOPs & analyses. ISA Compliant SysMO: 16 consortia, 110 institutes, 1600+ assets, 350+ membersLaunch and validate Gateway to GerontoSysmodels and analyses: public tools andJWS Online resources, e.g. BioModels livSYSiPS
  • Public http://www.seek4science.org SEEK
  • Standards & Content Sharing PlatformGovernance & Policy & Trusted Service Software & Tools Open source GatewayComp SciResearchPlatformKnowledge Network Preservation &Skills & Community Building Publication Platforms
  • Laissez-faire Philosophy• Bottom Up – Emergent & scruffy (to a degree…)• Reliant on third party contributions – Non-prescriptive, non-interfering and flexible – We make no content ourselves….• Part of a wider ecosystem – Other services, data, tools, platforms, people…• Inspired by social environments• Scarred by top-down, dictated, tech-driven and unused monoliths
  • http://www.flickr.com/photos/hellaoakland/3137360455/Never underestimate Liberty through how scruffy third Limitations party stuff can beHow often metadata is People say they want missing and messy if flexibility. They prefer the left to its own simplicity of order and will devices… adapt to adopt.
  • Who is they?• Jobbing Bioinformatician?• Expert Bioinformatician?• Sys admin?• Service provider?• Application developer?• Tool developer?• Biologist?
  • Who is THEY?Drug Toxicity Pharmacogenomics Trypanosomiasis in The Virtual(OpenTox Project) GWAS African Cattle Liver Physiopathology of Genetic differencesSystems Biology of the human body between breeds of Metagenomics cattleMicro-Organisms Medical Imaging
  • ConsortiaOrganised,Planned, Strongconnections withresource Independents…. Bovineproviders and Trypanosomiasiseach other. Consortium ResearchDistributed Groups & GroupsIndependent LonerangersLong tail, Disconnectedfrom data providers andeach other, emergent,Individuals
  • Specialise or Diversify?• Flexibility and extensibility -> customised Software and Document Services, Cookie cutter Helio- Preservation Physics• Widen adoption• Spread risk, extend resourcing streams BioDiversity Astronomy• Cross development alignment and coordination• More communities to build, nurture, support and sustain• Core Drift and Bashing Social Science Engineering: JPL, NASA FLOSS
  • BioDiversity Virtual e-Laboratory http://www.biovel.euBiodiversity Services Catalogues / Execution Repositories environment ProvenancePhylogenetic BLAST,Hmmer, WebDaV Data MrBayes, Management Blast, PAML, Taverna EMBOSS,… Workbench Search OpenTaxonomic SynonymsVisualisation Authentication / Authorisation BioSTIF Taverna Workflow Engine Google Refine CSW and ServerModelling/GeoProcessing Grid, Cloud, etc. R openModeller Platforms WPS / WCPS
  • Who is We? The ego-systembiologists,bioinformaticians,biodiversityinformaticians,astro-informaticians,social scientistsmodellers, softwareengineers,computer scientists,systems administrators,resource providers
  • My WorldCS Research Methods & Practice Productio n Science
  • http://www.wf4ever-project.org• Research Objects Citation Reproducibility, Integrated Publishing,• Aggregation Carriers of Research Context• Annotation• Provenance• Lifecycle• Preservation• Decay• Sharing• Stereotypical Profiles• Services and APIs• myExperiment 2.0 Encodings: Semantic Web: LOD, VoID, OAI-ORE, AO/OAC, SIOC, OPM/PROV, Memento….
  • Applications Production Publishing TrainingResearch Community Community
  • So if we build it will they come?Be useful for something: immediately,continuously, responsivelyBe usable by somebody: user experience,worth the effort, adoption pathSome of the time: as part of a big pictureUnder promise and over deliverAcquire Critical Mass
  • Four things that drive adoption of software or service.1. Added value – Do something that couldn’t do before or now do faster, gain competitive advantage, improve productivity, scale up2. New asset – Get or retain access to something important (data, method, technique, skills, knowledge)3. Keep up with the field. A Community. – Future-proof my practice, New skills and capacity, there is a vibe about it and I’ll be left out4. Because there is no choice – Business depends on it, its mandated, its de facto mandated
  • Seven things that hinder adoption of software or service1. Not enough added value • It doesn’t solve a problem or not as well or as cheaply as something else, no content or the right content It Sucks2. Not fit for take-on. It doesn’t work! • No: help, guides, documentation, manuals, examples, content, templates, portability, migration / legacy support, easy installation, virtual machines, testing, stability, version control, release cycle, roadmap, sustainability prospect, way of introducing my favourite component/data/environment.3. No Time or Capacity to take on • To learn, migrate personal legacy code/data/applications, no pathway/ramp to adoption • Training and special system needs
  • Software practices Zeeya Merali , Nature 467, 775-777 (2010) | doi:10.1038/467775aComputational science: ...Error…why scientific programming does not compute.“As a general rule,researchers do nottest or documenttheir programsrigorously, and theyrarely release theircodes, making italmost impossibleto reproduce andverify publishedresults generatedby scientificsoftware”
  • Software Stewardship “Better Science through Superior Software” – C Titus BrownSoftware sustainabilitySoftware practicesSoftware depositionLong term access to softwareCredit for softwareLicensing adviceOpen licensesReproducible Research Standard, Victoria Stodden,Intl J Comm Law & Policy, 13 2009
  • Seven things that hinder adoption of software or service1. Cost – Of disruption, of long-term ownership – It’s too costly2. Exposure to Risk. First to take-up, Support and sustainability dependencies, fear of scrutiny, misrepresentation or being scooped,3. No Community – Support and comfort4. Changes to work practices – Obligations, unclear or unenforced reciprocity protocols.
  • • It sucks but it’s the only thing around• It’s ace but it’s one of many, too late in the game and not enough to switch• Tipping point is likely not technical Betamax vs VHS
  • Bonus Hinder Never heard of it. We’ve built it but we haven’t told anyone.• Make noise…physically and virtually• Customer and Contributor Relationship Building• Self-supporting communities, multi-level marketing• Highly Resource Intensive
  • Bonus Hinder Never heard of it.We’ve built it but we haven’t told anyone. Market User Community Development It all kicks off Developer Community
  • Adoption Intentions Be careful what you wish for• Incidental – “I built it for myself, and stuck it out there”• Familial – “I built it for people just like me”• Fundamental – “I built it for others, many who are not like me”
  • Open Innovation: Development and Content you are not alone. you can’t do it all alone motivate & enable others to fill gaps “App Store Style” software, services, content, examples….• Really Interoperate. Don’t tweak.• Be Simple and Standard.• Be Helpful. Be Set up. Be reusable. Be Smart Friends Galaxy+Taverna/myExperiment Family• Others will develop on top of you. But don’t assume they will re- contribute or tell you. Acquaintances• It’s much harder than you think. Strangers• It’s unequal.
  • Ladder Model of OSS Adoption (adapted from Carbone P., Value Derived from Open Source is a Function ofFamily Acquaintances Friends Maturity Levels) Strangers Moores technology adoption curve [FLOSS@Sycracuse]
  • "its better, initially, to make a smallnumber of users really love you than a large number kind of like you" Paul Buchheit paulbuchheit.blogspot.com
  • PALS: Building FriendshipsIntelligence, Guidance, Advocacy, Evangelism, Market Research What’s in it for the PAL? – Long tail: Money, kudos, special support, special resources, skills, reputation building, influence, stuff they can’t do alone, CV building – Consortia: co-funded • Who is a PAL? – Post-docs, Post-grads, Administrators, Developers – PI: protector/champion • PAL handlers – Customer Relationship Manager, Nanny and Mediator, Scientist
  • Do not under-estimate…The power of the sprint / The power of a whizzy *-athon / fest / drinking interface. Even for plumbing. The importance of supporting and propagating best practice
  • Participatory, EmbeddedDesign-Build-Run-Manage is Good Act Local Reality Think Global Check Eat your own The Bigger Dog Food Picture
  • Participatory Design Work Together on a Real ProblemFunders Project PIs PALsData sharing Data control Spreadsheets.Data standards Own databases Yellow Pages. Just enough SOPsA database exchange. UnderstandingLong term Visibility limitations standards preservation Project dependence Curating. Examples. 3 Years later 15/16 consortia Safe Haven abandoned their own systems and Project went with the SEEK system. independence
  • If you buildit will theycome andcontribute?
  • Participation Cooperation? Coordination? Collaboration? Citizens Integration? Evolution and entropy models Public scientists TrustedCollaborators Private Groups Lone scholars Closed Controlled Open[based on an idea by Liz Lyon] Access
  • Critical mass spiral: 90:9:1 Driven by needs of and benefits to the scientist, rather than top down policies. Content tipping point[Andrew Su]
  • Trust, Fame and Blame: Reciprocity, Competition, Contribution and Use• Scooping, Scrutiny and Misinterpretation• Curation Cost• Poor quality• Reputation / Asset Economics• Public Peer PressureReciprocity Sucks• Flirting• Hugging• Controlled Sharing• Voyerism• Poor feedback / credit Nature 461, 145 (10 September 2009)Victoria Stodden, The Scientific Method in Practice: Reproducibility in the Computational Sciences Feb 9,2010 MIT Sloan Research Paper No. 4773-10, http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1550193
  • Harness Competitiveness CarrotsPride• Reputation: Cult, Credit & Attribution for allProtection• Just enough Sharing, Licensing & Liability• Quality, Peer review, MetadataPreservation• Safe havens and Sunsets (project churn)Publishing / Release• Citability, Supporting ExchangeProductivity• Availability of assets, help, capability, ramps
  • Sticks?Community, Journal and Funder mandatesThere are very few real sticks.
  • Adoption Ramps http://www.rightfield.org.ukInstrument familiar, widely-used toolsSpreadsheets and Email
  • Adoption Stealth• Data at home promise with automated harvesting• Sharing creep, Incremental metadata, Low obligations• URL upload in BioCatalogue• Web Service “come as you are” take-on in Taverna• Metadata prompting, Right tools, right time, right place• Service collections & Packaged services
  • Be vigilant• PAL burn-out and over familiarity• Unadjusted over- user accommodation• Drifting apart and not keeping it fresh• Step back, observe and adapt/intervene!• So relieved to get a community….• Instrument adoption and observationParticipatory Development is a mutual long term relationshipNot flirty speed dating, One night stand, Crush, Me Me Me
  • Urgent-Important• Technical bog down, operational burn-out• Little things that are important but don’t seem that urgent…• Dominant projects• Not-software content• It all takes way longer than you think• Simplicity driftParticipatory Development is a mutual long term relationshipNot flirty speed dating, One night stand, Crush, Me Me Me
  • Beware Version 2 Syndrome! Version 2 Syndrome
  • The Jam-based Adoption Model aka Added Value Value PropositionReturn On Investment http://delicious-cooks.com/photos/raspberry-jam/04/
  • What’s is the Special Jam? What is your Jam Value Chain and for Who?What: SysMO: safe haven, spreadsheet tooling, linking SOPs, models and data, examples Taverna: power, adaptability and myExperimentWho: Focused on contributors and experts Provider-consumer balance Functionality-Simplicity Syndrome Changing Who - Challenging baked-ins
  • Jam today and more, better Jam tomorrowJust Enough Jam, Just in Time not Just in Case* Feature Creep Conundrum * Big Picture Paradox* Core vs Specifics Syndrome * Content Decay Dilemma* Working to working Stability Stress
  • Customised Specific Jam beats Generic* Flexibility/Functionality – Simplicity Conundrum* Diversification Dilemma
  • http://www.gettyimages.co.uk/detail/photo/empty-jam-jar-royalty-free-image/136976198 Where is my Jam? Jam for All • What are WE (platform providers, Software builders, Community builders and Service providers) getting out if it? • Need credit and interest too. • Altmetrics Howison and Herbsleb, Scientific Software Production: Incentives and Collaboration, CSCW 2011, March 19–23, 2011, Hangzhou, Chinahttp://james.howison.name/pubs/HowisonHerbsleb2011SciSoftIncentives.pdf
  • Jam foreverThey came. Have the evidence. Have a plan. Did you wish for this? Do you want it?Fragile Flux• Content, services, bits, communitiesFunding Plan• Novelty over sustainability,• Research-Production Falsehoods• Wave invention, Political lobbyingSecuring the community• Leadership & FoundationsBusiness model??? Software is Free like Puppies Are Free
  • Jam not forever• Acquire• Retain• Widen – More/Different• Reposition – Different/New Stage• Changing Community is Challenging… [Daron Green]
  • Adoption is a The Social and theMerry-Go-Round Technical are Inseparable
  • You know they came when……you were useful and usable to someone some of the time,but they might not tell you… people ask you to join their consortia or use it… they gave up their own home grown stuff for yours… someone you don’t know uses it and tells you all aboutyour own stuff.… someone publishes papers about it. Without citing you.… someone else claims credit.… people you don’t know start bitching about it.… its just expected to be there and you are kind of expectedto be there too.…your Head of School complains you don’t do enough CSresearch because you are doing too much SoftwareEngineering and Support.
  • James Howison Heather PiwowarVictoria Stodden Janet VertesiChristine Borgman Nosh Contractor Acknowledgements (1) Jay Liebowitz Robert Kraut
  • Acknowledgements (2)• The myGrid family, friends and contributors• But especially: Katy Wolstencroft, David Withers, Marco Roos, Alan Williams, Jits Bhagat, Stuart Owen, Stian Soiland-Reyes, Shoab Sufi, Robert Stevens, Paul Fisher, Peter Li, Ian Dunlop, Finn Bacall, Mannie Tags, Niall Beard, Rob Haines, Christian Brenninkmeijer, Alasdair Gray, Tim Clark, Pinar Alper, Paolo Missier, Khalid Belhajjame, Duncan Hull, Sean Bechhofer, david De Roure, Don Cruickshank, Wolfgang Mueller, Olga Krebs, Franco Du Preez, Quyen Nguyen, Jacky Snoep.• The members of Wf4ever, SysMO, BioVel, HELIO, SCAPE, OMII, SSI, NeiSS, Obesity e-Lab and anyone else I forgot
  • • Further Information myGrid – http://www.mygrid.org.uk• Taverna – http://www.taverna.org.uk• myExperiment – http://www.myexperiment.org• BioCatalogue – http://www.biocatalogue.org• SysMO-SEEK – http://www.sysmo-db.org• MethodBox – http://www.methodbox.org.uk• Rightfield – http://www.rightfield.org.uk• Wf4ever – http://www.wf4ever-project.org• BioVeL – http://www.biovel.eu• Software Sustainability Institute – http://www.software.ac.uk• Software Carpentry – http://software-carpentry.org/
  • Coalface Patrons users Skeptic Champions Keep your Friends Close Friends and Family Fit in Favours will Embed Favour you Jam Today Jam Tomorrow Act Local Think GlobalEnd UsersDevelopers Just Enough Design for Know Anticipate Just in Time Network EffectsService your ChangeProviders Users Enable UsersSystem to Add ValueAdministrators Keep Sight of the Bigger Picture SUMMARY (De Roure and Goble, IEEE Software 2009)