If we build it will they come? BOSC2012 Keynote Goble

If we build it
will they come?
Prof Carole Goble FREng FBCS CITP
carole.goble@manchester.ac.uk
BOSC, Long Beach, July 14 2012

http://www.mygrid.org.uk

Est. 2001

Improving Knowledge Turning,
Enabling Reuse and Reproducibility

[Josh Sommer]

Keep the vision, modify the plan

Computational Methods LGPL
Scientific workflows.
Distributed web/grid/cloud services
Third party, independent service reuse
Data pipelines and analytics

Volunteerist Human Computation BSD
e-Laboratories - social collaboration
and sharing environments for scientific
artefacts. Libraries and Catalogues.
Asset safe havens, sharing, reuse.

Knowledge Acquisition Tools
Various
Semantic technology, semantic
applications, research objects,
executable papers.
OWL
Data/Metadata curation & reuse
POPULOUS SKOSEdit

The Taverna Suite of Tools Web Portals
Workflow Repository GUI Workbench Client User Interfaces

Virtual
Machine

Service Catalogue Third Party Tools

Workflow Engine

Provenance Workflow
Store Command Line
Server

Activity and Service
Plug-in Manager
Open
Provenance
Model

Programming and
Secure Service Access APIs

Community Haven
Sharing Resource
Social Collaboration
http://www.myexperiment.org

5820 members, 304
groups, 2415 workflows,
604 files and 229 packs
(research objects)

http://wiki.myexperiment.org/index.php/Galaxy

BioCatalogue:
crowd curation of web services
Contribute, Find and
understand Web
Services

Curate, review and
comment

Learning resource

Monitor Services Cloud Registry

2295 REST and SOAP services, 169 service
providers. 674 members, 27 countries

Find experts,
colleagues and
peers.

Find, exchange
and interlink,
preserve, publish
data, models,
publications,
SOPs & analyses.

ISA Compliant

SysMO: 16 consortia, 110 institutes,
1600+ assets, 350+ members

Launch and validate Gateway to GerontoSys
models and analyses: public tools and
JWS Online resources, e.g.
BioModels livSYSiPS

Public http://www.seek4science.org
SEEK

Standards & Content Sharing Platform
Governance & Policy & Trusted Service

Software
& Tools
Open source
Gateway
Comp Sci
Research
Platform

Knowledge Network Preservation &
Skills & Community Building Publication Platforms

Laissez-faire Philosophy
• Bottom Up
– Emergent & scruffy (to a degree…)
• Reliant on third party contributions
– Non-prescriptive, non-interfering and
flexible
– We make no content ourselves….
• Part of a wider ecosystem
– Other services, data, tools, platforms,
people…
• Inspired by social environments
• Scarred by top-down, dictated,
tech-driven and unused monoliths

http://www.flickr.com/photos/hellaoakland/3137360455/
Never underestimate Liberty through
how scruffy third Limitations
party stuff can be

How often metadata is People say they want
missing and messy if flexibility. They prefer the
left to its own simplicity of order and will
devices…
adapt to adopt.

Who is they?

• Jobbing
Bioinformatician?
• Expert
Bioinformatician?
• Sys admin?
• Service provider?
• Application
developer?
• Tool developer?
• Biologist?

Who is THEY?

Drug Toxicity Pharmacogenomics Trypanosomiasis in The Virtual
(OpenTox Project) GWAS African Cattle Liver

Physiopathology of Genetic differences
Systems Biology of the human body between breeds of
Metagenomics cattle
Micro-Organisms Medical Imaging

Consortia
Organised,
Planned, Strong
connections with
resource Independents….
Bovine
providers and
Trypanosomiasis
each other. Consortium

Research
Distributed Groups & Groups
Independent Lone
rangers
Long tail, Disconnected
from data providers and
each other, emergent,

Individuals

Specialise or
Diversify?
• Flexibility and extensibility ->
customised Software and
Document
Services, Cookie cutter Helio-
Preservation
Physics

• Widen adoption
• Spread risk, extend
resourcing streams

BioDiversity Astronomy
• Cross development
alignment and coordination
• More communities to build,
nurture, support and sustain
• Core Drift and Bashing
Social Science Engineering: JPL, NASA
FLOSS

BioDiversity Virtual e-Laboratory
http://www.biovel.eu

Biodiversity Services Catalogues / Execution
Repositories environment

Provenance
Phylogenetic
BLAST,Hmmer, WebDaV Data
MrBayes, Management
Blast, PAML,
Taverna
EMBOSS,… Workbench

Search
Open
Taxonomic
Synonyms
Visualisation

Authentication /
Authorisation
BioSTIF Taverna
Workflow Engine
Google Refine CSW and Server
Modelling/GeoProcessing
Grid, Cloud, etc.
R
openModeller

Platforms
WPS / WCPS

Who is We? The ego-system
biologists,
bioinformaticians,
biodiversity
informaticians,
astro-informaticians,
social scientists
modellers, software
engineers,
computer scientists,
systems administrators,

resource providers

My World
CS Research Methods & Practice Productio
n

Science

http://www.wf4ever-project.org

•
Research Objects
Citation Reproducibility, Integrated Publishing,
• Aggregation Carriers of Research Context

• Annotation
• Provenance
• Lifecycle
• Preservation
• Decay
• Sharing
• Stereotypical Profiles
• Services and APIs
• myExperiment 2.0 Encodings: Semantic Web: LOD, VoID,
OAI-ORE, AO/OAC, SIOC, OPM/PROV, Memento….

Applications
Production
Publishing Training
Research

Community Community

So if we build it will they come?
Be useful for something: immediately,
continuously, responsively
Be usable by somebody: user experience,
worth the effort, adoption path
Some of the time: as part of a big picture

Under promise and over deliver
Acquire Critical Mass

Four things that drive adoption
of software or service.
1. Added value
– Do something that couldn’t do before or now do faster,
gain competitive advantage, improve productivity,
scale up
2. New asset
– Get or retain access to something important (data,
method, technique, skills, knowledge)
3. Keep up with the field. A Community.
– Future-proof my practice, New skills and capacity,
there is a vibe about it and I’ll be left out
4. Because there is no choice
– Business depends on it, its mandated, its de facto
mandated

Seven things that hinder
adoption of software or service
1. Not enough added value
• It doesn’t solve a problem or not as well or as cheaply
as something else, no content or the right content

It Sucks
2. Not fit for take-on. It doesn’t work!
• No: help, guides, documentation, manuals, examples,
content, templates, portability, migration / legacy
support, easy installation, virtual machines, testing,
stability, version control, release cycle, roadmap,
sustainability prospect, way of introducing my
favourite component/data/environment.
3. No Time or Capacity to take on
• To learn, migrate personal legacy
code/data/applications, no pathway/ramp to adoption
• Training and special system needs

Software practices
Zeeya Merali , Nature 467, 775-777 (2010) | doi:10.1038/467775a
Computational science: ...Error…why scientific programming does not compute.

“As a general rule,
researchers do not
test or document
their programs
rigorously, and they
rarely release their
codes, making it
almost impossible
to reproduce and
verify published
results generated
by scientific
software”

Software Stewardship
“Better Science through Superior Software” – C Titus Brown

Software sustainability
Software practices
Software deposition
Long term access to software
Credit for software
Licensing advice

Open licenses
Reproducible Research Standard, Victoria Stodden,
Intl J Comm Law & Policy, 13 2009

Seven things that hinder
adoption of software or service

1. Cost
– Of disruption, of long-term ownership

–
It’s too costly
2. Exposure to Risk.
First to take-up, Support and sustainability dependencies,
fear of scrutiny, misrepresentation or being scooped,
3. No Community
– Support and comfort
4. Changes to work practices
– Obligations, unclear or unenforced reciprocity protocols.

• It sucks but it’s the
only thing around

• It’s ace but it’s one
of many, too late in
the game and not
enough to switch

• Tipping point is
likely not technical
Betamax vs VHS

Bonus Hinder
Never heard of it.
We’ve built it but we haven’t told anyone.

• Make noise…physically and virtually
• Customer and Contributor Relationship Building
• Self-supporting communities, multi-level marketing
• Highly Resource Intensive

Bonus Hinder
Never heard of it.
We’ve built it but we haven’t told anyone.

Market

User Community

Development
It all kicks off

Developer Community

Adoption Intentions
Be careful what you wish for

• Incidental
– “I built it for myself, and stuck it out there”

• Familial
– “I built it for people just like me”

• Fundamental
– “I built it for others, many who are not like me”

Open Innovation: Development and Content
you are not alone. you can’t do it all alone
motivate & enable others to fill gaps “App Store Style”
software, services, content, examples….

• Really Interoperate. Don’t tweak.
• Be Simple and Standard.
• Be Helpful. Be Set up. Be
reusable. Be Smart Friends
Galaxy+Taverna/myExperiment
Family
• Others will develop on top of you.
But don’t assume they will re-
contribute or tell you.
Acquaintances
• It’s much harder than you think.
Strangers
• It’s unequal.

Ladder Model of OSS Adoption
(adapted from Carbone P., Value Derived
from Open Source is a Function of
Family Acquaintances
Friends Maturity Levels) Strangers
Moore's technology adoption curve

[FLOSS@Sycracuse]

"it's better, initially, to make a small
number of users really love you than a
large number kind of like you"
Paul Buchheit
paulbuchheit.blogspot.com

PALS: Building Friendships
Intelligence, Guidance, Advocacy, Evangelism, Market Research

What’s in it for the PAL?
– Long tail: Money, kudos,
special support, special
resources, skills, reputation
building, influence, stuff they
can’t do alone, CV building
– Consortia: co-funded
• Who is a PAL?
– Post-docs, Post-grads,
Administrators, Developers
– PI: protector/champion
• PAL handlers
– Customer Relationship
Manager, Nanny and
Mediator, Scientist

Do not under-estimate…
The power of the sprint / The power of a whizzy
*-athon / fest / drinking interface. Even for plumbing.

The importance of
supporting and propagating
best practice

Participatory, Embedded
Design-Build-Run-Manage is Good

Act Local Reality
Think Global Check

Eat your own The Bigger
Dog Food Picture

Participatory Design
Work Together on a Real Problem

Funders Project PIs PALs
Data sharing Data control Spreadsheets.
Data standards Own databases Yellow Pages.
Just enough SOPs
A database
exchange. Understanding
Long term Visibility limitations standards
preservation
Project dependence Curating.
Examples.
3 Years later 15/16 consortia Safe Haven
abandoned their own systems and Project
went with the SEEK system. independence

If you build
it will they
come and
contribute?

Participation Cooperation? Coordination? Collaboration?
Citizens Integration? Evolution and entropy models

Public
scientists

Trusted
Collaborators

Private
Groups

Lone
scholars

Closed Controlled Open
[based on an idea by Liz Lyon] Access

Critical mass spiral: 90:9:1
Driven by needs of
and benefits to the
scientist, rather
than top down
policies.

Content tipping
point

[Andrew Su]

Trust, Fame and Blame: Reciprocity,
Competition, Contribution and Use
• Scooping, Scrutiny and Misinterpretation
• Curation Cost
• Poor quality
• Reputation / Asset Economics
• Public Peer Pressure

Reciprocity Sucks
• Flirting
• Hugging
• Controlled Sharing
• Voyerism
• Poor feedback / credit Nature 461, 145 (10 September 2009)
Victoria Stodden, The Scientific Method in Practice: Reproducibility in the Computational Sciences Feb 9,
2010 MIT Sloan Research Paper No. 4773-10, http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1550193

Harness Competitiveness Carrots

Pride
• Reputation: Cult, Credit & Attribution for all
Protection
• Just enough Sharing, Licensing & Liability
• Quality, Peer review, Metadata
Preservation
• Safe havens and Sunsets (project churn)
Publishing / Release
• Citability, Supporting Exchange
Productivity
• Availability of assets, help, capability,
ramps

Sticks?

Community, Journal and Funder
mandates

There are very few real sticks.

Adoption Ramps

http://www.rightfield.org.uk

Instrument familiar,
widely-used tools
Spreadsheets and Email

Adoption Stealth
• Data at home promise with
automated harvesting
• Sharing creep, Incremental
metadata, Low obligations
• URL upload in BioCatalogue
• Web Service “come as you
are” take-on in Taverna

• Metadata prompting, Right
tools, right time, right place
• Service collections &
Packaged services

Be vigilant
• PAL burn-out and
over familiarity
• Unadjusted over-
user accommodation
• Drifting apart and not
keeping it fresh
• Step back, observe
and adapt/intervene!
• So relieved to get a
community….
• Instrument adoption
and observation

Participatory Development is a mutual long term relationship
Not flirty speed dating, One night stand, Crush, Me Me Me

Urgent-Important
• Technical bog down,
operational burn-out
• Little things that are
important but don’t
seem that urgent…
• Dominant projects
• Not-software content
• It all takes way longer
than you think
• Simplicity drift

Participatory Development is a mutual long term relationship
Not flirty speed dating, One night stand, Crush, Me Me Me

Beware Version 2 Syndrome!
Version 2
Syndrome

The Jam-based
Adoption Model

aka
Added Value
Value Proposition
Return On Investment

http://delicious-cooks.com/photos/raspberry-jam/04/

What’s is the Special Jam?
What is your Jam Value Chain and for Who?
What:
SysMO: safe haven, spreadsheet tooling, linking
SOPs, models and data, examples
Taverna: power, adaptability and myExperiment
Who:
Focused on contributors and experts
Provider-consumer balance
Functionality-Simplicity Syndrome
Changing Who - Challenging baked-ins

Jam today and more, better Jam tomorrow
Just Enough Jam, Just in Time not Just in Case

* Feature Creep Conundrum * Big Picture Paradox
* Core vs Specifics Syndrome * Content Decay Dilemma
* Working to working Stability Stress

Customised Specific Jam beats Generic

* Flexibility/Functionality – Simplicity Conundrum
* Diversification Dilemma

http://www.gettyimages.co.uk/detail/photo/empty-jam-jar-royalty-free-image/136976198
Where is my Jam? Jam for All
• What are WE (platform providers,
Software builders, Community
builders and Service providers)
getting out if it?
• Need credit and interest too.
• Altmetrics

Howison and Herbsleb, Scientific Software Production:
Incentives and Collaboration, CSCW 2011, March 19–23,
2011, Hangzhou, China
http://james.howison.name/pubs/HowisonHerbsleb2011SciSoftIncentives.pdf

Jam forever
They came. Have the evidence. Have a plan.
Did you wish for this? Do you want it?
Fragile Flux
• Content, services, bits, communities
Funding Plan
• Novelty over sustainability,
• Research-Production Falsehoods
• Wave invention, Political lobbying
Securing the community
• Leadership & Foundations
Business model???
Software is Free like Puppies Are Free

Jam not forever
• Acquire
• Retain
• Widen
– More/Different
• Reposition
– Different/New Stage

• Changing Community
is Challenging… [Daron Green]

Adoption is a The Social and the
Merry-Go-Round Technical
are Inseparable

You know they came when…
…you were useful and usable to someone some of the time,
but they might not tell you
… people ask you to join their consortia or use it
… they gave up their own home grown stuff for yours
… someone you don’t know uses it and tells you all about
your own stuff.
… someone publishes papers about it. Without citing you.
… someone else claims credit.
… people you don’t know start bitching about it.

… its just expected to be there and you are kind of expected
to be there too.
…your Head of School complains you don’t do enough CS
research because you are doing too much Software
Engineering and Support.

James Howison Heather Piwowar

Victoria Stodden Janet Vertesi

Christine Borgman Nosh Contractor

Acknowledgements (1)
Jay Liebowitz Robert Kraut

Acknowledgements (2)
• The myGrid family, friends and contributors
• But especially: Katy Wolstencroft, David Withers, Marco
Roos, Alan Williams, Jits Bhagat, Stuart Owen, Stian
Soiland-Reyes, Shoab Sufi, Robert Stevens, Paul Fisher,
Peter Li, Ian Dunlop, Finn Bacall, Mannie Tags, Niall
Beard, Rob Haines, Christian Brenninkmeijer, Alasdair
Gray, Tim Clark, Pinar Alper, Paolo Missier, Khalid
Belhajjame, Duncan Hull, Sean Bechhofer, david De
Roure, Don Cruickshank, Wolfgang Mueller, Olga Krebs,
Franco Du Preez, Quyen Nguyen, Jacky Snoep.
• The members of Wf4ever, SysMO, BioVel, HELIO,
SCAPE, OMII, SSI, NeiSS, Obesity e-Lab and anyone
else I forgot

•
Further Information
myGrid
– http://www.mygrid.org.uk
• Taverna
– http://www.taverna.org.uk
• myExperiment
– http://www.myexperiment.org
• BioCatalogue
– http://www.biocatalogue.org
• SysMO-SEEK
– http://www.sysmo-db.org
• MethodBox
– http://www.methodbox.org.uk
• Rightfield
– http://www.rightfield.org.uk
• Wf4ever
– http://www.wf4ever-project.org
• BioVeL
– http://www.biovel.eu
• Software Sustainability Institute
– http://www.software.ac.uk
• Software Carpentry
– http://software-carpentry.org/

Coalface Patrons
users
Skeptic
Champions Keep your
Friends Close Friends and Family

Fit in
Favours will
Embed
Favour you Jam Today
Jam Tomorrow Act Local
Think Global
End Users

Developers Just Enough Design for
Know Anticipate
Just in Time Network Effects
Service your Change
Providers Users
Enable Users
System to Add Value
Administrators

Keep Sight of the
Bigger Picture

SUMMARY
(De Roure and Goble, IEEE Software 2009)

If we build it will they come? BOSC2012 Keynote Goble

More Related Content

Similar to If we build it will they come? BOSC2012 Keynote Goble

More from Carole Goble

Recently uploaded

If we build it will they come? BOSC2012 Keynote Goble

Editor's Notes