When we think about the software used in research and science, we might think of the commercial packages with thousands of users, or the millions of lines of code that support experiments such as the Large Hadron Collider, or indeed the millions of scripts written every day by researchers across the world to undertake simple tasks. What is clear is that modern research relies on software: a recent survey of UK researchers conducted by the Software Sustainability Institute reported that 92% of researchers used software, and 69% could not conduct their work without it. Millions of dollars are invested each year in supporting a quasi-industry of software production, with the equivalent of the full-spectrum from large multinationals and tiny cottage industries, but little is known about whether this is efficient or indeed appropriate. This talk will examine the similarities between the development of software in the research environment and the lifecycle of technology startup companies. It will also consider the driving factors behind adoption of software and the impact of software sustainability on the ability to conduct research.
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
Why developing research software is like a startup (and why this matters)
1. Software Sustainability Institute
www.software.ac.uk
Why developing research
software is like a startup
(and why this matters)
Neil Chue Hong, N.ChueHong@software.ac.uk
ISGC 2015, Taipei, 19th March 2015
Institute
Software
Sustainability
www.software.ac.uk
Supported by Project funding
from
Where indicated
slides licensed under
3. Software Sustainability Institute
www.software.ac.uk
The Software Sustainability
Institute
A national facility for cultivating better, more
sustainable, research software to enable world-
class research
• Software reaches boundaries in its
development cycle that prevent
improvement, growth and adoption
• Providing the expertise and services
needed to negotiate to the next stage
• Developing the policy and tools to
support the community developing and
using research software
Supported by EPSRC
Grant EP/H043160/1
5. Software Sustainability Institute
www.software.ac.uk
Software isn’t special,
it’s mainstream
69%92%
Survey of researchers from 15 Russell Group unis conducted by SSI between Aug- Oct 2014.
406 respondents covering representative range of funders, discipline and seniority.
http://www.software.ac.uk/blog/2014-12-04-its-impossible-conduct-research-without-software-say-7-out-10-uk-
researchers
6. Software Sustainability Institute
www.software.ac.uk
And everyone’s a developer
Survey of researchers from 15 Russell Group unis conducted by SSI between Aug- Oct 2014.
406 respondents covering representative range of funders, discipline and seniority.
http://www.software.ac.uk/blog/2014-12-04-its-impossible-conduct-research-without-software-say-7-out-10-uk-
researchers
56%
8. Software Sustainability Institute
www.software.ac.uk
Startup Survival Rules
1. Pick good
co-founders
2. Launch fast
3. Let your idea evolve
4. Understand your
users
5. Better to make a few
users love you than a
lot ambivalent
6. Offer surprisingly good
customer service
7. You make what you
measure
8. Spend little
9. Get ramen profitable
10.Avoid distractions
11.Don’t get demoralised
12.Don’t give up
13.Deals fall through
http://www.paulgraham.com/13sentences.html
9. Software Sustainability Institute
www.software.ac.uk
Software Survival Rules
1. Pick good
collaborators
2. Release early
3. Let your idea evolve
4. Understand your
users
5. Better to make a few
users love you than a
lot ambivalent
6. Offer surprisingly good
support
7. You make what you
measure
8. Spend little
9. Get paper profitable
10.Avoid distractions
11.Don’t get demoralised
12.Don’t give up
13.Funding often falls
through
10. Software Sustainability Institute
www.software.ac.uk
Understand your users
Wealth created / Impact enabled
Numberofusers
How much you improve their lives
“As in science, the hard
part is not answering
questions but asking
them: the hard part is
seeing something new
that users lack. The
better you understand
them the better the odds
of doing that. That's why
so many successful
startups make something
the founders needed.”
Paul Graham,
Y-Combinator
Where can you make most difference?
12. Software Sustainability Institute
www.software.ac.uk
Four stages of startups
Discovery
Validation
Efficiency
Scale
Are you solving a problem that
others are interested in?
Have you implemented core
features that users want?
Can you support new users by
refining your processes?
Ready to drive growth.
Back-end scalability refactoring
http://blog.startupcompass.co/pages/marmer-stages
13. Software Sustainability Institute
www.software.ac.uk
Four stages of startups
Discovery
Validation
Efficiency
Scale
Are you solving a problem that
others are interested in?
Have you implemented core
features that users want?
Can you support new users by
refining your processes?
Ready to drive growth.
Back-end scalability refactoring
http://blog.startupcompass.co/pages/marmer-stages
14. Software Sustainability Institute
www.software.ac.uk
Stage 1: Discovery
• Are you solving a problem that others are
interested in?
Many pieces of software created by researchers
have small user bases – this is particularly true of
scripts
Not a problem if you are writing the software for
yourself only – but it affects how large the project
to support the software can be
• Software development as research and
prototyping – is the science interesting?
15. Software Sustainability Institute
www.software.ac.uk
Case Study: Ligand Binding
• Centre for Computational Chemistry, Bristol
New methods for rapid MC sampling of
biomolecular systems modelled using QM/MM
Developed two codes ProtoMS (F77) + Sire (C++)
Water-Swap Reaction Coordinate method to
calculate absolute protein-ligand binding free
energies
• SSI’s work helped assess users + scale devs
Ran user observations with 4 different users
ASPIRE/ACQUIRE framework has multiple devs
• Split architecture between ASPIRE (adaptive
multiresolution hybrid MD simulation) and ACQUIRE
(WorkPacket scheduling system with optimisation
for time to result vs “green-ness”
• http://www.software.ac.uk/resources/case-studies/getting-grips-
molecules
• http://www.siremol.org/adaptive_dynamics
16. Software Sustainability Institute
www.software.ac.uk
Four stages of startups
Discovery
Validation
Efficiency
Scale
Are you solving a problem that
others are interested in?
Have you implemented core
features that users want?
Can you support new users by
refining your processes?
Ready to drive growth.
Back-end scalability refactoring
http://blog.startupcompass.co/pages/marmer-stages
17. Software Sustainability Institute
www.software.ac.uk
Stage 2: Validation
• Have you implemented core features that users
want?
Do you know who’s using your software? Why are they
using it?
If you asked them “how would you feel if you can no
longer use this software”, how many would be
disappointed?
These are your core users: what do they want and how
can you give it to them?
• At this point, research software projects often
start giving demonstrations, presentations,
workshops
18. Software Sustainability Institute
www.software.ac.uk
Case Study:
Climate Policy Modelling
• CIAS team at Tyndall Centre for Climate Change
Research, University of East Anglia
Develop linked climate and economic models for
detailed analysis
Their software was not ready to be used by other
groups
• One researcher/developer at UEA, several users
• SSI’s work means the software is robust enough that
it can be installed and used by others
Enabled use of the software by the WWFN’s
Climascope project and James Cook University
• Documented software to allow extensions by contributors
• Made it easier to maintain and backup
• Added job scheduling to improve modeling throughput
• New modelling framework enables new models i.e. new
science
• http://www.tyndall.ac.uk/research/cias
19. Software Sustainability Institute
www.software.ac.uk
Four stages of startups
Discovery
Validation
Efficiency
Scale
Are you solving a problem that
others are interested in?
Have you implemented core
features that users want?
Can you support new users by
refining your processes?
Ready to drive growth.
Back-end scalability refactoring
http://blog.startupcompass.co/pages/marmer-stages
20. Software Sustainability Institute
www.software.ac.uk
Stage 3: Efficiency
• Can you support new users by refining your
processes?
If you had conflicting requirements from users,
how would you deal with them?
What infrastructure changes do you need to make
to support new/additional users?
• At this point, research software projects often
start designating specific community/product
managers, user support staff
21. Software Sustainability Institute
www.software.ac.uk
Case Study: ICAT
• Science and Technology Facilities Council
Metadata catalogue, used by RAL UK (ISIS,
DIAMOND, CLF), SNS US, ELLETRA Italy
ICAT operationally critical at sites
Huge projects looking to use ICAT
(PaNdataODI, EuDAT)
Scalability issues and lack of proper processes
• SSI’s work provided 33 recommendations
15 interviews with different stakeholders
92 observations set out in report
“…we must focus on doing the right things, and
this report will help us”
• Alistair Mills, STFC
Governance and outreach changes to support additional users
• http://www.software.ac.uk/preparing-icat-thousands-new-users
• http://www.icatproject.org/
22. Software Sustainability Institute
www.software.ac.uk
Four stages of startups
Discovery
Validation
Efficiency
Scale
Are you solving a problem that
others are interested in?
Have you implemented core
features that users want?
Can you support new users by
refining your processes?
Ready to drive growth.
Back-end scalability refactoring
http://blog.startupcompass.co/pages/marmer-stages
23. Software Sustainability Institute
www.software.ac.uk
Stage 4: Scale
• Are you ready to drive growth of users, to
reengineer and refactor on an ongoing basis?
This is when software quality considerations
become very important, as you have increased
reputational risk
This is also the point where traditionally a PI would
step aside to become Chief Technology Officer /
Chief Scientist and enlist new management
• After this, next stage is sustain (then conserve)
Though this might be at a small scale if appropriate
25. Software Sustainability Institute
www.software.ac.uk
Open Source Software Projects
• “Every good work of software starts by
scratching a developer's personal itch.”
Eric Raymond, The Cathedral and the Bazaar
• Producers start by having a direct interest
in the success of the software
Just like in science
• OSS projects need to satisfy two aims:
Acquire users (a.k.a. researchers)
Acquire contributors (a.k.a. collaborating researchers)
• Producing Open Source Software: How to Run a
Successful Free Software Project by Karl Fogel
26. Software Sustainability Institute
www.software.ac.uk
Why do you need users?
• Funding
Direct: fees, subscriptions, …
Indirect funding: letters of support,
citations, collaborations
Advertising: recommendations and
referrals
• Direction (indirection?)
Requirements, bug reports, change
requests
• Community
Users supporting other users
Users becoming contributors
Sustainability and success
28. Software Sustainability Institute
www.software.ac.uk
The entrepreneur vs
the researcher?
• Entrepreneurs have ideas, and don’t mind if some
of them aren’t successful
“A person who never made a mistake never tried
anything new”
“I have not failed. I’ve just found 10,000 ways that
won’t work”
• Researchers find it difficult to get away from the
questions they choose to focus on
• Yet successful researchers are able to switch from
one area to another, because it’s interesting to
them
30. Software Sustainability Institute
www.software.ac.uk
Find out more about the SSI
• Community Engagement (Lead: Shoaib Sufi)
Fellowship Programme
Events and Workshops
• Consultancy (Lead: Steve Crouch)
Open Call for Projects / Collaborations
Software Evaluation
• Policy and Publicity (Lead: Simon Hettrick)
Case Studies / Policy Campaigns
Software and Research Blog
• Training (Lead: Aleksandra Pawlik)
Software Carpentry (300+ students/year)
Guides and Top Tips
• Journal of Open Research Software (Editor: Neil Chue Hong)
• Collaboration between universities of Edinburgh, Manchester, Oxford and Southampton
Editor's Notes
When we think about the software used in research and science, we might think of the commercial packages with thousands of users, or the millions of lines of code that support experiments such as the Large Hadron Collider, or indeed the millions of scripts written every day by researchers across the world to undertake simple tasks. What is clear is that modern research relies on software: a recent survey of UK researchers conducted by the Software Sustainability Institute reported that 92% of researchers used software, and 69% could not conduct their work without it. Millions of dollars are invested each year in supporting a quasi-industry of software production, with the equivalent of the full-spectrum from large multinationals and tiny cottage industries, but little is known about whether this is efficient or indeed appropriate. This talk will examine the similarities between the development of software in the research environment and the lifecycle of technology startup companies. It will also consider the driving factors behind adoption of software and the impact of software sustainability on the ability to conduct research.
My experience with the mushrooms made me realise that whenever you think it is a technical problem, you’re actually looking for a social solution.
The Software Sustainability Institute can help with: software reviews and refactoring, collaborations to develop your project, guidance and best practice on software development, project management, community building, publicity and more…
Drawing on pool of specialists to drive the continued improvement and impact of research software developed by and for researchers
Providing services for research software users and developers
Developing research community interactions and capacity
Promoting research software best practice and capability
KEY POINT: Get across what the Institute does
This is a conceptual breakdown to make the Institute’s activities easier to manage. In reality the Institute staff work across themes, contributing to many activities, as software sustainability cannot be addressed just through individual activities working alone.
Specific activities:
Software Carpentry courses teach researchers key computational skills
Helping to spin up ELIXIR-UK training activities
DIRAC Drivers License to get best use out of HPC resources
Advising Wellcome Trust on software policy
Guides on wide range of subjects
We understand the linkage between all of these requirements, issues and activities because we’ve investigated the research communities reliance on software in depth.
Variation in use with seniority of respondent
The use of research software varies little with seniority.
It’s difficult to measure seniority, so we simply asked how many years the respondents had worked in research. There isn’t a great deal of variation: the percentage of use varies by 12% with those having worked in research for 6-10 years reporting the most use (98%) and those having worked for more than 20 years in research reporting the lowest use (86%).
The first two categories – those have worked less than a year, and those that have worked for 1-5 years – report 91-92% use. Use peaks in the next ten years and then drops in the 15-20 year and more than 20 year groups.
There are different ways to explain this variation. Unfortunately, they cannot be confirmed by our data. It seems likely that low- and mid-seniority researchers are the workhorses of research and do the most generation of results – and hence are most likely to use software. Once a researcher gets more senior, there is the tendency to perform more management duties which makes them less likely to use research software.
What software are people using?
A lot of different software is being used: we recorded 566 different packages - some of them have only one user within our surveyed community, some with many. The most popular packages are Matlab (20% of respondents use it), R (16%), SPSS (15%), then Excel (12%). To show the use diagrammatically, we created the Wordle shown at the top of the page.
A lot of researchers are developing their own software – even though they lack training
It’s not just proprietary software, many researchers are developing their own code: 56% of them. This is great news, because the real power of software lies in developing it to allow you to do more in less time and make new research possible.
Many people in the research community are developing their own software, is the development in safe hands?
55% of respondents have received some training in software development (15% self taught and 40% had received some form of taught course). Worryingly, 21% of respondents who develop their own software had no training in software development. That’s one in five researchers developing software blind.
Software that is developed without adequate training is unlikely to be reliable. Researchers are, by their very nature, intelligent people who learn new skills quickly, but there are many subtle pitfalls in developing good code (that is, code that won’t later lead to paper retractions). And that’s only the case for reliability! We want defensible results, which requires a whole swathe of skills related to producing reproducible code, and we want to protect the research investment, which requires yet more skills for writing reusable software.
Software development costs are not being included in bids
Many researchers believe that including costs for developing software in a proposal will weaken it. We’ve had steer from the Research Councils that this is not the case - something we’re trying to persuade the research community to believe. But we may have our work cut out.
When we asked the people who are responsible for writing proposals whether they had included costs for software development, 22% said that they had, 57% said they had not, and 20% said that they had not even though they knew software development would make up part of the bid! (Note that rounding errors make these figures sum to 99%.)
Differences in software use with respect to gender
Women made up 36% of respondents to the survey, men made up 62% and the remainder went to “other”, “prefer not to say” or no response (the gender question was not mandatory).
There is no difference in the percentage of women and men who use research software: 92% each. This is heartening news!
Differences in software development with respect to gender
Although there is no difference in the use of research software, there is a huge difference when it comes to developing software: 70% of men develop their own research software, whereas only 30% of women do.
This preponderance of men in development is reflected, as one would expect, in training. Only 39% of women had received software development training of some form, relative to 63% of men who have received training.
Even those who create prototypes will assume they will rewrite the software
February 2009
One of the things I always tell startups is a principle I learned from Paul Buchheit: it's better to make a few people really happy than to make a lot of people semi-happy. I was saying recently to a reporter that if I could only tell startups 10 things, this would be one of them. Then I thought: what would the other 9 be?
When I made the list there turned out to be 13:
1. Pick good cofounders.
Cofounders are for a startup what location is for real estate. You can change anything about a house except where it is. In a startup you can change your idea easily, but changing your cofounders is hard. [1] And the success of a startup is almost always a function of its founders.
2. Launch fast.
The reason to launch fast is not so much that it's critical to get your product to market early, but that you haven't really started working on it till you've launched. Launching teaches you what you should have been building. Till you know that you're wasting your time. So the main value of whatever you launch with is as a pretext for engaging users.
3. Let your idea evolve.
This is the second half of launching fast. Launch fast and iterate. It's a big mistake to treat a startup as if it were merely a matter of implementing some brilliant initial idea. As in an essay, most of the ideas appear in the implementing.
4. Understand your users.
You can envision the wealth created by a startup as a rectangle, where one side is the number of users and the other is how much you improve their lives. [2] The second dimension is the one you have most control over. And indeed, the growth in the first will be driven by how well you do in the second. As in science, the hard part is not answering questions but asking them: the hard part is seeing something new that users lack. The better you understand them the better the odds of doing that. That's why so many successful startups make something the founders needed.
5. Better to make a few users love you than a lot ambivalent.
Ideally you want to make large numbers of users love you, but you can't expect to hit that right away. Initially you have to choose between satisfying all the needs of a subset of potential users, or satisfying a subset of the needs of all potential users. Take the first. It's easier to expand userwise than satisfactionwise. And perhaps more importantly, it's harder to lie to yourself. If you think you're 85% of the way to a great product, how do you know it's not 70%? Or 10%? Whereas it's easy to know how many users you have.
6. Offer surprisingly good customer service.
Customers are used to being maltreated. Most of the companies they deal with are quasi-monopolies that get away with atrocious customer service. Your own ideas about what's possible have been unconsciously lowered by such experiences. Try making your customer service not merely good, but surprisingly good. Go out of your way to make people happy. They'll be overwhelmed; you'll see. In the earliest stages of a startup, it pays to offer customer service on a level that wouldn't scale, because it's a way of learning about your users.
7. You make what you measure.
I learned this one from Joe Kraus. [3] Merely measuring something has an uncanny tendency to improve it. If you want to make your user numbers go up, put a big piece of paper on your wall and every day plot the number of users. You'll be delighted when it goes up and disappointed when it goes down. Pretty soon you'll start noticing what makes the number go up, and you'll start to do more of that. Corollary: be careful what you measure.
8. Spend little.
I can't emphasize enough how important it is for a startup to be cheap. Most startups fail before they make something people want, and the most common form of failure is running out of money. So being cheap is (almost) interchangeable with iterating rapidly. [4] But it's more than that. A culture of cheapness keeps companies young in something like the way exercise keeps people young.
9. Get ramen profitable.
"Ramen profitable" means a startup makes just enough to pay the founders' living expenses. It's not rapid prototyping for business models (though it can be), but more a way of hacking the investment process. Once you cross over into ramen profitable, it completely changes your relationship with investors. It's also great for morale.
10. Avoid distractions.
Nothing kills startups like distractions. The worst type are those that pay money: day jobs, consulting, profitable side-projects. The startup may have more long-term potential, but you'll always interrupt working on it to answer calls from people paying you now. Paradoxically, fundraising is this type of distraction, so try to minimize that too.
11. Don't get demoralized.
Though the immediate cause of death in a startup tends to be running out of money, the underlying cause is usually lack of focus. Either the company is run by stupid people (which can't be fixed with advice) or the people are smart but got demoralized. Starting a startup is a huge moral weight. Understand this and make a conscious effort not to be ground down by it, just as you'd be careful to bend at the knees when picking up a heavy box.
12. Don't give up.
Even if you get demoralized, don't give up. You can get surprisingly far by just not giving up. This isn't true in all fields. There are a lot of people who couldn't become good mathematicians no matter how long they persisted. But startups aren't like that. Sheer effort is usually enough, so long as you keep morphing your idea.
13. Deals fall through.
One of the most useful skills we learned from Viaweb was not getting our hopes up. We probably had 20 deals of various types fall through. After the first 10 or so we learned to treat deals as background processes that we should ignore till they terminated. It's very dangerous to morale to start to depend on deals closing, not just because they so often don't, but because it makes them less likely to.
Having gotten it down to 13 sentences, I asked myself which I'd choose if I could only keep one.
Understand your users. That's the key. The essential task in a startup is to create wealth; the dimension of wealth you have most control over is how much you improve users' lives; and the hardest part of that is knowing what to make for them. Once you know what to make, it's mere effort to make it, and most decent hackers are capable of that.
Understanding your users is part of half the principles in this list. That's the reason to launch early, to understand your users. Evolving your idea is the embodiment of understanding your users. Understanding your users well will tend to push you toward making something that makes a few people deeply happy. The most important reason for having surprisingly good customer service is that it helps you understand your users. And understanding your users will even ensure your morale, because when everything else is collapsing around you, having just ten users who love you will keep you going.
Notes
[1] Strictly speaking it's impossible without a time machine.
[2] In practice it's more like a ragged comb.
[3] Joe thinks one of the founders of Hewlett Packard said it first, but he doesn't remember which.
[4] They'd be interchangeable if markets stood still. Since they don't, working twice as fast is better than having twice as much time.
1. You’re looking for people who will make your software a success. It’s very hard to change collaborators.
2. Release early, because it teaches you about what you should have been building. You are using this to engage your first users.
5. Whilst it’s hard to grow the number of users directly, it’s easy to measure them.
6. Make people willing to use your software – their success will help champion your success
7. If you start measuring the right things, you’ll understand what makes those measures change
9. Get to the point when you’re getting enough out of the software to make it possible to get whatever credits you need (e.g. papers) more easily than without the software
Marmer stages (Max Marmer of Startup compass) has analysed 1,000s of startup companies to produce a product oriented lifecycle model.
Marmer stages (Max Marmer of Startup compass) has analysed 1,000s of startup companies to produce a product oriented lifecycle model.
Simulation of tamiflu bound to a drug-resistant mutant of influenza neuraminidase. The red parts of the protein prefer to be bound to the drug than water, while the blue parts prefer the water. At the start of the movie there are two blue residues at the top, with a red residue between them. The blue residue on the left is the mutant residue that confers drug resistance to the virus.
Marmer stages (Max Marmer of Startup compass) has analysed 1,000s of startup companies to produce a product oriented lifecycle model.
Would more than 40% of your current user base say they were disappointed if your software disappeared?
Collaboration helps sustainability
Marmer stages (Max Marmer of Startup compass) has analysed 1,000s of startup companies to produce a product oriented lifecycle model.
Marmer stages (Max Marmer of Startup compass) has analysed 1,000s of startup companies to produce a product oriented lifecycle model.
Scott McNealy coined the phrase
Pre-competitive versus post-competitive space
Promotion: e.g. a website
One way conduit of information from project to public
Communication: e.g. a mailing list, wiki
Also the “medium of record”
Collaboration: e.g. a code repository
Manage and increase visibility of changes
Management: e.g. an issue tracker
Keeping track of things so anyone can query
Comparing and contrasting the funding types in startups (Angel, VC etc) with the funding available to research software (Research grant, platform grant, etc)
Why do you take it, what does it give you, what do you lose in taking it?
Quotes from Albert Einstein and Thomas Edison
Some further resources:
http://www.software.ac.uk/blog/2012-11-09-craftsperson-and-scholar
http://software.ac.uk/blog/2012-08-16-what-research-software-community-and-why-should-you-care
http://www.software.ac.uk/blog/2011-05-02-publish-or-be-damned-alternative-impact-manifesto-research-software
http://www.software.ac.uk/software-evaluation-guide
http://www.software.ac.uk/software-carpentry
http://www.software.ac.uk/resources/guides
http://www.software.ac.uk/training
This question of minimal requiremements ducks the issue of who is to enforce them.
Star graphic modified under CC-BY from Ssolbergj
C.f.
5 Stars of Linked Data (Berners-Lee):
Available w/ open license, machine-readable, non-proprietary format, open standards, linked to provide context
5 Stars of Online Journals (Shotton):
Peer Review, Open Access, Enriched Content, Available Datasets, Machine-readable metadata
What about community?
A metajournal which encourages the publication of information that encourages the reuse of software.
A way of using the current tools and practices to make software better recognised.