at Stanford University
Research Software Engineering
Vanessa Sochat PhD
Research Software Engineer
Stanford Research Computing Center
The Missing Layer
facts.stanford.edu
Stanford researchers show that mealworms can safely consume toxic additive-containing plastic
Fact or Fiction? The Science of Star Wars
Engineers develop a less invasive way to study the brain
Global carbon emissions growth slows, but hits record high
Tracking power plant emissions in real time
Reduced soil tilling helps both soils and yields, Stanford researchers find
Why Your Best Idea May Be Your Second Favorite
Alcohol, ‘Asian glow’ mutation may contribute to Alzheimer’s disease, study finds
Stanford researchers uncover the silent cost of school shootings
Stanford water expert discusses wildfire’s threat to water quality
Scientists find potential diagnostic tool, treatment for Parkinson’s disease
Getting a read on low literacy scores from a Stanford education scholar news.stanford.edu
Stanford researchers show that mealworms can safely consume toxic additive-containing plastic
Fact or Fiction? The Science of Star Wars
Engineers develop a less invasive way to study the brain
Global carbon emissions growth slows, but hits record high
Tracking power plant emissions in real time
Reduced soil tilling helps both soils and yields, Stanford researchers find
Why Your Best Idea May Be Your Second Favorite
Alcohol, ‘Asian glow’ mutation may contribute to Alzheimer’s disease, study finds
Stanford researchers uncover the silent cost of school shootings
Stanford water expert discusses wildfire’s threat to water quality
Scientists find potential diagnostic tool, treatment for Parkinson’s disease
Getting a read on low literacy scores from a Stanford education scholar
medicine
medicine
medicine
news.stanford.edu
Stanford researchers show that mealworms can safely consume toxic additive-containing plastic
Fact or Fiction? The Science of Star Wars
Engineers develop a less invasive way to study the brain
Global carbon emissions growth slows, but hits record high
Tracking power plant emissions in real time
Reduced soil tilling helps both soils and yields, Stanford researchers find
Why Your Best Idea May Be Your Second Favorite
Alcohol, ‘Asian glow’ mutation may contribute to Alzheimer’s disease, study finds
Stanford researchers uncover the silent cost of school shootings
Stanford water expert discusses wildfire’s threat to water quality
Scientists find potential diagnostic tool, treatment for Parkinson’s disease
Getting a read on low literacy scores from a Stanford education scholar
medicine
medicine
medicine
environment
environment
environment
environment
environment
news.stanford.edu
Stanford researchers show that mealworms can safely consume toxic additive-containing plastic
Fact or Fiction? The Science of Star Wars
Engineers develop a less invasive way to study the brain
Global carbon emissions growth slows, but hits record high
Tracking power plant emissions in real time
Reduced soil tilling helps both soils and yields, Stanford researchers find
Why Your Best Idea May Be Your Second Favorite
Alcohol, ‘Asian glow’ mutation may contribute to Alzheimer’s disease, study finds
Stanford researchers uncover the silent cost of school shootings
Stanford water expert discusses wildfire’s threat to water quality
Scientists find potential diagnostic tool, treatment for Parkinson’s disease
Getting a read on low literacy scores from a Stanford education scholar
medicine
medicine
medicine
environment
environment
environment
environment
environment
earth
news.stanford.edu
news.stanford.edu
Stanford researchers show that mealworms can safely consume toxic additive-containing plastic
Fact or Fiction? The Science of Star Wars
Engineers develop a less invasive way to study the brain
Global carbon emissions growth slows, but hits record high
Tracking power plant emissions in real time
Reduced soil tilling helps both soils and yields, Stanford researchers find
Why Your Best Idea May Be Your Second Favorite
Alcohol, ‘Asian glow’ mutation may contribute to Alzheimer’s disease, study finds
Stanford researchers uncover the silent cost of school shootings
Stanford water expert discusses wildfire’s threat to water quality
Scientists find potential diagnostic tool, treatment for Parkinson’s disease
Getting a read on low literacy scores from a Stanford education scholar
medicine
medicine
medicine
environment
environment
environment
environment
environment
earth
economics
economics
economics
Who is doing the research?
person
person
person
person
My students write their own code in the language of their choosing.
We use scripts (typically cobbled together ourselves, shared among the
group, or accessed from previously published studies via GitHub).
Most people in the lab are using home grown, open source or else freely
available research software.
For projects that require advanced coding, the RA’s I’ve worked with
have been engineering masters students.
Everyone writes their own home grown code to access, process, and
analyze data.
Who is writing research software?
My students write their own code in the language of their choosing.
We use scripts (typically cobbled together ourselves, shared among the
group, or accessed from previously published studies via GitHub).
Most people in the lab are using home grown, open source or else freely
available research software.
For projects that require advanced coding, the RA’s I’ve worked with
have been engineering masters students.
Everyone writes their own home grown code to access, process, and
analyze data.
Who is writing research software?
We do not share our code online (too ad hoc and un-documented).
A former postdoc has maintained code used by the group.
A lot of our work is "one off" analyses, which are scripted for
reproducibility but not designed to be reused or flexible to multiple use
cases.
Everyone's lab work is kept on a private Github group for our lab.
Is it reproducible and sustainable?
For one large project I've had to hire a consultant (finite period of time).
The GSB has some software engineers on staff who work with faculty on
research.
All of the code development lumped together is ~0.05 to 0.2 FTE of a
software engineer.
I think the group sometimes struggles with documentation for
downstream use, since we are incentivized by producing research
products, not by maintaining good computing practices.
Do labs need extra help?
What does this all mean?
Who needs research software?
Most labs use research software, home-grown and open source
Who needs research software?
Most labs use research software, home-grown and open source
Do labs need extra help?
Hiring consultants suggests yes
Who needs research software?
Most labs use research software, home-grown and open source
Do labs need extra help?
Hiring consultants suggests yes
Who is writing research software?
Labs rely on students/staff to write software
Who needs research software?
Most labs use research software, home-grown and open source
Do labs need extra help?
Hiring consultants suggests yes
Who is writing research software?
Labs rely on students/staff to write software
Is it reproducible and sustainable?
No, there are largely no resources or plans to directly sustain it
What about training?
U. Nangia and D. S. Katz. Track 1 Paper: Surveying the US National Postdoctoral Association Regarding Software Use and Training in Research. In
on Sustainable Software for Science: Practice and Experiences (WSSSPE5.1), 2017.
U. Nangia and D. S. Katz. Track 1 Paper: Surveying the US National Postdoctoral Association Regarding Software Use and Training in Research. In
on Sustainable Software for Science: Practice and Experiences (WSSSPE5.1), 2017.
Nangia and Katz https://arxiv.org/pdf/1706.06527.pdf
The Top 100 Cited-Papers of all Time (Nature) https://www.nature.com/news/the-top-100-papers-1.16224
https://www.software.ac.uk/blog/2014-12-04-its-impossible-conduct-research-without-software-say-7-out-10-uk-researchers
Publications
Projects with substantial software are majority of publications
Nangia and Katz https://arxiv.org/pdf/1706.06527.pdf
The Top 100 Cited-Papers of all Time (Nature) https://www.nature.com/news/the-top-100-papers-1.16224
https://www.software.ac.uk/blog/2014-12-04-its-impossible-conduct-research-without-software-say-7-out-10-uk-researchers
Publications
Projects with substantial software are majority of publications
Methods and software are the most cited papers
Nangia and Katz https://arxiv.org/pdf/1706.06527.pdf
The Top 100 Cited-Papers of all Time (Nature) https://www.nature.com/news/the-top-100-papers-1.16224
https://www.software.ac.uk/blog/2014-12-04-its-impossible-conduct-research-without-software-say-7-out-10-uk-researchers
Publications
Projects with substantial software are majority of publications
Methods and software are the most cited papers
Research
Over 90% of US and UK researchers use research software
Nangia and Katz https://arxiv.org/pdf/1706.06527.pdf
The Top 100 Cited-Papers of all Time (Nature) https://www.nature.com/news/the-top-100-papers-1.16224
https://www.software.ac.uk/blog/2014-12-04-its-impossible-conduct-research-without-software-say-7-out-10-uk-researchers
Publications
Projects with substantial software are majority of publications
Methods and software are the most cited papers
Research
Over 90% of US and UK researchers use research software
65% Wouldn’t be able to do research without it
Nangia and Katz https://arxiv.org/pdf/1706.06527.pdf
The Top 100 Cited-Papers of all Time (Nature) https://www.nature.com/news/the-top-100-papers-1.16224
https://www.software.ac.uk/blog/2014-12-04-its-impossible-conduct-research-without-software-say-7-out-10-uk-researchers
Publications
Projects with substantial software are majority of publications
Methods and software are the most cited papers
Research
Over 90% of US and UK researchers use research software
65% Wouldn’t be able to do research without it
50% develop software as part of their research
...my neck hurts...
...my neck hurts...
We can help!
...my neck hurts...
We can help!
How do we start change?
https://tinyurl.com/stanford-rse
#community-stanford
https://tinyurl.com/stanford-rse
Monthly Meetup
Stanford RSE Roster
https://tinyurl.com/stanford-rse
github.com/rseng
https://tinyurl.com/stanford-rse
github.com/rseng
https://tinyurl.com/stanford-rse
What else awaits… ?
What is the long term vision?
Stanford Research Software Engineers
1. Connected group of trained professionals
1. Connected group of trained professionals
2. Labs meet at project onset to make plan for software
1. Connected group of trained professionals
2. Labs meet at project onset to make plan for software
3. An RSE supports life cycle of development
1. Connected group of trained professionals
2. Labs meet at project onset to make plan for software
3. An RSE supports life cycle of development
4. Long term maintenance is provided by Stanford RSE
Reproducible, sustainable research via
https://tinyurl.com/stanford-rse
Reproducible, sustainable research via
https://tinyurl.com/stanford-rse
@vsoch (Twitter and GitHub)
vsochat@stanford.edu

Research Software Engineering at Stanford University

  • 1.
    at Stanford University ResearchSoftware Engineering Vanessa Sochat PhD Research Software Engineer Stanford Research Computing Center
  • 2.
  • 7.
  • 8.
    Stanford researchers showthat mealworms can safely consume toxic additive-containing plastic Fact or Fiction? The Science of Star Wars Engineers develop a less invasive way to study the brain Global carbon emissions growth slows, but hits record high Tracking power plant emissions in real time Reduced soil tilling helps both soils and yields, Stanford researchers find Why Your Best Idea May Be Your Second Favorite Alcohol, ‘Asian glow’ mutation may contribute to Alzheimer’s disease, study finds Stanford researchers uncover the silent cost of school shootings Stanford water expert discusses wildfire’s threat to water quality Scientists find potential diagnostic tool, treatment for Parkinson’s disease Getting a read on low literacy scores from a Stanford education scholar news.stanford.edu
  • 9.
    Stanford researchers showthat mealworms can safely consume toxic additive-containing plastic Fact or Fiction? The Science of Star Wars Engineers develop a less invasive way to study the brain Global carbon emissions growth slows, but hits record high Tracking power plant emissions in real time Reduced soil tilling helps both soils and yields, Stanford researchers find Why Your Best Idea May Be Your Second Favorite Alcohol, ‘Asian glow’ mutation may contribute to Alzheimer’s disease, study finds Stanford researchers uncover the silent cost of school shootings Stanford water expert discusses wildfire’s threat to water quality Scientists find potential diagnostic tool, treatment for Parkinson’s disease Getting a read on low literacy scores from a Stanford education scholar medicine medicine medicine news.stanford.edu
  • 10.
    Stanford researchers showthat mealworms can safely consume toxic additive-containing plastic Fact or Fiction? The Science of Star Wars Engineers develop a less invasive way to study the brain Global carbon emissions growth slows, but hits record high Tracking power plant emissions in real time Reduced soil tilling helps both soils and yields, Stanford researchers find Why Your Best Idea May Be Your Second Favorite Alcohol, ‘Asian glow’ mutation may contribute to Alzheimer’s disease, study finds Stanford researchers uncover the silent cost of school shootings Stanford water expert discusses wildfire’s threat to water quality Scientists find potential diagnostic tool, treatment for Parkinson’s disease Getting a read on low literacy scores from a Stanford education scholar medicine medicine medicine environment environment environment environment environment news.stanford.edu
  • 11.
    Stanford researchers showthat mealworms can safely consume toxic additive-containing plastic Fact or Fiction? The Science of Star Wars Engineers develop a less invasive way to study the brain Global carbon emissions growth slows, but hits record high Tracking power plant emissions in real time Reduced soil tilling helps both soils and yields, Stanford researchers find Why Your Best Idea May Be Your Second Favorite Alcohol, ‘Asian glow’ mutation may contribute to Alzheimer’s disease, study finds Stanford researchers uncover the silent cost of school shootings Stanford water expert discusses wildfire’s threat to water quality Scientists find potential diagnostic tool, treatment for Parkinson’s disease Getting a read on low literacy scores from a Stanford education scholar medicine medicine medicine environment environment environment environment environment earth news.stanford.edu
  • 12.
    news.stanford.edu Stanford researchers showthat mealworms can safely consume toxic additive-containing plastic Fact or Fiction? The Science of Star Wars Engineers develop a less invasive way to study the brain Global carbon emissions growth slows, but hits record high Tracking power plant emissions in real time Reduced soil tilling helps both soils and yields, Stanford researchers find Why Your Best Idea May Be Your Second Favorite Alcohol, ‘Asian glow’ mutation may contribute to Alzheimer’s disease, study finds Stanford researchers uncover the silent cost of school shootings Stanford water expert discusses wildfire’s threat to water quality Scientists find potential diagnostic tool, treatment for Parkinson’s disease Getting a read on low literacy scores from a Stanford education scholar medicine medicine medicine environment environment environment environment environment earth economics economics economics
  • 13.
    Who is doingthe research?
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
    My students writetheir own code in the language of their choosing. We use scripts (typically cobbled together ourselves, shared among the group, or accessed from previously published studies via GitHub). Most people in the lab are using home grown, open source or else freely available research software. For projects that require advanced coding, the RA’s I’ve worked with have been engineering masters students. Everyone writes their own home grown code to access, process, and analyze data. Who is writing research software?
  • 20.
    My students writetheir own code in the language of their choosing. We use scripts (typically cobbled together ourselves, shared among the group, or accessed from previously published studies via GitHub). Most people in the lab are using home grown, open source or else freely available research software. For projects that require advanced coding, the RA’s I’ve worked with have been engineering masters students. Everyone writes their own home grown code to access, process, and analyze data. Who is writing research software?
  • 21.
    We do notshare our code online (too ad hoc and un-documented). A former postdoc has maintained code used by the group. A lot of our work is "one off" analyses, which are scripted for reproducibility but not designed to be reused or flexible to multiple use cases. Everyone's lab work is kept on a private Github group for our lab. Is it reproducible and sustainable?
  • 22.
    For one largeproject I've had to hire a consultant (finite period of time). The GSB has some software engineers on staff who work with faculty on research. All of the code development lumped together is ~0.05 to 0.2 FTE of a software engineer. I think the group sometimes struggles with documentation for downstream use, since we are incentivized by producing research products, not by maintaining good computing practices. Do labs need extra help?
  • 23.
    What does thisall mean?
  • 24.
    Who needs researchsoftware? Most labs use research software, home-grown and open source
  • 25.
    Who needs researchsoftware? Most labs use research software, home-grown and open source Do labs need extra help? Hiring consultants suggests yes
  • 26.
    Who needs researchsoftware? Most labs use research software, home-grown and open source Do labs need extra help? Hiring consultants suggests yes Who is writing research software? Labs rely on students/staff to write software
  • 27.
    Who needs researchsoftware? Most labs use research software, home-grown and open source Do labs need extra help? Hiring consultants suggests yes Who is writing research software? Labs rely on students/staff to write software Is it reproducible and sustainable? No, there are largely no resources or plans to directly sustain it
  • 29.
  • 30.
    U. Nangia andD. S. Katz. Track 1 Paper: Surveying the US National Postdoctoral Association Regarding Software Use and Training in Research. In on Sustainable Software for Science: Practice and Experiences (WSSSPE5.1), 2017.
  • 31.
    U. Nangia andD. S. Katz. Track 1 Paper: Surveying the US National Postdoctoral Association Regarding Software Use and Training in Research. In on Sustainable Software for Science: Practice and Experiences (WSSSPE5.1), 2017.
  • 33.
    Nangia and Katzhttps://arxiv.org/pdf/1706.06527.pdf The Top 100 Cited-Papers of all Time (Nature) https://www.nature.com/news/the-top-100-papers-1.16224 https://www.software.ac.uk/blog/2014-12-04-its-impossible-conduct-research-without-software-say-7-out-10-uk-researchers Publications Projects with substantial software are majority of publications
  • 34.
    Nangia and Katzhttps://arxiv.org/pdf/1706.06527.pdf The Top 100 Cited-Papers of all Time (Nature) https://www.nature.com/news/the-top-100-papers-1.16224 https://www.software.ac.uk/blog/2014-12-04-its-impossible-conduct-research-without-software-say-7-out-10-uk-researchers Publications Projects with substantial software are majority of publications Methods and software are the most cited papers
  • 35.
    Nangia and Katzhttps://arxiv.org/pdf/1706.06527.pdf The Top 100 Cited-Papers of all Time (Nature) https://www.nature.com/news/the-top-100-papers-1.16224 https://www.software.ac.uk/blog/2014-12-04-its-impossible-conduct-research-without-software-say-7-out-10-uk-researchers Publications Projects with substantial software are majority of publications Methods and software are the most cited papers Research Over 90% of US and UK researchers use research software
  • 36.
    Nangia and Katzhttps://arxiv.org/pdf/1706.06527.pdf The Top 100 Cited-Papers of all Time (Nature) https://www.nature.com/news/the-top-100-papers-1.16224 https://www.software.ac.uk/blog/2014-12-04-its-impossible-conduct-research-without-software-say-7-out-10-uk-researchers Publications Projects with substantial software are majority of publications Methods and software are the most cited papers Research Over 90% of US and UK researchers use research software 65% Wouldn’t be able to do research without it
  • 37.
    Nangia and Katzhttps://arxiv.org/pdf/1706.06527.pdf The Top 100 Cited-Papers of all Time (Nature) https://www.nature.com/news/the-top-100-papers-1.16224 https://www.software.ac.uk/blog/2014-12-04-its-impossible-conduct-research-without-software-say-7-out-10-uk-researchers Publications Projects with substantial software are majority of publications Methods and software are the most cited papers Research Over 90% of US and UK researchers use research software 65% Wouldn’t be able to do research without it 50% develop software as part of their research
  • 39.
  • 40.
  • 41.
  • 42.
    How do westart change?
  • 47.
  • 48.
  • 49.
  • 50.
  • 51.
  • 52.
  • 53.
    What is thelong term vision?
  • 54.
  • 55.
    1. Connected groupof trained professionals
  • 56.
    1. Connected groupof trained professionals 2. Labs meet at project onset to make plan for software
  • 57.
    1. Connected groupof trained professionals 2. Labs meet at project onset to make plan for software 3. An RSE supports life cycle of development
  • 58.
    1. Connected groupof trained professionals 2. Labs meet at project onset to make plan for software 3. An RSE supports life cycle of development 4. Long term maintenance is provided by Stanford RSE
  • 62.
    Reproducible, sustainable researchvia https://tinyurl.com/stanford-rse
  • 63.
    Reproducible, sustainable researchvia https://tinyurl.com/stanford-rse @vsoch (Twitter and GitHub) vsochat@stanford.edu

Editor's Notes

  • #2 Change.org http://chng.it/F5c8Lpwjyv
  • #3 Arguably, the foundation of Stanford is the research. So how does this happen? Well, you start with an interesting question, maybe you collect some data, you analyze it, which means preprocessing, analysis, statistics, all of these functions are wrapped up into software, and then you learn something interesting about the world and write about it. And guess what, open source software is also a player here, and further, someone is going to find your discovery, and try to go back to the code, or back to the data, and reproduce it! So guess what, whether we like it out not, given how engrained technology and software is in performing research, software is the foundation of sound research. So who is working on the software? ADD SLIDE ABOUT
  • #4 We are known for academic achievements
  • #5 academic achievements
  • #6 being rich
  • #7 and being close to tech companies also, with a lot of money.
  • #8 So let’s look closer. This is from facts.stanford.edu, and was updated in October. And if these are facts, what Stanford is known for is right on. We spend quite a bit on research, and the endowment is impressive. let’s dig deeper. We’re going to assume here that the kind of academic achievement that we are interested in is research and discovery. So let’s look at some of the big discoveries for 2019.
  • #9 So let’s look a little closer. I went to the Stanford News page, and looked at all stories, and specifically just filtered out announcements of someone winning an award, or passing. And here we are looking of the subset of news items most proudly reported by Stanford for the month of December. So we have everything here from mealworms to environmental discoveries to work done at SLAC. It’s awesome right! I wanted to better understand the work of these labs. So first I started by trying to look for code shared in the publications, or hints about the extent to which the lab uses software, and who was doing that. Guess what, that effort failed spectacularity MAKE A SLIDE FOR THIS So you know what, I reached out to the lead authors of these studies, and I asked them directly. And I want to share with you what I learned.
  • #10 So let’s look a little closer. I went to the Stanford News page, and looked at all stories, and specifically just filtered out announcements of someone winning an award, or passing. And here we are looking of the subset of news items most proudly reported by Stanford for the month of December. So we have everything here from mealworms to environmental discoveries to work done at SLAC. It’s awesome right! I wanted to better understand the work of these labs. So first I started by trying to look for code shared in the publications, or hints about the extent to which the lab uses software, and who was doing that. Guess what, that effort failed spectacularity MAKE A SLIDE FOR THIS So you know what, I reached out to the lead authors of these studies, and I asked them directly. And I want to share with you what I learned.
  • #11 So let’s look a little closer. I went to the Stanford News page, and looked at all stories, and specifically just filtered out announcements of someone winning an award, or passing. And here we are looking of the subset of news items most proudly reported by Stanford for the month of December. So we have everything here from mealworms to environmental discoveries to work done at SLAC. It’s awesome right! I wanted to better understand the work of these labs. So first I started by trying to look for code shared in the publications, or hints about the extent to which the lab uses software, and who was doing that. Guess what, that effort failed spectacularity MAKE A SLIDE FOR THIS So you know what, I reached out to the lead authors of these studies, and I asked them directly. And I want to share with you what I learned.
  • #12 So let’s look a little closer. I went to the Stanford News page, and looked at all stories, and specifically just filtered out announcements of someone winning an award, or passing. And here we are looking of the subset of news items most proudly reported by Stanford for the month of December. So we have everything here from mealworms to environmental discoveries to work done at SLAC. It’s awesome right! I wanted to better understand the work of these labs. So first I started by trying to look for code shared in the publications, or hints about the extent to which the lab uses software, and who was doing that. Guess what, that effort failed spectacularity MAKE A SLIDE FOR THIS So you know what, I reached out to the lead authors of these studies, and I asked them directly. And I want to share with you what I learned.
  • #13 So let’s look a little closer. I went to the Stanford News page, and looked at all stories, and specifically just filtered out announcements of someone winning an award, or passing. And here we are looking of the subset of news items most proudly reported by Stanford for the month of December. So we have everything here from mealworms to environmental discoveries to work done at SLAC. It’s awesome right! I wanted to better understand the work of these labs. So first I started by trying to look for code shared in the publications, or hints about the extent to which the lab uses software, and who was doing that. Guess what, that effort failed spectacularity MAKE A SLIDE FOR THIS So you know what, I reached out to the lead authors of these studies, and I asked them directly. And I want to share with you what I learned. What does it mean for science? it means that I can find a paper, see that it’s branded as Stanford Open Source Software, and know immediately where to find it. Have a team of engineers ready to help me if I have questions. This is the kind of standard that Stanford needs to be setting. Not “available upon request, but probably not available”
  • #14 But where is all the code? If I see a beautifully published study in any of these fields, it’s almost certain that the work involved some form of data analysis, or even writing simple scripts. So where is the Stanford Open Source GitHub organization, or website, where I can go and be fairly confident that I can find the code that is driving these discoveries?
  • #15 But where is all the code? If I see a beautifully published study in any of these fields, it’s almost certain that the work involved some form of data analysis, or even writing simple scripts. So where is the Stanford Open Source GitHub organization, or website, where I can go and be fairly confident that I can find the code that is driving these discoveries?
  • #16 But where is all the code? If I see a beautifully published study in any of these fields, it’s almost certain that the work involved some form of data analysis, or even writing simple scripts. So where is the Stanford Open Source GitHub organization, or website, where I can go and be fairly confident that I can find the code that is driving these discoveries?
  • #17 But where is all the code? If I see a beautifully published study in any of these fields, it’s almost certain that the work involved some form of data analysis, or even writing simple scripts. So where is the Stanford Open Source GitHub organization, or website, where I can go and be fairly confident that I can find the code that is driving these discoveries?
  • #18 But where is all the code? If I see a beautifully published study in any of these fields, it’s almost certain that the work involved some form of data analysis, or even writing simple scripts. So where is the Stanford Open Source GitHub organization, or website, where I can go and be fairly confident that I can find the code that is driving these discoveries?
  • #19 But where is all the code? If I see a beautifully published study in any of these fields, it’s almost certain that the work involved some form of data analysis, or even writing simple scripts. So where is the Stanford Open Source GitHub organization, or website, where I can go and be fairly confident that I can find the code that is driving these discoveries?
  • #20 With a couple of exceptions for very large projects, my students write their own code for their own projects in the language of their choosing. That mostly works fine. For one large project I've had to hire a consultant for one big code production push (but for a finite period of time) and a former postdoc maintained some simple processing code used by a few members of the group. If I had to guess, I'd say all of the code development lumped together is probably 0.05 to 0.2 FTE of a software engineer depending on how much the students were willing to outsource. We do not use training or share our code online (too ad hoc and un-documented). However, in some of my other work / my lab's work we use scripts (typically cobbled together ourselves, shared among the research group, or accessed from previously published studies via github) for sequencing analysis, in our case, typically of bacteria.
  • #21 I can’t tell you how many times I heard the term “cobbled together” or “home grown” and what are they trying to say? I think they are saying that they acknowledge that the work might be rudimentary or basic, but that they are doing their best.
  • #23 With a couple of exceptions for very large projects, my students write their own code for their own projects in the language of their choosing. That mostly works fine. For one large project I've had to hire a consultant for one big code production push (but for a finite period of time) and a former postdoc maintained some simple processing code used by a few members of the group. If I had to guess, I'd say all of the code development lumped together is probably 0.05 to 0.2 FTE of a software engineer depending on how much the students were willing to outsource. We do not use training or share our code online (too ad hoc and un-documented).
  • #24 What does it at mean? It’s sometimes hard to put together these points to go back and understand what this means for Stanford. It means that we need to have vision for the future, and compare Stanford with how it is now, and how we want it to be. I want to play for you quickly a sound bite from an undergraduate student at Lewis and Clark college. I asked him about what he wanted for the research software engineering community 10 years in the future, when he would be “an adult.” ADD SOUND HERE
  • #25 Here is the takeaways from this brief survey. And guess what, I probably don’t need to tell you most of this.
  • #26 Here is the takeaways from this brief survey. And guess what, I probably don’t need to tell you most of this.
  • #27 Here is the takeaways from this brief survey. And guess what, I probably don’t need to tell you most of this.
  • #29 But to play devil’s advocate, let’s say that you know what, this is fine. Graduate students and being forced to be software engineers, we are building character, useful skills for the job market. And maybe some of this is true. But I have a question for you.
  • #30 What does it at mean? It’s sometimes hard to put together these points to go back and understand what this means for Stanford. It means that we need to have vision for the future, and compare Stanford with how it is now, and how we want it to be. I want to play for you quickly a sound bite from an undergraduate student at Lewis and Clark college. I asked him about what he wanted for the research software engineering community 10 years in the future, when he would be “an adult.” ADD SOUND HERE
  • #31 A recent survey of U.S. postdoctoral researchers found that 95% of respondents use research software [3]. Of those that use research software 66% stated they could not continue their research without it (see Figure 1). Despite this reliance, over half (54%) of the respondents who were using this software had received no training.
  • #32 I hope that you see the conflict here - a majority of researchers at later stages of their career cannot continue their research without software, but a large majority have not had training for creating research software. Of those that have training, it’s very likely that it’s not to an advanced level that you’d really need to make sustainable software.
  • #33 Guess what guys, this is not fine. Even if you had the perfect graduate student programmer, he or she is going to graduate. And guess what, a fire is left behind for someone else to deal with. So what do we do. What does this actually mean for stnford. Well there are two options here as I see it. Either we force these researchers who are already burdened with publish or perish and being an expert in their domain with umpteem more years of training , OR we provide them with the support that they need, research software engineers, so they don’t have to.
  • #34 okay so maybe you don’t believe me that this small collection of data from Stanford researchers is representative. Let me throwwww some facts at you!
  • #35 okay so maybe you don’t believe me that this small collection of data from Stanford researchers is representative. Let me throwwww some facts at you!
  • #36 okay so maybe you don’t believe me that this small collection of data from Stanford researchers is representative. Let me throwwww some facts at you!
  • #37 okay so maybe you don’t believe me that this small collection of data from Stanford researchers is representative. Let me throwwww some facts at you!
  • #38 okay so maybe you don’t believe me that this small collection of data from Stanford researchers is representative. Let me throwwww some facts at you!
  • #39 So what do we do. What does this actually mean for stnford. Well there are two options here as I see it. Either we force these researchers who are already burdened with publish or perish and being an expert in their domain with umpteem more years of training , OR we provide them with the support that they need, research software engineers, so they don’t have to.
  • #40 So what do we do. What does this actually mean for stnford. Well there are two options here as I see it. Either we force these researchers who are already burdened with publish or perish and being an expert in their domain with umpteem more years of training , OR we provide them with the support that they need, research software engineers, so they don’t have to.
  • #41 So what do we do. What does this actually mean for stnford. Well there are two options here as I see it. Either we force these researchers who are already burdened with publish or perish and being an expert in their domain with umpteem more years of training , OR we provide them with the support that they need, research software engineers, so they don’t have to.
  • #42 So what do we do. What does this actually mean for stnford. Well there are two options here as I see it. Either we force these researchers who are already burdened with publish or perish and being an expert in their domain with umpteem more years of training , OR we provide them with the support that they need, research software engineers, so they don’t have to.
  • #43 What does it at mean? It’s sometimes hard to put together these points to go back and understand what this means for Stanford. It means that we need to have vision for the future, and compare Stanford with how it is now, and how we want it to be. I want to play for you quickly a sound bite from an undergraduate student at Lewis and Clark college. I asked him about what he wanted for the research software engineering community 10 years in the future, when he would be “an adult.” ADD SOUND HERE
  • #44 The first group are the labs, students, researchers, all the folks that might really benefit from a research software engineer.
  • #45 The second group are the RSEs - and I’ve shown this slide many times, but because it’s a very good metaphor. We need to find all of the RSEs around Stanford and get them talking to one another.
  • #46 What I realized is that we need to grow our Research Software Engineer community first. If we grow this community first, we will have a lot stronger base for being able to help the researchers. So that’s what I want to focus on for the rest of today - I want you to get excited, because in the next few months to the year I am putting full force effort into creating community for RSEs at Stanford.
  • #48 The first is easy and simple, and in fact, it’s arready done! We have a community-stanford channel in the USRSE slack. The goal here is have a place we can talk to one another, in the context of an already growing national community, the USRSE.
  • #49 The next initiative is about making sure that there is opportunity for more face to face, whether in person or virtual, communication. We need a regular, low key way to have a regular get together, just to talk and share the problems that we are working on. Whether we talk about projects, current challenges, or something else, this kind of open space is really important for ideas to grow. This is also how we will learn to ask one another for help. For example, for the first meeting I want to talk about how we can work together. We might introduce ourselves, and brainstorm what kind of training or other materials can we work on for our community. Or maybe we might just realize that we already have a calling to help one another with current problems. (put picture of people sitting around table, etc.)
  • #50 If you notice the link in the bottom right, this will take you to a form where you can ask for more information about RSE day. But in anticipation of bringing everyone together, in advance I am going to send you a follow up email with basic questions like what do you work on? What are some of your challenges? This is totally optional, but for those that fill it out, if you are a participant of RSE day you will be given a private “roster” sheet so that when you leave, you’ll have this entire inventory of people that you can contact if you have a question, want to ask for help, or want to just get coffee.
  • #51 For some of us, we work on open source software. So I’ve created the GitHub organization, rseng, as a place where we can come together as a community and work on this software. I want to note that I haven’t scoped this just to STanford, so my hope is that we can bring in other instituional RSEs as well. If you go to the interest form at the URL shared in the bottom right, it asks for your GitHub username, and I’ll invite you to this organization right away to start interacting with other RSEs.
  • #52 And as a subset of that is a smaller project called “needs love the idea here is that if you don’t want to contact someone directly, you can post an issue here. Members of the organization or those that are subscribed to the repository will receive a notification, and see if they can help with the issue. I’d like to see this be a place where we can post questions, or ask for help, and also look for things that we can help with, and in turn, grow. And there are going to be other fun goodies at RSE Day, which I won’t reveal now.
  • #53 And there are going to be other fun goodies at RSE Day, which I won’t reveal now.
  • #54 But where is all the code? If I see a beautifully published study in any of these fields, it’s almost certain that the work involved some form of data analysis, or even writing simple scripts. So where is the Stanford Open Source GitHub organization, or website, where I can go and be fairly confident that I can find the code that is driving these discoveries?
  • #55 We need Stanford Research Software Engineering.
  • #56 This means a centralized group of trained professionals, likely with expertise in both software engineering and different domains of science. It means that labs meet with the group at the onset of projects to make a plan for research software It means that a Research Software Engineer is there to support the entire lifecycle of development, from the original creating of the software, testing, containerization, and even running on HPC. The researcher doesn’t need to worry about if the software is properly versioned or reproducible, it’s ready to go. And then when the students or postdocs graduate, the software lives on as official Stanford Open Source Software. The team of RSEs act as maintainers to respond to issues that come up, and to further grow community around the software.
  • #57 This means a centralized group of trained professionals, likely with expertise in both software engineering and different domains of science. It means that labs meet with the group at the onset of projects to make a plan for research software It means that a Research Software Engineer is there to support the entire lifecycle of development, from the original creating of the software, testing, containerization, and even running on HPC. The researcher doesn’t need to worry about if the software is properly versioned or reproducible, it’s ready to go. And then when the students or postdocs graduate, the software lives on as official Stanford Open Source Software. The team of RSEs act as maintainers to respond to issues that come up, and to further grow community around the software.
  • #58 This means a centralized group of trained professionals, likely with expertise in both software engineering and different domains of science. It means that labs meet with the group at the onset of projects to make a plan for research software It means that a Research Software Engineer is there to support the entire lifecycle of development, from the original creating of the software, testing, containerization, and even running on HPC. The researcher doesn’t need to worry about if the software is properly versioned or reproducible, it’s ready to go. And then when the students or postdocs graduate, the software lives on as official Stanford Open Source Software. The team of RSEs act as maintainers to respond to issues that come up, and to further grow community around the software.
  • #59 This means a centralized group of trained professionals, likely with expertise in both software engineering and different domains of science. It means that labs meet with the group at the onset of projects to make a plan for research software It means that a Research Software Engineer is there to support the entire lifecycle of development, from the original creating of the software, testing, containerization, and even running on HPC. The researcher doesn’t need to worry about if the software is properly versioned or reproducible, it’s ready to go. And then when the students or postdocs graduate, the software lives on as official Stanford Open Source Software. The team of RSEs act as maintainers to respond to issues that come up, and to further grow community around the software.
  • #60 What does it mean for science? it means that I can find a paper, see that it’s branded as Stanford Open Source Software, and know immediately where to find it. Have a team of engineers ready to help me if I have questions. This is the kind of standard that Stanford needs to be setting. Not “available upon request, but probably not available”
  • #61 Imagine if I could read a paper, and be directed to a repository stamped as Stanford Open Source Software.
  • #62 Arguably, the foundation of Stanford is the research. So how does this happen? Well, you start with an interesting question, maybe you collect some data, you analyze it, which means preprocessing, analysis, statistics, all of these functions are wrapped up into software, and then you learn something interesting about the world and write about it. And guess what, open source software is also a player here, and further, someone is going to find your discovery, and try to go back to the code, or back to the data, and reproduce it! So guess what, whether we like it out not, given how engrained technology and software is in performing research, software is the foundation of sound research. So who is working on the software? ADD SLIDE ABOUT
  • #63 Change.org http://chng.it/F5c8Lpwjyv
  • #64 Change.org http://chng.it/F5c8Lpwjyv