SlideShare a Scribd company logo
Open Source and Science
at NSF
Daniel S. Katz
Program Director,
Division of Advanced
Cyberinfrastructure (ACI)
Big Science and Infrastructure
• Hurricanes affect humans
• Multi-physics: atmosphere, ocean, coast, vegetation, soil
– Sensors and data as inputs
• Humans: what have they built, where are they, what will they do
– Data and models as inputs
• Infrastructure:
– Urgent/scheduled processing, workflows
– Software applications, workflows
– Networks
– Decision-support systems,
visualization
– Data storage,
interoperability
• CIPRES Science Gateway for Phylogenetics
– Study of diversification of life and relationships among
living things through time
• Highly used
– Cited in at least 400 publications, e.g., Nature, PNAS, Cell
– More than 5000 unique users in 3 years
– Used routinely in at least 68 undergraduate classes
– 45% US (including most states), 55% 70 other countries
• Infrastructure
– Flexible web application
• A science gateway, uses software and lessons from XSEDE
gateways team, e.g., identify management, HPC job control
– Science software: tree inference and sequence alignment
• Parallel versions of MrBayes, RAxML, GARLI, BEAST, MAFFT
• PAUP*, Poy, ClustalW, Contralign, FSA, MUSCLE, ...
– Data
• Personal user space for storing
results
• Tools to transfer and view data
Credit: Mark Miller, SDSC
Long-tail Science and Infrastructure
Cyberinfrastructure (e-Research)
• “Cyberinfrastructure consists of computing systems,
data storage systems, advanced instruments and
data repositories, visualization environments, and
people, all linked together by software and high
performance networks to improve research
productivity and enable breakthroughs not otherwise
possible.”
-- Craig Stewart
• Infrastructure elements:
– parts of an infrastructure,
– developed by individuals and groups,
– international,
– developed for a purpose,
– used by a community
NSF Software Vision
NSF will take a leadership role in providing
software as enabling infrastructure for
science and engineering research and
education, and in promoting software as a
principal component of its comprehensive
CIF21 vision
• ...
• Reducing the complexity of software will be a
unifying theme across the CIF21 vision,
advancing both the use and development of
new software and promoting the ubiquitous
integration of scientific software across all
disciplines, in education, and in industry
– A Vision and Strategy for Software for Science,
Engineering, and Education – NSF 12-113
Create and maintain a
software ecosystem
providing new
capabilities that
advance and accelerate
scientific inquiry at
unprecedented
complexity and scale
Support the
foundational
research necessary
to continue to
efficiently advance
scientific software
Enable transformative,
interdisciplinary,
collaborative, science
and engineering
research and
education through the
use of advanced
software and services
Transform practice through new
policies for software addressing
challenges of academic culture, open
dissemination and use, reproducibility
and trust, curation, sustainability,
governance, citation, stewardship, and
attribution of software authorship
Develop a next generation diverse
workforce of scientists and
engineers equipped with essential
skills to use and develop software,
with software and services used in
both the research and education
process
Infrastructure Role & Lifecycle
Software Infrastructure Projects
• In SI2, currently ~50 Elements & Frameworks projects
& 13 potential Institutes planning projects
• See http://bit.ly/sw-ci for current SI2 projects
• NSF directorates have additional sw projects
SI2 Solicitation and Decision Process
• Cross-NSF software working group with members from all
directorates
• Determined how SI2 fits with other NSF programs that
support software
– See: Implementation of NSF Software Vision -
http://www.nsf.gov/funding/pgm_summ.jsp?pims_id=504817
• Discusses solicitations, determines who will participate in
each
• Discusses and participates in review process
• Work together to fund worthy proposals (matchmaking)
– Unidisciplinary projects (e.g. bioinformatics app)
– Multidisciplinary projects (e.g., molecular dynamics)
– Onmidisciplinary projects (e.g. http, math library)
• In all cases, need to forecast impact
– Past performance does predict future results
Measuring Impact – Scenarios
1. Developer of open source physics simulation
– Possible metrics
• How many downloads? (easiest to measure, least value)
• How many contributors?
• How many uses?
• How many papers cite it?
• How many papers that cite it are cited? (hardest to measure,
most value)
2. Developer of open source math library
– Possible metrics are similar, but citations are less
likely
– What if users don’t download it?
• It’s part of a distro
• It’s pre-installed (and optimized) on an HPC system
• It’s part of a cloud image
• It’s a service
Vision for Metrics & Citation
• Products (software, paper, data set) are registered
– Input: credit map (weighted list of contributors—people, products, etc.)
– Output: DOI
– Leads to transitive credit
• E.g., paper 1 provides 25% credit to software A, and software A provides 10%
credit to library X -> library X gets 2.5% credit for paper 1
• Helps developer – ―my tools are widely used, give me tenure‖ or ―NSF should
fund my tool maintenance‖
– Social issue: need to trust person who registers product
• Works for papers today (w/out weights) for both author lists and for citations
– Technological issue: Registration system (where, interface, multiple)
• Product usage is recorded
– Where? Both the developer and user want to track usage
– Privacy issues? (legal, competitive, ...)
– What does ―using‖ a data set mean? How to trigger usage record?
– Develop general code for this, add to multiple software?
• Ties to provenance
• With user input, tie usage to new research and development
Vision for Metrics & Citation, thoughts
• Can this be done incrementally?
• Lack of credit is a larger problem than often
perceived
– Lack of credit is a disincentive for sharing software
and data
– Providing credit would both remove disincentive as
well as adding incentive
– See Lewin’s principal of force field analysis (1943)
• For commercial tools, credit is tracked by $
– But this doesn’t help understand what tools were used
for what outcomes
– Does this encourage collaboration?
• Could a more economic model be used?
– NSF gives tokens are part of science grants, users
distribute tokens while/after using tools
Software Questions for Projects
• Sustainability to a program officer:
– How will you support your software without me
continuing to pay for it?
• What does support mean?
– Can I build and run it on my current/future system?
– Do I understand what it does?
– Does it do what it does correctly?
– Does it do what I want?
– Does it include newest science?
• Governance model?
– Tells users and contributors how the project makes
decisions, how they can be involved
– Community: Users? Developers? Both?
– Models: dictatorship (Linux kernel), meritocracy
(Apache), other?
– Tie to development models: cathedral, bazaar
General Software Questions
• Does the open source model work for all science?
– For some science? For underlying tools?
• How many users/developers are needed for success?
• Open Source for understanding (available) vs Open
Source for reuse/development (changeable)?
• Software that is intended to be infrastructure has
challenges
– Unlike in business, more users means more work
– The last 20% takes 80% of the effort
• What fraction of funds should be spent of support of
existing infrastructure vs. development of new
infrastructure?
• How do we decide when to stop supporting a software
element?
• How do we encourage reuse and discourage duplication?
• How do we more effectively support career paths for
software developers (with universities, labs, etc.)

More Related Content

What's hot

Fall15Resume
Fall15ResumeFall15Resume
Fall15Resume
Deonesha Williams
 
SGCI HICSS50 Presentation
SGCI HICSS50 PresentationSGCI HICSS50 Presentation
SGCI HICSS50 Presentation
maytaldahan
 
SGCI - Science Gateways Community Institute: Subsidized Services and Consulta...
SGCI - Science Gateways Community Institute: Subsidized Services and Consulta...SGCI - Science Gateways Community Institute: Subsidized Services and Consulta...
SGCI - Science Gateways Community Institute: Subsidized Services and Consulta...
Sandra Gesing
 
SGCI - Science Gateways - Technology-Enhanced Research Under Consideration of...
SGCI - Science Gateways - Technology-Enhanced Research Under Consideration of...SGCI - Science Gateways - Technology-Enhanced Research Under Consideration of...
SGCI - Science Gateways - Technology-Enhanced Research Under Consideration of...
Sandra Gesing
 
EDUC 5101 3rd Adobe Connect Class Session Presentation
EDUC 5101 3rd Adobe Connect Class Session PresentationEDUC 5101 3rd Adobe Connect Class Session Presentation
EDUC 5101 3rd Adobe Connect Class Session Presentation
Robert Power
 
Community and Code: Lessons from NESCent Hackathons
Community and Code: Lessons from NESCent HackathonsCommunity and Code: Lessons from NESCent Hackathons
Community and Code: Lessons from NESCent Hackathons
Arlin Stoltzfus
 
Heartificial Intelligence: the intersect between Artificial Intelligence and ...
Heartificial Intelligence: the intersect between Artificial Intelligence and ...Heartificial Intelligence: the intersect between Artificial Intelligence and ...
Heartificial Intelligence: the intersect between Artificial Intelligence and ...
The Happiness Alliance - home of the Happiness Index
 
What’s Standard? Industry Application versus University Education of Engineer...
What’s Standard? Industry Application versus University Education of Engineer...What’s Standard? Industry Application versus University Education of Engineer...
What’s Standard? Industry Application versus University Education of Engineer...
Chelsea Leachman
 
SGCI - The Science Gateways Community Institute: International Collaboration ...
SGCI - The Science Gateways Community Institute: International Collaboration ...SGCI - The Science Gateways Community Institute: International Collaboration ...
SGCI - The Science Gateways Community Institute: International Collaboration ...
Sandra Gesing
 
Sgci nasa-esds-10-29-18
Sgci nasa-esds-10-29-18Sgci nasa-esds-10-29-18
Sgci nasa-esds-10-29-18
Nancy Wilkins-Diehr
 
Improving Integrity, Transparency, and Reproducibility Through Connection of ...
Improving Integrity, Transparency, and Reproducibility Through Connection of ...Improving Integrity, Transparency, and Reproducibility Through Connection of ...
Improving Integrity, Transparency, and Reproducibility Through Connection of ...
Andrew Sallans
 
Charleston Conference: VIVO, libraries, and users.
Charleston Conference: VIVO, libraries, and users.Charleston Conference: VIVO, libraries, and users.
Charleston Conference: VIVO, libraries, and users.
Ellen Cramer
 
Ucsd research-it-09-11-18
Ucsd research-it-09-11-18Ucsd research-it-09-11-18
Ucsd research-it-09-11-18
Nancy Wilkins-Diehr
 
ISLMA PD SurveyResults 5-29-11
ISLMA PD SurveyResults  5-29-11ISLMA PD SurveyResults  5-29-11
ISLMA PD SurveyResults 5-29-11
Lisa Perez
 
Sgci spf-2-23-17
Sgci spf-2-23-17Sgci spf-2-23-17
Sgci spf-2-23-17
Nancy Wilkins-Diehr
 
UVa Library Scientific Data Consulting Group (SciDaC): New Partnerships and...
UVa Library Scientific Data Consulting Group (SciDaC):  New Partnerships and...UVa Library Scientific Data Consulting Group (SciDaC):  New Partnerships and...
UVa Library Scientific Data Consulting Group (SciDaC): New Partnerships and...
Andrew Sallans
 
Introduction to SEMAT
Introduction to SEMATIntroduction to SEMAT
Zucca "Technology & Systems"
Zucca "Technology & Systems"Zucca "Technology & Systems"
Riding the Waves of the Education Ecosystem
Riding the Waves of the Education EcosystemRiding the Waves of the Education Ecosystem
Riding the Waves of the Education Ecosystem
AAP PreK-12 Learning Group
 
Collaboration Importance In Agile Software Development
Collaboration Importance In Agile Software DevelopmentCollaboration Importance In Agile Software Development
Collaboration Importance In Agile Software Development
Veselin Georgiev
 

What's hot (20)

Fall15Resume
Fall15ResumeFall15Resume
Fall15Resume
 
SGCI HICSS50 Presentation
SGCI HICSS50 PresentationSGCI HICSS50 Presentation
SGCI HICSS50 Presentation
 
SGCI - Science Gateways Community Institute: Subsidized Services and Consulta...
SGCI - Science Gateways Community Institute: Subsidized Services and Consulta...SGCI - Science Gateways Community Institute: Subsidized Services and Consulta...
SGCI - Science Gateways Community Institute: Subsidized Services and Consulta...
 
SGCI - Science Gateways - Technology-Enhanced Research Under Consideration of...
SGCI - Science Gateways - Technology-Enhanced Research Under Consideration of...SGCI - Science Gateways - Technology-Enhanced Research Under Consideration of...
SGCI - Science Gateways - Technology-Enhanced Research Under Consideration of...
 
EDUC 5101 3rd Adobe Connect Class Session Presentation
EDUC 5101 3rd Adobe Connect Class Session PresentationEDUC 5101 3rd Adobe Connect Class Session Presentation
EDUC 5101 3rd Adobe Connect Class Session Presentation
 
Community and Code: Lessons from NESCent Hackathons
Community and Code: Lessons from NESCent HackathonsCommunity and Code: Lessons from NESCent Hackathons
Community and Code: Lessons from NESCent Hackathons
 
Heartificial Intelligence: the intersect between Artificial Intelligence and ...
Heartificial Intelligence: the intersect between Artificial Intelligence and ...Heartificial Intelligence: the intersect between Artificial Intelligence and ...
Heartificial Intelligence: the intersect between Artificial Intelligence and ...
 
What’s Standard? Industry Application versus University Education of Engineer...
What’s Standard? Industry Application versus University Education of Engineer...What’s Standard? Industry Application versus University Education of Engineer...
What’s Standard? Industry Application versus University Education of Engineer...
 
SGCI - The Science Gateways Community Institute: International Collaboration ...
SGCI - The Science Gateways Community Institute: International Collaboration ...SGCI - The Science Gateways Community Institute: International Collaboration ...
SGCI - The Science Gateways Community Institute: International Collaboration ...
 
Sgci nasa-esds-10-29-18
Sgci nasa-esds-10-29-18Sgci nasa-esds-10-29-18
Sgci nasa-esds-10-29-18
 
Improving Integrity, Transparency, and Reproducibility Through Connection of ...
Improving Integrity, Transparency, and Reproducibility Through Connection of ...Improving Integrity, Transparency, and Reproducibility Through Connection of ...
Improving Integrity, Transparency, and Reproducibility Through Connection of ...
 
Charleston Conference: VIVO, libraries, and users.
Charleston Conference: VIVO, libraries, and users.Charleston Conference: VIVO, libraries, and users.
Charleston Conference: VIVO, libraries, and users.
 
Ucsd research-it-09-11-18
Ucsd research-it-09-11-18Ucsd research-it-09-11-18
Ucsd research-it-09-11-18
 
ISLMA PD SurveyResults 5-29-11
ISLMA PD SurveyResults  5-29-11ISLMA PD SurveyResults  5-29-11
ISLMA PD SurveyResults 5-29-11
 
Sgci spf-2-23-17
Sgci spf-2-23-17Sgci spf-2-23-17
Sgci spf-2-23-17
 
UVa Library Scientific Data Consulting Group (SciDaC): New Partnerships and...
UVa Library Scientific Data Consulting Group (SciDaC):  New Partnerships and...UVa Library Scientific Data Consulting Group (SciDaC):  New Partnerships and...
UVa Library Scientific Data Consulting Group (SciDaC): New Partnerships and...
 
Introduction to SEMAT
Introduction to SEMATIntroduction to SEMAT
Introduction to SEMAT
 
Zucca "Technology & Systems"
Zucca "Technology & Systems"Zucca "Technology & Systems"
Zucca "Technology & Systems"
 
Riding the Waves of the Education Ecosystem
Riding the Waves of the Education EcosystemRiding the Waves of the Education Ecosystem
Riding the Waves of the Education Ecosystem
 
Collaboration Importance In Agile Software Development
Collaboration Importance In Agile Software DevelopmentCollaboration Importance In Agile Software Development
Collaboration Importance In Agile Software Development
 

Viewers also liked

F:\Usi\6th Semester\Edu 373\Science Indicator Project
F:\Usi\6th Semester\Edu 373\Science Indicator ProjectF:\Usi\6th Semester\Edu 373\Science Indicator Project
F:\Usi\6th Semester\Edu 373\Science Indicator Project
guest95f4a3
 
RCUK Strategy
RCUK StrategyRCUK Strategy
RCUK Strategy
steven_hill
 
Overview of ESRC
Overview of ESRCOverview of ESRC
Overview of ESRC
ESRC Communications
 
The National Science Foundation Open Government Plan 3.0 June 2014
The National Science Foundation Open Government Plan 3.0 June 2014The National Science Foundation Open Government Plan 3.0 June 2014
The National Science Foundation Open Government Plan 3.0 June 2014
Ed Dodds
 
Lecture 5 - Indicators of innovation and technological change: R&D and patents
Lecture 5 - Indicators of innovation and technological change: R&D and patentsLecture 5 - Indicators of innovation and technological change: R&D and patents
Lecture 5 - Indicators of innovation and technological change: R&D and patents
UNU.MERIT
 
Benchmarking Study On Innovation Policy 29012010
Benchmarking Study On Innovation Policy 29012010Benchmarking Study On Innovation Policy 29012010
Benchmarking Study On Innovation Policy 29012010
guest4594e8
 
Indicators of Innovative Research (Klavans, Boyack, Small, Sorensen, Ioannidis)
Indicators of Innovative Research (Klavans, Boyack, Small, Sorensen, Ioannidis)Indicators of Innovative Research (Klavans, Boyack, Small, Sorensen, Ioannidis)
Indicators of Innovative Research (Klavans, Boyack, Small, Sorensen, Ioannidis)
Kevin Boyack
 
STI Scoreboard 2015
STI Scoreboard 2015STI Scoreboard 2015
STI Scoreboard 2015
innovationoecd
 
Scoreboard- 2015 Japan Launch
Scoreboard- 2015 Japan LaunchScoreboard- 2015 Japan Launch
Scoreboard- 2015 Japan Launch
innovationoecd
 

Viewers also liked (9)

F:\Usi\6th Semester\Edu 373\Science Indicator Project
F:\Usi\6th Semester\Edu 373\Science Indicator ProjectF:\Usi\6th Semester\Edu 373\Science Indicator Project
F:\Usi\6th Semester\Edu 373\Science Indicator Project
 
RCUK Strategy
RCUK StrategyRCUK Strategy
RCUK Strategy
 
Overview of ESRC
Overview of ESRCOverview of ESRC
Overview of ESRC
 
The National Science Foundation Open Government Plan 3.0 June 2014
The National Science Foundation Open Government Plan 3.0 June 2014The National Science Foundation Open Government Plan 3.0 June 2014
The National Science Foundation Open Government Plan 3.0 June 2014
 
Lecture 5 - Indicators of innovation and technological change: R&D and patents
Lecture 5 - Indicators of innovation and technological change: R&D and patentsLecture 5 - Indicators of innovation and technological change: R&D and patents
Lecture 5 - Indicators of innovation and technological change: R&D and patents
 
Benchmarking Study On Innovation Policy 29012010
Benchmarking Study On Innovation Policy 29012010Benchmarking Study On Innovation Policy 29012010
Benchmarking Study On Innovation Policy 29012010
 
Indicators of Innovative Research (Klavans, Boyack, Small, Sorensen, Ioannidis)
Indicators of Innovative Research (Klavans, Boyack, Small, Sorensen, Ioannidis)Indicators of Innovative Research (Klavans, Boyack, Small, Sorensen, Ioannidis)
Indicators of Innovative Research (Klavans, Boyack, Small, Sorensen, Ioannidis)
 
STI Scoreboard 2015
STI Scoreboard 2015STI Scoreboard 2015
STI Scoreboard 2015
 
Scoreboard- 2015 Japan Launch
Scoreboard- 2015 Japan LaunchScoreboard- 2015 Japan Launch
Scoreboard- 2015 Japan Launch
 

Similar to Open Source and Science at the National Science Foundation (NSF)

Working towards Sustainable Software for Science (an NSF and community view)
Working towards Sustainable Software for Science (an NSF and community view)Working towards Sustainable Software for Science (an NSF and community view)
Working towards Sustainable Software for Science (an NSF and community view)
Daniel S. Katz
 
NSF SI2 program discussion at 2014 SI2 PI meeting
NSF SI2 program discussion at 2014 SI2 PI meetingNSF SI2 program discussion at 2014 SI2 PI meeting
NSF SI2 program discussion at 2014 SI2 PI meeting
Daniel S. Katz
 
Scientific Software Challenges and Community Responses
Scientific Software Challenges and Community ResponsesScientific Software Challenges and Community Responses
Scientific Software Challenges and Community Responses
Daniel S. Katz
 
NSF SI2 program discussion at 2013 SI2 PI meeting
NSF SI2 program discussion at 2013 SI2 PI meetingNSF SI2 program discussion at 2013 SI2 PI meeting
NSF SI2 program discussion at 2013 SI2 PI meeting
Daniel S. Katz
 
NSF Software @ ApacheConNA
NSF Software @ ApacheConNANSF Software @ ApacheConNA
NSF Software @ ApacheConNA
Daniel S. Katz
 
A Method to Select e-Infrastructure Components to Sustain
A Method to Select e-Infrastructure Components to SustainA Method to Select e-Infrastructure Components to Sustain
A Method to Select e-Infrastructure Components to Sustain
Daniel S. Katz
 
Software and Education at NSF/ACI
Software and Education at NSF/ACISoftware and Education at NSF/ACI
Software and Education at NSF/ACI
Daniel S. Katz
 
Software: impact, metrics, and citation
Software: impact, metrics, and citationSoftware: impact, metrics, and citation
Software: impact, metrics, and citation
Daniel S. Katz
 
Scientific Software Innovation Institutes (S2I2s) as part of NSF’s SI2 program
Scientific Software Innovation Institutes (S2I2s) as part of NSF’s SI2 programScientific Software Innovation Institutes (S2I2s) as part of NSF’s SI2 program
Scientific Software Innovation Institutes (S2I2s) as part of NSF’s SI2 program
Daniel S. Katz
 
XSEDE Overview (March 2014)
XSEDE Overview (March 2014)XSEDE Overview (March 2014)
XSEDE Overview (March 2014)
John Towns
 
Sgci xsede-gateways-07-08-16
Sgci xsede-gateways-07-08-16Sgci xsede-gateways-07-08-16
Sgci xsede-gateways-07-08-16
Nancy Wilkins-Diehr
 
Xsede for-nlhpc
Xsede for-nlhpcXsede for-nlhpc
Xsede for-nlhpc
John Towns
 
XSEDE: an ecosystem of advanced digital services accelerating scientific disc...
XSEDE: an ecosystem of advanced digital services accelerating scientific disc...XSEDE: an ecosystem of advanced digital services accelerating scientific disc...
XSEDE: an ecosystem of advanced digital services accelerating scientific disc...
John Towns
 
Sgci esip-7-20-18
Sgci esip-7-20-18Sgci esip-7-20-18
Sgci esip-7-20-18
Nancy Wilkins-Diehr
 
Software Ecosystems = Big Data
Software Ecosystems = Big DataSoftware Ecosystems = Big Data
Software Ecosystems = Big Data
Tom Mens
 
Research software susainability
Research software susainabilityResearch software susainability
Research software susainability
Daniel S. Katz
 
Sgci nsf-si2-2-21-17
Sgci nsf-si2-2-21-17Sgci nsf-si2-2-21-17
Sgci nsf-si2-2-21-17
Nancy Wilkins-Diehr
 
ARCC National Perspective Panel: XSEDE (Towns)
ARCC National Perspective Panel: XSEDE (Towns)ARCC National Perspective Panel: XSEDE (Towns)
ARCC National Perspective Panel: XSEDE (Towns)
John Towns
 
Supporting Research Communities with XSEDE
Supporting Research Communities with XSEDESupporting Research Communities with XSEDE
Supporting Research Communities with XSEDE
John Towns
 
Supporting Research Communities with XSEDE
Supporting Research Communities with XSEDESupporting Research Communities with XSEDE
Supporting Research Communities with XSEDE
John Towns
 

Similar to Open Source and Science at the National Science Foundation (NSF) (20)

Working towards Sustainable Software for Science (an NSF and community view)
Working towards Sustainable Software for Science (an NSF and community view)Working towards Sustainable Software for Science (an NSF and community view)
Working towards Sustainable Software for Science (an NSF and community view)
 
NSF SI2 program discussion at 2014 SI2 PI meeting
NSF SI2 program discussion at 2014 SI2 PI meetingNSF SI2 program discussion at 2014 SI2 PI meeting
NSF SI2 program discussion at 2014 SI2 PI meeting
 
Scientific Software Challenges and Community Responses
Scientific Software Challenges and Community ResponsesScientific Software Challenges and Community Responses
Scientific Software Challenges and Community Responses
 
NSF SI2 program discussion at 2013 SI2 PI meeting
NSF SI2 program discussion at 2013 SI2 PI meetingNSF SI2 program discussion at 2013 SI2 PI meeting
NSF SI2 program discussion at 2013 SI2 PI meeting
 
NSF Software @ ApacheConNA
NSF Software @ ApacheConNANSF Software @ ApacheConNA
NSF Software @ ApacheConNA
 
A Method to Select e-Infrastructure Components to Sustain
A Method to Select e-Infrastructure Components to SustainA Method to Select e-Infrastructure Components to Sustain
A Method to Select e-Infrastructure Components to Sustain
 
Software and Education at NSF/ACI
Software and Education at NSF/ACISoftware and Education at NSF/ACI
Software and Education at NSF/ACI
 
Software: impact, metrics, and citation
Software: impact, metrics, and citationSoftware: impact, metrics, and citation
Software: impact, metrics, and citation
 
Scientific Software Innovation Institutes (S2I2s) as part of NSF’s SI2 program
Scientific Software Innovation Institutes (S2I2s) as part of NSF’s SI2 programScientific Software Innovation Institutes (S2I2s) as part of NSF’s SI2 program
Scientific Software Innovation Institutes (S2I2s) as part of NSF’s SI2 program
 
XSEDE Overview (March 2014)
XSEDE Overview (March 2014)XSEDE Overview (March 2014)
XSEDE Overview (March 2014)
 
Sgci xsede-gateways-07-08-16
Sgci xsede-gateways-07-08-16Sgci xsede-gateways-07-08-16
Sgci xsede-gateways-07-08-16
 
Xsede for-nlhpc
Xsede for-nlhpcXsede for-nlhpc
Xsede for-nlhpc
 
XSEDE: an ecosystem of advanced digital services accelerating scientific disc...
XSEDE: an ecosystem of advanced digital services accelerating scientific disc...XSEDE: an ecosystem of advanced digital services accelerating scientific disc...
XSEDE: an ecosystem of advanced digital services accelerating scientific disc...
 
Sgci esip-7-20-18
Sgci esip-7-20-18Sgci esip-7-20-18
Sgci esip-7-20-18
 
Software Ecosystems = Big Data
Software Ecosystems = Big DataSoftware Ecosystems = Big Data
Software Ecosystems = Big Data
 
Research software susainability
Research software susainabilityResearch software susainability
Research software susainability
 
Sgci nsf-si2-2-21-17
Sgci nsf-si2-2-21-17Sgci nsf-si2-2-21-17
Sgci nsf-si2-2-21-17
 
ARCC National Perspective Panel: XSEDE (Towns)
ARCC National Perspective Panel: XSEDE (Towns)ARCC National Perspective Panel: XSEDE (Towns)
ARCC National Perspective Panel: XSEDE (Towns)
 
Supporting Research Communities with XSEDE
Supporting Research Communities with XSEDESupporting Research Communities with XSEDE
Supporting Research Communities with XSEDE
 
Supporting Research Communities with XSEDE
Supporting Research Communities with XSEDESupporting Research Communities with XSEDE
Supporting Research Communities with XSEDE
 

More from Daniel S. Katz

Software Professionals (RSEs) at NCSA
Software Professionals (RSEs) at NCSASoftware Professionals (RSEs) at NCSA
Software Professionals (RSEs) at NCSA
Daniel S. Katz
 
Parsl: Pervasive Parallel Programming in Python
Parsl: Pervasive Parallel Programming in PythonParsl: Pervasive Parallel Programming in Python
Parsl: Pervasive Parallel Programming in Python
Daniel S. Katz
 
Requiring Publicly-Funded Software, Algorithms, and Workflows to be Made Publ...
Requiring Publicly-Funded Software, Algorithms, and Workflows to be Made Publ...Requiring Publicly-Funded Software, Algorithms, and Workflows to be Made Publ...
Requiring Publicly-Funded Software, Algorithms, and Workflows to be Made Publ...
Daniel S. Katz
 
What is eScience, and where does it go from here?
What is eScience, and where does it go from here?What is eScience, and where does it go from here?
What is eScience, and where does it go from here?
Daniel S. Katz
 
Citation and Research Objects: Toward Active Research Objects
Citation and Research Objects: Toward Active Research ObjectsCitation and Research Objects: Toward Active Research Objects
Citation and Research Objects: Toward Active Research Objects
Daniel S. Katz
 
FAIR is not Fair Enough, Particularly for Software Citation, Availability, or...
FAIR is not Fair Enough, Particularly for Software Citation, Availability, or...FAIR is not Fair Enough, Particularly for Software Citation, Availability, or...
FAIR is not Fair Enough, Particularly for Software Citation, Availability, or...
Daniel S. Katz
 
Fundamentals of software sustainability
Fundamentals of software sustainabilityFundamentals of software sustainability
Fundamentals of software sustainability
Daniel S. Katz
 
Software Citation in Theory and Practice
Software Citation in Theory and PracticeSoftware Citation in Theory and Practice
Software Citation in Theory and Practice
Daniel S. Katz
 
Software citation
Software citationSoftware citation
Software citation
Daniel S. Katz
 
Expressing and sharing workflows
Expressing and sharing workflowsExpressing and sharing workflows
Expressing and sharing workflows
Daniel S. Katz
 
Citation and reproducibility in software
Citation and reproducibility in softwareCitation and reproducibility in software
Citation and reproducibility in software
Daniel S. Katz
 
Software Citation: Principles, Implementation, and Impact
Software Citation:  Principles, Implementation, and ImpactSoftware Citation:  Principles, Implementation, and Impact
Software Citation: Principles, Implementation, and Impact
Daniel S. Katz
 
Summary of WSSSPE and its working groups
Summary of WSSSPE and its working groupsSummary of WSSSPE and its working groups
Summary of WSSSPE and its working groups
Daniel S. Katz
 
What do we need beyond a DOI?
What do we need beyond a DOI?What do we need beyond a DOI?
What do we need beyond a DOI?
Daniel S. Katz
 
Scientific research: What Anna Karenina teaches us about useful negative results
Scientific research: What Anna Karenina teaches us about useful negative resultsScientific research: What Anna Karenina teaches us about useful negative results
Scientific research: What Anna Karenina teaches us about useful negative results
Daniel S. Katz
 
Panel: Our Scholarly Recognition System Doesn’t Still Work
Panel: Our Scholarly Recognition System Doesn’t Still WorkPanel: Our Scholarly Recognition System Doesn’t Still Work
Panel: Our Scholarly Recognition System Doesn’t Still Work
Daniel S. Katz
 
US University Research Funding, Peer Reviews, and Metrics
US University Research Funding, Peer Reviews, and MetricsUS University Research Funding, Peer Reviews, and Metrics
US University Research Funding, Peer Reviews, and Metrics
Daniel S. Katz
 
Swift Parallel Scripting for High-Performance Workflow
Swift Parallel Scripting for High-Performance WorkflowSwift Parallel Scripting for High-Performance Workflow
Swift Parallel Scripting for High-Performance Workflow
Daniel S. Katz
 
Multi-component Modeling with Swift at Extreme Scale
Multi-component Modeling with Swift at Extreme ScaleMulti-component Modeling with Swift at Extreme Scale
Multi-component Modeling with Swift at Extreme Scale
Daniel S. Katz
 
Application Fault Tolerance (AFT)
Application Fault Tolerance (AFT)Application Fault Tolerance (AFT)
Application Fault Tolerance (AFT)
Daniel S. Katz
 

More from Daniel S. Katz (20)

Software Professionals (RSEs) at NCSA
Software Professionals (RSEs) at NCSASoftware Professionals (RSEs) at NCSA
Software Professionals (RSEs) at NCSA
 
Parsl: Pervasive Parallel Programming in Python
Parsl: Pervasive Parallel Programming in PythonParsl: Pervasive Parallel Programming in Python
Parsl: Pervasive Parallel Programming in Python
 
Requiring Publicly-Funded Software, Algorithms, and Workflows to be Made Publ...
Requiring Publicly-Funded Software, Algorithms, and Workflows to be Made Publ...Requiring Publicly-Funded Software, Algorithms, and Workflows to be Made Publ...
Requiring Publicly-Funded Software, Algorithms, and Workflows to be Made Publ...
 
What is eScience, and where does it go from here?
What is eScience, and where does it go from here?What is eScience, and where does it go from here?
What is eScience, and where does it go from here?
 
Citation and Research Objects: Toward Active Research Objects
Citation and Research Objects: Toward Active Research ObjectsCitation and Research Objects: Toward Active Research Objects
Citation and Research Objects: Toward Active Research Objects
 
FAIR is not Fair Enough, Particularly for Software Citation, Availability, or...
FAIR is not Fair Enough, Particularly for Software Citation, Availability, or...FAIR is not Fair Enough, Particularly for Software Citation, Availability, or...
FAIR is not Fair Enough, Particularly for Software Citation, Availability, or...
 
Fundamentals of software sustainability
Fundamentals of software sustainabilityFundamentals of software sustainability
Fundamentals of software sustainability
 
Software Citation in Theory and Practice
Software Citation in Theory and PracticeSoftware Citation in Theory and Practice
Software Citation in Theory and Practice
 
Software citation
Software citationSoftware citation
Software citation
 
Expressing and sharing workflows
Expressing and sharing workflowsExpressing and sharing workflows
Expressing and sharing workflows
 
Citation and reproducibility in software
Citation and reproducibility in softwareCitation and reproducibility in software
Citation and reproducibility in software
 
Software Citation: Principles, Implementation, and Impact
Software Citation:  Principles, Implementation, and ImpactSoftware Citation:  Principles, Implementation, and Impact
Software Citation: Principles, Implementation, and Impact
 
Summary of WSSSPE and its working groups
Summary of WSSSPE and its working groupsSummary of WSSSPE and its working groups
Summary of WSSSPE and its working groups
 
What do we need beyond a DOI?
What do we need beyond a DOI?What do we need beyond a DOI?
What do we need beyond a DOI?
 
Scientific research: What Anna Karenina teaches us about useful negative results
Scientific research: What Anna Karenina teaches us about useful negative resultsScientific research: What Anna Karenina teaches us about useful negative results
Scientific research: What Anna Karenina teaches us about useful negative results
 
Panel: Our Scholarly Recognition System Doesn’t Still Work
Panel: Our Scholarly Recognition System Doesn’t Still WorkPanel: Our Scholarly Recognition System Doesn’t Still Work
Panel: Our Scholarly Recognition System Doesn’t Still Work
 
US University Research Funding, Peer Reviews, and Metrics
US University Research Funding, Peer Reviews, and MetricsUS University Research Funding, Peer Reviews, and Metrics
US University Research Funding, Peer Reviews, and Metrics
 
Swift Parallel Scripting for High-Performance Workflow
Swift Parallel Scripting for High-Performance WorkflowSwift Parallel Scripting for High-Performance Workflow
Swift Parallel Scripting for High-Performance Workflow
 
Multi-component Modeling with Swift at Extreme Scale
Multi-component Modeling with Swift at Extreme ScaleMulti-component Modeling with Swift at Extreme Scale
Multi-component Modeling with Swift at Extreme Scale
 
Application Fault Tolerance (AFT)
Application Fault Tolerance (AFT)Application Fault Tolerance (AFT)
Application Fault Tolerance (AFT)
 

Recently uploaded

Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
SitimaJohn
 
dbms calicut university B. sc Cs 4th sem.pdf
dbms  calicut university B. sc Cs 4th sem.pdfdbms  calicut university B. sc Cs 4th sem.pdf
dbms calicut university B. sc Cs 4th sem.pdf
Shinana2
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - HiikeSystem Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
Hiike
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdfNunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
flufftailshop
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
Recommendation System using RAG Architecture
Recommendation System using RAG ArchitectureRecommendation System using RAG Architecture
Recommendation System using RAG Architecture
fredae14
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Ivanti
 
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Tatiana Kojar
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Wask
 
Operating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptxOperating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptx
Pravash Chandra Das
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 
Trusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process MiningTrusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process Mining
LucaBarbaro3
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
saastr
 

Recently uploaded (20)

Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
 
dbms calicut university B. sc Cs 4th sem.pdf
dbms  calicut university B. sc Cs 4th sem.pdfdbms  calicut university B. sc Cs 4th sem.pdf
dbms calicut university B. sc Cs 4th sem.pdf
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - HiikeSystem Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdfNunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
Recommendation System using RAG Architecture
Recommendation System using RAG ArchitectureRecommendation System using RAG Architecture
Recommendation System using RAG Architecture
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
 
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
 
Operating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptxOperating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptx
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 
Trusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process MiningTrusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process Mining
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
 

Open Source and Science at the National Science Foundation (NSF)

  • 1. Open Source and Science at NSF Daniel S. Katz Program Director, Division of Advanced Cyberinfrastructure (ACI)
  • 2. Big Science and Infrastructure • Hurricanes affect humans • Multi-physics: atmosphere, ocean, coast, vegetation, soil – Sensors and data as inputs • Humans: what have they built, where are they, what will they do – Data and models as inputs • Infrastructure: – Urgent/scheduled processing, workflows – Software applications, workflows – Networks – Decision-support systems, visualization – Data storage, interoperability
  • 3. • CIPRES Science Gateway for Phylogenetics – Study of diversification of life and relationships among living things through time • Highly used – Cited in at least 400 publications, e.g., Nature, PNAS, Cell – More than 5000 unique users in 3 years – Used routinely in at least 68 undergraduate classes – 45% US (including most states), 55% 70 other countries • Infrastructure – Flexible web application • A science gateway, uses software and lessons from XSEDE gateways team, e.g., identify management, HPC job control – Science software: tree inference and sequence alignment • Parallel versions of MrBayes, RAxML, GARLI, BEAST, MAFFT • PAUP*, Poy, ClustalW, Contralign, FSA, MUSCLE, ... – Data • Personal user space for storing results • Tools to transfer and view data Credit: Mark Miller, SDSC Long-tail Science and Infrastructure
  • 4. Cyberinfrastructure (e-Research) • “Cyberinfrastructure consists of computing systems, data storage systems, advanced instruments and data repositories, visualization environments, and people, all linked together by software and high performance networks to improve research productivity and enable breakthroughs not otherwise possible.” -- Craig Stewart • Infrastructure elements: – parts of an infrastructure, – developed by individuals and groups, – international, – developed for a purpose, – used by a community
  • 5. NSF Software Vision NSF will take a leadership role in providing software as enabling infrastructure for science and engineering research and education, and in promoting software as a principal component of its comprehensive CIF21 vision • ... • Reducing the complexity of software will be a unifying theme across the CIF21 vision, advancing both the use and development of new software and promoting the ubiquitous integration of scientific software across all disciplines, in education, and in industry – A Vision and Strategy for Software for Science, Engineering, and Education – NSF 12-113
  • 6. Create and maintain a software ecosystem providing new capabilities that advance and accelerate scientific inquiry at unprecedented complexity and scale Support the foundational research necessary to continue to efficiently advance scientific software Enable transformative, interdisciplinary, collaborative, science and engineering research and education through the use of advanced software and services Transform practice through new policies for software addressing challenges of academic culture, open dissemination and use, reproducibility and trust, curation, sustainability, governance, citation, stewardship, and attribution of software authorship Develop a next generation diverse workforce of scientists and engineers equipped with essential skills to use and develop software, with software and services used in both the research and education process Infrastructure Role & Lifecycle
  • 7. Software Infrastructure Projects • In SI2, currently ~50 Elements & Frameworks projects & 13 potential Institutes planning projects • See http://bit.ly/sw-ci for current SI2 projects • NSF directorates have additional sw projects
  • 8. SI2 Solicitation and Decision Process • Cross-NSF software working group with members from all directorates • Determined how SI2 fits with other NSF programs that support software – See: Implementation of NSF Software Vision - http://www.nsf.gov/funding/pgm_summ.jsp?pims_id=504817 • Discusses solicitations, determines who will participate in each • Discusses and participates in review process • Work together to fund worthy proposals (matchmaking) – Unidisciplinary projects (e.g. bioinformatics app) – Multidisciplinary projects (e.g., molecular dynamics) – Onmidisciplinary projects (e.g. http, math library) • In all cases, need to forecast impact – Past performance does predict future results
  • 9. Measuring Impact – Scenarios 1. Developer of open source physics simulation – Possible metrics • How many downloads? (easiest to measure, least value) • How many contributors? • How many uses? • How many papers cite it? • How many papers that cite it are cited? (hardest to measure, most value) 2. Developer of open source math library – Possible metrics are similar, but citations are less likely – What if users don’t download it? • It’s part of a distro • It’s pre-installed (and optimized) on an HPC system • It’s part of a cloud image • It’s a service
  • 10. Vision for Metrics & Citation • Products (software, paper, data set) are registered – Input: credit map (weighted list of contributors—people, products, etc.) – Output: DOI – Leads to transitive credit • E.g., paper 1 provides 25% credit to software A, and software A provides 10% credit to library X -> library X gets 2.5% credit for paper 1 • Helps developer – ―my tools are widely used, give me tenure‖ or ―NSF should fund my tool maintenance‖ – Social issue: need to trust person who registers product • Works for papers today (w/out weights) for both author lists and for citations – Technological issue: Registration system (where, interface, multiple) • Product usage is recorded – Where? Both the developer and user want to track usage – Privacy issues? (legal, competitive, ...) – What does ―using‖ a data set mean? How to trigger usage record? – Develop general code for this, add to multiple software? • Ties to provenance • With user input, tie usage to new research and development
  • 11. Vision for Metrics & Citation, thoughts • Can this be done incrementally? • Lack of credit is a larger problem than often perceived – Lack of credit is a disincentive for sharing software and data – Providing credit would both remove disincentive as well as adding incentive – See Lewin’s principal of force field analysis (1943) • For commercial tools, credit is tracked by $ – But this doesn’t help understand what tools were used for what outcomes – Does this encourage collaboration? • Could a more economic model be used? – NSF gives tokens are part of science grants, users distribute tokens while/after using tools
  • 12. Software Questions for Projects • Sustainability to a program officer: – How will you support your software without me continuing to pay for it? • What does support mean? – Can I build and run it on my current/future system? – Do I understand what it does? – Does it do what it does correctly? – Does it do what I want? – Does it include newest science? • Governance model? – Tells users and contributors how the project makes decisions, how they can be involved – Community: Users? Developers? Both? – Models: dictatorship (Linux kernel), meritocracy (Apache), other? – Tie to development models: cathedral, bazaar
  • 13. General Software Questions • Does the open source model work for all science? – For some science? For underlying tools? • How many users/developers are needed for success? • Open Source for understanding (available) vs Open Source for reuse/development (changeable)? • Software that is intended to be infrastructure has challenges – Unlike in business, more users means more work – The last 20% takes 80% of the effort • What fraction of funds should be spent of support of existing infrastructure vs. development of new infrastructure? • How do we decide when to stop supporting a software element? • How do we encourage reuse and discourage duplication? • How do we more effectively support career paths for software developers (with universities, labs, etc.)

Editor's Notes

  1. Research (in OCI, CISE, directorates) feeds capabilities into CIScience drives these capabilities, and is the output of using themPolicy changes are needed to make the CI most effectiveEducation is connected in using the CI, and also in training the workforce that will develop future versions of itThis is a snapshot – things change over time, new science drivers, new capabilities, etc.
  2. Mapping in SI2, ABI, other...