SlideShare a Scribd company logo
1 of 11
Download to read offline
Building communities around open-
source scientific software
Karen Cranston
National Evolutionary Synthesis Center (NESCent)
@kcranstn
http://www.slideshare.net/kcranstn
NESCent
National Evolutionary Synthesis Center
www.nescent.org
fieldwork
labwork
method development
meta-analysisdata synthesis
Species A (mm^2) F (mm^2/
mm^2)
N (mm^-2) S (mm^4)
Abelia biflora
Abelia dielsii
Abelia integrifolia
Abelia mosanensis
Abelia serrata
Abelia spathulata
Abutilon fruticosum
Abutilon pannosum
Acacia albida
Acacia ataxacantha
Acacia borleae
Acacia burkei
Acacia caffra
0.002375829 0.924197654 389.0 6.11E-06
0.00115375 0.357418211 331.0 3.49E-06
0.001134115 0.240432369 212.0 5.35E-06
0.000855299 0.632065665 739.0 1.16E-06
0.000706858 0.206402637 292.0 2.42E-06
0.000804248 0.230819095 287.0 2.80E-06
0.001452201 0.137959114 95.0 1.53E-05
0.003117245 0.124689812 40.0 7.79E-05
0.012271846 0.049087385 4.0 0.003067962
0.013069811 0.169907541 13.0 0.00100537
0.004071504 0.061072561 15.0 0.000271434
0.008992024 0.053952141 6.0 0.001498671
0.010207035 0.214347725 21.0 0.000486049
+
trait data about species evolutionary trees
Outcomes: Community
Brian O'Meara, Michael Alfaro, Charles Bell, Ben Bolker, Marguerite Butler, Peter Cowan, Damien de Vienne, Richard
Desper, Joe Felsenstein, Luke Harmon, Christoph Heibl, Andrew Hipp, Gene Hunt, Thibaut Jombart, Steve Kembel, Hilmar
Lapp, Scott Loarie, Wayne Maddison, Peter Midford, David Orme, Emmanuel Paradis, Sam Price, Dan Rabosky, Brian
Sidlauskas, Stacey Smith, Dave Swofford, Todd Vision, Peter Waddell, Amy Zanne, Derrick Zwickl [bold indicates organizer]
Comparative methods in hackathon
Rationale Work at hackathon (Dec. 10-14, 2007)
The R statistical analysis package has emerged as a popular platform for
implementation of powerful comparative phylogenetic methods to understand the
evolution of organismal traits and diversification. It includes methods such as
independent contrasts, ancestral state estimation, various models of continuous
and discrete trait evolution, lineage through time plots, diversification tests,
generalized estimating equations, tree plotting, and more. This event was designed to bring
together active R developers as well as end-users working on the integration of comparative
phylogenetic methods within R to actively address issues of data exchange standards, code
interoperability, usability, documentation quality, and the breadth of functionality for comparative
methods available within R. The idea originated from a whitepaper submitted by NESCent
postdocs Amy Zanne and Sam Price.
•30 developers and users worked on
programming & writing documentation
•Split into subgroups on diversification,
divergence times, documentation, class
design, Mesquite-R interaction, input/
output, and trait evolution
•Package source code stored on shared repository hosted at R-forge (“PhyloConductor”)
Hackathon participants (red were flown to NESCent, purple participated remotely). Map from Google Maps
•Designed and began implementing a new S4 class for data and trees
•Ran “bootcamps” for developers on numerical optimization and S4 coding
•Used the Nexus Class Library (Lewis & Holder) and RCpp (Samperi) for reading and
interpreting Nexus tree and data files
•Began work on R tutorials
•Tested existing methods in R, identifying errors
•Developed ways for R to call Mesquite and Mesquite to call R
0
150
300
12/10 12/11 12/12 12/13 12/14 12/15
Commits
•R-Phylo Wiki (http://www.r-phylo.org): Tutorials and overview of
available analyses and packages from the hackathon
have been placed on a public website for all to use
and improve. It’s had >7,000 page visits from >30
countries and >600 edits since it went live in March
2008.
•R-sig-phylo mailing list (https://stat.ethz.ch/mailman/listinfo/r-sig-
phylo): A mailing list for users of R for comparative methods and
phylogenetics. Over 100 messages in its first four months.
•Comparative methods in R user tutorials planned for 2009 Society
for Integrative and Comparative Biology and Evolution meetings.
•Addition of R track to NESCent summer course in
phyloinformatics, featuring software developed at hackathon and
taught by hackathon participant Marguerite Butler.
•Proposal to NSF for summer course in R for phyloinformatics.
•Ongoing collaborations between hackathon participants.
•Two Google Summer of Code projects to sponsor student
NESCent informatics
Incompatible tree
formats are used in
different R packages
Package Function
geiger1.0-9.1 sim.char
ouch1.2-4 brown.dev
picante evolve.brownian
ape2.01 evolve.phylo
Redundancy (at least four functions to
evolve traits up the tree using simple
Brownian motion)
Can be intimidating to beginners
Coding at hackathon
The US National Evolutionary Synthesis
Center (http://www.nescent.org)
encourages synthetic, interdisciplinary,
and transformative research in evolutionary
biology. NESCent, a collaborative effort of
Duke, NC State University and UNC Chapel
Hill, is located in Durham NC and is supported
by the National Science Foundation
(EF-0423641).
A major goal of NESCent's Informatics branch
is to promote community-driven, collaborative
open-source software development. This is
achieved through hackathons, internships
(such as the Google Summer of Code), summer
courses, conference workshops, and by
externally funded collaborations for
the development and support of
Outcomes: Software
•Phylobase (http://r-forge.r-project.org/projects/phylobase/): New
package for phylogenetic trees and data. Can load trees and data
from Nexus files, output to other tree formats, coordinate pruning of
taxa from data and tree, traverse tree, handle DNA, morphological,
and continuous data types. Work is ongoing (below) to enhance tree
plotting and other functions. As with all hackathon products, new
developers are welcome to join to further improve the code (one
already has).
URL: http://hackathon.nescent.org/R_Hackathon_1 email: hackathon2@nescent.org
0
50
100
150
200
12/16 12/30 1/13 1/27 2/10 2/24 3/9 3/23 4/6 4/20 5/4 5/18 6/1
Commits
•Movement of existing packages to source code repositories
allowing more collaborative development (i.e., Picante package has
new Google Summer of Code 2008 developer Matthew Helmus)
•R-Mesquite interaction: Code written to allow Mesquite (Maddison
& Maddison, 2007) to call R packages (such as OUCH (Butler &
Coding for PhyloBase
NaturePrecedings:doi:10.1038/npre.2008.2126.1:Posted28Jul2008
O’Meara et al. Nature Preceedings. 2008 http://dx.doi.org/10.1038/npre.2008.2126.1
R-sig-phylo mailing list
32 R packages for comparative biology;
maintained by a hackathon participant
Informatics
team
Evolutionary
biologists
computational skills
domain knowledge
NESCent
National Evolutionary Synthesis Center
www.nescent.orgwww.nescent.org
short bootcamps teaching
computational skills to
domain scientists
bringing students into open-
source programming
communities
A grassroots approach to software sustainability. Karen Cranston, Todd Vision, Brian
O'Meara, Hilmar Lapp. http://dx.doi.org/10.6084/m9.figshare.790739

More Related Content

Similar to Building communities around open-source scientific software

Making your data work for you: Scratchpads, publishing & the biodiversity dat...
Making your data work for you: Scratchpads, publishing & the biodiversity dat...Making your data work for you: Scratchpads, publishing & the biodiversity dat...
Making your data work for you: Scratchpads, publishing & the biodiversity dat...Vince Smith
 
NeCTAR Presentation
NeCTAR PresentationNeCTAR Presentation
NeCTAR PresentationCybera Inc.
 
Neurosciences Information Framework (NIF): An example of community Cyberinfra...
Neurosciences Information Framework (NIF): An example of community Cyberinfra...Neurosciences Information Framework (NIF): An example of community Cyberinfra...
Neurosciences Information Framework (NIF): An example of community Cyberinfra...Neuroscience Information Framework
 
How to expand the Galaxy from genes to Earth in six simple steps (and live sm...
How to expand the Galaxy from genes to Earth in six simple steps (and live sm...How to expand the Galaxy from genes to Earth in six simple steps (and live sm...
How to expand the Galaxy from genes to Earth in six simple steps (and live sm...Raffaele Montella
 
A Knowledge Discovery Framework for Planetary Defense
A Knowledge Discovery Framework for Planetary DefenseA Knowledge Discovery Framework for Planetary Defense
A Knowledge Discovery Framework for Planetary DefenseYongyao Jiang
 
Tds — big science dec 2021
Tds — big science dec 2021Tds — big science dec 2021
Tds — big science dec 2021Gérard Dupont
 
Software Sustainability: Better Software Better Science
Software Sustainability: Better Software Better ScienceSoftware Sustainability: Better Software Better Science
Software Sustainability: Better Software Better ScienceCarole Goble
 
Demystifying Data Science & Analytics - 757ColorCoded 2019
Demystifying Data Science & Analytics - 757ColorCoded 2019Demystifying Data Science & Analytics - 757ColorCoded 2019
Demystifying Data Science & Analytics - 757ColorCoded 2019Guillermo A. Fisher
 
Humanities Networked Infrastructure (HuNI)
Humanities Networked Infrastructure (HuNI)Humanities Networked Infrastructure (HuNI)
Humanities Networked Infrastructure (HuNI)Deb Verhoeven
 
Working towards Sustainable Software for Science (an NSF and community view)
Working towards Sustainable Software for Science (an NSF and community view)Working towards Sustainable Software for Science (an NSF and community view)
Working towards Sustainable Software for Science (an NSF and community view)Daniel S. Katz
 
A Workflow-Driven Discovery and Training Ecosystem for Distributed Analysis o...
A Workflow-Driven Discovery and Training Ecosystem for Distributed Analysis o...A Workflow-Driven Discovery and Training Ecosystem for Distributed Analysis o...
A Workflow-Driven Discovery and Training Ecosystem for Distributed Analysis o...Ilkay Altintas, Ph.D.
 
Scientific Software: Sustainability, Skills & Sociology
Scientific Software: Sustainability, Skills & SociologyScientific Software: Sustainability, Skills & Sociology
Scientific Software: Sustainability, Skills & SociologyNeil Chue Hong
 
160606 data lifecycle project outline
160606 data lifecycle project outline160606 data lifecycle project outline
160606 data lifecycle project outlineIan Duncan
 
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant scienceVenice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant scienceGigaScience, BGI Hong Kong
 
Open source ai_technical_trend
Open source ai_technical_trendOpen source ai_technical_trend
Open source ai_technical_trendMario Cho
 
Semi-Supervised Classification with Graph Convolutional Networks @ICLR2017読み会
Semi-Supervised Classification with Graph Convolutional Networks @ICLR2017読み会Semi-Supervised Classification with Graph Convolutional Networks @ICLR2017読み会
Semi-Supervised Classification with Graph Convolutional Networks @ICLR2017読み会Eiji Sekiya
 
Cloud Standards in the Real World: Cloud Standards Testing for Developers
Cloud Standards in the Real World: Cloud Standards Testing for DevelopersCloud Standards in the Real World: Cloud Standards Testing for Developers
Cloud Standards in the Real World: Cloud Standards Testing for DevelopersAlan Sill
 
Data coffee - Support vector machine usage with complex data
Data coffee - Support vector machine usage with complex dataData coffee - Support vector machine usage with complex data
Data coffee - Support vector machine usage with complex dataDr. Branislav Majerník
 

Similar to Building communities around open-source scientific software (20)

Making your data work for you: Scratchpads, publishing & the biodiversity dat...
Making your data work for you: Scratchpads, publishing & the biodiversity dat...Making your data work for you: Scratchpads, publishing & the biodiversity dat...
Making your data work for you: Scratchpads, publishing & the biodiversity dat...
 
NeCTAR Presentation
NeCTAR PresentationNeCTAR Presentation
NeCTAR Presentation
 
Neurosciences Information Framework (NIF): An example of community Cyberinfra...
Neurosciences Information Framework (NIF): An example of community Cyberinfra...Neurosciences Information Framework (NIF): An example of community Cyberinfra...
Neurosciences Information Framework (NIF): An example of community Cyberinfra...
 
How to expand the Galaxy from genes to Earth in six simple steps (and live sm...
How to expand the Galaxy from genes to Earth in six simple steps (and live sm...How to expand the Galaxy from genes to Earth in six simple steps (and live sm...
How to expand the Galaxy from genes to Earth in six simple steps (and live sm...
 
A Knowledge Discovery Framework for Planetary Defense
A Knowledge Discovery Framework for Planetary DefenseA Knowledge Discovery Framework for Planetary Defense
A Knowledge Discovery Framework for Planetary Defense
 
Tds — big science dec 2021
Tds — big science dec 2021Tds — big science dec 2021
Tds — big science dec 2021
 
Software Sustainability: Better Software Better Science
Software Sustainability: Better Software Better ScienceSoftware Sustainability: Better Software Better Science
Software Sustainability: Better Software Better Science
 
Demystifying Data Science & Analytics - 757ColorCoded 2019
Demystifying Data Science & Analytics - 757ColorCoded 2019Demystifying Data Science & Analytics - 757ColorCoded 2019
Demystifying Data Science & Analytics - 757ColorCoded 2019
 
Humanities Networked Infrastructure (HuNI)
Humanities Networked Infrastructure (HuNI)Humanities Networked Infrastructure (HuNI)
Humanities Networked Infrastructure (HuNI)
 
Working towards Sustainable Software for Science (an NSF and community view)
Working towards Sustainable Software for Science (an NSF and community view)Working towards Sustainable Software for Science (an NSF and community view)
Working towards Sustainable Software for Science (an NSF and community view)
 
A Workflow-Driven Discovery and Training Ecosystem for Distributed Analysis o...
A Workflow-Driven Discovery and Training Ecosystem for Distributed Analysis o...A Workflow-Driven Discovery and Training Ecosystem for Distributed Analysis o...
A Workflow-Driven Discovery and Training Ecosystem for Distributed Analysis o...
 
Data mining weka
Data mining wekaData mining weka
Data mining weka
 
Scientific Software: Sustainability, Skills & Sociology
Scientific Software: Sustainability, Skills & SociologyScientific Software: Sustainability, Skills & Sociology
Scientific Software: Sustainability, Skills & Sociology
 
160606 data lifecycle project outline
160606 data lifecycle project outline160606 data lifecycle project outline
160606 data lifecycle project outline
 
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant scienceVenice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
 
Open source ai_technical_trend
Open source ai_technical_trendOpen source ai_technical_trend
Open source ai_technical_trend
 
Semi-Supervised Classification with Graph Convolutional Networks @ICLR2017読み会
Semi-Supervised Classification with Graph Convolutional Networks @ICLR2017読み会Semi-Supervised Classification with Graph Convolutional Networks @ICLR2017読み会
Semi-Supervised Classification with Graph Convolutional Networks @ICLR2017読み会
 
Ji cv6n1
Ji cv6n1Ji cv6n1
Ji cv6n1
 
Cloud Standards in the Real World: Cloud Standards Testing for Developers
Cloud Standards in the Real World: Cloud Standards Testing for DevelopersCloud Standards in the Real World: Cloud Standards Testing for Developers
Cloud Standards in the Real World: Cloud Standards Testing for Developers
 
Data coffee - Support vector machine usage with complex data
Data coffee - Support vector machine usage with complex dataData coffee - Support vector machine usage with complex data
Data coffee - Support vector machine usage with complex data
 

More from Karen Cranston

WSSSPE: Building communities
WSSSPE: Building communitiesWSSSPE: Building communities
WSSSPE: Building communitiesKaren Cranston
 
Cranston Evolution 2013
Cranston Evolution 2013Cranston Evolution 2013
Cranston Evolution 2013Karen Cranston
 
Open Tree of Life @NSF
Open Tree of Life @NSFOpen Tree of Life @NSF
Open Tree of Life @NSFKaren Cranston
 
Freeing scientific data using CC0
Freeing scientific data using CC0Freeing scientific data using CC0
Freeing scientific data using CC0Karen Cranston
 
If this is the future, where is my tree of life?
If this is the future, where is my tree of life?If this is the future, where is my tree of life?
If this is the future, where is my tree of life?Karen Cranston
 
Open Tree of Life @Evolution 2012
Open Tree of Life @Evolution 2012Open Tree of Life @Evolution 2012
Open Tree of Life @Evolution 2012Karen Cranston
 
OpenTree at NESCent Academy 2012
OpenTree at NESCent Academy 2012OpenTree at NESCent Academy 2012
OpenTree at NESCent Academy 2012Karen Cranston
 

More from Karen Cranston (8)

WSSSPE: Building communities
WSSSPE: Building communitiesWSSSPE: Building communities
WSSSPE: Building communities
 
Cranston Evolution 2013
Cranston Evolution 2013Cranston Evolution 2013
Cranston Evolution 2013
 
Open Tree of Life @NSF
Open Tree of Life @NSFOpen Tree of Life @NSF
Open Tree of Life @NSF
 
Freeing scientific data using CC0
Freeing scientific data using CC0Freeing scientific data using CC0
Freeing scientific data using CC0
 
If this is the future, where is my tree of life?
If this is the future, where is my tree of life?If this is the future, where is my tree of life?
If this is the future, where is my tree of life?
 
Phylotastic @iEvoBio
Phylotastic @iEvoBioPhylotastic @iEvoBio
Phylotastic @iEvoBio
 
Open Tree of Life @Evolution 2012
Open Tree of Life @Evolution 2012Open Tree of Life @Evolution 2012
Open Tree of Life @Evolution 2012
 
OpenTree at NESCent Academy 2012
OpenTree at NESCent Academy 2012OpenTree at NESCent Academy 2012
OpenTree at NESCent Academy 2012
 

Recently uploaded

WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceSamy Fodil
 
The Metaverse: Are We There Yet?
The  Metaverse:    Are   We  There  Yet?The  Metaverse:    Are   We  There  Yet?
The Metaverse: Are We There Yet?Mark Billinghurst
 
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfHow Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfFIDO Alliance
 
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...FIDO Alliance
 
Buy Epson EcoTank L3210 Colour Printer Online.pptx
Buy Epson EcoTank L3210 Colour Printer Online.pptxBuy Epson EcoTank L3210 Colour Printer Online.pptx
Buy Epson EcoTank L3210 Colour Printer Online.pptxEasyPrinterHelp
 
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfIntroduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfFIDO Alliance
 
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfFIDO Alliance
 
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...CzechDreamin
 
THE BEST IPTV in GERMANY for 2024: IPTVreel
THE BEST IPTV in  GERMANY for 2024: IPTVreelTHE BEST IPTV in  GERMANY for 2024: IPTVreel
THE BEST IPTV in GERMANY for 2024: IPTVreelreely ones
 
The UX of Automation by AJ King, Senior UX Researcher, Ocado
The UX of Automation by AJ King, Senior UX Researcher, OcadoThe UX of Automation by AJ King, Senior UX Researcher, Ocado
The UX of Automation by AJ King, Senior UX Researcher, OcadoUXDXConf
 
Strategic AI Integration in Engineering Teams
Strategic AI Integration in Engineering TeamsStrategic AI Integration in Engineering Teams
Strategic AI Integration in Engineering TeamsUXDXConf
 
AI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekAI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekCzechDreamin
 
Optimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityOptimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityScyllaDB
 
PLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. StartupsPLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. StartupsStefano
 
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfThe Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfFIDO Alliance
 
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCustom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCzechDreamin
 
Enterprise Knowledge Graphs - Data Summit 2024
Enterprise Knowledge Graphs - Data Summit 2024Enterprise Knowledge Graphs - Data Summit 2024
Enterprise Knowledge Graphs - Data Summit 2024Enterprise Knowledge
 
Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Patrick Viafore
 
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Julian Hyde
 
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlPeter Udo Diehl
 

Recently uploaded (20)

WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM Performance
 
The Metaverse: Are We There Yet?
The  Metaverse:    Are   We  There  Yet?The  Metaverse:    Are   We  There  Yet?
The Metaverse: Are We There Yet?
 
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfHow Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
 
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
 
Buy Epson EcoTank L3210 Colour Printer Online.pptx
Buy Epson EcoTank L3210 Colour Printer Online.pptxBuy Epson EcoTank L3210 Colour Printer Online.pptx
Buy Epson EcoTank L3210 Colour Printer Online.pptx
 
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfIntroduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
 
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
 
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
 
THE BEST IPTV in GERMANY for 2024: IPTVreel
THE BEST IPTV in  GERMANY for 2024: IPTVreelTHE BEST IPTV in  GERMANY for 2024: IPTVreel
THE BEST IPTV in GERMANY for 2024: IPTVreel
 
The UX of Automation by AJ King, Senior UX Researcher, Ocado
The UX of Automation by AJ King, Senior UX Researcher, OcadoThe UX of Automation by AJ King, Senior UX Researcher, Ocado
The UX of Automation by AJ King, Senior UX Researcher, Ocado
 
Strategic AI Integration in Engineering Teams
Strategic AI Integration in Engineering TeamsStrategic AI Integration in Engineering Teams
Strategic AI Integration in Engineering Teams
 
AI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekAI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří Karpíšek
 
Optimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityOptimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through Observability
 
PLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. StartupsPLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. Startups
 
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfThe Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
 
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCustom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
 
Enterprise Knowledge Graphs - Data Summit 2024
Enterprise Knowledge Graphs - Data Summit 2024Enterprise Knowledge Graphs - Data Summit 2024
Enterprise Knowledge Graphs - Data Summit 2024
 
Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024
 
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
 
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
 

Building communities around open-source scientific software

  • 1. Building communities around open- source scientific software Karen Cranston National Evolutionary Synthesis Center (NESCent) @kcranstn http://www.slideshare.net/kcranstn
  • 2. NESCent National Evolutionary Synthesis Center www.nescent.org fieldwork labwork method development meta-analysisdata synthesis
  • 3.
  • 4. Species A (mm^2) F (mm^2/ mm^2) N (mm^-2) S (mm^4) Abelia biflora Abelia dielsii Abelia integrifolia Abelia mosanensis Abelia serrata Abelia spathulata Abutilon fruticosum Abutilon pannosum Acacia albida Acacia ataxacantha Acacia borleae Acacia burkei Acacia caffra 0.002375829 0.924197654 389.0 6.11E-06 0.00115375 0.357418211 331.0 3.49E-06 0.001134115 0.240432369 212.0 5.35E-06 0.000855299 0.632065665 739.0 1.16E-06 0.000706858 0.206402637 292.0 2.42E-06 0.000804248 0.230819095 287.0 2.80E-06 0.001452201 0.137959114 95.0 1.53E-05 0.003117245 0.124689812 40.0 7.79E-05 0.012271846 0.049087385 4.0 0.003067962 0.013069811 0.169907541 13.0 0.00100537 0.004071504 0.061072561 15.0 0.000271434 0.008992024 0.053952141 6.0 0.001498671 0.010207035 0.214347725 21.0 0.000486049 + trait data about species evolutionary trees
  • 5. Outcomes: Community Brian O'Meara, Michael Alfaro, Charles Bell, Ben Bolker, Marguerite Butler, Peter Cowan, Damien de Vienne, Richard Desper, Joe Felsenstein, Luke Harmon, Christoph Heibl, Andrew Hipp, Gene Hunt, Thibaut Jombart, Steve Kembel, Hilmar Lapp, Scott Loarie, Wayne Maddison, Peter Midford, David Orme, Emmanuel Paradis, Sam Price, Dan Rabosky, Brian Sidlauskas, Stacey Smith, Dave Swofford, Todd Vision, Peter Waddell, Amy Zanne, Derrick Zwickl [bold indicates organizer] Comparative methods in hackathon Rationale Work at hackathon (Dec. 10-14, 2007) The R statistical analysis package has emerged as a popular platform for implementation of powerful comparative phylogenetic methods to understand the evolution of organismal traits and diversification. It includes methods such as independent contrasts, ancestral state estimation, various models of continuous and discrete trait evolution, lineage through time plots, diversification tests, generalized estimating equations, tree plotting, and more. This event was designed to bring together active R developers as well as end-users working on the integration of comparative phylogenetic methods within R to actively address issues of data exchange standards, code interoperability, usability, documentation quality, and the breadth of functionality for comparative methods available within R. The idea originated from a whitepaper submitted by NESCent postdocs Amy Zanne and Sam Price. •30 developers and users worked on programming & writing documentation •Split into subgroups on diversification, divergence times, documentation, class design, Mesquite-R interaction, input/ output, and trait evolution •Package source code stored on shared repository hosted at R-forge (“PhyloConductor”) Hackathon participants (red were flown to NESCent, purple participated remotely). Map from Google Maps •Designed and began implementing a new S4 class for data and trees •Ran “bootcamps” for developers on numerical optimization and S4 coding •Used the Nexus Class Library (Lewis & Holder) and RCpp (Samperi) for reading and interpreting Nexus tree and data files •Began work on R tutorials •Tested existing methods in R, identifying errors •Developed ways for R to call Mesquite and Mesquite to call R 0 150 300 12/10 12/11 12/12 12/13 12/14 12/15 Commits •R-Phylo Wiki (http://www.r-phylo.org): Tutorials and overview of available analyses and packages from the hackathon have been placed on a public website for all to use and improve. It’s had >7,000 page visits from >30 countries and >600 edits since it went live in March 2008. •R-sig-phylo mailing list (https://stat.ethz.ch/mailman/listinfo/r-sig- phylo): A mailing list for users of R for comparative methods and phylogenetics. Over 100 messages in its first four months. •Comparative methods in R user tutorials planned for 2009 Society for Integrative and Comparative Biology and Evolution meetings. •Addition of R track to NESCent summer course in phyloinformatics, featuring software developed at hackathon and taught by hackathon participant Marguerite Butler. •Proposal to NSF for summer course in R for phyloinformatics. •Ongoing collaborations between hackathon participants. •Two Google Summer of Code projects to sponsor student NESCent informatics Incompatible tree formats are used in different R packages Package Function geiger1.0-9.1 sim.char ouch1.2-4 brown.dev picante evolve.brownian ape2.01 evolve.phylo Redundancy (at least four functions to evolve traits up the tree using simple Brownian motion) Can be intimidating to beginners Coding at hackathon The US National Evolutionary Synthesis Center (http://www.nescent.org) encourages synthetic, interdisciplinary, and transformative research in evolutionary biology. NESCent, a collaborative effort of Duke, NC State University and UNC Chapel Hill, is located in Durham NC and is supported by the National Science Foundation (EF-0423641). A major goal of NESCent's Informatics branch is to promote community-driven, collaborative open-source software development. This is achieved through hackathons, internships (such as the Google Summer of Code), summer courses, conference workshops, and by externally funded collaborations for the development and support of Outcomes: Software •Phylobase (http://r-forge.r-project.org/projects/phylobase/): New package for phylogenetic trees and data. Can load trees and data from Nexus files, output to other tree formats, coordinate pruning of taxa from data and tree, traverse tree, handle DNA, morphological, and continuous data types. Work is ongoing (below) to enhance tree plotting and other functions. As with all hackathon products, new developers are welcome to join to further improve the code (one already has). URL: http://hackathon.nescent.org/R_Hackathon_1 email: hackathon2@nescent.org 0 50 100 150 200 12/16 12/30 1/13 1/27 2/10 2/24 3/9 3/23 4/6 4/20 5/4 5/18 6/1 Commits •Movement of existing packages to source code repositories allowing more collaborative development (i.e., Picante package has new Google Summer of Code 2008 developer Matthew Helmus) •R-Mesquite interaction: Code written to allow Mesquite (Maddison & Maddison, 2007) to call R packages (such as OUCH (Butler & Coding for PhyloBase NaturePrecedings:doi:10.1038/npre.2008.2126.1:Posted28Jul2008 O’Meara et al. Nature Preceedings. 2008 http://dx.doi.org/10.1038/npre.2008.2126.1
  • 7. 32 R packages for comparative biology; maintained by a hackathon participant
  • 8. Informatics team Evolutionary biologists computational skills domain knowledge NESCent National Evolutionary Synthesis Center www.nescent.orgwww.nescent.org
  • 9. short bootcamps teaching computational skills to domain scientists bringing students into open- source programming communities
  • 10.
  • 11. A grassroots approach to software sustainability. Karen Cranston, Todd Vision, Brian O'Meara, Hilmar Lapp. http://dx.doi.org/10.6084/m9.figshare.790739