BioCatalogue talk by Carole Goble. She outlines in these slides the reasons behind the BioCatalogue project. And present the BioCatalogue and its goals.
Better software, better service, better research: The Software Sustainabilit...Carole Goble
Ever spotted some great looking software only to discover you can’t get it, it doesn’t work, there is no documentation to help fix it and the developers don’t have the time or incentive to help? Ever produced some software that you want to be widely used or have folks contribute? What’s the sustainability of that key platform/library/tool /database your lab uses day in and day out? Are you helping the providers? The same issues stand for Data (or as we now say “FAIR” Findable, Accessible, Interoperable, Reusable Data) and its metadata. Is anyone looking out for Europe’s data services– the datasets and analysis systems you use and you make – the standards they use and the curators and developers who make them? Or is FAIR just a FAIRy story? I’ll tell how two organisations with quite different structures and approaches - the UK’s Software Sustainability Institute and the ELIXIR European Research Infrastructure for Life Science Data – are working for the common goal of better software, better service, and better research.
https://www.rothamsted.ac.uk/events/14th-international-symposium-integrative-bioinformatics
What is Reproducibility? The R* brouhaha (and how Research Objects can help)Carole Goble
presented at 1st First International Workshop on Reproducible Open Science @ TPDL, 9 Sept 2016, Hannover, Germany
http://repscience2016.research-infrastructures.eu/
presentation at https://researchsoft.github.io/FAIReScience/, FAIReScience 2021 online workshop
virtually co-located with the 17th IEEE International Conference on eScience (eScience 2021)
Keynote on software sustainability given at the 2nd Annual Netherlands eScience Symposium, November 2014.
Based on the article
Carole Goble ,
Better Software, Better Research
Issue No.05 - Sept.-Oct. (2014 vol.18)
pp: 4-8
IEEE Computer Society
http://www.computer.org/csdl/mags/ic/2014/05/mic2014050004.pdf
http://doi.ieeecomputersociety.org/10.1109/MIC.2014.88
http://www.software.ac.uk/resources/publications/better-software-better-research
FAIR Computational Workflows
Computational workflows capture precise descriptions of the steps and data dependencies needed to carry out computational data pipelines, analysis and simulations in many areas of Science, including the Life Sciences. The use of computational workflows to manage these multi-step computational processes has accelerated in the past few years driven by the need for scalable data processing, the exchange of processing know-how, and the desire for more reproducible (or at least transparent) and quality assured processing methods. The SARS-CoV-2 pandemic has significantly highlighted the value of workflows.
This increased interest in workflows has been matched by the number of workflow management systems available to scientists (Galaxy, Snakemake, Nextflow and 270+ more) and the number of workflow services like registries and monitors. There is also recognition that workflows are first class, publishable Research Objects just as data are. They deserve their own FAIR (Findable, Accessible, Interoperable, Reusable) principles and services that cater for their dual roles as explicit method description and software method execution [1]. To promote long-term usability and uptake by the scientific community, workflows (as well as the tools that integrate them) should become FAIR+R(eproducible), and citable so that author’s credit is attributed fairly and accurately.
The work on improving the FAIRness of workflows has already started and a whole ecosystem of tools, guidelines and best practices has been under development to reduce the time needed to adapt, reuse and extend existing scientific workflows. An example is the EOSC-Life Cluster of 13 European Biomedical Research Infrastructures which is developing a FAIR Workflow Collaboratory based on the ELIXIR Research Infrastructure for Life Science Data Tools ecosystem. While there are many tools for addressing different aspects of FAIR workflows, many challenges remain for describing, annotating, and exposing scientific workflows so that they can be found, understood and reused by other scientists.
This keynote will explore the FAIR principles for computational workflows in the Life Science using the EOSC-Life Workflow Collaboratory as an example.
[1] Carole Goble, Sarah Cohen-Boulakia, Stian Soiland-Reyes,Daniel Garijo, Yolanda Gil, Michael R. Crusoe, Kristian Peters, and Daniel Schober FAIR Computational Workflows Data Intelligence 2020 2:1-2, 108-121 https://doi.org/10.1162/dint_a_00033.
This presentation was provided by Tim McGeary of Duke University during the NISO virtual conference, Open Data Projects, held on Wednesday, June 13, 2018.
Better software, better service, better research: The Software Sustainabilit...Carole Goble
Ever spotted some great looking software only to discover you can’t get it, it doesn’t work, there is no documentation to help fix it and the developers don’t have the time or incentive to help? Ever produced some software that you want to be widely used or have folks contribute? What’s the sustainability of that key platform/library/tool /database your lab uses day in and day out? Are you helping the providers? The same issues stand for Data (or as we now say “FAIR” Findable, Accessible, Interoperable, Reusable Data) and its metadata. Is anyone looking out for Europe’s data services– the datasets and analysis systems you use and you make – the standards they use and the curators and developers who make them? Or is FAIR just a FAIRy story? I’ll tell how two organisations with quite different structures and approaches - the UK’s Software Sustainability Institute and the ELIXIR European Research Infrastructure for Life Science Data – are working for the common goal of better software, better service, and better research.
https://www.rothamsted.ac.uk/events/14th-international-symposium-integrative-bioinformatics
What is Reproducibility? The R* brouhaha (and how Research Objects can help)Carole Goble
presented at 1st First International Workshop on Reproducible Open Science @ TPDL, 9 Sept 2016, Hannover, Germany
http://repscience2016.research-infrastructures.eu/
presentation at https://researchsoft.github.io/FAIReScience/, FAIReScience 2021 online workshop
virtually co-located with the 17th IEEE International Conference on eScience (eScience 2021)
Keynote on software sustainability given at the 2nd Annual Netherlands eScience Symposium, November 2014.
Based on the article
Carole Goble ,
Better Software, Better Research
Issue No.05 - Sept.-Oct. (2014 vol.18)
pp: 4-8
IEEE Computer Society
http://www.computer.org/csdl/mags/ic/2014/05/mic2014050004.pdf
http://doi.ieeecomputersociety.org/10.1109/MIC.2014.88
http://www.software.ac.uk/resources/publications/better-software-better-research
FAIR Computational Workflows
Computational workflows capture precise descriptions of the steps and data dependencies needed to carry out computational data pipelines, analysis and simulations in many areas of Science, including the Life Sciences. The use of computational workflows to manage these multi-step computational processes has accelerated in the past few years driven by the need for scalable data processing, the exchange of processing know-how, and the desire for more reproducible (or at least transparent) and quality assured processing methods. The SARS-CoV-2 pandemic has significantly highlighted the value of workflows.
This increased interest in workflows has been matched by the number of workflow management systems available to scientists (Galaxy, Snakemake, Nextflow and 270+ more) and the number of workflow services like registries and monitors. There is also recognition that workflows are first class, publishable Research Objects just as data are. They deserve their own FAIR (Findable, Accessible, Interoperable, Reusable) principles and services that cater for their dual roles as explicit method description and software method execution [1]. To promote long-term usability and uptake by the scientific community, workflows (as well as the tools that integrate them) should become FAIR+R(eproducible), and citable so that author’s credit is attributed fairly and accurately.
The work on improving the FAIRness of workflows has already started and a whole ecosystem of tools, guidelines and best practices has been under development to reduce the time needed to adapt, reuse and extend existing scientific workflows. An example is the EOSC-Life Cluster of 13 European Biomedical Research Infrastructures which is developing a FAIR Workflow Collaboratory based on the ELIXIR Research Infrastructure for Life Science Data Tools ecosystem. While there are many tools for addressing different aspects of FAIR workflows, many challenges remain for describing, annotating, and exposing scientific workflows so that they can be found, understood and reused by other scientists.
This keynote will explore the FAIR principles for computational workflows in the Life Science using the EOSC-Life Workflow Collaboratory as an example.
[1] Carole Goble, Sarah Cohen-Boulakia, Stian Soiland-Reyes,Daniel Garijo, Yolanda Gil, Michael R. Crusoe, Kristian Peters, and Daniel Schober FAIR Computational Workflows Data Intelligence 2020 2:1-2, 108-121 https://doi.org/10.1162/dint_a_00033.
This presentation was provided by Tim McGeary of Duke University during the NISO virtual conference, Open Data Projects, held on Wednesday, June 13, 2018.
It Takes a Village to Grow ORCIDs on Campus: Establishing and Integrating Uni...Violeta Ilik
This presentation describes the integration of ORCID identifiers into the open source Vireo electronic theses and dissertations (ETD) workflow, the university's digital repository, and the internally-used VIVO profile system.
Presented at Texas Conference on Digital Libraries (TCDL) 2014:
https://conferences.tdl.org/tcdl/index.php/TCDL/TCDL2014/schedConf/program
This presentation describes the work of the Global Alliance for Genomics and Health, and its members, to develop standards and technologies to make genomics and clinical data more findable, accessible, and useful.
Being Reproducible: SSBSS Summer School 2017Carole Goble
Lecture 2:
Being Reproducible: Models, Research Objects and R* Brouhaha
Reproducibility is a R* minefield, depending on whether you are testing for robustness (rerun), defence (repeat), certification (replicate), comparison (reproduce) or transferring between researchers (reuse). Different forms of "R" make different demands on the completeness, depth and portability of research. Sharing is another minefield raising concerns of credit and protection from sharp practices.
In practice the exchange, reuse and reproduction of scientific experiments is dependent on bundling and exchanging the experimental methods, computational codes, data, algorithms, workflows and so on along with the narrative. These "Research Objects" are not fixed, just as research is not “finished”: the codes fork, data is updated, algorithms are revised, workflows break, service updates are released. ResearchObject.org is an effort to systematically support more portable and reproducible research exchange.
In this talk I will explore these issues in more depth using the FAIRDOM Platform and its support for reproducible modelling. The talk will cover initiatives and technical issues, and raise social and cultural challenges.
Publishing your research: Research Data Management (Introduction) Jamie Bisset
Publishing your research: Research Data Management (Introduction) (November 2013) slides. Delivered as part of the Durham University Researcher Development Programme. Further Training available at https://www.dur.ac.uk/library/research/training/
A talk about the merger and refactoring of the eagle-i and VIVO ontologies presented by myself, Brian Lowe, Janos Hajagos, and Erich Bremer at the VIVO2013 conference in St. Louis
A keynote given on experiences in curating workflows and web services.
3rd International Digital Curation Conference: "Curating our Digital Scientific Heritage: a Global Collaborative Challenge"
11-13 December 2007
Renaissance Hotel
Washington DC, USA
Keynote presentation by Professor Carole Goble at BOSC (Bioinformatics Open Source Conference) Long Beach, California, USA, July 14 2012. Co-located with ISMB, Intelligent Systems in Molecular Biology
It Takes a Village to Grow ORCIDs on Campus: Establishing and Integrating Uni...Violeta Ilik
This presentation describes the integration of ORCID identifiers into the open source Vireo electronic theses and dissertations (ETD) workflow, the university's digital repository, and the internally-used VIVO profile system.
Presented at Texas Conference on Digital Libraries (TCDL) 2014:
https://conferences.tdl.org/tcdl/index.php/TCDL/TCDL2014/schedConf/program
This presentation describes the work of the Global Alliance for Genomics and Health, and its members, to develop standards and technologies to make genomics and clinical data more findable, accessible, and useful.
Being Reproducible: SSBSS Summer School 2017Carole Goble
Lecture 2:
Being Reproducible: Models, Research Objects and R* Brouhaha
Reproducibility is a R* minefield, depending on whether you are testing for robustness (rerun), defence (repeat), certification (replicate), comparison (reproduce) or transferring between researchers (reuse). Different forms of "R" make different demands on the completeness, depth and portability of research. Sharing is another minefield raising concerns of credit and protection from sharp practices.
In practice the exchange, reuse and reproduction of scientific experiments is dependent on bundling and exchanging the experimental methods, computational codes, data, algorithms, workflows and so on along with the narrative. These "Research Objects" are not fixed, just as research is not “finished”: the codes fork, data is updated, algorithms are revised, workflows break, service updates are released. ResearchObject.org is an effort to systematically support more portable and reproducible research exchange.
In this talk I will explore these issues in more depth using the FAIRDOM Platform and its support for reproducible modelling. The talk will cover initiatives and technical issues, and raise social and cultural challenges.
Publishing your research: Research Data Management (Introduction) Jamie Bisset
Publishing your research: Research Data Management (Introduction) (November 2013) slides. Delivered as part of the Durham University Researcher Development Programme. Further Training available at https://www.dur.ac.uk/library/research/training/
A talk about the merger and refactoring of the eagle-i and VIVO ontologies presented by myself, Brian Lowe, Janos Hajagos, and Erich Bremer at the VIVO2013 conference in St. Louis
A keynote given on experiences in curating workflows and web services.
3rd International Digital Curation Conference: "Curating our Digital Scientific Heritage: a Global Collaborative Challenge"
11-13 December 2007
Renaissance Hotel
Washington DC, USA
Keynote presentation by Professor Carole Goble at BOSC (Bioinformatics Open Source Conference) Long Beach, California, USA, July 14 2012. Co-located with ISMB, Intelligent Systems in Molecular Biology
Six Principles of Software Design to Empower ScientistsDavid De Roure
Keynote talk for Workshop on Managing for Usability:
Challenges and Opportunities for E-Science Project Management, 10-11 April 2008,
OeRC, University of Oxford, UK
NSF Workshop Data and Software Citation, 6-7 June 2016, Boston USA, Software Panel
FIndable, Accessible, Interoperable, Reusable Software and Data Citation: Europe, Research Objects, and BioSchemas.org
Being FAIR: FAIR data and model management SSBSS 2017 Summer SchoolCarole Goble
Lecture 1:
Being FAIR: FAIR data and model management
In recent years we have seen a change in expectations for the management of all the outcomes of research – that is the “assets” of data, models, codes, SOPs, workflows. The “FAIR” (Findable, Accessible, Interoperable, Reusable) Guiding Principles for scientific data management and stewardship [1] have proved to be an effective rallying-cry. Funding agencies expect data (and increasingly software) management retention and access plans. Journals are raising their expectations of the availability of data and codes for pre- and post- publication. The multi-component, multi-disciplinary nature of Systems and Synthetic Biology demands the interlinking and exchange of assets and the systematic recording of metadata for their interpretation.
Our FAIRDOM project (http://www.fair-dom.org) supports Systems Biology research projects with their research data, methods and model management, with an emphasis on standards smuggled in by stealth and sensitivity to asset sharing and credit anxiety. The FAIRDOM Platform has been installed by over 30 labs or projects. Our public, centrally hosted Asset Commons, the FAIRDOMHub.org, supports the outcomes of 50+ projects.
Now established as a grassroots association, FAIRDOM has over 8 years of experience of practical asset sharing and data infrastructure at the researcher coal-face ranging across European programmes (SysMO and ERASysAPP ERANets), national initiatives (Germany's de.NBI and Systems Medicine of the Liver; Norway's Digital Life) and European Research Infrastructures (ISBE) as well as in PI's labs and Centres such as the SynBioChem Centre at Manchester.
In this talk I will show explore how FAIRDOM has been designed to support Systems Biology projects and show examples of its configuration and use. I will also explore the technical and social challenges we face.
I will also refer to European efforts to support public archives for the life sciences. ELIXIR (http:// http://www.elixir-europe.org/) the European Research Infrastructure of 21 national nodes and a hub funded by national agreements to coordinate and sustain key data repositories and archives for the Life Science community, improve access to them and related tools, support training and create a platform for dataset interoperability. As the Head of the ELIXIR-UK Node and co-lead of the ELIXIR Interoperability Platform I will show how this work relates to your projects.
[1] Wilkinson et al, The FAIR Guiding Principles for scientific data management and stewardship Scientific Data 3, doi:10.1038/sdata.2016.18
FAIRy stories: the FAIR Data principles in theory and in practiceCarole Goble
https://ucsb.zoom.us/meeting/register/tZYod-ippz4pHtaJ0d3ERPIFy2QIvKqjwpXR
FAIRy stories: the FAIR Data principles in theory and in practice
The ‘FAIR Guiding Principles for scientific data management and stewardship’ [1] launched a global dialogue within research and policy communities and started a journey to wider accessibility and reusability of data and preparedness for automation-readiness (I am one of the army of authors). Over the past 5 years FAIR has become a movement, a mantra and a methodology for scientific research and increasingly in the commercial and public sector. FAIR is now part of NIH, European Commission and OECD policy. But just figuring out what the FAIR principles really mean and how we implement them has proved more challenging than one might have guessed. To quote the novelist Rick Riordan “Fairness does not mean everyone gets the same. Fairness means everyone gets what they need”.
As a data infrastructure wrangler I lead and participate in projects implementing forms of FAIR in pan-national European biomedical Research Infrastructures. We apply web-based industry-lead approaches like Schema.org; work with big pharma on specialised FAIRification pipelines for legacy data; promote FAIR by Design methodologies and platforms into the researcher lab; and expand the principles of FAIR beyond data to computational workflows and digital objects. Many use Linked Data approaches.
In this talk I’ll use some of these projects to shine some light on the FAIR movement. Spoiler alert: although there are technical issues, the greatest challenges are social. FAIR is a team sport. Knowledge Graphs play a role – not just as consumers of FAIR data but as active contributors. To paraphrase another novelist, “It is a truth universally acknowledged that a Knowledge Graph must be in want of FAIR data.”
[1] Wilkinson, M., Dumontier, M., Aalbersberg, I. et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data 3, 160018 (2016). https://doi.org/10.1038/sdata.2016.18
This presentation was provided by Violeta Ilik of Northwestern University during the NISO Virtual Conference held on Feb 15, 2017, entitled Institutional Repositories: Ensuring Yours is Populated, Useful and Thriving. The DOI for this presentation is http://dx.doi.org/10.18131/G3VP6R
Presentation of BioCatalogue Web Services registry at the EMBL-EBI Small and Medium Size(SME) workshop in Munich in October 2010. Presentation done by Eric Nzuobontane.
6. Service and Workflow analytics and network analysis Recommendations and co-use. Social networks of third party externally hosted services Automated diagnostics, monitoring and metadata curation
7. Finding and Curating Services http://www.biocatalogue.org Drawing on 6 years experience in Taverna of semantic annotation of services using RDF and OWL ontologies. Drawing on experience at EBI in service provision. First pilot early November 2008, will cover major providers (EBI, NCBI, DDBJ) at “bronze” quality and show some at platinum.
8.
9.
10.
11. Workflows and Services Curation by Experts Social Curation by the Crowd refine validate refine validate Self-Curation by Contributors seed seed refine validate seed refine validate seed Automated Curation
12. Multiple Annotation Profiles User Profile Service Profile Profile Annotation Profile Annotation Profile Annotation Ranking Functions Group Profile
13. Service Profile Curation Model Quantitative Content Tags Service Model Semantic Content Model Ontologies Functional Provenance Operational Operational Metrics Conditions of Use Social Standing 6 facets Versioning QoS Usage
14. A.N. Other Execution at Host Service Profile Finding WSDL WADL S-A.N. Other SAWSDL SA-REST Analytics Ranking Browse/Shop Search Customised Services Workflows Monitoring Profiles Curation Quant’ve Service Model Semantic Content Model
15. Service Profile Facets Services Interface Neutral Functional Conditions of Use Operational Social Standing Operational Metrics Provenance
16. Services Interface Neutral Functional Conditions of Use Operational Social Standing Operational Metrics Provenance Multiply described Third Party Aggregated Feeds Monitoring Multiple Sources Multiple Versions Dynamic Multiple Instances Discovery Interoperability Composition Reuse Trusted Authorities Policies Ontologies Controlled Vocabularies Tags Free text Folksonomies Standards W*DL Atom Schemas
17. Services Interface Neutral Functional Conditions of Use Operational Social Standing Operational Metrics Provenance Multiply described Third Party Aggregated Feeds Monitoring Multiple Sources Multiple Versions Dynamic Multiple Instances Discovery Interoperability Composition Reuse Trusted Authorities Policies Ranking
18. Pay as you Go, Emergent Curation Just enough, Just in Time, not Just in Case. What is the Return for the Investment? Gain Pain Very BAD Good, but Unlikely Just right Folksonomy Tagging Hard Core full on Ontology Curation Rich enough metadata for effective reuse
31. Finding, curating and reusing workflows Connecting Scientists in the Wild A supermarket for workflow users. A toolbox for workflow creators. Social networking over commodities. Different disciplines. 1200+ members from 114 countries. 50000+ workflows downloads. 1500-2000 unique visitors / month 460+ workflows. 98 groups. 35+ packs. Running for just over a year. Joint Manchester and Southampton. Project leader: Prof David De Roure
32.
33.
34.
35. BioCatalogue Team Thomas Laurent Hamish McWilliams Franck Tanoh Jiten Bhagat Carole Goble Rodrigo Lopez Eric Nzuobontane
The plan for this talk was to highlight what BioCatalogue is and to Give a demo but unfortunately can’t do it because not ready. But will use some screen shot to show you what really going on or what to Expect next from BioCatalogue. Background of the talk: Lots of database and data resources Feta but can’t annotate all the services BioCatalogue
Services are methods too.
Fix, File and Forget is curation in a way….. Assets are used, we hope By applications and scientists who had anticipated using them. By applications and scientists that had not, or in ways that were unanticipated.
Of course it isn’t as clean as that. And highly interrelated.
Workflows are combinations of services. External Not self-contained or isolated Service and Workflow analytics and network analysis Service Diagnostics and monitoring Automated curation
Get service providers involved, get the community involved 3500+ service operations, but only 700ish annotated in Feta. myGrid Service Ontology Annotation and curation pipeline Curation and Discovery tools Other registries: DAS Registry, BioMOBY Central, SeekDa …
Scientists are naughty Reuse is Hard We have to try them to find out what they do… IVOA referred to this too. … I used it last time so it will work again the same way…damn! change location, capabilities and signatures (BioMART changed its interface three times in 2006). new ones appear and existing ones disappear (SeqHound) they decay and become outdated or unreliable.
Services in the Wild are frequently, er, disappointing and hard to use. (Rubbish ™) . Writing reusable workflows is hard. Local services Permissions. Licences What does it DO? Writing reusable services is hard. What does it DO? Predicting the unknown required by the unknown. Finding workflows, services and tools is hard Where do you go?? What does it DO?? Creating web services is still a bottleneck. For quick solutions it is still seen as too much extra trouble.
Ruin Not fix, file, forget Services are not deposited and preserved in software libraries. Rapid metadata heart-beat, especially on operational metadata. Could use previous slide in DCC talk. Shadows Method archives Shadows – what it was that can be used again. They are referred to. No SLA to be stable or standard. Constantly need tending or else they go stale. (cf. IVOA service validation, DAS). Not software libraries BioNanny – using Grid tools Versioning of workflows – Andrea. Regular health checks Use myExperiment to notify scientists with potential problems Use myExperiment to be smart about which services should be monitored. Workflows are deposited but…. Not self-contained. Linking to external services in flux. Or depend on software Incorporating services unavailable to others. Workflow fragility and hence decay. Workflows become plans and provenance rather than working scientific objects unless tended and updated.
In particular a platform for research into curation practices As in the panel today Expert – Is library like Suppliers and crowd are the web side Automated is
Group profile is the interrelationships between the services. Co-reference, Co-use,
Curation includes versioning Analytics includes monitoring
OAIS? From the model point of view. From the standoff annotation point of view. Metadata richness.
Skipped all but the core in talk. OAIS? From the model point of view. From the standoff annotation point of view. Metadata richness.
From the model point of view. From the standoff model neutral annotation point of view Bronze, silver, gold and platinum compliance levels.
Frankly, is it worth it to do the detailed stuff?
Richness spectrum Spoke to it but probably should have skipped The quality and completeness of metadata – graceful decay Platinum to bronz Semantic Web services IVOA talk asked – “why and when Semantics”. Here is an answer. Leads to multiple pipelines and multiple Scientist - Finding Simple classifications on a few properties. Simple queries, reduce search space, final decision with user Biological terms. Heavy use of provenance, reputation, usage patterns, operational properties, example configurations and boring stuff like that. Think Amazon. The interface is the thing. Automation – Validation and Execution Rich metadata for automatic service configuration, invocation and fault management Rich descriptions for reasoning: mismatches, debugging, repair Rich descriptions for reasoning: automated composition Hard and time-consuming
Joint project Manchester-EBI
Technical Infrastructure But its still not all joined up!! Feta keeps coming and going. Grid service descriptions are produced by annotating services with terms from the myGrid ontology, stored in a central registry, GRIMOIRES. Services are found using the Feta discovery service [5]. We have piloted expert manual annotation tools augmented by automated tools using information extraction techniques.
These ae not our scientists or our projects. We have none. Its just scientists in the wild. 50% usa and uk Google analytics says: 1931 uniq visitors for 3rd sept to 3rd oct 1698 uniq visitors for 3rd aug to 2nd sept myExperiment currently has 1203 users , 98 groups , 460 workflows , 130 files and 36 packs Extreme Web 2.0 18 months old Built on Ruby on Rails BSD License Source code hosted on RubyForge Publicly available 2 core developers 50% in Southampton, 50% in Manchester User driven design and development 959 active users 1429 unique IP visits in last month 82 groups 248 group memberships 296 workflow entries, 425 workflow versions 101 files 1382 taggings 46,427 downloads 77,393 viewings 408 creditations 12 packs (with 237 total entries)
Towards repeatable, reproducible, comparable and reusable research