Keynote presentation by Professor Carole Goble at BOSC (Bioinformatics Open Source Conference) Long Beach, California, USA, July 14 2012. Co-located with ISMB, Intelligent Systems in Molecular Biology
myExperiment and the Rise of Social MachinesDavid De Roure
Talk at hubbub 2012, Indianapolis, 25 September 2012. The talk introduces myExperiment and Wf4Ever, discusses the future of research communication including FORCE11, and introduces the SOCIAM project (Theory and Practice of Social Machines) which launches in October 2012.
The digital universe is booming, especially metadata and user-generated data. This raises strong challenges in order to identify the relevant portions of data which are relevant for a particular problem and to deal with the lifecycle of data. Finer grain problems include data evolution and the potential impact of change in the applications relying on the data, causing decay. The management of scientific data is especially sensitive to this. We present the Research Objects concept as the means to indentify and structure relevant data in scientific domains, addressing data as first-class citizens. We also identify and formally represent the main reasons for decay in this domain and propose methods and tools for their diagnosis and repair, based on provenance information. Finally, we discuss on the application of these concepts to the broader domain of the Web of Data: Data with a Purpose.
This document discusses critical literacies and new technologies. It defines key concepts like literacy and new literacies. It describes characteristics of Web 2.0 technologies like user-generated content and social mediation. The document maps various digital skills frameworks to pedagogical approaches and proposes developing literacy skills through a reflective, design-based approach that encourages learning with and through others using visualization and new metaphors.
RDAP13 Mark Leggott: Stewarding research data using the Islandora frameworkASIS&T
Mark Leggott, University of PEI/DiscoveryGarden
Islandora: Stewarding research data using the Islandora framework
Mark Leggott, Thornton Staples and Kathleen Van Ekris
Panel: Global scientific data infrastructure
Research Data Access & Preservation Summit 2013
Baltimore, MD April 4, 2013 #rdap13
Understanding Research 2.0 from a Socio-technical PerspectiveYuwei Lin
This document discusses Research 2.0 from a socio-technical perspective. It outlines key concepts of Web 2.0 like blogging, social networking, and wikis. It also discusses O'Reilly's design patterns for Web 2.0 and De Roure and Goble's six principles for software design. The document examines challenges in developing Research 2.0 environments like involving users and addressing ethical and legal issues. It argues a socio-technical approach is needed to develop Research 2.0 that considers both technological and social aspects.
The document summarizes a presentation by the UC Curation Center on supporting UC research data management. The UC Curation Center helps ensure the long-term preservation of and access to UC's digital research outputs. They are developing tools and services to help researchers at all stages of the research lifecycle, from creating data management plans and collecting datasets to publishing, preserving, and sharing research outputs. Their goal is to engage researchers early, prioritize initiatives, provide simple evolving tools, deploy flexible infrastructure, and develop partnerships to support diverse research data management needs.
The document discusses the challenges of cataloging and metadata today and in the future, including changes in technology, user behavior, and the types of information objects that need to be described. It provides biographical information about Hendro Wicaksono and his experience working in libraries and developing cataloging systems. The document touches on the evolving nature of libraries, catalogs, metadata standards, and the tasks and skills needed for cataloging in the digital age.
Keynote presentation by Professor Carole Goble at BOSC (Bioinformatics Open Source Conference) Long Beach, California, USA, July 14 2012. Co-located with ISMB, Intelligent Systems in Molecular Biology
myExperiment and the Rise of Social MachinesDavid De Roure
Talk at hubbub 2012, Indianapolis, 25 September 2012. The talk introduces myExperiment and Wf4Ever, discusses the future of research communication including FORCE11, and introduces the SOCIAM project (Theory and Practice of Social Machines) which launches in October 2012.
The digital universe is booming, especially metadata and user-generated data. This raises strong challenges in order to identify the relevant portions of data which are relevant for a particular problem and to deal with the lifecycle of data. Finer grain problems include data evolution and the potential impact of change in the applications relying on the data, causing decay. The management of scientific data is especially sensitive to this. We present the Research Objects concept as the means to indentify and structure relevant data in scientific domains, addressing data as first-class citizens. We also identify and formally represent the main reasons for decay in this domain and propose methods and tools for their diagnosis and repair, based on provenance information. Finally, we discuss on the application of these concepts to the broader domain of the Web of Data: Data with a Purpose.
This document discusses critical literacies and new technologies. It defines key concepts like literacy and new literacies. It describes characteristics of Web 2.0 technologies like user-generated content and social mediation. The document maps various digital skills frameworks to pedagogical approaches and proposes developing literacy skills through a reflective, design-based approach that encourages learning with and through others using visualization and new metaphors.
RDAP13 Mark Leggott: Stewarding research data using the Islandora frameworkASIS&T
Mark Leggott, University of PEI/DiscoveryGarden
Islandora: Stewarding research data using the Islandora framework
Mark Leggott, Thornton Staples and Kathleen Van Ekris
Panel: Global scientific data infrastructure
Research Data Access & Preservation Summit 2013
Baltimore, MD April 4, 2013 #rdap13
Understanding Research 2.0 from a Socio-technical PerspectiveYuwei Lin
This document discusses Research 2.0 from a socio-technical perspective. It outlines key concepts of Web 2.0 like blogging, social networking, and wikis. It also discusses O'Reilly's design patterns for Web 2.0 and De Roure and Goble's six principles for software design. The document examines challenges in developing Research 2.0 environments like involving users and addressing ethical and legal issues. It argues a socio-technical approach is needed to develop Research 2.0 that considers both technological and social aspects.
The document summarizes a presentation by the UC Curation Center on supporting UC research data management. The UC Curation Center helps ensure the long-term preservation of and access to UC's digital research outputs. They are developing tools and services to help researchers at all stages of the research lifecycle, from creating data management plans and collecting datasets to publishing, preserving, and sharing research outputs. Their goal is to engage researchers early, prioritize initiatives, provide simple evolving tools, deploy flexible infrastructure, and develop partnerships to support diverse research data management needs.
The document discusses the challenges of cataloging and metadata today and in the future, including changes in technology, user behavior, and the types of information objects that need to be described. It provides biographical information about Hendro Wicaksono and his experience working in libraries and developing cataloging systems. The document touches on the evolving nature of libraries, catalogs, metadata standards, and the tasks and skills needed for cataloging in the digital age.
This document summarizes key aspects of computational research methods and the myExperiment platform. It discusses how myExperiment allows researchers to automate, share, and reuse workflows and other methods. It also addresses challenges around reproducibility, provenance, collaboration, and incentives for sharing methods. MyExperiment provides social features and aims to build a community around openly exchanging and improving computational research techniques.
Do Libraries Meet Research 2.0 : collaborative tools and relevance for Resear...Guus van den Brekel
Presentation June 30th 2009 Toulouse at LIBER Conference 2009
http://liber2009.biu-toulouse.fr/
Research Libraries & Web 2.0. Scientists engage in science & research 2.0, libraries should follow, outreach, engage, explore and facilitate etc
BioCatalogue talk by Carole Goble. She outlines in these slides the reasons behind the BioCatalogue project. And present the BioCatalogue and its goals.
myExperiment - Defining the Social Virtual Research EnvironmentDavid De Roure
myExperiment is a social networking site for scientists to share workflows, data, and other research objects. It allows users to create profiles, join groups, and share content while maintaining control over privacy. The site aims to facilitate collaboration and reuse in scientific research. It was launched in 2007 and has over 1000 registered users sharing hundreds of workflows and other research objects. The open source software powering the site can also be downloaded and customized for specific communities or projects.
BioCatalogue DILS & Enfin 2009 by JitsBioCatalogue
Jiten Bhagat presented on the BioCatalogue, a public and curated catalogue of life science web services. The BioCatalogue allows anyone to register, discover, and curate web services. It provides rich metadata for over 3,000 publicly available life science web services. The BioCatalogue aims to make these services easier to find, advertise, understand, and use for various stakeholders like service providers, users, and curators. It utilizes a community-driven model where experts oversee annotation from both automated sources and crowdsourced user contributions.
The Liber 2009 presentation repeated for a Dutch audience IN Dutch but with the english slides (just the first one is in Dutch :-)
Samenwerking Hogeschool bibliotheken SHB, 5 november 2009
MyExperiment.org is a social networking site and marketplace aimed at scientists who use workflows and services for their research. It allows users to publish, discover, share, and reuse experimental artifacts like workflows. The site aims to make these tools easy to use with a familiar social media-style interface. Key goals include crossing boundaries between individual experiments, disciplines, and systems to facilitate collaboration and intellectual fusion. Challenges include addressing issues around user incentives, metadata, provenance, intellectual property, and quality control as experiments are shared in an open yet curated environment.
CNI Fall 2011 Meeting Presentation Margaret Hedstrom & Robert McDonald (Dec. ...SEAD
SEAD is a new NSF-funded project that aims to provide sustainable data services for sustainability science research. It will integrate existing technologies and tools to address the needs of researchers working on "long tail" sustainability problems. SEAD is in its initial phase of developing prototypes and will not be ready to accept data until after October 2012. It is a collaboration between researchers at the University of Michigan, Indiana University, University of Illinois, and Rensselaer Polytechnic Institute.
These are the slides for Robert H. McDonald for the Future Trends Panel Presentation at the the Inter-institutional Approaches to Supporting Scholarly Communication Symposium held on August 16, 2012 at the Georgia Institute of Technology.
Where are we going and how are we going to get there?David De Roure
The document discusses the myExperiment virtual research environment for sharing workflows. Some key points:
1. myExperiment is a social network and repository for research workflows and methods. It currently has over 1800 users and hundreds of shared workflows.
2. The site allows fine-grained privacy controls, grouping of related content into "packs", and integration with other systems through federation.
3. Analysis found that most workflows and other content are shared publicly, and some users actively build upon other users' shared workflows. The most viewed workflow has over 1500 views.
4. The principles behind myExperiment's design focus on empowering scientists by enabling new forms of collaboration and sharing without forcing changes to workflows. The
OSFair2017 Workshop | Building a global knowledge commons - ramping up reposi...Open Science Fair
Eloy Rodrigues, Petr Knoth & Kathleen Shearer showcase the conceptual model for this vision, as well as the role and functions of repositories within this model.
Workshop title: Building a global knowledge commons - ramping up repositories to support widespread change in the ecosystem
Workshop abstract:
The extensive international deployment of repository systems in higher education and research institutions, as well as scholarly communities, provides the foundation for a distributed, globally networked infrastructure for scholarly communication. This distributed network of repositories can and should be a powerful tool to promote the transformation of the scholarly communication ecosystem. However, repository platforms are still using technologies and protocols designed almost twenty years ago, before the boom of the web and the dominance of Google, social networking, semantic web and ubiquitous mobile devices. In April 2016, the Confederation of Open Access Repositories (COAR) launched a working group to help identify new functionalities and technologies for repositories and develop a road map for their adoption. For the past several months, the group has been working to define a vision for repositories and sketch out the priority user stories and scenarios that will help guide the development of new functionalities. The results of this work will be available in the summer of 2017.
This workshop will present the functionalities and technologies for the next generation of repositories and reflect on how these functionalities will be adopted into the existing software platforms. In addition, participants will discuss the important implications for the network layers, and how repositories will uniformly interact with the networks to provide value added services on top of their content.
DAY 3 - PARALLEL SESSION 6 & 7
http://www.opensciencefair.eu/workshops/parallel-day-3-1/building-a-global-knowledge-commons-ramping-up-repositories-to-support-widespread-change-in-the-ecosystem
This talk introduces Linked Data and Semantic Web by using two examples - population sciences grid and semantAqua - a semantically enabled environmental monitoring. It shows a few tools and the semantic methodology and opens discussion for LOD and team science
Presentation at EMTACL10, http://www.ntnu.no/ub/emtacl/
Guus van den Brekel
Central medical library, UMCG
Virtual Research Networks: towards Research 2.0
In the next few years, the further development of social, educational and research networks – with its extensive collaborative possibilities – will be dictating how users will search for, manage and exchange information. The network – evolved by technology – is changing the user's behaviour and that will affect the future of information services. Many envision a possible leading role for libraries in collaboration and community building services.
Users are not only heavily using new tools, but are also creating and shaping their own preferred tools.
Today's students are incorporating Web 2.0 skills in daily life, in their social and learning environments.
Tomorrow's research staff will expect to be able to use their preferred tools and resources within their work environment.
Today's ánd tomorrow's libraries should support students and staff in the learning and research process by integrating library services and resources into their environments.
This document discusses using linked open data and semantic technologies to support next generation science. It provides background on the increasing availability of open data and opportunities for citizen science contributions. Semantic technologies can help integrate and link diverse scientific data sources. Linked data principles allow disparate datasets to be connected through shared identifiers and relationships. Examples are provided of existing projects that use semantic approaches to enable scientific data discovery, analysis and collaboration across domains like population health, water quality monitoring and climate change. Overall, the document argues that semantic technologies are mature and can help scientists address large, distributed problems by facilitating data integration and knowledge sharing.
Lynch & Dirks - Platforms for Open Research - Charleston Conference 2011Lee Dirks
The document summarizes Microsoft's efforts in collaborating with various organizations to promote innovation in scholarly communication. It discusses projects such as VIVO for connecting researchers, ORCID for unique researcher IDs, DataVerse for data sharing, DataCite for data citation, Total Impact for measuring research impact, DuraCloud for data storage and preservation, and Microsoft Academic Search for discovery. The goal is to help solve problems across the scholarly communication lifecycle from data collection and authoring to publication, discovery and preservation.
Knowledge Base+: a Cloud-Based Community Knowledge Basesherif user group
Knowledge Base+: A cloud-based community knowledge base by Ben Showers, JISC. Presentation at the JIBS User Group Workshop and AGM Back to the Future and Into the Cloud, 24 February 2012, School of Oriental and African Studies, London.
The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...Carole Goble
Presented at the FAIR Data in Practice Symposium, 16 may 2023 at BioITWorld Boston. https://www.bio-itworldexpo.com/fair-data. The ELIXIR European research Infrastructure for life science data is an inter-governmental organizations coordinating, integrating and sustaining FAIR data and software resources across its 23 nations. To help advise users, data stewards, project managers and service providers, ELIXIR has developed complementary community-driven, open knowledge resources for guiding FAIR Research Data Management (RDMkit) and providing FAIRification recipes (FAIRCookbook). 150+ people have contributed content so far, including representatives of the pharmaceutical industry.
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science, a Digital Research...Carole Goble
Invited talk, PHIL_OS, March 30-31 2023, Exeter
https://opensciencestudies.eu/whither-open-science. Includes hidden slides.
FAIR and Open Science needs Digital Research Infrastructure, which is a federated system of systems and needs funding models that are fit for purpose
Culture change needed for paying for Open Science’s infrastructure and funding support for data driven research needs more reality and less rhetoric
More Related Content
Similar to If we build it will they come? BOSC2012 Keynote Goble
This document summarizes key aspects of computational research methods and the myExperiment platform. It discusses how myExperiment allows researchers to automate, share, and reuse workflows and other methods. It also addresses challenges around reproducibility, provenance, collaboration, and incentives for sharing methods. MyExperiment provides social features and aims to build a community around openly exchanging and improving computational research techniques.
Do Libraries Meet Research 2.0 : collaborative tools and relevance for Resear...Guus van den Brekel
Presentation June 30th 2009 Toulouse at LIBER Conference 2009
http://liber2009.biu-toulouse.fr/
Research Libraries & Web 2.0. Scientists engage in science & research 2.0, libraries should follow, outreach, engage, explore and facilitate etc
BioCatalogue talk by Carole Goble. She outlines in these slides the reasons behind the BioCatalogue project. And present the BioCatalogue and its goals.
myExperiment - Defining the Social Virtual Research EnvironmentDavid De Roure
myExperiment is a social networking site for scientists to share workflows, data, and other research objects. It allows users to create profiles, join groups, and share content while maintaining control over privacy. The site aims to facilitate collaboration and reuse in scientific research. It was launched in 2007 and has over 1000 registered users sharing hundreds of workflows and other research objects. The open source software powering the site can also be downloaded and customized for specific communities or projects.
BioCatalogue DILS & Enfin 2009 by JitsBioCatalogue
Jiten Bhagat presented on the BioCatalogue, a public and curated catalogue of life science web services. The BioCatalogue allows anyone to register, discover, and curate web services. It provides rich metadata for over 3,000 publicly available life science web services. The BioCatalogue aims to make these services easier to find, advertise, understand, and use for various stakeholders like service providers, users, and curators. It utilizes a community-driven model where experts oversee annotation from both automated sources and crowdsourced user contributions.
The Liber 2009 presentation repeated for a Dutch audience IN Dutch but with the english slides (just the first one is in Dutch :-)
Samenwerking Hogeschool bibliotheken SHB, 5 november 2009
MyExperiment.org is a social networking site and marketplace aimed at scientists who use workflows and services for their research. It allows users to publish, discover, share, and reuse experimental artifacts like workflows. The site aims to make these tools easy to use with a familiar social media-style interface. Key goals include crossing boundaries between individual experiments, disciplines, and systems to facilitate collaboration and intellectual fusion. Challenges include addressing issues around user incentives, metadata, provenance, intellectual property, and quality control as experiments are shared in an open yet curated environment.
CNI Fall 2011 Meeting Presentation Margaret Hedstrom & Robert McDonald (Dec. ...SEAD
SEAD is a new NSF-funded project that aims to provide sustainable data services for sustainability science research. It will integrate existing technologies and tools to address the needs of researchers working on "long tail" sustainability problems. SEAD is in its initial phase of developing prototypes and will not be ready to accept data until after October 2012. It is a collaboration between researchers at the University of Michigan, Indiana University, University of Illinois, and Rensselaer Polytechnic Institute.
These are the slides for Robert H. McDonald for the Future Trends Panel Presentation at the the Inter-institutional Approaches to Supporting Scholarly Communication Symposium held on August 16, 2012 at the Georgia Institute of Technology.
Where are we going and how are we going to get there?David De Roure
The document discusses the myExperiment virtual research environment for sharing workflows. Some key points:
1. myExperiment is a social network and repository for research workflows and methods. It currently has over 1800 users and hundreds of shared workflows.
2. The site allows fine-grained privacy controls, grouping of related content into "packs", and integration with other systems through federation.
3. Analysis found that most workflows and other content are shared publicly, and some users actively build upon other users' shared workflows. The most viewed workflow has over 1500 views.
4. The principles behind myExperiment's design focus on empowering scientists by enabling new forms of collaboration and sharing without forcing changes to workflows. The
OSFair2017 Workshop | Building a global knowledge commons - ramping up reposi...Open Science Fair
Eloy Rodrigues, Petr Knoth & Kathleen Shearer showcase the conceptual model for this vision, as well as the role and functions of repositories within this model.
Workshop title: Building a global knowledge commons - ramping up repositories to support widespread change in the ecosystem
Workshop abstract:
The extensive international deployment of repository systems in higher education and research institutions, as well as scholarly communities, provides the foundation for a distributed, globally networked infrastructure for scholarly communication. This distributed network of repositories can and should be a powerful tool to promote the transformation of the scholarly communication ecosystem. However, repository platforms are still using technologies and protocols designed almost twenty years ago, before the boom of the web and the dominance of Google, social networking, semantic web and ubiquitous mobile devices. In April 2016, the Confederation of Open Access Repositories (COAR) launched a working group to help identify new functionalities and technologies for repositories and develop a road map for their adoption. For the past several months, the group has been working to define a vision for repositories and sketch out the priority user stories and scenarios that will help guide the development of new functionalities. The results of this work will be available in the summer of 2017.
This workshop will present the functionalities and technologies for the next generation of repositories and reflect on how these functionalities will be adopted into the existing software platforms. In addition, participants will discuss the important implications for the network layers, and how repositories will uniformly interact with the networks to provide value added services on top of their content.
DAY 3 - PARALLEL SESSION 6 & 7
http://www.opensciencefair.eu/workshops/parallel-day-3-1/building-a-global-knowledge-commons-ramping-up-repositories-to-support-widespread-change-in-the-ecosystem
This talk introduces Linked Data and Semantic Web by using two examples - population sciences grid and semantAqua - a semantically enabled environmental monitoring. It shows a few tools and the semantic methodology and opens discussion for LOD and team science
Presentation at EMTACL10, http://www.ntnu.no/ub/emtacl/
Guus van den Brekel
Central medical library, UMCG
Virtual Research Networks: towards Research 2.0
In the next few years, the further development of social, educational and research networks – with its extensive collaborative possibilities – will be dictating how users will search for, manage and exchange information. The network – evolved by technology – is changing the user's behaviour and that will affect the future of information services. Many envision a possible leading role for libraries in collaboration and community building services.
Users are not only heavily using new tools, but are also creating and shaping their own preferred tools.
Today's students are incorporating Web 2.0 skills in daily life, in their social and learning environments.
Tomorrow's research staff will expect to be able to use their preferred tools and resources within their work environment.
Today's ánd tomorrow's libraries should support students and staff in the learning and research process by integrating library services and resources into their environments.
This document discusses using linked open data and semantic technologies to support next generation science. It provides background on the increasing availability of open data and opportunities for citizen science contributions. Semantic technologies can help integrate and link diverse scientific data sources. Linked data principles allow disparate datasets to be connected through shared identifiers and relationships. Examples are provided of existing projects that use semantic approaches to enable scientific data discovery, analysis and collaboration across domains like population health, water quality monitoring and climate change. Overall, the document argues that semantic technologies are mature and can help scientists address large, distributed problems by facilitating data integration and knowledge sharing.
Lynch & Dirks - Platforms for Open Research - Charleston Conference 2011Lee Dirks
The document summarizes Microsoft's efforts in collaborating with various organizations to promote innovation in scholarly communication. It discusses projects such as VIVO for connecting researchers, ORCID for unique researcher IDs, DataVerse for data sharing, DataCite for data citation, Total Impact for measuring research impact, DuraCloud for data storage and preservation, and Microsoft Academic Search for discovery. The goal is to help solve problems across the scholarly communication lifecycle from data collection and authoring to publication, discovery and preservation.
Knowledge Base+: a Cloud-Based Community Knowledge Basesherif user group
Knowledge Base+: A cloud-based community knowledge base by Ben Showers, JISC. Presentation at the JIBS User Group Workshop and AGM Back to the Future and Into the Cloud, 24 February 2012, School of Oriental and African Studies, London.
The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...Carole Goble
Presented at the FAIR Data in Practice Symposium, 16 may 2023 at BioITWorld Boston. https://www.bio-itworldexpo.com/fair-data. The ELIXIR European research Infrastructure for life science data is an inter-governmental organizations coordinating, integrating and sustaining FAIR data and software resources across its 23 nations. To help advise users, data stewards, project managers and service providers, ELIXIR has developed complementary community-driven, open knowledge resources for guiding FAIR Research Data Management (RDMkit) and providing FAIRification recipes (FAIRCookbook). 150+ people have contributed content so far, including representatives of the pharmaceutical industry.
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science, a Digital Research...Carole Goble
Invited talk, PHIL_OS, March 30-31 2023, Exeter
https://opensciencestudies.eu/whither-open-science. Includes hidden slides.
FAIR and Open Science needs Digital Research Infrastructure, which is a federated system of systems and needs funding models that are fit for purpose
Culture change needed for paying for Open Science’s infrastructure and funding support for data driven research needs more reality and less rhetoric
RO-Crate: packaging metadata love notes into FAIR Digital ObjectsCarole Goble
Abstract
slides available at: https://zenodo.org/record/7147703#.Y7agoxXP2F4
The Helmholtz Metadata Collaboration aims to make the research data [and software] produced by Helmholtz Centres FAIR for their own and the wider science community by means of metadata enrichment [1]. Why metadata enrichment and why FAIR? Because the whole scientific enterprise depends on a cycle of finding, exchanging, understanding, validating, reproducing), integrating and reusing research entities across a dispersed community of researchers.
Metadata is not just “a love note to the future” [2], it is a love note to today’s collaborators and peers. Moreover, a FAIR Commons must cater for the metadata of all the entities of research – data, software, workflows, protocols, instruments, geo-spatial locations, specimens, samples, people (well as traditional articles) – and their interconnectivity. That is a lot of metadata love notes to manage, bundle up and move around. Notes written in different languages at different times by different folks, produced and hosted by different platforms, yet referring to each other, and building an integrated picture of a multi-part and multi-party investigation. We need a crate!
RO-Crate [3] is an open, community-driven, and lightweight approach to packaging research entities along with their metadata in a machine-readable manner. Following key principles - “just enough” and “developer and legacy friendliness - RO-Crate simplifies the process of making research outputs FAIR while also enhancing research reproducibility and citability. As a self-describing and unbounded “metadata middleware” framework RO-Crate shows that a little bit of packaging goes a long way to realise the goals of FAIR Digital Objects (FDO)[4], and to not just overcome platform diversity but celebrate it while retaining investigation contextual integrity.
In this talk I will present the why, and how Research Object packaging eases Metadata Collaboration using examples in big data and mixed object exchange, mixed object archiving and publishing, mass citation, and reproducibility. Some examples come from the HMC, others from EOSC, USA and Australia, and from different disciplines.
Metadata is a love note to the future, RO-Crate is the delivery package.
[1] https://helmholtz-metadaten.de/en
[2] Scott, Jason The Metadata Mania, http://ascii.textfiles.com/archives/3181, June 2011
[3] Soiland-Reyes, Stian et al. “Packaging Research Artefacts with RO-Crate”. Data Science, 2022; 5(2):97-138, DOI: 10.3233/DS-210053
[4] De Smedt K, Koureas D, Wittenburg P. “FAIR Digital Objects for Science: From Data Pieces to Actionable Knowledge Units”. Publications. 2020; 8(2):21. https://doi.org/10.3390/publications8020021
Research Software Sustainability takes a VillageCarole Goble
1. Research software sustainability requires communities to support development and maintenance over time.
2. Strong communities cultivate relationships between developers, users, and other stakeholders to establish trust and shared responsibility for software.
3. Maintaining communities requires ongoing efforts like change management, skills development, and cultivating relationships that span organizational boundaries. Funders can support these community efforts.
“Bioscience has emerged as a data-rich discipline, in a transformation that is spreading as widely now as molecular biology in the twentieth century. We look forward to supporting new research careers, where data are valued and shared widely, where new software is a natural part of Biology, and where re-analysis and modelling are as creative as experimentation in understanding the rules of life and their applications.” Prof Andrew Millar FRS, chair Expert Group UKRI-BBSRC Review of data-intensive bioscience 2020.
Indeed - biomedical science is knowledge work and knowledge turning - the turning of observation and hypothesis through experimentation, comparison, and analysis into new, pooled knowledge. Turns depend on the FAIR and Open flow and availability of data and methods for automated processing and reproducible results, and on a society of scientists coordinating and collaborating.
For the past 25 years I have worked on the social and technical challenges in digital infrastructure to support scientific collaboration, data and method sharing, and automate scientific processing. Big ideas I have been instrumental in – sharing and publishing high quality computational workflows, semantic web technologies in bioscience, ecosystems of Research Objects as the currency of scholarly knowledge, FAIR data principles - preached revolution to inspire but need nudges* to get traction.
I’ll talk about making good on Andrew’s quote: what I’m doing to nudge and where we need to do more. I’ll also talk about my experiences as a woman in a digital infrastructure and computer science over the past 40 years – and some nudging is needed there too.
*Thaler RH, Sunstein CR (2008) Nudge: Improving Decisions about Health, Wealth, and Happiness. Yale University Press. ISBN 978-0-14-311526-7. OCLC 791403664.
https://www.bsc.es/research-and-development/research-seminars/hybrid-bsc-rslife-sessionbioinfo4women-seminar-love-money-fame-nudge-enabling-data-intensive
This document discusses FAIR computational workflows and why they are important. It defines computational workflows as multi-step processes for data analysis and simulation that link computational steps and handle data and processing dependencies. Workflows improve reproducibility, enable automation, and allow for increased sharing and reuse of research. The document outlines how applying FAIR principles to workflows makes them findable, accessible, interoperable, and reusable. This includes using standardized metadata, identifiers, licensing, and formats to describe workflows and ensure their components and data are also FAIR. Adopting FAIR workflows requires support from workflow systems, tools, communities and services.
Open Research: Manchester leading and learningCarole Goble
Open and FAIR science has an international momentum. Large scale communities are striving to make and manage the digital infrastructure needed for scientists to be open as possible, closed as necessary, as expected by the NIH, OECD, UNESCO and the EC. ELIXIR is such a research infrastructure in Europe for Life Sciences. This talk will highlight two of ELIXIR's Open Science resources built by Open Science communities to enable life science researchers to be open, and led by Manchester. And how can we learn from these and bring these practices to Manchester?
Launch: Manchester Office for Open Research, 4th April 2022
https://www.openresearch.manchester.ac.uk/
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...Carole Goble
https://datascience.nih.gov/news/march-data-sharing-and-reuse-seminar 11 March 2022
Starting in 2023, the US National Institutes of Health (NIH) will require institutes and researchers receiving funding to include a Data Management Plan (DMP) in their grant applications, including the making their data publicly available. Similar mandates are already in place in Europe, for example a DMP is mandatory in Horizon Europe projects involving data.
Policy is one thing - practice is quite another. How do we provide the necessary information, guidance and advice for our bioscientists, researchers, data stewards and project managers? There are numerous repositories and standards. Which is best? What are the challenges at each step of the data lifecycle? How should different types of data? What tools are available? Research Data Management advice is often too general to be useful and specific information is fragmented and hard to find.
ELIXIR, the pan-national European Research Infrastructure for Life Science data, aims to enable research projects to operate “FAIR data first”. ELIXIR supports researchers across their whole RDM lifecycle, navigating the complexity of a data ecosystem that bridges from local cyberinfrastructures to pan-national archives and across bio-domains.
The ELIXIR RDMkit (https://rdmkit.elixir-europe.org (link is external)) is a toolkit built by the biosciences community, for the biosciences community to provide the RDM information they need. It is a framework for advice and best practice for RDM and acts as a hub of RDM information, with links to tool registries, training materials, standards, and databases, and to services that offer deeper knowledge for DMP planning and FAIR-ification practices.
Launched in March 2021, over 120 contributors have provided nearly 100 pages of content and links to more than 300 tools. Content covers the data lifecycle and specialized domains in biology, national considerations and examples of “tool assemblies” developed to support RDM. It has been accessed by over 123 countries, and the top of the access list is … the United States.
The RDMkit is already a recommended resource of the European Commission. The platform, editorial, and contributor methods helped build a specialized sister toolkit for infectious diseases as part of the recently launched BY-COVID project. The toolkit’s platform is the simplest we could manage - built on plain GitHub - and the whole development and contribution approach tailored to be as lightweight and sustainable as possible.
In this talk, Carole and Frederik will present the RDMkit; aims and context, content, community management, how folks can contribute, and our future plans and potential prospects for trans-Atlantic cooperation.
Data policy must be partnered with data practice. Our researchers need to be the best informed in order to meet these new data management and data sharing mandates.
This document discusses computational workflows and FAIR principles. It begins by providing background on computational workflows and their increasing importance. It then discusses challenges around finding, accessing, and sharing workflows. Next, it explores how applying FAIR principles to workflows could help address these challenges by making workflows and their associated objects findable, accessible, interoperable, and reusable. This includes discussing applying metadata standards, using persistent identifiers, and developing principles for FAIR workflows and FAIR software. The document concludes by examining the roles and responsibilities of different stakeholders in working towards FAIR workflows.
presentation at https://researchsoft.github.io/FAIReScience/, FAIReScience 2021 online workshop
virtually co-located with the 17th IEEE International Conference on eScience (eScience 2021)
German Conference on Bioinformatics 2021
https://gcb2021.de/
FAIR Computational Workflows
Computational workflows capture precise descriptions of the steps and data dependencies needed to carry out computational data pipelines, analysis and simulations in many areas of Science, including the Life Sciences. The use of computational workflows to manage these multi-step computational processes has accelerated in the past few years driven by the need for scalable data processing, the exchange of processing know-how, and the desire for more reproducible (or at least transparent) and quality assured processing methods. The SARS-CoV-2 pandemic has significantly highlighted the value of workflows.
This increased interest in workflows has been matched by the number of workflow management systems available to scientists (Galaxy, Snakemake, Nextflow and 270+ more) and the number of workflow services like registries and monitors. There is also recognition that workflows are first class, publishable Research Objects just as data are. They deserve their own FAIR (Findable, Accessible, Interoperable, Reusable) principles and services that cater for their dual roles as explicit method description and software method execution [1]. To promote long-term usability and uptake by the scientific community, workflows (as well as the tools that integrate them) should become FAIR+R(eproducible), and citable so that author’s credit is attributed fairly and accurately.
The work on improving the FAIRness of workflows has already started and a whole ecosystem of tools, guidelines and best practices has been under development to reduce the time needed to adapt, reuse and extend existing scientific workflows. An example is the EOSC-Life Cluster of 13 European Biomedical Research Infrastructures which is developing a FAIR Workflow Collaboratory based on the ELIXIR Research Infrastructure for Life Science Data Tools ecosystem. While there are many tools for addressing different aspects of FAIR workflows, many challenges remain for describing, annotating, and exposing scientific workflows so that they can be found, understood and reused by other scientists.
This keynote will explore the FAIR principles for computational workflows in the Life Science using the EOSC-Life Workflow Collaboratory as an example.
[1] Carole Goble, Sarah Cohen-Boulakia, Stian Soiland-Reyes,Daniel Garijo, Yolanda Gil, Michael R. Crusoe, Kristian Peters, and Daniel Schober FAIR Computational Workflows Data Intelligence 2020 2:1-2, 108-121 https://doi.org/10.1162/dint_a_00033.
FAIR Data Bridging from researcher data management to ELIXIR archives in the...Carole Goble
ISMB-ECCB 2021, NIH/ODSS Session, 27 July 2021
ELIXIR is the pan-national European Research Infrastructure for Life Science data, whose 23 national nodes and the EBI coordinate the development and long-term sustainability of domain public databases. FAIR services, policies and curation approaches aim to build a FAIR connected data ecosystem of trusted domain repositories, from ENA, HPA and EGA to specialised resources like CorkOakDB and PIPPA for plant phenotypes. But this is only one part of the data landscape and often the end of data’s journey. The nodes support research projects to operate “FAIR data first”, working with institutional and national platforms that are often generic or designed for project-based data management. We need to bridge between project-based and community-based, and support researchers across their whole RDM lifecycle, navigating the complexity this ecosystem. The ELIXIR-CONVERGE project and its flagship RDMkit toolkit (https://rdmkit.elixir-europe.org) aims to do just that.
FAIR Computational Workflows
Computational workflows capture precise descriptions of the steps and data dependencies needed to carry out computational data pipelines, analysis and simulations in many areas of Science, including the Life Sciences. The use of computational workflows to manage these multi-step computational processes has accelerated in the past few years driven by the need for scalable data processing, the exchange of processing know-how, and the desire for more reproducible (or at least transparent) and quality assured processing methods. The SARS-CoV-2 pandemic has significantly highlighted the value of workflows.
This increased interest in workflows has been matched by the number of workflow management systems available to scientists (Galaxy, Snakemake, Nextflow and 270+ more) and the number of workflow services like registries and monitors. There is also recognition that workflows are first class, publishable Research Objects just as data are. They deserve their own FAIR (Findable, Accessible, Interoperable, Reusable) principles and services that cater for their dual roles as explicit method description and software method execution [1]. To promote long-term usability and uptake by the scientific community, workflows (as well as the tools that integrate them) should become FAIR+R(eproducible), and citable so that author’s credit is attributed fairly and accurately.
The work on improving the FAIRness of workflows has already started and a whole ecosystem of tools, guidelines and best practices has been under development to reduce the time needed to adapt, reuse and extend existing scientific workflows. An example is the EOSC-Life Cluster of 13 European Biomedical Research Infrastructures which is developing a FAIR Workflow Collaboratory based on the ELIXIR Research Infrastructure for Life Science Data Tools ecosystem. While there are many tools for addressing different aspects of FAIR workflows, many challenges remain for describing, annotating, and exposing scientific workflows so that they can be found, understood and reused by other scientists.
This keynote will explore the FAIR principles for computational workflows in the Life Science using the EOSC-Life Workflow Collaboratory as an example.
[1] Carole Goble, Sarah Cohen-Boulakia, Stian Soiland-Reyes,Daniel Garijo, Yolanda Gil, Michael R. Crusoe, Kristian Peters, and Daniel Schober FAIR Computational Workflows Data Intelligence 2020 2:1-2, 108-121 https://doi.org/10.1162/dint_a_00033.
FAIR Workflows and Research Objects get a Workout Carole Goble
So, you want to build a pan-national digital space for bioscience data and methods? That works with a bunch of pre-existing data repositories and processing platforms? So you can share FAIR workflows and move them between services? Package them up with data and other stuff (or just package up data for that matter)? How? WorkflowHub (https://workflowhub.eu) and RO-Crate Research Objects (https://www.researchobject.org/ro-crate) that’s how! A step towards FAIR Digital Objects gets a workout.
Presented at DataVerse Community Meeting 2021
FAIRy stories: the FAIR Data principles in theory and in practiceCarole Goble
https://ucsb.zoom.us/meeting/register/tZYod-ippz4pHtaJ0d3ERPIFy2QIvKqjwpXR
FAIRy stories: the FAIR Data principles in theory and in practice
The ‘FAIR Guiding Principles for scientific data management and stewardship’ [1] launched a global dialogue within research and policy communities and started a journey to wider accessibility and reusability of data and preparedness for automation-readiness (I am one of the army of authors). Over the past 5 years FAIR has become a movement, a mantra and a methodology for scientific research and increasingly in the commercial and public sector. FAIR is now part of NIH, European Commission and OECD policy. But just figuring out what the FAIR principles really mean and how we implement them has proved more challenging than one might have guessed. To quote the novelist Rick Riordan “Fairness does not mean everyone gets the same. Fairness means everyone gets what they need”.
As a data infrastructure wrangler I lead and participate in projects implementing forms of FAIR in pan-national European biomedical Research Infrastructures. We apply web-based industry-lead approaches like Schema.org; work with big pharma on specialised FAIRification pipelines for legacy data; promote FAIR by Design methodologies and platforms into the researcher lab; and expand the principles of FAIR beyond data to computational workflows and digital objects. Many use Linked Data approaches.
In this talk I’ll use some of these projects to shine some light on the FAIR movement. Spoiler alert: although there are technical issues, the greatest challenges are social. FAIR is a team sport. Knowledge Graphs play a role – not just as consumers of FAIR data but as active contributors. To paraphrase another novelist, “It is a truth universally acknowledged that a Knowledge Graph must be in want of FAIR data.”
[1] Wilkinson, M., Dumontier, M., Aalbersberg, I. et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data 3, 160018 (2016). https://doi.org/10.1038/sdata.2016.18
RO-Crate: A framework for packaging research products into FAIR Research ObjectsCarole Goble
RO-Crate: A framework for packaging research products into FAIR Research Objects presented to Research Data Alliance RDA Data Fabric/GEDE FAIR Digital Object meeting. 2021-02-25
The swings and roundabouts of a decade of fun and games with Research Objects Carole Goble
Research Objects and their instantiation as RO-Crate: motivation, explanation, examples, history and lessons, and opportunities for scholarly communications, delivered virtually to 17th Italian Research Conference on Digital Libraries
How are we Faring with FAIR? (and what FAIR is not)Carole Goble
Keynote presented at the workshop FAIRe Data Infrastructures, 15 October 2020
https://www.gmds.de/aktivitaeten/medizinische-informatik/projektgruppenseiten/faire-dateninfrastrukturen-fuer-die-biomedizinische-informatik/workshop-2020/
Remarkably it was only in 2016 that the ‘FAIR Guiding Principles for scientific data management and stewardship’ appeared in Scientific Data. The paper was intended to launch a dialogue within the research and policy communities: to start a journey to wider accessibility and reusability of data and prepare for automation-readiness by supporting findability, accessibility, interoperability and reusability for machines. Many of the authors (including myself) came from biomedical and associated communities. The paper succeeded in its aim, at least at the policy, enterprise and professional data infrastructure level. Whether FAIR has impacted the researcher at the bench or bedside is open to doubt. It certainly inspired a great deal of activity, many projects, a lot of positioning of interests and raised awareness. COVID has injected impetus and urgency to the FAIR cause (good) and also highlighted its politicisation (not so good).
In this talk I’ll make some personal reflections on how we are faring with FAIR: as one of the original principles authors; as a participant in many current FAIR initiatives (particularly in the biomedical sector and for research objects other than data) and as a veteran of FAIR before we had the principles.
What is Reproducibility? The R* brouhaha and how Research Objects can helpCarole Goble
This document discusses reproducibility in computational research. It defines several key terms related to reproducibility, including replicate, rerun, and repeat. It notes that computational papers should provide the full software and data used to generate results. The document outlines several rules for reproducible research, such as tracking how results were produced, version controlling scripts, and archiving intermediate results. It also discusses challenges to reproducibility like lack of funding and changing dependencies over time. Finally, it introduces Research Objects as a framework to bundle resources like data, software, and protocols to help address reproducibility issues.
Building Production Ready Search Pipelines with Spark and MilvusZilliz
Spark is the widely used ETL tool for processing, indexing and ingesting data to serving stack for search. Milvus is the production-ready open-source vector database. In this talk we will show how to use Spark to process unstructured data to extract vector representations, and push the vectors to Milvus vector database for search serving.
Generating privacy-protected synthetic data using Secludy and MilvusZilliz
During this demo, the founders of Secludy will demonstrate how their system utilizes Milvus to store and manipulate embeddings for generating privacy-protected synthetic data. Their approach not only maintains the confidentiality of the original data but also enhances the utility and scalability of LLMs under privacy constraints. Attendees, including machine learning engineers, data scientists, and data managers, will witness first-hand how Secludy's integration with Milvus empowers organizations to harness the power of LLMs securely and efficiently.
UiPath Test Automation using UiPath Test Suite series, part 6DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 6. In this session, we will cover Test Automation with generative AI and Open AI.
UiPath Test Automation with generative AI and Open AI webinar offers an in-depth exploration of leveraging cutting-edge technologies for test automation within the UiPath platform. Attendees will delve into the integration of generative AI, a test automation solution, with Open AI advanced natural language processing capabilities.
Throughout the session, participants will discover how this synergy empowers testers to automate repetitive tasks, enhance testing accuracy, and expedite the software testing life cycle. Topics covered include the seamless integration process, practical use cases, and the benefits of harnessing AI-driven automation for UiPath testing initiatives. By attending this webinar, testers, and automation professionals can gain valuable insights into harnessing the power of AI to optimize their test automation workflows within the UiPath ecosystem, ultimately driving efficiency and quality in software development processes.
What will you get from this session?
1. Insights into integrating generative AI.
2. Understanding how this integration enhances test automation within the UiPath platform
3. Practical demonstrations
4. Exploration of real-world use cases illustrating the benefits of AI-driven test automation for UiPath
Topics covered:
What is generative AI
Test Automation with generative AI and Open AI.
UiPath integration with generative AI
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Taking AI to the Next Level in Manufacturing.pdfssuserfac0301
Read Taking AI to the Next Level in Manufacturing to gain insights on AI adoption in the manufacturing industry, such as:
1. How quickly AI is being implemented in manufacturing.
2. Which barriers stand in the way of AI adoption.
3. How data quality and governance form the backbone of AI.
4. Organizational processes and structures that may inhibit effective AI adoption.
6. Ideas and approaches to help build your organization's AI strategy.
CAKE: Sharing Slices of Confidential Data on BlockchainClaudio Di Ciccio
Presented at the CAiSE 2024 Forum, Intelligent Information Systems, June 6th, Limassol, Cyprus.
Synopsis: Cooperative information systems typically involve various entities in a collaborative process within a distributed environment. Blockchain technology offers a mechanism for automating such processes, even when only partial trust exists among participants. The data stored on the blockchain is replicated across all nodes in the network, ensuring accessibility to all participants. While this aspect facilitates traceability, integrity, and persistence, it poses challenges for adopting public blockchains in enterprise settings due to confidentiality issues. In this paper, we present a software tool named Control Access via Key Encryption (CAKE), designed to ensure data confidentiality in scenarios involving public blockchains. After outlining its core components and functionalities, we showcase the application of CAKE in the context of a real-world cyber-security project within the logistics domain.
Paper: https://doi.org/10.1007/978-3-031-61000-4_16
Full-RAG: A modern architecture for hyper-personalizationZilliz
Mike Del Balso, CEO & Co-Founder at Tecton, presents "Full RAG," a novel approach to AI recommendation systems, aiming to push beyond the limitations of traditional models through a deep integration of contextual insights and real-time data, leveraging the Retrieval-Augmented Generation architecture. This talk will outline Full RAG's potential to significantly enhance personalization, address engineering challenges such as data management and model training, and introduce data enrichment with reranking as a key solution. Attendees will gain crucial insights into the importance of hyperpersonalization in AI, the capabilities of Full RAG for advanced personalization, and strategies for managing complex data integrations for deploying cutting-edge AI solutions.
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceIndexBug
Imagine a world where machines not only perform tasks but also learn, adapt, and make decisions. This is the promise of Artificial Intelligence (AI), a technology that's not just enhancing our lives but revolutionizing entire industries.
Ivanti’s Patch Tuesday breakdown goes beyond patching your applications and brings you the intelligence and guidance needed to prioritize where to focus your attention first. Catch early analysis on our Ivanti blog, then join industry expert Chris Goettl for the Patch Tuesday Webinar Event. There we’ll do a deep dive into each of the bulletins and give guidance on the risks associated with the newly-identified vulnerabilities.
Best 20 SEO Techniques To Improve Website Visibility In SERPPixlogix Infotech
Boost your website's visibility with proven SEO techniques! Our latest blog dives into essential strategies to enhance your online presence, increase traffic, and rank higher on search engines. From keyword optimization to quality content creation, learn how to make your site stand out in the crowded digital landscape. Discover actionable tips and expert insights to elevate your SEO game.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
Infrastructure Challenges in Scaling RAG with Custom AI modelsZilliz
Building Retrieval-Augmented Generation (RAG) systems with open-source and custom AI models is a complex task. This talk explores the challenges in productionizing RAG systems, including retrieval performance, response synthesis, and evaluation. We’ll discuss how to leverage open-source models like text embeddings, language models, and custom fine-tuned models to enhance RAG performance. Additionally, we’ll cover how BentoML can help orchestrate and scale these AI components efficiently, ensuring seamless deployment and management of RAG systems in the cloud.
OpenID AuthZEN Interop Read Out - AuthorizationDavid Brossard
During Identiverse 2024 and EIC 2024, members of the OpenID AuthZEN WG got together and demoed their authorization endpoints conforming to the AuthZEN API
How to Get CNIC Information System with Paksim Ga.pptxdanishmna97
Pakdata Cf is a groundbreaking system designed to streamline and facilitate access to CNIC information. This innovative platform leverages advanced technology to provide users with efficient and secure access to their CNIC details.
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slackshyamraj55
Discover the seamless integration of RPA (Robotic Process Automation), COMPOSER, and APM with AWS IDP enhanced with Slack notifications. Explore how these technologies converge to streamline workflows, optimize performance, and ensure secure access, all while leveraging the power of AWS IDP and real-time communication via Slack notifications.
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxSitimaJohn
Ocean Lotus cyber threat actors represent a sophisticated, persistent, and politically motivated group that poses a significant risk to organizations and individuals in the Southeast Asian region. Their continuous evolution and adaptability underscore the need for robust cybersecurity measures and international cooperation to identify and mitigate the threats posed by such advanced persistent threat groups.
HCL Notes and Domino License Cost Reduction in the World of DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-and-domino-license-cost-reduction-in-the-world-of-dlau/
The introduction of DLAU and the CCB & CCX licensing model caused quite a stir in the HCL community. As a Notes and Domino customer, you may have faced challenges with unexpected user counts and license costs. You probably have questions on how this new licensing approach works and how to benefit from it. Most importantly, you likely have budget constraints and want to save money where possible. Don’t worry, we can help with all of this!
We’ll show you how to fix common misconfigurations that cause higher-than-expected user counts, and how to identify accounts which you can deactivate to save money. There are also frequent patterns that can cause unnecessary cost, like using a person document instead of a mail-in for shared mailboxes. We’ll provide examples and solutions for those as well. And naturally we’ll explain the new licensing model.
Join HCL Ambassador Marc Thomas in this webinar with a special guest appearance from Franz Walder. It will give you the tools and know-how to stay on top of what is going on with Domino licensing. You will be able lower your cost through an optimized configuration and keep it low going forward.
These topics will be covered
- Reducing license cost by finding and fixing misconfigurations and superfluous accounts
- How do CCB and CCX licenses really work?
- Understanding the DLAU tool and how to best utilize it
- Tips for common problem areas, like team mailboxes, functional/test users, etc
- Practical examples and best practices to implement right away
HCL Notes and Domino License Cost Reduction in the World of DLAU
If we build it will they come? BOSC2012 Keynote Goble
1. If we build it
will they come?
Prof Carole Goble FREng FBCS CITP
carole.goble@manchester.ac.uk
BOSC, Long Beach, July 14 2012
http://www.mygrid.org.uk
2. Est. 2001
Improving Knowledge Turning,
Enabling Reuse and Reproducibility
[Josh Sommer]
Keep the vision, modify the plan
3. Computational Methods LGPL
Scientific workflows.
Distributed web/grid/cloud services
Third party, independent service reuse
Data pipelines and analytics
Volunteerist Human Computation BSD
e-Laboratories - social collaboration
and sharing environments for scientific
artefacts. Libraries and Catalogues.
Asset safe havens, sharing, reuse.
Knowledge Acquisition Tools
Various
Semantic technology, semantic
applications, research objects,
executable papers.
OWL
Data/Metadata curation & reuse
POPULOUS SKOSEdit
4. The Taverna Suite of Tools Web Portals
Workflow Repository GUI Workbench Client User Interfaces
Virtual
Machine
Service Catalogue Third Party Tools
Workflow Engine
Provenance Workflow
Store Command Line
Server
Activity and Service
Plug-in Manager
Open
Provenance
Model
Programming and
Secure Service Access APIs
5. Community Haven
Sharing Resource
Social Collaboration
http://www.myexperiment.org
5820 members, 304
groups, 2415 workflows,
604 files and 229 packs
(research objects)
http://wiki.myexperiment.org/index.php/Galaxy
6. BioCatalogue:
crowd curation of web services
Contribute, Find and
understand Web
Services
Curate, review and
comment
Learning resource
Monitor Services Cloud Registry
2295 REST and SOAP services, 169 service
providers. 674 members, 27 countries
7. Find experts,
colleagues and
peers.
Find, exchange
and interlink,
preserve, publish
data, models,
publications,
SOPs & analyses.
ISA Compliant
SysMO: 16 consortia, 110 institutes,
1600+ assets, 350+ members
Launch and validate Gateway to GerontoSys
models and analyses: public tools and
JWS Online resources, e.g.
BioModels livSYSiPS
9. Standards & Content Sharing Platform
Governance & Policy & Trusted Service
Software
& Tools
Open source
Gateway
Comp Sci
Research
Platform
Knowledge Network Preservation &
Skills & Community Building Publication Platforms
10. Laissez-faire Philosophy
• Bottom Up
– Emergent & scruffy (to a degree…)
• Reliant on third party contributions
– Non-prescriptive, non-interfering and
flexible
– We make no content ourselves….
• Part of a wider ecosystem
– Other services, data, tools, platforms,
people…
• Inspired by social environments
• Scarred by top-down, dictated,
tech-driven and unused monoliths
11. http://www.flickr.com/photos/hellaoakland/3137360455/
Never underestimate Liberty through
how scruffy third Limitations
party stuff can be
How often metadata is People say they want
missing and messy if flexibility. They prefer the
left to its own simplicity of order and will
devices…
adapt to adopt.
12. Who is they?
• Jobbing
Bioinformatician?
• Expert
Bioinformatician?
• Sys admin?
• Service provider?
• Application
developer?
• Tool developer?
• Biologist?
13. Who is THEY?
Drug Toxicity Pharmacogenomics Trypanosomiasis in The Virtual
(OpenTox Project) GWAS African Cattle Liver
Physiopathology of Genetic differences
Systems Biology of the human body between breeds of
Metagenomics cattle
Micro-Organisms Medical Imaging
14. Consortia
Organised,
Planned, Strong
connections with
resource Independents….
Bovine
providers and
Trypanosomiasis
each other. Consortium
Research
Distributed Groups & Groups
Independent Lone
rangers
Long tail, Disconnected
from data providers and
each other, emergent,
Individuals
15. Specialise or
Diversify?
• Flexibility and extensibility ->
customised Software and
Document
Services, Cookie cutter Helio-
Preservation
Physics
• Widen adoption
• Spread risk, extend
resourcing streams
BioDiversity Astronomy
• Cross development
alignment and coordination
• More communities to build,
nurture, support and sustain
• Core Drift and Bashing
Social Science Engineering: JPL, NASA
FLOSS
16. BioDiversity Virtual e-Laboratory
http://www.biovel.eu
Biodiversity Services Catalogues / Execution
Repositories environment
Provenance
Phylogenetic
BLAST,Hmmer, WebDaV Data
MrBayes, Management
Blast, PAML,
Taverna
EMBOSS,… Workbench
Search
Open
Taxonomic
Synonyms
Visualisation
Authentication /
Authorisation
BioSTIF Taverna
Workflow Engine
Google Refine CSW and Server
Modelling/GeoProcessing
Grid, Cloud, etc.
R
openModeller
Platforms
WPS / WCPS
17. Who is We? The ego-system
biologists,
bioinformaticians,
biodiversity
informaticians,
astro-informaticians,
social scientists
modellers, software
engineers,
computer scientists,
systems administrators,
resource providers
20. Applications
Production
Publishing Training
Research
Community Community
21. So if we build it will they come?
Be useful for something: immediately,
continuously, responsively
Be usable by somebody: user experience,
worth the effort, adoption path
Some of the time: as part of a big picture
Under promise and over deliver
Acquire Critical Mass
22. Four things that drive adoption
of software or service.
1. Added value
– Do something that couldn’t do before or now do faster,
gain competitive advantage, improve productivity,
scale up
2. New asset
– Get or retain access to something important (data,
method, technique, skills, knowledge)
3. Keep up with the field. A Community.
– Future-proof my practice, New skills and capacity,
there is a vibe about it and I’ll be left out
4. Because there is no choice
– Business depends on it, its mandated, its de facto
mandated
23. Seven things that hinder
adoption of software or service
1. Not enough added value
• It doesn’t solve a problem or not as well or as cheaply
as something else, no content or the right content
It Sucks
2. Not fit for take-on. It doesn’t work!
• No: help, guides, documentation, manuals, examples,
content, templates, portability, migration / legacy
support, easy installation, virtual machines, testing,
stability, version control, release cycle, roadmap,
sustainability prospect, way of introducing my
favourite component/data/environment.
3. No Time or Capacity to take on
• To learn, migrate personal legacy
code/data/applications, no pathway/ramp to adoption
• Training and special system needs
24. Software practices
Zeeya Merali , Nature 467, 775-777 (2010) | doi:10.1038/467775a
Computational science: ...Error…why scientific programming does not compute.
“As a general rule,
researchers do not
test or document
their programs
rigorously, and they
rarely release their
codes, making it
almost impossible
to reproduce and
verify published
results generated
by scientific
software”
25. Software Stewardship
“Better Science through Superior Software” – C Titus Brown
Software sustainability
Software practices
Software deposition
Long term access to software
Credit for software
Licensing advice
Open licenses
Reproducible Research Standard, Victoria Stodden,
Intl J Comm Law & Policy, 13 2009
26. Seven things that hinder
adoption of software or service
1. Cost
– Of disruption, of long-term ownership
–
It’s too costly
2. Exposure to Risk.
First to take-up, Support and sustainability dependencies,
fear of scrutiny, misrepresentation or being scooped,
3. No Community
– Support and comfort
4. Changes to work practices
– Obligations, unclear or unenforced reciprocity protocols.
27. • It sucks but it’s the
only thing around
• It’s ace but it’s one
of many, too late in
the game and not
enough to switch
• Tipping point is
likely not technical
Betamax vs VHS
28. Bonus Hinder
Never heard of it.
We’ve built it but we haven’t told anyone.
• Make noise…physically and virtually
• Customer and Contributor Relationship Building
• Self-supporting communities, multi-level marketing
• Highly Resource Intensive
29. Bonus Hinder
Never heard of it.
We’ve built it but we haven’t told anyone.
Market
User Community
Development
It all kicks off
Developer Community
30. Adoption Intentions
Be careful what you wish for
• Incidental
– “I built it for myself, and stuck it out there”
• Familial
– “I built it for people just like me”
• Fundamental
– “I built it for others, many who are not like me”
31. Open Innovation: Development and Content
you are not alone. you can’t do it all alone
motivate & enable others to fill gaps “App Store Style”
software, services, content, examples….
• Really Interoperate. Don’t tweak.
• Be Simple and Standard.
• Be Helpful. Be Set up. Be
reusable. Be Smart Friends
Galaxy+Taverna/myExperiment
Family
• Others will develop on top of you.
But don’t assume they will re-
contribute or tell you.
Acquaintances
• It’s much harder than you think.
Strangers
• It’s unequal.
32. Ladder Model of OSS Adoption
(adapted from Carbone P., Value Derived
from Open Source is a Function of
Family Acquaintances
Friends Maturity Levels) Strangers
Moore's technology adoption curve
[FLOSS@Sycracuse]
33. "it's better, initially, to make a small
number of users really love you than a
large number kind of like you"
Paul Buchheit
paulbuchheit.blogspot.com
34. PALS: Building Friendships
Intelligence, Guidance, Advocacy, Evangelism, Market Research
What’s in it for the PAL?
– Long tail: Money, kudos,
special support, special
resources, skills, reputation
building, influence, stuff they
can’t do alone, CV building
– Consortia: co-funded
• Who is a PAL?
– Post-docs, Post-grads,
Administrators, Developers
– PI: protector/champion
• PAL handlers
– Customer Relationship
Manager, Nanny and
Mediator, Scientist
35. Do not under-estimate…
The power of the sprint / The power of a whizzy
*-athon / fest / drinking interface. Even for plumbing.
The importance of
supporting and propagating
best practice
37. Participatory Design
Work Together on a Real Problem
Funders Project PIs PALs
Data sharing Data control Spreadsheets.
Data standards Own databases Yellow Pages.
Just enough SOPs
A database
exchange. Understanding
Long term Visibility limitations standards
preservation
Project dependence Curating.
Examples.
3 Years later 15/16 consortia Safe Haven
abandoned their own systems and Project
went with the SEEK system. independence
39. Participation Cooperation? Coordination? Collaboration?
Citizens Integration? Evolution and entropy models
Public
scientists
Trusted
Collaborators
Private
Groups
Lone
scholars
Closed Controlled Open
[based on an idea by Liz Lyon] Access
40. Critical mass spiral: 90:9:1
Driven by needs of
and benefits to the
scientist, rather
than top down
policies.
Content tipping
point
[Andrew Su]
41. Trust, Fame and Blame: Reciprocity,
Competition, Contribution and Use
• Scooping, Scrutiny and Misinterpretation
• Curation Cost
• Poor quality
• Reputation / Asset Economics
• Public Peer Pressure
Reciprocity Sucks
• Flirting
• Hugging
• Controlled Sharing
• Voyerism
• Poor feedback / credit Nature 461, 145 (10 September 2009)
Victoria Stodden, The Scientific Method in Practice: Reproducibility in the Computational Sciences Feb 9,
2010 MIT Sloan Research Paper No. 4773-10, http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1550193
42. Harness Competitiveness Carrots
Pride
• Reputation: Cult, Credit & Attribution for all
Protection
• Just enough Sharing, Licensing & Liability
• Quality, Peer review, Metadata
Preservation
• Safe havens and Sunsets (project churn)
Publishing / Release
• Citability, Supporting Exchange
Productivity
• Availability of assets, help, capability,
ramps
45. Adoption Stealth
• Data at home promise with
automated harvesting
• Sharing creep, Incremental
metadata, Low obligations
• URL upload in BioCatalogue
• Web Service “come as you
are” take-on in Taverna
• Metadata prompting, Right
tools, right time, right place
• Service collections &
Packaged services
46. Be vigilant
• PAL burn-out and
over familiarity
• Unadjusted over-
user accommodation
• Drifting apart and not
keeping it fresh
• Step back, observe
and adapt/intervene!
• So relieved to get a
community….
• Instrument adoption
and observation
Participatory Development is a mutual long term relationship
Not flirty speed dating, One night stand, Crush, Me Me Me
47. Urgent-Important
• Technical bog down,
operational burn-out
• Little things that are
important but don’t
seem that urgent…
• Dominant projects
• Not-software content
• It all takes way longer
than you think
• Simplicity drift
Participatory Development is a mutual long term relationship
Not flirty speed dating, One night stand, Crush, Me Me Me
49. The Jam-based
Adoption Model
aka
Added Value
Value Proposition
Return On Investment
http://delicious-cooks.com/photos/raspberry-jam/04/
50. What’s is the Special Jam?
What is your Jam Value Chain and for Who?
What:
SysMO: safe haven, spreadsheet tooling, linking
SOPs, models and data, examples
Taverna: power, adaptability and myExperiment
Who:
Focused on contributors and experts
Provider-consumer balance
Functionality-Simplicity Syndrome
Changing Who - Challenging baked-ins
51. Jam today and more, better Jam tomorrow
Just Enough Jam, Just in Time not Just in Case
* Feature Creep Conundrum * Big Picture Paradox
* Core vs Specifics Syndrome * Content Decay Dilemma
* Working to working Stability Stress
52. Customised Specific Jam beats Generic
* Flexibility/Functionality – Simplicity Conundrum
* Diversification Dilemma
53. http://www.gettyimages.co.uk/detail/photo/empty-jam-jar-royalty-free-image/136976198
Where is my Jam? Jam for All
• What are WE (platform providers,
Software builders, Community
builders and Service providers)
getting out if it?
• Need credit and interest too.
• Altmetrics
Howison and Herbsleb, Scientific Software Production:
Incentives and Collaboration, CSCW 2011, March 19–23,
2011, Hangzhou, China
http://james.howison.name/pubs/HowisonHerbsleb2011SciSoftIncentives.pdf
54. Jam forever
They came. Have the evidence. Have a plan.
Did you wish for this? Do you want it?
Fragile Flux
• Content, services, bits, communities
Funding Plan
• Novelty over sustainability,
• Research-Production Falsehoods
• Wave invention, Political lobbying
Securing the community
• Leadership & Foundations
Business model???
Software is Free like Puppies Are Free
55. Jam not forever
• Acquire
• Retain
• Widen
– More/Different
• Reposition
– Different/New Stage
• Changing Community
is Challenging… [Daron Green]
56. Adoption is a The Social and the
Merry-Go-Round Technical
are Inseparable
57. You know they came when…
…you were useful and usable to someone some of the time,
but they might not tell you
… people ask you to join their consortia or use it
… they gave up their own home grown stuff for yours
… someone you don’t know uses it and tells you all about
your own stuff.
… someone publishes papers about it. Without citing you.
… someone else claims credit.
… people you don’t know start bitching about it.
… its just expected to be there and you are kind of expected
to be there too.
…your Head of School complains you don’t do enough CS
research because you are doing too much Software
Engineering and Support.
58. James Howison Heather Piwowar
Victoria Stodden Janet Vertesi
Christine Borgman Nosh Contractor
Acknowledgements (1)
Jay Liebowitz Robert Kraut
59. Acknowledgements (2)
• The myGrid family, friends and contributors
• But especially: Katy Wolstencroft, David Withers, Marco
Roos, Alan Williams, Jits Bhagat, Stuart Owen, Stian
Soiland-Reyes, Shoab Sufi, Robert Stevens, Paul Fisher,
Peter Li, Ian Dunlop, Finn Bacall, Mannie Tags, Niall
Beard, Rob Haines, Christian Brenninkmeijer, Alasdair
Gray, Tim Clark, Pinar Alper, Paolo Missier, Khalid
Belhajjame, Duncan Hull, Sean Bechhofer, david De
Roure, Don Cruickshank, Wolfgang Mueller, Olga Krebs,
Franco Du Preez, Quyen Nguyen, Jacky Snoep.
• The members of Wf4ever, SysMO, BioVel, HELIO,
SCAPE, OMII, SSI, NeiSS, Obesity e-Lab and anyone
else I forgot
61. Coalface Patrons
users
Skeptic
Champions Keep your
Friends Close Friends and Family
Fit in
Favours will
Embed
Favour you Jam Today
Jam Tomorrow Act Local
Think Global
End Users
Developers Just Enough Design for
Know Anticipate
Just in Time Network Effects
Service your Change
Providers Users
Enable Users
System to Add Value
Administrators
Keep Sight of the
Bigger Picture
SUMMARY
(De Roure and Goble, IEEE Software 2009)
Editor's Notes
If I build it will they come? : What is it we are building? What is it we are building ? Who is they? Who are we? Over the years I have built a bunch of open source software and services for researchers: the Taverna workflow system, myExperiment for workflow sharing, BioCatalogue for services, SEEK for Systems Biology data and models, and most recently MethodBox for longitudinal data sets. As well as building software we built communities: development communities and user communities. So what drives/hinders adoption? What do I know now that I wished I had known before? How do we sustain communities on time-limited grants? How do we build it so they come, stay and join in?
Distributed Groups Independents and Partners Organised Teams, Planned, Strong connections with resource providers and each other. Structured, Cross-partner sharing, Retained results Distributed Groups & Independent Lone rangers Long tail, Disconnected from data providers and each other, emergent, fluid, personal stores, small science from big Make workflows for group Run workflows from platforms Store and Find Workflows Catalogue and Find Services Catalogue, store and find data, SOPs, Models Link stuff Release & Share stuff Curate stuff Cooperate / Collaborate / Coordinate / CoShape Vary on Coordination, collaboration, cooperation, contribution, integration, sustainability, longevity
Make workflows for group Run workflows from platforms Store and Find Workflows Catalogue and Find Services Catalogue, store and find data, SOPs, Models Link stuff Release & Share stuff Curate stuff Cooperate / Collaborate / Coordinate / CoShape
Still some people missing!
Knowledge Transfer Three tracks Large Team.
Developer and user adoption Contributed collaborative content Collaborative development
Maybe you don’t care…. Content and Promotion matter more than software, but harder to fund and different people to software developers.
Incidental – not really building for adoption or others to take up Familial – the producer and the consumer are the same – many are like this in BOSC
CLAs for set up. Remember upgrade paths Cooperate, Network effects, Amplify Self-supporting, Multi-level marketing There are no green fields.
Please some of the people some of the time
They all start off like this…
Working the first time User experience over smart. Cool interfaces (even for plumbing)
Primary Community Review Facebook generation! Community participation Sharing Commons based production Social Curation Voluntary contribution 1. Primary Content 2. Curation duties GeneWiki, Rfam, myExperiment, PloS, UsefulChem, OpenWetWare Open Science vs Long Tail Social networks vs the Long Tail Incentives and Obstacles Myths and Miracles Contribution. Curation. Volunteer science
Limited focus Social networking around content . Feedback loops.
PAL recruitment Content contribution Stick: Community, Journal and funder mandates – there is no stick Credit for peer review
Don’t forget to make more demands though!
User burn-out and over familiarity Over-friendly Stockhausen syndrome, absence of friendly fire, Keep enemies even closer Unadjusted over-user accommodation Fit in at first, get buy-in, move in, move on Drifting apart and not keeping it fresh Keep jointly working on real, concrete cases Don’t assume they will stay: Users are fickle. Step back, observe and adapt/intervene! So relieved get a community forget to see what they do (e.g. dubious workflow designs) Much easier with e-Laboratory Services that are inherently social collaboration spaces. Complacency Esp. dangerous outside funded collaborations Measuring impact and getting feedback Downloads ≠ useful (or usable) Don’t be prescriptive. Scientists control. – but actually we need to be a bit prescriptive Danger! Going native. Missing users. Fossilisation and complacency User experience over smart. Cool interfaces (even for plumbing *-athons Embedded co-working The total problem Replying Eating your own dog food Examples! Working the first time
Version 2 Syndrome Being too clever, forgetting about engagement Technical bog down and operational burn-out Fire fighting, Heads down not eyes up Little simple things that are important but don’t seem that urgent… But are the ha’peth of tar that sinks the ship Major project dominance He who pays the piper calls the tune Non-software innovations Seek and contribute content/component and contributing partners
Activation Energy Argument Balance against feature creep short-termism Keep planning the big stuff… Balance the cost to the benefit. But hacks survive – and don’t do the strategy.
58% by students, 24% unmaintained Schultheiss et al. (2010) PLoS Comp Bio Content and Promotion matter more than software, but harder to fund and different people to software developers. What’s your plan? Maintaining content, software, services Different groups, evolving practices, changing times, new patterns….. Funding cycles, chasms and reinventions Reward not hinder adoption. Foundations, Friends and Business Models…and the Open Source Community Silver Bullet!
Hard to Plan….
When the program’s Data Management Group chair claims it’s the only data system they have used that works. To your funders. Whoo-hoo!
Computer Supported Cooperative Work, Team Science, Knowledge Management, Social Science, Information Science, Library Science, Digital Scholarship, Collaboratories…