The Biodiversity Heritage Library has digitized over 45,000 publications containing over 32 million pages. In 2010, the site had over 800,000 visits. Recent developments include a new interface for the names index and upcoming enhancements to the home page and annotations of Charles Darwin's personal library. To promote the library through social media, the BHL uses channels like blogs, Flickr, Facebook, and Twitter since it has no budget for traditional print outreach. The BHL aims to increase its global partnerships to improve redundancy of content and enable data and application mirroring through its Global BHL initiative. It is also working with JSTOR and CiteBank to enable scholarly crowdsourcing of articles and share content with other platforms.
This document discusses best practices for supporting open science. It recommends adopting existing solutions where possible rather than developing new ones. It also suggests engaging with researchers, incentivizing open practices, allowing for innovation and failure, collaborating with peers, and keeping service delivery options open. The document concludes by inviting attendees to a workshop on delivering research data management services.
So, what's it all about then? Why we share research dataDanny Kingsley
This document summarizes a presentation about open research and data sharing. It discusses several drivers for data sharing, including funder requirements and cultural expectations among researchers. It also examines blockers to sharing such as concerns about data being stolen or reused without permission. The presentation argues that an overemphasis on high-impact publications and journal metrics is creating problems like hyperauthorship, reproducibility issues, and retractions. It advocates for increasing transparency through measures like preregistering trials, peer reviewing methodologies, and making data openly accessible. The goal is to overhaul how research is conducted, assessed and shared in a more open and collaborative manner.
This is a keynote presentation to "Open science, transparence et evaluation. Perspectives et enjeux pour les chercheurs.' Held at Urfist de Bordeaux, France, 4 April 2017
https://sygefor.reseau-urfist.fr/#!/training/6701/7159/?from=true
ABSTRACT: The way research is disseminated has changed immeasurably since the advent of the internet, yet we still reward researchers in the same way - for publication of novel results in high impact journals. This talk will start with a brief discussion of some of the big challenges the research sector is facing as a result and describe how Open Science can address these. The talk will then focus on the difficulty of introducing and implementing Open Science solutions. Open Science questions the status quo, and potentially threatens the established reputation of both institutions and individuals. It is not an easy concept to implement. While the discipline of Scholarly Communication takes a 'meta' view of the whole research ecosystem, most players in that system are working within a narrow view. It is very rare for individuals to be able to see beyond their own experience. Challenges for people trying to implement Open Science initiatives range from practical issues in implementing change, through to the people skills and negotiations required to convince individuals and institutions that this change is necessary.
La cruz de Calatrava tiene su origen en el siglo XII cuando el abad Raimundo de Fitero fundó la Orden de Calatrava para proteger la villa de Calatrava en Ciudad Real. Consiste en una cruz griega de color rojo con flores de lis en los extremos. A pesar de variaciones menores a lo largo de los años, la cruz de Calatrava siempre ha conservado su forma distintiva y representa la identidad de la orden.
This document describes an online ecology course for bachelor students in technical and economic specialties. The 16-week course is taught by Dr. Tetyana Tykhomyrova and covers topics like ecosystem functioning, decreasing greenhouse gas emissions, and global green technology implementation. Students will read lectures, answer questions, do presentations, and take two tests. Communication will occur via forums, chats, Skype meetings, and email. Upon completing the course, students will be able to assess environmental impacts, identify hazardous substances, propose pollution control methods, determine ecosystem productivity changes, and offer sustainable development elements.
The Biodiversity Heritage Library has digitized over 45,000 publications containing over 32 million pages. In 2010, the site had over 800,000 visits. Recent developments include a new interface for the names index and upcoming enhancements to the home page and annotations of Charles Darwin's personal library. To promote the library through social media, the BHL uses channels like blogs, Flickr, Facebook, and Twitter since it has no budget for traditional print outreach. The BHL aims to increase its global partnerships to improve redundancy of content and enable data and application mirroring through its Global BHL initiative. It is also working with JSTOR and CiteBank to enable scholarly crowdsourcing of articles and share content with other platforms.
This document discusses best practices for supporting open science. It recommends adopting existing solutions where possible rather than developing new ones. It also suggests engaging with researchers, incentivizing open practices, allowing for innovation and failure, collaborating with peers, and keeping service delivery options open. The document concludes by inviting attendees to a workshop on delivering research data management services.
So, what's it all about then? Why we share research dataDanny Kingsley
This document summarizes a presentation about open research and data sharing. It discusses several drivers for data sharing, including funder requirements and cultural expectations among researchers. It also examines blockers to sharing such as concerns about data being stolen or reused without permission. The presentation argues that an overemphasis on high-impact publications and journal metrics is creating problems like hyperauthorship, reproducibility issues, and retractions. It advocates for increasing transparency through measures like preregistering trials, peer reviewing methodologies, and making data openly accessible. The goal is to overhaul how research is conducted, assessed and shared in a more open and collaborative manner.
This is a keynote presentation to "Open science, transparence et evaluation. Perspectives et enjeux pour les chercheurs.' Held at Urfist de Bordeaux, France, 4 April 2017
https://sygefor.reseau-urfist.fr/#!/training/6701/7159/?from=true
ABSTRACT: The way research is disseminated has changed immeasurably since the advent of the internet, yet we still reward researchers in the same way - for publication of novel results in high impact journals. This talk will start with a brief discussion of some of the big challenges the research sector is facing as a result and describe how Open Science can address these. The talk will then focus on the difficulty of introducing and implementing Open Science solutions. Open Science questions the status quo, and potentially threatens the established reputation of both institutions and individuals. It is not an easy concept to implement. While the discipline of Scholarly Communication takes a 'meta' view of the whole research ecosystem, most players in that system are working within a narrow view. It is very rare for individuals to be able to see beyond their own experience. Challenges for people trying to implement Open Science initiatives range from practical issues in implementing change, through to the people skills and negotiations required to convince individuals and institutions that this change is necessary.
La cruz de Calatrava tiene su origen en el siglo XII cuando el abad Raimundo de Fitero fundó la Orden de Calatrava para proteger la villa de Calatrava en Ciudad Real. Consiste en una cruz griega de color rojo con flores de lis en los extremos. A pesar de variaciones menores a lo largo de los años, la cruz de Calatrava siempre ha conservado su forma distintiva y representa la identidad de la orden.
This document describes an online ecology course for bachelor students in technical and economic specialties. The 16-week course is taught by Dr. Tetyana Tykhomyrova and covers topics like ecosystem functioning, decreasing greenhouse gas emissions, and global green technology implementation. Students will read lectures, answer questions, do presentations, and take two tests. Communication will occur via forums, chats, Skype meetings, and email. Upon completing the course, students will be able to assess environmental impacts, identify hazardous substances, propose pollution control methods, determine ecosystem productivity changes, and offer sustainable development elements.
Informatics Transform : Re-engineering Libraries for the Data DecadeLiz Lyon
Libraries need to re-engineer to support the data decade by providing research data management services and developing data informatics capacity. This includes offering data management plans, metadata support, data storage, and tools for data tracking and citation. Libraries also need to work with researchers and partners to understand data requirements, provide advocacy and training, and help acquire skills in areas like data preservation, analysis, and visualization. As data becomes more important, libraries are on a journey to develop these research data management capabilities.
This document summarizes a presentation given at IASSIST in Cologne, Germany in May 2013 about the rise of data journals. It discusses the benefits of publishing data in journals, such as increased citations and reuse of data. However, it also notes challenges including linking data to publications and validating data. Several projects are working to address these challenges and facilitate data publication, such as establishing standards for peer reviewing datasets and enabling automatic exchange of metadata between repositories and publishers. Data journals are increasing in various fields and aim to give academic credit to data creators and provide long-term access to datasets.
Why does research data matter to librariesJisc RDM
- Research data matters to libraries because it is increasingly being produced and collected by researchers, and there are growing requirements to manage and preserve it.
- A survey found that while most researchers currently manage their own data, there is a trend toward using institutional repositories and libraries more for long-term preservation.
- Libraries are well-suited to help with research data management because of their experience organizing and describing information over long periods of time, but there are also challenges due to differences across disciplines in how data is defined and treated.
- As funders and journals require better data sharing practices, libraries have an opportunity to take a more active role in helping researchers and institutions capture, describe, and manage research data over
Evolution or revolution? The changing data landscapeLizLyon
This document summarizes a presentation on the changing data landscape and challenges of digital information management. It discusses how data sets are becoming core research instruments and potentially the new special collections. It covers perspectives on the increasing scale and complexity of data, as well as challenges regarding storage, incentives, costs and sustainability. It also examines gaps between data policies and practices in areas like data sharing, licensing, ethics and engagement with citizen science.
The document discusses the role of academic libraries in research data management (RDM). It begins by describing the variety of research data types and the large scale of data being produced. It then discusses funders' mandates for good RDM practices and potential areas where libraries can contribute, such as policy development, training, and advisory services. UK libraries are currently offering some basic RDM services but see it as a high priority going forward. Challenges include the need for skills development and concerns about capacity. Librarians need support to develop confidence and competencies in operating in this complex domain.
This document discusses the issue of reproducibility in research. It begins by noting that 47/53 "landmark" publications could not be replicated, and lists some common causes of irreproducibility like cherry-picking data and improper statistical analysis. It then looks at reproducibility from the perspective of different stakeholders like researchers, funders, and the public. Next, it distinguishes between different levels of reproducibility like replicating a study with the same versus different data or software. The document advocates making data and code "first class citizens" in research and describes emerging tools and systems that can help improve reproducibility. It ends by asking questions about what more can be done by individual labs and the research community as a whole to enhance reproducibility.
Reproducibility: A Funder and Data Science PerspectivePhilip Bourne
The document discusses the NIH's efforts to improve reproducibility in biomedical research. It describes how the NIH is working to incentivize researchers to make their work reproducible through funding policies, tools, and a proposed "Commons" platform. The Commons would be a virtual platform located in public clouds that would make large NIH-funded datasets and tools FAIR (Findable, Accessible, Interoperable, and Reusable). Several pilots are exploring using the Commons approach to facilitate collaboration and reproducibility. The document raises questions about evaluating the success of the pilots and balancing various metrics in a potential larger-scale implementation of the Commons.
Open Data - strategies for research data management & impact of best practicesMartin Donnelly
This document summarizes a presentation on open data strategies and research data management best practices. It discusses the importance of open data as part of the broader open science movement. The presenter outlines good practices for research data management, including planning, documentation, storage, and deposition. Benefits of good research data management include increased impact, accessibility, transparency, efficiency and data durability. Risks of poor management include legal issues, financial penalties, lost scientific opportunities and reputational harm. The presentation provides a step-by-step approach to research data management and discusses roles and responsibilities of different stakeholders.
Building a collaborative RDM community, research data networkJisc RDM
This document summarizes Dr. Marta Teperek's presentation on building a collaborative research data management (RDM) community. The presentation covered how not to start RDM services by mandating data sharing, and instead focusing on the benefits of sharing. It discussed Cambridge University's democratic approach to developing RDM services by empowering researchers, and the positive feedback received. Collaboration, open communication, and shaping services and policies with researchers were emphasized as key to success.
Data management: The new frontier for librariesLEARN Project
Presentation at 3rd LEARN workshop on Research Data Management, “Make research data management policies work”, by Kathleen Shearer, COAR, CARL/ABCR, RDC/DCR, ARL, SSHRC/CSRH.
This document summarizes Susanna-Assunta Sansone's presentation on open access and open data at Nature Publishing Group. Some key points discussed include:
- The benefits of open data including reducing errors/fraud and increasing return on investment in research. However, barriers also exist such as lack of incentives and standards.
- Recent initiatives at NPG to improve data/reproducibility such as requiring data behind figures and expanding methods sections.
- The role of data journals in increasing credit/visibility for shared data and promoting standards/best practices.
- Market research found researchers want increased visibility, usability, and credit for sharing their data.
How structural biology can influence data science and vice versa. Based on a forthcoming paper in Current Opinions in Structural Biology https://arxiv.org/abs/1807.09247 and presented as part of the University of Virginia Data Science Institute Lunch and Learn Series, August 31, 2018
About the Webinar
Big data is being collected at a rate that is surpassing traditional analytical methods due to the constantly expanding ways in which data can be created and mined. Faculty in all disciplines are increasingly creating and/or incorporating big data into their research and institutions are creating repositories and other tools to manage it all. There are many challenge to effectively manage and curate this data—challenges that are both similar and different to managing document archives. Libraries can and are assuming a key role in making this information more useful, visible, and accessible, such as creating taxonomies, designing metadata schemes, and systematizing retrieval methods.
Our panelists will talk about their experience with big data curation, best practices for research data management, and the tools used by libraries as they take on this evolving role.
Implications of Big Data & Data Science on PublishingPhilip Bourne
This document discusses the implications of big data and data science for publishing. It defines big data and data science as using large, complex datasets to answer questions and make statistically significant conclusions. Data science creates new types of interdisciplinary research and content that crosses traditional academic silos. Examples provided include using text mining to study censorship, sensor data to study air pollution and forests, and text analysis to study normativity and ethics. The format of publications may also change as datasets and the process of exploring data become more important than standalone findings. Publishers can leverage the new research environments and types of content emerging from data science.
Recomendations for infrastructure and incentives for open science, presented to the Research Data Alliance 6th Plenary. Presenter: William Gunn, Director of Scholarly Communications for Mendeley.
In order to be reused, research data must be discoverable.
The EPSRC Research Data Expectations* requires research organisations to maintain a data catalogue to record metadata about research data generated by EPSRC-funded research projects.
Universities are increasingly making research data assets available through repositories or other data portals.
The requirement for a UK research data discovery service has grown as universities become more involved in RDM and capacity develops.
The document summarizes a pilot project at the University of Edinburgh to support the development of a UK Research Data Discovery Service. PhD interns engaged with researchers from various schools to describe and deposit research datasets in the university's systems to be harvested by the discovery service. Observations found mixed results across schools, with humanities researchers less comfortable sharing data due to copyright and reluctance to share interpretations. Other schools had established data repositories causing less interest in the university's system. Building research data management practices will require tailored approaches and more training over time.
Provenance in Support of the ANDS Four TransformationsAndrew Treloar
The document discusses the Australian National Data Service (ANDS) and how it uses provenance information to support its four transformations of research data. ANDS aims to make Australian research data more discoverable, accessible, and reusable. It focuses on adding value to data through re-use rather than storing data itself. Provenance capture is important for managing data, connecting related data, improving discoverability, and enabling re-analysis. ANDS has funded projects involving provenance services and integration. Future work includes developing domain-specific extensions to the PROV-O standard and strengthening connections with the Research Data Alliance Interest Group on Provenance.
Informatics Transform : Re-engineering Libraries for the Data DecadeLiz Lyon
Libraries need to re-engineer to support the data decade by providing research data management services and developing data informatics capacity. This includes offering data management plans, metadata support, data storage, and tools for data tracking and citation. Libraries also need to work with researchers and partners to understand data requirements, provide advocacy and training, and help acquire skills in areas like data preservation, analysis, and visualization. As data becomes more important, libraries are on a journey to develop these research data management capabilities.
This document summarizes a presentation given at IASSIST in Cologne, Germany in May 2013 about the rise of data journals. It discusses the benefits of publishing data in journals, such as increased citations and reuse of data. However, it also notes challenges including linking data to publications and validating data. Several projects are working to address these challenges and facilitate data publication, such as establishing standards for peer reviewing datasets and enabling automatic exchange of metadata between repositories and publishers. Data journals are increasing in various fields and aim to give academic credit to data creators and provide long-term access to datasets.
Why does research data matter to librariesJisc RDM
- Research data matters to libraries because it is increasingly being produced and collected by researchers, and there are growing requirements to manage and preserve it.
- A survey found that while most researchers currently manage their own data, there is a trend toward using institutional repositories and libraries more for long-term preservation.
- Libraries are well-suited to help with research data management because of their experience organizing and describing information over long periods of time, but there are also challenges due to differences across disciplines in how data is defined and treated.
- As funders and journals require better data sharing practices, libraries have an opportunity to take a more active role in helping researchers and institutions capture, describe, and manage research data over
Evolution or revolution? The changing data landscapeLizLyon
This document summarizes a presentation on the changing data landscape and challenges of digital information management. It discusses how data sets are becoming core research instruments and potentially the new special collections. It covers perspectives on the increasing scale and complexity of data, as well as challenges regarding storage, incentives, costs and sustainability. It also examines gaps between data policies and practices in areas like data sharing, licensing, ethics and engagement with citizen science.
The document discusses the role of academic libraries in research data management (RDM). It begins by describing the variety of research data types and the large scale of data being produced. It then discusses funders' mandates for good RDM practices and potential areas where libraries can contribute, such as policy development, training, and advisory services. UK libraries are currently offering some basic RDM services but see it as a high priority going forward. Challenges include the need for skills development and concerns about capacity. Librarians need support to develop confidence and competencies in operating in this complex domain.
This document discusses the issue of reproducibility in research. It begins by noting that 47/53 "landmark" publications could not be replicated, and lists some common causes of irreproducibility like cherry-picking data and improper statistical analysis. It then looks at reproducibility from the perspective of different stakeholders like researchers, funders, and the public. Next, it distinguishes between different levels of reproducibility like replicating a study with the same versus different data or software. The document advocates making data and code "first class citizens" in research and describes emerging tools and systems that can help improve reproducibility. It ends by asking questions about what more can be done by individual labs and the research community as a whole to enhance reproducibility.
Reproducibility: A Funder and Data Science PerspectivePhilip Bourne
The document discusses the NIH's efforts to improve reproducibility in biomedical research. It describes how the NIH is working to incentivize researchers to make their work reproducible through funding policies, tools, and a proposed "Commons" platform. The Commons would be a virtual platform located in public clouds that would make large NIH-funded datasets and tools FAIR (Findable, Accessible, Interoperable, and Reusable). Several pilots are exploring using the Commons approach to facilitate collaboration and reproducibility. The document raises questions about evaluating the success of the pilots and balancing various metrics in a potential larger-scale implementation of the Commons.
Open Data - strategies for research data management & impact of best practicesMartin Donnelly
This document summarizes a presentation on open data strategies and research data management best practices. It discusses the importance of open data as part of the broader open science movement. The presenter outlines good practices for research data management, including planning, documentation, storage, and deposition. Benefits of good research data management include increased impact, accessibility, transparency, efficiency and data durability. Risks of poor management include legal issues, financial penalties, lost scientific opportunities and reputational harm. The presentation provides a step-by-step approach to research data management and discusses roles and responsibilities of different stakeholders.
Building a collaborative RDM community, research data networkJisc RDM
This document summarizes Dr. Marta Teperek's presentation on building a collaborative research data management (RDM) community. The presentation covered how not to start RDM services by mandating data sharing, and instead focusing on the benefits of sharing. It discussed Cambridge University's democratic approach to developing RDM services by empowering researchers, and the positive feedback received. Collaboration, open communication, and shaping services and policies with researchers were emphasized as key to success.
Data management: The new frontier for librariesLEARN Project
Presentation at 3rd LEARN workshop on Research Data Management, “Make research data management policies work”, by Kathleen Shearer, COAR, CARL/ABCR, RDC/DCR, ARL, SSHRC/CSRH.
This document summarizes Susanna-Assunta Sansone's presentation on open access and open data at Nature Publishing Group. Some key points discussed include:
- The benefits of open data including reducing errors/fraud and increasing return on investment in research. However, barriers also exist such as lack of incentives and standards.
- Recent initiatives at NPG to improve data/reproducibility such as requiring data behind figures and expanding methods sections.
- The role of data journals in increasing credit/visibility for shared data and promoting standards/best practices.
- Market research found researchers want increased visibility, usability, and credit for sharing their data.
How structural biology can influence data science and vice versa. Based on a forthcoming paper in Current Opinions in Structural Biology https://arxiv.org/abs/1807.09247 and presented as part of the University of Virginia Data Science Institute Lunch and Learn Series, August 31, 2018
About the Webinar
Big data is being collected at a rate that is surpassing traditional analytical methods due to the constantly expanding ways in which data can be created and mined. Faculty in all disciplines are increasingly creating and/or incorporating big data into their research and institutions are creating repositories and other tools to manage it all. There are many challenge to effectively manage and curate this data—challenges that are both similar and different to managing document archives. Libraries can and are assuming a key role in making this information more useful, visible, and accessible, such as creating taxonomies, designing metadata schemes, and systematizing retrieval methods.
Our panelists will talk about their experience with big data curation, best practices for research data management, and the tools used by libraries as they take on this evolving role.
Implications of Big Data & Data Science on PublishingPhilip Bourne
This document discusses the implications of big data and data science for publishing. It defines big data and data science as using large, complex datasets to answer questions and make statistically significant conclusions. Data science creates new types of interdisciplinary research and content that crosses traditional academic silos. Examples provided include using text mining to study censorship, sensor data to study air pollution and forests, and text analysis to study normativity and ethics. The format of publications may also change as datasets and the process of exploring data become more important than standalone findings. Publishers can leverage the new research environments and types of content emerging from data science.
Recomendations for infrastructure and incentives for open science, presented to the Research Data Alliance 6th Plenary. Presenter: William Gunn, Director of Scholarly Communications for Mendeley.
In order to be reused, research data must be discoverable.
The EPSRC Research Data Expectations* requires research organisations to maintain a data catalogue to record metadata about research data generated by EPSRC-funded research projects.
Universities are increasingly making research data assets available through repositories or other data portals.
The requirement for a UK research data discovery service has grown as universities become more involved in RDM and capacity develops.
The document summarizes a pilot project at the University of Edinburgh to support the development of a UK Research Data Discovery Service. PhD interns engaged with researchers from various schools to describe and deposit research datasets in the university's systems to be harvested by the discovery service. Observations found mixed results across schools, with humanities researchers less comfortable sharing data due to copyright and reluctance to share interpretations. Other schools had established data repositories causing less interest in the university's system. Building research data management practices will require tailored approaches and more training over time.
Provenance in Support of the ANDS Four TransformationsAndrew Treloar
The document discusses the Australian National Data Service (ANDS) and how it uses provenance information to support its four transformations of research data. ANDS aims to make Australian research data more discoverable, accessible, and reusable. It focuses on adding value to data through re-use rather than storing data itself. Provenance capture is important for managing data, connecting related data, improving discoverability, and enabling re-analysis. ANDS has funded projects involving provenance services and integration. Future work includes developing domain-specific extensions to the PROV-O standard and strengthening connections with the Research Data Alliance Interest Group on Provenance.
ANDS Applications Program: Building Tools to Facilitate Data ReuseAndrew Treloar
Presentation accompanying talk on the ANDS Applications program at IDCC 2016. Discusses the outputs of the program, but also focusses on issues of sustainability of such eresearch tools
Introductory talk for ANDS workshop on Institutional Repositories and data. The talk situates the topic within the field of scholarly communication before comparing the relative technical simplicity of running repositories of publications with the complexities that accompany a shift to data. The most-retweeted slide is the one viewing the response of repository managers to data through the lens of Elizabeth Kübler-Ross' stages of grieving.
Closing comments at #iPres 2014 conferenceAndrew Treloar
This document summarizes the author's observations from attending the iPres 2014 conference. In three sentences: The author notes that some presentations focused on recreating existing infrastructure instead of building on what is there. Several talks emphasized the importance of preserving data and processes to maintain the scholarly record. Overall the conference provided useful reflections on digital preservation practice and experiences, though some theoretical papers seemed detached from real-world challenges.
The document discusses changes in scholarly research and outputs that have implications for data archives. It notes that the research process itself is becoming more visible and dynamic as scholars use websites and tools to record and share their work. This results in a more extensive and heterogeneous scholarly record. However, many of these recording platforms are not designed for long-term archiving. Therefore, archives will need to develop new approaches to account for the characteristics of research recorded on the web, and to trigger the transfer of outputs from recording platforms to long-term archives.
The universe of identifiers and how ANDS is using themAndrew Treloar
Presentation on identifiers in general, and ANDS' approach to identifiers for objects and people in particular. Given at ODIP 3rd Workshop on August 7, 2014.
The document discusses adding value to researchers' data through the Australian National Data Service (ANDS). ANDS aims to transform unmanaged, disconnected data into structured collections that are managed, connected, findable, and reusable. It does this through nationally coordinated engagement with institutions and disciplines. The goal is to help researchers easily publish, discover, access and reuse research data.
The life-sciences as a pathfinder in data-intensive research practiceAndrew Treloar
Presentation given at UQ Winterschool 2014. The advent of the Internet is bringing about fundamental changes in the ways that research is performed and communicated. These have been particularly driven by the growing importance of data, as well as the tools available to work with this data. This presentation will examine this shift, drawing on examples from the life‐sciences, and try to make some predictions about the next five years.
Past, present, and future of scholarly technology and practicesAndrew Treloar
Thoughts about past, the present and the future of scholarly technologies and scholarly practices. Based on work done with @hvdsomp at #DANS, as well as discussions with @scharnhorsta
Talk given by @atreloar and @hvdsomp at workshop sponsored by http://dans.knaw.nl/ with title "Riding the Wave and the Scholarly Archive of the Future". NOTE: This reflects thinking in progress which may well change in the future.
Data Infrastructure and the Scholarly Ecosystem of the FutureAndrew Treloar
Talk delivered at forum at SURF in the Netherlands with the hashtag #disef. Talk deals with an overview of some thinking being done about elements of the ecosystem for scholarship, as well as some slides dealing with the Australian National Data Service (ands.org.au) and the Research Data Alliance (rd-alliance.org). These latter slides were used during a Q&A session as part of the talk.
Research data and the ANDS agenda in AustraliaAndrew Treloar
This document discusses research data and the agenda of the Australian National Data Service (ANDS) in Australia. ANDS was established in 2009 to enable Australian researchers to more easily publish, discover, access and reuse research data. It provides several national services and has funded over 200 projects. The document also outlines relevant national policies and ANDS's involvement in international organizations like the Research Data Alliance.
This document discusses how data is driving decisions in research. It notes that the amount of data being generated is growing exponentially and researchers are now in the data business. It outlines four transformations needed - from unmanaged to managed data, disconnected to connected data, invisible to findable data, and single-use to reusable data. National strategies in Australia are aiming to support these transformations through initiatives like the Australian National Data Service which provides resources and expertise to help researchers manage, connect, and enable reuse of research data.
Building on the Atlas (of Living Australia)Andrew Treloar
Presentation given at Atlas of Living Australia Science Symposium 2013. Discusses Australian National Data Service Applications program and two specific projects: Soils to Satellites (also involving TERN), and Edgar Bird Species distribution.
Journal literature size in the context of the LHC dataAndrew Treloar
Single slide Powerpoint animation showing the total size of the journal literature in the context of the data produced by the LHC from 2009-2013. Need to view in Slideshow mode.
The document discusses ANDS' efforts to augment data discovery through repurposing DataCite metadata. It describes ANDS' goals of making data more findable, accessible, and reusable. It outlines a three stage plan to provide "See Also" suggestions for datasets: 1) internal suggestions, 2) suggestions from searching DataCite metadata, and 3) potentially integrating additional sources like the National Library of Australia. The "See Also" feature aims to support serendipity in discovery. Future work may include ranking searches and expanding the types of related results provided.
From Data to Data: One version of a History of Scholarly CommunicationAndrew Treloar
1) Scholarly communication has evolved from early written works and data to modern digital scholarship that generates vast amounts of data.
2) Issues with data preservation, accessibility, and selective publication have impacted the completeness of the evidence base over time.
3) As data-intensive research increases, standardization and data federation are needed to aggregate data from multiple sources and answer new questions.
4) Initiatives like institutional repositories, researcher workflows, and national programs aim to improve data sharing, access, and reuse to support new discoveries.
Data management: international challenges, national infrastructure, and insti...Andrew Treloar
This document discusses challenges and responses to data management from an Australian perspective. It outlines international challenges around inconvenient, imprisoned, invisible, and inaccessible data. It then discusses the importance of data reuse for efficiency, validation, and value. Two case studies on astronomy and cancer research demonstrate increased citations when data is publicly shared. The document also outlines Australia's national data service, ANDS, which aims to make data managed, connected, findable, and reusable. ANDS is building national data services and working with institutions to improve data management policies, capture, and metadata. Ongoing issues include balancing local vs national needs, sustainability, and encouraging data sharing cultures.
HCL Notes and Domino License Cost Reduction in the World of DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-and-domino-license-cost-reduction-in-the-world-of-dlau/
The introduction of DLAU and the CCB & CCX licensing model caused quite a stir in the HCL community. As a Notes and Domino customer, you may have faced challenges with unexpected user counts and license costs. You probably have questions on how this new licensing approach works and how to benefit from it. Most importantly, you likely have budget constraints and want to save money where possible. Don’t worry, we can help with all of this!
We’ll show you how to fix common misconfigurations that cause higher-than-expected user counts, and how to identify accounts which you can deactivate to save money. There are also frequent patterns that can cause unnecessary cost, like using a person document instead of a mail-in for shared mailboxes. We’ll provide examples and solutions for those as well. And naturally we’ll explain the new licensing model.
Join HCL Ambassador Marc Thomas in this webinar with a special guest appearance from Franz Walder. It will give you the tools and know-how to stay on top of what is going on with Domino licensing. You will be able lower your cost through an optimized configuration and keep it low going forward.
These topics will be covered
- Reducing license cost by finding and fixing misconfigurations and superfluous accounts
- How do CCB and CCX licenses really work?
- Understanding the DLAU tool and how to best utilize it
- Tips for common problem areas, like team mailboxes, functional/test users, etc
- Practical examples and best practices to implement right away
Introduction of Cybersecurity with OSS at Code Europe 2024Hiroshi SHIBATA
I develop the Ruby programming language, RubyGems, and Bundler, which are package managers for Ruby. Today, I will introduce how to enhance the security of your application using open-source software (OSS) examples from Ruby and RubyGems.
The first topic is CVE (Common Vulnerabilities and Exposures). I have published CVEs many times. But what exactly is a CVE? I'll provide a basic understanding of CVEs and explain how to detect and handle vulnerabilities in OSS.
Next, let's discuss package managers. Package managers play a critical role in the OSS ecosystem. I'll explain how to manage library dependencies in your application.
I'll share insights into how the Ruby and RubyGems core team works to keep our ecosystem safe. By the end of this talk, you'll have a better understanding of how to safeguard your code.
Fueling AI with Great Data with Airbyte WebinarZilliz
This talk will focus on how to collect data from a variety of sources, leveraging this data for RAG and other GenAI use cases, and finally charting your course to productionalization.
Skybuffer SAM4U tool for SAP license adoptionTatiana Kojar
Manage and optimize your license adoption and consumption with SAM4U, an SAP free customer software asset management tool.
SAM4U, an SAP complimentary software asset management tool for customers, delivers a detailed and well-structured overview of license inventory and usage with a user-friendly interface. We offer a hosted, cost-effective, and performance-optimized SAM4U setup in the Skybuffer Cloud environment. You retain ownership of the system and data, while we manage the ABAP 7.58 infrastructure, ensuring fixed Total Cost of Ownership (TCO) and exceptional services through the SAP Fiori interface.
Programming Foundation Models with DSPy - Meetup SlidesZilliz
Prompting language models is hard, while programming language models is easy. In this talk, I will discuss the state-of-the-art framework DSPy for programming foundation models with its powerful optimizers and runtime constraint system.
Digital Marketing Trends in 2024 | Guide for Staying AheadWask
https://www.wask.co/ebooks/digital-marketing-trends-in-2024
Feeling lost in the digital marketing whirlwind of 2024? Technology is changing, consumer habits are evolving, and staying ahead of the curve feels like a never-ending pursuit. This e-book is your compass. Dive into actionable insights to handle the complexities of modern marketing. From hyper-personalization to the power of user-generated content, learn how to build long-term relationships with your audience and unlock the secrets to success in the ever-shifting digital landscape.
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Tatiana Kojar
Skybuffer AI, built on the robust SAP Business Technology Platform (SAP BTP), is the latest and most advanced version of our AI development, reaffirming our commitment to delivering top-tier AI solutions. Skybuffer AI harnesses all the innovative capabilities of the SAP BTP in the AI domain, from Conversational AI to cutting-edge Generative AI and Retrieval-Augmented Generation (RAG). It also helps SAP customers safeguard their investments into SAP Conversational AI and ensure a seamless, one-click transition to SAP Business AI.
With Skybuffer AI, various AI models can be integrated into a single communication channel such as Microsoft Teams. This integration empowers business users with insights drawn from SAP backend systems, enterprise documents, and the expansive knowledge of Generative AI. And the best part of it is that it is all managed through our intuitive no-code Action Server interface, requiring no extensive coding knowledge and making the advanced AI accessible to more users.
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on automated letter generation for Bonterra Impact Management using Google Workspace or Microsoft 365.
Interested in deploying letter generation automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
This presentation provides valuable insights into effective cost-saving techniques on AWS. Learn how to optimize your AWS resources by rightsizing, increasing elasticity, picking the right storage class, and choosing the best pricing model. Additionally, discover essential governance mechanisms to ensure continuous cost efficiency. Whether you are new to AWS or an experienced user, this presentation provides clear and practical tips to help you reduce your cloud costs and get the most out of your budget.
Main news related to the CCS TSI 2023 (2023/1695)Jakub Marek
An English 🇬🇧 translation of a presentation to the speech I gave about the main changes brought by CCS TSI 2023 at the biggest Czech conference on Communications and signalling systems on Railways, which was held in Clarion Hotel Olomouc from 7th to 9th November 2023 (konferenceszt.cz). Attended by around 500 participants and 200 on-line followers.
The original Czech 🇨🇿 version of the presentation can be found here: https://www.slideshare.net/slideshow/hlavni-novinky-souvisejici-s-ccs-tsi-2023-2023-1695/269688092 .
The videorecording (in Czech) from the presentation is available here: https://youtu.be/WzjJWm4IyPk?si=SImb06tuXGb30BEH .
Trusted Execution Environment for Decentralized Process MiningLucaBarbaro3
Presentation of the paper "Trusted Execution Environment for Decentralized Process Mining" given during the CAiSE 2024 Conference in Cyprus on June 7, 2024.
Driving Business Innovation: Latest Generative AI Advancements & Success StorySafe Software
Are you ready to revolutionize how you handle data? Join us for a webinar where we’ll bring you up to speed with the latest advancements in Generative AI technology and discover how leveraging FME with tools from giants like Google Gemini, Amazon, and Microsoft OpenAI can supercharge your workflow efficiency.
During the hour, we’ll take you through:
Guest Speaker Segment with Hannah Barrington: Dive into the world of dynamic real estate marketing with Hannah, the Marketing Manager at Workspace Group. Hear firsthand how their team generates engaging descriptions for thousands of office units by integrating diverse data sources—from PDF floorplans to web pages—using FME transformers, like OpenAIVisionConnector and AnthropicVisionConnector. This use case will show you how GenAI can streamline content creation for marketing across the board.
Ollama Use Case: Learn how Scenario Specialist Dmitri Bagh has utilized Ollama within FME to input data, create custom models, and enhance security protocols. This segment will include demos to illustrate the full capabilities of FME in AI-driven processes.
Custom AI Models: Discover how to leverage FME to build personalized AI models using your data. Whether it’s populating a model with local data for added security or integrating public AI tools, find out how FME facilitates a versatile and secure approach to AI.
We’ll wrap up with a live Q&A session where you can engage with our experts on your specific use cases, and learn more about optimizing your data workflows with AI.
This webinar is ideal for professionals seeking to harness the power of AI within their data management systems while ensuring high levels of customization and security. Whether you're a novice or an expert, gain actionable insights and strategies to elevate your data processes. Join us to see how FME and AI can revolutionize how you work with data!
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Research data ecology
1. Conceptualising Collaboration
and Competition in the Changing
Ecology of Research Data
Dr Andrew Treloar
Director of Technology
Australian National Data Service
18/06/2012 1
2. Why me?
• Information management
• Scholarly communication
• Institutional repositories
• Research data management
• „Adjunct librarian‟
• andrew.treloar.net/research
18/06/2012 2
3. ANDS enables transformation of:
Data that are: To Structured Collections that are:
Unmanaged Managed
Disconnected Connected
Invisible Findable
Single use Reusable
so that Australian researchers can easily
publish, discover, access and use research data.
18/06/2012 3
ands.org.au
5. Jungle
18/06/2012 5
BY http://upload.wikimedia.org/wikipedia/commons/thumb/4/47/Jungle.jpg/1280px-Jungle.jpg
6. Why an ecological approach?
• Information ecology:
o people
o practices
o values
o technologies
• Way of thinking about the space that offers
richer insights
18/06/2012 6
7. Ecology elements
• Systems that evolve over time
• Environmental factors (constraints, forcing)
• Selection pressures
• Biodiversity
• Species and individuals
• Niches for colonisation/exploitation
• Resources
• Interactions
• Species co-evolution/co-adaptation
18/06/2012 7
8. Research data ecology
elements
• Researchers
• Institutions
• Research funders
• Data centres (institutional, disciplinary,
national, international)
• Disciplines
• Research facilities
• Libraries
• Publishers
18/06/2012 8
13. Co-evolution isn‟t necessarily
good
• Systems co-evolve
• But can also get stuck in a new stable (not
necessarily more desirable) state
• Example: p-journals e-journals
o form and access arrangements largely
unchanged
• #openaccess is now gaining momentum
• But form changing more slowly
18/06/2012 13
14. New niches allow for new
possibilities
• Internet was new niche for journals
18/06/2012 14
CC-BY http://www.flickr.com/photos/stone-imaginings/3504148642/
15. Research data can be new niche
for librarians
• New roles within institutions
• New way to engage with wider range of
clients
• New application of existing skills
• New partnerships with Research Office, IT
Services, e-Research folks
18/06/2012 15
16. Selection pressures in research
data driving change
• Increasing
o
o
o
volume
variety
}
velocity
(Gartner, 2001)
• Increasing importance of data relative to
publications
• Mixed messages from journal publishers
• Outcomes currently unclear
18/06/2012 16
17. Role of Publishers
• Is the relationship between the publishers
of research and the producers of research
symbiotic or parasitic?
• And how will rise of data-intensive
research change this?
o Protein Data Bank
o Human Genome Project
o International Virtual Observatory
18/06/2012 17
18. Collaboration or competition?
• Symbiotic relationships are often better for
both parties than either competition or
predator/prey
18/06/2012 18
CC-BY http://www.flickr.com/photos/peternijenhuis/2979063336/
19. Conclusions
• Ecology provides a richer way of thinking
about scholarly communication than
mechanics
• Research data is a new niche for (some)
librarians
o but it‟s a niche undergoing great change
• Look for symbiotic relationships
• Critically examine the roles of other
players in the ecosystem
18/06/2012 19
20. Further reading
• B. A. Nardi, & V. L. O‟Day, “Information
ecologies: using technology with heart.
Chapter Four: Information ecologies”, First
Monday Vol 4 No 5 May 3, 1999.
http://www.firstmonday.org/issues/issue4_5/n
ardi_chapter4.html
• R. J. Robertson, M. Mahey, J. Allinson, An
ecological approach to repository and service
interactions, v. 1.5 http://ie-
repository.jisc.ac.uk/272/1/Introductoryecolog
yreport.pdf
18/06/2012 20
I’ve been working on/for ANDS for over four yearsPrior that e-Research and Institutional RepositoriesNow, on with talk. I’m going to look at one way of thinking about research data within scholarly communication. When you think of the current system of scholarly communication do you think of this?
or thi?
An alternative – and probably more realisticSo, why take an ecological approach
Building here on the work of Nardi and O’Day (as well as Kaufer and Carley). Homework at end.
So, what does this mean in the context of research data?
I’d now like to think about relationships between ‘species’ in research data ecology. Four basic kinds of relationships possible
Predator-Prey
Competitor
Parasitism:Eucalyptus mistletoe
Symbiosis: Potato cod on GBRSo, how does this framework help us think about scholarly communication and role of research data? Here are some thoughts
Volume – SKA producing 10 Petabytes per hour (1 PB = one thousand Terabytes). 2,000,000 DVDs/hour
Symbiosis between coral (sedentary filter-feeding animal) and green algae within their tissues provides benefits to both (and opportunities for huge diversity)