The document discusses research objects (ROs) which bundle together primary research results, metadata, software, and other materials. It describes the roles of data creators, curators, and data scientists in working with ROs as they move from initial research to publication and later reuse. The SEAD Virtual Archive (VA) implements a model for ROs that allows them to transition between different states as they move through the research lifecycle from creation to publication and reuse.
Research Data Management and the Research Data Lifecycle: a Gentle IntroductionGlen Newton
This document provides a gentle introduction to research data management and the research data lifecycle. It defines key terms like research, research data, and research data lifecycles. It discusses the benefits of data sharing, including enabling new research and testing new hypotheses. Research data can be complex with data, metadata, transformations, and combinations. The document outlines the research data lifecycle from collection to archiving and roles in data management.
Developing data services: a tale from two Oregon universitiesAmanda Whitmire
While the generation or collection of large, complex research datasets is becoming easier and less expensive all the time, researchers often lack the knowledge and skills that are necessary to properly manage them. Having these skills is paramount in ensuring data quality, integrity, discoverability, integration, reproducibility, and reuse over time. Librarians have been preserving, managing and disseminating information for thousands of years. As scholarly research is increasingly carried out digitally, and products of research have expanded from primarily text-based manuscripts to include datasets, metadata, maps, software code etc., it is a natural expansion of scope for libraries to be involved in the stewardship of these materials as well. This kind of evolution requires that libraries bring in faculty with new skills and collaborate more intimately with researchers during the research data lifecycle, and this is exactly what is happening in academic libraries across the country. In this webinar, two researchers-turned-data-specialists, both based in academic libraries, will share their experiences and perspectives on the development of research data services at their respective institutions. Each will share their perspective on the important role that libraries can play in helping researchers manage, preserve, and share their data.
1. The document discusses some early observations from the Associate Director for Data Science at the National Institutes of Health regarding data at NIH.
2. It notes that NIH does not fully understand how existing data is used, has focused more on why data should be shared rather than how to share it, and lacks plans for long-term sustainability of data.
3. Potential solutions discussed include developing a biomedical commons, modifying the review process, improving education in data science, and expanding the Big Data to Knowledge initiative. The goal is to create a digital research enterprise that better connects all aspects of the research lifecycle.
This document provides an overview of the research conducted by the NGSP Group at Swinburne University of Technology on cloud computing and workflow technologies. The group conducts research on data management in cloud computing, performance management in scientific workflows, security and privacy protection in the cloud, and their SwinDeW-C cloud workflow system. Specific topics studied include data storage, placement and replication strategies, temporal quality of service in workflows, and verifying temporal constraints in scientific workflows. The goal is to develop cost-effective and high performance techniques for complex software systems and services in cloud computing environments.
A metadata scheme of the software-data relationship: A proposalKai Li
This document proposes developing a scheme to describe the relationship between research data, software, and methods. It argues that these elements are intertwined and influence each other throughout the research lifecycle. The goal is to increase reproducibility and reuse of digital research objects. To understand these relationships, the project will analyze research papers, data papers, software documentation, and interviews with scientists. An ontology will then be presented to formally represent how data, software, and methods interconnect in scholarly communication.
This document provides an overview of research data management and outlines the steps for creating a data management plan. It discusses why research data management is important, including enabling data reuse and sharing and meeting funder requirements. The document then walks through creating a data management plan, covering topics like the types and formats of data that will be generated, ethical and intellectual property issues, how data will be stored and backed up, and long-term preservation and deposition of data. It emphasizes that planning early helps ensure accurate, complete and secure data, and avoids problems down the line.
This document provides an overview of Philip Bourne's early observations and thoughts regarding data management at the NIH. Some of the key points are: 1) Existing data resources are not well understood in terms of how they are used; 2) There is a need to focus on how data will be managed and shared, not just why it should be; 3) There is no NIH-wide sustainability plan for data management; 4) Training in biomedical data science is inconsistent. The document discusses some potential solutions such as establishing a NIH data commons and improving training programs.
Brad Houston presented information on data management plans (DMPs) required by the National Science Foundation (NSF) for grant proposals. He explained that DMPs must describe the data to be collected or generated, how it will be organized and formatted, and how it will be preserved and shared. He emphasized using open standards and preparing metadata to help others understand and find the data. Researchers were advised to consider long-term preservation and to partner with libraries or repositories to ensure access over time. Contact information was provided for those needing assistance developing their DMP.
Research Data Management and the Research Data Lifecycle: a Gentle IntroductionGlen Newton
This document provides a gentle introduction to research data management and the research data lifecycle. It defines key terms like research, research data, and research data lifecycles. It discusses the benefits of data sharing, including enabling new research and testing new hypotheses. Research data can be complex with data, metadata, transformations, and combinations. The document outlines the research data lifecycle from collection to archiving and roles in data management.
Developing data services: a tale from two Oregon universitiesAmanda Whitmire
While the generation or collection of large, complex research datasets is becoming easier and less expensive all the time, researchers often lack the knowledge and skills that are necessary to properly manage them. Having these skills is paramount in ensuring data quality, integrity, discoverability, integration, reproducibility, and reuse over time. Librarians have been preserving, managing and disseminating information for thousands of years. As scholarly research is increasingly carried out digitally, and products of research have expanded from primarily text-based manuscripts to include datasets, metadata, maps, software code etc., it is a natural expansion of scope for libraries to be involved in the stewardship of these materials as well. This kind of evolution requires that libraries bring in faculty with new skills and collaborate more intimately with researchers during the research data lifecycle, and this is exactly what is happening in academic libraries across the country. In this webinar, two researchers-turned-data-specialists, both based in academic libraries, will share their experiences and perspectives on the development of research data services at their respective institutions. Each will share their perspective on the important role that libraries can play in helping researchers manage, preserve, and share their data.
1. The document discusses some early observations from the Associate Director for Data Science at the National Institutes of Health regarding data at NIH.
2. It notes that NIH does not fully understand how existing data is used, has focused more on why data should be shared rather than how to share it, and lacks plans for long-term sustainability of data.
3. Potential solutions discussed include developing a biomedical commons, modifying the review process, improving education in data science, and expanding the Big Data to Knowledge initiative. The goal is to create a digital research enterprise that better connects all aspects of the research lifecycle.
This document provides an overview of the research conducted by the NGSP Group at Swinburne University of Technology on cloud computing and workflow technologies. The group conducts research on data management in cloud computing, performance management in scientific workflows, security and privacy protection in the cloud, and their SwinDeW-C cloud workflow system. Specific topics studied include data storage, placement and replication strategies, temporal quality of service in workflows, and verifying temporal constraints in scientific workflows. The goal is to develop cost-effective and high performance techniques for complex software systems and services in cloud computing environments.
A metadata scheme of the software-data relationship: A proposalKai Li
This document proposes developing a scheme to describe the relationship between research data, software, and methods. It argues that these elements are intertwined and influence each other throughout the research lifecycle. The goal is to increase reproducibility and reuse of digital research objects. To understand these relationships, the project will analyze research papers, data papers, software documentation, and interviews with scientists. An ontology will then be presented to formally represent how data, software, and methods interconnect in scholarly communication.
This document provides an overview of research data management and outlines the steps for creating a data management plan. It discusses why research data management is important, including enabling data reuse and sharing and meeting funder requirements. The document then walks through creating a data management plan, covering topics like the types and formats of data that will be generated, ethical and intellectual property issues, how data will be stored and backed up, and long-term preservation and deposition of data. It emphasizes that planning early helps ensure accurate, complete and secure data, and avoids problems down the line.
This document provides an overview of Philip Bourne's early observations and thoughts regarding data management at the NIH. Some of the key points are: 1) Existing data resources are not well understood in terms of how they are used; 2) There is a need to focus on how data will be managed and shared, not just why it should be; 3) There is no NIH-wide sustainability plan for data management; 4) Training in biomedical data science is inconsistent. The document discusses some potential solutions such as establishing a NIH data commons and improving training programs.
Brad Houston presented information on data management plans (DMPs) required by the National Science Foundation (NSF) for grant proposals. He explained that DMPs must describe the data to be collected or generated, how it will be organized and formatted, and how it will be preserved and shared. He emphasized using open standards and preparing metadata to help others understand and find the data. Researchers were advised to consider long-term preservation and to partner with libraries or repositories to ensure access over time. Contact information was provided for those needing assistance developing their DMP.
The document summarizes the development of knowledge-based tools for monitoring reciprocating compressors. It discusses how condition monitoring systems generate large amounts of raw data from reciprocating compressors. Knowledge-based tools that flag suspect data and potential problems are important because it is impractical for engineers to manually audit all the data. The document then describes different types of knowledge-based tools including fixed-threshold alarms, fault dictionaries, model-based diagnosis and decision trees. It also discusses how these tools can be combined to define rules for effective monitoring of reciprocating compressors.
The document tells a story using three pots as metaphors for facing difficulties in life. The first pot holds carrots that soften when boiled, representing people who lose hope when facing problems. The second pot holds eggs that harden inside when boiled, representing people who become bitter and hard-hearted from difficulties. The third pot holds coffee beans that change the water with their flavor and aroma when boiled, representing people who are able to learn and grow from life's challenges and make positive changes in others. The document encourages the reader to be like the coffee beans and grow stronger from facing problems rather than giving up or becoming bitter.
Building a Community for Research Data Services: CLIR/DLF E-Research Peer Net...Inna Kouper
Panel at the Digital Library Federation forum, October 27, 2014.
Authors: Chris Kollen (U of Arizona), Sarah Williams (U of Illinois at Urbana-Champaign), Mayu Ishida (U of Manitoba), Kathleen Fear (U of Rochester), Inna Kouper (Indiana U), Kendall Roark (U of Alberta)
1) The document evaluates different anti-surge control concepts for preventing surge and overheating in a compressor system during an emergency shutdown (ESD) using dynamic process simulation.
2) Simulation results show that implementing a hot bypass or cold bypass valve, in addition to the standard anti-surge control valve, can effectively prevent surge during an ESD event if the bypass valve has sufficient capacity. The hot and cold bypass concepts performed similarly in preventing surge.
3) It is recommended to implement the cold bypass concept to avoid overheating, though heating is only slightly less than with the hot bypass. This recommendation is based on analysis of a single predefined ESD case.
Research Data Services Vision(s):An Analysis of North American Research Libr...Inna Kouper
A presentation from the IASSIST 2015 conference in Minneapolis that describes preliminary results of research on research data services visions and implementations. Authors: Inna Kouper, Kathleen Fear, Mayu Ishida, Christine Kollen and Sarah Williams.
El documento presenta información sobre costos de contabilidad. Explica conceptos como teoría de costos, clasificación de costos, punto de equilibrio y aplicaciones. Define costos, clasifica costos según su naturaleza y comportamiento, y presenta un ejemplo numérico para calcular costos totales, precio unitario e ingresos.
This document provides an overview of basic drilling for WE ADP 2014. It discusses the reasons for drilling wells, including gaining subsurface information and allowing communication between the surface and underground for hydrocarbon and fluid production or injection. It also describes the different types of wells including wildcat, appraisal, production, and in-filled wells. The document outlines the key components of drilling a well, including the surface, intermediate, and production sections; casing; cementing; logging; and perforating. It provides details on rig systems, equipment used in well construction like casing, mud, and downhole tools, as well as formation evaluation and well completion. Risks associated with drilling operations and working on the rig are also summarized.
A Knowledge Discovery Framework for Planetary DefenseYongyao Jiang
This document describes a proposed knowledge framework for facilitating collaboration and integrating capabilities for planetary defense. The framework includes a hybrid cloud architecture to capture mitigation analyses, model outputs, and decision support. It also includes a cyberinfrastructure for knowledge discovery from various data sources using techniques like named entity recognition, relation extraction, and semantic reasoning. The framework is intended to provide easy access to expertise and information to achieve options for mitigating potential asteroid or comet impacts. Current research is focused on developing domain-specific web crawling and knowledge extraction from plain text documents.
Keynote presented to KE workshop held in conjunction with the release of the report "A Surfboard for Riding the Wave
Towards a four country action programme on research data": http://www.knowledge-exchange.info/Default.aspx?ID=469
Funding agencies are instituting requirements for data management and sharing as a condition of receiving research funds. This presentation addresses why researchers should care about research data management, what libraries have to do with it, and a case study of what one research specialist at the University of Colorado Anschutz Medical Campus is doing in this area.
This document discusses the need to make research data more discoverable and usable by connecting disparate data through metadata. Currently, the majority of research data is stored in isolated locations like personal hard drives, resulting in lost opportunities for analysis across experiments. The document advocates for culture change where researchers curate and share their data in centralized repositories to enable new insights from aggregating and comparing data in connected ways. This would help address challenges like variability between specimens and complexity in living systems that reductionist approaches cannot capture alone. Ensuring long-term sustainability of data repositories and defining roles for libraries and institutions are also discussed.
On community-standards, data curation and scholarly communication - BITS, Ita...Susanna-Assunta Sansone
The document discusses the vision of a "connected digital research enterprise" where researchers can more easily find and collaborate with others based on shared data and outputs. It describes a scenario where Researcher X discovers commonalities in data with Researcher Y, views Y's datasets and publications, and initiates a collaboration. Their joint work is captured and indexed, and a company utilizes some of the outputs while providing funding back to the researchers. The vision aims to more closely connect scientific work through shared digital resources.
2-6-14 ESI Supplemental Webinar: The Data Information Literacy ProjectDuraSpace
The document summarizes a webinar about the past, present, and future of the Data Information Literacy Project. The project aims to identify data literacy skills for different disciplines, build infrastructure for teaching those skills, and develop a toolkit for librarians. Case studies were conducted at 5 universities to determine data needs of students and faculty. Educational programs were developed and a symposium and toolkit are planned next. The project identifies 12 core data literacy competencies and aims to develop standards in this area.
This presentation will describe two studies undertaken to build two separate data catalogs: the first for NIH-funded datasets and the second for institutional datasets created within an academic medical center.
To inform the creation of an NIH data catalog, the purpose of the first study was to a) develop a set of minimal metadata elements used to describe datasets, and b) carry out an analysis to identify datasets in NIH-funded research articles that do not provide an indication that their data has been shared in a data repository. This study served as the foundation for developing an index of all NIH-funded datasets, and provided information about in what repositories researchers share their data most often.
The second study was spurred on by the first, and involved interviewing institutional faculty members and researchers to learn more about how they collect data, what challenges they face when collecting data, whether they’ve thought about sharing data, and what they would find most useful from an institutional data catalog. The results of this study informed the workflows, metadata creation, and requirements for building a data catalog within the medical center. Additionally, interview responses were used to further inform the data services provided by the health sciences library, including education, research consultations and clinical quality improvement initiatives.
Both studies provide various examples of how a librarian working in the health sciences can contribute to, and participate in data-related services within their institution.
This document discusses the need for critical infrastructure to promote data synthesis and evidence-based nutrient management. It outlines 10 steps for real-time data uptake, analysis, and customized nutrient recommendations. Key challenges include data standards, minimum data sets, provenance, and repositories. The Purdue University Research Repository is presented as a solution, providing preservation, curation, and publication of agricultural data. Hands-on support from librarians and agronomists is discussed to help researchers transition data and ensure best practices.
Benjamin E. Deonovic is a Ph.D. candidate in Biostatistics at the University of Iowa, expected to graduate in May 2017. His research interests include Bayesian modeling, MCMC, and statistical methods applied to genomics and bioinformatics. He has published papers on haplotype phasing and allele-specific expression using hybrid sequencing data, and has presented his research at several conferences. Deonovic has worked as a research assistant at the University of Iowa on projects involving genetic association studies, pathway analysis, and joint modeling of sequencing data.
Gelingungsbedingungen für die Einführung von Learning AnalyticsThomas Jenewein
The document discusses learning analytics and its potential to support students and teachers. It describes how learning analytics can use static student data and dynamic data collected from learning environments to analyze and visualize information in near real-time. This allows modeling, supporting, and optimizing the teaching-learning process. However, most higher education institutions have not fully implemented learning analytics organizations yet. Dashboards with support functions for students and teachers are also still limited. Learning analytics aim to support students during their learning processes and help plan learning activities.
The document summarizes the development of knowledge-based tools for monitoring reciprocating compressors. It discusses how condition monitoring systems generate large amounts of raw data from reciprocating compressors. Knowledge-based tools that flag suspect data and potential problems are important because it is impractical for engineers to manually audit all the data. The document then describes different types of knowledge-based tools including fixed-threshold alarms, fault dictionaries, model-based diagnosis and decision trees. It also discusses how these tools can be combined to define rules for effective monitoring of reciprocating compressors.
The document tells a story using three pots as metaphors for facing difficulties in life. The first pot holds carrots that soften when boiled, representing people who lose hope when facing problems. The second pot holds eggs that harden inside when boiled, representing people who become bitter and hard-hearted from difficulties. The third pot holds coffee beans that change the water with their flavor and aroma when boiled, representing people who are able to learn and grow from life's challenges and make positive changes in others. The document encourages the reader to be like the coffee beans and grow stronger from facing problems rather than giving up or becoming bitter.
Building a Community for Research Data Services: CLIR/DLF E-Research Peer Net...Inna Kouper
Panel at the Digital Library Federation forum, October 27, 2014.
Authors: Chris Kollen (U of Arizona), Sarah Williams (U of Illinois at Urbana-Champaign), Mayu Ishida (U of Manitoba), Kathleen Fear (U of Rochester), Inna Kouper (Indiana U), Kendall Roark (U of Alberta)
1) The document evaluates different anti-surge control concepts for preventing surge and overheating in a compressor system during an emergency shutdown (ESD) using dynamic process simulation.
2) Simulation results show that implementing a hot bypass or cold bypass valve, in addition to the standard anti-surge control valve, can effectively prevent surge during an ESD event if the bypass valve has sufficient capacity. The hot and cold bypass concepts performed similarly in preventing surge.
3) It is recommended to implement the cold bypass concept to avoid overheating, though heating is only slightly less than with the hot bypass. This recommendation is based on analysis of a single predefined ESD case.
Research Data Services Vision(s):An Analysis of North American Research Libr...Inna Kouper
A presentation from the IASSIST 2015 conference in Minneapolis that describes preliminary results of research on research data services visions and implementations. Authors: Inna Kouper, Kathleen Fear, Mayu Ishida, Christine Kollen and Sarah Williams.
El documento presenta información sobre costos de contabilidad. Explica conceptos como teoría de costos, clasificación de costos, punto de equilibrio y aplicaciones. Define costos, clasifica costos según su naturaleza y comportamiento, y presenta un ejemplo numérico para calcular costos totales, precio unitario e ingresos.
This document provides an overview of basic drilling for WE ADP 2014. It discusses the reasons for drilling wells, including gaining subsurface information and allowing communication between the surface and underground for hydrocarbon and fluid production or injection. It also describes the different types of wells including wildcat, appraisal, production, and in-filled wells. The document outlines the key components of drilling a well, including the surface, intermediate, and production sections; casing; cementing; logging; and perforating. It provides details on rig systems, equipment used in well construction like casing, mud, and downhole tools, as well as formation evaluation and well completion. Risks associated with drilling operations and working on the rig are also summarized.
A Knowledge Discovery Framework for Planetary DefenseYongyao Jiang
This document describes a proposed knowledge framework for facilitating collaboration and integrating capabilities for planetary defense. The framework includes a hybrid cloud architecture to capture mitigation analyses, model outputs, and decision support. It also includes a cyberinfrastructure for knowledge discovery from various data sources using techniques like named entity recognition, relation extraction, and semantic reasoning. The framework is intended to provide easy access to expertise and information to achieve options for mitigating potential asteroid or comet impacts. Current research is focused on developing domain-specific web crawling and knowledge extraction from plain text documents.
Keynote presented to KE workshop held in conjunction with the release of the report "A Surfboard for Riding the Wave
Towards a four country action programme on research data": http://www.knowledge-exchange.info/Default.aspx?ID=469
Funding agencies are instituting requirements for data management and sharing as a condition of receiving research funds. This presentation addresses why researchers should care about research data management, what libraries have to do with it, and a case study of what one research specialist at the University of Colorado Anschutz Medical Campus is doing in this area.
This document discusses the need to make research data more discoverable and usable by connecting disparate data through metadata. Currently, the majority of research data is stored in isolated locations like personal hard drives, resulting in lost opportunities for analysis across experiments. The document advocates for culture change where researchers curate and share their data in centralized repositories to enable new insights from aggregating and comparing data in connected ways. This would help address challenges like variability between specimens and complexity in living systems that reductionist approaches cannot capture alone. Ensuring long-term sustainability of data repositories and defining roles for libraries and institutions are also discussed.
On community-standards, data curation and scholarly communication - BITS, Ita...Susanna-Assunta Sansone
The document discusses the vision of a "connected digital research enterprise" where researchers can more easily find and collaborate with others based on shared data and outputs. It describes a scenario where Researcher X discovers commonalities in data with Researcher Y, views Y's datasets and publications, and initiates a collaboration. Their joint work is captured and indexed, and a company utilizes some of the outputs while providing funding back to the researchers. The vision aims to more closely connect scientific work through shared digital resources.
2-6-14 ESI Supplemental Webinar: The Data Information Literacy ProjectDuraSpace
The document summarizes a webinar about the past, present, and future of the Data Information Literacy Project. The project aims to identify data literacy skills for different disciplines, build infrastructure for teaching those skills, and develop a toolkit for librarians. Case studies were conducted at 5 universities to determine data needs of students and faculty. Educational programs were developed and a symposium and toolkit are planned next. The project identifies 12 core data literacy competencies and aims to develop standards in this area.
This presentation will describe two studies undertaken to build two separate data catalogs: the first for NIH-funded datasets and the second for institutional datasets created within an academic medical center.
To inform the creation of an NIH data catalog, the purpose of the first study was to a) develop a set of minimal metadata elements used to describe datasets, and b) carry out an analysis to identify datasets in NIH-funded research articles that do not provide an indication that their data has been shared in a data repository. This study served as the foundation for developing an index of all NIH-funded datasets, and provided information about in what repositories researchers share their data most often.
The second study was spurred on by the first, and involved interviewing institutional faculty members and researchers to learn more about how they collect data, what challenges they face when collecting data, whether they’ve thought about sharing data, and what they would find most useful from an institutional data catalog. The results of this study informed the workflows, metadata creation, and requirements for building a data catalog within the medical center. Additionally, interview responses were used to further inform the data services provided by the health sciences library, including education, research consultations and clinical quality improvement initiatives.
Both studies provide various examples of how a librarian working in the health sciences can contribute to, and participate in data-related services within their institution.
This document discusses the need for critical infrastructure to promote data synthesis and evidence-based nutrient management. It outlines 10 steps for real-time data uptake, analysis, and customized nutrient recommendations. Key challenges include data standards, minimum data sets, provenance, and repositories. The Purdue University Research Repository is presented as a solution, providing preservation, curation, and publication of agricultural data. Hands-on support from librarians and agronomists is discussed to help researchers transition data and ensure best practices.
Benjamin E. Deonovic is a Ph.D. candidate in Biostatistics at the University of Iowa, expected to graduate in May 2017. His research interests include Bayesian modeling, MCMC, and statistical methods applied to genomics and bioinformatics. He has published papers on haplotype phasing and allele-specific expression using hybrid sequencing data, and has presented his research at several conferences. Deonovic has worked as a research assistant at the University of Iowa on projects involving genetic association studies, pathway analysis, and joint modeling of sequencing data.
Gelingungsbedingungen für die Einführung von Learning AnalyticsThomas Jenewein
The document discusses learning analytics and its potential to support students and teachers. It describes how learning analytics can use static student data and dynamic data collected from learning environments to analyze and visualize information in near real-time. This allows modeling, supporting, and optimizing the teaching-learning process. However, most higher education institutions have not fully implemented learning analytics organizations yet. Dashboards with support functions for students and teachers are also still limited. Learning analytics aim to support students during their learning processes and help plan learning activities.
Realizing the Potential of Research Data by Carole L. Palmer carolelynnpalmer
The document discusses the challenges and opportunities in realizing the potential of research data. It notes that while institutions are well positioned with expertise and infrastructure to support data-intensive research, the scale and pace of changes pose significant challenges. New programs have emerged to train experts in data curation and e-science, and there is an abundance of data repositories, standards, and initiatives. Realizing the full potential of research data will require overcoming issues of interoperability between heterogeneous distributed data sources and establishing consensus around data sharing policies and practices.
Linking Data to Publications through Citation and Virtual ArchivesMicah Altman
This document discusses linking data to publications through citation and virtual archives. It argues that data citation and sharing infrastructure are necessary for scientific reproducibility and open data. It outlines elements of data management plans and requirements for data sharing infrastructure, including persistence, provenance, access control and incentives. The document advocates for data citations as first-class objects and emerging practices like assigning DOIs to datasets. It presents several use cases for the Dataverse network, a virtual archive designed for research data sharing through federated and organizational models.
Paper was presented at European Survey Research Association 2013, in the session Research Data Management for Re-use: Bringing Researchers and Archivists closer.
Data Literacy: Creating and Managing Reserach Datacunera
This document discusses best practices for creating and managing research data. It covers defining data, the importance of data management, developing a data management plan, file naming conventions, metadata, data sharing and preservation. Key points include making a data management plan addressing types of data, standards, access and sharing policies; using descriptive file names with dates; storing multiple versions of data; and including metadata to explain the data. Resources for data management support are provided.
Research Data Access and Preservation Summit, 2014
San Diego, CA
March 26-28, 2014
Jared Lyle, ICPSR
Jennifer Doty, Emory University
Joel Herndon, Duke University
Libbie Stephenson, University of California, Los Angeles
Data Curation: A New Frontier in Faculty-Librarian Collaborationjpotter49505
The document discusses the emerging field of data curation and the opportunities for collaboration between librarians and faculty researchers. It defines data curation as protocols and tools that provide descriptive analysis, discovery, management, reuse and preservation of digital collections. Benefits of data curation include facilitating data sharing, validation and reuse, as well as helping to secure grant funding. The document outlines services librarians can provide, such as following metadata standards, reorganizing data formats, and creating documentation. It also gives examples of data curation programs and discusses implications like a new orientation for libraries and greater faculty support of libraries.
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slackshyamraj55
Discover the seamless integration of RPA (Robotic Process Automation), COMPOSER, and APM with AWS IDP enhanced with Slack notifications. Explore how these technologies converge to streamline workflows, optimize performance, and ensure secure access, all while leveraging the power of AWS IDP and real-time communication via Slack notifications.
Fueling AI with Great Data with Airbyte WebinarZilliz
This talk will focus on how to collect data from a variety of sources, leveraging this data for RAG and other GenAI use cases, and finally charting your course to productionalization.
Introduction of Cybersecurity with OSS at Code Europe 2024Hiroshi SHIBATA
I develop the Ruby programming language, RubyGems, and Bundler, which are package managers for Ruby. Today, I will introduce how to enhance the security of your application using open-source software (OSS) examples from Ruby and RubyGems.
The first topic is CVE (Common Vulnerabilities and Exposures). I have published CVEs many times. But what exactly is a CVE? I'll provide a basic understanding of CVEs and explain how to detect and handle vulnerabilities in OSS.
Next, let's discuss package managers. Package managers play a critical role in the OSS ecosystem. I'll explain how to manage library dependencies in your application.
I'll share insights into how the Ruby and RubyGems core team works to keep our ecosystem safe. By the end of this talk, you'll have a better understanding of how to safeguard your code.
Monitoring and Managing Anomaly Detection on OpenShift.pdfTosin Akinosho
Monitoring and Managing Anomaly Detection on OpenShift
Overview
Dive into the world of anomaly detection on edge devices with our comprehensive hands-on tutorial. This SlideShare presentation will guide you through the entire process, from data collection and model training to edge deployment and real-time monitoring. Perfect for those looking to implement robust anomaly detection systems on resource-constrained IoT/edge devices.
Key Topics Covered
1. Introduction to Anomaly Detection
- Understand the fundamentals of anomaly detection and its importance in identifying unusual behavior or failures in systems.
2. Understanding Edge (IoT)
- Learn about edge computing and IoT, and how they enable real-time data processing and decision-making at the source.
3. What is ArgoCD?
- Discover ArgoCD, a declarative, GitOps continuous delivery tool for Kubernetes, and its role in deploying applications on edge devices.
4. Deployment Using ArgoCD for Edge Devices
- Step-by-step guide on deploying anomaly detection models on edge devices using ArgoCD.
5. Introduction to Apache Kafka and S3
- Explore Apache Kafka for real-time data streaming and Amazon S3 for scalable storage solutions.
6. Viewing Kafka Messages in the Data Lake
- Learn how to view and analyze Kafka messages stored in a data lake for better insights.
7. What is Prometheus?
- Get to know Prometheus, an open-source monitoring and alerting toolkit, and its application in monitoring edge devices.
8. Monitoring Application Metrics with Prometheus
- Detailed instructions on setting up Prometheus to monitor the performance and health of your anomaly detection system.
9. What is Camel K?
- Introduction to Camel K, a lightweight integration framework built on Apache Camel, designed for Kubernetes.
10. Configuring Camel K Integrations for Data Pipelines
- Learn how to configure Camel K for seamless data pipeline integrations in your anomaly detection workflow.
11. What is a Jupyter Notebook?
- Overview of Jupyter Notebooks, an open-source web application for creating and sharing documents with live code, equations, visualizations, and narrative text.
12. Jupyter Notebooks with Code Examples
- Hands-on examples and code snippets in Jupyter Notebooks to help you implement and test anomaly detection models.
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-und-domino-lizenzkostenreduzierung-in-der-welt-von-dlau/
DLAU und die Lizenzen nach dem CCB- und CCX-Modell sind für viele in der HCL-Community seit letztem Jahr ein heißes Thema. Als Notes- oder Domino-Kunde haben Sie vielleicht mit unerwartet hohen Benutzerzahlen und Lizenzgebühren zu kämpfen. Sie fragen sich vielleicht, wie diese neue Art der Lizenzierung funktioniert und welchen Nutzen sie Ihnen bringt. Vor allem wollen Sie sicherlich Ihr Budget einhalten und Kosten sparen, wo immer möglich. Das verstehen wir und wir möchten Ihnen dabei helfen!
Wir erklären Ihnen, wie Sie häufige Konfigurationsprobleme lösen können, die dazu führen können, dass mehr Benutzer gezählt werden als nötig, und wie Sie überflüssige oder ungenutzte Konten identifizieren und entfernen können, um Geld zu sparen. Es gibt auch einige Ansätze, die zu unnötigen Ausgaben führen können, z. B. wenn ein Personendokument anstelle eines Mail-Ins für geteilte Mailboxen verwendet wird. Wir zeigen Ihnen solche Fälle und deren Lösungen. Und natürlich erklären wir Ihnen das neue Lizenzmodell.
Nehmen Sie an diesem Webinar teil, bei dem HCL-Ambassador Marc Thomas und Gastredner Franz Walder Ihnen diese neue Welt näherbringen. Es vermittelt Ihnen die Tools und das Know-how, um den Überblick zu bewahren. Sie werden in der Lage sein, Ihre Kosten durch eine optimierte Domino-Konfiguration zu reduzieren und auch in Zukunft gering zu halten.
Diese Themen werden behandelt
- Reduzierung der Lizenzkosten durch Auffinden und Beheben von Fehlkonfigurationen und überflüssigen Konten
- Wie funktionieren CCB- und CCX-Lizenzen wirklich?
- Verstehen des DLAU-Tools und wie man es am besten nutzt
- Tipps für häufige Problembereiche, wie z. B. Team-Postfächer, Funktions-/Testbenutzer usw.
- Praxisbeispiele und Best Practices zum sofortigen Umsetzen
5th LF Energy Power Grid Model Meet-up SlidesDanBrown980551
5th Power Grid Model Meet-up
It is with great pleasure that we extend to you an invitation to the 5th Power Grid Model Meet-up, scheduled for 6th June 2024. This event will adopt a hybrid format, allowing participants to join us either through an online Mircosoft Teams session or in person at TU/e located at Den Dolech 2, Eindhoven, Netherlands. The meet-up will be hosted by Eindhoven University of Technology (TU/e), a research university specializing in engineering science & technology.
Power Grid Model
The global energy transition is placing new and unprecedented demands on Distribution System Operators (DSOs). Alongside upgrades to grid capacity, processes such as digitization, capacity optimization, and congestion management are becoming vital for delivering reliable services.
Power Grid Model is an open source project from Linux Foundation Energy and provides a calculation engine that is increasingly essential for DSOs. It offers a standards-based foundation enabling real-time power systems analysis, simulations of electrical power grids, and sophisticated what-if analysis. In addition, it enables in-depth studies and analysis of the electrical power grid’s behavior and performance. This comprehensive model incorporates essential factors such as power generation capacity, electrical losses, voltage levels, power flows, and system stability.
Power Grid Model is currently being applied in a wide variety of use cases, including grid planning, expansion, reliability, and congestion studies. It can also help in analyzing the impact of renewable energy integration, assessing the effects of disturbances or faults, and developing strategies for grid control and optimization.
What to expect
For the upcoming meetup we are organizing, we have an exciting lineup of activities planned:
-Insightful presentations covering two practical applications of the Power Grid Model.
-An update on the latest advancements in Power Grid -Model technology during the first and second quarters of 2024.
-An interactive brainstorming session to discuss and propose new feature requests.
-An opportunity to connect with fellow Power Grid Model enthusiasts and users.
UiPath Test Automation using UiPath Test Suite series, part 6DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 6. In this session, we will cover Test Automation with generative AI and Open AI.
UiPath Test Automation with generative AI and Open AI webinar offers an in-depth exploration of leveraging cutting-edge technologies for test automation within the UiPath platform. Attendees will delve into the integration of generative AI, a test automation solution, with Open AI advanced natural language processing capabilities.
Throughout the session, participants will discover how this synergy empowers testers to automate repetitive tasks, enhance testing accuracy, and expedite the software testing life cycle. Topics covered include the seamless integration process, practical use cases, and the benefits of harnessing AI-driven automation for UiPath testing initiatives. By attending this webinar, testers, and automation professionals can gain valuable insights into harnessing the power of AI to optimize their test automation workflows within the UiPath ecosystem, ultimately driving efficiency and quality in software development processes.
What will you get from this session?
1. Insights into integrating generative AI.
2. Understanding how this integration enhances test automation within the UiPath platform
3. Practical demonstrations
4. Exploration of real-world use cases illustrating the benefits of AI-driven test automation for UiPath
Topics covered:
What is generative AI
Test Automation with generative AI and Open AI.
UiPath integration with generative AI
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Your One-Stop Shop for Python Success: Top 10 US Python Development Providersakankshawande
Simplify your search for a reliable Python development partner! This list presents the top 10 trusted US providers offering comprehensive Python development services, ensuring your project's success from conception to completion.
OpenID AuthZEN Interop Read Out - AuthorizationDavid Brossard
During Identiverse 2024 and EIC 2024, members of the OpenID AuthZEN WG got together and demoed their authorization endpoints conforming to the AuthZEN API
Driving Business Innovation: Latest Generative AI Advancements & Success StorySafe Software
Are you ready to revolutionize how you handle data? Join us for a webinar where we’ll bring you up to speed with the latest advancements in Generative AI technology and discover how leveraging FME with tools from giants like Google Gemini, Amazon, and Microsoft OpenAI can supercharge your workflow efficiency.
During the hour, we’ll take you through:
Guest Speaker Segment with Hannah Barrington: Dive into the world of dynamic real estate marketing with Hannah, the Marketing Manager at Workspace Group. Hear firsthand how their team generates engaging descriptions for thousands of office units by integrating diverse data sources—from PDF floorplans to web pages—using FME transformers, like OpenAIVisionConnector and AnthropicVisionConnector. This use case will show you how GenAI can streamline content creation for marketing across the board.
Ollama Use Case: Learn how Scenario Specialist Dmitri Bagh has utilized Ollama within FME to input data, create custom models, and enhance security protocols. This segment will include demos to illustrate the full capabilities of FME in AI-driven processes.
Custom AI Models: Discover how to leverage FME to build personalized AI models using your data. Whether it’s populating a model with local data for added security or integrating public AI tools, find out how FME facilitates a versatile and secure approach to AI.
We’ll wrap up with a live Q&A session where you can engage with our experts on your specific use cases, and learn more about optimizing your data workflows with AI.
This webinar is ideal for professionals seeking to harness the power of AI within their data management systems while ensuring high levels of customization and security. Whether you're a novice or an expert, gain actionable insights and strategies to elevate your data processes. Join us to see how FME and AI can revolutionize how you work with data!
Project Management Semester Long Project - Acuityjpupo2018
Acuity is an innovative learning app designed to transform the way you engage with knowledge. Powered by AI technology, Acuity takes complex topics and distills them into concise, interactive summaries that are easy to read & understand. Whether you're exploring the depths of quantum mechanics or seeking insight into historical events, Acuity provides the key information you need without the burden of lengthy texts.
8. Research Object, Components of
• Identity : unique ID
• Entities : core data or software objects themselves
• Properties : Aggregation : “belongs to” relationship, used to aggregate
within Research Object
• Properties : Relationships : “related to” relationship
• Properties: Descriptive/Annotative : metadata
• Properties: Provenance : “derived from”, “versioned from” relationship as
well as others
• Properties: Agents : data creator (author list), curator, data scientist
• State : external to the RO
26. Packaging and Mapping (BagIT / ORE)
• BagIt format
• standardized “envelopes” (bags)
• no requirements for “knowing” internal semantics
• 3 elements: a bag declaration (bag.txt), a manifest file (manifest‐
<algorithm>.txt, folder with content (data)
• Tools available for bagging
• SEAD BagIt service
• LOC Bagger tool (http://sourceforge.net/projects/loc‐xferutils/files/loc‐
bagger/2.1.2/)
27. Resource Maps
• OAI/ORE standard
• Exposes rich content
• Captures semantic of relationships among RO items
• Identifies aggregations
• SEAD VA OAI/ORE relationship classes:
• Aggregation
• Description
• Authorship
• Copyright / rights
• Modification
• Derivation
• Citation
• Processing (calculation, computation, etc.)
29. OAI/ORE Map Example
<rdf:RDF
…
<rdf:Description rdf:about=URI> <!‐‐ data item‐‐>
<ore:isAggregatedBy>ID</ore:isAggregatedBy>
<dcterms:identifier rdf:datatype=URI>ID</dcterms:identifier>
<dcterms:title rdf:datatype=URI>Vortex_Mining.xlsx</dcterms:title>
<dcterms:source rdf:datatype=URI>test_bag/data/Vortex_Mining.xlsx</dcterms:source>
<!‐‐ A related resource from which the described resource is derived. ‐‐>
</rdf:Description>
…..
</rdf:RDF>
45. Service Level Agreement
‐ Requirements and Privileges (summary)
• RO properties – Requirements
• Data contributor Institutional Affiliation
• Scientific Domain
• Data Organization (e.g.: BagIt or SWORD)
• Size
• Versioning
• Minimal Metadata
• Licensing (eg: open, embargoed)
• Repository privileges
• Repository is free to re‐distribute the RO received from SEAD VA, except in case of
embargo.
• Repository can migrate RO into other formats and re‐distribute migrate ROs.
• Repository curators can annotate data collections to comply with standards or
upgrades in our policies.
47. Excerpt from from SLA for IU Scholarworks
• Institutional Affiliation
• At least one author, at the time of deposit, belongs to the same institution as our
repository.
• RO Size
• 150 MB for items uploaded directly to IUScholarWorks, 10 GB total
• 5 TB for items hosted on the SDA
• Versioning
• Only final PO is accepted, subsequent versions will substitute the version of record.
• Scientific Domain – Curator review might be needed
• ROs are associated with research in the domains of ANY (identify specific domains or
put “sustainability science” for a broader match)
61. Overview: The Data Scientist
Data Scientist uses research objects that were created by someone else
for his/her purposes and creates new research objects by modifying
existing objects.
Super Simple Example: Putting images in given RO 3 into a single
presentation and creating a new RO
Data scientist can:
• Search
• Download (bags)
• Modify
• Re‐upload
65. Provenance Capture in SEAD VA
• Uses Komadu provenance system
• Captures activity in real time, assembles new activity into internal
representation as provenance graphs
• W3C PROV spec compliant
• Terminology
• Activity : Some Processing Event in SEAD VA
• Entity : A Research Object (in CO or PO state)
• Agent : Data Creator, Curator, Data Scientist
72. Curation Time Provenance Capture
• Curation Activities
• Curation‐Edit‐Event
• Publish‐Event
• Provenance relationships captured in Komadu
• Agent‐Activity : When some Agent triggers one of above Activities
• Activity‐Entity : When an Activity Generates (Updates) a Research Object
• Example Scenario
• Curator X edits metadata on research object Y
• Agent‐Activity relationship (association) between X and Curation‐Edit‐Event
• Activity‐Entity relationship (generation) between Curation‐Edit‐Event and Y