Presentation to the UM Library Emergent Research SeriesSEAD
SEAD is a 5-year project funded by the NSF to develop cyberinfrastructure for sustainable data preservation and access. It is a partnership between the universities of Michigan, Indiana, and Illinois. SEAD aims to serve researchers in sustainability science who work in small teams and have diverse data needs. It provides active curation tools, collaboration spaces, and interfaces that integrate data, publications, and people. Data can be deposited to university repositories through the SEAD Virtual Archive for long-term preservation and discovery. Lessons show more support is needed to bridge data production and long-term infrastructure. Future plans include expanding the user community and repository options.
Preservation, Publishing, and People: A SEAD ViewInna Kouper
The document discusses research objects (ROs) which bundle together primary research results, metadata, software, and other materials. It describes the roles of data creators, curators, and data scientists in working with ROs as they move from initial research to publication and later reuse. The SEAD Virtual Archive (VA) implements a model for ROs that allows them to transition between different states as they move through the research lifecycle from creation to publication and reuse.
Data Sets, Ensemble Cloud Computing, and the University Library:Getting the ...SEAD
This document discusses research data management and the role of university libraries. It describes the SEAD (Sustainable Environment Actionable Data) project, which provides data services like curation, preservation, and a social community network to support research data across its lifecycle. SEAD aims to support interdisciplinary research by allowing researchers to define and manage related collections of data and metadata called Research Objects in a scalable way. The document argues that research organizations are best positioned to provide comprehensive long-term data services that integrate across the entire research process.
This document discusses the Sustainable Environment - Actionable Data (SEAD) project. SEAD aims to provide data services to sustainability researchers by developing tools that address challenges like heterogeneous and small datasets. It plans to move data curation upstream, involve domain scientists, and leverage social media and metadata. SEAD will integrate these active curation services into a federated infrastructure to preserve datasets long-term. The project is led by researchers from multiple institutions and funded by the National Science Foundation.
Practical and Conceptual Considerations of Research Object PreservationSEAD
This document discusses research object (RO) frameworks for preserving digital research data. It addresses the challenges of research spanning long periods of time and involving complex, heterogeneous data that changes states. The research object framework aims to capture agents, states, relationships, and content to enable automation, reproducibility, and reuse of research. The framework defines three states for research objects - live, curated, and published. Live objects are works in progress, curated objects are packaged for preservation, and published objects are immutable and citable. The framework allows documentation of research processes and outputs to build trust and facilitate reuse.
Digital Library Federation - DataNets Panel presentation (Nov. 1st, 2011)SEAD
This document summarizes a panel discussion on the NSF funded Datanet partnerships program. It introduces the panelists from various Datanet projects including SEAD, TerraPop, Datanet Federation Consortium, and DataOne. It then provides more detail on the goals and strategies of the SEAD project, which aims to develop tools and services to address the needs of long-tail sustainability research by leveraging social curation and active metadata. SEAD works to move data curation upstream and engage researchers throughout the project using automated metadata and volunteered contributions.
The NSF DataNet Program aims to create exemplar data infrastructure organizations called DataNet Partners to provide researchers with access to data and advance research. SEAD is one such DataNet Partner that provides lightweight data services for sustainability science. It acts as an active content repository and curation service, and is developing tools for community exploration of data. The current focus is on an end-user workshop, conference demonstrations, and interface redesign to refine models for supporting the full lifecycle of research data objects.
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...SEAD
This document discusses the Sustainable Environment Actionable Data (SEAD) project, which aims to lower the costs and increase the value of data curation through a data lifecycle approach. SEAD provides lightweight data services to support sustainability research, including secure project workspaces, active and social curation tools, and integrated lifecycle support for data from ingest to long-term preservation. By leveraging technologies like Web 2.0 and standards, SEAD simplifies and automates curation processes using metadata captured from data producers and users. This allows curation activities to begin earlier in the data lifecycle and be distributed across researchers and curators.
Presentation to the UM Library Emergent Research SeriesSEAD
SEAD is a 5-year project funded by the NSF to develop cyberinfrastructure for sustainable data preservation and access. It is a partnership between the universities of Michigan, Indiana, and Illinois. SEAD aims to serve researchers in sustainability science who work in small teams and have diverse data needs. It provides active curation tools, collaboration spaces, and interfaces that integrate data, publications, and people. Data can be deposited to university repositories through the SEAD Virtual Archive for long-term preservation and discovery. Lessons show more support is needed to bridge data production and long-term infrastructure. Future plans include expanding the user community and repository options.
Preservation, Publishing, and People: A SEAD ViewInna Kouper
The document discusses research objects (ROs) which bundle together primary research results, metadata, software, and other materials. It describes the roles of data creators, curators, and data scientists in working with ROs as they move from initial research to publication and later reuse. The SEAD Virtual Archive (VA) implements a model for ROs that allows them to transition between different states as they move through the research lifecycle from creation to publication and reuse.
Data Sets, Ensemble Cloud Computing, and the University Library:Getting the ...SEAD
This document discusses research data management and the role of university libraries. It describes the SEAD (Sustainable Environment Actionable Data) project, which provides data services like curation, preservation, and a social community network to support research data across its lifecycle. SEAD aims to support interdisciplinary research by allowing researchers to define and manage related collections of data and metadata called Research Objects in a scalable way. The document argues that research organizations are best positioned to provide comprehensive long-term data services that integrate across the entire research process.
This document discusses the Sustainable Environment - Actionable Data (SEAD) project. SEAD aims to provide data services to sustainability researchers by developing tools that address challenges like heterogeneous and small datasets. It plans to move data curation upstream, involve domain scientists, and leverage social media and metadata. SEAD will integrate these active curation services into a federated infrastructure to preserve datasets long-term. The project is led by researchers from multiple institutions and funded by the National Science Foundation.
Practical and Conceptual Considerations of Research Object PreservationSEAD
This document discusses research object (RO) frameworks for preserving digital research data. It addresses the challenges of research spanning long periods of time and involving complex, heterogeneous data that changes states. The research object framework aims to capture agents, states, relationships, and content to enable automation, reproducibility, and reuse of research. The framework defines three states for research objects - live, curated, and published. Live objects are works in progress, curated objects are packaged for preservation, and published objects are immutable and citable. The framework allows documentation of research processes and outputs to build trust and facilitate reuse.
Digital Library Federation - DataNets Panel presentation (Nov. 1st, 2011)SEAD
This document summarizes a panel discussion on the NSF funded Datanet partnerships program. It introduces the panelists from various Datanet projects including SEAD, TerraPop, Datanet Federation Consortium, and DataOne. It then provides more detail on the goals and strategies of the SEAD project, which aims to develop tools and services to address the needs of long-tail sustainability research by leveraging social curation and active metadata. SEAD works to move data curation upstream and engage researchers throughout the project using automated metadata and volunteered contributions.
The NSF DataNet Program aims to create exemplar data infrastructure organizations called DataNet Partners to provide researchers with access to data and advance research. SEAD is one such DataNet Partner that provides lightweight data services for sustainability science. It acts as an active content repository and curation service, and is developing tools for community exploration of data. The current focus is on an end-user workshop, conference demonstrations, and interface redesign to refine models for supporting the full lifecycle of research data objects.
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...SEAD
This document discusses the Sustainable Environment Actionable Data (SEAD) project, which aims to lower the costs and increase the value of data curation through a data lifecycle approach. SEAD provides lightweight data services to support sustainability research, including secure project workspaces, active and social curation tools, and integrated lifecycle support for data from ingest to long-term preservation. By leveraging technologies like Web 2.0 and standards, SEAD simplifies and automates curation processes using metadata captured from data producers and users. This allows curation activities to begin earlier in the data lifecycle and be distributed across researchers and curators.
ESA14 Workshop on SEAD's Data Services and ToolsSEAD
This document provides an overview of the SEAD (Sustainable Environment and Ecological Development) services and tools for data curation, preservation, and sharing. It outlines the SEAD workshop agenda which demonstrates how to use project spaces to manage research data, metadata, and social features. It also describes how to publish and preserve data, connect with other researchers through profiles and a research network, and find data within a project space. The goal of SEAD is to provide secure, team-controlled spaces to manage research data throughout the data lifecycle and promote sharing and discovery.
SEAD: Lightweight Data Services for Sustainability ResearchSEAD
This document describes SEAD, a set of lightweight data services to help sustainability researchers manage, share, and preserve their data. SEAD offers a secure project space to work privately with data, services to publish data and get DOIs to ensure the longevity of data, and tools to connect researchers and help them get credit for their work including profiles, networking visualizations, and metrics of research impact. It also provides data discovery resources to find relevant published data through faceted search and geospatial tools to view and interact with location-based data on maps.
Research Data Access and Preservation Summit, 2014
San Diego, CA
March 26-28, 2014
Jared Lyle, ICPSR
Jennifer Doty, Emory University
Joel Herndon, Duke University
Libbie Stephenson, University of California, Los Angeles
The document discusses data management plan requirements for proposals submitted to the U.S. Department of Energy Office of Science for research funding. It provides context on the history of data management policies, outlines the four main requirements for inclusion of a data management plan, and suggests elements that should be included in the plan such as data types/sources, content/format, sharing/preservation, and protection. It also discusses tools like the Public Access Gateway for Energy and Science that can help manage access to research publications and data.
RDAP 15 Local ICPSR Data Curation Workshop Pilot ProjectASIS&T
Research Data Access and Preservation Summit, 2015
Minneapolis, MN
April 22-23, 2015
Linda Detterman, Jennifer Doty, Jared Lyle, Amy Pienta, Lizzy Rolando and Mandy Swygart-Hobaugh
These are the slides for Robert H. McDonald for the Future Trends Panel Presentation at the the Inter-institutional Approaches to Supporting Scholarly Communication Symposium held on August 16, 2012 at the Georgia Institute of Technology.
February 18 2015 NISO Virtual Conference Scientific Data Management: Caring for Your Institution and its Intellectual Wealth
Using data management plans as a research tool: an introduction to the DART Project
Amanda L. Whitmire, Ph.D., Assistant Professor, Data Management Specialist, Oregon State University Libraries & Press
RDAP 15 EarthCollab: Connecting Scientific Information Sources using the Sema...ASIS&T
Research Data Access and Preservation Summit, 2015
Minneapolis, MN
April 22-23, 2015
Erica M. Johns, Jon Corson-Rikert, Huda J. Khan, Dean B. Krafft and Matthew S. Mayernik
RDAP14: Maryann Martone, Keynote, The Neuroscience Information FrameworkASIS&T
The Neuroscience Information Framework (NIF) is an initiative of the NIH Blueprint to maximize access to and utility of worldwide neuroscience research resources. NIF catalogs over 10,000 resources including databases, literature, and materials. It provides search capabilities across these resources and develops ontologies and semantic frameworks to integrate diverse data types and scales. NIF aims to make dispersed neuroscience information more findable, accessible, interoperable, and reusable to enable new insights.
Poster RDAP13: Data information literacy multiple paths to a single goalASIS&T
Jake Carlson, Jon Jeffryes, Brian Westra and Sarah Wright
Data Information Literacy: Multiple Paths to a Single Goal
Research Data Access & Preservation Summit 2013
Baltimore, MD April 4, 2013 #rdap13
Information technology and resources are an integral and indispensable part of the contemporary academic enterprise. In particular, technological advances have nurtured a new paradigm of data-intensive research. However, far too much of this activity still takes place in silos, to the detriment of open scholarly inquiry, integrity, and advancement. To counteract this tendency, the University of California Curation Center (UC3) has been developing and deploying a comprehensive suite of curation services that facilitate widespread data management, preservation, publication, sharing, and reuse. Through these services UC3 is engaging with new communities of use: in addition to its traditional stakeholders in cultural heritage memory organizations, e.g., libraries, museums, and archives, the UC3 service suite is now attracting significant adoption by research projects, laboratories, and individual faculty researchers. This webinar will present an introduction to five specific services – DMPTool, DataUp, EZID, Merritt, Web Archiving Service (WAS) – applicable to data curation throughout the scholarly lifecycle, two recent initiatives in collaboration with UC campuses, UC Berkeley Research Hub and UC San Francisco DataShare, and the ways in which they encourage and promote new communities of practice and greater transparency in scholarly research.
Poster RDAP13: Research Data in eCommons @ Cornell: Present and FutureASIS&T
Wendy A. Kozlowski, Dianne Dietrich, Gail Steinhart and Sarah Wright
Cornell University Library, Ithaca, NY
Research Data in eCommons @ Cornell: Present and Future
Research Data Access & Preservation Summit 2013
Baltimore, MD April 4, 2013 #rdap13
Poster RDAP13: A Workflow for Depositing to a Research Data Repository: A Cas...ASIS&T
Betsy Gunia, David Fearon, Benjamin Brosius, Tim DiLauro
JHU Data Management Services
Johns Hopkins University Sheridan Libraries
A Workflow for Depositing to a Research Data Repository: A Case Study for Archiving Publication Data
Research Data Access & Preservation Summit 2013
Baltimore, MD April 4, 2013 #rdap13
This presentation was provided by Joe Zucca of the University of Pennsylvania, during Session Five of the NISO event "Assessment Practices and Metrics for the 21st Century," held on November 22, 2019.
Jeff Haywood - Research Integrity: Institutional ResponsibilityJisc
1) The document discusses challenges and solutions related to research data management (RDM) at the University of Edinburgh. It outlines the university's RDM policy and implementation plan to provide training, support, and services for storing, backing up, and sharing research data.
2) The RDM working group at the university recommended establishing a research data service strategy to provide archiving of data, globally accessible storage, and support for mobile access and collaboration.
3) Key challenges going forward include securing sustainable funding, integrating new services with existing practices, developing support staff skills, and encouraging researcher engagement with new RDM practices.
Feb 26 NISO Training Thursday
Crafting a Scientific Data Management Plan
About the Training
Addressing a data management plan for the first time can be an intimidating exercise. Join NISO for a hands-on workshop that will guide you through the elements of creating a data management plan, including gathering necessary information, identifying needed resources, and navigating potential pitfalls. Participants explore the important components of a data management plan and critique excerpts of sample plans provided by the instructors.
This session is meant to be a guided, step-by-step session that will follow the February 18 NISO Virtual Conference, Scientific Data Management: Caring for Your Institution and its Intellectual Wealth.
About the Instructors
Kiyomi D. Deards, MSLIS, Assistant Professor, University of Nebraska-Lincoln Libraries
Jennifer Thoegersen, Data Curation Librarian, University of Nebraska-Lincoln Libraries
This presentation was provided by Carly Strasser of the Chan Zuckerberg Initiative during the NISO hot topic virtual conference "Effective Data Management," which was held on September 29, 2021.
SEAD: Sustainable Environment-Actionable Data - Robert McDonald - RDAP12 ASIS&T
SEAD: Sustainable Environment-Actionable Data
Leveraging Existing Cyberinfrastructure for Long-Term Sustainability
Margaret Hedstrom-University of Michigan
James Myers-Rensselaer Polytechnic Institute
Robert H. McDonald-Indiana University
Presentation at Research Data Access & Preservation Summit
22 March 2012
SEAD Prototype: Data Curation and Preservation for Sustainability ScienceSEAD
The SEAD prototype aims to enable data curation and preservation for sustainability science research by providing tools for ingesting, annotating, visualizing, and preserving heterogeneous research data. It integrates three components: Active Curation and Research (ACR) for data management and curation, VIVO for networking and analytics of research outputs, and Virtual Archive (VA) for long-term data publication, preservation, and discovery. The prototype is being tested by curating a 1.6 terabyte dataset from the National Center for Earth Surface Dynamics involving transfer of data and metadata between the three SEAD components.
ESA14 Workshop on SEAD's Data Services and ToolsSEAD
This document provides an overview of the SEAD (Sustainable Environment and Ecological Development) services and tools for data curation, preservation, and sharing. It outlines the SEAD workshop agenda which demonstrates how to use project spaces to manage research data, metadata, and social features. It also describes how to publish and preserve data, connect with other researchers through profiles and a research network, and find data within a project space. The goal of SEAD is to provide secure, team-controlled spaces to manage research data throughout the data lifecycle and promote sharing and discovery.
SEAD: Lightweight Data Services for Sustainability ResearchSEAD
This document describes SEAD, a set of lightweight data services to help sustainability researchers manage, share, and preserve their data. SEAD offers a secure project space to work privately with data, services to publish data and get DOIs to ensure the longevity of data, and tools to connect researchers and help them get credit for their work including profiles, networking visualizations, and metrics of research impact. It also provides data discovery resources to find relevant published data through faceted search and geospatial tools to view and interact with location-based data on maps.
Research Data Access and Preservation Summit, 2014
San Diego, CA
March 26-28, 2014
Jared Lyle, ICPSR
Jennifer Doty, Emory University
Joel Herndon, Duke University
Libbie Stephenson, University of California, Los Angeles
The document discusses data management plan requirements for proposals submitted to the U.S. Department of Energy Office of Science for research funding. It provides context on the history of data management policies, outlines the four main requirements for inclusion of a data management plan, and suggests elements that should be included in the plan such as data types/sources, content/format, sharing/preservation, and protection. It also discusses tools like the Public Access Gateway for Energy and Science that can help manage access to research publications and data.
RDAP 15 Local ICPSR Data Curation Workshop Pilot ProjectASIS&T
Research Data Access and Preservation Summit, 2015
Minneapolis, MN
April 22-23, 2015
Linda Detterman, Jennifer Doty, Jared Lyle, Amy Pienta, Lizzy Rolando and Mandy Swygart-Hobaugh
These are the slides for Robert H. McDonald for the Future Trends Panel Presentation at the the Inter-institutional Approaches to Supporting Scholarly Communication Symposium held on August 16, 2012 at the Georgia Institute of Technology.
February 18 2015 NISO Virtual Conference Scientific Data Management: Caring for Your Institution and its Intellectual Wealth
Using data management plans as a research tool: an introduction to the DART Project
Amanda L. Whitmire, Ph.D., Assistant Professor, Data Management Specialist, Oregon State University Libraries & Press
RDAP 15 EarthCollab: Connecting Scientific Information Sources using the Sema...ASIS&T
Research Data Access and Preservation Summit, 2015
Minneapolis, MN
April 22-23, 2015
Erica M. Johns, Jon Corson-Rikert, Huda J. Khan, Dean B. Krafft and Matthew S. Mayernik
RDAP14: Maryann Martone, Keynote, The Neuroscience Information FrameworkASIS&T
The Neuroscience Information Framework (NIF) is an initiative of the NIH Blueprint to maximize access to and utility of worldwide neuroscience research resources. NIF catalogs over 10,000 resources including databases, literature, and materials. It provides search capabilities across these resources and develops ontologies and semantic frameworks to integrate diverse data types and scales. NIF aims to make dispersed neuroscience information more findable, accessible, interoperable, and reusable to enable new insights.
Poster RDAP13: Data information literacy multiple paths to a single goalASIS&T
Jake Carlson, Jon Jeffryes, Brian Westra and Sarah Wright
Data Information Literacy: Multiple Paths to a Single Goal
Research Data Access & Preservation Summit 2013
Baltimore, MD April 4, 2013 #rdap13
Information technology and resources are an integral and indispensable part of the contemporary academic enterprise. In particular, technological advances have nurtured a new paradigm of data-intensive research. However, far too much of this activity still takes place in silos, to the detriment of open scholarly inquiry, integrity, and advancement. To counteract this tendency, the University of California Curation Center (UC3) has been developing and deploying a comprehensive suite of curation services that facilitate widespread data management, preservation, publication, sharing, and reuse. Through these services UC3 is engaging with new communities of use: in addition to its traditional stakeholders in cultural heritage memory organizations, e.g., libraries, museums, and archives, the UC3 service suite is now attracting significant adoption by research projects, laboratories, and individual faculty researchers. This webinar will present an introduction to five specific services – DMPTool, DataUp, EZID, Merritt, Web Archiving Service (WAS) – applicable to data curation throughout the scholarly lifecycle, two recent initiatives in collaboration with UC campuses, UC Berkeley Research Hub and UC San Francisco DataShare, and the ways in which they encourage and promote new communities of practice and greater transparency in scholarly research.
Poster RDAP13: Research Data in eCommons @ Cornell: Present and FutureASIS&T
Wendy A. Kozlowski, Dianne Dietrich, Gail Steinhart and Sarah Wright
Cornell University Library, Ithaca, NY
Research Data in eCommons @ Cornell: Present and Future
Research Data Access & Preservation Summit 2013
Baltimore, MD April 4, 2013 #rdap13
Poster RDAP13: A Workflow for Depositing to a Research Data Repository: A Cas...ASIS&T
Betsy Gunia, David Fearon, Benjamin Brosius, Tim DiLauro
JHU Data Management Services
Johns Hopkins University Sheridan Libraries
A Workflow for Depositing to a Research Data Repository: A Case Study for Archiving Publication Data
Research Data Access & Preservation Summit 2013
Baltimore, MD April 4, 2013 #rdap13
This presentation was provided by Joe Zucca of the University of Pennsylvania, during Session Five of the NISO event "Assessment Practices and Metrics for the 21st Century," held on November 22, 2019.
Jeff Haywood - Research Integrity: Institutional ResponsibilityJisc
1) The document discusses challenges and solutions related to research data management (RDM) at the University of Edinburgh. It outlines the university's RDM policy and implementation plan to provide training, support, and services for storing, backing up, and sharing research data.
2) The RDM working group at the university recommended establishing a research data service strategy to provide archiving of data, globally accessible storage, and support for mobile access and collaboration.
3) Key challenges going forward include securing sustainable funding, integrating new services with existing practices, developing support staff skills, and encouraging researcher engagement with new RDM practices.
Feb 26 NISO Training Thursday
Crafting a Scientific Data Management Plan
About the Training
Addressing a data management plan for the first time can be an intimidating exercise. Join NISO for a hands-on workshop that will guide you through the elements of creating a data management plan, including gathering necessary information, identifying needed resources, and navigating potential pitfalls. Participants explore the important components of a data management plan and critique excerpts of sample plans provided by the instructors.
This session is meant to be a guided, step-by-step session that will follow the February 18 NISO Virtual Conference, Scientific Data Management: Caring for Your Institution and its Intellectual Wealth.
About the Instructors
Kiyomi D. Deards, MSLIS, Assistant Professor, University of Nebraska-Lincoln Libraries
Jennifer Thoegersen, Data Curation Librarian, University of Nebraska-Lincoln Libraries
This presentation was provided by Carly Strasser of the Chan Zuckerberg Initiative during the NISO hot topic virtual conference "Effective Data Management," which was held on September 29, 2021.
SEAD: Sustainable Environment-Actionable Data - Robert McDonald - RDAP12 ASIS&T
SEAD: Sustainable Environment-Actionable Data
Leveraging Existing Cyberinfrastructure for Long-Term Sustainability
Margaret Hedstrom-University of Michigan
James Myers-Rensselaer Polytechnic Institute
Robert H. McDonald-Indiana University
Presentation at Research Data Access & Preservation Summit
22 March 2012
SEAD Prototype: Data Curation and Preservation for Sustainability ScienceSEAD
The SEAD prototype aims to enable data curation and preservation for sustainability science research by providing tools for ingesting, annotating, visualizing, and preserving heterogeneous research data. It integrates three components: Active Curation and Research (ACR) for data management and curation, VIVO for networking and analytics of research outputs, and Virtual Archive (VA) for long-term data publication, preservation, and discovery. The prototype is being tested by curating a 1.6 terabyte dataset from the National Center for Earth Surface Dynamics involving transfer of data and metadata between the three SEAD components.
CNI Fall 2011 Meeting Presentation Margaret Hedstrom & Robert McDonald (Dec. ...SEAD
SEAD is a new NSF-funded project that aims to provide sustainable data services for sustainability science research. It will integrate existing technologies and tools to address the needs of researchers working on "long tail" sustainability problems. SEAD is in its initial phase of developing prototypes and will not be ready to accept data until after October 2012. It is a collaboration between researchers at the University of Michigan, Indiana University, University of Illinois, and Rensselaer Polytechnic Institute.
Meeting Federal Research Requirements for Data Management Plans, Public Acces...ICPSR
These slides cover evolving federal research requirements for sharing scientific data. Provided are updates on federal agency responses to the 2013 OSTP memo, guidance on data management plans, resources for data management and curation training for staff/researchers, and tips for evaluating public data-sharing services. ICPSR's public data-sharing service, openICPSR, is also presented. Recording of this presentation is here: https://www.youtube.com/watch?v=2_erMkASSv4&feature=youtu.be
Enhancing Our Capacity for Large Health Dataset AnalysisCTSI at UCSF
Overview of UCSF-CTSI Comparative Effectiveness Large Dataset Analysis Core, which offers resources for the analysis of large, public data sets on health and health care.
A description of software as infrastructure at NSF, and how Apache projects may be similar. What lessons can be shared from one organization to the other? How does science software compare with more general software?
The document provides an overview of the CISER Data Archive at Cornell University and introduces key concepts of research data management (RDM).
The CISER Data Archive is a collection of over 27,000 numeric datasets to support quantitative research in various social science fields. It provides consulting services to help users find, access, and use data. It also maintains the Cornell research data repository.
The document defines research data and outlines the research data lifecycle. It discusses best practices for organizing, documenting, storing, and securing research data. Key aspects of RDM include developing data management plans, using appropriate file formats, and ensuring long-term preservation and sharing of research data.
Stuart Phinn_Many kinds of infrastructure: resolving and advancing ecosystem ...TERN Australia
This document discusses infrastructure for ecosystem science in Australia. It begins by outlining the multi-disciplinary nature of ecosystem science and challenges in funding infrastructure to support data collection, storage, analysis and sharing across disciplines. It promotes a collaborative approach through the TERN network to establish shared infrastructure and standards. Examples are given of coordinated data collection, processing, storage and analysis projects enabled by TERN. The document argues that infrastructure like TERN improves the efficiency and effectiveness of ecosystem science in Australia.
Green Shoots:Research Data Management Pilot at Imperial College LondonTorsten Reimer
The document summarizes the results of a research data management (RDM) pilot project at Imperial College London. It describes how £100k in funding was provided for six academic projects to develop exemplars of best practices in RDM. The funded projects developed various tools and frameworks to improve data curation, sharing, and citation. Overall, the pilot demonstrated that innovative RDM is possible but also difficult and expensive to develop sustainably. It helped establish an initial RDM community at Imperial.
This document summarizes a presentation about meeting federal data sharing requirements. It discusses the history of these requirements and defines good practices for data sharing and stewardship. It also reviews some public data sharing services and provides tips for evaluating them. Key aspects of good data sharing include maximizing access, protecting privacy, ensuring proper attribution, and having long-term preservation and sustainability plans. The presenter emphasizes that restricted-use or sensitive data can be effectively shared through secure virtual environments.
California Ocean Science Trust " Building a Sustainable Knowledge Base for ...Tom Moritz
"Building a Sustainable Knowledge Base for the Marine Protected Areas Monitoring Enterprise" a presentation to the California Ocean Science Trust, Oakland, California March 16, 2010
Supporting Libraries in Leading the Way in Research Data ManagementMarieke Guy
Marieke Guy, Institutional Support Officer, Digital Curation Centre, UKOLN, University of Bath, UK presents on Supporting Libraries in Leading the Way in Research Data Management at Online Information, London 20th -21st November 2012
The document discusses solutions to overcoming the tragedy of the data commons through shared metadata. It describes how large scientific projects can share data at low cost by starting from overlapping common metadata terms and having their metadata teams work together. Reusing shared metadata leads to increased reusability of data across projects. The document advocates for developing metadata as evolving, linked resources rather than predefined standards, and provides examples of how this approach has helped scientific collaborations and government data sharing initiatives succeed.
Agencies such as the NSF and NIH require data management plans as part of research proposals and the Office of Science and Technology Policy (OSTP) is requiring federal agencies to develop plans to increase public access to results of federally funded scientific research. These slides explore sustainable data sharing models, including models for sharing restricted-use data. Demos of these models and tips for accessing public data access services are provided as well as resources for creating data management plans for grant applications.
1) The University of Edinburgh drafted an 18-month Research Data Management Roadmap in August 2012 to address institutional research data management and comply with their RDM policy.
2) The Roadmap outlines governance, data management planning support, development of an active data infrastructure including a data store, and data stewardship services such as a data repository and registry.
3) Services under the Roadmap include tailored data management plan assistance, customizing an online DMP tool, infrastructure for storing and accessing research data, and a data repository for depositing and long-term management of completed research outputs.
This presentation was provided by Andrew K. Pace of OCLC, during the 13th Annual NISO-BISG forum "Interoperability: From Silos to An Ecosystem," held on June 24, 2020.
If Big Data is data that exceeds the processing capacity of conventional systems, thereby necessitating alternative processing measures, we are looking at an essentially technological challenge that IT managers are best equipped to address.
The DCC is currently working with 18 HEIs to support and develop their capabilities in the management of research data and, whilst the aforementioned challenge is not usually core to their expressed concerns, are there particular issues of curation inherent to Big Data that might force a different perspective?
We have some understanding of Big Data from our contacts in the Astronomy and High Energy Physics domains, and the scale and speed of development in Genomics data generation is well known, but the inability to provide sufficient processing capacity is not one of their more frequent complaints.
That’s not to say that Big Science and its Big Data are free of challenges in data curation; only that they are shared with their lesser cousins, where one might say that the real challenge is less one of size than diversity and complexity.
This brief presentation explores those aspects of data curation that go beyond the challenges of processing power but which may lend a broader perspective to the technology selection process.
Similar to Data 2012 -- Presentation by Margaret Hedstrom (Jan 2012 (20)
Poster: Using SEAD to Support Collaboration among Land Managers, Scientists, ...SEAD
SEAD is a project funded by the NSF that provides tools to facilitate collaboration and data sharing among scientists and land managers. Using SEAD, users can maintain data in a space they control while also taking advantage of annotation, commenting, and sharing capabilities. This allows for the type of collaboration needed for ecosystem management. SEAD helps users collaborate more effectively, find and understand data more easily, determine how open or closed to make data, engage communities in land management, and showcase work.
Using SEAD to Support Collaboration among Land Managers, Scientists, and the ...SEAD
The document introduces SEAD (Science Environment for Ecological Knowledge), a collaborative NSF-funded platform that supports data management and sharing among land managers, scientists, and communities. It describes SEAD's key features like project spaces for secure team collaboration and data storage, publishing data to a virtual archive, and a research network. Examples are provided of existing SEAD project spaces for organizations conducting ecological research and management. SEAD offers a central place for collaborative projects to store, organize, and invite additional contributions to shared data.
The document discusses the benefits of data curation through the SEAD (Science Environment for Ecological Applications and Data) program. SEAD makes data curation easier by integrating it into work routines, distributing the work, and providing structure based on archiving needs. It also makes data curation more valuable to data producers by helping them find and understand data more easily, collaborate more effectively, and make new connections. The SEAD program currently supports over 20 groups and encourages others to join.
SEAD is a collaboration between several universities that aims to create a virtual organization dedicated to supporting sustainability science. During an initial 18-month period, SEAD will develop a model for active and social curation that engages scientists in community data management. The project will develop a working prototype including an Active Content Repository to collect and integrate data from multiple projects, and a Virtual Long-Term Archive that functions as a thin layer presenting uniform access to distributed institutional storage. Early results from the Active Content Repository and Virtual Long-Term Archive components are expected to be available in late 2012.
SEAD: Opening Data in the "Long Tail" for Active and Social CurationSEAD
The document discusses a proposed system called SEAD that aims to facilitate active and social curation of data. SEAD would provide web and desktop interfaces to store, view, annotate, organize, and discover data. It would also offer active curation services to collect, generate derived data products, and publish new datasets with improved visualization. Additionally, SEAD would incorporate social curation services to allow users to discover projects, people, expertise, publications, and archived data, and to reuse data. The overall goal is to open data in the "long tail" for collaborative curation.
SEAD: A system to support social and active data curationSEAD
SEAD is a system to support active and social data curation in sustainability science. It aims to address challenges around finding, obtaining, and using sustainability data by leveraging social media for data discovery and annotation. The system will integrate existing tools to allow users to store, manage, share, annotate, and link heterogeneous data types. It will also facilitate community curation and long-term preservation of valuable data sets.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
UiPath Test Automation using UiPath Test Suite series, part 6DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 6. In this session, we will cover Test Automation with generative AI and Open AI.
UiPath Test Automation with generative AI and Open AI webinar offers an in-depth exploration of leveraging cutting-edge technologies for test automation within the UiPath platform. Attendees will delve into the integration of generative AI, a test automation solution, with Open AI advanced natural language processing capabilities.
Throughout the session, participants will discover how this synergy empowers testers to automate repetitive tasks, enhance testing accuracy, and expedite the software testing life cycle. Topics covered include the seamless integration process, practical use cases, and the benefits of harnessing AI-driven automation for UiPath testing initiatives. By attending this webinar, testers, and automation professionals can gain valuable insights into harnessing the power of AI to optimize their test automation workflows within the UiPath ecosystem, ultimately driving efficiency and quality in software development processes.
What will you get from this session?
1. Insights into integrating generative AI.
2. Understanding how this integration enhances test automation within the UiPath platform
3. Practical demonstrations
4. Exploration of real-world use cases illustrating the benefits of AI-driven test automation for UiPath
Topics covered:
What is generative AI
Test Automation with generative AI and Open AI.
UiPath integration with generative AI
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!SOFTTECHHUB
As the digital landscape continually evolves, operating systems play a critical role in shaping user experiences and productivity. The launch of Nitrux Linux 3.5.0 marks a significant milestone, offering a robust alternative to traditional systems such as Windows 11. This article delves into the essence of Nitrux Linux 3.5.0, exploring its unique features, advantages, and how it stands as a compelling choice for both casual users and tech enthusiasts.
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
“An Outlook of the Ongoing and Future Relationship between Blockchain Technologies and Process-aware Information Systems.” Invited talk at the joint workshop on Blockchain for Information Systems (BC4IS) and Blockchain for Trusted Data Sharing (B4TDS), co-located with with the 36th International Conference on Advanced Information Systems Engineering (CAiSE), 3 June 2024, Limassol, Cyprus.
Communications Mining Series - Zero to Hero - Session 1DianaGray10
This session provides introduction to UiPath Communication Mining, importance and platform overview. You will acquire a good understand of the phases in Communication Mining as we go over the platform with you. Topics covered:
• Communication Mining Overview
• Why is it important?
• How can it help today’s business and the benefits
• Phases in Communication Mining
• Demo on Platform overview
• Q/A
Communications Mining Series - Zero to Hero - Session 1
Data 2012 -- Presentation by Margaret Hedstrom (Jan 2012
1. SEAD
Sustainable Environment – Actionable Data
Margaret Hedstrom, PI (Michigan)
Praveen Kumar, co-PI (Illinois)
Jim Myers, co-PI (RPI)
Beth Plale, co-PI (Indiana)
Ann Zimmerman, co-PI (Michigan)
2. SEAD’s Goals
• Provide data services that address the needs of researchers in
sustainability science
• Integrate these services into an generalizable “Active and
Social Curation” infrastructure suited to data in the “long tail”
• Develop capabilities to package and migrate the most
valuable datasets to a federated repository infrastructure for
long-term preservation
4. Data challenges
• Small and derived
data sets
• Heterogeneous data
• Multiple sources of
data
• Short-lived data with
long-term value
• Value of data grows
when combined &
integrated
5. SEAD’s Strategy
• Leverage social media for discovery of data,
interest, and expertise
• Move data curation upstream in the data life cycle
• Involve domain scientists in setting priorities for
evolution of data and services
• Take advantage of existing infrastructures
(Institutional Repositories, ICPSR) for long-term
preservation
7. SEAD: Leveraging Existing Resources
• Cyberinfrastructure
– IU Data Capacitor/HPC Capabilities
– UIUC/NCSA HPC Capabilities
– Rensselaer CCNI Capabilities
• Repositories
– UM Deep Blue
– IU ScholarWorks
– ICPSR Repository
– UIUC IDEALS
8. SEAD 18 Month Prototype Targets for
Cyberinfrastructure
• Domain Engagement
– Requirements derived from researchers
– Use Cases
• Active and Social Content Curation
– Pilot Active Content Repository, VIVO deployments
– Exemplar services for Data Ingest, Discovery, Re-
use, Curation
• CI for Long-term Access
– Data model, protocol design/development
– Pilot Federated Repository infrastructure
9. SEAD TEAM
University of Michigan: Margaret Hedstrom (UM PI), Ann Zimmerman (Co-
PI and Project Manager), George Alter, Bryan Beecher, Charles Severance,
Karen Woollams, Jude Yew.
Indiana University: Beth Plale (IU PI), Katy Borner, Robert H. McDonald,
Kavitha Chandrasekar, Robert Ping, Stacy Kowalczyk, Robert Light.
University of Illinois: Praveen Kumar (UIUC PI), Rob Kooper, Luigi Marini,
Terry McLaren, Zaman Aktaruzzaman.
Rensselaer Polytechnic Institute: Jim Myers (RPI PI), Ram Prasanna Govind
Krishnan, Lindsay Todd, Adam Wilson.
10. Acknowledgments
SEAD is funded by the National Science
Foundation under cooperative agreement
#OCI0940824
http://sead-data.net
Editor's Notes
We will build usable and useful tools that scientists can take advantage of as they collect, generate and organize data in their active projects. This Active Curation approach will be designed with a great deal of user input to make sure that the tools are light-weight, easy to learn, easy to use, and more effective than the painstaking, hand-crafted approach that many sustainability scientists use today. The Active Curation approach will make data management easier for data producers and lower the curation costs to SEAD.Another part of our strategy is to deploy a variety of social networking and social-media inspired tools to engage the community of data producers and users. These include tools for annotation, rating and commentary on data sets, visualizations of publication and citation networks that map the invisible college of sustainability science researchers, and social networking tools that help build network effects. We have designed our program with multiple mechanisms to encourage participation in SEAD and adoption of its approach. These include domain engagement workshops to surface needs and requirements, ensure usability of tools, and enlisting key leaders in sustainability as early adopters and promoters of SEAD. These strategies along with support for centralized curation services, education, outreach and training will create a model for sustainable access and preservation of heterogeneous data for sustainability science and other small science disciplines in the long tail.