This document summarizes a thesis defense presentation on a conceptual model for a public commons for geospatial data. The presentation introduces the objectives of making it easier for creators to share spatial data by providing mechanisms for metadata creation, attribution and credit recognition, liability protection, and non-monetary benefits. It then describes the conceptual design of the public commons, which would use an open access licensing approach, enhanced metadata searchability, and techniques for embedding identifying information in datasets. The presentation demonstrates features of the public commons and concludes that it could incentivize thousands of individuals to share datasets by addressing impediments to sharing.
Data Sharing and the Polar Information CommonsKaitlin Thaney
1. The document discusses challenges with applying traditional copyright licenses to data and proposes establishing norms rather than licenses to govern data sharing.
2. It suggests norms that promote open, accessible, and interoperable data by waiving all rights necessary for data extraction and reuse, while encouraging attribution and quality standards through terms of use rather than legal requirements.
3. The document argues that data should flow freely in an open infrastructure to support new uses and insights, and that data has more value when structured and annotated within such a system rather than treated as a private property.
The document proposes the creation of a federated cloud computing platform called "The Commons" to support biomedical data sharing and analysis across multiple cloud providers. Key points:
- The Commons would index metadata and digital objects across conformant public and private cloud providers.
- It would be funded by providing credits to investigators for storage and computing, creating competition among providers to offer better services at lower costs.
- A phased implementation is outlined to initially involve experienced users and later expand to all NIH grantees.
The document discusses establishing an Open Knowledge Foundation (OKF) chapter in Korea. It provides background on OKF and its goals of promoting open data and building communities. It outlines OKF Korea's proposed activities, including launching an open data portal, organizing events, and translating datasets. The presentation requests confirmation from OKF to move forward and use the OKF Korea name for upcoming events. It aims to improve access to and use of open data in Korea through community building and open source projects.
Embracing Social Software And Semantic Web In Digital LibrariesAkhmad Riza Faizal
1) The document discusses social software and semantic web technologies in digital libraries based on a literature review. It describes various social software tools and their usage in research libraries.
2) It also discusses recommendations and challenges regarding personalization in digital libraries, including modeling users, balancing personal and community needs, and evaluating social effects.
3) The use of open source systems like WordPress to customize digital library interfaces is presented, along with issues in managing library data and skills.
4) Examples of mobile social software and semantic digital libraries are provided, with definitions and differences between conventional and semantic digital libraries.
http://wiki.knoesis.org/index.php/MaterialWays
http://www.knoesis.org/?q=research/semMat
http://wiki.knoesis.org/index.php/MaterialWays
Abstract
The sharing, discovery, and application of materials science and engineering data and documents are possible only if domain scientists are able and willing to do so. We need to overcome technological challenges such as the development of convenient computational tools and repositories conducive to easy exchange, curation, attribution, and analysis of data, and cultural challenges such as proper protection, control, and credit for sharing data. Our thesis and value proposition is that associating machine-processable semantics with materials science and engineering data and documents can provide a solid foundation for overcoming challenges associated with data discovery, integration, and interoperability caused by data heterogeneity. Specifically, easy to use and low upfront cost lightweight semantics in the form of file-level annotation can enable document discovery and sharing, while deeper data-level annotation using standardized ontologies can benefit semantic search and summarization. Machine processability achieved through fine-grained semantic annotation, extraction, and translation can enable data integration, interoperability and reasoning, ultimately leading to Linked Open Materials Science Data. Thus, a different granularity of semantics provides a continuum of cost/ease of use and expressiveness trade-off. In this presentation, we also show the application of semantic techniques for content extraction from materials and process specifications which are semi-structured and table-rich, and the application of semantic web techniques and technologies for materials vocabulary integration and curation (via semantic media wiki), semantic web visualization, efficient representation of provenance metadata and access control (via singleton property), and biomaterials information extraction
The document discusses spatiotemporal data management challenges faced by organizations like DEFRA and IBM's architectural approach. Specifically, it addresses the need for consolidated, high quality spatiotemporal data access across stakeholders. It also examines technical constraints encountered in large projects and potential next steps for integrated spatiotemporal enterprises.
This proposal outlines the development of a comprehensive information retrieval portal for Canadian scientific researchers. The portal would aggregate content from various sources and use techniques like collaborative filtering and content analysis to provide personalized search and recommendations. It would include features for user profiling, concept discovery, and interactive visualization of results. The proposal discusses forming partnerships with organizations to incorporate additional content and conducting a pilot program to evaluate the portal's usability and ability to improve search satisfaction and reuse.
Are you a researcher, citizen scientist, institution or community looking for data storage and value-added services? Do you want access to tools to make your research data more FAIR (findable, accessible, interoperable, and reusable)? Interested in seeing how the future European Open Science Cloud could support research data and practically foster cross-border, cross-disciplinary collaboration? Then this webinar is for you!
Data Sharing and the Polar Information CommonsKaitlin Thaney
1. The document discusses challenges with applying traditional copyright licenses to data and proposes establishing norms rather than licenses to govern data sharing.
2. It suggests norms that promote open, accessible, and interoperable data by waiving all rights necessary for data extraction and reuse, while encouraging attribution and quality standards through terms of use rather than legal requirements.
3. The document argues that data should flow freely in an open infrastructure to support new uses and insights, and that data has more value when structured and annotated within such a system rather than treated as a private property.
The document proposes the creation of a federated cloud computing platform called "The Commons" to support biomedical data sharing and analysis across multiple cloud providers. Key points:
- The Commons would index metadata and digital objects across conformant public and private cloud providers.
- It would be funded by providing credits to investigators for storage and computing, creating competition among providers to offer better services at lower costs.
- A phased implementation is outlined to initially involve experienced users and later expand to all NIH grantees.
The document discusses establishing an Open Knowledge Foundation (OKF) chapter in Korea. It provides background on OKF and its goals of promoting open data and building communities. It outlines OKF Korea's proposed activities, including launching an open data portal, organizing events, and translating datasets. The presentation requests confirmation from OKF to move forward and use the OKF Korea name for upcoming events. It aims to improve access to and use of open data in Korea through community building and open source projects.
Embracing Social Software And Semantic Web In Digital LibrariesAkhmad Riza Faizal
1) The document discusses social software and semantic web technologies in digital libraries based on a literature review. It describes various social software tools and their usage in research libraries.
2) It also discusses recommendations and challenges regarding personalization in digital libraries, including modeling users, balancing personal and community needs, and evaluating social effects.
3) The use of open source systems like WordPress to customize digital library interfaces is presented, along with issues in managing library data and skills.
4) Examples of mobile social software and semantic digital libraries are provided, with definitions and differences between conventional and semantic digital libraries.
http://wiki.knoesis.org/index.php/MaterialWays
http://www.knoesis.org/?q=research/semMat
http://wiki.knoesis.org/index.php/MaterialWays
Abstract
The sharing, discovery, and application of materials science and engineering data and documents are possible only if domain scientists are able and willing to do so. We need to overcome technological challenges such as the development of convenient computational tools and repositories conducive to easy exchange, curation, attribution, and analysis of data, and cultural challenges such as proper protection, control, and credit for sharing data. Our thesis and value proposition is that associating machine-processable semantics with materials science and engineering data and documents can provide a solid foundation for overcoming challenges associated with data discovery, integration, and interoperability caused by data heterogeneity. Specifically, easy to use and low upfront cost lightweight semantics in the form of file-level annotation can enable document discovery and sharing, while deeper data-level annotation using standardized ontologies can benefit semantic search and summarization. Machine processability achieved through fine-grained semantic annotation, extraction, and translation can enable data integration, interoperability and reasoning, ultimately leading to Linked Open Materials Science Data. Thus, a different granularity of semantics provides a continuum of cost/ease of use and expressiveness trade-off. In this presentation, we also show the application of semantic techniques for content extraction from materials and process specifications which are semi-structured and table-rich, and the application of semantic web techniques and technologies for materials vocabulary integration and curation (via semantic media wiki), semantic web visualization, efficient representation of provenance metadata and access control (via singleton property), and biomaterials information extraction
The document discusses spatiotemporal data management challenges faced by organizations like DEFRA and IBM's architectural approach. Specifically, it addresses the need for consolidated, high quality spatiotemporal data access across stakeholders. It also examines technical constraints encountered in large projects and potential next steps for integrated spatiotemporal enterprises.
This proposal outlines the development of a comprehensive information retrieval portal for Canadian scientific researchers. The portal would aggregate content from various sources and use techniques like collaborative filtering and content analysis to provide personalized search and recommendations. It would include features for user profiling, concept discovery, and interactive visualization of results. The proposal discusses forming partnerships with organizations to incorporate additional content and conducting a pilot program to evaluate the portal's usability and ability to improve search satisfaction and reuse.
Are you a researcher, citizen scientist, institution or community looking for data storage and value-added services? Do you want access to tools to make your research data more FAIR (findable, accessible, interoperable, and reusable)? Interested in seeing how the future European Open Science Cloud could support research data and practically foster cross-border, cross-disciplinary collaboration? Then this webinar is for you!
The presentation gives an overview of what metadata is and why it is important. It also addresses the benefits that metadata can bring and offers advice and tips on how to produce good quality metadata and, to close, how EUDAT uses metadata in the B2FIND service.
November 2016
This document discusses metadata, including what it is, why it is important, common components of metadata records, examples of metadata standards, and tips for writing good metadata. Metadata captures key details about data, such as who created it, when, how, and why, to facilitate discovering, understanding, and reusing the data. Standards provide consistency for computer interpretation and searching. Good metadata includes specific, accurate, and complete details to fully document data.
The document discusses using a mediator-based architecture and web services to provide uniform access to distributed heterogeneous air quality data sources. Key points include:
- A mediator server can homogenize data coding/formatting from various sources, allowing users to access data through a simple universal interface while minimizing changes to data providers.
- Services like Dvoy use mediators and wrappers to resolve technical and logical heterogeneity across sources and provide multidimensional querying of spatial-temporal data cubes.
- This approach facilitates data sharing and integration for improved analysis to address challenges from secondary pollutants and more participatory management of air quality.
Merritt’s micro-services-based architecture provides a number of options for easy integration with diverse external discovery services with specific disciplinary focus on scientific data sharing. By removing many of the barriers faced by researchers interested in data publication, the integrations of Merritt with DataShare and Research Hub exemplify a new service model for cooperative and distributed data sharing. The widespread adoption of such sharing is critical to open scientific inquiry and advancement.
2005 02 14 C2 I S R C O I Brief A K MaitraAmit Maitra
The document discusses the Air Force's strategy to move from privately stored data to an enterprise information environment where authorized users can access and share data. It outlines barriers to identifying, accessing, and understanding data across different organizations. The strategy involves using communities of interest, metadata registries, and web services to make data visible, accessible, and understandable across the Department of Defense.
Presentation given by Alex Ball of DCC/UKOLN, University of Bath, at the Digital Curation 101 Lite workshop held during the DCC Roadshow, 1-3 March 2011.
Master URL: http://opus.bath.ac.uk/22983
Abstract: This talk is intended to help workshop participants decide how to apply a licence to their research data, and which licence would be most suitable. It covers why licensing data is important, the impact licences have on future research, and the potential pitfalls to avoid.
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...Robert Grossman
Data commons are emerging as a solution to challenges in analyzing and sharing large biomedical datasets. A data commons co-locates data with cloud computing infrastructure and software tools to create an interoperable resource for the research community. Examples include the NCI Genomic Data Commons and the Open Commons Consortium. The open source Gen3 platform supports building disease- or project-specific data commons to facilitate open data sharing while protecting patient privacy. Developing interoperable data commons can accelerate research through increased access to data.
The document discusses the SHARE Notification Service, which aims to maximize research impact by keeping interested parties informed of research activities and outputs. It provides an overview of SHARE's mission and funding, describes the notification service which alerts subscribers about new research releases, and discusses early lessons learned. It also outlines plans for a Phase II that would provide a more comprehensive linked dataset to improve discovery and assessment of research impacts.
A Social Content Delivery Network for Scientific Cooperation: Vision, Design...Simon Caton
Data volumes have increased so significantly that we need to carefully consider how we interact with, share, and analyze data to avoid bottlenecks. In contexts such as eScience and scientific computing, a large emphasis is placed on collaboration, resulting in many well-known challenges in ensuring that data is in the right place at the right time and accessible by the right users. Yet these simple requirements create substantial challenges for the distribution, analysis, storage, and replication of potentially "large" datasets. Additional complexity is added through constraints such as budget, data locality, usage, and available local storage. In this paper, we propose a "socially driven" approach to address some of the challenges within (academic) research contexts by defining a Social Data Cloud and underpinning Content Delivery Network: a Social CDN (S-CDN). Our approach leverages digitally encoded social constructs via social network platforms that we use to represent (virtual) research communities. Ultimately, the S-CDN builds upon the intrinsic incentives of members of a given scientific community to address their data challenges collaboratively and in proven trusted settings. We define the design and architecture of a S-CDN and investigate its feasibility via a coauthorship case study as first steps to illustrate its usefulness.
NIH Data Initiatives: Harnessing Big (and small) Data to Improve Health
Presentation at the internet2 Global Forum, April 28, 2015
Session NIH Perspectives
Doing for Data what Pubmed did for literature: DATS a model for dataset description datasets indexing and data discovery.
Googleslides [https://goo.gl/cd5KKa] or Slideshare [https://goo.gl/c8DH5N]
This presentation provides a top-level introduction to semantics and Web 3.0. It discusses key concepts like semantic architectures, knowledge representations, and semantic applications. Semantic technologies add meaning to data so machines can better support users by doing more of the work. While early adoption was in enterprises, semantic applications are now emerging on the public web as part of the vision of Web 3.0 as a read-write-execute web.
The document proposes developing the Global Biodiversity Resources Discovery System (GBRDS) to address the challenge of discovering distributed biodiversity data and information resources. The GBRDS would consist of a registry to inventory publishers, institutions, datasets and services, and discovery services to search these resources. It would provide a "map" of all biodiversity information to enable discovery. The GBRDS is envisioned as the core of next generation biodiversity informatics infrastructure and aims to become a unified global entry point for discovering biodiversity resources by December 2010.
1) The document discusses Linked Data as a service and the Information Workbench platform for providing data as a service.
2) The Information Workbench enables semantic integration and federation of private and public data sources through a virtualization layer and provides self-service data discovery, exploration and analytics tools.
3) It describes a cloud-based architecture where the Information Workbench is deployed as a semantic data integration and analytics platform as a service (PaaS).
Everything Self-Service:Linked Data Applications with the Information WorkbenchPeter Haase
The document discusses an information workbench platform that enables self-service linked data applications. It addresses challenges in building linked data applications like data integration and quality. The platform allows for discovery and integration of internal and external data sources. It provides intelligent data access, analytics, and collaboration tools through a semantic wiki interface with customizable widgets. Example application areas discussed are knowledge management, digital libraries, and intelligent data center management.
Semantic Text Processing Powered by WikipediaMaxim Grinev
The document discusses using Wikipedia as a resource for semantic text processing and natural language processing techniques. It describes using Wikipedia's comprehensive coverage of terms, rich structure of links and categories, and ability to be continuously updated to power text analysis algorithms. These include word sense disambiguation, keyword extraction, topic inference, ontology management, semantic search, and improved recommendations. The techniques analyze Wikipedia's link structure and build semantic graphs of documents to discover related concepts and group keywords.
Dataset description: DCAT and other vocabulariesValeria Pesce
This document discusses metadata needed to describe datasets for applications to find and understand them when stored in data catalogs or repositories. It examines existing dataset description vocabularies like DCAT and their limitations in fully capturing necessary metadata.
Key points made:
- Machine-readable metadata is important for datasets to be discoverable and usable by applications when stored across repositories.
- Metadata should describe the dataset, distributions, dimensions, semantics, protocols/APIs, subsets etc.
- Vocabularies like DCAT provide some metadata but don't fully cover dimensions, semantics, protocols/APIs or subsets.
- No single vocabulary or data catalog solution currently provides all necessary metadata for full semantic interoperability.
#Walking and #trekking self guided tour from 1 person all, year at Costa da M...Enrique Pérez Sampedro
#Walking and #trekking self guided tour from 1 person all, year at Costa da Morte, #Galicia Lighthouse 200 km way from Malpica to Finisterre all over the coast
The presentation gives an overview of what metadata is and why it is important. It also addresses the benefits that metadata can bring and offers advice and tips on how to produce good quality metadata and, to close, how EUDAT uses metadata in the B2FIND service.
November 2016
This document discusses metadata, including what it is, why it is important, common components of metadata records, examples of metadata standards, and tips for writing good metadata. Metadata captures key details about data, such as who created it, when, how, and why, to facilitate discovering, understanding, and reusing the data. Standards provide consistency for computer interpretation and searching. Good metadata includes specific, accurate, and complete details to fully document data.
The document discusses using a mediator-based architecture and web services to provide uniform access to distributed heterogeneous air quality data sources. Key points include:
- A mediator server can homogenize data coding/formatting from various sources, allowing users to access data through a simple universal interface while minimizing changes to data providers.
- Services like Dvoy use mediators and wrappers to resolve technical and logical heterogeneity across sources and provide multidimensional querying of spatial-temporal data cubes.
- This approach facilitates data sharing and integration for improved analysis to address challenges from secondary pollutants and more participatory management of air quality.
Merritt’s micro-services-based architecture provides a number of options for easy integration with diverse external discovery services with specific disciplinary focus on scientific data sharing. By removing many of the barriers faced by researchers interested in data publication, the integrations of Merritt with DataShare and Research Hub exemplify a new service model for cooperative and distributed data sharing. The widespread adoption of such sharing is critical to open scientific inquiry and advancement.
2005 02 14 C2 I S R C O I Brief A K MaitraAmit Maitra
The document discusses the Air Force's strategy to move from privately stored data to an enterprise information environment where authorized users can access and share data. It outlines barriers to identifying, accessing, and understanding data across different organizations. The strategy involves using communities of interest, metadata registries, and web services to make data visible, accessible, and understandable across the Department of Defense.
Presentation given by Alex Ball of DCC/UKOLN, University of Bath, at the Digital Curation 101 Lite workshop held during the DCC Roadshow, 1-3 March 2011.
Master URL: http://opus.bath.ac.uk/22983
Abstract: This talk is intended to help workshop participants decide how to apply a licence to their research data, and which licence would be most suitable. It covers why licensing data is important, the impact licences have on future research, and the potential pitfalls to avoid.
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...Robert Grossman
Data commons are emerging as a solution to challenges in analyzing and sharing large biomedical datasets. A data commons co-locates data with cloud computing infrastructure and software tools to create an interoperable resource for the research community. Examples include the NCI Genomic Data Commons and the Open Commons Consortium. The open source Gen3 platform supports building disease- or project-specific data commons to facilitate open data sharing while protecting patient privacy. Developing interoperable data commons can accelerate research through increased access to data.
The document discusses the SHARE Notification Service, which aims to maximize research impact by keeping interested parties informed of research activities and outputs. It provides an overview of SHARE's mission and funding, describes the notification service which alerts subscribers about new research releases, and discusses early lessons learned. It also outlines plans for a Phase II that would provide a more comprehensive linked dataset to improve discovery and assessment of research impacts.
A Social Content Delivery Network for Scientific Cooperation: Vision, Design...Simon Caton
Data volumes have increased so significantly that we need to carefully consider how we interact with, share, and analyze data to avoid bottlenecks. In contexts such as eScience and scientific computing, a large emphasis is placed on collaboration, resulting in many well-known challenges in ensuring that data is in the right place at the right time and accessible by the right users. Yet these simple requirements create substantial challenges for the distribution, analysis, storage, and replication of potentially "large" datasets. Additional complexity is added through constraints such as budget, data locality, usage, and available local storage. In this paper, we propose a "socially driven" approach to address some of the challenges within (academic) research contexts by defining a Social Data Cloud and underpinning Content Delivery Network: a Social CDN (S-CDN). Our approach leverages digitally encoded social constructs via social network platforms that we use to represent (virtual) research communities. Ultimately, the S-CDN builds upon the intrinsic incentives of members of a given scientific community to address their data challenges collaboratively and in proven trusted settings. We define the design and architecture of a S-CDN and investigate its feasibility via a coauthorship case study as first steps to illustrate its usefulness.
NIH Data Initiatives: Harnessing Big (and small) Data to Improve Health
Presentation at the internet2 Global Forum, April 28, 2015
Session NIH Perspectives
Doing for Data what Pubmed did for literature: DATS a model for dataset description datasets indexing and data discovery.
Googleslides [https://goo.gl/cd5KKa] or Slideshare [https://goo.gl/c8DH5N]
This presentation provides a top-level introduction to semantics and Web 3.0. It discusses key concepts like semantic architectures, knowledge representations, and semantic applications. Semantic technologies add meaning to data so machines can better support users by doing more of the work. While early adoption was in enterprises, semantic applications are now emerging on the public web as part of the vision of Web 3.0 as a read-write-execute web.
The document proposes developing the Global Biodiversity Resources Discovery System (GBRDS) to address the challenge of discovering distributed biodiversity data and information resources. The GBRDS would consist of a registry to inventory publishers, institutions, datasets and services, and discovery services to search these resources. It would provide a "map" of all biodiversity information to enable discovery. The GBRDS is envisioned as the core of next generation biodiversity informatics infrastructure and aims to become a unified global entry point for discovering biodiversity resources by December 2010.
1) The document discusses Linked Data as a service and the Information Workbench platform for providing data as a service.
2) The Information Workbench enables semantic integration and federation of private and public data sources through a virtualization layer and provides self-service data discovery, exploration and analytics tools.
3) It describes a cloud-based architecture where the Information Workbench is deployed as a semantic data integration and analytics platform as a service (PaaS).
Everything Self-Service:Linked Data Applications with the Information WorkbenchPeter Haase
The document discusses an information workbench platform that enables self-service linked data applications. It addresses challenges in building linked data applications like data integration and quality. The platform allows for discovery and integration of internal and external data sources. It provides intelligent data access, analytics, and collaboration tools through a semantic wiki interface with customizable widgets. Example application areas discussed are knowledge management, digital libraries, and intelligent data center management.
Semantic Text Processing Powered by WikipediaMaxim Grinev
The document discusses using Wikipedia as a resource for semantic text processing and natural language processing techniques. It describes using Wikipedia's comprehensive coverage of terms, rich structure of links and categories, and ability to be continuously updated to power text analysis algorithms. These include word sense disambiguation, keyword extraction, topic inference, ontology management, semantic search, and improved recommendations. The techniques analyze Wikipedia's link structure and build semantic graphs of documents to discover related concepts and group keywords.
Dataset description: DCAT and other vocabulariesValeria Pesce
This document discusses metadata needed to describe datasets for applications to find and understand them when stored in data catalogs or repositories. It examines existing dataset description vocabularies like DCAT and their limitations in fully capturing necessary metadata.
Key points made:
- Machine-readable metadata is important for datasets to be discoverable and usable by applications when stored across repositories.
- Metadata should describe the dataset, distributions, dimensions, semantics, protocols/APIs, subsets etc.
- Vocabularies like DCAT provide some metadata but don't fully cover dimensions, semantics, protocols/APIs or subsets.
- No single vocabulary or data catalog solution currently provides all necessary metadata for full semantic interoperability.
#Walking and #trekking self guided tour from 1 person all, year at Costa da M...Enrique Pérez Sampedro
#Walking and #trekking self guided tour from 1 person all, year at Costa da Morte, #Galicia Lighthouse 200 km way from Malpica to Finisterre all over the coast
Proyecto pedagogico de aula con tic artisticaAna Reyes
Este proyecto pedagógico de 6 semanas busca desarrollar habilidades artísticas en estudiantes de grado 2o mediante el uso del programa Paint. Se justifica cómo las TIC pueden fortalecer la creatividad y expresión artística. El proyecto incluye objetivos, competencias, actividades como el uso de Paint y la exhibición de trabajos, y una evaluación cualitativa del aprendizaje y participación de los estudiantes.
Esri Scotland Conf 2016 Norfolk County CouncilEsri UK
Norfolk County Council was spending £10 million annually transporting vulnerable children to school via an inefficient manual process. They developed an interactive GIS application to categorize children into transport scenarios and identify cost savings opportunities. This helped reduce costs by £500,000 while improving pupil wellbeing. The successful application could also help with transporting adults and with school planning.
General Motors and Chrysler must present restructuring plans to the US government by the end of the day to show how they will repay loans and become viable, or face potential bankruptcy. GM is making progress in concession talks but will likely close plants, cut workforce, and end some production. Workers in Canada are worried about the impact on their jobs as the Ontario economy is already suffering. Union leaders say people are losing faith in the auto industry's future as bad news keeps coming.
Eastern Panhandle GIS Users Group Meeting held on 14 September 2016 in Martinsburg, WV. Presenters Kathryn Wesson & Margaret Markham, Chesapeake Conservancy
Esri Scotland Conf 2016 Forestry CommissionEsri UK
This document outlines the challenges faced by the Forestry Commission in facilitating collaboration among staff. It describes the evolution of their platforms from individual PCs in district offices in the 1990s to a modern "Forester" platform that allows all staff access to content from any device. The vision is for Forester to be a centralized hub containing all spatial data, enabling easy search, sharing and updating of content to support collaboration. It provides configurable collaboration tools to streamline the process and aims to make data accessible everywhere by everyone on the Forestry Commission team.
Web AppBuilder for ArcGIS is a tool for building configurable web apps without coding. It has over 40 widgets and allows creating 2D and 3D web maps. Web AppBuilder can be used within ArcGIS Online or Portal or with the separate developer edition. Examples of Web AppBuilder apps are publicly shared and filterable online. A demo was given of a Tour o' the Borders Cycle Challenge app built with Web AppBuilder to provide information to different user groups. Resources for learning more about Web AppBuilder include documentation, video tutorials, forums and training courses.
Presented at the 2016 Eastern Panhandle GIS Users Group Meeting held on September 14 in Martinsburg, WV. Contributors Kurt Donaldson, Todd Fagan, & Aaron Cox.
The document discusses the need for an NIH Data Commons to address challenges with data sharing and storage. It describes how factors like increasing data volumes, availability of cloud technologies, and emphasis on FAIR data principles are driving the need for a centralized data platform. The proposed NIH Data Commons would provide findable, accessible, interoperable and reusable data through cloud-based services and tools. It would enable data-driven science by facilitating discovery, access and analysis of biomedical data across different sources. Plans are outlined to develop and test an initial Data Commons pilot using existing genomic and other biomedical datasets.
A Framework for Geospatial Web Services for Public Health by Dr. Leslie LenertWansoo Im
A Framework for Geospatial Web Services for Public Health
by Leslie Lenert, MD, MS, FACMI, Director
National Center for Public Health Informatics, CCHIS, CDC
June 8 2009 URISA Public Health Conference
uploaded by Wansoo Im, Ph.D.
URISA Membership Committee Chair
http://www.gisinpublichealth.org
sers, Applications and the Community of Practice for the Air Quality ScenarioRudolf Husar
The document discusses the GEOSS (Global Earth Observation System of Systems) architecture for the air quality community. It proposes an architecture where air quality services could register with the GEOSS registry and be discovered and invoked by users. This would allow data analysts to compose and visualize air quality data workflows to inform decision makers. It also discusses establishing an air quality community of practice to facilitate collaboration.
2008-05-05 GEOSS UIC-ADC AQ Scen W shop TorontoRudolf Husar
The document discusses the GEOSS (Global Earth Observation System of Systems) architecture for the air quality community. It proposes an architecture where air quality services register with the GEOSS registry and are discoverable through the GEOSS clearinghouse. This would allow users to find, select, and link to relevant air quality services. The architecture envisions community air quality catalogs that aggregate catalog listings and allow users to access data and models through composed workflows.
Presentation given at Supercomputing 2007 on the progress of data sharing models, specifically highlighting the collision of data grid / data service and Web 2.0 worlds.
2005-03-17 Air Quality Cluster TechTrackRudolf Husar
The document discusses a federated information system called Dvoy that aims to integrate heterogeneous air quality data from different sources and provide uniform access. It does this through the use of wrappers that encapsulate data sources and mediators implemented as web services that resolve logical heterogeneity and allow for standardized querying of multidimensional data cubes. The system uses mediators and wrappers based on previous research to overcome issues of data access, translation and merging across different source schemas and formats.
The document provides an introduction to big data, including definitions and characteristics. It discusses how big data can be described by its volume, variety, and velocity. It notes that big data is large and complex data that is difficult to process using traditional data management tools. Common sources of big data include social media, sensors, and scientific instruments. Challenges in big data include capturing, storing, analyzing, and visualizing large and diverse datasets that are generated quickly. Distributed file systems and technologies like Hadoop are well-suited for processing big data.
1. The document discusses developing an "air emissions cyberinfrastructure" using web services to provide access to distributed emissions inventory data and analysis tools through standardized interfaces.
2. This would allow emissions data to remain controlled by their owners while still being accessible over the web. Users could find, access, and analyze data through a single portal without needing specialized software.
3. The system is being built following principles of distributed, non-intrusive, transparent, and interoperable designs in order to allow new datasets and tools to be easily incorporated.
Debbie Wilson: Deliver More Efficient, Joined-Up Services through Improved Ma...AGI Geocommunity
Improved data management and sharing through the use of harmonized data specifications and open standards can enable organizations to deliver services more efficiently with reduced costs. Specifications like INSPIRE define common modeling approaches for environmental data that allow data to be joined from different sources. Case studies show how the Met Office and Land Registry leveraged such standards to build new data services quickly and transform legacy systems. Adopting modular, model-driven approaches facilitates the rapid development and deployment of applications to meet new business and user needs.
FAIRy stories: the FAIR Data principles in theory and in practiceCarole Goble
https://ucsb.zoom.us/meeting/register/tZYod-ippz4pHtaJ0d3ERPIFy2QIvKqjwpXR
FAIRy stories: the FAIR Data principles in theory and in practice
The ‘FAIR Guiding Principles for scientific data management and stewardship’ [1] launched a global dialogue within research and policy communities and started a journey to wider accessibility and reusability of data and preparedness for automation-readiness (I am one of the army of authors). Over the past 5 years FAIR has become a movement, a mantra and a methodology for scientific research and increasingly in the commercial and public sector. FAIR is now part of NIH, European Commission and OECD policy. But just figuring out what the FAIR principles really mean and how we implement them has proved more challenging than one might have guessed. To quote the novelist Rick Riordan “Fairness does not mean everyone gets the same. Fairness means everyone gets what they need”.
As a data infrastructure wrangler I lead and participate in projects implementing forms of FAIR in pan-national European biomedical Research Infrastructures. We apply web-based industry-lead approaches like Schema.org; work with big pharma on specialised FAIRification pipelines for legacy data; promote FAIR by Design methodologies and platforms into the researcher lab; and expand the principles of FAIR beyond data to computational workflows and digital objects. Many use Linked Data approaches.
In this talk I’ll use some of these projects to shine some light on the FAIR movement. Spoiler alert: although there are technical issues, the greatest challenges are social. FAIR is a team sport. Knowledge Graphs play a role – not just as consumers of FAIR data but as active contributors. To paraphrase another novelist, “It is a truth universally acknowledged that a Knowledge Graph must be in want of FAIR data.”
[1] Wilkinson, M., Dumontier, M., Aalbersberg, I. et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data 3, 160018 (2016). https://doi.org/10.1038/sdata.2016.18
The NIH Data Commons - BD2K All Hands Meeting 2015Vivien Bonazzi
Presentation given at the BD2K All Hands meeting in Bethesda, MD, USA in November 2015
https://datascience.nih.gov/bd2k/events/NOV2015-AllHands
Video cast of this presentation:
http://videocast.nih.gov/summary.asp?Live=17480&bhcp=1
talk starts at 2hrs 40min (its about 55mins long) - includes video!
Document describing the Commons : https://datascience.nih.gov/commons
Data Mesh is the decentralized architecture where your units of architecture is a domain driven data set that is treated as a product owned by domains or teams that most intimately know that data either creating it or they are consuming it and re-sharing it and allocated specific roles that have the accountability and the responsibility to provide that data as a product abstracting away complexity into infrastructure layer a self-serve infrastructure layer so that create these products more much more easily.
T.E.R:R.A.I.N. is a system that allows community groups to digitize their resources and store spatial data in an online repository. It was created to help community groups: 1) map local data that is unique to the community and not available elsewhere, and 2) enter data online and link to other applications. A pilot project involved developing features for an open-source digital repository software used to map plants in Pukekura Park with data entered by students and the public. The system is designed to be free, open source software that is customizable, easy to install and use, with support from a large developer community.
Data commons bonazzi bd2 k fundamentals of science feb 2017Vivien Bonazzi
Vivien Bonazzi leads the Data Commons efforts within NIH. She discussed how big data is characterized by volume, velocity, variety and veracity. She explained that data is becoming the central currency of a new digital economy and organizations must leverage their digital assets through platforms like the Data Commons to transform into digital enterprises. The Data Commons platform fosters development of a digital ecosystem by enabling interactions between producers and consumers of FAIR digital objects like data, software and publications.
Standard Safeguarding Dataset - overview for CSCDUG.pptxRocioMendez59
13 July, 2023 - CSCDUG Online Event
Presenting the Sector-led Standard Safeguarding Dataset
Colleagues from Data to Insight, the LA-led service for children’s safeguarding data professionals, are delivering a DfE-funded project in partnership with LAs to define a new “standard safeguarding dataset” which all LAs will be able to produce from their safeguarding information systems.
At this session, they shared what they’ve learned so far from user research with LA colleagues and discussed their early thinking about what a better standard dataset might look like. Participants shared their own thoughts about how to improve these systems and processes.
Presenters
Alistair Herbert
Alistair is the lead officer for Data to Insight, the LA-led service for children’s safeguarding data professionals. With a career focused on local authority children’s services data work, he knows about safeguarding data, information systems, and cross-organisation collaboration.
John Foster
John is a Data Manager for Data to Insight. He has supported a range of children’s services data work, most recently at Shropshire Council. He led Data to Insight’s project to introduce the first national benchmarking dataset for Early Help, and is the user research lead for Data to Insight’s Standard Safeguarding Dataset project.
Rob Harrison and Joe Cornford-Hutchings
Rob and Joe are new Data Managers joining Data to Insight from the private and public sector respectively. They bring between them a wealth of experience and technical expertise, and will be working together to support design and implementation of the new Standard Safeguarding Dataset through 2023-24.
This document provides an overview and introduction to creating and managing digital collections, including:
- Defining digital libraries and their components
- The importance of selection criteria, intellectual property rights, and other legal considerations for digitization
- Cost factors and examples for digitizing different types of materials
- Standards for image processing, file formats, and quality control
- The role and types of metadata, content standards, and ensuring interoperability
- Database software options and technical considerations for storage, access, and user interfaces
Smith RDAP11 NSF Data Management Plan Case StudiesASIS&T
MacKenzie Smith, MIT; NSF Data Management Plan Case Studies; RDAP11 Summit
The 2nd Research Data Access and Preservation (RDAP) Summit
An ASIS&T Summit
March 31-April 1, 2011 Denver, CO
In cooperation with the Coalition for Networked Information
http://asist.org/Conferences/RDAP11/index.html
Wide access to spatial Citizen Science data - ECSA Berlin 2016COBWEB Project
Authors: Paul van Genuchten, Lieke Verhelst, Clemens Portele
Presented at the European Citizen Science Association conference Berlin, May 2016.
One of the objectives of COBWEB is to publish citizen science data to GEOSS, the Global Earth Observation System of Systems. GEOSS has a focus on spatial standards (CSW, SensorWeb, WMS/WFS). However, a major part of citizen science community is not aware of these standards, and average users use search engines to discover data and common formats to analyse data. So how do we bridge the gap between services in GEOSS and search engines?
The document discusses spatial data quality and neogeography. It notes that the world of spatial data is exploding in popularity due to increased accessibility and availability of tools for collecting, analyzing, and presenting spatial data. However, this growth could influence the quality of spatial data. It focuses on the role of neogeographers as data collectors and the implications for data quality. Neogeographers contribute user-generated content, which provides benefits like many eyes finding errors, but also has criticisms like a lack of quality control.
The document summarizes a presentation by Mark A. Parsons on opportunities and challenges for data sharing and citation. The presentation discusses how all of society's grand challenges require diverse data shared across boundaries, and the vision of the Research Data Alliance (RDA) to openly share data. RDA builds social and technical bridges to enable open data sharing through developing infrastructure, standards, and best practices. The presentation also covers specific RDA activities like developing data citation recommendations and engaging members globally.
1. Public Commons for
Geospatial Data:
A Conceptual Model
A Thesis Defense by
Narnindi Sharad
Advisory Committee:
Dr. Harlan J. Onsrud
Dr.Kate Beard
Dr. Anthony Stefanidis
2. Overview
Introduction
Objectives of Public Commons Model
Conceptual Design of Public Commons
Operational Aspects of Public Commons
Demo
Conclusions
Future Work
3. Introduction
National level spatial data collection efforts – In
many cases similar or duplicative.
FGDC, NSDI, Geospatial One-Stop – facilitate the
availability and access to spatial data to all levels of
government, private and public.
Key Premise – National governments are unable to
gather and maintain geographic data.
5. Introduction
Common Wisdom – Intellectual Property laws and the markets
they protect create the environment for producing and sharing.
Profit Motivations
Credit and recognition
As individuals, Most of our conduct in daily life is not driven by
profit motives.
What are the impediments to widespread data-sharing?
6. Objectives
Many creators have indicated they would be more than
willing to share their spatial data sets with SDI’s or geo-
libraries, if-
it was easier to do,
Efficient Search and Data access mechanisms
Interactive Web Interfaces
Minimized Metadata Transcripts
Upload Mechanisms
7. Objectives
Many creators have indicated they would be more than
willing to share their spatial data sets with SDI’s or geo-
libraries, if-
they can retain credit and recognition for their
contributions,
Visible Credit in their works and also derivatives
Linking Author information to the datasets
Multiple contributions – maintaining hierarchy of contributors
8. Objectives
Many creators have indicated they would be more than
willing to share their spatial data sets with SDI’s or geo-
libraries, if-
they get increased liability protection from use of the data
they make available to the public, and
Open Access Licensing
Display Liability information and Disclaimer upfront
9. Objectives
Many creators have indicated they would be more than
willing to share their spatial data sets with SDI’s or geo-
libraries, if-
they could obtain other non-monetary benefits.
Permanent Archival services
Tagging and Identification services
Increased search and retrieval capabilities
Increased visibility for contributions
10. Objectives
Many creators have indicated they would be more than
willing to share their spatial data sets with SDI’s or geo-
libraries, if-
it was easier to do,
they can retain credit and recognition for their
contributions,
they get increased liability protection from use of the data
they make available to the public, and
they could obtain other non-monetary benefits.
11. Conceptual Design of Public Commons
Combined technological & legal model for at least partially
accommodating these impediments.
Enable and entice Non-Expert GIS user contributions
(University researchers, students, professionals in other fields).
Archiving services – indexing, access and search mechanisms.
One-Stop approach (upload and download of datasets at single
location)
Develop technical methods which can support previously
discussed objectives.
12. GIS data producer enters metadata for his dataset at
Public Commons website
Submits GIS data
to SFIPCA
Data producer creates
a GIS dataset
SFIPCA
Use Steganography for embedding
an Identifying number
Generate machine readable
Open access licenses
Centralized GIS data
server
Store GIS data distrubed by
bounding coordinates
Data indexing &
Search Mechanisms
Conceptual Design of Public Commons
13. Advanced User-Friendly Web-Interface
- metadata creation and data upload mechanisms
Open Access/ Copyleft licensing approach
- enable credit recognition and free distribution
Enhanced Metadata Model
- allow indexing, rapid access and search of data
Embedding Copyright Information into the data
- enable identification and documenting contributor lineage
Conceptual Design of Public Commons
14. Public Commons for Geo-Spatial data
What is Public Commons?
Online Digital GeoSpatial library-like data repository
Napster-like data-sharing facility that
automatically supports user friendly metadata
creation, open access licenses, and documents
parent lineage of any newly submitted data set.
15. Advanced User-Friendly Web-Interface
- metadata creation and data upload mechanisms
Pull down Menus for Metadata fields
Intelligent from previous responses and saved profiles
Upload datasets directly from folders
Minimized web transcripts
Conceptual Design of Public Commons
16. Metadata Elements
File reference ID
Details of the originator
Title of the content
Presentation form
Abstract or Extensive information
Time period of the content
Status of the work?
Information about maintenance work.
Spatial Extent Info [ North, East, West, South bounding Coordinates or
interactive map ]
Data Theme Info
Keywords for the content and place of work
Spatial Data Info:(1) Data type: Raster / Vector (2) Data format .
Access Constraints:
Use Constraints:
Open Access Licensing
Liability Information
17. Comparison of Metadata Templates
Metadata Templates of
Organizations
Number of Mandatory
Metadata elements
(approx.)
FGDC CSDGM 165
NOAA 86
FGDC Metadata Lite 41
Geography Network 35
Public Commons 23
18. Open Access/ Copyleft licensing approach
- enable credit recognition and free distribution
Guard against liability exposure
Linking Liability information to the
datasets
as a part of the metadata creation process.
Conceptual Design of Public Commons
19. Open Access Licensing (OAL)
PUBLIC COMMONS OPEN ACCESS COPYRIGHT NOTICE
This copyrighted work permits unrestricted redistribution and
modification of a work, provided that all copies and derivatives retain
the same permission and the author is properly acknowledged and
cited.
Not conforming to any of these conditions will be considered a
violation of this Copyright and are punishable by Law.
This work is distributed in the hope that it will be useful, but WITHOUT
ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
View Full License
20. Advantages of OAL
liability exposure may be substantially reduced through
the license provisions,
the originator and all value-adders have a legally
enforceable right to credit for their work,
the license can prevent the efforts of the originator and
value-adders from being captured by a company with a
large market share or otherwise being removed from an
open sharing arrangement, and
Commons Identification software - can provide instant
access to the detailed licensing language through an
Internet link.
21. Enhanced Metadata Model
- allows rapid indexing, access and search of data
Metadata search models
Alexandria Digital Library Approach
Data Clearinghouse Approach
Public Commons Hierarchical approach
Public Commons for Geo-Spatial data
22. Alexandria Digital Library Approach
Metadata
Database
GIS Web user
Digital Library Web Client
Query: Spatial Location
Centralized
23. FGDC Data Clearinghouse approach
Metadata
Database
Alaska GDC
ESRI
Clearinghouse
NOAA
Clearinghouse
NRCS
Clearinghouse
FGDC Entry Point
Z39.50 Gateway
GIS Web user
Metadata
Database
Metadata
Database
Metadata
Database
Query: Maine
Decentralized
&
Distributed
25. Disadvantages
FGDC places all distributed clearing nodes on the
same level without any classification.
the results of a metadata query are retrieved by
individual clearing house servers and not an
integrated list ranked by their suitability of content.
duplicate metadata records - data suppliers register
with many clearinghouses.
Too many results – too much to evaluate.
Too many clearinghouses - too much confusions.
26. Query: Spatial Location
Roads
Metadata
Land
Parcel
Census Roads
Metadata
Land
Parcel
Census
GIS Web user
Digital Library Web Client
boundary info
(N E W S Coor)
boundary info
(N E W S Coor)
boundary info
(N E W S Coor)
Roads
Metadata
Land
Parcel
Census
Query: Theme
Prioritized Search Results
Public Commons Metadata Model
Centralized by Themes
&
Distributed by Location
27. 60° W120° W
30° N
60° N
57°
54°
63°66°
Hierarchical Metadata Search
28. Advantages of Public Commons
Metadata Search Approach
Meaningful Metadata archive structure.
Enhanced search mechanism.
Duplicate metadata registrations can be eliminated on
multiple server locations.
each lower level metadata repository can function
independently while sharing the same database with the
upper level.
Results sent back for a query are listed by their ranks.
29. Embedding Copyright Information into the data
- enable identification and documenting contributor lineage
Attaching an Identification Number to Standard GIS format files
Using Steganography for raster and vector datasets
Embedding an ID in polygonal sides of vector datasets
Spatial File Identification System (SFIPCA)
Link author information and open access licenses
Link Metadata
Document parent lineage
Permanently mark information directly into the dataset
Conceptual Design of Public Commons
30. Raster Images - Steganography
encoding extra info into least
significant bits of raster
images.
Hide text as well as Small
Images in raster datasets
(JPG, GIF, DRG’s TIFF etc).
Combined with cryptography
makes even tough for code
breakers.
Limited solutions exist for
raster datasets (e.g. Invisible
Secrets, DigiMarc)
34. AA101234
AA101234
AA101235
AA101235
Maine water
Penobscot
water
Harlan
Onsrud
Sharad
Spatial dataset
contributed to SFIPCA
The identification number
is extracted from the
dataset and checked for a
match in a database that is
placed at a remote
location on the Internet.
A database of linked
machine readable licensed to
patrons
A database of metadata
placed at a centralized
location
A database of identifier
numbers
SFIPCA Controlled
Databases
Identification & Verification
40. Conclusions
Identifier system need not be fool proof since goal is to provide
evidence that a file is in public commons rather than in private
ownership.
Little incentive to strip unobtrusive ID’s since everyone can use
file for free anyway.
Only potential thief tempted to strip ID’s might be business
trying to capture past contributions of others…. Yet similar
earlier files would exist in archives…. And 90% credit is good
enough.
Greatest challenge is to counter unintentional stripping of ID’s.
41. Conclusions
Public Domain
GIS Data
Federal Government
GIS data
Commercial
GIS
Data
Value-added products
&
Services
Upon Copyright
expiration
Share-a-like
Public Commons
+
Growing and evolving resource of
public domain and public commons
licensed spatial datasets
continuous loop
of growth in GIS data
42. Conclusions
Public Domain
GIS Data
Federal Government
GIS data
Commercial
GIS
Data
Value-added products
&
Services
Upon Copyright
expiration
Share-a-like
Public Commons
+
Growing and evolving resource of
public domain and public commons
licensed spatial datasets
continuous loop
of growth in GIS data
43. Conclusions
Would the Tens of thousands of individuals
creating GIS datasets make use of such
capabilities to make their datasets available
with others?
Our Hypothesis: YES
44. Future Work
Integrated search mechanisms based on spatial location
and ontologies?
Provide further software tools at one place such that
people can produce maps on their own.
Investigate Geospatial One-Stop Internet Portal
architecture relative to Public Commons.
How can we accommodate people who would like to
share databases?
Alternatives to steganographic techniques to embed extra
information.
Alternative search and access mechanisms.