This document proposes a process-centric ontological approach to integrate geo-sensor data by relating observed properties to underlying geo-processes. It aligns the DOLCE foundational ontology with concepts from surface hydrology to develop an ontology of hydrological processes and their participants. This allows semantic integration of sensor data by resolving naming ambiguities and enabling process-based retrieval of observations through participation and property relations. Further work is needed to clarify the bearers of qualities and specify participant roles.
Terra-i is a system that uses neural networks and MODIS data to monitor habitat change in near real-time. It maps habitat loss every 16 days at 250m resolution. Its goals are to monitor natural habitat conversion, have continental coverage, support government decision making, and quantify habitat change rates. Terra-i predicts vegetation greenness using past NDVI and precipitation data, compares this to MODIS measurements to detect anomalies, and calibrates results with Landsat images. Comparisons show it correlates well with other systems like PRODES. Terra-i is a tool for rapid habitat monitoring at continental to regional scales to inform conservation policy.
Web-enabled Physical Samples: Curating and Publishing Physical Samples in CSIROAnusuriya Devaraju
The document discusses CSIRO's implementation of the International Geo-Sample Number (IGSN) system to provide unique identifiers and metadata for physical samples. It describes how CSIRO established IGSN allocation rules and implemented registration services and schemas. Assigning IGSNs to samples in various CSIRO collections helps make them discoverable online and allows cross-referencing between related resources like sub-samples, datasets, and publications. The use of IGSNs is expected to expand across CSIRO facilities and collections.
Combining Process and Sensor Ontologies to Support Geo-Sensor Data RetrievalAnusuriya Devaraju
This document discusses combining a Sensor Network Ontology (SNO) with a Process-centric Hydrology Domain Ontology (HDO) to provide an integrated view of the Semantic Sensor Web. SNO describes sensors and observations at different levels of abstraction. HDO specifies relations between geo-processes, participants, and properties and handles naming issues in the hydrology domain. Together they allow complex observation requests involving sensors, features, and properties. The ontologies are being improved ongoing, with SNO aligned with W3C standards and HDO refining descriptions.
Representing and Reasoning about Geographic Occurrences in the Sensor WebAnusuriya Devaraju
Observations are fed into the Sensor Web through a growing number of environmental sensors, including technical and human observers. While a wealth of observations is now accessible, there is still a gap between low-level observations and the high-level descriptive information they reflect. For example, we may ask what the measurements mean when a weather buoy provides a temperature time series. The challenge is not to gather a vast number of observations, but rather to make sense of them in environmental monitoring and decision making.
In order to infer meaningful information about occurrences from observations, a description of how one gets from the former to information about the latter must be expressed. This thesis develops an ontology to formally capture the relationships between geographic occurrences and the properties observed by in situ sensors. Building upon the existing positions on experiential and historical perspectives, stimulus-centric sensing, event-process algebra and thematic roles, the ontology elucidates the key concepts associated with geographic occurrences that are particularly significant from a sensing point of view. A use case for reasoning about blizzards and their temporal parts from real time series supplied by the Environment Canada illustrates the ontological approach. This thesis evaluates its findings on the basis of a comparison with an alternative approach in the Sensor Web, a verification of the use case results using an official event report published by the weather agency and an analytical assessment approached from the system development perspective.
The theoretical contribution of the thesis lies in the development of a formal model, which constitutes common building blocks for constructing application ontologies that account for inferences of geographic events from observations. With regards to its practical contribution, the thesis has demonstrated how ontological vocabularies are exploited with reasoning mechanisms to infer information about events, and to formulate symbolic spatio-temporal queries.
The rapid development of sensing technologies had led to the creation of large volumes of environmental observation data. Data quality control information informs users how it was gathered, processed, examined. Sensor Web is a web-centric framework that involves observations from various providers. It is essential to capture quality control information within the framework to ensure that observation data are of known and documented quality. In this paper, we present a quality control framework covering different environmental observation data, and show how it is implemented in the TERENO data infrastructure. The infrastructure is modeled after the OGC’s Sensor Web Enablement (SWE) standards.
CAPTURING DATA PROVENANCE WITH A USER-DRIVEN FEEDBACK APPROACHAnusuriya Devaraju
Various portals have been developed to provide an easy way to discover and access public research data sets from various organizations. Data sets are made available with descriptive metadata based on common (e.g., OGC, CUAHSI, FGDC, INSPIRE, ISO, Dublin Core) or proprietary standards to facilitate better understanding and use of the data sets. Provenance descriptions may be included as part of the metadata and are
specified from a data provider’s perspective. These can include, for example, different entities and activities involved in a data creation flow, such as sensing platforms, personnel, and data calculation and transformation processes. Moving beyond the provider-centric descriptions, data provenance may be complemented with
forward provenance records supplied by data consumers. The records may be gathered via a user-driven feedback approach. The feedback information from data consumers gives valuable insights into application and assessment of published data sets. This might include descriptions about a scientific analysis in which the data
sets were used, the corrected version of an actual data set or any discovered issues and suggestions concerning the quality of the published data sets. Data providers might then use this information to handle erroneous data and improve existing metadata, their data collection and processing methods. Contributors can use the feedback channel to share their scientific analyses. Data consumers can learn more about data sets based on
other people’s experiences, and potentially save time by avoiding the need for interpreting or cleaning data sets. The goals of the study are to capture feedback from data users on published research data sets, link this to actual data sets, and finally support search and discovery of research data using feedback information. This
paper reports preliminary results addressing the goals. We provide a summary of current practices on gathering feedback from end-users on research data portals, and discuss their relevance and limitations. Examples from the Earth Science domain on how commentaries from data users might be useful in practice are also included.
Then, we present a data model representing key aspects of user feedback. We propose a system architecture to gather and manage feedback from end-users. We describe how the core PROV model may be used to represent the provenance of user feedback information. Technical solutions for linking feedback to existing data portals are also specified.
This document provides an overview of geospatial semantics and interoperability. It defines key concepts like semantics, interoperability, integration, and heterogeneities. It discusses types of heterogeneities that can occur like structural, domain, data, and language conflicts between data sources. Standards and specifications aim to provide syntactic interoperability but semantic interoperability requires identifying and resolving heterogeneities. The document also introduces geospatial semantic web and observations data standards that can help achieve greater interoperability.
THE ISSUE OF UNCERTAINTY FOR HYDROLOGIC EVENTS IN THE MISSOURI RIVER WATERSHE...Boris Shmagin
1. The document discusses a paradigm shift in science and methodology due to computers, with concepts moving from a simple to complex world. Pattern recognition problems in cognitive science helped develop new methods.
2. The new paradigm introduces direct search for solutions, emphasis on decision making, and a unity of technical and holistic languages for pattern description. This leads to a convergence of exact science and humanities.
3. The main difference between the new and old paradigms is a focus on controlling algorithm complexity rather than function complexity to guarantee inference success. Low complexity algorithms can create complex functions that generalize well.
Terra-i is a system that uses neural networks and MODIS data to monitor habitat change in near real-time. It maps habitat loss every 16 days at 250m resolution. Its goals are to monitor natural habitat conversion, have continental coverage, support government decision making, and quantify habitat change rates. Terra-i predicts vegetation greenness using past NDVI and precipitation data, compares this to MODIS measurements to detect anomalies, and calibrates results with Landsat images. Comparisons show it correlates well with other systems like PRODES. Terra-i is a tool for rapid habitat monitoring at continental to regional scales to inform conservation policy.
Web-enabled Physical Samples: Curating and Publishing Physical Samples in CSIROAnusuriya Devaraju
The document discusses CSIRO's implementation of the International Geo-Sample Number (IGSN) system to provide unique identifiers and metadata for physical samples. It describes how CSIRO established IGSN allocation rules and implemented registration services and schemas. Assigning IGSNs to samples in various CSIRO collections helps make them discoverable online and allows cross-referencing between related resources like sub-samples, datasets, and publications. The use of IGSNs is expected to expand across CSIRO facilities and collections.
Combining Process and Sensor Ontologies to Support Geo-Sensor Data RetrievalAnusuriya Devaraju
This document discusses combining a Sensor Network Ontology (SNO) with a Process-centric Hydrology Domain Ontology (HDO) to provide an integrated view of the Semantic Sensor Web. SNO describes sensors and observations at different levels of abstraction. HDO specifies relations between geo-processes, participants, and properties and handles naming issues in the hydrology domain. Together they allow complex observation requests involving sensors, features, and properties. The ontologies are being improved ongoing, with SNO aligned with W3C standards and HDO refining descriptions.
Representing and Reasoning about Geographic Occurrences in the Sensor WebAnusuriya Devaraju
Observations are fed into the Sensor Web through a growing number of environmental sensors, including technical and human observers. While a wealth of observations is now accessible, there is still a gap between low-level observations and the high-level descriptive information they reflect. For example, we may ask what the measurements mean when a weather buoy provides a temperature time series. The challenge is not to gather a vast number of observations, but rather to make sense of them in environmental monitoring and decision making.
In order to infer meaningful information about occurrences from observations, a description of how one gets from the former to information about the latter must be expressed. This thesis develops an ontology to formally capture the relationships between geographic occurrences and the properties observed by in situ sensors. Building upon the existing positions on experiential and historical perspectives, stimulus-centric sensing, event-process algebra and thematic roles, the ontology elucidates the key concepts associated with geographic occurrences that are particularly significant from a sensing point of view. A use case for reasoning about blizzards and their temporal parts from real time series supplied by the Environment Canada illustrates the ontological approach. This thesis evaluates its findings on the basis of a comparison with an alternative approach in the Sensor Web, a verification of the use case results using an official event report published by the weather agency and an analytical assessment approached from the system development perspective.
The theoretical contribution of the thesis lies in the development of a formal model, which constitutes common building blocks for constructing application ontologies that account for inferences of geographic events from observations. With regards to its practical contribution, the thesis has demonstrated how ontological vocabularies are exploited with reasoning mechanisms to infer information about events, and to formulate symbolic spatio-temporal queries.
The rapid development of sensing technologies had led to the creation of large volumes of environmental observation data. Data quality control information informs users how it was gathered, processed, examined. Sensor Web is a web-centric framework that involves observations from various providers. It is essential to capture quality control information within the framework to ensure that observation data are of known and documented quality. In this paper, we present a quality control framework covering different environmental observation data, and show how it is implemented in the TERENO data infrastructure. The infrastructure is modeled after the OGC’s Sensor Web Enablement (SWE) standards.
CAPTURING DATA PROVENANCE WITH A USER-DRIVEN FEEDBACK APPROACHAnusuriya Devaraju
Various portals have been developed to provide an easy way to discover and access public research data sets from various organizations. Data sets are made available with descriptive metadata based on common (e.g., OGC, CUAHSI, FGDC, INSPIRE, ISO, Dublin Core) or proprietary standards to facilitate better understanding and use of the data sets. Provenance descriptions may be included as part of the metadata and are
specified from a data provider’s perspective. These can include, for example, different entities and activities involved in a data creation flow, such as sensing platforms, personnel, and data calculation and transformation processes. Moving beyond the provider-centric descriptions, data provenance may be complemented with
forward provenance records supplied by data consumers. The records may be gathered via a user-driven feedback approach. The feedback information from data consumers gives valuable insights into application and assessment of published data sets. This might include descriptions about a scientific analysis in which the data
sets were used, the corrected version of an actual data set or any discovered issues and suggestions concerning the quality of the published data sets. Data providers might then use this information to handle erroneous data and improve existing metadata, their data collection and processing methods. Contributors can use the feedback channel to share their scientific analyses. Data consumers can learn more about data sets based on
other people’s experiences, and potentially save time by avoiding the need for interpreting or cleaning data sets. The goals of the study are to capture feedback from data users on published research data sets, link this to actual data sets, and finally support search and discovery of research data using feedback information. This
paper reports preliminary results addressing the goals. We provide a summary of current practices on gathering feedback from end-users on research data portals, and discuss their relevance and limitations. Examples from the Earth Science domain on how commentaries from data users might be useful in practice are also included.
Then, we present a data model representing key aspects of user feedback. We propose a system architecture to gather and manage feedback from end-users. We describe how the core PROV model may be used to represent the provenance of user feedback information. Technical solutions for linking feedback to existing data portals are also specified.
This document provides an overview of geospatial semantics and interoperability. It defines key concepts like semantics, interoperability, integration, and heterogeneities. It discusses types of heterogeneities that can occur like structural, domain, data, and language conflicts between data sources. Standards and specifications aim to provide syntactic interoperability but semantic interoperability requires identifying and resolving heterogeneities. The document also introduces geospatial semantic web and observations data standards that can help achieve greater interoperability.
THE ISSUE OF UNCERTAINTY FOR HYDROLOGIC EVENTS IN THE MISSOURI RIVER WATERSHE...Boris Shmagin
1. The document discusses a paradigm shift in science and methodology due to computers, with concepts moving from a simple to complex world. Pattern recognition problems in cognitive science helped develop new methods.
2. The new paradigm introduces direct search for solutions, emphasis on decision making, and a unity of technical and holistic languages for pattern description. This leads to a convergence of exact science and humanities.
3. The main difference between the new and old paradigms is a focus on controlling algorithm complexity rather than function complexity to guarantee inference success. Low complexity algorithms can create complex functions that generalize well.
The document discusses the evolving landscape of semantic technologies and their applications to scientific domains like eScience. It introduces the Tetherless World Constellation, a research group applying semantic web techniques. Examples are given of projects applying semantics to areas like virtual observatories and provenance capture. The value of semantic technologies is discussed for integration, discovery, and validation of scientific data and models. Modular ontologies and semantically-enabled frameworks are presented as important directions for reuse and collaboration.
This document presents a new platform called 4DEOS that allows for the asynchronous visualization of heterogeneous spatiotemporal data. 4DEOS uses a client-broker-server architecture to integrate and visualize different geospatial datasets over time in order to better analyze correlations. It was developed and tested for earthquake prediction studies where visualizing signals from different sources at varying time lags could help identify precursors.
The document discusses the Sensor Web and Semantic Sensor Web. It provides definitions and examples of the Sensor Web from NASA and OGC. The key components of Sensor Web Enablement (SWE) are described including sensor models, encodings, and web services. The role of semantics and ontologies in the Semantic Sensor Web is explained to provide contextual meaning to sensor observations. Rules can be used to derive additional knowledge from semantically annotated sensor data.
This document discusses the Space-Time Cube, a tool for visualizing, analyzing, and managing spatiotemporal data describing human movement and events. It examines use cases like evacuation routes after an earthquake and identifying flocking patterns in pedestrian trajectory data. The Space-Time Cube allows interactive exploration of movement data at different scales, as well as identifying relationships and patterns within spatiotemporal event data, like archaeological site locations. Key challenges with spatiotemporal data include differences in resolution, accuracy, and data types, ranging from dense tracking datasets to sparse event records, requiring appropriate analytical methods.
1) The document proposes a framework for ontology-based change detection in remote sensing data using knowledge information processing to emulate human interpretation.
2) It applies this framework to detect changes like mudslides and flooding in satellite images of different locations.
3) Initial experiments show the framework can accurately detect changes and outperform human interpretation without parameter tuning, demonstrating its potential to understand changes like humans.
`deep' semantics in the geosciences: semantic building blocks for a complete ...the university of auckland
In the geosciences, the ontologies available are typically narrowly focused structures fit for single purpose use. In this paper we discuss why this might be, with the conclusion that it is not sufficient to use semantics simply to provide categorical labels for instances—because of the interpretive and uncertain nature of geoscience, researchers need to understand how a conclusion has been reached in order to have any confidence in adopting it. Thus ontologies must address the epistemological questions of how (and possibly why) something is ‘known’. We provide a longer justification for this argument, make a case for capturing and representing these deep semantics, provide examples in specific geoscience domains and briefly touch on a visualisation program called Alfred that we have developed to allow researchers to explore the different facets of ontology that can support them applying value judgements to the interpretation of geological entities.
The document summarizes a research paper titled "HON4D: Histogram of Oriented 4D Normals for Activity Recognition from Depth Sequences". It proposes a novel descriptor called HON4D that encodes the distribution of surface normal orientations in a 4D space of depth, time, and spatial coordinates for activity recognition from depth image sequences. The 4D space is quantized using the vertices of a polychoron structure to create bins. This allows the HON4D descriptor to capture more complex and articulated motions than existing holistic approaches. Evaluation shows it outperforms these prior methods and can also be adapted for unaligned dataset recognition.
The document summarizes research on daily living activity recognition using efficient combination of high and low level cues. The researchers propose an approach that fuses body pose estimation and low-level cues like optical flow to produce an enriched descriptor. A Fisher kernel representation is then used to model the temporal variation in video sequences for recognizing activities. The approach achieves state-of-the-art results on the ADL Rochester dataset.
Moving beyond the 4th Dimension of Quantifying, Analyzing & Visualizing A...chrismalzone
Presentation & Paper By Chris Malzone and given by Mike Mutschler, RESON. Focuses on the benefits of fusing multiple sources of acoustic data through fusion. Presentation also introduces the concept of fusion & looking at multisource 4-dimensional data. It also brings home the point through a Habitat Mapping Case Study in the US Virgin Islands as analysed via the Eonfusion software.
Almost the same as the talk given to Ph.D. students one year ago. It covers the problem of research reproducibility and the tools for doing it. First comes some "theoretical" arguments, then the enumeration of some tools.
Laurent Etienne's presentation at Geomatics Atlantic 2012 (www.geomaticsatlantic.com) in Halifax, June 2012. More session details at http://lanyrd.com/2012/geomaticsatlantic2012/stbgx/ .
Publication and long term archival of observational data in the field of environmental sciences is a challenging topic of today's eScience research. The amount of effort that goes into technical and scientific quality assurance prior to publication is considerable and might well turn out to be a barrier to data publication. Our project's goal is to lower the amount of manual effort and, at the same time, increase data quality in the process of submitting observational data for publication – in this case meteorological observational data. This goal is divided into the following subgoals:
Establish a standard procedure for the publication of observational data in the area of meteorology including quality information.
Develop a workflow system for the automatisation of the publication process.
Make the procedure usable for environmental sciences in general.
Integration of the procedure into an existing central data repository for meteorology (CERA data base at the World Data Center for Climate).
This talk is about the current state of the project from an eResearch and technical point of view.
Aspects of Reproducibility in Earth ScienceRaul Palma
The document discusses aspects of reproducibility in earth science research within the European Virtual Environment for Research - Earth Science Themes (EVEREST) project. The key objectives of EVEREST are to establish an e-infrastructure to facilitate collaborative earth science research through shared data, models, and workflows. Research Objects (ROs) will be used to capture and share workflows, processes, and results to help ensure reproducibility and preservation of earth science research. An example RO is described for mapping volcano deformation using satellite imagery and other data sources. Issues around reproducibility related to data access, software dependencies, and manual intervention in workflows are also discussed.
A semantic framework and software design to enable the transparent integratio...Patricia Tavares Boralli
This document proposes a conceptual framework to unify representations of natural systems knowledge. The framework is based on separating the ontological nature of an object of study from the context of its observation. Each object is associated with a concept defined in an ontology and an observation context describing aspects like location and time. Models and data are treated as generic knowledge sources with a semantic type and observation context. This allows flexible integration and calculation of states across heterogeneous sources by composing their observation contexts and resolving semantic compatibility. The framework aims to simplify knowledge representation by abstracting away complexity related to data format and scale.
Current and pending projects at the University of Kansas Biodiversity Research Center include FishNet 2, an improved infrastructure for sharing biodiversity information; DataONE, which ensures preservation and access to earth observation data; and VDC, a virtual distributed network of data centers that supports access to biodiversity, ecological and environmental data through open standards and protocols.
Data Facilities Workshop - Panel on Current Concepts in Data Sharing & Intero...EarthCube
This series of presentations was given at the EarthCube Data Facilities End-User Workshop held January 15-17, 2014 in Washington, DC. This workshop provided a forum to discuss the unique requirements and challenges associated with developing the communication, collaboration, interoperability, and governance structures that will be required to build EarthCube in conjunction with existing and emerging NSF/GEO facilities.
This panel and discussion, specifically, outlined and explained several current concepts in data sharing and interoperability, featuring presentations by:
Paul Morin (UMN): Polar Cyberinfrastructure
Don Middleton (UCAR): Atmospheric/Climate
Kerstin Lehnert (LDEO): Domain Repositories & Physical Samples
David Schindel (CBOL, GRBio): Biological Perspective & Collections
Hank Leoscher (NEON): Observation Networks
Daniel Fuka (Virginia Tech) and Ruth Duerr (NSIDC): Brokering
Ilya Zaslavsky (UCSD): Cross-Domain Interoperability
Taking advantage of state of the art underwater vehicles and current networking capabilities, the visionary
double objective of this work is to “open to people connected to the Internet, an access to ocean depths
anytime, anywhere.” Today, these people can just perceive the changing surface of the sea from the shores,
but ignore almost everything on what is hidden. If they could explore seabed and become knowledgeable,
they would get involved in finding alternative solutions for our vital terrestrial problems – pollution,
climate changes, destruction of biodiversity and exhaustion of Earth resources. The second objective is to
assist professionals of underwater world in performing their tasks by augmenting the perception of the
scene and offering automated actions such as wildlife monitoring and counting. The introduction of Mixed
Reality and Internet in aquatic activities constitutes a technological breakthrough when compared with the
status of existing related technologies. Through Internet, anyone, anywhere, at any moment will be
naturally able to dive in real-time using a Remote Operated Vehicle (ROV) in the most remarkable sites
around the world. The heart of this work is focused on Mixed Reality. The main challenge is to reach real
time display of digital video stream to web users, by mixing 3D entities (objects or pre-processed
underwater terrain surfaces), with 2D videos of live images collected in real time by a teleoperated ROV.
The definition and extraction of actionable anomalous discords, i.e. pattern outliers, is a challenging
problem in data analysis. It raises the crucial issue of identifying criteria that would render a discord
more insightful than another one. In this paper, we propose an approach to address this by
introducing the concept of prominent discord. The core idea behind this new concept is to identify
dependencies among discords of varying lengths. How can we identify a discord that would be
prominent? We propose an ordering relation, that ranks discords, and we seek a set of prominent
discords with respect to this ordering. Our contributions are threefold 1) a formal definition,
ordering relation and methods to derive prominent discords based on Matrix Profile techniques,2)
their evaluation over large contextual climate data, covering 110 years of monthly data, and 3) a
comparison of an exact method based on STOMP and an approximate approach that is based on
SCRIMP++ to compute the prominent discords and study the tradeoff optimality/CPU. The
approach is generic and its pertinence shown over historical climate data.
This document discusses FAIR (Findable, Accessible, Interoperable, Reusable) principles for data management and describes an approach for assessing how well research data complies with the FAIR principles. The approach involves developing metrics based on the FAIR principles, an automated tool called F-UJI that assesses data using the metrics, and providing consultation to help data repositories improve their FAIRness over time based on the assessment results. The tool has been used to assess over 10,000 datasets across several data repositories. Challenges and lessons learned are also discussed, such as balancing machine-readable metrics with human aspects of the principles and ensuring assessments account for different types and contexts of data. The overall aim is to
The document provides a 7-step guide to effective research data sharing: 1) Publish your dataset with a persistent identifier; 2) Be generous when describing your data with metadata; 3) Use machine-readable controlled vocabularies; 4) Link related resources with persistent identifiers; 5) Choose an appropriate license; 6) De-identify sensitive data if applicable; 7) Make data available in a format for long-term accessibility and choose a trustworthy research data repository. The guide emphasizes making data findable, accessible, interoperable and reusable (FAIR).
The document discusses the evolving landscape of semantic technologies and their applications to scientific domains like eScience. It introduces the Tetherless World Constellation, a research group applying semantic web techniques. Examples are given of projects applying semantics to areas like virtual observatories and provenance capture. The value of semantic technologies is discussed for integration, discovery, and validation of scientific data and models. Modular ontologies and semantically-enabled frameworks are presented as important directions for reuse and collaboration.
This document presents a new platform called 4DEOS that allows for the asynchronous visualization of heterogeneous spatiotemporal data. 4DEOS uses a client-broker-server architecture to integrate and visualize different geospatial datasets over time in order to better analyze correlations. It was developed and tested for earthquake prediction studies where visualizing signals from different sources at varying time lags could help identify precursors.
The document discusses the Sensor Web and Semantic Sensor Web. It provides definitions and examples of the Sensor Web from NASA and OGC. The key components of Sensor Web Enablement (SWE) are described including sensor models, encodings, and web services. The role of semantics and ontologies in the Semantic Sensor Web is explained to provide contextual meaning to sensor observations. Rules can be used to derive additional knowledge from semantically annotated sensor data.
This document discusses the Space-Time Cube, a tool for visualizing, analyzing, and managing spatiotemporal data describing human movement and events. It examines use cases like evacuation routes after an earthquake and identifying flocking patterns in pedestrian trajectory data. The Space-Time Cube allows interactive exploration of movement data at different scales, as well as identifying relationships and patterns within spatiotemporal event data, like archaeological site locations. Key challenges with spatiotemporal data include differences in resolution, accuracy, and data types, ranging from dense tracking datasets to sparse event records, requiring appropriate analytical methods.
1) The document proposes a framework for ontology-based change detection in remote sensing data using knowledge information processing to emulate human interpretation.
2) It applies this framework to detect changes like mudslides and flooding in satellite images of different locations.
3) Initial experiments show the framework can accurately detect changes and outperform human interpretation without parameter tuning, demonstrating its potential to understand changes like humans.
`deep' semantics in the geosciences: semantic building blocks for a complete ...the university of auckland
In the geosciences, the ontologies available are typically narrowly focused structures fit for single purpose use. In this paper we discuss why this might be, with the conclusion that it is not sufficient to use semantics simply to provide categorical labels for instances—because of the interpretive and uncertain nature of geoscience, researchers need to understand how a conclusion has been reached in order to have any confidence in adopting it. Thus ontologies must address the epistemological questions of how (and possibly why) something is ‘known’. We provide a longer justification for this argument, make a case for capturing and representing these deep semantics, provide examples in specific geoscience domains and briefly touch on a visualisation program called Alfred that we have developed to allow researchers to explore the different facets of ontology that can support them applying value judgements to the interpretation of geological entities.
The document summarizes a research paper titled "HON4D: Histogram of Oriented 4D Normals for Activity Recognition from Depth Sequences". It proposes a novel descriptor called HON4D that encodes the distribution of surface normal orientations in a 4D space of depth, time, and spatial coordinates for activity recognition from depth image sequences. The 4D space is quantized using the vertices of a polychoron structure to create bins. This allows the HON4D descriptor to capture more complex and articulated motions than existing holistic approaches. Evaluation shows it outperforms these prior methods and can also be adapted for unaligned dataset recognition.
The document summarizes research on daily living activity recognition using efficient combination of high and low level cues. The researchers propose an approach that fuses body pose estimation and low-level cues like optical flow to produce an enriched descriptor. A Fisher kernel representation is then used to model the temporal variation in video sequences for recognizing activities. The approach achieves state-of-the-art results on the ADL Rochester dataset.
Moving beyond the 4th Dimension of Quantifying, Analyzing & Visualizing A...chrismalzone
Presentation & Paper By Chris Malzone and given by Mike Mutschler, RESON. Focuses on the benefits of fusing multiple sources of acoustic data through fusion. Presentation also introduces the concept of fusion & looking at multisource 4-dimensional data. It also brings home the point through a Habitat Mapping Case Study in the US Virgin Islands as analysed via the Eonfusion software.
Almost the same as the talk given to Ph.D. students one year ago. It covers the problem of research reproducibility and the tools for doing it. First comes some "theoretical" arguments, then the enumeration of some tools.
Laurent Etienne's presentation at Geomatics Atlantic 2012 (www.geomaticsatlantic.com) in Halifax, June 2012. More session details at http://lanyrd.com/2012/geomaticsatlantic2012/stbgx/ .
Publication and long term archival of observational data in the field of environmental sciences is a challenging topic of today's eScience research. The amount of effort that goes into technical and scientific quality assurance prior to publication is considerable and might well turn out to be a barrier to data publication. Our project's goal is to lower the amount of manual effort and, at the same time, increase data quality in the process of submitting observational data for publication – in this case meteorological observational data. This goal is divided into the following subgoals:
Establish a standard procedure for the publication of observational data in the area of meteorology including quality information.
Develop a workflow system for the automatisation of the publication process.
Make the procedure usable for environmental sciences in general.
Integration of the procedure into an existing central data repository for meteorology (CERA data base at the World Data Center for Climate).
This talk is about the current state of the project from an eResearch and technical point of view.
Aspects of Reproducibility in Earth ScienceRaul Palma
The document discusses aspects of reproducibility in earth science research within the European Virtual Environment for Research - Earth Science Themes (EVEREST) project. The key objectives of EVEREST are to establish an e-infrastructure to facilitate collaborative earth science research through shared data, models, and workflows. Research Objects (ROs) will be used to capture and share workflows, processes, and results to help ensure reproducibility and preservation of earth science research. An example RO is described for mapping volcano deformation using satellite imagery and other data sources. Issues around reproducibility related to data access, software dependencies, and manual intervention in workflows are also discussed.
A semantic framework and software design to enable the transparent integratio...Patricia Tavares Boralli
This document proposes a conceptual framework to unify representations of natural systems knowledge. The framework is based on separating the ontological nature of an object of study from the context of its observation. Each object is associated with a concept defined in an ontology and an observation context describing aspects like location and time. Models and data are treated as generic knowledge sources with a semantic type and observation context. This allows flexible integration and calculation of states across heterogeneous sources by composing their observation contexts and resolving semantic compatibility. The framework aims to simplify knowledge representation by abstracting away complexity related to data format and scale.
Current and pending projects at the University of Kansas Biodiversity Research Center include FishNet 2, an improved infrastructure for sharing biodiversity information; DataONE, which ensures preservation and access to earth observation data; and VDC, a virtual distributed network of data centers that supports access to biodiversity, ecological and environmental data through open standards and protocols.
Data Facilities Workshop - Panel on Current Concepts in Data Sharing & Intero...EarthCube
This series of presentations was given at the EarthCube Data Facilities End-User Workshop held January 15-17, 2014 in Washington, DC. This workshop provided a forum to discuss the unique requirements and challenges associated with developing the communication, collaboration, interoperability, and governance structures that will be required to build EarthCube in conjunction with existing and emerging NSF/GEO facilities.
This panel and discussion, specifically, outlined and explained several current concepts in data sharing and interoperability, featuring presentations by:
Paul Morin (UMN): Polar Cyberinfrastructure
Don Middleton (UCAR): Atmospheric/Climate
Kerstin Lehnert (LDEO): Domain Repositories & Physical Samples
David Schindel (CBOL, GRBio): Biological Perspective & Collections
Hank Leoscher (NEON): Observation Networks
Daniel Fuka (Virginia Tech) and Ruth Duerr (NSIDC): Brokering
Ilya Zaslavsky (UCSD): Cross-Domain Interoperability
Taking advantage of state of the art underwater vehicles and current networking capabilities, the visionary
double objective of this work is to “open to people connected to the Internet, an access to ocean depths
anytime, anywhere.” Today, these people can just perceive the changing surface of the sea from the shores,
but ignore almost everything on what is hidden. If they could explore seabed and become knowledgeable,
they would get involved in finding alternative solutions for our vital terrestrial problems – pollution,
climate changes, destruction of biodiversity and exhaustion of Earth resources. The second objective is to
assist professionals of underwater world in performing their tasks by augmenting the perception of the
scene and offering automated actions such as wildlife monitoring and counting. The introduction of Mixed
Reality and Internet in aquatic activities constitutes a technological breakthrough when compared with the
status of existing related technologies. Through Internet, anyone, anywhere, at any moment will be
naturally able to dive in real-time using a Remote Operated Vehicle (ROV) in the most remarkable sites
around the world. The heart of this work is focused on Mixed Reality. The main challenge is to reach real
time display of digital video stream to web users, by mixing 3D entities (objects or pre-processed
underwater terrain surfaces), with 2D videos of live images collected in real time by a teleoperated ROV.
The definition and extraction of actionable anomalous discords, i.e. pattern outliers, is a challenging
problem in data analysis. It raises the crucial issue of identifying criteria that would render a discord
more insightful than another one. In this paper, we propose an approach to address this by
introducing the concept of prominent discord. The core idea behind this new concept is to identify
dependencies among discords of varying lengths. How can we identify a discord that would be
prominent? We propose an ordering relation, that ranks discords, and we seek a set of prominent
discords with respect to this ordering. Our contributions are threefold 1) a formal definition,
ordering relation and methods to derive prominent discords based on Matrix Profile techniques,2)
their evaluation over large contextual climate data, covering 110 years of monthly data, and 3) a
comparison of an exact method based on STOMP and an approximate approach that is based on
SCRIMP++ to compute the prominent discords and study the tradeoff optimality/CPU. The
approach is generic and its pertinence shown over historical climate data.
This document discusses FAIR (Findable, Accessible, Interoperable, Reusable) principles for data management and describes an approach for assessing how well research data complies with the FAIR principles. The approach involves developing metrics based on the FAIR principles, an automated tool called F-UJI that assesses data using the metrics, and providing consultation to help data repositories improve their FAIRness over time based on the assessment results. The tool has been used to assess over 10,000 datasets across several data repositories. Challenges and lessons learned are also discussed, such as balancing machine-readable metrics with human aspects of the principles and ensuring assessments account for different types and contexts of data. The overall aim is to
The document provides a 7-step guide to effective research data sharing: 1) Publish your dataset with a persistent identifier; 2) Be generous when describing your data with metadata; 3) Use machine-readable controlled vocabularies; 4) Link related resources with persistent identifiers; 5) Choose an appropriate license; 6) De-identify sensitive data if applicable; 7) Make data available in a format for long-term accessibility and choose a trustworthy research data repository. The guide emphasizes making data findable, accessible, interoperable and reusable (FAIR).
F-UJI : An Automated Assessment Tool for Improving the FAIRness of Research DataAnusuriya Devaraju
The document describes F-UJI, an automated tool for assessing the FAIRness of research data. It was developed as part of the FAIRsFAIR project to test metrics and a badging scheme for evaluating how well individual datasets adhere to the FAIR data principles. The tool extracts metadata about datasets from repositories, applies 15 core metrics corresponding to the FAIR principles, and returns an assessment report. It was piloted on over 500 datasets each from 5 repositories, to provide recommendations to improve data FAIRness. Feedback from the repositories was incorporated into further developing the metrics and tool.
An Automated Assessment of the FAIRness of Research DataAnusuriya Devaraju
This document discusses developing metrics and an automated tool to assess how FAIR (findable, accessible, interoperable, reusable) research data is. It describes the FAIR principles, the FAIRsFAIR project aims, developing object assessment metrics in collaboration with repositories, and the F-UJI tool which automatically assesses data based on the metrics. Pilots with several repositories provided recommendations to improve data FAIRness and status updates. The approach aims to iteratively improve FAIR assessment considering repository contexts.
Towards A Web-Enabled Geo-Sample Web: An Open Source Resource Registration an...Anusuriya Devaraju
This document discusses the challenges of inconsistent sample cataloguing practices and lack of online sample discovery. It introduces the International Geo Sample Number (IGSN) as a solution to provide globally unique identifiers for physical samples. It then describes the implementation of IGSN at the Commonwealth Scientific and Industrial Research Organisation (CSIRO) in Australia, including developing an allocating agent service, description metadata schema, and sample registration system. Finally, it discusses applications of IGSN for sample tracking, discovery, and linking samples to related resources, and lessons learned from CSIRO's implementation.
Data You May Like: A Recommender System for Research Data DiscoveryAnusuriya Devaraju
This document describes a recommender system developed for the CSIRO Data Access Portal to help users discover research data. It examines two types of recommender systems: content-based, which uses item properties, and collaborative filtering, which uses user similarities. The system was built using both explicit metadata and implicit user behavior data. It calculates similarity between datasets across various features using techniques like TF-IDF and develops recommendations. An evaluation found the top recommendations were relevant 98% of the time. Future work includes enhancing the model and further evaluation.
The Implementation of the International Geo Sample Number in CSIRO: Experienc...Anusuriya Devaraju
In 2014 the Commonwealth Scientific and Industrial Research Organisation (CSIRO) began to implement the International Geo Sample Number (IGSN) to allow unambiguous identification of physical samples and data derived from these samples. In this paper we describe the requirements for the implementation of persistent identifiers for physical samples in the organisation and technical solutions we developed to meet these requirements.
Using Feedback from Data Consumers to Capture Quality Information on Environm...Anusuriya Devaraju
Data quality information is essential to facilitate reuse of Earth science data. Recorded quality information must be sufficient for other researchers to select suitable data sets for their analysis and confirm the results and conclusions. In the research data ecosystem, several entities are responsible for data quality. Data producers (researchers and agencies) play a major role in this aspect as they often include validation checks or data cleaning as part of their work. It is possible that the quality information is not supplied with published data sets; if it is available, the descriptions might be incomplete, ambiguous or address specific quality aspects. Data repositories have built infrastructures to share data, but not all of them assess data quality. They normally provide guidelines of documenting quality information. Some suggests that scholarly and data journals should take a role in ensuring data quality by involving reviewers to assess data sets used in articles, and incorporating data quality criteria in the author guidelines. However, this mechanism primarily addresses data sets submitted to journals. We believe that data consumers will complement existing entities to assess and document the quality of published data sets. This has been adopted in crowd-source platforms such as Zooniverse, OpenStreetMap, Wikipedia, Mechanical Turk and Tomnod. This paper presents a framework designed based on open source tools to capture and share data users’ feedback on the application and assessment of research data. The framework comprises a browser plug-in, a web service and a data model such that feedback can be easily reported, retrieved and searched. The feedback records are also made available as Linked Data to promote integration with other sources on the Web. Vocabularies from Dublin Core and PROV-O are used to clarify the source and attribution of feedback. The application of the framework is illustrated with the CSIRO’s Data Access Portal.
An Open Source Web Service for Registering and Managing Environmental SamplesAnusuriya Devaraju
This document describes the development of an open source web service for registering and managing environmental samples. The system was created to address issues with identifying and sharing metadata for samples that are isolated in different collections. It implements the International Geo Sample Number (IGSN) standard for uniquely identifying samples and includes a descriptive metadata schema and REST API web service to register samples and namespaces within CSIRO. The system was tested on the Capricorn Distal Footprints sample collection project and successfully registered samples and metadata. Future work includes applying the system to additional sample collections and mapping the metadata elements to existing data standards.
The document introduces the principles of Linked Data, which aims to share data rather than documents on the web. It describes the four rules of Linked Data and provides examples of existing Linked Data datasets as well as tools for publishing and using Linked Data. The document also discusses extending Linked Data to include geospatial and sensor data by linking web resources, structured geospatial databases, and unstructured geographic information.
1. A Process-Centric
Ontological Approach for
Integrating
teg at g
Geo-Sensor Data
Anusuriya Devaraju & Werner Kuhn
Institute For Geoinformatics,
University of Muenster
{anusuriya.devaraju, kuhn}@uni‐muenster.de
{ }
FOIS 2010 ‐ 6th International Conference on Formal Ontology in Information Systems, 13th May 2010.
2. A Simple Example…
Mrs Schneider:
cut thin pieces Mr Schneider:
from a large cut into or shape
piece of cooked (a hard material)
meat to produce an
object or design
Image Source : http://www.cartoonstock.com/directory/C/Carving.asp 2
4. Background
Geo‐sensors provide key
information about geo‐processes
g p
One way to interpret sensor
observations is by looking at geo‐
processes that influence them.
processes that influence them
Challenge :
– How to relate observed properties to
geo‐processes?
– Develop an approach that captures
consensual knowledge of the surface
hydrology domain
hydrology domain
(Observed Properties and Hydro‐Processes)
4
5. Motivation
Lack of principled ways of describing different kinds of
occurrences
– In GI domain, the terminological inconsistencies have led to
h l l h l
disagreements on classifying processes and events [Galton, 2008]
– Existing work : [Yuan, 2001], [Dias, 2004], [Wang, 2004], [Worboys, 2005],
etc.
etc
Are observed properties sufficient to classify or identify geo‐
processes?
– Objects & Matter as the ‘bearer’ of observed properties.
Handle semantic heterogeneities within geo‐sensor data
– H dl diff
Handle differences in naming conventions for (a) observed properties
i i i f ( ) b d i
(e.g., Gauge Height | Raw Stage) and (b) geo‐processes (e.g., InterFlow |
SubsurfaceStormFlow)
5
6. Motivation
From Sensor Web Community
– An ontology of observable property‐types to improve the discovery and
retrieval of sensor data sources must be available [SWE, OGC 2007].
– Eventually, the integration of domain ontologies [……..], semantic queries
and semantic transformations in Sensor Web infrastructure have to be
addressed [GEOSS Sensor Web Workshop Report, 2008]. It is necessary to
have sensor ontology to specify sensor capabilities in sensor ontology, as
well as the observed phenomena and complementing domain ontology to
specify what is being measured and the relation between the observed
if h i b i d d h l i b h b d
properties and features of interest in domain.
Existing approach to support semantic integration of geo‐sensor
data
– Focused on ontologies for sensors, observed properties, entities (e.g.,
p y
physical object). More examples in the paper ☺
j ) p p p
6
7. Approach
Process‐Centric*
Ontological Approach
(A DOLCE‐aligned
surface hydrology
ontology)
Observed Properties Geo‐Processes
DOLCE specifies (i) a basic level distinction between processes and events and
(ii) relations between processes and physical properties (via participants)
Related work based on DOLCE
Observation & Sensor [Probst (2007), Kuhn & Ortmann(2010); Neuhaus & Compton
(2005), Babitski et al. (2009); Fallahi (2008) ; [Brodaric & Probst (2009)]; Extreme‐
Events [Sherp et al. (2009), MONITOR]……
Events [Sherp et al (2009) MONITOR]
*The notion ’process’ encompasses different kinds of perdurants like process & event 7
9. Ontological Relations
Relation Example
Subsume
S b All individuals of a universal are necessarily individuals of another
All i di id l f i l il i di id l f th
SB(WaterObject , Lake) ; SB(PrecipitationProcess ,
SnowProcess)
Participation Relates endurants to perdurants in which they participate.
Relates endurants to perdurants in which they participate.
PC(Vegetation(x),TranspirationProcess(y),T(t))
Parthood A time‐independent relation holding between two individuals of
perdurants or abstracts.
PP(SnowflakeMelting(x),RainProcess(y))
Temporary A relation between two individuals of endurant where one is part of the
Parthood other at a particular time.
P(Headwater(x),River(y),T(t))
P(Headwater(x) River(y) T(t))
Inherence A relation between an individual quality and its bearer.
qt(Salinity(x), River(y))
qt(PrecipitationDuration(a),PrecipitationProcess(b))
9
11. Discussions: Sensor Data Retrieval
Importing domain categories into
the sensor network ontology.
Resolving naming ambiguity
o One process can be distinguished
from other processes via the
o o e p ocesses a e
participation relation
o equivalentClass relation identifies a
synonymous category Sensor Network Ontology (Neuhaus, 2009)
Improving sensor‐data retrieval
o Observation requests based on the relations between processes, their
pa t c pa ts as e as t e p ope t es
participants as well as their properties.
o Example : “How long did the rainstorm occurred in a given watershed during the
above period? (Asking information about duration) How much water was received
from the specified storm? (Asking information about interaction)”. What is the number
of days since last precipitation? (dry period preceding precipitation)
11
12. Ongoing & Future Work
It is harder to pinpoint the bearer of a quality
– The definition of ‘features’ (from OGC s O&M specification) allows any
The definition of features (from OGC’s O&M specification) allows any
‘entity’ to be classified as a feature type (e.g., geographic objects, event)
– In DOLCE, a physical quality only inhere‐in a physical endurant!
Further investigations are required on the concept quality
F th i ti ti i d th t lit
– Combination of qualities forming a more complex query; e.g.,
discharge = area × velocity
Specify the participant based on their ‘role’ with respect to a
perdurant
– amount of water & a particular ground surface amount of soil as
amount of water & a particular ground surface, amount of soil as
participants in the infiltration process.
Describe social hydro‐concepts, e.g. catchment
12
13. What’s Next….
Can we formalize this?
Can we formalize this?
A flash flood is a rapid flooding of geomorphic low‐lying areas ‐ washes,
rivers, dry lakes and basins. It may be caused by heavy rain or meltwater
from ice or snow flowing over icesheets or snowfields. Flash floods can also
from ice or snow flowing over icesheets or snowfields Flash floods can also
occur after the collapse of an ice dam, debris dam or a human structure,
such as a dam. Flash floods are distinguished from a regular flood by a
timescale less than six hours.1
* http://en.wikipedia.org/wiki/Flash_flood 13
14. Conclusions
Property
Object (Spatial,
Matter Temporal)
Process
Event
Semantic Integration of Geo‐Sensor Data
14