The document discusses the evolving landscape of semantic technologies and their applications to scientific domains like eScience. It introduces the Tetherless World Constellation, a research group applying semantic web techniques. Examples are given of projects applying semantics to areas like virtual observatories and provenance capture. The value of semantic technologies is discussed for integration, discovery, and validation of scientific data and models. Modular ontologies and semantically-enabled frameworks are presented as important directions for reuse and collaboration.
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
201109021 mcguinness ska_meeting
1. The Evolving Semantic Web and Semantic eScience Landscape Deborah L. McGuinness Tetherless World Senior Constellation Chair Professor of Computer and Cognitive Science Rensselaer Polytechnic Institute Troy, NY, USA Joint work with the Tetherless World Constellation eScience , Provenance, and Linked Open Data Teams. Particularly Peter Fox, Jim Hendler, Patrick West, Stephan Zednik, Cynthia Chang, … tw.rpi.edu/people
43. Semantic Web Methodology and Technology Development Process James L. Benedict, Deborah L. McGuinness, and Peter Fox. A Semantic Web-based Methodology for Building Conceptual Models of Scientific Information. In American Geophysical Union, Fall Meeting (AGU2006), San Francisco, Ca., December, 2007. Eos Trans. AGU 88(52), Fall Meet. Suppl., Abstract IN53A-0950. abstract
schematic of sources of atmospheric disruption – what they are and where they occur in the atmosphere – and how they show up after the eruption in terms of a climate process - moderately well understood processes BUT data is everywhere under many different controls From nasa: “The importance of the study of stratospheric aerosol is not one that readily connects with the general public. This not too surprising since aerosol in the stratosphere can be seen with the naked eye (in the form of luminous sunsets following large volcanic eruptions) only a few times over the course of a lifetime. Similarly, consider that under nominal non-volcanic background conditions that the stratosphere contains about 1 Tg (1 megatonne). If this material were deposited uniformly onto the surface of the Earth, it would result in a layer only about 1-nm thick or less than one ten-thousandth of the width of a human hair. With this in mind, it is not difficult to image that the general public may not appreciate the important role that stratospheric aerosol can play in climate. However, in this era of shrinking science dollars, it is required to develop coherent arguments for continued research and investment into what is almost by definition an esoteric field. “ Types of physical quantities between volcano and climate that need to be related. We need to integrate underlying data from heterogeneous sources - schematic of sources of atmospheric disruption – what they are and where they occur in the atmosphere – and how they show up after the eruption in terms of a climate process - moderately well understood processes BUT data is everywhere under many different controls
during January 2000
James L. Benedict, Deborah L. McGuinness, and Peter Fox. A Semantic Web-based Methodology for Building Conceptual Models of Scientific Information. In American Geophysical Union, Fall Meeting (AGU2006), San Francisco, Ca., December, 2007. Eos Trans. AGU 88(52), Fall Meet. Suppl., Abstract IN53A-0950. abstract
http://was.tw.rpi.edu/swqp/map.html
http://was.tw.rpi.edu/swqp/trend/epaTrend.html?state=RI&county=3&site=http%3A%2F%2Ftw2.tw.rpi.edu%2Fzhengj3%2Fowl%2Fepa.owl%23facility-110000312135 Plese make sure all parameters are selected as shown in this image: facility permit, characteristic, test type, and the click “click”. http://was.tw.rpi.edu/swqp/map.html http://inference-web.org/wiki/Semantic_Water_Quality_Portal
ImpacTeen: part of Bridging the Gap: Research Informing Practice and Policy for Healthy Youth Behavior, supported by the Robert Wood Johnson Foundation and administered by Univ. of Illinois at Chicago. http://www.impacteen.org/
Many Benefits: Reduced query formation from 8 to 3 steps and reduced choices at each stage Allowed scientists to get data from instruments they never knew of before (e.g., photometers in example) Supported augmentation and validation of data Useful and related data provided without having to be an expert to ask for it Integration and use (e.g. plotting) based on inference Ask and answer questions not possible before But Needed Provenance (SPCDIS, PML), reusability & modularity (SESF) Deborah McGuinness, Peter Fox, Luca Cinquini, Patrick West, Jose Garcia, James L. Benedict, and Don Middleton. The Virtual Solar-Terrestrial Observatory: A Deployed Semantic Web Application Case Study for Scientific Research. In the Proceedings of the Nineteenth Conference on Innovative Applications of Artificial Intelligence (IAAI-07). Vancouver, British Columbia, Canada, July 22-26, 2007. Peter Fox, Deborah L. McGuinness, Luca Cinquini, Patrick West, Jose Garcia, James L. Benedict, and Don Middleton. Ontology-supported Scientific Data Frameworks: The Virtual Solar-Terrestrial Observatory Experience. In Computers and Geosciences - Elsevier. Volume 35, Issue 4 (2009).
The current focus of SPCDIS is to model provenance for one VSTO-affiliated service known as the Chromospheric Helium Image Photometer (or CHIP) Pipeline. CHIP is a sensor located at the Mauna Loa Solar Observatory, which takes pictures of the sun every 3 minutes. In turn, these pictures are sent to a data processing center at the National Center for atmospheric research. Here, follow-up processing is conducted on the MLSO pictures – such as Flat Field Calibration, which removes optical errors in image data – as well as quality checking on pictures – leading to grades on the scale of GOOD, BAD, or UGLY. Finally, the MLSO pictures are each processed into two kinds of images consumable by scientists: Intensity images, which measure the luminousity of certain sections of the sun, and velocity images, which measure how fast matter on certain sections of the sun is moving.
http://was.tw.rpi.edu/swqp/map.html
We are using regulation data from 4 states: MASS, CA, RI, NY and 1 regulation data from EPA (total 5) Preprocessing regulation data: identify correct limit for each contaminant(some of data contain English words, not just number), write adhoc code to convert them into the format that our converter is able to process. Some links to regulation data: http://www.dem.ri.gov/pubs/regs/regs/water/h20q09.pdf page 100 (RI) http://water.epa.gov/drink/contaminants/index.cfm (EPA) http://www.mass.gov/dep/water/drinking/standards/dwstand.htm (MA) Data range: the echo data range: 10/31/2007-09/30/2010 the usgs date range: 1955-05-26 to 1999-11-09
the user can select the data organizations he/she trusts and the portal will use only data from the selected organizations.
http://was.tw.rpi.edu/swqp/trend/epaTrend.html?state=RI&county=3&site=http%3A%2F%2Ftw2.tw.rpi.edu%2Fzhengj3%2Fowl%2Fepa.owl%23facility-110000312135 Plese make sure all parameters are selected as shown in this image: facility permit, characteristic, test type, and the click “click”. http://was.tw.rpi.edu/swqp/map.html http://inference-web.org/wiki/Semantic_Water_Quality_Portal
Use Linked Data to enable common format, preserve data structure, and support incremental data growth Use semantic web ontology to capture deep semantics Use and Social Semantic Web to support community contributions Use SPARQL tools to enable data mash-up and connect back to conventional web tech. Access (Search/Query) => Cleanup/Mashup