201109021 mcguinness ska_meeting
Upcoming SlideShare
Loading in...5
×
 

201109021 mcguinness ska_meeting

on

  • 510 views

Invited talk for the Square Kilometer Array meeting in Wellington New Zealand in Sept 2011 on Semantic eScience and Semantically enabled Virtual Observatories along with directions

Invited talk for the Square Kilometer Array meeting in Wellington New Zealand in Sept 2011 on Semantic eScience and Semantically enabled Virtual Observatories along with directions

Statistics

Views

Total Views
510
Views on SlideShare
510
Embed Views
0

Actions

Likes
0
Downloads
4
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • schematic of sources of atmospheric disruption – what they are and where they occur in the atmosphere – and how they show up after the eruption in terms of a climate process - moderately well understood processes BUT data is everywhere under many different controls From nasa: “The importance of the study of stratospheric aerosol is not one that readily connects with the general public. This not too surprising since aerosol in the stratosphere can be seen with the naked eye (in the form of luminous sunsets following large volcanic eruptions) only a few times over the course of a lifetime. Similarly, consider that under nominal non-volcanic background conditions that the stratosphere contains about 1 Tg (1 megatonne). If this material were deposited uniformly onto the surface of the Earth, it would result in a layer only about 1-nm thick or less than one ten-thousandth of the width of a human hair. With this in mind, it is not difficult to image that the general public may not appreciate the important role that stratospheric aerosol can play in climate. However, in this era of shrinking science dollars, it is required to develop coherent arguments for continued research and investment into what is almost by definition an esoteric field. “ Types of physical quantities between volcano and climate that need to be related. We need to integrate underlying data from heterogeneous sources - schematic of sources of atmospheric disruption – what they are and where they occur in the atmosphere – and how they show up after the eruption in terms of a climate process - moderately well understood processes BUT data is everywhere under many different controls
  • during January 2000
  • James L. Benedict, Deborah L. McGuinness, and Peter Fox. A Semantic Web-based Methodology for Building Conceptual Models of Scientific Information. In American Geophysical Union, Fall Meeting (AGU2006), San Francisco, Ca., December, 2007. Eos Trans. AGU 88(52), Fall Meet. Suppl., Abstract IN53A-0950. abstract
  • http://was.tw.rpi.edu/swqp/map.html
  • http://was.tw.rpi.edu/swqp/trend/epaTrend.html?state=RI&county=3&site=http%3A%2F%2Ftw2.tw.rpi.edu%2Fzhengj3%2Fowl%2Fepa.owl%23facility-110000312135 Plese make sure all parameters are selected as shown in this image: facility permit, characteristic, test type, and the click “click”. http://was.tw.rpi.edu/swqp/map.html http://inference-web.org/wiki/Semantic_Water_Quality_Portal
  • http://logd.tw.rpi.edu/demo/tax-cost-policy-prevalence http://logd.tw.rpi.edu/project/popscigrid
  • ImpacTeen: part of Bridging the Gap: Research Informing Practice and Policy for Healthy Youth Behavior, supported by the Robert Wood Johnson Foundation and administered by Univ. of Illinois at Chicago. http://www.impacteen.org/
  • Many Benefits: Reduced query formation from 8 to 3 steps and reduced choices at each stage Allowed scientists to get data from instruments they never knew of before (e.g., photometers in example) Supported augmentation and validation of data Useful and related data provided without having to be an expert to ask for it Integration and use (e.g. plotting) based on inference Ask and answer questions not possible before But Needed Provenance (SPCDIS, PML), reusability & modularity (SESF) Deborah McGuinness, Peter Fox, Luca Cinquini, Patrick West, Jose Garcia, James L. Benedict, and Don Middleton. The Virtual Solar-Terrestrial Observatory: A Deployed Semantic Web Application Case Study for Scientific Research. In the Proceedings of the Nineteenth Conference on Innovative Applications of Artificial Intelligence (IAAI-07). Vancouver, British Columbia, Canada, July 22-26, 2007. Peter Fox, Deborah L. McGuinness, Luca Cinquini, Patrick West, Jose Garcia, James L. Benedict, and Don Middleton. Ontology-supported Scientific Data Frameworks: The Virtual Solar-Terrestrial Observatory Experience. In Computers and Geosciences - Elsevier. Volume 35, Issue 4 (2009).
  • The current focus of SPCDIS is to model provenance for one VSTO-affiliated service known as the Chromospheric Helium Image Photometer (or CHIP) Pipeline. CHIP is a sensor located at the Mauna Loa Solar Observatory, which takes pictures of the sun every 3 minutes. In turn, these pictures are sent to a data processing center at the National Center for atmospheric research. Here, follow-up processing is conducted on the MLSO pictures – such as Flat Field Calibration, which removes optical errors in image data – as well as quality checking on pictures – leading to grades on the scale of GOOD, BAD, or UGLY. Finally, the MLSO pictures are each processed into two kinds of images consumable by scientists: Intensity images, which measure the luminousity of certain sections of the sun, and velocity images, which measure how fast matter on certain sections of the sun is moving.
  • http://was.tw.rpi.edu/swqp/map.html
  • We are using regulation data from 4 states: MASS, CA, RI, NY and 1 regulation data from EPA (total 5) Preprocessing regulation data: identify correct limit for each contaminant(some of data contain English words, not just number), write adhoc code to convert them into the format that our converter is able to process. Some links to regulation data: http://www.dem.ri.gov/pubs/regs/regs/water/h20q09.pdf page 100 (RI) http://water.epa.gov/drink/contaminants/index.cfm (EPA) http://www.mass.gov/dep/water/drinking/standards/dwstand.htm (MA) Data range: the echo data range: 10/31/2007-09/30/2010 the usgs date range: 1955-05-26 to 1999-11-09
  • the user can select the data organizations he/she trusts and the portal will use only data from the selected organizations.
  • http://was.tw.rpi.edu/swqp/trend/epaTrend.html?state=RI&county=3&site=http%3A%2F%2Ftw2.tw.rpi.edu%2Fzhengj3%2Fowl%2Fepa.owl%23facility-110000312135 Plese make sure all parameters are selected as shown in this image: facility permit, characteristic, test type, and the click “click”. http://was.tw.rpi.edu/swqp/map.html http://inference-web.org/wiki/Semantic_Water_Quality_Portal
  • Use Linked Data to enable common format, preserve data structure, and support incremental data growth Use semantic web ontology to capture deep semantics Use and Social Semantic Web to support community contributions Use SPARQL tools to enable data mash-up and connect back to conventional web tech. Access (Search/Query) => Cleanup/Mashup

201109021 mcguinness ska_meeting 201109021 mcguinness ska_meeting Presentation Transcript

  • The Evolving Semantic Web and Semantic eScience Landscape Deborah L. McGuinness Tetherless World Senior Constellation Chair Professor of Computer and Cognitive Science Rensselaer Polytechnic Institute Troy, NY, USA Joint work with the Tetherless World Constellation eScience , Provenance, and Linked Open Data Teams. Particularly Peter Fox, Jim Hendler, Patrick West, Stephan Zednik, Cynthia Chang, … tw.rpi.edu/people
  • Introduction
      • Science data is exploding – sensors creating more than we can handle, Linked open data initiatives, etc.
      • Virtual Observatories expanding – in breadth, depth, and semantic usage
      • Introduction to a leading edge interdisciplinary virtual observatory – Virtual Solar Terrestrial Observatory
      • Directions – (that may be even more important for BIG science )
        • Provenance
        • Semantic eScience Framework
      • Discussion
  • Rensselaer Tetherless World Constellation (TWC) http://tw.rpi.edu Chaired Professors: McGuinness, Fox, Hendler Research Prof: Luciano; Research Staff: Bao, Chang, Erickson, Shi, West, Zednik
    • Themes:
    • Semantic Foundations
      • Knowledge Provenance / Explanation
      • Ontology Environments
      • Inference
      • Trust
      • Linked Data
    • Xinformatics
      • Semantic eScience
      • Data Science
      • eHealth
      • eEnvironment
    • Future Web
      • Web Science
      • Policy
      • Social
  • Semantic e-Science Motivations
    • AI Goal: AI in service of supporting the next generation of science – interdisciplinary, distributed e-Science
    • Science Goal: Scientists should be able to access a global, distributed knowledge base of scientific data that:
      • appears to be integrated
      • appears to be locally available
    • But… data is obtained by multiple instruments, using various protocols, in differing vocabularies, using (sometimes unstated) assumptions, with inconsistent (or non-existent) meta-data. It may be inconsistent, incomplete, evolving, and distributed.
    • We look to semantic technologies to help.
    McGuinness NSF/NCAR May 6, 2008
  • Virtual Solar Terrestrial Observatory (vsto.org)
    • Interdisciplinary Virtual Observatory for searching, integrating, & analyzing observational, experimental, & model databases.
    • Subject matter: solar, solar-terrestrial and space physics
    • Provides virtual access to specific data, model, tool and material archives containing items from a variety of space- and ground-based instruments and experiments, as well as individual and community modeling and software efforts bridging research and educational use
    • 3 year NSF project; initial deployment in year 1, multiple deployments by year 2; year 3 outreach and broadening
    • While aimed at one interdisciplinary area, it
    • serves as a replicable prototype for
    • interdisciplinary virtual observatories
    • Numerous follow-ons (Semantic Provenance Capture
    • in Data Ingest Systems, SESDI, SESF,
    • SSIII, …)
  • 9/15/2009 McGuinness - Cog Sci - RPI With NCAR, UTEP
  • McGuinness NSF/NCAR May 6, 2008
  •  
  • Some Learnings
    • Successful demonstration of semantic technologies
    • Serves as operational prototype and has been replicated in volcanology and climate response, semantic sea ice, ….
    • Semantic Web methodology for development
    • Modularization of ontologies is critical for re-use (along with designing the ontologies for re-use)
    • Provenance is critical for acceptance
    • Tools, toolkits, and smart frameworks are one next step that we are taking (and we love partners in this endeavor…)
  • Semantic Web Methodology and Technology Development Process
    • Establish and improve a well-defined methodology vision for Semantic Technology based application development; Leverage controlled vocabularies, etc.
    Use Case Small Team, mixed skills Analysis Adopt Technology Approach Leverage Technology Infrastructure Rapid Prototype Open World: Evolve, Iterate, Redesign, Redeploy Use Tools Science/Expert Review & Iteration Develop model/ ontology Evaluation James L. Benedict, Deborah L. McGuinness, and Peter Fox. A Semantic Web-based Methodology for Building Conceptual Models of Scientific Information. In American Geophysical Union, Fall Meeting (AGU2006), San Francisco, Ca., December, 2007. Eos Trans. AGU 88(52), Fall Meet. Suppl., Abstract IN53A-0950. abstract
  • Semantic Provenance Capture for Data Ingest Systemcs (SPCDIS)
    • Fact: Scientific data services are increasing in usage and scope, and with these increases comes growing need for access to provenance information.
    • Provenance Project Goal : to design a reusable, interoperable provenance infrastructure.
    • Science Project Goal: design and implement an extensible provenance solution that is deployed at the science data ingest/ product generation time.
    • Outcome: implemented provenance solution in one science setting AND operational specification for other scientific data applications.
    • Extends vsto.org
  • Advanced Coronal Observing System (ACOS) Provenance Use Cases
    • What were the cloud cover and seeing conditions during the observation period of this image?
    • What calibrations have been applied to this image?
    • Why does this image look bad?
  • ACOS Data Ingest
    • Typical science data processing pipelines
    • Distributed
    • Some metadata in silos
    • Much metadata lost
    • Many human-in-loop decisions, events
    • No metadata infrastructure for any user
    • Community is broadening
    Chromospheric Helium Imaging Photometer (CHIP) Data Ingest ACOS – Advanced Coronal Observing System
  • PML Usage in SPCDIS
    • Justification
      • Explanation
      • Causality graph
    • Provenance
      • Conclusion
      • Source
      • Engine
      • Rule
    • Trust
      • Trust/Belief metrics
    NodeSet Justification Conclusion NodeSet Justification Conclusion NodeSet Justification Conclusion Engine Rule Rule hasAntecedentList hasSourceUsage hasInferenceRule hasInferenceEngine SourceUsage Source DateTime
  • PML in Action
    • This is the PML provenance encoding for a “quick look” gif file that is generated from two image data datasets
    Node set for the quickloook gif file hasConclusion: a reference to the gif file itself InferenceStep : how the gif file was derived hasAntecedents hasInferenceRule hasInferenceEngine The “antecedents” of the quicklook gif file are other node sets
  • Integrated View
    • Observer log’s information added into quicklook image’s provenance
  • Knowledge Provenance in Action Mobile Wine Agent GILA Combining Proofs in TPTP Cognitive Asst Virtual Observatories Intelligence Analyst Tools McGuinness – Inference Web
  • Discussion
    • Semantic technologies can help in many ways – we have demonstrated their use in integration, discovery, access, validation, …
    • Many subject area ontologies exist… and some are modular enough and vetted enough and maintained enough to depend on
    • Moving from semantically-enabled systems to semantically-enabled frameworks is part of our present and future and we think it will be for others
    • Provenance is critical and should be part of the design from day 1 (not an afterthought)…. And languages and tools are emerging
    • Linked data can play a role – e.g., SemantAqua
    • Things you might consider:
      • Use our framework / tools / tutorials such as linked data, Inference Web, Ontologies, SESF
      • Contribute your ontologies, tools, use cases to SESF
      • Collaborate with us…………..
      • Questions dlm @ cs. rpi. edu
  • Tropopause http://aerosols.larc.nasa.gov/volcano2.swf
  • Atmosphere Use Case
    • Determine the statistical signatures of both volcanic and solar forcings on the height of the tropopause
    • From paleoclimate researcher – Caspar Ammann – Climate and Global Dynamics Division of NCAR - CGD/NCAR
    • Layperson perspective:
    • - look for indicators of acid rain in the part of the atmosphere we experience…
    • (look at measurements of sulfur dioxide in relation to sulfuric acid after volcanic eruptions at the boundary of the troposphere and the stratosphere)
    • Nasa funded effort with Fox – NCAR->RPI, Sinha - Va. Tech, Raskin - JPL
  • Use Case: A Volcano Erupts
    • Preferentially it’s a tropical mountain (+/- 30 degrees of the equator ) with ‘ acidic’ magma ; more SiO2 , and it erupt s with great intensity so that material and large amounts of gas are inject ed into the stratosphere .
    • The SO2 gas convert s to H2SO4 ( Sulfuric Acid ) + H2O (75% H2SO4 + 25% H2O ). The half life of SO2 is about 30 - 40 day s.
    • The sulfuric acid condensate s to little super-cooled liquid droplet s. These are the volcanic aerosol that will linger around for a year or two.
    • Brewer Dobson Circulation of the stratosphere will transport aerosol to higher latitude s. The particle s generate great sunset s, most commonly first seen in fall of the respective hemisphere . The sunlight gets partially reflect ed, some part gets scatter ed in the forward direction .
    • Result is that the direct solar beam is reduced, yet diffuse skylight increases. The scattering is responsible for the colorful sunset s as more and more of the blue wavelength are scatter ed away.in mid- latitude s the volcanic aerosol starts to settle, but most efficient removal from the stratosphere is through tropopause folds in the vicinity of the storm track s.
    • If particle s get over the pole , which happens in spring of the respective hemisphere , then they will settle down and fall onto polar ice cap s. Its from these ice cap s that we recover annual records of sulfate flux or deposit .
    • We get ice core s that show continuous deposition information. Nowadays we measure sulfate or SO4(2-) . Earlier measurement s were indirect, putting an electric current through the ice and measur ing the delay. With acid s present, the electric flow would be faster.
    • What we are looking for are pulse like event s with a build up over a few month s (mostly in summer, when the vortex is gone), and then a decay of the peak of about 1/e in 12 month s.
    • The distribution of these pulse s was found to follow an extreme value distribution ( Frechet ) with a heavy tail .
  •  
  • Inference Web: Making Data Transparent and Actionable Using Semantic Technologies
    • How and when does it make sense to use smart system results & how do we interact with them?
    Knowledge Provenance in Virtual Observatories Hypothesis Investigation / Policy Advisors (Mobile) Intelligent Agents Intelligence Analyst Tools NSF Interops: SONET SSIII – Sea Ice
  • Core and framework semantics
  • Ontology Spectrum Catalog/ ID General Logical constraints Terms/ glossary Thesauri “ narrower term” relation Formal is-a Frames (properties) Informal is-a Formal instance Value Restrs. Disjointness, Inverse, part-of… From 99 AAAI panel, 2000 Dagstuhl talk
  • Virtual Observatory (VSTO)
    • General: Find data subject to certain constraints and plot appropriately
    • Specific: Plot the observed/measured Neutral Temperature as recorded by the Millstone Hill Fabry-Perot interferometer while looking in the vertical direction at any time of high geomagnetic activity in a way that makes sense for the data.
    November 9, 2006
  • VSTO Results
    • Many Benefits:
      • Reduced query formation from 8 to 3 steps and reduced choices at each stage
      • Allowed scientists to get data from instruments they never knew of before (e.g., photometers in example)
      • Supported augmentation and validation of data
      • Useful and related data provided without having to be an expert to ask for it
      • Integration and use (e.g. plotting) based on inference
      • Ask and answer questions not possible before
    • But Needed Provenance (SPCDIS, PML), reusability & modularity (SESF)
      • Deborah McGuinness, Peter Fox, Luca Cinquini, Patrick West, Jose Garcia, James L. Benedict, and Don Middleton. The Virtual Solar-Terrestrial Observatory: A Deployed Semantic Web Application Case Study for Scientific Research. In the Proceedings of the Nineteenth Conference on Innovative Applications of Artificial Intelligence (IAAI-07). Vancouver, British Columbia, Canada, July 22-26, 2007.
      • Peter Fox, Deborah L. McGuinness, Luca Cinquini, Patrick West, Jose Garcia, James L. Benedict, and Don Middleton. Ontology-supported Scientific Data Frameworks: The Virtual Solar-Terrestrial Observatory Experience. In Computers and Geosciences - Elsevier. Volume 35, Issue 4 (2009).
  • VSTO Instrument
  • VSTO Infrastructure
  • November 9, 2006 Deborah L. McGuinness Partial exposure of Instrument class hierarchy - users seem to LIKE THIS
  • Users Require Provenance!
    • Users demand it! If users (humans and agents) are to use, reuse, and integrate system answers, they must trust them.
    • Intelligence analysts: (from DTO/IARPA’s NIMD)
    • Andrew. Cowell, Deborah McGuinness, Carrie Varley, and David A. Thurman. Knowledge-Worker Requirements for Next Generation Query Answering and Explanation Systems. Proc. of Intelligent User Interfaces for Intelligence Analysis Workshop, Intl Conf. on Intelligent User Interfaces (IUI 2006), Sydney, Australia.
    • Intelligent Assistant Users: (from DARPA’s PAL/CALO)
    • Alyssa Glass, Deborah L. McGuinness, Paulo Pinheiro da Silva, and Michael Wolverton. Trustable Task Processing Systems. In Roth-Berghofer, T., and Richter, M.M., editors, KI Journal, Special Issue on Explanation, Kunstliche Intelligenz, 2008.
    • Virtual Observatory Users: (from NSF’s VSTO)
    • Deborah McGuinness, Peter Fox, Luca Cinquini, Patrick West, Jose Garcia, James L. Benedict, and Don Middleton. The Virtual Solar-Terrestrial Observatory: A Deployed Semantic Web Application Case Study for Scientific Research. Proc. of the Nineteenth Conference on Innovative Applications of Artificial Intelligence (IAAI-07). Vancouver, British Columbia, Canada.
    • And… as systems become more diverse, distributed, embedded, and depend on more varied data and communities, more provenance and more types are needed
    • .
  • Advanced Coronal Observing System (ACOS) Provenance Use Cases
    • What were the cloud cover and seeing conditions during the observation period of this image?
    • What calibrations have been applied to this image?
    • Why does this image look bad?
  • ACOS Data Ingest
    • Typical science data processing pipelines
    • Distributed
    • Some metadata in silos
    • Much metadata lost
    • Many human-in-loop decisions, events
    • No metadata infrastructure for any user
    • Community is broadening
    Chromospheric Helium Imaging Photometer (CHIP) Data Ingest ACOS – Advanced Coronal Observing System
  • PML Usage in SPCDIS
    • Justification
      • Explanation
      • Causality graph
    • Provenance
      • Conclusion
      • Source
      • Engine
      • Rule
    • Trust
      • Trust/Belief metrics
    NodeSet Justification Conclusion NodeSet Justification Conclusion NodeSet Justification Conclusion Engine Rule Rule hasAntecedentList hasSourceUsage hasInferenceRule hasInferenceEngine SourceUsage Source DateTime
  • A PML-Enhanced Image CHIP Quick-Look CHIP PML-Enhance Quick-Look provenance
  • Integrated View
    • Observer log’s information added into quicklook image’s provenance
  • Provenance aware faceted search Tetherless World Constellation
  • Technologies
    • Semantic Web methodology
    • Medium weight ontologies (although adapted from existing ontologies)
    • Access to data
    • Mapping info / services
    • Reasoning (previous application was linking and exploration)
    • Note – this project was operational in 8 months and is still in use years later
  • Semantically-Enabled Systems -> Semantically-Enabled Frameworks
    • We could continue to build somewhat extensible and reusable systems…. But
    • We wanted broader base of builders and users
    • Frameworks provide many entry and exit points and re-usable (hopefully) seamless components
    • Open source ontologies and software!
    • We love partners in this endeavor…
  • Background
    • Began knowledge environment for GeoSciences discussions – early 2000s
    • Chose a particular interdisciplinary virtual observatory (VSTO) powered by semantic technologies
    • Use case driven – in solar and solar-terrestrial physics with an emphasis on instrument-based measurements and real data pipelines
    • First step – proof of concept semantically-enabled pilot – VSTO quite successful
    • We pushed semantics into applications that were already built on advanced cyberinfrastructure
  • Background II
    • Provenance demands led to Semantic Provenance Capture for Data Ingest Systems
    • Test in new domains – Semantically-Enabled Scientific Data Integration – predict climate impacts following volcanic eruption
    • Reuse worked: semantic integration, semantic provenance, (with modularization and tool requests)
    • Goal now – configurable, re-usable framework with embedded toolkit
  • Framework overview Tetherless World Constellation
  • Semantic Web Methodology and Technology Development Process James L. Benedict, Deborah L. McGuinness, and Peter Fox. A Semantic Web-based Methodology for Building Conceptual Models of Scientific Information. In American Geophysical Union, Fall Meeting (AGU2006), San Francisco, Ca., December, 2007. Eos Trans. AGU 88(52), Fall Meet. Suppl., Abstract IN53A-0950. abstract
  • Application integration with smart, scalable search
    • Rozell et al.
  • Core and framework semantics
  • Status & Discussion
    • Ontology and tool re-use in process or beginning with many projects
      • VSTO re-implementation
      • BCO-DMO (biological and chemical oceanography)
      • Semantic Sea Ice (NSF Interop project)
      • Scientific Observations Network (SONET – NSF Interop)
      • National Ecological Observatory Network (NEON)
      • CSIRO Water Monitoring
      • Your Project Here!
    • Modularization in process
    • Tools like S2S in place and being tested
  • Commonalities
    • Applications – simple linking; integration of many existing vocabularies, simple inference
    • Encoding of meaning – often lightweight – ontologies
    • Semantic Web methodology
    • Often light weight data encodings – triple stores
    • Usually simple reasoners
    • Provenance encodings
    • These are all options that can be used incrementally and at varying degrees of sophistication.
    • While initial applications are often on larger platforms, many can be adapted to mobile platforms
  • Comments
    • Broader groups of people are now building linked data applications – e.g., hackathons for linked govt data, TWC/Elsevier Hackathon, Health 2.0 , etc.
    • Broader groups of people are now building Virtual Observatories AND wanting to integrate more data, disciplines, etc.
    • More interest in encodings of meaning to create smarter and more context aware application
    • Growing demand for provenance for attribution, trust, transparency
    • *More applications are moving to mobile and becoming ubiquitous
    • More data from sensors and from open data initiative is fueling some applications
    • Things you might consider:
      • Use our framework / tools / tutorials such as linked data, Inference Web, Ontologies, SESF
      • Contribute your modules to SESF
      • Collaborate with us…………..
      • Questions dlm @ cs. rpi. edu
  • Extras
  • Ontology
    • Regulation Ontology
      • Model federal and state water quality regulations for drinking water sources
      • Can use to define: for example, in California, “any measurement has value 0.01 mg/L is the limit for Arsenic”
      • Combine with core ontology, we can infer “any water source contains 0.01 mg/L of Arsenic is a polluted water source.”
    Portion of Cal. Regulation Ontology.
  • Visualization
    • Map Visualization:
      • Presents analyzed results with Google Map
      • Presents explanation on why a water source is marked as polluted
      • Use “Facet” type filter to select type of data
    1 2 3 http://was.tw.rpi.edu/swqp/map.html
  • Selected Follow-up options Limit Violation
  • PopSciGrid in Action http://logd.tw.rpi.edu/demo/tax-cost-policy-prevalence
  • Directions
    • Use sensed personal data to provide context and integrate with aggregated data to provide actionable health advisors – diet & nutrition, exercise, etc.
    • Use PopsciGrid model for other data, e.g., CLASS data about exercise and nutrition in schools
    • Relate to health impacts
    • Expose provenance more effectively
  • Tetherless Faceted Browsing
  • PopSciGrid Revisited derive derive integrate derive archive Ban coverage Data sets, simple ontology Provenance tools Visualization tools CSV2RDF4LOD Direct SemDiff Archive CSV2RDF4LOD Enhance visualize Publish
  •  
  • VSTO DataProduct
  • Semantic Web Methodology McGuinness, Fox, West, Garcia, Cinquini, Benedict, Middleton http://www.vsto.org
  • Semantic Provenance Capture for Data Ingest Systemcs (SPCDIS)
    • Fact: Scientific data services are increasing in usage and scope, and with these increases comes growing need for access to provenance information.
    • Provenance Project Goal : to design a reusable, interoperable provenance infrastructure.
    • Science Project Goal: design and implement an extensible provenance solution that is deployed at the science data ingest/ product generation time.
    • Outcome: implemented provenance solution in one science setting AND operational specification for other scientific data applications.
    • Extends vsto.org
  • PML in Action
    • This is the PML provenance encoding for a “quick look” gif file, which is generated from two image data datasets
    Node set for the quickloook gif file hasConclusion: a reference to the gif file itself InferenceStep : how the gif file was derived hasAntecedents hasInferenceRule hasInferenceEngine The “antecedents” of the quicklook gif file are other node sets
  • CHIP Pipeline ( Chromospheric Helium Image Photometer ) Mauna Loa Solar Observatory (MLSO) Hawaii National Center for Atmospheric Research (NCAR) Data Center. Boulder, CO Intensity Images (GIF) Velocity Images (GIF)
    • Follow-up Processing
    • on Raw Data
    • (e.g., Flat Field Calibration)
    • Quality Checking
    • (Images Graded: GOOD, BAD, UGLY)
    • Raw Image Data
    Raw Image Data Captured by CHIP Chromospheric Helium-I Image Photometer
    • Raw Data Capture
    Publishes
  • Core and Framework Semantics - Multi-tiered interoperability used by
  • SPARQL to Xquery translator RDFS materialization (Billion triple winner) Govt metadata search Linked Open Govt Data SPARQL WG, earlier QL – OWL-QL, Classic’ QL, … OWL 1 & 2 WG Edited main OWL Docs, quick reference, OWL profiles (OWL RL), Earlier languages: DAML, DAML+OIL, Classic RIF WG AIR accountability tool DL, KIF, CL, N3Logic Inference Web, Proof Markup Language, W3C Provenance Working group formal model Inference Web IW Trust, Air + Trust Visualization APIs S2S Govt Data Ontology repositories (ontolinguag), Ontology Evolution env: Chimaera, Semantic eScience Ontologies, MANY other ontologies Transparent Accountable Datamining Initiative (TAMI) TWC and the Semantic Web Layer Cake
  • SemantAqua (part of SemantEco)
    • Enable/Enpower citizens & scientists to explore water pollution sites, facilities, and regulations along with provenance.
    • Demonstrates semantic web technologies in environmental informatics systems.
    • Map presentation of analysis
    • Explanations and Provenance available
    • Use “Facet” type filter to select type of data
    1 2 3 http://was.tw.rpi.edu/swqp/map.html
  • System Architecture access Virtuoso
  • Ontology
    • Core TWC Water ontology
      • Extends existing best practice ontologies, e.g. SWEET, OWL-Time.
      • Includes terms for relevant pollution concepts
      • Can use to conclude: “any water source that has a measurement outside of its allowable range” is a polluted water source.
    Portion of the TWC Water Ontology.
  • Provenance
    • Preserves provenance in the Proof Markup Language (PML).
    • Data Source Level Provenance:
      • The captured provenance data are used to support provenance-based queries.
    • Reasoning level provenance:
      • When water source been marked as polluted, user can access supporting provenance data for the explanations including the URLs of the source data, intermediate data and the converted data.
  • Some Foundations
    • Growing body of Open Linked Data
    • Growth and acceptance of ontologies and ontology-enabled service
    • RPI Tetherless World backend tools and service
      • LOGD
      • Inference Web and Proof Markup Language
      • eScience ontologies and infrastructure
  • The Tetherless World Constellation Linked Open Government Data Portal Create TWC LOGD Convert Query/ Access LOGD SPARQL Endpoint Enhance
    • RDF
    • RSS
    • JSON
    • XML
    • HTML
    • CSV
    Community Portal Data.gov deployment
  • A PML-Enhanced Image CHIP Quick-Look CHIP PML-Enhance Quick-Look provenance