WOW13_RPITWC_Web Observatories

Exploration in Web Science:
Instruments for Web
Observatories
Observatories
Presented by:
Kristine Gloria
Co-authors: Deborah McGuinness and Joanne Luciano
The Tetherless World Constellation
Rensselaer Polytechnic Institute, Troy, NY
With thanks to the extended RPI Tetherless World Team

Agenda
6
I. Web Observatories at RPI’s Web Science
Research Center
II. Web Observatory Themes
III. Science Data
IV. Health and Life Sciences,
V. Open Government
VI. Social Spaces

Web Observatories @ WSRC
At RPI WSRC, our observatories present both
tools and methodologies that empower
researchers to study the web and to make a
difference in the world

Web Observatories Themes
Science Data Observatory
Health & Life Sciences
Observatory
Open Government Observatory
Social Spaces Observatory

Web Observatory Theme
Open Government Observatory

Open Government Data
TWC –Intl Open Government Data Sets

Web Observatories Themes
Science Data Observatory

SemantAqua
• Enable/Empower citizens &
scientists to explore pollution
sites, facilities, regulations, and
health impacts along with
provenance
• Demonstrates semantic
monitoring possibilities
• Extend to endangered species
and resource mgr issues
• Explanations and Provenance
available
1
2 3
45
1. Map view of analyzed results
2. Explanation of pollution
3. Possible health effect of contaminant (from EPA)
4. Filtering by facet to select type of data
5. Link for reporting problems
6. Extended with input from USGS, with population counts for birds & fish

Example Workflow
(SemantAqua)
ArchiveArchive
CSV2RDF4LOD
Enhance
CSV2RDF4LOD
Enhance
derive derive
integrate
archive
PublishPublish
CSV2RDF4LOD
Direct
CSV2RDF4LOD
Direct visualizevisualize
8

Semantic Methodology and
Semantic Application Evolution
5
Originally developed for Virtual Observatories (in solar
terrestrial) , now in water quality, Sea ice, volcanology,
mycology, oceans…. …
McGuinness, Fox, West, Garcia, Cinquini, Benedict,
Middleton The Virtual Solar-Terrestrial Observatory: A
Deployed Semantic Web Application Case Study for
Scientific Research. Proc. 19 Conf. on Innovative
Applications of Artificial Intelligence (IAAI-07),
http://www.vsto.org
SemantAqua -> SemantEco -> DataOne
modularizing, broadening,
provenance, interaction
VSTO -> SESDI -> SPCDIS
- modularizing, provenance,
broadening, interaction

Web Observatory Theme
Health & Life Sciences
Observatory

Department of Health and Human Services'
Developer Challenge
Developer Challenge
6
In June 2012, HHS issued the first of its seven challenges calling for
developers “to make high value health data more accessible to
entrepreneurs, researchers, and policy makers in the hopes of better
health outcomes for all.”
A group from RPI TWC won first place in the competition, by using
semantic technologies and in-house developed software, such as
csv2rdf4lod, LODSPeaKr, Farrah and DataFAQS.
HHS wanted Metadata
"... application of existing voluntary consensus
standards for metadata common to all open
government data"
RPI TWC submitted:
•DCAT - W3C Data Catalog
◦Version controlled on github.
◦Extracted from their CKAN as input to
converter.
•VoID - W3C Vocabulary of Interlinked
Data
◦Organized datasets by source, dataset,
version.
◦Provided links to data dumps, Linksets to
LOD.
•PROV - W3C Provenance Interchange
Model
◦Captured during CKAN extraction, retrieval,
conversion, and publishing.
•Dublin Core Metadata Terms
◦Annotated subjects based on descriptions.
HHS wanted Classification
"...classify datasets in our growing catalog,
creating entities, attributes and relations that form
the foundations for better discovery,
integration..."
RPI TWC presented:
•Bottom-up vocabulary and entity reuse
◦Vocabulary created for each dataset
◦Enhanced datasets shifted to reuse vocabulary
and entities from other datasets.
◦Three stub vocabularies for top-level reuse.
•NCBO (Nat. Center for Biomedical Ont.)
Annotations
◦annotator/annotator.py SADI service
◦data/source/bioontology-org/annotator-
description-subject/version/retrieve.sh
HHS wanted Liquidity
"new designs ... that form the foundations for ... liquidity"
RPI TWC provided: 2B triples among 1M URIs
•Dataset Linked Data
◦Machine and Human views (via conneg)
◦Faceted search of datasets
•Dataset dumps (.ttl.gz)
◦For each dataset, and for the whole thing.
Dataset query (http://healthdata.tw.rpi.edu/sparql)
Text https://github.com/jimmccusker/twc-h

Web Observatory Themes
Social Spaces Observatory

Twitter Network Observatory
Makani, B. & Zhang, Q.
Makani, B. & Zhang, Q.
• Explores the relationships
of people and semantics in
the graph database
• Basic functions:
• Users can visualize and
analyze different types of
sub-graphs
• Preforms a set of basic
analyses for other
COSMIC Groups

How can we leverage Social
Media sites…
to identify these communities, and
stakeholders within them?
to gather requirements from these
communities?
First Responders, including Emergency Medical Personnel,
Firefighters, and Police Officers, have active online communities on
Social Media websites.
First Responders (with NIST)
McGuinness, Erickson, Chastain, Fry, Yan, Zhu
http://tw.rpi.edu/web/project/FirstResponders
Find Topics:
Find Users:
How can we leverage Social
Media sites…
to identify these communities, and
stakeholders within them?
to gather requirements from these
communities?

WOW13_RPITWC_Web Observatories

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (16)

Similar to WOW13_RPITWC_Web Observatories

Similar to WOW13_RPITWC_Web Observatories (20)

Recently uploaded

Recently uploaded (20)

WOW13_RPITWC_Web Observatories

Editor's Notes