Big Data R&D Strategy - Ensure the long term sustainability, access, and deve...Sky Bristol
Presentation on one of the strategic themes being considered for a U.S. Government Big Data R&D strategy - https://www.nitrd.gov/bigdata/rfi/02102014.aspx.
Deep Earth Computer: A Platform for Linked Science of the Deep Carbon Obser...Xiaogang (Marshall) Ma
Deep Carbon Observatory-Data Science is assembling a Deep Earth Computer for the Deep Carbon Observatory (DCO). The efforts will create a fundamental change in the conduct of Carbon-related research, resting upon a 21st century data science platform, and a series of aggregate data holdings that have never existed before. Data science combines aspects of informatics, data management, library science, computer science and physical science using cyberinfrastructure and information technology. The Deep Earth Computer we build provides these functions at minimum: an concept-type repository, an ability to identify and manage all key entities, agents and activities in the platform, a repository for archiving datasets and associated metadata, collaboration tools, and an integrated portal to manage diverse content and applications, with varied access levels and privacy options. The Deep Earth Computer sets up a platform for the Linked Science of the Deep Carbon Community, that is, not only scientific assets like data and methods behind scientific settings are opened and inter-connected, but also the people, organizations, groups, samples, instruments, activities, grants, meetings, etc. are recorded and inter-connected. Such a platform will promote collaborations among DCO community members, improve the openness and reproducibility of Carbon-related researches, and facilitate accreditation to resource (including publications, datasets, instruments, etc.) contributors.
Presentation about the IGSN and ongoing initiatives for the Internet of Samples at the EGU 2015 short course "Open Science Goes Geo: Beyond Data and Software".
Making Small Data BIG (UT Austin, March 2016)Kerstin Lehnert
Presentation given at the Texas Advanced Computing Center. It describes the potential of re-using small data for new science, achievements and the challenges to make small data re-usable.
Big Data R&D Strategy - Ensure the long term sustainability, access, and deve...Sky Bristol
Presentation on one of the strategic themes being considered for a U.S. Government Big Data R&D strategy - https://www.nitrd.gov/bigdata/rfi/02102014.aspx.
Deep Earth Computer: A Platform for Linked Science of the Deep Carbon Obser...Xiaogang (Marshall) Ma
Deep Carbon Observatory-Data Science is assembling a Deep Earth Computer for the Deep Carbon Observatory (DCO). The efforts will create a fundamental change in the conduct of Carbon-related research, resting upon a 21st century data science platform, and a series of aggregate data holdings that have never existed before. Data science combines aspects of informatics, data management, library science, computer science and physical science using cyberinfrastructure and information technology. The Deep Earth Computer we build provides these functions at minimum: an concept-type repository, an ability to identify and manage all key entities, agents and activities in the platform, a repository for archiving datasets and associated metadata, collaboration tools, and an integrated portal to manage diverse content and applications, with varied access levels and privacy options. The Deep Earth Computer sets up a platform for the Linked Science of the Deep Carbon Community, that is, not only scientific assets like data and methods behind scientific settings are opened and inter-connected, but also the people, organizations, groups, samples, instruments, activities, grants, meetings, etc. are recorded and inter-connected. Such a platform will promote collaborations among DCO community members, improve the openness and reproducibility of Carbon-related researches, and facilitate accreditation to resource (including publications, datasets, instruments, etc.) contributors.
Presentation about the IGSN and ongoing initiatives for the Internet of Samples at the EGU 2015 short course "Open Science Goes Geo: Beyond Data and Software".
Making Small Data BIG (UT Austin, March 2016)Kerstin Lehnert
Presentation given at the Texas Advanced Computing Center. It describes the potential of re-using small data for new science, achievements and the challenges to make small data re-usable.
Building on the Atlas (of Living Australia)Andrew Treloar
Presentation given at Atlas of Living Australia Science Symposium 2013. Discusses Australian National Data Service Applications program and two specific projects: Soils to Satellites (also involving TERN), and Edgar Bird Species distribution.
Research Data Infrastructure for Geochemistry (DFG Roundtable)Kerstin Lehnert
This presentation provides an overview of different aspects of data management for geochemistry and resources available at the EarthChem@IEDA data facility.
GeoChronos: An On-line Collaborative Platform for Earth Observation ScientistsGeoChronos
Presentation given by John Gamon at the AGU Fall Meeting in San Francisco on Dec. 14, 2009. The presentation highlights features and supporting technologies of the GeoChronos Platform
Filtergraph: A fast, flexible and sharable service for visualization in big d...Dan Burger
My talk from the Data Visualization Summit in Boston on September 12, 2013.
This talk will explore Filtergraph, a web application being developed by Vanderbilt’s Initiative in Data-Intensive Astrophysics to conduct rapid and intuitive visualization of large multi-dimensional datasets. Filtergraph has been designed with an understanding of the cognitive workflow in big-data exploration, enabling users to quickly, easily, and intuitively delve into massive datasets and quickly find emergent patterns in the data. While Filtergraph was designed for astronomy research projects at Vanderbilt, including searches for extrasolar planets from databases involving millions of stars, Filtergraph has broad potential for generating flexible, colorful and interactive data-visualization portals using a wide variety of data sources. Currently, Filtergraph has more than 100 users in 20 countries. Filtergraph is freely available at http://filtergraph.vanderbilt.edu/.
FAIRy stories: the FAIR Data principles in theory and in practiceCarole Goble
https://ucsb.zoom.us/meeting/register/tZYod-ippz4pHtaJ0d3ERPIFy2QIvKqjwpXR
FAIRy stories: the FAIR Data principles in theory and in practice
The ‘FAIR Guiding Principles for scientific data management and stewardship’ [1] launched a global dialogue within research and policy communities and started a journey to wider accessibility and reusability of data and preparedness for automation-readiness (I am one of the army of authors). Over the past 5 years FAIR has become a movement, a mantra and a methodology for scientific research and increasingly in the commercial and public sector. FAIR is now part of NIH, European Commission and OECD policy. But just figuring out what the FAIR principles really mean and how we implement them has proved more challenging than one might have guessed. To quote the novelist Rick Riordan “Fairness does not mean everyone gets the same. Fairness means everyone gets what they need”.
As a data infrastructure wrangler I lead and participate in projects implementing forms of FAIR in pan-national European biomedical Research Infrastructures. We apply web-based industry-lead approaches like Schema.org; work with big pharma on specialised FAIRification pipelines for legacy data; promote FAIR by Design methodologies and platforms into the researcher lab; and expand the principles of FAIR beyond data to computational workflows and digital objects. Many use Linked Data approaches.
In this talk I’ll use some of these projects to shine some light on the FAIR movement. Spoiler alert: although there are technical issues, the greatest challenges are social. FAIR is a team sport. Knowledge Graphs play a role – not just as consumers of FAIR data but as active contributors. To paraphrase another novelist, “It is a truth universally acknowledged that a Knowledge Graph must be in want of FAIR data.”
[1] Wilkinson, M., Dumontier, M., Aalbersberg, I. et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data 3, 160018 (2016). https://doi.org/10.1038/sdata.2016.18
These slides were presented in a session that we organized at the American Association for Advancement of Science (AAAS) meeting in Chicago, February 2009.
Abstract: New laboratory devices, sensor networks, high-throughput instruments, and numerical simulation systems are producing data at rates that are both without precedent and rapidly growing. The resulting increases in the size, number, and variety of data are revolutionizing scientific practice. These changes demand new computing infrastructures and tools. Until recently, most laboratories and collaborations managed their own data, operated their own computers, and used remote high-performance computers only when required. We are moving to a paradigm in which data will primarily be located and managed on remote clusters, grids, and data centers. In this symposium, we will examine the computing infrastructure designed to serve this emerging era of data-intensive computing from three perspectives: (1) that of grid computing, which enables the creation of virtual organizations that can share remote and distributed resources over the Internet; (2) that of data centers, which are transitioning to providers of integrated storage, data, compute, and collaboration services (the offering of one or more of these integrated services over the Internet is beginning to be called cloud computing); and (3) that of e-science, in which grids, Web 2.0 technologies, and new collaboration and analysis services are merging and changing the way science is conducted. Each speaker will focus on one perspective but also compare and contrast with the others.
Webinar presented on December 5, 2012, by Joan Starr and Perry Willett of CDL/UC3, and Lisa Federer and Claudia Horning from UCLA. Part of the ACRL Digital Curation Interest (DCIG) Group Webinar Series.
LSST Education and Public Outreach (EPO) Amanda Bauer
A talk on the LSST Education and Public Outreach program delivered at the joint LSST Science Collaboration Chairs/Project Science Team telecon on July 18, 2017.
Building on the Atlas (of Living Australia)Andrew Treloar
Presentation given at Atlas of Living Australia Science Symposium 2013. Discusses Australian National Data Service Applications program and two specific projects: Soils to Satellites (also involving TERN), and Edgar Bird Species distribution.
Research Data Infrastructure for Geochemistry (DFG Roundtable)Kerstin Lehnert
This presentation provides an overview of different aspects of data management for geochemistry and resources available at the EarthChem@IEDA data facility.
GeoChronos: An On-line Collaborative Platform for Earth Observation ScientistsGeoChronos
Presentation given by John Gamon at the AGU Fall Meeting in San Francisco on Dec. 14, 2009. The presentation highlights features and supporting technologies of the GeoChronos Platform
Filtergraph: A fast, flexible and sharable service for visualization in big d...Dan Burger
My talk from the Data Visualization Summit in Boston on September 12, 2013.
This talk will explore Filtergraph, a web application being developed by Vanderbilt’s Initiative in Data-Intensive Astrophysics to conduct rapid and intuitive visualization of large multi-dimensional datasets. Filtergraph has been designed with an understanding of the cognitive workflow in big-data exploration, enabling users to quickly, easily, and intuitively delve into massive datasets and quickly find emergent patterns in the data. While Filtergraph was designed for astronomy research projects at Vanderbilt, including searches for extrasolar planets from databases involving millions of stars, Filtergraph has broad potential for generating flexible, colorful and interactive data-visualization portals using a wide variety of data sources. Currently, Filtergraph has more than 100 users in 20 countries. Filtergraph is freely available at http://filtergraph.vanderbilt.edu/.
FAIRy stories: the FAIR Data principles in theory and in practiceCarole Goble
https://ucsb.zoom.us/meeting/register/tZYod-ippz4pHtaJ0d3ERPIFy2QIvKqjwpXR
FAIRy stories: the FAIR Data principles in theory and in practice
The ‘FAIR Guiding Principles for scientific data management and stewardship’ [1] launched a global dialogue within research and policy communities and started a journey to wider accessibility and reusability of data and preparedness for automation-readiness (I am one of the army of authors). Over the past 5 years FAIR has become a movement, a mantra and a methodology for scientific research and increasingly in the commercial and public sector. FAIR is now part of NIH, European Commission and OECD policy. But just figuring out what the FAIR principles really mean and how we implement them has proved more challenging than one might have guessed. To quote the novelist Rick Riordan “Fairness does not mean everyone gets the same. Fairness means everyone gets what they need”.
As a data infrastructure wrangler I lead and participate in projects implementing forms of FAIR in pan-national European biomedical Research Infrastructures. We apply web-based industry-lead approaches like Schema.org; work with big pharma on specialised FAIRification pipelines for legacy data; promote FAIR by Design methodologies and platforms into the researcher lab; and expand the principles of FAIR beyond data to computational workflows and digital objects. Many use Linked Data approaches.
In this talk I’ll use some of these projects to shine some light on the FAIR movement. Spoiler alert: although there are technical issues, the greatest challenges are social. FAIR is a team sport. Knowledge Graphs play a role – not just as consumers of FAIR data but as active contributors. To paraphrase another novelist, “It is a truth universally acknowledged that a Knowledge Graph must be in want of FAIR data.”
[1] Wilkinson, M., Dumontier, M., Aalbersberg, I. et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data 3, 160018 (2016). https://doi.org/10.1038/sdata.2016.18
These slides were presented in a session that we organized at the American Association for Advancement of Science (AAAS) meeting in Chicago, February 2009.
Abstract: New laboratory devices, sensor networks, high-throughput instruments, and numerical simulation systems are producing data at rates that are both without precedent and rapidly growing. The resulting increases in the size, number, and variety of data are revolutionizing scientific practice. These changes demand new computing infrastructures and tools. Until recently, most laboratories and collaborations managed their own data, operated their own computers, and used remote high-performance computers only when required. We are moving to a paradigm in which data will primarily be located and managed on remote clusters, grids, and data centers. In this symposium, we will examine the computing infrastructure designed to serve this emerging era of data-intensive computing from three perspectives: (1) that of grid computing, which enables the creation of virtual organizations that can share remote and distributed resources over the Internet; (2) that of data centers, which are transitioning to providers of integrated storage, data, compute, and collaboration services (the offering of one or more of these integrated services over the Internet is beginning to be called cloud computing); and (3) that of e-science, in which grids, Web 2.0 technologies, and new collaboration and analysis services are merging and changing the way science is conducted. Each speaker will focus on one perspective but also compare and contrast with the others.
Webinar presented on December 5, 2012, by Joan Starr and Perry Willett of CDL/UC3, and Lisa Federer and Claudia Horning from UCLA. Part of the ACRL Digital Curation Interest (DCIG) Group Webinar Series.
LSST Education and Public Outreach (EPO) Amanda Bauer
A talk on the LSST Education and Public Outreach program delivered at the joint LSST Science Collaboration Chairs/Project Science Team telecon on July 18, 2017.
Presented as a Pecha Kucha at Web Science 2013 (Paris), this presentation focuses on a post-modern interpretation of "data" leveraging social theories of Goffman and Foucault.
Data Facilities Workshop - Panel on Current Concepts in Data Sharing & Intero...EarthCube
This series of presentations was given at the EarthCube Data Facilities End-User Workshop held January 15-17, 2014 in Washington, DC. This workshop provided a forum to discuss the unique requirements and challenges associated with developing the communication, collaboration, interoperability, and governance structures that will be required to build EarthCube in conjunction with existing and emerging NSF/GEO facilities.
This panel and discussion, specifically, outlined and explained several current concepts in data sharing and interoperability, featuring presentations by:
Paul Morin (UMN): Polar Cyberinfrastructure
Don Middleton (UCAR): Atmospheric/Climate
Kerstin Lehnert (LDEO): Domain Repositories & Physical Samples
David Schindel (CBOL, GRBio): Biological Perspective & Collections
Hank Leoscher (NEON): Observation Networks
Daniel Fuka (Virginia Tech) and Ruth Duerr (NSIDC): Brokering
Ilya Zaslavsky (UCSD): Cross-Domain Interoperability
Lecture for a course at NTNU, 27th January 2021
CC-BY 4.0 Dag Endresen https://orcid.org/0000-0002-2352-5497
See also http://bit.ly/biodiversityinformatics
https://www.gbif.no/events/2021/lecture-ntnu-gbif.html
Scott Edmunds: GigaScience - a journal or a database? Lessons learned from th...GigaScience, BGI Hong Kong
Scott Edmunds talk at the HUPO congress in Geneva, September 6th 2011 on GigaScience - a journal or a database? Lessons learned from the Genomics Tsunami.
A VIVO VIEW OF CANCER RESEARCH: Dream, Vision and RealityPaul Courtney
Presentation made by Paul Courtney (Dana-Farber Cancer Institute, Boston, MA and OHSL, MD) and Anil Srivastava (OHSL) at the 2013 VIVO conference in St. Louis, MO. Material contributed by Rubayi Srivastava (OHSL), Swati Mehta (Centre for Development of Advanced Computing, India), Juliusz Pukacki (Poznan Supercomputing and Network Center, Poland) and Devdatt Dubhashi (Chalmers Institute of Technology, Sweden).
Understanding the Big Picture of e-ScienceAndrew Sallans
A. Sallans. "Understanding the Big Picture of e-Science." Presented at the 2011 eScience Bootcamp at the University of Virginia's Claude Moore Health Sciences Library. 4 March 2011
Biodiversity Informatics: An Interdisciplinary ChallengeBryan Heidorn
"Impacto de la Informática en el Conocimiento de la Biodiversidad: Actualidad y Futuro” at Universidad Nacional de Colombia on August 12, 2011. https://sites.google.com/site/simposioinformaticaicn/home
Scott Edmunds slides for class 8 from the HKU Data Curation (module MLIM7350 from the Faculty of Education) course covering science data, medical data and ethics, and the FAIR data principles.
Open Data in a Big Data World: easy to say, but hard to do?LEARN Project
Presentation at 3rd LEARN workshop on Research Data Management, “Make research data management policies work”
Helsinki, 28 June 2016, by Sarah Callaghan, STFC Rutherford Appleton Laboratory
High Performance Cyberinfrastructure to Support Data-Intensive Biomedical Res...Larry Smarr
08.06.16
Invited Talk
Association of University Research Parks BioParks 2008
"From Discovery to Innovation"
Salk Institute
Title: High Performance Cyberinfrastructure to Support Data-Intensive Biomedical Research Instruments
La Jolla, CA
This is an overview of the Data Biosphere Project, its goals, its architecture, and the three core projects that form its foundation. We also discuss data commons.
Scratchpads: Building web communities supporting biodiversity scienceVince Smith
Presented by Dave Roberts at a meeting titled "Information Technology in Biodiversity Conservation and in Agriculture" organized by the Club of Rome and the EU ICT-ENSURE project, at UNESCO, Paris. January 15th, 2009.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
WOW13_RPITWC_Web Observatories
1. Exploration in Web Science:
Instruments for Web
Observatories
Observatories
Presented by:
Kristine Gloria
Co-authors: Deborah McGuinness and Joanne Luciano
The Tetherless World Constellation
Rensselaer Polytechnic Institute, Troy, NY
With thanks to the extended RPI Tetherless World Team
2. Agenda
6
I. Web Observatories at RPI’s Web Science
Research Center
II. Web Observatory Themes
III. Science Data
IV. Health and Life Sciences,
V. Open Government
VI. Social Spaces
3. Web Observatories @ WSRC
At RPI WSRC, our observatories present both
tools and methodologies that empower
researchers to study the web and to make a
difference in the world
4. Web Observatories Themes
Science Data Observatory
Health & Life Sciences
Observatory
Open Government Observatory
Social Spaces Observatory
8. SemantAqua
• Enable/Empower citizens &
scientists to explore pollution
sites, facilities, regulations, and
health impacts along with
provenance
• Demonstrates semantic
monitoring possibilities
• Extend to endangered species
and resource mgr issues
• Explanations and Provenance
available
1
2 3
45
1. Map view of analyzed results
2. Explanation of pollution
3. Possible health effect of contaminant (from EPA)
4. Filtering by facet to select type of data
5. Link for reporting problems
6. Extended with input from USGS, with population counts for birds & fish
10. Semantic Methodology and
Semantic Application Evolution
5
Originally developed for Virtual Observatories (in solar
terrestrial) , now in water quality, Sea ice, volcanology,
mycology, oceans…. …
McGuinness, Fox, West, Garcia, Cinquini, Benedict,
Middleton The Virtual Solar-Terrestrial Observatory: A
Deployed Semantic Web Application Case Study for
Scientific Research. Proc. 19 Conf. on Innovative
Applications of Artificial Intelligence (IAAI-07),
http://www.vsto.org
SemantAqua -> SemantEco -> DataOne
modularizing, broadening,
provenance, interaction
VSTO -> SESDI -> SPCDIS
- modularizing, provenance,
broadening, interaction
12. Department of Health and Human Services'
Developer Challenge
Developer Challenge
6
In June 2012, HHS issued the first of its seven challenges calling for
developers “to make high value health data more accessible to
entrepreneurs, researchers, and policy makers in the hopes of better
health outcomes for all.”
A group from RPI TWC won first place in the competition, by using
semantic technologies and in-house developed software, such as
csv2rdf4lod, LODSPeaKr, Farrah and DataFAQS.
HHS wanted Metadata
"... application of existing voluntary consensus
standards for metadata common to all open
government data"
RPI TWC submitted:
•DCAT - W3C Data Catalog
◦Version controlled on github.
◦Extracted from their CKAN as input to
converter.
•VoID - W3C Vocabulary of Interlinked
Data
◦Organized datasets by source, dataset,
version.
◦Provided links to data dumps, Linksets to
LOD.
•PROV - W3C Provenance Interchange
Model
◦Captured during CKAN extraction, retrieval,
conversion, and publishing.
•Dublin Core Metadata Terms
◦Annotated subjects based on descriptions.
HHS wanted Classification
"...classify datasets in our growing catalog,
creating entities, attributes and relations that form
the foundations for better discovery,
integration..."
RPI TWC presented:
•Bottom-up vocabulary and entity reuse
◦Vocabulary created for each dataset
◦Enhanced datasets shifted to reuse vocabulary
and entities from other datasets.
◦Three stub vocabularies for top-level reuse.
•NCBO (Nat. Center for Biomedical Ont.)
Annotations
◦annotator/annotator.py SADI service
◦data/source/bioontology-org/annotator-
description-subject/version/retrieve.sh
HHS wanted Liquidity
"new designs ... that form the foundations for ... liquidity"
RPI TWC provided: 2B triples among 1M URIs
•Dataset Linked Data
◦Machine and Human views (via conneg)
◦Faceted search of datasets
•Dataset dumps (.ttl.gz)
◦For each dataset, and for the whole thing.
Dataset query (http://healthdata.tw.rpi.edu/sparql)
Text https://github.com/jimmccusker/twc-h
14. Twitter Network Observatory
Makani, B. & Zhang, Q.
Makani, B. & Zhang, Q.
• Explores the relationships
of people and semantics in
the graph database
• Basic functions:
• Users can visualize and
analyze different types of
sub-graphs
• Preforms a set of basic
analyses for other
COSMIC Groups
15. How can we leverage Social
Media sites…
to identify these communities, and
stakeholders within them?
to gather requirements from these
communities?
First Responders, including Emergency Medical Personnel,
Firefighters, and Police Officers, have active online communities on
Social Media websites.
First Responders (with NIST)
McGuinness, Erickson, Chastain, Fry, Yan, Zhu
http://tw.rpi.edu/web/project/FirstResponders
Find Topics:
Find Users:
How can we leverage Social
Media sites…
to identify these communities, and
stakeholders within them?
to gather requirements from these
communities?
Examples from each of these observatories: 1. Science Data Observatory: A. SemantEco B. SemantAqua
Examples from each of these observatories: 1. Open Government Observatory: A.Linked Open Government Data Portal B. International Open Government Dataset
Semantically-enabled environmental monitoring – in this case monitoring water quality. Done initially as a student project in McGuinness’ Semantic eScience class, attracted interest of USGS and has an extension done with USGS. Currently working on a cooperative agreement with USGS to continue. Also used as a model for semantically enabling monitoring of air, soil, food, etc. Project page: http://tw.rpi.edu/web/project/SemantAQUA
Examples from each of these observatories: 1. Healthy & Life Sciences Observatory: A. HealthData Challenge
Examples from each of these observatories: 1. Social Spaces Data Observatory: A. Twitter Network Observatory B. First Responder Twitter Network
The RPI group has been developing Twitter Network Observatory to explore the relationships of people and semantics in the graph database. The basic functions have been fulfilled,including Users could visualize and analyze different types of sub-graphs based on the selections of topic, time range. The Twitter Network observatory performs a set of basic analyses for other COSMIC groups and users to support their purposes. We have been working on adding new functions including The selection based on time range, location, and sentiments. Network (and the topological properties) can be exported to various formats to be used in other software (GraphML, XGMML, SVG, etc.).
Introduction First Responders , including Emergency Medical Personnel, Firefighters, and Police Officers, have active online communities on Social Media websites. How can we leverage Social Media sites … to gather requirements for active First Responders? … to identify stakeholders within those First Responder communities? * http://www.digitalbuzzblog.com/infographic-24-hours-on-the-internet/