Talk delivered at YOW! Developer Conferences in Melbourne, Brisbane and Sydney Australia on 1-9 December 2016.
Abstract: Governments collect a lot of data. Data on air quality, toxic chemicals, laws and regulations, public health, and the census are intended to be widely distributed. Some data is not for public consumption. This talk focuses on open government data — the information that is meant to be made available for benefit of policy makers, researchers, scientists, industry, community organisers, journalists and members of civil society.
We’ll cover the evolution of Linked Data, which is now being used by Google, Apple, IBM Watson, federal governments worldwide, non-profits including CSIRO and OpenPHACTS, and thousands of others worldwide.
Next we’ll delve into the evolution of the U.S. Environmental Protection Agency’s Open Data service that we implemented using Linked Data and an Open Source Data Platform. Highlights include how we connected to hundreds of billions of open data facts in the world’s largest, open chemical molecules database PubChem and DBpedia.
WHO SHOULD ATTEND
Data scientists, software engineers, data analysts, DBAs, technical leaders and anyone interested in utilising linked data and open government data.
Usage of Linked Data: Introduction and Application ScenariosEUCLID project
This presentation introduces the main principles of Linked Data, the underlying technologies and background standards. It provides basic knowledge for how data can be published over the Web, how it can be queried, and what are the possible use cases and benefits. As an example, we use the development of a music portal (based on the MusicBrainz dataset), which facilitates access to a wide range of information and multimedia resources relating to music.
This tutorial explains the Data Web vision, some preliminary standards and technologies as well as some tools and technological building blocks developed by AKSW research group from Universität Leipzig.
Open science can contribute to AI trustworthiness. This talk is a categorization of scientific data platforms, and a framing of AI trustworthiness with pointers to open science contributions.
Usage of Linked Data: Introduction and Application ScenariosEUCLID project
This presentation introduces the main principles of Linked Data, the underlying technologies and background standards. It provides basic knowledge for how data can be published over the Web, how it can be queried, and what are the possible use cases and benefits. As an example, we use the development of a music portal (based on the MusicBrainz dataset), which facilitates access to a wide range of information and multimedia resources relating to music.
This tutorial explains the Data Web vision, some preliminary standards and technologies as well as some tools and technological building blocks developed by AKSW research group from Universität Leipzig.
Open science can contribute to AI trustworthiness. This talk is a categorization of scientific data platforms, and a framing of AI trustworthiness with pointers to open science contributions.
It19 20140721 linked data personal perspectiveJanifer Gatenby
A presentation made for Standards Australia's seminar. Outlines the basic aspects of linked data from a personal perspective and where it fits with direct and subject searching.
This presentation addresses the main issues of Linked Data and scalability. In particular, it provides gives details on approaches and technologies for clustering, distributing, sharing, and caching data. Furthermore, it addresses the means for publishing data trough could deployment and the relationship between Big Data and Linked Data, exploring how some of the solutions can be transferred in the context of Linked Data.
Build Narratives, Connect Artifacts: Linked Open Data for Cultural HeritageOntotext
Many issues are faced by scholars, book researchers, museum directors who try to find the underlying connection between resources. Scholars in particular continuously emphasizes the role of digital humanities and the value of linked data in cultural heritage information systems.
There are high expectations for Linked Government Data—the practice of publishing public sector information on the Web using Linked Data formats. This slideset reviews some of the ongoing work in the US, UK, and within W3C, as well as activities within my institute (DERI, National University of Ireland, Galway).
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...Armin Haller
Linked Open Data promises to provide guiding principles to publish interlinked knowledge graphs on the Web in the form of findable, accessible, interoperable, and reusable datasets. In this talk I argue that while as such, Linked Data may be viewed as a basis for instantiating the FAIR principles, there are still a number of open issues that cause significant data quality issues even when knowledge graphs are published as Linked Data. In this talk I will first define the boundaries of what constitutes a single coherent knowledge graph within Linked Data, i.e., present a principled notion of what a dataset is and what links within and between datasets are. I will also define different link types for data in Linked datasets and present the results of our empirical analysis of linkage among the datasets of the Linked Open Data cloud. Recent results from our analysis of Wikidata, which has not been part of the Linked Open Data Cloud, will also be presented.
Lecture Notes by Mustafa Jarrar at Birzeit University, Palestine.
See the course webpage at: http://jarrar-courses.blogspot.com/2014/01/sparql-rdf-query-language.html
and http://www.jarrar.info
The lecture covers:
- SPARQL Basics
- SPARQL Practical Session
Within the course, we will present Linked Data as a set of best practices for publishing and connecting structured data on the Web. These best practices have been adopted by an increasing number of data providers over the past years, leading to the creation of a global data space that contains many billions of assertions – the Web of Linked Data.
Towards digitizing scholarly communicationSören Auer
Slides of the VIVO 2016 Conference keynote: Despite the availability of ubiquitous connectivity and information technology, scholarly communication has not changed much in the last hundred years: research findings are still encoded in and decoded from linear, static articles and the possibilities of digitization are rarely used. In this talk, we will discuss strategies for digitizing scholarly communication. This comprises in particular: the use of machine-readable, dynamic content; the description and interlinking of research artifacts using Linked Data; the crowd-sourcing of multilingual
educational and learning content. We discuss the relation of these developments to research information systems and how they could become part of an open ecosystem for scholarly communication.
This presentation gives details on technologies and approaches towards exploiting Linked Data by building LD applications. In particular, it gives an overview of popular existing applications and introduces the main technologies that support implementation and development. Furthermore, it illustrates how data exposed through common Web APIs can be integrated with Linked Data in order to create mashups.
An introduction to linked data (semantic web) for a Knowledge and Information Network (KIN) webinar. The presentation shows some examples of linked data in action, data visualization, difference between open and linked data and how linkd data is being used in UK gov and local gov.
It19 20140721 linked data personal perspectiveJanifer Gatenby
A presentation made for Standards Australia's seminar. Outlines the basic aspects of linked data from a personal perspective and where it fits with direct and subject searching.
This presentation addresses the main issues of Linked Data and scalability. In particular, it provides gives details on approaches and technologies for clustering, distributing, sharing, and caching data. Furthermore, it addresses the means for publishing data trough could deployment and the relationship between Big Data and Linked Data, exploring how some of the solutions can be transferred in the context of Linked Data.
Build Narratives, Connect Artifacts: Linked Open Data for Cultural HeritageOntotext
Many issues are faced by scholars, book researchers, museum directors who try to find the underlying connection between resources. Scholars in particular continuously emphasizes the role of digital humanities and the value of linked data in cultural heritage information systems.
There are high expectations for Linked Government Data—the practice of publishing public sector information on the Web using Linked Data formats. This slideset reviews some of the ongoing work in the US, UK, and within W3C, as well as activities within my institute (DERI, National University of Ireland, Galway).
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...Armin Haller
Linked Open Data promises to provide guiding principles to publish interlinked knowledge graphs on the Web in the form of findable, accessible, interoperable, and reusable datasets. In this talk I argue that while as such, Linked Data may be viewed as a basis for instantiating the FAIR principles, there are still a number of open issues that cause significant data quality issues even when knowledge graphs are published as Linked Data. In this talk I will first define the boundaries of what constitutes a single coherent knowledge graph within Linked Data, i.e., present a principled notion of what a dataset is and what links within and between datasets are. I will also define different link types for data in Linked datasets and present the results of our empirical analysis of linkage among the datasets of the Linked Open Data cloud. Recent results from our analysis of Wikidata, which has not been part of the Linked Open Data Cloud, will also be presented.
Lecture Notes by Mustafa Jarrar at Birzeit University, Palestine.
See the course webpage at: http://jarrar-courses.blogspot.com/2014/01/sparql-rdf-query-language.html
and http://www.jarrar.info
The lecture covers:
- SPARQL Basics
- SPARQL Practical Session
Within the course, we will present Linked Data as a set of best practices for publishing and connecting structured data on the Web. These best practices have been adopted by an increasing number of data providers over the past years, leading to the creation of a global data space that contains many billions of assertions – the Web of Linked Data.
Towards digitizing scholarly communicationSören Auer
Slides of the VIVO 2016 Conference keynote: Despite the availability of ubiquitous connectivity and information technology, scholarly communication has not changed much in the last hundred years: research findings are still encoded in and decoded from linear, static articles and the possibilities of digitization are rarely used. In this talk, we will discuss strategies for digitizing scholarly communication. This comprises in particular: the use of machine-readable, dynamic content; the description and interlinking of research artifacts using Linked Data; the crowd-sourcing of multilingual
educational and learning content. We discuss the relation of these developments to research information systems and how they could become part of an open ecosystem for scholarly communication.
This presentation gives details on technologies and approaches towards exploiting Linked Data by building LD applications. In particular, it gives an overview of popular existing applications and introduces the main technologies that support implementation and development. Furthermore, it illustrates how data exposed through common Web APIs can be integrated with Linked Data in order to create mashups.
An introduction to linked data (semantic web) for a Knowledge and Information Network (KIN) webinar. The presentation shows some examples of linked data in action, data visualization, difference between open and linked data and how linkd data is being used in UK gov and local gov.
Are you overwhelmed by storage capacity requirements? Are you wondering how web giants are able to store large amounts of data at a fraction of your storage costs?
OpenStack is the fastest growing open-source project to date, and its community builds cloud software. Join us to learn about the two OpenStack storage projects and how your company can take advantage of them.
OpenStack storage allows the use of commodity hardware at massive scales that you can consume as a public, private, or hybrid cloud.
View the on-demand webinar. Special guest speaker Randy Bias, founder and CEO of Cloudscaling and member of the Board of Directors for OpenStack Foundation, and EVault big data expert Joey Yep will inform you about this fast-growing, open-source project: OpenStack.
• OpenStack Swift and Cinder storage projects
• High-level functionality and architecture
• Public, private, and hybrid use-cases
Insperity Business Confidence Survey Q2 2015 [Infographic]Insperity
Insperity asked executives from 5,300 businesses about their second quarter 2015 results and what those meant for the remainder of 2015. Here’s what they shared.
In this episode, Jeff Williams interviews Justin Somaini of Box.com. They discuss security implications from a consumer perspective, how security and the cloud environment work together, and revisit Bill Gates Trustworthy Computing memo from 2002.
Infographic: 10 Jaw-dropping Skype for Business StatsExinda
We’ve all experienced the delayed audio (annoying), difficulty connecting (ugh) and dropped calls (THE WORST) – so how is poor quality of experience really affecting the organization? Here are 10 things that may surprise you about Skype for Business.
Microsoft Office has recently been updated for the Apple iPad and has consistently been one of the top downloads from the Apple App Store.
Office for the iPad includes Excel, Word, PowerPoint and Outlook. If you are an Office 365 E3, Midsize, or Small Business SKU customer you are entitled to Office on the iPad for free!
Learn how to use Microsoft Office for the iPhone and iPad.
The rise of the mobile web will dramatically affect the go-to-market strategies for organizations of every size across virtually every industry and market segment.
Presented for managers & researchers at The Global One Health Initiative of the Ohio State University, Africa Regional Branch in Addis Ababa, Ethiopia (April 24th 2019)
This module supported the training on Linked Open Data delivered to the EU Institutions on 30 November 2015 in Brussels. https://joinup.ec.europa.eu/community/ods/news/ods-onsite-training-european-commission
What infrastructure is necessary for successful research data management (RDM...heila1
RDM life cycle; research data elements in the research life cycle; what is RDM infrastructure; IT infrastructure; Library infrastructure; Research Office infrastructure; Examples of 4 universities RDM service offerings
This presentation was provided by Chris Erdmann of Library Carpentries and by Judy Ruttenberg of ARL during the NISO virtual conference, Open Data Projects, held on Wednesday, June 13, 2018.
morning session talk at the second Keystone Training School "Keyword search in Big Linked Data" held in Santiago de Compostela.
https://eventos.citius.usc.es/keystone.school/
NSF Workshop Data and Software Citation, 6-7 June 2016, Boston USA, Software Panel
FIndable, Accessible, Interoperable, Reusable Software and Data Citation: Europe, Research Objects, and BioSchemas.org
Nelson Piedra , Janneth Chicaiza
and Jorge López, Universidad Técnica Particular de Loja, Edmundo
Tovar, Universidad Politécnica de Madrid,
and Oscar Martínez, Universitas
Miguel Hernández
Explore the advantages of using linked data with OERs.
Talk about Exploring the Semantic Web, and particularly Linked Data, and the Rhizomer approach. Presented August 14th 2012 at the SRI AIC Seminar Series, Menlo Park, CA
FAIRy stories: the FAIR Data principles in theory and in practiceCarole Goble
https://ucsb.zoom.us/meeting/register/tZYod-ippz4pHtaJ0d3ERPIFy2QIvKqjwpXR
FAIRy stories: the FAIR Data principles in theory and in practice
The ‘FAIR Guiding Principles for scientific data management and stewardship’ [1] launched a global dialogue within research and policy communities and started a journey to wider accessibility and reusability of data and preparedness for automation-readiness (I am one of the army of authors). Over the past 5 years FAIR has become a movement, a mantra and a methodology for scientific research and increasingly in the commercial and public sector. FAIR is now part of NIH, European Commission and OECD policy. But just figuring out what the FAIR principles really mean and how we implement them has proved more challenging than one might have guessed. To quote the novelist Rick Riordan “Fairness does not mean everyone gets the same. Fairness means everyone gets what they need”.
As a data infrastructure wrangler I lead and participate in projects implementing forms of FAIR in pan-national European biomedical Research Infrastructures. We apply web-based industry-lead approaches like Schema.org; work with big pharma on specialised FAIRification pipelines for legacy data; promote FAIR by Design methodologies and platforms into the researcher lab; and expand the principles of FAIR beyond data to computational workflows and digital objects. Many use Linked Data approaches.
In this talk I’ll use some of these projects to shine some light on the FAIR movement. Spoiler alert: although there are technical issues, the greatest challenges are social. FAIR is a team sport. Knowledge Graphs play a role – not just as consumers of FAIR data but as active contributors. To paraphrase another novelist, “It is a truth universally acknowledged that a Knowledge Graph must be in want of FAIR data.”
[1] Wilkinson, M., Dumontier, M., Aalbersberg, I. et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data 3, 160018 (2016). https://doi.org/10.1038/sdata.2016.18
Rule-based Capture/Storage of Scientific Data from PDF Files and Export using...Stuart Chalk
Recently, the US government has mandated that publicly funded scientific research data be freely made available in a useable form, allowing integration of data in other systems. While this mandate has been articulated, existing publications and new papers (PDF) still do not provide accessible data, meaning that the usefulness is limited without human intervention.
This presentation outlines our efforts to extract scientific data from PDF files, using the PDFToText software and regular expressions (regex), and process it into a form that structures the data and its context (metadata). Extracted data is processed (cleaned, normalized), organized, and inserted into a contextually developed MySQL database. The data and metadata can then be output using a generic JSON-LD based scientific data model (SDM) under development in our laboratory.
Linked Open Data Principles, Technologies and ExamplesOpen Data Support
Theoretical and practical introducton to linked data, focusing both on the value proposition, the theory/foundations, and on practical examples. The material is tailored to the context of the EU institutions.
Figshare for institutions - Jisc Digifest 2016Jisc
In May 2015 the EPSRC policy framework on research data came into effect. Salford University partnered with figshare to not only answer the mandate but to enhance the visibility of the research generated at the institution. All public facing research outputs are freely available to the wider public at salford.figshare.com.
Learn more about University of Salford’s approach and get a high level overview of the latest figshare functionality.
This talk highlights the rich history and diversity within software engineering and related STEM fields. Bernadette Hyland-Wood, a serial tech entrepreneur with Australia and U.S. experience addressed an audience of high school year 11 and 12 students on STEM futures as part of International Women's week 2018. This talk was organised by ChangeMakeHer ambassadors, helping to create the next generation of female changemakers to lead and change the world. More on ChangeMakeHer Australia https://www.changemakeher.com/about-us
Empowering a healthier future: through the intersection of people, technology and science with a panel of bio-informatists and data experts. Brisbane Australia 27-Feb 2018
Software engineering specifically is about designing, writing, testing, implementing and maintaining software. In 2017 and beyond, it is about much more. Software doesn’t affect any one group of people; rather, software plays a massive role in our lives from the moment we wake up, travel to work, school or wherever we spend significant time during our lives. This talk delivered in November 2017 to high school students in Australia, aims to introduce teenagers to the wide range of opportunities in software engineering and information technology-related majors at university and careers upon graduation. #STEM #sofwareengineering #robotics #AI #GirlsCanCode
Presented by serial tech entrepreneur Bernadette Hyland to an audience of tech and design managers on building an inclusive, collaborative workplace. Bernadette Hyland began her career in Silicon Valley when 37% of computer science graduates were women. During the next two decades, the number of female engineers dropped to a low of 12% despite more women in the workplace. What happened? This talk highlights several remarkable female programming pioneers from the U.S. and Australia. This talk aims to engage the audience in a discussion on the value of diverse collaborations, the role of women and how we may be self-reflective to improve participation and collaboration in the workplace, and reduce discrimination and harassment.
A talk delivered by software engineer and serial tech founder, Ms. Bernadette Hyland to year 9-12 students in Brisbane Australia. The information session was for girls to highlight software engineering and what students can do now to prepare for a productive and satisfying career that leverages science, technology, engineering and math.
3 Round Stones Briefing to U.S. EPA's Chief Data Scientist on Open DataBernadette Hyland-Wood
The following is technical brief to U.S. EPA's Chief Data Scientist on open data information architecture, the use of Linked Data and the EPA Linked Data Management Service. The briefing was held in February 2016 and was educational in nature.
The following brief details the use of linked data to connect various high quality data sets produced by the U.S. Environmental Protection Agency. Linked data is an open standards way to publish and consume data. Using a linked data approach and the REST API, developers, scientists, and the public can more easily find, access and re-use authoritative data published by the EPA.
Presentation at the ESRI Health and Human Services Conference, October 2015, by GeoHealth US Corp. GeoHealth.us is an interactive web service that allows users to map their local environment to health impacts.
Bernadette Hyland speaks at Startup Queensland Visiting Entrepreneurs Program...Bernadette Hyland-Wood
Continuing with the Queensland Government’s and Brisbane Marketing’s fantastic program of bringing international entrepreneurs to Queensland to tell their stories and to mentor local founders, ilab will be hosting US entrepreneur Bernadette Hyland on Thursday Aug 6, 2015.
Bernadette has a fascinating CV – Software Engineer, Startup Founder, Open Data guru, Web innovator and W3C influencer, IoT, public health data analytics, Crowdsourcing, STEM education and is a major supporter women startup founders.
Update on the progress of two Linked Data projects, including one from US EPA and another from a Virginia based regional healthcare company using anonymized EMR and Linked Data for personalized healthcare.
Linked Data Cookbook for Government Agencies, SemTech East, Washington DC 1-D...Bernadette Hyland-Wood
Linked Data Cookbook for US Government Agencies by Bernadette Hyland, 3 Round Stones, Inc. and W3C Government Linked Data co-chair.
Presented at Semantic Technology Conference Dec 2011, Washington DC
Presentation on what's happening with Government Linked Data presented by Bernadette Hyland. Presentation delivered on 3-Nov-2011 at NASA Goddard to CENDI Federal STI Managers Group.
This is a presentation Zen style talk (ala Garr Reynolds) on the importance of publishing high quality (“5 star”)
Linked Data and why this is central to fulfilling the promise of Open Government in the 21st Century. I blogged the full story on http://3roundstones.com/2011/10/17/a-new-era-of-transparency/
Semantic Content Management framework with wiki interface for creating data-driven Web applications. This is an Open Source project based on International Data Exchange standards (W3C) and Web technologies. Learn more about Callimachus at http://callimachusproject.org.
StarCompliance is a leading firm specializing in the recovery of stolen cryptocurrency. Our comprehensive services are designed to assist individuals and organizations in navigating the complex process of fraud reporting, investigation, and fund recovery. We combine cutting-edge technology with expert legal support to provide a robust solution for victims of crypto theft.
Our Services Include:
Reporting to Tracking Authorities:
We immediately notify all relevant centralized exchanges (CEX), decentralized exchanges (DEX), and wallet providers about the stolen cryptocurrency. This ensures that the stolen assets are flagged as scam transactions, making it impossible for the thief to use them.
Assistance with Filing Police Reports:
We guide you through the process of filing a valid police report. Our support team provides detailed instructions on which police department to contact and helps you complete the necessary paperwork within the critical 72-hour window.
Launching the Refund Process:
Our team of experienced lawyers can initiate lawsuits on your behalf and represent you in various jurisdictions around the world. They work diligently to recover your stolen funds and ensure that justice is served.
At StarCompliance, we understand the urgency and stress involved in dealing with cryptocurrency theft. Our dedicated team works quickly and efficiently to provide you with the support and expertise needed to recover your assets. Trust us to be your partner in navigating the complexities of the crypto world and safeguarding your investments.
Explore our comprehensive data analysis project presentation on predicting product ad campaign performance. Learn how data-driven insights can optimize your marketing strategies and enhance campaign effectiveness. Perfect for professionals and students looking to understand the power of data analysis in advertising. for more details visit: https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/
As Europe's leading economic powerhouse and the fourth-largest hashtag#economy globally, Germany stands at the forefront of innovation and industrial might. Renowned for its precision engineering and high-tech sectors, Germany's economic structure is heavily supported by a robust service industry, accounting for approximately 68% of its GDP. This economic clout and strategic geopolitical stance position Germany as a focal point in the global cyber threat landscape.
In the face of escalating global tensions, particularly those emanating from geopolitical disputes with nations like hashtag#Russia and hashtag#China, hashtag#Germany has witnessed a significant uptick in targeted cyber operations. Our analysis indicates a marked increase in hashtag#cyberattack sophistication aimed at critical infrastructure and key industrial sectors. These attacks range from ransomware campaigns to hashtag#AdvancedPersistentThreats (hashtag#APTs), threatening national security and business integrity.
🔑 Key findings include:
🔍 Increased frequency and complexity of cyber threats.
🔍 Escalation of state-sponsored and criminally motivated cyber operations.
🔍 Active dark web exchanges of malicious tools and tactics.
Our comprehensive report delves into these challenges, using a blend of open-source and proprietary data collection techniques. By monitoring activity on critical networks and analyzing attack patterns, our team provides a detailed overview of the threats facing German entities.
This report aims to equip stakeholders across public and private sectors with the knowledge to enhance their defensive strategies, reduce exposure to cyber risks, and reinforce Germany's resilience against cyber threats.
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
1. ExtendYourReach.
Linking Open Government
Data at Scale
YOW! 2016 Conference
Melbourne December 1-2 ~ Brisbane December 5-6
Sydney December 8-9
Bernadette Hyland
CEO & co-founder
3 Round Stones, Inc.
@BernHyland
bhyland@3RoundStones.com
10. Refers to a set of best practices for publishing and
interlinking data for access by both humans and
machines.
The RDF family of syntaxes (e.g., JSON-LD, N3, Turtle)
and HTTP URIs.
Linked Data
@BernHyland
11. Linked Data can be published by a person
or organization behind the firewall or on the
public Web.
Linked Data published on the public Web is
generally called Linked Open Data.
- W3C Linked Data Glossary
@BernHyland
27. my data
collector
collected by
measurement
Michael
first name
Hausenblaslast name
Person
a
a measurement
2011-01-01
date
0
value
units of measure
degrees
Centigrade
...
Galway Airport
collected at
or
Linked Data on the Web
@BernHyland
28. “Linked Data was part of my initial vision for the
Web and is an important part of the Web’s
future. The Web took off as a web of hyperlinked
documents which were exciting to read, but
which could not be effectively used as data.
“Linked Data was part of my initial vision for the Web
and is an important part of the Web’s future.The Web
took off as a web of hyperlinked documents which
were exciting to read, but which could not be
effectively used as data.”
- Tim Berners-Lee
29. “Linked Data was part of my initial vision for the
Web and is an important part of the Web’s
future. The Web took off as a web of hyperlinked
documents which were exciting to read, but
which could not be effectively used as data.
The Semantic Web morphed when it hit
the marketplace
44. • Widens EPA’s audience (justifies relevance), for
research, environmental justice
• More cost-effective than relational backed web
portals
• Used for scientific R&D, green chemistry, ++
• Increased transparency
https://opendata.epa.gov
@BernHyland
45. 7 Steps to Publish Linked Data
Source: W3C Best Practices for Publishing Linked Data, see https://www.w3.org/TR/ld-bp/
46. Step #1 - Identify
Identify the dataset(s) to be modeled
• Request a copy of the logical and physical model of the
database(s)
• Obtain data extracts (i.e., databases and/or
spreadsheets) or create data in a way that can be
replicated.
@BernHyland
47. Step #2 - Model Data
Model data without context to allow for reuse and
easier merging of data sets
• Traditional DBAs organize data for specified
Web services or applications
• In Linked Data, application logic does not drive
the data schema, concepts, etc
@BernHyland
48. Step #2 - Modeling (cont)
Look for real world objects of interest (e.g., people, places,
things, locations, etc.) and model them.
• Investigate how others are already modeling similar or
related data.
• Look for duplication & normalize the data
• Use common sense to decide whether or not to make
link
@BernHyland
49. • Connect data from different sources & authoritative
vocabularies
• Use URIs as names for your objects
• Put aside immediate needs of any application
• Don’t think about how an application will use your data
• Do think about time and how the data will change over
time.
Step #2 - Modeling (cont)
@BernHyland
50. Identifiers are at the heart of how things
become useful as linked data.
We use the same mechanism for connecting
data as the Web — the humble HTTP URI
The Web is formed by HTTP URIs that are
essentially connections linking pieces of
information together.
Step #3 & 4
Name & Describe
@BernHyland
51. 5. Write a script or process to convert the data set
repeatedly
6. Publish to the Web and announce it!
7. Maintenance strategy
Steps #5, 6 & 7
Convert, Publish & Maintain
@BernHyland
52. Take an iterative approach
1. Review of modeling decisions
2. Review vocabularies chosen and developed
3. Modify/update data conversion scripts
4. Do a maintenance walk-through with real use cases
5. Show how to explore data with SPARQL and
visualizations
6. Discuss a persistent identifier strategy (think PURLs)
@BernHyland
55. Technical DNA of EPA
Linked Data Services
• Built on Open Source Software
• Provides downloadable Linked Open Data (RDF,
JSON-LD)
• Developer guide includes RESTful API, persistent
URLs strategy
• Sample apps on GitHub (https://github.com/
USEPA)
@BernHyland
56. Power of LOD
Combining data sets
in a day with Linked Open
Data from DBpedia &
EPA.
Next the EPA wanted
more chemical data
linked to their data…
@BernHyland
58. PubChem, the world’s
largest open molecular
database
Used by healthcare /
life sciences industry
worldwide - all Linked
Open Data
@BernHyland
59. Use of shared
vocabularies, including
SKOS, RDFS, OWL.
Other key vocabularies
include Dublin Core,
Geo, FOAF, ORG, Vcard
are the “lingua franca” of
data interoperability
61. Public
Application, Script or automated client
Web Browser
SPARQL endpointREST APIResource URIs
Linked Data management system
located at a Tier 1 Cloud Provider
(FISMA compliant)
RDF Database
Registered developer
@BernHyland
62. • A worldwide system of linked information systems
• Global addressing scheme for data integration that scales to the
Web
• Nearly immediate data integration to billions of facts
Linked Data is a gift …
@BernHyland