Going from a web of document to a web of knowledge is one of the key goal set by the creator of the World Wide Web, Sir Tim Berners-Lee. This dream is becoming a reality more each day with the development and the integration of new formats and new technologies to represent data as knowledge graphs, interlinking concepts within documents or databases together. This presentation will provide an overview of the generic concepts supporting Linked Data, including formats, the existing technologies supporting these formats, introduce the key existing initiatives relying on these technologies. We will also address the challenge of semantic/knowledge modeling in science and in other domains and the need for more tools to support the use of these technologies. In particular, we will present the semantic annotation service B2NOTE and how the formats and technologies are used to extend the description of datasets within EUDAT and allow the possibility to create new datasets from multiples sources and multiple domains.
Visit https://eudat.eu/eudat-summer-school
The Importance of Metadata - EUDAT Summer School (Shaun de Witt, CCFE)EUDAT
Shaun will explain the importance of metadata for data discovery, provenance, reproducibility and reuse. Without sufficient metadata and documentation, research data cannot be found or understood. Providing this contextual information is critical for data to be FAIR. The topics of metadata ontologies and folksonomies are also discussed. This talk aims at giving the participants an understanding of the importance of metadata for both collaborative research and to ensure the usefulness of the data into the future, as well as an idea of what makes ‘good’ metadata.
Visit https://eudat.eu/eudat-summer-school
The Data Lifecycle - EUDAT Summer School (Yann Le Franc)EUDAT
Yann will introduce the notion of data life cycles (DLCs) as an overarching framework for the workshop. This presentation will explain the key activities and roles identified by EUDAT and undertaken by researchers and data service providers in the process of creating, analysing, managing, sharing and archiving research data. It will highlight how the EUDAT service suite addresses this data lifecycle to support researchers with their key data requirements. He will then present the current research work undertaken in EUDAT to model community specific DLCs, the relation with the concept of provenance and the prototype services being currently developed to bridge the identified gaps in DLC coverage.
Visit https://eudat.eu/eudat-summer-school
This presentation was provided by Chris Erdmann of Library Carpentries and by Judy Ruttenberg of ARL during the NISO virtual conference, Open Data Projects, held on Wednesday, June 13, 2018.
This presentation was provided by Scott Ziegler of Louisiana State University during the NISO Virtual Conference, Open Data Projects, held on Wednesday, June 13, 2018.
Exploration, visualization and querying of linked open data sourcesLaura Po
afternoon hands-on session talk at the second Keystone Training School "Keyword search in Big Linked Data" held in Santiago de Compostela.
https://eventos.citius.usc.es/keystone.school/
morning session talk at the second Keystone Training School "Keyword search in Big Linked Data" held in Santiago de Compostela.
https://eventos.citius.usc.es/keystone.school/
Dataverse in China: Internationalization, Curation and Promotion by Yin Shenqindatascienceiqss
Zhang Jilong & Yin Shenqin will discuss the internationalization development work done by Fudan University to support a Chinese language user interface in Dataverse. Additionally, the practice of data curation at Fudan University will be presented, as well as the branding and dissemination of Dataverse in China.
The Importance of Metadata - EUDAT Summer School (Shaun de Witt, CCFE)EUDAT
Shaun will explain the importance of metadata for data discovery, provenance, reproducibility and reuse. Without sufficient metadata and documentation, research data cannot be found or understood. Providing this contextual information is critical for data to be FAIR. The topics of metadata ontologies and folksonomies are also discussed. This talk aims at giving the participants an understanding of the importance of metadata for both collaborative research and to ensure the usefulness of the data into the future, as well as an idea of what makes ‘good’ metadata.
Visit https://eudat.eu/eudat-summer-school
The Data Lifecycle - EUDAT Summer School (Yann Le Franc)EUDAT
Yann will introduce the notion of data life cycles (DLCs) as an overarching framework for the workshop. This presentation will explain the key activities and roles identified by EUDAT and undertaken by researchers and data service providers in the process of creating, analysing, managing, sharing and archiving research data. It will highlight how the EUDAT service suite addresses this data lifecycle to support researchers with their key data requirements. He will then present the current research work undertaken in EUDAT to model community specific DLCs, the relation with the concept of provenance and the prototype services being currently developed to bridge the identified gaps in DLC coverage.
Visit https://eudat.eu/eudat-summer-school
This presentation was provided by Chris Erdmann of Library Carpentries and by Judy Ruttenberg of ARL during the NISO virtual conference, Open Data Projects, held on Wednesday, June 13, 2018.
This presentation was provided by Scott Ziegler of Louisiana State University during the NISO Virtual Conference, Open Data Projects, held on Wednesday, June 13, 2018.
Exploration, visualization and querying of linked open data sourcesLaura Po
afternoon hands-on session talk at the second Keystone Training School "Keyword search in Big Linked Data" held in Santiago de Compostela.
https://eventos.citius.usc.es/keystone.school/
morning session talk at the second Keystone Training School "Keyword search in Big Linked Data" held in Santiago de Compostela.
https://eventos.citius.usc.es/keystone.school/
Dataverse in China: Internationalization, Curation and Promotion by Yin Shenqindatascienceiqss
Zhang Jilong & Yin Shenqin will discuss the internationalization development work done by Fudan University to support a Chinese language user interface in Dataverse. Additionally, the practice of data curation at Fudan University will be presented, as well as the branding and dissemination of Dataverse in China.
This presentation was provided by Anne Washington of the University of Houston during the NISO virtual conference, Open Data Projects, held on Wednesday, June 13, 2018.
Presentation given by Peter Burnhill, director of EDINA, at #ReCon_15 : Beyond the paper: publishing data, software and more. Edinburgh, 19 June 2015
Peter Burnhill
http://reconevent.com/
Keystone Summer School 2015: Mauro Dragoni, Ontologies For Information RetrievalMauro Dragoni
The presentation provides an overview of what an ontology is and how it can be used for representing information and for retrieving data with a particular focus on the linguistic resources available for supporting this kind of task. Overview of semantic-based retrieval approaches by highlighting the pro and cons of using semantic approaches with respect to classic ones. Use cases are presented and discussed
EUDAT Summer School Welcome - EUDAT Summer School (Yannis Tzitzikas, FORTH-ICS)EUDAT
Yannis will welcome the participants of the summer school on behalf of FORTH and will say a few words about the research center, its institutes and the hosting lab.
Visit https://eudat.eu/eudat-summer-school
This presentation was provided by Anne Washington of the University of Houston during the NISO virtual conference, Open Data Projects, held on Wednesday, June 13, 2018.
Presentation given by Peter Burnhill, director of EDINA, at #ReCon_15 : Beyond the paper: publishing data, software and more. Edinburgh, 19 June 2015
Peter Burnhill
http://reconevent.com/
Keystone Summer School 2015: Mauro Dragoni, Ontologies For Information RetrievalMauro Dragoni
The presentation provides an overview of what an ontology is and how it can be used for representing information and for retrieving data with a particular focus on the linguistic resources available for supporting this kind of task. Overview of semantic-based retrieval approaches by highlighting the pro and cons of using semantic approaches with respect to classic ones. Use cases are presented and discussed
EUDAT Summer School Welcome - EUDAT Summer School (Yannis Tzitzikas, FORTH-ICS)EUDAT
Yannis will welcome the participants of the summer school on behalf of FORTH and will say a few words about the research center, its institutes and the hosting lab.
Visit https://eudat.eu/eudat-summer-school
Presentation about - Semantic Web - Overview -Semantic Web
Web of Data, Giant Global Graph, Data Web, Web 3.0, Linked Data Web, Semantic Data Web, Enterprise Information Web, HTML, CSS,
Open Data and Cross Disciplinary Research - EUDAT Summer School (Brian Matthe...EUDAT
The European Open Science Cloud (EOSC) has become a driving force behind the current evolution of e-Infrastructure to support research. The EOSC offers the vision of an integrated ecosystem of data, services and expertise providing a common platform for open cross-community research in Europe and beyond. In this session, I shall consider the aims of the EOSC and discuss some the opportunities it offers, and barriers it needs to overcome to realise the vision. I shall introduce the EOSC-Pilot project which is aiming to pave the way towards the EOSC by exploring the opportunities and barriers, and proposing how the EOSC should evolve, both technically, including its architecture, and organisationally, including how it should be managed. Participants will be invited to consider what the issues of the EOSC are and how it might affect their own domain.
Visit: https://www.eudat.eu/eudat-summer-school
This is an informal overview of Linked Data and the usage made of it for the project http://res.space (presented on August 11th 2016 during a team meeting)
Talk delivered at YOW! Developer Conferences in Melbourne, Brisbane and Sydney Australia on 1-9 December 2016.
Abstract: Governments collect a lot of data. Data on air quality, toxic chemicals, laws and regulations, public health, and the census are intended to be widely distributed. Some data is not for public consumption. This talk focuses on open government data — the information that is meant to be made available for benefit of policy makers, researchers, scientists, industry, community organisers, journalists and members of civil society.
We’ll cover the evolution of Linked Data, which is now being used by Google, Apple, IBM Watson, federal governments worldwide, non-profits including CSIRO and OpenPHACTS, and thousands of others worldwide.
Next we’ll delve into the evolution of the U.S. Environmental Protection Agency’s Open Data service that we implemented using Linked Data and an Open Source Data Platform. Highlights include how we connected to hundreds of billions of open data facts in the world’s largest, open chemical molecules database PubChem and DBpedia.
WHO SHOULD ATTEND
Data scientists, software engineers, data analysts, DBAs, technical leaders and anyone interested in utilising linked data and open government data.
Making sure your content is licenced and discoverable
A presentation from the JISC Programme Meeting for its Content Programme for 2011 http://www.jisc.ac.uk/whatwedo/programmes/digitisation/econtent11.aspx
This presentation is intended to give some brief advice for those publishing
digital content (digital images, cultural heritage, scholarly information etc.)
on the Internet and in particular how to ensure good visibility via Google and other portals
RO-Crate: packaging metadata love notes into FAIR Digital ObjectsCarole Goble
Abstract
slides available at: https://zenodo.org/record/7147703#.Y7agoxXP2F4
The Helmholtz Metadata Collaboration aims to make the research data [and software] produced by Helmholtz Centres FAIR for their own and the wider science community by means of metadata enrichment [1]. Why metadata enrichment and why FAIR? Because the whole scientific enterprise depends on a cycle of finding, exchanging, understanding, validating, reproducing), integrating and reusing research entities across a dispersed community of researchers.
Metadata is not just “a love note to the future” [2], it is a love note to today’s collaborators and peers. Moreover, a FAIR Commons must cater for the metadata of all the entities of research – data, software, workflows, protocols, instruments, geo-spatial locations, specimens, samples, people (well as traditional articles) – and their interconnectivity. That is a lot of metadata love notes to manage, bundle up and move around. Notes written in different languages at different times by different folks, produced and hosted by different platforms, yet referring to each other, and building an integrated picture of a multi-part and multi-party investigation. We need a crate!
RO-Crate [3] is an open, community-driven, and lightweight approach to packaging research entities along with their metadata in a machine-readable manner. Following key principles - “just enough” and “developer and legacy friendliness - RO-Crate simplifies the process of making research outputs FAIR while also enhancing research reproducibility and citability. As a self-describing and unbounded “metadata middleware” framework RO-Crate shows that a little bit of packaging goes a long way to realise the goals of FAIR Digital Objects (FDO)[4], and to not just overcome platform diversity but celebrate it while retaining investigation contextual integrity.
In this talk I will present the why, and how Research Object packaging eases Metadata Collaboration using examples in big data and mixed object exchange, mixed object archiving and publishing, mass citation, and reproducibility. Some examples come from the HMC, others from EOSC, USA and Australia, and from different disciplines.
Metadata is a love note to the future, RO-Crate is the delivery package.
[1] https://helmholtz-metadaten.de/en
[2] Scott, Jason The Metadata Mania, http://ascii.textfiles.com/archives/3181, June 2011
[3] Soiland-Reyes, Stian et al. “Packaging Research Artefacts with RO-Crate”. Data Science, 2022; 5(2):97-138, DOI: 10.3233/DS-210053
[4] De Smedt K, Koureas D, Wittenburg P. “FAIR Digital Objects for Science: From Data Pieces to Actionable Knowledge Units”. Publications. 2020; 8(2):21. https://doi.org/10.3390/publications8020021
Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...Stefan Dietze
Presentation from mentoring event of Open Education Europa Challenge (http://www.openeducationchallenge.eu/) about using Linked Data in educational applications.
RO-Crate: A framework for packaging research products into FAIR Research ObjectsCarole Goble
RO-Crate: A framework for packaging research products into FAIR Research Objects presented to Research Data Alliance RDA Data Fabric/GEDE FAIR Digital Object meeting. 2021-02-25
On 2008-11-15 Maurice Vanderfeesten gave a presentation in Baltimore at the SPARC OpenAccess confenrence.
This presentation explains about the needs for interoperability amoung repository systems. DRIVER provides guidelines how to expose metadata via OAI-PMH is a way that has international compliance.
Similar to Linked Data and Semantic Web - EUDAT Summer School (Yann Le Franc, e-Science Data Factory) (20)
With a network of more than 20 European research
organisations, data and computing centres in 14 countries,
the EUDAT Collaborative Data Infrastructure (CDI) is one of
the largest infrastructures of integrated data services and
resources supporting research in Europe.
Are you a researcher, citizen scientist, institution or community looking for data storage and value-added services? Do you want access to tools to make your research data more FAIR (findable, accessible, interoperable, and reusable)? Interested in seeing how the future European Open Science Cloud could support research data and practically foster cross-border, cross-disciplinary collaboration? Then this webinar is for you!
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfGetInData
Recently we have observed the rise of open-source Large Language Models (LLMs) that are community-driven or developed by the AI market leaders, such as Meta (Llama3), Databricks (DBRX) and Snowflake (Arctic). On the other hand, there is a growth in interest in specialized, carefully fine-tuned yet relatively small models that can efficiently assist programmers in day-to-day tasks. Finally, Retrieval-Augmented Generation (RAG) architectures have gained a lot of traction as the preferred approach for LLMs context and prompt augmentation for building conversational SQL data copilots, code copilots and chatbots.
In this presentation, we will show how we built upon these three concepts a robust Data Copilot that can help to democratize access to company data assets and boost performance of everyone working with data platforms.
Why do we need yet another (open-source ) Copilot?
How can we build one?
Architecture and evaluation
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdfEnterprise Wired
In this guide, we'll explore the key considerations and features to look for when choosing a Trusted analytics platform that meets your organization's needs and delivers actionable intelligence you can trust.
The Building Blocks of QuestDB, a Time Series Databasejavier ramirez
Talk Delivered at Valencia Codes Meetup 2024-06.
Traditionally, databases have treated timestamps just as another data type. However, when performing real-time analytics, timestamps should be first class citizens and we need rich time semantics to get the most out of our data. We also need to deal with ever growing datasets while keeping performant, which is as fun as it sounds.
It is no wonder time-series databases are now more popular than ever before. Join me in this session to learn about the internal architecture and building blocks of QuestDB, an open source time-series database designed for speed. We will also review a history of some of the changes we have gone over the past two years to deal with late and unordered data, non-blocking writes, read-replicas, or faster batch ingestion.
State of Artificial intelligence Report 2023kuntobimo2016
Artificial intelligence (AI) is a multidisciplinary field of science and engineering whose goal is to create intelligent machines.
We believe that AI will be a force multiplier on technological progress in our increasingly digital, data-driven world. This is because everything around us today, ranging from culture to consumer products, is a product of intelligence.
The State of AI Report is now in its sixth year. Consider this report as a compilation of the most interesting things we’ve seen with a goal of triggering an informed conversation about the state of AI and its implication for the future.
We consider the following key dimensions in our report:
Research: Technology breakthroughs and their capabilities.
Industry: Areas of commercial application for AI and its business impact.
Politics: Regulation of AI, its economic implications and the evolving geopolitics of AI.
Safety: Identifying and mitigating catastrophic risks that highly-capable future AI systems could pose to us.
Predictions: What we believe will happen in the next 12 months and a 2022 performance review to keep us honest.
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeWalaa Eldin Moustafa
Dynamic policy enforcement is becoming an increasingly important topic in today’s world where data privacy and compliance is a top priority for companies, individuals, and regulators alike. In these slides, we discuss how LinkedIn implements a powerful dynamic policy enforcement engine, called ViewShift, and integrates it within its data lake. We show the query engine architecture and how catalog implementations can automatically route table resolutions to compliance-enforcing SQL views. Such views have a set of very interesting properties: (1) They are auto-generated from declarative data annotations. (2) They respect user-level consent and preferences (3) They are context-aware, encoding a different set of transformations for different use cases (4) They are portable; while the SQL logic is only implemented in one SQL dialect, it is accessible in all engines.
#SQL #Views #Privacy #Compliance #DataLake
Learn SQL from basic queries to Advance queriesmanishkhaire30
Dive into the world of data analysis with our comprehensive guide on mastering SQL! This presentation offers a practical approach to learning SQL, focusing on real-world applications and hands-on practice. Whether you're a beginner or looking to sharpen your skills, this guide provides the tools you need to extract, analyze, and interpret data effectively.
Key Highlights:
Foundations of SQL: Understand the basics of SQL, including data retrieval, filtering, and aggregation.
Advanced Queries: Learn to craft complex queries to uncover deep insights from your data.
Data Trends and Patterns: Discover how to identify and interpret trends and patterns in your datasets.
Practical Examples: Follow step-by-step examples to apply SQL techniques in real-world scenarios.
Actionable Insights: Gain the skills to derive actionable insights that drive informed decision-making.
Join us on this journey to enhance your data analysis capabilities and unlock the full potential of SQL. Perfect for data enthusiasts, analysts, and anyone eager to harness the power of data!
#DataAnalysis #SQL #LearningSQL #DataInsights #DataScience #Analytics
Global Situational Awareness of A.I. and where its headedvikram sood
You can see the future first in San Francisco.
Over the past year, the talk of the town has shifted from $10 billion compute clusters to $100 billion clusters to trillion-dollar clusters. Every six months another zero is added to the boardroom plans. Behind the scenes, there’s a fierce scramble to secure every power contract still available for the rest of the decade, every voltage transformer that can possibly be procured. American big business is gearing up to pour trillions of dollars into a long-unseen mobilization of American industrial might. By the end of the decade, American electricity production will have grown tens of percent; from the shale fields of Pennsylvania to the solar farms of Nevada, hundreds of millions of GPUs will hum.
The AGI race has begun. We are building machines that can think and reason. By 2025/26, these machines will outpace college graduates. By the end of the decade, they will be smarter than you or I; we will have superintelligence, in the true sense of the word. Along the way, national security forces not seen in half a century will be un-leashed, and before long, The Project will be on. If we’re lucky, we’ll be in an all-out race with the CCP; if we’re unlucky, an all-out war.
Everyone is now talking about AI, but few have the faintest glimmer of what is about to hit them. Nvidia analysts still think 2024 might be close to the peak. Mainstream pundits are stuck on the wilful blindness of “it’s just predicting the next word”. They see only hype and business-as-usual; at most they entertain another internet-scale technological change.
Before long, the world will wake up. But right now, there are perhaps a few hundred people, most of them in San Francisco and the AI labs, that have situational awareness. Through whatever peculiar forces of fate, I have found myself amongst them. A few years ago, these people were derided as crazy—but they trusted the trendlines, which allowed them to correctly predict the AI advances of the past few years. Whether these people are also right about the next few years remains to be seen. But these are very smart people—the smartest people I have ever met—and they are the ones building this technology. Perhaps they will be an odd footnote in history, or perhaps they will go down in history like Szilard and Oppenheimer and Teller. If they are seeing the future even close to correctly, we are in for a wild ride.
Let me tell you what we see.
Global Situational Awareness of A.I. and where its headed
Linked Data and Semantic Web - EUDAT Summer School (Yann Le Franc, e-Science Data Factory)
1. www.eudat.eu
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No. 654065
Introduction to Linked Data
and Semantic Web
Yann Le Franc, PhD
This work is licensed under the Creative
Commons CC-BY 4.0 licence.
Attribution: EUDAT – www.eudat.eu
Version 2017-1
2. How to cope with an expending
universe of scientific data?
“The Hitchhiker’s guide to the Semantic Web Galaxy”
3. How to cope with an expending
universe of scientific data?
“The Hitchhiker’s guide to the Semantic Web Galaxy”
4. EUDAT Summer School, 3-7 July 2017, Crete
Introduction: a bit of context
The general principles of Linked Data and standards
Application: data annotations with B2NOTE
Outline
5. EUDAT Summer School, 3-7 July 2017, Crete
Problem: the volume of scientific data is
expanding
6. EUDAT Summer School, 3-7 July 2017, Crete
?
Challenge: Aggregating multi-dimensional
data from multiple data sources
7. EUDAT Summer School, 3-7 July 2017, Crete
?
Similar problem and challenge in Neuroscience
8. EUDAT Summer School, 3-7 July 2017, Crete
Multiple species
Multi-scale data
ConnectivityGenes Molecules
Electrical
activity Functional
Data aggregation
Similar problem and challenge in Neuroscience
9. EUDAT Summer School, 3-7 July 2017, Crete
Modeling
Multiple species
Multi-scale data
ConnectivityGenes Molecules
Electrical
activity Functional
Data Analysis
Data aggregation
Similar problem and challenge in Neuroscience
10. EUDAT Summer School, 3-7 July 2017, Crete
Modeling
Multiple species
Multi-scale data
ConnectivityGenes Molecules
Electrical
activity Functional
Data Analysis
Data aggregation
Similar problem and challenge in Neuroscience
11. EUDAT Summer School, 3-7 July 2017, Crete
Data enclosed in information silos : Distinct APIs, Data published within HTML or
unstructured
2710 databases related to Neurosciences (Neuroscience Information
Framework)
How can we make these data resources interoperable and
link them together?
The current situation: distributed data
resources in large variety of formats
WebAPI
<HTML>
<HTML>
WebAPI
12. EUDAT Summer School, 3-7 July 2017, Crete
https://fr.wikipedia.org/wiki/Tim_Berners-Lee
A global problem
World Wide Web is a global document space
Documents are interconnected with links
Data is hidden in HTML pages: Easy to use by humans but
not by machines
Large diversity of Web APIs
Impossible to access and interlink data
Need for semantics for transforming the global document
space into a global data space
13. EUDAT Summer School, 3-7 July 2017, Crete
A solution for Life Science, the Universe
and Everything
14. EUDAT Summer School, 3-7 July 2017, Crete
What is Linked Data?
Tim Berners-Lee (2006) - Design Issues
Use URIs as name for things
Use HTTP URIs so that people can look up those
names (dereferencable)
When someone looks up a URI, provide useful
information, using the standards (RDF, SPARQL)
Include links to other URIs, so that they can
discover more things
https://www.w3.org/DesignIssues/LinkedData.html
15. EUDAT Summer School, 3-7 July 2017, Crete
Use URI instead of URN (Uniform Resource Name) and DOIs
Example
Real Person
http://www.esciencedatafactory.com/people/yann_le_franc
Description RDF (for machines)
http://www.esciencedatafactory.com/people/yann_le_franc.rdf
Description HTML (for humans)
http://www.esciencedatafactory.com/people/yann_le_franc.html
Separate the URI representing the real object or concept from its description
Name things with URIs
16. EUDAT Summer School, 3-7 July 2017, Crete
Make use of HTTP content negociation
Two technical solutions for designing the URIs:
1 - Use the content negotiation Redirect 303 (see Other Link)
2 – Hash URI
https://www.w3.org/TR/cooluris/
Make URI dereferencable
https://www.w3.org/Protocols/rfc2616/rfc2616-sec12.html
17. EUDAT Summer School, 3-7 July 2017, Crete
Make URI dereferencable
Use the content negotiation Redirect 303 (see Other Link)
Client Server
18. EUDAT Summer School, 3-7 July 2017, Crete
GET URI
Make URI dereferencable
Use the content negotiation Redirect 303 (see Other Link)
Client Server
Client HEADER
GET /people/yann_le_franc HTTP/1.1
Host: esciencedatafactory.com
Accept: text/html, application/rdf+xml
19. EUDAT Summer School, 3-7 July 2017, Crete
GET URI
303- See URI2
Make URI dereferencable
Use the content negotiation Redirect 303 (see Other Link)
Client Server
Client HEADER
GET /people/yann_le_franc HTTP/1.1
Host: esciencedatafactory.com
Accept: text/html, application/rdf+xml
Server Answer
HTTP/1.1 303 See Other
Location: http://www.esciencedatafactory.com/
people/yann_le_franc.rdf
Vary: Accept
20. EUDAT Summer School, 3-7 July 2017, Crete
GET URI
303- See URI2
GET URI2
Make URI dereferencable
Use the content negotiation Redirect 303 (see Other Link)
Client Server
Client HEADER
GET /people/yann_le_franc.rdf HTTP/1.1
Host: esciencedatafactory.com
Accept: text/html, application/rdf+xml
21. EUDAT Summer School, 3-7 July 2017, Crete
GET URI
303- See URI2
GET URI2
Content URI2
Make URI dereferencable
Use the content negotiation Redirect 303 (see Other Link)
Client Server
Client HEADER
GET /people/yann_le_franc HTTP/1.1
Host: esciencedatafactory.com
Accept: text/html, application/rdf+xml
Server Answer
HTTP/1.1 200 OK
Content-Type: application/rdf+xml
…
22. EUDAT Summer School, 3-7 July 2017, Crete
GET URI
303- See URI2
GET URI2
Content URI2
Make URI dereferencable
Use the content negotiation Redirect 303 (see Other Link)
Client Server
Client HEADER
GET /people/yann_le_franc HTTP/1.1
Host: esciencedatafactory.com
Accept: text/html, application/rdf+xml
Server Answer
HTTP/1.1 200 OK
Content-Type: application/rdf+xml
…
Requires 4 HTTP calls per item
23. EUDAT Summer School, 3-7 July 2017, Crete
Make URI dereferencable
2 – Use Hash URI
GET URI
Client
Server
http://www.esciencedatafactory.com/people
List of people
• http://www.esciencedatafactory.com/people#yann_le_franc
• http://www.esciencedatafactory.com/people#john_doe
Client HEADER
GET /people HTTP/1.1
Host: esciencedatafactory.com
Accept: application/rdf+xml
24. EUDAT Summer School, 3-7 July 2017, Crete
Make URI dereferencable
2 – Use Hash URI
GET URI
Content URI
Client
Server
http://www.esciencedatafactory.com/people
List of people
• http://www.esciencedatafactory.com/people#yann_le_franc
• http://www.esciencedatafactory.com/people#john_doe
Client HEADER
GET /people HTTP/1.1
Host: esciencedatafactory.com
Accept: application/rdf+xml
HTTP/1.1 200 OK
Content-Type: application/rdf+xml
The whole list
Server Answer
25. EUDAT Summer School, 3-7 July 2017, Crete
Make URI dereferencable
2 – Use Hash URI
GET URI
Content URI
Client
Server
http://www.esciencedatafactory.com/people
List of people
• http://www.esciencedatafactory.com/people#yann_le_franc
• http://www.esciencedatafactory.com/people#john_doe
Client HEADER
GET /people HTTP/1.1
Host: esciencedatafactory.com
Accept: application/rdf+xml
HTTP/1.1 200 OK
Content-Type: application/rdf+xml
The whole list
Server Answer
Cache
26. EUDAT Summer School, 3-7 July 2017, Crete
Make URI dereferencable
2 – Use Hash URI
GET URI
Content URI
Client
ServerCache
Get the whole file and then look into the file to find the items with the hash
http://www.esciencedatafactory.com/people
List of people
• http://www.esciencedatafactory.com/people#yann_le_franc
• http://www.esciencedatafactory.com/people#john_doe
27. EUDAT Summer School, 3-7 July 2017, Crete
Resource A
URI
Resource B
URI
Relation
URI
My website
http://www.example.com/
index.html
Me
http://myprofile/name
Created by
RDF Triple
(subject, predicate, object)
The RDF Data Model
28. EUDAT Summer School, 3-7 July 2017, Crete
Labeled directed graph
From W3C RDF 1.1. Primer https://www.w3.org/TR/rdf11-primer/
RDF in action
30. EUDAT Summer School, 3-7 July 2017, Crete
RDFa
RDF serializations
<!DOCTYPE html PUBLIC “ _//W3C//DTD XHTML+RDFa 1.0//EN”
“http://www.w3c.org/MarkUp/DTD/xhtml-rdfa-1.dtd”>
<html xmlns=“http://www.w3c.org/1999/xhtml”
xmlns:rdf=“http://www.w3c.org/1999/02/22-rdf-syntax-ns#”
xmlns:foaf=“http://xmlns.com/foaf/0.1/”>
<head>
<meta http-equiv=“Content-Type” content=“application/xhtml+xml; charset=UTF-8”/>
<title>Profile page for Yann Le Franc</title>
<:/head>
<body>
<div about=http://www.esciencedatafactory.com/people/yann_le_franc typeof=“foaf:Person”>
<span property=“foaf:name”>Yann Le Franc</span>
</div>
</body>
</html>
31. EUDAT Summer School, 3-7 July 2017, Crete
Subject Predicate Object
Alice is a friend of Bob
Bob Is interested
in
The Mona
Lisa
Bob Is a Person
Bob Is born 14 July 1990
The Mona
Lisa
Was created
by
Leonardo Da
Vinci
La Joconde in
Washington
Is about The Mona
Lisa
Triple Store
SPARQL endpoint
SPARQL
Queries
Publishing RDF
32. EUDAT Summer School, 3-7 July 2017, Crete
RDF Triple store Graph database
M. Junghanns and A. Petermann, “Management and Analysis of Big Graph Data: Current Systems and Open Challenges,” …
(eds: S Sakr, 2017.
B. Haslhofer, E. Momeni Roochi, B. Schandl, and S. Zander, “Europeana RDF Store Report,” Mar. 2011.
Z. Kaoudi and G. Weikum, RDF in the clouds: a survey In The VLDB Journal. 2014.
Technologies to publish RDF
33. EUDAT Summer School, 3-7 July 2017, Crete
Resource 1: http://www.incf.org/images/newsroom/le-franc
Resource 2:
http://m.c.lnkd.licdn.com/mpr/mpr/shrink_200_200/p/2/000/22d/056/2bdc24c.jpg
Last Name : Le Franc
<last_name>
Le Franc
</last_name>
Family Name : Le Franc
<family_name>
Le Franc
</family_name>
Do we need anything else?
34. EUDAT Summer School, 3-7 July 2017, Crete
Resource 1: http://www.incf.org/images/newsroom/le-franc
Resource 2:
http://m.c.lnkd.licdn.com/mpr/mpr/shrink_200_200/p/2/000/22d/056/2bdc24c.jpg
Last Name : Le Franc
<last_name>
Le Franc
</last_name>
Family Name : Le Franc
<family_name>
Le Franc
</family_name>
Do we need anything else?
Synonym/Equivalent
35. EUDAT Summer School, 3-7 July 2017, Crete
Resource 1: http://www.incf.org/images/newsroom/le-franc
Resource 2:
http://m.c.lnkd.licdn.com/mpr/mpr/shrink_200_200/p/2/000/22d/056/2bdc24c.jpg
Last Name : Le Franc
<last_name>
Le Franc
</last_name>
Family Name : Le Franc
<family_name>
Le Franc
</family_name>
Do we need anything else?
Synonym/Equivalent
?
? ??
?
WE NEED COMMON VOCABULARIES TO SHARE THE SAME SEMANT
36. EUDAT Summer School, 3-7 July 2017, Crete
Yes if you are interested in:
Sharing data with other
Data aggregation from multiple sources
Not if you are a lone scientist in your ivory tower
Do we really need vocabularies?
37. EUDAT Summer School, 3-7 July 2017, Crete
“In computer science and information science, an ontology formally represents
knowledge as a set of concepts within a domain, using a shared vocabulary to
denotes the types, properties and interrelationships of the concepts” - Wikipedia
You need to create a controlled vocabulary also called ontology that could be
used as a common “standardized” vocabulary to annotate your resource
W3C semantic web standards:
RDF Schema
OWL (Web Ontology Language)
SKOS (Simple Knowledge Organization System)
What is an ontology?
How do you encode this in practice?
How can we make it better?
38. EUDAT Summer School, 3-7 July 2017, Crete
Class
What is an ontology in practice?
39. EUDAT Summer School, 3-7 July 2017, Crete
Class
Unique identifier
Label
Human-readable definition
Other metadata
(creator, version, date,…)
What is an ontology in practice?
40. EUDAT Summer School, 3-7 July 2017, Crete
Superclass
Unique identifier
Label
Human-readable definition
Other metadata
(creator, version, date,…)
Subclass
Unique identifier
Label
Human-readable definition
is_aSubsumption
relation
Macaqua mulata is an animal
What is an ontology in practice?
41. EUDAT Summer School, 3-7 July 2017, Crete
Person
Unique identifier
Label
Human-readable definition
Other metadata
(creator, version, date,…)
Yann
Le Franc
Unique identifier
Label
Human-readable definition
is_aSubsumption
relation
What is an ontology in practice?
42. EUDAT Summer School, 3-7 July 2017, Crete
Superclass
Subclass
is_aSubsumption
relation
Superclass 2
has_a
Associative relation
What is an ontology in practice?
43. EUDAT Summer School, 3-7 July 2017, Crete
Person
Yann
Le Franc
is_aSubsumption
relation
Relations between concepts are based on first-order logic
Use reasoners/classifiers- machine learning algorithms
Name
has_a
Associative relation
What is an ontology in practice?
50. EUDAT Summer School, 3-7 July 2017, Crete
Example of vocabularies
FOAF – Friend Of A Friend
DCAT (Data Catalog Vocabulary)
PROV (Provenance vocabulary)
Web Annotation
Music Ontology
SIOC (Semantically Interlinked Online Community)
51. EUDAT Summer School, 3-7 July 2017, Crete
By user:Marobi1 [CC0], via Wikimedia Commons
https://en.wikipedia.org/wiki/Semantic_Web_Stack
The semantic web stack
52. EUDAT Summer School, 3-7 July 2017, Crete
Limitation of a unique formal model: monolithic ontologies
Difficulty to reconcile different models
Lack of validation and quality testing for ontologies
Difficult reach consensus on research topics
Slow integration of the new concepts in existing ontologies
Hard to use for scientists
However designing common terminologies is valuable and Mostly Harmless
?
Limits of the approach
53. EUDAT Summer School, 3-7 July 2017, Crete
Google Knowledge Graph
https://www.google.com/intl/bn/insidesearch/features/sea
rch/knowledge.html
Facebook graph:
https://developers.facebook.com/docs/graph-
api/overview/
Wikidata:
https://www.wikidata.org/wiki/Wikidata:Main_Page
Freebase
Dbpedia
https://datahub.io/dataset
EBI RDF store
Some major RDF resources
54. EUDAT Summer School, 3-7 July 2017, Crete
Metadata
Different types of metadata to describe the context, the
content, the format and the history of the data
Metadata are generally frozen after publication of a data
record
Descriptive Metadata can be incomplete and/or biased
by the data publisher perspective
55. EUDAT Summer School, 3-7 July 2017, Crete
Metadata
Different types of metadata to describe the context, the
content, the format and the history of the data
Metadata are generally frozen after publication of a data
record
Descriptive Metadata can be incomplete and/or biased
by the data publisher perspective
Annotations
How to add new metadata/information in a flexible way?
56. EUDAT Summer School, 3-7 July 2017, Crete
What do we mean by annotation?
By definition, an annotation is “a note added to a text,
book, drawing, etc., as a comment or an explanation”
(from Merriam Webster).
In our context, it is an assertion we want to make about a
digital resource i.e. a text file, an image, a recording, a
movie,... .
57. EUDAT Summer School, 3-7 July 2017, Crete
Semantic Annotation: General Principles
58. EUDAT Summer School, 3-7 July 2017, Crete
Web Annotation Data Model
Use W3C Web Annotation data model –
(https://www.w3.org/TR/annotation-model/)
Serialized in JSON-LD (https://www.w3.org/TR/json-ld/)
= JSON based representation of RDF graphs
59. EUDAT Summer School, 3-7 July 2017, Crete
The annotation “use-cases”
Manual annotations of data elements: semantic
tagging and file linking
Semi-automatic annotations of data element content:
related with LTER Data Pilot
Data curation: curation status tags
Create aggregated datasets from multi-scale or
multi-domain datasets.
60. EUDAT Summer School, 3-7 July 2017, Crete
B2NOTE
Crowdsourcing annotator
All annotation are public
Private annotation in the next release
Easy-to-use
auto-completion with terms from domain specific controlled vocabularies
Intuitive User Interface
Easily create new datasets selected based on annotations
Easy integration approach based Widget/iframe approach
Integrate with EUDAT services
Integrate with community web UI
Easy to deploy
Store triples as JSON-LD in MongoDB backend
Uses Django as CMS
63. EUDAT Summer School, 3-7 July 2017, Crete
B2NOTE at work
Try it @ http://b2note.bsc.es
Login/Register Annotation interface Access to annotation
64. EUDAT Summer School, 3-7 July 2017, Crete
B2NOTE at work
Access semantic term
information
Search files using
annotations
Export annotations and
selected data for reuse
65. EUDAT Summer School, 3-7 July 2017, Crete
Test integration with B2SHARE
https://trng-b2share.eudat.eu/
66. EUDAT Summer School, 3-7 July 2017, Crete
The added-value of annotations
Enriching digital content with your personal keyword
without modifying the data record
Structure data differently using annotations
Support data curation before and after publication
Create aggregated datasets from multi-scale or multi-
domain datasets.
67. EUDAT Summer School, 3-7 July 2017, Crete
Additional Resources
EUDAT Webinar: Organise, retrieve and
aggregate data using annotations with
B2NOTE