SlideShare a Scribd company logo
The Power of
Semantic Technologies
to Explore
Linked Open Data
Graphorum & Smart Data Conference, Jan 2017
You will learn how to:
• Convert tabular data into RDF
• Combine local and remote data in a single query
• Graphically explore the connectivity patterns in big diverse data
− 1B+ triples, 1000+ classes, 8 datasets
• Detect suspicious patterns of company control
• Filter news based on relationships between companies and people
• Rank companies per industry and region
Presentation Outline
•Use cases: Relation discovery and Media monitoring
•GraphDB’s OntoRefine conversion of tabular data in RDF
•FactForge: Open data and news about people and organizations
•Relationship Discovery Examples
•Media Monitoring Examples & Popularity Ranking
•Panama Papers and Global Legal Entity Identifier as Open Data
•Tracing Panama Papers entities in the news
Use cases: Relation
discovery
and Media monitoring
Commercial Company
Database
(e.g. D&B)
Link data!
Reveal more!
Social Media
News
Wikipedia
Private
• Link diverse data in a
Knowledge Graph
• Analyze News and
Social Content
• Extract facts and
link content to data
• Interpret data in context
of big linked data
Content Analytics &
Exploration Platform
GraphDB Linked Open Data
Relation Discovery Case
• Find suspicious
relationships like:
− Company in USA
− Controls another
company in USA
− Through a company in
an off-shore zone
• Show news
relevant to these
companies
Linking News to Big Knowledge Graphs
• The DSP platform
links text to
knowledge
graphs
• One can navigate
from news to
concepts,
entities and
topics, and from
there to other
news
Try it at http://now.ontotext.com
Semantic Media Monitoring
For each
entity:
•popularity
trends
•relevant
news
•related
entities
•knowledge
graph
information
Try it at http://now.ontotext.com
GraphDB OntoRefine:
conversion of tabular
data in RDF
OntoRefine: Data Transformation to RDF
• Based on OpenRefine and integrated in the GraphDB Workbench
• Allows converting tabular data into RDF
− Supported formats are TSV, CSV, *SV, XLS, XLSX, JSON, XML, RDF as XML, and Google sheet
− Easily filter your data, edit its inconsistencies
− View the cleaned data as RDF
• Exposes a GraphDB SPARQL endpoint
− Transform your data using SPIN functions
− Import your data straight into a GraphDB repository
The Power of Semantic Technologies to Explore Linked Open Data Jan 2017 #11
OntoRefine: Uploading data
• Create new project
− From local / remote files
▪ Supported formats are TSV, CSV,
*SV, XLS, XLSX, JSON, XML, RDF as
XML, and Google sheet
▪ With the first opening of the file,
OntoRefine tries to recognize the
encoding of the text file and all
delimiters.
▪ Allows further fine-tuning of the
table configurations
− From clipboard
• Open / import a project
The Power of Semantic Technologies to Explore Linked Open Data Jan 2017 #12
OntoRefine: Viewing tabular data as RDF
• OpenRefine supports RDF as input only
• OntoRefine also supports RDF as output
The Power of Semantic Technologies to Explore Linked Open Data Jan 2017 #13
•Data shown as either records or rows
− A record combines multiple rows identifying the same
object and sharing the first column
•Data stored in a separate repository
− must not be mistaken with the current repository available
through GraphDB Workbench SPARQL tab
OntoRefine: RDF-izing data
• Transform data using a CONSTRUCT query
− in the OntoRefine SPARQL endpoint
− directly in the GraphDB SPARQL endpoint
• GraphDB 8.O supports SPIN functions:
− SPARQL functions for splitting a string
− SPARQL functions for parsing dates
− SPARQL functions for encoding URIs
The Power of Semantic Technologies to Explore Linked Open Data Jan 2017 #14
OntoRefine: Importing data in GraphDB
• After transforming the data, import it in the
current repository without leaving the
GraphDB Workbench
− Copy the endpoint of the OntoRefine project
− Go to GraphDB SPARQL menu
− Execute a query to import the results
The Power of Semantic Technologies to Explore Linked Open Data Jan 2017 #15
Combine local and remote data
• SPARQL Federation allows one to retrieve data from a remote end-
point in the middle of a query to a local repository
•For instance, to combine local data for GDP with information about
the area of each country from DBPedia to calculated GDP/sq.km.
Query GDP/Sq.km.
Federation example: GDP per Sq. Km.
SELECT DISTINCT ?name
(STR(?area) AS ?areaSqKm) (STR(?GDPperKm) AS ?GDPperSqKm)
{
?gdp2015prop gdp:forYear 2015 .
?country gdp:gdpCountry_Name ?name ; ?gdp2015prop ?gdp2015 .
{ SELECT (STR(?n) as ?name) ?area {
SERVICE <http://dbpedia.org/sparql> {
?c a dbo:Country ; rdfs:label ?n; dbp:areaKm ?area .
} } }
BIND(STR(ROUND(xsd:decimal(?gdp2015/1000000000))) AS ?gdp2015bil)
BIND(xsd:integer((?gdp2015) / ?area ) AS ?GDPperKm)
} ORDER BY DESC(?GDPperKm) LIMIT 10
FactForge: Open data and
news about people and
organizations
http://factforge.net
Our approach to Big Data
1. Integrate data from many sources
− Build a Big Knowledge Graph that integrates relevant
data from proprietary databases and taxonomies plus
millions of facts of Linked Data
1. Infer new facts and unveil
relationships
− Performing reasoning across different data sources
1. Interlink text and with big data
− Using text-mining to automatically discover references
to concepts and entities
1. Use graph database for metadata
management, querying and search
FactForge: Data Integration
DBpedia (the English version) 496M
Geonames (all geographic features on Earth) 150M
owl:sameAs links between DBpedia and Geonames 471K
Company registry data (GLEI) 3M
Panama Papers DB (#LinkedLeaks) 20M
Other datasets and ontologies: WordNet, WorldFacts, FIBO
News metadata (2000 articles/day enriched by NOW) 473M
Total size (1152M explicit + 322M inferred statements) 1 475М
News Metadata
• Metadata from Ontotext’s Dynamic Semantic Publishing platform
− News stream from Google
− Automatically generated as part of the NOW.ontotext.com semantic news showcase
•News stream from Google since Feb 2015, about 50k news/month
− ~70 tags (annotations) per news article
• Tags link text mentions of concepts to the knowledge graph
− Technically these are URIs for entities (people, organizations, locations, etc.) and key phrases
New Metadata
Category Count
International 52 074
Science and Technology 23 201
Sports 20 714
Business 15 155
Lifestyle 11 684
122 828
Mentions / entity type Count
Keyphrase 2 589 676
Organization 1 276 441
Location 1 260 972
Person 1 248 784
Work 309 093
Event 258 388
RelationPersonRole 236 638
Species 180 946
News Metadata
Class Hierarchy Map (by number of instances)
Left: The big picture
Right: dbo:Agent class (2.7M organizations and persons)
Sample queries at http://factforge.net
• F1: Big cities in Eastern Europe
• F2: Airports near London
• F3: People and organizations related to Google
• F4: Top-level industries by number of companies
Available as Saved Queries at http://factforge.net/sparql
Note: Open Saved Queries with the folder icon in the upper-right corner
Relationship Discovery
Examples
Offshore control example
• Query: Find companies, which control other companies in the same
country, through company in an off-shore zone
• How it works:
• Establish control-relationship
• Establish a company-country mapping
• Establish an “off-shore criteria”
• SPARQL it
Off-shore company control example
SELECT *
FROM onto:disable-sameAs
WHERE {
?c1 fibo-fnd-rel-rel:controls ?c2 .
?c2 fibo-fnd-rel-rel:controls ?c3 .
?c1 ff-map:orgCountry ?c1_country .
?c2 ff-map:orgCountry ?c2_country .
?c3 ff-map:orgCountry ?c1_country .
FILTER (?c1_country != ?c2_country)
?c2_country ff-map:hasOffshoreProvisions true .
}
Media Monitoring
Examples
Semantic Media Monitoring/Press-Clipping
• We can trace references to a specific company in the news
− This is pretty much standard, however we can deal with syntactic variations in the names,
because state of the art Named Entity Recognition technology is used
− What’s more important, we distinguish correctly in which mention “Paris” refers to which of the
following: Paris (the capital of France), Paris in Texas, Paris Hilton or to Paris (the Greek hero)
• We can trace and consolidate references to daughter companies
• We have comprehensive industry classification
− The one from DBPedia, but refined to accommodate identifier variations and specialization (e.g.
company classified as dbr:Bank will also be considered classified as dbr:FinancialServices)
Media Monitoring Queries
• F5: Mentions in the news of an organization and its related entities
• F7: Most popular companies per industry, including children
• F8: Regional exposition of company – normalized
News Popularity Ranking: Automotive
Rank Company News # Rank
Company incl. mentions of child
companies News #
1 General Motors 2722 1 General Motors 4620
2 Tesla Motors 2346 2 Volkswagen Group 3999
3 Volkswagen 2299 3 Fiat Chrysler Automobiles 2658
4 Ford Motor Company 1934 4 Tesla Motors 2370
5 Toyota 1325 5 Ford Motor Company 2125
6 Chevrolet 1264 6 Toyota 1656
7 Chrysler 1054 7 Renault-Nissan Alliance 1332
8 Fiat Chrysler Automobiles 1011 8 Honda 864
9 Audi AG 972 9 BMW 715
10 Honda 717 10 Takata Corporation 547
News Popularity: Finance
Rank Company News # Rank Company incl. mentions of controlled News #
1 Bloomberg L.P. 3203 1 Intra Bank 261667
2 Goldman Sachs 1992 2 Hinduja Bank (Switzerland) 49731
3 JP Morgan Chase 1712 3 China Merchants Bank 38288
4 Wells Fargo 1688 4 Alphabet Inc. 22601
5 Citigroup 1557 5 Capital Group Companies 4076
6 HSBC Holdings 1546 6 Bloomberg L.P. 3611
7 Deutsche Bank 1414 7 Exor 2704
8 Bank of America 1335 8 Nasdaq, Inc. 2082
9 Barclays 1260 9 JP Morgan Chase 1972
10 UBS 694 10 Sentinel Capital Partners 1053
Note: Including investment funds, stock exchanges, agencies, etc.
News Popularity: Banking
Rank Company News # Rank Company incl. mentions of controlled News #
1 Goldman Sachs 996 1 China Merchants Bank * 38288
2 JP Morgan Chase 856 2 JP Morgan Chase 1972
3 HSBC Holdings 773 3 Goldman Sachs 1030
4 Deutsche Bank 707 4 HSBC 966
5 Barclays 630 5 Bank of America 771
6 Citigroup 519 6 Deutsche Bank 742
7 Bank of America 445 7 Barclays 681
8 Wells Fargo 422 8 Citigroup 630
9 UBS 347 9 Wells Fargo 428
10 Chase 126 10 UBS 347
Panama Papers and
Global Legal Entity
Identifier as Open Data
Global Legal Entity Identifier (GLEI) data
• Global Markets Entity Identifier (GMEI) Utility data
− The Global Markets Entity Identifier (GMEI) utility is DTCC's legal entity identifier solution
offered in collaboration with SWIFT
− We downloaded as XML data dump from https://www.gmeiutility.org/
• RDF-ized company records
− Fields: LEI#, legal name, ultimate parent, registered country
− 3M explicit statements for 211 thousand organizations
▪ For comparison, there are 490 000 organizations in DBPeda and D&B covers above 200 million
− 10,821 ultimate parent relationships and 1632 ultimate parents
• 2 800 organizations from the GLEI dump mapped to DBPedia
GLEI Company Data Sample: ABN-AMRO
lei:businessRegistry Kamer van Koophandel
lei:businessRegistryNumber 34334259
lei:duplicateReference data:549300T5O0D0T4V2ZB28
lei:entityStatus ACTIVE
lei:headquartersCity Amsterdam
lei:headquartersState Noord-Holland
lei:legalForm NAAMLOZE VENNOOTSCHAP
lei:legalName ABN AMRO Bank N.V.
lei:lei BFXS5XCH7N0Y05NIXW11
lei:registeredCity Amsterdam
lei:registeredCountry NL
lei:registeredPostCode 1082 PP
lei:registeredState Noord-Holland
GLEI Company Data Sample: ABN-AMRO
Ultimate parent Children Country
1 The Goldman Sachs Group, Inc. 1 851 US
2 United Technologies Corporation 427 US
3 Honeywell International Inc. 341 US
4 Morgan Stanley 228 US
5 Cargill, Incorporated 217 US
6 1832 Asset Management L.P. 202 CA
7 Aegon N.V. 174 NL
8 Union Bancaire Privée, UBP SA 138 CH
9 Citigroup Inc. 135 US
10 State Street Corporation 128 US
Country Companies
1 dbr:United_States 103 548
2 dbr:Canada 17 425
3 dbr:Luxembourg 13 984
4 dbr:Sweden 7 934
5 dbr:United_Kingdom 7 421
6 dbr:Belgium 6 868
7 dbr:Ireland 4 762
8 dbr:Australia 4 385
9 dbr:Germany 3 039
10 dbr:Netherlands 2 561
Global Legal Entity Identifier (GLEI) data
Offshore Leaks Database from ICIJ
• Published by the International Consortium of Investigative
Journalists (ICIJ) on 9th of May
• A “searchable database” about 320 000 offshore companies
− 214 000 extracted from Panama Papers (valid until 2015)
− More than 100 000 from 2013 Offshore leaks investigation (valid until 2010)
• CSV extract from a graph database available for download
• https://offshoreleaks.icij.org/
Offshore
Leaks Database
Offshore Leaks DB as Linked Open Data
• Ontotext published the Offshore Leaks DB as Linked Open Data
• Available for exploration, querying and download at
http://data.ontotext.com
• ONTOTEXT DISCLAIMERS
We use the data as is provided by ICIJ. We make no representations and warranties of any kind,
including warranties of title, accuracy, absence of errors or fitness for particular purpose. All
transformations, query results and derivative works are used only to showcase the service and
technological capabilities and not to serve as basis for any statements or conclusions.
Enrichment and structuring of the data
• Relationship type hierarchy
− About 80 types of relationship types in the original dataset got organized in a property hierarchy
• Classification of officers into Person and Company
− In the original database there is no way to distinguish whether an officer is a physical person
• Mapping to DBPedia:
− 209 countries referred in Offshore Leaks DB are mapped to DBPedia
− About 3000 persons and 300 companies mapped to DBPedia
• Overall size of the repository: 22M statements (20M explicit)
The RDF-ization Process
• Linked data variant produced without programming
− The raw CSV files are RDF-ized using TARQL, http://tarql.github.io/
− Data was further interlinked and enriched in GraphDB using SPARQL
• The process is documented in this README file
• All relevant artifacts are open-source, available at
• https://github.com/Ontotext-AD/leaks/
• The entire publishing and mapping took about 15 person-days !!!
− Including data.ontotext.com portal setup, promotion, documentation, etc.
Sample queries at http://data.ontotext.com
• Q1: Countries by number of entities related to them
• Q2: Country pairs by ownership statistics
• Q3: Statistics by incorporation year
• Q4: Officers and entities by number of capital relations
• Q5: Countries in Eastern Europe by number of owners
• Q6: Intermediaries in Asia by name
• Q7: The best connected officers
• Q8: Countries by number of Person and Company officers
Mapping Datasets to
DBPedia with the
GraphDB Lucene
Connector
Mapping datasets to DBPedia
• The task: map people, organizations and locations to IDs in DBPedia
− So that we can analyze the original data with the help of the extra information available in
DBPedia and other datasets that are related to it, e.g. Geonames
− For instance, #LinkedLeaks doesn’t contain any extra information about the companies, e.g.
industry sector, controlling or controlled companies, etc.
• Specific conditions: we had to map by names
− Other than names, the information about the entities in the source datasets couldn’t help the
mapping
▪ Address and country attributes are present, but those appeared to be marginally useful for mapping
− In both cases we mapped locations only in terms of countries and not finer grained locations
▪ For this purpose DBPedia geographic data is sufficient and it is also well mapped with GeoNames
Mapping datasets to DBPedia (2)
• We used the GraphDB connector to Lucene for these mappings
− Using the GraphDB connector, Lucene index was created for Organizations and People from
DBPedia, indexing all sorts of names, descriptions and other textual information for each entity
− The mapping process consists mostly of using the name of the entity from the 3rd party dataset
(in this case Panama Papers or GLEI) as a FTS query, embedded in a SPARQL query
• What is that Lucence does better than SPARQL?
− When there is little information other than the name, we benefit from the free text indexing of
Lucene, because it deals well with minor syntactic variations and sorts the results by relevance
− When mappings 300 000 organizations against another 500 000 organizations, without a key,
the complexity of a SPARQL query is 300 000 x 500 000, which is slower that 300 000 Lucene
queries
#LinkedLeaks Mapping Queries
• Companies mapped by industry
• Companies mapped in the Finance sector
• Politicians mapped
• Available as Saved Queries at http://factforge.net/sparql
• Note 1: Open Saved Queries with the folder icon in the upper-right
corner
Tracing Panama Papers
entities in the news
Tracing Panama Papers entities in the news
• After mapping #LinkedLeaks entities to DBPedia identifiers, we can
load them, together with the mappings, in the FF-NEWS repository
• This way we have in a single repo, mapped to one another:
#LinkedLeaks data, DBPedia, News metadata
• We can make queries like: Give me news mentions of entities which
appear in the Panama Papers dataset
• This way the mapping enabled media monitoring at no extra cost
Thank you!
Experience the technology with NOW: Semantic News Portal
http://now.ontotext.com
and play with open data at
http://factforge.net

More Related Content

What's hot

Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven RecipesReasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
Ontotext
 
Linked Data Experiences at Springer Nature
Linked Data Experiences at Springer NatureLinked Data Experiences at Springer Nature
Linked Data Experiences at Springer Nature
Michele Pasin
 
Diving in Panama Papers and Open Data to Discover Emerging News
Diving in Panama Papers and Open Data to Discover Emerging NewsDiving in Panama Papers and Open Data to Discover Emerging News
Diving in Panama Papers and Open Data to Discover Emerging News
Ontotext
 
Gain Super Powers in Data Science: Relationship Discovery Across Public Data
Gain Super Powers in Data Science: Relationship Discovery Across Public DataGain Super Powers in Data Science: Relationship Discovery Across Public Data
Gain Super Powers in Data Science: Relationship Discovery Across Public Data
Ontotext
 
Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Build Narratives, Connect Artifacts: Linked Open Data for Cultural HeritageBuild Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Ontotext
 
Enterprise knowledge graphs
Enterprise knowledge graphsEnterprise knowledge graphs
Enterprise knowledge graphs
Sören Auer
 
Choosing the Right Graph Database to Succeed in Your Project
Choosing the Right Graph Database to Succeed in Your ProjectChoosing the Right Graph Database to Succeed in Your Project
Choosing the Right Graph Database to Succeed in Your Project
Ontotext
 
Efficient Practices for Large Scale Text Mining Process
Efficient Practices for Large Scale Text Mining ProcessEfficient Practices for Large Scale Text Mining Process
Efficient Practices for Large Scale Text Mining Process
Ontotext
 
[Conference] Cognitive Graph Analytics on Company Data and News
[Conference] Cognitive Graph Analytics on Company Data and News[Conference] Cognitive Graph Analytics on Company Data and News
[Conference] Cognitive Graph Analytics on Company Data and News
Ontotext
 
How to Reveal Hidden Relationships in Data and Risk Analytics
How to Reveal Hidden Relationships in Data and Risk AnalyticsHow to Reveal Hidden Relationships in Data and Risk Analytics
How to Reveal Hidden Relationships in Data and Risk Analytics
Ontotext
 
LDOW2015 Position Talk and Discussion
LDOW2015 Position Talk and DiscussionLDOW2015 Position Talk and Discussion
LDOW2015 Position Talk and Discussion
Sören Auer
 
Linking Open, Big Data Using Semantic Web Technologies - An Introduction
Linking Open, Big Data Using Semantic Web Technologies - An IntroductionLinking Open, Big Data Using Semantic Web Technologies - An Introduction
Linking Open, Big Data Using Semantic Web Technologies - An Introduction
Ronald Ashri
 
Using the Semantic Web Stack to Make Big Data Smarter
Using the Semantic Web Stack to Make  Big Data SmarterUsing the Semantic Web Stack to Make  Big Data Smarter
Using the Semantic Web Stack to Make Big Data Smarter
Matheus Mota
 
Benchmarking RDF Metadata Representations: Reification, Singleton Property an...
Benchmarking RDF Metadata Representations: Reification, Singleton Property an...Benchmarking RDF Metadata Representations: Reification, Singleton Property an...
Benchmarking RDF Metadata Representations: Reification, Singleton Property an...
Fabrizio Orlandi
 
A Semantic Data Model for Web Applications
A Semantic Data Model for Web ApplicationsA Semantic Data Model for Web Applications
A Semantic Data Model for Web Applications
Armin Haller
 
Linked data experience at Macmillan: Building discovery services for scientif...
Linked data experience at Macmillan: Building discovery services for scientif...Linked data experience at Macmillan: Building discovery services for scientif...
Linked data experience at Macmillan: Building discovery services for scientif...
Michele Pasin
 
Beyond 2022 project presentation 2021
Beyond 2022 project presentation 2021Beyond 2022 project presentation 2021
Beyond 2022 project presentation 2021
Fabrizio Orlandi
 
ROI in Linking Content to CRM by Applying the Linked Data Stack
ROI in Linking Content to CRM by Applying the Linked Data StackROI in Linking Content to CRM by Applying the Linked Data Stack
ROI in Linking Content to CRM by Applying the Linked Data Stack
Martin Voigt
 
EC-WEB: Validator and Preview for the JobPosting Data Model of Schema.org
EC-WEB: Validator and Preview for the JobPosting Data Model of Schema.orgEC-WEB: Validator and Preview for the JobPosting Data Model of Schema.org
EC-WEB: Validator and Preview for the JobPosting Data Model of Schema.org
Jindřich Mynarz
 
Linked Data Usecases
Linked Data UsecasesLinked Data Usecases
Linked Data Usecases
Myungjin Lee
 

What's hot (20)

Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven RecipesReasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
 
Linked Data Experiences at Springer Nature
Linked Data Experiences at Springer NatureLinked Data Experiences at Springer Nature
Linked Data Experiences at Springer Nature
 
Diving in Panama Papers and Open Data to Discover Emerging News
Diving in Panama Papers and Open Data to Discover Emerging NewsDiving in Panama Papers and Open Data to Discover Emerging News
Diving in Panama Papers and Open Data to Discover Emerging News
 
Gain Super Powers in Data Science: Relationship Discovery Across Public Data
Gain Super Powers in Data Science: Relationship Discovery Across Public DataGain Super Powers in Data Science: Relationship Discovery Across Public Data
Gain Super Powers in Data Science: Relationship Discovery Across Public Data
 
Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Build Narratives, Connect Artifacts: Linked Open Data for Cultural HeritageBuild Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
 
Enterprise knowledge graphs
Enterprise knowledge graphsEnterprise knowledge graphs
Enterprise knowledge graphs
 
Choosing the Right Graph Database to Succeed in Your Project
Choosing the Right Graph Database to Succeed in Your ProjectChoosing the Right Graph Database to Succeed in Your Project
Choosing the Right Graph Database to Succeed in Your Project
 
Efficient Practices for Large Scale Text Mining Process
Efficient Practices for Large Scale Text Mining ProcessEfficient Practices for Large Scale Text Mining Process
Efficient Practices for Large Scale Text Mining Process
 
[Conference] Cognitive Graph Analytics on Company Data and News
[Conference] Cognitive Graph Analytics on Company Data and News[Conference] Cognitive Graph Analytics on Company Data and News
[Conference] Cognitive Graph Analytics on Company Data and News
 
How to Reveal Hidden Relationships in Data and Risk Analytics
How to Reveal Hidden Relationships in Data and Risk AnalyticsHow to Reveal Hidden Relationships in Data and Risk Analytics
How to Reveal Hidden Relationships in Data and Risk Analytics
 
LDOW2015 Position Talk and Discussion
LDOW2015 Position Talk and DiscussionLDOW2015 Position Talk and Discussion
LDOW2015 Position Talk and Discussion
 
Linking Open, Big Data Using Semantic Web Technologies - An Introduction
Linking Open, Big Data Using Semantic Web Technologies - An IntroductionLinking Open, Big Data Using Semantic Web Technologies - An Introduction
Linking Open, Big Data Using Semantic Web Technologies - An Introduction
 
Using the Semantic Web Stack to Make Big Data Smarter
Using the Semantic Web Stack to Make  Big Data SmarterUsing the Semantic Web Stack to Make  Big Data Smarter
Using the Semantic Web Stack to Make Big Data Smarter
 
Benchmarking RDF Metadata Representations: Reification, Singleton Property an...
Benchmarking RDF Metadata Representations: Reification, Singleton Property an...Benchmarking RDF Metadata Representations: Reification, Singleton Property an...
Benchmarking RDF Metadata Representations: Reification, Singleton Property an...
 
A Semantic Data Model for Web Applications
A Semantic Data Model for Web ApplicationsA Semantic Data Model for Web Applications
A Semantic Data Model for Web Applications
 
Linked data experience at Macmillan: Building discovery services for scientif...
Linked data experience at Macmillan: Building discovery services for scientif...Linked data experience at Macmillan: Building discovery services for scientif...
Linked data experience at Macmillan: Building discovery services for scientif...
 
Beyond 2022 project presentation 2021
Beyond 2022 project presentation 2021Beyond 2022 project presentation 2021
Beyond 2022 project presentation 2021
 
ROI in Linking Content to CRM by Applying the Linked Data Stack
ROI in Linking Content to CRM by Applying the Linked Data StackROI in Linking Content to CRM by Applying the Linked Data Stack
ROI in Linking Content to CRM by Applying the Linked Data Stack
 
EC-WEB: Validator and Preview for the JobPosting Data Model of Schema.org
EC-WEB: Validator and Preview for the JobPosting Data Model of Schema.orgEC-WEB: Validator and Preview for the JobPosting Data Model of Schema.org
EC-WEB: Validator and Preview for the JobPosting Data Model of Schema.org
 
Linked Data Usecases
Linked Data UsecasesLinked Data Usecases
Linked Data Usecases
 

Viewers also liked

The Knowledge Discovery Quest
The Knowledge Discovery Quest The Knowledge Discovery Quest
The Knowledge Discovery Quest
Ontotext
 
Cooking up the Semantic Web
Cooking up the Semantic WebCooking up the Semantic Web
Cooking up the Semantic Web
Ontotext
 
Best Practices for Large Scale Text Mining Processing
Best Practices for Large Scale Text Mining ProcessingBest Practices for Large Scale Text Mining Processing
Best Practices for Large Scale Text Mining Processing
Ontotext
 
Thinking Outside the Table
Thinking Outside the TableThinking Outside the Table
Thinking Outside the Table
Ontotext
 
Analytics for 2014: The Numbers that Matter
Analytics for 2014: The Numbers that MatterAnalytics for 2014: The Numbers that Matter
Analytics for 2014: The Numbers that Matter
WhatCounts, Inc.
 
MDBW-Cappius-Speaker Presentation - Enterprise_Speech_Analytics_v5
MDBW-Cappius-Speaker Presentation - Enterprise_Speech_Analytics_v5MDBW-Cappius-Speaker Presentation - Enterprise_Speech_Analytics_v5
MDBW-Cappius-Speaker Presentation - Enterprise_Speech_Analytics_v5Surya Putchala
 
Designing Teams for Emerging Challenges
Designing Teams for Emerging ChallengesDesigning Teams for Emerging Challenges
Designing Teams for Emerging Challenges
Aaron Irizarry
 
Antidot Semantic Publishing - Réussir un site éditorial agrégeant plusieurs s...
Antidot Semantic Publishing - Réussir un site éditorial agrégeant plusieurs s...Antidot Semantic Publishing - Réussir un site éditorial agrégeant plusieurs s...
Antidot Semantic Publishing - Réussir un site éditorial agrégeant plusieurs s...
Antidot
 
Global RDF Descriptors for Germplasm Data
Global RDF Descriptors for Germplasm DataGlobal RDF Descriptors for Germplasm Data
Global RDF Descriptors for Germplasm Data
Vassilis Protonotarios
 
Publishing Germplasm Vocabularies as Linked Data
Publishing Germplasm Vocabularies as Linked DataPublishing Germplasm Vocabularies as Linked Data
Publishing Germplasm Vocabularies as Linked Data
Valeria Pesce
 
How is smart data cooked?
How is smart data cooked?How is smart data cooked?
How is smart data cooked?
Ontotext
 
European agrobiodioversity, ECPGR network meeting on EURISCO, Central Crop Da...
European agrobiodioversity, ECPGR network meeting on EURISCO, Central Crop Da...European agrobiodioversity, ECPGR network meeting on EURISCO, Central Crop Da...
European agrobiodioversity, ECPGR network meeting on EURISCO, Central Crop Da...
Dag Endresen
 
Innvovative cities: web de données et web sémantique, ressources ubiquitaires...
Innvovative cities: web de données et web sémantique, ressources ubiquitaires...Innvovative cities: web de données et web sémantique, ressources ubiquitaires...
Innvovative cities: web de données et web sémantique, ressources ubiquitaires...
Fabien Gandon
 
Linked Open Data-enabled Strategies for Top-N Recommendations
Linked Open Data-enabled Strategies for Top-N RecommendationsLinked Open Data-enabled Strategies for Top-N Recommendations
Linked Open Data-enabled Strategies for Top-N Recommendations
Cataldo Musto
 
What is #LODLAM?! Understanding linked open data in libraries, archives [and ...
What is #LODLAM?! Understanding linked open data in libraries, archives [and ...What is #LODLAM?! Understanding linked open data in libraries, archives [and ...
What is #LODLAM?! Understanding linked open data in libraries, archives [and ...
Alison Hitchens
 
L’apport du Web sémantique à la recherche d’informations
L’apport du Web sémantique à la recherche d’informationsL’apport du Web sémantique à la recherche d’informations
L’apport du Web sémantique à la recherche d’informationsAref Jdey
 
Intro to Linked Open Data in Libraries Archives & Museums.
Intro to Linked Open Data in Libraries Archives & Museums.Intro to Linked Open Data in Libraries Archives & Museums.
Intro to Linked Open Data in Libraries Archives & Museums.
Jon Voss
 

Viewers also liked (17)

The Knowledge Discovery Quest
The Knowledge Discovery Quest The Knowledge Discovery Quest
The Knowledge Discovery Quest
 
Cooking up the Semantic Web
Cooking up the Semantic WebCooking up the Semantic Web
Cooking up the Semantic Web
 
Best Practices for Large Scale Text Mining Processing
Best Practices for Large Scale Text Mining ProcessingBest Practices for Large Scale Text Mining Processing
Best Practices for Large Scale Text Mining Processing
 
Thinking Outside the Table
Thinking Outside the TableThinking Outside the Table
Thinking Outside the Table
 
Analytics for 2014: The Numbers that Matter
Analytics for 2014: The Numbers that MatterAnalytics for 2014: The Numbers that Matter
Analytics for 2014: The Numbers that Matter
 
MDBW-Cappius-Speaker Presentation - Enterprise_Speech_Analytics_v5
MDBW-Cappius-Speaker Presentation - Enterprise_Speech_Analytics_v5MDBW-Cappius-Speaker Presentation - Enterprise_Speech_Analytics_v5
MDBW-Cappius-Speaker Presentation - Enterprise_Speech_Analytics_v5
 
Designing Teams for Emerging Challenges
Designing Teams for Emerging ChallengesDesigning Teams for Emerging Challenges
Designing Teams for Emerging Challenges
 
Antidot Semantic Publishing - Réussir un site éditorial agrégeant plusieurs s...
Antidot Semantic Publishing - Réussir un site éditorial agrégeant plusieurs s...Antidot Semantic Publishing - Réussir un site éditorial agrégeant plusieurs s...
Antidot Semantic Publishing - Réussir un site éditorial agrégeant plusieurs s...
 
Global RDF Descriptors for Germplasm Data
Global RDF Descriptors for Germplasm DataGlobal RDF Descriptors for Germplasm Data
Global RDF Descriptors for Germplasm Data
 
Publishing Germplasm Vocabularies as Linked Data
Publishing Germplasm Vocabularies as Linked DataPublishing Germplasm Vocabularies as Linked Data
Publishing Germplasm Vocabularies as Linked Data
 
How is smart data cooked?
How is smart data cooked?How is smart data cooked?
How is smart data cooked?
 
European agrobiodioversity, ECPGR network meeting on EURISCO, Central Crop Da...
European agrobiodioversity, ECPGR network meeting on EURISCO, Central Crop Da...European agrobiodioversity, ECPGR network meeting on EURISCO, Central Crop Da...
European agrobiodioversity, ECPGR network meeting on EURISCO, Central Crop Da...
 
Innvovative cities: web de données et web sémantique, ressources ubiquitaires...
Innvovative cities: web de données et web sémantique, ressources ubiquitaires...Innvovative cities: web de données et web sémantique, ressources ubiquitaires...
Innvovative cities: web de données et web sémantique, ressources ubiquitaires...
 
Linked Open Data-enabled Strategies for Top-N Recommendations
Linked Open Data-enabled Strategies for Top-N RecommendationsLinked Open Data-enabled Strategies for Top-N Recommendations
Linked Open Data-enabled Strategies for Top-N Recommendations
 
What is #LODLAM?! Understanding linked open data in libraries, archives [and ...
What is #LODLAM?! Understanding linked open data in libraries, archives [and ...What is #LODLAM?! Understanding linked open data in libraries, archives [and ...
What is #LODLAM?! Understanding linked open data in libraries, archives [and ...
 
L’apport du Web sémantique à la recherche d’informations
L’apport du Web sémantique à la recherche d’informationsL’apport du Web sémantique à la recherche d’informations
L’apport du Web sémantique à la recherche d’informations
 
Intro to Linked Open Data in Libraries Archives & Museums.
Intro to Linked Open Data in Libraries Archives & Museums.Intro to Linked Open Data in Libraries Archives & Museums.
Intro to Linked Open Data in Libraries Archives & Museums.
 

Similar to The Power of Semantic Technologies to Explore Linked Open Data

Powerful Information Discovery with Big Knowledge Graphs –The Offshore Leaks ...
Powerful Information Discovery with Big Knowledge Graphs –The Offshore Leaks ...Powerful Information Discovery with Big Knowledge Graphs –The Offshore Leaks ...
Powerful Information Discovery with Big Knowledge Graphs –The Offshore Leaks ...
Connected Data World
 
Boost your data analytics with open data and public news content
Boost your data analytics with open data and public news contentBoost your data analytics with open data and public news content
Boost your data analytics with open data and public news content
Ontotext
 
Open Data and News Analytics Demo
Open Data and News Analytics DemoOpen Data and News Analytics Demo
Open Data and News Analytics Demo
Ontotext
 
What do we want computers to do for us?
What do we want computers to do for us? What do we want computers to do for us?
What do we want computers to do for us?
Andrea Volpini
 
Better Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSA
Better Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSABetter Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSA
Better Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSA
PRBETTER
 
euBusinessGraph Company and Economic Data
euBusinessGraph Company and Economic DataeuBusinessGraph Company and Economic Data
euBusinessGraph Company and Economic Data
Vladimir Alexiev, PhD, PMP
 
Wed roman tut_open_datapub
Wed roman tut_open_datapubWed roman tut_open_datapub
Wed roman tut_open_datapubeswcsummerschool
 
Open Data Portals: 9 Solutions and How they Compare
Open Data Portals: 9 Solutions and How they CompareOpen Data Portals: 9 Solutions and How they Compare
Open Data Portals: 9 Solutions and How they Compare
Safe Software
 
The Web of data and web data commons
The Web of data and web data commonsThe Web of data and web data commons
The Web of data and web data commons
Jesse Wang
 
(PROJEKTURA) Big Data Open Data story for TGG
(PROJEKTURA) Big Data Open Data story for TGG(PROJEKTURA) Big Data Open Data story for TGG
(PROJEKTURA) Big Data Open Data story for TGG
Ratko Mutavdzic
 
Data Integration And Visualization
Data Integration And VisualizationData Integration And Visualization
Data Integration And Visualization
Ivan Ermilov
 
Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)Anja Jentzsch
 
How google is using linked data today and vision for tomorrow
How google is using linked data today and vision for tomorrowHow google is using linked data today and vision for tomorrow
How google is using linked data today and vision for tomorrow
Vasu Jain
 
On-Demand RDF Graph Databases in the Cloud
On-Demand RDF Graph Databases in the CloudOn-Demand RDF Graph Databases in the Cloud
On-Demand RDF Graph Databases in the Cloud
Marin Dimitrov
 
Enabling Low-cost Open Data Publishing and Reuse
Enabling Low-cost Open Data Publishing and ReuseEnabling Low-cost Open Data Publishing and Reuse
Enabling Low-cost Open Data Publishing and Reuse
Marin Dimitrov
 
Analytical Innovation: How to Build the Next Generation Data Platform
Analytical Innovation: How to Build the Next Generation Data PlatformAnalytical Innovation: How to Build the Next Generation Data Platform
Analytical Innovation: How to Build the Next Generation Data Platform
VMware Tanzu
 
GeoLinkedData
GeoLinkedDataGeoLinkedData
How do we develop open source software to help open data ? (MOSC 2013)
How do we develop open source software to help open data ? (MOSC 2013)How do we develop open source software to help open data ? (MOSC 2013)
How do we develop open source software to help open data ? (MOSC 2013)
Sammy Fung
 
Semantics and Machine Learning
Semantics and Machine LearningSemantics and Machine Learning
Semantics and Machine Learning
Vladimir Alexiev, PhD, PMP
 
Let's downscale the semantic web !
Let's downscale the semantic web !Let's downscale the semantic web !
Let's downscale the semantic web !
Christophe Guéret
 

Similar to The Power of Semantic Technologies to Explore Linked Open Data (20)

Powerful Information Discovery with Big Knowledge Graphs –The Offshore Leaks ...
Powerful Information Discovery with Big Knowledge Graphs –The Offshore Leaks ...Powerful Information Discovery with Big Knowledge Graphs –The Offshore Leaks ...
Powerful Information Discovery with Big Knowledge Graphs –The Offshore Leaks ...
 
Boost your data analytics with open data and public news content
Boost your data analytics with open data and public news contentBoost your data analytics with open data and public news content
Boost your data analytics with open data and public news content
 
Open Data and News Analytics Demo
Open Data and News Analytics DemoOpen Data and News Analytics Demo
Open Data and News Analytics Demo
 
What do we want computers to do for us?
What do we want computers to do for us? What do we want computers to do for us?
What do we want computers to do for us?
 
Better Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSA
Better Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSABetter Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSA
Better Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSA
 
euBusinessGraph Company and Economic Data
euBusinessGraph Company and Economic DataeuBusinessGraph Company and Economic Data
euBusinessGraph Company and Economic Data
 
Wed roman tut_open_datapub
Wed roman tut_open_datapubWed roman tut_open_datapub
Wed roman tut_open_datapub
 
Open Data Portals: 9 Solutions and How they Compare
Open Data Portals: 9 Solutions and How they CompareOpen Data Portals: 9 Solutions and How they Compare
Open Data Portals: 9 Solutions and How they Compare
 
The Web of data and web data commons
The Web of data and web data commonsThe Web of data and web data commons
The Web of data and web data commons
 
(PROJEKTURA) Big Data Open Data story for TGG
(PROJEKTURA) Big Data Open Data story for TGG(PROJEKTURA) Big Data Open Data story for TGG
(PROJEKTURA) Big Data Open Data story for TGG
 
Data Integration And Visualization
Data Integration And VisualizationData Integration And Visualization
Data Integration And Visualization
 
Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)
 
How google is using linked data today and vision for tomorrow
How google is using linked data today and vision for tomorrowHow google is using linked data today and vision for tomorrow
How google is using linked data today and vision for tomorrow
 
On-Demand RDF Graph Databases in the Cloud
On-Demand RDF Graph Databases in the CloudOn-Demand RDF Graph Databases in the Cloud
On-Demand RDF Graph Databases in the Cloud
 
Enabling Low-cost Open Data Publishing and Reuse
Enabling Low-cost Open Data Publishing and ReuseEnabling Low-cost Open Data Publishing and Reuse
Enabling Low-cost Open Data Publishing and Reuse
 
Analytical Innovation: How to Build the Next Generation Data Platform
Analytical Innovation: How to Build the Next Generation Data PlatformAnalytical Innovation: How to Build the Next Generation Data Platform
Analytical Innovation: How to Build the Next Generation Data Platform
 
GeoLinkedData
GeoLinkedDataGeoLinkedData
GeoLinkedData
 
How do we develop open source software to help open data ? (MOSC 2013)
How do we develop open source software to help open data ? (MOSC 2013)How do we develop open source software to help open data ? (MOSC 2013)
How do we develop open source software to help open data ? (MOSC 2013)
 
Semantics and Machine Learning
Semantics and Machine LearningSemantics and Machine Learning
Semantics and Machine Learning
 
Let's downscale the semantic web !
Let's downscale the semantic web !Let's downscale the semantic web !
Let's downscale the semantic web !
 

More from Ontotext

It Don’t Mean a Thing If It Ain’t Got Semantics
It Don’t Mean a Thing If It Ain’t Got SemanticsIt Don’t Mean a Thing If It Ain’t Got Semantics
It Don’t Mean a Thing If It Ain’t Got Semantics
Ontotext
 
[Webinar] GraphDB Fundamentals: Adding Meaning to Your Data
[Webinar] GraphDB Fundamentals: Adding Meaning to Your Data[Webinar] GraphDB Fundamentals: Adding Meaning to Your Data
[Webinar] GraphDB Fundamentals: Adding Meaning to Your Data
Ontotext
 
Hercule: Journalist Platform to Find Breaking News and Fight Fake Ones
Hercule: Journalist Platform to Find Breaking News and Fight Fake OnesHercule: Journalist Platform to Find Breaking News and Fight Fake Ones
Hercule: Journalist Platform to Find Breaking News and Fight Fake Ones
Ontotext
 
How to migrate to GraphDB in 10 easy to follow steps
How to migrate to GraphDB in 10 easy to follow steps How to migrate to GraphDB in 10 easy to follow steps
How to migrate to GraphDB in 10 easy to follow steps
Ontotext
 
GraphDB Cloud: Enterprise Ready RDF Database on Demand
GraphDB Cloud: Enterprise Ready RDF Database on DemandGraphDB Cloud: Enterprise Ready RDF Database on Demand
GraphDB Cloud: Enterprise Ready RDF Database on Demand
Ontotext
 
Smarter content with a Dynamic Semantic Publishing Platform
Smarter content with a Dynamic Semantic Publishing PlatformSmarter content with a Dynamic Semantic Publishing Platform
Smarter content with a Dynamic Semantic Publishing Platform
Ontotext
 
Semantic Data Normalization For Efficient Clinical Trial Research
Semantic Data Normalization For Efficient Clinical Trial ResearchSemantic Data Normalization For Efficient Clinical Trial Research
Semantic Data Normalization For Efficient Clinical Trial Research
Ontotext
 
Gaining Advantage in e-Learning with Semantic Adaptive Technology
Gaining Advantage in e-Learning with Semantic Adaptive TechnologyGaining Advantage in e-Learning with Semantic Adaptive Technology
Gaining Advantage in e-Learning with Semantic Adaptive Technology
Ontotext
 
Why Semantics Matter? Adding the semantic edge to your content, right from au...
Why Semantics Matter? Adding the semantic edge to your content,right from au...Why Semantics Matter? Adding the semantic edge to your content,right from au...
Why Semantics Matter? Adding the semantic edge to your content, right from au...
Ontotext
 
Adding Semantic Edge to Your Content – From Authoring to Delivery
Adding Semantic Edge to Your Content – From Authoring to DeliveryAdding Semantic Edge to Your Content – From Authoring to Delivery
Adding Semantic Edge to Your Content – From Authoring to Delivery
Ontotext
 

More from Ontotext (10)

It Don’t Mean a Thing If It Ain’t Got Semantics
It Don’t Mean a Thing If It Ain’t Got SemanticsIt Don’t Mean a Thing If It Ain’t Got Semantics
It Don’t Mean a Thing If It Ain’t Got Semantics
 
[Webinar] GraphDB Fundamentals: Adding Meaning to Your Data
[Webinar] GraphDB Fundamentals: Adding Meaning to Your Data[Webinar] GraphDB Fundamentals: Adding Meaning to Your Data
[Webinar] GraphDB Fundamentals: Adding Meaning to Your Data
 
Hercule: Journalist Platform to Find Breaking News and Fight Fake Ones
Hercule: Journalist Platform to Find Breaking News and Fight Fake OnesHercule: Journalist Platform to Find Breaking News and Fight Fake Ones
Hercule: Journalist Platform to Find Breaking News and Fight Fake Ones
 
How to migrate to GraphDB in 10 easy to follow steps
How to migrate to GraphDB in 10 easy to follow steps How to migrate to GraphDB in 10 easy to follow steps
How to migrate to GraphDB in 10 easy to follow steps
 
GraphDB Cloud: Enterprise Ready RDF Database on Demand
GraphDB Cloud: Enterprise Ready RDF Database on DemandGraphDB Cloud: Enterprise Ready RDF Database on Demand
GraphDB Cloud: Enterprise Ready RDF Database on Demand
 
Smarter content with a Dynamic Semantic Publishing Platform
Smarter content with a Dynamic Semantic Publishing PlatformSmarter content with a Dynamic Semantic Publishing Platform
Smarter content with a Dynamic Semantic Publishing Platform
 
Semantic Data Normalization For Efficient Clinical Trial Research
Semantic Data Normalization For Efficient Clinical Trial ResearchSemantic Data Normalization For Efficient Clinical Trial Research
Semantic Data Normalization For Efficient Clinical Trial Research
 
Gaining Advantage in e-Learning with Semantic Adaptive Technology
Gaining Advantage in e-Learning with Semantic Adaptive TechnologyGaining Advantage in e-Learning with Semantic Adaptive Technology
Gaining Advantage in e-Learning with Semantic Adaptive Technology
 
Why Semantics Matter? Adding the semantic edge to your content, right from au...
Why Semantics Matter? Adding the semantic edge to your content,right from au...Why Semantics Matter? Adding the semantic edge to your content,right from au...
Why Semantics Matter? Adding the semantic edge to your content, right from au...
 
Adding Semantic Edge to Your Content – From Authoring to Delivery
Adding Semantic Edge to Your Content – From Authoring to DeliveryAdding Semantic Edge to Your Content – From Authoring to Delivery
Adding Semantic Edge to Your Content – From Authoring to Delivery
 

Recently uploaded

From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 

Recently uploaded (20)

From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 

The Power of Semantic Technologies to Explore Linked Open Data

  • 1. The Power of Semantic Technologies to Explore Linked Open Data Graphorum & Smart Data Conference, Jan 2017
  • 2. You will learn how to: • Convert tabular data into RDF • Combine local and remote data in a single query • Graphically explore the connectivity patterns in big diverse data − 1B+ triples, 1000+ classes, 8 datasets • Detect suspicious patterns of company control • Filter news based on relationships between companies and people • Rank companies per industry and region
  • 3. Presentation Outline •Use cases: Relation discovery and Media monitoring •GraphDB’s OntoRefine conversion of tabular data in RDF •FactForge: Open data and news about people and organizations •Relationship Discovery Examples •Media Monitoring Examples & Popularity Ranking •Panama Papers and Global Legal Entity Identifier as Open Data •Tracing Panama Papers entities in the news
  • 5. Commercial Company Database (e.g. D&B) Link data! Reveal more! Social Media News Wikipedia Private • Link diverse data in a Knowledge Graph • Analyze News and Social Content • Extract facts and link content to data • Interpret data in context of big linked data
  • 6. Content Analytics & Exploration Platform GraphDB Linked Open Data
  • 7. Relation Discovery Case • Find suspicious relationships like: − Company in USA − Controls another company in USA − Through a company in an off-shore zone • Show news relevant to these companies
  • 8. Linking News to Big Knowledge Graphs • The DSP platform links text to knowledge graphs • One can navigate from news to concepts, entities and topics, and from there to other news Try it at http://now.ontotext.com
  • 9. Semantic Media Monitoring For each entity: •popularity trends •relevant news •related entities •knowledge graph information Try it at http://now.ontotext.com
  • 10. GraphDB OntoRefine: conversion of tabular data in RDF
  • 11. OntoRefine: Data Transformation to RDF • Based on OpenRefine and integrated in the GraphDB Workbench • Allows converting tabular data into RDF − Supported formats are TSV, CSV, *SV, XLS, XLSX, JSON, XML, RDF as XML, and Google sheet − Easily filter your data, edit its inconsistencies − View the cleaned data as RDF • Exposes a GraphDB SPARQL endpoint − Transform your data using SPIN functions − Import your data straight into a GraphDB repository The Power of Semantic Technologies to Explore Linked Open Data Jan 2017 #11
  • 12. OntoRefine: Uploading data • Create new project − From local / remote files ▪ Supported formats are TSV, CSV, *SV, XLS, XLSX, JSON, XML, RDF as XML, and Google sheet ▪ With the first opening of the file, OntoRefine tries to recognize the encoding of the text file and all delimiters. ▪ Allows further fine-tuning of the table configurations − From clipboard • Open / import a project The Power of Semantic Technologies to Explore Linked Open Data Jan 2017 #12
  • 13. OntoRefine: Viewing tabular data as RDF • OpenRefine supports RDF as input only • OntoRefine also supports RDF as output The Power of Semantic Technologies to Explore Linked Open Data Jan 2017 #13 •Data shown as either records or rows − A record combines multiple rows identifying the same object and sharing the first column •Data stored in a separate repository − must not be mistaken with the current repository available through GraphDB Workbench SPARQL tab
  • 14. OntoRefine: RDF-izing data • Transform data using a CONSTRUCT query − in the OntoRefine SPARQL endpoint − directly in the GraphDB SPARQL endpoint • GraphDB 8.O supports SPIN functions: − SPARQL functions for splitting a string − SPARQL functions for parsing dates − SPARQL functions for encoding URIs The Power of Semantic Technologies to Explore Linked Open Data Jan 2017 #14
  • 15. OntoRefine: Importing data in GraphDB • After transforming the data, import it in the current repository without leaving the GraphDB Workbench − Copy the endpoint of the OntoRefine project − Go to GraphDB SPARQL menu − Execute a query to import the results The Power of Semantic Technologies to Explore Linked Open Data Jan 2017 #15
  • 16. Combine local and remote data • SPARQL Federation allows one to retrieve data from a remote end- point in the middle of a query to a local repository •For instance, to combine local data for GDP with information about the area of each country from DBPedia to calculated GDP/sq.km. Query GDP/Sq.km.
  • 17. Federation example: GDP per Sq. Km. SELECT DISTINCT ?name (STR(?area) AS ?areaSqKm) (STR(?GDPperKm) AS ?GDPperSqKm) { ?gdp2015prop gdp:forYear 2015 . ?country gdp:gdpCountry_Name ?name ; ?gdp2015prop ?gdp2015 . { SELECT (STR(?n) as ?name) ?area { SERVICE <http://dbpedia.org/sparql> { ?c a dbo:Country ; rdfs:label ?n; dbp:areaKm ?area . } } } BIND(STR(ROUND(xsd:decimal(?gdp2015/1000000000))) AS ?gdp2015bil) BIND(xsd:integer((?gdp2015) / ?area ) AS ?GDPperKm) } ORDER BY DESC(?GDPperKm) LIMIT 10
  • 18. FactForge: Open data and news about people and organizations http://factforge.net
  • 19. Our approach to Big Data 1. Integrate data from many sources − Build a Big Knowledge Graph that integrates relevant data from proprietary databases and taxonomies plus millions of facts of Linked Data 1. Infer new facts and unveil relationships − Performing reasoning across different data sources 1. Interlink text and with big data − Using text-mining to automatically discover references to concepts and entities 1. Use graph database for metadata management, querying and search
  • 20. FactForge: Data Integration DBpedia (the English version) 496M Geonames (all geographic features on Earth) 150M owl:sameAs links between DBpedia and Geonames 471K Company registry data (GLEI) 3M Panama Papers DB (#LinkedLeaks) 20M Other datasets and ontologies: WordNet, WorldFacts, FIBO News metadata (2000 articles/day enriched by NOW) 473M Total size (1152M explicit + 322M inferred statements) 1 475М
  • 21. News Metadata • Metadata from Ontotext’s Dynamic Semantic Publishing platform − News stream from Google − Automatically generated as part of the NOW.ontotext.com semantic news showcase •News stream from Google since Feb 2015, about 50k news/month − ~70 tags (annotations) per news article • Tags link text mentions of concepts to the knowledge graph − Technically these are URIs for entities (people, organizations, locations, etc.) and key phrases
  • 22. New Metadata Category Count International 52 074 Science and Technology 23 201 Sports 20 714 Business 15 155 Lifestyle 11 684 122 828 Mentions / entity type Count Keyphrase 2 589 676 Organization 1 276 441 Location 1 260 972 Person 1 248 784 Work 309 093 Event 258 388 RelationPersonRole 236 638 Species 180 946 News Metadata
  • 23. Class Hierarchy Map (by number of instances) Left: The big picture Right: dbo:Agent class (2.7M organizations and persons)
  • 24. Sample queries at http://factforge.net • F1: Big cities in Eastern Europe • F2: Airports near London • F3: People and organizations related to Google • F4: Top-level industries by number of companies Available as Saved Queries at http://factforge.net/sparql Note: Open Saved Queries with the folder icon in the upper-right corner
  • 26. Offshore control example • Query: Find companies, which control other companies in the same country, through company in an off-shore zone • How it works: • Establish control-relationship • Establish a company-country mapping • Establish an “off-shore criteria” • SPARQL it
  • 27. Off-shore company control example SELECT * FROM onto:disable-sameAs WHERE { ?c1 fibo-fnd-rel-rel:controls ?c2 . ?c2 fibo-fnd-rel-rel:controls ?c3 . ?c1 ff-map:orgCountry ?c1_country . ?c2 ff-map:orgCountry ?c2_country . ?c3 ff-map:orgCountry ?c1_country . FILTER (?c1_country != ?c2_country) ?c2_country ff-map:hasOffshoreProvisions true . }
  • 29. Semantic Media Monitoring/Press-Clipping • We can trace references to a specific company in the news − This is pretty much standard, however we can deal with syntactic variations in the names, because state of the art Named Entity Recognition technology is used − What’s more important, we distinguish correctly in which mention “Paris” refers to which of the following: Paris (the capital of France), Paris in Texas, Paris Hilton or to Paris (the Greek hero) • We can trace and consolidate references to daughter companies • We have comprehensive industry classification − The one from DBPedia, but refined to accommodate identifier variations and specialization (e.g. company classified as dbr:Bank will also be considered classified as dbr:FinancialServices)
  • 30. Media Monitoring Queries • F5: Mentions in the news of an organization and its related entities • F7: Most popular companies per industry, including children • F8: Regional exposition of company – normalized
  • 31. News Popularity Ranking: Automotive Rank Company News # Rank Company incl. mentions of child companies News # 1 General Motors 2722 1 General Motors 4620 2 Tesla Motors 2346 2 Volkswagen Group 3999 3 Volkswagen 2299 3 Fiat Chrysler Automobiles 2658 4 Ford Motor Company 1934 4 Tesla Motors 2370 5 Toyota 1325 5 Ford Motor Company 2125 6 Chevrolet 1264 6 Toyota 1656 7 Chrysler 1054 7 Renault-Nissan Alliance 1332 8 Fiat Chrysler Automobiles 1011 8 Honda 864 9 Audi AG 972 9 BMW 715 10 Honda 717 10 Takata Corporation 547
  • 32. News Popularity: Finance Rank Company News # Rank Company incl. mentions of controlled News # 1 Bloomberg L.P. 3203 1 Intra Bank 261667 2 Goldman Sachs 1992 2 Hinduja Bank (Switzerland) 49731 3 JP Morgan Chase 1712 3 China Merchants Bank 38288 4 Wells Fargo 1688 4 Alphabet Inc. 22601 5 Citigroup 1557 5 Capital Group Companies 4076 6 HSBC Holdings 1546 6 Bloomberg L.P. 3611 7 Deutsche Bank 1414 7 Exor 2704 8 Bank of America 1335 8 Nasdaq, Inc. 2082 9 Barclays 1260 9 JP Morgan Chase 1972 10 UBS 694 10 Sentinel Capital Partners 1053 Note: Including investment funds, stock exchanges, agencies, etc.
  • 33. News Popularity: Banking Rank Company News # Rank Company incl. mentions of controlled News # 1 Goldman Sachs 996 1 China Merchants Bank * 38288 2 JP Morgan Chase 856 2 JP Morgan Chase 1972 3 HSBC Holdings 773 3 Goldman Sachs 1030 4 Deutsche Bank 707 4 HSBC 966 5 Barclays 630 5 Bank of America 771 6 Citigroup 519 6 Deutsche Bank 742 7 Bank of America 445 7 Barclays 681 8 Wells Fargo 422 8 Citigroup 630 9 UBS 347 9 Wells Fargo 428 10 Chase 126 10 UBS 347
  • 34. Panama Papers and Global Legal Entity Identifier as Open Data
  • 35. Global Legal Entity Identifier (GLEI) data • Global Markets Entity Identifier (GMEI) Utility data − The Global Markets Entity Identifier (GMEI) utility is DTCC's legal entity identifier solution offered in collaboration with SWIFT − We downloaded as XML data dump from https://www.gmeiutility.org/ • RDF-ized company records − Fields: LEI#, legal name, ultimate parent, registered country − 3M explicit statements for 211 thousand organizations ▪ For comparison, there are 490 000 organizations in DBPeda and D&B covers above 200 million − 10,821 ultimate parent relationships and 1632 ultimate parents • 2 800 organizations from the GLEI dump mapped to DBPedia
  • 36. GLEI Company Data Sample: ABN-AMRO lei:businessRegistry Kamer van Koophandel lei:businessRegistryNumber 34334259 lei:duplicateReference data:549300T5O0D0T4V2ZB28 lei:entityStatus ACTIVE lei:headquartersCity Amsterdam lei:headquartersState Noord-Holland lei:legalForm NAAMLOZE VENNOOTSCHAP lei:legalName ABN AMRO Bank N.V. lei:lei BFXS5XCH7N0Y05NIXW11 lei:registeredCity Amsterdam lei:registeredCountry NL lei:registeredPostCode 1082 PP lei:registeredState Noord-Holland GLEI Company Data Sample: ABN-AMRO
  • 37. Ultimate parent Children Country 1 The Goldman Sachs Group, Inc. 1 851 US 2 United Technologies Corporation 427 US 3 Honeywell International Inc. 341 US 4 Morgan Stanley 228 US 5 Cargill, Incorporated 217 US 6 1832 Asset Management L.P. 202 CA 7 Aegon N.V. 174 NL 8 Union Bancaire Privée, UBP SA 138 CH 9 Citigroup Inc. 135 US 10 State Street Corporation 128 US Country Companies 1 dbr:United_States 103 548 2 dbr:Canada 17 425 3 dbr:Luxembourg 13 984 4 dbr:Sweden 7 934 5 dbr:United_Kingdom 7 421 6 dbr:Belgium 6 868 7 dbr:Ireland 4 762 8 dbr:Australia 4 385 9 dbr:Germany 3 039 10 dbr:Netherlands 2 561 Global Legal Entity Identifier (GLEI) data
  • 38. Offshore Leaks Database from ICIJ • Published by the International Consortium of Investigative Journalists (ICIJ) on 9th of May • A “searchable database” about 320 000 offshore companies − 214 000 extracted from Panama Papers (valid until 2015) − More than 100 000 from 2013 Offshore leaks investigation (valid until 2010) • CSV extract from a graph database available for download • https://offshoreleaks.icij.org/
  • 40. Offshore Leaks DB as Linked Open Data • Ontotext published the Offshore Leaks DB as Linked Open Data • Available for exploration, querying and download at http://data.ontotext.com • ONTOTEXT DISCLAIMERS We use the data as is provided by ICIJ. We make no representations and warranties of any kind, including warranties of title, accuracy, absence of errors or fitness for particular purpose. All transformations, query results and derivative works are used only to showcase the service and technological capabilities and not to serve as basis for any statements or conclusions.
  • 41. Enrichment and structuring of the data • Relationship type hierarchy − About 80 types of relationship types in the original dataset got organized in a property hierarchy • Classification of officers into Person and Company − In the original database there is no way to distinguish whether an officer is a physical person • Mapping to DBPedia: − 209 countries referred in Offshore Leaks DB are mapped to DBPedia − About 3000 persons and 300 companies mapped to DBPedia • Overall size of the repository: 22M statements (20M explicit)
  • 42. The RDF-ization Process • Linked data variant produced without programming − The raw CSV files are RDF-ized using TARQL, http://tarql.github.io/ − Data was further interlinked and enriched in GraphDB using SPARQL • The process is documented in this README file • All relevant artifacts are open-source, available at • https://github.com/Ontotext-AD/leaks/ • The entire publishing and mapping took about 15 person-days !!! − Including data.ontotext.com portal setup, promotion, documentation, etc.
  • 43. Sample queries at http://data.ontotext.com • Q1: Countries by number of entities related to them • Q2: Country pairs by ownership statistics • Q3: Statistics by incorporation year • Q4: Officers and entities by number of capital relations • Q5: Countries in Eastern Europe by number of owners • Q6: Intermediaries in Asia by name • Q7: The best connected officers • Q8: Countries by number of Person and Company officers
  • 44. Mapping Datasets to DBPedia with the GraphDB Lucene Connector
  • 45. Mapping datasets to DBPedia • The task: map people, organizations and locations to IDs in DBPedia − So that we can analyze the original data with the help of the extra information available in DBPedia and other datasets that are related to it, e.g. Geonames − For instance, #LinkedLeaks doesn’t contain any extra information about the companies, e.g. industry sector, controlling or controlled companies, etc. • Specific conditions: we had to map by names − Other than names, the information about the entities in the source datasets couldn’t help the mapping ▪ Address and country attributes are present, but those appeared to be marginally useful for mapping − In both cases we mapped locations only in terms of countries and not finer grained locations ▪ For this purpose DBPedia geographic data is sufficient and it is also well mapped with GeoNames
  • 46. Mapping datasets to DBPedia (2) • We used the GraphDB connector to Lucene for these mappings − Using the GraphDB connector, Lucene index was created for Organizations and People from DBPedia, indexing all sorts of names, descriptions and other textual information for each entity − The mapping process consists mostly of using the name of the entity from the 3rd party dataset (in this case Panama Papers or GLEI) as a FTS query, embedded in a SPARQL query • What is that Lucence does better than SPARQL? − When there is little information other than the name, we benefit from the free text indexing of Lucene, because it deals well with minor syntactic variations and sorts the results by relevance − When mappings 300 000 organizations against another 500 000 organizations, without a key, the complexity of a SPARQL query is 300 000 x 500 000, which is slower that 300 000 Lucene queries
  • 47. #LinkedLeaks Mapping Queries • Companies mapped by industry • Companies mapped in the Finance sector • Politicians mapped • Available as Saved Queries at http://factforge.net/sparql • Note 1: Open Saved Queries with the folder icon in the upper-right corner
  • 49.
  • 50. Tracing Panama Papers entities in the news • After mapping #LinkedLeaks entities to DBPedia identifiers, we can load them, together with the mappings, in the FF-NEWS repository • This way we have in a single repo, mapped to one another: #LinkedLeaks data, DBPedia, News metadata • We can make queries like: Give me news mentions of entities which appear in the Panama Papers dataset • This way the mapping enabled media monitoring at no extra cost
  • 51. Thank you! Experience the technology with NOW: Semantic News Portal http://now.ontotext.com and play with open data at http://factforge.net

Editor's Notes

  1. ДИЗАЙН: Looks better with the graphics A bit smaller title and more “air” around the logo would make it better – see the proposed re-attangement
  2. ДИЗАЙН: Сивият фон прави слайда да изглежда различно и го разграничава от другите. Което е добре. От друга страна ми е някак „убито“
  3. ДИЗАЙН: Сивият фон прави слайда да изглежда различно и го разграничава от другите. Което е добре. От друга страна ми е някак „убито“
  4. ДИЗАЙН: новата графика е супер. На следващия слайд ще й пипна малко цветовете, защото наситеността и прозрачността на цветовете в случая трябва да отговаря на „плътността“ на данните. По-лек основен текст (не болд) – на конкретния слайд ми стои по-добре Our vision is to enable machines to interpret data and text by interlinking those in big knowledge graphs.The web of open data in growing exponentially! There are thousands of datasets from Wikipedia and Geonames to government statistical data and to Panama PapersWe link open data to analyze news. We extract data from news to produce more open data and analyze social media. We integrate all this with proprietary data and commercial databases. Why??? To help journalists, banks, merchants, governments and citizens reveal more! Quicker, with less effort and less stress.
  5. This is elevator pitch for our overall technology approach, proposition and applications
  6. HOW MANY CONCEPTS A PERSON KNOWS?
  7. HOW MANY CONCEPTS A PERSON KNOWS?
  8. ДИЗАЙН: оранжавите ленти в случая слагаха твърде силен акцент