SlideShare a Scribd company logo

From Artwork to Cyber Attacks: Lessons Learned in Building Knowledge Graphs using Semantic Web Technologies

Craig Knoblock
Craig Knoblock
Craig KnoblockResearch Professor at University of Southern California

Over the last few years we have been building domain-specific knowledge graphs for a variety of real-world problems, including creating virtual museums, combating human trafficking, identifying illegal arms sales, and predicting cyber attacks. We have developed a variety of techniques to construct such knowledge graphs, including techniques for extracting data from online sources, aligning the data to a domain ontology, and linking the data across sources. In his talk I will present these techniques and describe our experience in applying Semantic Web technologies to build knowledge graphs for real-world problems.

From Artwork to Cyber Attacks: Lessons Learned in Building Knowledge Graphs using Semantic Web Technologies

1 of 68
Download to read offline
From Artwork to Cyber Attacks: Lessons
Learned in Building Knowledge Graphs
using Semantic Web Technologies
Craig Knoblock
USC Information Sciences Institute
U.S. Semantic Technologies Symposium
March 1, 2018
Center on Knowledge Graphs: People
2
Center on Knowledge Graphs: People (cont.)
3
Center on Knowledge Graphs: Projects
4Center on Knowledge GraphsUSC Information Sciences Institute
Goal: Building Knowledge Graphs
raw  messy  disconnected clean  organized  linked
hard to query, analyze & visualize easy to query, analyze & visualize
5Center on Knowledge GraphsUSC Information Sciences Institute
Questions Addressed in this Talk
1. Where should the Semantic Web data come from?
• Triplestores? Linked data? Schema.org?
2. What is the “best” representation of the data in a knowledge graph?
• Very detailed domain-specific ontologies?
3. How should we deal with incomplete and incorrect information
• Manual curation? Automated data cleaning?
4. How do we organize and store the data for efficient access?
• RDF? Triplestore?
6Center on Knowledge GraphsUSC Information Sciences Institute
Ad

Recommended

Automatic Spatio-temporal Indexing to Integrate and Analyze the Data of an Or...
Automatic Spatio-temporal Indexing to Integrate and Analyze the Data of an Or...Automatic Spatio-temporal Indexing to Integrate and Analyze the Data of an Or...
Automatic Spatio-temporal Indexing to Integrate and Analyze the Data of an Or...Craig Knoblock
 
Extracting, Aligning, and Linking Data to Build Knowledge Graphs
Extracting, Aligning, and Linking Data to Build Knowledge GraphsExtracting, Aligning, and Linking Data to Build Knowledge Graphs
Extracting, Aligning, and Linking Data to Build Knowledge GraphsCraig Knoblock
 
Building and Using a Knowledge Graph to Combat Human Trafficking
Building and Using a Knowledge Graph to Combat Human TraffickingBuilding and Using a Knowledge Graph to Combat Human Trafficking
Building and Using a Knowledge Graph to Combat Human TraffickingCraig Knoblock
 
Massive Data Analysis- Challenges and Applications
Massive Data Analysis- Challenges and ApplicationsMassive Data Analysis- Challenges and Applications
Massive Data Analysis- Challenges and ApplicationsVijay Raghavan
 
CIKM Tutorial 2008
CIKM Tutorial 2008CIKM Tutorial 2008
CIKM Tutorial 2008Peiling Wang
 
Towards Knowledge Graph based Representation, Augmentation and Exploration of...
Towards Knowledge Graph based Representation, Augmentation and Exploration of...Towards Knowledge Graph based Representation, Augmentation and Exploration of...
Towards Knowledge Graph based Representation, Augmentation and Exploration of...Sören Auer
 
Large Graph Mining
Large Graph MiningLarge Graph Mining
Large Graph MiningSabri Skhiri
 

More Related Content

What's hot

GraphConnect SF 2013 Keynote
GraphConnect SF 2013 KeynoteGraphConnect SF 2013 Keynote
GraphConnect SF 2013 KeynoteEmil Eifrem
 
Towards research data knowledge graphs
Towards research data knowledge graphsTowards research data knowledge graphs
Towards research data knowledge graphsStefan Dietze
 
Building Knowledge Graphs in DIG
Building Knowledge Graphs in DIGBuilding Knowledge Graphs in DIG
Building Knowledge Graphs in DIGPalak Modi
 
Knowledge Graph Maintenance
Knowledge Graph MaintenanceKnowledge Graph Maintenance
Knowledge Graph MaintenancePaul Groth
 
Big Data + Big Sim: Query Processing over Unstructured CFD Models
Big Data + Big Sim: Query Processing over Unstructured CFD ModelsBig Data + Big Sim: Query Processing over Unstructured CFD Models
Big Data + Big Sim: Query Processing over Unstructured CFD ModelsUniversity of Washington
 
陸永祥/全球網路攝影機帶來的機會與挑戰
陸永祥/全球網路攝影機帶來的機會與挑戰陸永祥/全球網路攝影機帶來的機會與挑戰
陸永祥/全球網路攝影機帶來的機會與挑戰台灣資料科學年會
 
Scientific 3D Printing with GRASS GIS (FOSSGIS 2014)
Scientific 3D Printing with GRASS GIS (FOSSGIS 2014)Scientific 3D Printing with GRASS GIS (FOSSGIS 2014)
Scientific 3D Printing with GRASS GIS (FOSSGIS 2014)Peter Löwe
 
Data Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisData Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisDataminingTools Inc
 
Describing Scholarly Contributions semantically with the Open Research Knowle...
Describing Scholarly Contributions semantically with the Open Research Knowle...Describing Scholarly Contributions semantically with the Open Research Knowle...
Describing Scholarly Contributions semantically with the Open Research Knowle...Sören Auer
 
Using a graph database for analyzing your Liferay data
Using a graph database for analyzing your Liferay dataUsing a graph database for analyzing your Liferay data
Using a graph database for analyzing your Liferay dataMáté Thurzó
 
High Performance Data Analytics and a Java Grande Run Time
High Performance Data Analytics and a Java Grande Run TimeHigh Performance Data Analytics and a Java Grande Run Time
High Performance Data Analytics and a Java Grande Run TimeGeoffrey Fox
 
Knowledge Graph Futures
Knowledge Graph FuturesKnowledge Graph Futures
Knowledge Graph FuturesPaul Groth
 
Identifying semantics characteristics of user’s interactions datasets through...
Identifying semantics characteristics of user’s interactions datasets through...Identifying semantics characteristics of user’s interactions datasets through...
Identifying semantics characteristics of user’s interactions datasets through...Fernando de Assis Rodrigues
 
袁晓如:大数据时代可视化和可视分析的机遇与挑战
袁晓如:大数据时代可视化和可视分析的机遇与挑战袁晓如:大数据时代可视化和可视分析的机遇与挑战
袁晓如:大数据时代可视化和可视分析的机遇与挑战hdhappy001
 
A Trinity Construction for Web Extraction Using Efficient Algorithm
A Trinity Construction for Web Extraction Using Efficient AlgorithmA Trinity Construction for Web Extraction Using Efficient Algorithm
A Trinity Construction for Web Extraction Using Efficient AlgorithmIOSR Journals
 
AI-based re-identification of behavioral data
AI-based re-identification of behavioral dataAI-based re-identification of behavioral data
AI-based re-identification of behavioral dataMOSTLY AI
 
Application Modeling with Graph Databases - Relationships are cool
Application Modeling with Graph Databases - Relationships are coolApplication Modeling with Graph Databases - Relationships are cool
Application Modeling with Graph Databases - Relationships are coolLars Martin
 
HLG Big Data project and Sandbox
HLG Big Data project and SandboxHLG Big Data project and Sandbox
HLG Big Data project and SandboxCarlo Vaccari
 
Combining a co-occurrence-based and a semantic measure for entity linking
Combining a co-occurrence-based and a semantic measure for entity linkingCombining a co-occurrence-based and a semantic measure for entity linking
Combining a co-occurrence-based and a semantic measure for entity linkingBesnik Fetahu
 

What's hot (20)

GraphConnect SF 2013 Keynote
GraphConnect SF 2013 KeynoteGraphConnect SF 2013 Keynote
GraphConnect SF 2013 Keynote
 
Towards research data knowledge graphs
Towards research data knowledge graphsTowards research data knowledge graphs
Towards research data knowledge graphs
 
Building Knowledge Graphs in DIG
Building Knowledge Graphs in DIGBuilding Knowledge Graphs in DIG
Building Knowledge Graphs in DIG
 
Knowledge Graph Maintenance
Knowledge Graph MaintenanceKnowledge Graph Maintenance
Knowledge Graph Maintenance
 
Big Data + Big Sim: Query Processing over Unstructured CFD Models
Big Data + Big Sim: Query Processing over Unstructured CFD ModelsBig Data + Big Sim: Query Processing over Unstructured CFD Models
Big Data + Big Sim: Query Processing over Unstructured CFD Models
 
陸永祥/全球網路攝影機帶來的機會與挑戰
陸永祥/全球網路攝影機帶來的機會與挑戰陸永祥/全球網路攝影機帶來的機會與挑戰
陸永祥/全球網路攝影機帶來的機會與挑戰
 
Scientific 3D Printing with GRASS GIS (FOSSGIS 2014)
Scientific 3D Printing with GRASS GIS (FOSSGIS 2014)Scientific 3D Printing with GRASS GIS (FOSSGIS 2014)
Scientific 3D Printing with GRASS GIS (FOSSGIS 2014)
 
Data Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisData Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysis
 
Describing Scholarly Contributions semantically with the Open Research Knowle...
Describing Scholarly Contributions semantically with the Open Research Knowle...Describing Scholarly Contributions semantically with the Open Research Knowle...
Describing Scholarly Contributions semantically with the Open Research Knowle...
 
Using a graph database for analyzing your Liferay data
Using a graph database for analyzing your Liferay dataUsing a graph database for analyzing your Liferay data
Using a graph database for analyzing your Liferay data
 
High Performance Data Analytics and a Java Grande Run Time
High Performance Data Analytics and a Java Grande Run TimeHigh Performance Data Analytics and a Java Grande Run Time
High Performance Data Analytics and a Java Grande Run Time
 
Knowledge Graph Futures
Knowledge Graph FuturesKnowledge Graph Futures
Knowledge Graph Futures
 
Cognitive data
Cognitive dataCognitive data
Cognitive data
 
Identifying semantics characteristics of user’s interactions datasets through...
Identifying semantics characteristics of user’s interactions datasets through...Identifying semantics characteristics of user’s interactions datasets through...
Identifying semantics characteristics of user’s interactions datasets through...
 
袁晓如:大数据时代可视化和可视分析的机遇与挑战
袁晓如:大数据时代可视化和可视分析的机遇与挑战袁晓如:大数据时代可视化和可视分析的机遇与挑战
袁晓如:大数据时代可视化和可视分析的机遇与挑战
 
A Trinity Construction for Web Extraction Using Efficient Algorithm
A Trinity Construction for Web Extraction Using Efficient AlgorithmA Trinity Construction for Web Extraction Using Efficient Algorithm
A Trinity Construction for Web Extraction Using Efficient Algorithm
 
AI-based re-identification of behavioral data
AI-based re-identification of behavioral dataAI-based re-identification of behavioral data
AI-based re-identification of behavioral data
 
Application Modeling with Graph Databases - Relationships are cool
Application Modeling with Graph Databases - Relationships are coolApplication Modeling with Graph Databases - Relationships are cool
Application Modeling with Graph Databases - Relationships are cool
 
HLG Big Data project and Sandbox
HLG Big Data project and SandboxHLG Big Data project and Sandbox
HLG Big Data project and Sandbox
 
Combining a co-occurrence-based and a semantic measure for entity linking
Combining a co-occurrence-based and a semantic measure for entity linkingCombining a co-occurrence-based and a semantic measure for entity linking
Combining a co-occurrence-based and a semantic measure for entity linking
 

Similar to From Artwork to Cyber Attacks: Lessons Learned in Building Knowledge Graphs using Semantic Web Technologies

Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender SystemsMarcel Kurovski
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender Systemsinovex GmbH
 
Applied AI Workshop - Presentation - Connect Day GDL
Applied AI Workshop - Presentation - Connect Day GDLApplied AI Workshop - Presentation - Connect Day GDL
Applied AI Workshop - Presentation - Connect Day GDLMarc Teunis
 
Data Visualization for Big Data: Experience from the Front Line
Data Visualization for Big Data: Experience from the Front LineData Visualization for Big Data: Experience from the Front Line
Data Visualization for Big Data: Experience from the Front LineRosa Romero Gómez, PhD
 
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - TrivadisTechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - TrivadisTrivadis
 
Machine Learning for (JVM) Developers
Machine Learning for (JVM) DevelopersMachine Learning for (JVM) Developers
Machine Learning for (JVM) DevelopersMateusz Dymczyk
 
Bayesian Network 을 활용한 예측 분석
Bayesian Network 을 활용한 예측 분석Bayesian Network 을 활용한 예측 분석
Bayesian Network 을 활용한 예측 분석datasciencekorea
 
Göteborg university(condensed)
Göteborg university(condensed)Göteborg university(condensed)
Göteborg university(condensed)Zenodia Charpy
 
Thoughts on Knowledge Graphs & Deeper Provenance
Thoughts on Knowledge Graphs  & Deeper ProvenanceThoughts on Knowledge Graphs  & Deeper Provenance
Thoughts on Knowledge Graphs & Deeper ProvenancePaul Groth
 
Voting Based Extreme Learning Machine Essay Examples
Voting Based Extreme Learning Machine Essay ExamplesVoting Based Extreme Learning Machine Essay Examples
Voting Based Extreme Learning Machine Essay ExamplesChristina Padilla
 
Machine Learning for Data Extraction
Machine Learning for Data ExtractionMachine Learning for Data Extraction
Machine Learning for Data ExtractionDasha Herrmannova
 
Synergy of Human and Artificial Intelligence in Software Engineering
Synergy of Human and Artificial Intelligence in Software EngineeringSynergy of Human and Artificial Intelligence in Software Engineering
Synergy of Human and Artificial Intelligence in Software EngineeringTao Xie
 
Data Science with Azure Machine Learning and  R
Data Science with  Azure Machine Learning and  RData Science with  Azure Machine Learning and  R
Data Science with Azure Machine Learning and  RChristos Charmatzis
 
KDD, Data Mining, Data Science_I.pptx
KDD, Data Mining, Data Science_I.pptxKDD, Data Mining, Data Science_I.pptx
KDD, Data Mining, Data Science_I.pptxYogeshGairola2
 
Graphing Grifters: Identify & Display Patterns of Corruption With Oracle Graph
Graphing Grifters: Identify & Display Patterns of Corruption With Oracle GraphGraphing Grifters: Identify & Display Patterns of Corruption With Oracle Graph
Graphing Grifters: Identify & Display Patterns of Corruption With Oracle GraphJim Czuprynski
 
Software Analytics: Towards Software Mining that Matters (2014)
Software Analytics:Towards Software Mining that Matters (2014)Software Analytics:Towards Software Mining that Matters (2014)
Software Analytics: Towards Software Mining that Matters (2014)Tao Xie
 
Data-centric AI and the convergence of data and model engineering: opportunit...
Data-centric AI and the convergence of data and model engineering:opportunit...Data-centric AI and the convergence of data and model engineering:opportunit...
Data-centric AI and the convergence of data and model engineering: opportunit...Paolo Missier
 
Main principles of Data Science and Machine Learning
Main principles of Data Science and Machine LearningMain principles of Data Science and Machine Learning
Main principles of Data Science and Machine LearningNikolay Karelin
 

Similar to From Artwork to Cyber Attacks: Lessons Learned in Building Knowledge Graphs using Semantic Web Technologies (20)

Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender Systems
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender Systems
 
Applied AI Workshop - Presentation - Connect Day GDL
Applied AI Workshop - Presentation - Connect Day GDLApplied AI Workshop - Presentation - Connect Day GDL
Applied AI Workshop - Presentation - Connect Day GDL
 
Data Visualization for Big Data: Experience from the Front Line
Data Visualization for Big Data: Experience from the Front LineData Visualization for Big Data: Experience from the Front Line
Data Visualization for Big Data: Experience from the Front Line
 
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - TrivadisTechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
 
Machine Learning for (JVM) Developers
Machine Learning for (JVM) DevelopersMachine Learning for (JVM) Developers
Machine Learning for (JVM) Developers
 
Bayesian Network 을 활용한 예측 분석
Bayesian Network 을 활용한 예측 분석Bayesian Network 을 활용한 예측 분석
Bayesian Network 을 활용한 예측 분석
 
Göteborg university(condensed)
Göteborg university(condensed)Göteborg university(condensed)
Göteborg university(condensed)
 
Thoughts on Knowledge Graphs & Deeper Provenance
Thoughts on Knowledge Graphs  & Deeper ProvenanceThoughts on Knowledge Graphs  & Deeper Provenance
Thoughts on Knowledge Graphs & Deeper Provenance
 
Voting Based Extreme Learning Machine Essay Examples
Voting Based Extreme Learning Machine Essay ExamplesVoting Based Extreme Learning Machine Essay Examples
Voting Based Extreme Learning Machine Essay Examples
 
Machine Learning for Data Extraction
Machine Learning for Data ExtractionMachine Learning for Data Extraction
Machine Learning for Data Extraction
 
Synergy of Human and Artificial Intelligence in Software Engineering
Synergy of Human and Artificial Intelligence in Software EngineeringSynergy of Human and Artificial Intelligence in Software Engineering
Synergy of Human and Artificial Intelligence in Software Engineering
 
20181212 ibm aot
20181212 ibm aot20181212 ibm aot
20181212 ibm aot
 
Data Science with Azure Machine Learning and  R
Data Science with  Azure Machine Learning and  RData Science with  Azure Machine Learning and  R
Data Science with Azure Machine Learning and  R
 
KDD, Data Mining, Data Science_I.pptx
KDD, Data Mining, Data Science_I.pptxKDD, Data Mining, Data Science_I.pptx
KDD, Data Mining, Data Science_I.pptx
 
Graphing Grifters: Identify & Display Patterns of Corruption With Oracle Graph
Graphing Grifters: Identify & Display Patterns of Corruption With Oracle GraphGraphing Grifters: Identify & Display Patterns of Corruption With Oracle Graph
Graphing Grifters: Identify & Display Patterns of Corruption With Oracle Graph
 
Software Analytics: Towards Software Mining that Matters (2014)
Software Analytics:Towards Software Mining that Matters (2014)Software Analytics:Towards Software Mining that Matters (2014)
Software Analytics: Towards Software Mining that Matters (2014)
 
Data-centric AI and the convergence of data and model engineering: opportunit...
Data-centric AI and the convergence of data and model engineering:opportunit...Data-centric AI and the convergence of data and model engineering:opportunit...
Data-centric AI and the convergence of data and model engineering: opportunit...
 
Microsoft Dryad
Microsoft DryadMicrosoft Dryad
Microsoft Dryad
 
Main principles of Data Science and Machine Learning
Main principles of Data Science and Machine LearningMain principles of Data Science and Machine Learning
Main principles of Data Science and Machine Learning
 

More from Craig Knoblock

Learning to Adapt to Sensor Changes and Failures
Learning to Adapt to Sensor Changes and FailuresLearning to Adapt to Sensor Changes and Failures
Learning to Adapt to Sensor Changes and FailuresCraig Knoblock
 
Lessons Learned in Building Linked Data for the American Art Collaborative
Lessons Learned in Building Linked Data for the American Art CollaborativeLessons Learned in Building Linked Data for the American Art Collaborative
Lessons Learned in Building Linked Data for the American Art CollaborativeCraig Knoblock
 
Assigning semantic labels to data sources
Assigning semantic labels to data sourcesAssigning semantic labels to data sources
Assigning semantic labels to data sourcesCraig Knoblock
 
A scalable architecture for extracting, aligning, linking, and visualizing mu...
A scalable architecture for extracting, aligning, linking, and visualizing mu...A scalable architecture for extracting, aligning, linking, and visualizing mu...
A scalable architecture for extracting, aligning, linking, and visualizing mu...Craig Knoblock
 
From Virtual Museums to Peacebuilding: Creating and Using Linked Knowledge
From Virtual Museums to Peacebuilding: Creating and Using Linked KnowledgeFrom Virtual Museums to Peacebuilding: Creating and Using Linked Knowledge
From Virtual Museums to Peacebuilding: Creating and Using Linked KnowledgeCraig Knoblock
 
Semantics for Big Data Integration and Analysis
Semantics for Big Data Integration and AnalysisSemantics for Big Data Integration and Analysis
Semantics for Big Data Integration and AnalysisCraig Knoblock
 
A Semantic Approach to Retrieving, Linking, and Integrating Heterogeneous Ge...
A Semantic Approach to Retrieving, Linking, and  Integrating Heterogeneous Ge...A Semantic Approach to Retrieving, Linking, and  Integrating Heterogeneous Ge...
A Semantic Approach to Retrieving, Linking, and Integrating Heterogeneous Ge...Craig Knoblock
 
Discovering Alignments in Ontologies of Linked Data
Discovering Alignments in Ontologies of Linked DataDiscovering Alignments in Ontologies of Linked Data
Discovering Alignments in Ontologies of Linked DataCraig Knoblock
 

More from Craig Knoblock (8)

Learning to Adapt to Sensor Changes and Failures
Learning to Adapt to Sensor Changes and FailuresLearning to Adapt to Sensor Changes and Failures
Learning to Adapt to Sensor Changes and Failures
 
Lessons Learned in Building Linked Data for the American Art Collaborative
Lessons Learned in Building Linked Data for the American Art CollaborativeLessons Learned in Building Linked Data for the American Art Collaborative
Lessons Learned in Building Linked Data for the American Art Collaborative
 
Assigning semantic labels to data sources
Assigning semantic labels to data sourcesAssigning semantic labels to data sources
Assigning semantic labels to data sources
 
A scalable architecture for extracting, aligning, linking, and visualizing mu...
A scalable architecture for extracting, aligning, linking, and visualizing mu...A scalable architecture for extracting, aligning, linking, and visualizing mu...
A scalable architecture for extracting, aligning, linking, and visualizing mu...
 
From Virtual Museums to Peacebuilding: Creating and Using Linked Knowledge
From Virtual Museums to Peacebuilding: Creating and Using Linked KnowledgeFrom Virtual Museums to Peacebuilding: Creating and Using Linked Knowledge
From Virtual Museums to Peacebuilding: Creating and Using Linked Knowledge
 
Semantics for Big Data Integration and Analysis
Semantics for Big Data Integration and AnalysisSemantics for Big Data Integration and Analysis
Semantics for Big Data Integration and Analysis
 
A Semantic Approach to Retrieving, Linking, and Integrating Heterogeneous Ge...
A Semantic Approach to Retrieving, Linking, and  Integrating Heterogeneous Ge...A Semantic Approach to Retrieving, Linking, and  Integrating Heterogeneous Ge...
A Semantic Approach to Retrieving, Linking, and Integrating Heterogeneous Ge...
 
Discovering Alignments in Ontologies of Linked Data
Discovering Alignments in Ontologies of Linked DataDiscovering Alignments in Ontologies of Linked Data
Discovering Alignments in Ontologies of Linked Data
 

Recently uploaded

Importance of magazines in education ppt
Importance of magazines in education pptImportance of magazines in education ppt
Importance of magazines in education pptsafnarafeek2002
 
Quinto Z-Wave Heltun_HE-RS01_User_Manual_B9AH.pdf
Quinto Z-Wave Heltun_HE-RS01_User_Manual_B9AH.pdfQuinto Z-Wave Heltun_HE-RS01_User_Manual_B9AH.pdf
Quinto Z-Wave Heltun_HE-RS01_User_Manual_B9AH.pdfDomotica daVinci
 
M.Aathiraju Self Intro.docx-AD21001_____
M.Aathiraju Self Intro.docx-AD21001_____M.Aathiraju Self Intro.docx-AD21001_____
M.Aathiraju Self Intro.docx-AD21001_____Aathiraju
 
Journey of Television in World & in India
Journey of Television in World & in IndiaJourney of Television in World & in India
Journey of Television in World & in IndiaAdarshAgarwal66
 
Q1 Memory Fabric Forum: XConn CXL Switches for AI
Q1 Memory Fabric Forum: XConn CXL Switches for AIQ1 Memory Fabric Forum: XConn CXL Switches for AI
Q1 Memory Fabric Forum: XConn CXL Switches for AIMemory Fabric Forum
 
Manual Eurotronic Thermostatic Valve Comry Z-Wave
Manual Eurotronic Thermostatic Valve Comry Z-WaveManual Eurotronic Thermostatic Valve Comry Z-Wave
Manual Eurotronic Thermostatic Valve Comry Z-WaveDomotica daVinci
 
Microsoft Azure News - Feb 2024
Microsoft Azure News - Feb 2024Microsoft Azure News - Feb 2024
Microsoft Azure News - Feb 2024Daniel Toomey
 
Bit N Build Poland
Bit N Build PolandBit N Build Poland
Bit N Build PolandGDSC PJATK
 
Dynamical systems simulation in Python for science and engineering
Dynamical systems simulation in Python for science and engineeringDynamical systems simulation in Python for science and engineering
Dynamical systems simulation in Python for science and engineeringMassimo Talia
 
Q1 Memory Fabric Forum: Advantages of Optical CXL​ for Disaggregated Compute ...
Q1 Memory Fabric Forum: Advantages of Optical CXL​ for Disaggregated Compute ...Q1 Memory Fabric Forum: Advantages of Optical CXL​ for Disaggregated Compute ...
Q1 Memory Fabric Forum: Advantages of Optical CXL​ for Disaggregated Compute ...Memory Fabric Forum
 
Zi-Stick UBS Dongle ZIgbee from Aeotec manual
Zi-Stick UBS Dongle ZIgbee from  Aeotec manualZi-Stick UBS Dongle ZIgbee from  Aeotec manual
Zi-Stick UBS Dongle ZIgbee from Aeotec manualDomotica daVinci
 
AUGMENTED REALITY (AR) IN DAILY LIFE: EXPANDING BEYOND GAMING
AUGMENTED REALITY (AR) IN DAILY LIFE: EXPANDING BEYOND GAMINGAUGMENTED REALITY (AR) IN DAILY LIFE: EXPANDING BEYOND GAMING
AUGMENTED REALITY (AR) IN DAILY LIFE: EXPANDING BEYOND GAMINGLiveplex
 
Early Tech Adoption: Foolish or Pragmatic? - 17th ISACA South Florida WOW Con...
Early Tech Adoption: Foolish or Pragmatic? - 17th ISACA South Florida WOW Con...Early Tech Adoption: Foolish or Pragmatic? - 17th ISACA South Florida WOW Con...
Early Tech Adoption: Foolish or Pragmatic? - 17th ISACA South Florida WOW Con...Adrian Sanabria
 
21ST CENTURY LITERACY FROM TRADITIONAL TO MODERN
21ST CENTURY LITERACY FROM TRADITIONAL TO MODERN21ST CENTURY LITERACY FROM TRADITIONAL TO MODERN
21ST CENTURY LITERACY FROM TRADITIONAL TO MODERNRonnelBaroc
 
5 Things You Shouldn’t Do at Salesforce World Tour Sydney 2024!
5 Things You Shouldn’t Do at Salesforce World Tour Sydney 2024!5 Things You Shouldn’t Do at Salesforce World Tour Sydney 2024!
5 Things You Shouldn’t Do at Salesforce World Tour Sydney 2024!XfilesPro
 
Artificial-Intelligence-in-Marketing-Data.pdf
Artificial-Intelligence-in-Marketing-Data.pdfArtificial-Intelligence-in-Marketing-Data.pdf
Artificial-Intelligence-in-Marketing-Data.pdfIsidro Navarro
 
OTel Orientation_ How to Train Teams (OTel in Practice).pdf
OTel Orientation_ How to Train Teams (OTel in Practice).pdfOTel Orientation_ How to Train Teams (OTel in Practice).pdf
OTel Orientation_ How to Train Teams (OTel in Practice).pdfPaige Cruz
 
Introduction to Serverless with AWS Lambda in C#.pptx
Introduction to Serverless with AWS Lambda in C#.pptxIntroduction to Serverless with AWS Lambda in C#.pptx
Introduction to Serverless with AWS Lambda in C#.pptxBrandon Minnick, MBA
 
Breaking Barriers & Leveraging the Latest Developments in AI Technology
Breaking Barriers & Leveraging the Latest Developments in AI TechnologyBreaking Barriers & Leveraging the Latest Developments in AI Technology
Breaking Barriers & Leveraging the Latest Developments in AI TechnologySafe Software
 

Recently uploaded (20)

Importance of magazines in education ppt
Importance of magazines in education pptImportance of magazines in education ppt
Importance of magazines in education ppt
 
Quinto Z-Wave Heltun_HE-RS01_User_Manual_B9AH.pdf
Quinto Z-Wave Heltun_HE-RS01_User_Manual_B9AH.pdfQuinto Z-Wave Heltun_HE-RS01_User_Manual_B9AH.pdf
Quinto Z-Wave Heltun_HE-RS01_User_Manual_B9AH.pdf
 
M.Aathiraju Self Intro.docx-AD21001_____
M.Aathiraju Self Intro.docx-AD21001_____M.Aathiraju Self Intro.docx-AD21001_____
M.Aathiraju Self Intro.docx-AD21001_____
 
Journey of Television in World & in India
Journey of Television in World & in IndiaJourney of Television in World & in India
Journey of Television in World & in India
 
Q1 Memory Fabric Forum: XConn CXL Switches for AI
Q1 Memory Fabric Forum: XConn CXL Switches for AIQ1 Memory Fabric Forum: XConn CXL Switches for AI
Q1 Memory Fabric Forum: XConn CXL Switches for AI
 
Manual Eurotronic Thermostatic Valve Comry Z-Wave
Manual Eurotronic Thermostatic Valve Comry Z-WaveManual Eurotronic Thermostatic Valve Comry Z-Wave
Manual Eurotronic Thermostatic Valve Comry Z-Wave
 
Microsoft Azure News - Feb 2024
Microsoft Azure News - Feb 2024Microsoft Azure News - Feb 2024
Microsoft Azure News - Feb 2024
 
Bit N Build Poland
Bit N Build PolandBit N Build Poland
Bit N Build Poland
 
Dynamical systems simulation in Python for science and engineering
Dynamical systems simulation in Python for science and engineeringDynamical systems simulation in Python for science and engineering
Dynamical systems simulation in Python for science and engineering
 
Q1 Memory Fabric Forum: Advantages of Optical CXL​ for Disaggregated Compute ...
Q1 Memory Fabric Forum: Advantages of Optical CXL​ for Disaggregated Compute ...Q1 Memory Fabric Forum: Advantages of Optical CXL​ for Disaggregated Compute ...
Q1 Memory Fabric Forum: Advantages of Optical CXL​ for Disaggregated Compute ...
 
Zi-Stick UBS Dongle ZIgbee from Aeotec manual
Zi-Stick UBS Dongle ZIgbee from  Aeotec manualZi-Stick UBS Dongle ZIgbee from  Aeotec manual
Zi-Stick UBS Dongle ZIgbee from Aeotec manual
 
AUGMENTED REALITY (AR) IN DAILY LIFE: EXPANDING BEYOND GAMING
AUGMENTED REALITY (AR) IN DAILY LIFE: EXPANDING BEYOND GAMINGAUGMENTED REALITY (AR) IN DAILY LIFE: EXPANDING BEYOND GAMING
AUGMENTED REALITY (AR) IN DAILY LIFE: EXPANDING BEYOND GAMING
 
Early Tech Adoption: Foolish or Pragmatic? - 17th ISACA South Florida WOW Con...
Early Tech Adoption: Foolish or Pragmatic? - 17th ISACA South Florida WOW Con...Early Tech Adoption: Foolish or Pragmatic? - 17th ISACA South Florida WOW Con...
Early Tech Adoption: Foolish or Pragmatic? - 17th ISACA South Florida WOW Con...
 
21ST CENTURY LITERACY FROM TRADITIONAL TO MODERN
21ST CENTURY LITERACY FROM TRADITIONAL TO MODERN21ST CENTURY LITERACY FROM TRADITIONAL TO MODERN
21ST CENTURY LITERACY FROM TRADITIONAL TO MODERN
 
5 Things You Shouldn’t Do at Salesforce World Tour Sydney 2024!
5 Things You Shouldn’t Do at Salesforce World Tour Sydney 2024!5 Things You Shouldn’t Do at Salesforce World Tour Sydney 2024!
5 Things You Shouldn’t Do at Salesforce World Tour Sydney 2024!
 
Artificial-Intelligence-in-Marketing-Data.pdf
Artificial-Intelligence-in-Marketing-Data.pdfArtificial-Intelligence-in-Marketing-Data.pdf
Artificial-Intelligence-in-Marketing-Data.pdf
 
5 Tech Trend to Notice in ESG Landscape- 47Billion
5 Tech Trend to Notice in ESG Landscape- 47Billion5 Tech Trend to Notice in ESG Landscape- 47Billion
5 Tech Trend to Notice in ESG Landscape- 47Billion
 
OTel Orientation_ How to Train Teams (OTel in Practice).pdf
OTel Orientation_ How to Train Teams (OTel in Practice).pdfOTel Orientation_ How to Train Teams (OTel in Practice).pdf
OTel Orientation_ How to Train Teams (OTel in Practice).pdf
 
Introduction to Serverless with AWS Lambda in C#.pptx
Introduction to Serverless with AWS Lambda in C#.pptxIntroduction to Serverless with AWS Lambda in C#.pptx
Introduction to Serverless with AWS Lambda in C#.pptx
 
Breaking Barriers & Leveraging the Latest Developments in AI Technology
Breaking Barriers & Leveraging the Latest Developments in AI TechnologyBreaking Barriers & Leveraging the Latest Developments in AI Technology
Breaking Barriers & Leveraging the Latest Developments in AI Technology
 

From Artwork to Cyber Attacks: Lessons Learned in Building Knowledge Graphs using Semantic Web Technologies

  • 1. From Artwork to Cyber Attacks: Lessons Learned in Building Knowledge Graphs using Semantic Web Technologies Craig Knoblock USC Information Sciences Institute U.S. Semantic Technologies Symposium March 1, 2018
  • 2. Center on Knowledge Graphs: People 2
  • 3. Center on Knowledge Graphs: People (cont.) 3
  • 4. Center on Knowledge Graphs: Projects 4Center on Knowledge GraphsUSC Information Sciences Institute
  • 5. Goal: Building Knowledge Graphs raw  messy  disconnected clean  organized  linked hard to query, analyze & visualize easy to query, analyze & visualize 5Center on Knowledge GraphsUSC Information Sciences Institute
  • 6. Questions Addressed in this Talk 1. Where should the Semantic Web data come from? • Triplestores? Linked data? Schema.org? 2. What is the “best” representation of the data in a knowledge graph? • Very detailed domain-specific ontologies? 3. How should we deal with incomplete and incorrect information • Manual curation? Automated data cleaning? 4. How do we organize and store the data for efficient access? • RDF? Triplestore? 6Center on Knowledge GraphsUSC Information Sciences Institute
  • 7. Steps To Build a KG Crawling Extraction DataAcquisition Mapping To Ontology Entity Linking &Similarity Knowledge Graph Deployment Query & Visualization Elastic Search Graph DB schema.org geonames Data Acquisition Feature Extraction Feature Alignment Entity Resolution Graph Construction User Interface 7Center on Knowledge GraphsUSC Information Sciences Institute Feature Extraction
  • 8. Illegal Arms Sales • 100s of web sites • ATF wants to find people buying and selling across state lines • Challenge: extract and align the data across sites USC Information Sciences Institute Center on Knowledge Graphs 8
  • 11. Automated Extraction [Minton et al., Inferlink] Input: A Pile of Pages 11Center on Knowledge GraphsUSC Information Sciences Institute
  • 12. Automated Extraction input: a pile of pages Classify by Templates pages clustered by template 12Center on Knowledge GraphsUSC Information Sciences Institute
  • 13. Automated Extraction input: a pile of pages Classify by Templates pages clustered by template Infer Extractor Infer Extractor Infer Extractor Infer Extractor extractor 13Center on Knowledge GraphsUSC Information Sciences Institute
  • 15. Extraction Evaluation Title Desc Seller Date Price Loc Cat Member Since Expires Views ID Perfect 1.0 (50/50) .76 (37/49) .95 (40/42) .83 (40/48) .87 (39/45) .51 (23/45) .68 (34/50) 1.0 (35/35) .52 (15/29) .76 (19/25) .97 (35/36) Including partial and extra data 1.0 (50/50) .98 (48/49) .95 (40/42) .83 (40/48) .98 (44/45) .84 (38/45) .88 (44/50) 1.0 (35/35) .55 (16/29) 1.0 (25/25) 1.0 (36/36) 10 websites, 5 pages each fields 15Center on Knowledge GraphsUSC Information Sciences Institute
  • 16. Steps To Build a KG Crawling Extraction DataAcquisition Mapping To Ontology Entity Linking &Similarity Knowledge Graph Deployment Query & Visualization Elastic Search Graph DB schema.org geonames Data Acquisition Feature Extraction Feature Alignment Entity Resolution Graph Construction User Interface 16Center on Knowledge GraphsUSC Information Sciences Institute
  • 17. Knowledge Graph for Predicting Cyber Attacks Elastic Search Cyber Domain OntologyBlogs Twitter Conferences CPEs Darkweb marketplaces News CVEs Darkweb Forums Abuse.ch Karma Model Model Microsoft Bulletins 17Center on Knowledge GraphsUSC Information Sciences Institute
  • 18. Cyber Domain Ontology 18 28 Classes 97 Properties Based on Schema.org
  • 19. Karma: Mapping Data to Ontologies Services Relational Sources Karma { JSON-LD } Hierarchical Sources Cyber Ontology 19 [ Knoblock, Szekely, et al. ISWC 2012 ] USC Information Sciences Institute
  • 20. Map Source to Domain Ontology Domain Ontology Source 20 object property data property Software Vulnerability Topic name version author hasVulnerability name description name isTopicOf PostisVulnerabilityOf location mentions datePublished topic hasTopic username Person isAuthorOf Semantic Model: maps source to domain ontology Column 1 Column 2 Column 3 Column 4 Column 5 Bro can you give me a.. English windows xp sp3 CVE-2016-1052 303828 … ‫أنا‬‫جربت‬‫البرنامج‬‫وعمل‬‫ع‬ Arabic jp2_cdef_destroy 147075 salve a tutti, ultimamento … Italian cve-2012-4969 execcommand vuln cve-2012-4969 107075 USC Information Sciences Institute Center on Knowledge Graphs
  • 21. Semantic Types Post Topic Vulnerabilit y Person text language name userId name Post 21 Column 1 Column 2 Column 3 Column 4 Column 5 Bro can you give me a.. English windows xp sp3 CVE-2016-1052 303828 … ‫أنا‬‫جربت‬‫البرنامج‬‫وعمل‬‫ع‬ Arabic jp2_cdef_destroy 147075 salve a tutti, ultimamento … Italian cve-2012-4969 execcommand vuln cve-2012-4969 107075 USC Information Sciences Institute Center on Knowledge Graphs
  • 22. Relationships Post Topic Vulnerability Person text language mentions hasTopic author name userId name 22 Column 1 Column 2 Column 3 Column 4 Column 5 Bro can you give me a.. English windows xp sp3 CVE-2016-1052 303828 … ‫أنا‬‫جربت‬‫البرنامج‬‫وعمل‬‫ع‬ Arabic jp2_cdef_destroy 147075 salve a tutti, ultimamento … Italian cve-2012-4969 execcommand vuln cve-2012-4969 107075 USC Information Sciences Institute Center on Knowledge Graphs
  • 23. Cyber KG Dashboard 23Center on Knowledge GraphsUSC Information Sciences Institute
  • 24. Karma Learns the Source Models Taheriyan et al., ISWC 2013, ICSC 2014 Domain Ontology Learn Semantic Types Sample Data Construct a Graph Generate Candidate Models Rank Results Known Semantic Models 24Center on Knowledge GraphsUSC Information Sciences Institute
  • 25. Learning Semantic Types Requirements: Learn from a small number of examples Distinguish both string and numeric values Can be learned quickly and is highly scalable to large numbers of semantic types Person OrganizationCity State name birthdate name namename Person name date city state workplace 1 Fred Collins Oct 1959 Seattle WA Microsoft 2 Tina Peterson May 1980 New York NY Google Domain Ontology 25Center on Knowledge GraphsUSC Information Sciences Institute
  • 26. Training machine learning model [Pham et al., ISWC 2016] 26
  • 28. Construct a Graph Construct a graph from semantic types and ontology date 28USC Information Sciences Institute
  • 29. Determine Relationships Select minimal tree that connects all semantic types A customized Steiner tree algorithm [Kou & Markowsky, 1981] Initial Model date 29USC Information Sciences Institute
  • 30. Refining the Model Correct Model Impose constraints on Steiner Tree Algorithm 30Center on Knowledge GraphsUSC Information Sciences Institute
  • 31. Knowledge Graphs Karma uses semantic models to create knowledge graphs Karma semi-automatically builds semantic models 31USC Information Sciences Institute Center on Knowledge Graphs
  • 32. American Art Collaborative • Consortium of 14 American art museums • Explore the use of Linked Data for research, education, and outreach • Build 5* Linked Data for the museums • Create tools to support the construction of Linked Data 32Center on Knowledge GraphsUSC Information Sciences Institute [Knoblock et al., ISWC 2017]
  • 33. Example Model of Actor for Amon Carter 33Center on Knowledge GraphsUSC Information Sciences Institute
  • 34. Complete Model of Actor for Amon Carter 34Center on Knowledge GraphsUSC Information Sciences Institute
  • 35. AAC Data Statistics 35Center on Knowledge GraphsUSC Information Sciences Institute
  • 38. Statistics on What Was Mapped 38Center on Knowledge GraphsUSC Information Sciences Institute
  • 39. Steps To Build a KG Crawling Extraction DataAcquisition Mapping To Ontology Entity Linking &Similarity Knowledge Graph Deployment Query & Visualization Elastic Search Graph DB schema.org geonames Data Acquisition Feature Extraction Feature Alignment Entity Resolution Graph Construction User Interface 39Center on Knowledge GraphsUSC Information Sciences Institute
  • 40. Quiet Comfort 25 Noise Cancelling Headphone Bose Electroni c Product 1 Noise Cancelling Headphones Product 2 292 Premium Noise Cancelling Headphones Son y Product 3 599 Dish Washer Bosch Product 4 229 Bose Noise Cancelling Headphones Bos e Product 5 299 price description manufacturerproduct Multi-Type Graph 40 Collective Entity Resolution [Zhu et al, ISWC’16] Identifying and linking instances of the same real world entity
  • 41. Quiet Comfort 25 Noise Cancelling Headphone Bose Electroni c Product 1 Noise Cancelling Headphones Product 2 292 Premium Noise Cancelling Headphones Son y Product 3 599 Dish Washer Bosch Product 4 229 Bose Noise Cancelling Headphones Bos e Product 5 299 price description manufacturerproduct Multi-Type Graph Collective Entity Resolution [Zhu et al, ISWC’16] Identifying and linking instances of the same real world entity 41
  • 42. Common Approach: Pairwise Comparisons Product 5 299 Quiet Comfort 25 Noise Cancelling Headphone Bose Electronic 299, 229 Bose Noise Cancelling HeadphonesBoseProduct 4 599 Dish WasherBoschProduct 3 292 Premium Noise Cancelling HeadphonesSonyProduct 2 Noise Cancelling HeadphonesSonyProduct 1 Price TitleManufacturer Jaro 0.5 distance 0.2 Jaccard 0.3 Acceptance Threshold: 0.8 42USC Information Sciences Institute
  • 43. Graph Summarization: Original Graph Quiet Comfort 25 Noise Cancelling Headphone Bose Electroni c Product 1 Noise Cancelling Headphones Product 2 292 Premium Noise Cancelling Headphones Son y Product 3 599 Dish Washer Bosch Product 4 229 Bose Noise Cancelling Headphones Bos e Product 5 299 price description manufacturerproduct 43Center on Knowledge Graphs
  • 45. Quiet Comfort 25 Noise Cancelling Headphone Bose Electroni c Product 1 Noise Cancelling Headphones Product 2 292 Premium Noise Cancelling Headphones Son y Product 3 599 Dish Washer Bosch 229 Bose Noise Cancelling Headphones Bos e Product 5 299 Product 4 Graph Sumarization: Super-Nodes 45
  • 47. Quiet Comfort 25 Noise Cancelling Headphone Bose Electroni c Product 1 Noise Cancelling Headphones Product 2 292 Premium Noise Cancelling Headphones Son y Product 3 599 Dish Washer Bosch 229 Bose Noise Cancelling Headphones Bos e Product 5 299 Product 4 Super-Links 47Center on Knowledge GraphsUSC Information Sciences Institute
  • 48. Bose Electroni c Product 3 Bosch Bos e Product 5 Product 4 Predict Links In Original Graph 48Center on Knowledge GraphsUSC Information Sciences Institute
  • 49. Bose Electroni c Product 3 Bosch Bos e Product 5 Product 4 Predict Links In Original Graph Bose Electroni c Product 3 Bosch Bos e Product 5 Product 4 49USC Information Sciences Institute
  • 50. Predict Links In Original Graph Bose Electroni c Product 3 Bosch Bos e Product 5 Product 4 50Center on Knowledge GraphsUSC Information Sciences Institute
  • 51. Re-Clustering Improves Reconstruction Quality Bose Electroni c Product 3 Bosch Bos e Product 5 Product 4 Bose Electroni c Product 3 Bosch Bos e Product 5 Product 4 51USC Information Sciences Institute
  • 52. Quality Comparison Precision Recall F-measure Author Paper Product Author Paper Product Author Paper Product Limes-F 0.958 0.827 0.446 0.864 0.761 0.16 0.909 0.792 0.236 Silk-F 0.846 0.877 0.459 0.986 0.756 0.348 0.91 0.812 0.395 Gsum 0.727 0.668 0.01 0.569 0.624 0.587 0.638 0.645 0.02 CoSum-B 0.993 0.871 0.58 0.94 0.611 0.477 0.966 0.718 0.524 Limes-MO 0.912 0.827 0.446 0.944 0.761 0.16 0.928 0.792 0.236 Silk-MO 0.932 0.877 0.459 0.958 0.756 0.348 0.945 0.812 0.395 Serf 0.985 0.837 0.436 0.687 0.808 0.186 0.809 0.822 0.261 CoSum-P 0.999 0.771 0.639 0.997 0.997 0.695 0.998 0.87 0.666 Commercial 0.615 0.63 0.622 AuthorLDA 0.995 52
  • 53. Steps To Build a KG Crawling Extraction DataAcquisition Mapping To Ontology Entity Linking &Similarity Knowledge Graph Deployment Query & Visualization Elastic Search Graph DB schema.org geonames Data Acquisition Feature Extraction Feature Alignment Entity Resolution Graph Construction User Interface 53Center on Knowledge GraphsUSC Information Sciences Institute
  • 54. Counter Human Trafficking 54Center on Knowledge GraphsUSC Information Sciences Institute
  • 55. DIG for Counter Human Trafficking
  • 56. Find the locations where a potential victim was advertised
  • 57. Successfully deployed and used to find victims and prosecute traffickers
  • 58. Graph Construction assembling the data for efficient query & analysis - Data represented in JSON-LD - Stored in ElasticSearch • Cloud-based search engine based on Apache Lucene • Horizontal scaling, replication, load balancing • Queries are fast! • Everything is a document - bulk loading: massive data imports (> 100M web pages) - real-time updates: live, changing data (~5,000 pages/hour) 58Center on Knowledge GraphsUSC Information Sciences Institute
  • 59. Adult Service Offer Person Efficient indexing and query Phone Web Page ElasticSearch Data Model 59Center on Knowledge GraphsUSC Information Sciences Institute
  • 60. Indexing for High Performance Knowledge Graph Queries Avg. Query Times in Milliseconds Single User Query Load 1.2 billion triples State of the Art Graph Database (RDF) DIG indexing deployed in ElasticSearch 60Center on Knowledge GraphsUSC Information Sciences Institute
  • 61. • Index time for 16 million documents ~2.5 Hours • Query times: • Average Query time for Keyword searches: 8 msec • Find a specific CVE: 14 msec • Get all mentions of a MS Bulletin in all sources: 48 msec • Get all Malware named ‘Locky’ and sort results by observedDate: 68 msec • Get all blogs mentioning keyword ‘microsoft’ with a date range: 98 msec • Aggregate and give document counts for each publisher/sensor: 409 msec 61 Knowledge Graph Performance in Cyber Domain USC Information Sciences Institute Center on Knowledge Graphs
  • 62. Questions Addressed in This Talk 1. Where should the Semantic Web data come from? • Triplestores? Linked data? Schema.org? 2. What is the best representation of the data in a knowledge graph? • Do we want to use the most detailed ontology possible 3. How should we deal with missing and incomplete information • Manual curation? Automated data cleaning? 4. How do we organize and store the data for efficient access? • RDF? Triplestore? Questions Addressed in This Talk Lessons Learned 62Center on Knowledge GraphsUSC Information Sciences Institute
  • 63. Questions Addressed in This Talk 1. Where should the Semantic Web data come from? • Triplestores? Linked data? Schema.org? The Web! Waiting for the rest of the world to adopt the Semantic Web and provide the data in RDF is an approach doomed to failure! Questions Addressed in This Talk Lessons Learned 63Center on Knowledge GraphsUSC Information Sciences Institute
  • 64. Questions Addressed in This Talk 1. Where should the Semantic Web data come from? • Triplestores? Linked data? Schema.org? 2. What is the “best” representation of the data in a knowledge graph? • Do we want to use the most detailed ontology possible The simplest one you need for the problem you are trying to solve Overly complicated ontologies that attempt to be comprehensive for a domain, get in the way of solving the real problems Questions Addressed in This Talk Lessons Learned 64Center on Knowledge GraphsUSC Information Sciences Institute
  • 65. Questions Addressed in This Talk 1. Where should the Semantic Web data come from? • Triplestores? Linked data? Schema.org? 2. What is the best representation of the data in a knowledge graph? • Carefully curated domain-specific ontologies? 3. How should we deal with missing and incorrect information • Manual curation? Automated data cleaning? Clean where possible, but need techniques that can face these problems The world is a messy place and the ability to deal with it allows us to solve real-world problems Questions Addressed in This Talk Lessons Learned 65Center on Knowledge GraphsUSC Information Sciences Institute
  • 66. Questions Addressed in This Talk 1. Where should the Semantic Web data come from? • Triplestores? Linked data? Schema.org? 2. What is the best representation of the data in a knowledge graph? • Carefully curated domain-specific ontologies? 3. How should we deal with missing and incomplete information • Manual curation? Automated data cleaning? 4. How do we organize and store the data for efficient access? • RDF? Triplestore? In whatever datastore best meets the goals of the problem! It is a mistake to equate the Semantic Web with triples and triplestores. Questions Addressed in This Talk Lessons Learned 66Center on Knowledge GraphsUSC Information Sciences Institute
  • 67. Important Directions for Future Research 1. Techniques for extracting data from the online sources 2. Approaches to quickly build, refine, and extend ontologies to solve specific problems 3. Methods for semantically annotating data from extracted sources 4. Scalable and configurable techniques for entity resolution 5. Highly scalable algorithms for querying and reasoning 6. Ability to publish and query semantic data on web pages 67Center on Knowledge GraphsUSC Information Sciences Institute

Editor's Notes

  1. Karma offers suggestions on how to do the mapping
  2. Tokenize values in a given labeled column into pure alphabetic, numeric and symbol tokens Extract features from the tokens and the column name and associate them with column’s semantic type
  3. Waiting for the rest of the world to adopt the Semantic Web and provide the data in RDF is an approach doomed to failure!
  4. Overly complicated ontologies that attempt to be comprehensive for a domain, get in the way of solving the real problems
  5. The world is a messy place and the ability to deal with it allows us to solve real-world problems
  6. It is a mistake to equate the Semantic Web with triples and triplestores.