SlideShare a Scribd company logo
1 of 26
K E E N A N A L Y T I C S 1
Semantic
SPEAKERS:
Dr. Arthur Keen, Principal
Keen Analytics
Thomas Kelly, Practice Director
Cognizant Technology Solutions, Inc.
K E E N A N A L Y T I C S 2
Operator, get me Klondike 5-397
K E E N A N A L Y T I C S 3
Data Ecosystems are Growing in Complexity
Tens of thousands of
databases
Millions to billions of
data elements
Dozens of markets
Hundreds to thousands of
social media sites
K E E N A N A L Y T I C S 4
Analytics without semantics
is like having a multi-lingual
conversation without
interpreters
Semantics Manages the Complexity of Data Variety
K E E N A N A L Y T I C S 5
Semantic Analytics
Data Science
Domains
Technologies
Analytic Methods
Semantics
• Knowledge
• Expertise
• Abstraction &
Diversity
• Consistency
Semantics
• Data Meaning
• Context
• Relationships
• Vocabulary
K E E N A N A L Y T I C S 6
Semantic Analytics
Emphasis is on
data relationships,
not just the data
Data focus is on data
concepts (abstraction),
not the diversity of
implementation details
Data assumptions are
made explicit in the
semantic model
The semantics guide
the analytics process,
rather than just the
analyst’s knowledge
SPARQL is a key
component, but not
the only tool in the
semantic analytics
toolbox
K E E N A N A L Y T I C S 7
Challenges in Semantic Analytics
1
2
3
Semantic models that
do not abstract data
concepts from their
implementation details
4
Semantic models that
are missing semantics
Semantic data that is
missing a semantic
model
Rich, accurate
provenance is required
to establish confidence
in the analytics results
5
Data cleansing must
meet requirements for
accuracy, consistency,
and fitness for the
purpose of the analytic
task and result
K E E N A N A L Y T I C S 8
Semantics Analytics in the Data to Action Loop
Analyze
Transform
Classify
Correlate
Predict
Interpret
COA’s
Semantics
Which relationships relevant?
What class? What kind of group?
Define New Relationships?
Inference
Tag/Inference
Representation/Provenance
Wisdom
Knowledge
Information
Data
W
K
I
D
W
K
I
D
W
K
I
D
W
K
I
D
Semantics
Intelligence Pyramid
Analytics
Semantic
Analytics
K E E N A N A L Y T I C S 9
Clustering Through Semantic Tagging
Image Credit: historyinthecity.blogspot.com
Semantic Tags
• Tend to be user- or publisher-defined based on preferences,
including terminology and depth of attribution
• May have ambiguities to resolve (synonyms, reuse/overuse,
too specific, language, jargon)
Key Benefits
• Faster search of content
• Greater precision of search results
Semantic Tags are keywords
used to describe a resource
(webpages, documents, business
transactions)
Source-Directed Tags
• Manual selection and entry by the author
• Automated population by the publisher, such as
professional literature or publication websites
• Automatically excerpted from a corpus through
semantic analysis of the content, guided by a
controlled vocabulary
K E E N A N A L Y T I C S 10
Clustering Through Semantic Tagging
Source: Implementing Iterative Algorithms with SPARQL http://ceur-ws.org/Vol-1133/paper-36.pdf
DROP GRAPH <urn:ga/g/xjz[i+1]> ;
CREATE GRAPH <urn:ga/g/xjz[i+1]> ;
INSERT { GRAPH <urn:ga/g/xjz[i+1]>
{?s <urn:ga/p/inCluster> ?clus3 } }
WHERE {
{ SELECT ?s (SAMPLE(?clus) AS ?clus3) WHERE {
{ SELECT ?s (MAX(?clusCt) AS ?maxClusCt) WHERE {
SELECT ?s ?clus (COUNT(?clus) AS ?clusCt)
WHERE { ?s <urn:ga/p/hasLink> ?o .
GRAPH <urn:ga/g/xjz[i] > ?clus }
} GROUP BY ?s ?clus
} GROUP BY ?s }
{ SELECT ?s ?clus (COUNT(?clus) AS ?clusCt)
WHERE { ?s <urn:ga/p/hasLink> ?o .
GRAPH <urn:ga/g/xjz[i]>
{ ?o <urn:ga/p/inCluster ?clus }
} GROUP BY ?s ?clus
} FILTER (?clusCt = ?maxClusCt)
} GROUP BY ?s } }
DROP GRAPH <urn:ga/g/xjz0> ;
CREATE GRAPH <urn:ga/g/xjz0> ;
INSERT { GRAPH <urn:ga/g/xjz0>
{?s <urn:ga/p/inCluster> ?s } }
WHERE {
{ SELECT DISTINCT ?s WHERE {
{ SELECT ?s <urn:ga/p/hasLink> ?o . } }
Assign Each Tag Vertex to a Cluster
For Each Tag Vertex, Populate Cluster Assignments of Neighbors
Peer-Pressure Clustering
Observation
• No use of semantics features, such as
vocabulary and knowledge
management capabilities
Strengths
• Effective over large volumes of data
• Comprehensive use of RDF data
structure features
K E E N A N A L Y T I C S 11
Clustering Through Semantic Tagging
Positive Negative
Ecstatic Pleased Okay Terms used in
Semantic Tags
Common Taxonomy for Semantic Tags
K E E N A N A L Y T I C S 12
Clustering Through Semantic Tagging
Positive Negative
Ecstatic Inspired Charged
Excited
Exceeds
Need
Very
Satisfied
Satisfied
Somewhat
Satisfied
Preferred Terms,
Synonyms, and
Common
Misspellings
Frequently-Used
Generalizations and
Degrees of Specificity
Knowledge-based Taxonomy for Semantic Tags
Estatic
Extatic
Egstatic
K E E N A N A L Y T I C S 13
Clustering through Semantic Tagging
Process
Cluster resources with highest frequency semantic tag pairs
Map the semantic tags to an N-level taxonomy of preferred tags,
based on exact and synonym matches, and desired degree of
specificity
Select a set of triples containing URIs of the resources, as well as the
semantic tags assigned to the resources
K E E N A N A L Y T I C S 14
INSERT
{ ?SemanticTagEdgeURI
rdf:type :SemanticTagEdge ;
:resourceURI ?resource ;
:edgeNode1 ?clusterTagLabel1 ;
:edgeNode2 ?clusterTagLabel2 . }
WHERE {
?SemanticTagURI1
rdf:type :SemanticTag ;
:resourceURI ?resource ;
:clusterTagValue ?clusterTagLabel1 .
?SemanticTagURI2
rdf:type :SemanticTag ;
:resourceURI ?resource ;
:clusterTagValue ?clusterTagLabel2 .
FILTER ( ?clusterTagLabel1 != ?clusterTagLabel2 )
BIND ( URI( CONCAT( str(?resource),
?clusterTagLabel1, ?clusterTagLabel2 ) ) AS
?SemanticTagEdgeURI ) }
Clustering through Semantic Tagging
:Webpage1 :hasTag “10101” .
:Webpage1 :hasTag “1030303B” .
:Webpage2 :hasTag “10201” .
:Webpage2 :hasTag “1030301” .
:Webpage3 :hasTag “1030303B” .
:Webpage3 :hasTag “10201A” .
:Webpage4 :hasTag “10101B” .
:Webpage4 :hasTag “10302A” .
:Webpage5 :hasTag “1030301” .
:Webpage5 :hasTag “10101A” . …
INSERT { ?SemanticTagURI :clusterTagValue
?clusterTagLabel }
WHERE {
?SemanticTagURI rdf:type :SemanticTag ;
:hasTag ?tagLabel .
?Concept rdf:type skos:Concept ;
( skos:prefLabel|skos:altLabel|skos:hiddenLabel )
?tagLabel .
OPTIONAL {
?Concept :degreeOfSpecificity :<SPECIFICITY> ;
skos:prefLabel ?clusterTagLabel . }
OPTIONAL {
?Concept :degreeOfSpecificity ?Specificity .
?Concept skos:broader* ?BroaderConcept .
?BroaderConcept :degreeOfSpecificity
?BroaderSpecificity .
FILTER ( ?BroaderSpecificity = :<SPECIFICITY> )
?BroaderConcept skos:prefLabel
?clusterTagLabel . } }
Insert Sample Data
Find Preferred/Generalized
Tag Value
Generate Tag Pairs
Concept
- Preferred Tag Term
- Synonyms, Misspellings
- Broader/Generalized
Concepts
- Degree of Specificity
Taxonomy
• Highest Frequency Tag Pairs
• Highest Frequency Solitary Tags *
• Triple and Quadruple Tag Sets *
Results
* Not depicted
K E E N A N A L Y T I C S 15
Semantic Analytics in Two Flavors
Semantics on Analytics Analytics on Semantics
Semantic assisted analysis: Money
laundering, fraud detection,
community detection, insider trading…
Understanding Risk (financial
trading & cyber security),
transaction optimization,
vulnerability assessment…
K E E N A N A L Y T I C S 16
Discover Abnormal BehaviorProbability
Degree Centrality
Rare Occurrence
(Frequent
Communication)
Rare Occurrence
(Infrequent
Communication)
Normal
Communication
Levels
K E E N A N A L Y T I C S 17
Identifying and predicting behavior changes
Observe Orient Decide Act
Network
Density
Time
Classify and predict group behavior using communication network density
What kind of organization is this?
What is their objective/intent?
Distributing food? Terrorist attack? Cyber attack?
Merger/Acquisition? Bank robbery?
When are they going to act?
K E E N A N A L Y T I C S 18
Company
Understanding Risk: Systemic Risk Analysis
Transitive risk exposure in a network of trading partners and holding companies
E
F
A
D H
L
J
B
K
C
G
I
M
O
N
Q
relationship
R
P
K E E N A N A L Y T I C S 19
Company
Systemic Risk Analysis:
Transitive risk exposure in a network of trading partners and holding companies
E
F
A
D H
L
J
B
K
C
G
I
M
O
N
Q
controlledBy
tradesWith
R
P
K E E N A N A L Y T I C S 20
Company
Systemic Risk Analysis:
Transitive risk exposure in a network of trading partners and holding companies
E
F
A
D H
L
J
B
K
C
G
I
M
O
N
Q
controlledBy
tradesWith
R
P
Bank
K E E N A N A L Y T I C S 21
Systemic Risk Analysis:
Transitive risk exposure in a network of trading partners and holding companies
E
F
A
D H
L
J
B
K
C
G
I
M
O
N
Q
controlledBy
tradesWith
R
P
Bank
BankHoldingCompany
A bank holding company controls a bank or controls a bank holding company
K E E N A N A L Y T I C S 22
Systemic Risk Analysis:
Transitive risk exposure in a network of trading partners and holding companies
E
F
A
D
H
L
J
B
K
C
G
I
M O
N
Q
controlledBy
tradesWith
R
P
Bank
BankHoldingCompany
risk
K E E N A N A L Y T I C S 23
In SPARQL
PREFIX : <http://pagerank/>
PREFIX rank: <http://www.ontotext.com/owlim/RDFRank#>
SELECT DISTINCT ?node ?rank
WHERE {GRAPH <http://pagerank>{
{?node :to [].}UNION {[] :to ?node}
?node rank:hasRDFRank ?rank .
}}ORDER BY ?node
PREFIX rank: <http://www.ontotext.com/owlim/RDFRank#>
INSERT DATA { rank:epsilon rank:setParam "0.001" . }
K E E N A N A L Y T I C S 24
Questions?
K E E N A N A L Y T I C S 25
Thank you!
K E E N A N A L Y T I C S 26
Speakers
Thomas (Tom) Kelly
Practice Director, Enterprise Information Management, Cognizant
Thomas Kelly is a Director in Cognizant’s Enterprise Information Management
(EIM) Practice and heads its Semantic Technology Center of Excellence. He has 20-
plus years of technology consulting experience in leading data warehousing,
business intelligence and big data projects, focused primarily on the life sciences,
healthcare, and financial services industries. Tom can be reached at
Thomas.Kelly@cognizant.com.
Dr. Arthur Keen
Principal, Keen Analytics
Arthur Keen possesses a deep understanding of graph analytics, predictive
modeling, unstructured data, categorization, text mining, natural language
processing, data mining algorithms, neural networks, and Artificial Intelligence.
He has used his expertise in these areas to provide thought leadership and
develop applications and evaluations in multiple domains including
intelligence/security informatics, business intelligence, cyber security, financial
analysis, corporate governance, retail and energy. Arthur can be reached at
akeen@keenassoc.com

More Related Content

Similar to Semantic Analytics, Smart Data

SKOS as a key element in Enterprise Linked Data Strategies
SKOS as a key element in Enterprise Linked Data StrategiesSKOS as a key element in Enterprise Linked Data Strategies
SKOS as a key element in Enterprise Linked Data StrategiesSemantic Web Company
 
Assessment In Spreadsheets
Assessment In SpreadsheetsAssessment In Spreadsheets
Assessment In Spreadsheetsguest46de76
 
슬라이드 1
슬라이드 1슬라이드 1
슬라이드 1butest
 
Text Analytics for Non-Experts
Text Analytics for Non-ExpertsText Analytics for Non-Experts
Text Analytics for Non-ExpertsSynaptica, LLC
 
Semantic Perspectives for Contemporary Question Answering Systems
Semantic Perspectives for Contemporary Question Answering SystemsSemantic Perspectives for Contemporary Question Answering Systems
Semantic Perspectives for Contemporary Question Answering SystemsAndre Freitas
 
Different Semantic Perspectives for Question Answering Systems
Different Semantic Perspectives for Question Answering SystemsDifferent Semantic Perspectives for Question Answering Systems
Different Semantic Perspectives for Question Answering SystemsAndre Freitas
 
Introduction of semantic technology for SAS programmers
Introduction of semantic technology for SAS programmersIntroduction of semantic technology for SAS programmers
Introduction of semantic technology for SAS programmersKevin Lee
 
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and Vocabularies
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and VocabulariesHaystack 2018 - Algorithmic Extraction of Keywords Concepts and Vocabularies
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and VocabulariesMax Irwin
 
Identifying Security Risks Using Auto-Tagging and Text Analytics
Identifying Security Risks Using Auto-Tagging and Text AnalyticsIdentifying Security Risks Using Auto-Tagging and Text Analytics
Identifying Security Risks Using Auto-Tagging and Text AnalyticsEnterprise Knowledge
 
Irmac presentation for website
Irmac presentation for websiteIrmac presentation for website
Irmac presentation for websiteFrank Barnes
 
Introduction to Application Profiles
Introduction to Application ProfilesIntroduction to Application Profiles
Introduction to Application ProfilesDiane Hillmann
 
SKOS as the focal point of linked data strategies
SKOS as the focal point of linked data strategiesSKOS as the focal point of linked data strategies
SKOS as the focal point of linked data strategiesSemantic Web Company
 
Semantic Web in Action: Ontology-driven information search, integration and a...
Semantic Web in Action: Ontology-driven information search, integration and a...Semantic Web in Action: Ontology-driven information search, integration and a...
Semantic Web in Action: Ontology-driven information search, integration and a...Amit Sheth
 
Climbing the Ontology Mountain to Achieve a Successful Knowledge Graph
Climbing the Ontology Mountain to Achieve a Successful Knowledge GraphClimbing the Ontology Mountain to Achieve a Successful Knowledge Graph
Climbing the Ontology Mountain to Achieve a Successful Knowledge GraphEnterprise Knowledge
 
Data Mining and the Web_Past_Present and Future
Data Mining and the Web_Past_Present and FutureData Mining and the Web_Past_Present and Future
Data Mining and the Web_Past_Present and Futurefeiwin
 
Dataset Catalogs as a Foundation for FAIR* Data
Dataset Catalogs as a Foundation for FAIR* DataDataset Catalogs as a Foundation for FAIR* Data
Dataset Catalogs as a Foundation for FAIR* DataTom Plasterer
 

Similar to Semantic Analytics, Smart Data (20)

SKOS as a key element in Enterprise Linked Data Strategies
SKOS as a key element in Enterprise Linked Data StrategiesSKOS as a key element in Enterprise Linked Data Strategies
SKOS as a key element in Enterprise Linked Data Strategies
 
Assessment In Spreadsheets
Assessment In SpreadsheetsAssessment In Spreadsheets
Assessment In Spreadsheets
 
슬라이드 1
슬라이드 1슬라이드 1
슬라이드 1
 
Text Analytics for Non-Experts
Text Analytics for Non-ExpertsText Analytics for Non-Experts
Text Analytics for Non-Experts
 
Semantic Perspectives for Contemporary Question Answering Systems
Semantic Perspectives for Contemporary Question Answering SystemsSemantic Perspectives for Contemporary Question Answering Systems
Semantic Perspectives for Contemporary Question Answering Systems
 
Different Semantic Perspectives for Question Answering Systems
Different Semantic Perspectives for Question Answering SystemsDifferent Semantic Perspectives for Question Answering Systems
Different Semantic Perspectives for Question Answering Systems
 
Introduction of semantic technology for SAS programmers
Introduction of semantic technology for SAS programmersIntroduction of semantic technology for SAS programmers
Introduction of semantic technology for SAS programmers
 
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and Vocabularies
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and VocabulariesHaystack 2018 - Algorithmic Extraction of Keywords Concepts and Vocabularies
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and Vocabularies
 
Recommender Systems and Linked Open Data
Recommender Systems and Linked Open DataRecommender Systems and Linked Open Data
Recommender Systems and Linked Open Data
 
The Power of Data
The Power of DataThe Power of Data
The Power of Data
 
Identifying Security Risks Using Auto-Tagging and Text Analytics
Identifying Security Risks Using Auto-Tagging and Text AnalyticsIdentifying Security Risks Using Auto-Tagging and Text Analytics
Identifying Security Risks Using Auto-Tagging and Text Analytics
 
Irmac presentation for website
Irmac presentation for websiteIrmac presentation for website
Irmac presentation for website
 
Introduction to Application Profiles
Introduction to Application ProfilesIntroduction to Application Profiles
Introduction to Application Profiles
 
SKOS as the focal point of linked data strategies
SKOS as the focal point of linked data strategiesSKOS as the focal point of linked data strategies
SKOS as the focal point of linked data strategies
 
Semantic Web in Action: Ontology-driven information search, integration and a...
Semantic Web in Action: Ontology-driven information search, integration and a...Semantic Web in Action: Ontology-driven information search, integration and a...
Semantic Web in Action: Ontology-driven information search, integration and a...
 
Climbing the Ontology Mountain to Achieve a Successful Knowledge Graph
Climbing the Ontology Mountain to Achieve a Successful Knowledge GraphClimbing the Ontology Mountain to Achieve a Successful Knowledge Graph
Climbing the Ontology Mountain to Achieve a Successful Knowledge Graph
 
Taxonomy Quality Assessment
Taxonomy Quality AssessmentTaxonomy Quality Assessment
Taxonomy Quality Assessment
 
Data Mining and the Web_Past_Present and Future
Data Mining and the Web_Past_Present and FutureData Mining and the Web_Past_Present and Future
Data Mining and the Web_Past_Present and Future
 
Data Mining
Data MiningData Mining
Data Mining
 
Dataset Catalogs as a Foundation for FAIR* Data
Dataset Catalogs as a Foundation for FAIR* DataDataset Catalogs as a Foundation for FAIR* Data
Dataset Catalogs as a Foundation for FAIR* Data
 

Recently uploaded

Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 

Recently uploaded (20)

Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 

Semantic Analytics, Smart Data

  • 1. K E E N A N A L Y T I C S 1 Semantic SPEAKERS: Dr. Arthur Keen, Principal Keen Analytics Thomas Kelly, Practice Director Cognizant Technology Solutions, Inc.
  • 2. K E E N A N A L Y T I C S 2 Operator, get me Klondike 5-397
  • 3. K E E N A N A L Y T I C S 3 Data Ecosystems are Growing in Complexity Tens of thousands of databases Millions to billions of data elements Dozens of markets Hundreds to thousands of social media sites
  • 4. K E E N A N A L Y T I C S 4 Analytics without semantics is like having a multi-lingual conversation without interpreters Semantics Manages the Complexity of Data Variety
  • 5. K E E N A N A L Y T I C S 5 Semantic Analytics Data Science Domains Technologies Analytic Methods Semantics • Knowledge • Expertise • Abstraction & Diversity • Consistency Semantics • Data Meaning • Context • Relationships • Vocabulary
  • 6. K E E N A N A L Y T I C S 6 Semantic Analytics Emphasis is on data relationships, not just the data Data focus is on data concepts (abstraction), not the diversity of implementation details Data assumptions are made explicit in the semantic model The semantics guide the analytics process, rather than just the analyst’s knowledge SPARQL is a key component, but not the only tool in the semantic analytics toolbox
  • 7. K E E N A N A L Y T I C S 7 Challenges in Semantic Analytics 1 2 3 Semantic models that do not abstract data concepts from their implementation details 4 Semantic models that are missing semantics Semantic data that is missing a semantic model Rich, accurate provenance is required to establish confidence in the analytics results 5 Data cleansing must meet requirements for accuracy, consistency, and fitness for the purpose of the analytic task and result
  • 8. K E E N A N A L Y T I C S 8 Semantics Analytics in the Data to Action Loop Analyze Transform Classify Correlate Predict Interpret COA’s Semantics Which relationships relevant? What class? What kind of group? Define New Relationships? Inference Tag/Inference Representation/Provenance Wisdom Knowledge Information Data W K I D W K I D W K I D W K I D Semantics Intelligence Pyramid Analytics Semantic Analytics
  • 9. K E E N A N A L Y T I C S 9 Clustering Through Semantic Tagging Image Credit: historyinthecity.blogspot.com Semantic Tags • Tend to be user- or publisher-defined based on preferences, including terminology and depth of attribution • May have ambiguities to resolve (synonyms, reuse/overuse, too specific, language, jargon) Key Benefits • Faster search of content • Greater precision of search results Semantic Tags are keywords used to describe a resource (webpages, documents, business transactions) Source-Directed Tags • Manual selection and entry by the author • Automated population by the publisher, such as professional literature or publication websites • Automatically excerpted from a corpus through semantic analysis of the content, guided by a controlled vocabulary
  • 10. K E E N A N A L Y T I C S 10 Clustering Through Semantic Tagging Source: Implementing Iterative Algorithms with SPARQL http://ceur-ws.org/Vol-1133/paper-36.pdf DROP GRAPH <urn:ga/g/xjz[i+1]> ; CREATE GRAPH <urn:ga/g/xjz[i+1]> ; INSERT { GRAPH <urn:ga/g/xjz[i+1]> {?s <urn:ga/p/inCluster> ?clus3 } } WHERE { { SELECT ?s (SAMPLE(?clus) AS ?clus3) WHERE { { SELECT ?s (MAX(?clusCt) AS ?maxClusCt) WHERE { SELECT ?s ?clus (COUNT(?clus) AS ?clusCt) WHERE { ?s <urn:ga/p/hasLink> ?o . GRAPH <urn:ga/g/xjz[i] > ?clus } } GROUP BY ?s ?clus } GROUP BY ?s } { SELECT ?s ?clus (COUNT(?clus) AS ?clusCt) WHERE { ?s <urn:ga/p/hasLink> ?o . GRAPH <urn:ga/g/xjz[i]> { ?o <urn:ga/p/inCluster ?clus } } GROUP BY ?s ?clus } FILTER (?clusCt = ?maxClusCt) } GROUP BY ?s } } DROP GRAPH <urn:ga/g/xjz0> ; CREATE GRAPH <urn:ga/g/xjz0> ; INSERT { GRAPH <urn:ga/g/xjz0> {?s <urn:ga/p/inCluster> ?s } } WHERE { { SELECT DISTINCT ?s WHERE { { SELECT ?s <urn:ga/p/hasLink> ?o . } } Assign Each Tag Vertex to a Cluster For Each Tag Vertex, Populate Cluster Assignments of Neighbors Peer-Pressure Clustering Observation • No use of semantics features, such as vocabulary and knowledge management capabilities Strengths • Effective over large volumes of data • Comprehensive use of RDF data structure features
  • 11. K E E N A N A L Y T I C S 11 Clustering Through Semantic Tagging Positive Negative Ecstatic Pleased Okay Terms used in Semantic Tags Common Taxonomy for Semantic Tags
  • 12. K E E N A N A L Y T I C S 12 Clustering Through Semantic Tagging Positive Negative Ecstatic Inspired Charged Excited Exceeds Need Very Satisfied Satisfied Somewhat Satisfied Preferred Terms, Synonyms, and Common Misspellings Frequently-Used Generalizations and Degrees of Specificity Knowledge-based Taxonomy for Semantic Tags Estatic Extatic Egstatic
  • 13. K E E N A N A L Y T I C S 13 Clustering through Semantic Tagging Process Cluster resources with highest frequency semantic tag pairs Map the semantic tags to an N-level taxonomy of preferred tags, based on exact and synonym matches, and desired degree of specificity Select a set of triples containing URIs of the resources, as well as the semantic tags assigned to the resources
  • 14. K E E N A N A L Y T I C S 14 INSERT { ?SemanticTagEdgeURI rdf:type :SemanticTagEdge ; :resourceURI ?resource ; :edgeNode1 ?clusterTagLabel1 ; :edgeNode2 ?clusterTagLabel2 . } WHERE { ?SemanticTagURI1 rdf:type :SemanticTag ; :resourceURI ?resource ; :clusterTagValue ?clusterTagLabel1 . ?SemanticTagURI2 rdf:type :SemanticTag ; :resourceURI ?resource ; :clusterTagValue ?clusterTagLabel2 . FILTER ( ?clusterTagLabel1 != ?clusterTagLabel2 ) BIND ( URI( CONCAT( str(?resource), ?clusterTagLabel1, ?clusterTagLabel2 ) ) AS ?SemanticTagEdgeURI ) } Clustering through Semantic Tagging :Webpage1 :hasTag “10101” . :Webpage1 :hasTag “1030303B” . :Webpage2 :hasTag “10201” . :Webpage2 :hasTag “1030301” . :Webpage3 :hasTag “1030303B” . :Webpage3 :hasTag “10201A” . :Webpage4 :hasTag “10101B” . :Webpage4 :hasTag “10302A” . :Webpage5 :hasTag “1030301” . :Webpage5 :hasTag “10101A” . … INSERT { ?SemanticTagURI :clusterTagValue ?clusterTagLabel } WHERE { ?SemanticTagURI rdf:type :SemanticTag ; :hasTag ?tagLabel . ?Concept rdf:type skos:Concept ; ( skos:prefLabel|skos:altLabel|skos:hiddenLabel ) ?tagLabel . OPTIONAL { ?Concept :degreeOfSpecificity :<SPECIFICITY> ; skos:prefLabel ?clusterTagLabel . } OPTIONAL { ?Concept :degreeOfSpecificity ?Specificity . ?Concept skos:broader* ?BroaderConcept . ?BroaderConcept :degreeOfSpecificity ?BroaderSpecificity . FILTER ( ?BroaderSpecificity = :<SPECIFICITY> ) ?BroaderConcept skos:prefLabel ?clusterTagLabel . } } Insert Sample Data Find Preferred/Generalized Tag Value Generate Tag Pairs Concept - Preferred Tag Term - Synonyms, Misspellings - Broader/Generalized Concepts - Degree of Specificity Taxonomy • Highest Frequency Tag Pairs • Highest Frequency Solitary Tags * • Triple and Quadruple Tag Sets * Results * Not depicted
  • 15. K E E N A N A L Y T I C S 15 Semantic Analytics in Two Flavors Semantics on Analytics Analytics on Semantics Semantic assisted analysis: Money laundering, fraud detection, community detection, insider trading… Understanding Risk (financial trading & cyber security), transaction optimization, vulnerability assessment…
  • 16. K E E N A N A L Y T I C S 16 Discover Abnormal BehaviorProbability Degree Centrality Rare Occurrence (Frequent Communication) Rare Occurrence (Infrequent Communication) Normal Communication Levels
  • 17. K E E N A N A L Y T I C S 17 Identifying and predicting behavior changes Observe Orient Decide Act Network Density Time Classify and predict group behavior using communication network density What kind of organization is this? What is their objective/intent? Distributing food? Terrorist attack? Cyber attack? Merger/Acquisition? Bank robbery? When are they going to act?
  • 18. K E E N A N A L Y T I C S 18 Company Understanding Risk: Systemic Risk Analysis Transitive risk exposure in a network of trading partners and holding companies E F A D H L J B K C G I M O N Q relationship R P
  • 19. K E E N A N A L Y T I C S 19 Company Systemic Risk Analysis: Transitive risk exposure in a network of trading partners and holding companies E F A D H L J B K C G I M O N Q controlledBy tradesWith R P
  • 20. K E E N A N A L Y T I C S 20 Company Systemic Risk Analysis: Transitive risk exposure in a network of trading partners and holding companies E F A D H L J B K C G I M O N Q controlledBy tradesWith R P Bank
  • 21. K E E N A N A L Y T I C S 21 Systemic Risk Analysis: Transitive risk exposure in a network of trading partners and holding companies E F A D H L J B K C G I M O N Q controlledBy tradesWith R P Bank BankHoldingCompany A bank holding company controls a bank or controls a bank holding company
  • 22. K E E N A N A L Y T I C S 22 Systemic Risk Analysis: Transitive risk exposure in a network of trading partners and holding companies E F A D H L J B K C G I M O N Q controlledBy tradesWith R P Bank BankHoldingCompany risk
  • 23. K E E N A N A L Y T I C S 23 In SPARQL PREFIX : <http://pagerank/> PREFIX rank: <http://www.ontotext.com/owlim/RDFRank#> SELECT DISTINCT ?node ?rank WHERE {GRAPH <http://pagerank>{ {?node :to [].}UNION {[] :to ?node} ?node rank:hasRDFRank ?rank . }}ORDER BY ?node PREFIX rank: <http://www.ontotext.com/owlim/RDFRank#> INSERT DATA { rank:epsilon rank:setParam "0.001" . }
  • 24. K E E N A N A L Y T I C S 24 Questions?
  • 25. K E E N A N A L Y T I C S 25 Thank you!
  • 26. K E E N A N A L Y T I C S 26 Speakers Thomas (Tom) Kelly Practice Director, Enterprise Information Management, Cognizant Thomas Kelly is a Director in Cognizant’s Enterprise Information Management (EIM) Practice and heads its Semantic Technology Center of Excellence. He has 20- plus years of technology consulting experience in leading data warehousing, business intelligence and big data projects, focused primarily on the life sciences, healthcare, and financial services industries. Tom can be reached at Thomas.Kelly@cognizant.com. Dr. Arthur Keen Principal, Keen Analytics Arthur Keen possesses a deep understanding of graph analytics, predictive modeling, unstructured data, categorization, text mining, natural language processing, data mining algorithms, neural networks, and Artificial Intelligence. He has used his expertise in these areas to provide thought leadership and develop applications and evaluations in multiple domains including intelligence/security informatics, business intelligence, cyber security, financial analysis, corporate governance, retail and energy. Arthur can be reached at akeen@keenassoc.com

Editor's Notes

  1. Recent sessions on semantic analytics focus on data prep -- how semantics supports analytics We’ll talk about how the analytics are based on and enabled by semantics
  2. Bell Telephone Operator story -- everyone becomes an operator -- everyone will become a data analyst or data scientist How can semantics fundamentally change how analytics are performed? -- by experts and non-experts
  3. We’re seeing an explosion of data resources that are available to support analytic activities. But with each new set of data comes new terminology, data formats, definitions, context, and more. Our ability to analyze a set of data is no longer impeded by technology’s limitations but, rather, our own abilities to absorb and understand the variety of data that we will leverage to solve problems. While many technologies can ingest standard data formats, disparate encoding formats and the lack of data meaning captured by these systems causes lengthy delays in onboarding new data into an organization’s data ecosystem. Further, once the data has been onboarded and rationalized, the variety of data repositories and the types of data that they manage far exceed a human’s ability to remember which (databases and data elements) are best for which purposes.
  4. Data Science combines expertise in Domains, Technology, and Analytic Methods Semantics capture and embed data meaning, context, and predefined data integration.
  5. Semantic data that is missing a semantic model -- We may have triples, but no class or property definitions Semantic models that are missing semantics -- Data and relationship definitions, but no properties or annotations that describe the data Semantic models that do not abstract data concepts from their implementation details --
  6. Optional – Check if known term in hierarchy. If not, bind tag and stop
  7. 3-4 Minutes: Semantic analytics comes in two flavors. The first is Semantics on Analytics, or semantically assisted analytics, where semantic models including vocabularies, ontologies, and provenance are consulted before, during, and after analytics. Typical use cases fraud detection, money laundering detection, community detection, insider trading detection. The second involves applying the analytics to the semantic model (network). This is used for understanding behavior of complex systems and is used for risk analysis, cyber security, vulnerability assessments.
  8. 3-4 minutes: We are looking for abnormal behavior in communication. In a complex graph with high dimensionality (lots of properties) we need guidance on which relationships constitute communication. We can use this to restrict the relationships being considered. In this example we use degree centrality as a metric and compute a frequency distribution for it in order to find rare behavior. Seeking anomalies on either end of the tail Hacker with no internal communication Sharp increase/decrease in communication between trading partners
  9. 3-4 Minutes: Describe how metrics like network density (or triangle counts) indicate kinds of behavior and behavior changes and how this can be used to predict behavior
  10. Slides 13 to 18 are like flash cards. 3-4 minutes for the set. Spend no more than 20 seconds on each. Given a complex graph representing interactions between trading partners. We would like to understand the relative risk being absorbed by different entities that results from the trading activity PREFIX : <http://www.example/fs/> INSERT DATA { :A :tradesWith :D. :D :tradesWith :A. :B :tradesWith :D. :D :tradesWith :B. :C :tradesWith :D. :D :tradesWith :C. :D :tradesWith :E. :E :tradesWith :D. :D :tradesWith :I. :I :tradesWith :D. :D :tradesWith :H. :H :tradesWith :D. :D :tradesWith :G. :G :tradesWith :D. :E :tradesWith :F. :F :tradesWith :E. :F :tradesWith :H. :H :tradesWith :F. :H :tradesWith :J. :J :tradesWith :H. :G :tradesWith :J. :J :tradesWith :G. :H :tradesWith :L. :L :tradesWith :H. :J :tradesWith :L. :L :tradesWith :J. :I :tradesWith :K. :K :tradesWith :I. :K :tradesWith :L. :L :tradesWith :K. } PREFIX rank: <http://www.ontotext.com/owlim/RDFRank#> INSERT DATA { _:b1 rank:compute _:b2. } PREFIX : <http://risk/example/> PREFIX rank: <http://www.ontotext.com/owlim/RDFRank#> SELECT DISTINCT * WHERE {GRAPH <http://risk/example>{ ?bank :tradesWith [] . ?bank rank:hasRDFRank ?rank . }}ORDER BY DESC(?rank) bank rank 1 http://risk/example/D "1.00"^^xsd:float 2 http://risk/example/H "0.51"^^xsd:float 3 http://risk/example/J "0.38"^^xsd:float 4 http://risk/example/L "0.38"^^xsd:float 5 http://risk/example/E "0.27"^^xsd:float 6 http://risk/example/I "0.27"^^xsd:float 7 http://risk/example/F "0.26"^^xsd:float 8 http://risk/example/K "0.26"^^xsd:float 9 http://risk/example/G "0.25"^^xsd:float 10 http://risk/example/A "0.13"^^xsd:float 11 http://risk/example/B "0.13"^^xsd:float 12 http://risk/example/C "0.13"^^xsd:float PREFIX rank: <http://www.ontotext.com/owlim/RDFRank#> INSERT DATA {_:b1 rank:computeIncremental "true"}
  11. We identify the controlledBy and tradesWith relationships as relationships that propagate risk. In a real deployment you would use actual transactions rather than abstracting it like this.
  12. We identify the organizations that are banks
  13. And infer the bank holding companies using the bank holding company rule
  14. We apply pageRank algorithm to the topology to provide an overall picture of relative risk of the organizations.
  15. If running out of time, skip over this one. Briefly, here is how this is done in SPARQL. Can explain in more detail after session. If asked, I used a combination of Ontotext, Linkurious, and Neo4j to do this.