SlideShare a Scribd company logo
Relation-wise Automatic Domain-
Range Information Management for
Knowledge Entries
Md-Mizanur Rahoman & Ryutaro Ichise
The Graduate University for Advanced Studies, Tokyo, Japan
National Institute of Informatics, Tokyo, Japan
Begum Rokeya University, Rangpur, Bangladesh
Outline
• Background
• Problem & Possible Solution
• Proposed Framework
• Experiment
• Conclusion
30-Jan-2017 Relation-wise Automatic Domain-Range Information Management for Knowledge Entries I Rahoman & Ichise 2
Background
• knowledge-base (KB) construction and management gained interest
• relations play great role in KB
• construction – generation of knowledge entries
<Subject, relation, Object>
• e.g., <Obama, born_in, Hawaii>
• management – validation of knowledge entries
• e.g., domain(born_in) = Person, range(born_in) = Place
• not all knowledge-base maintain domain-range validation for
relation, e.g., Freebase
30-Jan-2017 Relation-wise Automatic Domain-Range Information Management for Knowledge Entries I Rahoman & Ichise 3
Problem
• existence of wrong entries – e.g., in current
• costly maintenance - domain-range selection is not automatic
• manual checking time consuming
• require domain level expertise
30-Jan-2017 Relation-wise Automatic Domain-Range Information Management for Knowledge Entries I Rahoman & Ichise 4
Subject Relation Object
Paprika type Book
Paprika author Yasutaka Tsutsui
Freedom in Exile type Book
Freedom in Exile author 14
Possible Solution
• Intuition
• Subjects of a relation should hold some similarity
• extract features for Subject entities and generate learning
model e.g.,
• Subject(born_in) will only comply if it is Person i.e., domain
• Objects of a relation should hold some similarity
• extract features for Objects entities and generate learning
model e.g.,
• Object(born_in) will only comply if it is Place i.e., range
30-Jan-2017 Relation-wise Automatic Domain-Range Information Management for Knowledge Entries I Rahoman & Ichise 5
Proposed Framework
• required resource
• language specific relation - e.g., born_in, spouse, author etc.
• language specific training example - e.g., entries
• language specific large text corpus - e.g.,
30-Jan-2017 Relation-wise Automatic Domain-Range Information Management for Knowledge Entries I Rahoman & Ichise 6
Subject Relation Object
Obama born_in Hawaii
Trump born_in New York
Clinton born_in Chicago
… … …
Proposed Framework
• process
• Word Vectorizer
• generate features for words
from a large text corpus
• Model Generator
• generate supervised machine
learning models for the
extracted features
30-Jan-2017 Relation-wise Automatic Domain-Range Information Management for Knowledge Entries I Rahoman & Ichise 7
Word Vectorizer
• take large text corpus e.g.,
• use Word2Vec* implementation for word embedding
• generate feature vectors for text vocabulary
• maintain linguistic context for the corpus
• put similar words into similar kind of vectors
30-Jan-2017 Relation-wise Automatic Domain-Range Information Management for Knowledge Entries I Rahoman & Ichise 8
* https://code.google.com/p/word2vec/
Model Generator (1/4)
• For each relation
• collect positive and negative training words
• collect feature vectors for training words
• generate two supervised machine learning models (domain &
range model) that classify
• a word element should belong to domain or not
• a word element should belong to range or not
30-Jan-2017 Relation-wise Automatic Domain-Range Information Management for Knowledge Entries I Rahoman & Ichise 9
Model Generator (2/4)
• positive features
• collected from existing knowledge entries
• divided into Subject element feature vectors and Object element
feature vectors
30-Jan-2017 Relation-wise Automatic Domain-Range Information Management for Knowledge Entries I Rahoman & Ichise 10
Subject Relation Object
Obama born_in Hawaii
Trump born_in New York
Clinton born_in Chicago
… … …
Model Generator (3/4)
• negative features
• collected for random vocabularies of text corpus
• excluded for positive word elements that already considered
• maintained for same number of negative and positive training
30-Jan-2017 Relation-wise Automatic Domain-Range Information Management for Knowledge Entries I Rahoman & Ichise 11
Model Generator (4/4)
• models
• domain model
• generated for Subject element feature vectors and negative
word feature vectors
• used decision tree-based learning model
• range model
• generated for Object element feature vectors and negative
word feature vectors
• used decision tree-based learning model
30-Jan-2017 Relation-wise Automatic Domain-Range Information Management for Knowledge Entries I Rahoman & Ichise 12
Experiment
• resource
• relations – 32 frequent English relations (among first 100)
• Cat-1 – range values are distributed over domain e.g., candidate
• Cat-2 – range values are concentrated over domain e.g., genre
• training example – entries for the relations
• Text corpus – English
• evaluation metrics - accuracy
30-Jan-2017 Relation-wise Automatic Domain-Range Information Management for Knowledge Entries I Rahoman & Ichise 13
Result
• purpose – show how accurately it can detect correct (pos) and incorrect (neg)
entries, and mix (i.e., pos + neg)
• finding – same type of word belong to same kind of feature vectors, model
generalize the words
30-Jan-2017 Relation-wise Automatic Domain-Range Information Management for Knowledge Entries I Rahoman & Ichise 14
Conclusion
• Observation
• a relation should hold same type of elements as Subject and same
type of elements as Object
• generalization of Subject and Object can automatically generate
domain and range for a relation - experiment result support this
assumption
• Future Work
• look for more sophisticated learning model other than decision
tree
• want to investigate different word embedding other than the
default in word2vec
30-Jan-2017 Relation-wise Automatic Domain-Range Information Management for Knowledge Entries I Rahoman & Ichise 15

More Related Content

What's hot

finding nobel prize window by PageRank
finding nobel prize window by PageRankfinding nobel prize window by PageRank
finding nobel prize window by PageRank
Yuji Fujita
 
Language Models for Information Retrieval
Language Models for Information RetrievalLanguage Models for Information Retrieval
Language Models for Information Retrieval
Dustin Smith
 

What's hot (17)

Searching with vectors
Searching with vectorsSearching with vectors
Searching with vectors
 
Experience
ExperienceExperience
Experience
 
Mathematical Language Processing via Tree Embeddings
Mathematical Language Processing via Tree EmbeddingsMathematical Language Processing via Tree Embeddings
Mathematical Language Processing via Tree Embeddings
 
finding nobel prize window by PageRank
finding nobel prize window by PageRankfinding nobel prize window by PageRank
finding nobel prize window by PageRank
 
AIMed 19 Workshop 1: Machine Learning for non-data scientist by Dr. Robert Hoyt
AIMed 19 Workshop 1: Machine Learning for non-data scientist by Dr. Robert HoytAIMed 19 Workshop 1: Machine Learning for non-data scientist by Dr. Robert Hoyt
AIMed 19 Workshop 1: Machine Learning for non-data scientist by Dr. Robert Hoyt
 
Order out of Chaos: Construction of Knowledge Models from PDF Textbooks
Order out of Chaos: Construction of Knowledge Models from PDF TextbooksOrder out of Chaos: Construction of Knowledge Models from PDF Textbooks
Order out of Chaos: Construction of Knowledge Models from PDF Textbooks
 
Data Analytics.03. Data processing
Data Analytics.03. Data processingData Analytics.03. Data processing
Data Analytics.03. Data processing
 
Ea – extra note
Ea – extra noteEa – extra note
Ea – extra note
 
Resume
Resume Resume
Resume
 
Methodology Mashups: Systematic Searches, Plus ...
Methodology Mashups: Systematic Searches, Plus ... Methodology Mashups: Systematic Searches, Plus ...
Methodology Mashups: Systematic Searches, Plus ...
 
CV
CVCV
CV
 
Algorithms for the thematic analysis of twitter datasets
Algorithms for the thematic analysis of twitter datasetsAlgorithms for the thematic analysis of twitter datasets
Algorithms for the thematic analysis of twitter datasets
 
Information Retrieval
Information RetrievalInformation Retrieval
Information Retrieval
 
Language Models for Information Retrieval
Language Models for Information RetrievalLanguage Models for Information Retrieval
Language Models for Information Retrieval
 
Survey on Common Strategies of Vocabulary Reuse in Linked Open Data Modeling ...
Survey on Common Strategies of Vocabulary Reuse in Linked Open Data Modeling ...Survey on Common Strategies of Vocabulary Reuse in Linked Open Data Modeling ...
Survey on Common Strategies of Vocabulary Reuse in Linked Open Data Modeling ...
 
Some Information Retrieval Models and Our Experiments for TREC KBA
Some Information Retrieval Models and Our Experiments for TREC KBASome Information Retrieval Models and Our Experiments for TREC KBA
Some Information Retrieval Models and Our Experiments for TREC KBA
 
Using and learning phrases
Using and learning phrasesUsing and learning phrases
Using and learning phrases
 

Similar to Relation-wise Automatic Domain-Range Information Management for Knowledge Entries

Vectors in Search – Towards More Semantic Matching - Simon Hughes, Dice.com
Vectors in Search – Towards More Semantic Matching - Simon Hughes, Dice.com Vectors in Search – Towards More Semantic Matching - Simon Hughes, Dice.com
Vectors in Search – Towards More Semantic Matching - Simon Hughes, Dice.com
Lucidworks
 
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
Lucidworks
 

Similar to Relation-wise Automatic Domain-Range Information Management for Knowledge Entries (20)

Improving Search in Workday Products using Natural Language Processing
Improving Search in Workday Products using Natural Language ProcessingImproving Search in Workday Products using Natural Language Processing
Improving Search in Workday Products using Natural Language Processing
 
Efficient Practices for Large Scale Text Mining Process
Efficient Practices for Large Scale Text Mining ProcessEfficient Practices for Large Scale Text Mining Process
Efficient Practices for Large Scale Text Mining Process
 
Intro to Deep Learning for Question Answering
Intro to Deep Learning for Question AnsweringIntro to Deep Learning for Question Answering
Intro to Deep Learning for Question Answering
 
OpenEssayist: Extractive Summarisation and Formative Assessment (DCLA13)
OpenEssayist: Extractive Summarisation and Formative Assessment (DCLA13)OpenEssayist: Extractive Summarisation and Formative Assessment (DCLA13)
OpenEssayist: Extractive Summarisation and Formative Assessment (DCLA13)
 
Vectors in Search – Towards More Semantic Matching - Simon Hughes, Dice.com
Vectors in Search – Towards More Semantic Matching - Simon Hughes, Dice.com Vectors in Search – Towards More Semantic Matching - Simon Hughes, Dice.com
Vectors in Search – Towards More Semantic Matching - Simon Hughes, Dice.com
 
Vectors in Search - Towards More Semantic Matching
Vectors in Search - Towards More Semantic MatchingVectors in Search - Towards More Semantic Matching
Vectors in Search - Towards More Semantic Matching
 
Data-Driven Learning Strategy
Data-Driven Learning StrategyData-Driven Learning Strategy
Data-Driven Learning Strategy
 
Using a keyword extraction pipeline to understand concepts in future work sec...
Using a keyword extraction pipeline to understand concepts in future work sec...Using a keyword extraction pipeline to understand concepts in future work sec...
Using a keyword extraction pipeline to understand concepts in future work sec...
 
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.comEnhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
 
Haystack 2019 - Search with Vectors - Simon Hughes
Haystack 2019 - Search with Vectors - Simon HughesHaystack 2019 - Search with Vectors - Simon Hughes
Haystack 2019 - Search with Vectors - Simon Hughes
 
Hansen Metadata for Institutional Repositories
Hansen Metadata for Institutional RepositoriesHansen Metadata for Institutional Repositories
Hansen Metadata for Institutional Repositories
 
Dice.com Bay Area Search - Beyond Learning to Rank Talk
Dice.com Bay Area Search - Beyond Learning to Rank TalkDice.com Bay Area Search - Beyond Learning to Rank Talk
Dice.com Bay Area Search - Beyond Learning to Rank Talk
 
394 wade word2007-ssp2008
394 wade word2007-ssp2008394 wade word2007-ssp2008
394 wade word2007-ssp2008
 
AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101
 
Architecting a CMS for a content centered website
Architecting a CMS for a content centered websiteArchitecting a CMS for a content centered website
Architecting a CMS for a content centered website
 
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
 
Pravin_Bhat
Pravin_BhatPravin_Bhat
Pravin_Bhat
 
Websci 2018
Websci 2018Websci 2018
Websci 2018
 
Prospect for learning analytics to achieve adaptive learning model
Prospect for learning analytics to achieve  adaptive learning modelProspect for learning analytics to achieve  adaptive learning model
Prospect for learning analytics to achieve adaptive learning model
 
Making AI Behave: Using Knowledge Domains to Produce Useful, Trustworthy Results
Making AI Behave: Using Knowledge Domains to Produce Useful, Trustworthy ResultsMaking AI Behave: Using Knowledge Domains to Produce Useful, Trustworthy Results
Making AI Behave: Using Knowledge Domains to Produce Useful, Trustworthy Results
 

More from National Inistitute of Informatics (NII), Tokyo, Japann

More from National Inistitute of Informatics (NII), Tokyo, Japann (6)

LiCord: Language Independent Content Word Finder
LiCord: Language Independent Content Word FinderLiCord: Language Independent Content Word Finder
LiCord: Language Independent Content Word Finder
 
inteSearch: An Intelligent Linked Data Information Access Framework
inteSearch: An Intelligent Linked Data Information Access FrameworkinteSearch: An Intelligent Linked Data Information Access Framework
inteSearch: An Intelligent Linked Data Information Access Framework
 
BoTLRet: A Template-based Linked Data Information Retrieval
 BoTLRet: A Template-based Linked Data Information Retrieval BoTLRet: A Template-based Linked Data Information Retrieval
BoTLRet: A Template-based Linked Data Information Retrieval
 
TLDRet: A Temporal Semantic Facilitated Linked Data Retrieval Framework
TLDRet: A Temporal Semantic Facilitated Linked Data Retrieval FrameworkTLDRet: A Temporal Semantic Facilitated Linked Data Retrieval Framework
TLDRet: A Temporal Semantic Facilitated Linked Data Retrieval Framework
 
Inclusion of Temporal Semantics over Keyword-based Linked Data Retrieval
Inclusion of Temporal Semantics over Keyword-based Linked Data RetrievalInclusion of Temporal Semantics over Keyword-based Linked Data Retrieval
Inclusion of Temporal Semantics over Keyword-based Linked Data Retrieval
 
An automated template selection framework for keyword query over linked data
An automated template selection framework for keyword query over linked dataAn automated template selection framework for keyword query over linked data
An automated template selection framework for keyword query over linked data
 

Recently uploaded

Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
Bhaskar Mitra
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 

Recently uploaded (20)

To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone KomSalesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
 
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxUnpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
 
Introduction to Open Source RAG and RAG Evaluation
Introduction to Open Source RAG and RAG EvaluationIntroduction to Open Source RAG and RAG Evaluation
Introduction to Open Source RAG and RAG Evaluation
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John Staveley
 
UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2
 
In-Depth Performance Testing Guide for IT Professionals
In-Depth Performance Testing Guide for IT ProfessionalsIn-Depth Performance Testing Guide for IT Professionals
In-Depth Performance Testing Guide for IT Professionals
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
AI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekAI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří Karpíšek
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 

Relation-wise Automatic Domain-Range Information Management for Knowledge Entries

  • 1. Relation-wise Automatic Domain- Range Information Management for Knowledge Entries Md-Mizanur Rahoman & Ryutaro Ichise The Graduate University for Advanced Studies, Tokyo, Japan National Institute of Informatics, Tokyo, Japan Begum Rokeya University, Rangpur, Bangladesh
  • 2. Outline • Background • Problem & Possible Solution • Proposed Framework • Experiment • Conclusion 30-Jan-2017 Relation-wise Automatic Domain-Range Information Management for Knowledge Entries I Rahoman & Ichise 2
  • 3. Background • knowledge-base (KB) construction and management gained interest • relations play great role in KB • construction – generation of knowledge entries <Subject, relation, Object> • e.g., <Obama, born_in, Hawaii> • management – validation of knowledge entries • e.g., domain(born_in) = Person, range(born_in) = Place • not all knowledge-base maintain domain-range validation for relation, e.g., Freebase 30-Jan-2017 Relation-wise Automatic Domain-Range Information Management for Knowledge Entries I Rahoman & Ichise 3
  • 4. Problem • existence of wrong entries – e.g., in current • costly maintenance - domain-range selection is not automatic • manual checking time consuming • require domain level expertise 30-Jan-2017 Relation-wise Automatic Domain-Range Information Management for Knowledge Entries I Rahoman & Ichise 4 Subject Relation Object Paprika type Book Paprika author Yasutaka Tsutsui Freedom in Exile type Book Freedom in Exile author 14
  • 5. Possible Solution • Intuition • Subjects of a relation should hold some similarity • extract features for Subject entities and generate learning model e.g., • Subject(born_in) will only comply if it is Person i.e., domain • Objects of a relation should hold some similarity • extract features for Objects entities and generate learning model e.g., • Object(born_in) will only comply if it is Place i.e., range 30-Jan-2017 Relation-wise Automatic Domain-Range Information Management for Knowledge Entries I Rahoman & Ichise 5
  • 6. Proposed Framework • required resource • language specific relation - e.g., born_in, spouse, author etc. • language specific training example - e.g., entries • language specific large text corpus - e.g., 30-Jan-2017 Relation-wise Automatic Domain-Range Information Management for Knowledge Entries I Rahoman & Ichise 6 Subject Relation Object Obama born_in Hawaii Trump born_in New York Clinton born_in Chicago … … …
  • 7. Proposed Framework • process • Word Vectorizer • generate features for words from a large text corpus • Model Generator • generate supervised machine learning models for the extracted features 30-Jan-2017 Relation-wise Automatic Domain-Range Information Management for Knowledge Entries I Rahoman & Ichise 7
  • 8. Word Vectorizer • take large text corpus e.g., • use Word2Vec* implementation for word embedding • generate feature vectors for text vocabulary • maintain linguistic context for the corpus • put similar words into similar kind of vectors 30-Jan-2017 Relation-wise Automatic Domain-Range Information Management for Knowledge Entries I Rahoman & Ichise 8 * https://code.google.com/p/word2vec/
  • 9. Model Generator (1/4) • For each relation • collect positive and negative training words • collect feature vectors for training words • generate two supervised machine learning models (domain & range model) that classify • a word element should belong to domain or not • a word element should belong to range or not 30-Jan-2017 Relation-wise Automatic Domain-Range Information Management for Knowledge Entries I Rahoman & Ichise 9
  • 10. Model Generator (2/4) • positive features • collected from existing knowledge entries • divided into Subject element feature vectors and Object element feature vectors 30-Jan-2017 Relation-wise Automatic Domain-Range Information Management for Knowledge Entries I Rahoman & Ichise 10 Subject Relation Object Obama born_in Hawaii Trump born_in New York Clinton born_in Chicago … … …
  • 11. Model Generator (3/4) • negative features • collected for random vocabularies of text corpus • excluded for positive word elements that already considered • maintained for same number of negative and positive training 30-Jan-2017 Relation-wise Automatic Domain-Range Information Management for Knowledge Entries I Rahoman & Ichise 11
  • 12. Model Generator (4/4) • models • domain model • generated for Subject element feature vectors and negative word feature vectors • used decision tree-based learning model • range model • generated for Object element feature vectors and negative word feature vectors • used decision tree-based learning model 30-Jan-2017 Relation-wise Automatic Domain-Range Information Management for Knowledge Entries I Rahoman & Ichise 12
  • 13. Experiment • resource • relations – 32 frequent English relations (among first 100) • Cat-1 – range values are distributed over domain e.g., candidate • Cat-2 – range values are concentrated over domain e.g., genre • training example – entries for the relations • Text corpus – English • evaluation metrics - accuracy 30-Jan-2017 Relation-wise Automatic Domain-Range Information Management for Knowledge Entries I Rahoman & Ichise 13
  • 14. Result • purpose – show how accurately it can detect correct (pos) and incorrect (neg) entries, and mix (i.e., pos + neg) • finding – same type of word belong to same kind of feature vectors, model generalize the words 30-Jan-2017 Relation-wise Automatic Domain-Range Information Management for Knowledge Entries I Rahoman & Ichise 14
  • 15. Conclusion • Observation • a relation should hold same type of elements as Subject and same type of elements as Object • generalization of Subject and Object can automatically generate domain and range for a relation - experiment result support this assumption • Future Work • look for more sophisticated learning model other than decision tree • want to investigate different word embedding other than the default in word2vec 30-Jan-2017 Relation-wise Automatic Domain-Range Information Management for Knowledge Entries I Rahoman & Ichise 15