Subhabrata Mukherjee, Jitendra Ajmera and Sachindra Joshi.
Domain Cartridge: Unsupervised Framework for Shallow Domain Ontology Construction from Corpus
Proc. of the 23rd ACM International Conference on Information and Knowledge Management (CIKM). 2014.
CHEAP Call Girls in Saket (-DELHI )đ 9953056974đ(=)/CALL GIRLS SERVICE
Â
Domain Cartridge: Unsupervised Framework for Shallow Domain Ontology Construction from Corpus
1. Domain Cartridge: Unsupervised
Framework for Shallow Domain Ontology
Construction from Corpus
Subhabrata Mukherjee
Jitendra Ajmera, Sachindra Joshi
Max Planck Institute for Informatics
IBM India Research Lab
CIKM 2014
October 9, 2014
2. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Motivation: Domain Term Discovery
Usefulness for Parsing. Consider the examples:
âinstall sprint navigationâ
âproblem with touch screen of the mobileâ
ESG Parser generates noisy or incomplete parse without the
smartphone domain knowledge that âsprint navigation, touch
screenâ are multi-word concepts
3. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Motivation: Domain Term Discovery
Usefulness for Parsing. Consider the examples:
âinstall sprint navigationâ
âproblem with touch screen of the mobileâ
ESG Parser generates noisy or incomplete parse without the
smartphone domain knowledge that âsprint navigation, touch
screenâ are multi-word concepts
4. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Motivation: Domain Relation Discovery
Interactive dialogue systems
For user query âbattery of my device depletes fast", the
knowledge âbatteryâ is a Feature-Of âdeviceâ enables the system
to clarify about the type of device
Query expansion
E.g. Consider Synonyms along with the original query, âbatteryâ
is a Feature-Of âphoneâ as well as âtabletâ âdeviceâ
Query re-formulation
For user query âscreen freezes E5150", the knowledge âE5150â
is a Type-Of âErrorâ results in query re-formulation âscreen
freezes error E5150"
5. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Motivation: Domain Relation Discovery
Interactive dialogue systems
For user query âbattery of my device depletes fast", the
knowledge âbatteryâ is a Feature-Of âdeviceâ enables the system
to clarify about the type of device
Query expansion
E.g. Consider Synonyms along with the original query, âbatteryâ
is a Feature-Of âphoneâ as well as âtabletâ âdeviceâ
Query re-formulation
For user query âscreen freezes E5150", the knowledge âE5150â
is a Type-Of âErrorâ results in query re-formulation âscreen
freezes error E5150"
6. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Motivation: Domain Relation Discovery
Interactive dialogue systems
For user query âbattery of my device depletes fast", the
knowledge âbatteryâ is a Feature-Of âdeviceâ enables the system
to clarify about the type of device
Query expansion
E.g. Consider Synonyms along with the original query, âbatteryâ
is a Feature-Of âphoneâ as well as âtabletâ âdeviceâ
Query re-formulation
For user query âscreen freezes E5150", the knowledge âE5150â
is a Type-Of âErrorâ results in query re-formulation âscreen
freezes error E5150"
7. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Unsupervised Framework
Typically for a domain, a lot of knowledge articles, manuals,
tutorials etc. are available in a variety of formats
Most of these documents have less hyperlink and table
(info-box as in Wikipedia) information
Challenge is to learn a shallow ontology from raw unannotated
plain text
8. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Unsupervised Framework
Typically for a domain, a lot of knowledge articles, manuals,
tutorials etc. are available in a variety of formats
Most of these documents have less hyperlink and table
(info-box as in Wikipedia) information
Challenge is to learn a shallow ontology from raw unannotated
plain text
9. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Unsupervised Framework
Typically for a domain, a lot of knowledge articles, manuals,
tutorials etc. are available in a variety of formats
Most of these documents have less hyperlink and table
(info-box as in Wikipedia) information
Challenge is to learn a shallow ontology from raw unannotated
plain text
10. Domain Cartridge as a Graph
install insert
device
handset
android
blackberry
Operating
system
samsung
sim
card
Samsung
Galaxy
victory
Samsung
array
Domain
term
Domain
process
11. Domain Cartridge as a Graph
install insert
device
handset
android
blackberry
Operating
system
samsung
sim
card
Samsung
Galaxy
victory
Samsung
array
Domain
term
Domain
process
Synonyms
12. Domain Cartridge as a Graph
install insert
device
handset
android
blackberry
Operating
system
samsung
sim
card
Samsung
Galaxy
victory
Samsung
array
Domain
term
Domain
process
Feature-Of
13. Domain Cartridge as a Graph
install insert
device
handset
android
blackberry
Operating
system
samsung
sim
card
Samsung
Galaxy
victory
Samsung
array
Domain
term
Domain
process
Type-Of
14. Domain Cartridge as a Graph
install insert
device
handset
android
blackberry
Operating
system
samsung
sim
card
Samsung
Galaxy
victory
Samsung
array
Domain
term
Domain
process
Action-On
15. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Roadmap
Unsupervised framework for shallow domain ontology
construction:
Domain Term Discovery (DTD)
Improvement of Parser performance by DTD
Domain Relation Discovery (DRD)
Use-Case: Improvement of an in-house Question-Answering
system
Experiments: Manual Evaluation, Comparison with BabelNet,
WordNet, Yago
Conclusions
16. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Roadmap
Unsupervised framework for shallow domain ontology
construction:
Domain Term Discovery (DTD)
Improvement of Parser performance by DTD
Domain Relation Discovery (DRD)
Use-Case: Improvement of an in-house Question-Answering
system
Experiments: Manual Evaluation, Comparison with BabelNet,
WordNet, Yago
Conclusions
17. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Roadmap
Unsupervised framework for shallow domain ontology
construction:
Domain Term Discovery (DTD)
Improvement of Parser performance by DTD
Domain Relation Discovery (DRD)
Use-Case: Improvement of an in-house Question-Answering
system
Experiments: Manual Evaluation, Comparison with BabelNet,
WordNet, Yago
Conclusions
18. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Roadmap
Unsupervised framework for shallow domain ontology
construction:
Domain Term Discovery (DTD)
Improvement of Parser performance by DTD
Domain Relation Discovery (DRD)
Use-Case: Improvement of an in-house Question-Answering
system
Experiments: Manual Evaluation, Comparison with BabelNet,
WordNet, Yago
Conclusions
19. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Corpus: Knowledge articles, manuals, tutorials etc.
Domain Cartridge: Framework
20. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Parsing
Domain Cartridge: Framework
21. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Parsing
English Slot Grammar (ESG) parser is used
50 - 100 times faster than the Charniak parser
Shallow semantic relationship (SSR) annotation done over ESG
parser output to generate normalized parser relations
E.g., âSamsung has a battery" and âSamsungâs battery died" both
generate the same relation ânnMod:samsung_batteryâ
22. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Domain Cartridge: Framework
23. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Lucene Index
For efďŹcient retrieval of relations, documents, positional (offset)
information, handling proximity based queries or answer scorers
E.g: (Doc. Freq. of) All relations containing âinstallâ or
ârel:svo:quickbook_update_ďŹleâ
TokenStream modiďŹed to produce contiguous representation of
all types of concepts (ngrams, relations, words etc.)
24. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Domain Cartridge: Framework
25. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Domain Term Discovery
ESG parser maintains a domain term lexicon of multi-word
concepts. E.g. âtouch screen, sprint navigationâ
Noun Phrase Chunking done on document titles to extract
frequently occuring concepts as domain words
Enrich lexicon and bootstrap parser
Parser generates reďŹned output
High precision but low recall â as titles are precise, clean but
short
To extract more ďŹne-grained domain terms HITS is used on
parser output
26. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Domain Term Discovery
ESG parser maintains a domain term lexicon of multi-word
concepts. E.g. âtouch screen, sprint navigationâ
Noun Phrase Chunking done on document titles to extract
frequently occuring concepts as domain words
Enrich lexicon and bootstrap parser
Parser generates reďŹned output
High precision but low recall â as titles are precise, clean but
short
To extract more ďŹne-grained domain terms HITS is used on
parser output
27. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Domain Term Discovery
ESG parser maintains a domain term lexicon of multi-word
concepts. E.g. âtouch screen, sprint navigationâ
Noun Phrase Chunking done on document titles to extract
frequently occuring concepts as domain words
Enrich lexicon and bootstrap parser
Parser generates reďŹned output
High precision but low recall â as titles are precise, clean but
short
To extract more ďŹne-grained domain terms HITS is used on
parser output
28. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
HITS
Any Shallow Semantic Relation (SSR) from ESG parser can be
visualized as a hub generating domain terms
Any domain term can be visualized as an authority inďŹuenced
by incoming features from hubs
Link strength between the hub and authority is taken as the
number of mutual participating SSR in the Lucene Index
Good authorities are incorporated in Parser Domain Term
Lexicon
Parser is re-run, reďŹned relations generated and previous steps
iterated till convergence
29. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
HITS
Any Shallow Semantic Relation (SSR) from ESG parser can be
visualized as a hub generating domain terms
Any domain term can be visualized as an authority inďŹuenced
by incoming features from hubs
Link strength between the hub and authority is taken as the
number of mutual participating SSR in the Lucene Index
Good authorities are incorporated in Parser Domain Term
Lexicon
Parser is re-run, reďŹned relations generated and previous steps
iterated till convergence
30. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
HITS
Any Shallow Semantic Relation (SSR) from ESG parser can be
visualized as a hub generating domain terms
Any domain term can be visualized as an authority inďŹuenced
by incoming features from hubs
Link strength between the hub and authority is taken as the
number of mutual participating SSR in the Lucene Index
Good authorities are incorporated in Parser Domain Term
Lexicon
Parser is re-run, reďŹned relations generated and previous steps
iterated till convergence
31. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
HITS
Any Shallow Semantic Relation (SSR) from ESG parser can be
visualized as a hub generating domain terms
Any domain term can be visualized as an authority inďŹuenced
by incoming features from hubs
Link strength between the hub and authority is taken as the
number of mutual participating SSR in the Lucene Index
Good authorities are incorporated in Parser Domain Term
Lexicon
Parser is re-run, reďŹned relations generated and previous steps
iterated till convergence
32. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
HITS
Any Shallow Semantic Relation (SSR) from ESG parser can be
visualized as a hub generating domain terms
Any domain term can be visualized as an authority inďŹuenced
by incoming features from hubs
Link strength between the hub and authority is taken as the
number of mutual participating SSR in the Lucene Index
Good authorities are incorporated in Parser Domain Term
Lexicon
Parser is re-run, reďŹned relations generated and previous steps
iterated till convergence
33. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
34. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Feedback
Domain Cartridge: Framework
35. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Parser Performance Improvement
Number of incomplete parses went down by 73% after
incorporating domain terms in the parser lexicon
36. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Domain Terms
samsung blackberry software-version application htc-evo wi-
ďŹ memory-card bluetooth motorola kyocera browser voicemail
microsoft-exchange lg-optimus samsung-m400 samsung-galaxy-
victory software-updates samsung-array text-messaging wallpa-
per synchronization iphone touch-screen blackberry-bold
Table: Snapshot of multi-word domain terms extracted by NP Chunking.
wi-ďŹ optimus-g set-up novatel-wireless e-mail sierra-wireless
apple-id google-maps play-music mobile-network 10-digit
internet-explorer slacker-radio caller-id google-search address-
book my-computer software-update blackberry-id as-well-as
windows-update terms-of-service drop-down pro-700 add-on
scp-2700 mac-os device-manager voice-mail non-camera
Table: Snapshot of multi-word domain terms extracted by HITS (not
found by NP Chunking).
37. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Domain Cartridge: Framework
38. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Random Index (RI)
Random Indexing is used for computing word-relatedness and
dimensionality reduction
RI considers âterm X termâ co-occurrence, as opposed to
âterm X documentâ matrix â allowing for incremental learning of
context information, scaling up with the corpus size
We adopt the notion of Relational Distributional Similarity. Two
terms are considered similar if they appear in a similar context
participating in similar Shallow Semantic Relations
39. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Random Index (RI)
Random Indexing is used for computing word-relatedness and
dimensionality reduction
RI considers âterm X termâ co-occurrence, as opposed to
âterm X documentâ matrix â allowing for incremental learning of
context information, scaling up with the corpus size
We adopt the notion of Relational Distributional Similarity. Two
terms are considered similar if they appear in a similar context
participating in similar Shallow Semantic Relations
40. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Random Index
Instead of taking the raw neighborhood of N terms, we use only
those neighboring terms that share a syntactic dependency
(given by ESG parser) with the target term
Lucene Index is queried for term and relation co-occurrence
statistics while constructing high dimensional random index
vector for each term
41. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Domain Cartridge: Framework
42. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Synonym Discovery
Random Index is used to get top N similar terms for a given term
based on Relational Distributional Similarity
HITS is used to get the dominant domain terms and domain
(SSR) relations
Sim(wi, wj) =
p Ili =lj ,ki =kj
(fwki
,p, fwkj
,p )
p r Ili =lr ,ki =kr (fwki
,p, fwkr ,p )
Numerator counts the number of times any word appears in both
feature vectors with similar dominant SSR relations
Denominator counts the number of times it appears in the feature
vector of any other word with that SSR relation
43. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Synonym Discovery
Random Index is used to get top N similar terms for a given term
based on Relational Distributional Similarity
HITS is used to get the dominant domain terms and domain
(SSR) relations
Sim(wi, wj) =
p Ili =lj ,ki =kj
(fwki
,p, fwkj
,p )
p r Ili =lr ,ki =kr (fwki
,p, fwkr ,p )
Numerator counts the number of times any word appears in both
feature vectors with similar dominant SSR relations
Denominator counts the number of times it appears in the feature
vector of any other word with that SSR relation
44. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Synonym Discovery
High recall but low precision. E.g. âfolderâ and âtabâ have a high
relational distributional similarity due to overlapping context
words like âopen, close, minimize, shortcut, multiple"
ESG parser attributes like noun, animated, physical object,
device are used to increase precision
45. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Domain Cartridge: Framework
46. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Action-On Discovery
Action-On represents any activity or âmethodâ on a domain term
E.g. ârel:svo:tap_add_account, rel:svo:phone_access_internet,
rel:svo:mobile_sync_phone, rel:svo:account_use_phone etc."
HITS index is traversed to extract all the dominant dm (action on
entities), and verb-object from svo (subject-verb-object) SSR
47. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Action-On Discovery
Action-On represents any activity or âmethodâ on a domain term
E.g. ârel:svo:tap_add_account, rel:svo:phone_access_internet,
rel:svo:mobile_sync_phone, rel:svo:account_use_phone etc."
HITS index is traversed to extract all the dominant dm (action on
entities), and verb-object from svo (subject-verb-object) SSR
48. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Type-Of Discovery
Type-Of relations depict Is-A hierarchy i.e. a parent-child relation
E.g. ârel:svo:devices_include_HTC, rel:npo:applications_
such-as_WhatsApp, rel:npo:features_like_call,
rel:npo:contact_such-as_address".
svo like âincludeâ and npo âlike, such-as, asâ etc. are most
informative
Dominant domain terms from HITS, bearing Hearst patterns like
âorâ, âespeciallyâ
E.g. service_or_process, prev_or_next,
messages_especially_sms_and_ mms
49. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Type-Of Discovery
Type-Of relations depict Is-A hierarchy i.e. a parent-child relation
E.g. ârel:svo:devices_include_HTC, rel:npo:applications_
such-as_WhatsApp, rel:npo:features_like_call,
rel:npo:contact_such-as_address".
svo like âincludeâ and npo âlike, such-as, asâ etc. are most
informative
Dominant domain terms from HITS, bearing Hearst patterns like
âorâ, âespeciallyâ
E.g. service_or_process, prev_or_next,
messages_especially_sms_and_ mms
50. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Feature-Of Discovery
Feature-Of relations depict components or functionalities of a
domain term
nnMod depicting noun-noun modiďŹcations, and subject-object
(so) extracted from svo SSR used to discover Feature-Of from
HITS index. E.g.
ârel:nnMod:network_life, rel:nnMod:account_settings,
rel:nnMod:iPhone_battery etc."
ârel:svo:motorola-photon-4g_run_device-
software, rel:svo:router_decrease_signal-strength,
rel:svo:ďŹnd-a-store_open_locator etc."
51. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Feature-Of Discovery
Feature-Of relations depict components or functionalities of a
domain term
nnMod depicting noun-noun modiďŹcations, and subject-object
(so) extracted from svo SSR used to discover Feature-Of from
HITS index. E.g.
ârel:nnMod:network_life, rel:nnMod:account_settings,
rel:nnMod:iPhone_battery etc."
ârel:svo:motorola-photon-4g_run_device-
software, rel:svo:router_decrease_signal-strength,
rel:svo:ďŹnd-a-store_open_locator etc."
52. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Smartphone Ontology Snapshot
53. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Domain Term Evaluation
5000 articles, tutorials and manuals from the smartphone domain
We used the Back-of-the-Book Index (BOI) of manuals, to create
ground truth for domain term discovery
Baselines:
WordNet (G. A. Miller. Wordnet: A lexical database for english. COMMUNICATIONS OF THE ACM, 38,
1995.)
BabelNet (R. Navigli and S. P. Ponzetto. BabelNet: Building a very large multilingual semantic network. ACL
â10.)
Yago (F. M. Suchanek, G. Kasneci, and G. Weikum. Yago: a core of semantic knowledge. WWW â07.)
54. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Domain Term Evaluation
Method Recall
WordNet 22.62%
NP Chunking on Titles 32.45%
HITS 40.87%
Yago 43.77%
BabelNet 53.74%
Table: Domain term evaluation.
55. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Recall of a Question-Answering System
Recall@N With Domain
Term Lexicon
Without domain
term lexicon
recall@1 0.40 0.33
recall@2 0.49 0.45
Table: Performance of a QA system with and without domain term
lexicon.
Incorporation of domain terms in parser lexicon improves QA
system performance
1
D. Gondek et al. A framework for merging and ranking of answers in
DeepQA. IBM Journal of Research and Development, 56(3), 2012.
56. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Domain Relation Evaluation
2000 word pairs (500 for each of four categories) are manually
annotated by two annotators
High Kappa score of 0.92 for Synonyms and 0.70 for Type-Of due
to the large number of word pairs, 84% and 62%, the annotators
agree are not Synonyms and Type-Of respectively
57. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Comparison
Relations in BabelNet come either from (unlabeled) Wikipedia
hyperlinks or WordNet
Yago uses features like âwikipedia:redirectâ links to detect
synonymous contexts
WordNet âHyponyms, âHypernymsâ are similar to Type-Of, and
âMeronyms, Holonymsâ are subset of Feature-Of
BabelNet, WordNet, Yago do not have any category like
Action-On
Yago has Type-Of categories derived from WordNet hyponymy
and hypernymy, and Wikipedia categories
System Type-Of Feature-Of Action-On
BabelNet, WordNet 19.27% - -
Yago 25.12% - -
Domain Cartridge 77% 85.7% 68%
Table: Recall comparison of systems for 3 relations.
58. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Comparison
Relations in BabelNet come either from (unlabeled) Wikipedia
hyperlinks or WordNet
Yago uses features like âwikipedia:redirectâ links to detect
synonymous contexts
WordNet âHyponyms, âHypernymsâ are similar to Type-Of, and
âMeronyms, Holonymsâ are subset of Feature-Of
BabelNet, WordNet, Yago do not have any category like
Action-On
Yago has Type-Of categories derived from WordNet hyponymy
and hypernymy, and Wikipedia categories
System Type-Of Feature-Of Action-On
BabelNet, WordNet 19.27% - -
Yago 25.12% - -
Domain Cartridge 77% 85.7% 68%
Table: Recall comparison of systems for 3 relations.
59. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Comparison
Relations in BabelNet come either from (unlabeled) Wikipedia
hyperlinks or WordNet
Yago uses features like âwikipedia:redirectâ links to detect
synonymous contexts
WordNet âHyponyms, âHypernymsâ are similar to Type-Of, and
âMeronyms, Holonymsâ are subset of Feature-Of
BabelNet, WordNet, Yago do not have any category like
Action-On
Yago has Type-Of categories derived from WordNet hyponymy
and hypernymy, and Wikipedia categories
System Type-Of Feature-Of Action-On
BabelNet, WordNet 19.27% - -
Yago 25.12% - -
Domain Cartridge 77% 85.7% 68%
Table: Recall comparison of systems for 3 relations.
60. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Comparison
Relations in BabelNet come either from (unlabeled) Wikipedia
hyperlinks or WordNet
Yago uses features like âwikipedia:redirectâ links to detect
synonymous contexts
WordNet âHyponyms, âHypernymsâ are similar to Type-Of, and
âMeronyms, Holonymsâ are subset of Feature-Of
BabelNet, WordNet, Yago do not have any category like
Action-On
Yago has Type-Of categories derived from WordNet hyponymy
and hypernymy, and Wikipedia categories
System Type-Of Feature-Of Action-On
BabelNet, WordNet 19.27% - -
Yago 25.12% - -
Domain Cartridge 77% 85.7% 68%
Table: Recall comparison of systems for 3 relations.
61. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Comparison
Relations in BabelNet come either from (unlabeled) Wikipedia
hyperlinks or WordNet
Yago uses features like âwikipedia:redirectâ links to detect
synonymous contexts
WordNet âHyponyms, âHypernymsâ are similar to Type-Of, and
âMeronyms, Holonymsâ are subset of Feature-Of
BabelNet, WordNet, Yago do not have any category like
Action-On
Yago has Type-Of categories derived from WordNet hyponymy
and hypernymy, and Wikipedia categories
System Type-Of Feature-Of Action-On
BabelNet, WordNet 19.27% - -
Yago 25.12% - -
Domain Cartridge 77% 85.7% 68%
Table: Recall comparison of systems for 3 relations.
62. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Distributional Similarity Comparison
System Precision Recall F-Score
Yago 38% 32% 34.37%
BabelNet, WordNet 83% 31% 45.14%
Domain Cartridge (DC) 58% 41% 47.60%
DC + WordNet 62% 40% 49.00%
DC + ESG Parser Features 65% 39% 49.14%
Table: Precision-Recall comparison of Domain Cartridge
(random-indexing, HITS and sim. eqn.) with other systems.
63. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Comparison with Distributional Similarity
Measures in WordNet
WordNet F-Score
LCH 0.22
RES 0.31
JCN 0.42
PATH 0.42
LIN 0.43
WUP 0.43
LESK 0.45
Domain Cartridge 0.49
Table: F-Score comparison of WordNet similarity measures with
Domain Cartridge.
64. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Conclusions
Unsupervised framework for shallow domain ontology
construction, without using manually annotated resources
Multi-words form an important component of Domain Term
Discovery
Incorporation of domain terms in parser lexicon results in 73%
reduction in incomplete parses, improving performance of an
in-house QA system by upto 7%
Synonym discovery approach, using Relational Distributional
Similarity, RI, HITS etc., performs better than other existing
approaches
Future Work: Measure customization time required for domain
adaptation of the QA system
65. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Conclusions
Unsupervised framework for shallow domain ontology
construction, without using manually annotated resources
Multi-words form an important component of Domain Term
Discovery
Incorporation of domain terms in parser lexicon results in 73%
reduction in incomplete parses, improving performance of an
in-house QA system by upto 7%
Synonym discovery approach, using Relational Distributional
Similarity, RI, HITS etc., performs better than other existing
approaches
Future Work: Measure customization time required for domain
adaptation of the QA system
66. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Conclusions
Unsupervised framework for shallow domain ontology
construction, without using manually annotated resources
Multi-words form an important component of Domain Term
Discovery
Incorporation of domain terms in parser lexicon results in 73%
reduction in incomplete parses, improving performance of an
in-house QA system by upto 7%
Synonym discovery approach, using Relational Distributional
Similarity, RI, HITS etc., performs better than other existing
approaches
Future Work: Measure customization time required for domain
adaptation of the QA system
67. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Conclusions
Unsupervised framework for shallow domain ontology
construction, without using manually annotated resources
Multi-words form an important component of Domain Term
Discovery
Incorporation of domain terms in parser lexicon results in 73%
reduction in incomplete parses, improving performance of an
in-house QA system by upto 7%
Synonym discovery approach, using Relational Distributional
Similarity, RI, HITS etc., performs better than other existing
approaches
Future Work: Measure customization time required for domain
adaptation of the QA system
68. Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Conclusions
Unsupervised framework for shallow domain ontology
construction, without using manually annotated resources
Multi-words form an important component of Domain Term
Discovery
Incorporation of domain terms in parser lexicon results in 73%
reduction in incomplete parses, improving performance of an
in-house QA system by upto 7%
Synonym discovery approach, using Relational Distributional
Similarity, RI, HITS etc., performs better than other existing
approaches
Future Work: Measure customization time required for domain
adaptation of the QA system