SlideShare a Scribd company logo
STL : A Similarity Measure Based on Semantic, Terminological and Linguistic InformationNitish Aggarwaljoint work with Tobias Wunner, MihaelArcanDERI, NUI Galwayfirstname.lastname@deri.orgFriday,19th Aug, 2011DERI, Friday Meeting
OverviewMotivation & ApplicationsWhy STL? SemanticTerminologyLinguisticEvaluationConclusion and future work2
Motivation & ApplicationsSemanticAnnotationSimilarity between corpus data and ontology conceptsSAP AG held €1615 million in short-term liquid assets (2009)“dbpedia:SAP_AG” “xEBR:LiquidAssets” at “dbpedia:year:2009”3
SemanticSearchSimilarity between Query and index objectMotivation & ApplicationsSAP liquid asset in 2010Current asset of SAP last year“dbpedia:SAP_AG” “xEBR:liquid asset” at “dbpedia:year:2010”Net cash of SAP in 2010SAP total amount received in 20104
Motivation & ApplicationsOntologyMatching & AlignmentSimilarity between ontology conceptsifrs:StatementOfFinancialPositionxebr:KeyBalanceSheetAssetsIfrs:Assetsifrs:BiologicalAssetsxebr:SubscribedCapitalUnpaidIfrs:CurrentAssetsIfrs:NonCurrentAssetsxebr:FixedAssetsxebr:CurrentAssetsifrs:PropertyPlantAndEquipmentxebr:TangibleFixedAssetsxebr:IntangibleFixedAssetsxebr:Amount Receivablexebr:LiquidAssetsSimilarity = ?Similarity = ?ifrs:CashAndCashEquivalentsIfrs:TradeAndOtherCurrentReceivablesIfrs:Inventories5
Classical ApproachesString SimilarityLevenshteindistance, Dice CoefficientCorpus-basedLSA, ESA, Google distance,Vector-Space ModelOntology-basedPath distance, Information contentSyntax SimilarityWord-order, Part of Speech6
Why STL?SemanticSemanticstructure and relationsTerminologycomplex terms expressing the same conceptLinguistic Phrase and dependency structure7
STLDefinitionLinear combination of semantic, terminological and linguisticobtained by using a linear regressionFormula usedSTL = w1*S + w2*T + w3*L + Constantw1, w2, w3 represent the contribution of each8
SemanticWuPalmer2*depth(MSCA) / depth(c1) + depth(c2)Resnik’s Information ContentIC(c) = -log p(c)Intrinsic Information Content (Pirro09)Overcome the analysis of large corpora9
Cont.Intrinsic information content(iIC).where sub(c) is number of sub-concept of given concept c.Pirro_Similarity10
Cont.MSCAsubconcepts = 48IC (TFA) = 0.32AssetsSubscribed Capital UnpaidFixed AssetsCurrent AssetsPirro_Sim = 0.33Pirro_Sim =?StocksTangible Fixed AssetsAmount Receivablesubconcepts = 6IC (AR) = 0.69subconcepts = 9IC (TFA) = 0.60Amount Receivable [total]Amount Receivable  with in one yearAmount Receivable after more than one yearOther Tangible Fixed AssetsProperty, Plant and EquipmentPayments on account and asset in constructionFurniture Fixture and EquipmentTrade DebtorsOther FixtureLand and BuildingOther DebtorsPlant and MachineryOther Property, Plant and EquipmentProperty, Plant and Equipment [Total]11
LimitationDoes semantic structure reflect a good similarity?not necessarilye.g. In xEBR, parent-child relation for describing the layout of 	    	concepts“Work in progress” is not a type of asset, although both are linked via the parent-child relationship  12
TerminologyDefinitionCommon naming conventionNgram Vs subtermsIn financial domain, bigram ”Intangible Fixed” is a subtring of ”Other Intangible Fixed Assets” but not a subterm.Terminological similaritymaximal subterm overlap13
Cont.Trade Debts Payable After More Than One Year [[Trade][Debts]][Payable][After More Than One Year][SAP:Payable][Ifrs:After More Than One Year][Investoword:Debt][FinanceDict:Trade Debts][Investopedia:Trade]Financial[Debts][Payable][After More Than One Year]Financial Debts Payable After More Than One Year 14
Multilingual SubtermsTranslatedsubtermsAvailable in otherlanguagesAdvantageReflect terminological similarities that may be available in one language but not in others.”Property Plant and Equipment”@en”Sachanlagen”@de”Tangible Fixed Asset” @en15
Linguistic	Syntactic InformationBeyond simple word orderphrase structureDependency structurePhrase structureIntangible fixed : adj adj > ??Intangible fixed assets : adj adj n > NPDependency structureAmounts receivable : N Adv : receive:mod, amounts:headReceived amounts : V N : receive:mod, amounts:head16
EvaluationData SetxEBR finance vocabulary269 terms (concept labels)72,361(269*269) termpairsBenchmarksSimSem59: sample of 59 term pairsSimSem200 : sample of 200 term pairs (under construction)17
ExperimentAn overview of similarity measures18
Experiment Results (Simsem59)STL formula usedSTL = 0.1531 * S + 0.5218 * T + 0.1041 * L + 0.1791Correlation between similarity scores & simsem59Semantic ContributionTerminologyContributionLinguistic Contribution19
ConclusionSTL outperforms more traditional similarity measuresLargest contribution by T (Terminological Analysis)Multilingual subterms performs better than monolingual20
Future workEvaluation on larger data set and vocabularies (IFRS)3000+ terms 9M term pairsricher set of linguistic operations“recognise” => “recognition” 	by derivation rule verb_lemma+"ion”Similarity between subterms“Staff Costs” and "Wages And Salaries"21

More Related Content

What's hot

Overview of XBRL by FinDynamics.com
Overview of XBRL by FinDynamics.comOverview of XBRL by FinDynamics.com
Overview of XBRL by FinDynamics.com
XBRLAnalyst FinDynamics
 
Xbrl india[1]
Xbrl india[1]Xbrl india[1]
Xbrl india[1]
Khizer Ahmed Sheriff
 
110 Introduction To Xbrl Taxonomies And Instance Documents Sept 2007 Print Ve...
110 Introduction To Xbrl Taxonomies And Instance Documents Sept 2007 Print Ve...110 Introduction To Xbrl Taxonomies And Instance Documents Sept 2007 Print Ve...
110 Introduction To Xbrl Taxonomies And Instance Documents Sept 2007 Print Ve...
helggeist
 
XBRL - Features and Fundamental
XBRL - Features and FundamentalXBRL - Features and Fundamental
XBRL - Features and Fundamental
Sundar B N
 
XBRL Conversion Steps
XBRL Conversion StepsXBRL Conversion Steps
XBRL Conversion Steps
trivesa
 
Understanding XBRL
Understanding XBRLUnderstanding XBRL
Understanding XBRL
Mamta Binani
 
XBRL Fundamentals
XBRL FundamentalsXBRL Fundamentals
XBRL Overview
XBRL OverviewXBRL Overview
XBRL Overview
Dhiren Gala
 
Xbrl slideshare
Xbrl slideshareXbrl slideshare
Xbrl slideshare
Mandar Godbole
 

What's hot (10)

Overview of XBRL by FinDynamics.com
Overview of XBRL by FinDynamics.comOverview of XBRL by FinDynamics.com
Overview of XBRL by FinDynamics.com
 
Gaia 5
Gaia 5Gaia 5
Gaia 5
 
Xbrl india[1]
Xbrl india[1]Xbrl india[1]
Xbrl india[1]
 
110 Introduction To Xbrl Taxonomies And Instance Documents Sept 2007 Print Ve...
110 Introduction To Xbrl Taxonomies And Instance Documents Sept 2007 Print Ve...110 Introduction To Xbrl Taxonomies And Instance Documents Sept 2007 Print Ve...
110 Introduction To Xbrl Taxonomies And Instance Documents Sept 2007 Print Ve...
 
XBRL - Features and Fundamental
XBRL - Features and FundamentalXBRL - Features and Fundamental
XBRL - Features and Fundamental
 
XBRL Conversion Steps
XBRL Conversion StepsXBRL Conversion Steps
XBRL Conversion Steps
 
Understanding XBRL
Understanding XBRLUnderstanding XBRL
Understanding XBRL
 
XBRL Fundamentals
XBRL FundamentalsXBRL Fundamentals
XBRL Fundamentals
 
XBRL Overview
XBRL OverviewXBRL Overview
XBRL Overview
 
Xbrl slideshare
Xbrl slideshareXbrl slideshare
Xbrl slideshare
 

Similar to STL: A similarity measure based on semantic and linguistic information

Semantic, terminological and linguistic analysis of xbrl
Semantic, terminological and linguistic analysis of xbrlSemantic, terminological and linguistic analysis of xbrl
Semantic, terminological and linguistic analysis of xbrl
Tobias Wunner
 
Cross-lingual ontology lexicalisation, translation and information extraction...
Cross-lingual ontology lexicalisation, translation and information extraction...Cross-lingual ontology lexicalisation, translation and information extraction...
Cross-lingual ontology lexicalisation, translation and information extraction...
Tobias Wunner
 
Financial Industry Semantics and Ontologies
Financial Industry Semantics and OntologiesFinancial Industry Semantics and Ontologies
Financial Industry Semantics and Ontologies
Mike Bennett
 
Arch CoP - Domain Driven Design.pptx
Arch CoP - Domain Driven Design.pptxArch CoP - Domain Driven Design.pptx
Arch CoP - Domain Driven Design.pptx
Sanjoy Kumar Roy
 
Les week 6 inleiding tot xbrl
Les week 6 inleiding tot xbrlLes week 6 inleiding tot xbrl
Les week 6 inleiding tot xbrl
Ifk Bigfood
 
Implementing information federation
Implementing information federationImplementing information federation
Implementing information federation
Cory Casanave
 
Language First Protocol from QSi
Language First Protocol from QSiLanguage First Protocol from QSi
Language First Protocol from QSi
John O'Gorman
 
Chapter 12-assigning instancefactvalues
Chapter 12-assigning instancefactvaluesChapter 12-assigning instancefactvalues
Chapter 12-assigning instancefactvalues
jps619
 
42109 scudeletti (1)
42109 scudeletti (1)42109 scudeletti (1)
42109 scudeletti (1)
Paolo Cipriano
 
Data Model vs Ontology Development – a FIBO perspective | Mike Bennett
Data Model vs Ontology Development – a FIBO perspective | Mike BennettData Model vs Ontology Development – a FIBO perspective | Mike Bennett
Data Model vs Ontology Development – a FIBO perspective | Mike Bennett
Connected Data World
 
SSO Strategy Implementation Considerations
SSO Strategy Implementation ConsiderationsSSO Strategy Implementation Considerations
SSO Strategy Implementation Considerations
John Bauer
 
What's new for Text in SAP HANA SPS 11
What's new for Text in SAP HANA SPS 11What's new for Text in SAP HANA SPS 11
What's new for Text in SAP HANA SPS 11
SAP Technology
 
Data Modeling Presentations I
Data Modeling Presentations IData Modeling Presentations I
Data Modeling Presentations I
cd_crisci
 
CV Tuyen Ly Eng 2017 01-09
CV Tuyen Ly Eng 2017 01-09CV Tuyen Ly Eng 2017 01-09
CV Tuyen Ly Eng 2017 01-09
Thanh-Tuyen Ly, CPA CMA
 
Cloud insights m&a and capital markets report
Cloud insights m&a and capital markets reportCloud insights m&a and capital markets report
Cloud insights m&a and capital markets report
MMMTechLaw
 
FIBO in Neo4j: Applying Knowledge Graphs in the Financial Industry
FIBO in Neo4j: Applying Knowledge Graphs in the Financial IndustryFIBO in Neo4j: Applying Knowledge Graphs in the Financial Industry
FIBO in Neo4j: Applying Knowledge Graphs in the Financial Industry
Neo4j
 
Chapter 15-understanding andusingbusinessrules
Chapter 15-understanding andusingbusinessrulesChapter 15-understanding andusingbusinessrules
Chapter 15-understanding andusingbusinessrules
jps619
 
Wetzel, "CORE, Cost of Resource Exchange Update"
Wetzel, "CORE, Cost of Resource Exchange Update"Wetzel, "CORE, Cost of Resource Exchange Update"
Wetzel, "CORE, Cost of Resource Exchange Update"
National Information Standards Organization (NISO)
 
X api chinese cop monthly meeting feb.2016
X api chinese cop monthly meeting   feb.2016X api chinese cop monthly meeting   feb.2016
X api chinese cop monthly meeting feb.2016
Jessie Chuang
 
Intro to xAPI Camp DevLearn 2018
Intro to xAPI Camp DevLearn 2018Intro to xAPI Camp DevLearn 2018
Intro to xAPI Camp DevLearn 2018
Megan Bowe
 

Similar to STL: A similarity measure based on semantic and linguistic information (20)

Semantic, terminological and linguistic analysis of xbrl
Semantic, terminological and linguistic analysis of xbrlSemantic, terminological and linguistic analysis of xbrl
Semantic, terminological and linguistic analysis of xbrl
 
Cross-lingual ontology lexicalisation, translation and information extraction...
Cross-lingual ontology lexicalisation, translation and information extraction...Cross-lingual ontology lexicalisation, translation and information extraction...
Cross-lingual ontology lexicalisation, translation and information extraction...
 
Financial Industry Semantics and Ontologies
Financial Industry Semantics and OntologiesFinancial Industry Semantics and Ontologies
Financial Industry Semantics and Ontologies
 
Arch CoP - Domain Driven Design.pptx
Arch CoP - Domain Driven Design.pptxArch CoP - Domain Driven Design.pptx
Arch CoP - Domain Driven Design.pptx
 
Les week 6 inleiding tot xbrl
Les week 6 inleiding tot xbrlLes week 6 inleiding tot xbrl
Les week 6 inleiding tot xbrl
 
Implementing information federation
Implementing information federationImplementing information federation
Implementing information federation
 
Language First Protocol from QSi
Language First Protocol from QSiLanguage First Protocol from QSi
Language First Protocol from QSi
 
Chapter 12-assigning instancefactvalues
Chapter 12-assigning instancefactvaluesChapter 12-assigning instancefactvalues
Chapter 12-assigning instancefactvalues
 
42109 scudeletti (1)
42109 scudeletti (1)42109 scudeletti (1)
42109 scudeletti (1)
 
Data Model vs Ontology Development – a FIBO perspective | Mike Bennett
Data Model vs Ontology Development – a FIBO perspective | Mike BennettData Model vs Ontology Development – a FIBO perspective | Mike Bennett
Data Model vs Ontology Development – a FIBO perspective | Mike Bennett
 
SSO Strategy Implementation Considerations
SSO Strategy Implementation ConsiderationsSSO Strategy Implementation Considerations
SSO Strategy Implementation Considerations
 
What's new for Text in SAP HANA SPS 11
What's new for Text in SAP HANA SPS 11What's new for Text in SAP HANA SPS 11
What's new for Text in SAP HANA SPS 11
 
Data Modeling Presentations I
Data Modeling Presentations IData Modeling Presentations I
Data Modeling Presentations I
 
CV Tuyen Ly Eng 2017 01-09
CV Tuyen Ly Eng 2017 01-09CV Tuyen Ly Eng 2017 01-09
CV Tuyen Ly Eng 2017 01-09
 
Cloud insights m&a and capital markets report
Cloud insights m&a and capital markets reportCloud insights m&a and capital markets report
Cloud insights m&a and capital markets report
 
FIBO in Neo4j: Applying Knowledge Graphs in the Financial Industry
FIBO in Neo4j: Applying Knowledge Graphs in the Financial IndustryFIBO in Neo4j: Applying Knowledge Graphs in the Financial Industry
FIBO in Neo4j: Applying Knowledge Graphs in the Financial Industry
 
Chapter 15-understanding andusingbusinessrules
Chapter 15-understanding andusingbusinessrulesChapter 15-understanding andusingbusinessrules
Chapter 15-understanding andusingbusinessrules
 
Wetzel, "CORE, Cost of Resource Exchange Update"
Wetzel, "CORE, Cost of Resource Exchange Update"Wetzel, "CORE, Cost of Resource Exchange Update"
Wetzel, "CORE, Cost of Resource Exchange Update"
 
X api chinese cop monthly meeting feb.2016
X api chinese cop monthly meeting   feb.2016X api chinese cop monthly meeting   feb.2016
X api chinese cop monthly meeting feb.2016
 
Intro to xAPI Camp DevLearn 2018
Intro to xAPI Camp DevLearn 2018Intro to xAPI Camp DevLearn 2018
Intro to xAPI Camp DevLearn 2018
 

Recently uploaded

matatag curriculum education for Kindergarten
matatag curriculum education for Kindergartenmatatag curriculum education for Kindergarten
matatag curriculum education for Kindergarten
SarahAlie1
 
The Jewish Trinity : Sabbath,Shekinah and Sanctuary 4.pdf
The Jewish Trinity : Sabbath,Shekinah and Sanctuary 4.pdfThe Jewish Trinity : Sabbath,Shekinah and Sanctuary 4.pdf
The Jewish Trinity : Sabbath,Shekinah and Sanctuary 4.pdf
JackieSparrow3
 
1-NLC-MATH7-Consolidation-Lesson1 2024.pptx
1-NLC-MATH7-Consolidation-Lesson1 2024.pptx1-NLC-MATH7-Consolidation-Lesson1 2024.pptx
1-NLC-MATH7-Consolidation-Lesson1 2024.pptx
AnneMarieJacildo
 
How to Empty a One2Many Field in Odoo 17
How to Empty a One2Many Field in Odoo 17How to Empty a One2Many Field in Odoo 17
How to Empty a One2Many Field in Odoo 17
Celine George
 
Howe Writing Center - Orientation Summer 2024
Howe Writing Center - Orientation Summer 2024Howe Writing Center - Orientation Summer 2024
Howe Writing Center - Orientation Summer 2024
Elizabeth Walsh
 
How To Update One2many Field From OnChange of Field in Odoo 17
How To Update One2many Field From OnChange of Field in Odoo 17How To Update One2many Field From OnChange of Field in Odoo 17
How To Update One2many Field From OnChange of Field in Odoo 17
Celine George
 
ENGLISH-7-CURRICULUM MAP- MATATAG CURRICULUM
ENGLISH-7-CURRICULUM MAP- MATATAG CURRICULUMENGLISH-7-CURRICULUM MAP- MATATAG CURRICULUM
ENGLISH-7-CURRICULUM MAP- MATATAG CURRICULUM
HappieMontevirgenCas
 
(T.L.E.) Agriculture: Essentials of Gardening
(T.L.E.) Agriculture: Essentials of Gardening(T.L.E.) Agriculture: Essentials of Gardening
(T.L.E.) Agriculture: Essentials of Gardening
MJDuyan
 
How to Manage Early Receipt Printing in Odoo 17 POS
How to Manage Early Receipt Printing in Odoo 17 POSHow to Manage Early Receipt Printing in Odoo 17 POS
How to Manage Early Receipt Printing in Odoo 17 POS
Celine George
 
Configuring Single Sign-On (SSO) via Identity Management | MuleSoft Mysore Me...
Configuring Single Sign-On (SSO) via Identity Management | MuleSoft Mysore Me...Configuring Single Sign-On (SSO) via Identity Management | MuleSoft Mysore Me...
Configuring Single Sign-On (SSO) via Identity Management | MuleSoft Mysore Me...
MysoreMuleSoftMeetup
 
Year-to-Date Filter in Odoo 17 Dashboard
Year-to-Date Filter in Odoo 17 DashboardYear-to-Date Filter in Odoo 17 Dashboard
Year-to-Date Filter in Odoo 17 Dashboard
Celine George
 
How to Create a New Article in Knowledge App in Odoo 17
How to Create a New Article in Knowledge App in Odoo 17How to Create a New Article in Knowledge App in Odoo 17
How to Create a New Article in Knowledge App in Odoo 17
Celine George
 
DANH SÁCH THÍ SINH XÉT TUYỂN SỚM ĐỦ ĐIỀU KIỆN TRÚNG TUYỂN ĐẠI HỌC CHÍNH QUY N...
DANH SÁCH THÍ SINH XÉT TUYỂN SỚM ĐỦ ĐIỀU KIỆN TRÚNG TUYỂN ĐẠI HỌC CHÍNH QUY N...DANH SÁCH THÍ SINH XÉT TUYỂN SỚM ĐỦ ĐIỀU KIỆN TRÚNG TUYỂN ĐẠI HỌC CHÍNH QUY N...
DANH SÁCH THÍ SINH XÉT TUYỂN SỚM ĐỦ ĐIỀU KIỆN TRÚNG TUYỂN ĐẠI HỌC CHÍNH QUY N...
thanhluan21
 
2024 KWL Back 2 School Summer Conference
2024 KWL Back 2 School Summer Conference2024 KWL Back 2 School Summer Conference
2024 KWL Back 2 School Summer Conference
KlettWorldLanguages
 
How to Add a Filter in the Odoo 17 - Odoo 17 Slides
How to Add a Filter in the Odoo 17 - Odoo 17 SlidesHow to Add a Filter in the Odoo 17 - Odoo 17 Slides
How to Add a Filter in the Odoo 17 - Odoo 17 Slides
Celine George
 
How to Manage Shipping Connectors & Shipping Methods in Odoo 17
How to Manage Shipping Connectors & Shipping Methods in Odoo 17How to Manage Shipping Connectors & Shipping Methods in Odoo 17
How to Manage Shipping Connectors & Shipping Methods in Odoo 17
Celine George
 
Webinar Innovative assessments for SOcial Emotional Skills
Webinar Innovative assessments for SOcial Emotional SkillsWebinar Innovative assessments for SOcial Emotional Skills
Webinar Innovative assessments for SOcial Emotional Skills
EduSkills OECD
 
Edukasyong Pantahanan at Pangkabuhayan 1: Personal Hygiene
Edukasyong Pantahanan at  Pangkabuhayan 1: Personal HygieneEdukasyong Pantahanan at  Pangkabuhayan 1: Personal Hygiene
Edukasyong Pantahanan at Pangkabuhayan 1: Personal Hygiene
MJDuyan
 
How to Manage Large Scrollbar in Odoo 17 POS
How to Manage Large Scrollbar in Odoo 17 POSHow to Manage Large Scrollbar in Odoo 17 POS
How to Manage Large Scrollbar in Odoo 17 POS
Celine George
 
Bài tập bộ trợ anh 7 I learn smart world kì 1 năm học 2022 2023 unit 1.doc
Bài tập bộ trợ anh 7 I learn smart world kì 1 năm học 2022 2023 unit 1.docBài tập bộ trợ anh 7 I learn smart world kì 1 năm học 2022 2023 unit 1.doc
Bài tập bộ trợ anh 7 I learn smart world kì 1 năm học 2022 2023 unit 1.doc
PhngThLmHnh
 

Recently uploaded (20)

matatag curriculum education for Kindergarten
matatag curriculum education for Kindergartenmatatag curriculum education for Kindergarten
matatag curriculum education for Kindergarten
 
The Jewish Trinity : Sabbath,Shekinah and Sanctuary 4.pdf
The Jewish Trinity : Sabbath,Shekinah and Sanctuary 4.pdfThe Jewish Trinity : Sabbath,Shekinah and Sanctuary 4.pdf
The Jewish Trinity : Sabbath,Shekinah and Sanctuary 4.pdf
 
1-NLC-MATH7-Consolidation-Lesson1 2024.pptx
1-NLC-MATH7-Consolidation-Lesson1 2024.pptx1-NLC-MATH7-Consolidation-Lesson1 2024.pptx
1-NLC-MATH7-Consolidation-Lesson1 2024.pptx
 
How to Empty a One2Many Field in Odoo 17
How to Empty a One2Many Field in Odoo 17How to Empty a One2Many Field in Odoo 17
How to Empty a One2Many Field in Odoo 17
 
Howe Writing Center - Orientation Summer 2024
Howe Writing Center - Orientation Summer 2024Howe Writing Center - Orientation Summer 2024
Howe Writing Center - Orientation Summer 2024
 
How To Update One2many Field From OnChange of Field in Odoo 17
How To Update One2many Field From OnChange of Field in Odoo 17How To Update One2many Field From OnChange of Field in Odoo 17
How To Update One2many Field From OnChange of Field in Odoo 17
 
ENGLISH-7-CURRICULUM MAP- MATATAG CURRICULUM
ENGLISH-7-CURRICULUM MAP- MATATAG CURRICULUMENGLISH-7-CURRICULUM MAP- MATATAG CURRICULUM
ENGLISH-7-CURRICULUM MAP- MATATAG CURRICULUM
 
(T.L.E.) Agriculture: Essentials of Gardening
(T.L.E.) Agriculture: Essentials of Gardening(T.L.E.) Agriculture: Essentials of Gardening
(T.L.E.) Agriculture: Essentials of Gardening
 
How to Manage Early Receipt Printing in Odoo 17 POS
How to Manage Early Receipt Printing in Odoo 17 POSHow to Manage Early Receipt Printing in Odoo 17 POS
How to Manage Early Receipt Printing in Odoo 17 POS
 
Configuring Single Sign-On (SSO) via Identity Management | MuleSoft Mysore Me...
Configuring Single Sign-On (SSO) via Identity Management | MuleSoft Mysore Me...Configuring Single Sign-On (SSO) via Identity Management | MuleSoft Mysore Me...
Configuring Single Sign-On (SSO) via Identity Management | MuleSoft Mysore Me...
 
Year-to-Date Filter in Odoo 17 Dashboard
Year-to-Date Filter in Odoo 17 DashboardYear-to-Date Filter in Odoo 17 Dashboard
Year-to-Date Filter in Odoo 17 Dashboard
 
How to Create a New Article in Knowledge App in Odoo 17
How to Create a New Article in Knowledge App in Odoo 17How to Create a New Article in Knowledge App in Odoo 17
How to Create a New Article in Knowledge App in Odoo 17
 
DANH SÁCH THÍ SINH XÉT TUYỂN SỚM ĐỦ ĐIỀU KIỆN TRÚNG TUYỂN ĐẠI HỌC CHÍNH QUY N...
DANH SÁCH THÍ SINH XÉT TUYỂN SỚM ĐỦ ĐIỀU KIỆN TRÚNG TUYỂN ĐẠI HỌC CHÍNH QUY N...DANH SÁCH THÍ SINH XÉT TUYỂN SỚM ĐỦ ĐIỀU KIỆN TRÚNG TUYỂN ĐẠI HỌC CHÍNH QUY N...
DANH SÁCH THÍ SINH XÉT TUYỂN SỚM ĐỦ ĐIỀU KIỆN TRÚNG TUYỂN ĐẠI HỌC CHÍNH QUY N...
 
2024 KWL Back 2 School Summer Conference
2024 KWL Back 2 School Summer Conference2024 KWL Back 2 School Summer Conference
2024 KWL Back 2 School Summer Conference
 
How to Add a Filter in the Odoo 17 - Odoo 17 Slides
How to Add a Filter in the Odoo 17 - Odoo 17 SlidesHow to Add a Filter in the Odoo 17 - Odoo 17 Slides
How to Add a Filter in the Odoo 17 - Odoo 17 Slides
 
How to Manage Shipping Connectors & Shipping Methods in Odoo 17
How to Manage Shipping Connectors & Shipping Methods in Odoo 17How to Manage Shipping Connectors & Shipping Methods in Odoo 17
How to Manage Shipping Connectors & Shipping Methods in Odoo 17
 
Webinar Innovative assessments for SOcial Emotional Skills
Webinar Innovative assessments for SOcial Emotional SkillsWebinar Innovative assessments for SOcial Emotional Skills
Webinar Innovative assessments for SOcial Emotional Skills
 
Edukasyong Pantahanan at Pangkabuhayan 1: Personal Hygiene
Edukasyong Pantahanan at  Pangkabuhayan 1: Personal HygieneEdukasyong Pantahanan at  Pangkabuhayan 1: Personal Hygiene
Edukasyong Pantahanan at Pangkabuhayan 1: Personal Hygiene
 
How to Manage Large Scrollbar in Odoo 17 POS
How to Manage Large Scrollbar in Odoo 17 POSHow to Manage Large Scrollbar in Odoo 17 POS
How to Manage Large Scrollbar in Odoo 17 POS
 
Bài tập bộ trợ anh 7 I learn smart world kì 1 năm học 2022 2023 unit 1.doc
Bài tập bộ trợ anh 7 I learn smart world kì 1 năm học 2022 2023 unit 1.docBài tập bộ trợ anh 7 I learn smart world kì 1 năm học 2022 2023 unit 1.doc
Bài tập bộ trợ anh 7 I learn smart world kì 1 năm học 2022 2023 unit 1.doc
 

STL: A similarity measure based on semantic and linguistic information

  • 1. STL : A Similarity Measure Based on Semantic, Terminological and Linguistic InformationNitish Aggarwaljoint work with Tobias Wunner, MihaelArcanDERI, NUI Galwayfirstname.lastname@deri.orgFriday,19th Aug, 2011DERI, Friday Meeting
  • 2. OverviewMotivation & ApplicationsWhy STL? SemanticTerminologyLinguisticEvaluationConclusion and future work2
  • 3. Motivation & ApplicationsSemanticAnnotationSimilarity between corpus data and ontology conceptsSAP AG held €1615 million in short-term liquid assets (2009)“dbpedia:SAP_AG” “xEBR:LiquidAssets” at “dbpedia:year:2009”3
  • 4. SemanticSearchSimilarity between Query and index objectMotivation & ApplicationsSAP liquid asset in 2010Current asset of SAP last year“dbpedia:SAP_AG” “xEBR:liquid asset” at “dbpedia:year:2010”Net cash of SAP in 2010SAP total amount received in 20104
  • 5. Motivation & ApplicationsOntologyMatching & AlignmentSimilarity between ontology conceptsifrs:StatementOfFinancialPositionxebr:KeyBalanceSheetAssetsIfrs:Assetsifrs:BiologicalAssetsxebr:SubscribedCapitalUnpaidIfrs:CurrentAssetsIfrs:NonCurrentAssetsxebr:FixedAssetsxebr:CurrentAssetsifrs:PropertyPlantAndEquipmentxebr:TangibleFixedAssetsxebr:IntangibleFixedAssetsxebr:Amount Receivablexebr:LiquidAssetsSimilarity = ?Similarity = ?ifrs:CashAndCashEquivalentsIfrs:TradeAndOtherCurrentReceivablesIfrs:Inventories5
  • 6. Classical ApproachesString SimilarityLevenshteindistance, Dice CoefficientCorpus-basedLSA, ESA, Google distance,Vector-Space ModelOntology-basedPath distance, Information contentSyntax SimilarityWord-order, Part of Speech6
  • 7. Why STL?SemanticSemanticstructure and relationsTerminologycomplex terms expressing the same conceptLinguistic Phrase and dependency structure7
  • 8. STLDefinitionLinear combination of semantic, terminological and linguisticobtained by using a linear regressionFormula usedSTL = w1*S + w2*T + w3*L + Constantw1, w2, w3 represent the contribution of each8
  • 9. SemanticWuPalmer2*depth(MSCA) / depth(c1) + depth(c2)Resnik’s Information ContentIC(c) = -log p(c)Intrinsic Information Content (Pirro09)Overcome the analysis of large corpora9
  • 10. Cont.Intrinsic information content(iIC).where sub(c) is number of sub-concept of given concept c.Pirro_Similarity10
  • 11. Cont.MSCAsubconcepts = 48IC (TFA) = 0.32AssetsSubscribed Capital UnpaidFixed AssetsCurrent AssetsPirro_Sim = 0.33Pirro_Sim =?StocksTangible Fixed AssetsAmount Receivablesubconcepts = 6IC (AR) = 0.69subconcepts = 9IC (TFA) = 0.60Amount Receivable [total]Amount Receivable with in one yearAmount Receivable after more than one yearOther Tangible Fixed AssetsProperty, Plant and EquipmentPayments on account and asset in constructionFurniture Fixture and EquipmentTrade DebtorsOther FixtureLand and BuildingOther DebtorsPlant and MachineryOther Property, Plant and EquipmentProperty, Plant and Equipment [Total]11
  • 12. LimitationDoes semantic structure reflect a good similarity?not necessarilye.g. In xEBR, parent-child relation for describing the layout of concepts“Work in progress” is not a type of asset, although both are linked via the parent-child relationship 12
  • 13. TerminologyDefinitionCommon naming conventionNgram Vs subtermsIn financial domain, bigram ”Intangible Fixed” is a subtring of ”Other Intangible Fixed Assets” but not a subterm.Terminological similaritymaximal subterm overlap13
  • 14. Cont.Trade Debts Payable After More Than One Year [[Trade][Debts]][Payable][After More Than One Year][SAP:Payable][Ifrs:After More Than One Year][Investoword:Debt][FinanceDict:Trade Debts][Investopedia:Trade]Financial[Debts][Payable][After More Than One Year]Financial Debts Payable After More Than One Year 14
  • 15. Multilingual SubtermsTranslatedsubtermsAvailable in otherlanguagesAdvantageReflect terminological similarities that may be available in one language but not in others.”Property Plant and Equipment”@en”Sachanlagen”@de”Tangible Fixed Asset” @en15
  • 16. Linguistic Syntactic InformationBeyond simple word orderphrase structureDependency structurePhrase structureIntangible fixed : adj adj > ??Intangible fixed assets : adj adj n > NPDependency structureAmounts receivable : N Adv : receive:mod, amounts:headReceived amounts : V N : receive:mod, amounts:head16
  • 17. EvaluationData SetxEBR finance vocabulary269 terms (concept labels)72,361(269*269) termpairsBenchmarksSimSem59: sample of 59 term pairsSimSem200 : sample of 200 term pairs (under construction)17
  • 18. ExperimentAn overview of similarity measures18
  • 19. Experiment Results (Simsem59)STL formula usedSTL = 0.1531 * S + 0.5218 * T + 0.1041 * L + 0.1791Correlation between similarity scores & simsem59Semantic ContributionTerminologyContributionLinguistic Contribution19
  • 20. ConclusionSTL outperforms more traditional similarity measuresLargest contribution by T (Terminological Analysis)Multilingual subterms performs better than monolingual20
  • 21. Future workEvaluation on larger data set and vocabularies (IFRS)3000+ terms 9M term pairsricher set of linguistic operations“recognise” => “recognition” by derivation rule verb_lemma+"ion”Similarity between subterms“Staff Costs” and "Wages And Salaries"21