Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

What do cats have to do with explicit semantics?

629 views

Published on

  • Be the first to comment

  • Be the first to like this

What do cats have to do with explicit semantics?

  1. 1. www.isocat.org What do cats have to do with explicit semantics? Menzo Windhouwer Ineke Schuurman MPI for Psycholinguistics KU Leuven & Utrecht University menzo.windhouwer@mpi.nl ineke@ccl.kuleuven.be
  2. 2. www.isocat.org TTNWW and ISOcat • TTNWW: TST Tools voor het Nederlands als Web services in een Workflow • CLARIN-NL and VL pilot project • Goal: to enable researchers in the humanties to use our tools and resources in an easy way, even when a whole series of tools and resources is involved. 20 January 2012 CLIN22 - TTNWW Project 2
  3. 3. www.isocat.org TTNWW and ISOcat • Issues when making use of such a ‘chain’: – Is the meaning of notion X in resource/tool A the same as that in resource/tool B ? – Is the meaning of notion X in resource/tool A and that of Y in resource/tool B the same? – Or, if not the same, are they related? If so, how? = ISOcat and friends to the rescue ! 20 January 2012 CLIN22 - TTNWW Project 3
  4. 4. www.isocat.org Explicit semantics • Language resources are valuable assets – store them in an archive to assure persistency! – later generations can research material that only now can still be collected • Problem: used terminology might ‘rot’ – terms get a (slightly) different meaning over (long) periods of time – later generations need to know the meaning of today • Solution: make semantics explicit 20 January 2012 CLIN22 - TTNWW Project 4
  5. 5. www.isocat.org The ISOcat Data Category Registry http://www.isocat.org/ • An ISOcat data category is “an elementary descriptor in a linguistic structure or an annotation scheme” (ISO 12620:2009) • ISOcat data categories have unique and persistent identifiers, which can be resolved over the web http://www.isocat.org/datcat/DC-78 20 January 2012 CLIN22 - TTNWW Project 5
  6. 6. www.isocat.org Annotate all elements in a linguistic resource /lexicon/ /language/ /alphabet/ /entry/ /japanese/ /ipa/ /lemma/ /writtenForm/ 20 January 2012 CLIN22 - TTNWW Project 6
  7. 7. www.isocat.org Sharing structure • Using ISOcat data category references specifications of elementary descriptors can be shared between structures • How to share (annotated) structures? • A companion registry for ISOcat is under development: SCHEMAcat • This registry should persistently store any kind of schema, e.g., XML schemata, EBNF grammars 20 January 2012 CLIN22 - TTNWW Project 7
  8. 8. www.isocat.org Annotated CGN/DCOI grammartag = pos ( feat* )# @dcr:datcat ‘WW’ http://www.isocat.org/datacat/DC-1424# @dcr:datcat ‘TW’ http://www.isocat.org/datacat/DC-1334# @dcr:datcat ‘VG’ http://www.isocat.org/datacat/DC-1226# @dcr:datcat ‘TSW’ http://www.isocat.org/datacat/DC-2717pos = N | ADJ | WW | TW | VNW | LID | VZ | VG | BW | TSWfeat = NTYPE | GETAL | GRAAD | GENUS | NAAMVAL | POSITIE | BUIGING | GETAL-N | WVORM | PVTIJD | PVAGR | NUMTYPE | VWTYPE | PDTYPE | PERSOON | STATUS | NPAGR | LWTYPE | VZTYPE’ | CONJTYPE | SPECTYPENTYPE = soortnaam | eigennaamGETAL = enkelvoud | meervoud | getalGRAAD = basis | comparatief | superlatief | diminutiefGENUS = genus | zijdig | masculien | feminien | onzijdigNAAMVAL = standaard | nominatief | oblique | bijzonder | genitief | datiefPOSITIE = prenominaal | nominaal | postnominaal vrijBUIGING = zonder | met-e | met-sGETAL-N = zonder-n | meervoud-nWVORM = persoonsvorm | buigbaar | innitief | onvdw | voltdw‘# @dcr:datcat PVTIJD http://www.isocat.org/datacat/DC-1286# @dcr:datcat ‘verleden’ http://www.isocat.ord/datacat/DC-1347# @dcr:datcat ‘conjunctie’ http://www.isocat.ord/datacat/DC-1843PVTIJD = tegenwoordig | verleden | conjunctiefPVAGR = enkelvoud | meervoud | met-tNUMTUPE = hoofdtelwoord | rangtelwoordVWTYPE = pr | persoonlijk | reexief | reciprook | bezittelijk | vb | vragend | betrekkelijk | exclamatief | aanwijzend | onbepaaldPDTYPE = pronomen | adv-pronimen | determiner | gradeerbaarPERSOON = persoon | 1 | 2 | 2v | 2b | 3 | 3p | 3 | 3v | 3oSTATUS = vol | gereduceerd | nadrukNPAGR = agr | evon | rest | evz | mv | agr3 | evmo | rest3 | evf | mvLWTYPE = bepaald | onbepaaldVZTYPE = initieel | versmolten | naalCONJTYPE =January 2012 | onderschikkend 20 nevenschikkend CLIN22 - TTNWW Project 8SPECTYPE = afgebroken | onverstaanbaar | vreemd | deeleigen | meta | commentaar | achtergrond | afkorting | symbool | dialect
  9. 9. www.isocat.org Sharing relations • Among data categories and (other) concepts ontological relationships can be defined • These relationships allow crosswalks between various resource models – discover related resources which use (different levels of) semantically close data categories • RELcat is a companion registry which will allow storing (and sharing) a linguists individual view on these relationships http://lux13.mpi.nl/relcat/ (alpha) 20 January 2012 CLIN22 - TTNWW Project 9
  10. 10. www.isocat.org Semantic network Linguistic resource (schema) Linguistic knowledge base Data categories Containers Concepts Relation Schema Registry - SCHEMAcat Data Category Registry - ISOcat Concept Registry Relation Registry - RELcat 20 January 2012 CLIN22 - TTNWW Project 10
  11. 11. www.isocat.org Conclusion • CLARIN(-NL/-VL), including TTNWW, is working towards a set of registries that enable the community to collaboratively make semantics explicit by: – sharing elementary descriptors: data categories • persistently – sharing structure: schemata • persistently – sharing ontological relations • individual world views 20 January 2012 CLIN22 - TTNWW Project 11
  12. 12. www.isocat.org What do cats have to do with explicit semantics? 20 January 2012 CLIN22 - TTNWW Project 12
  13. 13. www.isocat.org Thank you for your attention! Visit www.isocat.org Questions? www.isocat.org/forum/ isocat@mpi.nl 20 January 2012 CLIN22 - TTNWW Project 13

×