The ISO-DCR17 January /20111CMDI tutorialMarc Kemps-Snijdersa, Menzo Windhouwerb, Sue Ellen WrightcaMeertens Institute, bMPI for Psycholinguistics, cKent State Universitymenzo.windhouwer@mpi.nl
OutlineISOcat: a Data Category RegistryThe role of data categories in CMDIA glimpse of ISOcatStatus of the metadata profile17 January /20112CMDI tutorial
ISOcat: a Data Category RegistryThe reference implementation of ISO 12620:2009Terminology and other content and language resources — Specification of data categories and management of a Data Category Registry for language resourcesA data categoryis the result of the specification of a given data fieldan elementary descriptor in a linguistic structure or an annotation scheme17 January /20113CMDI tutorial
Data categories and linguistic resourcespartOfSpeechLemmawrittenFormwrittenFormWord FormgrammaticalGenderlexicalTypegrammaticalGenderwordOrderLexicon1..*A (schema for a) typological databaseLexical EntryShared semantics!0..*1..*FormSense0..*A (schema for a) lexicon17 January /20114CMDI tutorial
Data category specificationAdministrative informationIdentifierVersionOriginJustificationStatusDescriptive informationNames, definitions, examples and explanations in various languages (English is mandatory)Application (domain) specific namesConceptual domainPossible values (per profile)Linguistic informationExamples and explanations for various languagesPossible values for various languages17 January /20115CMDI tutorial
The role of data categories in CMDICMD components, elements and items can have links to conceptsThese links should be resolvable to a concept descriptionThis concept description gives explicitsemanticsElements and components can use different terminology but still have common semanticsISOcat provides resolvable links to the semantic description of data categories (DCs)CMD items: simple DCsCMD elements: complex DCsCMD components: container DCs (upcoming)17 January /20116CMDI tutorial
Data category references in CMDI<CMD_Component name="HeadWordType">  <CMD_Element name="HeadWordType" ConceptLink="http://www.isocat.org/datcat/DC-2486">    <ValueScheme>      <enumeration>        <item ConceptLink="http://www.isocat.org/datcat/DC-286">Lemma</item>        <item ConceptLink="http://www.isocat.org/datcat/DC-2948">Word form</item>        <item ConceptLink="http://www.isocat.org/datcat/DC-350">Phrase</item>        <item ConceptLink="http://www.isocat.org/datcat/DC-1386">Sentence</item>        <item ConceptLink="http://www.isocat.org/datcat/DC-2599">Other</item>        <item ConceptLink="http://www.isocat.org/datcat/DC-2592">Unspecified</item>      </enumeration>    </ValueScheme>  </CMD_Element></CMD_Component>17 January /20117CMDI tutorial
A glimpse of ISOcat17 January /20118CMDI tutorialhttp://www.isocat.org/
Status of the metadata profileInitial set of data categories has been created (to never disappear) by the Athens Core groupBut your own components might need your own specific DCsTranslations for many EU languages have been addedNo ISO Standardization yetTDG is working towards starting up the processAthens Core group will become more prominentThe addition of container DCs to be linked to CMD components is planned17 January /20119CMDI tutorial
StandardizationDecision GroupSubmissiongroupData Category RegistryBoardThematic DomainGroupStewardshipgroupValidationEvaluationrejectedrejectedPublication17 January /201110CMDI tutorial
Component RegistryInteracts with ISOcatAccess to public DCs in the metadata profileSo to currently access your private DCs you’ll have to make them publicWorking towards:The ability to access your private workspace from the Component RegistryStill will have to make your DCs public if you make your component/profile public17 January /201111CMDI tutorial
Thank you for your attention!Visitwww.isocat.orgQuestions?isocat@mpi.nlorhttp://trac.clarin.nl/ / helpdesk@clarin.nlorThe CLARIN-NL ISOcat tutorial 201117 January /201112CMDI tutorial

The ISO-DCR

  • 1.
    The ISO-DCR17 January/20111CMDI tutorialMarc Kemps-Snijdersa, Menzo Windhouwerb, Sue Ellen WrightcaMeertens Institute, bMPI for Psycholinguistics, cKent State Universitymenzo.windhouwer@mpi.nl
  • 2.
    OutlineISOcat: a DataCategory RegistryThe role of data categories in CMDIA glimpse of ISOcatStatus of the metadata profile17 January /20112CMDI tutorial
  • 3.
    ISOcat: a DataCategory RegistryThe reference implementation of ISO 12620:2009Terminology and other content and language resources — Specification of data categories and management of a Data Category Registry for language resourcesA data categoryis the result of the specification of a given data fieldan elementary descriptor in a linguistic structure or an annotation scheme17 January /20113CMDI tutorial
  • 4.
    Data categories andlinguistic resourcespartOfSpeechLemmawrittenFormwrittenFormWord FormgrammaticalGenderlexicalTypegrammaticalGenderwordOrderLexicon1..*A (schema for a) typological databaseLexical EntryShared semantics!0..*1..*FormSense0..*A (schema for a) lexicon17 January /20114CMDI tutorial
  • 5.
    Data category specificationAdministrativeinformationIdentifierVersionOriginJustificationStatusDescriptive informationNames, definitions, examples and explanations in various languages (English is mandatory)Application (domain) specific namesConceptual domainPossible values (per profile)Linguistic informationExamples and explanations for various languagesPossible values for various languages17 January /20115CMDI tutorial
  • 6.
    The role ofdata categories in CMDICMD components, elements and items can have links to conceptsThese links should be resolvable to a concept descriptionThis concept description gives explicitsemanticsElements and components can use different terminology but still have common semanticsISOcat provides resolvable links to the semantic description of data categories (DCs)CMD items: simple DCsCMD elements: complex DCsCMD components: container DCs (upcoming)17 January /20116CMDI tutorial
  • 7.
    Data category referencesin CMDI<CMD_Component name="HeadWordType"> <CMD_Element name="HeadWordType" ConceptLink="http://www.isocat.org/datcat/DC-2486"> <ValueScheme> <enumeration> <item ConceptLink="http://www.isocat.org/datcat/DC-286">Lemma</item> <item ConceptLink="http://www.isocat.org/datcat/DC-2948">Word form</item> <item ConceptLink="http://www.isocat.org/datcat/DC-350">Phrase</item> <item ConceptLink="http://www.isocat.org/datcat/DC-1386">Sentence</item> <item ConceptLink="http://www.isocat.org/datcat/DC-2599">Other</item> <item ConceptLink="http://www.isocat.org/datcat/DC-2592">Unspecified</item> </enumeration> </ValueScheme> </CMD_Element></CMD_Component>17 January /20117CMDI tutorial
  • 8.
    A glimpse ofISOcat17 January /20118CMDI tutorialhttp://www.isocat.org/
  • 9.
    Status of themetadata profileInitial set of data categories has been created (to never disappear) by the Athens Core groupBut your own components might need your own specific DCsTranslations for many EU languages have been addedNo ISO Standardization yetTDG is working towards starting up the processAthens Core group will become more prominentThe addition of container DCs to be linked to CMD components is planned17 January /20119CMDI tutorial
  • 10.
    StandardizationDecision GroupSubmissiongroupData CategoryRegistryBoardThematic DomainGroupStewardshipgroupValidationEvaluationrejectedrejectedPublication17 January /201110CMDI tutorial
  • 11.
    Component RegistryInteracts withISOcatAccess to public DCs in the metadata profileSo to currently access your private DCs you’ll have to make them publicWorking towards:The ability to access your private workspace from the Component RegistryStill will have to make your DCs public if you make your component/profile public17 January /201111CMDI tutorial
  • 12.
    Thank you foryour attention!Visitwww.isocat.orgQuestions?isocat@mpi.nlorhttp://trac.clarin.nl/ / helpdesk@clarin.nlorThe CLARIN-NL ISOcat tutorial 201117 January /201112CMDI tutorial