SlideShare a Scribd company logo
1 of 1
Download to read offline
Frank Landsbergen, Carole Tiberius & Roderik Dernison; Institute for Dutch Lexicology, Leiden, The Netherlands; {frank.landsbergen, carole.tiberius, roderik.dernison}@inl.nl
taalportaal: an online grammar of Dutch and Frisian
Taalportaal workflow
The input for Taalportaal is formed by
grammatical texts in XML. Most Taalportaal
authors write directly in XML using the
Taalportaal editing environment. The materials
of the Syntax of Dutch follow a different route
though, due to its genesis, and consist of Word
documents which are automatically converted
to XML.
The Taalportaal authors who are based at
different institutes and universities throughout
the country, store their grammatical texts in a
central Subversion repository. This data is
copied onto a local file system and forms,
together with the database of terms and the
database of bibliographic references, the core of
the Taalportaal web application. The process
from retrieving the data from the Subversion
repository to displaying the data online is
completely automated such that data updates
can be realized at regular intervals.
The core of the Taalportaal website consists of a vast number of topics: small, independently readable
texts on a specific grammatical subject from Dutch or Frisian phonology, morphology or syntax.
(1) search box
(2) left panel with metadata information, table of contents, and links to bibliography and glossary
(3) breadcrumbs, showing the position of the current page in the table of contents hierarchy
(4) right panel with options zoom, comment, print, email, cite (not available in the current beta version)
(5) short introductory description of what the topic is about
(6) main text of topic
(7) reference list and links to related topics (not available in the current beta version)
Taalportaal website
Author infrastructure
The content of Taalportaal is created by a team of
authors. Content is, where possible, taken from
existing descriptions from Dutch and Frisian, and
updated where needed. Since the original sources
generally have a linear structure, they have been
rewritten to make them more suitable for the
internet.
Authors use the XML-editor oXygen, which has
been customized to the specific needs of the
project. This editor contains a link to an extensive
database of bibliographical references, that
authors can use to insert entries in their texts, and
to add or edit them. This database has been filled
by merging the references from several textbooks
on Dutch phonology, morphology and syntax.
Authors use a Taalportaal-XML schema, which is
based on the DITA format (http://dita.xml.org) to
enforce a topic-based approach to writing.
Dutch syntax
What is Taalportaal?
Taalportaal will create an online portal containing an
exhaustive and fully searchable electronic grammar
of Dutch and Frisian phonology, morphology and
syntax. Its content will be in English.
Why Taalportaal?
• Currently, no comprehensive scientific online grammar of
Dutch and Frisian exists.
• Taalportaal will serve the scientific community by organizing,
integrating and completing the grammatical knowledge of
both languages, and by making this data accessible in an
innovative way.
• The digital design of the portal enables interoperability
between the linguistic categories of phonology, morphology
and syntax on the one hand, and between the two languages
on the other. The portal’s rich crosslinking will benefit these
domains of research, which are now often studied in
isolation.
Taalportaal consortium
Sponsor:Taalportaal timeline
2011 2012 2013 2014 2015 2016
start of the project beta version online delivery of final version
All data for the module
of Dutch syntax is taken
directly from the
existing (and almost
completed) Syntax of
Dutch, an up-to-date
and comprehensive
syntactic description of
Standard Dutch. Since
its data is written in
Word and covers thou-
sands of pages, a ne-
cessary conversion to
XML has to be perfor-
med before it can be
added to the portal.
This conversion is done
using a Perl-script that
has been especially
created for this par-
ticular job. Since the
converted material can
still contain errors, all
texts need to be
manually checked be-
fore they can appear on
the portal site. Cur-
rently, one of three
parts of the Syntax of
Dutch has been conver-
ted and is published on
the Taalportaal web-
site.
www.taalportaal.org
2
5
6
3
7
4
1

More Related Content

Similar to LREC Ton vd Wouden

Multilingual issues in the representation of international bibliographic stan...
Multilingual issues in the representation of international bibliographic stan...Multilingual issues in the representation of international bibliographic stan...
Multilingual issues in the representation of international bibliographic stan...
Gordon Dunsire
 
Publishing skos concept schemes with skosmos
Publishing skos concept schemes with skosmosPublishing skos concept schemes with skosmos
Publishing skos concept schemes with skosmos
AIMS (Agricultural Information Management Standards)
 

Similar to LREC Ton vd Wouden (20)

Lexigraf - a multilingual lexicography DTP engine
Lexigraf - a multilingual lexicography DTP engineLexigraf - a multilingual lexicography DTP engine
Lexigraf - a multilingual lexicography DTP engine
 
Open sonar martinreynaert
Open sonar martinreynaertOpen sonar martinreynaert
Open sonar martinreynaert
 
Multilingual issues in the representation of international bibliographic stan...
Multilingual issues in the representation of international bibliographic stan...Multilingual issues in the representation of international bibliographic stan...
Multilingual issues in the representation of international bibliographic stan...
 
OEG-Tools for supporting Ontology Engineering
OEG-Tools for supporting Ontology EngineeringOEG-Tools for supporting Ontology Engineering
OEG-Tools for supporting Ontology Engineering
 
OEG tools for supporting Ontology Engineering
OEG tools for supporting Ontology EngineeringOEG tools for supporting Ontology Engineering
OEG tools for supporting Ontology Engineering
 
OEG-Tools for supporting Ontology Engineering
OEG-Tools for supporting  Ontology EngineeringOEG-Tools for supporting  Ontology Engineering
OEG-Tools for supporting Ontology Engineering
 
Presentation of DanteSources
Presentation of DanteSourcesPresentation of DanteSources
Presentation of DanteSources
 
Enabling Language Resources to Expose Translations as Linked Data on the Web
Enabling Language Resources to Expose Translations as Linked Data on the WebEnabling Language Resources to Expose Translations as Linked Data on the Web
Enabling Language Resources to Expose Translations as Linked Data on the Web
 
Concordances
Concordances Concordances
Concordances
 
OWLGrEd/CNL: a Graphical Editor for OWL with Multilingual CNL Support
OWLGrEd/CNL: a Graphical Editor for OWL with Multilingual CNL SupportOWLGrEd/CNL: a Graphical Editor for OWL with Multilingual CNL Support
OWLGrEd/CNL: a Graphical Editor for OWL with Multilingual CNL Support
 
Ibio english sept_2011
Ibio english sept_2011Ibio english sept_2011
Ibio english sept_2011
 
Publishing skos concept schemes with skosmos
Publishing skos concept schemes with skosmosPublishing skos concept schemes with skosmos
Publishing skos concept schemes with skosmos
 
Multilingual presentation ifla 2013 08-19
Multilingual presentation ifla 2013 08-19Multilingual presentation ifla 2013 08-19
Multilingual presentation ifla 2013 08-19
 
OWN-PT: Taking Stock
OWN-PT: Taking Stock OWN-PT: Taking Stock
OWN-PT: Taking Stock
 
NIF 2.0 Phd thesis intermediate report
NIF 2.0 Phd thesis intermediate reportNIF 2.0 Phd thesis intermediate report
NIF 2.0 Phd thesis intermediate report
 
lexicog: Overview of the New Module for Lexicography of OntoLex-lemon
lexicog: Overview of the New Module for Lexicography of OntoLex-lemonlexicog: Overview of the New Module for Lexicography of OntoLex-lemon
lexicog: Overview of the New Module for Lexicography of OntoLex-lemon
 
Php packages
Php packagesPhp packages
Php packages
 
Roadmap for a multilingual BioPortal
Roadmap for a multilingual BioPortalRoadmap for a multilingual BioPortal
Roadmap for a multilingual BioPortal
 
Lit mtap
Lit mtapLit mtap
Lit mtap
 
Lemon at-mlw3
Lemon at-mlw3Lemon at-mlw3
Lemon at-mlw3
 

More from CLARIAH

More from CLARIAH (20)

ACAD Presentation by Wilbert Spooren, CLARIAH Toogdag 19-10-2018
ACAD Presentation by Wilbert Spooren, CLARIAH Toogdag 19-10-2018ACAD Presentation by Wilbert Spooren, CLARIAH Toogdag 19-10-2018
ACAD Presentation by Wilbert Spooren, CLARIAH Toogdag 19-10-2018
 
DB:CCC Presentation of Karin Hofmeester, CLARIAH Toogdag 19-10-2018
DB:CCC Presentation of Karin Hofmeester, CLARIAH Toogdag 19-10-2018DB:CCC Presentation of Karin Hofmeester, CLARIAH Toogdag 19-10-2018
DB:CCC Presentation of Karin Hofmeester, CLARIAH Toogdag 19-10-2018
 
Masterclass innosurance 2018
Masterclass innosurance 2018Masterclass innosurance 2018
Masterclass innosurance 2018
 
Flat TLA
Flat TLAFlat TLA
Flat TLA
 
QB'er demonstration
QB'er demonstrationQB'er demonstration
QB'er demonstration
 
Collection registration for the CLARIAH Media Suite.
Collection registration for the CLARIAH Media Suite.Collection registration for the CLARIAH Media Suite.
Collection registration for the CLARIAH Media Suite.
 
CMDI2RDF
CMDI2RDFCMDI2RDF
CMDI2RDF
 
2016 05-20-clariah-wp4
2016 05-20-clariah-wp42016 05-20-clariah-wp4
2016 05-20-clariah-wp4
 
2016 05-20-clariah-wp3
2016 05-20-clariah-wp32016 05-20-clariah-wp3
2016 05-20-clariah-wp3
 
2016 05-20-clariah-wp2
2016 05-20-clariah-wp22016 05-20-clariah-wp2
2016 05-20-clariah-wp2
 
2016 05-20-clariah-wp5
2016 05-20-clariah-wp52016 05-20-clariah-wp5
2016 05-20-clariah-wp5
 
Paqu Gertjan van Noord en Jan Odijk
Paqu Gertjan van Noord en Jan OdijkPaqu Gertjan van Noord en Jan Odijk
Paqu Gertjan van Noord en Jan Odijk
 
Struc data Auke Rijpma
Struc data Auke RijpmaStruc data Auke Rijpma
Struc data Auke Rijpma
 
Corpus studio Erwin Komen
Corpus studio Erwin KomenCorpus studio Erwin Komen
Corpus studio Erwin Komen
 
Athena richard zijdeman
Athena richard zijdemanAthena richard zijdeman
Athena richard zijdeman
 
Struc data aukerijpma
Struc data aukerijpmaStruc data aukerijpma
Struc data aukerijpma
 
Anansi jauco noordzij
Anansi jauco noordzijAnansi jauco noordzij
Anansi jauco noordzij
 
Clariah dag 2016_wp1_ocw
Clariah dag 2016_wp1_ocwClariah dag 2016_wp1_ocw
Clariah dag 2016_wp1_ocw
 
WP4: overzicht van de voortgang van WP4 op de CLARIAH-dag 22 januari 2016
WP4: overzicht van de voortgang van WP4 op de CLARIAH-dag 22 januari 2016WP4: overzicht van de voortgang van WP4 op de CLARIAH-dag 22 januari 2016
WP4: overzicht van de voortgang van WP4 op de CLARIAH-dag 22 januari 2016
 
WP 2: overview of the progress of WP2 on the "CLARIAH-day 22-01-2016
WP 2: overview of the progress of WP2 on the "CLARIAH-day 22-01-2016 WP 2: overview of the progress of WP2 on the "CLARIAH-day 22-01-2016
WP 2: overview of the progress of WP2 on the "CLARIAH-day 22-01-2016
 

Recently uploaded

Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
AlMamun560346
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
Areesha Ahmad
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
PirithiRaju
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
Areesha Ahmad
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
levieagacer
 
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
dkNET
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
Sérgio Sacani
 

Recently uploaded (20)

Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdf
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
 
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
Unit5-Cloud.pptx for lpu course cse121 o
Unit5-Cloud.pptx for lpu course cse121 oUnit5-Cloud.pptx for lpu course cse121 o
Unit5-Cloud.pptx for lpu course cse121 o
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
 
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
 
Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...
Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...
Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 

LREC Ton vd Wouden

  • 1. Frank Landsbergen, Carole Tiberius & Roderik Dernison; Institute for Dutch Lexicology, Leiden, The Netherlands; {frank.landsbergen, carole.tiberius, roderik.dernison}@inl.nl taalportaal: an online grammar of Dutch and Frisian Taalportaal workflow The input for Taalportaal is formed by grammatical texts in XML. Most Taalportaal authors write directly in XML using the Taalportaal editing environment. The materials of the Syntax of Dutch follow a different route though, due to its genesis, and consist of Word documents which are automatically converted to XML. The Taalportaal authors who are based at different institutes and universities throughout the country, store their grammatical texts in a central Subversion repository. This data is copied onto a local file system and forms, together with the database of terms and the database of bibliographic references, the core of the Taalportaal web application. The process from retrieving the data from the Subversion repository to displaying the data online is completely automated such that data updates can be realized at regular intervals. The core of the Taalportaal website consists of a vast number of topics: small, independently readable texts on a specific grammatical subject from Dutch or Frisian phonology, morphology or syntax. (1) search box (2) left panel with metadata information, table of contents, and links to bibliography and glossary (3) breadcrumbs, showing the position of the current page in the table of contents hierarchy (4) right panel with options zoom, comment, print, email, cite (not available in the current beta version) (5) short introductory description of what the topic is about (6) main text of topic (7) reference list and links to related topics (not available in the current beta version) Taalportaal website Author infrastructure The content of Taalportaal is created by a team of authors. Content is, where possible, taken from existing descriptions from Dutch and Frisian, and updated where needed. Since the original sources generally have a linear structure, they have been rewritten to make them more suitable for the internet. Authors use the XML-editor oXygen, which has been customized to the specific needs of the project. This editor contains a link to an extensive database of bibliographical references, that authors can use to insert entries in their texts, and to add or edit them. This database has been filled by merging the references from several textbooks on Dutch phonology, morphology and syntax. Authors use a Taalportaal-XML schema, which is based on the DITA format (http://dita.xml.org) to enforce a topic-based approach to writing. Dutch syntax What is Taalportaal? Taalportaal will create an online portal containing an exhaustive and fully searchable electronic grammar of Dutch and Frisian phonology, morphology and syntax. Its content will be in English. Why Taalportaal? • Currently, no comprehensive scientific online grammar of Dutch and Frisian exists. • Taalportaal will serve the scientific community by organizing, integrating and completing the grammatical knowledge of both languages, and by making this data accessible in an innovative way. • The digital design of the portal enables interoperability between the linguistic categories of phonology, morphology and syntax on the one hand, and between the two languages on the other. The portal’s rich crosslinking will benefit these domains of research, which are now often studied in isolation. Taalportaal consortium Sponsor:Taalportaal timeline 2011 2012 2013 2014 2015 2016 start of the project beta version online delivery of final version All data for the module of Dutch syntax is taken directly from the existing (and almost completed) Syntax of Dutch, an up-to-date and comprehensive syntactic description of Standard Dutch. Since its data is written in Word and covers thou- sands of pages, a ne- cessary conversion to XML has to be perfor- med before it can be added to the portal. This conversion is done using a Perl-script that has been especially created for this par- ticular job. Since the converted material can still contain errors, all texts need to be manually checked be- fore they can appear on the portal site. Cur- rently, one of three parts of the Syntax of Dutch has been conver- ted and is published on the Taalportaal web- site. www.taalportaal.org 2 5 6 3 7 4 1