1. Frank Landsbergen, Carole Tiberius & Roderik Dernison; Institute for Dutch Lexicology, Leiden, The Netherlands; {frank.landsbergen, carole.tiberius, roderik.dernison}@inl.nl
taalportaal: an online grammar of Dutch and Frisian
Taalportaal workflow
The input for Taalportaal is formed by
grammatical texts in XML. Most Taalportaal
authors write directly in XML using the
Taalportaal editing environment. The materials
of the Syntax of Dutch follow a different route
though, due to its genesis, and consist of Word
documents which are automatically converted
to XML.
The Taalportaal authors who are based at
different institutes and universities throughout
the country, store their grammatical texts in a
central Subversion repository. This data is
copied onto a local file system and forms,
together with the database of terms and the
database of bibliographic references, the core of
the Taalportaal web application. The process
from retrieving the data from the Subversion
repository to displaying the data online is
completely automated such that data updates
can be realized at regular intervals.
The core of the Taalportaal website consists of a vast number of topics: small, independently readable
texts on a specific grammatical subject from Dutch or Frisian phonology, morphology or syntax.
(1) search box
(2) left panel with metadata information, table of contents, and links to bibliography and glossary
(3) breadcrumbs, showing the position of the current page in the table of contents hierarchy
(4) right panel with options zoom, comment, print, email, cite (not available in the current beta version)
(5) short introductory description of what the topic is about
(6) main text of topic
(7) reference list and links to related topics (not available in the current beta version)
Taalportaal website
Author infrastructure
The content of Taalportaal is created by a team of
authors. Content is, where possible, taken from
existing descriptions from Dutch and Frisian, and
updated where needed. Since the original sources
generally have a linear structure, they have been
rewritten to make them more suitable for the
internet.
Authors use the XML-editor oXygen, which has
been customized to the specific needs of the
project. This editor contains a link to an extensive
database of bibliographical references, that
authors can use to insert entries in their texts, and
to add or edit them. This database has been filled
by merging the references from several textbooks
on Dutch phonology, morphology and syntax.
Authors use a Taalportaal-XML schema, which is
based on the DITA format (http://dita.xml.org) to
enforce a topic-based approach to writing.
Dutch syntax
What is Taalportaal?
Taalportaal will create an online portal containing an
exhaustive and fully searchable electronic grammar
of Dutch and Frisian phonology, morphology and
syntax. Its content will be in English.
Why Taalportaal?
• Currently, no comprehensive scientific online grammar of
Dutch and Frisian exists.
• Taalportaal will serve the scientific community by organizing,
integrating and completing the grammatical knowledge of
both languages, and by making this data accessible in an
innovative way.
• The digital design of the portal enables interoperability
between the linguistic categories of phonology, morphology
and syntax on the one hand, and between the two languages
on the other. The portal’s rich crosslinking will benefit these
domains of research, which are now often studied in
isolation.
Taalportaal consortium
Sponsor:Taalportaal timeline
2011 2012 2013 2014 2015 2016
start of the project beta version online delivery of final version
All data for the module
of Dutch syntax is taken
directly from the
existing (and almost
completed) Syntax of
Dutch, an up-to-date
and comprehensive
syntactic description of
Standard Dutch. Since
its data is written in
Word and covers thou-
sands of pages, a ne-
cessary conversion to
XML has to be perfor-
med before it can be
added to the portal.
This conversion is done
using a Perl-script that
has been especially
created for this par-
ticular job. Since the
converted material can
still contain errors, all
texts need to be
manually checked be-
fore they can appear on
the portal site. Cur-
rently, one of three
parts of the Syntax of
Dutch has been conver-
ted and is published on
the Taalportaal web-
site.
www.taalportaal.org
2
5
6
3
7
4
1