1. Terminology Management
in Technical Communication
tcworld India
Bangalore, 11 – 12 March 2011
Klaus-Dirk Schmitz
Institute for Information Management
Faculty 03
University of Applied Sciences Cologne
klaus.schmitz@fh-koeln.de
2. Overview
What is terminology?
Basic principles of terminology management
Terminology management in companies
Tools for managing terminology
Terminology extraction tools
Terminology databases
Terminology control tools
K.-D. Schmitz, IIM, FH Köln
3. What is terminology ?
Note: If you omit the password, MultiTerm
prompts you for a password when loading,
assuming the database is password-protected.
If you log on as the system administrator, you
are normally asked whether you want exclusive
access to the database. This is not the case when
opening a database using parameters; in this
case, it is assumed that you do want exclusive
access. Only when exclusive access is not
available, MultiTerm does assume that you still
want to take part in normal multi-user operation.
K.-D. Schmitz, IIM, FH Köln
4. What is terminology ?
Note: If you omit the password, MultiTerm
prompts you for a password when loading,
assuming the database is password-protected.
If you log on as the system administrator, you
are normally asked whether you want exclusive
access to the database. This is not the case when
opening a database using parameters; in this
case, it is assumed that you do want exclusive
access. Only when exclusive access is not
available, MultiTerm does assume that you still
want to take part in normal multi-user operation.
K.-D. Schmitz, IIM, FH Köln
5. What is terminology ?
Note: If you omit the password, MultiTerm
prompts you for a password when loading,
assuming the database is password-protected.
If you log on as the system administrator, you
are normally asked whether you want exclusive
access to the database. This is not the case when
opening a database using parameters; in this
case, it is assumed that you do want exclusive
access. Only when exclusive access is not
available, MultiTerm does assume that you still
want to take part in normal multi-user operation.
K.-D. Schmitz, IIM, FH Köln
6. What is terminology ?
database terminology
exclusive access =
loading vocabulary of
log on a subject field
MultiTerm * =
multi-user operation Gesamtheit der Begriffe
open a database und Benennungen in
parameter einem Fachgebiet
(DIN 2342)
password
=
password-protected
set of designations belonging
prompt to one special language
system administrator (ISO 1087-1)
K.-D. Schmitz, IIM, FH Köln
8. Object
Any part of the perceivable or
conceivable world
Objects may be material (e.g. mouse)
or immaterial (e.g. magnetism)
K.-D. Schmitz, IIM, FH Köln
9. Concept
Unit of thinking
made up of characteristics
that are derived by categorizing objects
having a number of identical properties (DIN)
Unit of knowledge created by a unique
combination of characteristics (ISO)
Concepts are not bound to particular languages.
They are, however, influenced by social or cultural
background
K.-D. Schmitz, IIM, FH Köln
10. “mouse”
Term
Designation of a defined concept
in a special language
by a linguistic expression
Designation: Any representation of a concept
A term may consist of one or more words A term may
consist of one or more words
Single word term: mouse, printer, laser
Multi-word term: laser printer, printer with single-sheet feed
K.-D. Schmitz, IIM, FH Köln
11. Synonymy
Communication can be disturbed!
“return key?”
“enter key”
K.-D. Schmitz, IIM, FH Köln
12. Synonymy
Synonymy exists
if two or more
terms in a given
language
represent the
same concept.
K.-D. Schmitz, IIM, FH Köln
14. Homonymy / Polysemy
Communication can fail!
“mouse”
K.-D. Schmitz, IIM, FH Köln
15. Homonymy / Polysemy
Polysemy: etymological affinity,
the same word.
Homonymy exists if one term or
several terms that have the
same external form refer to
several concepts.
True homonymy: different
words with the same form, no
etymological affinity.
Differentiation between
polysemy and homonymy is
irrelevant for terminology work.
K.-D. Schmitz, IIM, FH Köln
19. Term-related issues
Criteria for the selection and creation/coining of
terms:
Transparency/motivation
Consistency
Appropriateness
Linguistic economy
Derivability
Linguistic correctness
Preference for native language
K.-D. Schmitz, IIM, FH Köln
20. Transparency
The concept designated by the term can be
inferred without a definition
(e.g. pipe wrench or adjustable wrench vs. monkey wrench)
The meaning of the term is visible by:
morphological motivation
page setup, error message
data network identification code*
network printing device setup*
semantic motivation
worm, virus, infected file, vulnerability, firewall*
K.-D. Schmitz, IIM, FH Köln
21. Consistency
Terminology must be defined accurately and
used consistently at least within:
one document
one product
one company or organization
Only one term for each concept
(avoid synonyms !)
Only one concept for each term
(avoid homonyms !)
But also consistent term coining (see Wikipedia: wrench)
K.-D. Schmitz, IIM, FH Köln
22. Appropriateness
Appropriateness means that terms:
have to be familiar to the user (localization!)
don’t cause confusion or insecurity
have no negative connotations (neutral, politically correct)
express installation (only components needed)
network installation (all components)
system error, severe error, fatal error, user error etc.
master/slave, web designers’ bible, knowledge nugget etc.
nuclear energy vs. atomic energy
K.-D. Schmitz, IIM, FH Köln
24. Other features
Linguistic economy
Ultrakurzwellenüberreichweitenfernsehrichtfunkverbindung
Derivability
medicinal plant vs. herb herbal, herbalist, ...
Bedeutungslehre Semantik
Linguistic correctness
aktualisieren vs. updaten, geupdated, upgedatet, ...
OpenSource, Cafe ToGo, fünfköpfiger Familienvater, …
Preference for native language
Startseite vs. Homepage, Multifunktionsleiste vs. Ribbon
K.-D. Schmitz, IIM, FH Köln
25. Overview
What is terminology?
Basic principles of terminology management
Terminology management in companies
Tools for managing terminology
Terminology extraction tools
Terminology databases
Terminology control tools
K.-D. Schmitz, IIM, FH Köln
26. Terminology in companies
Terminology is an important carrier of knowledge for
domain-specific information in companies
Within a company, but also between company and
customers as well as between company and suppliers
Many departments and sectors of a company create,
disseminate, retrieve and use terminology
BUT: Terminology management is discussed controversially:
Costs + effort
Quality and efficiency
Therefore: Costs and benefits of terminology management
K.-D. Schmitz, IIM, FH Köln
27. tekom survey
Successful terminology
management in companies
Practical tips and guidelines:
Basic principles, implementation,
cost-benefit analysis, system
overview
published in German 4/2010,
about 300 pages, CD with data
also available in English
K.-D. Schmitz, IIM, FH Köln
28. tekom survey
Online questionnaire end of 2009
about 1,000 sent out, mostly to tekom members
High response rate of 940 questionnaires (77% tekom)
34% managerial staff and CEOs
64% employees
67% industrial enterprises
15% software companies
13% service providers (TD / translation / localization)
(And: questionnaire for tools providers, questionnaire for
benchmarking companies (25), 2 benchmarking workshops)
K.-D. Schmitz, IIM, FH Köln
29. tekom survey: the situation
At an average, a company
has to manage 11.81 different information products
is creating 5.87 different technical documentations
has to deal with the fact, that 5.04 different sections/
departments are involved in the process of creating
(new) terms
is translating information products into 10.1 different
languages
K.-D. Schmitz, IIM, FH Köln
30. Product planning and R&D & Product Management Product specifications
development R&D Procedural instructions
Process documentation for product development
CAD graphics, 3D-models pertaining to the product
Laboratory manuals
R&D & software development S oftware GUIs / user menus
R&D & TD marketing Images pertaining to the product
R&D & TD Data sheets
Parts lists
R&D & Marketing Packaging labels / product labels
QM Quality documentation
Product marketing R&D & Product management Information pertaining to product release changes
Marketing and product management Product information sheets for pre-sales
Marketing Marketing material
Customer presentations
Sales and Marketing &TD Product catalogues
Price catalogues
Contract documents
Corporate communications Press releases
Product usage Technical Communication (TD) Montage and installation instructions
Commissioning manuals
Online help
Training & TD Training documents
Marketing and TD Multimedia/simulation programs
TD & R&D & software dev. User interfaces(GUIs) / software descriptions
R&D Control elementsand labels
Product maintenance TD Maintenance/S ervice manuals
TD & Service R epair manuals
S pare parts catalogues K.-D. Schmitz, IIM, FH Köln
31. Who is creating terminology?
Multiple answers,
Technical documentation 79.7%
average 5.04
Research / (software) development / 79.7%
engineering different
Marketing 63.5% departments/
Product management/ Portfolio 61.3% sections
management
Translation / Localization 40.4%
Distribution / Sales 39.3%
Customer service / After sales 30.7%
Training 28.4%
Management board 26.3%
Corporate communications / Public 24.4%
relation
Quality assurance / Quality management 16.3%
Purchase / Procurement 12.3%
Montage / Assembly planning / 12.3%
Production
Servicing / Maintenance 8.2%
IT service 6.7%
K.-D. Schmitz, IIM, FH Köln
32. Terminology problems
84.2 % report, that always or very frequently various
departments/sections use different terms for the same
concept
70.7 % report, that always or very frequently differing
terms for the same concept are used in various
documents
47.1 % of the staff always or very frequently have
problems in understanding technical terms on the spot
51.1 % of the staff always or very frequently have to
ask for or retrieve the correct term for a given concept
K.-D. Schmitz, IIM, FH Köln
34. Opinions about terminology
96,9%
96,0% 96,0%
87,7%
sehen die Zeitersparnis in halten die
Making work
schätzen die
Improving Better
gehen von einer eher großen
Saving time Arbeitserleichterung für eher Qualitätsverbesserung von bis sehr großen Erleichterung
der Kommunikation und
understanding
Arbeit als eher groß bis sehr easier
groß bis sehr groß quality
Dokumenten und in der der Verständlichkeit für den
for the customer
groß an Kommunikation als eher Kunden aus
groß bis sehr groß K.-D. Schmitz, IIM, FH Köln
35. Terminology documentation
Documentation of terminology
Definitions 84.3%
Subject field information 78.2%
Status (preferred, admitted, deprecated, do not use etc.) 72.3%
Grammatical information (Gender, POS, Number etc.) 51.4%
Project, product, customer, department information 44.9%
Illustrations 34.8%
And: Context, Example, Synonym, No-Term, Source, Explanation,
References, Position numbers, short forms
K.-D. Schmitz, IIM, FH Köln
36. Model for a cost-benefit analysis
1. Analysis of the problem
2. Analysis of the impact (very often, very expensive, bad quality)
3. Framework conditions for the value of benefit (many
documents, high volumes, many languages, using TMS, using CMS)
4. Definition of goals (management triangle: costs, time, quality)
5. Analysis of benefit (consistent terminology, less changes, higher
TM match rate, less queries by translators)
6. With/without comparison
7. Analysis of costs (initial costs: system, implementation, training;
running costs: licenses, personnel, working hours)
8. Cost-benefit analysis
9. Success factors and risks (early, involve all, workflow)
K.-D. Schmitz, IIM, FH Köln
37. Key performance indicators: benefits
For a specific application scenario:
Costs for changes of terms in the source language
Costs for queries from translators
Costs for terminology related translation corrections
Translation costs
e.g. for changes of terms:
duration in number of hours of work for making changes
wages for the hours of work
number of changes made per document
number of newly created documents per year
And: when changes? in which formats/systems? with/without termbase!
K.-D. Schmitz, IIM, FH Köln
38. Key performance indicators: costs
For a specific application scenario:
Procurement costs for a termbase system (12,000-40,000 €)
Costs for system support and update (1,100-9,000 €)
Time for one SL term entry (ø 30 min) plus TL info (ø 20 min)
Number of new entries/year (60-600) + updated entries (100-1200)
Monthly salaries for the terminology staff
K.-D. Schmitz, IIM, FH Köln
39. Cost-benefit analysis
Type of Problems
Degree of Necessity
Degree of Effects
e.g. Number of Languages
Framework Conditions e.g. Amount of Translation
for Optimum Usage e.g. TMS Usage
e.g. Number Employees TD
Benefit Key Indicators: Cost Key Indicators:
e.g. Costs for Source Text Changes e.g. Investment Costs for TermBase System
Costs for Answering Translators Questions Costs for Training Personnel
Costs for Target Text Corrections Running Costs for Terminology Management
TMS Match Rates System Maintenance
Alternatives
Without Defined With Defined (Without Defined With Defined
Terminology Terminology Terminology) Terminology
Figures to Compare from Benchmarking or Estimation
Evaluation and Comparison Benefit Evaluation and Comparison Costs
K.-D. Schmitz, IIM, FH Köln
44. Overview
What is terminology?
Basic principles of terminology management
Terminology management in companies
Tools for managing terminology
Terminology extraction tools
Terminology databases
Terminology control tools
K.-D. Schmitz, IIM, FH Köln
45. Terminology extraction
In many application scenarios of terminology work, the extraction of
terminology from (existing) textual material is recommended.
We can differentiate between the following extraction methods:
Monolingual term extraction (text in electronic form)
Bilingual term extraction (parallel aligned texts, i.e. TMs)
Manual (human) term extraction
Computer-assisted term extraction (tools propose term candidates)
• With statistical methods (for “all” languages, cannot use
knowledge about syntax)
• With linguistic methods (better results, but only for “important”
languages)
• With hybrid methods (combining statistical and linguistic methods)
K.-D. Schmitz, IIM, FH Köln
46. Terminology extraction tools
Features of (monolingual) term extraction tools:
Common functionalities from concordance programs (e.g.
WordSmith): identify words, word statistics, KWIC index,
alphabetic/frequency order
Reducing inflected word forms to the basic canonical form:
needed for real statistics, needs morphological knowledge
Filtering and ignoring function words (articles, conjunctions etc.)
and general language words (but what is general language?)
Filtering and ignoring terms that are already included in a term
base
Identifying multi-word terms, noun phases and verbal phases
Identifying discontinuous elements and elliptical constructions
K.-D. Schmitz, IIM, FH Köln
47. Terminology extraction tools
Improving and
enriching term
candidates with
SDL MultiTerm
Extract
K.-D. Schmitz, IIM, FH Köln
48. Terminology extraction tools
Settings to
improve the
results of term
extraction with
SDL MultiTerm
Extract
K.-D. Schmitz, IIM, FH Köln
49. Terminology extraction tools
Benefits and problems of term extraction tools:
Term extraction tools are helpful in preparing terminology for
large translation projects (with several translators) and for an
initial feeding of a term base (with company or subject specific
terminology)
Result of a term extraction is a list of term candidates; the list
must be checked; but what about the texts (with possible not
extracted terms)?
Results are only terms (and context examples), but no other
terminological information; it is a kind of a to-do list for the
terminologist
The more linguistics the better the results; but what about “less
common” and minority languages? K.-D. Schmitz, IIM, FH Köln
50. Terminology management systems
Terminology management systems are software
applications that are designed to manage terminological
data.
They support tasks related to terminology work and store the
results: Terminological data can be entered, edited, deleted,
retrieved and filtered.
Most of the systems available on the market are based on
(relational) data base systems (MS-Access, SQL, Oracle).
Can be seen as a kind of CAT-Tools (CAT=computer assisted
translation).
Tables in word processing or spreadsheet programs are not
adequate for terminology management !
K.-D. Schmitz, IIM, FH Köln
52. Designing a terminology management solution
Before designing a terminology management
solution and choosing, adapting or programming a
terminology management application:
Analyze the needs and objectives
Specify the user groups, tasks and workflow
Define the terminological data categories needed
Take into account the basic modelling principles
Model the terminological entry
Select, adapt, develop the software
Meta data
K.-D. Schmitz, IIM, FH Köln
53. Typology of data categories I
Complex data categories
Open data categories
content not predictable and defined by specification
e.g.: term, definition, note
Closed data categories
content defined by a limited set of possible values
e.g.: gender, part of speech, geographical usage
Simple data categories
content only yes or no; values of closed data categories
e.g.: masculine, noun, DE
K.-D. Schmitz, IIM, FH Köln
54. Typology of data categories II
Concept-oriented data categories
e.g.: subject field, figure
Language-oriented data categories
e.g.: definition ?
Term-oriented data categories
e.g.: part of speech, context
Administrative data categories
e.g.: author, date, note
Special data categories
e.g.: term, language, (structural elements), (shared resources)
K.-D. Schmitz, IIM, FH Köln
56. Concept orientation
All terminological information belonging
to one concept including all terms in all
languages and all term-related and
administrative data must be store in
one terminological entry
concept = terminological entry
K.-D. Schmitz, IIM, FH Köln
57. Lexicographical view / model / entry
meaning meaning
word
meaning meanding
meaning meaning
K.-D. Schmitz, IIM, FH Köln
58. Terminological view / model / entry
term concept term
term term
term term
descriptive
terminology management
K.-D. Schmitz, IIM, FH Köln
59. Terminological view / model / entry
term concept term
(term) term
term term
prescriptive
terminology management
K.-D. Schmitz, IIM, FH Köln
64. Term autonomy
All terms belonging to one concept should
be managed (in one terminological entry)
as autonomous (repeatable) blocks of data
categories without any preference for a
specific term
Therefore all terms can be documented with the
relevant term-related data categories
Term autonomy is necessary for the main term, all
synonyms, all variants, and all short forms
Term autonomy is not explicitly discussed in
theoretical literature
K.-D. Schmitz, IIM, FH Köln
65. Concept orientation & term autonomy
TermEntry
Concept
represented by ID-No. and/or classification / notation
Language 1 Language 2 Language 3 ...
+ AuxInfo + AuxInfo + AuxInfo
Term 1 Term 1 Term 1
+ AuxInfo + AuxInfo + AuxInfo
Term 2 Term 2
+ AuxInfo + AuxInfo
Term 3
+ AuxInfo
K.-D. Schmitz, IIM, FH Köln
67. Terminological data modeling
Terminological data categories
ISO 12620:1999 / 12620:2009
definition, subject field, grammar, context,
project code, author, date etc.
Data Category Registry (DCR)
Terminological data modeling principles
ISO 12200 / 16642 / 30042
meta model, concept orientation, term autonomy,
TBX (Termbase eXchange)
K.-D. Schmitz, IIM, FH Köln
68. Terminology control
In many application scenarios of terminology work, the checking of
correct and consistent use of terminology in documents (created by
technical writers or translators) is recommended.
We can differentiate between the following control methods:
Monolingual terminology control
Bilingual terminology control (for translations)
Manual (human) terminology control (part of proof reading & QA)
Computer-assisted term control (tools analyze and check documents)
• Without linguistic methods (for “all” languages, using the content
of a term base)
• With linguistic methods (better results, but only for “important”
languages)
K.-D. Schmitz, IIM, FH Köln
69. Terminology control tools
Features of terminology checking tools:
Similar to spell checkers and auto correction
Integrated into editors, authoring systems, CAT tools, but also as
stand-alone programs
Directly during the writing process of a document or translation, or
as an autonomous process (when the document is finished)
Connection to the term base entries (interactive or via
export/import)
Very often combined with grammar and style checking (controlled
language)
Using fuzzy search and/or linguistics (inflected terms in texts vs.
canonical form of the terms in term bases)
Deprecated terms must be maintained in the term base (no-terms)
K.-D. Schmitz, IIM, FH Köln
71. Terminology control tools
Automatic detection of
linguistic variants with
acrolinx IQ Suite
K.-D. Schmitz, IIM, FH Köln
72. Conclusion 1
You will find terminology problems in many companies
In any case, terminology management will improve
important factors such as:
better communication, consistent corporate language, avoidance of
errors and misunderstanding, faster training, better translations
Not all benefits are easy to calculate and quantify
Benefits and ROI depend on several company specific
framework conditions
Terminology management is no luxury, it is a necessity
for all industrial companies and service providers,
and this can be documented by key indicators
K.-D. Schmitz, IIM, FH Köln
73. Conclusion 2
Only excellent managed terminology can guarantee
a high quality information and communication:
follow guidelines for term creation and
term selection
model your termbase with concept orientation
and term autonomy, and include appropriate
data categories for documenting terms
Wrong decisions and mistakes can later be
repaired only with huge efforts and costs !
K.-D. Schmitz, IIM, FH Köln
74. Thank you for your attention
Prof. Dr. Klaus-Dirk Schmitz
Fachhochschule Köln
Fakultät 03 - ITMK/IIM
Mainzer Str. 5
50678 Köln
klaus.schmitz@fh-koeln.de