Multi Language Database
Design Approaches
Contents
 Column Approach
 Multi-row Approach
 Single Translation Table Approach
 Additional Translation Table Approach
 Conclusion
Building a Multilanguage website is not a trivial task and you will
encounter many problems on this way, and one of them is how you're
planning to store the content of the site in the database for each
language.
You may perform a small research on the Web and find enough
resources about it, but there is no a magic solution, you have to
understand this - each solution depends on your personal requirements,
size of the database, complexity of your site, etc. So we'll discuss only
major techniques. If you want to learn more, you may find additional
information with a Google search.
This solution is the simplest one and basically it
creates an additional column for each text (each
language) that needs to be translated (there is
may be a number of such columns in your table,
like: title, name, description etc.).
1. Column Approach
• Simplicity - easy to
implement
• Easy querying - no JOINs
required
• No duplicates - doesn't have
duplicate content (there is
only one row for each record
and only the language
columns are duplicated)
Advantages & Disadvantages:
• Hard to maintain - works in easy way for
2-3 languages, but it becomes a really hard
when you have a lot of columns or a lot of
languages
• Hard to add a new language - adding new
language requires schema changes (and
special access rights for db user) for each
table with multi-language content
• Store empty space - if not all translations
are required (e.g. at some places default
language should always be used) it may
cause redundant data or empty db fields
• Need to build the watch - what column
you are working with depending on the
language
This solution is similar to the one above, but
instead of duplicating the content in columns it
does it in rows.
2. Multi-row Approach
• Simplicity - easy to implement
• Easy querying - no JOINs
required
Advantages & Disadvantages:
• Hard to maintain - every column that is
not translated must be changed in all
rows for each language. e.g. changing
the price for single product requires
repeating of this operation for all
languages
• Hard to add a new language - requires
repeating insertion operation for each
language (cloning the record for default
language)
• Duplicate content - you will have a lot
of duplicate content for all the columns
that are not translated
This solution seems to be the cleanest one from database
structure perspective. You store all texts that need to be
translated in a single translation table. It is more suited for
dynamic websites and which have a large number of
languages or which intend to add a new language in the
future and want to do it with ease.
3. Single Translation Table Approach
• Proper normalization - seems
like clean, relational approach
• Ease in adding a new
language - doesn't require
schema changes
• All translations in one place -
readable/maintainable database
Advantages & Disadvantages:
• Complex querying - multiple joins
required to retrieve correct product
description
• Hard to maintain - overcomplicated
querying on all operations: insertion,
removing and updating
• All translations in one place - one
missing table leads to global problems
This is a variation of the above approach and it seems to be
easier to maintain and work with. Let's check why: for each
table that stores information that may need to be translated
an additional table is created. The original table stores only
language insensitive data and the new one all translated info.
4. Additional Translation Table Approach
• Proper normalization - seems like
clean, relational approach
• Ease in adding a new language -
doesn't require schema changes
• Columns keep there names -
doesn't require "_lang" suffixes or
something
• Easy to query - relatively simple
querying (only one JOIN is required)
Advantages & Disadvantages:
• May double the amount of
tables - You have to create
translation tables for all your
tables that have columns that
need to be translated
 These 4 examples that are presented above give us an idea of how
different approaches may be used here. These are of course not all of
possible options, just the most popular ones. You may always modify
them e.g. by introducing some additional views that would save you
writing complex joins direct from your code.
 Remember, that a solution you choose mostly depends on your project
requirements. If you need s simplicity and are sure that the number of
supported languages is small and fixed you could go with option 1. If
you require a bit more flexibility and can afford a simple join when
querying for multilingual data options 3 or 4 would be a possible
solution.
Conclusion
Multi-language database design appoaches

Multi-language database design appoaches

  • 1.
  • 2.
    Contents  Column Approach Multi-row Approach  Single Translation Table Approach  Additional Translation Table Approach  Conclusion
  • 3.
    Building a Multilanguagewebsite is not a trivial task and you will encounter many problems on this way, and one of them is how you're planning to store the content of the site in the database for each language. You may perform a small research on the Web and find enough resources about it, but there is no a magic solution, you have to understand this - each solution depends on your personal requirements, size of the database, complexity of your site, etc. So we'll discuss only major techniques. If you want to learn more, you may find additional information with a Google search.
  • 4.
    This solution isthe simplest one and basically it creates an additional column for each text (each language) that needs to be translated (there is may be a number of such columns in your table, like: title, name, description etc.). 1. Column Approach
  • 5.
    • Simplicity -easy to implement • Easy querying - no JOINs required • No duplicates - doesn't have duplicate content (there is only one row for each record and only the language columns are duplicated) Advantages & Disadvantages: • Hard to maintain - works in easy way for 2-3 languages, but it becomes a really hard when you have a lot of columns or a lot of languages • Hard to add a new language - adding new language requires schema changes (and special access rights for db user) for each table with multi-language content • Store empty space - if not all translations are required (e.g. at some places default language should always be used) it may cause redundant data or empty db fields • Need to build the watch - what column you are working with depending on the language
  • 6.
    This solution issimilar to the one above, but instead of duplicating the content in columns it does it in rows. 2. Multi-row Approach
  • 7.
    • Simplicity -easy to implement • Easy querying - no JOINs required Advantages & Disadvantages: • Hard to maintain - every column that is not translated must be changed in all rows for each language. e.g. changing the price for single product requires repeating of this operation for all languages • Hard to add a new language - requires repeating insertion operation for each language (cloning the record for default language) • Duplicate content - you will have a lot of duplicate content for all the columns that are not translated
  • 8.
    This solution seemsto be the cleanest one from database structure perspective. You store all texts that need to be translated in a single translation table. It is more suited for dynamic websites and which have a large number of languages or which intend to add a new language in the future and want to do it with ease. 3. Single Translation Table Approach
  • 9.
    • Proper normalization- seems like clean, relational approach • Ease in adding a new language - doesn't require schema changes • All translations in one place - readable/maintainable database Advantages & Disadvantages: • Complex querying - multiple joins required to retrieve correct product description • Hard to maintain - overcomplicated querying on all operations: insertion, removing and updating • All translations in one place - one missing table leads to global problems
  • 10.
    This is avariation of the above approach and it seems to be easier to maintain and work with. Let's check why: for each table that stores information that may need to be translated an additional table is created. The original table stores only language insensitive data and the new one all translated info. 4. Additional Translation Table Approach
  • 11.
    • Proper normalization- seems like clean, relational approach • Ease in adding a new language - doesn't require schema changes • Columns keep there names - doesn't require "_lang" suffixes or something • Easy to query - relatively simple querying (only one JOIN is required) Advantages & Disadvantages: • May double the amount of tables - You have to create translation tables for all your tables that have columns that need to be translated
  • 12.
     These 4examples that are presented above give us an idea of how different approaches may be used here. These are of course not all of possible options, just the most popular ones. You may always modify them e.g. by introducing some additional views that would save you writing complex joins direct from your code.  Remember, that a solution you choose mostly depends on your project requirements. If you need s simplicity and are sure that the number of supported languages is small and fixed you could go with option 1. If you require a bit more flexibility and can afford a simple join when querying for multilingual data options 3 or 4 would be a possible solution. Conclusion