Your SlideShare is downloading. ×
Chinese Minority Language Support in OpenOffice.org
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Saving this for later?

Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime - even offline.

Text the download link to your phone

Standard text messaging rates apply

Chinese Minority Language Support in OpenOffice.org

2,100
views

Published on

Published in: Business, Technology

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
2,100
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
30
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Chinese Minority Language Support in OpenOffice.org Institute of Software, Chinese Academy of Sciences Lead / Tibetan Native-lang Project Yanmin Jia 贾彦民 [email_address]
  • 2. Agenda
    • Status quo
    • Fund support from government
    • Language computing features
    • Character set and encoding
    • Support in OpenOffice.org
    • Demo
    • Problem & future work
    • Conclusion
  • 3. Status quo
    • China is a multi-lingual and multi-cultural country
      • 29 minorities have their own languages/scripts
        • Tibetan , Mongolian , Uighur , Kazak, Khalkhas, Yi, and Tai Le
      • Population of native speakers
        • Tibetan: more than 2 millions in China
          • Ladakh, Nepal, Bhutan, and north area of India bordering Tibet
        • Mongolian: more than 3 millions
          • Inner Mongolia & outer Mongolia
        • Uighur: more than 8 millions
          • Sinkiang
    • Language computing
        • Microsoft platform
          • Unscribe in Vista
        • Institute of Software Chinese Academy of Sciences
          • Red Flag & OpenOffice.org
  • 4. Fund support from government
    • 863 Hi-Tech Research and Development Program of China
      • “Linux Operating System and Office Suite for Minority Scripts” (2003AA1Z2110)
    • Knowledge Innovation Project sponsored by Chinese Academy of Sciences
      • “Platform-Independent Tibetan Information Processing System Based on Linux” (KGCX2-SW-504)
    • Electronic Product Development Fund sponsored Ministry of Information Industry
      • Cross-platform Tibetan Office suite
  • 5. Language computing features
    • Writing Direction & formatting style
      • Uighur: bidirectional text
        • Right to left horizontally
      • Mongolian
        • From left to right vertically
      • Tibetan
        • From left to right horizontally
        • Special Line breaking behavior
  • 6.
    • Complex Script
      • Character shaping
        • The same character takes different shapes depending on the context
      • Ligature
        • Certain character sequences is rendered as one single shape
      • Character positioning
        • Grapheme & pre-composed character
    Language computing features (Cont.)
  • 7. Character set and encoding
    • In 1980's
      • Software provider take their own designed character set
    • Since 1997
      • Unicode standard
        • Tibetan 1997 (U+0F00~U+0FFFF)
        • Mongolian 2000 (U+1800~U+18AF)
        • Uighur use Arabic as its writing system
    • Tibetan pre-composed character set standard
      • Tibetan coded character set Extension A (GB/T20542-2006)
        • popular used Tibetan BrdaRten (Pre-composed character)
      • Tibetan coded character set Extension B
        • Devanagari transliteration of Tibetan
        • Non-BMP
    • Tibetan and Himalayan Digital Library
      • Jomolhari (Tibetan font)
  • 8. Support in OpenOffice.org
    • Rendering
      • Smart font — OpenType
        • GSUB
        • GPOS
      • Complex Layout Engine
        • Input
          • an array of Unicode characters in logical order
        • Output
          • an array of glyph indices
          • an array of character indices for the glyphs
          • an array of glyph positions
      • Vcl
        • ICU LayoutEngine (Linux)
        • Uniscribe (Windows)
  • 9. Support in OpenOffice.org(Cont.)
    • Rendering
      • Mongolian support in ICU LayoutEngine
        • MonglianOpenTypeLayoutEngine
        • Mongolian text layout process goes through the following steps
          • Subdivide string of characters into runs
          • Further divide each run into clusters
          • Each cluster is labeled by feature tag
          • Apply feature tag information to each cluster
  • 10. Support in OpenOffice.org(Cont.)
    • Rendering
      • Bypass Uniscribe on Windows
        • most minority language speakers are windows users
        • UniscribeLayout is replaced by IcuLayoutEngine
        • the Mongolian, Uighur and Tibetan can be rendered correctly
  • 11. Support in OpenOffice.org(Cont.)
    • Vertical Text Formatting for Mongolian
      • Map the vertical text frame to horizontal text frame by rotation
      • Normal horizontal text formatting is performed
      • text frame is mapped back to its vertical origin
      • It's reversible the map between various direction frames
  • 12. Support in OpenOffice.org(Cont.)
    • Vertical text formatting for Mongolian
      • Three functions in sw determine the location and exchange the width and height of the embedded frames
        • SwitchVerticalToHorizontal
        • SwitchHorizontalToVertical
        • SwapWidthAndHeight
  • 13. Support in OpenOffice.org(Cont.)
    • Locale
      • ICU
      • I18n/l10n
  • 14. Support in OpenOffice.org(Cont.)
    • GUI translation
      • Step 1: Add the New Language to the Resource System;
      • Step 2: Add the New Language to the Build Environment;
      • Step 3: Add the New Language to the Localization Tools;
      • Step 4: Extract Strings and Messages from the Source Code;
      • Step 5: Translate Extracted Strings and Messages to the New Language;
      • Step 6: Merge Translated Strings and Messages to Source Code;
      • Step 7: Add new language to the installation set project;
      • Step 8: Adding new language to the module “helpcontent” and “readlicense_oo”.
      • (http://l10n.openoffice.org/adding_language.html)
  • 15. Demo
  • 16. Demo (Cont.)
  • 17. Demo (Cont.)
  • 18. Problem & future work
    • Problems
      • It's impossible to benefit from developing software for minority language
      • Translation is not easy for Chinese minority language
        • No uniform glossary
        • No enough people mastering not only programming but also minority language
      • More fund support
    • Future work
      • more features
        • Transliteration & sorting
      • Training & application
        • Money
      • Work together with OpenOffice.org community
      • Strong collaboration with software provider
  • 19. Conclusion
    • Minority language is an amazing world
      • The languages will be lost if they aren't saved
    • OpenOffice.org is much valuable for minority Language
      • OpenOffice.org should pay more attention on minority language
    • Welcome software corporations Keep an eye on Chinese minority Languages
      • SUN, Chinese 2000, Novell and so on
    • More developing document on OpenOffice.org
    • Establish Chinese minority language federation based on OpenOffice.org
  • 20.
    • Thanks!