Internationalisation And Globalisation


Published on

This is a very old presentation but if you gloss over the usage of VB6 there is plenty of value. I presented this to the VBUG Annual Conference in 2003.

Published in: Economy & Finance, Technology
1 Comment
1 Like
No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Internationalisation And Globalisation

    1. 1. Internationalisation and Globalisation Visual Basic 6
    2. 2. Alan Dean <ul><li>alan .dean@ retailexperience .co. uk </li></ul><ul><li>or </li></ul><ul><li>adean </li></ul><ul><li>©2003 </li></ul>
    3. 3. Credit to Kaplan <ul><li>“ Internationalisation with Visual Basic” </li></ul><ul><ul><li>Michael S. Kaplan </li></ul></ul><ul><ul><li>ISBN 0672319772 </li></ul></ul>
    4. 4. Credit to Appleman <ul><li>“ Visual Basic Programmer’s Guide to the Win32 API” </li></ul><ul><ul><li>Dan Appleman </li></ul></ul><ul><ul><li>ISBN 0672315904 </li></ul></ul>
    5. 5. Outline <ul><li>“ In a connected world, it is increasingly important to be able to implement solutions for users across the world. Unfortunately, the ability to do this with VB6 is not well documented, requires a lot of effort to understand and is not available 'out of the box'.” </li></ul><ul><li> </li></ul>
    6. 6. Contents <ul><li>The following subjects are covered: </li></ul><ul><ul><li>Characters </li></ul></ul><ul><ul><li>Keyboards </li></ul></ul><ul><ul><li>Fonts (very briefly…) </li></ul></ul><ul><ul><li>Languages </li></ul></ul><ul><ul><li>Strings </li></ul></ul><ul><ul><li>Techniques to code an internationalised application </li></ul></ul>
    7. 7. Terminology
    8. 8. Terminology – Contents <ul><li>Globalisation </li></ul><ul><li>Internationalisation (i18N) </li></ul><ul><li>Multinationalisation (M18N) </li></ul><ul><li>Translation </li></ul><ul><li>Localisation (L10N) </li></ul>
    9. 9. Internationalisation (i18N) <ul><li>The process of converting an application to be capable of multinationalisation and localisation </li></ul><ul><li>Culture-specific issues are addressed </li></ul><ul><ul><li>e.g. conventions, preferences, data formatting </li></ul></ul><ul><li>Depends upon default system or user preferences </li></ul><ul><li>Does not require the translation of the text of an application </li></ul>
    10. 10. Globalisation <ul><li>The process of designing and developing an application that supports localized user interfaces and regional data for users in multiple cultures .NET Framework Developers Guide </li></ul>
    11. 11. Multinationalisation (M18N) <ul><li>The process of converting an application to support multiple cultures </li></ul><ul><li>A significant enhancement of i18N </li></ul><ul><li>Multiple language availability, including crossing the code page barrier </li></ul><ul><ul><li>E.g. Office2000 multilanguage packs (langpacks) and Win2000 multilanguage user interface (MUI) </li></ul></ul>
    12. 12. Translation <ul><li>The process of representing the text of an application in another language </li></ul><ul><ul><li>e.g. dialogs, menus, alerts, documentation etc. </li></ul></ul><ul><li>For example, the ‘File|Open’ menu item is translated to ‘Fichier|Ouvrir’ in French </li></ul><ul><ul><li>Microsoft International Word List </li></ul></ul><ul><li>Converts the meaning and sense of the text, not just the words </li></ul>
    13. 13. Beware Babelfish! <ul><li>“ Insert the boot disk into Drive A” </li></ul><ul><ul><li>Translate from English to German using Babelfish </li></ul></ul><ul><ul><li>“ Legen Sie die Boot Diskette in Laufwerk A ein” which means </li></ul></ul><ul><ul><li>“ Insert the charge disk in Propulsion A” </li></ul></ul><ul><li>“ Setzen Sie die Aufladung Scheibe in Antrieb A ein” is the correct translation </li></ul>
    14. 14. Localisation (L10N) <ul><li>The process of converting an application to adhere to the local culture of a user </li></ul>
    15. 15. Terminology - Summary <ul><li>Explained some of the general terms used around internationalisation </li></ul><ul><li>Discussed the scope of the terms used </li></ul>
    16. 16. About Characters
    17. 17. About Character - Contents <ul><li>Character Repertoires </li></ul><ul><li>Character Codes & Encoding </li></ul><ul><li>Character Sets </li></ul><ul><ul><li>ASCII, ANSI, DBCS, Unicode </li></ul></ul><ul><li>Windows Character Set Usage </li></ul>
    18. 18. Character (definition) <ul><li>character noun … 7. letter or symbol : any written or printed letter, number, or other symbol … Source: Encarta World English Dictionary </li></ul>
    19. 19. Character (alternate definition) <ul><li>A character is the atomic unit of textual communication </li></ul>
    20. 20. Character Repertoire <ul><li>An abstract set of distinct characters </li></ul><ul><ul><li>Usually defined by specifying a name and sample presentation of each character </li></ul></ul><ul><ul><li>The ordering of characters for sorting purposes is not defined </li></ul></ul><ul><ul><li>Either: </li></ul></ul><ul><ul><ul><li>Fixed (e.g. English), or </li></ul></ul></ul><ul><ul><ul><li>Open (e.g. Unicode, Chinese) </li></ul></ul></ul>
    21. 21. Character Repertoire (English) <ul><li>The character repertoire of English contains </li></ul><ul><ul><li>Alphabet </li></ul></ul><ul><ul><ul><li>Upper case A ‘A’ … Lower case Z ‘z’ </li></ul></ul></ul><ul><ul><li>Punctuation </li></ul></ul><ul><ul><ul><li>Period . Ellipses … </li></ul></ul></ul><ul><ul><ul><li>Comma , Semicolon ; Colon : </li></ul></ul></ul><ul><ul><ul><li>Question Mark ? Exclamation Point ! </li></ul></ul></ul><ul><ul><ul><li>Quotation Marks “” Parentheses () </li></ul></ul></ul><ul><ul><ul><li>Apostrophe ‘ Hyphen - </li></ul></ul></ul>
    22. 22. Character Repertoires
    23. 23. Character Code <ul><li>A mapping between an unsigned integer and a character </li></ul><ul><ul><li>e.g. 65=‘A’ </li></ul></ul><ul><li>The VB Functions Chr$(…) and Asc(…) address this mapping </li></ul><ul><ul><li>e.g. Chr$(65) returns “A” </li></ul></ul><ul><ul><li>e.g. Asc(“A”) returns 65 </li></ul></ul>
    24. 24. Character Encoding <ul><li>The process of collating code points by assigning an unsigned integer to each character in a repertoire </li></ul><ul><li>The output of encoding is a character set </li></ul><ul><li>The values assigned imply ordering of the character set, but the ordering may not be meaningful </li></ul>
    25. 25. Character Set <ul><li>An encoded character repertoire </li></ul><ul><li>There are a large number of character sets </li></ul><ul><li>Character sets are not language specific </li></ul><ul><ul><li>e.g. Latin Alphabet No.1 (ISO 8859-1) </li></ul></ul>
    26. 26. ASCII Character Set
    27. 27. ANSI Character Sets
    28. 28. Double-byte Character Sets (DBCS) <ul><li>aka MBCS (Multi-byte character set) </li></ul><ul><ul><li>Because first 128 characters single-byte encoded as ANSI </li></ul></ul><ul><ul><li>Additional characters double-byte encoded </li></ul></ul><ul><li>Double-byte encoding </li></ul><ul><ul><li>the first (or ‘lead’) byte signals that both itself and the next byte are to be interpreted as a single character </li></ul></ul>
    29. 29. Double-byte character
    30. 30. DBCS Example
    31. 31. Unicode Character Set <ul><li>All characters as double-byte encoded (as far as Windows is concerned anyway: UCS-2/UTF-16) </li></ul><ul><li>Although DBCS and Unicode both use double-byte encoding, the mapping differs </li></ul><ul><li>All characters in the Unicode character set are given a unique value </li></ul>
    32. 32. Character Set Comparison
    33. 33. Character Repertoires Revisited
    34. 34. Windows Character Set Usage <ul><li>16-bit Windows use ANSI character sets </li></ul><ul><ul><li>Known as Code Pages </li></ul></ul><ul><li>32-bit Windows use Unicode </li></ul>
    35. 35. Windows Code Page <ul><li>A table of 256(+) code points for a language </li></ul><ul><ul><li>First 128 code points are the same (the ASCII table of non-printing and English characters) </li></ul></ul><ul><ul><li>Next 128(+) are used for non-English characters needed by the language </li></ul></ul><ul><li>Based on ANSI character sets </li></ul>
    36. 36. <ul><li>Windows Code Page 1252, etc. </li></ul><ul><li>http://www. microsoft .com/ globaldev /reference/ sbcs /1252. htm </li></ul>
    37. 37. About Characters - Summary <ul><li>Explained how characters are gathered into repertoires, and are then encoded into character sets </li></ul><ul><li>Described the main character sets supported by Windows </li></ul>
    38. 38. About Keyboards
    39. 39. About Keyboards - Contents <ul><li>Scan Codes </li></ul><ul><li>Keyboard Layouts </li></ul><ul><li>Virtual Keys </li></ul>
    40. 40. Scan Code <ul><li>A hardware-dependent code sent by a keyboard to indicate a keyboard operation </li></ul><ul><li>Scan codes can vary between different keyboards </li></ul>
    41. 41. Keyboard Layout <ul><li>A definition of the scan codes supported by a keyboard </li></ul><ul><ul><li>Win3.x have a system-wide layout </li></ul></ul><ul><ul><li>Win9x and WinNT support multiple layouts on a system-wide and per-thread basis </li></ul></ul>
    42. 42. Virtual Key <ul><li>An abstraction of scan codes, so that interpretation of input need not be hardware-specific </li></ul><ul><li>API Constants exist with VK_ prefix </li></ul><ul><ul><li>e.g. VK_A </li></ul></ul>
    43. 43. From Key to Character
    44. 44. Keyboard limitations <ul><li>Keyboards are an effective data entry method for most languages </li></ul><ul><li>However there are no keyboards for character-based languages because there are no keyboards with thousands of keys… </li></ul><ul><ul><li>i.e. Far East languages (also known as Chinese/Japanese/Korean, or CJK languages) </li></ul></ul>
    45. 45. Input Method Editor (IME) <ul><li>Software to allow the input of CJK characters </li></ul><ul><ul><li>A group that approximates a character is selected </li></ul></ul><ul><ul><li>An actual character can then be selected from the group </li></ul></ul><ul><li>Run by the Input Method Manager (IMM) </li></ul>
    46. 46. Japanese IME
    47. 47. About Keyboards - Summary <ul><li>Explained how keystrokes become characters </li></ul><ul><li>Briefly discussed non-keyboard input </li></ul>
    48. 48. About Fonts
    49. 49. About Fonts - Contents <ul><li>Character-based systems </li></ul><ul><li>Graphic-based systems </li></ul><ul><li>Glyphs & Fonts </li></ul>
    50. 50. Character-based Systems <ul><li>Such systems display characters only </li></ul>
    51. 51. Graphic-based Systems <ul><li>Such systems display glyphs, not characters </li></ul>
    52. 52. Glyph <ul><li>A glyph is a graphical representation of a character </li></ul>
    53. 53. Font <ul><li>A collection of glyphs </li></ul>
    54. 54. About Fonts - Summary <ul><li>Discussed the difference between character-based and graphic-based systems </li></ul><ul><li>Briefly discussed the representation of characters by glyphs and fonts </li></ul>
    55. 55. About Languages
    56. 56. About Languages - Contents <ul><li>Languages </li></ul><ul><li>Locales </li></ul>
    57. 57. Language (definition) <ul><li>language noun 1. speech of group : the speech of a country, region, or group of people, including its diction, syntax, and grammar … Source: Encarta World English Dictionary </li></ul>
    58. 58. Locale <ul><li>A specific international market where a target user is working </li></ul><ul><li>Encompasses localisation issues: </li></ul><ul><ul><li>e.g. conventions, culture, language, preferences </li></ul></ul><ul><ul><li>including formatting of numbers, currencies, etc. </li></ul></ul><ul><ul><li>phraseology can vary also </li></ul></ul>
    59. 59. Locale Identifier (LCID) <ul><li>A 32-bit unsigned integer that identifies the locale for the system or thread </li></ul><ul><li>Commonly pronounced el-sid </li></ul>
    60. 60. LCID Structure
    61. 61. LCID Language <ul><li>Language Identifier </li></ul><ul><ul><li>A combination of the primary and secondary language identifiers </li></ul></ul><ul><li>Primary Language Identifier </li></ul><ul><ul><li>Represents the language itself </li></ul></ul><ul><ul><li>(e.g. ‘English’) </li></ul></ul><ul><li>Secondary Language Identifier </li></ul><ul><ul><li>Represents the country or region where the language is spoken </li></ul></ul><ul><ul><li>(e.g. ‘English as spoken in the United Kingdom’) </li></ul></ul>
    62. 62. LCID Sorting <ul><li>Sort Identifier </li></ul><ul><ul><li>Represents the order in which characters are to be sorted (usually the default) </li></ul></ul><ul><li>Sort Version </li></ul><ul><ul><li>Currently unused (it is reserved and must be set to 0) </li></ul></ul>
    63. 63. Locale Coverage <ul><li>Windows does not have locales for all possible language / region combinations </li></ul><ul><ul><li>In fact, almost without exception, a locale is only supported if there is a country or region that speaks the language </li></ul></ul><ul><ul><li>For example there is no locale for Esperanto, Coptic or Latin and certainly not for Klingon! </li></ul></ul>
    64. 64. Locale Usage <ul><li>Settings associated with Locales are heavily used by Windows, COM and VB </li></ul><ul><ul><li>So, the current Locale fundamentally affects the processing of information on a system </li></ul></ul><ul><li>Settings are accessed by the Regional Options control panel </li></ul>
    65. 65. About Languages - Summary <ul><li>Discussed the relationship between languages and locales </li></ul><ul><li>Explained the structure of the locale identifier </li></ul>
    66. 66. About Strings
    67. 67. About Strings - Contents <ul><li>C Strings </li></ul><ul><li>VB Strings </li></ul><ul><li>VB String calls to COM and Win32 API functions </li></ul>
    68. 68. String <ul><li>An array of characters </li></ul><ul><li>Not a primitive datatype </li></ul><ul><li>A number of string datatypes exist </li></ul><ul><ul><li>e.g. LPSTR, BSTR, etc. </li></ul></ul>
    69. 69. Pointer to String (LPSTR) <ul><li>C datatype </li></ul><ul><li>Null-terminated </li></ul><ul><li>Used extensively throughout the Windows API </li></ul>
    70. 70. Basic String (BSTR) <ul><li>COM datatype, used by VB internally </li></ul><ul><li>Unicode pointer to a block of memory prefixed by a length encoding representing the size of the string </li></ul><ul><ul><li>A contract for creation (allocation) </li></ul></ul><ul><ul><li>A contract for destruction (deallocation) </li></ul></ul><ul><ul><li>An API </li></ul></ul>
    71. 71. VB COM Calls <ul><li>Both VB and COM use Unicode, so strings are not transposed into alternate character sets </li></ul>
    72. 72. VB Win32 API Calls <ul><li>Character encoding </li></ul><ul><ul><li>VB and WinNT use Unicode encoding, but </li></ul></ul><ul><ul><li>Win9x uses ANSI encoding </li></ul></ul><ul><li>Unfortunately VB does not know the encoding expected on the target API call </li></ul><ul><ul><li>Strings are therefore encoded as ANSI </li></ul></ul><ul><ul><li>Thus the call succeeds both on Win9x and WinNT, but this wasteful on WinNT… </li></ul></ul>
    73. 73. VB Win9x API Call
    74. 74. VB WinNT API Call
    75. 75. VB WinNT API Call (Unicode)
    76. 76. About Strings - Summary <ul><li>Discussed C and VB strings </li></ul><ul><li>Explained how COM and Win32 API string function calls are transacted </li></ul>
    77. 77. An Internationalised App
    78. 78. 1.0.1 <ul><li>‘ Plain vanilla’ VB Standard EXE </li></ul>
    79. 79. 2.0.2 <ul><li>1 st attempt to internationalise </li></ul><ul><ul><li>Addition of resource file </li></ul></ul>
    80. 80. 2.1.2 <ul><li>2 nd attempt to internationalise </li></ul><ul><ul><li>Isolate persistent strings </li></ul></ul>
    81. 81. 2.2.2 <ul><li>3 rd attempt to internationalise </li></ul><ul><ul><li>Parameterise resource strings </li></ul></ul>
    82. 82. 2.2.3 <ul><li>4 th attempt to internationalise </li></ul><ul><ul><li>Loading with current LCID </li></ul></ul><ul><ul><li>By setting thread locale </li></ul></ul>
    83. 83. 3.0.4 <ul><li>5 th attempt to internationalise </li></ul><ul><ul><li>Loading with current LCID (again…) </li></ul></ul><ul><ul><li>By loading resources directly </li></ul></ul>
    84. 84. 3.1.5 <ul><li>6 th attempt to internationalise </li></ul><ul><ul><li>Loading with current LCID (yet again!) </li></ul></ul><ul><ul><li>By employing satellite resource </li></ul></ul>
    85. 85. 3.1.6 <ul><li>5 th attempt to internationalise </li></ul><ul><ul><li>Loading all strings from satellite resources </li></ul></ul>
    86. 86. Conclusion <ul><li>Covered Characters, Keyboards, Fonts, and Languages </li></ul><ul><li>Explained Strings and the usage of Strings </li></ul><ul><li>Coded a simple internationalised application </li></ul>
    87. 87. Thank You <ul><li>alan .dean@ retailexperience .co. uk </li></ul><ul><li>or </li></ul><ul><li>adean </li></ul><ul><li>©2003 </li></ul>
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.