Achievement And Lessons Learned By An Loc

EPFL (École polytechnique fédérale de Lausanne)
EPFL (École polytechnique fédérale de Lausanne)Senior Scientist at EPFL and Executive Director at Kamusi Project
Presented by:
             Martin Benjamin (martin@kamusi.org)
  Director of ANLoc Locales and Terminology Subprojects


          Funded by IDRC, Acacia, project number 104475
Managed in IDRC regional office for Middle East and North Africa, Cairo
Achievements and Lessons Learned by
 the African Network for Localization
   Languages and information technology in
Africa: the challenges for localization
  Addressing the challenges: ANLoc and its
subprojects
  Lessons learned: two case studies
   The long view: the outlook for IT in African
languages and African societies going forward
Languages and information technology
in Africa: the challenges for localization
National University of Rwanda
                      July 2008




                                     Rwanda, July 2008

Dar es Salaam,
Tanzania, July 2008
Achievement And Lessons Learned By An Loc
African language facts:
• As many as 2000 languages spoken in Africa by
  1,000,000,000 people
• Over 200 languages are spoken by more than 500,000
  people each
• At least 15 languages are spoken by more than 10,000,000
  people each:
       Amharic, Arabic, Berber, Chewa, Fula, Hausa, Igbo,
     Kinyarwanda, Malagasy, Oromo, Shona, Somali, Swahili, Yoruba,
     Zulu
• Primary education in Africa is often in local, regional, or
  national languages
• IT in Africa is mostly available in English, French, or
  Portuguese
Enter ANLoc

  Adapting ICT so that it can
be used by:
• Non-specialists                 IT+46 training workshop,
                                  Kampala, November 2006
• Non-elites
• Non-speakers of global        L10n =
  languages
                                Localization
• Students
• Anyone
   In other words, most of
Africa’s 1 billion people
Addressing the challenges: ANLoc and
           its subprojects
    Enabling L10n           Software Freedom Day,
                            Accra, September 2008
• Locales
• Fonts
• Keyboards
    Activating L10n
•   Terminologies
•   Translation tools
•   Spellcheckers
•   Localizing software
•   Training localizers
    Sustaining L10n
• Language and ICT policy
• Network development
Enabling L10n
Locales            These are things that
Fonts              must exist for a language
                   before any software
Keyboards          localization can occur




                             Translate Firefox event,
                             Kampala, August 2008
Locales
     The basic information sets needed to
configure computers for a language

     •What character sets to use

     •How dates and numbers appear

     •Direction text is written

     •Names of days, weeks, months

     •Currency symbols, measurement systems

     •Other background information that computers
     need for a language

    In French: “paramètres régionaux”

    Makes it easy to write and share
documents in a language

    Makes it possible to develop software,
websites, mobile phones, ATMs, etc, for a
language
Fonts
      Many African languages have
letters that do not exist in the
standard European character set
     ANLoc is creating Free and
Open Source fonts that contain all
characters for numerous African
languages that have been included
in the UNICODE standard
     Availability of a font with all the
necessary characters is elemental
for using IT in a language
    Fonts are integrated with ANLoc
Keyboards and Translation Tools
     Documentation and
dissemination need more attention
Keyboards
      Mapping the characters
of a language’s alphabet to
the keys on a qwerty or
azerty keyboard
     Completely integrated
with the output of the Fonts
subproject for each specific
language
    12 keyboards available
in most recent Windows and
Mac builds
     30 keyboards available
for Linux: http://is.gd/CjGi
     Documentation and
dissemination need more
attention
Activating L10n
Terminologies         These are the building
Translation tools     blocks to ensure the
                      viability of L10n for a
Speelcheekers         language
Localizing Software
Training Localizers




                                 tzLUG, Dar es Salaam,
                                 December 2008
Activating L10n
Terminologies         These are the building
Translation tools     blocks to ensure the
                      viability of L10n for a
Spellcheckers         language
Localizing Software
Training Localizers




                                 tzLUG, Dar es Salaam,
                                 December 2008
Terminologies
     2500 IT terms selected from
more than 1100 translation files
for Free and Open Source
Software
    Definitions for each term in
English
     Glossmaster software for
rapid glossary development by
project partners
     Producing terms + definitions
in 14 African languages
     Working with Translation
Bureau (Public Works and
Governments Services Canada)
to add a French component
     Direct export to Virtaal
translation tool of the Tools
subproject
    Free online dissemination
through PALDO (kamusi.org)
Translation
              Tools
    Provide good tools to a wide range of
users, including:

      • Less skilled people

      • People who cannot translate from English

      • People with less-frequently provided needs,
      such as custom fonts, ISO 639-3 codes,
      complex writing systems, right-to-left writing

    Help beginners do the right thing and
work productively right away

     Integrate with existing resources such
as glossaries and translation memories

    Main tools being developed

      • Pootle – translation management, online
      translation

      • Virtaal – powerful desktop (offline)
      translation tool

      • Translate Toolkit – underlying technology for
      other tools, with numerous tools for L10n
      engineering, planning, QA, etc.

    Products already in use for OpenOffice,
Mozilla, Creative Commons, OLPC, Opera,
and many others
Translation
              Tools
    Provide good tools to a wide range of
users, including:

      • Less skilled people

      • People who cannot translate from English

      • People with less-frequently provided needs,
      such as custom fonts, ISO 639-3 codes,
      complex writing systems, right-to-left writing

    Help beginners do the right thing and
work productively right away

     Integrate with existing resources such
as glossaries and translation memories

    Main tools being developed

      • Pootle – translation management, online
      translation

      • Virtaal – powerful desktop (offline)
      translation tool

      • Translate Toolkit – underlying technology for
      other tools, with numerous tools for L10n
      engineering, planning, QA, etc.

    Products already in use for OpenOffice,
Mozilla, Creative Commons, OLPC, Opera,
and many others
Spellcheckers
    Create tools to simplify technical
development

      • CorpusCatcher – collects texts from the
      web

      • Spelt – word classification with a focus
      on productivity

    Create three spellcheckers for
languages of partners in the network:

      • Gikuyu – Bantu, East (Kenya),
      agglutinative morphology

      • Zulu – Bantu, South (South Africa),
      agglutinative morphology

      • Yoruba – non-Bantu, West (Nigeria), rich
      tonal system

     Spellcheckers are created for
Hunspell for easy integration with office
and internet tools (OpenOffice, Firefox,
Thunderbird, others)

    Build expertise for more work in this
area going forward
Localizing
     Software
     Starting with Firefox, a key
software application that is free,
open source, extremely useful,
and widely used
    Focus on languages for
which glossaries are being
developed in the Terminology
subproject
    Creating L10n communities
with pools of expertise that can
continue with more projects
       For many languages, this
is a demonstration that will
prove that L10n is viable for the
first time
Training
                                 Arabize training
     Localizers                  workshop, Cairo, July
                                 2008
    Create training course
modules with the Institute for
Localisation Professionals
(TILP Ireland) to cover local
L10n needs
      Establish local pools of
skilled L10n professionals
     Open source sprint will
create material aimed
directly at volunteer
localizers
      Work toward a
certification system for L10n
professionals
Sustaining L10n
   Language and ICT
 policy
 (taken individually and together)

    Network development

These are the foundations
to ensure ongoing pursuit
of L10n for speakers of
African languages
Language and
  ICT Policy
    Review current state of
language policy around Africa
       Review where language
fits into ICT policy
     Provide resources for
policy planners to understand
language and ICT issues
     Engage policy planners
and decision makers in
support for expanded access
to ICT through L10n

                                International Mother Language Day
                                               Paris, February 2009
Network
 Development
      Website with capacity
for contributions by all
network members:
http://africanlocalization.net
     Active discussion list
for partner communication
    Annual network
meetings for major partner
organizations
     Recruitment of new
partners through website,
subprojects, and outreach
Lessons learned: two case studies

Locales                            Terminologies

    Time and effort required to        Time and effort required for
  recruit participants through       software development
  networks
                                       Payment model for
    Volunteer model for data         significant data contributions
  contributions
                                       Technology obstacles for
     Upstreaming data: finding a     African partners
  thirst for project outputs
                                        Managing the scope:
                                     finding a hunger for joining in
Lessons learned: Locales
Time and effort required to
recruit participants
through networks
     Ambitious goal of 100 languages
      Need to find people with the
necessary combination of computer
skills, network access, and language
knowledge
      For languages in the long tail,
that means we need to identify and
recruit from among about ½ million
total speakers
     Even some languages with more
than 10,000,000 speakers have not
produced a single volunteer
Broadcasting through existing
Lessons learned: Locales      networks (mailing lists, newsgroups)
Time and effort required to
recruit participants              Exploring new social networking
through networks              opportunities (Facebook, Twitter)
                                 Using the personal networks of
                              ANLoc members
Lessons learned: Locales
Volunteer model for data
contributions

    Amount of work is only 2 to 3 hours per language
     Small payments to 100 people in 50 countries would be a logistical
nightmare (even if we had a budget to cover it)
    New recruitment campaigns have addressed this question head on:
        “And to answer the most common question in advance, yes, volunteer
        means for free - for your language, for your country, but not for money.”
Lessons learned: Locales      Google    IBM       Wikimedia Foundation
Upstreaming data: finding a
thirst for project outputs      CLDR (Common Locale Data Repository)
Lessons learned: Terminologies
Time and effort required for
software development

Software must be:
    Simple to use
    Fast
    Lightweight
     Deal with numerous
linguistic complexities
    Interlink numerous
languages
Lessons learned: Terminologies
Payment model for significant
data contributions


     Project takes about 2 months of
professional labor per language
    Payment for each language
occurs when all 2500 entries are
complete
     Payment model insures that
work gets done and that quality
control can be implemented
Lessons learned: Terminologies
Technology obstacles for African
partners


    Power outages
    Connectivity problems
    Adequate equipment has not
been a problem for our partners
Lessons learned: Terminologies
Managing the scope: finding a
hunger for joining in


     Project provides a
consistent, carefully chosen set
of L10n terminology that can
be used for any language
     English glossary with clear
definitions is a resource that is
not available to localizers
elsewhere on the web
    Project cut from 24
languages to 12 to fit within
budget constraints
    Additional language
groups are seeking to join on a
volunteer basis
The long view: the outlook for IT in
     African languages and African societies
                  going forward
                            Use of ANLoc outputs
                       by consumers
                            Continued L10n
                       through people and tools
                       enabled by ANLoc
                            Strong and growing
                       network of African IT and
                       language professionals
                            Increased industry
                       L10n activity
                            Establishing the
                       expectation that IT will be
                       available in African
                       languages: making
                       localization the new normal
Isimikinyi, Tanzania                                 Isimikinyi, Tanzania
June 2005                                                       July 2008
Presented by:
             Martin Benjamin (martin@kamusi.org)
  Director of ANLoc Locales and Terminology Subprojects

          Funded by IDRC, Acacia, project number 104475
Managed in IDRC regional office for Middle East and North Africa, Cairo
1 of 34

Recommended

Celtic language technologies in the digital age by
Celtic language technologies in the digital ageCeltic language technologies in the digital age
Celtic language technologies in the digital agetechiaith
1.3K views42 slides
How to build language technology resources for the next 100 years by
How to build language technology resources for the next 100 yearsHow to build language technology resources for the next 100 years
How to build language technology resources for the next 100 yearsGuy De Pauw
601 views76 slides
Promoting the Use of Basque via Language Technology by
Promoting the Use of Basque via Language TechnologyPromoting the Use of Basque via Language Technology
Promoting the Use of Basque via Language Technologytechiaith
1.3K views58 slides
First Stages and challenges of LibreOffice Translation in Hausa Language by
First Stages and challenges  of LibreOffice Translation  in Hausa LanguageFirst Stages and challenges  of LibreOffice Translation  in Hausa Language
First Stages and challenges of LibreOffice Translation in Hausa LanguageiCRAFT Corp. (アイクラフト株式会社)
395 views26 slides
Bridge to r by
Bridge to rBridge to r
Bridge to rDmitry Makarchuk
2K views78 slides
ICANN 50: IDN Variant TLD Program GNSO Update by
ICANN 50: IDN Variant TLD Program GNSO UpdateICANN 50: IDN Variant TLD Program GNSO Update
ICANN 50: IDN Variant TLD Program GNSO UpdateICANN
375 views17 slides

More Related Content

Viewers also liked

Projet Complet: Paramètres Régionaux Pour 100 Langues Africaines by
Projet Complet: Paramètres Régionaux Pour 100 Langues AfricainesProjet Complet: Paramètres Régionaux Pour 100 Langues Africaines
Projet Complet: Paramètres Régionaux Pour 100 Langues AfricainesEPFL (École polytechnique fédérale de Lausanne)
1.8K views35 slides
Completed Project: 100 African Language Locales by
Completed Project: 100 African Language LocalesCompleted Project: 100 African Language Locales
Completed Project: 100 African Language LocalesEPFL (École polytechnique fédérale de Lausanne)
2.7K views35 slides

Similar to Achievement And Lessons Learned By An Loc

Translate.org Presentation by
Translate.org PresentationTranslate.org Presentation
Translate.org PresentationSANGONeT
309 views10 slides
Iñaki and Amaia by
Iñaki and AmaiaIñaki and Amaia
Iñaki and AmaiaÚcar Marian
220 views13 slides
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2... by
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...Europeana
441 views67 slides
How community software supports language documentation and data analysis by
How community software supports language documentation and data analysisHow community software supports language documentation and data analysis
How community software supports language documentation and data analysisPeter Bouda
602 views25 slides
Cyflwyniad Bloc by
Cyflwyniad BlocCyflwyniad Bloc
Cyflwyniad Bloccanolfanbedwyr
528 views17 slides
K33050053 by
K33050053K33050053
K33050053IJERA Editor
245 views4 slides

Similar to Achievement And Lessons Learned By An Loc(20)

Translate.org Presentation by SANGONeT
Translate.org PresentationTranslate.org Presentation
Translate.org Presentation
SANGONeT309 views
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2... by Europeana
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...
Europeana441 views
How community software supports language documentation and data analysis by Peter Bouda
How community software supports language documentation and data analysisHow community software supports language documentation and data analysis
How community software supports language documentation and data analysis
Peter Bouda602 views
Apertium: a unique free/open-source MT system for related languages [but not ... by Gema Ramirez-Sanchez
Apertium: a unique free/open-source MT system for related languages [but not ...Apertium: a unique free/open-source MT system for related languages [but not ...
Apertium: a unique free/open-source MT system for related languages [but not ...
La Oss 2010 Anousak by laonux
La Oss 2010 AnousakLa Oss 2010 Anousak
La Oss 2010 Anousak
laonux327 views
Language translator by SumitSumit26
Language translatorLanguage translator
Language translator
SumitSumit266.1K views
Speech Recognition Technology by Aamir-sheriff
Speech Recognition TechnologySpeech Recognition Technology
Speech Recognition Technology
Aamir-sheriff1.6K views
Type of apps that can be developed using python by Semidot Infotech
Type of apps that can be developed using pythonType of apps that can be developed using python
Type of apps that can be developed using python
Semidot Infotech81 views
10 World’s Leading Speech or Voice Recognition Software That Can 3X Your Prod... by nehachhh
10 World’s Leading Speech or Voice Recognition Software That Can 3X Your Prod...10 World’s Leading Speech or Voice Recognition Software That Can 3X Your Prod...
10 World’s Leading Speech or Voice Recognition Software That Can 3X Your Prod...
nehachhh30 views

Recently uploaded

Uni Systems for Power Platform.pptx by
Uni Systems for Power Platform.pptxUni Systems for Power Platform.pptx
Uni Systems for Power Platform.pptxUni Systems S.M.S.A.
58 views21 slides
Webinar : Desperately Seeking Transformation - Part 2: Insights from leading... by
Webinar : Desperately Seeking Transformation - Part 2:  Insights from leading...Webinar : Desperately Seeking Transformation - Part 2:  Insights from leading...
Webinar : Desperately Seeking Transformation - Part 2: Insights from leading...The Digital Insurer
24 views52 slides
Democratising digital commerce in India-Report by
Democratising digital commerce in India-ReportDemocratising digital commerce in India-Report
Democratising digital commerce in India-ReportKapil Khandelwal (KK)
20 views161 slides
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f... by
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...TrustArc
72 views29 slides
Zero to Automated in Under a Year by
Zero to Automated in Under a YearZero to Automated in Under a Year
Zero to Automated in Under a YearNetwork Automation Forum
22 views23 slides
Evolving the Network Automation Journey from Python to Platforms by
Evolving the Network Automation Journey from Python to PlatformsEvolving the Network Automation Journey from Python to Platforms
Evolving the Network Automation Journey from Python to PlatformsNetwork Automation Forum
17 views21 slides

Recently uploaded(20)

Webinar : Desperately Seeking Transformation - Part 2: Insights from leading... by The Digital Insurer
Webinar : Desperately Seeking Transformation - Part 2:  Insights from leading...Webinar : Desperately Seeking Transformation - Part 2:  Insights from leading...
Webinar : Desperately Seeking Transformation - Part 2: Insights from leading...
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f... by TrustArc
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...
TrustArc72 views
Unit 1_Lecture 2_Physical Design of IoT.pdf by StephenTec
Unit 1_Lecture 2_Physical Design of IoT.pdfUnit 1_Lecture 2_Physical Design of IoT.pdf
Unit 1_Lecture 2_Physical Design of IoT.pdf
StephenTec15 views
Case Study Copenhagen Energy and Business Central.pdf by Aitana
Case Study Copenhagen Energy and Business Central.pdfCase Study Copenhagen Energy and Business Central.pdf
Case Study Copenhagen Energy and Business Central.pdf
Aitana17 views
STPI OctaNE CoE Brochure.pdf by madhurjyapb
STPI OctaNE CoE Brochure.pdfSTPI OctaNE CoE Brochure.pdf
STPI OctaNE CoE Brochure.pdf
madhurjyapb14 views
Igniting Next Level Productivity with AI-Infused Data Integration Workflows by Safe Software
Igniting Next Level Productivity with AI-Infused Data Integration Workflows Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Safe Software317 views
Business Analyst Series 2023 - Week 3 Session 5 by DianaGray10
Business Analyst Series 2023 -  Week 3 Session 5Business Analyst Series 2023 -  Week 3 Session 5
Business Analyst Series 2023 - Week 3 Session 5
DianaGray10345 views
GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N... by James Anderson
GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N...GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N...
GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N...
James Anderson126 views
Piloting & Scaling Successfully With Microsoft Viva by Richard Harbridge
Piloting & Scaling Successfully With Microsoft VivaPiloting & Scaling Successfully With Microsoft Viva
Piloting & Scaling Successfully With Microsoft Viva
ESPC 2023 - Protect and Govern your Sensitive Data with Microsoft Purview in ... by Jasper Oosterveld
ESPC 2023 - Protect and Govern your Sensitive Data with Microsoft Purview in ...ESPC 2023 - Protect and Govern your Sensitive Data with Microsoft Purview in ...
ESPC 2023 - Protect and Govern your Sensitive Data with Microsoft Purview in ...
Future of AR - Facebook Presentation by Rob McCarty
Future of AR - Facebook PresentationFuture of AR - Facebook Presentation
Future of AR - Facebook Presentation
Rob McCarty22 views

Achievement And Lessons Learned By An Loc

  • 1. Presented by: Martin Benjamin (martin@kamusi.org) Director of ANLoc Locales and Terminology Subprojects Funded by IDRC, Acacia, project number 104475 Managed in IDRC regional office for Middle East and North Africa, Cairo
  • 2. Achievements and Lessons Learned by the African Network for Localization Languages and information technology in Africa: the challenges for localization Addressing the challenges: ANLoc and its subprojects Lessons learned: two case studies The long view: the outlook for IT in African languages and African societies going forward
  • 3. Languages and information technology in Africa: the challenges for localization
  • 4. National University of Rwanda July 2008 Rwanda, July 2008 Dar es Salaam, Tanzania, July 2008
  • 6. African language facts: • As many as 2000 languages spoken in Africa by 1,000,000,000 people • Over 200 languages are spoken by more than 500,000 people each • At least 15 languages are spoken by more than 10,000,000 people each: Amharic, Arabic, Berber, Chewa, Fula, Hausa, Igbo, Kinyarwanda, Malagasy, Oromo, Shona, Somali, Swahili, Yoruba, Zulu • Primary education in Africa is often in local, regional, or national languages • IT in Africa is mostly available in English, French, or Portuguese
  • 7. Enter ANLoc Adapting ICT so that it can be used by: • Non-specialists IT+46 training workshop, Kampala, November 2006 • Non-elites • Non-speakers of global L10n = languages Localization • Students • Anyone In other words, most of Africa’s 1 billion people
  • 8. Addressing the challenges: ANLoc and its subprojects Enabling L10n Software Freedom Day, Accra, September 2008 • Locales • Fonts • Keyboards Activating L10n • Terminologies • Translation tools • Spellcheckers • Localizing software • Training localizers Sustaining L10n • Language and ICT policy • Network development
  • 9. Enabling L10n Locales These are things that Fonts must exist for a language before any software Keyboards localization can occur Translate Firefox event, Kampala, August 2008
  • 10. Locales The basic information sets needed to configure computers for a language •What character sets to use •How dates and numbers appear •Direction text is written •Names of days, weeks, months •Currency symbols, measurement systems •Other background information that computers need for a language In French: “paramètres régionaux” Makes it easy to write and share documents in a language Makes it possible to develop software, websites, mobile phones, ATMs, etc, for a language
  • 11. Fonts Many African languages have letters that do not exist in the standard European character set ANLoc is creating Free and Open Source fonts that contain all characters for numerous African languages that have been included in the UNICODE standard Availability of a font with all the necessary characters is elemental for using IT in a language Fonts are integrated with ANLoc Keyboards and Translation Tools Documentation and dissemination need more attention
  • 12. Keyboards Mapping the characters of a language’s alphabet to the keys on a qwerty or azerty keyboard Completely integrated with the output of the Fonts subproject for each specific language 12 keyboards available in most recent Windows and Mac builds 30 keyboards available for Linux: http://is.gd/CjGi Documentation and dissemination need more attention
  • 13. Activating L10n Terminologies These are the building Translation tools blocks to ensure the viability of L10n for a Speelcheekers language Localizing Software Training Localizers tzLUG, Dar es Salaam, December 2008
  • 14. Activating L10n Terminologies These are the building Translation tools blocks to ensure the viability of L10n for a Spellcheckers language Localizing Software Training Localizers tzLUG, Dar es Salaam, December 2008
  • 15. Terminologies 2500 IT terms selected from more than 1100 translation files for Free and Open Source Software Definitions for each term in English Glossmaster software for rapid glossary development by project partners Producing terms + definitions in 14 African languages Working with Translation Bureau (Public Works and Governments Services Canada) to add a French component Direct export to Virtaal translation tool of the Tools subproject Free online dissemination through PALDO (kamusi.org)
  • 16. Translation Tools Provide good tools to a wide range of users, including: • Less skilled people • People who cannot translate from English • People with less-frequently provided needs, such as custom fonts, ISO 639-3 codes, complex writing systems, right-to-left writing Help beginners do the right thing and work productively right away Integrate with existing resources such as glossaries and translation memories Main tools being developed • Pootle – translation management, online translation • Virtaal – powerful desktop (offline) translation tool • Translate Toolkit – underlying technology for other tools, with numerous tools for L10n engineering, planning, QA, etc. Products already in use for OpenOffice, Mozilla, Creative Commons, OLPC, Opera, and many others
  • 17. Translation Tools Provide good tools to a wide range of users, including: • Less skilled people • People who cannot translate from English • People with less-frequently provided needs, such as custom fonts, ISO 639-3 codes, complex writing systems, right-to-left writing Help beginners do the right thing and work productively right away Integrate with existing resources such as glossaries and translation memories Main tools being developed • Pootle – translation management, online translation • Virtaal – powerful desktop (offline) translation tool • Translate Toolkit – underlying technology for other tools, with numerous tools for L10n engineering, planning, QA, etc. Products already in use for OpenOffice, Mozilla, Creative Commons, OLPC, Opera, and many others
  • 18. Spellcheckers Create tools to simplify technical development • CorpusCatcher – collects texts from the web • Spelt – word classification with a focus on productivity Create three spellcheckers for languages of partners in the network: • Gikuyu – Bantu, East (Kenya), agglutinative morphology • Zulu – Bantu, South (South Africa), agglutinative morphology • Yoruba – non-Bantu, West (Nigeria), rich tonal system Spellcheckers are created for Hunspell for easy integration with office and internet tools (OpenOffice, Firefox, Thunderbird, others) Build expertise for more work in this area going forward
  • 19. Localizing Software Starting with Firefox, a key software application that is free, open source, extremely useful, and widely used Focus on languages for which glossaries are being developed in the Terminology subproject Creating L10n communities with pools of expertise that can continue with more projects For many languages, this is a demonstration that will prove that L10n is viable for the first time
  • 20. Training Arabize training Localizers workshop, Cairo, July 2008 Create training course modules with the Institute for Localisation Professionals (TILP Ireland) to cover local L10n needs Establish local pools of skilled L10n professionals Open source sprint will create material aimed directly at volunteer localizers Work toward a certification system for L10n professionals
  • 21. Sustaining L10n Language and ICT policy (taken individually and together) Network development These are the foundations to ensure ongoing pursuit of L10n for speakers of African languages
  • 22. Language and ICT Policy Review current state of language policy around Africa Review where language fits into ICT policy Provide resources for policy planners to understand language and ICT issues Engage policy planners and decision makers in support for expanded access to ICT through L10n International Mother Language Day Paris, February 2009
  • 23. Network Development Website with capacity for contributions by all network members: http://africanlocalization.net Active discussion list for partner communication Annual network meetings for major partner organizations Recruitment of new partners through website, subprojects, and outreach
  • 24. Lessons learned: two case studies Locales Terminologies Time and effort required to Time and effort required for recruit participants through software development networks Payment model for Volunteer model for data significant data contributions contributions Technology obstacles for Upstreaming data: finding a African partners thirst for project outputs Managing the scope: finding a hunger for joining in
  • 25. Lessons learned: Locales Time and effort required to recruit participants through networks Ambitious goal of 100 languages Need to find people with the necessary combination of computer skills, network access, and language knowledge For languages in the long tail, that means we need to identify and recruit from among about ½ million total speakers Even some languages with more than 10,000,000 speakers have not produced a single volunteer
  • 26. Broadcasting through existing Lessons learned: Locales networks (mailing lists, newsgroups) Time and effort required to recruit participants Exploring new social networking through networks opportunities (Facebook, Twitter) Using the personal networks of ANLoc members
  • 27. Lessons learned: Locales Volunteer model for data contributions Amount of work is only 2 to 3 hours per language Small payments to 100 people in 50 countries would be a logistical nightmare (even if we had a budget to cover it) New recruitment campaigns have addressed this question head on: “And to answer the most common question in advance, yes, volunteer means for free - for your language, for your country, but not for money.”
  • 28. Lessons learned: Locales Google IBM Wikimedia Foundation Upstreaming data: finding a thirst for project outputs CLDR (Common Locale Data Repository)
  • 29. Lessons learned: Terminologies Time and effort required for software development Software must be: Simple to use Fast Lightweight Deal with numerous linguistic complexities Interlink numerous languages
  • 30. Lessons learned: Terminologies Payment model for significant data contributions Project takes about 2 months of professional labor per language Payment for each language occurs when all 2500 entries are complete Payment model insures that work gets done and that quality control can be implemented
  • 31. Lessons learned: Terminologies Technology obstacles for African partners Power outages Connectivity problems Adequate equipment has not been a problem for our partners
  • 32. Lessons learned: Terminologies Managing the scope: finding a hunger for joining in Project provides a consistent, carefully chosen set of L10n terminology that can be used for any language English glossary with clear definitions is a resource that is not available to localizers elsewhere on the web Project cut from 24 languages to 12 to fit within budget constraints Additional language groups are seeking to join on a volunteer basis
  • 33. The long view: the outlook for IT in African languages and African societies going forward Use of ANLoc outputs by consumers Continued L10n through people and tools enabled by ANLoc Strong and growing network of African IT and language professionals Increased industry L10n activity Establishing the expectation that IT will be available in African languages: making localization the new normal Isimikinyi, Tanzania Isimikinyi, Tanzania June 2005 July 2008
  • 34. Presented by: Martin Benjamin (martin@kamusi.org) Director of ANLoc Locales and Terminology Subprojects Funded by IDRC, Acacia, project number 104475 Managed in IDRC regional office for Middle East and North Africa, Cairo

Editor's Notes

  1. 4 languages of Rwanda: English, French, Swahili, Kinyarwanda
  2. Story about getting a business card made in Kigali. Software in English and French, conversation in Kinyarwanda and a bit of Swahili, cards took about an hour to design because the designer couldn’t read all the menus. National University of Rwanda has a haphazard collection of computers that use English or French, depending on who donated or purchased the equipment.