2019 11-06 introduction to localization silicon valley-pc_ final_20191106

#LocWorld41#LocWorld41
Introduction to Localization
Paul Cerda

#LocWorld41
Presentation highlights
• Introduction
• Zooming out — Globalization The Macro view
• Jargonese
• Zooming in — WELD: Whole enterprise Localization Design
• Where are you standing?
• Localization’s Anatomy — WELD
• Localization types and mediums: Continuous, Agile, voice, device, longevity, priority
• Localization Tools — CAT, TMS, Terminology
© The Word in Bits, 2019

#LocWorld41
Who am I?

#LocWorld41
Terminology (Jargon)
Globalization (Product and business strategy decisions)
Adaptation of product for the sensibilities and needs of a new market. This
includes localization and internationalization
Internationalization (Code and development decisions)
Engineering of a product to enable efficient adaptation of that product to
local requirements. This includes separation of code from content and revision
of code to integrate programmatic solutions for internationalization.
Localization (Language/Culture/market decisions)
Localization is the process of adapting a (software) product and
accompanying materials to suit a target-market locale. This includes
translation.

#LocWorld41
Why Localize?
Which image will sell more e-book readers in Africa or Latin America?

#LocWorld41
Localization: adapt product for new locales
It is a small part of a larger
whole.
Globalization is.. . product
culturation supported and
sustained by internationalized
code and localized content.
I18n
L10n
Product
culturation
What you need to be aware of in localization

#LocWorld41
Localization 10,000 feet view
Production
Content
and
Metadata
MetricsTools
Back Office

#LocWorld41
Production: Translation and localization ops.
The PM, engineering, and translation work to extract, transform, translate, transcreate, and ingest localized
content, and images used for globalized products and services. Continuous improvement calls for tweaking cost
time and quality. Decisions about levels of localization, human or machines, tools, and processes fall to those
tasked with localization.
Ops
Cost
Time
Scope
Production Concerns

#LocWorld41
Production: The Players
content, and images used for globalized products and services.
Ops
Loc PM
Engineer
Linguists
MLV/SLV
Roles and responsibilities

#LocWorld41
The handoffs and handbacks
• Multiple TMS and CMS systems
• APIs
• SDK integrations
• Spreadsheets
• Resource Files

#LocWorld41
Localization: flow of data and information
Files sent
directly or added
to a TMS or CATCMS
source
DB
Code App
TMS/
CAT
TMS/
CAT
Loc. Eng.
creates files
Loc. Eng.
preps files or helps
with MT.
Loc PMs manage work across teams to manage work
Client-Side Vendor-side
Linguists translate
or Post-Edit
Linguists translate
or Post-Edit

#LocWorld41
Localization Production: the Main Players
Localization PM: Vendor and Client
Localization Engineer: There will be client and vendor loc engineers.
Linguist: Vendor and possibly client Linguists may translate, review, edit, or post-edit
translations.
MLV: Multi-language vendor. Think of this as your main contact for all the individuals
and smaller firms that do your linguistic work.
SLV: Single language vendors work specifically from and to a single language.
MT Provider: If you have an MT provider they will customize NMT or SMT for you.
QA Team: QA team will test for linguistic/functional issues in your localized software.

#LocWorld41
Localization Project Mgmt: Necessary Skills
The localization project manager manages all the details and troubleshoots project
and sometimes program level work. They work closely with the localization engineer
to make sure the content is localized for the appropriate markets.
• Project management skills
• Linguistic skills
• Negotiation skills
• Financial management skills

#LocWorld41
Localization Engineer: Necessary Skills
Localization engineers are the scripting and coding counterparts of the localization
project managers. They are responsible for reducing manual effort and ensuring that
programmatic means are incorporated to reduce cost, time, and errors.
• Data Munging
• API integrations
• Automating processes
• Python Skills, Java or Javascript

#LocWorld41
Content Specific Issues
Multimedia
• Video
• Subtitles
• Dubbing
• Voiceover
• Images/
Embedded text
• Animations
Data Intensive
• Machine translation
• Aritificial Intelligence
• Voice UI
Special Regulations or
Requirements
• Privacy
• Transcreation
• Games
• Medical
• Legal
Content issues are characterized by the unique content requirements
and the volume of content needed to produce viable systemic solutions.

#LocWorld41
Time-Specific Issues
Poor planning and design
I18n not planned, functionality not scoped for multiple locales, assumption that
translation is the only localization activity.
Linguistic Development Process
Translation time, File Prep time, development dependencies, product
Artificial launch timelines
Reduction in scope, abandoning localization altogether or sending out half-localized
products

#LocWorld41
Scale Specific Issues
Volume
• Time
• Budget
• Quality
• Process
• Product
Locales
• Budget
• Timeline
• Resources
• Maintenance
• Data stores
Content Lifecycle
• Tool Integration
• Team interactions
• Authoritative source
• Source/Target issues
• Creation Approval
Deprecation

#LocWorld41
How do you measure Loc Quality?
• Quality is what the customer says it is.
• SAEJ2450, LISA, MQM, DQF
Requires Human Review
• MT: Proximity to human equivalent (Levenshtein, Bleu,
Meteor,etc)

#LocWorld41
Linguistic Quality and how to scale it
• Automate objective quality
• Put vendors, tools, and metrics in
place for subjective quality
• Scale subjective reviews
• Design machine translation post-
editing (MTPE) efforts for vendor and
tools
• Automated MT quality reviews
Quality
Objective
Subjective
Human
quality at
scale
MTPE
MT and
MTPE
Quality at
scale

#LocWorld41
Human Translation
Human
Translation
Freelancer
In-house
Vendor

#LocWorld41
Machine Translation
Machine
Translation
Rule-Based
Statistical
Neural

#LocWorld41
History of MT
• One of the earliest goals for computers in the 1950’s
• Rule-based ruled the 80’s and 90’s
• 2000-2015 SMT reined, with a variety of strategies
• 2015-Present: Neural Machine Translation (NMT)
• 2018-Present: Unsupervised m

#LocWorld41
Rules-Based Machine Translation
• First- production-ready MT systems
• Semantic based. Designed for each individual locale
• Highly complex and labor-intensive to create but results were
great.
• Required Linguists, CS teams, a lot of effort

#LocWorld41
Statistical Machine Translation
• Brute-Force: Massive Bilingual corpora necessary
• Large Data companies began scraping the web for bilingual
data to create corpora.
• Google, MS, others created translation tooling to capture data
from translators
• SMT was notoriously inaccurate at first, but with time and
discipline specific training it became essential to the
Localization industry.
• SMT engines were trained with a source and target in mind.
Language arcs were not designed to be bilingual

#LocWorld41
Neural Machine Translation
• Sometimes called deep-learning. These engines depend on
recursive neural networks.
• NMT requires less sentences, but it still requires a bilingual
corpora. This is problematic for long-tail languages that lack
the corpora.
• Latest unsupervised engines can use monolingual data and
show promise, but it is still early.

#LocWorld41
Post-edited machine translation
In-house
Vendor
Freelancer
SMT vs. NMT

#LocWorld41
Where are you standing?
What matters to you will decide which of the areas in the next slide areas you own,
measure, depend on or ignore.
LPMs: Micro-level: production, operations, testing, regression, project costs
Globalization/Localization leads Macro level: Everything but data unless there is a fire
C-Level: Metrics for growth, sales, conversion, MAU etc., ROI in relation to
international growth. Localization is a part of this, but not the whole.

#LocWorld41
Localization in the Enterprise
Production
Content
and
Metadata
MetricsTools
Back Office

#LocWorld41
Localization’s Anatomy
WELD in Detail or what does your organization do?

#LocWorld41
Production: The Players
content, and images used for globalized products and services.
Ops
Loc PM
Engineer
Translators
MLV/SLV
Roles and responsibilities

#LocWorld41
Back Office Concerns
Easing the pain of internal customers
Back office
Payment processes, fund reallocation, IP holdings, worker classification, tax, legal, finance, and
other elements.
Back
Office
Finance
Legal
Tax

#LocWorld41
Back Office: The Players
Easing the pain of internal customers
Back office
Payment processes, fund reallocation, IP holdings, worker classification, tax, legal, finance, and
other elements.
Back
Office
Finance
teams
Lawyers
Tax
Lawyers

#LocWorld41
Tools / Technologies Concerns
Scaling and systematizing localization
Tools
1. The tools used for loc
production.
2. 2. The tools that
disseminate,
transform and ingest
the source and
localized content.
Tools
Loc
Content
Lifecycle
Development
Tech:
API
MTs
AI
TMS
CAT

#LocWorld41
Tools / Technologies Players
Scaling and systematizing localization
Teams
1. Content Lifecycle
2. Multilingual
3. Content dependent
Tools
Loc
Teams
Content
Teams
Dev
Teams

#LocWorld41
Content and Metadata Concerns: Creation
Content and Metadata
What is done with it: Ingestion, security, deployment, and transformation
Description: Metadata helps to identify content and usage so that the tools know how to use or
process it.
Data
Content,
Data types
Data
management
Data use
Communication and format

#LocWorld41
Content and Metadata Players: Creation
Data and Metadata
• What is done with gathered data: Analytics, Machine Learning, Predictive analysis, AI, Product
design
• What is done with created data: Lifecycle manager
Data
KM
teams
Support
teams
Product, Marketing, Sales teams
Who own and parse the data?

#LocWorld41
Content and Metadata Concerns: Capture
Content and Metadata
What is done with it: Ingestion, analysis, transformation,product features, Machine learning, AI
Description: Captured data is stored, mined, leveraged, and re-deployed
Data
Capture
Data
management
Data use
Communication and format

#LocWorld41
Content and Metadata Players: Capture
Data and Metadata
• What is done with gathered data: Analytics, Machine Learning, Predictive analysis, AI, Product
design
• What is done with created data: Lifecycle manager
Data
Data
Analysts
ML
teams
SEO/ Product teams
What is done with the data?

#LocWorld41
Metrics concerns
Proving your worth
Metrics
Each level of your business will measure something different it is important to know what they are
measuring as the data will be useful for you to contextualize and create valuations for your work.
Metrics
Operational
Divisional
C-Level

#LocWorld41
Metrics Players
Proving your worth
Metrics
Each level of your business will measure something different so it is important to know what they
are measuring. The data will be useful for you to contextualize and create value for your work.
Metrics
Development
Data
Analytics
SEO

#LocWorld41
Globalization efforts broaden the scope
If your organization is making a concerted push globally they may organize many
other teams to work on components that interact with localization.
Product: What does the product look like in each locale?
I18n: Can the code handle localized content and regional needs?
UX: Is the user experience transferable to other locales?
CX: Is the customer experience adapted to locale-specific expectations?
International Analytics: Are there analytics by country or region to glean insights?
ROI: Is there a clear return on investment case for localization or specific locales?
Corporate structure: Does the corporate structure delineate ownership of localized
features?

#LocWorld41
Enterprise design for localization
What does your organization look like?

#LocWorld41
The Enterprise parts: Client-Side
The Wall: Disconnect between localization and rest of company. Localization often
seen as the last step and long-pole in international product development.
The Silo: Business units build their own infrastructures for localization and rarely
leverage a centralized set of tools, vendors or processes
The Hub: Localization is a centralized or platform function shared across the
company.

#LocWorld41
The Wall
The wall is common in localization. Every division sends the content over the wall to
localization and wait for the finished content to return.
Product Dev Content
Loc

#LocWorld41
The Silo
Many large enterprises silo their work and this creates many loc processes.
Biz 2Biz 1 Biz 3
Loc
Prod
Dev
Content
Prod
DevContent
Prod
DevContent
Loc
Loc

#LocWorld41
The Silo and the Wall
Many large enterprises have silos and walls.Loc teams work separately in horizontal
orgs.
Biz 2Biz 1 Biz 3
Loc
Prod
DevContent
Prod
DevContent
Prod
Dev
Content
Loc
Loc

#LocWorld41
The Hub
Many large enterprises make loc a hub. And these loc teams are conversant in the
company products.
Biz
3
Biz 5 Biz 2
Loc
Biz 1
Biz 4

#LocWorld41
The Hub and Wall
Many large enterprises make loc a hub. But they throw work over the wall to loc
teams not conversant in products.
Biz
3
Biz 5 Biz 2
Loc
Biz 1
Biz 4

#LocWorld41
The Hub and Silo
Some enterprises have central tools, but individual loc teams interact with an
enterprise localization team.
Product Dev Content Product Dev Content Product Dev Content
Enterprise localization Tools

#LocWorld41
The Hub and Silo and Wall
Some localization teams have content thrown over the wall and work with enterprise
loc teams to manage the content with little to no context.
Product Dev Content
Loc
Enterprise localization Tools
Product Dev ContentProduct Dev Content

#LocWorld41
Supply Parts: Vendor-Side
MLV: Multi-language vendor. A Vendor of vendors
SLV: Single-language vendor. Provides a single language
to several MLVs.
Translators: In-House or Freelance
• In-house translators usually know the product and services
better. And there are less issues with access to systems.
• Freelancers may be cheaper, but they are not usually
dedicated solely to a single company so they need more
context.

#LocWorld41
MLV
• Multiple language vendor engages with large clients for millions of words and
multi-million dollar contracts.
• Amalgamators of capacity and large-scale problem solvers.
• Can staff and bring team in-house for short term spikes and long-term needs.

#LocWorld41
SLV
• Specialists in a single language or multiple languages and dialects in a given
region.
• Supply MLVs and occasionally enterprise clients for specific languages or regions
• Smaller volumes but they are closer to the linguists

#LocWorld41
The MLV in the Silo and the Wall
Many large enterprises have silos and walls. Loc teams work separately in horizontal
orgs and they pass to MLVs who pass to SLVs who pass to linguists.
Biz 2Biz 1 Biz 3
Loc
Prod
DevContent
Prod
DevContent
Prod
DevContent
Loc
Loc

#LocWorld41
Mojibake
Mojibake occurs when
character encoding is
incorrect.
Buttons have the correct text
because they are images, but
rendered text is scrambled
because the character
encoding does not match
the page settings.

#LocWorld41
Hard-Coded Strings
Hard-coded strings are
strings that have been
stored in the code rather
than abstracted into a
separate file that gets
pulled in at runtime.

#LocWorld41
Pseudo-Localization: Example I18n prep
Start and end markers: All strings are
encapsulated in [ ]. If a developer doesn’t see
these characters they know the string has been
clipped by an inflexible UI element.
Transformation of ASCII characters to extended
character equivalents: Stresses the UI from a
vertical line height perspective, tests font and
encoding support, and weeds out strings that
haven’t been externalized correctly (they will not
have the Pseudo Localization applied to them).
Padding text: Simulates translation induced
expansion. In our case we add “one two three
four”…etc after each string, simulating 40%
expansion. Note that we don’t apply expansion
to areas of the UI where text length has already
been limited by other systems prior to display on
the UI, doing so would cause false positives ( e.g.
synopsis text, titles, etc ).
https://medium.com/netflix-techblog/pseudo-localization-netflix-12fff76fbcbe

#LocWorld41
What does a globalized product look like?
Code
Culturation
LanguageUX
UI

2019 11-06 introduction to localization silicon valley-pc_ final_20191106

Recommended

Recommended

More Related Content

Similar to 2019 11-06 introduction to localization silicon valley-pc_ final_20191106

Similar to 2019 11-06 introduction to localization silicon valley-pc_ final_20191106 (20)

Recently uploaded

Recently uploaded (20)

2019 11-06 introduction to localization silicon valley-pc_ final_20191106

Editor's Notes