Wanted: Best Practices for Collaborative Translation Alain Désilets National Research Council of Canada firstname.lastname@example.org With support from:
A (Very) Brief History of Collaborative TranslationCirca 2005: “Wikis, whats that?”Circa 2006: “I know about Wikipedia, but I hear it’s garbage because anyone can write anything on it.”Circa 2007: “You know, I have been to Wikipedia a couple of times and was pleasantly surprised by the quality of what I found there.”Circa 2008: “Actually, this wiki stuff is really interesting. Now, I routinely use Wikipedia in my work, and although I am cautious with it, I find it useful. I have a sense that this wiki/collaborative stuff will have a wider impact for translation, but I’m not quite sure how.”
A (Very) Brief History of Collaborative Translation (2)Circa 2009: “Whoa! Translation Crowdsourcing is going to put me out of a job!”Circa 2010: “Well, I guess that was a storm in a teacup. Crowdsourcing will be used in some specific and limited contexts, but it wont take over. Maybe this is more an opportunity for us than a threat…”Circa 2011: “Hum… getting this collaborative translation stuff to work right is hard and confusing.”
Talk OutlineThe different “flavors” of Collaborative TranslationCommon issues and Challenges in Collaborative TranslationCapturing Collaborative Translation best-practices in the form of Design Patterns
The different flavors ofCollaborative Translation
DefinitionCollaborative Translation is the use of any open online collaboration technology or process, in order to help with translation tasks, or tasks related to translation (ex: terminology).
Available in the following flavors…• Translation crowdsourcing• Collaborative terminology resources• Translation memory sharing• Online marketplaces for translators• Agile translation teamware• Post-editing by the crowd
Translation CrowdsourcingMechanical-Turk-like systems to support the translation of content by large crowds of mostly amateurs, through an open-call process.By far the most talked about collaborative translation approach• Software user interface (Facebook, Adobe, Symantec, Firefox)• Technical documentation (Adobe, Symantec)• Transcripts of videos of an “inspirational” nature (TED Talks, Adobe TV)• Humanitarian aid content (Translators without Borders, Kiva.org, Haiti Earthquake Mission)• Large scale collection of linguistic data for research purposes or machine translation training (NAACL Workshop on Crowdsourcing).
Collaborative terminology resourcesWikipedia-like platforms for the creation and maintenance of large terminology resources by a crowd of translators, terminologists, domain experts, and even general members of the public.• Wikipedia• Wiktionary• ProZ’s Kudoz forum• Urban Dictionary• TermWiki.com• TikiWiki• Reverso dictionary
Translation memory sharingPlatforms for large scale pooling and sharing of multilingual parallel corpora between organizations and individuals.• TAUS Data Association• MyMemory• Google Translator Toolkit• WeBiTextOften, collaboration is “implicit”, for example, in the case of WeBiText.
Online marketplaces for translatorseBay-like disintermediated environments for connecting customers and translators directly, with minimal intervention by a middle man.• ProZ.com• TranslatorsCafe• Translated.netCollaborative aspects comes from things like “open call sourcing” and reputation management based on community assessment.
Agile translation teamwareWiki-like systems and processes that allow multidisciplinary teams of professionals (translators, terminologists, domain experts, revisers, managers) to collaborate on large translation projects, using an agile, grassroots, parallelized process instead of the more top-down, assembly-line approach found in most translation workflow systems.No specific software or site, but many case studies describing how to implement this approach, using general purpose collaboration tools like wikis, BaseCamp.• Beninatto & De Palma, 2008,• Calvert, 2008• Yahaya, 2008Some translation workflow systems starting to market themselves as being “collaborative”
Post-editing by the crowdSystems allowing a large crowd of mostly amateurs to correct the output of machine translations systems, often with the aim of improving the system’s accuracy.• Asia Online’s Wikipedia translation project• Google Translate allows anonymous users to correct the outputs produced by the systems• Likewise for Microsoft’s Bing Translator
Is this REALLY New?Weren’t Terminology Databases, Translation Memories and Translation Workflow Systems already collaborative?• Yes, but…• … Collaborative Translation is about using these kinds of groupware technologies in the context of much larger groups or communities, where people have fewer reasons to trust each other a-priori.It’s one thing to open yourself to collaboration with colleagues and customers.It’s quite another thing to open yourself to the whole world.
Common issues and Challengesin Collaborative Translation
This is NOT easyChoosing a flavor and tailoring it to your needs is still somewhat of a black art, guided by trial and error.There are lots of important and poorly understood issues that arise, many of which are common to most flavors:• Alignment with business goals• Quality control• Crowd motivation• Proper role of professionals
Alignment with business goalsWhy are you doing this in the first place? Which flavor can deliver what you want?The actual benefit you get from a given flavor is not necessarily what you think!Translation crowdsourcing• Reduce cost? – Yes, but not the biggest benefit.• Decrease lead time? – Definitely.• Translation more in-tune with target audience’s idiosyncrasies? -- Also• Most importantly: Increase brand loyalty by engaging end-users as co-creators of products, instead of passive consumers.
Quality ControlHow to control quality when you open yourself to contributions from a potential large group of “outsiders”?Many ways:• Screen contributors before letting them in (ex: Translators Without Border, Kiva.org).• Have members of the community vote on the quality of each other’s work (ex: Facebook, Translated.net).• Have in-house professionals revise the work done by the community (ex: Facebook).
Quality Control (2)Do not assume that quality of community-produced content will be lower.For instance, Wikipedia provably measures up to professionally produced encyclopedia like Britannica (English) and Brockhaus (German).Quality issues tend to iron themselves out provided that you attract a sufficient large number of the right peopleWisdom of crowd effects works surprisingly well when the following conditions are met:• Diversity• Independence• Aggregation
Crowd motivationIf you are to attract and retain enough of the “right people” you need to understand why thy might contribute.• Mandated by management (ex: Agile Translation teamware)• Emotional bond with the content (ex: Facebook, and surprisingly, Adobe)• Prestige of the content (ex: TEDTalks)• Wanting to do good (ex: Translators Without Border, Kiva, Haiti Earthquake Mission, Data collection for scientific research)• Pride in one’s native language (ex: Data collection for R&D in MT for small density languages)• Trying to perfect second language skills• Trying to make a go at professional translation career (ex: Kiva.org)• And in some cases, $$$ – Will this be the dominant scenario? – How to set compensation high enough to attract good contributors, but not so high that it interferes with more intrinsic motivations, or attracts people out to game the system.
Role of professionalsSome flavors of CT are designed specifically for professionals (ex: Agile translation teamware, Online translator marketplaces).But some (e.g. Translation crowdsourcing), tend to de-emphasize their role.When should professionals be involved, and what should be their role?• Revise work done by amateurs? – Focus on more challenging aspects of translation like terminology, style, fluidity?• Manage and coach the crowd?• Focus on more mission-critical and hard to translate content?Translation Crowdsourcing may actually increase the size of the pie, by making it possible to tackle content and/or small languages that would otherwise not have been dealt with anyway.
Capturing CollaborativeTranslation best-practices in theform of Design Patterns
Wanted: Best- PracticesCollaborative Translations presents practitioners with a varied and complex envelopes of different approaches and technologies.Selecting a flavor and tuning it to meet your needs is complex.We need some sort of concise, easy-to-consult repository of best- practices for that field.We propose a way to collaboratively create such a repository a community, in the form of a design patterns language.
About Design PatternsA format for describing a common solution to a common problem in a given fieldOriginally used in Architecture, but since then adopted in other fields such as Software Engineering, Education, etc.
Design Patterns Example Publish Contributions RapidlyContext This pattern is useful for motivating contributors in any collaborative translation context, but it is particularly useful in translation crowdsourcing scenarios.Problem Contributors are often motivated by a desire to have a positive impact on the community they are participating in. However, they cannot achieve this sense of being useful, if their contributions do not become available to the rest of the community in a reasonable amount of time.Solution Therefore, minimize the delay between the moment when a member of the community contributes to the site, and the moment where it becomes publicly available to the rest of the community. Ideally, the contribution should become visible to the rest of the community as soon as the user clicks on the Save button.
Design Patterns Example (2)Related patterns – Point System is another way for a contributor to get a sense of how useful he has been to the community. – Campaign Progress Gauge is another practice which allows members of the community to see the positive impact of their actions. The main difference is that it operates more at a community/project level rather than at a individual/contribution level.Real-life examples – At Facebook, translations become available in a matter of hours. – In the context of software localization by the crowd, Adobe makes a conscious effort to wrap the communitys translations into every new releases of the product.
TAUS Roundtable on Collaborative TranslationWiki “Barn Raising” workshop held on October 12th, 2011 at Localization World in Santa Clara.12 practitioners• One third with hands on experience of CT (NRC, Adobe, Symantec, Kiva, World Wide Lexicon)• Two thirds with no experience, but a strong interest in trying it (In Every Language, MemSource, Firma 8, SPIL Games)Talks by the experienced users about what worked and didn’t.Followed by brainstorming of what the recurring best-practices seem to be.
The Best-PracticesEnd result:=> 50+ best practices organized into 6 themesPlanning and Scoping Translation as User Engagement, Align Stakeholder Expectations, Early and Continuous Clarification of Translator Expectations, Backup Plan, Project, Check Points, Appoint Initial Community Manager, Clear Objectives, Identify Compatible ContentCommunity Motivation Campaign Progress Gauge, Contributor Recognition, Leader Board, Official Certificate, Point System, Offer Double Points, Hand-Out Unique Branded, Products, Contributor of the Month, Grant Special Access Rights, Playful Casual Translation, Campaign, Publish Contributions Rapidly, Playful Competition Among Contributors
The Best-Practices (2)Quality Content-Specific Testing, Entry Exam, Peer Review, Automatic Reputation Management, Random Spot-Checking, Revision Crowdsourcing, Users as Translators, Voting, Transparent Quality Level, Publish then ReviseContributor Career Path Flexible Contributor Career Path, Lurker to Contributor Transition, Anonymous Translation, Find the Leaders, Support Variable Levels of Involvement, Community Manager, Content PrioritizerRight Sizing Appropriate Chunk Size, Community-Appropriate Project Size, Break Up Crowd Into Teams, Require Minimal Involvement Level, Keep the Crowd Small, Volunteer Team Leaders
The Best-Practices (3)Tools and Processes Hint at Content Priority, First In, First Out, Task Self Selection, Layered Fallbacks, Official Linguistic Resources, Automatic Suggestions, Provide, Context, In-Place Translation, Community Forum, Analytics for Content Prioritization, Simplicity First, Good Examples of Contributions, Encourage Self, Set Deadlines
Some ObservationsThe bulk of practices relate to Translation Crowdsourcing.=> We need to spend more time capturing practices for other flavors of Collaborative TranslationThe bulk of the practices so far are not specific to translation.• They would be useful in the context of crowdsourcing efforts in any domain.• Maybe all we need is to codify and/or learn about the best practices for crowdsourcing in general?The more similar two organizations are, the more similar their practices will be (ex: Kiva and TWB, versus Kiva and Adobe).
ConclusionCollaborative Translation presents practitioners with a very large and varied set of tools and processes.Choosing a particular flavor of CT and tailoring to meet one’s needs can be a daunting task.We need a concise, easy to consult, modular compendium of current best practices in that area.We have started building such a compendium in the form of a wiki site (www.collaborative-translation-patterns.com) which captures best practices in the form of design patterns.We invite every one in this room to contribute to it if they can.