2013 GALA Miami: Breaking into Latin Maerican Markets on a Small Budget

Uploaded on

The Latin American market is composed of a mix of various Spanish dialects. If a company really wants to reach a specific audience in Latin America, it must use the right dialect. But how is it …

The Latin American market is composed of a mix of various Spanish dialects. If a company really wants to reach a specific audience in Latin America, it must use the right dialect. But how is it possible to translate marketing materials into four or five Spanish dialects without dramatically increasing costs? This session will discuss how a joint effort to create an MT engine for translating international Spanish into specific Latin American dialects (Spanish for Argentina, Chile, Columbia, Mexico, and Puerto Rico) made this challenge feasible, economical, and replicable.

More in: Technology , Business
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads


Total Views
On Slideshare
From Embeds
Number of Embeds



Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

    No notes for slide


  • 1. An MT Case Study:Breaking into Latin American Marketson a Small BudgetMaría Azqueta (SeproTec) & Diego Bartolomé (tauyou)
  • 2. Spanish WorldwideSpanish Language:• Also known as Castellano.• Latin-derived Romance language.• Spanish is one of the six official languages ofthe United Nations and an official language ofthe European Union.
  • 3. Spanish Worldwide
  • 4. Spanish Worldwide0 200 400 600 800 1000 1200Mandarin ChineseSpanishEnglishHindi/Urdu407 million311 million955 million360 millionSecond most spoken language by number of native speakers
  • 5. Spanish Worldwide• For demographic reasons, the percentage of theorld’s populatio that speaks Spa ish as a ati elanguage is increasing, while the percentage ofChinese and English speakers is decreasing.• Withi three or four ge eratio s, % of the orld’spopulation will communicate in Spanish.• I 5 , the U ited States ill e the orld’sforemost Spanish speaking country.
  • 6. Spanish on the Internet• Spanish is the third most widely used language onthe Net.• The use of Spanish on the Net has experienced agrowth rate of 807.4% between 2000 and 2011.• Spain and Mexico are among the 20 countries withthe highest number of internet users.• The demand for documents in Spanish is the fourthlargest fro a o g the orld’s la guages.
  • 7. Spanish Worldwide and its DifferencesHigh demand for translations into Spanish.But… is the same Spanish spokeneverywhere?
  • 8. Spanish Worldwide and its DifferencesRAE (Royal Spanish Academy) :– Created in the 18th century, it is widely seen asthe arbiter of what is considered standardSpanish.– It produces authoritative dictionaries andgrammar guides.– Although its decisions are not formally binding,they are widely followed in both Spain and LatinAmerica.
  • 9. Spanish Worldwide and its DifferencesLexicalvariationsGrammaticaldifferencesIdiomsDifferent dialects and many differences:
  • 10. Spanish Worldwide and its Differences‘Neutral’ or‘International’SpanishLatin AmericanSpanish &EuropeanSpanishMarket Trend:
  • 11. Why Adapt to theLocal Spanish of Each Country?To reach different marketsPeople are most likely to buy when a product isadvertised in their dialect
  • 12. Why Adapt to theLocal Spanish of Each Country?EN: Take a card from the deckES: Coge una carta de la barajaClient A (Gaming Industry)
  • 13. Why Adapt to theLocal Spanish of Each Country?ES: Coge una carta de la barajaAR: Agarrá una carta del mazoCL: Toma una carta del naipeCO: Coge una carta de la barajaMX: Saca una carta de la barajaPR: Coge una carta de la baraja
  • 14. Coger (32 entries)http://rae.es/rae.html1.tr. Asir, agarrar o tomar. U. t. c. prnl.31. intr. vulg. Am. Realizar el acto sexualWhy Adapt to theLocal Spanish of Each Country?
  • 15. Advise ClientsIf you really want to break into a specificmarket, you must decide which countryyou want to target and localize yourmaterial for the different Spanish dialectsspoken in each individual country.
  • 16. The Main Problems Clients Face
  • 17. Is there a cost-efficient solutionon the market?
  • 18. tauyou MT Solution at SeproTecHybrid machine translation since January 2011La guages: EN, ES, PT, GA, FR, IT…Do ai s: Legal, Te h i al…Glossaries and forbidden words listsAverage translated words per month: 700,000
  • 19. Initial BrainstormingMT fromEN > different ES dialectsExtensive post-editingwould be required
  • 20. Final Scope of the ProjectHuman translation + revisionEnglish > Spanish (Spain)MT of Spanish (Spain) intoSpanish from:• Argentina• Chile• Colombia• Mexico• Puerto Rico
  • 21. Initial Approach for Latin American MTTraditional Workflow. Gather tra slatio e ories (EN → ES-XX)2. Add generic material3. Develop engine4. Add linguistic pre- and post-processing5. Improve quality over time
  • 22. DrawbacksVarying MT QualityDepending on the domain and dialectInitial Inconsistencies among DialectsHandled with glossariesMedium Post-Editing EffortCould be improved over time
  • 23. New ApproachTranslate EN to Standard ESVia standard high-quality human translationConvert Standard ES to Latin American VariantsFrom Spanish to SpanishBetter final quality is achieved
  • 24. SpecificationsCountriesArgentina, Chile, Colombia, Mexico, Puerto RicoInternal Glossaries to Handle Lexical VariationsIt corrects discordanceIdiomsGrammatical DifferencesIt adapts verb tenses
  • 25. Testing the Prototype EngineExtraction of several texts (fashion, real-estate, human resources, automobile)Sent to linguists and/or translators ineach target country for localizationPerformance of the same localizationsby the engineComparison and contrasting of humanand machine localization results
  • 26. First Bug ReportNot all termswere localizedConcordanceissues(masc./fem.;sing./pl.)Verbal tensesfor ArgentinaHuman vs. MachineMT: 7.78 % error rate
  • 27. First Bug ReportSome terms were changed/localized by theengine, but not by the humans.(example)Human error or MT error?
  • 28. Testing the Prototype EngineA glossary was created byextracting the terms localized by thelinguists/translators.This glossary was then sent tothe same people who localizedthe texts to verify that all theterms were correctly localizedand nothing was missing.
  • 29. Testing the Prototype EngineThe glossary grew by 36.91%!
  • 30. Testing the Prototype EnginePeople can miss things.Although many different variants of Spanishexist, Spanish speakers understand manyterms that are foreign to their own dialectwhen they read them in context,sometimes to the point of accepting themas their own. I believe that this may bedue to the phenomenon of globalizationand the internet.
  • 31. Latest Bug ReportMT: 1.21% error rate
  • 32. AchievementsVery little post-editing neededReduced error rateShortened deadlinesSignificant cost reduction
  • 33. ConclusionsHuman localization is not perfect.MT is not perfect either.Combining human and machine translationhelps achieve high quality and reduce cost.
  • 34. Further WorkImproving GlossariesThrough a simple web interface for PEExtending Spanish Language CoverageMore dialectsTraductor.cervantes.esIncorporating more languagesEnglish, French and Portuguese
  • 35. BibliographyYule, G. (2006). The Study of Language: ThirdEdition, Cambridge University New York.RAEInstituto Cervanteshttp://www.linguapress.com
  • 36. THANK YOU FORYOUR TIME!María Azquetamazqueta@seprotec.comDiego Bartolomédiego.bartolome@tauyou.com