SlideShare a Scribd company logo
1 of 23
Download to read offline
TM




    Translation Technology Platform

Kirti Vashee
VP Sales, Asia Online
Kirti.vashee@asiaonline.net
Revolutionize the enterprise
   Revolutionize the Internet                      translation process with a
  experience for non-English                      comprehensive, continuous
       speakers in Asia                              learning SMT platform

Provide 1 billion+ local-language pages online   SaaS environment that allows data
using mostly translated open license content,    cleaning and preparation, develop SMT
combined with compelling portal and social
networking style services in Thailand,           engines on demand and enable ongoing
Indonesia, India, Malaysia, Philippines,         comprehensive post editing and correction
Vietnam and China, Japan & Korea                 to continuously improve engines




The Consumer Market                              The Enterprise Market



    Large Buyer &                                    Translation Tools
 Publisher Perspective                              Vendor Perspective
      TM                                                      Copyright © 2008, All Rights Reserved
• The only SMT technology provider that is also a major user of
  ALT technology on one of the largest translation projects in the
  world - English Wikipedia (1B Words+) into 11 Asian languages
  using SMT and crowdsourcing
• The translation tools and technology platform used to
  accomplish this, is also being made available as a SaaS
  product for the enterprise translation market




           TM                                    Copyright © 2008, All Rights Reserved
Battlefield of words
Fusion with customer support
Continuous translation
Community translation
Industry-shared language data
Massive online collaboration
Translation automation


  TM                            Copyright © 2008, All Rights Reserved
Interactive
                                   Support:
                                    EMAIL
                                                                     Knowledge
                   Knowledge        Instant                            Base
                   Base Data      Messaging
User Manuals     User Generated      Voice
   Support          Content          Blogs
Documentation
                                                           User                      Interactive
                                                          Manual                       Support


        •   Web 2.0 is much more interactive and dynamic
        •   Globalization will be further driven by internet penetration into Asia
        •   Word-of-mouth-marketing gaining prominence all over the world
        •   Unstructured content in blogs, review sites is becoming critical
        •   The dialogue with global customer needs to be more interactive
            TM                                                     Copyright © 2008, All Rights Reserved
Continuous Improvement HDSMT Engines
  Sales /
                       Blogs
 Marketing
                        CRM
  Product         Biz Intelligence
Management

                     Human
  Content           Resources
Management            ECM
                      BPM
                                                                          The Global
 Customer
                       CRM
                       Email
                                                                          Customer
  Support               IM




   •   Highly adaptive human driven process for continuous output quality
       improvement in SMT engines and translation automation
   •   Intensive Collaboration with human translators to raise quality of SMT
   •   Integration with content creation and content refinement tools to enhance
       speed and improve business process management
   •   Continued evolution in standards to facilitate sharing linguistic assets
             TM                                               Copyright © 2008, All Rights Reserved
• Comprehensive SaaS Platform that facilitates the
  translation and continued refinement of any large high
  value translatable corpus using HDSMT
• Existing Feature Set
   –   Data Cleaning & Preparation Tools
   –   On Demand SMT engine development
   –   Support for both user created and online dictionaries and glossaries
   –   Ability to pool data for greater leverage
   –   Multiple level domain support
   –   Seamless integration with collaborative post-editing environment
   –   Real time updates of translated assets
   –   Web Services based APIs for integration
• System and process foundation for managed online
  community collaboration

           TM                                            Copyright © 2008, All Rights Reserved
•   Bilingual Data Preparation & Cleaning
             •   Bilingual Data Normalization & Optimization
             •   Source Cleanup and Preparation
   Data      •   Grammar and Spelling validation
Management   •   Monolingual Data Extraction & Analysis




             •   SMT System Training & Development
             •   Monolingual Data Training
             •   Ongoing Corpus Refinement and Tuning
SMT Engine   •   Analysis and Evaluation of Ngrams




             •   Error Pattern Identification & Correction
             •   Automated error correction tools
  Output     •   Continuing Cycle of Exception Identification and Correction
Proofing &   •   Development of small sets of new data to correct errors
  Editing




   TM                                                                          Copyright © 2008, All Rights Reserved
TM   Copyright © 2008, All Rights Reserved
• Data Cleaning Utilities to normalize and standardize data
  prior to consolidation to provide maximum leverage
• Recent study for TAUS proves conclusively that sharing
  clean data provides leverage
   – Smaller amount of clean data can produce better results than
     datasets even 2X larger
   – Consistent Terminology matters and provides real leverage
   – Data optimized for TM Tools can be “dirty data “ for SMT




         TM                                           Copyright © 2008, All Rights Reserved
Initial System put
                                     into production



            Changes are collected                            Trained Internal
             and added to initial                          Experts begin initial
               corpus to drive                           clean up and correction
            continuous retraining                                process




                     All users allowed to
                                                 Expert Users also
                   suggest changes which
                                                 allowed to make
                     go through vetting
                                                     changes
Community                  process




     TM                                                                 Copyright © 2008, All Rights Reserved
Targeted Corrections
Initial System                                of Bad Learning




                   Spelling & Terminology




  Correct
  Mistranslation
  Syntax/Grammar
  Terminology
  Spelling
  Punctuation
                                       Human Feedback can
                                       raise the raw output to previously
                                       unseen quality levels
             TM                                            Copyright © 2008, All Rights Reserved
TM   Copyright © 2008, All Rights Reserved
TM   Copyright © 2008, All Rights Reserved
Information Requests                                           Data Training
GetAccountInformation                                          CancelTrainingJob
GetAccountUsageHistory                                         GetTrainingJobList
GetAvailableDomainCombinationsForLanguagePair                  GetTrainingJobStatus
GetAvailableDomainsForLanguagePair                             SubmitDatasetForTraining
GetAvailableLanguagePairs                                      Data Preparation
GetCustomDomainsForLanguagePair                                CleanText
Data Storage                                                   ExtractText
CreateDataset                                                  NormalizeText
DeleteDataset                                                  OCRImage
DeleteDataFromDataset                                          ParagraphAlignLanguagePairText
DownloadDataset                                                SentenceAlignLanguagePairText
DownloadDatasetItem                                            SentenceSegmentText
GetDatasetList                                                 SpellCheckText
GetDatasetItemList                                             WordSegmentText
LinkDataToDataset                                              Translation
MergeDatasets                                                  CancelTranslationJob
UploadData                                                     GetTranslationJobList
UploadGlossary                                                 GetTranslationJobStatus
UploadImage                                                    SubmitDatasetForTranslation
UploadLanguageModel                                            SubmitSinglePhraseForTranslation
UploadMonolingualText
UploadOCRPageLayout             sUsername           String    The username of the person making the request.
UploadPhrasePairs
                                sPassword           String    The password of the person making the request.
UploadTranslationMemory
                                iAccountNo          Integer   The account number that this request is associated with.
UploadZIP
                                iDepartmentNo       Integer   The department number that this request is associated with.
                                iLanguagePairCode   Integer   The code for the language pair that is being looked up.


     TM                                                                                        Copyright © 2008, All Rights Reserved
TM   Copyright © 2008, All Rights Reserved
TM   Copyright © 2008, All Rights Reserved
TM   Copyright © 2008, All Rights Reserved
TM   Copyright © 2008, All Rights Reserved
TM   Copyright © 2008, All Rights Reserved
Provide existing human
                     translated content for
                  training language engines       Translation
                                                   Systems                            User
Publishers                                           Constant           User accesses
                                                   Improvement         online content in         Social Networks /
                                                                        local language              Community
                    Leverage ASP
                  Translation service                                                  Translated content proof
                     for translation of                                                  read using community
                       new material                                                    principles and paid proof
                                                                                      readers using Asia Online
                                                                                          proof reading system
 Proof reading
 still required
whether human
  or machine          New
  translation     translations
                  sent back to
                    publisher
                                                                                Translated
                                          Translation       Asia Online        content made      Translated Content

                                             SaaS             Portal            available to
                                                                                  users

Human Proof Readers      Translations are
                        proof read via ASP
                                                                 Original Content translated
                       proof reading system
                                                                      to local language           Original Content

                  TM                                                            Copyright © 2008, All Rights Reserved
• Integrated data cleaning, data preparation, SMT systems
  development and post-editing environment
• Comprehensive proof-reading and post-editing environment
  that is integrated with core SMT engines to enable instant
  updates        Greater Control & Better systems
• Greater transparency of many key SMT building blocks to
  enable users to see and modify what the system has learnt
  resulting in greater control and better systems
• A richer and deeper taxonomy for domains to ensure the best
  quality       Better systems
• Incremental additions of new training data to any existing
  system to enable rapid updates         Faster updates
• Easy handling of terminology, glossaries, dictionaries
          TM                                   Copyright © 2008, All Rights Reserved
TM




Kirti Vashee
VP Sales, Asia Online
kirti.vashee@asiaonline.net

More Related Content

What's hot

ScaleBase Webinar: Methods and Challenges to Scale Out a MySQL Database
ScaleBase Webinar: Methods and Challenges to Scale Out a MySQL DatabaseScaleBase Webinar: Methods and Challenges to Scale Out a MySQL Database
ScaleBase Webinar: Methods and Challenges to Scale Out a MySQL DatabaseScaleBase
 
Intel IT Cloud Strategy
Intel IT Cloud StrategyIntel IT Cloud Strategy
Intel IT Cloud Strategytdwiindia
 
France rediscover informix support
France   rediscover informix supportFrance   rediscover informix support
France rediscover informix supportFranckThomas
 
Complex Er[jl]ang Processing with StreamBase
Complex Er[jl]ang Processing with StreamBaseComplex Er[jl]ang Processing with StreamBase
Complex Er[jl]ang Processing with StreamBasedarach
 
AWS Partner Presentation - Riverbed
AWS Partner Presentation - RiverbedAWS Partner Presentation - Riverbed
AWS Partner Presentation - RiverbedAmazon Web Services
 
Master agile development and testing
Master agile development and testingMaster agile development and testing
Master agile development and testingvmglover
 
Etendez votre datacenter avec aws v4
Etendez votre datacenter avec aws v4Etendez votre datacenter avec aws v4
Etendez votre datacenter avec aws v4Amazon Web Services
 
Introduction to Enterprise Cloud Economics
Introduction to Enterprise Cloud EconomicsIntroduction to Enterprise Cloud Economics
Introduction to Enterprise Cloud EconomicsEverest Group
 
Managing content in_a_mobile_world
Managing content in_a_mobile_worldManaging content in_a_mobile_world
Managing content in_a_mobile_worldQuestexConf
 
Training Intro.2009.Gm
Training Intro.2009.GmTraining Intro.2009.Gm
Training Intro.2009.Gmmxnprns
 
BPOS Information
BPOS InformationBPOS Information
BPOS Informationsleyland
 
Ecm mythbusters the_real_story_behind_vendor_marketing
Ecm mythbusters the_real_story_behind_vendor_marketingEcm mythbusters the_real_story_behind_vendor_marketing
Ecm mythbusters the_real_story_behind_vendor_marketingQuestexConf
 

What's hot (17)

ScaleBase Webinar: Methods and Challenges to Scale Out a MySQL Database
ScaleBase Webinar: Methods and Challenges to Scale Out a MySQL DatabaseScaleBase Webinar: Methods and Challenges to Scale Out a MySQL Database
ScaleBase Webinar: Methods and Challenges to Scale Out a MySQL Database
 
Intel IT Cloud Strategy
Intel IT Cloud StrategyIntel IT Cloud Strategy
Intel IT Cloud Strategy
 
France rediscover informix support
France   rediscover informix supportFrance   rediscover informix support
France rediscover informix support
 
Complex Er[jl]ang Processing with StreamBase
Complex Er[jl]ang Processing with StreamBaseComplex Er[jl]ang Processing with StreamBase
Complex Er[jl]ang Processing with StreamBase
 
AWS Partner Presentation - Riverbed
AWS Partner Presentation - RiverbedAWS Partner Presentation - Riverbed
AWS Partner Presentation - Riverbed
 
Iris-Corp's corporate business profile
Iris-Corp's corporate business profileIris-Corp's corporate business profile
Iris-Corp's corporate business profile
 
Master agile development and testing
Master agile development and testingMaster agile development and testing
Master agile development and testing
 
Etendez votre datacenter avec aws v4
Etendez votre datacenter avec aws v4Etendez votre datacenter avec aws v4
Etendez votre datacenter avec aws v4
 
Introduction to Enterprise Cloud Economics
Introduction to Enterprise Cloud EconomicsIntroduction to Enterprise Cloud Economics
Introduction to Enterprise Cloud Economics
 
Managing content in_a_mobile_world
Managing content in_a_mobile_worldManaging content in_a_mobile_world
Managing content in_a_mobile_world
 
Training Intro.2009.Gm
Training Intro.2009.GmTraining Intro.2009.Gm
Training Intro.2009.Gm
 
Avaya Data Network
Avaya Data NetworkAvaya Data Network
Avaya Data Network
 
BPOS Information
BPOS InformationBPOS Information
BPOS Information
 
Ecm mythbusters the_real_story_behind_vendor_marketing
Ecm mythbusters the_real_story_behind_vendor_marketingEcm mythbusters the_real_story_behind_vendor_marketing
Ecm mythbusters the_real_story_behind_vendor_marketing
 
Avaya ip office
Avaya ip officeAvaya ip office
Avaya ip office
 
OSS Business models
OSS Business modelsOSS Business models
OSS Business models
 
Mobility Solution for Life Insurance Enterprises
Mobility Solution for Life Insurance EnterprisesMobility Solution for Life Insurance Enterprises
Mobility Solution for Life Insurance Enterprises
 

Viewers also liked

The Latest Advances in Patent Machine Translation
The Latest Advances in Patent Machine TranslationThe Latest Advances in Patent Machine Translation
The Latest Advances in Patent Machine TranslationIconic Translation Machines
 
ICIC 2014 High volume, High Quality Patent Translation across Multiple Domain...
ICIC 2014 High volume, High Quality Patent Translation across Multiple Domain...ICIC 2014 High volume, High Quality Patent Translation across Multiple Domain...
ICIC 2014 High volume, High Quality Patent Translation across Multiple Domain...Dr. Haxel Consult
 
TAUS Roundtable Moscow, User Empowered Machine Translation, Dion Wiggins, Asi...
TAUS Roundtable Moscow, User Empowered Machine Translation, Dion Wiggins, Asi...TAUS Roundtable Moscow, User Empowered Machine Translation, Dion Wiggins, Asi...
TAUS Roundtable Moscow, User Empowered Machine Translation, Dion Wiggins, Asi...TAUS - The Language Data Network
 
machine translation beginning...
machine translation beginning...machine translation beginning...
machine translation beginning...Muneeb Khan
 
Experiments with Different Models of Statistcial Machine Translation
Experiments with Different Models of Statistcial Machine TranslationExperiments with Different Models of Statistcial Machine Translation
Experiments with Different Models of Statistcial Machine Translationkhyati gupta
 
ICIC 2013 Conference Proceedings Richard Garner (LexisNexis)
ICIC 2013 Conference Proceedings Richard Garner (LexisNexis)ICIC 2013 Conference Proceedings Richard Garner (LexisNexis)
ICIC 2013 Conference Proceedings Richard Garner (LexisNexis)Dr. Haxel Consult
 
User Empowered Machine Translation. Dion Wiggins, Asia Online
User Empowered Machine Translation. Dion Wiggins, Asia OnlineUser Empowered Machine Translation. Dion Wiggins, Asia Online
User Empowered Machine Translation. Dion Wiggins, Asia OnlineABBYY Language Serivces
 
Weaving Dataflows with Silk - ScalaMatsuri 2014, Tokyo
Weaving Dataflows with Silk - ScalaMatsuri 2014, TokyoWeaving Dataflows with Silk - ScalaMatsuri 2014, Tokyo
Weaving Dataflows with Silk - ScalaMatsuri 2014, TokyoTaro L. Saito
 
BRANDING — Fashion Institute of Technology Denim Project Presentation — Feb 2...
BRANDING — Fashion Institute of Technology Denim Project Presentation — Feb 2...BRANDING — Fashion Institute of Technology Denim Project Presentation — Feb 2...
BRANDING — Fashion Institute of Technology Denim Project Presentation — Feb 2...BrandingDoneRight
 
Weaving Technology V1
Weaving Technology V1Weaving Technology V1
Weaving Technology V1Nazrul
 
Dual Learning for Machine Translation (NIPS 2016)
Dual Learning for Machine Translation (NIPS 2016)Dual Learning for Machine Translation (NIPS 2016)
Dual Learning for Machine Translation (NIPS 2016)Toru Fujino
 

Viewers also liked (14)

The Latest Advances in Patent Machine Translation
The Latest Advances in Patent Machine TranslationThe Latest Advances in Patent Machine Translation
The Latest Advances in Patent Machine Translation
 
ICIC 2014 High volume, High Quality Patent Translation across Multiple Domain...
ICIC 2014 High volume, High Quality Patent Translation across Multiple Domain...ICIC 2014 High volume, High Quality Patent Translation across Multiple Domain...
ICIC 2014 High volume, High Quality Patent Translation across Multiple Domain...
 
TAUS Roundtable Moscow, User Empowered Machine Translation, Dion Wiggins, Asi...
TAUS Roundtable Moscow, User Empowered Machine Translation, Dion Wiggins, Asi...TAUS Roundtable Moscow, User Empowered Machine Translation, Dion Wiggins, Asi...
TAUS Roundtable Moscow, User Empowered Machine Translation, Dion Wiggins, Asi...
 
machine translation beginning...
machine translation beginning...machine translation beginning...
machine translation beginning...
 
Experiments with Different Models of Statistcial Machine Translation
Experiments with Different Models of Statistcial Machine TranslationExperiments with Different Models of Statistcial Machine Translation
Experiments with Different Models of Statistcial Machine Translation
 
Why MT Matters
Why MT MattersWhy MT Matters
Why MT Matters
 
ICIC 2013 Conference Proceedings Richard Garner (LexisNexis)
ICIC 2013 Conference Proceedings Richard Garner (LexisNexis)ICIC 2013 Conference Proceedings Richard Garner (LexisNexis)
ICIC 2013 Conference Proceedings Richard Garner (LexisNexis)
 
User Empowered Machine Translation. Dion Wiggins, Asia Online
User Empowered Machine Translation. Dion Wiggins, Asia OnlineUser Empowered Machine Translation. Dion Wiggins, Asia Online
User Empowered Machine Translation. Dion Wiggins, Asia Online
 
Weaving Dataflows with Silk - ScalaMatsuri 2014, Tokyo
Weaving Dataflows with Silk - ScalaMatsuri 2014, TokyoWeaving Dataflows with Silk - ScalaMatsuri 2014, Tokyo
Weaving Dataflows with Silk - ScalaMatsuri 2014, Tokyo
 
BRANDING — Fashion Institute of Technology Denim Project Presentation — Feb 2...
BRANDING — Fashion Institute of Technology Denim Project Presentation — Feb 2...BRANDING — Fashion Institute of Technology Denim Project Presentation — Feb 2...
BRANDING — Fashion Institute of Technology Denim Project Presentation — Feb 2...
 
Process control of weaving
Process control of weaving Process control of weaving
Process control of weaving
 
Machine translation
Machine translationMachine translation
Machine translation
 
Weaving Technology V1
Weaving Technology V1Weaving Technology V1
Weaving Technology V1
 
Dual Learning for Machine Translation (NIPS 2016)
Dual Learning for Machine Translation (NIPS 2016)Dual Learning for Machine Translation (NIPS 2016)
Dual Learning for Machine Translation (NIPS 2016)
 

Similar to TAUS Scotland Asia Online Technology Platform V1

MiTiN 2013 Keynote in Detroit Michigan
MiTiN 2013 Keynote in Detroit MichiganMiTiN 2013 Keynote in Detroit Michigan
MiTiN 2013 Keynote in Detroit MichiganKirti Vashee
 
ICIC 2013 New Product Introductions CEPT
ICIC 2013 New Product Introductions CEPTICIC 2013 New Product Introductions CEPT
ICIC 2013 New Product Introductions CEPTDr. Haxel Consult
 
Outsourcing The Next Frontier In Editorial Workflow Shivaji Sengupta
Outsourcing The Next Frontier In Editorial Workflow Shivaji SenguptaOutsourcing The Next Frontier In Editorial Workflow Shivaji Sengupta
Outsourcing The Next Frontier In Editorial Workflow Shivaji SenguptaNXTKey Corporation
 
Tms days 04 2012 manuel herranz pangea mt
Tms days 04 2012 manuel herranz pangea mtTms days 04 2012 manuel herranz pangea mt
Tms days 04 2012 manuel herranz pangea mtManuel Herranz
 
Lexcelera MT Breaking Compromises
Lexcelera MT Breaking CompromisesLexcelera MT Breaking Compromises
Lexcelera MT Breaking CompromisesLoriThicke
 
雲端推動的人工智能革命
雲端推動的人工智能革命雲端推動的人工智能革命
雲端推動的人工智能革命Amazon Web Services
 
How to Purchase Translations and What to Look For in a Supplier
How to Purchase Translations and What to Look For in a SupplierHow to Purchase Translations and What to Look For in a Supplier
How to Purchase Translations and What to Look For in a SupplierResearchShare
 
TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Seattle, Full Service Enterpri...
TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Seattle, Full Service Enterpri...TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Seattle, Full Service Enterpri...
TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Seattle, Full Service Enterpri...TAUS - The Language Data Network
 
TAUS webinar The Big Picture View On The Translation Industry, March 2013
TAUS webinar The Big Picture View On The Translation Industry, March 2013TAUS webinar The Big Picture View On The Translation Industry, March 2013
TAUS webinar The Big Picture View On The Translation Industry, March 2013TAUS - The Language Data Network
 
Business Process Management and Virtual Worlds
Business Process Management and Virtual WorldsBusiness Process Management and Virtual Worlds
Business Process Management and Virtual WorldsIan Hughes / epredator
 
Machine Translation Master Class at the EUATC Conference by Diego Bartolome
Machine Translation Master Class at the EUATC Conference by Diego BartolomeMachine Translation Master Class at the EUATC Conference by Diego Bartolome
Machine Translation Master Class at the EUATC Conference by Diego Bartolometauyou
 
Talend Open Studio and Hortonworks Data Platform
Talend Open Studio and Hortonworks Data PlatformTalend Open Studio and Hortonworks Data Platform
Talend Open Studio and Hortonworks Data PlatformHortonworks
 
WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...
WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...
WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...Welocalize
 
The Next Generation of Big Data Analytics
The Next Generation of Big Data AnalyticsThe Next Generation of Big Data Analytics
The Next Generation of Big Data AnalyticsHortonworks
 
Webinar: Increase technology Uptake with Software Usage Metering Tools
Webinar: Increase technology Uptake with Software Usage Metering ToolsWebinar: Increase technology Uptake with Software Usage Metering Tools
Webinar: Increase technology Uptake with Software Usage Metering ToolsOpen iT Inc.
 
Etuma Feedback Analysis API offering
Etuma Feedback Analysis API offeringEtuma Feedback Analysis API offering
Etuma Feedback Analysis API offeringEtuma
 
5 IT Trends That Reduce Cost And Improve Web Performance - A Forrester and Go...
5 IT Trends That Reduce Cost And Improve Web Performance - A Forrester and Go...5 IT Trends That Reduce Cost And Improve Web Performance - A Forrester and Go...
5 IT Trends That Reduce Cost And Improve Web Performance - A Forrester and Go...Compuware APM
 

Similar to TAUS Scotland Asia Online Technology Platform V1 (20)

MiTiN 2013 Keynote in Detroit Michigan
MiTiN 2013 Keynote in Detroit MichiganMiTiN 2013 Keynote in Detroit Michigan
MiTiN 2013 Keynote in Detroit Michigan
 
ICIC 2013 New Product Introductions CEPT
ICIC 2013 New Product Introductions CEPTICIC 2013 New Product Introductions CEPT
ICIC 2013 New Product Introductions CEPT
 
Outsourcing The Next Frontier In Editorial Workflow Shivaji Sengupta
Outsourcing The Next Frontier In Editorial Workflow Shivaji SenguptaOutsourcing The Next Frontier In Editorial Workflow Shivaji Sengupta
Outsourcing The Next Frontier In Editorial Workflow Shivaji Sengupta
 
Tms days 04 2012 manuel herranz pangea mt
Tms days 04 2012 manuel herranz pangea mtTms days 04 2012 manuel herranz pangea mt
Tms days 04 2012 manuel herranz pangea mt
 
Lexcelera MT Breaking Compromises
Lexcelera MT Breaking CompromisesLexcelera MT Breaking Compromises
Lexcelera MT Breaking Compromises
 
雲端推動的人工智能革命
雲端推動的人工智能革命雲端推動的人工智能革命
雲端推動的人工智能革命
 
How to Purchase Translations and What to Look For in a Supplier
How to Purchase Translations and What to Look For in a SupplierHow to Purchase Translations and What to Look For in a Supplier
How to Purchase Translations and What to Look For in a Supplier
 
TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Seattle, Full Service Enterpri...
TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Seattle, Full Service Enterpri...TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Seattle, Full Service Enterpri...
TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Seattle, Full Service Enterpri...
 
TAUS webinar The Big Picture View On The Translation Industry, March 2013
TAUS webinar The Big Picture View On The Translation Industry, March 2013TAUS webinar The Big Picture View On The Translation Industry, March 2013
TAUS webinar The Big Picture View On The Translation Industry, March 2013
 
Business Process Management and Virtual Worlds
Business Process Management and Virtual WorldsBusiness Process Management and Virtual Worlds
Business Process Management and Virtual Worlds
 
TIRTA ERP
TIRTA ERPTIRTA ERP
TIRTA ERP
 
2011 10-26 bpm-talk_andrew_watson
2011 10-26 bpm-talk_andrew_watson2011 10-26 bpm-talk_andrew_watson
2011 10-26 bpm-talk_andrew_watson
 
Smt & data quality
Smt & data qualitySmt & data quality
Smt & data quality
 
Machine Translation Master Class at the EUATC Conference by Diego Bartolome
Machine Translation Master Class at the EUATC Conference by Diego BartolomeMachine Translation Master Class at the EUATC Conference by Diego Bartolome
Machine Translation Master Class at the EUATC Conference by Diego Bartolome
 
Talend Open Studio and Hortonworks Data Platform
Talend Open Studio and Hortonworks Data PlatformTalend Open Studio and Hortonworks Data Platform
Talend Open Studio and Hortonworks Data Platform
 
WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...
WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...
WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...
 
The Next Generation of Big Data Analytics
The Next Generation of Big Data AnalyticsThe Next Generation of Big Data Analytics
The Next Generation of Big Data Analytics
 
Webinar: Increase technology Uptake with Software Usage Metering Tools
Webinar: Increase technology Uptake with Software Usage Metering ToolsWebinar: Increase technology Uptake with Software Usage Metering Tools
Webinar: Increase technology Uptake with Software Usage Metering Tools
 
Etuma Feedback Analysis API offering
Etuma Feedback Analysis API offeringEtuma Feedback Analysis API offering
Etuma Feedback Analysis API offering
 
5 IT Trends That Reduce Cost And Improve Web Performance - A Forrester and Go...
5 IT Trends That Reduce Cost And Improve Web Performance - A Forrester and Go...5 IT Trends That Reduce Cost And Improve Web Performance - A Forrester and Go...
5 IT Trends That Reduce Cost And Improve Web Performance - A Forrester and Go...
 

TAUS Scotland Asia Online Technology Platform V1

  • 1. TM Translation Technology Platform Kirti Vashee VP Sales, Asia Online Kirti.vashee@asiaonline.net
  • 2. Revolutionize the enterprise Revolutionize the Internet translation process with a experience for non-English comprehensive, continuous speakers in Asia learning SMT platform Provide 1 billion+ local-language pages online SaaS environment that allows data using mostly translated open license content, cleaning and preparation, develop SMT combined with compelling portal and social networking style services in Thailand, engines on demand and enable ongoing Indonesia, India, Malaysia, Philippines, comprehensive post editing and correction Vietnam and China, Japan & Korea to continuously improve engines The Consumer Market The Enterprise Market Large Buyer & Translation Tools Publisher Perspective Vendor Perspective TM Copyright © 2008, All Rights Reserved
  • 3. • The only SMT technology provider that is also a major user of ALT technology on one of the largest translation projects in the world - English Wikipedia (1B Words+) into 11 Asian languages using SMT and crowdsourcing • The translation tools and technology platform used to accomplish this, is also being made available as a SaaS product for the enterprise translation market TM Copyright © 2008, All Rights Reserved
  • 4. Battlefield of words Fusion with customer support Continuous translation Community translation Industry-shared language data Massive online collaboration Translation automation TM Copyright © 2008, All Rights Reserved
  • 5. Interactive Support: EMAIL Knowledge Knowledge Instant Base Base Data Messaging User Manuals User Generated Voice Support Content Blogs Documentation User Interactive Manual Support • Web 2.0 is much more interactive and dynamic • Globalization will be further driven by internet penetration into Asia • Word-of-mouth-marketing gaining prominence all over the world • Unstructured content in blogs, review sites is becoming critical • The dialogue with global customer needs to be more interactive TM Copyright © 2008, All Rights Reserved
  • 6. Continuous Improvement HDSMT Engines Sales / Blogs Marketing CRM Product Biz Intelligence Management Human Content Resources Management ECM BPM The Global Customer CRM Email Customer Support IM • Highly adaptive human driven process for continuous output quality improvement in SMT engines and translation automation • Intensive Collaboration with human translators to raise quality of SMT • Integration with content creation and content refinement tools to enhance speed and improve business process management • Continued evolution in standards to facilitate sharing linguistic assets TM Copyright © 2008, All Rights Reserved
  • 7. • Comprehensive SaaS Platform that facilitates the translation and continued refinement of any large high value translatable corpus using HDSMT • Existing Feature Set – Data Cleaning & Preparation Tools – On Demand SMT engine development – Support for both user created and online dictionaries and glossaries – Ability to pool data for greater leverage – Multiple level domain support – Seamless integration with collaborative post-editing environment – Real time updates of translated assets – Web Services based APIs for integration • System and process foundation for managed online community collaboration TM Copyright © 2008, All Rights Reserved
  • 8. Bilingual Data Preparation & Cleaning • Bilingual Data Normalization & Optimization • Source Cleanup and Preparation Data • Grammar and Spelling validation Management • Monolingual Data Extraction & Analysis • SMT System Training & Development • Monolingual Data Training • Ongoing Corpus Refinement and Tuning SMT Engine • Analysis and Evaluation of Ngrams • Error Pattern Identification & Correction • Automated error correction tools Output • Continuing Cycle of Exception Identification and Correction Proofing & • Development of small sets of new data to correct errors Editing TM Copyright © 2008, All Rights Reserved
  • 9. TM Copyright © 2008, All Rights Reserved
  • 10. • Data Cleaning Utilities to normalize and standardize data prior to consolidation to provide maximum leverage • Recent study for TAUS proves conclusively that sharing clean data provides leverage – Smaller amount of clean data can produce better results than datasets even 2X larger – Consistent Terminology matters and provides real leverage – Data optimized for TM Tools can be “dirty data “ for SMT TM Copyright © 2008, All Rights Reserved
  • 11. Initial System put into production Changes are collected Trained Internal and added to initial Experts begin initial corpus to drive clean up and correction continuous retraining process All users allowed to Expert Users also suggest changes which allowed to make go through vetting changes Community process TM Copyright © 2008, All Rights Reserved
  • 12. Targeted Corrections Initial System of Bad Learning Spelling & Terminology Correct Mistranslation Syntax/Grammar Terminology Spelling Punctuation Human Feedback can raise the raw output to previously unseen quality levels TM Copyright © 2008, All Rights Reserved
  • 13. TM Copyright © 2008, All Rights Reserved
  • 14. TM Copyright © 2008, All Rights Reserved
  • 15. Information Requests Data Training GetAccountInformation CancelTrainingJob GetAccountUsageHistory GetTrainingJobList GetAvailableDomainCombinationsForLanguagePair GetTrainingJobStatus GetAvailableDomainsForLanguagePair SubmitDatasetForTraining GetAvailableLanguagePairs Data Preparation GetCustomDomainsForLanguagePair CleanText Data Storage ExtractText CreateDataset NormalizeText DeleteDataset OCRImage DeleteDataFromDataset ParagraphAlignLanguagePairText DownloadDataset SentenceAlignLanguagePairText DownloadDatasetItem SentenceSegmentText GetDatasetList SpellCheckText GetDatasetItemList WordSegmentText LinkDataToDataset Translation MergeDatasets CancelTranslationJob UploadData GetTranslationJobList UploadGlossary GetTranslationJobStatus UploadImage SubmitDatasetForTranslation UploadLanguageModel SubmitSinglePhraseForTranslation UploadMonolingualText UploadOCRPageLayout sUsername String The username of the person making the request. UploadPhrasePairs sPassword String The password of the person making the request. UploadTranslationMemory iAccountNo Integer The account number that this request is associated with. UploadZIP iDepartmentNo Integer The department number that this request is associated with. iLanguagePairCode Integer The code for the language pair that is being looked up. TM Copyright © 2008, All Rights Reserved
  • 16. TM Copyright © 2008, All Rights Reserved
  • 17. TM Copyright © 2008, All Rights Reserved
  • 18. TM Copyright © 2008, All Rights Reserved
  • 19. TM Copyright © 2008, All Rights Reserved
  • 20. TM Copyright © 2008, All Rights Reserved
  • 21. Provide existing human translated content for training language engines Translation Systems User Publishers Constant User accesses Improvement online content in Social Networks / local language Community Leverage ASP Translation service Translated content proof for translation of read using community new material principles and paid proof readers using Asia Online proof reading system Proof reading still required whether human or machine New translation translations sent back to publisher Translated Translation Asia Online content made Translated Content SaaS Portal available to users Human Proof Readers Translations are proof read via ASP Original Content translated proof reading system to local language Original Content TM Copyright © 2008, All Rights Reserved
  • 22. • Integrated data cleaning, data preparation, SMT systems development and post-editing environment • Comprehensive proof-reading and post-editing environment that is integrated with core SMT engines to enable instant updates Greater Control & Better systems • Greater transparency of many key SMT building blocks to enable users to see and modify what the system has learnt resulting in greater control and better systems • A richer and deeper taxonomy for domains to ensure the best quality Better systems • Incremental additions of new training data to any existing system to enable rapid updates Faster updates • Easy handling of terminology, glossaries, dictionaries TM Copyright © 2008, All Rights Reserved
  • 23. TM Kirti Vashee VP Sales, Asia Online kirti.vashee@asiaonline.net