The document summarizes the agenda for the TAUS Moses Roundtable meeting, which included welcome and introductions, a presentation on the results of the Moses survey, a discussion of the Moses roadmap, and a session to discuss and prioritize areas for potential cooperation among Moses users from industry. The roadmap presentation aimed to discuss how the needs of industry can help guide future development of the open source Moses machine translation toolkit.
Jaap van der Meer, Director of TAUS, shares a compilation of the feedback on the Big Idea as well as a complete overview of new TAUS features and services and new partnerships.
MT is useful, and it gets better and more useful when it is customized to the terminology and style of the documents to be translated. But it is extra work, not much, but extra work. In this talk you’ll get an overview of MT domain customization, its benefits, pitfalls, and conditions for making it work, as well as an overview of the actual work and helpful vs. not so helpful training documents. The theory of MT. Introduction to MT: short history, the pros and cons of different techniques. Statistical MT versus rule-based MT and what the brand new model-based MT can offer, as well as the hybridization and the challenges and possible breakthroughs.
Daniel Gervais, Executive Vice-president, MultiCorpora
Recent developments in TAUS Data Association super cloud-based data-sharing coupled with advanced leveraging technologies, produce measurable increases in segment matching. However, there are heated debates about how translation pollution can arise in this context, and potential antidotes for such pollution. Daniel provides cases studies to assess a central question that everyone is posing today: does increased matching through advanced leveraging technology equate to real productivity gain? Daniel's talk will provide innovative thought on new collaboration models between linguists and TM systems.
A "simplified guide to SMT" is about as simple as a "simplified guide to Photoshop." Professional tools require expertise. The questions are, what levels of expertise are required, how do you acquire them and what processes contribute to a successful SMT program? These fundamentals are the same whether you're planning to use an outsourcing service or preparing to operate an in-house system. This session reviews these fundamentals with examples that reference use cases with PTTools' DoMT Desktop, a commercial application with a Moses kernel.
This presentation is a part of the MosesCore project that encourages the development and usage of open source machine translation tools, notably the Moses statistical MT toolkit. MosesCore is supported by the European Commission Grant Number 288487 under the 7th Framework Programme.
For the latest updates go to http://www.statmt.org/mosescore/
or follow us on Twitter - #MosesCore
The Future of Technical Communication is MarketingScott Abel
Once a prospect buys a product or service, the content they interact with is no longer familiar. The instructions provided don't look, feel, or sound anything like the marketing and sales materials that introduced them to your brand. Neither does the service contract, the warranty, the customer support website, the product documentation, nor the training materials.
The extensive variability in customer experience — and each customer touchpoint — creates a different and inconsistent version of the brand, some that bear little or no resemblance to the brand that executives believe they are building. There are often as many brands as there are touchpoints.
For no good reason, the content experience changes drastically -- and not in a good way. That's why organizations that recognize the importance of a unified customer experience have started rethinking what it means to be customer-centric.
Some forward-thinking organizations are reorganizing customer-facing content creators into teams under one roof. They're breaking down the barriers — the silos — that prevent them from collaborating; from creating a unified customer content experience.
In this presentation, delivered at Acrolinx Day at LavaCon 2014 Portland, Scott Abel, The Content Wrangler, discussed the challenges of content inconsistency and incongruity, and why he thinks the future of technical communication is marketing.
Kirti Vashee, Vice-President Enterprise Translation Sales, Asia Online
Rustin Gibbs, Solutions Architect, Moravia Worldwide
Kirti and Rustin provide insights into an innovative approach to the practical use of MT in situations where the bilingual data is of insufficient volume and the monolingual data is of unclear relevance Kirti and Rustin provide examples from travel and publishing industries to show the individual steps of the process to equip participants with information on what language and language technology tools exist to build a high-quality translation engine.
This document summarizes a presentation on terminology trends from a blogger's perspective. It discusses how language lovers use social networks like blogs, Facebook, and Twitter to communicate about terminology by researching, asking questions, answering questions of followers, reporting on conferences, and providing helpful tips, news, and job opportunities. Social networks produce large amounts of text data that can be used for terminology research to analyze evolving language and identify neologisms. Tools like the Global Language Monitor use natural language processing of social media to track new terms and their usage in real-time.
Jaap van der Meer, Director of TAUS, shares a compilation of the feedback on the Big Idea as well as a complete overview of new TAUS features and services and new partnerships.
MT is useful, and it gets better and more useful when it is customized to the terminology and style of the documents to be translated. But it is extra work, not much, but extra work. In this talk you’ll get an overview of MT domain customization, its benefits, pitfalls, and conditions for making it work, as well as an overview of the actual work and helpful vs. not so helpful training documents. The theory of MT. Introduction to MT: short history, the pros and cons of different techniques. Statistical MT versus rule-based MT and what the brand new model-based MT can offer, as well as the hybridization and the challenges and possible breakthroughs.
Daniel Gervais, Executive Vice-president, MultiCorpora
Recent developments in TAUS Data Association super cloud-based data-sharing coupled with advanced leveraging technologies, produce measurable increases in segment matching. However, there are heated debates about how translation pollution can arise in this context, and potential antidotes for such pollution. Daniel provides cases studies to assess a central question that everyone is posing today: does increased matching through advanced leveraging technology equate to real productivity gain? Daniel's talk will provide innovative thought on new collaboration models between linguists and TM systems.
A "simplified guide to SMT" is about as simple as a "simplified guide to Photoshop." Professional tools require expertise. The questions are, what levels of expertise are required, how do you acquire them and what processes contribute to a successful SMT program? These fundamentals are the same whether you're planning to use an outsourcing service or preparing to operate an in-house system. This session reviews these fundamentals with examples that reference use cases with PTTools' DoMT Desktop, a commercial application with a Moses kernel.
This presentation is a part of the MosesCore project that encourages the development and usage of open source machine translation tools, notably the Moses statistical MT toolkit. MosesCore is supported by the European Commission Grant Number 288487 under the 7th Framework Programme.
For the latest updates go to http://www.statmt.org/mosescore/
or follow us on Twitter - #MosesCore
The Future of Technical Communication is MarketingScott Abel
Once a prospect buys a product or service, the content they interact with is no longer familiar. The instructions provided don't look, feel, or sound anything like the marketing and sales materials that introduced them to your brand. Neither does the service contract, the warranty, the customer support website, the product documentation, nor the training materials.
The extensive variability in customer experience — and each customer touchpoint — creates a different and inconsistent version of the brand, some that bear little or no resemblance to the brand that executives believe they are building. There are often as many brands as there are touchpoints.
For no good reason, the content experience changes drastically -- and not in a good way. That's why organizations that recognize the importance of a unified customer experience have started rethinking what it means to be customer-centric.
Some forward-thinking organizations are reorganizing customer-facing content creators into teams under one roof. They're breaking down the barriers — the silos — that prevent them from collaborating; from creating a unified customer content experience.
In this presentation, delivered at Acrolinx Day at LavaCon 2014 Portland, Scott Abel, The Content Wrangler, discussed the challenges of content inconsistency and incongruity, and why he thinks the future of technical communication is marketing.
Kirti Vashee, Vice-President Enterprise Translation Sales, Asia Online
Rustin Gibbs, Solutions Architect, Moravia Worldwide
Kirti and Rustin provide insights into an innovative approach to the practical use of MT in situations where the bilingual data is of insufficient volume and the monolingual data is of unclear relevance Kirti and Rustin provide examples from travel and publishing industries to show the individual steps of the process to equip participants with information on what language and language technology tools exist to build a high-quality translation engine.
This document summarizes a presentation on terminology trends from a blogger's perspective. It discusses how language lovers use social networks like blogs, Facebook, and Twitter to communicate about terminology by researching, asking questions, answering questions of followers, reporting on conferences, and providing helpful tips, news, and job opportunities. Social networks produce large amounts of text data that can be used for terminology research to analyze evolving language and identify neologisms. Tools like the Global Language Monitor use natural language processing of social media to track new terms and their usage in real-time.
Kevin Knight, Senior Research Scientist and Fellow, Information Sciences Institute, Research Associate Professor, University of Southern California
A clear long-term vision motivates research in automatic language translation. The vision is that you read, write, listen, and speak in your own language, and computer software translates whenever necessary. Reading this paragraph but don't know English? No problem, computer will translate. Launching a new product in Eastern Europe? No problem. Boyfriend doesn't speak Korean? No problem.
This is certainly one of the most compelling visions in computer science, and it has animated a great deal of research. How do we get from here to there? This talk will look at recent improvements, noting how ideas have moved from impractical to mainstream, as well as covering current problems and future directions.
This presentation is a part of the MosesCore project that encourages the development and usage of open source machine translation tools, notably the Moses statistical MT toolkit. MosesCore is supported by the European Commission Grant Number 288487 under the 7th Framework Programme.
For the latest updates go to http://www.statmt.org/mosescore/
or follow us on Twitter - #MosesCore
As contents published on the Internet are becoming more and more dominated by videos, requirements on the language translation have also changed. Specifically, video publishers and distributors have a significant interest in balancing both the translation time and the accuracy. To this end, Pactera has invested in solutions, which leverage machine translation to reduce the overall translation time, and recruit human translators to improve the accuracy in a Wikipedia-like fashion. At Pactera, we aim to help video contents to reach billions of people that were not possible before.
This TAUS webinar outlines the many facets of translation technology and shares big picture analysis of key opportunities and challenges going forward.
Olga Beregovaya, CEO Americas, PROMT
PROMT's approach to engine hybridization differs from many other companies’ technology, using statistical methods on every stage of translation process: pre-editing, transfer and post-editing. The hybrid engine defines syntactic, lexical and grammar choices on an “atomic” level, rather than processing complete translated sentences. Pilot case examples will be used to demonstrate the robustness of advances.
The cognitive era and the future of contentScott Abel
The document discusses how cognitive computing could help Manuel, a nutritionist, more effectively produce and deliver healthy recipes and content to customers. It notes that Manuel currently struggles to produce enough content across multiple channels to meet customer expectations. A cognitive computing system could learn from Manuel's large collection of structured and unstructured content, understand customer needs, and help deliver personalized recommendations and experiences. This would help Manuel scale his business and provide an exceptional customer experience.
Jaap van der Meer will present key findings from the MT Market Report that TAUS published. For more information, see: https://www.taus.net/think-tank/reports/translate-reports/mt-market-report-2014
This presentation is a part of the MosesCore project that encourages the development and usage of open source machine translation tools, notably the Moses statistical MT toolkit. MosesCore is supported by the European Commission Grant Number 288487 under the 7th Framework Programme.
For the latest updates go to http://www.statmt.org/mosescore/
or follow us on Twitter - #MosesCore
A generation ago an emerging group of localization service providers were able to the exploit the opportunities of their times and earn rich rewards, leaving the traditional industry in their wake.
We are at a pivotal moment once again. New opportunities with Technology, Data, Metrics and Connectivity are already fueling growth in our industry. There will be winners and losers.
This Webinar covers the market landscape and TAUS activity on:
Translation automation, Language data sharing, Translation quality evaluation, Interoperability
This presentation is a part of the MosesCore project that encourages the development and usage of open source machine translation tools, notably the Moses statistical MT toolkit.
MosesCore is supported by the European Commission Grant Number 288487 under the 7th Framework Programme.
For the latest updates, follow us on Twitter - #MosesCore
The document summarizes a TAUS Machine Translation Showcase event held in Vancouver, Canada on October 29, 2014. It includes an agenda for presentations on machine translation applications at eBay, getting started with SMT, seamless globalization with crowd posting editing, and an introduction to the Matecat open-source CAT tool. The document also provides an overview of the machine translation market trends presented by TAUS, including growing market size, opportunities and challenges in the industry, and predictions for the future of machine translation and post-editing.
This document discusses the state of post-editing of machine translation output. It covers several topics:
1) There is still controversy around post-editing due to misunderstandings of what it involves. 2) University programs are beginning to include courses on machine translation and post-editing. 3) There is still a lack of shared best practices for post-editing and it remains an evolving skill. 4) The proposed ISO standard for post-editing may be premature and does not accurately capture the current state of post-editing.
Our statistical machine translation platform and hybrid features were presented at the European Commission offices in Luxembourg last Tuesday 22nd September. It is one of the tools that the European Union will consider, among other machine translation commercial solutions, as a tool to help its mandate for CEF (Connecting Europe Facility). Pangeanic’s CEO, Manuel Herranz, presented the current state-of-the-art that PangeaMT version 3 represents. Representatives from the EU were particularly interested in the solid data management features, machine translation engine retraining routines, data cleaning and automated engine training and creation features. One of key features with the new PangeaMT version is the possibility to change translation algorithms and use rule-based systems like Apertium and Thot as well as the default Moses. It is also compatible with 3rd-party calls from other systems. Its powerful API can also provide machine translated output to requests anywhere in the world, although the platform is designed for onsite use at translation companies and organizations. PangeaMT is also compatible with several popular translation formats like ttx, sdlxliff, memoq, memsource, and most xml-based Tikal formats.
TAUS is an innovation think tank for the translation industry that aims to shape the industry and increase its size and significance. It publishes reports on trends and strategies, holds events on topics like machine translation and interoperability, and runs labs for members to collaborate on projects around dynamic quality evaluation, interoperability standards, and open-source machine translation. The organization provides directories of language technologies, shares language data to improve translation quality and automation, and supports industry players and entrepreneurs through knowledge sharing and strategic discussions.
The proposed developments were wide-reaching and have significant implications for how the industry conducts business. We received 680 usable responses to the consultation questionnaire and a wealth of new ideas on further new features and services. Responses came from every stakeholder group in the industry: translators, corporate buyers, public sector buyers, service providers, technology vendors, academia, and consultants / sector analysts / commentators.
A very large majority recognize the benefits of sharing translation memories. There was a strong endorsement of plans to provide users and members with greater intelligent access and easier access to data through translation matching and open APIs for services. View the presentation to see how people voted, what has been prioritized, and when new services will be delivered.
The document provides an introduction to the OBASHI methodology, which is used to create visual maps of a business. It discusses the key elements of OBASHI, including Business & IT (B&IT) diagrams, Dataflow Analysis Views (DAV), and the six layers of the B&IT model. The methodology aims to provide clarity on how a business works and how it is supported by IT assets, in order to facilitate communication, planning, and improvement initiatives.
STAR Group provides machine translation (MT) solutions that integrate with their translation memory (TM) system called Transit. STAR MT uses statistical machine translation trained on customer-specific reference materials and terminology to provide MT suggestions during translation projects in Transit. These MT suggestions are treated similarly to fuzzy matches from the TM, allowing translators to easily validate or post-edit translations as needed. The integrated system aims to improve translation quality and efficiency over pure MT or TM alone.
This presentation is a part of the MosesCore project that encourages the development and usage of open source machine translation tools, notably the Moses statistical MT toolkit.
MosesCore is supported by the European Commission Grant Number 288487 under the 7th Framework Programme.
For the latest updates, follow us on Twitter - #MosesCore
TAUS is an innovation think tank for the translation industry that aims to increase automation and interoperability. It publishes reports on trends, holds events on topics like machine translation, and runs labs on interoperability and quality metrics. Its goals are to shape the industry, support innovation, and help members make informed decisions through knowledge sharing and defining new strategies.
Moses from the point of view of an LSP
This presentation is a part of the MosesCore project that encourages the development and usage of open source machine translation tools, notably the Moses statistical MT toolkit.
MosesCore is supporetd by the European Commission Grant Number 288487 under the 7th Framework Programme.
Latest news on Twitter - #MosesCore
The document announces a Translation Technology Showcase event hosted by TAUS on February 28, 2017 in Shenzhen. The event will feature presentations from various translation technology companies on topics like multichannel translation for the digital economy, using free and open source tools, leveraging large translation memories, and neural machine translation. The agenda lists out the scheduled presentations and their times. The document also mentions that TAUS recently published an updated Translation Technology Landscape Report covering trends in the industry and profiles of over 80 companies.
Kevin Knight, Senior Research Scientist and Fellow, Information Sciences Institute, Research Associate Professor, University of Southern California
A clear long-term vision motivates research in automatic language translation. The vision is that you read, write, listen, and speak in your own language, and computer software translates whenever necessary. Reading this paragraph but don't know English? No problem, computer will translate. Launching a new product in Eastern Europe? No problem. Boyfriend doesn't speak Korean? No problem.
This is certainly one of the most compelling visions in computer science, and it has animated a great deal of research. How do we get from here to there? This talk will look at recent improvements, noting how ideas have moved from impractical to mainstream, as well as covering current problems and future directions.
This presentation is a part of the MosesCore project that encourages the development and usage of open source machine translation tools, notably the Moses statistical MT toolkit. MosesCore is supported by the European Commission Grant Number 288487 under the 7th Framework Programme.
For the latest updates go to http://www.statmt.org/mosescore/
or follow us on Twitter - #MosesCore
As contents published on the Internet are becoming more and more dominated by videos, requirements on the language translation have also changed. Specifically, video publishers and distributors have a significant interest in balancing both the translation time and the accuracy. To this end, Pactera has invested in solutions, which leverage machine translation to reduce the overall translation time, and recruit human translators to improve the accuracy in a Wikipedia-like fashion. At Pactera, we aim to help video contents to reach billions of people that were not possible before.
This TAUS webinar outlines the many facets of translation technology and shares big picture analysis of key opportunities and challenges going forward.
Olga Beregovaya, CEO Americas, PROMT
PROMT's approach to engine hybridization differs from many other companies’ technology, using statistical methods on every stage of translation process: pre-editing, transfer and post-editing. The hybrid engine defines syntactic, lexical and grammar choices on an “atomic” level, rather than processing complete translated sentences. Pilot case examples will be used to demonstrate the robustness of advances.
The cognitive era and the future of contentScott Abel
The document discusses how cognitive computing could help Manuel, a nutritionist, more effectively produce and deliver healthy recipes and content to customers. It notes that Manuel currently struggles to produce enough content across multiple channels to meet customer expectations. A cognitive computing system could learn from Manuel's large collection of structured and unstructured content, understand customer needs, and help deliver personalized recommendations and experiences. This would help Manuel scale his business and provide an exceptional customer experience.
Jaap van der Meer will present key findings from the MT Market Report that TAUS published. For more information, see: https://www.taus.net/think-tank/reports/translate-reports/mt-market-report-2014
This presentation is a part of the MosesCore project that encourages the development and usage of open source machine translation tools, notably the Moses statistical MT toolkit. MosesCore is supported by the European Commission Grant Number 288487 under the 7th Framework Programme.
For the latest updates go to http://www.statmt.org/mosescore/
or follow us on Twitter - #MosesCore
A generation ago an emerging group of localization service providers were able to the exploit the opportunities of their times and earn rich rewards, leaving the traditional industry in their wake.
We are at a pivotal moment once again. New opportunities with Technology, Data, Metrics and Connectivity are already fueling growth in our industry. There will be winners and losers.
This Webinar covers the market landscape and TAUS activity on:
Translation automation, Language data sharing, Translation quality evaluation, Interoperability
This presentation is a part of the MosesCore project that encourages the development and usage of open source machine translation tools, notably the Moses statistical MT toolkit.
MosesCore is supported by the European Commission Grant Number 288487 under the 7th Framework Programme.
For the latest updates, follow us on Twitter - #MosesCore
The document summarizes a TAUS Machine Translation Showcase event held in Vancouver, Canada on October 29, 2014. It includes an agenda for presentations on machine translation applications at eBay, getting started with SMT, seamless globalization with crowd posting editing, and an introduction to the Matecat open-source CAT tool. The document also provides an overview of the machine translation market trends presented by TAUS, including growing market size, opportunities and challenges in the industry, and predictions for the future of machine translation and post-editing.
This document discusses the state of post-editing of machine translation output. It covers several topics:
1) There is still controversy around post-editing due to misunderstandings of what it involves. 2) University programs are beginning to include courses on machine translation and post-editing. 3) There is still a lack of shared best practices for post-editing and it remains an evolving skill. 4) The proposed ISO standard for post-editing may be premature and does not accurately capture the current state of post-editing.
Our statistical machine translation platform and hybrid features were presented at the European Commission offices in Luxembourg last Tuesday 22nd September. It is one of the tools that the European Union will consider, among other machine translation commercial solutions, as a tool to help its mandate for CEF (Connecting Europe Facility). Pangeanic’s CEO, Manuel Herranz, presented the current state-of-the-art that PangeaMT version 3 represents. Representatives from the EU were particularly interested in the solid data management features, machine translation engine retraining routines, data cleaning and automated engine training and creation features. One of key features with the new PangeaMT version is the possibility to change translation algorithms and use rule-based systems like Apertium and Thot as well as the default Moses. It is also compatible with 3rd-party calls from other systems. Its powerful API can also provide machine translated output to requests anywhere in the world, although the platform is designed for onsite use at translation companies and organizations. PangeaMT is also compatible with several popular translation formats like ttx, sdlxliff, memoq, memsource, and most xml-based Tikal formats.
TAUS is an innovation think tank for the translation industry that aims to shape the industry and increase its size and significance. It publishes reports on trends and strategies, holds events on topics like machine translation and interoperability, and runs labs for members to collaborate on projects around dynamic quality evaluation, interoperability standards, and open-source machine translation. The organization provides directories of language technologies, shares language data to improve translation quality and automation, and supports industry players and entrepreneurs through knowledge sharing and strategic discussions.
The proposed developments were wide-reaching and have significant implications for how the industry conducts business. We received 680 usable responses to the consultation questionnaire and a wealth of new ideas on further new features and services. Responses came from every stakeholder group in the industry: translators, corporate buyers, public sector buyers, service providers, technology vendors, academia, and consultants / sector analysts / commentators.
A very large majority recognize the benefits of sharing translation memories. There was a strong endorsement of plans to provide users and members with greater intelligent access and easier access to data through translation matching and open APIs for services. View the presentation to see how people voted, what has been prioritized, and when new services will be delivered.
The document provides an introduction to the OBASHI methodology, which is used to create visual maps of a business. It discusses the key elements of OBASHI, including Business & IT (B&IT) diagrams, Dataflow Analysis Views (DAV), and the six layers of the B&IT model. The methodology aims to provide clarity on how a business works and how it is supported by IT assets, in order to facilitate communication, planning, and improvement initiatives.
STAR Group provides machine translation (MT) solutions that integrate with their translation memory (TM) system called Transit. STAR MT uses statistical machine translation trained on customer-specific reference materials and terminology to provide MT suggestions during translation projects in Transit. These MT suggestions are treated similarly to fuzzy matches from the TM, allowing translators to easily validate or post-edit translations as needed. The integrated system aims to improve translation quality and efficiency over pure MT or TM alone.
This presentation is a part of the MosesCore project that encourages the development and usage of open source machine translation tools, notably the Moses statistical MT toolkit.
MosesCore is supported by the European Commission Grant Number 288487 under the 7th Framework Programme.
For the latest updates, follow us on Twitter - #MosesCore
TAUS is an innovation think tank for the translation industry that aims to increase automation and interoperability. It publishes reports on trends, holds events on topics like machine translation, and runs labs on interoperability and quality metrics. Its goals are to shape the industry, support innovation, and help members make informed decisions through knowledge sharing and defining new strategies.
Moses from the point of view of an LSP
This presentation is a part of the MosesCore project that encourages the development and usage of open source machine translation tools, notably the Moses statistical MT toolkit.
MosesCore is supporetd by the European Commission Grant Number 288487 under the 7th Framework Programme.
Latest news on Twitter - #MosesCore
The document announces a Translation Technology Showcase event hosted by TAUS on February 28, 2017 in Shenzhen. The event will feature presentations from various translation technology companies on topics like multichannel translation for the digital economy, using free and open source tools, leveraging large translation memories, and neural machine translation. The agenda lists out the scheduled presentations and their times. The document also mentions that TAUS recently published an updated Translation Technology Landscape Report covering trends in the industry and profiles of over 80 companies.
Khalil Hidmi is a Jordanian national seeking a challenging position in information technology with 13 years of experience. He currently works as a Senior Application Consultant at SAAD Specialist Hospital, where he develops their hospital management information system. Prior to this, he worked as a Sales Engineer and as a Department Programming Coordinator. He has extensive experience with Oracle Reports and Forms, SQL, databases, and various programming languages and tools.
"Empower" is a buzz word that has been pushed around extensively by many Machine Translation (MT) vendors. Empowerment implies by its very nature that you are required to put in some effort to have the control that empowerment promises and of course that you have the necessary experience and skills required to be empowered. In reality, few MT vendors offer little more than the ability to upload translation memories. True MT empowerment comes by having total control and transparency in the entire customization and translation process. MT empowerment also enables the business as a whole to expand its capabilities and reach by performing tasks that were previously unobtainable with a human only translation approach.
This showcase demonstrates on the how Language Studio™ empowers organizations to use MT optimally and strategically by enabling project managers to control and define the MT customization process. Language Studio™ provides a wide range of tools and processes that enable customers to have complete control over their custom MT engines. With the guidance of Language Studio™ Linguists, the process is streamlined with expertise gained from building thousands of custom engines. This expertise is leveraged to meet your specific custom MT requirements. Just like a human translation project, every custom engine is unique and is managed in a similar manner to human translation projects with term definition, style guides and quality assurance.
Mostafa Mohamd is seeking a full-time position as an Odoo developer where he can utilize his entrepreneurial and technical skills to design and customize business solutions. He has over 3 years of experience developing Odoo modules and customizing the backend and frontend of Odoo POS. He is proficient in Python, Django, JavaScript, PostgreSQL, and various Odoo development tools. Mostafa holds a Bachelor's degree in Computer Science and is currently pursuing a Master's degree.
This document summarizes a MuleSoft meetup focused on sustainable engineering practices. It discusses adopting sustainable practices throughout the software development lifecycle from architecture and development through deployment and operations. Specific practices discussed include optimizing Mule applications by switching to Mule 4, implementing caching, reducing excess variables, compressing data, and monitoring resource usage. It also covers green deployment options like using cloud platforms and containerization on-premises to improve server utilization. Tests were presented showing the performance and resource impact of optimized versus unoptimized applications. The key takeaway is that developers have power to positively impact sustainability through their work.
This presentation is a part of the MosesCore project that encourages the development and usage of open source machine translation tools, notably the Moses statistical MT toolkit.
MosesCore is supported by the European Commission Grant Number 288487 under the 7th Framework Programme.
For the latest updates, follow us on Twitter - #MosesCore
Similar to TAUS Moses Roundtable, Prague, 11 September 2013 (20)
The document introduces the Dynamic Quality Framework (DQF), which aims to standardize quality measurements across the translation industry. It describes DQF as inclusive, industry-shared, and data-informed. The framework integrates with common CAT tools and TMS through open APIs to collect translation and review data and provide interactive dashboards and reports for performance tracking and benchmarking at the project and organizational level.
The document discusses the evolution of machine translation (MT) technology over time from early conceptual ideas to modern neural machine translation (NMT) systems. It uses metaphors of a band changing their sound over time by adding new band members, such as an "MT guy", to represent how translation companies can adapt to new technologies. The presentation encourages translation businesses to thoughtfully integrate new tools like NMT by involving stakeholders and focusing on people in the process of transition.
The document summarizes the results of a machine translation evaluation that compared human and machine translations. Several human and machine translation systems were evaluated on a test set containing sentences translated between English and Chinese. The top performing systems were combinations of human and machine translations. There was criticism of claims of machine translation achieving "human parity" due to limitations in the test set only using sentences rather than documents, and evaluators not being qualified translators. Neural machine translation systems are argued to have advantages over statistical and rule-based systems by processing full sentences and storing additional context in hidden layers.
The document discusses how artificial intelligence and neural machine translation will change the role of human translation over time. While AI can handle the translation process at scale, humans will still be needed for local knowledge, problem solving, and tasks like optimizing processes, improving output quality, and ensuring quality. However, a fragmented technology landscape slows businesses down. The solution proposed is an integrated localization hub that connects content systems, translation technology, and translation services through a single API to address current issues where technical knowledge and system fragmentation are still barriers.
The document discusses innovation in machine translation and language technology. It notes that translation is becoming more data-driven and algorithmic, with machines learning from large amounts of data. It also mentions that translation may become invisible and automated like utilities such as electricity. The document then lists some concepts characterizing innovative contest candidates in game changer awards, such as advanced machine translation, artificial intelligence, and automated quality evaluation. Finally, it states that six contestants will each have six minutes to pitch their innovative ideas.
Review processes as the last step in quality assurance workflows are “notorious for causing delays and frustrations”. The reason normally is a flawed process: Many manual steps for the PMs, the lack of intuitive, layout-oriented collaboration software, plus the expectation of review to “fix a broken translation” in the last second rather than giving strategic process input. globalReview shifts this paradigm: As an integrated, collaborative platform with full layout editing it provides a positive review experience. At the same time, it pushes quality upstream applying DQF principles: Flexible content profiles define precise quality expectations; issue categories and scoring effectively gauge and also track translation quality over time; a sampling module allows for fast yet accurate quality evaluation. Put together, this allows the customer to raise the process from painful review to strategic quality management and gain valuable business intelligence.
A global P2P Trading Platform for TMs will be introduced. Tmxmall TM marketplace is the core, and client TM software and CATs are the input and output respectively. User of CATs is able to search the TMs of client users while it does not require client users to upload TMs to the cloud.
The presentation will introduce the NLP technologies used in Shiyibao and the main product features, covering the following points:
Function of giving automatic grades for translations based on translation quality automatic evaluation algorithm;
Function of giving automatic comments based on rules matching;
Function of sorting translations according to their similarity or some specific fragments to dramatically improve the efficiency of reviewing and commenting on translations.
In today’s digital economy, content is becoming smaller, more fragmented, and in need of on-demand translation in minutes and around the clock. Traditional localization models are no longer sufficient in meeting these always-on, agile, fast, and small translation requirements of the digital age. This is why mobile translation services like Stepes that are able to deliver quality, speed, and scalability are poised to see tremendous growth. During this 6-minute presentation, Stepes will demonstrate live its instant human translation service for micro content. Powered by human translators from around the world, Stepes is the world’s first mobile translation ecosystem delivering quality translation services using a networking model similar to Uber and Lyft.
This document discusses TruTran's open machine translation platform and the trends in machine translation engine development. It notes that neural network technology allows each company to have its own customized trained neural machine translation engine. The open source nature of neural networks means that machine translation will be "generalized" or available to more users. However, enterprises currently lack professionals skilled in natural language processing and training data can be difficult to process. TruTran's platform aims to address these issues by allowing users to easily upload custom training data and corpora, select a domain to train an engine, and have the engine trained within 6 days on the platform's resources. This gives each company their own commercial-grade machine translation engine at low cost and with their
Kirk Zhang, the COO of Wiitrans, presented on their semantic matching and translator resource management tools which aim to deliver high quality translations by matching content to appropriate translators based on their individual translator profiles and histories. The tools analyze translator-specific language assets, glossaries, and translation memories to best match work to translators and simplify the translation process.
The document describes a computer-aided translation and interpretation training system called CATS. It provides course management, multi-lingual resource centers, and translation management platforms to support online translation and interpretation courses. CATS allows instructors to upload multimedia content and documents, create translation cases and assignments, and evaluate student work. It aims to improve over traditional methods of collecting assignments through email by offering an integrated online platform for pre-class, in-class, and post-class activities.
Most of LSPs have not converted the translated bilingual documents to TM till now. Even the LSPs have established TMs, they are also confronted with disordered management of TMs and low efficiency. This report will share the way of quick TM establishment with Tmxmall Cloud-Based Smart Aligner, the way of Management of large-scale TMs with Private Cloud-Based TM for achieving pre-translation with large-scale TMs and team cooperation and etc.. Besides, the report will introduce Tmxmall TM marketplace, which is expected to promote TM sharing. Finally, we will share the experience of LSPs on alignment and Private Cloud-Based TM management for reducing translation costs and increasing profits.
SDL is the leader in global content management and language translation solutions. With more than 20 years of experience, SDL helps companies build relevant online experiences that deliver transformative business results on a global scale. Translation Industry continues to grow, and Freelancers, LSPs and Corporate clients all see increased demand as more and more content is created, so we have to address them all. As a Market-leading translation productivity tool, SDL Trados Studio is trusted by over 200,000 translation professionals to boost productivity, control quality and aid collaboration. SDL has launched Trados Studio 2017. This presentation will introduce SDL Trados Studio 2017 and highlight SDL’s new productivity booster- UPLIFT, which is well welcomed by global clients.
This document discusses Lingosail's translation technology products and services, including machine translation, corpus construction, and translation services. It outlines how Lingosail's machine translation process editing (MTPE) solutions can provide easier entry into translation for clients, higher translation efficiency, and more scalable management of translation workflows. The document also describes Lingosail's patent post-editing training course for translators, which saw hundreds of participants last year, and resulted in trainees increasing their translation speed and quality after training.
This document discusses how to introduce machine translation (MT) into a company to improve localization processes. It outlines challenges with the current process of 30 localization loops involving 40 translators across different locations with no quality or cost control. Introducing MT for display text localization could speed up availability, lower costs by 25%, and reduce unnecessary translation loops by 50%. A short-term goal is to use MT for development phases with a final quality loop involving human translation and post-editing. Long-term preparation is needed to expand MT use while addressing risks, quality guidelines, and system environments.
This document discusses integrating XTM Cloud and TAUS DQF to enable higher quality translation projects. Key steps include creating accounts in both systems, configuring LQA parameters and issues in XTM, creating translation projects in XTM with LQA steps, performing translations and LQA reviews in XTM, and then viewing productivity and quality results in the TAUS DQF system. The integration is meant to provide benefits like higher productivity, improved quality, and better data to evaluate machine translation systems.
Quality standards in the industry have come a long way. They have evolved over the years, but their focus on quality definitions based on errors and metrics has remained the accepted wisdom. Expectations of end users are changing. Every piece of content has a job to do, and it is often to touch the heart of users rather than just the mind by delivering information that is accurate and whose quality is measurable. A new “quality evaluation paradigm” is emerging. This calls for a new profile for translators, one that is different from what has been typical for the past few decades. This presentation will look at this trend in more detail, considering how to test these new types of translators fast and effectively. What matters in this emerging quality model and what does it possibly mean for DQF?
Have you ever been confused by the myriad of choices offered by AWS for hosting a website or an API?
Lambda, Elastic Beanstalk, Lightsail, Amplify, S3 (and more!) can each host websites + APIs. But which one should we choose?
Which one is cheapest? Which one is fastest? Which one will scale to meet our needs?
Join me in this session as we dive into each AWS hosting service to determine which one is best for your scenario and explain why!
Fueling AI with Great Data with Airbyte WebinarZilliz
This talk will focus on how to collect data from a variety of sources, leveraging this data for RAG and other GenAI use cases, and finally charting your course to productionalization.
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxSitimaJohn
Ocean Lotus cyber threat actors represent a sophisticated, persistent, and politically motivated group that poses a significant risk to organizations and individuals in the Southeast Asian region. Their continuous evolution and adaptability underscore the need for robust cybersecurity measures and international cooperation to identify and mitigate the threats posed by such advanced persistent threat groups.
5th LF Energy Power Grid Model Meet-up SlidesDanBrown980551
5th Power Grid Model Meet-up
It is with great pleasure that we extend to you an invitation to the 5th Power Grid Model Meet-up, scheduled for 6th June 2024. This event will adopt a hybrid format, allowing participants to join us either through an online Mircosoft Teams session or in person at TU/e located at Den Dolech 2, Eindhoven, Netherlands. The meet-up will be hosted by Eindhoven University of Technology (TU/e), a research university specializing in engineering science & technology.
Power Grid Model
The global energy transition is placing new and unprecedented demands on Distribution System Operators (DSOs). Alongside upgrades to grid capacity, processes such as digitization, capacity optimization, and congestion management are becoming vital for delivering reliable services.
Power Grid Model is an open source project from Linux Foundation Energy and provides a calculation engine that is increasingly essential for DSOs. It offers a standards-based foundation enabling real-time power systems analysis, simulations of electrical power grids, and sophisticated what-if analysis. In addition, it enables in-depth studies and analysis of the electrical power grid’s behavior and performance. This comprehensive model incorporates essential factors such as power generation capacity, electrical losses, voltage levels, power flows, and system stability.
Power Grid Model is currently being applied in a wide variety of use cases, including grid planning, expansion, reliability, and congestion studies. It can also help in analyzing the impact of renewable energy integration, assessing the effects of disturbances or faults, and developing strategies for grid control and optimization.
What to expect
For the upcoming meetup we are organizing, we have an exciting lineup of activities planned:
-Insightful presentations covering two practical applications of the Power Grid Model.
-An update on the latest advancements in Power Grid -Model technology during the first and second quarters of 2024.
-An interactive brainstorming session to discuss and propose new feature requests.
-An opportunity to connect with fellow Power Grid Model enthusiasts and users.
Driving Business Innovation: Latest Generative AI Advancements & Success StorySafe Software
Are you ready to revolutionize how you handle data? Join us for a webinar where we’ll bring you up to speed with the latest advancements in Generative AI technology and discover how leveraging FME with tools from giants like Google Gemini, Amazon, and Microsoft OpenAI can supercharge your workflow efficiency.
During the hour, we’ll take you through:
Guest Speaker Segment with Hannah Barrington: Dive into the world of dynamic real estate marketing with Hannah, the Marketing Manager at Workspace Group. Hear firsthand how their team generates engaging descriptions for thousands of office units by integrating diverse data sources—from PDF floorplans to web pages—using FME transformers, like OpenAIVisionConnector and AnthropicVisionConnector. This use case will show you how GenAI can streamline content creation for marketing across the board.
Ollama Use Case: Learn how Scenario Specialist Dmitri Bagh has utilized Ollama within FME to input data, create custom models, and enhance security protocols. This segment will include demos to illustrate the full capabilities of FME in AI-driven processes.
Custom AI Models: Discover how to leverage FME to build personalized AI models using your data. Whether it’s populating a model with local data for added security or integrating public AI tools, find out how FME facilitates a versatile and secure approach to AI.
We’ll wrap up with a live Q&A session where you can engage with our experts on your specific use cases, and learn more about optimizing your data workflows with AI.
This webinar is ideal for professionals seeking to harness the power of AI within their data management systems while ensuring high levels of customization and security. Whether you're a novice or an expert, gain actionable insights and strategies to elevate your data processes. Join us to see how FME and AI can revolutionize how you work with data!
Programming Foundation Models with DSPy - Meetup SlidesZilliz
Prompting language models is hard, while programming language models is easy. In this talk, I will discuss the state-of-the-art framework DSPy for programming foundation models with its powerful optimizers and runtime constraint system.
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfMalak Abu Hammad
Discover how MongoDB Atlas and vector search technology can revolutionize your application's search capabilities. This comprehensive presentation covers:
* What is Vector Search?
* Importance and benefits of vector search
* Practical use cases across various industries
* Step-by-step implementation guide
* Live demos with code snippets
* Enhancing LLM capabilities with vector search
* Best practices and optimization strategies
Perfect for developers, AI enthusiasts, and tech leaders. Learn how to leverage MongoDB Atlas to deliver highly relevant, context-aware search results, transforming your data retrieval process. Stay ahead in tech innovation and maximize the potential of your applications.
#MongoDB #VectorSearch #AI #SemanticSearch #TechInnovation #DataScience #LLM #MachineLearning #SearchTechnology
This presentation provides valuable insights into effective cost-saving techniques on AWS. Learn how to optimize your AWS resources by rightsizing, increasing elasticity, picking the right storage class, and choosing the best pricing model. Additionally, discover essential governance mechanisms to ensure continuous cost efficiency. Whether you are new to AWS or an experienced user, this presentation provides clear and practical tips to help you reduce your cloud costs and get the most out of your budget.
Main news related to the CCS TSI 2023 (2023/1695)Jakub Marek
An English 🇬🇧 translation of a presentation to the speech I gave about the main changes brought by CCS TSI 2023 at the biggest Czech conference on Communications and signalling systems on Railways, which was held in Clarion Hotel Olomouc from 7th to 9th November 2023 (konferenceszt.cz). Attended by around 500 participants and 200 on-line followers.
The original Czech 🇨🇿 version of the presentation can be found here: https://www.slideshare.net/slideshow/hlavni-novinky-souvisejici-s-ccs-tsi-2023-2023-1695/269688092 .
The videorecording (in Czech) from the presentation is available here: https://youtu.be/WzjJWm4IyPk?si=SImb06tuXGb30BEH .
Your One-Stop Shop for Python Success: Top 10 US Python Development Providersakankshawande
Simplify your search for a reliable Python development partner! This list presents the top 10 trusted US providers offering comprehensive Python development services, ensuring your project's success from conception to completion.
Dive into the realm of operating systems (OS) with Pravash Chandra Das, a seasoned Digital Forensic Analyst, as your guide. 🚀 This comprehensive presentation illuminates the core concepts, types, and evolution of OS, essential for understanding modern computing landscapes.
Beginning with the foundational definition, Das clarifies the pivotal role of OS as system software orchestrating hardware resources, software applications, and user interactions. Through succinct descriptions, he delineates the diverse types of OS, from single-user, single-task environments like early MS-DOS iterations, to multi-user, multi-tasking systems exemplified by modern Linux distributions.
Crucial components like the kernel and shell are dissected, highlighting their indispensable functions in resource management and user interface interaction. Das elucidates how the kernel acts as the central nervous system, orchestrating process scheduling, memory allocation, and device management. Meanwhile, the shell serves as the gateway for user commands, bridging the gap between human input and machine execution. 💻
The narrative then shifts to a captivating exploration of prominent desktop OSs, Windows, macOS, and Linux. Windows, with its globally ubiquitous presence and user-friendly interface, emerges as a cornerstone in personal computing history. macOS, lauded for its sleek design and seamless integration with Apple's ecosystem, stands as a beacon of stability and creativity. Linux, an open-source marvel, offers unparalleled flexibility and security, revolutionizing the computing landscape. 🖥️
Moving to the realm of mobile devices, Das unravels the dominance of Android and iOS. Android's open-source ethos fosters a vibrant ecosystem of customization and innovation, while iOS boasts a seamless user experience and robust security infrastructure. Meanwhile, discontinued platforms like Symbian and Palm OS evoke nostalgia for their pioneering roles in the smartphone revolution.
The journey concludes with a reflection on the ever-evolving landscape of OS, underscored by the emergence of real-time operating systems (RTOS) and the persistent quest for innovation and efficiency. As technology continues to shape our world, understanding the foundations and evolution of operating systems remains paramount. Join Pravash Chandra Das on this illuminating journey through the heart of computing. 🌟
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Tatiana Kojar
Skybuffer AI, built on the robust SAP Business Technology Platform (SAP BTP), is the latest and most advanced version of our AI development, reaffirming our commitment to delivering top-tier AI solutions. Skybuffer AI harnesses all the innovative capabilities of the SAP BTP in the AI domain, from Conversational AI to cutting-edge Generative AI and Retrieval-Augmented Generation (RAG). It also helps SAP customers safeguard their investments into SAP Conversational AI and ensure a seamless, one-click transition to SAP Business AI.
With Skybuffer AI, various AI models can be integrated into a single communication channel such as Microsoft Teams. This integration empowers business users with insights drawn from SAP backend systems, enterprise documents, and the expansive knowledge of Generative AI. And the best part of it is that it is all managed through our intuitive no-code Action Server interface, requiring no extensive coding knowledge and making the advanced AI accessible to more users.
Monitoring and Managing Anomaly Detection on OpenShift.pdfTosin Akinosho
Monitoring and Managing Anomaly Detection on OpenShift
Overview
Dive into the world of anomaly detection on edge devices with our comprehensive hands-on tutorial. This SlideShare presentation will guide you through the entire process, from data collection and model training to edge deployment and real-time monitoring. Perfect for those looking to implement robust anomaly detection systems on resource-constrained IoT/edge devices.
Key Topics Covered
1. Introduction to Anomaly Detection
- Understand the fundamentals of anomaly detection and its importance in identifying unusual behavior or failures in systems.
2. Understanding Edge (IoT)
- Learn about edge computing and IoT, and how they enable real-time data processing and decision-making at the source.
3. What is ArgoCD?
- Discover ArgoCD, a declarative, GitOps continuous delivery tool for Kubernetes, and its role in deploying applications on edge devices.
4. Deployment Using ArgoCD for Edge Devices
- Step-by-step guide on deploying anomaly detection models on edge devices using ArgoCD.
5. Introduction to Apache Kafka and S3
- Explore Apache Kafka for real-time data streaming and Amazon S3 for scalable storage solutions.
6. Viewing Kafka Messages in the Data Lake
- Learn how to view and analyze Kafka messages stored in a data lake for better insights.
7. What is Prometheus?
- Get to know Prometheus, an open-source monitoring and alerting toolkit, and its application in monitoring edge devices.
8. Monitoring Application Metrics with Prometheus
- Detailed instructions on setting up Prometheus to monitor the performance and health of your anomaly detection system.
9. What is Camel K?
- Introduction to Camel K, a lightweight integration framework built on Apache Camel, designed for Kubernetes.
10. Configuring Camel K Integrations for Data Pipelines
- Learn how to configure Camel K for seamless data pipeline integrations in your anomaly detection workflow.
11. What is a Jupyter Notebook?
- Overview of Jupyter Notebooks, an open-source web application for creating and sharing documents with live code, equations, visualizations, and narrative text.
12. Jupyter Notebooks with Code Examples
- Hands-on examples and code snippets in Jupyter Notebooks to help you implement and test anomaly detection models.
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
2. This slide may not be used or copied without permission from TAUS
Moses Users – Finding Common Ground
Are there areas where Moses users (from industry) can
cooperate? (beyond what is already done as part of
MosesCore)
AREA COOPERATION
Knowledge Sharing Yes
Sharing Investment ?
Sharing Code ?
3. This slide may not be used or copied without permission from TAUS
Open/Proprietary
4. This slide may not be used or copied without permission from TAUS
Agenda
14:00/ Welcome
14:10/ Introductions
14:30/ Results Moses Survey
15:00/ Moses Roadmap
15:30 / Discussion on Areas for Cooperation
16:00 / BREAK
16:30/ Review/Prioritize Areas for Cooperation
17:15/ Wrap Up and Adjourn
6. This slide may not be used or copied without permission from TAUS
Introductions
o Konstantinos Chatzitheodorou, Alpha CRC
o Natalia Kljueva, Charles University
o Shadi Salen, Charles University
o Milan Condak, Condak.net s.r.o.
o Bonnie Dorr, DARPA
o Zdena Závůrková, IBM
o Anabela Barreiro, INESC-ID
o Adam LopezJohns Hopkins University
o Christian Buck, Lantis
o Michal Kašpar, Lingea s.r.o.
o Jacek Skarbek, LocStar
o Tomas Fulatak, Moravia
o Niko Papula, Multilizer
7. This slide may not be used or copied without permission from TAUS
Introductions
o Daniel Rosàs, Pactera
o Francis Tyers, Prompsit
o WonYoung Seo, Samsung Electronics
o SeungWook Lee, Samsung Electronics
o Falko Schaefer, SAP AG
o Alexander Semerenko, Seznam.cz, a.s.
o Jie Jiang, Capita T&I
o Martin Baumgärtner, STAR Langauge Technology & Solutions GmbH
o Ronald Horsselenberg, TransIT BV
o Ulrich Germann, University of Edinburgh
o Varvara Logacheva, USFD
o Alex Yanishevsky, Welocalize
o Andrzej Zydroo, XTM-INTL
8. This slide may not be used or copied without permission from TAUS
Introductions
Organisers
o Ondrej Bojar, Charles University
o Philipp Koehn, University of Edinburgh
o Barry Haddow, University of Edinburgh
o Hieu Hoang, University of Edinburgh
o Achim Ruopp, TAUS
o Rahzeb Choudhury, TAUS
10. This slide may not be used or copied without permission from TAUS
MT @ Alpha CRC
o Working with MT since 2006
o Hybrid phrase-based MT system
o Post-editing cost evaluation
Have developed Reverse Analysis, a methodology to
evaluate the post editing effort on the basis of how much
MT output was edited
11. This slide may not be used or copied without permission from TAUS
Alpha MT flow
Selection of
training data
Training
Moses, SRILM, MGIZA,
MERT …
Translation Post-editing
Re-training
Rules Insertion
POS, Syntactic, Morpholo
gy
Terminology
optional
mandatory
12. TAUS Moses Roundtable
<My MTs and my CATs>
<Milan Čondák>
<Condak.net s.r.o. Petřvald>
11-Sep-2013
Prague, Czech Republic
13. This slide may not be used or copied without permission from TAUS
< My MTs and my CATs >
<PC Translator>
o My first MT was Czech program PC Translator. This
SW run in Windows, one language is foreign language
and second language is Czech. PC Translator has the
bidirectoral indexes and can translate in both
directions.
o There was 3 main modules: a Dictionary, an Editor
and a Dictionary Manager.
o PC Translator worked in two modes: translating of an
entire file or translating of text in Editor.
o By text translating was visible a terminology of
opened sentence.
14. This slide may not be used or copied without permission from TAUS
Wordfast Classic in MS Word
o Wordfast Classic (WFC) have been offering integration
MT which works in MS Word
o So I asked a developer of PC Translator to create API
for MS Word. He created three APIs: for MS Word, for
MS Outlook and MS IE. Later he added APIs for more
email clients and web brousers.
o WFC begun to use new feature, a Companion. In new
window is displayed terminology of opened segment
which is found in Wordfast glossary.
15. This slide may not be used or copied without permission from TAUS
Wordfast Classic in MS Word
16. This slide may not be used or copied without permission from TAUS
Web translation services in my MT and CATs
o PC Translator can show
offers from Google and
Bing:
o http://www.condak.net/m
achine_t/cs/comprendo/c
s/07.html
o MetaTexis for Word 2007
+ Web MT Servers:
o http://www.condak.net/c
at_other/virtaal/2013082
1/cs/02.html
o Free Translation via
Internet works without
registration.
o Virtaal Plugins - models
for TM an MT:
o http://www.condak.net/c
at_other/virtaal/2013082
1/cs/03.html
17. TAUS Moses Roundtable
MT in localization company
Jacek Skarbek
LocStar
11-Sep-2013
Prague, Czech Republic
18. This slide may not be used or copied without permission from TAUS
Our experience with MT
o We are a software localization provider which has been
using CAT tools for 17 years – we have large TMs
o Some of our clients provide us work with MTranslated
content (both for no matches and as alternative to TM
fuzzies) using their own MT solutions – our job is to work
on MT like with fuzzy matches (no typical post-editing)
o For about 2 years we have been used MT for one of our
main customer. We buy MT content from third party that
use their own solution based on Moses and TMs provided
by customer.
o We have large TMs collected and we test our Moses
based internal solution to use it in production
environment
19. This slide may not be used or copied without permission from TAUS
Moses related problems/areas to improve
o Tag handling/inline markup – only partially resolved by
M4Loc solutions
o Lack of API to better/easier integrate Moses into
production workflow
o We need better terminology/software items handling. I.e
something like <zone>, but phrase in zone treated
separately from rest of sentence at the level of TM and
as a part of sentence at the level of LM
o Inflections in Slavic languages
20. This slide may not be used or copied without permission from TAUS
Other problems
o Weird approach to MT rate – customers tend to decrease
translation rate in the same percent as measured (or
estimated) acceleration of translation work itself, while
translation rates covers also project management and all
other linguistic and technical tasks that are not
accelerated by MT
o We are not allowed to use MT for some customers – it is
restricted by work agreement . Although we treat
MTranslated content as fuzzy matches, they afraid that it
would impact on final quality of translation
21. TAUS Moses Roundtable
Samsung Electronics
Cooperation and SMT
Seung-Wook Lee, Wonyoung Seo
Samsung Electronics Corporation
11-Sep-2013
Prague, Czech Republic
22. This slide may not be used or copied without permission from TAUS
Samsung Electronics Cooperation and Machine Translation
o Our team provide translation services for various internal
groupware applications (e.g., instance messenger) for the
department
o One of the main concerns of ours is to expand language pairs
There are very little of bilingual corpus available for the
most of languages, such as Asian languages
Is indirect translation the solution? how do we deal with the
error propagation?
Working groups and developer meetings for those
languages may necessary
23. TAUS Moses Roundtable
Statistical Machine Translation
at SAP
Dr. Falko Schaefer
SAP Language Services
11-Sep-2013
Prague, Czech Republic
24. This slide may not be used or copied without permission from TAUS
SMT Project at SLS
o SAP Language Services (SLS) has successfully worked
with rule-based MT for over 20 years
o However, the growing demand for a new breed of MT
meant that SLS began to embark on Moses-based
SMT in early 2013
o The SLS MT project aims to establish a new MT
service to reduce translation throughput time and
cost
o To that end SLS works with an external partner to
support implementation and knowledge transfer
26. This slide may not be used or copied without permission from TAUS
Company Intro
o One of top 10 LSPs (language service providers)
o Department dedicated to MT and Language Tools
(evaluation of MT, productivity workbench, corpus
preparation, vendor selection, vendor training and
certification)
o MT agnostic
o MT integrated into TMS/GMS
27. This slide may not be used or copied without permission from TAUS
Areas of Interest
o Productization of Moses
-lower barrier of entry
-interoperability
o Integration into TMS/GMS
o Tag handling
o Predictive modeling
29. This slide may not be used or copied without permission from TAUS
Demographic Composition
0% 5% 10% 15% 20% 25% 30% 35%
Consultant
Language Service Provider/Agency
Other
Research Institute
Translation Buyer
Translation Technology Provider
Translator
2013 2012 2011
30. This slide may not be used or copied without permission from TAUS
Ranking of Requested Moses Improvements
2013
Rank
2012
Rank
2011
Rank
1 1 3 Training and translation speed
2 4 1 Integrating Moses into existing workflow/system
(e.g. TM integration)
3 3 2 Installing and using Moses
4 Terminology Management
5 2 4 Evaluation results (e.g. evaluating productivity)
6 5 5 Language-specific issues
7 Advanced features (e.g. tree-based translation)
8 7 7 Customer support
6 6 Easier to get the right human resources
31. This slide may not be used or copied without permission from TAUS
#1 Requested Moses Improvement
Training and Translation Speed
o Users are aware that SMT requires a considerable
amount of computing resources
o Request driven by management and user demands
Fast-turn-around/online translation
Frequent re-training of systems with new/updated data
o Recommendation
Integrate recent training speed improvements into the
training tool chain
Document recommendations how to best use the training
speed improvements
Further optimize performance for multi-threaded decoding
32. This slide may not be used or copied without permission from TAUS
#2 Requested Moses Improvement
Integrating Moses into Existing Workflows/Systems
o Integration into growing number of diverse systems
TMS/CaT/TenT
Content Management Systems
Automated Speech Recognition
Dialog Systems
…
o Recommendation
Comprehensive, stable and well documented APIs to the
decoder and data produced by it
RESTful HTTP API (Google/Bing compatible?)
Finish Okapi/M4Loc file format support
33. This slide may not be used or copied without permission from TAUS
#3 Requested Moses Improvement
Installing and Using Moses
o Still a range of installation experiences from “No
problem” to “very complex to understand and to
implement”
o Should Moses team provide installable packages?
o Windows support?
o UI?
o Recommendation
Occasional stable releases of Moses as installable packages
across different platforms – consistency is key
Take on the maintenance and release of required
components abandoned by their original developers
34. This slide may not be used or copied without permission from TAUS
#4 Requested Moses Improvement
Terminology Management
o Terminology injection from terminological resources
Term bases
Named entity recognizers
o Recommendation:
Better documentation of the XML input feature
Ensuring that the XML input feature minimally impacts
translation quality of the overall sentence
Handling input with named entities marked up by named
entity recognizers
Ensure XML input can be handled in the complete tool chain
(e.g. tokenizer)
35. This slide may not be used or copied without permission from TAUS
#5 Requested Moses Improvement
Evaluation
o Expansion of the metrics that can be used to tune
Moses MT systems
Specifically for MT+post-editing scenario
o Evaluation and productivity testing systems
Can be external
o Recommendation
Integrate tuning metrics into Moses that allow optimizing
systems for the MT+post-editing usage scenario
Ensure interoperability with external evaluation/productivity
testing systems, e.g. TAUS DQF, QT Launchpad
36. This slide may not be used or copied without permission from TAUS
#6 Requested Moses Improvement
Language-Specific Issues
o Moses is focused on a relatively small set of European
languages
o Survey participants would like to see tools for more
languages included
o Full Unicode support
o Recommendation
Test and improve Unicode support in the language-
independent core
Recommend and document use of additional language tools
Encourage users to report Unicode issues and provide
language-specific data
37. This slide may not be used or copied without permission from TAUS
#7 Requested Moses Improvement
Advanced Features for Moses Commercialization
o Received a broad cross-section of requests
o Researchers develop cutting-edge technologies that
could benefit industry
o Too often conversations still happen in distinct
academic and industry silos
o Recommendation
Start a conversation explaining how newly developed
methods and technologies can help the industry to address
critical MT issues, e.g.
o Tree-based/syntax-based models
o Morphologically rich languages
38. This slide may not be used or copied without permission from TAUS
#8 Requested Moses Improvement
Customer Support
o Moses support mailing list considered excellent
o Few requests for professional support or faster
support response times
o Recommendation
Continue excellent support on mailing list
Improve documentation for some industry-relevant features
to allow easier adoption
39. This slide may not be used or copied without permission from TAUS
Moses Open Source Project
Strengths Weaknesses
o Additions/updates to
“core” Moses:
decoder, training, LM
Latest methods
Benefiting all users
o Documentation and
support
o MosesCore funded
releases and tutorial
o Few contributions by
long-time industry
Moses users
Adobe Moses Tools
DoMY CE
o Complexity of
installation/use for
entry-level users
40. This slide may not be used or copied without permission from TAUS
Moses Future
Academic Project Broad Adoption
o Sharing platform for
research progress
o Unstable code base
o Complex use
o Integration by few
sophisticated technology
providers
o Similar OSS project
HTK speech recognition
toolkit
o Ease of installation
o Ease of use for diverse
scenarios
o Pre-trained engines
o Similar OSS projects
PostgreSQL
NLTK (Natural Language
Toolkit)
CMU Sphinx
Not mutually exclusive!
41. This slide may not be used or copied without permission from TAUS
PostgreSQL
Object-relational database management system
o Started in 1986 by Michael Stonebraker at UC
Berkeley
o Evolved from research project into universal RDBMS
o Used by Apple, BASF, Skype, Redhat, governments,
universities …
o Broad contributor base
Often industry funded
o PostgreSQL license (similar to MIT license)
o Commercial support through EnterpriseDB
o Consulting/training available
42. This slide may not be used or copied without permission from TAUS
Discussion To Follow
o Discuss industry needs
o Identify areas of industry cooperation
o Discuss line between open source project and
proprietary add-ons
43. This slide may not be used or copied without permission from TAUS
Open/Proprietary
46. 1Development in Moses
• Moses is mainly developed in academia
• Academic research progress is somewhat un-predictable
• Biases
1. quality
2. scalability
3. usability
• Sometimes research use cases do not match industry use cases
(e.g., translation of news vs. technical documentation)
Philipp Koehn Roadmap 11 September 2013
48. 3Progress in Models
1990 2000
word-based models
2010
phrase-based models
formal grammar-based models
linguistic grammar-based models
semantics
Philipp Koehn Roadmap 11 September 2013
49. 4Progress in Methods
1990 2000 2010
probabilistic models
parameter tuning
large-scale
discriminative training
Philipp Koehn Roadmap 11 September 2013
50. 5
Quality
Some examples from UEDIN systems in WMT 2013
• Better machine learning methods
• Linguistically motivated models
• More data
Philipp Koehn Roadmap 11 September 2013
51. 6
Quality
Some examples from UEDIN systems in WMT 2013
• Better machine learning methods
operation sequence model
• Linguistically motivated models
syntax-based machine translation model
• More data
training a language model on 130 billion words
Philipp Koehn Roadmap 11 September 2013
54. 9
Huge, I Say Huge!, Language Model
• Unpruned 5-gram language model trained on 130 billion words
• Training straightforward [Heafield et al., ACL 2013]
• Decoding requires 1TB RAM machine
• Best performance at WMT2013 (manual judgment)
Spanish–English French–English Czech–English
score
0.624
0.595
0.570
system
UEDIN-HEAFIELD
”ONLINE-B”
UEDIN
score
0.638
0.604
0.591
system
UEDIN-HEAFIELD
UEDIN
”ONLINE-B”
score
0.607
0.582
0.562
system
UEDIN-HEAFIELD
”ONLINE-B”
UEDIN
Philipp Koehn Roadmap 11 September 2013
55. 10Usability
• Main uses of Moses
– productivity tool for professional translators
– gisting for information discovery
• Research driven by real-world use
–
–
–
–
incremental training
handling of tags
terminology management
quality estimation
Philipp Koehn Roadmap 11 September 2013
56. 11
• Integration of statistical MT and collaborative translation memories
• Novel technology
– Self-tuning machine translation
– User adaptive machine translation
– Informative machine translation
• Open source workbench
• Extensive testing by translation agency
Philipp Koehn Roadmap 11 September 2013
57. 12
• Cognitive studies of translator behaviour based on key logging and eye tracking
• Novel types of assistance to human translators
– interactive translation prediction
– interactive editing
– adaptive translation models
• Open source workbench
• Field tests by translation agency and online volunteer translation platforms
Philipp Koehn Roadmap 11 September 2013
58. 13
• Use of machine translation for community content
• Novel technology
– Pre-editing of content
– Monolingual and bilingual post-editing
– Development of feedback loops
• Use in
– commercial product forum relating to Symantec network security products
– content in community of volunteer translators Traducteurs sans Fronti`eres
Philipp Koehn Roadmap 11 September 2013
59. 14
The Future: Better Models
• Syntax-based and semantic statistical models
– improvements to basic tools of natural language processing
– requires annotated data resources, annotation standards
– new models, training methods, inference algorithms
• Exploitation of data — machine learning
– different types: parallel, comparable, monolingual, interactive
– scaling up of existing machine learning methods
– adaptation to user needs
• Integration with other technologies
–
–
–
–
–
–
human translation and localization workflows
speech recognition
dialog systems
information retrieval
data mining
communication systems
Philipp Koehn Roadmap 11 September 2013
60. 15
The Future: Better Usability
• Installation
– MOSESCORE installer, pre-built binaries
– pre-installed virtual machines for Amazon EC et al.
• Resources
– ongoing efforts to make data publicly available
– memory and time efficient training and decoding
• Integration into workflows
–
–
–
–
addressing requirements of professional translators
industry-led projects on handling tags, untranslated terms, terminology
MOSESCORE ”arrows” workflow management
various server process implementation, e.g., based on Google API
Philipp Koehn Roadmap 11 September 2013
62. TAUS Moses Roundtable
Review and Discussion of
Sharing Optons in the
Industry
Rahzeb Choudhury, Achim Ruopp
TAUS
11-Sep-2013
Prague, Czech Republic
63. This slide may not be used or copied without permission from TAUS
Sharing Knowledge
o TAUS Machine Translation Showcases
Co-located with Localization World Conferences
Familiarizing the industry with Moses/SMT
Users share experiences
Panel discussions
o TAUS Machine Translation and Moses Tutorial
Online tutorial teaching theory and practice
300+ registered users
Developed in collaboration with UEdin
o This TAUS Moses Roundtable
64. This slide may not be used or copied without permission from TAUS
Sharing Code
o DoMY CE
Prepare training corpora
Train & tune SMT models
Manage SMT resources
Translate documents
o M4Loc – Moses for Localization
Integration with popular open source Okapi localization
framework
Adobe Moses Tools
o In Moses /contrib folder
Moses for Mere Mortals
Several web APIs
o Language-specific non-breaking prefix files
65. This slide may not be used or copied without permission from TAUS
Industry Sharing
Knowledge
Investment
Code
66. This slide may not be used or copied without permission from TAUS
Discussion: Ideas for Sharing
o What are common use scenarios?
(among participants)
o MT as a productivity enhancer
Beginner, Pilot, Implementation, Production, Ongoing rollout
o MT to gist – no participants involved in the scenario
o How do we make them easier to achieve?
67. This slide may not be used or copied without permission from TAUS
Beginners -Installing and Using Moses
o Still a range of installation experiences from “No
problem” to “very complex to understand and to
implement”
o Should Moses team provide installable packages?
o Windows support?
o UI?
o Recommendation
Occasional stable releases of Moses as installable packages
across different platforms – consistency is key
Take on the maintenance and release of required
components abandoned by their original developers
68. This slide may not be used or copied without permission from TAUS
Beginners - Installing and Using Moses
o Conclusions during meeting:
The resources available (Moses site, support list, MT and
Moses Tutorial) are sufficient
The v1 release is very welcome and look forward to future
releases
Try to ensure these resources are more easily discoverable,
ensure documentation stays up to date, and easy to use
69. This slide may not be used or copied without permission from TAUS
Implementation
Integrating Moses into Existing Workflows/Systems
o Integration into growing number of diverse systems
TMS/CaT/TenT
Content Management Systems
Automated Speech Recognition
Dialog Systems
…
o Recommendation
Comprehensive, stable and well documented APIs to the
decoder and data produced by it
RESTful HTTP API (Google/Bing compatible?)
Finish Okapi/M4Loc file format support
70. This slide may not be used or copied without permission from TAUS
Implementation
Integrating Moses into Existing Workflows/Systems
o Conclusions during meeting:
Main areas of cooperation (APIs and formatting) covered by
current activity
TAUS to help with next steps for Moses4Loc (Formatting) to
help ensure there is thorough testing
71. This slide may not be used or copied without permission from TAUS
Production
Training and Translation Speed
o Users are aware that SMT requires a considerable
amount of computing resources
o Request driven by management and user demands
Fast-turn-around/online translation
Frequent re-training of systems with new/updated data
o Recommendation
Integrate recent training speed improvements into the
training tool chain
Document recommendations how to best use the training
speed improvements
Further optimize performance for multi-threaded decoding
72. This slide may not be used or copied without permission from TAUS
Production
Training and Translation Speed
o Conclusions during meeting:
Participants did not have any specific ideas beyond what the
MosesCore consortium members are already doing
73. This slide may not be used or copied without permission from TAUS
Other Issues/Ideas Raised
o Lack of Data
The TAUS Data repository was shown as a potential source of
training data
o Interoperability
Going forward it would be good to be able to share translation
and language models
Participants briefly discussed the complexity of the challenge
o Shared engines
It was suggested that baseline language/industry/domain engines
be made available
Making the engines built as part of the TAUS Developing Talent
project available may be a good start. TAUS will look into this.