This document discusses collaborative translation and outlines some common issues. It begins with a brief history of collaborative translation approaches from 2005-2011. It then outlines different flavors of collaborative translation like crowdsourcing, terminology resources, and translation memory sharing. Common challenges are discussed such as alignment with business goals, quality control, crowd motivation, and defining the professional role. The talk concludes that capturing best practices for collaborative translation in the form of design patterns would be useful.
Health Care Knowledge Transfer Using The Online EnvironmentAnita Hamilton PhD
I gave this presentation to a group of health care managers in Alberta Canada (February 2009). The goal of this presentation was to increase awareness around the possibilities that Web 2.0 tools offer the managers to enable their teams to network, collaborate and share knowledge. It was well received and I have been asked to present it again in December 2009.
Painless XML Authoring?: How DITA Simplifies XMLScott Abel
Presented at DocTrain East 2007 by Bob Doyle, DITA Users -- This introduction to XML Authoring will acquaint you with over fifty tools aimed at structuring content with DITA. They are not just DITA-compliant authoring tools (editors) for writers. They also include content management systems (CMS), translation management systems (TMS), and dynamic publishing engines that fully support DITA. You will also need to know about tools that convert legacy documents to DITA and help to design stylesheets for DITA deliverables. The best DITA tools for technical communicators implement the DITA standard while hiding all the complexity of the underlying XML (eXtensible Markup Language).
As a tech writer and not a tech, you should be able to forget about XML - except to know that you are using it (DITA is XML) and that it consists of named content elements (or components) with attributes. You need to know enough about the content elements so you can reference (conref) them for reuse. You need to know about their attributes so you can filter on them for conditional processing. And you should appreciate that because components are uniquely identifiable they lend themselves perfectly to automated dynamic assembly using a publishing engine.
We will describe how you can get started with structured writing without knowing XML or installing anything.
The promise of topic-based structured authoring is not simply better documentation. It is the creation of mission-critical information for your organization, written with a deep understanding of your most important audiences, that can be repurposed to multiple delivery channels and localized for multilingual global markets. You are not just writing content, you are preparing the information deliverables that enhance the value of your organization in all its markets.
To do that well, you must understand the latest tools in structured writing that are revolutionizing corporate information systems - today in documentation but tomorrow throughout the enterprise, from external marketing to internal human resources. Whether you are trying to push a new product into a new market or are “onboarding” a new employee, the need for high quality information to educate the customer or train the new salesperson is a challenge for technical communicators. You need to think outside the docs!
The key idea behind Darwin Information Typing Architecture is to create content in small chunks or modules called topics. A topic is the right size when it can stand alone as meaningful information. Topics are then assembled into documents using DITA maps, which are hierarchical lists of pointers or links to topics. The pointers are called “topicrefs” (for topic references).
Think of documents as assembled from single-source component parts. Assembly can be conditional, dependent on properties or metadata “tags” you attach to a topic. For example, the “audience” property might be “beginner” or “advanced.”
At a still finer level of granularity, individual elements of a topic can also be assigned property tags for conditional assembly. More importantly, a topic element can be assigned a unique ID that makes it a content component reusable in other topics.
As you will learn, DITA is a leading technology for “component content management,” which multiplies the value of your work. You need to leverage DITA and structured content to multiply your income.
One of the biggest challenges for translation teams today is that the translation tends to be pushed to the very end of the product cycle and, if deadlines aren't met, can have an adverse impact on the total cost of product marketing campaign due to delayed releases. Regardless of our role in the translation process, we need to understand how both the documentation process and the translation process affect each other, where are the bottle-necks in the workflow, and how we can merge the two so that our customers can meet their goals.
Similar to Wanted: Best Practices for Collaborative Translation (20)
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofsAlex Pruden
This paper presents Reef, a system for generating publicly verifiable succinct non-interactive zero-knowledge proofs that a committed document matches or does not match a regular expression. We describe applications such as proving the strength of passwords, the provenance of email despite redactions, the validity of oblivious DNS queries, and the existence of mutations in DNA. Reef supports the Perl Compatible Regular Expression syntax, including wildcards, alternation, ranges, capture groups, Kleene star, negations, and lookarounds. Reef introduces a new type of automata, Skipping Alternating Finite Automata (SAFA), that skips irrelevant parts of a document when producing proofs without undermining soundness, and instantiates SAFA with a lookup argument. Our experimental evaluation confirms that Reef can generate proofs for documents with 32M characters; the proofs are small and cheap to verify (under a second).
Paper: https://eprint.iacr.org/2023/1886
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
UiPath Test Automation using UiPath Test Suite series, part 5DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 5. In this session, we will cover CI/CD with devops.
Topics covered:
CI/CD with in UiPath
End-to-end overview of CI/CD pipeline with Azure devops
Speaker:
Lyndsey Byblow, Test Suite Sales Engineer @ UiPath, Inc.
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!SOFTTECHHUB
As the digital landscape continually evolves, operating systems play a critical role in shaping user experiences and productivity. The launch of Nitrux Linux 3.5.0 marks a significant milestone, offering a robust alternative to traditional systems such as Windows 11. This article delves into the essence of Nitrux Linux 3.5.0, exploring its unique features, advantages, and how it stands as a compelling choice for both casual users and tech enthusiasts.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
Maruthi Prithivirajan, Head of ASEAN & IN Solution Architecture, Neo4j
Get an inside look at the latest Neo4j innovations that enable relationship-driven intelligence at scale. Learn more about the newest cloud integrations and product enhancements that make Neo4j an essential choice for developers building apps with interconnected data and generative AI.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Wanted: Best Practices for Collaborative Translation
1. Wanted: Best Practices for Collaborative
Translation
Alain Désilets
National Research Council of Canada
alain.desilets@nrc-cnrc.gc.ca
With support from:
2. A (Very) Brief History
of Collaborative
Translation
Circa 2005: “Wikis, what's that?”
Circa 2006: “I know about Wikipedia, but I hear it’s garbage because
anyone can write anything on it.”
Circa 2007: “You know, I have been to Wikipedia a couple of times and
was pleasantly surprised by the quality of what I found there.”
Circa 2008: “Actually, this wiki stuff is really interesting. Now, I routinely
use Wikipedia in my work, and although I am cautious with it, I find it
useful.
I have a sense that this wiki/collaborative stuff will have a wider impact
for translation, but I’m not quite sure how.”
3. A (Very) Brief History
of Collaborative
Translation (2)
Circa 2009: “Whoa! Translation Crowdsourcing is going to put me out of a
job!”
Circa 2010: “Well, I guess that was a storm in a teacup. Crowdsourcing
will be used in some specific and limited contexts, but it won't take
over. Maybe this is more an opportunity for us than a threat…”
Circa 2011: “Hum… getting this collaborative translation stuff to work
right is hard and confusing.”
4. Talk Outline
The different “flavors” of Collaborative Translation
Common issues and Challenges in Collaborative
Translation
Capturing Collaborative Translation best-practices
in the form of Design Patterns
7. Definition
Collaborative Translation is the use
of any open online collaboration
technology or process, in order to
help with translation tasks, or tasks
related to translation (ex:
terminology).
8. Available in the
following flavors…
• Translation crowdsourcing
• Collaborative terminology resources
• Translation memory sharing
• Online marketplaces for translators
• Agile translation teamware
• Post-editing by the crowd
9. Translation
Crowdsourcing
Mechanical-Turk-like systems to support the translation of content
by large crowds of mostly amateurs, through an open-call
process.
By far the most talked about collaborative translation approach
• Software user interface (Facebook, Adobe, Symantec, Firefox)
• Technical documentation (Adobe, Symantec)
• Transcripts of videos of an “inspirational” nature (TED Talks, Adobe TV)
• Humanitarian aid content (Translators without Borders, Kiva.org, Haiti
Earthquake Mission)
• Large scale collection of linguistic data for research purposes or machine
translation training (NAACL Workshop on Crowdsourcing).
10. Collaborative
terminology resources
Wikipedia-like platforms for the creation and maintenance of large
terminology resources by a crowd of translators, terminologists,
domain experts, and even general members of the public.
• Wikipedia
• Wiktionary
• ProZ’s Kudoz forum
• Urban Dictionary
• TermWiki.com
• TikiWiki
• Reverso dictionary
11. Translation memory
sharing
Platforms for large scale pooling and sharing of multilingual
parallel corpora between organizations and individuals.
• TAUS Data Association
• MyMemory
• Google Translator Toolkit
• WeBiText
Often, collaboration is “implicit”, for example, in the case of
WeBiText.
12. Online marketplaces
for translators
eBay-like disintermediated environments for connecting customers
and translators directly, with minimal intervention by a middle
man.
• ProZ.com
• TranslatorsCafe
• Translated.net
Collaborative aspects comes from things like “open call sourcing”
and reputation management based on community assessment.
13. Agile translation
teamware
Wiki-like systems and processes that allow multidisciplinary teams
of professionals (translators, terminologists, domain experts,
revisers, managers) to collaborate on large translation projects,
using an agile, grassroots, parallelized process instead of the
more top-down, assembly-line approach found in most
translation workflow systems.
No specific software or site, but many case studies describing how
to implement this approach, using general purpose
collaboration tools like wikis, BaseCamp.
• Beninatto & De Palma, 2008,
• Calvert, 2008
• Yahaya, 2008
Some translation workflow systems starting to market themselves
as being “collaborative”
14. Post-editing by the
crowd
Systems allowing a large crowd of mostly amateurs to correct the
output of machine translations systems, often with the aim of
improving the system’s accuracy.
• Asia Online’s Wikipedia translation project
• Google Translate allows anonymous users to correct the
outputs produced by the systems
• Likewise for Microsoft’s Bing Translator
15. Is this REALLY New?
Weren’t Terminology Databases, Translation Memories and
Translation Workflow Systems already collaborative?
• Yes, but…
• … Collaborative Translation is about using these kinds of
groupware technologies in the context of much larger groups or
communities, where people have fewer reasons to trust each
other a-priori.
It’s one thing to open yourself to collaboration with colleagues and
customers.
It’s quite another thing to open yourself to the whole world.
17. This is NOT easy
Choosing a flavor and tailoring it to your needs is still somewhat of
a black art, guided by trial and error.
There are lots of important and poorly understood issues that
arise, many of which are common to most flavors:
• Alignment with business goals
• Quality control
• Crowd motivation
• Proper role of professionals
18. Alignment with
business goals
Why are you doing this in the first place? Which flavor can deliver
what you want?
The actual benefit you get from a given flavor is not necessarily
what you think!
Translation crowdsourcing
• Reduce cost? – Yes, but not the biggest benefit.
• Decrease lead time? – Definitely.
• Translation more in-tune with target audience’s idiosyncrasies?
-- Also
• Most importantly: Increase brand loyalty by engaging end-users
as co-creators of products, instead of passive consumers.
19. Quality Control
How to control quality when you open yourself to contributions
from a potential large group of “outsiders”?
Many ways:
• Screen contributors before letting them in (ex: Translators
Without Border, Kiva.org).
• Have members of the community vote on the quality of each
other’s work (ex: Facebook, Translated.net).
• Have in-house professionals revise the work done by the
community (ex: Facebook).
20. Quality Control (2)
Do not assume that quality of community-produced content will be
lower.
For instance, Wikipedia provably measures up to professionally
produced encyclopedia like Britannica (English) and Brockhaus
(German).
Quality issues tend to iron themselves out provided that you attract
a sufficient large number of the right people
Wisdom of crowd effects works surprisingly well when the
following conditions are met:
• Diversity
• Independence
• Aggregation
21. Crowd motivation
If you are to attract and retain enough of the “right people” you
need to understand why thy might contribute.
• Mandated by management (ex: Agile Translation teamware)
• Emotional bond with the content (ex: Facebook, and surprisingly, Adobe)
• Prestige of the content (ex: TEDTalks)
• Wanting to do good (ex: Translators Without Border, Kiva, Haiti Earthquake
Mission, Data collection for scientific research)
• Pride in one’s native language (ex: Data collection for R&D in MT for small
density languages)
• Trying to perfect second language skills
• Trying to make a go at professional translation career (ex: Kiva.org)
• And in some cases, $$$
– Will this be the dominant scenario?
– How to set compensation high enough to attract good contributors, but not so
high that it interferes with more intrinsic motivations, or attracts people out to
game the system.
22. Role of professionals
Some flavors of CT are designed specifically for professionals (ex: Agile
translation teamware, Online translator marketplaces).
But some (e.g. Translation crowdsourcing), tend to de-emphasize their role.
When should professionals be involved, and what should be their role?
• Revise work done by amateurs?
– Focus on more challenging aspects of translation like terminology,
style, fluidity?
• Manage and coach the crowd?
• Focus on more mission-critical and hard to translate content?
Translation Crowdsourcing may actually increase the size of the pie, by
making it possible to tackle content and/or small languages that would
otherwise not have been dealt with anyway.
24. Wanted: Best-
Practices
Collaborative Translations presents practitioners with a varied and
complex envelopes of different approaches and technologies.
Selecting a flavor and tuning it to meet your needs is complex.
We need some sort of concise, easy-to-consult repository of best-
practices for that field.
We propose a way to collaboratively create such a repository a
community, in the form of a design patterns language.
26. About Design Patterns
A format for describing a common
solution to a common problem in
a given field
Originally used in Architecture, but
since then adopted in other fields
such as Software Engineering,
Education, etc.
27. Design Patterns
Example
Publish Contributions Rapidly
Context
This pattern is useful for motivating contributors in any collaborative
translation context, but it is particularly useful in translation crowdsourcing
scenarios.
Problem
Contributors are often motivated by a desire to have a positive impact on the
community they are participating in. However, they cannot achieve this
sense of being useful, if their contributions do not become available to the
rest of the community in a reasonable amount of time.
Solution
Therefore, minimize the delay between the moment when a member of the
community contributes to the site, and the moment where it becomes
publicly available to the rest of the community. Ideally, the contribution
should become visible to the rest of the community as soon as the user
clicks on the Save button.
28. Design Patterns
Example (2)
Related patterns
– Point System is another way for a contributor to get a sense of how useful he
has been to the community.
– Campaign Progress Gauge is another practice which allows members of the
community to see the positive impact of their actions. The main difference is
that it operates more at a community/project level rather than at a
individual/contribution level.
Real-life examples
– At Facebook, translations become available in a matter of hours.
– In the context of software localization by the crowd, Adobe makes a conscious
effort to wrap the community's translations into every new releases of the
product.
29. TAUS Roundtable on
Collaborative
Translation
Wiki “Barn Raising” workshop held on October 12th, 2011 at
Localization World in Santa Clara.
12 practitioners
• One third with hands on experience of CT (NRC, Adobe,
Symantec, Kiva, World Wide Lexicon)
• Two thirds with no experience, but a strong interest in trying it
(In Every Language, MemSource, Firma 8, SPIL Games)
Talks by the experienced users about what worked and didn’t.
Followed by brainstorming of what the recurring best-practices
seem to be.
30. The Best-Practices
End result:
=> 50+ best practices organized into 6 themes
Planning and Scoping
Translation as User Engagement, Align Stakeholder Expectations, Early and
Continuous Clarification of Translator Expectations, Backup Plan, Project,
Check Points, Appoint Initial Community Manager, Clear Objectives, Identify
Compatible Content
Community Motivation
Campaign Progress Gauge, Contributor Recognition, Leader Board, Official
Certificate, Point System, Offer Double Points, Hand-Out Unique Branded,
Products, Contributor of the Month, Grant Special Access Rights, Playful
Casual Translation, Campaign, Publish Contributions Rapidly, Playful
Competition Among Contributors
31. The Best-Practices (2)
Quality
Content-Specific Testing, Entry Exam, Peer Review, Automatic Reputation
Management, Random Spot-Checking, Revision Crowdsourcing, Users as
Translators, Voting, Transparent Quality Level, Publish then Revise
Contributor Career Path
Flexible Contributor Career Path, Lurker to Contributor Transition, Anonymous
Translation, Find the Leaders, Support Variable Levels of Involvement,
Community Manager, Content Prioritizer
Right Sizing
Appropriate Chunk Size, Community-Appropriate Project Size, Break Up
Crowd Into Teams, Require Minimal Involvement Level, Keep the Crowd
Small, Volunteer Team Leaders
32. The Best-Practices (3)
Tools and Processes
Hint at Content Priority, First In, First Out, Task Self Selection, Layered
Fallbacks, Official Linguistic Resources, Automatic Suggestions, Provide,
Context, In-Place Translation, Community Forum, Analytics for Content
Prioritization, Simplicity First, Good Examples of Contributions, Encourage
Self, Set Deadlines
33. Some Observations
The bulk of practices relate to Translation Crowdsourcing.
=> We need to spend more time capturing practices for other
flavors of Collaborative Translation
The bulk of the practices so far are not specific to translation.
• They would be useful in the context of crowdsourcing efforts in
any domain.
• Maybe all we need is to codify and/or learn about the best
practices for crowdsourcing in general?
The more similar two organizations are, the more similar their
practices will be (ex: Kiva and TWB, versus Kiva and Adobe).
35. Conclusion
Collaborative Translation presents practitioners with a very large
and varied set of tools and processes.
Choosing a particular flavor of CT and tailoring to meet one’s
needs can be a daunting task.
We need a concise, easy to consult, modular compendium of
current best practices in that area.
We have started building such a compendium in the form of a wiki
site (www.collaborative-translation-patterns.com) which
captures best practices in the form of design patterns.
We invite every one in this room to contribute to it if they can.