What is hiatus?hiatus is a localization QA tool.Like xBench, but different indetails.See slide 4 & 5 for details of whatyou can check with this tool.You can get it from herehttps://github.com/ahanba/hiatus
Requisites?Ruby 1.9.2 or 1.9.3Ruby gemtk (When install Ruby, check on the option for tk)nokogiri (gem install nokogiri)rubyzip (gem install zip)Windows XP/Vista/7 Japanese*MS Office Excel (mandatory), Word (optional)*By default, character code for Excel is set to Shift-jis (Windows JA OS code). If you modify there, you canuse hiatus on other language OS.
Supported bilingual file format CSV TXT (LocStudio Dump) (Tab Separated) DOC* TMX DOCX* XLS* RTF* XLSX* (Trados RTF) XLZ TBX (Idiom XLZ) TTX *To check these, Microsoft Word/Excel is required
What can be checked? Glossary Error Length Style Check Skipped Translation Monolingual Search Missing/Added Tag Check Numbers Unsourced Term Inconsistency (Source > Target) (Source > Target) Unsourced Term Inconsistency (Target > Source) (Target > Source)
Auto ConversionWhen you run Glossary check, you cancheck the option to enable “AutoConversion” function.For example:If you want to search “make”, followingterms are also detected by check onthis option.[make, makes, made, making]Currently (05/11/2012), this functionis only implemented for English.
Set upDownload & Instal Ruby1.9.3 fromhttp://rubyinstaller.org/Check on “Install tk” option.Download & Setup RubyGems fromhttp://rubygems.org/pages/downloadJust run “ruby setup.rb” to setupRun following gem commandsgem install nokogirigem install zip
OverviewThis is thecomponents of hiatusTo run hiatus1. Edit Config.yaml2. Run hiatus.rb
ChecksGlossary Error Source-Target bilingual check Monolingual Source or Target monolingual check Number Check Number is consistent to sourceInconsistency Source<->Target inconsistency Length Catch too long/short translation Catch English term which is not in Unsourced the source but in the translation Tag Missing/Added tag check Skip Catch potentially skipped translation
Edit Config.yaml Path of target bilingual files (Root dir) bilingual Files in subfolders are checked output Output path of the report report Output report format. Currently only “XLS”source/target 2 char Source/Target language code glossary Path of glossary files monolingual Path of monolingual files Enter True or False for each checkcheck section Select True which you want to run If True, 100% match segment is skipped ignore100 Valid for TTX & XLZ If True, ICE match segment is skipped ignoreICE Valid for TTX & XLZ
Config of Glossary file 3 columns Tab separated value Source Term Target Term Option UTF-8 without BOM TXT file Options: i Regexp, Autoconv = ON, ignore case m Regexp, Autoconv = ON, multiline e Regexp, Autoconv = ON, extended Start with # Regexp = ON. If start with #, no Autoconv blank Regexp = OFF, Autoconv = OFF. CaseSensitive z Regexp = OFF, Autoconv = OFF. Ignore case.
Config of Monolingual file 4 columns Tab separated value S or t Term Option Comment UTF-8 without BOM TXT file Options: s or t s = search the source, t = search the target i Regexp, Autoconv = ON, ignore case m Regexp, Autoconv = ON, multiline e Regexp, Autoconv = ON, extended Start with # Regexp = ON. If start with #, no Autoconv blank Regexp = OFF, Autoconv = OFF. CaseSensitive z Regexp = OFF, Autoconv = OFF. Ignore case. Comment Comment to display on the report