More Related Content

More from Recruit Technologies(20)


RedPen, a document checker

  1. RedPen, a Document Checker Takahiko Ito 1
  2. Background: programming environment Software engineers make use of many tools in the development of software. Tool: CheckStyle, FindBugs, lint, Valgrind, CI etc… ➔ Tools contribute to keep the quality. 2
  3. Background: writing situations Software engineers write large amount of natural language documents Example: Manuals, tutorial, Blog, Specification Unfortunately, there is no handy checking tool for the quality of documents. ➔ Quality of documents is not improved. 3
  4. Motivation Checking formatting issues can be done automatically. Writers can concentrate on the contents of documents. ➔We have made RedPen, a document checker. 4
  5. What is RedPen? a validation tool for document written in natural languages E.g., English, Japanese, Chinese Target: technical papers, manuals and so on. 5
  6. Function of RedPen RedPen detects the problems in input documents. Problems: Sentence Length Inconsistency of terminology Spell-miss … 6
  7. Example: low quality text small letter! Too long sentence! Some of software works in more than one machines and such distributed software can handle large amount of data or works in severe environments because such software make use of much computer resources. In this paper we call a server works in a cluster as ‘instance.’ for example, in search engines or distributed databases, the fractions of indexes are stored in multiple instances.Such system need a component to merge the query results before the return the results to the users. 7 Need space!
  8. Features of RedPen Handy configuration Language independent 8
  9. Usage: RedPen Users pick up the checking items (validators) RedPen provides many validators 9
  10. Example of RedPen configuration <validator-list> <validator name=“SentenceLength" /> <validator name="InvalidCharacter" /> <validator name=“SpellCheck" />  <validator name=“SectionLength” /> </validator-list> 10 Sentence length Invalid character spell check
  11. Available validators SentenceLength InvalidExpression SpaceAfterPeriod CommaNumber WordNumber SuggestExpression InvalidCharacter SpaceWithSymbol KatakanaEndHyphen KatakanaSpellCheck SectionLength ParagraphNumber ParagraphStartWith 11
  12. Command RedPen provides a simple command. $ redpen -c config-file input ! Supported format: Markdown、Textile、 PlainText 12
  13. Sample server Launched by the following command. $ java -jar redpen.war 13
  14. Demo 14
  15. Future work Current RedPen focuses on the simple functions In the future, RedPen will support more sophisticated and experimental functions proposed in research fields. Provide plugin system 15
  16. Summary Introduction of RedPen Validation tool for documents written in natural languages. Usage: Configurations Handy command and server Future work 16