Improving writing aids, the community way


Published on

Writing aids (namely: spellchecker, thesaurus, hyphenation patterns, grammar checker) for can always be improved and streamlined. The best environment for a collaborative effort to create and improve such tools is the local Native-Lang community: you can get work done by people who use and appreciate OOo, and reward the community by making their work available in the following releases.

However, a number of issues must be solved to ensure success of such a community project. We will examine some of them, like: the technical expertise needed to build and maintain the single tools and extension packages; the licenses and legal reviews to get tools included in the official builds of; the management of roles within the community and effective delegation; the need for readily accessible and easy-to-use interfaces for newcomers who just want to quickly contribute an idea or improvement.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Improving writing aids, the community way

  1. 1. OOoCon Budapest 2 September 2010 Improving Writing Aids, the Community Way Andrea Pescetti Italian N-L project Lead
  2. 2. Getting Writing Aids Started
  3. 3. Writing Aids: Overview <ul><li>Spell Checker
  4. 4. Thesaurus
  5. 5. Hyphenation Patterns
  6. 6. Grammar Checker </li></ul>
  7. 7. Spell Checker <ul><li>The spell checking engine Hunspell is integrated in all versions of OOo.
  8. 8. Hunspell dictionaries (suitable for OOo, Thunderbird and more) are available for about 100 languages.
  9. 9. </li></ul>
  10. 10. Thesaurus <ul><li>Engine: integrated in all recent versions of OOo.
  11. 11. OOo-specific tool and format, you will usually have to start from scratch.
  12. 12. Documentation: OOo project </li></ul>
  13. 13. Hyphenation Patterns <ul><li>Engine: Hyphen, included in the Hunspell project; integrated in all versions of OOo.
  14. 14. Format: tool-specific, but conversion from TeX patterns available (with caveats): start based on TeX patterns!
  15. 15. TeX Archive: </li></ul>
  16. 16. Grammar Checker <ul><li>Not integrated in OOo as a user-visible tool as of 3.2.1, but API available.
  17. 17. Several options available as extensions : LanguageTool, LightProof, CoGrOO and more.
  18. 18. Rules for your language: tool-dependent format. </li></ul>
  19. 19. Licensing Issues
  20. 20. Mere Aggregation <ul><li>Wide spectrum of licenses for writing aids; most are incompatible with the OOo license, LGPLv3.
  21. 21. But they are pure data files.
  22. 22. FSF: this is “mere aggregation”, licenses do not need to be compatible: issue 65039 . </li></ul>
  23. 23. Extensions OXT <ul><li>Data for writing aids (except grammar) have been packaged as extensions since OOo 3.x.
  24. 24. This reinforces the “mere aggregation” concept.
  25. 25. Data files within the extension may have different licenses: still “mere aggregation”. </li></ul>
  26. 26. Choose your license <ul><li>LGPLv3 (latest) is compatible with the OOo codebase and ensures that any distributed modified versions remain free.
  27. 27. GPLv3 : in OOo, no significant differences (mere aggregation).
  28. 28. AGPLv3 : usage on a network (WWW) counts as distribution. </li></ul>
  29. 29. Meet Sun/Oracle legal <ul><li>Licenses aside, copyright holders must sign the OCA for their work to appear in the OOo code repository.
  30. 30. Usual choice: external contribution, no OCA required.
  31. 31. Sun legal was very slow ; but Oracle legal froze the process! </li></ul>
  32. 32. Distributed Management
  33. 33. Use a repository <ul><li>Make writing aids available to all contributors in an online repository.
  34. 34. Use version control .
  35. 35. Expose an easy, web-based, change tracking interface to show differences between revisions. </li></ul>
  36. 36. Spell Checker <ul><li>One file in text format.
  37. 37. Human readable, except rules.
  38. 38. Good for collaborative editing. </li></ul>
  39. 39. Thesaurus <ul><li>One file in text format and an automatically generated index.
  40. 40. Human readable.
  41. 41. Good for collaborative editing. </li></ul>
  42. 42. Hyphenation <ul><li>One text file.
  43. 43. Format: as arcane as it can get!
  44. 44. Changes very rarely.
  45. 45. Fix bugs upstream, in TeX. </li></ul>
  46. 46. Grammar checker <ul><li>LanguageTool: rules in XML.
  47. 47. Basic XML knowledge needed.
  48. 48. Fix upstream, in LanguageTool.
  49. 49. Collaboration possible. </li></ul>
  50. 50. Packaging <ul><li>Generation of the OXT extension can be scripted.
  51. 51. It is even possible to automatically generate an updated OXT for every committed change of a file.
  52. 52. Keep generated OXT files in the same repository. </li></ul>
  53. 53. Team Structure <ul><li>All components are independent ; collaboration is possible in every component.
  54. 54. A packaging manager (or a script!) to generate extensions.
  55. 55. A release manager to make stable versions available in OOo. </li></ul>
  56. 56. Community Involvement
  57. 57. Community Involvement <ul><li>The Native-Lang community is the best group of people to improve writing aids.
  58. 58. Motivated users, who will benefit directly from their work.
  59. 59. Main issue: providing tools that allow to manage contributions in an efficient way. </li></ul>
  60. 60. Web based interface <ul><li>Allow quick and easy reporting of missing, erroneous and wrongly hyphenated words.
  61. 61. Easy to setup: basic web form or embed in, e.g., Drupal site.
  62. 62. Notifications: e-mail to maintainers group, suggestions stored in online database. </li></ul>
  63. 63. Web based interface
  64. 64. Expose web services <ul><li>Allow direct usage of the web application, with no need to submit a form.
  65. 65. Parameters can be embedded in a URL, users don't have to explicitly open the site.
  66. 66. Suitable for inclusion in applications or macros. </li></ul>
  67. 67. Web services in OXT <ul><li>Ideally, embed a macro in the OXT dictionary package distributed with OOo.
  68. 68. Right click on a word to show: </li><ul><li>Nominate for inclusion in dictionary.
  69. 69. Nominate for removal from dictionary.
  70. 70. Report wrong hyphenation. </li></ul></ul>
  71. 71. Thesaurus maintenance <ul><li>Vithesaurus: Existing online tool for collaboratively creating and maintaining a thesaurus.
  72. 72. In use (German) at
  73. 73. Can be installed on own server, free software.
  74. 74. </li></ul>
  75. 75. Handling Duplicates <ul><li>In a large community, usually suggestions are reported more than once by different users.
  76. 76. It's a plus: the web application can deal with duplicates and it ranks suggestions according to their frequency, for more efficient operation. </li></ul>
  77. 77. Handling Wrong Reports <ul><li>Most annoying use case: users actually make some wrong suggestions and repeat them!
  78. 78. The web application helps with a “motivated blacklisting”: repeated wrong submissions are handled and a message can be shown to the user. </li></ul>
  79. 79. Thanks for attention Andrea Pescetti Italian N-L Project Lead PLIO Board Member Image credits: Flickr, PLIO Archives.