OOoCon Budapest 2 September 2010 Improving Writing Aids, the Community Way Andrea Pescetti Italian N-L project Lead
Getting Writing Aids Started
Writing Aids: Overview <ul><li>Spell Checker
Thesaurus
Hyphenation Patterns
Grammar Checker </li></ul>
Spell Checker <ul><li>The spell checking engine  Hunspell  is integrated in all versions of OOo.
Hunspell dictionaries (suitable for OOo, Thunderbird and more) are available for about 100 languages.
http://hunspell.sf.net </li></ul>
Thesaurus <ul><li>Engine: integrated in all recent versions of OOo.
OOo-specific  tool and format, you will usually have to start from scratch.
Documentation: OOo project  lingucomponent.openoffice.org </li></ul>
Hyphenation Patterns <ul><li>Engine: Hyphen, included in the Hunspell project; integrated in all versions of OOo.
Format: tool-specific, but conversion from  TeX  patterns available (with caveats): start based on TeX patterns!
TeX Archive:  http://ctan.org/ </li></ul>
Grammar Checker <ul><li>Not integrated in OOo as a user-visible tool as of 3.2.1, but API available.
Several options available as  extensions : LanguageTool, LightProof, CoGrOO and more.
Rules for your language: tool-dependent format. </li></ul>
Licensing Issues
Mere Aggregation <ul><li>Wide spectrum of licenses for writing aids; most are  incompatible  with the OOo license, LGPLv3.
But they are pure data files.
FSF: this is “mere aggregation”, licenses  do not need  to be compatible:  issue 65039 . </li></ul>
Extensions OXT <ul><li>Data for writing aids (except grammar) have been packaged as  extensions  since OOo 3.x.
This reinforces the “mere aggregation” concept.
Data files within the extension may have different licenses: still “mere aggregation”. </li></ul>
Choose your license <ul><li>LGPLv3  (latest) is compatible with the OOo codebase and ensures that any  distributed  modifi...
GPLv3 : in OOo, no significant differences (mere aggregation).
AGPLv3 : usage on a network (WWW) counts as distribution. </li></ul>
Meet Sun/Oracle legal <ul><li>Licenses aside, copyright holders must  sign the OCA  for their work to appear in the OOo co...
Usual choice:  external  contribution, no OCA required.
Upcoming SlideShare
Loading in...5
×

Improving writing aids, the community way

1,185

Published on

Writing aids (namely: spellchecker, thesaurus, hyphenation patterns, grammar checker) for OpenOffice.org can always be improved and streamlined. The best environment for a collaborative effort to create and improve such tools is the local Native-Lang community: you can get work done by people who use and appreciate OOo, and reward the community by making their work available in the following releases.

However, a number of issues must be solved to ensure success of such a community project. We will examine some of them, like: the technical expertise needed to build and maintain the single tools and extension packages; the licenses and legal reviews to get tools included in the official builds of OpenOffice.org; the management of roles within the community and effective delegation; the need for readily accessible and easy-to-use interfaces for newcomers who just want to quickly contribute an idea or improvement.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,185
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Improving writing aids, the community way

  1. 1. OOoCon Budapest 2 September 2010 Improving Writing Aids, the Community Way Andrea Pescetti Italian N-L project Lead
  2. 2. Getting Writing Aids Started
  3. 3. Writing Aids: Overview <ul><li>Spell Checker
  4. 4. Thesaurus
  5. 5. Hyphenation Patterns
  6. 6. Grammar Checker </li></ul>
  7. 7. Spell Checker <ul><li>The spell checking engine Hunspell is integrated in all versions of OOo.
  8. 8. Hunspell dictionaries (suitable for OOo, Thunderbird and more) are available for about 100 languages.
  9. 9. http://hunspell.sf.net </li></ul>
  10. 10. Thesaurus <ul><li>Engine: integrated in all recent versions of OOo.
  11. 11. OOo-specific tool and format, you will usually have to start from scratch.
  12. 12. Documentation: OOo project lingucomponent.openoffice.org </li></ul>
  13. 13. Hyphenation Patterns <ul><li>Engine: Hyphen, included in the Hunspell project; integrated in all versions of OOo.
  14. 14. Format: tool-specific, but conversion from TeX patterns available (with caveats): start based on TeX patterns!
  15. 15. TeX Archive: http://ctan.org/ </li></ul>
  16. 16. Grammar Checker <ul><li>Not integrated in OOo as a user-visible tool as of 3.2.1, but API available.
  17. 17. Several options available as extensions : LanguageTool, LightProof, CoGrOO and more.
  18. 18. Rules for your language: tool-dependent format. </li></ul>
  19. 19. Licensing Issues
  20. 20. Mere Aggregation <ul><li>Wide spectrum of licenses for writing aids; most are incompatible with the OOo license, LGPLv3.
  21. 21. But they are pure data files.
  22. 22. FSF: this is “mere aggregation”, licenses do not need to be compatible: issue 65039 . </li></ul>
  23. 23. Extensions OXT <ul><li>Data for writing aids (except grammar) have been packaged as extensions since OOo 3.x.
  24. 24. This reinforces the “mere aggregation” concept.
  25. 25. Data files within the extension may have different licenses: still “mere aggregation”. </li></ul>
  26. 26. Choose your license <ul><li>LGPLv3 (latest) is compatible with the OOo codebase and ensures that any distributed modified versions remain free.
  27. 27. GPLv3 : in OOo, no significant differences (mere aggregation).
  28. 28. AGPLv3 : usage on a network (WWW) counts as distribution. </li></ul>
  29. 29. Meet Sun/Oracle legal <ul><li>Licenses aside, copyright holders must sign the OCA for their work to appear in the OOo code repository.
  30. 30. Usual choice: external contribution, no OCA required.
  31. 31. Sun legal was very slow ; but Oracle legal froze the process! </li></ul>
  32. 32. Distributed Management
  33. 33. Use a repository <ul><li>Make writing aids available to all contributors in an online repository.
  34. 34. Use version control .
  35. 35. Expose an easy, web-based, change tracking interface to show differences between revisions. </li></ul>
  36. 36. Spell Checker <ul><li>One file in text format.
  37. 37. Human readable, except rules.
  38. 38. Good for collaborative editing. </li></ul>
  39. 39. Thesaurus <ul><li>One file in text format and an automatically generated index.
  40. 40. Human readable.
  41. 41. Good for collaborative editing. </li></ul>
  42. 42. Hyphenation <ul><li>One text file.
  43. 43. Format: as arcane as it can get!
  44. 44. Changes very rarely.
  45. 45. Fix bugs upstream, in TeX. </li></ul>
  46. 46. Grammar checker <ul><li>LanguageTool: rules in XML.
  47. 47. Basic XML knowledge needed.
  48. 48. Fix upstream, in LanguageTool.
  49. 49. Collaboration possible. </li></ul>
  50. 50. Packaging <ul><li>Generation of the OXT extension can be scripted.
  51. 51. It is even possible to automatically generate an updated OXT for every committed change of a file.
  52. 52. Keep generated OXT files in the same repository. </li></ul>
  53. 53. Team Structure <ul><li>All components are independent ; collaboration is possible in every component.
  54. 54. A packaging manager (or a script!) to generate extensions.
  55. 55. A release manager to make stable versions available in OOo. </li></ul>
  56. 56. Community Involvement
  57. 57. Community Involvement <ul><li>The Native-Lang community is the best group of people to improve writing aids.
  58. 58. Motivated users, who will benefit directly from their work.
  59. 59. Main issue: providing tools that allow to manage contributions in an efficient way. </li></ul>
  60. 60. Web based interface <ul><li>Allow quick and easy reporting of missing, erroneous and wrongly hyphenated words.
  61. 61. Easy to setup: basic web form or embed in, e.g., Drupal site.
  62. 62. Notifications: e-mail to maintainers group, suggestions stored in online database. </li></ul>
  63. 63. Web based interface
  64. 64. Expose web services <ul><li>Allow direct usage of the web application, with no need to submit a form.
  65. 65. Parameters can be embedded in a URL, users don't have to explicitly open the site.
  66. 66. Suitable for inclusion in applications or macros. </li></ul>
  67. 67. Web services in OXT <ul><li>Ideally, embed a macro in the OXT dictionary package distributed with OOo.
  68. 68. Right click on a word to show: </li><ul><li>Nominate for inclusion in dictionary.
  69. 69. Nominate for removal from dictionary.
  70. 70. Report wrong hyphenation. </li></ul></ul>
  71. 71. Thesaurus maintenance <ul><li>Vithesaurus: Existing online tool for collaboratively creating and maintaining a thesaurus.
  72. 72. In use (German) at http://www.openthesaurus.de
  73. 73. Can be installed on own server, free software.
  74. 74. http://vithesaurus.sf.net </li></ul>
  75. 75. Handling Duplicates <ul><li>In a large community, usually suggestions are reported more than once by different users.
  76. 76. It's a plus: the web application can deal with duplicates and it ranks suggestions according to their frequency, for more efficient operation. </li></ul>
  77. 77. Handling Wrong Reports <ul><li>Most annoying use case: users actually make some wrong suggestions and repeat them!
  78. 78. The web application helps with a “motivated blacklisting”: repeated wrong submissions are handled and a message can be shown to the user. </li></ul>
  79. 79. Thanks for attention Andrea Pescetti Italian N-L Project Lead PLIO Board Member pescetti@openoffice.org Image credits: Flickr, PLIO Archives.
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×