Machine Translation of Indic Languages using apertium

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    3 Favorites

    Machine Translation of Indic Languages using apertium - Presentation Transcript

    1. Machine Translation for Indic Languages using Apertium Pranava Swaroop S Malaviya National Institute of Technology, Jaipur
    2. Conclusion! ● Most of the Machine Translation Systems are closed ● They rarely are friendly towards their peers ● Most Indian MT systems are stagnant or closed or badly documented ● Interchange is mostly a “Mission Impossible – 5” ;-) ● Integration !!!
    3. Results(Indian perspective) ● Indian MT engines either stagnant ● Undocumented ● Coded by different people ● No uniformity ● Mostly CLOSED ● Mostly Undeployable ● Versioning is a term which most of the Indian organizations(producing MT) have never heard of
    4. So?? ● Need for an Open Collaboration ● Would be great if there is a base is readily available ● Must be well documented ● Must be portable ● Must be active
    5. Why all this? ● India has more than 18 Languages defined in the constitution ● Very less literary resources {digital} ● Need for rapid conversion and Immediate generation of digital data ● Need for collaboration ● Though the accuracy may be low during initial phases
    6. And ● Indic Languages belong to the huge group namely: ● Indo-European ● Indo-Iranian ● Indo-Aryan ● Dravidian ● Some of the well known languages from these groups have already well formed corpus and translation rules.
    7. The use? Can we inherit some properties? ●
    8. Any options? ● Apertium!! ● Apertium is an open-source machine translation toolbox (http://www.apertium.org) providing: ● 1 An open-source modular shallow-transfer machine ● translation engine with: ● text format management ● finite-state lexical processing ● statistical lexical disambiguation ● shallow transfer based on finite-state pattern matching
    9. ● * Spanish–Catalan (apertium-es-ca) ● * Spanish–Portuguese (apertium-es-pt) ● * Spanish–Galician (apertium-es-gl) ● * Occitan–Catalan (apertium-oc-ca) ● * French–Catalan (apertium-fr-ca) ● * English–Catalan (apertium-en-ca)
    10. Do what with that? ● Most of the Indic languages are well known as close neihbours ● Most of the grammatical constructs are almost the same. ● Use apertium for the translation of close neighbours, though it is known that apertium works for sparsely spaced languages
    11. Introduction ● Translation of closely related languages ● The need to write specific translation rules ● http://apertium.svn.sourceforge.net/viewvc/apertium ● http://xixona.dlsi.ua.es/~fran/hindidict.txt
    12. ● Please download apertium and lt-toolbox from http://www.apertium.org ● Live demonstration ● Extend it to different languages ● The application to urdu hindi translation ● http://sanskrit.uohyd.ernet.in/~anusaaraka/urdu/ Urdu-Hindi-Translation/
    13. Contribute ● Mail me pmadhyastha@acm.org ● Join apertium@irc.freenode.net ● https://lists.sourceforge.net/lists/listinfo/apertium-stu

    + guest9762aguest9762a, 12 months ago

    custom

    524 views, 3 favs, 0 embeds more stats

    More info about this document

    © All Rights Reserved

    Go to text version

    • Total Views 524
      • 524 on SlideShare
      • 0 from embeds
    • Comments 0
    • Favorites 3
    • Downloads 4
    Most viewed embeds

    more

    All embeds

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories

    Tags