Accentuate Us!: Lightning Talk


Published on

In this talk, given to the Saint Louis Lambda Lounge, Michael Schade quickly discusses the background, approach, and technical implementation of the system, and then demonstrates the new vim plugin, unreleased Apple Mac OS X service, and the just-released 0.9 version of the Firefox add-on.

Published in: Technology
1 Comment
  • <br /><object type="application/x-shockwave-flash" data="" width="350" height="288"><param name="movie" value=""></param><embed src="" width="350" height="288" type="application/x-shockwave-flash"></embed></object>
    Are you sure you want to  Yes  No
    Your message goes here
  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Name\n19-year-old entrepreneur, student at Saint Louis University\nCo-found Spearhead with mom, with Kevin Scannell of SLU\n
  • Expanded, 45-minute version online\n\nGoing to start with background, architecture, and finally some demos\n
  • - 90% loss!\n\n- Irrevocable loss\n- Each is a repository of the culture, traditions, and world view\n- Akin to extinction of animal or plant species\n\n- They&amp;#x2019;re looking to the Internet and technology for that.\n\nSo, let&amp;#x2019;s help!\n
  • - Even Unicode-encoded languages often lack appropriate input methods\n\n- Identified problem: keyboard input\n
  • - Every character that allows a diacritic is a classification problem\n\n- trained with corpus of texts with diacritics\n\n- Never-before seen words: statistics of 3-character sequences in a neighborhood of the character in question\n
  • Simple: only three calls\n\n- Langs: get languages &amp;&amp;#xA0;localizations\n- Lift: accentuate text (legacy)\n- Feedback: add to corpora, improve models\n
  • Clients send requests to load-balancing proxy &apos;distribution center&quot;\n\nProxy\n&amp;#xA0;&amp;#xA0; &amp;#xA0;- Load balances across same-language API servers\n&amp;#xA0;&amp;#xA0; &amp;#xA0;- Allows quick management of servers&amp;#x2013;no DNS propagation time!\n&amp;#xA0;&amp;#xA0; &amp;#xA0;-&amp;#xA0;Increases privacy (masks real UA, IP)\n\nAPI servers ran by language communities!\n&amp;#xA0;&amp;#xA0; &amp;#xA0;-&amp;#xA0;Makes keeping it free doable\n&amp;#xA0;&amp;#xA0;&amp;#xA0; - Helps learn technology \n&amp;#xA0;&amp;#xA0; &amp;#xA0;- Distributed to language hot spots (French servers for French-using zones, etc.).\n
  • Firefox API request\n\nBlue text is most important to proxy server!\n\nInformation in headers so we don&amp;#x2019;t unpack body\n\nUA must start with &quot;;\n&amp;#xA0;&amp;#xA0; &amp;#xA0;- Analytics\n&amp;#xA0;&amp;#xA0; &amp;#xA0;- Mismatch resolution\n&amp;#xA0;&amp;#xA0; &amp;#xA0;- Spam prevention\n
  • Accentuate the differences: API server receives less information!\n\nClient is not identifiable based on:\n\n- UA\n- Host\n- IP \n\nBlue parts are what is different from API request\n
  • Emacs users: stand your ground!\n\nVersion 1.0: early alpha; will\n\n- Grab context words\n- Modularize processing\n
  • \n
  • Accentuate Us!: Lightning Talk

    1. 1. Accentuate Us! Michael Schade December 2, 2010
    2. 2.
    3. 3. Keyboard Input• Lack appropriate input methods• Electronic texts often entered as plain ASCII o Transliteration Cherokee ᏴᏴᏴᏴᏴ → galvquodiyu o Omitting diacritics Lingala likɔngá → likonga o Ad hoc approaches Irish béal → be/al• Diacritics matter!• Omission leads to ambiguities, misunderstandings o leite vs. léite
    4. 4. Statistical Machine Learning• Classification problem• Machine learning• Never-before seen words o French: "cera" vs. "cerc," "cabl" vs. "cabo" o Under-resourced languages• 114 trained languages!
    5. 5. API• Protocol: JSON• Calls o langs o lift o feedback• Sample Call o { "call": "charlifter.lift" , "lang": "ht" , "text": "Bon, la fe sa apre demen pito, le la we mwen andey." , "locale": "ht" }• Full documentation at
    6. 6. Service ArchitectureAPI ServersLoad-Balancing ProxyClients
    7. 7. HTTP Communication (Proxy)Cache-Control: no-cacheConnection: keep-alivePragma: no-cacheAccept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7Accept-Encoding: gzip,deflateAccept-Language: en-us,en;q=0.5Host: Mozilla/5.0 (Windows; U; Windows NT6.1; en-US; rv: Gecko/20100914 Firefox/3.6.1Content-Length: 113Content-Type: application/json; charset=utf-8Keep-Alive: 115{"call":"charlifter.lift","lang":"ht","text":"Bon, la fe sa apre demen pito, le lawe mwen andey.","locale":"ht"}
    8. 8. HTTP Communication (API)Cache-Control: no-cacheConnection: closePragma: no-cacheAccept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7Accept-Encoding: gzip,deflateAccept-Language: en-us,en;q=0.5Host: htUser-Agent: 113Content-Type: application/json; charset=utf-8{"call":"charlifter.lift","lang":"ht","text":"Bon, la fe sa apre demen pito, le la wemwen andey.","locale":"ht"}
    9. 9. Demos (and a sneak preview of 1.0!)
    10. 10. Thank You!