SlideShare a Scribd company logo
1 of 9
Download to read offline
Sunpinyin and Its Future

       Mike(Dai) Qin
  mikeandmore@gmail.com

     University of Toronto


       July 17, 2011
Who Am I




• Just call me Mike.
• Used to be a student of Zhejiang University, Hangzhou. Now
  admitted by University of Toronto.
• Was born in Tianjin. (That’s why I’m here.)
• Has been using Linux since high school. A Fedora user now.
• Has been a sunpinyin committer since 2009 winter.
Pinyin Input Method




• Show several candidates according to Pinyin that user inputs.
• Lots of commercial and free implementation.
    • Pinyin ABC
    • Microsoft Pinyin
    • Sougou Pinyin
    • QQ Pinyin
Approach


• Dictionary based
    • A dictionary that contains all possible words.
    • Always look up from the dictionary upon user input.
    • Will adjust the order of candidates upon user commit.
• Pros: Easy to implement.
• Cons: Not intelligence enough when words are combined.
• Implementation:
    • Pinyin ABC (Commercial)
    • Fcitx (GPL)
    • ibus-pinyin (GPL)
Approach



• N-Gram based
    • Have a database of conditional probability.
    • Will try to calculate the sentence with the larger probability.
    • Interpolate between user commit history and database.
• Pros: Intelligence!
• Cons: Where can I get the database?
• Implementation:
    • Sunpinyin (LGPL/CDDL)
    • Sougou Pinyin (Commercial)
Sunpinyin and OpenGram Project



• Sunpinyin is a input method using N-Gram based method.
• Free as in freedom. LGPL license.
• It’s using tri-grams for build-in and bi-grams for user history.

• OpenGram project aims at creating a tri-gram database for
  Simplified Chinese.
• Free as in freedom. CC license.
Current Progress - Sunpinyin


• Released 2.0.3, and we’re working on 2.1/2.5 release.
• Works on Linux, BSD and OSX platform, with native
  interface.
• Ported to Ibus(ibus-sunpinyin), Scim(scim-sunpinyin). Also
  have a standalone version(xsunpinyin).
• Current Progress of next release.
    • Multiple Best Sentence. done
    • Partial Sentence. done
    • Plugin Support. WIP
• Needs maintainer!
What Do I Need to Know Before Joining
                               Sunpinyin?



• Passion.
• A little C++ knowledge. (Nobody in the team know C++
  completely. :) )
• That’s enough! and maybe plus
    • Ibus/Scim API, Xorg API or OSX API.
    • Python API. (Plugin Support)
    • Windows API. (Windows Port?)
Q&A

More Related Content

Similar to Sunpinyin and it's future

Mobeers waterloo-2011
Mobeers waterloo-2011Mobeers waterloo-2011
Mobeers waterloo-2011
Brian LeRoux
 

Similar to Sunpinyin and it's future (20)

Learning Python
Learning PythonLearning Python
Learning Python
 
Mobeers waterloo-2011
Mobeers waterloo-2011Mobeers waterloo-2011
Mobeers waterloo-2011
 
What is open source?
What is open source?What is open source?
What is open source?
 
Python
PythonPython
Python
 
Scaling server side web rtc applications the janus challenge by lorenzo miniero
Scaling server side web rtc applications the janus challenge by lorenzo minieroScaling server side web rtc applications the janus challenge by lorenzo miniero
Scaling server side web rtc applications the janus challenge by lorenzo miniero
 
Scaling WebRTC applications with Janus
Scaling WebRTC applications with JanusScaling WebRTC applications with Janus
Scaling WebRTC applications with Janus
 
Open source softwares
Open source softwaresOpen source softwares
Open source softwares
 
Open source softwares
Open source softwaresOpen source softwares
Open source softwares
 
The One Way
The One WayThe One Way
The One Way
 
What is Python? | Edureka
What is Python? | EdurekaWhat is Python? | Edureka
What is Python? | Edureka
 
Day 1 - Intro to Ruby
Day 1 - Intro to RubyDay 1 - Intro to Ruby
Day 1 - Intro to Ruby
 
Vimeo and Open Source (SMPTE Forum 2015)
Vimeo and Open Source (SMPTE Forum 2015)Vimeo and Open Source (SMPTE Forum 2015)
Vimeo and Open Source (SMPTE Forum 2015)
 
Introduction to Python Programming Basics
Introduction  to  Python  Programming BasicsIntroduction  to  Python  Programming Basics
Introduction to Python Programming Basics
 
Python programming language (2017)
Python programming language (2017)Python programming language (2017)
Python programming language (2017)
 
How to start Python? - lesson 1
How to start Python? - lesson 1How to start Python? - lesson 1
How to start Python? - lesson 1
 
Raspberry using Python Session 1
Raspberry using Python Session 1Raspberry using Python Session 1
Raspberry using Python Session 1
 
python classes in thane
python classes in thanepython classes in thane
python classes in thane
 
Python presentation by Monu Sharma
Python presentation by Monu SharmaPython presentation by Monu Sharma
Python presentation by Monu Sharma
 
Introduction to python
Introduction to pythonIntroduction to python
Introduction to python
 
python.pptx
python.pptxpython.pptx
python.pptx
 

More from Rhythm Sun

Trello workflow by @imRhythm
Trello workflow by @imRhythmTrello workflow by @imRhythm
Trello workflow by @imRhythm
Rhythm Sun
 
Bitcoin and retail
Bitcoin and retailBitcoin and retail
Bitcoin and retail
Rhythm Sun
 
Garage cafe keynote peak ji_no_video
Garage cafe keynote peak ji_no_videoGarage cafe keynote peak ji_no_video
Garage cafe keynote peak ji_no_video
Rhythm Sun
 
Doc 2011101411284862
Doc 2011101411284862Doc 2011101411284862
Doc 2011101411284862
Rhythm Sun
 
Doc 2010050608572429
Doc 2010050608572429Doc 2010050608572429
Doc 2010050608572429
Rhythm Sun
 
火狐2011 sfd讲稿
火狐2011 sfd讲稿火狐2011 sfd讲稿
火狐2011 sfd讲稿
Rhythm Sun
 
Customize snipmate
Customize snipmateCustomize snipmate
Customize snipmate
Rhythm Sun
 

More from Rhythm Sun (11)

长连接服务 WebSocket Service
长连接服务 WebSocket Service长连接服务 WebSocket Service
长连接服务 WebSocket Service
 
Trello workflow by @imRhythm
Trello workflow by @imRhythmTrello workflow by @imRhythm
Trello workflow by @imRhythm
 
Bitcoin and retail
Bitcoin and retailBitcoin and retail
Bitcoin and retail
 
Garage cafe keynote peak ji_no_video
Garage cafe keynote peak ji_no_videoGarage cafe keynote peak ji_no_video
Garage cafe keynote peak ji_no_video
 
Beginning git
Beginning gitBeginning git
Beginning git
 
Doc 2011101411284862
Doc 2011101411284862Doc 2011101411284862
Doc 2011101411284862
 
Doc 2010050608572429
Doc 2010050608572429Doc 2010050608572429
Doc 2010050608572429
 
Outside
OutsideOutside
Outside
 
火狐2011 sfd讲稿
火狐2011 sfd讲稿火狐2011 sfd讲稿
火狐2011 sfd讲稿
 
Customize snipmate
Customize snipmateCustomize snipmate
Customize snipmate
 
Zsh
ZshZsh
Zsh
 

Recently uploaded

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Recently uploaded (20)

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 

Sunpinyin and it's future

  • 1. Sunpinyin and Its Future Mike(Dai) Qin mikeandmore@gmail.com University of Toronto July 17, 2011
  • 2. Who Am I • Just call me Mike. • Used to be a student of Zhejiang University, Hangzhou. Now admitted by University of Toronto. • Was born in Tianjin. (That’s why I’m here.) • Has been using Linux since high school. A Fedora user now. • Has been a sunpinyin committer since 2009 winter.
  • 3. Pinyin Input Method • Show several candidates according to Pinyin that user inputs. • Lots of commercial and free implementation. • Pinyin ABC • Microsoft Pinyin • Sougou Pinyin • QQ Pinyin
  • 4. Approach • Dictionary based • A dictionary that contains all possible words. • Always look up from the dictionary upon user input. • Will adjust the order of candidates upon user commit. • Pros: Easy to implement. • Cons: Not intelligence enough when words are combined. • Implementation: • Pinyin ABC (Commercial) • Fcitx (GPL) • ibus-pinyin (GPL)
  • 5. Approach • N-Gram based • Have a database of conditional probability. • Will try to calculate the sentence with the larger probability. • Interpolate between user commit history and database. • Pros: Intelligence! • Cons: Where can I get the database? • Implementation: • Sunpinyin (LGPL/CDDL) • Sougou Pinyin (Commercial)
  • 6. Sunpinyin and OpenGram Project • Sunpinyin is a input method using N-Gram based method. • Free as in freedom. LGPL license. • It’s using tri-grams for build-in and bi-grams for user history. • OpenGram project aims at creating a tri-gram database for Simplified Chinese. • Free as in freedom. CC license.
  • 7. Current Progress - Sunpinyin • Released 2.0.3, and we’re working on 2.1/2.5 release. • Works on Linux, BSD and OSX platform, with native interface. • Ported to Ibus(ibus-sunpinyin), Scim(scim-sunpinyin). Also have a standalone version(xsunpinyin). • Current Progress of next release. • Multiple Best Sentence. done • Partial Sentence. done • Plugin Support. WIP • Needs maintainer!
  • 8. What Do I Need to Know Before Joining Sunpinyin? • Passion. • A little C++ knowledge. (Nobody in the team know C++ completely. :) ) • That’s enough! and maybe plus • Ibus/Scim API, Xorg API or OSX API. • Python API. (Plugin Support) • Windows API. (Windows Port?)
  • 9. Q&A