NLP Task English Indic Languages Nepali
Machine Translation Very Good Good
Very Good Fair None
(Few Ground work)
Very Good Poor Very Poor
POS Tagging Good Poor Very Poor
Sentiment Analysis Very Good Fair
Speech Recognition Good Poor
What So Far?
–Prof. James A. Hendler
University of Maryland
“I have the students learn Python in our
undergraduate and graduate Semantic Web
courses. Why? Because basically there's nothing
else with the ﬂexibility and as many web
• NLTK, although not the most efﬁcient
implementation, provides a lot of awesome tools
to quickly prototype a hypothesis
• Scipy + Numpy: Everything that isn't in NLTK is
deﬁnitely in these libraries. If you want to use more
advanced algorithms like Latent Semantic
Indexing or Latent Dirichlet Allocation, Python has
libraries to do that.
• Python has really great XML/HTML parsing
libraries such as Beautiful Soup and Scrape.py.
You can use these libraries to quickly scrape the web and generate large
data sets to improve the performance of your models (because lets face
it, big data trumps complexity)
• Python has great web-frameworks like Django/
If you invent a revolutionary sarcasm detector that can predict trends in
the stock market, you can quickly integrated it into a web service, make
millions, and buy a large island in a third-world country.
• Consider your other options: It would not make
sense to use a compiled language like C++/Java
for this type of work unless you needed to increase
performance (computational speed, not model
As far as I can tell, Ruby is completely useless for any Machine Learning,
Data Mining, or Natural Language Processing task. Maybe you could use
Lisp, but at this point, Python has a larger eco-system.