 An on-going project on Natural Language Processing (using Python and the NLTK toolkit), which focuses on the extraction of sentiment from a Question and its title on and determining the polarity.Based on the above findings, it is verified whether the rules and guidelines imposed by the SO community on the users are strictly followed or not.

  1. 1. Project based on Natural Language Processing (NLP) techniques
  2. 2. NLP Definition : The term Natural Language Processing encompasses a broad set of techniques for automated generation, manipulation and analysis of natural or human languages. So, NLP comprises of mainly three things – • Automated Generation of Natural Languages. • Text manipulation of Natural Languages. • General analysis of Natural Languages.
  3. 3. Diving into the project This project focuses on the following website : A major Questions and Answers forum for developers and programmers.
  4. 4. Brilliance of stackoverflow: • One can expect answers to his question within 10 - 15 minutes (in general). • Comprises of tags ranging from “python” to “java” to even fields like “image processing” and “artificial intelligence”.
  5. 5. Downsides of stackoverflow: • Comprises a very strict Voting mechanism which becomes even more difficult for a beginner to handle. • Questions and answers which are written in a very uncanny or strange way results into down-votes , which even decrements the overall reputation of the user.
  6. 6. Major Reasons behind receiving down- votes : • Questions showing no Research effort. • Endeavors framed but not mentioned in the question. ”What have you tried?” is a very common reply to questions which do not consist of personal effort. • Questions or answers consisting of broken links are likely to get downvoted.
  7. 7. • If the title of the question is not correctly formatted i.e. it starts with “How do I” etc. then the question is a contender of receiving down-votes. • Titles of questions consisting of negative polarity or negativity in their posts are unlikely to go viral. We conclude this from the Jonah Berger and Katherine L. Milkman paper on viral content of internet.
  8. 8. What is a badly formatted title? According to Stack Overflow community, the following are the examples of badly formatted titles: • Titles starting with “How do I”, or “How can I” are categorized as badly formatted. • Titles which consists of a tag keywords. If the question consists a tag word, it is also considered as badly formatted.
  9. 9. Major Goals in the project • Does high-rated questions consists of titles which are well-formatted? • Does sentiment of the title of a question play a significant role in the success of a question? Do titles which consist of positive sentiment draw more attention?
  10. 10. Major benefits of this project: • New programmers and developers will be able to judge their mistakes while framing a particular question or an answer. This will reduce the chances of receiving down-votes and not getting blocked by moderators. • This will also result in neater and cleaner questions which will make life easier for existing developers to answer questions.
  11. 11. END • Anirban Ghosh , Roll – 05 , IT Sec – A, 3rd year • Aryak Sengupta, Roll – 14, IT Sec – A, 3rd year Mentor Prof. Tapan Kumar Hazra