Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.



Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this


  1. 1. Comprehensive sentimental data mining, analysis and visualization to improve Business Outcomes The Team: Arjit Sachdeva Prashast Kumar Singh Nishtha Pande Dhruv Mahajan
  2. 2. The Problem: Making sense of data to facilitate consumer-centric companies, governments etc in taking decisions to improve their products.
  3. 3. Scenario #1 Imagine you’re a mobile manufacturer, say HTC. You just launched your flagship phone, the HTC X, which boomed on the Internet with users posting their reactions about it. But, is there any way you could actually go through hundreds of thousands of those reviews individually, and use that data for your organization at all ? Is it possible to analyze the sentiments of the user in what he wrote about the HTC X ? How about analyzing not just the positive/negative quotient in the posts, but also getting a summarized feedback on what users liked the most, and disliked the most as well, about the HTC X ?
  4. 4. Scenario #2 The government wants to connect to hundreds of thousands of people and analyze their views. How to directly connect to people to answer questions like: Government wants to know how the people are reacting to a new policy announcement. •What parts of the policy do the voters like? (Example Tax cuts) •What parts of the policy need to be changed of modified? Getting feedback on proposed laws •What do the people think about a proposed law (positive/negative response)? •How the proposal be improved? •Analyse the negative comments.
  5. 5. Our approach towards a Comprehensive sentimental analysis and visualization tool
  6. 6. Break up a review into sentences, and parse each sentence using the rules of English grammar. Identify the various relationships(dependencies) existing between all pairs of words. Filter the relevant relationships and make a list of relevant nouns and adjectives. Assign scores using a self-learning scoring algorithm. Use the generated data structures to visualize data to provide answers to businesses’ questions.
  7. 7. PARSING Parsing is the process of assigning structural descriptions to sequences of words in a natural language.
  8. 8. IDENTIFYING RELATIONSHIPS The Stanford typed dependencies representation was designed to provide a simple description of the grammatical relationships in a sentence that can easily be understood.
  9. 9. SCORING NOUNS We find the scores of the Adjectives present using the SentiWordNet API. These scores are then assigned to the corresponding Nouns and stored in Guava structures.
  10. 10. VISUALIZATION Intuitive 2D and 3D visualizations of every aspect of data, mapping changes in sentiments about your brand, demographics and other analytics
  11. 11. A few challenges: Analysis of sentiments inside data is a very complex task for a machine because of the multiple and often unpredictable soft and hard variables that come into play when interpreting it. The main problem being that the sentiment of a sentence only rarely lies in the sentence itself and is instead rooted in the cultural context around that sentence. This requires the algorithm to compute a vast amount of densely interconnected information to answer a fairly simple question in human terms. Just a few keywords taken separately won’t do the job. A bit like: The Government is wrong in its decision because it is a racist one. We need to consider a lot of combinations together to figure out WHY the decision is thought wrong by people.
  12. 12. Retrieve Data from various Social media channels PERT CHART Summarized feedback with intuitive 2D and 3D visualizations of every aspect of data Load the collected data into Database Behavior Segmentation ANALYSIS Share Of Voice Affinity Relation
  13. 13. VISUALIZATION How STARK attempts to answer a few generic scenarios?
  14. 14. Company A: Can you summarize what the user talked about my product, in specific detail? STARK shows the summary of the reviews
  15. 15. Company A: We had incorporated a new kind of a camera having a super-fast zoom. How strongly did the user talk about the camera? STARK processes the reviews and generates the following meter graph for CAMERA. The meter graph shows that the user has responded positively to the quality of camera.
  16. 16. Company A: Overall, how strongly did he express his views about my product? STARK shows the mean sentiment distribution of various components i.e the aggregated mean sentiment shown by all users towards each component.
  17. 17. Company A: Since we had many new things in our product this time, I'd like to know that feature which was talked about the most by him. STARK shows the percentage distribution of various components in the review. It gives an overview of the components that are being talked about and to which extent.
  18. 18. Company A: I still need one more detail. Did he talk about the camera positively only? Or was it negatively Or both? How many times positive and how many negative? STARK shows the sentiment distribution of various components. Sentiments distribution means the sentiment shown by user towards each component.
  19. 19. Company A: Could you finally quantify the scores assigned to each feature? STARK shows the scatter plot and line graph of all the features.
  20. 20. Cheers to BIG DATA in a SMALL WORLD •Arjit Sachdeva •Dhruv Mahajan •Nishtha Pande •Prashast Kumar Singh