Feature Specific Sentiment Analysis for Product Reviews, Subhabrata Mukherjee and Pushpak Bhattacharyya, In Proceedings of the 13th International Conference on Intelligent Text Processing and Computational Intelligence (CICLING 2012), New Delhi, India, March, 2012 (http://www.cse.iitb.ac.in/~pb/papers/cicling12-feature-specific-sa.pdf)
Paolo Rosso "On irony detection in social media"AINL Conferences
Каковы лингвистические паттерны, которым следуют пользователи социальных сетей, чтобы высказывать иронию в совсем коротких фразах? Лингвистические средства - такие как неоднозначность, непоследовательность, неожиданность эмоциональный контекст, гораздо более широкий, чем просто негативная или позитивная тональность - играют очень важную роль триггеров иронии. В иронических текстах буквальный смысл сообщения как правило отрицается, но формальные маркеры отрицания отсутствуют. Это делает задачу определения иронии очень сложной. В своем выступлении я опишу как ирония выражается в социальных сетях (Twitter, Amazon, Facebook и др.) и каково современное положение дел в автоматическом определении иронии. Определение иронии очень важно для таких задач анализа текста как определение тональности сообщения, извлечение мнений, или анализ репутаций, и существует определенный интерес исследовательского сообщества к этой теме. На конференции SemEval 2015 будет организована задача-соревнование по определению тональности фигуративного языка в Твиттере (Sentiment Analysis of Figurative Language in Twitter, http://alt.qcri.org/semeval2015/task11/). В конце я коснусь еще более сложной проблемы различения иронии, сатиры и сарказма, например: Если вам тяжело смеяться над собой, я буду счастлив сделать это за вас.
Sarcasm is a peculiar form of sentiment expression, where the surface sentiment differs from the implied sentiment. The detection of sarcasm in social media platforms has been applied in the past mainly to textual utterances where lexical indicators (such as interjections and intensifiers), linguistic markers, and contextual information (such as user profiles, or past conversations) were used to detect the sar- castic tone. However, modern social media platforms allow to create multimodal messages where audiovisual content is integrated with the text, making the analysis of a mode in isolation partial.
We first study the relationship between the textual and visual aspects in multimodal posts from three major social media platforms, i.e., Instagram, Tumblr and Twitter, and we run a crowdsourcing task to quantify the extent to which images are perceived as necessary by human annotators.
Moreover, we propose two different computational frameworks to detect sarcasm that integrate the textual and visual modalities. Results show the positive effect of combining modalities for the detection of sarcasm across platforms and methods.
Summarization and opinion detection in product reviewspapanaboinasuman
This document describes a project to build a system that provides structured summaries of product reviews by extracting product features and associated opinions. It outlines the end-to-end architecture of the system, including modules for crawling reviews, preprocessing text, extracting and analyzing features and opinions, and providing a feature-based summary. An evaluation of the system shows a precision of 75% and recall of 90% for correctly identifying features and opinions.
The document discusses techniques for analyzing sentiment and opinions in consumer reviews. It begins by introducing sentiment classification of reviews as positive or negative. It then discusses several approaches to sentiment classification including unsupervised methods using pointwise mutual information and supervised methods using machine learning techniques. The document also discusses analyzing reviews at the sentence level to extract product features that are commented on and determine if the comments are positive or negative. It proposes techniques for feature extraction, feature refinement, identifying sentiment orientation, and generating a feature-based summary. Finally, it discusses related work on other sentiment analysis and opinion mining tasks.
Sarcasm & Thwarting in Sentiment Analysis [IIT-Bombay]Sagar Ahire
1) The document discusses various linguistic phenomena including irony, sarcasm, and thwarting. It presents algorithms for detecting sarcasm and thwarting in text.
2) For sarcasm detection, a semi-supervised algorithm uses pattern-based and punctuation-based features to classify sentences, achieving up to 81% accuracy.
3) Thwarting detection compares sentiment across levels of a domain ontology, using either rule-based or machine learning approaches, with the latter approach achieving up to 81% accuracy.
This document provides an overview of text classification and the Naive Bayes machine learning algorithm. It defines text classification as assigning categories or labels to documents, and discusses different approaches like human labeling, rule-based classification, and machine learning. Naive Bayes is introduced as a simple supervised learning method that calculates the probability of documents belonging to different categories based on word frequencies. The document then reviews probability concepts and shows how Naive Bayes makes the "naive" assumption that words are conditionally independent given the topic to classify documents probabilistically using Bayes' theorem.
The Festival della Scienza is an annual science festival held in Genoa, Italy from October 21st to November 2nd, 2011. The 2011 festival celebrates the 150th anniversary of the unification of Italy and highlights scientific excellence in Italy over the past 150 years. The festival features lectures, exhibitions, laboratories and other events focused on science, hosted both in Genoa and other major Italian cities. Notable speakers include scientists from the United States, who are the guest country for 2011, celebrating the 150th anniversary of the Massachusetts Institute of Technology. The festival aims to showcase both Italy's scientific history and contemporary scientists working to advance knowledge and bring Italy into new scenarios for a better future.
The big data phenomenon has confirmed the achievement of data access transformation. Sentiment analysis (SA) is one of the most exploited area and used for profit-making purpose through business intelligence applications. This paper reviews the trends in SA and relates the growth in the area with the big data era.
Paolo Rosso "On irony detection in social media"AINL Conferences
Каковы лингвистические паттерны, которым следуют пользователи социальных сетей, чтобы высказывать иронию в совсем коротких фразах? Лингвистические средства - такие как неоднозначность, непоследовательность, неожиданность эмоциональный контекст, гораздо более широкий, чем просто негативная или позитивная тональность - играют очень важную роль триггеров иронии. В иронических текстах буквальный смысл сообщения как правило отрицается, но формальные маркеры отрицания отсутствуют. Это делает задачу определения иронии очень сложной. В своем выступлении я опишу как ирония выражается в социальных сетях (Twitter, Amazon, Facebook и др.) и каково современное положение дел в автоматическом определении иронии. Определение иронии очень важно для таких задач анализа текста как определение тональности сообщения, извлечение мнений, или анализ репутаций, и существует определенный интерес исследовательского сообщества к этой теме. На конференции SemEval 2015 будет организована задача-соревнование по определению тональности фигуративного языка в Твиттере (Sentiment Analysis of Figurative Language in Twitter, http://alt.qcri.org/semeval2015/task11/). В конце я коснусь еще более сложной проблемы различения иронии, сатиры и сарказма, например: Если вам тяжело смеяться над собой, я буду счастлив сделать это за вас.
Sarcasm is a peculiar form of sentiment expression, where the surface sentiment differs from the implied sentiment. The detection of sarcasm in social media platforms has been applied in the past mainly to textual utterances where lexical indicators (such as interjections and intensifiers), linguistic markers, and contextual information (such as user profiles, or past conversations) were used to detect the sar- castic tone. However, modern social media platforms allow to create multimodal messages where audiovisual content is integrated with the text, making the analysis of a mode in isolation partial.
We first study the relationship between the textual and visual aspects in multimodal posts from three major social media platforms, i.e., Instagram, Tumblr and Twitter, and we run a crowdsourcing task to quantify the extent to which images are perceived as necessary by human annotators.
Moreover, we propose two different computational frameworks to detect sarcasm that integrate the textual and visual modalities. Results show the positive effect of combining modalities for the detection of sarcasm across platforms and methods.
Summarization and opinion detection in product reviewspapanaboinasuman
This document describes a project to build a system that provides structured summaries of product reviews by extracting product features and associated opinions. It outlines the end-to-end architecture of the system, including modules for crawling reviews, preprocessing text, extracting and analyzing features and opinions, and providing a feature-based summary. An evaluation of the system shows a precision of 75% and recall of 90% for correctly identifying features and opinions.
The document discusses techniques for analyzing sentiment and opinions in consumer reviews. It begins by introducing sentiment classification of reviews as positive or negative. It then discusses several approaches to sentiment classification including unsupervised methods using pointwise mutual information and supervised methods using machine learning techniques. The document also discusses analyzing reviews at the sentence level to extract product features that are commented on and determine if the comments are positive or negative. It proposes techniques for feature extraction, feature refinement, identifying sentiment orientation, and generating a feature-based summary. Finally, it discusses related work on other sentiment analysis and opinion mining tasks.
Sarcasm & Thwarting in Sentiment Analysis [IIT-Bombay]Sagar Ahire
1) The document discusses various linguistic phenomena including irony, sarcasm, and thwarting. It presents algorithms for detecting sarcasm and thwarting in text.
2) For sarcasm detection, a semi-supervised algorithm uses pattern-based and punctuation-based features to classify sentences, achieving up to 81% accuracy.
3) Thwarting detection compares sentiment across levels of a domain ontology, using either rule-based or machine learning approaches, with the latter approach achieving up to 81% accuracy.
This document provides an overview of text classification and the Naive Bayes machine learning algorithm. It defines text classification as assigning categories or labels to documents, and discusses different approaches like human labeling, rule-based classification, and machine learning. Naive Bayes is introduced as a simple supervised learning method that calculates the probability of documents belonging to different categories based on word frequencies. The document then reviews probability concepts and shows how Naive Bayes makes the "naive" assumption that words are conditionally independent given the topic to classify documents probabilistically using Bayes' theorem.
The Festival della Scienza is an annual science festival held in Genoa, Italy from October 21st to November 2nd, 2011. The 2011 festival celebrates the 150th anniversary of the unification of Italy and highlights scientific excellence in Italy over the past 150 years. The festival features lectures, exhibitions, laboratories and other events focused on science, hosted both in Genoa and other major Italian cities. Notable speakers include scientists from the United States, who are the guest country for 2011, celebrating the 150th anniversary of the Massachusetts Institute of Technology. The festival aims to showcase both Italy's scientific history and contemporary scientists working to advance knowledge and bring Italy into new scenarios for a better future.
The big data phenomenon has confirmed the achievement of data access transformation. Sentiment analysis (SA) is one of the most exploited area and used for profit-making purpose through business intelligence applications. This paper reviews the trends in SA and relates the growth in the area with the big data era.
Introduction to text classification using naive bayesDhwaj Raj
This document provides an overview of text classification and the Naive Bayes classification method. It defines text classification as assigning categories, topics or genres to documents. It describes classification methods like hand-coded rules and supervised machine learning. It explains the bag-of-words representation and how Naive Bayes classification works by calculating the probability of a document belonging to a class using Bayes' rule and independence assumptions. It discusses parameter estimation and how to build a multinomial Naive Bayes classifier for text classification tasks.
This document discusses and provides examples of supervised and unsupervised learning. Supervised learning involves using labeled training data to learn relationships between inputs and outputs and make predictions. An example is using data on patients' attributes to predict the likelihood of a heart attack. Unsupervised learning involves discovering hidden patterns in unlabeled data by grouping or clustering items with similar attributes, like grouping fruits by color without labels. The goal of supervised learning is to build models that can make predictions when new examples are presented.
Tweets Classification using Naive Bayes and SVMTrilok Sharma
This document summarizes a project to automatically classify tweets into predefined Wikipedia categories. It discusses using three algorithms - Naive Bayes, SVM, and rule-based - to classify tweets into 11 categories like business, sports, politics etc. It explains the concepts used like removing outliers, stemming, spell checking. Accuracy results using 10-fold cross validation show SVM and rule-based achieving over 80% accuracy on most categories. The project analyzed real-time tweet data using an API and achieved high performance speeds for classification.
The document provides an overview of artificial neural networks and their learning capabilities. It discusses:
- How biological neural networks in the brain inspired artificial neural networks
- The basic structure of artificial neurons and how they are connected in a network
- Single layer perceptrons and how they can be trained to learn simple tasks using supervised learning algorithms like the perceptron learning rule
- Multilayer neural networks with one or more hidden layers that can learn more complex patterns using backpropagation to modify weights.
This document discusses computational intelligence and supervised learning techniques for classification. It provides examples of applications in medical diagnosis and credit card approval. The goal of supervised learning is to learn from labeled training data to predict the class of new unlabeled examples. Decision trees and backpropagation neural networks are introduced as common supervised learning algorithms. Evaluation methods like holdout validation, cross-validation and performance metrics beyond accuracy are also summarized.
This document summarizes key concepts from the CS 221 lecture on machine learning. It discusses supervised learning techniques like Naive Bayes classification, linear regression, perceptrons, and SVMs. It also covers unsupervised learning through k-nearest neighbors and discusses challenges like overfitting, generalization, and the curse of dimensionality.
Sentiment analysis software uses natural language processing and artificial intelligence to analyze text such as reviews and identify whether the opinions and sentiments expressed are positive or negative. It can help businesses understand customer perceptions of products and brands. While sentiment analysis works reasonably well for classifying simple positive and negative sentiments, it faces challenges in dealing with ambiguity and nuance in human language. The accuracy of sentiment analysis depends on factors such as the complexity of the language analyzed and how finely sentiments are classified.
Modeling Social Data, Lecture 6: Classification with Naive Bayesjakehofman
The document is a slide presentation on Naive Bayes classification. It introduces the concept of Naive Bayes classification using examples such as medical diagnosis and spam filtering. It then describes the mathematical formulation of Naive Bayes as applying Bayes' rule to classify documents represented as bags of words. The presentation notes that Naive Bayes works better than expected due to its computational efficiency and ability to be updated with new data, though its performance depends on the document representation used.
This document provides an overview of sentiment analysis techniques including AFINN-111, SentiWordNet, and document classification. It describes analyzing sentiment at the word level using lexicons and at the document level. Key steps are outlined such as tokenization, part-of-speech tagging, word sense disambiguation, and assigning sentiment scores. Issues with analyzing short texts like tweets are also discussed. The document provides references and links to related projects and APIs.
This document discusses predicting movie box office success based on sentiment analysis of tweets. It presents the methodology, which includes collecting twitter data on movies, preprocessing the data by removing noise and irrelevant tweets, using a trained classifier to label tweets as positive, negative, neutral or irrelevant, and calculating a PT-NT ratio based on these labels to predict if a movie will be a hit, flop or average. Related work on using social media to predict outcomes is also discussed.
Adaptation of Sentiment Analysis to New Linguistic Features, Informal Languag...Subhabrata Mukherjee
Adaptation of Sentiment Analysis to New Linguistic Features, Informal Language Form and World Knowledge, Subhabrata Mukherjee and Pushpak Bhattacharyya, Master's Thesis, IIT Bombay, Dept. of Computer Science and Engineering
TwiSent: A Multi-Stage System for Analyzing Sentiment in TwitterSubhabrata Mukherjee
TwiSent: A Multi-Stage System for Analyzing Sentiment in Twitter, Subhabrata Mukherjee, Akshat Malu, Balamurali A.R. and Pushpak Bhattacharyya, In Proceedings of The 21st ACM Conference on Information and Knowledge Management (CIKM 2012), Hawai, Oct 29 - Nov 2, 2012 (http://www.cse.iitb.ac.in/~pb/papers/cikm2012-twisent.pdf)
This document appears to be a transcript from a presentation on software estimation. It discusses how estimation is different for software projects compared to construction projects due to factors like constantly changing tools and requirements in software. A key example given is the Berlin Brandenburg Airport project, whose costs ballooned from an original budget of €2.83 billion to over €9.4 billion due to delays and changes. The presentation argues that while estimation is important, it is difficult to be perfectly accurate for software due to its inherently changing nature.
Lean Engineering: Engineering for Learning & Experimentation in the Enterpris...Rosenfeld Media
Bill Scott: "Lean Engineering: Engineering for Learning & Experimentation in the Enterprise"
Enterprise UX 2015 • May 13, 2015 • San Antonio, TX, USA
http://enterpriseux.net
PWA are a hot topic and it is important to understand that they are a different approach to apps than the traditional way of packaging something and letting the user install it. In this keynote you'll see some of the differences.
WEBASSEMBLY - What's the right thing to write? -Shin Yoshida
https://github.com/wbcchsyn/slide-WEBASSEMBLY-whats-the-right-thing-to-write.git
What is WebAssembly?
According to webassembly.org,
WebAssembly (abbreviated Wasm) is a binary instruction format for a stack-based virtual machine.
I think that it is a standard to make the programming logic abstract.
“standard to make the programming logic abstract.”
What does it mean?
What is the advantage?
Let’s talk about WebAssembly while looking back on the computer history.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with one Invariant Sections: “Shin Yoshida wrote this document with the goal of contributing to a fair and safe world. Funai Soken Digital Incorporated agrees with the vision and compensated him for his work.” no Front-Cover Texts, and no Back-Cover Text. A copy of the license is included in the section entitled “GNU Free Documentation License”.
https://github.com/wbcchsyn/slide-WEBASSEMBLY-whats-the-right-thing-to-write.git
Mobile App Feature Configuration and A/B Experimentslacyrhoades
The document discusses feature configuration and A/B testing in mobile apps. It describes how Etsy uses feature flags and continuous experimentation to iteratively develop and test new features. Features can be enabled or disabled for certain users or groups. Experiments follow a process of setting up a feature flag, determining user eligibility, coding the feature, internal testing, then launching the feature to a percentage of users while collecting analytics. This allows gathering feedback to improve products and user experience.
The Ember.js Framework - Everything You Need To KnowAll Things Open
All Things Open 2014 - Day 2
Thursday, October 23rd, 2014
Yehuda Katz
Founder of Tilde
Front Dev 1
The Ember.js Framework - Everything You Need To Know
This document provides style guidelines for 2adpro's communication with clients, including sections on branding, grammar, tone, crisis communication, and presentations. It emphasizes consistency in language, grammar, punctuation and formatting. Specific guidelines address issues like sentence structure, word choice, punctuation, spacing, and phrasing to ensure clear communication for an international client base.
This document discusses how data science can be used to help people plan vacations by analyzing reviews of hotels and other destinations. It describes building a model using natural language processing techniques like bag-of-words modeling and decision trees on hotel review data to match people's descriptions of their dream vacations to the most suitable locations. Some limitations of this approach are also outlined, like not accounting for word frequency or context between words. The document promotes an online data science bootcamp for learning skills like those used in this example.
Introduction to text classification using naive bayesDhwaj Raj
This document provides an overview of text classification and the Naive Bayes classification method. It defines text classification as assigning categories, topics or genres to documents. It describes classification methods like hand-coded rules and supervised machine learning. It explains the bag-of-words representation and how Naive Bayes classification works by calculating the probability of a document belonging to a class using Bayes' rule and independence assumptions. It discusses parameter estimation and how to build a multinomial Naive Bayes classifier for text classification tasks.
This document discusses and provides examples of supervised and unsupervised learning. Supervised learning involves using labeled training data to learn relationships between inputs and outputs and make predictions. An example is using data on patients' attributes to predict the likelihood of a heart attack. Unsupervised learning involves discovering hidden patterns in unlabeled data by grouping or clustering items with similar attributes, like grouping fruits by color without labels. The goal of supervised learning is to build models that can make predictions when new examples are presented.
Tweets Classification using Naive Bayes and SVMTrilok Sharma
This document summarizes a project to automatically classify tweets into predefined Wikipedia categories. It discusses using three algorithms - Naive Bayes, SVM, and rule-based - to classify tweets into 11 categories like business, sports, politics etc. It explains the concepts used like removing outliers, stemming, spell checking. Accuracy results using 10-fold cross validation show SVM and rule-based achieving over 80% accuracy on most categories. The project analyzed real-time tweet data using an API and achieved high performance speeds for classification.
The document provides an overview of artificial neural networks and their learning capabilities. It discusses:
- How biological neural networks in the brain inspired artificial neural networks
- The basic structure of artificial neurons and how they are connected in a network
- Single layer perceptrons and how they can be trained to learn simple tasks using supervised learning algorithms like the perceptron learning rule
- Multilayer neural networks with one or more hidden layers that can learn more complex patterns using backpropagation to modify weights.
This document discusses computational intelligence and supervised learning techniques for classification. It provides examples of applications in medical diagnosis and credit card approval. The goal of supervised learning is to learn from labeled training data to predict the class of new unlabeled examples. Decision trees and backpropagation neural networks are introduced as common supervised learning algorithms. Evaluation methods like holdout validation, cross-validation and performance metrics beyond accuracy are also summarized.
This document summarizes key concepts from the CS 221 lecture on machine learning. It discusses supervised learning techniques like Naive Bayes classification, linear regression, perceptrons, and SVMs. It also covers unsupervised learning through k-nearest neighbors and discusses challenges like overfitting, generalization, and the curse of dimensionality.
Sentiment analysis software uses natural language processing and artificial intelligence to analyze text such as reviews and identify whether the opinions and sentiments expressed are positive or negative. It can help businesses understand customer perceptions of products and brands. While sentiment analysis works reasonably well for classifying simple positive and negative sentiments, it faces challenges in dealing with ambiguity and nuance in human language. The accuracy of sentiment analysis depends on factors such as the complexity of the language analyzed and how finely sentiments are classified.
Modeling Social Data, Lecture 6: Classification with Naive Bayesjakehofman
The document is a slide presentation on Naive Bayes classification. It introduces the concept of Naive Bayes classification using examples such as medical diagnosis and spam filtering. It then describes the mathematical formulation of Naive Bayes as applying Bayes' rule to classify documents represented as bags of words. The presentation notes that Naive Bayes works better than expected due to its computational efficiency and ability to be updated with new data, though its performance depends on the document representation used.
This document provides an overview of sentiment analysis techniques including AFINN-111, SentiWordNet, and document classification. It describes analyzing sentiment at the word level using lexicons and at the document level. Key steps are outlined such as tokenization, part-of-speech tagging, word sense disambiguation, and assigning sentiment scores. Issues with analyzing short texts like tweets are also discussed. The document provides references and links to related projects and APIs.
This document discusses predicting movie box office success based on sentiment analysis of tweets. It presents the methodology, which includes collecting twitter data on movies, preprocessing the data by removing noise and irrelevant tweets, using a trained classifier to label tweets as positive, negative, neutral or irrelevant, and calculating a PT-NT ratio based on these labels to predict if a movie will be a hit, flop or average. Related work on using social media to predict outcomes is also discussed.
Adaptation of Sentiment Analysis to New Linguistic Features, Informal Languag...Subhabrata Mukherjee
Adaptation of Sentiment Analysis to New Linguistic Features, Informal Language Form and World Knowledge, Subhabrata Mukherjee and Pushpak Bhattacharyya, Master's Thesis, IIT Bombay, Dept. of Computer Science and Engineering
TwiSent: A Multi-Stage System for Analyzing Sentiment in TwitterSubhabrata Mukherjee
TwiSent: A Multi-Stage System for Analyzing Sentiment in Twitter, Subhabrata Mukherjee, Akshat Malu, Balamurali A.R. and Pushpak Bhattacharyya, In Proceedings of The 21st ACM Conference on Information and Knowledge Management (CIKM 2012), Hawai, Oct 29 - Nov 2, 2012 (http://www.cse.iitb.ac.in/~pb/papers/cikm2012-twisent.pdf)
This document appears to be a transcript from a presentation on software estimation. It discusses how estimation is different for software projects compared to construction projects due to factors like constantly changing tools and requirements in software. A key example given is the Berlin Brandenburg Airport project, whose costs ballooned from an original budget of €2.83 billion to over €9.4 billion due to delays and changes. The presentation argues that while estimation is important, it is difficult to be perfectly accurate for software due to its inherently changing nature.
Lean Engineering: Engineering for Learning & Experimentation in the Enterpris...Rosenfeld Media
Bill Scott: "Lean Engineering: Engineering for Learning & Experimentation in the Enterprise"
Enterprise UX 2015 • May 13, 2015 • San Antonio, TX, USA
http://enterpriseux.net
PWA are a hot topic and it is important to understand that they are a different approach to apps than the traditional way of packaging something and letting the user install it. In this keynote you'll see some of the differences.
WEBASSEMBLY - What's the right thing to write? -Shin Yoshida
https://github.com/wbcchsyn/slide-WEBASSEMBLY-whats-the-right-thing-to-write.git
What is WebAssembly?
According to webassembly.org,
WebAssembly (abbreviated Wasm) is a binary instruction format for a stack-based virtual machine.
I think that it is a standard to make the programming logic abstract.
“standard to make the programming logic abstract.”
What does it mean?
What is the advantage?
Let’s talk about WebAssembly while looking back on the computer history.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with one Invariant Sections: “Shin Yoshida wrote this document with the goal of contributing to a fair and safe world. Funai Soken Digital Incorporated agrees with the vision and compensated him for his work.” no Front-Cover Texts, and no Back-Cover Text. A copy of the license is included in the section entitled “GNU Free Documentation License”.
https://github.com/wbcchsyn/slide-WEBASSEMBLY-whats-the-right-thing-to-write.git
Mobile App Feature Configuration and A/B Experimentslacyrhoades
The document discusses feature configuration and A/B testing in mobile apps. It describes how Etsy uses feature flags and continuous experimentation to iteratively develop and test new features. Features can be enabled or disabled for certain users or groups. Experiments follow a process of setting up a feature flag, determining user eligibility, coding the feature, internal testing, then launching the feature to a percentage of users while collecting analytics. This allows gathering feedback to improve products and user experience.
The Ember.js Framework - Everything You Need To KnowAll Things Open
All Things Open 2014 - Day 2
Thursday, October 23rd, 2014
Yehuda Katz
Founder of Tilde
Front Dev 1
The Ember.js Framework - Everything You Need To Know
This document provides style guidelines for 2adpro's communication with clients, including sections on branding, grammar, tone, crisis communication, and presentations. It emphasizes consistency in language, grammar, punctuation and formatting. Specific guidelines address issues like sentence structure, word choice, punctuation, spacing, and phrasing to ensure clear communication for an international client base.
This document discusses how data science can be used to help people plan vacations by analyzing reviews of hotels and other destinations. It describes building a model using natural language processing techniques like bag-of-words modeling and decision trees on hotel review data to match people's descriptions of their dream vacations to the most suitable locations. Some limitations of this approach are also outlined, like not accounting for word frequency or context between words. The document promotes an online data science bootcamp for learning skills like those used in this example.
1) The document discusses Darwin Phones, a mobile phone sensing system that applies distributed computing concepts to allow for the evolution, pooling, and collaborative inference of classification models on mobile devices.
2) Darwin Phones aims to address the limitations of traditional supervised mobile sensing approaches that require retraining models for different environments and do not scale well.
3) The key aspects of Darwin Phones include allowing classification models to evolve on devices without supervision, pooling existing models between devices, and enabling collaborative inference across multiple devices.
This document discusses multi-tenant SaaS applications and the implications of sharing data and business logic across organizations. It notes that while multi-tenanting aims to reduce costs through a single codebase and database, it also blurs ownership and responsibilities. Further, as organizations adapt their business logic to a shared SaaS environment over time, their practices may influence and change the application in ways that are then shared with all users.
This document discusses contextual references in English texts. It defines contextual references as words that substitute for other words used earlier in the text to avoid repetition. It provides examples of different types of contextual references like pronouns, possessive adjectives, relative pronouns, and words that leave out repeated nouns. The document also discusses using specific words and pro-clauses to refer to ideas from the previous sentence.
Smartphone security and privacy: you're doing it wrongGraham Lee
Before you can get security or privacy features correct, you must understand how people think and how this will impact any UI you show for your privacy settings. In this presentation, I discuss the user's mental model and see how this impacts on iPhone and Android privacy UI.
The document provides an overview of a training session on data provenance given by Thurstan Young. Thurstan Young discusses the definition of data provenance, guidance available in the RDA toolkit, elements related to data provenance, recording methods for different elements, and applications of data provenance in RDA. He compares coverage of these topics between the current and new RDA data toolkits.
"The Cutting Edge" - Palletways Business Club Presentationgeorge_edwards
George Edwards gave a talk on applying cutting-edge web development practices to the transportation and distribution industries. He discussed how his background led him to see opportunities for innovation by applying an outside perspective. At Speed Welshpool, he created all-in-house software called Transpire to put them at the forefront technologically and handle their operations. He explained concepts like web development, frameworks as tools, and why industries should utilize existing resources from the vibrant web development community instead of reinventing the wheel.
This document discusses classifying tweets as related or not related to particular companies based on company profiles. It presents an approach that represents tweets and company profiles as bags of keywords. It describes generating basic company profiles from sources like homepage crawls, metadata tags, categories, and feedback. It then discusses classifying tweets by comparing their overlap with the positive and negative evidence in a company's profile. The document also introduces the concept of a relatedness factor to handle ambiguity and describes approaches using a default or random decision based on this factor. Finally, it proposes an active learning approach to augment company profiles with associated words from the live tweet stream.
Datatium - using data as a material for contextually responsive design.Andrew Fisher
Rersponsive design has changed how we build sites, however whilst we've addressed many of the technical challenges of devices we haven't understood the underlying behaviour that is occurring. This talk highlights how context is increasingly important and how data can be used to create responsive experiences beyond simply reflowing of web pages.
SearchLove Boston 2017 | Will Critchlow | Building Robot AllegiancesDistilled
Under Sundar Pichai, Google is doubling down on machine learning and artificial intelligence. Computer capabilities are improving at a frightening rate, and there are already parts of our jobs that would be better off done by robots. In this talk, Will is going to highlight the areas where humans are falling behind and give you some tips on what to do about it.
Slides from my DevOpsExpo London talk "From oops to NoOps".
They tell you in these conferences that DevOps is not about tools, but about culture. And they are partially right. I am going to tell you that it’s not only about culture or tools but also abstractions.
It is a lot about how you see software and its value. About our mental model of what software is: how it runs, evolves, and interacts with the other facets of an enterprise.
We used to view software as code. As a state of code. Now we think about software as change, as a flow. A dynamic system where people, machines, and processes interact continuously.
At Platform.sh we spend a bunch of time asking ourselves not “How do you build?” - or even “How do you build consistently?” - but rather “What does it mean to consistently build in a world where change is good?” A world that lets you push security fixes into production as soon as they’re available because you don’t want to be an Equifax but you do want stability.
In this presentation, I will go over what we think software is and why having the right ideas about software will help you get your culture right and your tooling aligned, as well as gain in productivity, and general happiness and well-being.
Similar to Feature specific analysis of reviews (20)
XtremeDistil: Multi-stage Distillation for Massive Multilingual ModelsSubhabrata Mukherjee
Massive distillation of pre-trained language models like multilingual BERT with 35x compression and 51x speedup (98% smaller and faster) retaining 95% F1-score over 41 languages
OpenTag: Open Attribute Value Extraction From Product ProfilesSubhabrata Mukherjee
Guineng Zheng, Subhabrata Mukherjee, Xin Luna Dong, Feifei Li
KDD 2018, London, UK
OpenTag brings deep learning and active learning together for state-of-the-art imputation and open entity extraction system.
Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...Subhabrata Mukherjee
One of the major hurdles preventing the full exploitation of information from online communities is the widespread concern regarding the quality and credibility of user-contributed content. We propose probabilistic graphical models that can leverage the joint interplay between multiple factors --- like user interactions, community dynamics, and textual content --- to automatically assess the credibility of user-contributed online information, expertise of users and their evolution with user-interpretable explanation. We devise new models based on Conditional Random Fields that enable applications such as extracting reliable side-effects of drugs from user-contributed posts in health forums, and identifying credible news articles in news forums.
Online communities are dynamic, as users join and leave, adapt to evolving trends, and mature over time. To capture this dynamics, we propose generative models based on Hidden Markov Model, Latent Dirichlet Allocation, and Brownian Motion to trace the continuous evolution of user expertise and their language model over time. This allows us to identify expert users and credible content jointly over time, improving state-of-the-art recommender systems by explicitly considering the maturity of users. This enables applications such as identifying useful product reviews, and detecting fake and anomalous reviews with limited information.
Continuous Experience-aware Language Model
Subhabrata Mukherjee, Stephan Günnemann and Gerhard Weikum
Proc. of the 22nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD). 2016
Experience aware Item Recommendation in Evolving Review CommunitiesSubhabrata Mukherjee
Experience aware Item Recommendation in Evolving Review Communities
Subhabrata Mukherjee, Hemank Lamba and Gerhard Weikum
IEEE International Conference in Data Mining (ICDM) 2015
Domain Cartridge: Unsupervised Framework for Shallow Domain Ontology Construc...Subhabrata Mukherjee
Subhabrata Mukherjee, Jitendra Ajmera and Sachindra Joshi.
Domain Cartridge: Unsupervised Framework for Shallow Domain Ontology Construction from Corpus
Proc. of the 23rd ACM International Conference on Information and Knowledge Management (CIKM). 2014.
Leveraging Joint Interactions for Credibility Analysis in News CommunitiesSubhabrata Mukherjee
Leveraging Joint Interactions for Credibility Analysis in News Communities,
Subhabrata Mukherjee and Gerhard Weikum,
Max Planck Institute for Informatics,
CIKM 2015
People on Drugs: Credibility of User Statements in Health ForumsSubhabrata Mukherjee
People on Drugs: Credibility of User Statements in Health Communities. Subhabrata Mukherjee, Gerhard Weikum and Cristian Danescu-Niculescu-Mizil. Proc. of the 20th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD). 2014
Author-Specific Hierarchical Sentiment Aggregation for Rating Prediction of R...Subhabrata Mukherjee
Author-Specific Hierarchical Sentiment Aggregation for Rating Prediction of Reviews, Subhabrata Mukherjee and Sachindra Joshi, In Proc. of the 9th edition of the Language Resources and Evaluation Conference (LREC 2014), Reykjavik, Iceland, May 26-31, 2014
Joint Author Sentiment Topic Model, Subhabrata Mukherjee, Gaurab Basu and Sachindra Joshi, In Proc. of the SIAM International Conference in Data Mining (SDM 2014), Pennsylvania, USA, Apr 24-26, 2014 [http://people.mpi-inf.mpg.de/~smukherjee/jast.pdf]
Leveraging Sentiment to Compute Word Similarity, Balamurali A.R., Subhabrata Mukherjee, Akshat Malu and Pushpak Bhattacharyya, In Proceedings of the 6th International Global Wordnet Conference (GWC 2011), Matsue, Japan, Jan, 2012 (http://www.cse.iitb.ac.in/~pb/papers/gwc12-sense-sa.pdf)
WikiSent : Weakly Supervised Sentiment Analysis Through Extractive Summarizat...Subhabrata Mukherjee
WikiSent : Weakly Supervised Sentiment Analysis Through Extractive Summarization With Wikipedia, Subhabrata Mukherjee and Pushpak Bhattacharyya, In Proceedings of the European Conference on Machine Learning (ECML PKDD 2012), Bristol, U.K., 24-28 Sept, 2012 (http://www.cs.bris.ac.uk/~flach/ECMLPKDD2012papers/1125567.pdf)
YouCat : Weakly Supervised Youtube Video Categorization System from Meta Data...Subhabrata Mukherjee
YouCat : Weakly Supervised Youtube Video Categorization System from Meta Data & User Comments using WordNet & Wikipedia, Subhabrata Mukherjee and Pushpak Bhattacharyya, In Proceedings of the 24th International Conference on Computational Linguistics (COLING 2012), IIT Bombay, Mumbai, Dec 8 - Dec 15, 2012 (Long Paper)
Sentiment Analysis in Twitter with Lightweight Discourse AnalysisSubhabrata Mukherjee
Sentiment Analysis in Twitter with Lightweight Discourse Analysis, Subhabrata Mukherjee and Pushpak Bhattacharyya, In Proceedings of the 24th International Conference on Computational Linguistics (COLING 2012), IIT Bombay, Mumbai, Dec 8 - Dec 15, 2012 (http://www.cse.iitb.ac.in/~pb/papers/coling12-discourse-sa.pdf)
"What does it really mean for your system to be available, or how to define w...Fwdays
We will talk about system monitoring from a few different angles. We will start by covering the basics, then discuss SLOs, how to define them, and why understanding the business well is crucial for success in this exercise.
How information systems are built or acquired puts information, which is what they should be about, in a secondary place. Our language adapted accordingly, and we no longer talk about information systems but applications. Applications evolved in a way to break data into diverse fragments, tightly coupled with applications and expensive to integrate. The result is technical debt, which is re-paid by taking even bigger "loans", resulting in an ever-increasing technical debt. Software engineering and procurement practices work in sync with market forces to maintain this trend. This talk demonstrates how natural this situation is. The question is: can something be done to reverse the trend?
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor IvaniukFwdays
At this talk we will discuss DDoS protection tools and best practices, discuss network architectures and what AWS has to offer. Also, we will look into one of the largest DDoS attacks on Ukrainian infrastructure that happened in February 2022. We'll see, what techniques helped to keep the web resources available for Ukrainians and how AWS improved DDoS protection for all customers based on Ukraine experience
The Department of Veteran Affairs (VA) invited Taylor Paschal, Knowledge & Information Management Consultant at Enterprise Knowledge, to speak at a Knowledge Management Lunch and Learn hosted on June 12, 2024. All Office of Administration staff were invited to attend and received professional development credit for participating in the voluntary event.
The objectives of the Lunch and Learn presentation were to:
- Review what KM ‘is’ and ‘isn’t’
- Understand the value of KM and the benefits of engaging
- Define and reflect on your “what’s in it for me?”
- Share actionable ways you can participate in Knowledge - - Capture & Transfer
"$10 thousand per minute of downtime: architecture, queues, streaming and fin...Fwdays
Direct losses from downtime in 1 minute = $5-$10 thousand dollars. Reputation is priceless.
As part of the talk, we will consider the architectural strategies necessary for the development of highly loaded fintech solutions. We will focus on using queues and streaming to efficiently work and manage large amounts of data in real-time and to minimize latency.
We will focus special attention on the architectural patterns used in the design of the fintech system, microservices and event-driven architecture, which ensure scalability, fault tolerance, and consistency of the entire system.
Conversational agents, or chatbots, are increasingly used to access all sorts of services using natural language. While open-domain chatbots - like ChatGPT - can converse on any topic, task-oriented chatbots - the focus of this paper - are designed for specific tasks, like booking a flight, obtaining customer support, or setting an appointment. Like any other software, task-oriented chatbots need to be properly tested, usually by defining and executing test scenarios (i.e., sequences of user-chatbot interactions). However, there is currently a lack of methods to quantify the completeness and strength of such test scenarios, which can lead to low-quality tests, and hence to buggy chatbots.
To fill this gap, we propose adapting mutation testing (MuT) for task-oriented chatbots. To this end, we introduce a set of mutation operators that emulate faults in chatbot designs, an architecture that enables MuT on chatbots built using heterogeneous technologies, and a practical realisation as an Eclipse plugin. Moreover, we evaluate the applicability, effectiveness and efficiency of our approach on open-source chatbots, with promising results.
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyScyllaDB
Freshworks creates AI-boosted business software that helps employees work more efficiently and effectively. Managing data across multiple RDBMS and NoSQL databases was already a challenge at their current scale. To prepare for 10X growth, they knew it was time to rethink their database strategy. Learn how they architected a solution that would simplify scaling while keeping costs under control.
Discover top-tier mobile app development services, offering innovative solutions for iOS and Android. Enhance your business with custom, user-friendly mobile applications.
What is an RPA CoE? Session 2 – CoE RolesDianaGray10
In this session, we will review the players involved in the CoE and how each role impacts opportunities.
Topics covered:
• What roles are essential?
• What place in the automation journey does each role play?
Speaker:
Chris Bolin, Senior Intelligent Automation Architect Anika Systems
The Microsoft 365 Migration Tutorial For Beginner.pptxoperationspcvita
This presentation will help you understand the power of Microsoft 365. However, we have mentioned every productivity app included in Office 365. Additionally, we have suggested the migration situation related to Office 365 and how we can help you.
You can also read: https://www.systoolsgroup.com/updates/office-365-tenant-to-tenant-migration-step-by-step-complete-guide/
"NATO Hackathon Winner: AI-Powered Drug Search", Taras KlobaFwdays
This is a session that details how PostgreSQL's features and Azure AI Services can be effectively used to significantly enhance the search functionality in any application.
In this session, we'll share insights on how we used PostgreSQL to facilitate precise searches across multiple fields in our mobile application. The techniques include using LIKE and ILIKE operators and integrating a trigram-based search to handle potential misspellings, thereby increasing the search accuracy.
We'll also discuss how the azure_ai extension on PostgreSQL databases in Azure and Azure AI Services were utilized to create vectors from user input, a feature beneficial when users wish to find specific items based on text prompts. While our application's case study involves a drug search, the techniques and principles shared in this session can be adapted to improve search functionality in a wide range of applications. Join us to learn how PostgreSQL and Azure AI can be harnessed to enhance your application's search capability.
Must Know Postgres Extension for DBA and Developer during MigrationMydbops
Mydbops Opensource Database Meetup 16
Topic: Must-Know PostgreSQL Extensions for Developers and DBAs During Migration
Speaker: Deepak Mahto, Founder of DataCloudGaze Consulting
Date & Time: 8th June | 10 AM - 1 PM IST
Venue: Bangalore International Centre, Bangalore
Abstract: Discover how PostgreSQL extensions can be your secret weapon! This talk explores how key extensions enhance database capabilities and streamline the migration process for users moving from other relational databases like Oracle.
Key Takeaways:
* Learn about crucial extensions like oracle_fdw, pgtt, and pg_audit that ease migration complexities.
* Gain valuable strategies for implementing these extensions in PostgreSQL to achieve license freedom.
* Discover how these key extensions can empower both developers and DBAs during the migration process.
* Don't miss this chance to gain practical knowledge from an industry expert and stay updated on the latest open-source database trends.
Mydbops Managed Services specializes in taking the pain out of database management while optimizing performance. Since 2015, we have been providing top-notch support and assistance for the top three open-source databases: MySQL, MongoDB, and PostgreSQL.
Our team offers a wide range of services, including assistance, support, consulting, 24/7 operations, and expertise in all relevant technologies. We help organizations improve their database's performance, scalability, efficiency, and availability.
Contact us: info@mydbops.com
Visit: https://www.mydbops.com/
Follow us on LinkedIn: https://in.linkedin.com/company/mydbops
For more details and updates, please follow up the below links.
Meetup Page : https://www.meetup.com/mydbops-databa...
Twitter: https://twitter.com/mydbopsofficial
Blogs: https://www.mydbops.com/blog/
Facebook(Meta): https://www.facebook.com/mydbops/
"Choosing proper type of scaling", Olena SyrotaFwdays
Imagine an IoT processing system that is already quite mature and production-ready and for which client coverage is growing and scaling and performance aspects are life and death questions. The system has Redis, MongoDB, and stream processing based on ksqldb. In this talk, firstly, we will analyze scaling approaches and then select the proper ones for our system.
In the realm of cybersecurity, offensive security practices act as a critical shield. By simulating real-world attacks in a controlled environment, these techniques expose vulnerabilities before malicious actors can exploit them. This proactive approach allows manufacturers to identify and fix weaknesses, significantly enhancing system security.
This presentation delves into the development of a system designed to mimic Galileo's Open Service signal using software-defined radio (SDR) technology. We'll begin with a foundational overview of both Global Navigation Satellite Systems (GNSS) and the intricacies of digital signal processing.
The presentation culminates in a live demonstration. We'll showcase the manipulation of Galileo's Open Service pilot signal, simulating an attack on various software and hardware systems. This practical demonstration serves to highlight the potential consequences of unaddressed vulnerabilities, emphasizing the importance of offensive security practices in safeguarding critical infrastructure.
This talk will cover ScyllaDB Architecture from the cluster-level view and zoom in on data distribution and internal node architecture. In the process, we will learn the secret sauce used to get ScyllaDB's high availability and superior performance. We will also touch on the upcoming changes to ScyllaDB architecture, moving to strongly consistent metadata and tablets.
Essentials of Automations: Exploring Attributes & Automation ParametersSafe Software
Building automations in FME Flow can save time, money, and help businesses scale by eliminating data silos and providing data to stakeholders in real-time. One essential component to orchestrating complex automations is the use of attributes & automation parameters (both formerly known as “keys”). In fact, it’s unlikely you’ll ever build an Automation without using these components, but what exactly are they?
Attributes & automation parameters enable the automation author to pass data values from one automation component to the next. During this webinar, our FME Flow Specialists will cover leveraging the three types of these output attributes & parameters in FME Flow: Event, Custom, and Automation. As a bonus, they’ll also be making use of the Split-Merge Block functionality.
You’ll leave this webinar with a better understanding of how to maximize the potential of automations by making use of attributes & automation parameters, with the ultimate goal of setting your enterprise integration workflows up on autopilot.
Essentials of Automations: Exploring Attributes & Automation Parameters
Feature specific analysis of reviews
1. Feature Specific Sentiment
Analysis of Reviews
Subhabrata Mukherjee and Pushpak Bhattacharyya
Dept. of Computer Science and Engineering,
IIT Bombay
13th International Conference on Intelligent Text Processing
and Computational Intelligence - CICLING 2012,
New Delhi, India, March, 2012
2. MOTIVATION CONTD…
Sentiment Analysis is always with respect to a
particular entity or feature
Feature may be implicit or explicit
This work concerns explicit feature
3. MOTIVATION
I have an ipod and it is a great buy
but I'm probably the only person that
dislikes the iTunes software.
Here the sentiment w.r.t ipod is
positive whereas that respect to
software is negative
4. ENTITY AND FEATURES
An entity may be analyzed from the point of
view of multiple features
Entity – Titanic
Features – Music, Direction, Plot etc.
Given a sentence how to identify the set of
features ?
5. SCENARIO
Each sentence can contain multiple features
and mixed opinions (positive and negative)
Reviews mixed from various domains
No prior information about set of features
except the target feature
6. MAIN FEATURES OF THE
ALGORITHM
Does not require any prior information about
any domain
Unsupervised – But need a small untagged
dataset to tune parameters
Does not require any prior feature set
Groups set of features into separate clusters
which need to be pruned or labeled
8. Hypothesis Example
“I want to use Samsung which is a great
product but am not so sure about using
Nokia”.
Here “great” and “product” are related by an adjective modifier
relation, “product” and “Samsung” are related by a relative
clause modifier relation. Thus “great” and “Samsung” are
transitively related.
Here “great” and “product” are more related to Samsung
than they are to Nokia
Hence “great” and “product” come together to express an
opinion about the entity “Samsung” than about the entity
“Nokia”
9. Hypothesis Example
“I want to use Samsung which is a great
product but am not so sure about using
Nokia”.
Here “great” and “product” are related by an adjective modifier
relation, “product” and “Samsung” are related by a relative
clause modifier relation. Thus “great” and “Samsung” are
transitively related.
Here “great” and “product” are more related to Samsung
than they are to Nokia
Hence “great” and “product” come together to express an
opinion about the entity “Samsung” than about the entity
“Nokia”
10. Hypothesis Example
“I want to use Samsung which is a great
product but am not so sure about using
Nokia”.
Here “great” and “product” are related by an adjective modifier
relation, “product” and “Samsung” are related by a relative
clause modifier relation. Thus “great” and “Samsung” are
transitively related.
Here “great” and “product” are more related to Samsung
than they are to Nokia
Hence “great” and “product” come together to express an
opinion about the entity “Samsung” than about the entity
“Nokia”
11. Hypothesis Example
“I want to use Samsung which is a great
product but am not so sure about using
Nokia”.
Here “great” and “product” are related by an adjective modifier
relation, “product” and “Samsung” are related by a relative
clause modifier relation. Thus “great” and “Samsung” are
transitively related.
Here “great” and “product” are more related to Samsung
than they are to Nokia
Hence “great” and “product” come together to express an
opinion about the entity “Samsung” than about the entity
“Nokia”
12. Hypothesis Example
“I want to use Samsung which is a great
product but am not so sure about using
Nokia”.
Here “great” and “product” are related by an adjective modifier
relation, “product” and “Samsung” are related by a relative
clause modifier relation. Thus “great” and “Samsung” are
transitively related.
Here “great” and “product” are more related to Samsung
than they are to Nokia
Hence “great” and “product” come together to express an
opinion about the entity “Samsung” than about the entity
“Nokia”
Adjective Modifier
13. Hypothesis Example
“I want to use Samsung which is a great
product but am not so sure about using
Nokia”.
Here “great” and “product” are related by an adjective modifier
relation, “product” and “Samsung” are related by a relative
clause modifier relation. Thus “great” and “Samsung” are
transitively related.
Here “great” and “product” are more related to Samsung
than they are to Nokia
Hence “great” and “product” come together to express an
opinion about the entity “Samsung” than about the entity
“Nokia”
Adjective Modifier
14. Hypothesis Example
“I want to use Samsung which is a great
product but am not so sure about using
Nokia”.
Here “great” and “product” are related by an adjective modifier
relation, “product” and “Samsung” are related by a relative
clause modifier relation. Thus “great” and “Samsung” are
transitively related.
Here “great” and “product” are more related to Samsung
than they are to Nokia
Hence “great” and “product” come together to express an
opinion about the entity “Samsung” than about the entity
“Nokia”
Adjective Modifier
15. Hypothesis Example
“I want to use Samsung which is a great
product but am not so sure about using
Nokia”.
Here “great” and “product” are related by an adjective modifier
relation, “product” and “Samsung” are related by a relative
clause modifier relation. Thus “great” and “Samsung” are
transitively related.
Here “great” and “product” are more related to Samsung
than they are to Nokia
Hence “great” and “product” come together to express an
opinion about the entity “Samsung” than about the entity
“Nokia”
Adjective Modifier
Relative Clause
Modifier
16. Example of a Review
I have an ipod and it is a great buy but I'm
probably the only person that dislikes the
iTunes software.
17. Example of a Review
I have an ipod and it is a great buy but I'm
probably the only person that dislikes the
iTunes software.
18. Example of a Review
I have an ipod and it is a great buy but I'm
probably the only person that dislikes the
iTunes software.
19. Example of a Review
I have an ipod and it is a great buy but I'm
probably the only person that dislikes the
iTunes software.
20. Example of a Review
I have an ipod and it is a great buy but I'm
probably the only person that dislikes the
iTunes software.
22. Feature Extraction : Domain Info
Not Available
Initially, all the Nouns are treated as features and added to
the feature list F.
23. Feature Extraction : Domain Info
Not Available
Initially, all the Nouns are treated as features and added to
the feature list F.
F = { ipod, buy, person, software }
24. Feature Extraction : Domain Info
Not Available
Initially, all the Nouns are treated as features and added to
the feature list F.
F = { ipod, buy, person, software }
Pruning the feature set
Merge 2 features if they are strongly related
25. Feature Extraction : Domain Info
Not Available
Initially, all the Nouns are treated as features and added to
the feature list F.
F = { ipod, buy, person, software }
Pruning the feature set
Merge 2 features if they are strongly related
“buy” merged with “ipod”, when target feature = “ipod”,
“person, software” will be ignored.
26. Feature Extraction : Domain Info
Not Available
Initially, all the Nouns are treated as features and added to
the feature list F.
F = { ipod, buy, person, software }
Pruning the feature set
Merge 2 features if they are strongly related
“buy” merged with “ipod”, when target feature = “ipod”,
“person, software” will be ignored.
“person” merged with “software”, when target feature =
“software”
“ipod, buy” will be ignored.
27. Relations
Direct Neighbor Relation
Capture short range dependencies
Any 2 consecutive words (such that none of them is a
StopWord) are directly related
Consider a sentence S and 2 consecutive words .
If , then they are directly related.
Dependency Relation
Capture long range dependencies
Let Dependency_Relation be the list of significant
relations.
Any 2 words wi and wj in S are directly related, if
s.t.
42. Evaluation – Dataset 1
2500 sentences
Varied domains like antivirus, camera, dvd, ipod,
music player, router, mobile
Each sentence tagged with a feature and polarity
w.r.t the feature
Acid Test
Each Review has a mix of positive and negative
comments
43. Parameter Learning
Dependency Parsing uses approx. 40 relations.
Relation Space – (240 -1)
Infeasible to probe the entire relation space.
Fix relations certain to be significant
nsubj, nsubjpass, dobj, amod, advmod, nn, neg
Reject relations certain to be non-significant
44. Parameter Learning Contd…
This leaves around 21 relations some of which
may not be signficant.
Compute Leave-One-Relation out accuracy
over a training set.
Find the relations for which there is significant
accuracy change.
48. Significant Relations Contd…
Leaving out dep improves accuracy most
Relation Set Accuracy
With Dep+Rcmod 66
Without Dep 69
Without Rcmod 67
Without
Dep+Rcmod
68
49. Significant Relations Contd…
Leaving out dep improves accuracy most
Relation Set Accuracy
With Dep+Rcmod 66
Without Dep 69
Without Rcmod 67
Without
Dep+Rcmod
68
56. Evaluation – Dataset 2
Extracted 500 sentences
Varied domains like camera, laptop, mobile
Each sentence tagged with a feature and
polarity w.r.t the feauture
“Exploiting Coherence for the Simultaneous Discovery of Latent Facets and associated Sentiments”
60. CONCLUSIONS
Incorporating feature specificity improves
sentiment accuracy.
Dependency Relations capture long range
dependencies as is evident from accuracy
improvement.
Work to be extended for implicit features and
domain dependent sentiment.