Successfully reported this slideshow.
By: Pradeep Pujari
 Sentiment Analysis?
 Sentiment Analysis – General Architecture
 Little Lucene
 Sentiment Analysis and Solr
 Applicat...
Working mostly in Search domain
Search = IR + ML + NLP
Who am I?
Works for
Contributing to SolrSherlock
- Open Source Project
Who am I?
http://solrsherlock.github.io/SolrSherlock/
What is Sentiment Analysis?
A linguistic analysis technique that identifies
The movie is great.
The movie stars Mr. X
The ...
Challenging
Too easy Too hard
Difficulty
misclassification
What is Sentiment Analysis?
Sentiment
Analysis
NLP
Cognitive Science
What is Sentiment Analysis?
Human can easily understand
emotions.
Can a machine be trained to do it?
What is Sentiment Analysis?
 SA offers organizations ability to monitor in
real time and act accordingly
 Marketing managers, PR Firms, campaign
man...
 Document-Level
supervised/non supervised learning
 Sentence-Level
supervised learning
 Feature-Based Sentiment Analysi...
 Open-source Java based search
engine
 Provides document indexing w/
arbitrary fields and fast search
 Several relevanc...
1. Create an index
2. Add ‘document’ representations of
items
3. Construct queries
4. Ask for results (will be scored )
IndexWriterConfig config = /* configure */ ;
Directory dir = FSDirectory.open(indexFile);
IndexWriter w = new IndexWriter(...
 IndexSearcher idx = getIndexSearcher();
 IndexReader reader = idx.getIndexReader();
 TopDocs results = idx.search(q, n...
 PyLucene is Python implementation
 Lucy is in C w/ bindings for other langs
 Lucene.NET
 SOLR provides search server ...
Solr ?
Http Request Servlet
Admin
Interface
Update Servlet
Standard
Request
Handler
Custom
Request
Handler
Response
Writer...
Linguistics module
Stems, Lemmas and Synonyms
multi language capability
CJKAnalyzer, UIMA Analyzers
UIMA integration
Updat...
Why Solr ?
Extract domain specific entities
and concepts
Time and Cost
Solr Set Up – 5 mins
UIMA Annotators - 5 days
Enric...
Tagging entities in review text
Applications:
I wasn't really in the market for another tablet, but my girlfriend ended
up...
Applications:
Consumer feedback about products
Which product features are more relevant
Polarity
Digital SLR with Full 1080p HD Video
There are many preprogrammed scene modes
that make this a very easy camera to use.
Th...
Why UIMA ?
UIMA Framework manages components
and data flow – No coding
Deploy pipeline of analysis engines
AEs wrap NLP al...
Index
Lucene
Solr Update
RequestProcessor
Solr
QParser Data
Solr+UIMA
UIMA AE
NLP+UIMA
Use POS in query understanding
boosting terms
Synonym expansion
Extract concepts/entities
Faceting using entities...
Ideas: Sentiment Analysis App
Identify Subjective Sentences from text
Remove noisy sentences
– Regex, conditional probabil...
Subjectivity
detector
Subjective
Objective
Polarity
Classifier
Ideas: Sentiment Analysis App
Sentiments Intensity - SentiWordNet
WordNet-Affect: WordNet +
annotated concepts
Ideas: Sentiment Analysis App
Hybrid mode...
Update
Handler with
processor chain
Remove Duplicates
processor
Logging
processor
Custom Transform
processor
Index
process...
Let’s look at the code
 Data transformation or post processing
 UpdateProcessorFactory
 LogUpdateProcessorFactory
 UIMAUpdateProcessorFactory...
<requestHandler name="/update" class="solr.XmlUpdateRequestHandler" >
<lst name="defaults">
<str name="update.processor">u...
 Stanford NER
<updateRequestProcessorChain name="uima">
<processor class="org.apache.solr.uima.processor.UIMAUpdateRequestProcessorFacto...
http://lucene.apache.org/solr/
http://uima.apache.org/
http://alias-i.com/lingpipe/demos/tutorial/sentiment/read-me.html
h...
Questions ?
Thank You
Email: pradeepp@rocketmail.com
Sais svcc
Sais svcc
Upcoming SlideShare
Loading in …5
×

Sais svcc

1,618 views

Published on

S

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Sais svcc

  1. 1. By: Pradeep Pujari
  2. 2.  Sentiment Analysis?  Sentiment Analysis – General Architecture  Little Lucene  Sentiment Analysis and Solr  Applications of Sentiment Analysis  Code Walkthrough
  3. 3. Working mostly in Search domain Search = IR + ML + NLP Who am I? Works for
  4. 4. Contributing to SolrSherlock - Open Source Project Who am I? http://solrsherlock.github.io/SolrSherlock/
  5. 5. What is Sentiment Analysis? A linguistic analysis technique that identifies The movie is great. The movie stars Mr. X The movie is horrible. opinion early in a piece of text.
  6. 6. Challenging Too easy Too hard Difficulty misclassification What is Sentiment Analysis?
  7. 7. Sentiment Analysis NLP Cognitive Science What is Sentiment Analysis?
  8. 8. Human can easily understand emotions. Can a machine be trained to do it? What is Sentiment Analysis?
  9. 9.  SA offers organizations ability to monitor in real time and act accordingly  Marketing managers, PR Firms, campaign managers, politicians, equity investors, on line shoppers are direct beneficiaries  http://www.tweetfeel.com  http://www.nytimes.com/interactive/us/pol itics/2010-twitter-candidates.html
  10. 10.  Document-Level supervised/non supervised learning  Sentence-Level supervised learning  Feature-Based Sentiment Analysis All NP in corpus and Polarity  Sentiment Lexicon Acquisition WordNet
  11. 11.  Open-source Java based search engine  Provides document indexing w/ arbitrary fields and fast search  Several relevance and ranking algorithms
  12. 12. 1. Create an index 2. Add ‘document’ representations of items 3. Construct queries 4. Ask for results (will be scored )
  13. 13. IndexWriterConfig config = /* configure */ ; Directory dir = FSDirectory.open(indexFile); IndexWriter w = new IndexWriter(dir, config); for (ItemInfo item: getItems()) { Document doc = new Document(); doc.add(new Field("title", item.title)); doc.add(new Field("tags", item.tags)); w.add(doc); } w.close();
  14. 14.  IndexSearcher idx = getIndexSearcher();  IndexReader reader = idx.getIndexReader();  TopDocs results = idx.search(q, n + 1);
  15. 15.  PyLucene is Python implementation  Lucy is in C w/ bindings for other langs  Lucene.NET  SOLR provides search server (with REST API) on top of Lucene
  16. 16. Solr ? Http Request Servlet Admin Interface Update Servlet Standard Request Handler Custom Request Handler Response Writer Solr Core Lucene Analysis UIMA config Caching Update Handler
  17. 17. Linguistics module Stems, Lemmas and Synonyms multi language capability CJKAnalyzer, UIMA Analyzers UIMA integration UpdateProcessorChain Why Solr ?
  18. 18. Why Solr ? Extract domain specific entities and concepts Time and Cost Solr Set Up – 5 mins UIMA Annotators - 5 days Enrich text, write to dedicated field
  19. 19. Tagging entities in review text Applications: I wasn't really in the market for another tablet, but my girlfriend ended up getting one for me so she got me on this one. I would like to say that this tablet reminds me of the first Motorola Droid smartphone that came out several years back. The phone jam packed a ton of bells & whistles into its hardware and software to give a lot of bang for your buck. This is what it feels like amazon has done with the Kindle Fire 8.9. They have put a lot of advanced hardware and innovative software, so for the average user, specially someone who absorbs a lot of media, you get a lot for the price. But just because you get a lot for the price, doesn't mean it is without its flaws.
  20. 20. Applications: Consumer feedback about products Which product features are more relevant Polarity
  21. 21. Digital SLR with Full 1080p HD Video There are many preprogrammed scene modes that make this a very easy camera to use. The picture quality is beyond belief, and even better for the price. Price: Usecase
  22. 22. Why UIMA ? UIMA Framework manages components and data flow – No coding Deploy pipeline of analysis engines AEs wrap NLP algorithms Person Place organization Language Detection Aggregate analysis engine Sentence Annotator POS Annotator NER
  23. 23. Index Lucene Solr Update RequestProcessor Solr QParser Data Solr+UIMA UIMA AE
  24. 24. NLP+UIMA Use POS in query understanding boosting terms Synonym expansion Extract concepts/entities Faceting using entities Identify places in query and use spatial queries
  25. 25. Ideas: Sentiment Analysis App Identify Subjective Sentences from text Remove noisy sentences – Regex, conditional probability Graph min cut – LingPipe Subjectivity Lexicons Discard Facts and Objective Sentences
  26. 26. Subjectivity detector Subjective Objective Polarity Classifier Ideas: Sentiment Analysis App
  27. 27. Sentiments Intensity - SentiWordNet WordNet-Affect: WordNet + annotated concepts Ideas: Sentiment Analysis App Hybrid model with adding dictionary
  28. 28. Update Handler with processor chain Remove Duplicates processor Logging processor Custom Transform processor Index processor Update Processor Chain Text Analyzers Lucene Lucene Index Sentence Detection processor Sentiment Classifier Company Name Annotator Sentiment Score processor Product Reviews
  29. 29. Let’s look at the code
  30. 30.  Data transformation or post processing  UpdateProcessorFactory  LogUpdateProcessorFactory  UIMAUpdateProcessorFactory  UpdateRequestProcessorChain ◦ Pipe line of UpdateRequestProcessors
  31. 31. <requestHandler name="/update" class="solr.XmlUpdateRequestHandler" > <lst name="defaults"> <str name="update.processor">uima</str> </lst> </requestHandler>
  32. 32.  Stanford NER
  33. 33. <updateRequestProcessorChain name="uima"> <processor class="org.apache.solr.uima.processor.UIMAUpdateRequestProcessorFactory"> <lst name="uimaConfig"> <lst name="runtimeParameters"> </lst> <lst name="analysisEngine"><str name="defaultanalysisEngine">/org/apache/uima/desc/OverridingParamsExtServicesAE.xml</str> </lst> <lst name="analyzeFields"> <bool name="merge">false</bool> <arr name="fields"> <str>content_text</str> </arr> </lst> <lst name="fieldMappings"> <lst name="type"> <str name="name">org.apache.uima.DictionaryEntry</str> <lst name="mapping"> <str name="feature">coveredText</str> <str name="field">sentiment_keyword,sentiment_type</str> </lst> </lst>
  34. 34. http://lucene.apache.org/solr/ http://uima.apache.org/ http://alias-i.com/lingpipe/demos/tutorial/sentiment/read-me.html http://openie.cs.washington.edu/ http://wiki.apache.org/solr/SolrUIMA
  35. 35. Questions ?
  36. 36. Thank You Email: pradeepp@rocketmail.com

×