This project is an attempt to build Word Clouds from customer reviews. We use Natural Language Processing tools like PCFG, Neural Net based Dependency Parsers and Wordnet to extract and group together semantically similar phrases into word clouds
2. Index
- Introduction and Motivation for the Project
- Inspiration - TripAdvisor
- Stages of Word Clouds Extraction
- Grammar based Approach
- Grammar + Relationship based approach
- Entity Grouping Logic
- Demo
4. Stages of Word Clouds Extraction
1. Cleaning Reviews - grouping reviews by doctor id
2. POS Tagging - extracting parts of speech
3. Extracting Grammatical Sentences - extracting complete sentence from
review
4. Extracting Dependencies within a sentence from the Model - extract the
dependencies between Parts-Of-Speech within a sentence
5. Entity Grouping Logic - group related word clouds
6. Structuring Results - structure results to JSON
5. Grammar based Approach
Sentence Chunker
Probabilistic Context Free Grammar based Model , built on Rules.
- Set of rules based on conditional probability
6. Grammar + Relationship based approach
POS Tagging
- I’m happy to share my feelings
Relationships between POS Tags
- happy/JJ -> share/VB , feeling/NN -> my/PRP
Final Solution:-
PCFG + Neural Net based Dependency Parsing
7. Entity Grouping Logic
Group the entities(mainly nouns) from the phrases into buckets/ word clouds
1. Current Algo -> Pre-defined Static Mappings + Semantic Synonyms ( using
Wordnet ) of the entities
2. Spell Check the words while grouping them into the buckets
Future Work:-
Static Mappings -> Training Data -> Classifier (ML Model)