11. Raw comments (~250,000 out of 1 million)
Census tallies of houses with internet access
NLTK
(natural language processing toolkit)
to build term-document matrix
and used tf-idf to normalize
12. Raw comments (~250,000 out of 1 million)
Census tallies of houses with internet access
NLTK
(natural language processing toolkit)
to build term-document matrix
and used tf-idf to normalize
Found template comments by identifying
identical rows
13. Raw comments (~250,000 out of 1 million)
Census tallies of houses with internet access
NLTK
(natural language processing toolkit)
to build term-document matrix
and used tf-idf to normalize
Found template comments by identifying
identical rows
Scored sentiment using AFINN-111
2,477 English words rated between -5 and +5
14. Raw comments (~250,000 out of 1 million)
Census tallies of houses with internet access
NLTK
(natural language processing toolkit)
to build term-document matrix
and used tf-idf to normalize
Found template comments by identifying
identical rows
Scored sentiment using AFINN-111
2,477 English words rated between -5 and +5
Front end built using Twitter Bootstrap, AWS,
and D3