Be the first to like this
A common task in natural language processing is category-specific lexicon mining, or identifying words and phrases that are associated with the presence or absence of a specific category. For example, lists of words associated with positive (vs. negative) product reviews may be automatically discovered from labeled corpora.
In the 1960s, the semanticists A. J. Greimas and F. Rastier developed a framework for turning two opposing categories into a network of 10 semantic classes. This talk introduces an algorithm for discovering lexicons associated with those semantic classes given a corpus of categorized documents. This algorithm is implemented as part of Scattertext, and the output can be viewed in an interactive browser-based visualization.