A common task in natural language processing is category-specific lexicon mining, or identifying words and phrases that are associated with the presence or absence of a specific category. For example, lists of words associated with positive (vs. negative) product reviews may be automatically discovered from labeled corpora.
In the 1960s, the semanticists A. J. Greimas and F. Rastier developed a framework for turning two opposing categories into a network of 10 semantic classes. This talk introduces an algorithm for discovering lexicons associated with those semantic classes given a corpus of categorized documents. This algorithm is implemented as part of Scattertext, and the output can be viewed in an interactive browser-based visualization.
Natural Language Visualization with ScattertextJason Kessler
Scattertext is a Python package that lets you compare and contrast how words and phrases are used differently in two types of documents, producing interactive, Javascript-based visualizations. This talk will cover the use of Scattertext, issues in creating dense scatterplots, and discuss statistical term-association and phrase identification algorithms. The code used in the talk will be available as a repository on my Github account, http://www.github.com/JasonKessler/GlobalAI2018
Topics For A Discursive Essay. 5 Discursive Writing Samples and Templates - P...Latoya White
FREE 5+ Sample Discursive Writing Templates in PDF. Discursive Essay [Topics, Examples, Ideas] | Pro Essay Help. 150 Excellent Discursive Essay Topics for Students To Consider. Essential Tips on How to Write a Discursive Essay - Peachy Essay. How to Write a Discursive Essay: Tips to Succeed & Examples. Wonderful Good Topics For A Discursive Essay ~ Thatsnotus. How To Write A Discursive Essay | PDF | Argument | Essays.
Running head FICTION ESSAYS 1 Compare.docxcowinhelen
Running head: FICTION ESSAYS 1
Compare and Contrast “The Rocking-Horse Winner” and “The Lottery”
Tammy Tillman
Liberty University
Heather Spear
Heather Spear: you want to try to make your title specific to your paper
FICTION ESSAYS 2
Compare and Contrast “The Rocking-Horse Winner” and “The Lottery”
The two stories, “The Rocking-Horse Winner” by D.H. Lawrence and “The Lottery” by
Shirley Jackson exploit the setting element in an interesting and clever manner. Both Jackson
and Lawrence use setting in the stories in a contradicting manner. This technique is very
successful in presenting the weightiness of the issues presented in the stories. However, the two
authors are using the setting element in a different way. Jackson in her story, applies setting as a
way of used the setting as a way of sidetracking or distracting the readers from the matter at
hand. Lawrence on the other hand, applies setting to establish and build the story. Setting forms a
vital element when it comes presenting the author’s thoughts or themes in a given story. This is
clearly illustrated by the two stories where by each depends on the setting to convey a particular
viewpoint and directing the readers towards a given perception.
Shirley Jackson in her short story “The Lottery”, uses the setting with the aim of
distracting the readers from the real issues at hand. The author presents a beautiful scenario and
memories images or visuals which makes the reader to rush into making an assumption about the
ending of the stories. The irony is that, the story does not end as the readers would like to
presume after going the setting. Jackson presents a blissful, peaceful, bright and warm
environment in the story. The day is even described as “clear and sunny, with the fresh warmth
of a full-summer day” (Jackson, 1948). The main aim of this description is to create the notion of
a peaceful, warm, blissful and bright sceneries in the readers in order sidetrack them. This
technique proves to be a success since the readers are not able to correctly predict the events at
the end of the story.
Jackson successful directs the readers into believing that due the wonderful set up of the
story, it will definitely have a happy and nice ending to it. However, the readers discover the
Heather Spear
Heather Spear: Where is your final outline and thesis?
Heather Spear
Heather Spear: no italics, only quotes for titles of short stories
Heather Spear
Heather Spear: your final thesis should be here; you want the specifics to be in the thesis--the story titles or characters
Heather Spear
Heather Spear: this idea is unclear
Heather Spear
Heather Spear: this gets closer to a specific thesis, but it is still not specific enough --what do you mean by "setting element," and what do you mean by "different way"? You want to give a clear and precise argument: "One story has a clearly defined setting while the other has a vague ...
Strong Compare and Contrast Essay Examples. How to Write a Compare and Contrast Essay Outline Point-By-Point With .... Writing a Compare/Contrast Essay:. Surprising Comparison Contrast Essay Examples ~ Thatsnotus. 009 High School Vs College Essay Compare And Contrast Example English .... 021 Compare Contrast Essay Difference Between High School College ....
Natural Language Visualization with ScattertextJason Kessler
Scattertext is a Python package that lets you compare and contrast how words and phrases are used differently in two types of documents, producing interactive, Javascript-based visualizations. This talk will cover the use of Scattertext, issues in creating dense scatterplots, and discuss statistical term-association and phrase identification algorithms. The code used in the talk will be available as a repository on my Github account, http://www.github.com/JasonKessler/GlobalAI2018
Topics For A Discursive Essay. 5 Discursive Writing Samples and Templates - P...Latoya White
FREE 5+ Sample Discursive Writing Templates in PDF. Discursive Essay [Topics, Examples, Ideas] | Pro Essay Help. 150 Excellent Discursive Essay Topics for Students To Consider. Essential Tips on How to Write a Discursive Essay - Peachy Essay. How to Write a Discursive Essay: Tips to Succeed & Examples. Wonderful Good Topics For A Discursive Essay ~ Thatsnotus. How To Write A Discursive Essay | PDF | Argument | Essays.
Running head FICTION ESSAYS 1 Compare.docxcowinhelen
Running head: FICTION ESSAYS 1
Compare and Contrast “The Rocking-Horse Winner” and “The Lottery”
Tammy Tillman
Liberty University
Heather Spear
Heather Spear: you want to try to make your title specific to your paper
FICTION ESSAYS 2
Compare and Contrast “The Rocking-Horse Winner” and “The Lottery”
The two stories, “The Rocking-Horse Winner” by D.H. Lawrence and “The Lottery” by
Shirley Jackson exploit the setting element in an interesting and clever manner. Both Jackson
and Lawrence use setting in the stories in a contradicting manner. This technique is very
successful in presenting the weightiness of the issues presented in the stories. However, the two
authors are using the setting element in a different way. Jackson in her story, applies setting as a
way of used the setting as a way of sidetracking or distracting the readers from the matter at
hand. Lawrence on the other hand, applies setting to establish and build the story. Setting forms a
vital element when it comes presenting the author’s thoughts or themes in a given story. This is
clearly illustrated by the two stories where by each depends on the setting to convey a particular
viewpoint and directing the readers towards a given perception.
Shirley Jackson in her short story “The Lottery”, uses the setting with the aim of
distracting the readers from the real issues at hand. The author presents a beautiful scenario and
memories images or visuals which makes the reader to rush into making an assumption about the
ending of the stories. The irony is that, the story does not end as the readers would like to
presume after going the setting. Jackson presents a blissful, peaceful, bright and warm
environment in the story. The day is even described as “clear and sunny, with the fresh warmth
of a full-summer day” (Jackson, 1948). The main aim of this description is to create the notion of
a peaceful, warm, blissful and bright sceneries in the readers in order sidetrack them. This
technique proves to be a success since the readers are not able to correctly predict the events at
the end of the story.
Jackson successful directs the readers into believing that due the wonderful set up of the
story, it will definitely have a happy and nice ending to it. However, the readers discover the
Heather Spear
Heather Spear: Where is your final outline and thesis?
Heather Spear
Heather Spear: no italics, only quotes for titles of short stories
Heather Spear
Heather Spear: your final thesis should be here; you want the specifics to be in the thesis--the story titles or characters
Heather Spear
Heather Spear: this idea is unclear
Heather Spear
Heather Spear: this gets closer to a specific thesis, but it is still not specific enough --what do you mean by "setting element," and what do you mean by "different way"? You want to give a clear and precise argument: "One story has a clearly defined setting while the other has a vague ...
Strong Compare and Contrast Essay Examples. How to Write a Compare and Contrast Essay Outline Point-By-Point With .... Writing a Compare/Contrast Essay:. Surprising Comparison Contrast Essay Examples ~ Thatsnotus. 009 High School Vs College Essay Compare And Contrast Example English .... 021 Compare Contrast Essay Difference Between High School College ....
Jason Kessler Problems: What's Wrong with TwitterJason Kessler
What happens when an awful person shares your name, and half of Twitter thinks you're him. A short tale of hate speech, misdirected hate tweets, right-wing provocateurs, Twitter's broken UX, and how it can be fixed
Discovering Persuasive Language through Observing Customer BehaviorJason Kessler
How can you use text data combined with customer activity to learn how to speak better to customers? This talk is a brief overview of how CDK Global levered review data and mystery shopped email communication to learn both general and specific ways of creating effective copy and communication practices. The three general tips were:
1. Be specific.
2. For non-expert customers: pain points > jargon.
3. Speak to the customer’s next steps and desires. Don’t be vacuous.
From Sentiment to Persuasion Analysis: A Look at Idea Generation ToolsJason Kessler
Talk given at NLP Day Texas.
Note: the first section is largely the same as the talk "From Sentiment to Persuasion Analysis." The following sections, making up the vast majority of the content, present new information.
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Discussion on Vector Databases, Unstructured Data and AI
https://www.meetup.com/unstructured-data-meetup-new-york/
This meetup is for people working in unstructured data. Speakers will come present about related topics such as vector databases, LLMs, and managing data at scale. The intended audience of this group includes roles like machine learning engineers, data scientists, data engineers, software engineers, and PMs.This meetup was formerly Milvus Meetup, and is sponsored by Zilliz maintainers of Milvus.
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
Jason Kessler Problems: What's Wrong with TwitterJason Kessler
What happens when an awful person shares your name, and half of Twitter thinks you're him. A short tale of hate speech, misdirected hate tweets, right-wing provocateurs, Twitter's broken UX, and how it can be fixed
Discovering Persuasive Language through Observing Customer BehaviorJason Kessler
How can you use text data combined with customer activity to learn how to speak better to customers? This talk is a brief overview of how CDK Global levered review data and mystery shopped email communication to learn both general and specific ways of creating effective copy and communication practices. The three general tips were:
1. Be specific.
2. For non-expert customers: pain points > jargon.
3. Speak to the customer’s next steps and desires. Don’t be vacuous.
From Sentiment to Persuasion Analysis: A Look at Idea Generation ToolsJason Kessler
Talk given at NLP Day Texas.
Note: the first section is largely the same as the talk "From Sentiment to Persuasion Analysis." The following sections, making up the vast majority of the content, present new information.
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Discussion on Vector Databases, Unstructured Data and AI
https://www.meetup.com/unstructured-data-meetup-new-york/
This meetup is for people working in unstructured data. Speakers will come present about related topics such as vector databases, LLMs, and managing data at scale. The intended audience of this group includes roles like machine learning engineers, data scientists, data engineers, software engineers, and PMs.This meetup was formerly Milvus Meetup, and is sponsored by Zilliz maintainers of Milvus.
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
Lexicon Mining for Semiotic Squares: Exploding Binary Classification
1. Lexicon Mining for Semiotic Squares:
Exploding Binary Classification
Jason S. Kessler*
Data Day Texas
January 27, 2018
@jasonkesslerhttp://bit.ly/LexiconMining *No, not that Jason Kessler.
2. @jasonkessler
By the end of the talk
You’ll know how to
programmatically create
semiotic squares like
this one.
http://bit.ly/LexiconMining
Source:
http://www.squadrati.com/2014/03/31/
quadrato-semiotico-dei-wine-lovers/
5. @jasonkessler
Lexicon speculation
Bo Pang, Lillian Lee and Shivakumar Vaithyanathan. Thumbs up? Sentiment classification
using machine learning techniques. EMNLP. 2002.
http://bit.ly/LexiconMining
6. @jasonkessler
Bo Pang, Lillian Lee and Shivakumar Vaithyanathan. Thumbs up? Sentiment classification
using machine learning techniques. EMNLP. 2002.
Lexicon mining ≠ lexicon speculation
http://bit.ly/LexiconMining
7. Lexicon mining
• Let’s do a deep dive into lexicon mining
• We’ll walk through the notebook linked below:
• http://bit.ly/LexiconMining
@jasonkessler
9. @jasonkessler
R: ggplot2
• Most labels are legible
• Labels can overlap points
• Dense areas can be tough to
read
• Not recommended.
• Use Scattertext instead, if
possible.
10. Burt Monroe, Michael Colaresi and Kevin Quinn.
Fightin' words: Lexical feature selection and
evaluation for identifying the content of political
conflict. Political Analysis. 2008.
@jasonkessler
Monroe et. al plot
• Identifies terms with z-scores >
1.96
• Labels still overlap, only labels a
few points
11. Michael Colaresi and Zuhaib Mahmood. Do the robot: Lessons from machine
learning to improve conflict forecasting. Journal of Peace Research. 2017.
R package available:
https://github.com/zsmahmood89/ModelC
riticism
@jasonkessler
12. In defense of stop words
Cindy K. Chung and James W. Pennebaker. Counting
Little Words in Big Data: The Psychology of
Communities, Culture, and History. EASP. 2012
In times of
shared crisis,
“we” use
increases, while
“I” use decreases.
I/we: age, social
integration
I: lying, social
position relative
hearer,
testosterone
@jasonkessler
13. Stop words in context: gender
Newman, ML; Groom, CJ; Handelman LD, Pennebaker, JW. Gender
Differences in Language Use: An Analysis of 14,000 Text Samples. 2008
LIWC Dimension
Bold: entirely stopwords
Effect Size (Cohen’s d)
(>0 F, <0 M) MANOVA p<.001
All Pronouns 0.36
Present tense verbs (walk, is, be) 0.18
Feeling (touch, hold, feel) 0.17
Certaintyns (always, never) 0.14
Word count NS
Numbers -0.15
Prepositions -0.17
Words >6 letters -0.24
Swear words -0.22
Articles -0.24
• Performed on a
variety of
language
categories,
including
speech.
• Other studies
have found that
function words
are the best
predictors of
gender.
@jasonkessler
14. Lexicon Mining for Semiotic Squares:
Exploding Binary Classification
@jasonkessler
15. Greimas, A.J. and Francis Rastier. 1968. “The Interaction of
Semiotic Constraints,” Yale French Studies. 41: 86-105.
Original Semiotic
Square paper:
Pelkey, Jamin. 2017. Greimas embodied: How kinesthetic opposition
grounds the semiotic square. Semiotica 214(1). 277–305
@jasonkessler
17. Language about movies
Positive Negative
Positive sentimentPositive sentiment
Opposition relation
These are treated as
contrastive concepts
@jasonkessler
18. Language about movies
Positive Negative
Positive sentimentPositive sentiment
¬Negative
Language that’s not
negative
Entailment relation
¬Negative includes
positive sentiment
@jasonkessler
19. Language about movies
Positive Negative
Positive sentimentPositive sentiment
¬Negative
Language that’s not
negative
@jasonkessler
Contradiction
relation
Negative and
¬Negative are
mutually exclusive*
*exceptions, e.g., damning with faint praise”
20. Language about movies
Positive Negative
Positive sentimentPositive sentiment
¬Negative ¬Positive
Language that’s not
negative
Language that’s not
positive
@jasonkessler
Complete the
square
21. Language about movies
Positive Negative
Positive sentimentPositive sentiment
¬Negative ¬Positive
Language that’s not
negative
Language that’s not
positive
Objective
Plot descriptions Neutral term
@jasonkessler
22. Language about movies
Positive Negative
Positive sentimentPositive sentiment
¬Negative ¬Positive
Language that’s not
negative
Language that’s not
positive
Evaluative
Reviews
Objective
Plot descriptions
Complex term
@jasonkessler
23. Language about movies
Positive Negative
Positive sentimentPositive sentiment
¬Negative ¬Positive
Language that’s not
negative
Language that’s not
positive
Objective
Plot descriptions
Evaluative
Reviews
Aspects of well-
reviewed movies
Positive deixis
@jasonkessler
24. Language about movies
Positive Negative
Positive sentimentPositive sentiment
¬Negative ¬Positive
Language that’s not
negative
Language that’s not
positive
Objective
Plot descriptions
Evaluative
Reviews
Aspects of well-
reviewed movies
Aspects of poorly-
reviewed movies
Negative deixis
@jasonkessler
25. Language about movies
Positive Negative
Positive sentimentPositive sentiment
¬Negative ¬Positive
Language that’s not
negative
Language that’s not
positive
Objective
Plot descriptions
Evaluative
Reviews
Aspects of well-
reviewed movies
Aspects of poorly-
reviewed movies
Semiotic square
@jasonkessler
26. Lexicalizing the Semiotic Square
• Task:
• Find language associated with
each element of the square
• Domain of discourse
• Corpus of documents
pertaining to domain (e.g.,
movie-related text)
• Corpus is divided into three
categories
Domain of discourse
Pos Neg
¬Neg ¬Pos
Neutral Term
Complex Term
Positive
Deixis
Negative
Deixis
@jasonkessler
32. PLOT
NEGATIVEPOSITIVE
Positive + Negative complex term:
- Terms near (Euclidean distance) of blue point
- Limit to Quadrant II and Quadrant I
- Captures terms which are associated with reviews but
not highly polar.
REVIEW
@jasonkessler
34. "The square is a map of logical
possibilities. As such, it can be
used as a heuristic device, and in
fact, attempting to fill it in
stimulates the imagination… the
theory of the square allows us to
see all thinking as a game, with
the logical relations as the rules
and concepts current in a given
language and culture as the
pieces."
@jasonkessler
Lithuanian stamp issued honoring the 100th
birthday of Greimas.
Lithuanian stamp issued honoring the 100th
birthday of Greimas.
36. In defense of stop words
Function words reveal traits psychological traits. Person A is tentative, B is
stiff, C is easy going.
Cindy K. Chung and James W. Pennebaker. The
Psychological Functions of Function Words. Social
Communication. 2007.
@jasonkessler
37. James Clifford. The Predicament of Culture. The
Predicament of Culture: Twentieth-Century
Ethnography, Literature, and Art. Harvard Univ.
Press. 1988. @jasonkessler
38. Language about movies
Positive Negative
¬Negative ¬Positive
Objective
Evaluative
Aspects of
well-
reviewed
movies
Aspects of
poorly-reviewed
movies
@jasonkessler
Lexicalizing the Semiotic Square
39.
40. Lexicalizing the Semiotic Square
Pos Neg
¬Neg ¬Pos
Neutral Term
Complex term
Positive
Deixis
Negative
Deixis
Negative
Documents
E.g., negative reviews
Positive
Documents
E.g., positive reviews
Lexicons: compare positive and negative corpora @jasonkessler
41. Vitaliy Kaurov. Finding X in Espresso: Adventures in
Computational Lexicology. Wolfram Blog. 2017.
@jasonkessler
42. Jason Kessler. Using Scattertext and the Python NLP
Ecosystem for Text Visualization. PyData. July. 2017 @jasonkessler
43. Josh Katz, Claire Cain Miller And Kathleen A. Flynn. The
Words Men and Women Use When They Write About
Love. The Upshot. The New York Times. Nov 2017.
@jasonkessler