Positive words carry less information than negative words
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Positive words carry less information than negative words

on

  • 179 views

Sasahara Lab's Journal club 2014/4/23

Sasahara Lab's Journal club 2014/4/23

Statistics

Views

Total Views
179
Views on SlideShare
115
Embed Views
64

Actions

Likes
0
Downloads
0
Comments
0

1 Embed 64

http://mj89sp3sau2k7lj1eg3k40hkeppguj6j-a-sites-opensocial.googleusercontent.com 64

Accessibility

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Positive words carry less information than negative words Presentation Transcript

  • 1. Positive words carry less information than negative words D. Garcia, A. Garas and F. Schweitzer EPJ Data Science, 2012 JClub 2014.4.23 by Kazutoshi Sasahara
  • 2. Introduction n  Is human language biased towards positive emotion or neutral? n  Statistical properties of word freq. and length n  Word freq. (word rank)-1 (Zipf 1949) n  Word freq. predicts word length as a result of a principle of least effort n  Word length increases with information content for efficient communication (Piantadosi et al. 2011).
  • 3. Introduction (cont.) n  Pollyanna hypothesis (Boucher & Osgood 1969) A universal human tendency to use evaluatively positive words more frequently and than evaluatively negative words in communicating. n  Previous researches reported emotional bias but with the lack of control n  Problems in the use of Amazon Mechanical Turk n  Possible biases n  Acquiescent bias n  Social desirability bias n  Framing effects
  • 4. Data Analysis n  This paper examined emotional bias in three major languages on the Internet. n  English (56.6%), German (6.5%), Spanish (4.6%) n  Data n  Established lexica of affective word usage: English:1,034, German: 2,902, Spanish: 1,034 n  Google N-gram dataset: 1012 tokens n  Valence (v) The degree of pleasure induced by the affective word usage, rescaled between -1 and 1.
  • 5. Data Science 2012, 1:3 atascience.com/content/1/1/3 Figure 1 Emotion word clouds with frequencies calculated from Google’s crawl. In each word cloud for English (left), German (middle), and Spanish (right), the size of a word is proportional to its frequency of appearance in the trillion-token Google N-gram dataset [26]. Word colors are chosen from red (negative) to green (positive) in the valence range from psychology studies [7–9]. For the three languages, positive words predominate on the Internet. Results: Frequency of emotional words v=-1: red, v=+1: green Exception n  Positive words predominate on the Internet. English German Spanish
  • 6. Results: Distribution of emotional wordsa Science 2012, 1:3 Page 5 of 12 science.com/content/1/1/3 n  The median shifts significantly towards positive values ( 0.3). n  95% confidence intervals (Wilcoxon tests): n  English: 0.257 0.032 n  German: 0.167 0.017 n  Spanish: 0.287 0.035 n  Empirical evidence of positive bias No control Control
  • 7. al. EPJ Data Science 2012, 1:3 Pag ww.epjdatascience.com/content/1/1/3 Table 1 Correlations between word valence and information measurements. English German Spanish ρ(v,f) 0.222** 0.144** 0.236** ρ(v,I) –0.368** –0.325** –0.402** ρ(v,I′) –0.294** –0.222** –0.311** ρ(v,I2) –0.332** –0.301** –0.359** ρ(v,I3) –0.313** –0.201** –0.340** ρ(v,I4) –0.254** –0.049* –0.162** Correlation coefficients of the valence (v), frequency f, self-information I, and information content measured for 2-grams I2, 3- grams I3, and 4-grams I4, and with self-information I′ measured from the frequencies reported in [42–44]. Significance levels: *p < 0.01, **p < 0.001. ones, but this nonlinear mapping between frequency and self-information makes the latter more closely related to word valence than the former. The first two lines of Table  show the Pearson’s correlation coefficient of word valence and frequency ρ(v,f ), followed by the correlation coefficient between word valence and self-information, ρ(v,I). For all three languages, the absolute value of the correlation coefficient with I is larger than with f , showing that self-information provides more knowledge about word valence than plain Results: Relation between information and valence (1) n  Information content is measured by self-information I(w), which provides more knowledge on the valence than the frequency. n  Negative correlation between v and I: n  Positive words carry less information than negative words. n  Correlation coefficient becomes smaller for the larger context (N). I(w) = −log2 P(w) ←Control analysis
  • 8. Data Science 2012, 1:3 Page 7 of 12 jdatascience.com/content/1/1/3 Results: Relation between information and valence (2) − 1 N log2 i=1 N ∑ P(W = w |C = ci )( ) n  For all languages and context sizes, valence decreases with information content. (Left) Color: Valence (v) Size: Self-information (I) (Right) Average self-information:
  • 9. Results: Additional analysis of valence, length, and self-info (1) n  The sign of valence matters n  Word length I(w) n  Valence ? (word length)-1 n  Combined influence of valence and length to I(w) n  Additional dimension in the communication process related to emotional content (v) rather than communication efficiency (l) valence, which means, indeed, that the usage frequency of a word is not just related to the overall emotional intensity, but to the positive or negative emotion expressed by the word. Subsequently, we found that the correlation coefficient between word length and self- information (ρ(l,I)) is positive, showing that word length increases with self-information. These values of ρ(l,I) are consistent with previous results [, ]. Pearson’s and Spearman’s Table 2 Additional correlations between valence, self-information and length. English German Spanish ρ(abs(v),I) 0.032 ◦ 0.109*** 0.135*** ρ(l,I) 0.378*** 0.143*** 0.361*** ρ(v,l) –0.044 ◦ –0.071*** –0.112*** ρ(v,I|l) –0.379*** –0.319*** –0.399*** ρ(l,I|v) 0.389*** 0.126*** 0.357*** Correlation coefficients of the valence (v), absolute value of the valence (abs(v)), and word length (l) versus self-information (I). Partial correlations are calculated for both variables (ρ(v,I|l),ρ(l,I|v)), and correlation between valence and length (ρ(v,l)). Significance levels: ◦ p < 0.3, *p < 0.1, **p < 0.01, ***p < 0.001. but this trend is not so clear for German. These trends are properly quan- rson’s correlation coefficients between valence and information content for size (Table ). Each correlation coefficient becomes smaller for larger sizes of as the information content estimation includes a larger context but becomes nal analysis of valence, length and self-information rovide additional support for our results, we tested different hypotheses im- elation between word usage and valence. First, we calculated Pearson’s and orrelation coefficients between the absolute value of the valence and the self- of a word, ρ(abs(v),I) (see Table ). We found both correlation coefficients . for German and Spanish, while they are not significant for English. The between valence and self-information disappears if we ignore the sign of the h means, indeed, that the usage frequency of a word is not just related to the onal intensity, but to the positive or negative emotion expressed by the word. tly, we found that the correlation coefficient between word length and self- ρ(l,I)) is positive, showing that word length increases with self-information. of ρ(l,I) are consistent with previous results [, ]. Pearson’s and Spearman’s ional correlations between valence, self-information and length. English German Spanish 0.032 ◦ 0.109*** 0.135*** 0.378*** 0.143*** 0.361*** –0.044 ◦ –0.071*** –0.112*** –0.379*** –0.319*** –0.399*** 0.389*** 0.126*** 0.357*** ients of the valence (v), absolute value of the valence (abs(v)), and word length (l) versus self-information Additional analysis of valence, length and self-information rder to provide additional support for our results, we tested different hypotheses im- ting the relation between word usage and valence. First, we calculated Pearson’s and arman’s correlation coefficients between the absolute value of the valence and the self- rmation of a word, ρ(abs(v),I) (see Table ). We found both correlation coefficients e around . for German and Spanish, while they are not significant for English. The endence between valence and self-information disappears if we ignore the sign of the nce, which means, indeed, that the usage frequency of a word is not just related to the rall emotional intensity, but to the positive or negative emotion expressed by the word. ubsequently, we found that the correlation coefficient between word length and self- rmation (ρ(l,I)) is positive, showing that word length increases with self-information. se values of ρ(l,I) are consistent with previous results [, ]. Pearson’s and Spearman’s e 2 Additional correlations between valence, self-information and length. English German Spanish s(v),I) 0.032 ◦ 0.109*** 0.135*** 0.378*** 0.143*** 0.361*** ) –0.044 ◦ –0.071*** –0.112*** |l) –0.379*** –0.319*** –0.399*** v) 0.389*** 0.126*** 0.357*** lation coefficients of the valence (v), absolute value of the valence (abs(v)), and word length (l) versus self-information rtial correlations are calculated for both variables (ρ(v,I|l),ρ(l,I|v)), and correlation between valence and length (ρ(v,l)). ficance levels: ◦ p < 0.3, *p < 0.1, **p < 0.01, ***p < 0.001. Page 9 of 12 ween valence and information content. German Spanish –0.100*** –0.058* –0.070*** –0.149*** –0.020* –0.084** n content measured on different context sizes (I2, I3, I4) controlling 1, **p < 0.01, ***p < 0.001. and length ρ(v,l) are very low or not significant. f valence and length to self-information, we cal- s ρ(v,I|l) and ρ(l,I|v). The results are shown in e intervals of the original correlation coefficients or the existence of an additional dimension in the o emotional content rather than communication wn result that word lengths adapt to information dent semantic feature of valence. Valence is also the symbolic representation of the word through context by controlling for word frequency. In Ta- fficients of valence with information content for g for self-information. We find that most of the ve sign, with the exception of I for English. The probably related to two word constructions such ents between valence and information content. h German Spanish –0.100*** –0.058* –0.070*** –0.149*** * –0.020* –0.084** information content measured on different context sizes (I2, I3, I4) controlling < 0.3, *p < 0.1, **p < 0.01, ***p < 0.001. valence and length ρ(v,l) are very low or not significant. fluence of valence and length to self-information, we cal- oefficients ρ(v,I|l) and ρ(l,I|v). The results are shown in onfidence intervals of the original correlation coefficients upport for the existence of an additional dimension in the elated to emotional content rather than communication the known result that word lengths adapt to information independent semantic feature of valence. Valence is also ut not to the symbolic representation of the word through uence of context by controlling for word frequency. In Ta- tion coefficients of valence with information content for ontrolling for self-information. We find that most of the of negative sign, with the exception of I for English. The es of  is probably related to two word constructions such n valence and information content. German Spanish –0.100*** –0.058* –0.070*** –0.149*** –0.020* –0.084** tent measured on different context sizes (I2, I3, I4) controlling p < 0.01, ***p < 0.001. length ρ(v,l) are very low or not significant. lence and length to self-information, we cal- (v,I|l) and ρ(l,I|v). The results are shown in tervals of the original correlation coefficients he existence of an additional dimension in the motional content rather than communication result that word lengths adapt to information t semantic feature of valence. Valence is also symbolic representation of the word through text by controlling for word frequency. In Ta- ents of valence with information content for or self-information. We find that most of the sign, with the exception of I for English. The
  • 10. Results: Additional analysis of valence, length, and self-info (2)et al. EPJ Data Science 2012, 1:3 Pa www.epjdatascience.com/content/1/1/3 Table 3 Partial correlation coefficients between valence and information content. English German Spanish ρ(v,I2|I) –0.034 ◦ –0.100*** –0.058* ρ(v,I3|I) –0.101** –0.070*** –0.149*** ρ(v,I4|I) –0.134*** –0.020* –0.084** Correlation coefficients of the valence (v) and information content measured on different context sizes (I2, I3, I4) controlling for self-information (I). Significance levels: ◦ p < 0.3, *p < 0.1, **p < 0.01, ***p < 0.001. correlation coefficients between valence and length ρ(v,l) are very low or not significant. In order to test the combined influence of valence and length to self-information, we cal- culated the partial correlation coefficients ρ(v,I|l) and ρ(l,I|v). The results are shown in Table , and are within the % confidence intervals of the original correlation coefficients ρ(v,I) and ρ(l,I). This provides support for the existence of an additional dimension in the communication process closely related to emotional content rather than communication efficiency. This is consistent with the known result that word lengths adapt to information n  Most of the correlations keep significant and negative sign, except I2 for English. n  Knowing the possible contexts of a word (N=2,3,4) provides further information about word valence than sole self- information.
  • 11. Summary n  Empirical evidence for a positive bias in language n  Positive words are more frequently used. n  Pollyanna hypothesis n  Facilitation of social links n  Negative words convey more information content than positive words. n  Word frequency is determined by n  Not only word length and information content n  But also emotional content