2. My Assumption
If his tweet is emotionally intensive, either positive or
negative, he will get more Retweets.
3. Data
7374 tweets from July 16, 2015 to November 11, 2016
https://www.kaggle.com/kingburrito666/better-donald-trump-tweets
4. Data
I scored by using a Python package “textbolb”, and named
the column as “Polarity”.
Polarity scaled -10 to 10.
“MAKE AMERICA GREAT AGAIN!” = 10
“Wow, Twitter, Google and Facebook are burying the FBI
criminal investigation of Clinton. Very dishonest media!”
= -2.625
5. Hypothesis and model
“More Emotional, either positive or negative, more Retweets”
Retweets= 𝜷 𝟎+𝜷 𝟏Polarity+𝜷 𝟐 𝑷𝒐𝒍𝒂𝒓𝒊𝒕𝒚 𝟐+u
should be a quadratic line with concaving up
𝛽2>0 , 𝛽0 −
𝛽2
2
4𝛽1
2 > 0
Explanatory variables are statically significant
Satisfies MRS assumptions
6. OLS model
Successfully Achieved data and conduct regression by R
𝑹𝒆𝒕𝒘𝒆𝒆𝒕𝒔= 𝟒𝟑𝟔𝟔. 𝟗𝟔𝟏-151.191Polarity+ 𝟏𝟎. 𝟔𝟐𝟕𝑷𝒐𝒍𝒂𝒓𝒊𝒕𝒚 𝟐
Global min :
(Polarity, Retweets) =
(7.114, 3829.21)
VIF= 1.524855
7. Breusch-Pagan Test
Is this model Homoscedastic??
𝑢2 = δ0 + δ1 𝑃𝑜𝑙𝑎𝑟𝑖𝑡𝑦 + δ2 𝑃𝑜𝑙𝑎𝑟𝑖𝑡𝑦2 + 𝑒𝑟𝑟𝑜𝑟
𝐻0 ∶ δ1=δ2 = 0
𝐻1 ∶ 𝐻0 𝑖𝑠 𝑛𝑜𝑡 𝑐𝑜𝑟𝑟𝑒𝑐𝑡
5.093 at degree of freedom 7372,
the p-value is 0.6158%
heteroscedastic
8. Robust standard error
Robust Standard Error Polarity Polarity^2
25.914264 6.426094
T value Polarity Polarity^2
-5.834277 1.653726
Statistically insignificant