This document analyzes social media data from Twitter and other sources over a 3-day period. It finds that Twitter accounts for 88% of the 5000 total articles and 100% of updates, followers, and followings. Chi-square and correlation tests show a relationship between followers and followings on Twitter, though the correlation is low. Sentiment analysis finds nearly all articles have neutral sentiment. The analysis provides insights into social media usage and interactions across different platforms.
2. DATA ANALYSIS
For the given social media data we have 5000 articles posted by 2272 unique users in the time period of 3
days over 7 social media providers and the media providers are-
• Aggregator
• Buy/Sell
• Forum Posts
• Forum Replies
• Generic Blogs
• Mainstream Media
• Twitter
MEDIA_PROVIDER
Frequency Percent Valid Percent
Cumulative
Percent
Valid Aggregator 57 1.1 1.1 1.1
Buy/Sell 4 .1 .1 1.2
Forum Posts 70 1.4 1.4 2.6
Forum Replies 345 6.9 6.9 9.5
Generic Blogs 22 .4 .4 10.0
Mainstream Media 101 2.0 2.0 12.0
TWITTER 4401 88.0 88.0 100.0
Total 5000 100.0 100.0
3. Analysis of Updates and Sentiments
Row Labels Sum of UPDATES
Aggregator 0
Buy/Sell 0
Forum Posts 0
Forum Replies 0
Generic Blogs 0
Mainstream Media 0
TWITTER 160000982
Grand Total 160000982
Descriptive Statistics
N Minimum Maximum Mean Std. Deviation
UPDATES 5000 0 1033694 32000.20 88128.117
Valid N (listwise) 5000
Interpretations-
• Only Twitter has received the updates.
• Mean Updates is 32000(approx.) and maximum updates received to an article are 1033694.
• The Article who has received maximum update has the content “Samsung, Alibaba to team up on mobile
payment systems: Reports: Samsung Electronics and… https://t.co/xeKWYboocy SPS®”
4. BLOG_POST_SENTIMENT
Frequency Percent Valid Percent Cumulative Percent
Valid Neutral 4999 100.0 100.0 100.0
Positive 1 .0 .0 100.0
Total 5000 100.0 100.0
Media Providers Sentiments Updates
Aggregator Neutral 0
Buy/Sell Neutral 0
Forum Posts Neutral 0
Forum Replies Neutral 0
Generic Blogs Neutral 0
Mainstream Media Neutral
Positive
0
Twitter Neutral 160000982
Grand Total 160000982
Interpretations-
• We have two types of sentiments positive and neutral for the articles.
• Out of 5000 articles, we have neutral sentiment for 4999 articles and positive sentiment for 1 article.
• The article who has received positive sentiment has content “integrate it in future gadgets. NFC has
featured prominently in some smartphones, and popular services such as Apple Pay, Samsung Pay and
Google Wallet -- helping the technology gain traction among retailers. Fitbit rival Jawbone has already
tied up with American Express Co to let users pay through its UP4...”
5. Analysis of Media Providers used by
Unique Users
MEDIA_PROVIDER
Frequency Percent Valid Percent Cumulative Percent
Valid Aggregator 12 .5 .5 .5
Buy/Sell 1 .0 .0 .6
Forum Posts 57 2.5 2.5 3.1
Forum Replies 315 13.9 13.9 16.9
Generic Blogs 15 .7 .7 17.6
Mainstream Media 4 .2 .2 17.8
TWITTER 1868 82.2 82.2 100.0
Total 2272 100.0 100.0
1% 0% 2%
14%
1% 0%
82%
Aggregator
Buy/Sell
Forum Posts
Forum Replies
Generic Blogs
Mainstream Media
TWITTER
6. Analysis of Following
Row Labels Sum of FOLLOWING
Aggregator 0
Buy/Sell 0
Forum Posts 0
Forum Replies 0
Generic Blogs 0
Mainstream Media 0
TWITTER 1278320
Grand Total 1278320
Interpretations-
• We have followings only for Twitter.
• Out of 1868 Unique Twitter users 1789 users have followings in the interval 0-2500, and 54 users have
followings in the interval 2500-5000.
7. Analysis of Followers
Row Labels Sum of FOLLOWERS
Aggregator 0
Buy/Sell 0
Forum Posts 0
Forum Replies 0
Generic Blogs 0
Mainstream Media 0
TWITTER 12637437
Grand Total 12637437
Interpretations-
• We have followers only for Twitter.
• Out of 1868 Unique Twitter users 1829 users have followings in the interval 0-20000.
8. The Chi Square Test
To test the independency between followers and following variable-
Our Hypotheses are-
H0 - Following and Followers variable are independent to each other.
H1 - Following and Followers variable are not independent to each other.
Test Statistic-
= Chi-Square test of Independence
= Observed value of two nominal variables
= Expected value of two nominal variables
10. TEST RESULT
Chi-Square Tests
Value df
Asymptotic Significance (2-
sided)
Pearson Chi-Square 939802.589a 718263 .000
N of Valid Cases 1868
a. 719960 cells (100.0%) have expected count less than 5. The minimum expected count is .00.
Calculated = 939802.589
Tabulated
= 720235.6
Calculated > Tabulated
Here,Chi square calculated value is greater than the chi square tabulated at 5% level of
Significance.
Hence we reject the null hypothesis and accept the alternative hypothesis , that means
Follower and following variable are not independent to each other.
11. Correlation Approach to find the
association
Following Followers
Following 1 0.039
Followers 0.039 1
Interpretations-
• Karl pearson correlation coefficient is for the given dataset between followers and following variable is 0.039.
• Since value is slightly greater than zero hence we can say there must be an association between these two
variable.
** Here I have calculated correlation by considering unique cases as there are so many duplicate cases in these columns.
12. Correlation among following, followers
and updates
Following Followers Updates
Following 1
Followers 0.022 1
Updates 0.103 0.046 1
Interpretations of Correlation Coefficient-
• +1 Indicates perfect positive correlation between variables.
• 0 Indicates no correlation between variables.
• -1 Indicates perfectly negative correlation between variables.
** Here I have calculated correlation by considering all the cases as we have different update for each articles.