Exploring Image Virality in Google Plus

What
Makes
an
Image
Viral?

•  Virality:
tendency
of
a
content
either
to

spread
quickly
within
a
community
or
to

receive
a
great
deal
of
a>en?on
by
it.

•  Our
virality
indicators:

  Plusones

  Replies

  Reshares

Brightness
Saturation
Orientation
Animation
…
Image
Characteris?cs

Collec?on

•  Google+
API
to
harvest
the
public
posts
from

the
1000
top
followed
users
in
Google+

•  Time
span
of
one
year,
from
June
28th
2011

(Google+
date
of
launch)
to
June
29th
2012.

•  Roughly
200K
posts
containing
a
pic
as

a>achment.

Methodology

Virality
indexes
studied
with
Complementary

Cumula?ve
Distribu?on
Func?ons
(CCDFs).

Very
useful
for
comparing
different
image
categories.
E.g.
color
images
vs.
b/w.

Suppose
to
check
the
value
of
F(75)
-‐
where
75
represents
the
number
of

plusones
-‐
and
find
values
0.3
and
0.15
for
colored
and
b/w
respec?vely.
Then

you
know
that
30%
of
colorful
pictures
posted
on
G+
received
at
least
75
plusones

while
among
those
in
b/w,
only
15%
received
at
least
75
plusones.
Put
simply,
we

can
say
that
color
images
have
an
virality
index
(on
plusones)
double
than
in
b/w.

distribution thickening toward low virality score. In order
to evaluate the “virality power” of the features taken into
account, we compare the virality indexes in terms of empirical
Complementary Cumulative Distribution Functions (CCDFs).
These functions are commonly used to analyse online social
networks in terms of growth in size and activity (see for
example [14], [15], or the discussion presented in [17]) and
also for measuring content diffusion, e.g. the number of retweet
of a given content [16]. Basically these functions account for
the probability p that a virality index will be greater than n
and are defined as follows:
ˆF(n) =
number of posts with virality index > n
total number of posts
(1)
4It has been noted how (see, for instance, http://on.wsj.com/zjRr06), espe-
cially in the time frame we consider, that is the first year of Google+, users’
activity did not increase much in front of the exploding network size.
of posts, w
to reply, re
characterist
play a role
process as
some respec
mechanisms
strangers’ p
sequences a
In order
compared p
only text. W
interesting
probability
of resharers
vs. 0.10, K

Text
Only
vs.
Image

for text-only posts but we do not investigate this issue
here).
• Also, if we focus on simple appreciation (plusoners in
Figure 5.a), results are very intriguing: while up to about
75 plusoners the probability of having posts containing
images is higher, after this threshold the situation cap-
sizes. This finding can be of support to the hypothesis
that, while images have higher initial impact in the
information flow — as argued with the aforementioned
“rapid cognition” model, above a certain threshold, high
quality textual content plays a major role.
0 50 100 150 200 250 300 350 400
number of plusoners (n)
0.0
0.2
0.4
0.6
0.8
1.0
ˆF(n)
a
0 50 100 150 200 250 300 350 400
number of replies (n)
0.0
0.2
0.4
0.6
0.8
1.0
ˆF(n)
b
0 10 20 30 40 50 60 70 80 90 100
number of resharers (n)
0.0
0.2
0.4
0.6
0.8
1.0
ˆF(n)
c
with image
without attachments
Fig. 5. Virality CCDFs for posts with image vs. text-only posts.
B. Static vs. Animated
6. With respect to plusoners and replies, static images tend to
show higher CCDFs (respectively two and three times more,
ˆFplus(75) = 0.30 vs. 0.17, ˆFrepl(50) = 0.22 vs. 0.08, K–S test
p < 0.001), while on resharers the opposite holds.
The fact that ˆFresh(n) is two times higher for posts
containing animated images ( ˆFresh(10) = 0.48 vs. 0.27, K–
S test p < .001) can be potentially explained by the fact that
animated images are usually built to convey a small “memetic”
clip - i.e. funny, cute or quirky situations as suggested in [24].
In order to verify this hypothesis we have annotated a small
random subsample of 200 images. 81% of these animated
images were found to be “memetic” (two annotators were
used, positive example if the image score 1 at least on one of
the aforementioned dimensions, annotator agreement is very
high — Cohen’s kappa 0.78). These findings indicate that
animated images are mainly a vehicle for amusement, at least
on Google+.
C. Image Orientation
We focused on the question whether image orientation
(landscape, portrait and squared) has any impact on virality
indexes.
While the orientation seems not to have strong impact
on resharers, with a mild prevalence of horizontal pictures
(see Figure 7.c), plusoners and replies tend to well discrimi-
nate among various image orientations. In particular, portrait
images show higher probability of being viral than squared
images than, in turn, landscapes (see Figure 7.a and 7.b).
Furthermore, CCDFs indicate that vertical images tend to be
more viral than horizontal ones ( ˆFplus(75) = 0.38 vs. 0.26,
ˆFrepl(50) = 0.38 vs. 0.17, K–S test p < 0.001). Hence, while
squared images place themselves in the middle in any metric,
landscape images have lower viral probability for plusoners
and replies but slightly higher probability for reshares.
This can be partially explained by the fact that we are
analyzing “celebrities” posts. If the vertically-orientated image
contains the portrait of a celebrity this is more likely to be
appreciated rather that reshared, since the act of resharing can
also be seen as a form of “self-representation” of the follower
(we will analyze the impact of picture containing faces in the
following section). The opposite holds for landscapes, i.e. it is
more likely to be reshared and used for self-representation by
reshares.
D. Images containing one face
In traditional mono-directional media (e.g. tv, billboards,
Posts
with
an
image,

probability

of
reshares
is

almost
three
?mes
higher

but
lower
probability
of

being
viral
when
it
comes

to
number
of
comments.

P l u s o n e s
c o m p l e x

interac?on.

–  Fˆresh(10)
=
0.28
vs.
0.10

–  Fˆrepl(50)
=
0.33
vs.
0.22

Text
Only
vs.
Image

•  Reshares:
within
vast
informa?on
ﬂow
visual

cues
grab
user’s
a>en?on.

BUT

•  Comments:
text-‐only
posts
elicit
more

“linguis?c-‐elabora?on”
than
images.

•  Plusones:
Images
higher
ini?al
impact,
ager,

high
quality
textual
content
plays
a
major
role.

Sta?c
images
higher

virality
for
plusones
and

replies,
lower
for
reshares

– Fˆplus(75)
=
0.30
vs
0.17

– Fˆrepl(50)
=
0.22
vs
0.08

– Fˆresh(10)
=
0.27
vs
0.48

Sta?c
vs.
Animated

0 50 100 150 200 250 300 350 400
0.0
0.2
0.4
0.6
0.8
1.0
ˆF(n)
a
0 50 100 150 200 250 300 350 400
0.0
0.2
0.4
0.6
0.8
1.0
ˆF(n)
b
0 10 20 30 40 50 60 70 80 90 100
0.0
0.2
0.4
0.6
0.8
1.0
ˆF(n)
c
static image
animated image
Fig. 6. Virality CCDFs for static vs. animated images.
strategy applicable to Social Media? Understanding the effect
0 50 100 150 200 250 300 350 400
0.0
0.2
0.4
0.6
0.8
1.0
ˆF(n)
a
0 50 100 150 200 250 300 350 400
0.0
0.2
0.4
0.6
0.8
1.0
ˆF(n)
b
0 10 20 30 40 50 60 70 80 90 100
0.0
0.2
0.4
0.6
0.8
1.0
ˆF(n)
c
vertical
square
horizontal
Fig. 7. Virality CCDFs for image orientation.
ratio are too few to further verify the hypotheses.

Sta?c
vs.
Animated

Anima?on
adds
a
further
dimension
to

pictures
expressivity.

Annotated
a
random
subsample
of
200

images.
81%
of
animated
images
were

“memeHc”.
Two
annotators,
posi?ve
example

if
image
scores
1
at
least
on
one
of
the
dimensions:

funny|cute|quirky.
Cohen’s
kappa
0.78

Animated
images
are
mainly
a

vehicle
for
amusement,
at
least

on
Google+
and
tend
to
be

reshared
more.

Image
Orienta?on

350 400
350 400
90 100
tatic image
nimated image
ng the effect
0 50 100 150 200 250 300 350 400
0.0
0.2
0.4
0.6
0.8
1.0
ˆF(n)
a
0 50 100 150 200 250 300 350 400
0.0
0.2
0.4
0.6
0.8
1.0
ˆF(n)
b
0 10 20 30 40 50 60 70 80 90 100
0.0
0.2
0.4
0.6
0.8
1.0
ˆF(n)
c
vertical
square
horizontal
Fig. 7. Virality CCDFs for image orientation.
ratio are too few to further verify the hypotheses.
Not
strong
impact
on
reshares,

while
plusones
and
replies

tend
to
well
discriminate
in

favor
of
Ver?cal
images.

•  Fˆplus(75)
=
0.38
vs
0.26

•  Fˆrepl(50)
=
0.38
vs.
0.17

Image
Orienta?on

Landscape
pics
lower
viral
probability
for

plusones
and
replies
but
slightly
higher
for

reshares

If
verHcal
images
contain
the

portrait
of
a
celebrity
this
is
more
likely
to
be

appreciated
rather
that
reshared,
since
the
act

of
resharing
is
a
form
of
“self-‐representaHon”

of
the
follower.

Random
subsample
of
200

images.
55%
Instagrammed.

65%
including
b/w.

Two
annotators
w,
posi?ve
example
if
the

image
is
clearly
recognized
as
modified

with
a
filter;
annotator
agreement
is
high
–

Cohens
kappa
0.68.

Squared
images
typical
of
services
a
la
Instagram,
providing
a
so-‐
called
“vintage
effect”.

Face
vs.
No
Face

Considering
any
image

containing
at
least
one

face.

Eﬀect
of
face
on
virality
is

staHsHcally
signiﬁcant
but

small.
Pictures
containing

faces
slightly
higher

replies
and
plusones
but

lower
reshares.

0 50 100 150 200 250 300 350 400
0.0
0.2
0.4
0.6
0.8
1.0
ˆF(n)
a
0 50 100 150 200 250 300 350 400
0.0
0.2
0.4
0.6
0.8
1.0
ˆF(n)
b
0 10 20 30 40 50 60 70 80 90 100
0.0
0.2
0.4
0.6
0.8
1.0
ˆF(n)
c
no faces
one face
Fig. 7. Virality CCDFs for images containing faces vs. images without faces.
while for resharers it is 27% higher in favor of high brightness
ˆ
0 50 100 150 200 250 300 350 400
0.0
0.2
0.4
0.6
0.8
1.0
ˆF(n)
a
0 50 100 150 200 250 300 350 400
0.0
0.2
0.4
0.6
0.8
1.0
ˆF(n)
b
0 10 20 30 40 50 60 70 80 90 100
0.0
0.2
0.4
0.6
0.8
1.0
ˆF(n)
c
mean bright. ≤ 0.85
mean bright. > 0.85
Fig. 8. Virality CCDFs for image Brightness.
essentially emotional experience, whereas shape corresponds

Selﬁes
tend
to
be

reshared
less?

A
subsample
of
pics

where
faces
are
at
least

10%
of
the
surface.

Diﬀerence
among

indexes
increase
(higher

plusones
and
comments,

lower
reshares)
as

expected.

Grayscale
vs
Colored

The
impact
and
meaning
of
black-‐and-‐white

photography
studied
from
different
perspec?ves

(e.g.
semio?cs
and
psychology)
and
in
different

professional
fields
(from
documentary
to

adver?sing).
Rudolf
Arnheim
argues
that
color

produces
emoHonal
experience,
whereas
shape

corresponds
to
intellectual
pleasure.
Understand

if
such
effects
can
be
spo>ed
in
virality
indexes.

Grayscale
vs
Colored

0 50 100 150 200 250 300 350 400
0.0
0.2
0.4
0.6
0.8
1.0
ˆF(n)
a
0 50 100 150 200 250 300 350 400
0.0
0.2
0.4
0.6
0.8
1.0
ˆF(n)
b
0 10 20 30 40 50 60 70 80 90 100
0.0
0.2
0.4
0.6
0.8
1.0
ˆF(n)
c
mean sat. ≤ 0.05
mean sat. > 0.05
Fig. 9. Virality CCDFs for Grayscale vs. Colored images.
in the context of real-time visual concept classiﬁcation.
0 50 100 150 200 250 300 350 400
0.0
0.2
0.4
0.6
0.8
1.0
ˆF(n)
a
0 50 100 150 200 250 300 350 400
0.0
0.2
0.4
0.6
0.8
1.0
ˆF(n)
b
0 10 20 30 40 50 60 70 80 90 100
0.0
0.2
0.4
0.6
0.8
1.0
ˆF(n)
c
y-edge intensity <= 0.009
y-edge intensity > 0.009
Fig. 10. Virality CCDFs for horizontal edges.
followers. On the contrary, animated images that usually
Perceptual
grayscale:
using

image
mean-‐satura?on

(threshold
of
0.05).

Colored
images
(satura?on
>

0.05)
higher
probability
of

collec?ng
plusones
and

replies.
No
relevant
diﬀerence

on
reshares.

S?ll,
photographer
category
rise
by

50%
its
probability
on
grayscale.

Consistent
with
the
idea
that
black-‐
and-‐white
photography
is
a
form
of
art

expressivity
mainly
used
by

professionals.

Image

Brightness

Usually
images
with
high

brightness
are
cartoon-‐
like
or
“photoshopped”.

Image
Brightness

350 400
350 400
90 100
no faces
one face
s without faces.
p < 0.001),
0 50 100 150 200 250 300 350 400
0.0
0.2
0.4
0.6
0.8
1.0
ˆF(n)
a
0 50 100 150 200 250 300 350 400
0.0
0.2
0.4
0.6
0.8
1.0
ˆF(n)
b
0 10 20 30 40 50 60 70 80 90 100
0.0
0.2
0.4
0.6
0.8
1.0
ˆF(n)
c
mean bright. ≤ 0.85
mean bright. > 0.85
Fig. 9. Virality CCDFs for image Brightness.
Rudolf Arnheim, for example, argues that color produces

Brighter
images,
lower

probability
of
being
viral
on

plusones
and
replies
but

higher
prob.
on
reshares.

–  Fˆplus(75)
=
0.31
vs
0.18

–  Fˆrepl(50)
=
0.23
vs
0.12

–  Fˆresh(10)
=
0.26
vs
0.33

Image
Brightness

•  Random
subsample
of
200
very
bright
images.
88%
contained
text,
but

only
13%
were
cartoon
and
13%
photoshopped.
Only
21%
was
considered

funny
or
meme?c.

Content
meant
to
be
mainly

informaHve,
complementary

to
animated
pics
-‐
mainly

intended
for
amusement.

•  Vast
majority
infographics,
screenshots
of

so[ware,
social-‐networks
posts
etc.

Two
annotators,
four
binary
categories:
contain-‐text
|
comics
|

real-‐picture
|
funny
—
Cohen’s
kappa
0.74

VIRALITY
INDEXES
CORRELATION

Correla?on
Analysis

•  Plusones
and
replies
always
high
correla?on
while

replies
and
reshares
always
correlate
low.

appreciation for the funny picture and, after that,
reshares the content. Since resharing implies also wr
comment in the new post, the reply is likely not to be
to the original VIP’s post.
TABLE II. VIRALITY INDEXES CORRELATION ON THE VAR
DATASET CUTS, PEARSON COEFFICIENT AND MIC WITH PARAM
α = 0.5, c = 10 USED.
Pearson MIC
Static images
plusoners vs. replies 0.723 0.433
plusoners vs. resharers 0.550 0.217
replies vs. resharers 0.220 0.126
Animated Images
Text Only
In Table III instead, we sum up some of the main ﬁ
Plusones
and
reshares,
mild

correla?on
in
most
cases,
but
high
in

funny
pictures

Procedural
eﬀect:

the
follower
expresses
his/her

apprecia?on
for
the
funny
picture

and,
ager
that,
he/she
reshares
the

content.
Since
resharing
implies
also

wri?ng
a
comment
in
the
new
post,

the
reply
is
likely
not
to
be
added
to

the
original
VIP’s
post.

Endorsement
vs.
Self-‐Representa?on

•  plusones
and
replies
are
a
form
of

endorsement,
while
reshares

correspond
to

self-‐representaHon.

–  Pictures
containing
faces
are
endorsed
but
not
used
for
self-‐
representa?on
by
VIPs’
followers.

–  Animated
images,
containing
funny
material,
more
likely
to
provoke

reshares

Studies
show
that
people
tend
to
represent
themselves

with
posi?ve
feelings,
and
posi?ve
moods
appear
to
be
associated

with
social
interac?ons.

•  Inves?gate
possible
interac?ons
between

image
characteris?cs
and
VIPs’
typology.

•  To
what
extent
results
are
generalizable
or

typical
of
a
community,
gathered
around
a

common
interest?

TABLE VI. CONTINGENCY TABLE OF IMAGE-CATEGORY DISTRIBUTIONS OVER USER-CATEGORIES.
User-category Grayscale Colored High Brightness Low Brightness Containing Face Containing No Face Squared Vertical Horizontal Total
No Category 7% 6% 9% 6% 5% 7% 4% 5% 7% 6%
Actor 4% 6% 5% 5% 8% 5% 5% 6% 5% 5%
Artist 5% 6% 7% 6% 6% 6% 5% 7% 6% 6%
Company 0% 1% 1% 1% 1% 1% 1% 1% 1% 1%
Entrepreneur 8% 7% 6% 7% 7% 7% 8% 5% 8% 7%
Music 3% 16% 3% 16% 19% 12% 15% 29% 8% 14%
Not Available 0% 0% 0% 0% 0% 0% 0% 0% 0% 0%
Organization 0% 0% 0% 0% 0% 0% 0% 0% 0% 0%
Other 0% 0% 0% 0% 0% 0% 2% 0% 0% 0%
Photography 31% 19% 9% 22% 15% 23% 23% 14% 23% 20%
Politician 0% 0% 0% 0% 0% 0% 0% 0% 0% 0%
Sport 0% 3% 1% 3% 4% 2% 2% 2% 3% 2%
Technology 27% 22% 40% 20% 19% 24% 16% 18% 25% 22%
TV 1% 2% 1% 2% 3% 1% 5% 1% 2% 2%
Website 1% 2% 2% 2% 2% 2% 1% 1% 2% 2%
Writing 11% 10% 17% 10% 11% 10% 11% 10% 11% 11%
KL-divergence 0.173 0.002 0.259 0.003 0.027 0.006 0.047 0.076 0.029
at the Kullback-Leibler (KL) divergence of specific image
categories with respect to the reference distribution (i.e., taken
as the total number of images posted by each user- category),
we observe very few but interesting effects due to specific
user-categories. In particular, while all the KL divergences are
very small, two of them (for Grayscale and High Brightness,
and exploiting descriptors such as color histograms, oriented-
edges histograms; (c) building upon the vast literature available
in the context of scene/object recognition, dividing our dataset
into specific categories in order to analyse relations between
categories of natural images and their virality.

Discussion

•  Kullback-‐Leibler
(KL)
divergence
of
image
categories
with

respect
to
the
reference
distribu?on.

•  KL
divergences
all
very
small,
for
Grayscale
and
High

Brightness
li>le
higher,
explained
by
the
distribu?on
gap
in

two
User’s
categories.

–  High
Brightness:
Technology
users
probability
doubled
(from

22%
to
40%)
and
Music
and
Photography
reduce
their
to
one

third.
Consistent
with
the
analysis
of
infographics
and

screenshots
of
sogware
programs
(connected
to
technology).

–  Grayscale:
Photography
users
rise
by
50%
their
probability,

music
reduce
it
to
one
third.
Consistent
with
the
idea
that
black-‐
and-‐white
photography
is
a
form
of
art
expressivity
mainly
used

by
professionals.

•  A
preliminary
study
showing
that
perceptual

characteris?cs
of
an
image
can
strongly
affect

the
virality
of
the
post
embedding
it.

•  Considering
various
kinds
of
images
(e.g.

cartoons,
panorama
or
self-‐portraits)
and

related
features
(e.g.
orienta?on,
anima?ons)

users’
reac?ons
are
affected
in
different
ways.

•  Further
details:
Marco
Guerini,
Jacopo
Staiano,
Davide
Albanese.
Exploring

Image
Virality
in
Google
Plus.
In
Proceedings
of
IEEE/ASE
SocialCom
(2013)

Exploring Image Virality in Google Plus

Recommended

Recommended

More Related Content

Recently uploaded

Recently uploaded (20)

Featured

Featured (20)

Exploring Image Virality in Google Plus