Ageing Factor: a PotentialAltmetric for Observing Eventsand Attention Spans inMicroblogsVictoria Uren & Aba-Sah Dadzie
Why look at science discussion on Twitter?• Public engagement with science matters: • Enthuse kids to learn science, • Inform people about fascinating stuff, • Build consensus for social and economic change, • The public paid for our research!• Social media present a great opportunity to “talk nerdy” to the public (on.ted.com/Marshall)
Challenges1. Level of tweeting low for science • Of 32 astronomy terms from UNESCO Thesaurus 6 occurred at a usable level • Earth, Moon, Sun, Stars, Universe, Space2. Scientific tweets lost in noise • 0.04 of tweets in the UNESCO terms sample were on topic • ~0.4 for popular culture tweets (Mejovo and Srinivasan 2012)
Meteor Showers – coming to a sky near you!• Debris from comets stream to earth on parallel paths• Shooting stars appear to radiate from a point• Predictable time and place• Fun to observe with friends!• Geminid 13-14 Dec 2011• Quadrantid 3 Jan 2012 Images from Wikipedia
Aging Factor k AF = i k +lWhere: i is the cut-off time in hours, k is the number of retweets originating at least i hours ago, l is the number of retweets originating less than i hours ago, k + l is therefore all the tweets in the sampleBased on Avremescu’s metric from Egghe & Rousseau (1990) – using hoursinstead of years.
Assumptions• Assumption 1: ageing factors for topics which concern special events will be lower than suitable baselines.• Assumption 2: ageing factors which are higher than suitable baselines are associated with topics in which interest is sustained over time.
Experiment 1• Dec 13-14th 2011 – Geminid meteor shower• Training set: • 8980 tweets • Dec 14th 2011 22:36 GMT - Dec 14th 2011 23:18 GMT• Test set: • 81891 tweets • Dec 14th 2011 23:18 GMT - Dec 15th 2011 03:30 GMT• Human categorization by reading of tweets in the training data • 9 composite searches • 3 baseline searches • 1hAF & 24hAF
Experiment 2• Jan 3 2012 - Quadrantid meteor shower • > 2 weeks later• Four time windows: • 0:00-5:59 GMT (6) • 6:00-11:59 GMT (12) • 12:00-17:59 GMT (18) • 18:00-23:59 GMT (24)• Are 1hAF values low for this event?• Does the time of day matter (it must be dark to see meteors)?
Results – Queries from Training Data 1 0.9 0.8 0.7 0.6 0.5 0.4 6 0.3 12 0.2 18 0.1 24 0 Batch Space Space Space Space Space Space Astro Astro Astro Astro Astro Meteor AND sci AND AND AND AND AND AND @ AND NOT gear amb bodies bodies+ events tech meteorAstro AND events @12 – of 275total retweets 18 contain the termquadrantid while 213 contain the Astro & events and Meteorterm wish – these are NOT about both contain “shooting star”meteor showers! and are low c.f. Astro in12
Modified Queries• Background Knowledge - 3 astronomical events: • Quadrantid meteor shower night of 3-4 Jan. • 2nd of the twin Grail spacecraft moving into orbit around the Moon on the 2nd of Jan. • Proximity of the Moon and the planet Jupiter in the night sky on the 2nd of Jan
Results – Modified Queries Space AND grail @18 lies within the expected variance of the population
Results – 3 “Interesting” Sets• 2 Astro AND quad points • @18 0.15 182, @24 0.22 330 • Inference: retweeting activity around the Quadrantid meteor shower was significant in the hours of darkness for the UK and USA• 1 Space NOT grail • @6 0.71 274 • 216 of the retweets contained the phrase “join NASA” • “Oh really? You need space? You might as well join NASA.” • Inference: this is a funny joke (apparently)!
Conclusions• 1hAF does support analysis of the smaller datasets typical of scientific posts• 24hAF was not a sensitive measure • 24h time window too long for Twitter• Funnel plot suggests some observations are significant• Both low and high 1hAF were observed • High interest = low 1hAF • Long attention span = high 1hAF
Future Directions• NLP/ML approach for identifying scientific posts? • Clustering better than categorisation because data changes rapidly with time? • ( after >2 weeks our training data queries were outdated)• Different cutoffs for the AF? • E.g. 6hAF for quarter days• Episodic vs steady tweets (Hu et al. 2012) • Episodic -> low AF ? • Steady -> high AF?• Different types of participant?
ReferencesY. Mejova and P. Srinivasan, “Crossing Media Streams withSentiment: Domain Adaptation in Blogs, Reviews and Twitter,” inSixth International AAAI Conference on Weblogs and Social Media,2012, pp. 234-241.L. Egghe and R. Rousseau, Introduction to Informetrics. Elsevier,1990.Y. Hu et al., “What Were the Tweets About? Topical Associationsbetween Public Events and Twitter Feeds,” in Sixth InternationalAAAI Conference on Weblogs and Social Media, 2012, pp.154-161.