Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Vladimir Alexiev | Semantic Enrichment of Twitter Microposts Helps Understand Post-Brexit Reactions

222 views

Published on

http://2016.semantics.cc/vladimir-alexiev

Published in: Technology
  • Be the first to comment

Vladimir Alexiev | Semantic Enrichment of Twitter Microposts Helps Understand Post-Brexit Reactions

  1. 1. Semantic Enrichment of Twitter Microposts Helps Understand Post-Brexit Reactions Dr. Laura Tolosi Presented by: Dr. Vladimir Alexiev
  2. 2. University of Sheffield, UK Universitaet des Saarlandes, Germany MODUL University Vienna GMBH, Austria Ontotext AD, Bulgaria ATOS Spain SA, Spain King's College London, UK The University of Warwick, UK SwissInfo.ch, Switzerland ihub Ltd, Kenya About Pheme ● A pheme is a meme enhanced with truthfulness information – … also the Greek goddess of fame and rumors ● Research project funded by the FP7-ICT Programme, now in its final year ● Concerned with veracity analysis in Social Media: – Aspects of veracity: rumor, misinformation, disinformation, true/false information, support / deny attitude ● Consortium:
  3. 3. Semantic annotations for the journalism dashboard ● Streaming from Twitter on selected topics and adding semantic annotations in real time (>50 tweets/s) ● Metadata: – Whatever comes from the Twitter API – Entity tagging: person, location, organization, event, etc – LOD: DBpedia + Geonames – Veracity (via ML inference): ● Rumor probability ● Controversiality: support / deny / question annotation for each tweet – Clustering of tweets into stories ● The journalism dashboard usecase: – Provides journalists with a real-time Twitter monitoring platform – An intuitive interface for viewing and filtering the above metadata
  4. 4. Semantic annotations flow in Pheme ● Tweet-processing flow for three languages: EN, DE and BG – Diagram includes only Ontotext's components: multilingual rumor classification and concept tagging
  5. 5. Semantic database ● Pheme Ontology diagram ● GraphDB SPARQL endpoint:
  6. 6. Post-Brexit analysis ● Between June 6th and August 25th we annotated more than 800,000 tweets about Brexit ● Advantages of using LOD annotations in Twitter analysis demonstrated via: – Showcase 1: how are the large administrative UK regions mentioned in Twitter, after Brexit decision? – Showcase 2: who are the people most mentioned in Twitter, after Brexit decision?
  7. 7. Showcase 1: Regions analysis ● SPARQL query for finding mentions of UK administrative regions: 19,674 records Lines 8-9: date and text of tweet Lines 10-13: tweet mentions location Line 17: country via Geonames is UK Line 18: location is an administrative unit of rank 1 in Geonames Line 19: data channel is Brexit
  8. 8. Showcase 1: Regions analysis ● Regions by number of mentions in our dataset: – The opposing vote of Scotland attracts much discussion on Twitter
  9. 9. Showcase 1: Regions analysis ● Alternative ways of mentioning Northern Ireland in tweets Note the not so widespread references: Norn Iron, Six Counties, The Occupied Six Counties
  10. 10. Showcase 1: Regions analysis ● Alternative ways of mentioning Northern Ireland in tweets Note the not so widespread references: Norn Iron, Six Counties, The Occupied Six Counties Source: https://en.wikipedia.org/wiki/Alternative_names_for_Northern_Ireland
  11. 11. Showcase 1: Regions analysis ● NLP analysis of tweets and regions: key-terms most distinctly used between Scotland and England – Significance of association: term presence and mention Scotland / England by Fisher's test p-value – Eg: #indyref2 is significantly more often mentioned in tweets about Scotland than England; ● it is a hashtag about a referendum for Scotland's independence.
  12. 12. Showcase 1: Regions analysis ● NLP analysis of tweets and regions: key-terms most distinctly used between Scotland and England – Significance of association: term presence and mention Scotland / England by Fisher's test p-value – Eg: #indyref2 is significantly more often mentioned in tweets about Scotland than England; ● it is a hashtag about a referendum for Scotland's independence.
  13. 13. Showcase 2: People analysis ● SPARQL query for retrieving mentions of known people: 53,355 records Lines 12-13: tweet mentions Person Line 10: Person is known from DBpedia, not a ML inference Lines 14-16: retrieve date of birth from DBpedia Line 17: data channel is Brexit
  14. 14. Showcase 2: People analysis ● Year-of-birth distribution of people mentioned in post-Brexit tweets – Note the long tail on the left: historical figures? – Peak around 1950: people aged ~65 now – The peak at the older age of 65 relates to the frequent discussions about the different voting preferences of older and younger people
  15. 15. Showcase 2: People analysis ● Historical figures: – Sir Winston Churchill, Henry VIII, Adam Smith, Adolf Hitler, Sir Arthur Harris,Ralph Vaughan Williams, George Santayana, Richard III, Aldous Huxley, Isaac Newton – Tweets reveal insightful analogies with the past:
  16. 16. ● Contemporary figures, most mentioned. Stephen Hawking stands out from the crowd of politicians – Theresa May (11,452), Boris Johnson (3,961), Nigel Farage (2,926), David Cameron (2,868), Andrea Leadsom (1,388), Angela Merkel (922), Jeremy Corbyn (820), Nicola Sturgeon (780), Stephen Hawking (746). ● Support/ deny/ question -tweets mentioning these personalities: – Surprisingly many question-like tweets mentioning Angela Merkel – Tweets mentioning Stephen Hawking have an attitude of deny Showcase 2: People analysis "Our attitude towards wealth played a crucial role in Brexit. We need a rethink" - Stephen Hawking https://t.co/IA0tr0l8Jm #Brexit #UK
  17. 17. Showcase 2: People analysis ● Mentions of young people are quite few comparably: – 183 people born after 1975 were mentioned – At a quick glance, most of them are sportsmen (mostly football players) or actors, not activists for Brexit – Most mentioned: ● Ruth Davidson (leader of the Scottish Conservative and Unionist Party), ● Will Straw (British policy researcher and Labour Party politician), ● Paul Nuttall (Deputy Leader of the UK Independence Party), ● Max Schrems (Austrian lawyer, author and privacy activist), ● Tim Stanley (English blogger, journalist and historian), ● Tulip Siddiq (British Labour Party and Co-operative Party politician) ● Julia Reda (German politician and activist), ● Tom Cotton (American politician who is the junior United States Senator from Arkansas) , ● Chuka Umunna (British Labour politician), ● Laura Kuenssberg (British journalist, currently the political editor of BBC News), etc.
  18. 18. Showcase 2: People analysis ● Mentions of young people are quite few comparably: – 183 people born after 1975 were mentioned – At a quick glance, most of them are sportsmen (mostly football players) or actors, not activists for Brexit – Most mentioned: ● Ruth Davidson (leader of the Scottish Conservative and Unionist Party), ● Will Straw (British policy researcher and Labour Party politician), ● Paul Nuttall (Deputy Leader of the UK Independence Party), ● Max Schrems (Austrian lawyer, author and privacy activist), ● Tim Stanley (English blogger, journalist and historian), ● Tulip Siddiq (British Labour Party and Co-operative Party politician) ● Julia Reda (German politician and activist), ● Tom Cotton (American politician who is the junior United States Senator from Arkansas) , ● Chuka Umunna (British Labour politician), ● Laura Kuenssberg (British journalist, currently the political editor of BBC News), etc. Confirms the previous result, that Scotland is the hottest topic. Independent analyses of regions and people support and reinforce each other via LOD enrichment
  19. 19. Conclusions ● For microposts, the enrichment with LOD is extremely helpful – It provides the context that is necessary for understanding opinion / trend, etc. – It makes the computer “read like a human”, by recalling and relating to external common knowledge ● Reasoning about political regions, age of people mentioned, their functions, is possible only with semantic enrichment – We can only imagine that historians and journalists would greatly benefit from the collection of quotes and analogies with the past that we discovered
  20. 20. Awards ● For its contribution to Pheme, Ontotext is nominated for the Innovation Radar Prize 2016 (among 40 of the best EU-funded innovators)
  21. 21. Ontotext's analysis of Twitter before the Brexit vote ● Ontotext's Twitter analysis before the polls speculated that, at least based on Social Media trends, the Brits want out
  22. 22. References Read the complete post-Brexit vote analysis here: Deliverable D4.1.2. LOD-based reasoning about rumors

×