Leveraging the Power of Social Media

810 views

Published on

A light intro to natural language processing on social media, presented as an invited talk at the University of Sheffield Engineering Symposium 2014 in the AI session. As well as an introduction to the area, this presentation covers powerful real-world applications of social media, and touches on the work we do in the Sheffield NLP group.

Video cast: https://www.youtube.com/watch?v=QUbRmUinhHw&feature=youtu.be

0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
810
On SlideShare
0
From Embeds
0
Number of Embeds
68
Actions
Shares
0
Downloads
1
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

Leveraging the Power of Social Media

  1. 1. Leveraging the Power of Social Media Leon Derczynski Natural Language Processing Group Department of Computer Science Faculty of Engineering University of Sheffield
  2. 2. work in the field of “computational linguistics” focus on turning text into “understanding” and “decision support”
  3. 3. the “AI effect” Pamela McCorduck artificial intelligence is less impressive when we know how it works
  4. 4. the “AI effect” Pamela McCorduck artificial intelligence is less impressive when we know how it works ..so this talk won't have deep technical detail
  5. 5. language
  6. 6. (note huge evolutionary advancement)
  7. 7. ??
  8. 8. social media
  9. 9. social media – a poster child for big data
  10. 10. big data: promises new insights is (was) a cool buzzword causes headaches what is it?
  11. 11. V: velocity twitter: 255 000 000 users / month Facebook: 1 280 000 000 users / month
  12. 12. VV: volume reddit: 34 000 000 posts / month twitter: 650 000 000 messages / month
  13. 13. VVV: variety
  14. 14. there are many online social networks we need one of these
  15. 15. there are many online social networks we need one of these
  16. 16. artificial intelligence
  17. 17. “Human knowledge is expressed in language. So computational linguistics is very important.” - Mark Steedman
  18. 18. Start: sequence of bytes [natural language processing goes here] End: actionable knowledge
  19. 19. why bother programming at all?
  20. 20. why bother programming at all? … let the computer program itself!
  21. 21. machine learning: make decisions about tasks based on things you've seen before a little bit like human learning
  22. 22. give text and examples of what we want done machine learns to from these examples
  23. 23. understanding language
  24. 24. social media text is surprisingly formal
  25. 25. they see me rollin - a typo?
  26. 26. they see me rollin they hatin - perhaps not. G-dropping mapped from speech
  27. 27. they see me rollin they hatin patrollin - incidentally, this linguistic phenomenon is a good predictor of education level
  28. 28. they see me rollin they hatin patrollin tryna catch me ridin dirty - a new style! flawless; not a single mistake
  29. 29. omb x - surely they mean “omg”?
  30. 30. omb ✔ - the keys are like, right next to each other X really? this guy?
  31. 31.  Shall we go out for dinner this evening?   Ey yo wen u gon let me tap dat
  32. 32. spelling ability distribution in net slang users
  33. 33. with spelling ability distribution in non-slang users
  34. 34. Do you feel luccy, punk?
  35. 35. Do you feel luccy, punk?
  36. 36. challenge 1: what language is this anyway je bent Jacques cousteau niet die een nieuwe soort heeft ontdekt, het is duidelijk, ze bedekken hun gezicht. Get over it RT @TomPIngram: VIVA LAS VEGAS 16 - NEWS #constantcontact http://t.co/VrFzZaa7
  37. 37. challenge 2: pls type better I wonde rif Tsubasa is okay.. - misplaced space = two new words no homwork tonight.. suprising?? - maybe there should be!
  38. 38. challenge 3: finding names derek x is a person miles x might be a person Marie Claire x should not be a person Exodus Porter x probably an OK person, but actually a beer
  39. 39. challenge 3: finding names Spicy Pickle Jr. x apparently actually a person
  40. 40. challenge 3: finding names Spicy Pickle Jr. x apparently actually a person ???
  41. 41. old news
  42. 42. social media defends against earthquakes 2010
  43. 43. Japanese and US quake response times: down from ~20s to ~17.5s
  44. 44. social media predicts epidemics 2012 exhibit a: one dead crow
  45. 45. social media mentions of dead crows predict WNV in humans ''There's a dead crow in my garden''
  46. 46. social media predicts you getting flu 2012
  47. 47. @mari: i think im sick ugh.. great potential for misuse :)
  48. 48. this november: social media dispatches fire engines 2014
  49. 49. trust
  50. 50. if hospitals and fire stations act based on tweets, wrong information is extra-harmful
  51. 51. rumours speculation misinformation disinformation who can you trust online? Imagine a lie detector for politicians / Fox News
  52. 52. responsibility
  53. 53. 1. Collect tweets 2. ???? 3. Profit!
  54. 54. how long do we keep them for? - “15 years is OK, right?” - NSA what do we store and process? - “just metadata, it's harmless” - GCHQ
  55. 55. (from Kurt Opshal's slides at the Chaos Communication Congress, photo by Marion Marschalek)
  56. 56. bias
  57. 57. news style social media
  58. 58. most of our language AI was trained on news text the bias is: - middle class - white -working age - educated - male - 1980s/1990s - from the US - journalist - following AP guidelines
  59. 59. your phone rewards you if you talk and write like (ok.. sort of)
  60. 60. your phone rewards you if you talk and write like (ok.. sort of) .. and punishes you when you don't. (not cool!)
  61. 61. twitter bias is different - not German or Nordic - are young(ish) lower requirements - you can publish even if you're not a journalist - still operates beyond the 1990s some new requirements - you do need access to the internet... - ...and twitter ( 对不起,中国人 )
  62. 62. the big picture
  63. 63. we're racing ahead and improving life quality there is immense value in “trivia” understanding social media lets us help people better
  64. 64. understanding social media lets us help people better Thank you! Leon Derczynski

×