Disambiguating          Twitter Search          Kevin Teh          kkwteh@gmail.com          Insight Data Science Fellows ...
That’s not the python that I          meant...Tuesday, February 26, 13
The solution? cluster-pluck.Tuesday, February 26, 13
cluster-pluck disambiguates             Twitter search in real timeTuesday, February 26, 13
It works in Spanish too!Tuesday, February 26, 13
Tuesday, February 26, 13
Tools           Word Filter              Web Application 300,000 Tweets                           Filter                  ...
Algorithm                                    read query and d/l                                  corpus of 1500 tweets    ...
Kevin Teh          kkwteh@gmail.com             Math PhD -- May ’13                               B.A.Sc. -- April ’07    ...
Tuesday, February 26, 13
Upcoming SlideShare
Loading in …5
×

Kevin teh insight presentation

556 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
556
On SlideShare
0
From Embeds
0
Number of Embeds
192
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Kevin teh insight presentation

  1. 1. Disambiguating Twitter Search Kevin Teh kkwteh@gmail.com Insight Data Science Fellows Program March 2013Tuesday, February 26, 13
  2. 2. That’s not the python that I meant...Tuesday, February 26, 13
  3. 3. The solution? cluster-pluck.Tuesday, February 26, 13
  4. 4. cluster-pluck disambiguates Twitter search in real timeTuesday, February 26, 13
  5. 5. It works in Spanish too!Tuesday, February 26, 13
  6. 6. Tuesday, February 26, 13
  7. 7. Tools Word Filter Web Application 300,000 Tweets Filter UserTuesday, February 26, 13
  8. 8. Algorithm read query and d/l corpus of 1500 tweets filter out common words count link two candidates words if their relative proportion of co- rank remaining occurrence is words by number select potentially greater than 0.25 of occurrences and meaningful words select top 10 rank connected rank remaining components by cluster candidates words by rate of total occurrences into groups capitalization and and take top 3 select top 10 assign tweets to clustersTuesday, February 26, 13
  9. 9. Kevin Teh kkwteh@gmail.com Math PhD -- May ’13 B.A.Sc. -- April ’07 Topic: Noncommutative Geometry (Whatever that is) Engineering Science (Whatever that is)Tuesday, February 26, 13
  10. 10. Tuesday, February 26, 13

×