Enable tweet-geolocation and don’t drive ERTs crazy!
   Improving situational awareness using Twitter




   Juan Sixto, Oscar Peña, Bernhard Klein and Diego López-de-Ipiña
      DeustoTech−Deusto Institute of Technology, University of Deusto
           SMERST 2013. University of Warwick, Coventry UK
Outline


                        ● Motivations
                        ● Why Twitter ?
                        ● Extracting Relevant Information
                        ● Location Problems
                        ● Noise Reduction
                        ● Disambiguation
                        ● Proposed Solutions
                        ● Results
                        ● Conclusions and Future Work




http://www.flickr.com/photos/usnavy/8154906115/
Motivations

                         ● SABESS Project

                         ● Social Networks provide high
                         availability compared to traditional
                         communication services.

                         ● People usually tend to post more
                         often than alerting ERTs.

                         ● Easy to automate the incoming
                         data (programmatic analysis)




http://www.flickr.com/photos/wiertz/8553028974
Why Twitter ?

                        Advantages
                        ● Very popular communication tool
                        (>175M/day tweets and 200M users)
                        ● De-facto tool for broadcasting news
                        ● Real-time information


                        Drawbacks
                        ● Short posts (140 characters max)
                        ● Much noise in the Twitterverse




http://www.flickr.com/photos/usnavy/8154906115/
Extracting relevant information

                                         ● What?
                                            ● Incidents filtered by
                                               keywords + context
                                         ● Where?
                                            ● User profile's location
                                            ● Geo-tagged tweets
                                            ● NER System
                                            (Named Entity Recognition)




     http://www.flickr.com/photos/rosauraochoa/3283888598/
Location Problems

                                   ● Ungeotagged

                                      tweets & users

                                   ● Noise

                                   ● Ambiguity




 http://www.flickr.com/photos/scottod/5653885470/
Noise Reduction

                                  ● Slang Cleaner
                                  ● Stop Words
                                  ● Hashtags
                                  ● URLs
                                  ● Mentions




http://www.flickr.com/photos/rarebeasts/4468517649/
Noise Reduction

                                                   ● Slang Cleaner
                                                   ● Stop Words
                                                   ● Hashtags
                                                   ● URLs
Oh My God, being woken up to a car on #fire 
right outside my window, only in #Springfield!. 
ttp://bitly.com/16KWmdM
                                                   ● Mentions




              http://www.flickr.com/photos/rarebeasts/4468517649/
Noise Reduction
Oh My God, being woken up to a car on #fire 
right outside my window, only in #Springfield!. 
http://bitly.com/16KWmdM

                                                   ● Slang Cleaner
God, woken car  #fire window, #Springfield!. 
http://bitly.com/16KWmdM                           ● Stop Words
                                                   ● Hashtags
God, woken car  fire window, Springfield!.         ● URLs
http://bitly.com/16KWmdM

                                                   ● Mentions

God, woken car  fire window,  Springfield!.



              http://www.flickr.com/photos/rarebeasts/4468517649/
Noise Reduction
God, woken car  fire window,  Springfield!.
                                                 ● Slang Cleaner
                                                 ● Stop Words
               NER Analyser
                                                 ● Hashtags
                                                 ● URLs
                                                 ● Mentions
           Springfield [LOCATION].




               http://www.flickr.com/photos/rarebeasts/4468517649/
Noise Reduction Results




   http://www.flickr.com/photos/rarebeasts/4468517649/
Disambiguation




http://www.flickr.com/photos/dougtone/5180424902/
Proposed Solutions

                                          ● Geolocalization APIs
                                              ● Nominatim (OSM)
                                              ● Geonames
                                              ● Google Reverse
                                                 Coder
                                              ● Yahoo Geoplanet




http://www.flickr.com/photos/leehaywood/5047795870/
Proposed Solutions Demo




   http://www.flickr.com/photos/leehaywood/5047795870/
Results
                                              ● NER Tool
                                              Comparison

                                              ● Geo-location API
                                              Comparison




http://www.flickr.com/photos/rosasay/4675053765/
Conclusions and Future Work


                                       ● Other Social Networks

                                      ● Improve Conversation

                                          Graphs

                                      ● Increase Accuracy




      http://www.flickr.com/photos/usnavy/8612336419
References

 F. Abel, C. Hauff, G. J. Houben, R. Stronkman, and K. Tao. Semantics+ filtering+ search=
twitcident. exploring information in social web streams. In Proceedings of the 23rd ACM conference
on Hypertext and social media, pages 285–294, 2012.

S. Paradesi. Geotagging tweets using their content. In Proceedings of the Twenty-Fourth Interna-
tional Florida Artificial Intelligence Research Society Conference, 2011.

Jie Yin, A. Lampert, M. Cameron, B. Robinson, and R. Power. Using social media to enhance
emergency situation awareness. Intelligent Systems, IEEE, 27(6):52–59, December 2012.

A. Ritter, S. Clark, and O. Etzioni. Named entity recognition in tweets: an experimental study.
In Proceedings of the Conference on Empirical Methods in Natural Language Processing, pages
1524–1534, 2011.




                      http://www.flickr.com/photos/usnavy/8612336419
Enable tweet-geolocation and don’t drive ERTs crazy!
   Improving situational awareness using Twitter




                                               Juan Sixto Cesteros
                                                 jsixto@deusto.es
                                                   @JuanSixtoC




   Juan Sixto, Oscar Peña, Bernhard Klein and Diego López-de-Ipiña
      DeustoTech−Deusto Institute of Technology, University of Deusto
           SMERST 2013. University of Warwick, Coventry UK
Enable tweet-geolocation and don’t drive ERTs crazy!
   Improving situational awareness using Twitter



All rights of images are reserved by the original owners*,
        the rest of the content is licensed under a
          Creative Commons by-sa 3.0 license.




                     *See references in each slide

Enable tweet-geolocation and don’t drive ERTs crazy!

  • 1.
    Enable tweet-geolocation anddon’t drive ERTs crazy! Improving situational awareness using Twitter Juan Sixto, Oscar Peña, Bernhard Klein and Diego López-de-Ipiña DeustoTech−Deusto Institute of Technology, University of Deusto SMERST 2013. University of Warwick, Coventry UK
  • 2.
    Outline ● Motivations ● Why Twitter ? ● Extracting Relevant Information ● Location Problems ● Noise Reduction ● Disambiguation ● Proposed Solutions ● Results ● Conclusions and Future Work http://www.flickr.com/photos/usnavy/8154906115/
  • 3.
    Motivations ● SABESS Project ● Social Networks provide high availability compared to traditional communication services. ● People usually tend to post more often than alerting ERTs. ● Easy to automate the incoming data (programmatic analysis) http://www.flickr.com/photos/wiertz/8553028974
  • 4.
    Why Twitter ? Advantages ● Very popular communication tool (>175M/day tweets and 200M users) ● De-facto tool for broadcasting news ● Real-time information Drawbacks ● Short posts (140 characters max) ● Much noise in the Twitterverse http://www.flickr.com/photos/usnavy/8154906115/
  • 5.
    Extracting relevant information ● What? ● Incidents filtered by keywords + context ● Where? ● User profile's location ● Geo-tagged tweets ● NER System (Named Entity Recognition) http://www.flickr.com/photos/rosauraochoa/3283888598/
  • 6.
    Location Problems ● Ungeotagged tweets & users ● Noise ● Ambiguity http://www.flickr.com/photos/scottod/5653885470/
  • 7.
    Noise Reduction ● Slang Cleaner ● Stop Words ● Hashtags ● URLs ● Mentions http://www.flickr.com/photos/rarebeasts/4468517649/
  • 8.
    Noise Reduction ● Slang Cleaner ● Stop Words ● Hashtags ● URLs Oh My God, being woken up to a car on #fire  right outside my window, only in #Springfield!.  ttp://bitly.com/16KWmdM ● Mentions http://www.flickr.com/photos/rarebeasts/4468517649/
  • 9.
    Noise Reduction Oh My God, being woken up to a car on #fire  right outside my window, only in #Springfield!.  http://bitly.com/16KWmdM ● Slang Cleaner God, woken car  #fire window, #Springfield!.  http://bitly.com/16KWmdM ● Stop Words ● Hashtags God, woken car  fire window, Springfield!.  ● URLs http://bitly.com/16KWmdM ● Mentions God, woken car  fire window,  Springfield!. http://www.flickr.com/photos/rarebeasts/4468517649/
  • 10.
    Noise Reduction God, woken car  fire window,  Springfield!. ● Slang Cleaner ● Stop Words NER Analyser ● Hashtags ● URLs ● Mentions Springfield [LOCATION]. http://www.flickr.com/photos/rarebeasts/4468517649/
  • 11.
    Noise Reduction Results http://www.flickr.com/photos/rarebeasts/4468517649/
  • 12.
  • 13.
    Proposed Solutions ● Geolocalization APIs ● Nominatim (OSM) ● Geonames ● Google Reverse Coder ● Yahoo Geoplanet http://www.flickr.com/photos/leehaywood/5047795870/
  • 14.
    Proposed Solutions Demo http://www.flickr.com/photos/leehaywood/5047795870/
  • 15.
    Results ● NER Tool Comparison ● Geo-location API Comparison http://www.flickr.com/photos/rosasay/4675053765/
  • 16.
    Conclusions and FutureWork ● Other Social Networks ● Improve Conversation Graphs ● Increase Accuracy http://www.flickr.com/photos/usnavy/8612336419
  • 17.
    References F. Abel,C. Hauff, G. J. Houben, R. Stronkman, and K. Tao. Semantics+ filtering+ search= twitcident. exploring information in social web streams. In Proceedings of the 23rd ACM conference on Hypertext and social media, pages 285–294, 2012. S. Paradesi. Geotagging tweets using their content. In Proceedings of the Twenty-Fourth Interna- tional Florida Artificial Intelligence Research Society Conference, 2011. Jie Yin, A. Lampert, M. Cameron, B. Robinson, and R. Power. Using social media to enhance emergency situation awareness. Intelligent Systems, IEEE, 27(6):52–59, December 2012. A. Ritter, S. Clark, and O. Etzioni. Named entity recognition in tweets: an experimental study. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, pages 1524–1534, 2011. http://www.flickr.com/photos/usnavy/8612336419
  • 18.
    Enable tweet-geolocation anddon’t drive ERTs crazy! Improving situational awareness using Twitter Juan Sixto Cesteros jsixto@deusto.es @JuanSixtoC Juan Sixto, Oscar Peña, Bernhard Klein and Diego López-de-Ipiña DeustoTech−Deusto Institute of Technology, University of Deusto SMERST 2013. University of Warwick, Coventry UK
  • 19.
    Enable tweet-geolocation anddon’t drive ERTs crazy! Improving situational awareness using Twitter All rights of images are reserved by the original owners*, the rest of the content is licensed under a Creative Commons by-sa 3.0 license. *See references in each slide