SlideShare a Scribd company logo
Detecting, Modeling, & Predicting
    User Temporal Intention
         in Social Media
          Hany M. SalahEldeen
          Old Dominion University

        Advisor: Dr. Michael L. Nelson

       JCDL ‘12 Doctoral Consortium
Michael Jackson Dies




                   Snapshot on: June 25th 2009
http://web.archive.org/web/20090625232522/http://www.cnn.com/
Jeff tweets about it…




          Published on: June 25th 2009
https://twitter.com/mdnitehk/status/2333993907
Jenny is off the grid
Jeff’s friend Jenny was on a vacation in Hawaii
for a month…
Jenny starts catching up a month later




                                             Read on: July26th 2009


When she came back she checked Jeff’s tweets and was
shocked!
          https://twitter.com/mdnitehk/status/2333993907
Jenny follows the link on July 26th




                     CNN page on: July 26th 2009
 http://web.archive.org/web/20090726234411/http://www.cnn.com/
Jenny is confused!
• Implication:
  – Jenny thought Jeff is making a joke about her
    favorite singer and she got mad at him


• Problem:
  – The tweet and the resource the tweet links to
    have become unsynchronized.
The Egyptian Revolution
Reading about it on Storify in
       March 2012….




     http://storify.com/maq4sure/egypts-revolution
I noticed some shared images are missing




       http://storify.com/maq4sure/egypts-revolution
Some tweets are still intact…




https://twitter.com/miss_amy_qb/status/32477898581483521
…and some lost their meaning with the
    disappearance of the images



       https://twitter.com/aishes/status/32485352102952960
                                                                Missing ?




    https://twitter.com/omar_chaaban/status/32203697597452289
The tweet remains but the shared
      image disappeared…




       http://yfrog.com/h5923xrvbqqvgzj
Cairo….we have a problem
• Implication:
  – The reader cannot understand what the author of
    the tweet meant because the image is not
    available.


• Problem:
  – The post is available but the linked resource
    (image) is completely missing.
The Anatomy of a Tweet
The Anatomy of a Tweet
                                      Author’s username
                                      Other user mention
Social
 Post                                                Tweet Body




   Interaction Publishing Shortened URL   Hash Tag
   options     timestamp to resource

                        Shared Resource
3 URIs = 3 Chances to fail
Explanation in MJ’s example
t3   t4   t5        t7   t8   t9   …   tn
t1   t2                  t6
User’s Temporal Intention
The Focus of our research                 Instrumented shortener



  Share time                  Implicit       Explicit

   Click time                 Implicit       Explicit
                                         Instrumented web client
      Out of our scope
      Purview of Facebook,                Engineering problem
      Twitter, Google, …etc
                                           Solved by providing
                                                  tools
Sometimes you want a
       previous version




                 The Correct Temporal
                      Intention

CNN.com at the closest time to the tweet: 25th June 2009 ~ 7pm
Sometimes you want the
      current version




                The Correct Temporal
                     Intention


In this case the current state of the press releases page
Research Question

  Can we estimate the users’
intention at the time of posting
   and reading to predict and
maintain temporal consistency?
Research Goals
• Detect the temporal intention of the:
    1.   Author upon sharing time
    2.   The reader upon dereferencing time
• Model this intention as a function of time, nature of the resource,
   and its context.
• Predict how resources change with time and the intention behind
   sharing them to minimize inconsistency.
• Implement the prediction model to automatically preserve
   vulnerable social content that is prone to change or loss.
• Create an environment implementing this framework that
   provides a smooth temporal navigation of the social web.
Related Work
•   User’s Web Search Intention       • Persistence of shared resources
     –   A. Ashkan ECIR ’09                – M. Nelson D-Lib ‘02
     –   C. Lee AINA ‘05                   – R. Sanderson OR’11
     –   A. Loser IRSW ‘08                 – F. McCown JCDL ‘07
     –   L. Azzopardi ECIR ‘09
     –   R. Baeza-Yates SPIR‘06
     –   N. Dai HT ’11
                                      • URL Shortening
                                           – D. Antoniades WWW ’11
•   Commercial Intention
     –   Q. Guo SIGIR ’10             • Tweeting, Micro-blogging and Popularity
     –   A. Benczur AIRWeb ’07
                                           – S. Wu WWW ’11
                                           – A. Java SNA-KDD ’07
•   Sentiment Analysis
                                           – H. Kwak WWW ’10
     –   G. Mishne AAAI ‘06
     –   J. Bollen JCS ‘11
                                      •   Social Networks Growth and Evolution
•   Access to Archives
                                           – B. Meeder WWW ’11
     –   H. Van de Sompel OR‘09
Dissertation Plan
  BEGIN
          Read Literature
          Collect Datasets
          Analyze Archives Coverage
          Analyze Shortened URIs
          Prototype Application
          Analyze Shared Resources Persistence and Coverage
                                                  Current
          Analyze Contextual Intention
                                                   State

          Create Intention-based dataset
          Extract Intention Features
          Train a Parametric Model to predict intention
          Evaluate, test, cross-validate the model
          Create a mockup application
          Extend the model to induce preservation
          Finish Writing the Dissertation


PhD Defense
Dissertation Plan
  BEGIN
          Read Literature
          Collect Datasets
          Analyze Archives Coverage
          Analyze Shortened URIs
          Prototype Application
          Analyze Shared Resources Persistence and Coverage

          Analyze Contextual Intention

          Create Intention-based dataset
          Extract Intention Features
          Train a Parametric Model to predict intention
          Evaluate, test, cross-validate the model
          Create a mockup application
          Extend the model to induce preservation
          Finish Writing the Dissertation


PhD Defense
Estimating Web Archiving Coverage
• Goal: Estimate how much of the public web is present in the public archives
  and how many copies are available?
• Action:
   – Getting 4 different datasets from 4 different sources:
          •   Search Engines Indices
          •   Bit.ly
          •   DMOZ
          •   Delicious.
• Results:                                         *




• Publications:
     – How much of the web is archived? JCDL '11
* Table Courtesy of Ahmed AlSum JCDL 2011
Dissertation Plan
  BEGIN
          Read Literature
          Collect Datasets
          Analyze Archives Coverage
          Analyze Shortened URIs
          Prototype Application
          Analyze Shared Resources Persistence and Coverage

          Analyze Contextual Intention

          Create Intention-based dataset
          Extract Intention Features
          Train a Parametric Model to predict intention
          Evaluate, test, cross-validate the model
          Create a mockup application
          Extend the model to induce preservation
          Finish Writing the Dissertation


PhD Defense
Shortened URI analysis
•   Goal: Have a better understanding of URI shortening and resolving,
    understand the effect of time on this process and the correlation between
    the page’s features and characteristics, and its resolution.

•   Action:
     – Fresh Bit.lys
     – Get hourly clicklogs, rate of change, social networking spread, and other
       contextual information
     – Longitudinal study

•   Evaluation:
     – Compare results with frequency of change analysis of Cho and Garcia-
       Molina.
     – Compare results with Antoniades et al. WWW 2011.
Dissertation Plan
  BEGIN
          Read Literature
          Collect Datasets
          Analyze Archives Coverage
          Analyze Shortened URIs
          Prototype Application
          Analyze Shared Resources Persistence and Coverage
          Analyze Contextual Intention

          Create Intention-based dataset
          Extract Intention Features
          Train a Parametric Model to predict intention
          Evaluate, test, cross-validate the model
          Create a mockup application
          Extend the model to induce preservation
          Finish Writing the Dissertation


PhD Defense
Estimating Loss of Shared Resources
               in Social Media
•   Goal: Estimate how much of the public web is present in the public archives
    and how many copies are available?
•   Action:
     – Sampling from 6 public events
     – Events spanning 3 years
     – Existence in the current web
     – Existence in the public archives
     – Find relation with time
•   Results:
     – After 1st year ~11% will be lost
     – After that we will continue on losing 0.02% daily
•   Publications:
     – A year after the Egyptian revolution, 10% of the social media documentation is gone.
       http://ws-dl.blogspot.com/2012/02/2012-02-11-losing-my-revolution-year.html
     – Losing my revolution: How Many Resources Shared on Social Media Have Been Lost?
       TPDL '12
Dissertation Plan
  BEGIN
          Read Literature
          Collect Datasets
          Analyze Archives Coverage
          Analyze Shortened URIs
          Prototype Application
          Analyze Shared Resources Persistence and Coverage

          User Intention Analysis
          Create Intention-based dataset
          Extract Intention Features
          Train a Parametric Model to predict intention
          Evaluate, test, cross-validate the model
          Create a mockup application
          Extend the model to induce preservation
          Finish Writing the Dissertation


PhD Defense
User Intention Analysis
•   Goal: Have a better understanding of User Intention and what factors affect
    it. Also create a new testing and training set.

•   Action:
     –   Get a sample set of tweets selected at random
     –   Extract the URIs
     –   Get closest Memento
     –   Download the snapshot & current version
     –   Use Amazon’s Mechanical Turk in choosing the best version

•   Evaluation:
     – Measure cross-rater agreement and confidence.
Proposed Work
•   Data Gathering
•   Feature Extraction
•   Modeling the intention engine
•   Evaluation
•   Application: Prediction and Preservation
Possible Solution for Jenny
Possible Solution for Jenny



       The resource has changed since last time it was shared
       Do you wish to see the version the author intended or
       the current version?

                      Current Version     Intended Version
Proposed Framework


                                               Archived Version




                 Feature
                                  Classifier
                Extraction

              Example Features:                Current Version

              - Tweet Content
              - Click Logs
              - Other Tweets
              - Shared Resource
              - Timemaps
Extra Slides
Archive Shortener Application
Estimating Shared Resources Loss in Social Media
Estimating Shared Resources Loss in Social Media
My Publications
•   S. G. Ainsworth, A. Alsum, H. SalahEldeen, M. C. Weigle, and M. L. Nelson. How
    much of the web is archived? In Proceedings of the 11th annual international
    ACM/IEEE joint conference on Digital libraries, JCDL '11, pages 133{136, 2011.

•   H. SalahEldeen and M. L. Nelson. Losing my revolution: How much social media
    content has been lost? Accepted in TPDL 2012


•   H. SalahEldeen and M. L. Nelson. Losing my revolution: A year after the Egyptian
    revolution, 10% of the social media documentation is gone. http://ws-
    dl.blogspot.com/2012/02/2012-02-11-losing-my-revolution-year.html.
References
•   D. Antoniades, I. Polakis, G. Kontaxis, E. Athanasopoulos, S. Ioannidis, E. P. Markatos, and T. Karagiannis. we.b: the web of short
    urls. In Proceedings of the 20th international conference on World wide web, WWW '11, pages 715 {724, New York, NY, USA,
    2011. ACM.
•   A. Ashkan, C. L. Clarke, E. Agichtein, and Q. Guo. Classifying and characterizing query intent. In Proceedings of the 31th
    European Conference on IR Research on Advances in Information Retrieval, ECIR '09, pages 578{586, Berlin, Heidelberg, 2009.
    Springer-Verlag.
•   L. Azzopardi and M. de Rijke. Query intention acquisition: A case study on automatically inferring structured queries. In
    Proceedings DIR-2006, 2006.
•   R. Baeza-Yates, L. Calderon-Benavides, and C. Gonzalez-Caro. The intention behind web queries. In F. Crestani, P. Ferragina, and
    M. Sanderson, editors, String Processing and Information Retrieval, volume 4209 of Lecture Notes in Computer Science, pages
    98{109. Springer Berlin / Heidelberg, 2006. 10.1007/11880561 9.
•   A. Benczur, I. Bro, K. Csalogany, and T. Sarlos. Web spam detection via commercial intent analysis. In Proceedings of the 3rd
    international workshop on Adversarial information retrieval on the web, AIRWeb '07, pages 89{92, New York, NY, USA, 2007.
    ACM.
•   J. Bollen, H. Mao, and X.-J. Zeng. Twitter mood predicts the stock market. CoRR, abs/1010.3003, 2010.
•   N. Dai, X. Qi, and B. D. Davison. Bridging link and query intent to enhance web search. In Proceedings of the 22nd ACM
    conference on Hypertext and hypermedia, HT '11, pages 17{26, New York, NY, USA, 2011. ACM.
•   N. Dai, X. Qi, and B. D. Davison. Enhancing web search with entity intent. In Proceedings of the 20 th international conference
    companion on World wide web, WWW '11, pages 29{30, New York, NY, USA, 2011. ACM.
•   K. Durant and M. Smith. Predicting the political sentiment of web log posts using supervised machine learning techniques
    coupled with feature selection. In O. Nasraoui, M. Spiliopoulou, J. Srivastava, B. Mobasher, and B. Masand, editors, Advances in
    Web Mining and Web Usage Analysis, volume 4811 of Lecture Notes in Computer Science, pages 187{206. Springer Berlin /
    Heidelberg, 2007. 10.1007/978-3-540-77485-3 11.
References
•   Q. Guo and E. Agichtein. Ready to buy or just browsing?: detecting web searcher goals from interaction data. In Proceedings of the 33rd
    international ACM SIGIR conference on Research and development in information retrieval, SIGIR '10, pages 130{137, New York, NY, USA,
    2010. ACM.
•   A. Java, X. Song, T. Finin, and B. Tseng. Why we twitter: understanding microblogging usage and communities. In Proceedings of the 9th
    WebKDD and 1st SNA-KDD 2007 workshop on Web mining and social network analysis, WebKDD/SNA-KDD '07, pages 56{65, New York, NY,
    USA, 2007. ACM.
•   H. Kwak, C. Lee, H. Park, and S. Moon. What is twitter, a social network or a news media? In Proceedings of the 19th international
    conference on World wide web, WWW '10, pages 591{600, New York, NY, USA, 2010. ACM.
•   C.-H. L. Lee and A. Liu. Modeling the query intention with goals. In Proceedings of the 19th International Conference on Advanced
    Information Networking and Applications - Volume 2, AINA '05, pages 535{540, Washington, DC, USA, 2005. IEEE Computer Society.
•   A. Loser, W. M. Barczynski, and F. Brauer. What's the intention behind your query? a few observations from a large developer community.
    In IRSW, 2008.
•   F. McCown, N. Diawara, and M. L. Nelson. Factors aecting website reconstruction from the web infrastructure. In JCDL '07: Proceedings of
    the 7th ACM/IEEE-CS Joint Conference on Digital Libraries, pages 39{48, 2007.
•   B. Meeder, B. Karrer, A. Sayedi, R. Ravi, C. Borgs, and J. Chayes. We know who you followed last summer: inferring social link creation times
    in twitter. In Proceedings of the 20th international conference on World wide web, WWW '11, pages 517{526, New York, NY, USA, 2011.
    ACM.
•   G. Mishne. Predicting movie sales from blogger sentiment. In In AAAI 2006 Spring Symposium on Computational Approaches to Analysing
    Weblogs (AAAI-CAAW), 2006.
•   M. L. Nelson and B. D. Allen. Object persistence and availability in digital libraries. D-Lib Magazine, 8(1), 2002.
•   R. Sanderson, M. Phillips, and H. Van de Sompel. Analyzing the persistence of referenced web resources with memento. CoRR,
    abs/1105.3459, 2011.
•   H. Van de Sompel, M. L. Nelson, R. Sanderson, L. Balakireva, S. Ainsworth, and H. Shankar. Memento: Time travel for the web. CoRR,
    abs/0911.1112, 2009.
•   S. Wu, J. M. Hofman, W. A. Mason, and D. J. Watts. Who says what to whom on twitter. In Proceedings of the 20th international conference
    on World wide web, WWW '11, pages 705{714, New York, NY, USA, 2011. ACM.

More Related Content

Similar to Hany's JCDL Doctoral Consortium

Paperprotopreso
PaperprotopresoPaperprotopreso
Paperprotopreso
RschDev
 
#mytweet via Instagram: Exploring User Behaviour Across Multiple Social Networks
#mytweet via Instagram: Exploring User Behaviour Across Multiple Social Networks#mytweet via Instagram: Exploring User Behaviour Across Multiple Social Networks
#mytweet via Instagram: Exploring User Behaviour Across Multiple Social Networks
Bang Hui Lim
 
Learning Analytics - CET Seminar 2012
Learning Analytics - CET Seminar 2012Learning Analytics - CET Seminar 2012
Learning Analytics - CET Seminar 2012
Andrew Deacon
 
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012
lljohnston
 
Looking for Data: Finding New Science
Looking for Data: Finding New ScienceLooking for Data: Finding New Science
Looking for Data: Finding New Science
Anita de Waard
 
Lecture 5: Mining, Analysis and Visualisation
Lecture 5: Mining, Analysis and VisualisationLecture 5: Mining, Analysis and Visualisation
Lecture 5: Mining, Analysis and Visualisation
Marieke van Erp
 
Introduction to Information Architecture & Design - SVA Workshop 06/21/14
Introduction to Information Architecture & Design - SVA Workshop 06/21/14Introduction to Information Architecture & Design - SVA Workshop 06/21/14
Introduction to Information Architecture & Design - SVA Workshop 06/21/14
Robert Stribley
 
Introduction to Information Architecture & Design - SVA Workshop 03/22/14
Introduction to Information Architecture & Design - SVA Workshop 03/22/14Introduction to Information Architecture & Design - SVA Workshop 03/22/14
Introduction to Information Architecture & Design - SVA Workshop 03/22/14
Robert Stribley
 
Charleston 2013: The Social Side of Research
Charleston 2013: The Social Side of ResearchCharleston 2013: The Social Side of Research
Charleston 2013: The Social Side of Research
William Gunn
 
Dean R Berry Loss of Privacy: Necessary Evil or Unwanted Invasion Student Pro...
Dean R Berry Loss of Privacy: Necessary Evil or Unwanted Invasion Student Pro...Dean R Berry Loss of Privacy: Necessary Evil or Unwanted Invasion Student Pro...
Dean R Berry Loss of Privacy: Necessary Evil or Unwanted Invasion Student Pro...
Riverside County Office of Education
 
Towards Research Engines: Supporting Search Stages in Web Archives (2015)
Towards Research Engines: Supporting Search Stages in Web Archives (2015)Towards Research Engines: Supporting Search Stages in Web Archives (2015)
Towards Research Engines: Supporting Search Stages in Web Archives (2015)
TimelessFuture
 
Conversations in Context: A Twitter Case for Social Media Systems Design
Conversations in Context: A Twitter Case for Social Media Systems DesignConversations in Context: A Twitter Case for Social Media Systems Design
Conversations in Context: A Twitter Case for Social Media Systems Design
CommunitySense
 
Data Sets, Ensemble Cloud Computing, and the University Library: Getting the ...
Data Sets, Ensemble Cloud Computing, and the University Library:Getting the ...Data Sets, Ensemble Cloud Computing, and the University Library:Getting the ...
Data Sets, Ensemble Cloud Computing, and the University Library: Getting the ...
SEAD
 
Lecture4 Social Web
Lecture4 Social Web Lecture4 Social Web
Lecture4 Social Web
Marieke van Erp
 
Introduction to Information Architecture & Design - 10/03/15
Introduction to Information Architecture & Design - 10/03/15Introduction to Information Architecture & Design - 10/03/15
Introduction to Information Architecture & Design - 10/03/15
Robert Stribley
 
Towards Cognitive Agents for BigData Discovery
Towards Cognitive Agents for BigData DiscoveryTowards Cognitive Agents for BigData Discovery
Towards Cognitive Agents for BigData Discovery
Jack Park
 
Introducing PRIME:Publisher, Repository and Institutional Metadata Exchange
Introducing PRIME:Publisher, Repository and Institutional Metadata ExchangeIntroducing PRIME:Publisher, Repository and Institutional Metadata Exchange
Introducing PRIME:Publisher, Repository and Institutional Metadata Exchange
Brian Hole
 
Dean R Berry The Challenges of Technology Student Project
Dean R Berry The Challenges of  Technology Student ProjectDean R Berry The Challenges of  Technology Student Project
Dean R Berry The Challenges of Technology Student Project
Riverside County Office of Education
 
Ngsp
NgspNgsp
Ngsp
Tim Clark
 
Linked Open Data in Libraries, Archives & Museums
Linked Open Data in Libraries, Archives & MuseumsLinked Open Data in Libraries, Archives & Museums
Linked Open Data in Libraries, Archives & Museums
Jon Voss
 

Similar to Hany's JCDL Doctoral Consortium (20)

Paperprotopreso
PaperprotopresoPaperprotopreso
Paperprotopreso
 
#mytweet via Instagram: Exploring User Behaviour Across Multiple Social Networks
#mytweet via Instagram: Exploring User Behaviour Across Multiple Social Networks#mytweet via Instagram: Exploring User Behaviour Across Multiple Social Networks
#mytweet via Instagram: Exploring User Behaviour Across Multiple Social Networks
 
Learning Analytics - CET Seminar 2012
Learning Analytics - CET Seminar 2012Learning Analytics - CET Seminar 2012
Learning Analytics - CET Seminar 2012
 
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012
 
Looking for Data: Finding New Science
Looking for Data: Finding New ScienceLooking for Data: Finding New Science
Looking for Data: Finding New Science
 
Lecture 5: Mining, Analysis and Visualisation
Lecture 5: Mining, Analysis and VisualisationLecture 5: Mining, Analysis and Visualisation
Lecture 5: Mining, Analysis and Visualisation
 
Introduction to Information Architecture & Design - SVA Workshop 06/21/14
Introduction to Information Architecture & Design - SVA Workshop 06/21/14Introduction to Information Architecture & Design - SVA Workshop 06/21/14
Introduction to Information Architecture & Design - SVA Workshop 06/21/14
 
Introduction to Information Architecture & Design - SVA Workshop 03/22/14
Introduction to Information Architecture & Design - SVA Workshop 03/22/14Introduction to Information Architecture & Design - SVA Workshop 03/22/14
Introduction to Information Architecture & Design - SVA Workshop 03/22/14
 
Charleston 2013: The Social Side of Research
Charleston 2013: The Social Side of ResearchCharleston 2013: The Social Side of Research
Charleston 2013: The Social Side of Research
 
Dean R Berry Loss of Privacy: Necessary Evil or Unwanted Invasion Student Pro...
Dean R Berry Loss of Privacy: Necessary Evil or Unwanted Invasion Student Pro...Dean R Berry Loss of Privacy: Necessary Evil or Unwanted Invasion Student Pro...
Dean R Berry Loss of Privacy: Necessary Evil or Unwanted Invasion Student Pro...
 
Towards Research Engines: Supporting Search Stages in Web Archives (2015)
Towards Research Engines: Supporting Search Stages in Web Archives (2015)Towards Research Engines: Supporting Search Stages in Web Archives (2015)
Towards Research Engines: Supporting Search Stages in Web Archives (2015)
 
Conversations in Context: A Twitter Case for Social Media Systems Design
Conversations in Context: A Twitter Case for Social Media Systems DesignConversations in Context: A Twitter Case for Social Media Systems Design
Conversations in Context: A Twitter Case for Social Media Systems Design
 
Data Sets, Ensemble Cloud Computing, and the University Library: Getting the ...
Data Sets, Ensemble Cloud Computing, and the University Library:Getting the ...Data Sets, Ensemble Cloud Computing, and the University Library:Getting the ...
Data Sets, Ensemble Cloud Computing, and the University Library: Getting the ...
 
Lecture4 Social Web
Lecture4 Social Web Lecture4 Social Web
Lecture4 Social Web
 
Introduction to Information Architecture & Design - 10/03/15
Introduction to Information Architecture & Design - 10/03/15Introduction to Information Architecture & Design - 10/03/15
Introduction to Information Architecture & Design - 10/03/15
 
Towards Cognitive Agents for BigData Discovery
Towards Cognitive Agents for BigData DiscoveryTowards Cognitive Agents for BigData Discovery
Towards Cognitive Agents for BigData Discovery
 
Introducing PRIME:Publisher, Repository and Institutional Metadata Exchange
Introducing PRIME:Publisher, Repository and Institutional Metadata ExchangeIntroducing PRIME:Publisher, Repository and Institutional Metadata Exchange
Introducing PRIME:Publisher, Repository and Institutional Metadata Exchange
 
Dean R Berry The Challenges of Technology Student Project
Dean R Berry The Challenges of  Technology Student ProjectDean R Berry The Challenges of  Technology Student Project
Dean R Berry The Challenges of Technology Student Project
 
Ngsp
NgspNgsp
Ngsp
 
Linked Open Data in Libraries, Archives & Museums
Linked Open Data in Libraries, Archives & MuseumsLinked Open Data in Libraries, Archives & Museums
Linked Open Data in Libraries, Archives & Museums
 

More from heinestien

MLEARN 210 B Autumn 2018: Lecture 1
MLEARN 210 B Autumn 2018: Lecture 1MLEARN 210 B Autumn 2018: Lecture 1
MLEARN 210 B Autumn 2018: Lecture 1
heinestien
 
Doctoral Defense: Hany SalahEldeen
Doctoral Defense: Hany SalahEldeenDoctoral Defense: Hany SalahEldeen
Doctoral Defense: Hany SalahEldeen
heinestien
 
Zen & the art of data mining
Zen & the art of data miningZen & the art of data mining
Zen & the art of data mining
heinestien
 
Reading the Correct History? Modeling Temporal Intention in Resource Sharing
Reading the Correct History? Modeling Temporal Intention in Resource SharingReading the Correct History? Modeling Temporal Intention in Resource Sharing
Reading the Correct History? Modeling Temporal Intention in Resource Sharing
heinestien
 
Carbon Dating The Web: Estimating the Age of Web Resources
Carbon Dating The Web: Estimating the Age of Web ResourcesCarbon Dating The Web: Estimating the Age of Web Resources
Carbon Dating The Web: Estimating the Age of Web Resources
heinestien
 
Tpdl Doctoral consortium 2012
Tpdl Doctoral consortium 2012Tpdl Doctoral consortium 2012
Tpdl Doctoral consortium 2012
heinestien
 
Losing My Revolution Long Paper TPDL2012
Losing My Revolution Long Paper TPDL2012Losing My Revolution Long Paper TPDL2012
Losing My Revolution Long Paper TPDL2012
heinestien
 

More from heinestien (7)

MLEARN 210 B Autumn 2018: Lecture 1
MLEARN 210 B Autumn 2018: Lecture 1MLEARN 210 B Autumn 2018: Lecture 1
MLEARN 210 B Autumn 2018: Lecture 1
 
Doctoral Defense: Hany SalahEldeen
Doctoral Defense: Hany SalahEldeenDoctoral Defense: Hany SalahEldeen
Doctoral Defense: Hany SalahEldeen
 
Zen & the art of data mining
Zen & the art of data miningZen & the art of data mining
Zen & the art of data mining
 
Reading the Correct History? Modeling Temporal Intention in Resource Sharing
Reading the Correct History? Modeling Temporal Intention in Resource SharingReading the Correct History? Modeling Temporal Intention in Resource Sharing
Reading the Correct History? Modeling Temporal Intention in Resource Sharing
 
Carbon Dating The Web: Estimating the Age of Web Resources
Carbon Dating The Web: Estimating the Age of Web ResourcesCarbon Dating The Web: Estimating the Age of Web Resources
Carbon Dating The Web: Estimating the Age of Web Resources
 
Tpdl Doctoral consortium 2012
Tpdl Doctoral consortium 2012Tpdl Doctoral consortium 2012
Tpdl Doctoral consortium 2012
 
Losing My Revolution Long Paper TPDL2012
Losing My Revolution Long Paper TPDL2012Losing My Revolution Long Paper TPDL2012
Losing My Revolution Long Paper TPDL2012
 

Recently uploaded

Bed Making ( Introduction, Purpose, Types, Articles, Scientific principles, N...
Bed Making ( Introduction, Purpose, Types, Articles, Scientific principles, N...Bed Making ( Introduction, Purpose, Types, Articles, Scientific principles, N...
Bed Making ( Introduction, Purpose, Types, Articles, Scientific principles, N...
Leena Ghag-Sakpal
 
Mule event processing models | MuleSoft Mysore Meetup #47
Mule event processing models | MuleSoft Mysore Meetup #47Mule event processing models | MuleSoft Mysore Meetup #47
Mule event processing models | MuleSoft Mysore Meetup #47
MysoreMuleSoftMeetup
 
clinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdfclinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdf
Priyankaranawat4
 
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UPLAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
RAHUL
 
spot a liar (Haiqa 146).pptx Technical writhing and presentation skills
spot a liar (Haiqa 146).pptx Technical writhing and presentation skillsspot a liar (Haiqa 146).pptx Technical writhing and presentation skills
spot a liar (Haiqa 146).pptx Technical writhing and presentation skills
haiqairshad
 
How to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 InventoryHow to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 Inventory
Celine George
 
Hindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdfHindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdf
Dr. Mulla Adam Ali
 
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptxPrésentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
siemaillard
 
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptxNEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
iammrhaywood
 
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdfবাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
eBook.com.bd (প্রয়োজনীয় বাংলা বই)
 
Philippine Edukasyong Pantahanan at Pangkabuhayan (EPP) Curriculum
Philippine Edukasyong Pantahanan at Pangkabuhayan (EPP) CurriculumPhilippine Edukasyong Pantahanan at Pangkabuhayan (EPP) Curriculum
Philippine Edukasyong Pantahanan at Pangkabuhayan (EPP) Curriculum
MJDuyan
 
ZK on Polkadot zero knowledge proofs - sub0.pptx
ZK on Polkadot zero knowledge proofs - sub0.pptxZK on Polkadot zero knowledge proofs - sub0.pptx
ZK on Polkadot zero knowledge proofs - sub0.pptx
dot55audits
 
How to deliver Powerpoint Presentations.pptx
How to deliver Powerpoint  Presentations.pptxHow to deliver Powerpoint  Presentations.pptx
How to deliver Powerpoint Presentations.pptx
HajraNaeem15
 
A Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdfA Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdf
Jean Carlos Nunes Paixão
 
Chapter wise All Notes of First year Basic Civil Engineering.pptx
Chapter wise All Notes of First year Basic Civil Engineering.pptxChapter wise All Notes of First year Basic Civil Engineering.pptx
Chapter wise All Notes of First year Basic Civil Engineering.pptx
Denish Jangid
 
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
imrankhan141184
 
Advanced Java[Extra Concepts, Not Difficult].docx
Advanced Java[Extra Concepts, Not Difficult].docxAdvanced Java[Extra Concepts, Not Difficult].docx
Advanced Java[Extra Concepts, Not Difficult].docx
adhitya5119
 
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptx
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptxBeyond Degrees - Empowering the Workforce in the Context of Skills-First.pptx
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptx
EduSkills OECD
 
The History of Stoke Newington Street Names
The History of Stoke Newington Street NamesThe History of Stoke Newington Street Names
The History of Stoke Newington Street Names
History of Stoke Newington
 
MARY JANE WILSON, A “BOA MÃE” .
MARY JANE WILSON, A “BOA MÃE”           .MARY JANE WILSON, A “BOA MÃE”           .
MARY JANE WILSON, A “BOA MÃE” .
Colégio Santa Teresinha
 

Recently uploaded (20)

Bed Making ( Introduction, Purpose, Types, Articles, Scientific principles, N...
Bed Making ( Introduction, Purpose, Types, Articles, Scientific principles, N...Bed Making ( Introduction, Purpose, Types, Articles, Scientific principles, N...
Bed Making ( Introduction, Purpose, Types, Articles, Scientific principles, N...
 
Mule event processing models | MuleSoft Mysore Meetup #47
Mule event processing models | MuleSoft Mysore Meetup #47Mule event processing models | MuleSoft Mysore Meetup #47
Mule event processing models | MuleSoft Mysore Meetup #47
 
clinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdfclinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdf
 
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UPLAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
 
spot a liar (Haiqa 146).pptx Technical writhing and presentation skills
spot a liar (Haiqa 146).pptx Technical writhing and presentation skillsspot a liar (Haiqa 146).pptx Technical writhing and presentation skills
spot a liar (Haiqa 146).pptx Technical writhing and presentation skills
 
How to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 InventoryHow to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 Inventory
 
Hindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdfHindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdf
 
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptxPrésentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
 
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptxNEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
 
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdfবাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
 
Philippine Edukasyong Pantahanan at Pangkabuhayan (EPP) Curriculum
Philippine Edukasyong Pantahanan at Pangkabuhayan (EPP) CurriculumPhilippine Edukasyong Pantahanan at Pangkabuhayan (EPP) Curriculum
Philippine Edukasyong Pantahanan at Pangkabuhayan (EPP) Curriculum
 
ZK on Polkadot zero knowledge proofs - sub0.pptx
ZK on Polkadot zero knowledge proofs - sub0.pptxZK on Polkadot zero knowledge proofs - sub0.pptx
ZK on Polkadot zero knowledge proofs - sub0.pptx
 
How to deliver Powerpoint Presentations.pptx
How to deliver Powerpoint  Presentations.pptxHow to deliver Powerpoint  Presentations.pptx
How to deliver Powerpoint Presentations.pptx
 
A Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdfA Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdf
 
Chapter wise All Notes of First year Basic Civil Engineering.pptx
Chapter wise All Notes of First year Basic Civil Engineering.pptxChapter wise All Notes of First year Basic Civil Engineering.pptx
Chapter wise All Notes of First year Basic Civil Engineering.pptx
 
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
 
Advanced Java[Extra Concepts, Not Difficult].docx
Advanced Java[Extra Concepts, Not Difficult].docxAdvanced Java[Extra Concepts, Not Difficult].docx
Advanced Java[Extra Concepts, Not Difficult].docx
 
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptx
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptxBeyond Degrees - Empowering the Workforce in the Context of Skills-First.pptx
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptx
 
The History of Stoke Newington Street Names
The History of Stoke Newington Street NamesThe History of Stoke Newington Street Names
The History of Stoke Newington Street Names
 
MARY JANE WILSON, A “BOA MÃE” .
MARY JANE WILSON, A “BOA MÃE”           .MARY JANE WILSON, A “BOA MÃE”           .
MARY JANE WILSON, A “BOA MÃE” .
 

Hany's JCDL Doctoral Consortium

  • 1. Detecting, Modeling, & Predicting User Temporal Intention in Social Media Hany M. SalahEldeen Old Dominion University Advisor: Dr. Michael L. Nelson JCDL ‘12 Doctoral Consortium
  • 2. Michael Jackson Dies Snapshot on: June 25th 2009 http://web.archive.org/web/20090625232522/http://www.cnn.com/
  • 3. Jeff tweets about it… Published on: June 25th 2009 https://twitter.com/mdnitehk/status/2333993907
  • 4. Jenny is off the grid Jeff’s friend Jenny was on a vacation in Hawaii for a month…
  • 5. Jenny starts catching up a month later Read on: July26th 2009 When she came back she checked Jeff’s tweets and was shocked! https://twitter.com/mdnitehk/status/2333993907
  • 6. Jenny follows the link on July 26th CNN page on: July 26th 2009 http://web.archive.org/web/20090726234411/http://www.cnn.com/
  • 7. Jenny is confused! • Implication: – Jenny thought Jeff is making a joke about her favorite singer and she got mad at him • Problem: – The tweet and the resource the tweet links to have become unsynchronized.
  • 9. Reading about it on Storify in March 2012…. http://storify.com/maq4sure/egypts-revolution
  • 10. I noticed some shared images are missing http://storify.com/maq4sure/egypts-revolution
  • 11. Some tweets are still intact… https://twitter.com/miss_amy_qb/status/32477898581483521
  • 12. …and some lost their meaning with the disappearance of the images https://twitter.com/aishes/status/32485352102952960 Missing ? https://twitter.com/omar_chaaban/status/32203697597452289
  • 13. The tweet remains but the shared image disappeared… http://yfrog.com/h5923xrvbqqvgzj
  • 14. Cairo….we have a problem • Implication: – The reader cannot understand what the author of the tweet meant because the image is not available. • Problem: – The post is available but the linked resource (image) is completely missing.
  • 15. The Anatomy of a Tweet
  • 16. The Anatomy of a Tweet Author’s username Other user mention Social Post Tweet Body Interaction Publishing Shortened URL Hash Tag options timestamp to resource Shared Resource
  • 17. 3 URIs = 3 Chances to fail
  • 19. t3 t4 t5 t7 t8 t9 … tn t1 t2 t6
  • 20. User’s Temporal Intention The Focus of our research Instrumented shortener Share time Implicit Explicit Click time Implicit Explicit Instrumented web client Out of our scope Purview of Facebook, Engineering problem Twitter, Google, …etc Solved by providing tools
  • 21. Sometimes you want a previous version The Correct Temporal Intention CNN.com at the closest time to the tweet: 25th June 2009 ~ 7pm
  • 22. Sometimes you want the current version The Correct Temporal Intention In this case the current state of the press releases page
  • 23. Research Question Can we estimate the users’ intention at the time of posting and reading to predict and maintain temporal consistency?
  • 24. Research Goals • Detect the temporal intention of the: 1. Author upon sharing time 2. The reader upon dereferencing time • Model this intention as a function of time, nature of the resource, and its context. • Predict how resources change with time and the intention behind sharing them to minimize inconsistency. • Implement the prediction model to automatically preserve vulnerable social content that is prone to change or loss. • Create an environment implementing this framework that provides a smooth temporal navigation of the social web.
  • 25. Related Work • User’s Web Search Intention • Persistence of shared resources – A. Ashkan ECIR ’09 – M. Nelson D-Lib ‘02 – C. Lee AINA ‘05 – R. Sanderson OR’11 – A. Loser IRSW ‘08 – F. McCown JCDL ‘07 – L. Azzopardi ECIR ‘09 – R. Baeza-Yates SPIR‘06 – N. Dai HT ’11 • URL Shortening – D. Antoniades WWW ’11 • Commercial Intention – Q. Guo SIGIR ’10 • Tweeting, Micro-blogging and Popularity – A. Benczur AIRWeb ’07 – S. Wu WWW ’11 – A. Java SNA-KDD ’07 • Sentiment Analysis – H. Kwak WWW ’10 – G. Mishne AAAI ‘06 – J. Bollen JCS ‘11 • Social Networks Growth and Evolution • Access to Archives – B. Meeder WWW ’11 – H. Van de Sompel OR‘09
  • 26. Dissertation Plan BEGIN Read Literature Collect Datasets Analyze Archives Coverage Analyze Shortened URIs Prototype Application Analyze Shared Resources Persistence and Coverage Current Analyze Contextual Intention State Create Intention-based dataset Extract Intention Features Train a Parametric Model to predict intention Evaluate, test, cross-validate the model Create a mockup application Extend the model to induce preservation Finish Writing the Dissertation PhD Defense
  • 27. Dissertation Plan BEGIN Read Literature Collect Datasets Analyze Archives Coverage Analyze Shortened URIs Prototype Application Analyze Shared Resources Persistence and Coverage Analyze Contextual Intention Create Intention-based dataset Extract Intention Features Train a Parametric Model to predict intention Evaluate, test, cross-validate the model Create a mockup application Extend the model to induce preservation Finish Writing the Dissertation PhD Defense
  • 28. Estimating Web Archiving Coverage • Goal: Estimate how much of the public web is present in the public archives and how many copies are available? • Action: – Getting 4 different datasets from 4 different sources: • Search Engines Indices • Bit.ly • DMOZ • Delicious. • Results: * • Publications: – How much of the web is archived? JCDL '11 * Table Courtesy of Ahmed AlSum JCDL 2011
  • 29. Dissertation Plan BEGIN Read Literature Collect Datasets Analyze Archives Coverage Analyze Shortened URIs Prototype Application Analyze Shared Resources Persistence and Coverage Analyze Contextual Intention Create Intention-based dataset Extract Intention Features Train a Parametric Model to predict intention Evaluate, test, cross-validate the model Create a mockup application Extend the model to induce preservation Finish Writing the Dissertation PhD Defense
  • 30. Shortened URI analysis • Goal: Have a better understanding of URI shortening and resolving, understand the effect of time on this process and the correlation between the page’s features and characteristics, and its resolution. • Action: – Fresh Bit.lys – Get hourly clicklogs, rate of change, social networking spread, and other contextual information – Longitudinal study • Evaluation: – Compare results with frequency of change analysis of Cho and Garcia- Molina. – Compare results with Antoniades et al. WWW 2011.
  • 31. Dissertation Plan BEGIN Read Literature Collect Datasets Analyze Archives Coverage Analyze Shortened URIs Prototype Application Analyze Shared Resources Persistence and Coverage Analyze Contextual Intention Create Intention-based dataset Extract Intention Features Train a Parametric Model to predict intention Evaluate, test, cross-validate the model Create a mockup application Extend the model to induce preservation Finish Writing the Dissertation PhD Defense
  • 32. Estimating Loss of Shared Resources in Social Media • Goal: Estimate how much of the public web is present in the public archives and how many copies are available? • Action: – Sampling from 6 public events – Events spanning 3 years – Existence in the current web – Existence in the public archives – Find relation with time • Results: – After 1st year ~11% will be lost – After that we will continue on losing 0.02% daily • Publications: – A year after the Egyptian revolution, 10% of the social media documentation is gone. http://ws-dl.blogspot.com/2012/02/2012-02-11-losing-my-revolution-year.html – Losing my revolution: How Many Resources Shared on Social Media Have Been Lost? TPDL '12
  • 33. Dissertation Plan BEGIN Read Literature Collect Datasets Analyze Archives Coverage Analyze Shortened URIs Prototype Application Analyze Shared Resources Persistence and Coverage User Intention Analysis Create Intention-based dataset Extract Intention Features Train a Parametric Model to predict intention Evaluate, test, cross-validate the model Create a mockup application Extend the model to induce preservation Finish Writing the Dissertation PhD Defense
  • 34. User Intention Analysis • Goal: Have a better understanding of User Intention and what factors affect it. Also create a new testing and training set. • Action: – Get a sample set of tweets selected at random – Extract the URIs – Get closest Memento – Download the snapshot & current version – Use Amazon’s Mechanical Turk in choosing the best version • Evaluation: – Measure cross-rater agreement and confidence.
  • 35. Proposed Work • Data Gathering • Feature Extraction • Modeling the intention engine • Evaluation • Application: Prediction and Preservation
  • 37. Possible Solution for Jenny The resource has changed since last time it was shared Do you wish to see the version the author intended or the current version? Current Version Intended Version
  • 38. Proposed Framework Archived Version Feature Classifier Extraction Example Features: Current Version - Tweet Content - Click Logs - Other Tweets - Shared Resource - Timemaps
  • 39.
  • 42. Estimating Shared Resources Loss in Social Media
  • 43. Estimating Shared Resources Loss in Social Media
  • 44. My Publications • S. G. Ainsworth, A. Alsum, H. SalahEldeen, M. C. Weigle, and M. L. Nelson. How much of the web is archived? In Proceedings of the 11th annual international ACM/IEEE joint conference on Digital libraries, JCDL '11, pages 133{136, 2011. • H. SalahEldeen and M. L. Nelson. Losing my revolution: How much social media content has been lost? Accepted in TPDL 2012 • H. SalahEldeen and M. L. Nelson. Losing my revolution: A year after the Egyptian revolution, 10% of the social media documentation is gone. http://ws- dl.blogspot.com/2012/02/2012-02-11-losing-my-revolution-year.html.
  • 45. References • D. Antoniades, I. Polakis, G. Kontaxis, E. Athanasopoulos, S. Ioannidis, E. P. Markatos, and T. Karagiannis. we.b: the web of short urls. In Proceedings of the 20th international conference on World wide web, WWW '11, pages 715 {724, New York, NY, USA, 2011. ACM. • A. Ashkan, C. L. Clarke, E. Agichtein, and Q. Guo. Classifying and characterizing query intent. In Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval, ECIR '09, pages 578{586, Berlin, Heidelberg, 2009. Springer-Verlag. • L. Azzopardi and M. de Rijke. Query intention acquisition: A case study on automatically inferring structured queries. In Proceedings DIR-2006, 2006. • R. Baeza-Yates, L. Calderon-Benavides, and C. Gonzalez-Caro. The intention behind web queries. In F. Crestani, P. Ferragina, and M. Sanderson, editors, String Processing and Information Retrieval, volume 4209 of Lecture Notes in Computer Science, pages 98{109. Springer Berlin / Heidelberg, 2006. 10.1007/11880561 9. • A. Benczur, I. Bro, K. Csalogany, and T. Sarlos. Web spam detection via commercial intent analysis. In Proceedings of the 3rd international workshop on Adversarial information retrieval on the web, AIRWeb '07, pages 89{92, New York, NY, USA, 2007. ACM. • J. Bollen, H. Mao, and X.-J. Zeng. Twitter mood predicts the stock market. CoRR, abs/1010.3003, 2010. • N. Dai, X. Qi, and B. D. Davison. Bridging link and query intent to enhance web search. In Proceedings of the 22nd ACM conference on Hypertext and hypermedia, HT '11, pages 17{26, New York, NY, USA, 2011. ACM. • N. Dai, X. Qi, and B. D. Davison. Enhancing web search with entity intent. In Proceedings of the 20 th international conference companion on World wide web, WWW '11, pages 29{30, New York, NY, USA, 2011. ACM. • K. Durant and M. Smith. Predicting the political sentiment of web log posts using supervised machine learning techniques coupled with feature selection. In O. Nasraoui, M. Spiliopoulou, J. Srivastava, B. Mobasher, and B. Masand, editors, Advances in Web Mining and Web Usage Analysis, volume 4811 of Lecture Notes in Computer Science, pages 187{206. Springer Berlin / Heidelberg, 2007. 10.1007/978-3-540-77485-3 11.
  • 46. References • Q. Guo and E. Agichtein. Ready to buy or just browsing?: detecting web searcher goals from interaction data. In Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval, SIGIR '10, pages 130{137, New York, NY, USA, 2010. ACM. • A. Java, X. Song, T. Finin, and B. Tseng. Why we twitter: understanding microblogging usage and communities. In Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 workshop on Web mining and social network analysis, WebKDD/SNA-KDD '07, pages 56{65, New York, NY, USA, 2007. ACM. • H. Kwak, C. Lee, H. Park, and S. Moon. What is twitter, a social network or a news media? In Proceedings of the 19th international conference on World wide web, WWW '10, pages 591{600, New York, NY, USA, 2010. ACM. • C.-H. L. Lee and A. Liu. Modeling the query intention with goals. In Proceedings of the 19th International Conference on Advanced Information Networking and Applications - Volume 2, AINA '05, pages 535{540, Washington, DC, USA, 2005. IEEE Computer Society. • A. Loser, W. M. Barczynski, and F. Brauer. What's the intention behind your query? a few observations from a large developer community. In IRSW, 2008. • F. McCown, N. Diawara, and M. L. Nelson. Factors aecting website reconstruction from the web infrastructure. In JCDL '07: Proceedings of the 7th ACM/IEEE-CS Joint Conference on Digital Libraries, pages 39{48, 2007. • B. Meeder, B. Karrer, A. Sayedi, R. Ravi, C. Borgs, and J. Chayes. We know who you followed last summer: inferring social link creation times in twitter. In Proceedings of the 20th international conference on World wide web, WWW '11, pages 517{526, New York, NY, USA, 2011. ACM. • G. Mishne. Predicting movie sales from blogger sentiment. In In AAAI 2006 Spring Symposium on Computational Approaches to Analysing Weblogs (AAAI-CAAW), 2006. • M. L. Nelson and B. D. Allen. Object persistence and availability in digital libraries. D-Lib Magazine, 8(1), 2002. • R. Sanderson, M. Phillips, and H. Van de Sompel. Analyzing the persistence of referenced web resources with memento. CoRR, abs/1105.3459, 2011. • H. Van de Sompel, M. L. Nelson, R. Sanderson, L. Balakireva, S. Ainsworth, and H. Shankar. Memento: Time travel for the web. CoRR, abs/0911.1112, 2009. • S. Wu, J. M. Hofman, W. A. Mason, and D. J. Watts. Who says what to whom on twitter. In Proceedings of the 20th international conference on World wide web, WWW '11, pages 705{714, New York, NY, USA, 2011. ACM.