Your SlideShare is downloading. ×
Thou Shalt not Share Collections of Tweets: Should we give a TOS?
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Saving this for later?

Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime - even offline.

Text the download link to your phone

Standard text messaging rates apply

Thou Shalt not Share Collections of Tweets: Should we give a TOS?

481
views

Published on

A conversation about Twitter's recent moves to enforce aspects of its API TOS to prohibit online research services archives for download. This was informed by recent discussion on the AoIR mailing …

A conversation about Twitter's recent moves to enforce aspects of its API TOS to prohibit online research services archives for download. This was informed by recent discussion on the AoIR mailing list and my own experiences.

Published in: Technology

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
481
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Thou Shalt not Share Collections of Tweets: Should we give a TOS?
  • 2. “Thou Shalt not Share Collections ...”
    Interest sparked by AoIR discussion
    Post by Prof Stuart Shulman on May 5th
    2
  • 3. The Original Post (OP)
    3
    [Posted: Thu May 5 05:24:10 PDT 2011]
  • 4. What Twitter said
    4
  • 5. 5
  • 6. Twitter-History a.k.a. ‘Twistory’
    “We hope Twitter will realize the value of enabling researchers, journalists and citizens better ways to search, sort and analyze clusters of this important historical information.”
    6
  • 7. Twitter appears to think so too!
    7
  • 8. Twitter says “desist!”
    Prohibited other services from offering archives (for download):
    E.g., 140kit, TwapperKeeper, DiscoverText, ...
    Shut down 3rd party clients (Twidroyd & UberTwitter) for:
    Private Direct Messages longer than 140 characters
    Trademark infringement
    Changing the content of users' Tweets in order to make money
    8
  • 9. Twitter responds ...
    “... abide by a simple set of rules that are in the interests of our users, as well as the health and vitality of the platform as a whole.”
    “... on an average day we turn off more than one hundred services that violate our API rules of the road.”
    “You can download Twitter for Blackberry, Twitter for Android and other official Twitter apps here. You can also try our mobile web site or apps from other third-party developers.”
    9
  • 10. Why now?
    10
  • 11. Perspectives:
    Online social messaging service (user)
    Open ecosystem infrastructure (developer)
    Historical social record (researchers)
    Post “tweets” with max. 140 characters in real-time
    Publicly accessible (cf. CB radios) with some privacy
    Provides search (limited)
    Uses & develops open-source software (e.g., Cassandra, Lucene, FlockDB, ...)
  • 12. 12
  • 13. Some Twitter numbers
    Valuation: 4 billion (January 2011)
    Investment: $360 million (200m, Dec 2010)
    Employees: 400 (Jan 2011)200 are engineers
    Revenue: Ad estimates 150 million for 2011
    No. of tweets: 140-150 million per day
    Users/Accounts: 200 million (approx.)
    Website ranking: Top 10-Top20
    Twitter search: One billion queries per day
    13
  • 14. 2006 (late)-2008
    14
  • 15. 2009-2010
    15
  • 16. 2011
    16
  • 17. A quick aside ...
  • 18. Twitter Research
    Services: 140kit, TwapperKeeper, DiscoverText, The Archivist, ...
    Some hundreds of publications
    Areas:
    Social network analysis, recommendations systems, social influence, user sentiment, business strategy, disaster prediction & alerts, education, software engineering, politics, ...
    Using:
    Content analysis (narrative), ethnography, SVMs, TextRank, TFIDF, BoW, POS, ...
    18
  • 19. The Twitter API
    REST API uses HTTP protocol
    All website features supported through API
    Programming libraries available
    Rate limiting (user & IP):
    Anonymous: 150 requests per hour
    OAuth: 350 requests per hour
    Whitelist e.g.  20,000 requests
    Streaming offerings:
    Spritzer (1%)
    Gardenhose (10%)
    Firehose (100%)
    19
  • 20. General Terms of Service (Nov 2010)
    Under “Your Rights”:
    “... You grant us a worldwide, non-exclusive, royalty-free license (with the right to sublicense) to use, copy, reproduce, process, adapt, modify, publish, transmit, display and distribute such Content in any and all media or distribution methods (now known or later developed).”
    20
  • 21. TOS tips
    “This license is you authorizing us to make your Tweets available to the rest of the world and to let others do the same. But what’s yours is yours – you own your content.”
    “Twitter has an evolving set of rules for how API developers can interact with your content. These rules exist to enable an open ecosystem with your rights in mind.”
    21
  • 22. API TOS (Feb 2011)
    Access to Twitter Content:
    You will not attempt or encourage others to:
    sell, rent, lease, sublicense, redistribute, or syndicate the Twitter API or Twitter Content to any third party for such party to develop additional products or services without prior written approval from Twitter
    Content = “All use of the Twitter API and content, documentation, code, and related materials made available to you on or through Twitter.”
    22
  • 23. Authorised resyndication = GNIP
    First authorized reseller of Twitter data, Nov 2010
    Offerings:
    Halfhose (50%, $30k / mo)
    Decahose (10%, $5k / mo)
    Power Track ($.10 per 1,000 Tweets)
    Link Stream ($50k / mo)
    User Mention Stream ($20k / mo)
    Keyword Search
    23
  • 24. Potential consequences
    Obstruct peer review of datasets
    Prohibits researchers getting access to data (in a timely way, if at all)
    Stifle innovations (most come from user community & 3rd party developers!)
    Users become more cautious about using social media
    Twitter becomes less useful (protest, reporting, ...)
    Twitter services become hacking targets: (unreliable, unstable, slow, ...)
    Social science researchers twiddle their thumbs
  • 25. One solution ...
    One solution?
    25
  • 26. Talking points
    Is there a problem here?
    Does Twitter have any obligation to users, developers & researchers?
    Is it worth (or even ethical) to violate Twitter’s TOS to get access to researchable data?
    Should users’ content even be available to researchers?
  • 27. Thanks!