Social Media Collecting
with Social Feed Manager
(in 10 minutes)
Collecting social media from API vs. website
Advantages of API:
● Structured data (typically JSON).
● Tend to be stable.
● Some provide more metadata than
available from website.
● Data can be collected efficiently.
Disadvantages of API:
● Not all social media platforms have
complete, public APIs.
● Data not readily human-viewable.
● Each API is different.
● Some platforms (notably Twitter) limit
data sharing.
● May be greater limitations on the
data available from API, especially
historical data.
A tweet
https://gist.github.com/justinlittman/462a398d161002a8caff0905bf4e5f7f
Social Feed Manager (SFM)
● Open-source software.
● Developed by GW Libraries with grant from National Historical Publications &
Records Commission.
● Collects social media from APIs of Twitter, Tumblr, Flickr & Sina Webo. Also
collects web resources.
● Supports requirements of researchers and archivists/librarians.
○ Collect to answer specific research questions.
○ Proactively collect to support future research.
● Intended to be provided as a service to the members of a community (as
opposed to individual use).
Example collections
● 2016 U.S. election
○ 280 million tweets
○ Separate collections for
candidates, debates,
conventions
○ Published to Harvard Dataverse
● End of Term (EOT) collection
○ 3000 Twitter accounts
○ 70 Tumblr blogs
○ Continuing as 2017-2020
Federal Term collection
● Women’s March
● Trump Administration officials
● 115th U.S. Congress
● News outlets
● Chinese anti-corruption
○ Sina Weibo and Twitter
● ISIS-related Twitter users
● Latin American political leaders
● George Washington University
○ Official accounts and student
groups
Social Feed Manager
@SocialFeedMgr
sfm@gwu.edu
http://go.gwu.edu/sfm
Justin Littman
@justin_littman
justinlittman@gwu.edu

Social Feed Manager presentation at Archives Unleashed 3.0

  • 1.
    Social Media Collecting withSocial Feed Manager (in 10 minutes)
  • 2.
    Collecting social mediafrom API vs. website Advantages of API: ● Structured data (typically JSON). ● Tend to be stable. ● Some provide more metadata than available from website. ● Data can be collected efficiently. Disadvantages of API: ● Not all social media platforms have complete, public APIs. ● Data not readily human-viewable. ● Each API is different. ● Some platforms (notably Twitter) limit data sharing. ● May be greater limitations on the data available from API, especially historical data.
  • 3.
  • 4.
    Social Feed Manager(SFM) ● Open-source software. ● Developed by GW Libraries with grant from National Historical Publications & Records Commission. ● Collects social media from APIs of Twitter, Tumblr, Flickr & Sina Webo. Also collects web resources. ● Supports requirements of researchers and archivists/librarians. ○ Collect to answer specific research questions. ○ Proactively collect to support future research. ● Intended to be provided as a service to the members of a community (as opposed to individual use).
  • 8.
    Example collections ● 2016U.S. election ○ 280 million tweets ○ Separate collections for candidates, debates, conventions ○ Published to Harvard Dataverse ● End of Term (EOT) collection ○ 3000 Twitter accounts ○ 70 Tumblr blogs ○ Continuing as 2017-2020 Federal Term collection ● Women’s March ● Trump Administration officials ● 115th U.S. Congress ● News outlets ● Chinese anti-corruption ○ Sina Weibo and Twitter ● ISIS-related Twitter users ● Latin American political leaders ● George Washington University ○ Official accounts and student groups
  • 9.