Presentation as part of panel, “Social media, web archiving, and digital libraries,” Web Archives and Digital Libraries workshop, Joint Conference on Digital Libraries (JCDL), June 23, 2016.
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Social Feed Manager, WADL/JCDL 2016
1. Social Feed Manager
Laura Wrubel
@liblaura @SocialFeedMgr
http://go.gwu.edu/sfm
Web Archives and Digital Libraries workshop, JCDL 2016
Social Feed Manager is supported by the National Historical Publications & Records Commission
2. Allows users to create collections of data
from social media platforms
7. Research documentation (for researchers)
≈ provenance metadata (for archivists)
(and it’s really important for both)
8. Creation
Authoring of the social media
● Creation metadata is provided by Twitter as JSON via API.
● Social media user metadata:
○ Screen name
○ Date account created
○ Location
● Tweet metadata:
○ Date
○ Tweet text
○ Mentions
○ Hashtags
○ URLs
○ Source (how posted)
● SFM records it in WARC files.
9. Selection
Decisions by the SFM user which leads SFM to harvest the tweet
Recorded in the SFM database
● Collection information
○ Harvest type
○ Harvest options (e.g., incremental, harvest web resources)
○ Credentials (API keys)
○ Description of collection
● Seeds for the collection (which vary by platform)
○ Screen name
○ UID
○ Keywords to filter on
● Change log
○ Change note
○ Fields changed
○ User who made change
○ Date of change
10. Collection
How SFM retrieved the tweet from Twitter’s API
● Collection metadata is received by SFM’s Twitter harvester & recorded
within WARCs.
● WARCs include the exact HTTP request/response
○ URL with params such as user account id or keywords
○ HTTP headers
● WARC record headers also include:
○ Date WARC record created
○ Server information
○ Fixities
11. Collection (cont)
● WARC file metadata, recorded in the SFM database:
○ File location
○ File size
○ Fixity
○ Creation date
● Harvest metadata:
○ Date
○ Collection
○ Date harvest started
○ Date harvest ended
○ Messages (informational, warning, or error)
○ Token/seed updates
○ Basic stats on number of items collected
13. How is this useful? http://bit.ly/tweet-prov
● Which of this provenance metadata do you (researcher,
archivist, librarian, etc.) want access to?
● How do you want access to this metadata? In SFM’s UI? In
reports when exports are created? Exposed via SFM’s
software libraries? A REST API? Machine-readable?
Human-readable?
● What metadata have we missed?
● Do the answers to the previous questions vary by discipline
(e.g., humanities, social science, etc.)?
● Are there other relevant specifications or standards that we
should consider? Is there value in a mapping to or providing
output in accordance with metadata standards such as
PREMIS or PROV?