Successfully reported this slideshow.
Your SlideShare is downloading. ×

Media Ecology Project slides from Open Repositories 2015

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad

Check these out next

1 of 25 Ad

More Related Content

Slideshows for you (20)

Similar to Media Ecology Project slides from Open Repositories 2015 (20)

Advertisement

Recently uploaded (20)

Advertisement

Media Ecology Project slides from Open Repositories 2015

  1. 1. Online Access for Scholars Adds Value to Media Archives John Bell • john.p.bell@dartmouth.edu • @nmdjohn Mark J Williams • mark.j.williams@dartmouth.edu • @markj2
  2. 2. ARCHIVES • Dartmouth’s institutional repository • Hydra/Fedora + Open Linked Data • Doesn’t actually exist yet
  3. 3. BOUNCE Users StayUsers Leave
  4. 4. REPO WORLD PROBLEMS http://www.gettyimages.com/detail/88172463 http://www.gettyimages.com/detail/15740504 1 http://www.gettyimages.com/detail/86438099
  5. 5. WHAT IS MEP?
  6. 6. DC: Descriptive metadata and media ID METADATA FOAF: Contributors SKOS: Shared vocabularies OA: Time-based open annotations RDF: Encoding (XML and JSON)
  7. 7. APPLICATIONS Classroom annotation environment Metadata sharing server Semantic web publishing platform Collaborative vocabulary builder
  8. 8. ONOMY DEVELOPMENT
  9. 9. SCALAR DEVELOPMENT
  10. 10. SCALAR DEVELOPMENT
  11. 11. SCALAR DEVELOPMENT
  12. 12. SCALAR DEVELOPMENT
  13. 13. MEDIATHREAD DEVELOPMENT
  14. 14. CURRENT MODEL • User clicks Mediathread bookmarklet on Library of Congress video • Mediathread bookmarklet scrapes metadata from Library of Congress and stores it in MediaThread’s local database • Scalar embeds media in a similar way, but uses search APIs rather than a bookmarklet and scraping.
  15. 15. NEW MODEL • User loads a controlled vocabulary into Mediathread to use for tagging • Later, Scalar will use the same vocabulary to help import Mediathread’s annotations into a Scalar book
  16. 16. SIDELOADED METADATA • Mediathread sends a second request to MEP for any metadata about the video, then adds the result to its database • The user never knows MEP was queried
  17. 17. ADDED METADATA • User and colleauges add annotations, commentary, tags, etc. to the video in Mediathread • When the user publishes their work, Mediathread sends the user’s new metadata to MEP
  18. 18. ADDED METADATA • The user’s metadata is added to what MEP knows about the video and what it sends to Mediathread next time that video is loaded • If the Library of Congress can accept the new metadata, MEP will send it back to enhance the original source’s records
  19. 19. CONNECTIVITY • Later, the user decides they want to use Scalar’s presentation tools to publish the annotations they made in Mediathread • When the user imports the same piece of media into Scalar, it queries MEP to see if it has any annotations for that URL and imports any that it finds
  20. 20. PILOT: PAPER PRINT COLLECTION • Rare early cinema Paper Print collection from the Library of Congress • Scholar participants from DOMITOR (international society for the study of early cinema)
  21. 21. PILOT: HISTORICAL NEWS MEDIA • Multiple archive sources including WGBH, University of South Carolina, UCLA, and University of Georgia/Peabody Award Collection • Featuring newsfilm from both motion picture and television news sources • Scholars include Mark Cooper from University of South Carolina and Ross Melnick from UC Santa Barbara
  22. 22. PILOT: IN THE LIFE • Famous public television program that assayed gay and lesbian life in the United States • Includes all programs plus all B-roll, interviews, etc.
  23. 23. PILOT: FILMS DIVISION • Films Division has produced state- sponsored documentary, informational, and experimental cinema since India’s independence in 1947 • Developing scholarly participation in both India and the US
  24. 24. AFFILIATED PROJECTS • ACTION • Red Hen Lab • Visual Learning Group • Spehr/Dalton Film Data Rescue • Archival Memory Studies Project Mark: images? Some kind of visual?
  25. 25. John Bell • john.p.bell@dartmouth.edu • @nmdjohn Mark J Williams • mark.j.williams@dartmouth.edu • @markj2

Editor's Notes

  • Going to start by talking about a completely different project: Dartmouth Academic Commons. Dartmouth’s IR. In designing the IR we of course looked around to see what other people were doing and what we might expect to see in terms of utilization. One stat that consistently stood out to me was the bounce rate: how many people come to an IR from a search engine and then immediately leave after only viewing one page.
  • I’ve seen different rates at different places, but in looking at public stats from different IRs a pretty typical number seems to be 75-85%. On the spectrum of web sites, that’s quite high. Why? Users leave because we’ve turned over responsibility for contextualizing our content to Google. A repository is seen as an end point in an external search – it’s the shelf where you store the PDFs and MP4s. For DAC, we wanted to change this number. We don’t want to let Google or Bing’s algorithm set context for the work in our repository. We need to create strong ties between different items in the repo and show them to users in a compelling enough way that they don’t treat each piece of content as a discrete item. Google might show them the first thing they look at, but we want them to see how that is tied to other things we have. Doing so promotes understanding of the work and, not incidentally, the value of Dartmouth as an institution.
  • So what do you need to set context, and why aren’t we doing it strongly enough now? A few options:
    Not enough content. Your repo or archive may simply not have enough volume, or enough related volume, to establish a strong contextual relationship between items.
    Not enough metadata. There are lots of archives and repos out there that, for various reasons, simply don’t have a lot of information about the content that’s in them. Without that, you can’t set context.
    Not the right type of metadata. Many archives and repos do see themselves as the shelf for the PDFs and thus are more concerned with collecting technical metadata than semantic. It’s organized, but not in a way that’s useful for strong contextualization.

    I can’t help you with #1, but #2 and #3 are where we come back to the Media Ecology Project. We’re trying to harness the creative work processes of researchers, academics, and eventually crowds so they become a source of high quality contextual metadata. We’re starting where the problems are the toughest to solve: culturally-significant time-based media.
  • MEP is a partnership network that facilitates collaborative analysis of moving image materials. Our goal is to find ways to connect researchers to the archives and software tools needed for close textual studies of the subject matter, production, reception, and representational practices of media.

    I’m going to step through MEP’s technology development and goals, and then I’m going to hand it over to Mark who will introduce some of the scholarly work that MEP is helping to facilitate.

    In talking about MEP’s technology there are really two types of outputs that we should be talking about: data models and applications. Our models are mostly implementations of open data specs that are aimed at harnessing very specific types of metadata.
  • Key points: embed media, don’t upload or copy it. Onomy aimed at small groups creating controlled vocabularies for individual projects, mostly for tagging across applications. Each application is in a different stage of development.
  • Scholar’s workbench – a media chooser that allows users to collect and organize clips, staging them for inclusion into a Scalar publication. It’s still in development and not publicly available yet, but it’s on track to be a key component in the larger ecology of applications.
  • We’ve also been working with Scalar to upgrade their time-based annotation features. It’s also still in development, but these screenshots show the new interface ANVC is developing.
  • The idea is that this list of tags will be loaded from Onomy.org and provide a controlled vocabulary to use for annotation across platforms.
  • Once a tag is applied, the terms and their definitions will become part of the time-based annotation, appearing and disappearing during playback as appropriate.
  • We’ve also been working closely with Columbia’s CCNMTL to integrate Onomy with Mediathread’s vocabulary system. You can see here that it will now accept a URL for json produced by Onomy and import it. One difference between Mediathread’s Onomy support and Scalar’s is that Scalar will flatten out all terms it imports to be equal. Here, you can see that Mediathread understands basic hierarchies and will treat the top-most term in a hierarchy as a category, with child terms imported as tag beneath that category. Those terms can then be applied to time-based annotations of media clips, and when they are, they also automatically become filters that can be used to discover both clips and specific sub-clips. We’re currently working with CCNMTL on an interface refresh that will help remove some of the long lists here to provide a better user experience when working with complex vocabularies.
  • Where are we going with this? What’s the vision of how all of this will work in the end?

    So let’s look at how these platforms work now. Their technical infrastructure is quite different, but many of the principles are the same so we will just use mediathread as our example. If a user wants to bring a video into mediathread for analysis they have to install a bookmarklet, or a small Javascript program that is stored in your browser’s bookmarks. With that installed they visit a site with videos they want to comment on, for instance the Library of Congress. When they find the video they want to use they click the Mediathread bookmarklet, which scrapes the HTML of the page to find whatever metadata is available and then brings the video link and metadata into Mediathread. Any metadata the bookmarklet finds was part of the original page’s HTML.
  • Under the new model, we start out with a vocabulary loaded into mediathread
  • MEP proposes to add another source for metadata about the imported videos. Under the MEP workflow, after a video is brought into MediaThread the bookmarklet would make a second request for metadata to the MEP server. If the video in question had been previously marked up by anyone using an MEP-compatible platform then this second request would result in more robust metadata about the object than could be scraped from the original web page itself. This process is completely transparent to the user, though; other than seeing more information about the video than they would have otherwise, they never have to take any action to trigger the MEP query.
  • So now that the video is in MediaThread, the user does what they wanted to do all along and adds new commentary and annotation to the video. As MediaThread currently exists, that is where the process ends. However, with the MEP 3rd party server in play, the user has a new option when they are finished with their annotations: they can publish their work back up to the MEP server.
  • If the user decides to publish their annotations then they will be added to what MEP knows about the video clip. Future scholars who look at the same clip from an MEP-compatible platform will be able to see those annotations, broadening the base of knowledge about the clip. In addition, MEP offers the ability for the original source archive to also capture those annotations and add them to their existing metadata about the video. This is a significant new capability because many archives have digitized videos for which they have limited metadata, which means that those videos are not easily discoverable or usable. If the archive’s current infrastructure isn’t capable of reading the new metadata from MEP then the MEP server will store it until the archive is able to integrate it into their own records. MEP wants to facilitate new forms and uses of metadata but will not force any archive to change their system to accomodate it; instead, it will simply cache the information until the archive wants to access it.
  • MEP proposes to add another source for metadata about those videos. Under the MEP workflow, after a video is brought into MediaThread the bookmarklet would make a second request for metadata to the MEP server. If the video in question had been previously marked up by anyone using an MEP-compatible platform then this second request would result in more robust metadata about the object than could be scraped from the original web page itself. This process is completely transparent to the user, though; other than seeing more information about the video than they would have otherwise, they never have to take any action to trigger the MEP query.

×