Lightning Talk on the Macaw Metatata collection tool for book like objects. Presented at Code4Lib 2012 in Seattle, WA on Feb 8, 2012.
[SLIDE 1] The Smithsonian Libraries (SIL) digitizes "book-like things". Books, trade catalogs, pamphlets, you name it.
[SLIDE 2] Most of the books that we digitize are sent to the Biodiversity Heritage Library (BHL), a worldwide group of natural history and botanical libraries digitizing legacy literature.
[SLIDE 3] At the SIL, most of our digitzed books are really sent to the Internet Archive. And then to BHL. During this process they collect metadata about the pages so that we can read them online at BHL. And the Internet Archive. But some of them are too large to fit the usual scanning hardware.
[SLIDE 4] So we bought a camera. But since they are being digitized outside the our normal process, we needed a new way to manage and collect the metadata. We could have used spreadsheets, but really, it wasn't a good idea. And other ways were too cumbersome to even consider. So we developed a piece of software to help us collect and manage that information.
[SLIDE 5] We called it Macaw, for Metadata Collection and Workflow. But it ended up not doing much workflow at all.
Macaw collects metadata about the pages of the item in question. This information is needed to effectively read the item at the Internet Archive and subsequently at BHL and eventually on our website where we are building a new Linked Open Data digital library. Macaw collects the metadata in an rich, attractive and easy to use interface.
[SLIDE 6] Macaw is modular in three places. It has customizable import modules to get data from wherever you want. Our installation of Macaw reaches into a custom database and also makes z39.50 requests to our catalog. Macaw can also import a basic CSV file that you upload into the user interface. Macaw has customizable export modules to send data to wherever you want. Like everyone else at BHL, our data is sent to the Internet Archive. We also send some data to the SIL website itself. The metadata that Macaw collects is also modular and customizable. Out of the box, so to speak, we offer a basic set of page-level metadata collection fields. But this can customized to your needs. In theory macaw can collect metadata about other collections of images, not just book-like things.
[SLIDE 7] Macaw is open source hosted at Google Code and is built in PHP on the CodeIgniter framework, uses PostgreSQL or MySQL (Summer 2012), and can be installed in minutes.
If you would like to use Macaw, download it, but keep in mind you need some PHP skills to create an import or export module if you choose to use them. And I think you'll at least want to export somewhere, unless you are uploading to the Internet Archive, which we make available in the main distribution.
If you want go know more, check out our google code page where you can read documentation and wikis and documentation.
Clipping is a handy way to collect important slides you want to go back to later.