4. Digital Publishing at the Getty
• Mandate: publish the museum’s collection catalogues in a digital format
• Attempts to shoehorn academic books into a CMS didn’t work very well
• Team of 2 (developer, product manager) embedded in a larger print
publishing team
• Didn’t want to lose the beautiful design and long-term availability of printed
books
5. Web books on the JAM Stack
• Ancient Terracottas from South Italy and Sicily, Maria Lucia Ferruzza
• Roman Mosaics in the J. Paul Getty Museum, Alexis Belis
• Ancient Lamps in the J. Paul Getty Museum, Jean Bussiere and Birgitta Lindros Wohl
• Many more at www.getty.edu/publications/digital/
6. Tech Stack
• Middleman static site generator (easy to extend in Ruby)
• Plain text back-end (markdown & YAML files), stored on GitHub
• Vue.js and other javascript libraries for interactive UI elements
• PrinceXML and CSS3 for PDF generation at build-time
Hi everyone! Thanks so much for being here this evening. My name is Eric Gardner. I’m a software engineer at Rumors, a design studio in Portland. I’ve been passionate about the technology of the “JAM Stack” since before that acronym existed! Tonight I’d like to talk about one application of this technology that I hope you find interesting – using static web tech as a system for digital publishing.
First some background. Prior to joining Rumors in Portland, I worked for several years as a software developer at the Getty Museum in Los Angeles.
This was just another day at the office. I highly recommend a visit if you’re ever in LA, the architecture alone is pretty amazing. And admission is free!
The Getty is a world-class art museum, and it also houses a publishing department that produces dozens of art-related books each year. I worked within this department during my time there.
We found a solution based on static site generator technology, or what you’d now call the JAM Stack. We began publishing books this way in 2016 with Ancient Terracottas from South Italy and Sicily. Since then a total of 6 major scholarly publications have been produced this way at the museum, with several more in the pipeline. I’m going to talk mostly about the first “trilogy” here.
All of these are available online at www.getty.edu/publications/digital. They are also open source, and the code is available at github.com/gettypubs.
Since this is a tech conference, here’s a quick overview of our tech stack for these projects.
We wanted the products we were creating to feel like books.
The simplicity of building sites on top of static generator tech allowed us to focus on things like creating a great reading experience and responsive design. You can see some examples of that design here. We were inspired by sites like Medium and wanted to deliver an elegant, distraction-free experience to the reader.
Another key feature was making the full text of the books searchable. We used the Lunr.js library for this feature.
Lunr is described by its creator as “a little bit like Solr but much smaller and not as bright”.
We still thought it was pretty amazing. No back-end server was required - we generated a JSON index at build time which was consumed by the client at runtime.
Building a JSON index of all text pages in Middleman is pretty straightforward, since you have all of Ruby at your disposal. The software has a “sitemap” object which can be queried and manipulated in various ways. Here we simply looped through the pages and constructed a simple data structure containing text content and basic metadata for each page, which then gets rendered as JSON.
On the client side, the JSON data is fetched and fed into the Lunr search index; fields can be given different weights to prioritize matches found in the title for example.
Leaflet is an open-source mapping library, but it also supports deep-zoom image display.
I wrote a simple ruby script to split up high-resolution images from the museum into tiles at various zoom levels and uploaded them to S3. The catalogue entry pages can then display the images with a zoomable interface similar to Google Maps. We were able to display hundreds of views of dozens of artifacts, at high levels of detail – something that would simply not have been possible in a printed publication.
Link out to http://www.getty.edu/publications/terracottas/catalogue/1/ if desired
You can also use Leaflet to display actual maps, of course.
We were able to connect artifacts with the locations where they were found or manufactured.
These catalogues primarily were mainly composed of static HTML pages generated by Middleman. But when more interactivity was required, Vue.js proved to be a useful tool for building complex UI components. It is easy to introduce a Vue component in an otherwise standard web page.
This example comes from the Ancient Lamps catalogue, which contained over 600 entries. Using Vue’s computed properties feature, it was easy to do things like filter and sort a large collection of data in response to multiple criteria instantaneously.
Perhaps the “killer feature” of our web books was the fact that we could also provide EPUB, PDF, and print-on-demand versions of the same material automatically.
This is one place where Middleman really shines – out of all the static site generators, it is the easiest one to add custom functionality in. The extension API lets you hook into the build process at various points and inject custom code. I wrote a small extension in Ruby that queried the sitemap during a build to find all the pages where a front matter flag had been set for inclusion in the print version – these pages were fed to the PrinceXML tool which generated a PDF from them.
Here is an example of the kinds of layouts we were able to create using CSS.
The little-known Paged Media module supports pretty sophisticated control over printed layouts (margins, columns, bleeds, breaks, page numbering, etc). Much of this spec is already implemented in major browsers, but the command-line Prince XML software (not open source, unfortunately), offers the best support and was instrumental to generating the print editions.
Compared to traditional books, most websites have a laughably short lifespan. But scholarly work needs to be available for years if not decades so that future researchers can make use of it.
The JAM stack is a great way to mitigate this problem.
Between the human-readable text files in Git, the static web versions that require no back-end services, and the Epub, PDF, and print versions that we generated at the same time, we are confident that at least some versions of these books will still be usable decades from now.
Another huge advantage of this approach is that the relative simplicity of the tech lowers the barriers to entry for non-specialists.
When your data looks like this, it’s much easier for to collaborate with non-technical staff (editors, designers, etc)
We were able to successfully train editors to use tools like GitHub and text editors. This is an example “revision” of a book after it was published.
One of our editors (Ruth, pictured here) enjoyed the process so much that she wrote about it for the Getty blog.
Finally, our initial work on these publications led to the development of a digital publishing framework, Quire, that is still in active development at the Getty.
Quire builds on top of the Hugo static site generator and uses a CLI written in Node.js to add support for building additional publication formats (PDF, Epub, etc). The project is currently in alpha stages, but several museums and small publishers already have projects in production using this tool. More info at the GitHub repo here.
Before I wrap up, I just want to acknowledge some of the amazing open-source tools that made these projects possible.
Finally, I want to leave you all with a challenge, and that is: