Linking the American Art Museum to the Cloud What is Linked Open Data? Data published by exis0ng internet protocols that use a URI (Unique Resource Indicator) as the primary discoverable en0ty for a resource (e.g. person, object, web page, etc.) THE FIVE STARS OF LOD: ★ make your stuﬀ available on the web (whatever format) under an open license ★★ make it available as structured data (e.g., Excel instead of image scan of a table) ★★★ use non-‐proprietary formats (e.g., CSV instead of Excel) ★★★★ use URIs to iden0fy things, so that people can point at your stuﬀ ★★★★★ link your data to other data to provide context
Linking the American Art Museum to the Cloud What is good for? • Making your data more discoverable and useful by everybody • Making the web machine-‐readable at a more granular level • Allowing for more sophis0cated queries using inference • Connec0ng your data to other people’s data For American Art, Linked Open Data will: • Make our collec0ons data more ﬁndable on the web • Create connec0ons with other museums that have related artworks • Create connec0ons with other non-‐museum resources, such as the New York Times • Create connec0ons with our dispersed content on social media (e.g. Flickr) • Help us beSer adapt to the changing web
Linking the American Art Museum to the Cloud Examples Europeana • Digi0zed collec0ons of museums, libraries, archives and galleries across Europe. • Open metadata on 20 million texts, images, videos and sounds • A subset of 2.4 millions objects from 8 direct Europeana providers encompassing over 200 cultural ins0tu0ons from 15 countries is served according to the Linked Data recipes. • Virtual exhibi0ons showcase some of the content available.
Linking the American Art Museum to the Cloud Examples Pelagios • Stands for Pelagios: Enable Linked Ancient Geodata In Open Systems’ • Aim is to help introduce Linked Open Data into online resources that refer to places in the Ancient World. • Allows you to ﬁnd content related to a speciﬁc place
Linking the American Art Museum to the Cloud GeJng Started IniKal QuesKons • Will it take a lot of 0me and resources to prepare our data? • How does LOD diﬀer from what a Google search can do? • Is it foolish to be doing this before standards are in place? • What if people create inappropriate links to our data? • Will it be worth the 0me and eﬀort in the end? • How do we handle all of the non-‐public data that we have? • Is it possible to make sense of all the acronyms? The Project • Working with the Informa0on Sciences Ins0tute (ISI) and Department of Computer Science at the University of Southern California. • Goal: Publish 5-‐star Linked Open Data of our complete collec0ons data (41,000 objects, 8,000 ar0sts). • Project Phases: Prepare the data, Create an ontology, map the data to RDF, link the data to hub datasets, publish the data.
Linking the American Art Museum to the Cloud The Process Preparing the data • Collec0ons data is stored in TMS. We have over 100 tables • We decided to publish only the data that is already visible on our website • We used an exis0ng output report from our database • Several ﬁelds needed to be interpreted ﬁrst before they could be mapped to RDF Designing the Ontology • We built our ontology around exis0ng ontologies • An augmented version of Europeana Data Model v.2 for overall framework; SKOS for classiﬁca0on of artworks, ar0st and place names; Dublin Core for tombstone data; RDA Group 2 Elements for biographical informa0on; schema.org for geographical data.
Linking the American Art Museum to the Cloud The American Art Ontology
Linking the American Art Museum to the Cloud The Process Mapping the Data to RDF (Resource DescripKon Framework) • Used KARMA tool to model the data • The system learns with each dataset so the process becomes easier and faster For Example: Subject Predicate Object www.americanart.si.edu/linkeddata/person/3406 saam:Person “Thomas Moran” www.americanart.si.edu/linkeddata/person/3406 rdaGr2:dateOfBirth “1837” www.americanart.si.edu/linkeddata/person/3406 owl:SameAs hSp://live.dbpedia.org/page/Thomas_Moran
Linking the American Art Museum to the Cloud The Process Linking the Data to External Data • Verify matches before publishing • Have already linked ar0sts to: • DBPedia -‐ 2,194 • New York Times -‐ 70 • Addi0onally, can link ar0sts to: • GeSy Union List of Ar0st Names -‐ 2,110 (ULAN is not yet published as LOD, but will be) • Rijksmuseum dataset – 551 (links are not yet veriﬁed) • In the works: • Linking places to GeoNames • Linking concepts to AAT • Linking to datasets from other museums • Linking to social media content Publishing • Plan to publish complete dataset and all veriﬁed links under a CC0 license • Data will be CC0, but images will be maintained under a restricted license • Include example records and SPARQL endpoint
Linking the American Art Museum to the Cloud Some answers Answers to IniKal QuesKons: • Will it take a lot of 0me and resources to prepare our data? • Using KARMA to model the data and a visual interface to verify the links reduced the staﬀ Eme that would have been needed to do this manually. Working with ISI certainly helped kick-‐start the process. • How does LOD diﬀer from what a Google search can do? • LOD eliminates the “noise” of a Google search. With LOD you can query speciﬁc facts. With Google you query documents and then have to read the document to get the facts. • Is it foolish to be doing this before standards are in place? • There are already some standards in place. Plus, being one of the ﬁrst means that we have the opportunity to help shape the standards. • What if people create inappropriate links to our data? • You cannot control what people say about you on the internet! • Will it be worth the 0me and eﬀort in the end? • We believe so! It will allow us to beTer adapt to the future of the web. • How do we handle all of the non-‐public data that we have? • We opted to publish only our public data. • Is it possible to make sense of all the acronyms? • Yes! It takes Eme, but you do eventually grasp all the diﬀerent terms.
Linking the American Art Museum to the Cloud Some conclusions • We ini0ally planned to use only a sample of collec0ons data. In the end, we used data for our en0re collec0on – over 41,000 objects! • Linking to datasets like DBPedia and the New York Times will greatly expand the content we oﬀer on our website. • Linking to datasets from other art museums will increase the accessibility and reach of art collec0ons and cultural heritage online. • We’re excited for the poten0al to link to our content on social media sites – an object page as a “hub” to all types of content about that object. • We see great poten0al in using Linked Open Data to curate stories about artworks and ar0sts that connect museums and datasets around the world in new and surprising ways.
Linking the American Art Museum to the Cloud What’s next? • Embedding linked content on object pages and ar0st pages on our website (Wikipedia, the New York Times, etc.) • Improve representa0on of ar0sts on Wikipedia, adding ar0cles and infoboxes where possible to increase the number of matches in DBPedia. • Create an ongoing maintenance plan to ensure that the linked open data reﬂect new and edited museum data. • Tag object-‐ and person-‐related museum content on social sites like Flickr and YouTube so that we can create links to that content on our website, too. • Inves0gate mapping and linking an artwork’s subject. • Expand the LOD in ways that will enhance research. • Create a tool that allows users to “curate stories” using LOD: • hSp://prezi.com/htrvh2jrcsio/cura0ng-‐stories-‐with-‐linked-‐open-‐data/ • Encourage others to build applica0ons with our data.