MCA Assembly Debbie Campbell NATIONAL LIBRARY OF AUSTRALIA 28th September 2009http://creativecommons.org/licenses/by-nc-sa/2.1/au/
Digital best practice – a library viewpoint •The National Library has established a national information infrastructure •It underpins all of the Library’s web discovery services including Music Australia •It is open for any agency to participate in •It is based on ten principles of best practiceI am going to cover digital best practice for discovery services.
#1 accessOn behalf of every Australian, the National Library collects documentary resources of all kinds, including digital materials. Many of these resources are needed for creative endeavour and research, so the National Library makes sure that access to them is at the forefront of all of its services. In most situations this access is free. To increase the availability of resources, we are developing a new service called Trove and it will provide access to all of our web discovery services combined. The service will be available with its new design in the next month or so for you to use.The service contains resources from many Australian institutions because they have collaborated in building the national information infrastructure.
#1 accessThis is Trove beta, a prototype. The resources at the moment are mostly in the form of records, but the number of these which link to a digital object online is increasing rapidly.
#1 accessThe service contains some selected sources overseas. We have identified full‐text resources which we think will be of interest to Australians, and are available for harvesting into our service.If I do a search on J S Bach, it shows two sources, Google Books and the Open Library which is managed by the Internet Archive.
#1 accessIf I click on the Open Library option, I can read the work online or continue through to where a copy is held in an Australian library. Note the borrow option.I am going to move on and explain what you can do to share your content with Trove. Together, we can make sure that people find your resources too.
#2 standardsThe most important standards are the ones which facilitate access. By this I mean every work or resource you look after should be uniformly described. This ensures that they are then consistently able to be discovered, and are discoverable over time. For example, if I do a search on Mondo Rock, the descriptive standards that have been applied allow searching to be precise. Note the format and keyword information on the left‐hand side of the screen. In Trove, we do not proscribe a single descriptive standard. The results above are taken from MARC21 records, which is in the standard global bibliographic schema, as well as a Web designed schema known as the Dublin Core, and in some cases we have also indexed full‐text. The important thing is to choose a standard which allows both descriptions and content to be shared in web spaces.
#3 persistenceThe National Library works with other agencies both nationally and internationally to construct standards for persistence. These include persistent identifiers for resources, and for people and organisations. Applying persistent identifiers provides a short‐cut way to find someone or something, unambiguously identify them, and provide for reliable citations.International standards for identifying resources have existed since the 1960s, such as the ISBN for books. More recently a standard shared for people and organisations, called a party identifier, has gained traction. The National Library has developed a party identifier for national use as part of its People Australia program. Here you can see a party identifier for Stephen Leek. The People Australia program has started a process of seeking data from a range of agencies. When we harvest the data, we match as much as we can to group all of the information about a person or organisation so that it comes together in Trove. So when you click on the label which says “About Stephen Leek” all of those matched sources appear. We then provide the persistent identifier back to the agency which gave us the data so they can use it too.
#4 interoperabilityPersistent identifiers are an example of interoperability at work. That is, interoperability needs agreement between agencies to use the same standards to manage and share data. The information can be disclosed to the web in both places. This is important because you can’t assume that someone looking for music material will start with your service. They might start with Google, they might start with Trove. We work with Google tools to make Australian content visible in the Google search engine, to bring both national and global audiences back to content we have invested in. So if I search for a fourth class Australian song book, Google points to a record, which when clicked, leads to the resource our Music Australia service.We expend effort on ensuring that records in our services are all surfaced through Google. It is a value‐add that we provide back to agencies which collaborate with us.
#4 interoperabilityHere is the Australian song book record in Music Australia.
#5 common file formatsThe National Library uses common file formats when digitising resources in our collection. Not only does this ensure that the files are more easily shared on the web, but it also means that it is easier to preserve them for future use. The Library supports a wide range of formats in its collection, depending on the resource type. It also creates view copies and preservation copies of its own material and each of these is assigned its own identifier. Here is the example for the Australian songbook – we use JPEG for view copies.While a lot of creative work occurs at the cutting edge of technology, it doesn’t mean that longevity should not be considered at the outset.
#6 open platforms code.nla.gov.auSimilarly, we recommend choosing open platforms. By that I mean look for software which is in common usage, because developers can share their interest and expertise, thus guaranteeing more longevity for it as well. Over the last five years or so, the Library has chosen various pieces of open source software to underpin its catalogue and some of its other discovery services such as Australian Research Online. Trove is also being built using open source. As a result of this shift, we are sharing our knowledge for others to exploit. One of the pieces of code we have shared here is a copyright checking algorithm, which works with library catalogues. It displays copyright information in MARC records in a more prominent way, in a way which we hope people can more easily understand. Of course, if your agency doesn’t have the capacity to fund programming in this way, you can still request software vendors to provide you with platforms that use open protocols for sharing data and information.
#7 rights & licencingRights and licencing are an important consideration when sharing a new work in the public domain. The National Library uses Creative Commons licences on some occasions, such as for staff papers, presentations and policy documents. The essential consideration is to ensure that rights and licensing conditions are always included in information about a resource. This may be achieved with a link to a standard statement. We encourage the use of Creative Commons licences in our services as well, for example in Picture Australia where photographers share their Flickr photographs and provide us with information about their images. This does not mean that we are advocating loss of revenue. A snippet, a sound sample, a thumbnail – small snapshots of content are incredibly useful for creating awareness and even sales…
#7 rights & licencing http://www.nla.gov.au/op enpublish/index.php/nlas p/article/view/1468/1796A few years ago, we embarked on a purchasing arrangement with Destra Media. The business failed, but we were able to keep the metadata and cover art such as this example in Music Australia. Robyn Holmes has written an excellent paper about this event (URL at left).However, the experience hasn’t put us off exploring other possible commercial options to ensure that the service provides access to the widest possible range of Australian music. In particular, we are looking at open platforms which support interoperability in some way.
#8 preservationMy colleague Paul Koerbin is going to talk in more detail about the National Library’s preservation activities, which we conduct on behalf of the nation, but the general principle is that anyone creating resources, especially those which have been publically funded, has a responsibility to look after them.This is an example of a website we have captured over a few years. PANDORA content is included in Trove.The idea is to start work with preservation in mind. If planning and execution follow the preceding seven principles, then achieving the eighth is far easier.
#9 participationI want to finish by telling you about two methods of achieving engagement. The first is audience participation, and this shows an example from our Australian Newspapers service. I have done a search on the Music Council of Australia. There are many articles, this one is from 1933. You can see we display both an image of the original newspaper article and the OCRd text. We provide the latter so that anyone can correct the text. Because we are dealing with four million pages of newsprint, the Library doesn’t have the resources to correct errors in the OCRing.Since we made this feature public in July last year, which is an example of crowd‐sourcing, the general public has corrected 3.4 million lines of text. A few individuals have corrected about 250,000 lines each, and we feature the top 10 in a text correctors hall of fame on the home page of this service. You can see that the quality of this article is quite high, so the text is able to be searched reliably, but that is not always the case, hence the need to correct the text. Although you don’t have to obtain a signon to correct text, it is useful to have one to keep a record of your work. It can also be activated when adding tags about the articles. Why is this important? Well, if someone was doing a research project on the history of the Music Council, they could tag each article they find with those words. A list of the articles is then kept for later reference. Community groups have started using this feature to improve the text, and keep a list of articles for research purposes. The functions of the newspaper service will be transferred to Trove, and we will make the text correction feature available for other content as we digitise it. We have been overwhelmed by the positive reaction we have received – people are welcoming the opportunity to participate in a national endeavour for posterity.
#10 collaboration http://librariesaustralia .nla.gov.au/searchboxThe other form of engagement is collaboration. We welcome the opportunity to work with agencies to share records or content with us. We use standard mechanisms to support exchange. And we have built little services around exchange protocols so that you can benefit from the services too. By that I mean, we provide various mechanisms to allow other agencies to query our services in real time, or harvest records for inclusion in their services. These mechanisms are all available as open source. For example, you can pop a little search box such as this one on your web site. This little box will be rebranded Trove in a few weeks time.I am happy to provide further information about any of our collaborative opportunities at any stage. Together we can create a national exemplar of digital best practice.