Your SlideShare is downloading. ×
Trove: Collecting, Sharing and Improving Digital Data: Changing roles of librarians and users. 4 May 2010. Rose Holley
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.


Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Trove: Collecting, Sharing and Improving Digital Data: Changing roles of librarians and users. 4 May 2010. Rose Holley


Published on

Published in: Technology

  • Be the first to comment

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide
  • Thank you for inviting me to speak here today. I am going to talk about changing roles for librarians and users, drawing to a large extent on my own personal experiences, with particular reference to Trove and Australian Newspapers.
  • Lets step back 100 years ago and look at the qualities a librarian needed to work in Australia. I shall read you extracts of an article from the Sydney Morning Herald dated 1907. “ A good all-round education is essential, and also a knowledge of two languages besides English. Practically it is necessary that intending librarians should hold a University degree, and understand French. A knowledge of German is desirable also, or any other living language, while Latin is distinctly useful. The average woman is not capable of administering a department properly, say the authorities. It is not likely that many women will become the heads of libraries; they are handicapped by various limitations; limitations perhaps of physical strength, perhaps of temperament. "Nine out of every ten women are unfitted to be at the head of a library," remarked Mr. Anderson, the Government Librarian, "but sometimes a tenth-is discovered, and she is beyond price." There are only 11 women at present employed in the Government libraries. Work at the lending branch of the Free Public Library in Sydney does not make so much demand on the Intellectual as on the physical powers of the women assistants. It is chiefly mechanical. Beginners feel the strain of standing for so many hours, and of carrying so many books. They get accustomed to it after a time, and then find the work fairly pleasant. Occasionally an assistant shows capacity in Indexing or cataloguing or something of that sort. Then she may rise to a higher grade. There are no library schools in Sydney, as there are in America, Germany, and other parts of the world. It is to be hoped that some such training will be accorded to Sydney girls in the near future, and that duly Qualified women will take up the higher branches of library work, which hitherto they have not done.” Thank goodness times have changed!!
  • Lets move forward to 25 years ago. This is me in my first job as a Public Reference Librarian after gaining my degree in Librarianship. I was working in a state of the art public library in Surrey, UK spending 90% of my time on the reference desk. Lets take a closer look at what is happening in this photograph. You can see on the right the microfiche reader which was our most essential tool. I had microfiche for books in our library and for books in print and these were in title and author order only. Members of the public did not have access to the microfiche and had to queue at the desk for help. It was also not available in all branches and the public had to come to the central library usually to get up to date accurate information. This was not ideal but necessary because the microfiche cost a lot of money to buy as did the readers. We had just converted the card catalogue to microfiche which was a big step forward. It was normal to expect that the librarian had control over the information and was the gatekeeper. Secondly if you can read the sign I am holding it says ‘a librarian is on duty, please wait here’. It was normal for me to have a constant queue of at least 20 people with a wait time of half an hour, so the public expected (but didn’t like) to have to wait for information. Thirdly their only way to access the information was in person or via the telephone – note the telephone on the desk. We were only able to answer the telephone if we did not have a queue in front of us. The key points being that 25 years ago information was controlled by librarians and was centralised, users found it difficult to easily find or get information themselves, and none of this was immediate, it all took time. On the 10% of time I was not on the desk I was in charge of what we called ‘community information’. This consisted of a card index file of local clubs and societies, the poster display walls you can see one behind me and the council leaflets. Public libraries have always been an important part of community building and sharing, and this has not changed.
  • The most radical change I have experienced in my working life to date is the arrival of the internet. The internet changed everything for libraries . Before the internet information was Produced by a relatively few large and powerful publishers Discovered by metadata hand-crafted by librarians Expensive and centralised Post web, information is Produced by anyone Discovered by full text and bottom-up linking effects Cheap and distributed By December 2009 according to the ABS 2/3 of the population now has internet broadband access. Obviously in Morgan, South Australia this is not taken for granted –hence the signpost!
  • With the arrival of the internet and digital technologies libraries began to digitise resources so they could be accessed online. This was a very laborious and cost intensive process since libraries have always been concerned about long term preservation, not just digital access so standards in libraries for digitisation were high. Between 2001 and 2006 I managed the digitisation of over 100,000 items in New Zealand including maps, photos, architectural drawings, artworks, journals, archives and documents and around the world other librarians were doing the same. Then libraries decided to ramp it up and go into ‘mass digitisation’ for example digitising millions of pages of newspapers and books. To date millions of items have been digitised by libraries internationally. This photo of me is in 2004, I can’t describe the excitement we had when I received 50 CD’s containing thousands of digitised archival images back from our digitisation contractor. These were exciting times!
  • Initially most libraries built their own websites to give users access to their resources, but we came to realise that in the digital world users do not see institution walls and it was more helpful for them to discover resources in a large pool. Collaborative digital discovery services like Picture Australia and its New Zealand equivalent Matapihi were created. We could do this because libraries have been working to the same standards for many years such as Marc and Dublin Core. However we still tended to group like formats together in services for example digital pictures only and we didn’t mix the non-digitised records up with the digitised ones, so generally our library catalogues were still totally separate to our digital discovery services.
  • Whilst we focused on digitisation and delivering things online we did not seem to realise that the public lost valuable social interactions that they had before in the physical environment. Web 2.0 is really about building those social engagements back into the digital world. To give you an example when I worked in the reference library users often used to underline words in books and write their own notes by the side. If they did this they would get a stiff fine, and if they did it repeatedly they would be banned from the library, but people still continued to do it. Now on a digital book it is possible to add a virtual post it note, or an annotation or comment and we encourage this!! It has taken us a long time to realise that what the users want to do, has not actually changed, and it is good to let people do these things.
  • And to go into user spaces such as Facebook to connect with our new virtual communities.
  • And now we are in 2010. We have firmly acknowledged that in order to give users the best service, collaboration and data sharing are key. But more than this. The two direction statements the NLA is working towards are as follows: We will explore new models for creating and sharing information and for collecting materials, including supporting the creation of knowledge by our users . “ “ The changing expectations of users that they will not be passive receivers of information, but rather contributors and participants in information services.”
  • In the year 2010, hot off the press (from last week actually) this is what I was saying and Trove is the result so far of the Library’s strategic direction statements. I will play this short news video. (If I cannot) I am talking about the importance of making data from libraries (both digital and not) accessible in a search engine like Google (which is Trove). The library has recognised that it is no longer enough to offer only your own data, or collaborative library data, the public want the widest possible access to the widest amount of content from any cultural heritage institution or relevant organisation that is on, about or by Australians. Trove does this. It also enables the public to engage in new ways with data and add value to it, and soon they will also be able to engage with each other within the virtual community that has been created.
  • At this point since I’ve now mentioned the ‘G’ word I want to remind you all why we need libraries and why we are different to Google or Amazon or Wikipedia. It’s because we have made some promises about our content…. Long term preservation and access We have no commercial motives Universal access “ Free for all” ALWAYS AND FOREVER
  • Personally I do not see the current conditions threatening libraries, I see huge opportunities for libraries in the year 2010, and a chance for them to demonstrate their relevance in society, now more than ever before. However I know some of you may strongly disagree with me here. So I’m going to talk a bit more about the 2 projects I manage which are Australian Newspapers and Trove.
  • This is the Australian Newspapers service. The site was released in August 2008 and contains millions of articles of out of copyright Australian newspapers from 1803 to 1954. It is being heavily used with about a million users at present and 100,000 searches per day. Users can either keyword search or browse by date, title or state. Most popular searches are personal names george, william and thomas, and birth, death, murder, marriage, hanging, cricket, gold and shipping.
  • The service is the outcome of The Australian Newspapers Digitisation Program which began 3 years ago in 2007. This website gives full details about the digitisation program, including how digitisation is undertaken, progress and title selection.
  • The overall objective of the Australian Newspaper Digitisation Program is to improve access to Australian newspapers, focusing first on content that is out of copyright – so up until the end of 1954. Up until now people wanting to research historic Australian newspapers needed to go to libraries across Australia and scroll through reels of microfilm. This program aimed to provide an online service that will let people anywhere, anytime access these newspapers via the internet. The service is now available. It is free. You can full text search across every page of every newspaper in the service, including advertising, cartoons, letters to the editor as well as the news and sports articles.
  • The program is national and collaborative. Every state and territory library in Australia is involved. Each state and territory has made their selections of titles and provided microfilm to the NNLA and by 2011, 40 million articles will be available. After this it is hoped that state, territory and public libraries will begin to contribute regional titles.
  • A great enabler in this program was the existence of ANPlan – the Australian Newspaper Preservation Plan. This is also a national collaboration between the state and territory libraries. For a number of years ANPlan had been working to ensure that significant newspapers were preserved via microfilming. This meant that the microfilms could be sourced and used for digitisation. Digitising newspapers from microfilm costs about a third of that from hard copy, so this has enabled much more content to be digitised.
  • A complete microfilm of the Sydney Morning Herald did not exist but due to a very generous donation of $1 million by the Vincent Fairfax Family Foundation we have been able to source the hard copies of the SMH and include these in the service. This title is now complete except for 3500 missing issues - now found in stacks which will be digitised and made available soon.
  • Of all the items you can digitise newspapers are one of the hardest to digitise and deliver for a number of reasons. The NLA decided that setting up a national infrastructure and aiming to have all Australian Newspapers accessible from one place would be the best thing for the nation. This was a big challenge and one that no other country had attempted to take. The national infrastructure consisted of 5 things: Online and offline storage for digital files Development of a NCM to manage the digitisation workflows Development of a public delivery system to provide access to the newspapers Quality assurance processes and team Establishing a panel of newspaper digitisation contractors
  • Also decided at the start was the end product would not be your average database. It would meet the changing needs and expectations of users and that users would be heavily involved in the development of the service. The team largely followed the Google development path and released first a prototype which was given to each of the Anplan stakeholders for comment for 6 weeks in 2007 and then a beta version which was released to the public in August 2008.
  • The public quickly found the site mostly via genealogy blogs and forums which winged their way around the world in a matter of seconds. This is an example of a popular international forum where the news of OCR text correction wings its way from Mary in Italy to William in Gateshead UK, (PAGE)
  • to Zoe in London, to Uncle John in Bedfordshire, and then to Harry’s mum in Brisbane in a matter of minutes. Without any promotion I expected to perhaps have 20 users contact me to help with testing, instead it was a couple of thousand, so every user of beta was considered a ‘tester’. Feedback was gathered in a variety of ways, by e-mail, by survey, by feedback form, by reading blogs and forums such as this. User feedback strongly guided the development of the service.
  • One of the innovative features that was in the first release was the ability for members of the public to correct or enhance the OCR text. When digitising old newspapers the process is to convert a digital image into full-text by use of Optical Character Recognition software (OCR). This works well on new clear documents but on old newspapers where the font and paper is of poor quality and microfilms may be out of focus the translation often goes into gibberish. After investigating every possible way technically of being able to improve this we came to the conclusion that the best way was by hand and human eye. We could not possibly afford to pay contractors to do this ‘re-keying’ so the lead programmer Kent Fitch suggested we open it up for the public to do. If text was made accurate the searching would be instantly improved for everyone since the search works over the OCR text.
  • This was very controversial since no such thing had ever been done by a library or archive before and was considered high risk. The risks were identified as: No one will do it OR People will deliberately vandalise the text. The likelihood that people would just do lots of it well was considered extremely unlikely. Because it was unknown if people would do it, it was decided not to put valuable time into developing a moderation module which may be unnecessary but to just see how it went without moderation. Also to lesson barriers it was decided registration would not be necessary. To reduce risk the added data would be kept in layers and not integrated into the original metadata (although it would appear to users as if they had changed the actual metadata). All layers of data would be searched on. While we were doing this we also decided to add in the ability for people to add comments and tags to articles as well. NEXT – quick demo.
  • This is the article view. Users can zoom in or out and choose to view the article in the context of the entire page. They can also navigate to any other page within the newspaper issue. The electronically generated text created through the OCR process is displayed on the left hand side. This is also where the users can use the 3 enhancement features. They can drag the viewing pane to see more of the or less. Users can tag the article with keywords and they can write comments and notes about the article. If users login they will be able to choose to make their tags and comments public or private. So they can share their comments with all users or they can add their own private research notes that only they can access. One feature that we believe is innovative and not available in any other online newspaper service, is the ability for the user to correct the electronically generated text. There are a number of reasons why the electronically created text is not always 100% accurate, mainly due to the quality of the original newspaper that the image was created from. Users can correct the text by clicking on the ‘Help fix this text’ button. We will now use these features on this article. The article we are looking at is the first report in an Australian paper of the sinking of the titantic.It’s in the Northern Territory Times on 19 April 1912.
  • I want to tag the article with ‘titantic sinking’. If a user does not login when they first enter the service then the first time they want to enhance an article they will be offered the option to login. At this point they can either login or enter the captcha to verify they are human (and not a robot – attempting to do something undesirable).
  • Once logged in or verified with captcha a user can enter their tags.
  • Now I want to add a comment. Those of you who read this article may have noticed that it was reported that all passengers were safely rescued from the titanic and the weather was calm. I’ll just add a comment to say this was unfortunately not the case.
  • Now I have zoomed in on the image and if the OCR text was inaccurate I would edit it in the box on the left. This is what we call the power edit mode. In this article the text is actually very accurate so has either OCR’d very well, or already been corrected by someone else.
  • Now we can review the article with all the enhancements we have made showing on the left. Tags, comments and corrections. We can view the history of all the enhancements (both ours and other peoples history).
  • The results are pretty astounding both to the National Library of Australia and the world in general. So far over 9000 users have been actively correcting text each month and they have so far corrected 12 million lines of text. They have also been using the other features especially tagging to further improve the quality and depth of the article information.
  • This graph shows the rising rate of text correction. Text correction peaks over the christmas period and long weekends. There has been no time since release of service when text correction is not taking place. It goes on 24/7.
  • So after all this activity the most common question people kept asking me was “Who are these people?” and also “Why do they do it?” Some people even suspected that the text correctors were really library staff, which is not the case. The text correctors are real, normal people. We sent some of them a survey to find answers to our questions about how long they spend correcting, why they do it, what motivates them, what would motivate them to do more or less? The responses were very interesting.
  • The three main reasons for correcting text were: We’re helping to provide an accurate record of Australian History We want to record family names and help others as we go We think it is a useful cause that will help all Australians, the Library, and ourselves and we are willing to give time for this.
  • But also people gave these reasons for doing so much – in some cases up to 40 hrs a week. Because after all if you don’t enjoy it you wouldn’t keep at it, so loving it and finding it interesting and fun were really important.
  • The motivating factors given were no different to those that motivate anyone to do anything for example they enjoy it, they have their own research goals, the think about the main outcome (ie making it better for everyone), they have been given a high level of trust and respect to do the job, and it is a challenge.
  • To maintain or increase their motivation they again gave standard motivational answers. Things we had not done which they would like were to give them detailed instructions on how to do the job, to create for them a feeling of team spirit and being part of a virtual community, to recognise their achievements and acknowledge they were making a difference, and lastly to give them more content. They said the more content they were given the more they would do. Many noted that we had not publicised the service in any way or called for volunteers and the potential to harness a lot more volunteers was vast.
  • In response to numerous requests we instigated the ‘hall of fame’. The top 5 correctors show on the home page as well as in the hall of fame. Originally the hall of fame only showed the top 10 but users wanted to see more, so now it is anyone who has corrected more than 5000 lines per month. Users are still asking for entire league tables however so they can see where they are in the big picture. This is a motivating factor for them. During development it was suggested that we need to use gaming technologies to encourage people to correct text but this has so far not proved necessary!
  • We did not prompt the public to give their opinion on text correction, though many did anyway. Although no-one had encountered anything like this before they quickly understood the aim and thought it was a good idea. Once people understood they likened it to Wikipedia, though it was not quite the same since the original image is always there for verification.
  • All our top 5 correctors are Australians living in Victoria, New South Wales, and Queensland. The five turned out to be 6 since one was a married couple sharing a logon to do research. Of the 6, 4 are female and 2 male. One is working full-time, one is a stay at home mum and 4 are retired. They are aged between 38 and 70. Three of the correctors are correcting as a volunteer ‘do good’ activity and trying to think up topics to correct, whereas the other 3 are correcting around their own areas of family history and local research. 2 of the 6 are also transcribing shipping records and births, marriages and deaths for other organisations. Here are some quotes from some of our top correctors. Julie is our top corrector. She is in her thirties and is a stay at home mum. She mainly corrects articles on local history and murder and corrects whole articles at a time. She says “ I enjoy the correction – it’s a great way to learn more about past history and things of interest whilst doing a service to the community by correcting text for the benefit of others” I keep doing because of the knowledge that you are doing something that will benefit future people that wish to access articles on their family history.
  • Catherine is located in Washington DC and works full-time as the Director of an e-commerce company. She says “I enjoy typing, want to do something useful and find the content fascinating. I do it to benefit others”. Also she does not watch much TV. Lyn and Maurie a retired couple work on it together as part of their family history shipping research. They also do voluntary work for the mariners records. They say “ We get sick of doing housework, we find text correction addictive and it helps us and other people. How can you not correct errors when you see them?”.
  • Mick is recently retired from IT. He says “ I thought I could be of some assistance to the project. It benefits me and other people. It helps with my family research. I would do more if I had broadband and did not have to share the computer with the rest of my family!” Fay is retired, she says “I enjoy the challenge, I need something to do in my spare time and it benefits me and others”
  • Each user has a profile page where they can view their latest tagging, commenting and text correction activities. The user profile pages are visible to other users. At this stage users cannot edit their profiles. It is desirable however that users are able to edit and personalise their profiles so they can share information about themselves and their research interests with other users.
  • By browsing user profile pages we can see 3 distinct methods that people use to correct text. This first profile shows us that this user is looking at lots of different articles with a similar subject – flying saucers and ufo’s and just correcting a few lines in each article. The profile shows the article, the date changed, the old text and the new text.
  • The next user profile shows method 2 – find an interesting article and then correct the whole article. Two of our top correctors are correcting long articles on gruesome murders, this is a popular theme. Text correctors report doing 1-3 hrs of text correction at a sitting on average. The average visitor spends 17 minutes searching and reading articles in a session.
  • Text correction method 3 – names in family notices. Text correction method 4 – methodically working through a paper eg Canberra Times
  • Several people can correct the same article. All corrections are saved and viewable in the history of the article. All versions of corrections are searched for. It is the last correction that is visible in the left hand pane. Articles are corrected by many users when they are either very long, very significant, or very illegible. For example this article is in the first Australian newspaper – the Sydney Gazette and NSW advertiser of March 1803. Around 20 people have made corrections to this article. It is particularly challenging because of its use of the long f instead of an s.
  • This is the text correction history of this article, showing all the different users and what parts they corrected.
  • The lessons we have learnt from this activity are that engaging with users and building virtual communities is just as important to the users as providing the data itself. They want to be part of a community. By giving the users a high level of trust we have built commitment and loyalty in the community. Another lesson we have learnt is that using the term ‘text correction’ is not always helpful. It implies that something will be corrected and the old version deleted, which has caused concern to stakeholders and to the public. However as users undertake the activity it has become apparent that what they are doing is ‘enhancement’ or ‘enriching’ the data. They are actually creating layers on top of the original data, and all the layers can be transparent and separate or jointly searchable. The term ‘enhancement of data’ is not one which has yet become common terminology in Australian libraries but it will not be long before it does and is commonly understood by both the public and libraries. Lastly we know that the Australian Newspapers has had a big ‘social impact’ on peoples lives and the genealogical community. We are unable to quantitatively measure the impact or predict what may happen next.
  • Because the work these people are doing is so invaluable the Director General of the NLA decided to honour the top correctors in the annual NLA Australia Day Awards (which is usually for library staff). The text correctors are considered part of the NLA family for the invaluable work they are doing. It was very interesting for me to finally meet these individuals in person and be able to thank them. They had not necessarily realised what an impact they were making.
  • Julie the top corrector has featured in the media and become a star. She loves correcting articles about Bendigo murders.
  • Unwittingly almost without realising it we had unleashed something amazing. We had gone to the next level of web 2.0 which I shall call crowdsourcing. The crowd (the general public) were working together to improve the quality of the newspapers for the common good so that searching would be more effective. In the same way that people thought it would be impossible for the public to create their own online encyclopedia without any rules (wikipedia), the public had gathered together to assist the NLA in a mammoth task it could have never done on its own.
  • The traditional model of libraries holding control over data shifted to the community. I like this quote from Barack Obama on community engagement and volunteering said “Don’t under-estimate the power of people who join together …. They can accomplish amazing things”. This is true. People want to achieve amazing things and we as librarians have the power to give them both the data and the tools to do this – they will do the rest themselves. The challenge for the library is now how to nurture, sustain and grow this virtual community we have created and their resulting activities.
  • We are now quite sure that the community has the enthusiasm knowledge and time to help us.
  • And we can benefit from this. Crowdsourcing enables us to achieve goals that we would never have the resource – financial or staff to do ourselves, and the community can add value to our services and help improve our resources. The community is actively engaged and we are able to effectively utilise their knowledge.
  • In turn we are encouraging a sense of public ownership and responsibility towards cultural heritage items, many of which hold significance for our nation. We build trust and loyalty of our community and through the activity we can demonstrate the relevance and value of libraries in our society today.
  • I am now going to talk briefly about Trove – the search engine for Australian resources. The Library had a master plan and the Australian Newspapers service was in fact a test bed for the idea to transform service delivery and our internal IT infrastructure in the future. Because Australian Newspaper worked so well the beta model of software development, the underlying IT infrastructure and the application of user engagement has been applied to all the other discovery services the library manages which are rolled into Trove.
  • Trove is an aggregation of 90 million items from over 1000 libraries and other organisations It’s key feature is the single search across different types of content. Trove has social and data engagement features. Two of our most heavily used services are included in Trove (AN and PA). Trove aims to help you find and get unique Australian resources, and although predominantly features lib, archive, museum and gallery data is not limited to this.
  • The key features of Trove are that 1. Firstly, and most importantly it is a single search. In one click you can simultaneously search across several groups of information- books, journals, magazines and articles: images: australian digitised newspapers: diaries, letters, archives: maps: music, sound, video: archived websites, about people and organisations. 2. Secondly you can browse through these groups or zones one at a time if you prefer to only seek one type of content for example newspapers. 3. Thirdly you are able to restrict your searches to – online content only, and/or content held in locations near to you. This is very useful feature for the large majority of users.
  • Results are unbiased – best and most relevant info possible – relevancy ranking. Similar to values of a good reference librarian (subject to initial choices made by user eg location, immediacy). Results are returned in the same zones that we saw on the home page. You can see in each zone how many results are found. Most searches retrieve vast numbers of results because of the wealth and richness of the repository that is being searched. It is likely that you will want to refine or limit your search results and you can do this by using the facets on the left hand side of the screen. The facets change depending what content you are looking at, so for example the book, journal, magazine and article zone has a facet to refine by braille book or audio book. We recognise that many people just want items that are immediately accessible ie digitised or online, as fast as possible, so the links to online content appear immediately at this stage although we haven’t yet drilled down to a detailed results screen. The check boxes to restrict the content to online or Australian are always visible so that they can be checked or unchecked at any point in the search. Here is a concrete example. Suppose a scholar is researching the life and works of Ethel Turner, the author of “Seven little Australians”. Through Trove that scholar is able to discover books by and about Ethel Turner, with information on the location of those books in Australian libraries, and with access to the full content where the work is out of copyright; articles, conference papers, theses and other research dealing with Ethel Turner, including content from university open access repositories pictures of Ethel Turner from libraries, museums and archives newspaper articles dealing with Ethel Turner, and published prior to 1955; archived web sites that refer to Ethel Turner; music, sound and video resources, including audio books and information about the ABC television series of Seven little Australians ; information about papers, letters, diaries and other records relating to Ethel Turner that are in archival collections; and biographies of Ethel Turner from sources such as the Australian Women's Register, the Dictionary of Australian Biography Online, and Wikipedia. Note that last point. Trove includes biographical data: its serves as the online interface to the data contribution program called “People Australia”. I am now going to drill down further into the results in some of the zones to show you some other features of the service, starting with the books, journals, magazines and articles zone. Let’s start with selecting the first book in the list – seven little australians by ethel turner.
  • We have applied FRBR -work and version structure to resources. Therefore at the top of the screen you can see the details of the work – seven little australians. Beneath this all the different editions (117 in this case) are grouped together in a box. Grouping them together like this makes it much quicker for users to find items, rather than having every single item being listed as a separate record as is usual in a library catalogue. The online versions are always listed first in this version box which helps users who want to ‘find and get’ as quickly and easily as possible. All versions are expandable and collapsable if you want to see more detail. On the right hand side are works which may be related. For books, at version level you can check the copyright status and have the citation provided in a variety of formats.
  • We have enabled direct linking through to bookshops that sell the item you are looking at. If no match for the item is found suggestions of bookshops that may have it are given. We have pre-populated versions with tags and reviews from Amazon and Wikipedia, and users can also add their own tags and comments to versions. Because of the difficulty we have had in correctly putting items into version/edition groups due to inconsistent data, users can help improve the display by merge or split versions or works if they notice they are grouped incorrectly. Guidance is given on this in the help.
  • When viewing results in zones you have the option to expand a zone so it fills the page by using the arrow, or to minimise some or all of the other zones by using the minimise icon.
  • We are viewing the original diaries of Ethel Turner now which are held in the State Library of NSW. You can see that I have now minimised the other zones on the right.
  • We are in the process of integrating the Australian Newspapers fully into Trove and expect to be redirecting users from the blue version to the Trove version (which will have the same functionality) in June.
  • We are enhancing the user profiles. In order to be able to find items in libraries near you the service needs to be able to know where you are, so you set your library preferences in your profile after registering. It is not compulsory to register to use the service, only if you want to. Your profile also keeps a history of your data enhancements such as tagging, commenting, corrections, merging and splitting.
  • The recent interactions users have made are also displayed on the homepage for everyone to see. You can see the number of searches in the last hour, newspaper article corrections so far today, works merged or split this week, items tagged this week, and comments this month.
  • For example users can comment on items as well as newspaper articles now eg images, books and archives and share valuable information, and rate items.
  • Tags are a means for users to group items they are interested, highlight specific items, and provide and additional way of searching.
  • This is the result list for the tag ‘sinking of the centaur’
  • Trove is in an early stage of development. Plans for the year ahead are centred around expanding content and developing new features. The Library is interested to hear from new potential contributors who will be able to have their metadata harvested. Important features the library is working on are providing the ability for users to communicate with each other via a forum, alerting users to new content, being able to let people re-purpose content via use of API’s. For example this may mean if you use Primo you may be able to utilise the metadata from Trove, and enhanced getting options. We don’t want users to find dead ends in Trove. I would encourage you all to use Trove for searching if you haven’t already, and then you can discover for yourself how easy it is to find a wealth of high quality Australian information. Then please pass the word on. You could use Trove in your Library and Information week promotions. Whether you are tracing your family history, researching a topic, reading for pleasure, teaching or studying Trove will help you. Trove is a free service for all Australians.
  • To summarise what I said earlier I think 2010 is a year of opportunity for libraries. We really need to be thinking about transforming from holding power and control over information to enabling freedom over its use, sharing and re-purposing. “ Freedom is actually a bigger game than power. Power is about what you can control. Freedom is about what you can unleash.” This quote really resonates with me. Roles for librarians and users are changing.
  • When you go away today I would like you to think about the following things in the big picture context, and how you as a librarian have a role to play in making some of these things happen. The importance of collaboration for digitisation, storage, service delivery, crowdsourcing. “Gravity” Building social engagement into our digital interactions - tools What we may want crowdsourcing help with. Why we want the help: improve quality, social engagement, add new content
  • Ensure we work with open standards for data sharing e.g. OAI Possibilities of data exchange with API’s Making our data discoverable via Google and Wikipedia Changing institutional strategic thinking from power/control to freedom
  • Thank you for listening to me today. I am happy to take questions.
  • Transcript

    • 1.
      • Rose Holley
      • Manager, Trove
      • 2010 Reference at the Metcalfe Seminar, SLNSW
      • 4 May 2010
      • Collecting, sharing and improving data:
      • Changing roles for librarians and users
    • 2. Women librarians 1907
      • Cite:
    • 3. Reference Librarian 1985
    • 4. Arrival of the Internet Photo courtesy Genevieve Bell. Location: near Morgan, South Australia
    • 5. Digitisation
      • Millions of items digitised by cultural heritage institutions
      • Maps, photos, artworks, architectural plans, journals, archives, documents, books, newspapers, music.
    • 6. Collaborative Delivery
    • 7. Web 2.0 – data engagement
    • 8. Web 2.0 user engagement
    • 9. NLA Strategic Directions 2009-2011
      • “ We will explore new models for creating and sharing information and for collecting materials, including supporting the creation of knowledge by our users . “
      • “ The changing expectations of users that they will not be passive receivers of information, but rather contributors and participants in information services.”
    • 10. Librarians 2010
    • 11. Why do we need libraries?
      • Long term preservation and access
      • No commercial motives
      • Universal access
      • “ Free for all”
      • ALWAYS and FOREVER….
    • 12. 2010 Library Opportunities
      • Technology has turned discover on its head:
      • Content can be created by anyone
      • Content can be described by anyone
      • Libraries are still needed:
      • Vast amounts of data
      • Information expertise
      • Gatekeepers – open doors with technology
    • 13. Australian Newspapers 17 million articles now, 40 million by 2011
    • 14.
    • 15. Objectives
      • Increase access to Australian newspapers
      • Build a national service that will provide free online access from the first Australian newspaper published in 1803 through to the end of 1954
      • Key Features of the service
        • Online access
        • Freely available
        • Full text searchable
    • 16. National Program and Content
      • Initial focus on major titles from each state and territory
      • ‘ Regional ’ titles being contributed by libraries 2010 onwards
      • Coverage: published between 1803 – 1954
      • (out of copyright)
      West Australian Northern Territory Times Courier Mail Advertiser Sydney Morning Herald Sydney Gazette Argus Mercury Canberra Times
    • 17.
    • 18. Sydney Morning Herald 1831 – 1954 now available online
    • 19. National Infrastructure
      • Storage
      • Newspaper Content Management system (digitisation workflow)
      • Public delivery system
      • Panel of digitisation contractors (mass digi)
      • Quality assurance processes and team
    • 20. Prototype/Beta
    • 21.  
    • 22.  
    • 23. Text correction
    • 24. Greatest fears!
      • No one will do it
      • OR
      • People will deliberately vandalise the text.
      • Questions?
      • Moderation?
      • Login?
      • Integration of data?
    • 25. Interaction at article level
    • 26. Add a tag ‘titanic sinking’
    • 27.  
    • 28. Add a comment
    • 29. Fix text – power edit mode
    • 30. After enhancements
    • 31. Achievements
      • March 2010 (1.5 yrs since release)
        • 9,000+ volunteers
        • 12.5 million lines of text corrected (600,000 newspaper articles)
        • 400,000 tags added
        • 7,600 comments added
    • 32. Text Correction Activity
    • 33. “ Who are the text correctors?” Flickr: LucLeqay
    • 34. Why correct text?
      • Australian history - Helping to provide accurate record (sometimes linked to local history research)
      • Family Names - Doing family history and help others with names as they go by correcting
      • Useful cause and want to help Australian community/Library/themselves
    • 35. Comments from text correctors
      • I love it
      • It’s interesting and fun
      • It is a worthy cause
      • It’s addictive
      • I am helping with something important e.g. recording history, finding new things
      • I want to do some voluntary work
      • I want to help non-profit making organisations like libraries
      • I want to learn something
      • It’s a challenge
      • I want to give something back to the community
      • You trust me to do it so I’ll do it
    • 36. Motivating factors
      • Pleasure
      • Short and long term goals
      • Concentrating on outcomes
      • Trust and Respect given
      • The challenge
    • 37. Maintaining motivation
      • Detailed instructions - If you want a specific result, give us specific instructions. We will work better when we know exactly what’s expected.
      • Team Spirit - Create an online environment of camaraderie. We’ll work more effectively when we feel like part of team or virtual community. We don’t want to let others down.
      • Recognize achievement - Make a point to recognize achievements one-on-one and also in group settings. We like to think we are being noticed and are making a difference. Show us how we fit into the big picture.
      • Raising the bar – The more we do the more you should expect us to do. We’ll do a lot more if you give us a lot more content. That would be our highest motivational factor.
    • 38. Hall of Fame
    • 39. Views of the public
      • ‘ OCR text correction is great! I think I just found my new hobby!’
      • ‘ It’s looking like it will be very cool and the text fixing and tagging is quite addictive.’
      • ‘ An interesting way of using interested readers “labour”! I really like it.’
      • ‘ A wonderful tool - the amount of user control is very surprising but refreshing.’
      • ‘ ‘ I applaud the capability for readers to correct the text.’
    • 40. Profiles of top correctors
    • 41.  
    • 42.  
    • 43. User profile page
    • 44. Text Correction – method 1
    • 45. Text correction – method 2
    • 46. Text Correction – Method 3
    • 47. One article corrected by many
    • 48. View all corrections on this article
    • 49. Lessons Learnt
      • Engaging with users just as important as improving data quality (in opinion of users)
      • Giving users high level of trust results in commitment and loyalty
      • ‘ Correction’ implies deletion vs ‘Enhancement’ implies adding layers safely
      • Big social impact
    • 50.
    • 51. 391,378 lines improved
    • 52. Crowdsourcing
      • Web 2.0 = Social engagement on the internet
      • Interactions with data and other users:
      • Helps users to help themselves
      • Crowdsourcing
      • Many people working together to achieve a big goal via web 2.0 features. Result usually for common good and will benefit many.
      A book to read: Clay Shirky ‘Here comes everybody ’.
    • 53. The power
      • "Don't under estimate the power of people who join together…. they can accomplish amazing things,"
      • Barack Obama 19 Jan 2009 Speaking on community engagement and involvement and voluntary work
      • Rose says:
      • People want to work together to achieve amazing things – we as librarians have the power to give them both the data and tools to do this - they will do the rest……
    • 54. Community has: 1. Enthusiasm 3. Time 2. Knowledge
    • 55. Benefits for libraries
      • Achieving goals that the library does not have resource for
      • Improving/adding value to your resource/service
      • Active engagement with the community
      • Utilising knowledge of community
    • 56. Benefits
      • Encouraging sense of public ownership and responsibility towards cultural heritage items
      • Building trust and loyalty of the community
      • Demonstrating relevance and value of libraries
    • 57.
    • 58. Content sources
      • Australian Collaborative Services
      • ANBD – 1000 libraries
      • Pandora - websites
      • ARO - Research
      • RAAM - Archives
      • Picture Australia
      • Australian Newspapers
      • Open sources
      • Open Library (Internet Archive)
      • Hathi Trust
      • OAISTER
      • Targets – websites
      • Amazon
      • Wikipedia
      • Google Books
      • YouTube
    • 59. browse groups/ zones Single search Restrict search
    • 60. Refine/limit search results groups/zones results Get item
    • 61. Grouping of versions Get options
    • 62. Buy Add tag Add comment merge/split versions and works if incorrect
    • 63. minimise expand
    • 64. Minimised zones
    • 65.  
    • 66. User profile Your settings and history
    • 67. finding information just got easier.....
    • 68.
    • 69.
    • 70.
    • 71. Trove: Future developments
        • Expanding content – new contributors
        • New features
          • Forum
          • Adding context to and between items
          • RSS feeds
          • API
          • Enhancements re getting options
          • Site harvesting by Google
    • 72. Power vs Freedom
      • “ Freedom is actually a bigger game than power. Power is about what you can control. Freedom is about what you can unleash.” Harriet Rubin
      • Changing role for librarians and users………
    • 73. What Do Librarians Need to Consider?
      • The importance of collaboration for digitisation, storage, service delivery, crowdsourcing. “Gravity”
      • Building social engagement into our digital interactions – tools.
      • What we want crowdsourcing help with?
      • Why we want the help: improve quality, social engagement, add new content?
    • 74. What Do Librarians Need to Consider?
      • Ensure we work with open standards for data sharing e.g. OAI
      • Possibilities of data exchange with API’s
      • Making our data discoverable via Google and Wikipedia
      • Changing institutional strategic thinking from power/control to freedom
    • 75. [email_address] Rose The site you manage is a nightmare! It’s addictive. Keeps me awake at night. Congratulations! Mary Questions?