Hopefully my talk today will serve a s bridge between the organizational and personnel concepts that we have talked about today, to the more technical aspects that will be discussed in Module 2 tomorrow.Essentially our world, both personal and professional, is becoming more socially network driven. This demands a different way of working, both within our organizational confines and how we work with others.Instead of organizing and linking in linear ways of how our world and work is organized – colums and rows, labels and categories, it is a a more flat structure, where lables and relationships can shift, and there is more opportunities to engage and do more, but differently
Capacity to do more – but it is imperitive to examine what work is no longer a prioirity – driving interdisciplinary scholarshipMore oportunties for individuals to work on projects outside of their normal organization confines.Opportunties for organizations to market what they do best, knowing which partnerships, and when to engage, for the maximium benefit.
The President of CLIR, Charles Henry, has written much about macrosolutions, And I recommend his most recent article in the January issue of educuase review. Basically, Macorsolutions are where institutions come together, share resources and create solutions that create convergence – or an integral dependency – to provide a service that was once locally owned. Paul Courant and John Wilkin, of the Univeristy of Michigan refers to this as “above-campus” library services in an Educause review article from August 2010 . IS this the same as outsourcing, or some of the limited collaborative projects, like interlibrary loan. Although there is similar charactieristics, macrosoluations, is collaboration to the extreme. - it is SUPER sonic collaboration. – Collaborations exist on a scale – Gunter Waible discusses this in his OCLC report Collaboration Contexts:Framing Local, Group and Global Solutions, Gunter Waible
I Deeper collaborations trend toward convergence, a transformative process that eventually will change behaviors, processes and organizational structures, and leads to a fundamental interconnectedness and interdependence among the partners. In transformative collaborations, participants find efficiencies that free up time and resources to focus on the things they do best. At the extreme end of the continuum, convergence in a specific area may turn into infrastructure: a service that is so deeply embedded into our everyday life that it becomes visible only when it breaks down.From Collaboration Contexts:Framing Local, Group and Global Solutions, Gunter Waible
How many are familiar with HathiTrust? HathiTrust is a partnership of major research institutions building an immense digital preservation repostitory. A majority of the content is from Google book scans, but other digital collections are represented. This project has achieved. Economies of scale for digital preservation and associated servicesGrown the pool of digital preservation expertise by through real world experienceTrusted collaboration As research libraries face financial pressures and weigh the relative value of print and digital volumes, this growing digital aggregation of research library content has the potential to support current local collection development decisions.
In a recent OCLC research report, the feasibility of outsourcing management of low-use print books held in academic libraries to shared service providers, including large-scale print like ReCAP and digital repositories like the HathiTurst is examined. Based on a year-long study of data from the New York University’s Bobst Library, HathiTrust, ReCAP, and WorldCat, they concluded that there is sufficient material in the the HathiTrust to duplicate a portion of virtually any academic library in the United States, and there is adequate duplication between Hathi and large-scale print storage facilities to enable a great number of academic libraries to reconsider their local print management operations. As of June 2010, the median rate of duplication between titles held by university libraries in the U.S. Association of Research Libraries (ARL) and the HathiTrust Digital Library exceeds 30%; that is to say, nearly a third of the content purchased by research-intensive libraries in the United States has already been digitized and is preserved in a shared digital repository.If the current growth trajectory of the HathiTrust Digital Library is sustained, it is projected that more than 60% of the retrospective print collections held in ARL libraries will be duplicated in HATHI by June 2014. This growth rate far exceeds average annual acquisitions in ARL libraries, suggesting that the digital replication of legacy collections will outpace growth of new physical collections, enabling a transformation in traditional library operations, staffing and space requirements.The median space fort an ARL library approximately 36,000 linear feet or the equivalent of more than 45,000 assignable square feet (conservative estimate). The total annual cost avoidance possible available today would amount to $500,000 to $2 million per ARL library depending on the physical environment (e.g., open stacks on campus or high-density off-site storage) in which the titles would be managed locally. or (13,828,000 to 55,312,000 roubles), There are some obstacles to achieving this vision of a cloud sourced library.There is only 30% of Hathi in the public domain; it requires a network of shared print services – good news, a small number of print providers needed to achieve 70% collection duplication; and there needs to be a service to manage access to this print repository network.http://www.oclc.org/research/publications/library/2011/2011-01.pdfhttp://downloads.alcts.ala.org/ce/06062012_ACprecon_shared_collections_planning_slides.pdfIncreases preservation capacity.• Reinvests space.• Reduces risk of loss of scarce and unique copies.• Shifts library resources to new services/materials.• Encourages greater access through digitization.• Increases support for scholarship through inter-institutional collaboration.• Reduces rate of unnecessarily duplicative print collection growth.
These are the library focused or from the library perspective – Hvent even gotten to the Data center or HPC organized groups -
This is not the time to go it alone – for success in this area it is imperitive to be connected to others – And it is completely folly to not balance the load that this is going to take to others – Talk about DPN and other network efforts.
We should be paying closer attention to the data curation conversation – as it is not just another siloed service in the library – it is the core service of our libraries Data curation activities enable data discovery and retrieval, maintain its quality, add value, and provide for reuse over time
Our community used to expect researchers to come to us, ask us questions about our collections, and use our digital collections in our environment. They want their library resources presented as a platform…..
This is how Amazon is represented as a decentered model. Replace Amazon with your library - and look at the cloud as “cloud library” as envisioned be the OCLC report. By approaching library functions in this way – the library itself, ceases to be a stand-alone island, a world unto itself. It transcends the idea of place, and functions more like an ecosystem, enabling the freedom to experiment and respond proactively to user needs
In the Decentered Library model, Collections and services are united, not by a place or a website – but by the library brand and its message.And as a result – the organizational borders become more permeable – and the”library” becomes more integrated into the information network.
If our libraries are becoming more flattened and distributed – why are we still trying to organize in a linear fashion?Organizational model to solve the more challenging problems facing libraries / cultural heritage/ higher ed.Faith based modelTrust that the folks attracted to the need or opportunity are the best ones to solve the problemThis is the way to organize around these information based issues that require creative thinking – as these are large over arching challenges, international challenges, that invovle many stakeholders.It allows for orgnazations and individuals to contribture to the process at the time and tin instance when they can do the most good and make the most impactIt is how the NDSA is organized, the DPC and the planning stages for the DPLA.This is not an easy model – as it requires lots of overheard, is risky, but ultimately creates sociall driven emergent communities capable of enacting global change
Leadership challenge – both top down and from where you are.
Are we doing things, through hiring decisions, budget, and policy that keep us from fully taking advantage of the socially driven information ecology?
Rufus Pollock of the open Knowldege foundation said at a conference – the best thing to do with your data will be done by someone else – This is not a bad thing – in fact it is extraordinary – how are you and your library facilitating this? We are a remix cuture –are you invested? Or do you abstain? Or worse, restrict?Golden moment of Advocacy for libraries in the area of Open.CLIR report – challenges of data research, curation and reuse – restrictive research environments – insert refePermeable – Cross discipline and institutional research spaceCourage in our skills, courage in our abilities, courage to trust our partners, and courage in order to expand our tolerance for risk
The connectedness of it all – It is all very fluid – In order to be a thriving relevant information, or better yet, knowledge organization, our libraries, have to be connect, collaborated, and converge, in order to be a vibrant hub on the network. We need to be part of the research process and not passively wait for the end product to arrive on our loading docks and online through our subscription services.This is an unprecedented, boundless opportunity for libraries, limited only by us.
Boundless OpportunityThe Impact of Cloud-Based* Services for Libraries Rachel L. Frick Director, Digital Library Federation Council on Library and Information Resources Ticer Summer School August 21, 2012
Cloud Based* services Not just technical infrastructure Distributed Services Collections Expertise
Network Opportunities Capacity to do more Leverage local expertise Amplify local excellence
Macrosolutions:towards convergence “Common to these efforts will be developing strong coalitions that bring together diverse institutions within a national framework; federating shared resources and interests, including collections, technology, and expertise; and creating a genuine, volitional dependency on other participating institutions for the provision of what was once a locally owned and managed asset. We are calling these collaborative projects macro solutions.” CLIR Annual Report, 2009-2010, p. 3
Collaboration Continuum • Common Interest • Common Values • Convergencehttp://www.oclc.org/research/publications/library/2010/2010-09.pdf
High Risk / High Reward Requires high trust threshold / risk tolerance Dependence on others Less control
Research Library at Web-scale10,449,391 total volumes5,516,747 book titles272,663 serial titles3,657,286,850 pages468 terabytes124 miles = 199.5 Kilometers8,490 tons (US) = 7702 metric tons3,140,629 volumes (~30% of total) in the public domain
Cloud Sourcing Library CollectionsManaging Print in the Mass Digitized Library EnvironmentConstance Malpas, 2011 1/3 of U.S. ARL content duplicated in HathiTrust Shared Print Archiving / Collective Collections Regional Print/ Digital Archives Service Centers http://www.oclc.org/research/publications/library/2011/2011-01.pdf
Print Archiving: network scale ReCAP - http://recap.princeton.edu/ WEST - http://www.cdlib.org/services/west/about/ ASERL / University of Florida: US Gov Docs http://www.aserl.org/programs/gov-doc/ Maine Shared Print - http://www.maineinfonet.net/mscs/ Organizational Node: Center for Research Libraries Print Archive Community Forum http://www.crl.edu/archiving-preservation/print- archives/forum
New MetricsHow do we – http://www.flickr.com/photos/blackcountrymuseums/4887803840 Count Collections? Measure “quality”? Reward high ratios of services, collections per budget $ Rate Trustworthiness Identify good collaborators / team players?
Pause for a moment http://www.flickr.com/photos/hckyso/3870006964/
Networked Collections: not just books Digitized Primary Resource Collections Europeana - http://www.europeana.eu/portal/ Biodiversity Heritage Library - http://www.biodiversitylibrary.org/ Scholarly Communications OA publications / IR’s, disciplinary depositories Research Data DataOne - http://www.dataone.org/ OpenAire- http://www.openaire.eu/
Challenge of Data Collections BIG DATA vs. small data Data sharing, small science and institutional repositories. Melissa H. Cragin, Carole L. Palmer, Jacob R. Carlson, and Michael Witt. Philosophical Transactions of the Royal Society A 2010; 368(1926): 4023-4038. doi:10.1098/rsta.2010.0165 Preservation services Brief online interview with Sayeed Choudhry, JHU. http://youtu.be/oWw7Ifn1Xx8 Data post production services: Access, reuse, remix
Challenge of Data CollectionsResearchers aligned with discipline, not institutionRestrictive campus IT policiesNot adequate network storageFocused on publication, not curationData breach (privacy) top concernLibrary viewed as dispensary of goods, not a data service partner. http://www.clir.org/pubs/reports/pub154
Data Preservation Communities Professional Organizations providing guidance International Digital Curation Centre - http://www.dcc.ac.uk/ Digital Preservation Coalition - http://www.dpconline.org/ National Digital Stewardship Alliance - http://www.digitalpreservation.gov/ndsa/index.html Open Planets Foundation - http://www.openplanetsfoundation.org/ Centers that “bridge the gap” Data to Insight Center – http://d2i.indiana.edu/ D2C2 – http://d2c2.lib.purdue.edu/ UC3 – California Digital Library - http://www.cdlib.org/services/uc3/ Networks that balance the load Text Grid - http://www.textgrid.de/ DataOne - http://www.dataone.org/ Data Conservancy - http://dataconservancy.org/
Why prioritize data curation services? Data are emerging as the research output of importance Data papers, example Ecological Society ofAmerica: http://esapubs.org/archive/archive_D.htm Data citation http://www.datacite.org/ Databib http://databib.org/ Published journal articles will be less important Metadata of the research data Gravemarker of research activity and version of dataset
What are conversations on your campus? How is the library positioning itself in your campus’ data ecology? Active Participant? Research Partner? Passive – end of process?http://www.flickr.com/photos/marcwathieu/2979581445/ How is your library connected to larger data communities?
Collections = DATA Data sets are not just scientific and business tables or spreadsheets Not just generated by satellites and sensors Libraries (archives,museums): potential distributed data stores
Computational Research Digital Humanities Digging into Data Challenge http://www.diggingintodata.org/ CLIR publication: One Culture http://www.clir.org/pubs/reports/pub151/pub151.pdf
Case Study: Historic Newspapers • Chronicling America • http://chroniclingamerica.loc.gov/ • 5 million page images from historic newspapers with OCR from organizations in 25 states • ~ 4 million hits per day • Traditional research: • SERACHING for stories • Data research: • MINING newspaper OCR for trends across time periods and geographic areas
Case Study: Historic Newspapers http://www.stanford.edu/group/ruralwest/cgi-bin/drupal/visualizations/us_newspapers
Data Research Service Needs To use collections as a whole, mining and organizing and the information in novel and innovative ways Algorithmic and visualization tools Working with both the artifact and its data representation
Data Collection Services The ingest and inventory of such collections, other than scale, is basically understood. How much ingest processing should be done with data collections, or collections that can be treated as data? Do we process collections to create a variety of derivatives that might be used in various forms of analysis before ingesting them? Do we have sufficient infrastructure to support full discovery? Do we load collections into analytical tools?
Library Service Implications Collections as “self-serve” If only provide access to data, do we limit it to native format or provide pre-processed or on-the-fly format transformation services for downloads? Can we handle the download traffic? Can our staff develop the expertise to provide guidance to researchers in using analytical tools? Do we leave researchers to fend for themselves?
New Librarianship Honesty about the limits of re-tooling Re-think the librarian’s role in research Crucial leadership challenge Priorities of traditional services “Stop moving the books, okay?” Back to Basics Collections that are unique REAL Research support Archiving, preservation, and access: distributed, but at scale
Get out of the comfort zone Take the time to ask the hard questions Consider the possibility for radical change Are we deciding for today? Or making the hard choice for tomorrow? Are we network ready? http://www.flickr.com/photos/iamthebestartist/203179552/
Being ready Research environments (including library systems) with permeable borders Advocacy Value of “Open Data” Facilitating information flow Courage http://www.clir.org/pubs/reports/pub154/pub154.pdf
Connected-nessBollen J, Van de Sompel H, Hagberg A, Bettencourt L, Chute R, et al. 2009 Clickstream Data Yields High-Resolution Maps ofScience. PLoS ONE 4(3): e4803. doi:10.1371/journal.pone.0004803
Credits and Attribution Ideas and contributions Patricia Cruse, UC3 – California Digital Library Lorcan Dempsey, OCLC Josh Greenburg, Sloan Foundation Leslie Johnston, Library of Congress Patricia Cruse, UC3 – California Digital Library Gunter Waible, Smithsonian Institution Jon Voss, History Pin – We are what we do Martin Kalfatovic – Smithsonian Institution Libraries / BHL Charles Henry and my colleagues at CLIR