Lipstick on a Pig:Integrated Library Systems 30 November 2010
Tip of the week: Communication • Writingacademic library colleagues is not like writing for your papers. • No required pagecounts, and no points for prolixity • Nobody cares about your erudition. (Lit review only to make a serious point, or to leverage peer pressure.) • Get to the point. Fast. • “Executive summary” for people who won’t read it all (which is most of them!). • Bullet points are good. Compound-complex sentences and ten- dollar words are bad. • Always ask: What cares? purpose of this thing I am writing? Who is the • Communicate? Persuade? Train? • Upbeat matters. Even if you’re frustrated, tired, fed up.
Tool of the week: Creative Commons• Creators who license permission for reuse; no need to ask their creations • But observe any conditions on the license!• Images: http://ﬂickr.com/creativecommons/ • or Compﬁght: http://compﬁght.com/ • or Flickr Storm: http://zoo-m.com/ﬂickr-storm• Music: search for “podsafe music”• Brilliant for decorating presentations and podcasts• Remember to give back!
Tool of the week: Online presence• Yes, you need to worry about your egoGoogle.• You also need to worry if you don’t have one. • Are you missing a chance to stand out in a competitive ﬁeld? • Once you’re on the job market, THEY ALL HAVE MLS-ES.• You’ll have to take the bad with the good... and only you can decide what’s worth it. Consider: • Professional network (assists, job leads, conference buddies) • What ﬁts your way of presenting yourself (blog, Flickr,YouTube, Ravelry, FriendFeed, Facebook, LinkedIn, SlideShare) • Whether and how to pull all of it together into a portfolio.
Weekly reﬂection (for next week’s discussion)• Google yourself. Could you ﬁnd yourself? Do you have Googlegangers?• Any surprises, good or bad?• Is this what you want a potential employer knowing about you? • If not, what do you plan to do about it?• Are you satisﬁed with your own online privacy?
Software development models: why you care• How your software was built affects: • how much you pay for it, up-front and ongoing • which chunk of budget those costs come from • how much you can do with and to it • how much it will cost to support and train people on it • how much control you have over your data and how your data are presented to your patrons • how good it is• There is no one right answer. There are only tradeoffs, which you need to understand.
Building it yourself• Some libraries deliberately and intentionally develop their own software. Go them!• Some libraries do it by accident! • One bright tinkerer whomps something up. • The library comes to depend on it. • ... and then the tinkerer leaves. Oops. • ... or the computing world changes such that the whomped-up thing no longer works. Oops. • Tinkerers are great. But make them document. And have a plan for transitioning off the whomped-up thing!
Off-the-shelf software• What you buy in the TechStore• Made by for-proﬁt companies • Though small developers and shareware makers are still out there!• Certain expectations of performance, stability, polish, documentation • May vary somewhat depending on customer base• May rely on proprietary ﬁle formats for customer lock-in• Pricing: usually “per seat” or “site licensed”
Vendor software• Usually springs up in niches where off-the-shelf software can’t sell enough seats • ... e.g. ILS software for libraries! Also learning-management systems!• You pay to run the software AND for a certain level of customer service • Installation help • Employee training, user groups, conferences • Technical support (up to and including vendor-run servers)• You’ll still need local tech staff, often! • Installing and customizing these things is a HASSLE.• But there will be strict limits on what you can do.
Use the source, Luke!• “Source code” = the instructions that humans write for computers to follow• “Compiled code” or “binary code” = source code that has been munged to be directly understandable by the computer • Not interpretable by humans any more! • This is the only form in which proprietary software is distributed (usually), and why you can’t peek under its hood. • “Compiler,” “interpreter,” “virtual machine” all bits and pieces of the source-code to compiled-code transformation.
Open-source software• The source code is open! • You can (legally) download and install it without paying. • You can (legally) read it. • You can (legally) change it. • You can (legally) resell it (sometimes with caveats).• Developers “license” their code under one of a number of open-source licenses • Commonest: GNU General Public License (GPL), which has a sting in its tail • Also notable: BSD license, Artistic License • OSI maintains a vetted list of open-source licenses.
Brief digression: open source, open standard, open access• Open source: refers to SOFTWARE• Open standard: refers to RULES for protocols, ﬁle formats, software specs, etc. • “Reference implementation:” software that shows how software that complies with a particular standard should work • Example: W3C’s Amaya browser• Open access: refers to the SCHOLARLY LITERATURE
I’m not a programmer. Whyshould I care about the source?• Do you beneﬁt when other people hack on the software? • With open source, quite possibly yes. • If there’s a good API, quite possibly yes. • With API-less proprietary software, rarely and only indirectly.• What happens whenkills a product? out of business? Or a software company goes • Proprietary software: decay and obsolescence. • Open-source software: new companies, forks, options.• Security • Security-through-obscurity doesn’t work. No software is perfectly secure, but OSS has a good track record of fast patches.
Should I use open-source or proprietary software, Dorothea?• It depends. There are tradeoffs. • $$$ vs. staff time/expertise: “free as in kittens” • Ease of use/installation vs. control • Professional support vs. ad-hoc online communities• You can’t always know what your experience will be. • Some vendor support is horrible. Some is great. Some online communities are horrible. Some are great. • Some open-source projects move fast. Some don’t. Some vendors move fast. Most don’t (most can’t!).• Only you understand your library’s situation.• ASK AROUND before you invest, either way.
The worst of OSS: DSpace• Few developers (and until recently, all volunteers), so change is slow.• Arrogant developers, so change is out-of- touch with actual user needs. • Why did publicly-accessible statistics take YEARS? • This has gotten better of late. It’s still not perfect.• Architecture deeply hostile to casual hacking, so innovation is slow. • APIs? What APIs? Plugins? Who needs plugins? And why should we have a space to share code?• Usability? This is open-source software! We don’t need no stinkin’ usability!
The worst of vendor software: ILSes• Migration is a huge hassle, so vendors lock in customers and have little further incentive to serve them.• Heinous hardware-price markup• Totally opaque data models; few APIs; licenses that forbid tinkering• Horrendous customer support• Stunningly slow to innovate (partly our fault!)
What’s an ILS?• Integrated Library System• THE system that handles library operations.• “Modules” • Acquisitions • Cataloguing • OPAC • Circulation/patron management • Also: serials, metasearch, e-resource managers (sometimes), link resolvers... separately or bundled• Underneath: heap big relational database!
State of the market• Big consolidations in mid-decade • Players: Endeavor (Voyager), Ex Libris, Sirsi/Dynix (Horizon)• Up-and-coming open-source packages • Koha: geared toward public libraries • Evergreen: geared toward library consortia, is building code for academic libraries (e.g. serials management) • eXtensible Catalog Project: University of Rochester• Some service innovation • WorldCat Local • LibraryThing for Libraries• Typical ILS replacement cycle: 5 to 10 years
Lipsticking the pig• Libraries turned to outside vendors, homegrown solutions • NCSU: adopted Endeca, who are a web-commerce ﬁrm • UVa: Solr/Flare/Blacklight (ha ha ha) • Scriblio,VuFind, etc.• What were they looking for? • USABILITY! • Faceted searching/browsing • Better associations among records (quasi-FRBRization) • Better correlation between user language and controlled vocabularies • Generally: making the data work harder!
More pieces: Link resolvers and OpenURL• You have a citation. How do you ﬁnd out if the library has the article among its e-resources?• OpenURL: protocol for checking citation information against a library’s list of vendor- provided e-journals and article databases • Pack citation info into a URL or a teeny XML document• Link resolver: gizmo that takes in an OpenURL and returns list of available copies.• SFX (Ex Libris) current market leader
Still more pieces: e-resource management• You just bought a Big Deal. How do you update holdings and URLs in your OPAC? How do you update your link resolver?• How do you keep track of who bought what out of which fund? Or who to call when something breaks? Or usage stats?• Market leader: Serials Solutions • Service (auto-holdings-updating), not just product. • Open-source (though dependent on MS Access) entrant: ERMes
Catalog vs. “resource discovery” • What’s actually in an OPAC? • Print books, maps, sheet music • Title-level serials • Maybe govdocs, theses/dissertations, collection records for stuff in special collections • What’s not? • The rest of the world! Including digital collections, stuff on the web, article-level access to journals, ﬁnding aids... • The information world is bigger than it used to be! • So is the ILS/OPAC an INVENTORY tool, or a DISCOVERY tool? • And what is our inventory, really?
First-cut solution: Metasearch • How many databases are you willing to search? With all their different interfaces? • Metasearch to the rescue! or something. • Single search interface presented to the user. • Sends user’s query to various databases; receives, processes (deduping, relevance ranking), and presents the results. • Some databases use search protocols like Z39.50 and SRU/ SRW. Others have to be screenscraped. • Lousy solution. • Slow, not always good at processing results, coverage not always the best, search bells and whistles gone.
Next try: Building local index for search• Tricky to do! • Which data sources can you legally build your index from? • Of those, how many have an API? Or will you be stuck screenscraping HTML? • Or do you have to work with your link resolver?• See also: Google Scholar • Essentially this is what GS does. They make special arrangements to crawl publisher sites, even behind ﬁrewalls.
Now: “web-scale” discovery• OPAC layers (or ILS replacements, or ILS add- ins) that purport to offer one-stop shopping: OPAC, digital collections, serials, etc. • Serials Solutions: Summon • WorldCat Local • Ex Libris: Primo Central • EBSCO: EBSCO Discovery Service (EDS)• First question: is this a SEARCH TOOL or a CONTENT DATABASE or both?• Next question: coverage? • Players VERY close-mouthed about serials coverage.
The future of MARC• Bluntly: it doesn’t have one. • As a ﬁle format, it’s LONG past its sell-by date. • Does not ﬁt into the mashup universe at all. • Making it work with current-gen technology is a tremendous resource drain. • In hindsight, decisions made so that MARC could easily output human-readable catalog cards are hurting us badly now that catalog cards aren’t what we want any more.• That said, we have a lot of data in it. • If you become a cataloger, you will be involved in a mass data migration. Have fun! (Believe me, I feel your pain.)• Migration to what? Well, that’s the question. • The answer is probably multiple. But RDA is part of the answer.
What is RDA?• Resource Description and Access • the next analogue to AACR2• Does not assume MARC or ISBD underneath! • Diane Hillmann, others actively working on linked-data/RDF expressions.• Claimed beneﬁts • Expand the universe of what is describable • Spend less time on rules pilpul, punctuation, and other cruft • Less emphasis on “record,” more on linkages • Ability to make our records work with/for outside world • FRBRization
Right, so what’s FRBR?• Functional Requirements for Bibliographic Records• Relational data model for catalog records.• Recognizes that not all parts of a bibliographic record describe the same thing • Author: of a “work” • Page count: of an “edition”• “FRBRizing” a catalog means drawing all those relationship arrows between records, and then doing something with them for patrons.• We can do this mechanically. Sort of. Some of it.
Next problem: Who owns our records?• OCLC controls union catalog in the US. • But OCLC didn’t author most of the records!• Huge, records, ﬂap about who can use/remix those ongoing with or without permission.• Open-records initiatives springing up • Open Library • Michigan: http://blog.okfn.org/2010/11/29/open- bibliographic-data-how-should-the-ecosystem-work/• To be clear: legal restrictions onpresence mashups damage librarianship’s reuse and online. We can’t afford not to settle this.
Last problem: How doesour data ﬁt into the Web?• This is not entirely a catalog problem. • What about our digitized collections? Born-digital holdings? Finding aids? Usage data? Authority data? • What are our APIs?• To what extent do we NEED local catalogs? • Uncomfortable but necessary question! Do we need to reinvent Google? If so, how do we exchange records for stuff that isn’t in our ILS? • Are we overinvested in the ILS?• How do we facilitate appropriate reuse of our data? Do we/can we bar inappropriate reuse?