Anyone who’s had to gather database or e-journal statistics knows the headaches involved. First of all you need to know what to get – which databases and e-journals you have that you can get stats for, which ones you can’t, and which reports you need for each one, whether it’s one of the many COUNTER reports available or some combination of whatever the vendor provides.Next you have to know how to get it – where to go to get stats, whether it’s a vendor website or e mailing your sales rep, what your login is, and what to do after you log in.You always seem to need stats on short notice to review particular products, so you have to be able to gather them quickly and put them together with historical statistics to help selectors make decisions.And the hardest part is putting it all together, showing which resources are underused, are too expensive to justify continuing, are growing or declining in use, or need more seats because they’re in high demand.
Well, we got a handle on this, and so can you! Over the course of the last year, we have been working hard to make gathering and reporting our e-resource usage statistics organized, efficient, and manageable. Our strategy has involved reimagining our relationship to the work involved and the data necessary to make it happen. We’re also all about sharing the work we’ve done, so you can benefit from it, too. In this presentation I’m going to discuss our new, three-tiered approach to managing e-resource usage statistics (which involves new ways to control data and organize and distribute work), options for reporting, and subscription and open-source solutions you can use at your library.
When I first became responsible for our e-resources usage statistics, it was easy to get overwhelmed. Reponsibility for collecting them had been divided among different units and passed around to different staff members. There were duplicate and incomplete files saved all over – in shared folders, on people’s hard drives. Some files were arranged according to name, others according to fund code, others according to staff member, others according to systems that were unidentifiable. Worse, files like these existed for every single year.
Collating stats by fiscal year and calculating price-per-use was done by hand in separate files, and the information was then copied into a final report.
With our database and e-journal collections growing and our budget getting tighter, we knew we needed a new approach.
Since early this year, we’ve been re-developing our statistics strategy according to three main principles. 1 – we’ve gotten control of our data. That means we restructured how we save administrative information related to statistics, how we maintain instructions for collecting statistics, and how and where we save our stats files themselves.2 – we have reoriented ourselves to how this work is organized, making sure to integrated statistics into the lifecycle of every resource, from its acquisition to its cancellation, to make sure our information remains accurate, and we’ve reorganized the work of gathering statistics themselves in order to minimize it.3 – less and simpler work means it can be stratified into tasks that are divided up by staff level, we can integrate a student worker into helping us, and not all the work falls on one person at one time.
First, and most importantly, we integrated administrative information about usage statistics into our ERM.We use the Millennium Electronic Resources Management system to manage our e-resources, and we started using the ERM admin field to save the username, password and url for retrieval. This field has a separate authorization, so it is easy to limit who can see this information, but the process for managing this is integrated into the Millennium authorization system. Therefore, anyone who needs to see this can have access to it, but it’s not easy for any one person to export, duplicate, or change it. Even if no administrative module exists, we include this field with a note to that effect, so we never have to wonder whether we’ve checked on this or not.We also repurposed a fixed field to include information about whether and what kind of statistics are available for a resource, so it is easy to run a list.We are also using the usage statistics field in every resource record. If no statistics are available, we include a note that says that. If statistics are available, we include a link to the wiki where we keep the downloading instructions. Since many different databases can share one set of retrieval information, this is efficient. For example, we have 57 EBSCO databases. Instead of duplicating downloading instructions in our ERM 57 times (and updating them 57 times if they change), we keep them on one easily-edited web-accessible wiki page which is linked to our ERM through this field.
Our Database Usage Stats wiki covers 70 platforms and is arranged by platform or, in the case of a database that doesn’t share its platform, database name. we’re using our campus wiki software, so we don’t have to worry about it being secure or stable. The wiki is searchable, so you can either navigate to the page you need or search for the database you’re looking for.
Each page lists the database or databases for which these instructions are used, the reports necessary to gather (whether they’re COUNTER or not), and explicit instructions for both gathering the statistics and saving them. Even if it’s a platform for which you have to e mail a vendor, I’ve included the vendor’s e mail address and exact text for the e mail that needs to be sent – because I want to be able to do this work quickly and easily and not re-think through what I’ve thought through in the past. Since this is a wiki, it is very easy to update these instructions when they change (which they do frequently, as vendors migrate platforms or consolidate). The url to get stats from is included, but the username and password is not. This is because we made our wiki public – we want anyone to be able to use it. We have password-protected the ability to update pages, and it is very easy to change who is allowed to do this when we need to.
Finally, we’ve gotten a handle on the stats reports themselves. All reports are saved in one place on a shared drive and are organized according to the same principle as the wiki – by platform or database name. Each folder has separate folders for separate reports, and each file name includes the database name, so it’s easy to search for it if you don’t know where to look. Stats for each year are saved in ONE workbook, with a worksheet for each year, so it is always easy to compile historical statistics for review.
After we did all this work to organize our stats, it would be a shame if we didn’t maintain our system. So we reviewed our workflow to make it sustainable.First, we made sure to integrate statistics into our e-resources management process. I use checklists to help me remember what to do during each step of the process, and have made sure to include notes related to statistics. This means that as soon as we acquire a database I make sure we ask the vendor about gathering usage statistics and input the admin username and password into the ERM, fill out the fixed field that lets us know how statistics are gathered, add the resource to our stats wiki (whether it needs a new page or is on an existing vendor platform), add it to the autostat table in our ERM if we are going to use SUSHI to obtain journal statistics, create a folder for it on the R drive, etc. If a database is cancelled or changes platforms, we go through the same steps to update or remove it. This way we never have to wonder if we’re missing something when we gather statistics in the spring, because we’ve been keeping our documentation up to date all year long.
You may have noticed I just said “in the spring.” That’s right – we are only getting stats once a year now, instead of monthly. We also don’t finesse them in any way – we save them the way they come and don’t try to collate use on the fiscal year – we just keep everything on the calendar year, and add the numbers when we need to report them.
This simpler work has allowed us to break the work down into smaller jobs that can be divvied out among librarians, staff and student workers. So we have one person focusing on administrative information, reporting and keeping things going in the ERM, one person who maintains our downloading instructions, helps gather stats, and maintains information in our ERM, and we are able to let the actual collecting of COUNTER reports be done by a student. In addition, a variety of other people have valuable jobs that allow this process to run smoothly – Serials staff do coverage load monthly and work in serials Solutions (both necessary for how we report e-journal downloads), acquisitions staff maintain cost information and invoice data, and IT staff help us manage the ERM and troubleshoot.
Perhaps the hardest part about managing statistics is managing their reporting. We have pushed our e-journal usage reporting onto our ERM, after working very hard to prepare for that. For our database reporting, we’re going to keep creating spreadsheets, for now, and I’ll explain why. But first, I want to demonstrate a couple of guiding principles we’ve embraced related to e-resource statistics reporting.
First, focus on statistics that are both meaningful – meaning they are actually related to use – and can be compared across platforms. We used to only report searches, but with openURL I’m not sure that’s the most descriptive use – for example, people might search in a general database like Academic Search Complete or a discovery layer like Summon, but connect to a different database to download an article, and if we are not looking at FT downloads, we’ll miss that use. Likewise, as relevancy ranking improves, more users might be connecting to a database, but searches might still be falling. So if we can get COUNTER compliant statistics, we get searches, sessions and full text downloads for our databases, because all of these can be meaningful uses. If we can’t get COUNTER compliant statistics, we focus on data that might be comparable to searches, sessions and downloads and ignore a lot of the other data our non-COUNTER compliant vendors give us (like “hits” or number of search results – what good is that?).
We also want to paint a comparative picture of use over time. I experimented with the most illustrative way to communicate this this fall, and I’ve decided to calculate percent change in use from the previous year and overall.
and pie charts, but they are not really very good for comparing more than a few resources, and I discovered that most of our librarians responded more to percent change in use numbers than the graphs I tried to make.
I tried line graphs
I’ve discovered that a little inaccuracy isn’t the end of the world. We estimate the use for our databases for the current year by doubling the July-December use in our report. Otherwise, we would have to collect statistics more than once a year. It turns out these estimates are not so off base that they throw our entire review off.
We also don’t worry about the “EBSCO bump.” Have any of you ever thought about this? It’s the term our head of reference gave to the EBSCO statistics inflation that comes from users choosing “all databases” when they go to EBSCO (we have 57) instead of just one. Whenever this happens, each of our EBSCO databases gets a search and session, but they are not actually meaningful uses. We decided that “Vente et Gestion” is our measure of the EBSCO bump at BG, since it gets about 10,000 searches a year, but we don’t think any of our students are looking for French language business news. When you’re calculating cost-per-use, this seems like a big number. But when I backed out these uses for our paid EBSCO databases, I found that the cost-per-use only rose an average of $1.26 each, and only one was significant, and it was for a database we had decided to cancel anyway. So I don’t worry about it.The biggest piece of advice I have to give about reporting is just to minimize noise. Anything more than the database name, the use, the cost and cost per use, and perhaps a note is just too much information. Put it in alphabetical order so people can find what they’re looking for, and format it to print before you distribute it, because everyone wants to print it.
There’s always the option of outsourcing! How many of you subscribe to services like Scholarly Stats or 360 Counter, that go and get your stats for you? I’m going to talk briefly about three services I’ve used a little: Scholarly Stats, which was the first of these kinds of services, SwetsWise Selection Support, which is built off of Scholarly Stats, and Serials Solutions 360 Counter assessment service.
Scholarly stats is definitely the most basic kind of service. You give them your logins and passwords and they collect reports for you, collate them, and make them available for you to download from their website. You pay per platform, and they gather stats monthly.
Their reports are nothing fancy – really just all your COUNTER reports smushed together. They don’t gather non-COUNTER compliant statistics, and there is no way to add your cost data or adjust the reporting period. This is really a service for people who don’t want to go out there and download their reports themselves, and who want them collated by title.
Swets purchased Scholarly Stats and built SwetsWise Selection Support as an enhancement to that service. They still gather reports for you, and you still pay per platform, but you can also enter and upload your own platforms and statistics and add your cost information or pull it in from Swets if you use them as your serials subscription agent. It is SUSHI compliant, so you can set it up so your reports are pulled into your ERM. It takes COUNTER complain reports, or will take reports you force into a COUNTER format, or it will allow you to input statistics for a specific title manually.
It also comes with a number of pre-formatted reports that you can run for any time period you like. The only thing I don’t like about this is that it is very serials-oriented, so it’s great for getting cost-per-use on a specific journal title, but the same process to get cost-per-use on a database is rather cumbersome. If you want to learn more about it, I suggest asking a Swets rep for access to the tutorial I’ve linked here, which gives a very thorough overview of the service.
360 Counter is Serials Solution’s service. You don’t pay per platform – instead, you pay an annual fee and Serials Solutions will collect statistics for ALL of your COUNTER-compliant platforms annually (more often if you pay more). You have the option of manually uploading data for non-COUNTER compliant platforms. If you are a 360 Resource Manager customer (which is their ERM) cost information will be pulled in from that system. If not, you can either manually input it or upload it through a spreadsheet template they provide. They are not yet using SUSHI. You can ask for access to their demo site to play around with the interface.
In Scholst and sw, the reports download in Excel. In 360 Counter, they display on screen and give the option to link or download in a variety of formats.
I don’t believe that outsourcing is a magic bullet. It’s largely limited to COUNTER compliant sources, or requires you to put non-COUNTER compliant data into COUNTER format – which is work for you. For example, we get statistics from 50 platforms that are NOT COUNTER compliant. It doesn’t integrate with your cost data, so requires manually inputting that – also work for you. You could build your stats workflow around an outsourced solution, but this requires a lot of up-front work potentially, not the least of which is determining what to get stats for and how to get them. It also requires ongoing maintenance. If your vendor doesn’t collect all your statistics, you’ll still end up collecting some yourself and inputting them manually. Once we did all the work of figuring out what to get, how to get it and how to push what we could into our ERM, I realized that, for us, the hard work was done, that gathering stats was, for the most part, going to be a once-a-year process that we could largely entrust to students, while managing stats, whether we outsourced or not, was not something we could never escape. So we actually cancelled our subscription to SwetsWise selection support this fall.
Of course, a big part of the reason I felt confident doing that was because we have an ERM. While not that many libraries in Ohio have the III ERM or any plans to get it, it is important to know that there are a number of open-source ERMs that incorporate statistics in some way. The benefits of using an ERM include storing your administrative data, like your usernames and passwords, too. I have listed four open-source ERMs here, but I’m really only going to talk about the first two.
ERMes is certainly the most widely used open-source ERM, with over 40 users. It is the only open-source ERM (and only ERM I know of at all) that takes database instead of journal reports. It runs on Microsoft Access. However, it doesn’t truly allow for uploading COUNTER reports – rather, you append them as spreadsheets to the database and the process requires a lot of modification by hand. I asked the developers for an example of a report, but the only thing they could send me is what you see here. I think they have developed a method for getting the data into the database, but it would be largely up to the end-user to construct a query to get it out again.
CORAL, developed my Notre Dame, is, in my opinion, the most robust open-source ERM. The statistics and reporting modules are standalone, so you can implement just those if you’d like. You can actually upload your journal 1 reports. However, you can only run calendar-year reports, and there is no way to incorporate your cost data, manually or otherwise, in order to calculate cost-per-use.
E-Matrix, developed by NCSU, allows users to upload Journal 1 reports, but they require some editing before they can be uploaded. Use statistics display within the records. I’m not sure if there is any way to collate them into a larger report.
CUFTS is the only open-source ERM that can use SUSHI. http://researcher.sfu.ca/files/cuftsAdminGuide.pdf
What you should do:implement an open-source ERM that can take COUNTER reportssave all your usernames and passwords in there, or in Serials Solutions, or in your order records in Acquisitions – NOT in spreadsheets and DEFINITELY not in e mailuse my wiki for downloading instructionsdo it once a year or twice – not monthlyprioritize if you can’t do them alldon’t waste money on outsourcing
Streamlining Stats:Creating an Efficient Workflow for Gathering, Reporting and Interpreting E-Resources Usage Statistics Amy Fry, Electronic Resources Coordinator, Bowling Green State University images by Ken Fager, http://www.flickr.com/photos/kenfagerdotcom/ ALAO 2012 Annual Conference, 4 November 2012
The Database* Statistics Dilemma *and e-journal What do I get? How do I get it? I need it now! How do I put it all together?
We got a handle on this, and so can you!• Data control• Organization of work• Distribution of labor
A quick note on definitions.• COUNTER: a NISO* standard for collecting and reporting database (DB), e-journal (JR) and e-book (EB) usage. Counts searches, sessions, and full-text downloads. • SUSHI: a NISO standard for automatic retrieval of COUNTER- compliant data. • ERMs: Electronic Resource Management systems. *NISO: National Information Standards Organization
Our new philosophy. Data control Organization of work Distribution of laborEverything in its (one Stats are integrated There are jobs forand only) place, all into the resource’s librarians, staff andlinked together. lifecycle. students.
Data control ONE place for logins and passwords: our ERM.
Data control ONE place for instructions: our wiki.
Anyone can access it!http://dbusestats.bgsu.wikispaces.net
Data control ONE place for the stats files: our R drive.
Organization of work Stats are integrated into the lifecycle of each resource.
Organization of work Simpler work is easier to do. • Don’t manipulate spreadsheets – leave them as is. • Only get stats once a year.
Distribution of labor Everyone gets to help! Librarian jobs: • Manage logins and passwords • Manage reporting • Manage the SUSHI table in the ERM • Manage stats through each resource’s lifecycle Staff jobs: • Write downloading instructions • Make changes as needed in the wiki (adding, removing, and changing resources) • Convert files XML; upload XML files into the ERM Student jobs: • Collect and save stats according to wiki instructions Other jobs: • Serials, Acquisitions & IT staff work with order records, do coverage load, and more.
ReportingOur medium – For databases: spreadsheets, for now – For e-journals: our ERM Our philosophy – Focus on meaningful counts with cross-platform applicability – Try to paint a picture over time – Accuracy isn’t everything
Science database use at BGSU, fall 2010 Computers & applied Biological Abstracts Science Database Use sciences complete Alt-HealthWatch 1% 2% 2% GeoRef GreenFILE Agricola 1% 1% MEDLINE with full text 1% 3% Environment complete 3% Health source: nursing/academic edition 3% Health Source: Consumer Edition EBC Scholarly/Reference: 3% Springer 27% CINAHL plus with full text 5% SciFinder Scholar 6% MathSciNet 7% Web of Science 25% SPORTDiscus with full text 11%
Science database use at BGSU, fall 201045,000 EBC Scholarly/Reference: Springer Web of Science SPORTDiscus with full text40,000 MathSciNet SciFinder Scholar35,000 CINAHL plus with full text Health Source: Consumer Edition30,000 Health source: nursing/academic edition Environment complete MEDLINE with full text25,000 Alt-HealthWatch Biological Abstracts20,000 Agricola GeoRef15,000 Computers & applied sciences complete GreenFILE Physical education index10,000 Compendex Garden, landscape & horticulture index 5,000 Beilstein Inspec - Lexi-PALS drug guide 1 2 3
Feel like outsourcing?Scholarly Stats – Basic stats collection service, pay per platform – Basic collated reportsSwetsWise Selection Support – Stats collection service (pay per platform) – Options to upload data manually 360 Counter (Serials Solutions) – Unlimited platforms with annual subscription – Options to upload data manually
They have a demo site! Ask your rep for a login.
Drawbacks to outsourcing• Possible time lag for data availability• Retrieval is limited to COUNTER compliant sources• Not tied in to your order records • You have to pay for it ALSO: You’re never, ever going to get out of collecting data altogether unless you decide to stop looking at your stats.
You can use a free ERM• ERMes (University of Wisconsin-La Crosse) – Incorporates DB1 reports• CORAL (Notre Dame) – Incorporates JR1 reports • E-Matrix (NCSU) – Incorporates JR1 reports • CUFTS (Simon Fraser) – Incorporates JR1 reports; uses SUSHI
ERMes• http://murphy library.uwlax. edu/erm/• Runs on Microsoft Access• Robust user community• Allows you to upload COUNTER DB1 statistics (by linking a modified Excel spreadsheet to the Access database), input cost, and create fiscal year reports See page 19 of this pdf for more info on how ERMes handles use stats: http://bit.ly/ERMesManual .
CORAL• http:// erm.library. nd.edu• Built with PHP 5 and MySQL• Can use the statistics module and add-on reporting module independently of the other modules• Allows you to upload JR1 reports• No incorporation of cost data; only calendar year reporting See this pdf for more about the CORAL Usage Statistics module: http://bit.ly/CORALManual .
E-Matrix• http://lib.ncsu. edu/e-matrix/• Built with PostGreSQL• Allows you to upload COUNTER JR1 reports (requires some report modification)• Displays statistics within records
CUFTS • http://researcher. sfu.ca/cufts/erm • Built with PostGreSQL • Allows you to pull in COUNTER JR1 reports with SUSHI • Displays statistics and cost per use within records See pages 53 & 113 of this pdf for more info on how CUFTS handles use stats: http://researcher.sfu.ca/files/cuftsAdminGuide.pdf
What you need to do now• You need an ERM (you really do!).• Use the BGSU stats wikis, and contribute your own information. • Stop getting stats monthly. • If you’re going to outsource, do it wisely.
What we’re going to do next• Force our EJC statistics into COUNTER format so we can upload them to our ERM, too• Explore other reporting options for databases