Summary of Retention, Preservation and Withdrawa Scenario Development Meeting with SCS, April 8, 2013

Introduction

On April 8th 2013, Maine Shared Collections Strategy (MSCS) Project Team and Collections Management Committee representatives met with Rick Lugg and Andy Breeding from Sustainable Collection Services (SCS) at Colby College to review the preliminary collections analysis SCS had conducted. Slides from the meeting can be found here.This was an exciting opportunity for MSCS to actually look at their collection data for the first time in a manner that would facilitate the development of retention, reservation and withdrawal criteria. During the meeting attendees agreed that MSCS would initially concentrate on those items held by 1-2 MSCS partners and:

  • Analyze and take action only on pre-2003 copies
  • Retain the copies if any circulation or internal use
  • Retain material that falls into local protection categories (Specific Maine items) even if no circulation
  • Retain Special Collections/Archives copies even if no circulation
  • Retain materials on course reserves even if no circulation
  • Retain unique in OCLC (only 0-9 copies in OCLC) even if no circulation
  • Compare with both HathiTrust and Internet Archive

Collection summaries

Prior to the meeting, Andy had circulated copies of the group collections summaries to attendees. The summaries are a categorical overview of the data set: Circulation Counts, WorldCat/Peer Counts, Date-Related Counts, and HathiTrust overlap. The Summary is intended to be used as a starting point to build preliminary withdrawal and preservation scenarios from this data by combining various data elements.

Rick reviewed the collection and preparation of the data used in the MSCS collection summaries and made it clear to the attendees that they were just the first attempt at MSCS data reports, but at the same time they were not a too far from the finished product. Over the next couples of weeks, SCS are going to certify the data sets. The recent reclamations MSCS libraries had gone through meant the MSCS data set was much cleaner than SCS are used to seeing and did not require as much remediation.

SCS will continue to update the HathiTrust comparisons so they are as current as possible. Rick was pleased to announce that SCS had successfully compared MSCS holding with the Internet Archive, which was a first for them. However, SCS still have some data validation to perform before they are totally comfortable with the reliability of Internet Archive data.

As the collection summaries had been distributed prior to the meeting, MSCS representatives had already raised three issues regarding the data:

  • Circulation rates: separating circulating & non-circulating materials
  • USM: Internal Use counts were omitted from URSUS Export table (264,000)
  • Portland Public Library: ‘Last return date’ omitted from ‘last charge date’ calculations

Rick confirmed that the circulation rates in the summary did contain non-circulating items, but he showed a revised circulation counts tally including only circulating copies that did not distort the overall circulation percentages. Issues with internal uses and omission have now been addressed. Rick asked attendees to continue reviewing the summaries and report back on any anomalies they identify, because this is the data MSCS will be using in the collections analysis process.

High level views of MSCS data

There were a total of 2,958,905 bib records (un-filtered) and 3,420,061 items (un-filtered) and 1,754,598 unique items (filtered) some of which are held by multiple partners.

Rick highlighted some interesting points that could be drawn from the data. Rick commented that average circulation rates were higher than they usually see in projects; he put this down to the higher circulation rates at public libraries. There were also a higher proportion of unique holdings (38%) amongst the MSCS group than SCS usually sees. Rick commented that this uniqueness would have been an issue had MSCS been primarily identifying withdrawal candidates. SCS are going to look into the effect the inclusion of special collections items has had on the data including the percentage of unique holdings.

Rick presented graphs with MSCS circulating title holdings by holding level and commented on potential retention, preservation and withdrawal scenarios. The different scenarios are dependent on libraries’ comfort levels with copies being retained, yields and workload for technical services staff inserting 583 retention & preservation notes.

There are 99 titles held by all MSCS libraries including “The lobster gangs of Maine” by James M. Acheson!

Using the data, related counts of publication year, and last item add date, Rick was able to show trends in acquisition history across the partners including: the impact of budget cuts, growth of e-books and effects of the shared approval plans at Colby, Bates and Bowdoin where the libraries are acquiring less of the same material.

A total of 133,383 titles were filtered using the local protection rules developed by MSCS to identify Maine titles.

A total of 168,451 MSCS public domain titles are also in the HathiTrust. There were 183,883 titles in the Internet Archive that are not in HathiTrust. Comparisons between the HathiTrust and Internet Archive holdings will make for some interesting reporting to IMLS. SCS also have URLS available for HathiTrust and Internet Archive Public Domain titles which they can provide MSCS libraries with for use in their catalogs.

Rick presented the subject analysis work SCS had successfully completed to allow MSCS libraries to view the data in both LC & DDCS and combined. The subject mapping involved was another first for SCS.

Deeper into the data

Andy looked deeper into the MSCS data sets using boxplots showing the distribution of total charges by library — by LC & Dewey Classes and subclasses and applying that to individual libraries. Interesting points from the data include that Portland Public titles circulate more frequently, while University of Maine has the biggest ‘G’ collection and academics have very similar circulation levels. Using the example of home economics, Andy was able to show how this was much more of a subject strength for the publics.

MSCS retention, preservation and withdrawal scenario development

Using available data as their guide, Andy and Rick walked through a number of different retention, preservation and withdrawal scenarios. After first outlining the differences between “Title Set” and “Title Holding” Andy highlighted the types of decisions that drive shared-print scenarios for example: Which of these title-sets are eligible for draw down and which must be retained in their entirety? How many title-holdings can be withdrawn? Which ones are withdrawn and which are retained?

Andy showed a real life example of applying different criteria conditions and different factors involved. Attendees debated the importance of ensuring rare and Maine items are protected, the ‘grubby book’ factor of retaining low circulated copies because of their presumed good physical condition rather than copies that have circulated widely, but as a result may be in poor physical condition. As Rick highlighted, there is a danger of drowning in the data, with so many possibilities available MSCS needs to use the data to focus on actionable analyses of retention, preservation and withdrawal opportunities.

MSCS decided to begin with analyzing uniquely held titles (see slide 27 of Rick and Andy’s presentation), for which there is a higher than expected proportion, but is still a manageable size both in-terms of analysis and workloads for technical service/collections management for ingesting holdings information & batch loading 583 statements, is well distributed, and provides an opportunity to report results within the grant period. The following criteria for making decisions on unique materials were developed for 1-2 title holdings levels (the amount of items in this category will rise when non-circulating titles are added):

  • Analyze and take action only on pre-2003 copies
  • Retain the copies if any circulation or internal use
  • Retain material that falls into local protection categories (Specific Maine items) even if no circulation
  • Retain Special Collections/Archives copies even if no circulation
  • Retain materials on course reserves even if no circulation
  • Retain unique in OCLC (only 0-9 copies in OCLC) even if no circulation
  • Compare with both HathiTrust and Internet Archive

The goal is not to look at individual titles, but to apply general rules and some quality assurance checks to ensure MSCS are comfortable with items appearing on the lists. Attendees agreed that it was important to not get ‘bogged down’ in the lists and to make MSCS sustainable rules needs to be applied en-masse with the understanding that a limited amount of items may unnecessarily be retained, but they will be reviewed again in 15 years.

MSCS will need to look in more detail at those items with 0 circulations and apply different criteria to make decisions regarding which titles are retained. Although many of these titles are dispersed across the partner libraries, MSCS will still need to look at the items in more detail using subject strengths and other factors such as space and loan periods to allocate retention responsibility.

MSCS representatives expressed a desire to see lists of potential withdrawal candidates, which MSCS libraries can choose to withdrawal because they are being retained by another partner. Clem commented that it would also be helpful if this could somehow be included in the item’s record (non-public), so libraries can go back and view those titles which, during the MSCS group collections analysis were identified as withdrawal candidates. Rick and Andy reminded MSCS that this can only be achieved once retention commitments have been allocated to MSCS partner libraries. MSCS will also need to decide how many copies of an item are retained across the partners.

Attendees also debated their willingness to rely on digital surrogates and whether they considered the Internet Archive an as trusted repository as the HathiTrust.

Conclusion

Attendees agreed that it had been a successful meeting and were pleased that a preliminary retention and withdrawal criteria had been developed. MSCS groups will continue to review the collection summaries and report back on any identified anomalies. SCS are going to return to working on MSCS data on the week of April 15th. SCS will have a revised summary for MSCS in two weeks and data reports with MSCS retention criteria applied to it in three weeks.