For: Digital talking book distribution analysis. Task 1 - Distribution system analysis and selection (Final report : September 16, 2005)
Section 2 - Key Variables and Other Considerations Affecting DTB Distribution System Costs
One major assumption made for the evaluation of DTB distribution systems costs was that production of DTB titles in the future system would be identical or very similar to that for RC books in the current system and during the recent past. That is, the number of new titles produced per year, the quantities produced (for those titles that are mass-produced), their lengths (i.e., duration in minutes), and the mixture of content offerings would be approximately the same as that for the current system and would furthermore not vary under the DTB distribution options.
Therefore, an analysis of RC book production was performed using data from the NLS Production Inventory Control (PICS) system. The source data consisted of a file of all the RC records in the system as of the report date (4/15/2005), regardless of the production phase for the books. It was later determined, after comparing these data with detailed circulation data from a sample of network libraries, that approximately 1-out-of-8 title records for the NLS collection were missing from the PICS data extract file. However, supplementary information for these missing records from PICS was later provided and incorporated into most of the analyses described in this report.
While some information for most of the RC titles produced before the implementation of PICS (in 1988) is contained in the system, those titles produced before PICS implementation lack "date shipped" information (among other information), which is necessary to determine when a title was first available for circulation; about one-third of the titles in the extract lacked ship dates. Therefore, all lower-numbered titles with no ship date and those with a 1988 ship date were simply considered "FY 1988" in the analyses. The lowest RC Number in the file with a ship date was RC23500 with a ship date of 11/08/1988, while the highest was RC58954 with a ship date of 3/23/2005. The earliest ship date in the file was 4/14/1988 for RC26012, while the latest ship date was 5/18/2005 for RC58717. Data for FY 2005 were excluded from the analyses because the time period is for a partial rather than a complete year.
The more pertinent results of the analyses of the PICS production data are noted below.
- Titles Shipped: 1,978 RC titles were shipped during FY 2004, and production has averaged 1,985 titles per year over the last five years, as shown in Appendix 1. Production has been at almost a uniform level in recent years, therefore the baseline value used in the cost model is 2,000 new DTB titles produced per year.
- Number of Copies Produced per Title: Over the past five years, the number of RC copies produced per title varied from lows in the upper 400's to slightly over 2,000 for the titles with the very highest production quantities, as shown in Appendix 2. The 5-year average low was 476 copies, and the high was 1,984. The 5-year average production level was 932 copies per title, and the median was 855; these values showed little variation among the five years in the time period. The graph in Appendix 3 plots the percent of new titles produced vs. the number of copies per title produced using an average for FY 2000 – FY 2004, and shows a bimodal behavior, i.e., there is a pronounced second, albeit lower, maximum in the 1,300-1,400 copy per title class. A baseline value of 925 DTB copies per title (adjusted from 932 with data received in the second PICS extract) was used for the All Mass Duplication option, while other values documented later in this report were used for evaluation of the mass duplication portion of the Hybrid option; for the All DOD option, the number of copies per title produced is irrelevant.
- Length of Titles: RC Title durations ranged from a few minutes to dozens of hours for books produced during FY 2000 – FY 2004, but both of these extremes are outliers and unusual, as shown in the table in Appendix 4. As the graph in Appendix 5 shows, while the percent of titles produced with durations longer than about 1,100 minutes falls off rapidly, the number with greater than average duration but less than 1,100 minutes is sizable; this profile also holds for each of the five years in the time period. The 5-year average length (unweighted) was 676 minutes, and the median was 609 minutes, i.e., more titles fall below the average than above. The graph in Appendix 6 shows both the unweighted and weighted average title length for the years FY 2000 through FY 2004. The 5-year average profile was used for planning purposes, and a weighted average 5-year value of 694 minutes per DTB title was used as a baseline value for title length in the cost model.
- Number of RC Containers: Because DTBs will be more compact than RC books in several respects, the containers-per-copy ratios will be different for the two media. However, it was still necessary to understand the current requirements for RC containers in order to understand the current system and how that might be affected. It turns out that almost all RC books require only one container, i.e., approximately 98.5% of titles require one container (whether a 4 or 6-cassette type), and only 1.5% of titles require 2 containers (or more, but only five titles over the past five years required more than two containers). Thus, almost all RC titles fit into a single container
Some additional findings of lesser important but some relevance to the project also resulted from the analysis of the PICS data, which are listed below.
- Total Narration Cost: was $4,436,373 for FY 2004, and has averaged just under $5 million per year ($4,935,886) for the past 5 years [based upon initial incomplete PICS data, and is estimated to be short by approximately 12%]. Narration costs will not vary under any of the three distribution options being considered. Each option will impose identical requirements for narration contractors which are to narrate, edit and produce the digital book file master according to the specified NISO standard in a "wav" file format; to also produce an analog master from the digital master for titles being produced in RC format during the transition period; and produce a compressed digital master using the specified compression algorithm and deliver to the NLS, book mass-duplication contractors and/or DOD Center(s).
- Total Duplication Cost: was $3,800,169 for FY 2004, and has averaged $3,679,401 for the past 5 years [based upon initial incomplete PICS data, and is estimated to be short by approximately 12%]. Note: This total excludes the cost of RC containers, which are provided as Government Furnished Equipment (GFE) to book producers, and cost on average about $0.51 per copy, or about $950,000 per year, which is significant.
- Average Narration Cost per Title: was $2,873 during FY 2004, and has averaged $2,806 for the past 5 years [based upon initial incomplete PICS data, but is an average and probably very close to actual]. There will be no differences in narration costs per title under the distribution options.
- Average Duplication Cost per Title: was $2,010 during FY 2004, and has averaged $2,089 for the past 5 years [based upon initial incomplete PICS data, but is an average and probably very close to actual]. Note: this excludes container costs, which add about $472 per title ($0.51 per container times 925 copies per title), and is hence significant.
- Average Narration Cost per Copy: was $3.04 during FY 2004, and has averaged $2.99 for the past 5 years [based upon initial incomplete PICS data, but is an average and probably very close to actual]. Note: there will be no differences in average narration costs per title under the distribution options.
- Average Duplication Cost per Copy: was $2.13 during FY 2004, and has averaged $2.23 for the past 5 years [based upon initial incomplete PICS data, but is an average and probably very close to actual]. Note: excludes container cost, which adds about $0.51 per copy, and is hence significant.
- Total Minutes Duplicated: was 967,593,791 during FY 2004, and has averaged 1,145,234,077 over the past 5 years [both values are based upon initial incomplete PICS data and are probably about 12% short]. This was calculated in order to determine weighted-average title duration.
- English/Non-English Titles: during FY 2004, 1,567 titles produced were in English and 32 were not in English, while during the past five years the corresponding averages were 1,737 and 34, respectively [based upon initial incomplete PICS data, and is probably about 12% short].
Additional calculations were performed using the PICS data extract in order to determine the percent of recorded book titles, based on an average production profile spanning FY 2000 – FY 2004, that would fit on DTB flash memory cartridges of three different capacities. Although these calculations were based upon the initial PICS extract that was missing approximately 1/8 of the records, the profile is likely identical for the entire population. Using the envisioned compression algorithm, 711 minutes of book length can be provided by 128 MB of flash memory capacity, which was an estimate provided by NLS.
A table in Appendix 7 shows for each of three cartridge capacities, i.e., 128 MB, 256 MB and 512 MB cartridges, the percent of titles produced that would require 1, 2, 3 or 4 DTB cartridges. If a 128 MB cartridge is used, about two-thirds of titles would fit on a single cartridge, about 30% would require two cartridges, 4% would require 3 cartridges, and 1% would need 4. If a 256 MB cartridge is used, then over 94% of titles would fit on a single cartridge, between 5% and 6% would require 2 cartridges, and about 1/3 of 1% would require 3. If a 512 MB cartridge is used, virtually all titles (99.7%) would fit on a single cartridge and the small remainder would fit on 2.
RC book circulation in the national library program has averaged approximately 20,000,000 copies per year over the last five years, although the value for FY 2004 was closer to 19,000,000 copies. A value of 20,000,000 copies per year for systemwide DTB circulation is used in the planning model for total system circulation workload even though possibly 5% of book circulation will continue to be on RC for non-converted titles even after steady-state DTB operations has been achieved, and the value also being about 5% over the actual book circulation for FY 2004.
Appendix 8 shows statistics reported to NLS by network libraries for FY 2004, which include SRL readership and circulation with the associated RL. The table shows cassette readership and circulation data where IND_RDS is individual readers (about 95% of circulation); INST_RDS is institutional readers (the institution itself is the patron); IND_CIRC is circulation to individuals; INT_CIRC is circulation to institutions; MAG_CIRC is direct magazine circulation; ILL_CIRC is interlibrary loan circulation; and TOTBCIRC is total book circulation (the sum of IND_CIRC, INT_CIRC and ILL_CIRC, and excludes magazines).
The libraries are listed in descending order of total circulation, and the table shows the percent of total circulation that each operation constitutes (PCTOFTOT) and its rank (RANK). Activity ranges from a high in FL of over 2,100,000 (note: about half of which is from SRLs and half from the RL), to a low of about 1,100 for the Virgin Islands; excluding the VI, HI with about 27,000 is the lowest RL. The Maryland RL, whose operations were investigated to serve as a basis for network library operations, is highlighted in bold.
Summary statistics are shown at the bottom of the table; average circulation is about 315,000 and the median is about 233,000, and there is great variability. Maryland is below the mean, and only slightly below the median annual circulation. WY is not included as an entity in the summary counts since RC circulation services are provided entirely by another state. The graph in Appendix 9 shows the frequency distribution of these regional library systems for 100,000 copy-per-year circulation classes, i.e., the first class has circulation of less than 100,000, the second class has more than 100,000 but less than 200,000, etc.
The statistics shown in Appendix 10 are similar to those shown in Appendix 8, except that both RLs and SRLs and their associated circulation for FY 2004 are shown independently, i.e., RL circulation net of SRL circulation and SRL circulation, ranked from highest to lowest circulation. Circulation ranges from a high of just over 1,200,000 for Los Angeles to a low of about 500 for a SRL in Virginia, with an average of about 141,000 and a median of about 61,000, with great variation in size. The graph in Appendix 11 plots the data in Appendix 10 and makes clear the profile of library circulation size, which is not normally-distributed but rather shows a large number of small operations, a moderate number of larger "mid-size" operations, and a small number of large operations.
As stated above, the Maryland RL’s operations were reviewed in order to establish a basis for network library RC book distribution operations. NLS has concluded that while there are differences among libraries in the manner in which they perform book distribution, the procedures employed in Maryland are generally indicative of all operations but for a very few exceptions. The Maryland Regional library, excluding that of its one subregional library, has a book circulation of about 600 RC copies per day (ref. Appendix 12); including the SRL, the system has a circulation of about 800 copies per day.
The USPS makes one combined early morning delivery-pickup at the Maryland RL a day. The new titles, reader returns, reader non-deliveries, Braille books and other parcels are in mixed hampers, and must first be sorted out. There is a considerable variation in the volume of day-to-day receipts, which results in an uneven library workload, particularly for profile select-generated circulation (which is driven by copy limits and reader returns). This volatility is shown in graphs in Appendix 13 which shows actual daily shipping activity and average shipping activity for about a 2.5-month time period, and Appendix 14 which shows actual day-of-week and average day-of-week shipping activity for the time period.
New books are received (checked into the system, as well as physically handled) in the receiving area, and reader returns are received in the quick-pick area after the books have been shelved. One copy of each new title received is designated a master copy, and the container is marked with a red dot. An OCR serial number, or "P-Label," is placed on the right side of the container lid for all other copies. The P-Labels are then scanned to indicate receipt. All new copies, including the master, are then taken to main collection storage.
The shelving used in collection storage is of standard library cantilever design. The shelf spacing is on 12" centers, and storage is 7 shelves high. The shelves are 36" wide x 12" deep, and there are 7 storage slots in a shelf opening. There are therefore 49 storage slots per section. The capacity of one slot with 2-deep storage is 14 containers. The P-Label on the container is not visible from the storage aisle.
The books in collection storage are sequenced numerically by title number, and there is no formal locator in the inventory record. Only one storage slot is generally assigned to a title, and 41 shelf sections are needed to accommodate a full year’s collection (i.e., 2,000/49). A summary of the space now allocated to collection storage is presented in Appendix 15. As can be seen, RC storage accounts for approximately 29% of total facility area.
There are 34 sections of turnaround shelving and storage is 5 shelves high. The shelves are 36" wide by 12" deep, but container storage is only 1-deep. The containers are stored on their sides, so that the P-Label copy number is visible, and there are 23 books in a shelf opening. Total quick-pick shelf capacity is 3,910 cassette books, or about 6.5 average days throughput.
Copies returned by patrons are first taken to the inspection ("rewind") area, where the containers are opened and the cassettes are checked, repaired or replaced, as necessary. A summary of the anomalies that are encountered in a typical workweek is presented in Appendix 16. As can be seen, by far the most common problem is cassettes not being rewound (about 50% of circulation based upon 3,000 copies per week); following this, each at about 1.5% of circulation, are damaged cassettes and an incorrect number of cassettes in the containers.
The books that pass inspection are delivered to the quick-pick area, and are placed in shelf storage the same day. Shelving space for one day’s receipts is provided by moving residual books to collection storage. This transfer is a daily procedure and is performed in a constant sequence so that books have about 6.5 workdays to reside in the quick-pick area before being transferred to main collection storage.
Each of the 140 shelf openings in the quick-pick area has a unique 3-digit number. The receiving of customer returns then consists of keying-in a 3-digit shelf location, and scanning the P-Label of each copy in that location from left to right (random storage). This makes the copies available for recirculation, and changes the patron record for the title from Has-Now to Has-Had.
Pick tickets for the next day’s shipments are generated during the overnight system batch routine and printed the first thing in the morning by the earliest-arriving staff, and are thus available shortly after the start of work. The pick tickets are 3" x 5" address cards and two separate strings of tickets are prepared – one for the quick-pick area and one for the main collection area. The pick tickets are prepared in storage location sequence in both instances. Some 62.5% of the pick tickets are for books in the turnaround area, and 37.5% are for books in collection storage, as shown in Appendix 17 based upon a sample of one week’s shipping activity.
The pick-tickets for the turnaround area have an assigned P-Label on them, as the system knows which copy is to be shipped and the patron Has-Now record has been charged. Order filling then consists of matching the P-Label number on the pick ticket/address card with the P-Label number on the book’s container. Verification of order shipment from the quick-pick area is done only on an exception basis, i.e., if a copy is not shipped for whatever reason, then the patron Has-Now record is relieved of that copy.
The pick tickets for the main collection storage area show only the title number, as the system does not know what copy of that title will be shipped. To make this bridge, there is also a transaction number printed on the pick ticket. The picking process then includes scanning the copy number (P-Label) on the container and the transaction number on the pick ticket. The patron Has-Now record is then charged and the copy is removed from available inventory.
The library has to enter ILL orders to the Multistate Centers (MSCs) effectively offline, i.e., although email is used, the process is manually-intensive. While there is confirmation from the MSC of order receipt and shipment, there is no verification of book return, and therefore no update to the patron Has-Now or Has-Had record is made possible or performed.
When storage space becomes critical, the books in main collection storage are consolidated. This consolidation consists of selectively placing more than one title in a storage slot, and a reduction of the number of copies per title retained, based upon both a circulation history and the judgment of the library Director. Space requirements are thereby reduced some 37%, but there is apparently no collection weeding or further consolidation at a later date, and the 7-high shelf spacing is not changed.
An analysis was performed to determine the loss rate for DTBs that will be experienced in the future distribution system. The assumption was made that the loss rate for DTBs will be approximately the same as that recently experienced for RC books.
Using data from eight regional and two subregional libraries that use the Keystone Library Automation System (KLAS) information systems (as the eight reporting entities Alabama RL, Colorado RL, Massachusetts RL and SRL (combined), North Carolina RL, New York SRL, both Cleveland and Cincinnati, Ohio RLs (combined), and Oklahoma RL), RC book losses were calculated as a percentage of RC circulation. Data for each of four full consecutive fiscal years, FY 2001 - FY 2004, were evaluated.
There are two classes of RC losses reported in KLAS library systems: (1) books lost by readers and the USPS (i.e., lost when not in a library’s possession); and (2) books lost when in the possession of the libraries. Each of these two types of losses were considered separately, as well as total books lost, i.e., the sum of both loss classes that are tracked in and reported by the KLAS system. Both annual and 4-year average loss rates were calculated for all the libraries, and the same values were determined for both weighted and unweighted averages for all libraries.
Appendix 18 shows circulation and loss statistics and calculations of loss rates for all libraries and years. The graph in Appendix 19 shows the loss rates for books lost by readers and USPS, Appendix 20 that for books lost while in possession of the library, and Appendix 21 the total for both classes of losses. As the data show and as Keystone noted, reporting conventions differ among libraries. Both the CO and MA libraries effectively do not charge patrons with losses, but rather charge the library with all losses. However, the NY SRL is at the other extreme; while for FY 2001 and FY 2002 it showed almost no losses in either class, in FY 2003 and FY 2004 it reported virtually all losses as lost by readers and USPS.
The reported loss rates in Appendix 19 range from lows of a fraction of a percent to slightly over 6% for OH in one year and OK in another. Variation within library across years is caused by relatively large loss write-offs occurring relatively infrequently, which is indicative of an operating procedure performed periodically rather than continuously. It was therefore necessary that data for multiple, consecutive years of activity be analyzed.
The graph in Appendix 20 shows data for books lost when in possession of the libraries. The high is OH in FY 2002 of almost 5%, while several libraries show lows in several years that are a fraction of a percent. While some variation within library across years is seen, most notably for OH, variations in this class of loss are noticeably less than that reported for the losses by readers and USPS class.
The total loss rate, i.e., combined reader/USPS and library losses, are shown in Appendix 21. In this graph, both the weighted (by circulation) and unweighted multi-library average loss rates are shown, wherein it is seen that the weighted average rate generally (FY 2001 being the exception) is slightly higher than the unweighted rate. The weighted loss rate is seen to fall slightly above, and the unweighted rate slightly below, 3% of circulation. This is consistent with the statistics in Appendix 18, which show the 4-year weighted average loss rate to be 3.2% and the unweighted loss rate to be 2.6%.
Lastly, a calculation was performed to determine the reader/USPS loss rate after excluding data from the CO RL, MA RL and SRL, and NY SRL for the reasons previously cited, i.e., all three use a reporting convention that reports all losses in a single loss class rather than in those in which they actually occurred. The motivation for this calculation is that the Distribution-on-Demand center(s) would not be subject to the "lost by library" class of loss because collections would not be maintained, i.e., a copy of a title could not get "misplaced" in a collection for the simple reason that centers would not maintain collections; only pilferage could cause such losses. These adjusted multi-state average loss rates, weighed and unweighted, closely parallel the same multi-state average rates for reader/USPS losses, i.e., the weighted loss rate falls slightly above and the unweighted rate slightly below 3% of circulation, as shown in Appendix 22. The 4-year multi-state weighted average loss rate for this reduced sample size is 3.0% (a 1.2% FY 2001 rate brought the 4-year average down to 3%) and the unweighted rate is 2.4%.
While the loss rate experienced by the Duplication-on-Demand Center(s) may, in theory, have the potential of being slightly lower than that experienced by the libraries (because no collections would be maintained by the Centers), both because the calculated estimates are so close and because of the variability that exists in the data, the same loss rate is assumed for the Center(s) and the libraries. Therefore, the baseline value of the key variable "Loss Rate of DTBs" used in the cost evaluation model is 3%.
In order to help determine the quantity of DTB cartridges needed as a working inventory for the DOD Center(s) under the Hybrid and All DOD options, and to determine maximum circulation of copies from library collections that can be achieved under the Mass Duplication and Hybrid options, it is first necessary to estimate what the average turnaround time for DTBs will be in the future system. This is the average time that a DTB will spend outbound in the USPS mail, with a reader, and inbound in the mail, i.e., the time elapsed between the date that a DTB is shipped to a reader and the date that it is received back at the library or DOD center.
The assumption was made that average turnaround time for DTBs in the future system will be identical or very similar to the average turnaround time for RCs in the current system. An analysis of RC turnaround time during FY 2004 was performed using data from 10 regional libraries and two subregional libraries. The data were obtained via special extracts from libraries using the KLAS (previously cited) and CUL library automation systems and the sample, though not scientifically selected, was chosen by NLS to incorporate both geographic and size diversity.
The table in Appendix 23 shows the average turnaround times for the eight KLAS library entities previously cited; Alabama RL (without its SRLs), Colorado RL, Idaho RL, Massachusetts RL and SRL combined, North Carolina RL, New York SRL, Ohio Cincinnati and Cleveland RLs combined, and the Oklahoma RL. Average turnaround times ("Days Out") ranged from a low of about 22 days for Idaho to a high of 41 days for the two Ohio RLs. The source data included a few records for books loaned during FY 2004 but not yet returned as of the date the report was run in FY 2005; these records were excluded from the calculation of the average turnaround times.
The unweighted average turnaround time among these libraries was 32.3 days and the median was 31.4 days. The time that individual titles were out ranged from 0 days (for walk-ins) to 573 days. While other statistics could be computed from the source data, such as frequency distributions of turnaround times, only average turnaround time was needed for estimating the costs of distribution options so they were not developed.
The Iowa RL performed calculations using their CUL system and reported an average RC book turnaround time of 35 days during FY 2004. A median value of 21 days was also reported, indicating that a majority of books have turnaround times below the average. A mode of 13 days was also reported. These values are shown in a table in Appendix 24.
The Carnegie Library of Pittsburgh (PA(W) RL) performed calculations using their CUL system and provided a frequency distribution of RCs returned during FY 2004 by 8 classes of turnaround time ranging from "less than 8 days" to "greater than 360 days," which is shown in Appendix 25. The midpoint of each class range and the number of returns for each class were then used to estimate the average turnaround time, which was found to be 37.0 days.
The table shown in Appendix 26 presents the calculation of unweighted and weighted average RC turnaround times for FY 2004 for all libraries analyzed. The minimum value was 21.6 days and the maximum was 41.0 days. The unweighted average was 33.0 days, the median was 33.5 days, and the weighted average was 35.3 days (using FY 2004 RC circulation reported to NLS as the weighting factor). A bar graph in Appendix 27 shows FY 2004 average RC turnaround times for all libraries in the sample.
Given these findings, a value of either 34 or 35 days should be used as the baseline value for average DTB loan turnaround time in the cost model, with a range from 33 to 36 days. The baseline value used in the cost model, possibly erring very slightly on the part of conservatism, is 35 days.
An analysis of Has-Had RC book circulation data for FY 2004 for the 10 KLAS network libraries (8 reporting entities) in the sample was performed in order to determine what proportion of total recorded book circulation is generated automatically by library automation systems via profile select versus that which is generated by orders placed for specific titles. While not a key variable in the cost model, because it influences the timing of circulation workload placed upon the book distribution system, it was examined to assist in conceptualizing the processes necessary in future DTB distribution operations.
The "Select Code" field in the KLAS Has-Had records were used to determine whether a book was circulated in response to an order for a specific title (via mail, email, telephone or walk-in-based orders) or was selected based upon reader preferences (subject, author and/or series preferences). Codes used by both the Version 5 and Version 7 KLAS systems were included in the data records and were used to designate circulation as either profile-select or reserve/request based.
The table in Appendix 28 shows the proportions of circulation that were profile select or reserve-request based for each library in the sample and for the group as a whole. With the sole exception of Alabama, which had only about 32% of circulation generated by profile select (note: Alabama data is circulation for the RL only, and does not include circulation for five SRLs in the Alabama system, and is unique in this regard among the libraries in the sample), all libraries had a majority of total circulation generated by profile select. Excluding Alabama, the lowest was Colorado with 55.7%, and the highest was that for the two Ohio RLs with 77.6%.
The unweighted average portion of total circulation that was profile select-based among all libraries was 59.5%, the median was 60.6%, and the weighted average was 65.0%. A graph in Appendix 29 shows the percentage of total circulation that was profile-select based for all libraries in the sample. The baseline value used for planning purposes is 65% of total circulation being profile select-based.
Detailed RC circulation data for each of the four fiscal years FY 2001 – FY 2004 from the sample KLAS libraries and CUL-based data from the Iowa RL were analyzed. This was done in order to help determine profiles of book demand necessary for estimating workload and associated costs under the distribution options.
Pareto’s Law effectively states that, in many applications, a distinct majority of an activity is associated with a minority of entities in the population which generates the activity. This concept is critical to the economic feasibility of the Hybrid option from the perspective of NLS, which attempts to exploit this phenomenon by mass-duplicating the most popular titles (minority) and DOD the less popular titles (minority), the principal advantage being to both minimize the investment in relatively expensive book media and also leave the majority of distribution workload with the network libraries who have traditionally performed this role and borne the associated costs.
Analyses were first performed in order to determine if Pareto’s Law indeed applies to network library book circulation, for both individual libraries and systemwide, if it applies to what degree does it apply, and what the ramifications are for DTB distribution options. It was determined that Pareto’s Law clearly applies to circulation at all of the sample libraries, for all four years for which data were provided and analyzed, both individually and, much more importantly, in the aggregate.
Appendix 30 shows a table with the cumulative percent of total RC circulation for FY 2004 generated by title classes each of which constitutes 5% of the total RC collection, with the titles sorted in descending order of popularity. The rule-of-thumb for such a relationship is that 20% of the entities in a population account for 80% of the activity. However, each of the nine library systems examined, with the single exception of OH (79%), reach 80% or more of total circulation with 15% of the most popular titles in the collection; the highest is NC with 88.7% of circulation being generated by the top 15% of titles in the collection. In fact, NC and the NY SRL achieve 80% of total circulation with the top 10% of titles in their collections. The cumulative percent of circulation vs. cumulative percent of titles in collection relationship for all of the libraries in the sample for FY 2004 is shown in a graph in Appendix 31, where it is apparent that the same relationship essentially holds in all cases; activity for FYs 2001, 2002 and 2003 was also analyzed and found to have basically the same pattern. One of the most striking findings in these data is that approximately half of the titles in the book collections don’t circulate at all.
However, even if the Pareto’s Law effect holds for each library individually, it may not necessarily hold for the aggregate, i.e., systemwide circulation, if readers in different libraries have fundamentally different preferences. In theory, systemwide demand could potentially be uniform if the most popular titles at some libraries are the least popular titles at other libraries, and vice versa. But as clearly demonstrated by the graph in Appendix 32, the weighted average combined circulation by title of all libraries, which is the value needed both to estimate systemwide costs and to determine whether the circulation profile makes the Hybrid option feasible, follows the same pattern of the individual libraries, with the top 20% of titles in the collection accounting for 80% (FY 2001) to 83% (FY 2004) of total circulation over the four years examined.
In the graph in Appendix 33, weighted average systemwide circulation is plotted against unweighted (i.e., the average of all nine library systems in the sample) average systemwide circulation for FY 2004. The curve for the weighted average systemwide circulation falls very slightly below the curve for the unweighted average systemwide circulation. This difference is essentially caused by reader preferences that vary geographically. However, as stated above, systemwide weighted average circulation nevertheless adheres to Pareto’s Law with about 83% of total circulation being generated by the most popular 20% of titles in the collection.
An analysis was then performed in order to address another issue concerning Pareto’s Law and its application to the DTB distribution system. This concern is that, even though we now know that Pareto’s Law holds for combined network library circulation, what would be experienced at individual libraries, especially small ones, if the profile for the combined (systemwide) circulation were used. The table in Appendix 34 shows what the resulting circulation profiles at the nine individual library systems would be if the titles are ranked in descending popularity order for the combined profile. The conclusion is that the most popular books systemwide are also, generally, the most popular books in all the individual libraries as well. With the single exception of ID, with 77.1%, all libraries would achieve 80% or more of total circulation from the top 20% of titles in a collection whose ranking is based upon systemwide (combined) circulation. Some libraries would achieve 90% of circulation from the top 20% of titles in the collection, if they had the choice of which titles to use.
Appendix 35 shows the results of an analysis of combined circulation for the sample libraries which determined the incremental contribution of various proportions of annual title production to total annual circulation, which is the most relevant concern for a "title-based" Hybrid model. In this table, the "Top 5%" class includes the circulation generated by the most popular 5% of titles produced for each and every production year for all the titles in the collection, i.e., the top 5% produced during FY 2004, the top 5% produced during FY 2003, etc. These analyses were performed for all four years FY 2001 – FY 2004. While the profile demonstrates a Pareto’s Law characteristic, i.e., a minority of titles constituting a majority of activity (the top 15% of titles account for about 50% of total demand), the tendency is less than that observed for collection-based proportions. The production-based profile indicates that approximately 39%-40% of the most popular mass-duplicated titles account for about 80% of total circulation. The reason for the difference in the two profiles is simply aging of the titles (as described below) and a consequent significant waning of popularity, so that, e.g., some titles in the second quintile of a recent production year have more circulation than titles that were in the first quintile of their production year but are now many years old.
Appendix 36 contains a table which shows the production levels for title classes sorted from most to least popular titles in the collection. Both production levels (copies per title produced) and cumulative production levels are shown for all classes. For example, the average number of copies per title produced for books that constitute the most popular 20% of titles in the collection was 1,114 (using the value for FY 2004), whereby 1,224 copies per title were produced for the top 5% of titles. This finding is consistent with the fact that libraries order more copies of titles that they expect to be popular than the average, and NLS produces them in larger quantities than the average; titles expected to be of below average demand are ordered and produced in smaller quantities than average. The data indicate that libraries have a very good sense of what the relative popularity of book types as there are very few anomalies in the relationship between quantity ordered and popularity (however, this also could be a partially self-fulfilling prophesy as the books with more copies will be relatively more available to circulate). However, the age of a title and its contents/author also strongly affect demand, as shown below.
The data in Appendix 36 also indicate that demand does not vary with title length. That is, there is apparently no correlation between title demand and title length, and this was apparently the case for each of the four years FY 2001 – FY 2004.
Appendix 37 shows the average number of copies per title produced for various proportions of annual title production, corresponding to the title classes previously cited in Appendix 35, wherein the "Top 5%" class is the most popular 5% of titles produced for each and every production year for all the titles in the collection, i.e., the top 5% produced during FY 2004, the top 5% produced during FY 2003, etc. The values in this table show exactly the same trend as do the copies per title data in Appendix 36, i.e., the higher the title class in the ranking, the greater the number of copies per title originally produced.
Demand for books is known anecdotally to vary strongly with title age, i.e., readers in the program generally like to read the most recent materials and new releases as opposed to titles many years old; there are, of course, a few exceptions to this rule which are generally referred to as "perennially popular" titles. An analysis was therefore conducted in order to determine how book demand wanes with title age.
Appendix 38 contains data for FY 2001 – FY 2004 which shows for each fiscal year of production, for 1988 (which includes all prior years) through FY 2004, how each cohort of books produced annually contributes to total circulation in each of the four years. These data were reformatted to create the table shown in Appendix 39, which shows how titles in 1-year age classes contribute to total circulation for each of the four years examined and a 4-year average. Note that for the age class "1 Year or Less," which are titles produced during the year for which circulation is being considered, a title on average has only a half year to circulate. That is, titles are produced continuously throughout the year, with some having virtually all year to circulate, some having a reasonable portion but less than an entire year, and some having very little time to circulate. This is why circulation associated with this class is lower than that associated with the "2 Years or Less" class.
An interesting finding resulting from these data is that weeding of mass-duplicated copies can begin during or at the end of the third year after production, potentially reducing the number of copies per title that have to be stored in collections significantly, maybe by as much as 50%, and possibly about another 15% (of the original quantity, or 30% of the remaining quantity) over the next three to five years. Utilizing such procedures, cartridge reuse rates after attaining steady-state operations for library-managed books could be as high as 68%, but to assume any higher is optimistic given attrition at 3% of circulation, as cited later in the report. Based upon current RC operations, a reuse rate of 47% would be expected (based upon the disposal/recycling throughput of RCs as reported by the NLS RC recycling contractor).
Another possibility is that, if all titles were mass-duplicated and "migrated" to DOD production after 6 years of distribution by the libraries, regardless of how relatively popular any individual titles are at that time, the resulting workload split would be about 80% of total circulation performed by the libraries and 20% performed by the DOD centers. If the rule of migration to DOD after 3 years of age were used, a 60% libraries/40% DOD workload split would result. Migration after 8 years of age would result in a 85% libraries/15% DOD workload split. A table in Appendix 40 shows that the demand for fast movers (in this instance, the top 20% of titles in both FY 2001 and FY 2002 were considered) wanes at virtually the same rate as do all books, despite some that are "always popular."
Appendix 41 shows results of univariate linear regression analysis for title circulation as a function of title age. In this analysis, one calculation was performed including records for book titles in the "Less than 1 year old" age class, and another was performed excluding them for the reason cited above. There is a clear correlation between title circulation and title age, but the R-Squared value in "the 20’s" indicates that factors other than age also influence demand.
As previously mentioned, libraries order and NLS produces more copies per title for books that are expected to be relatively popular and fewer copies for those expected to be less popular than average. Appendix 42 shows the results of univariate linear regression analysis for title circulation as a function of the number of copies of the title produced. Although this factor is known to have some influence upon demand, there is virtually no correlation between demand and original quantity produced. This occurs because title popularity is also strongly correlated with title age and book subject/author. Thus an older title, which may have been very popular in its first few years of circulation, may have relatively little circulation after several years in the collection but was produced originally in relatively large quantities in order to support circulation for the first several years after production.
Appendix 43 shows the results of a bivariate linear regression analysis for title demand as a function of both age and quantity produced. There is a clear correlation between title circulation and title age and production quantity, but the R-Squared value in "the 20’s" indicates that factors other than these also influence demand, and most of the correlation is driven by age rather than quantity produced.
An analysis of book demand by subject was also performed in order to gain insight into which types of subjects are relatively more or less popular in the program. A similar analysis by author was not performed and is not as useful for planning both because (1) as current authors retire and/or die, while the works of some may retain popularity for some years, no further titles will be written by them; and (2) new authors will arise in the future, some of whom will become popular during the period that the flash memory-based DTB system is in operation, and the popularity of their works in the future are simply unknown at this juncture.
For this purpose, the "Genre" subject code used by Massachusetts, as provided by Keystone Systems, was used to identify NLS title by subject. The NLS Copy Allotment Subject Code was not used because the PICS data extract that contained this data element was missing approximately 1-out-of-8 records. A listing of the Massachusetts subject codes and their definitions is shown in Appendix 44.
Appendix 45 contains a table with title circulation by subject for all titles in the collection and for the most popular 5%, 10%, 15% and 20% of titles in the collection. This listing is arbitrarily sorted in descending order for the most popular 10% of titles in the collection; it could instead be sorted in a similar manner using the 5%, 15% or 20% classes. As these data show, some subjects are very much more in demand than other subjects.
Appendix 46 and appendix 47 show circulation profiles for FY 2004 for the most popular 20% and 10% of titles ranked in descending order by subject, respectively. For example, in Appendix 46, Mysteries are the most popular subject. There were 3,788 total mystery titles in the NLS RC collection in FY 2004 (8.3% of titles in the collection), of which 1,484 were among the most popular 20% of titles, and of the circulation generated by the top 20% of all titles in the collection (which we know was 83.5% of total circulation during FY 2004), Mysteries account for fully 16.2%, followed by Romance with 10.1%, etc. The cumulative percent of circulation associated with each subject is also shown.
The major conclusions regarding book circulation and their impacts upon the DTB distribution system are: (1) Pareto’s Law does apply to the circulation at individual libraries and for combined systemwide circulation, which is a necessary condition for implementation of the Hybrid option, but the relationship for a title-based Hybrid is less than a pure 80/20 ratio due to the waning popularity of titles with age regardless of their initial popularity; (2) the number of copies per title produced, which is the sum of libraries orders for a new title in anticipation of expected reader demand for a title, is related to demand but the relationship is weak because age and subject/author also very strongly influence demand; (3) the length of title has no influence upon demand; (4) demand is strongly correlated with title age, and demand begins to wane significantly three years after a title is produced, and after another three-to-five years (6-to-8 total) approaches a residual, asymptotic level; (5) although demand is correlated with author of the title, using author as a predictor variable for future planning is not prudent for reasons cited; and (6) demand is strongly correlated with subject matter, with Mysteries, Romance, Suspense, Westerns and several other genres being by far the most popular subject.
However, there is not a single algorithm which can be used to forecast demand or determine precisely a weeding rate for books. In the case of weeding, for example, the "perennially popular" titles must be identified by librarian knowledge and system-generated circulation reports. In the case of production, occasionally there is a popular author who produces a title in what is otherwise a relatively unpopular subject. These are exceptions to the rule, however, and would not prevent the NLS copy allotment system (which is based upon defaults for subjects, modified when needed to compensate for characteristics of individual titles) from being effective if it employs the general rules.
Appendix 48 presents a table, provided by NLS, depicting the estimated wholesale unit prices for 128, 256 and 512 MB flash memory cartridges, by year, for 2001 through 2008. Because the cost model assumes that a steady-state operation will be in place sometime after 2008, a wholesale unit price of $6 per 256 MB DTB cartridge is assumed as the baseline standard and cost value for use in the model. A 512 MB cartridge is assumed to have a wholesale unit price of $9 per cartridge, while it is assumed that 128 MB cartridges probably will not be available on the market.
The Write speed of flash memory is assumed to be about 2 MB/s by the time steady-state operations are achieved; the value may indeed turn out to be twice that. For the erase speed of flash memory, it is assumed for planning purposes that only several seconds (1-5) will be required to erase the average size DTB. Given the average size of a NLS DTB, i.e., about 120 MB in compressed format, it will require at least 30 seconds and possibly 60 seconds to first erase and then load an average DTB title in duplication operations.
Appendix 49 presents the library automation systems used by network libraries as of early in 2005. A total of six systems are used, of which two are unique to individual libraries (Texas and Albany NY) and four are used by multiple libraries (CUL, KLAS, READS and SIRSI). Libraries currently using the SIRSI system (5 libraries) are migrating to other systems because of the support level being received. READS was developed and is supported by NLS (used by 15 library systems). KLAS is by far the predominant system, used by 29 RL systems.
Although specific impacts upon network library ADP systems are to be assessed in the next task of this project, an estimate was needed for the cost impact of changes to library ADP systems required by the various DTB distribution options. To this end, a macro-level functional specification for library ADP systems under each of the distribution options was developed and presented to Keystone Systems in order to obtain a "ballpark estimate" of costs associated with such changes. These functional specifications are shown in Appendix 50.
Keystone's response indicated that changes to their system would be considered part of regular improvements to their system and would be absorbed as part of ongoing software maintenance. They also noted that any third-party software that would need to be integrated would have to be paid for through an increased cost to users. Since the details of the selected option are a requirement of Task 2 of this study, we do not know that third-party software will be required. As well, the READS system is provided to cooperating libraries by NLS without cost. Thus, at this time we are using a zero cost factor for changes that might required to library ADP systems.
The conceptual operating modes for a DOD Center and its roles in the distribution system for DTBs described here were developed to provide a rational way of estimating prospective costs, and for evaluating other meaningful attributes of the Hybrid Distribution option. Alternative ways of satisfying these functional requirements can later be examined, if desired, should the Hybrid distribution option be selected by the NLS for implementation. Because the Hybrid option involves both mass-duplication of DTBs and DOD of DTBs, it was developed in the most detail, with facets also applicable to the All Mass Duplication and All DOD option.
Nominally 20% of circulation could eventually be provided by DOD facilities under the Hybrid option, and that two geographically separated production facilities may be required (primarily because of risk diversification concerns). The mission of these facilities will be to fill reader demand for books that are not stocked by some or all of the libraries. These will be the slowest-moving titles that now account for about 20% of reader demand. This will include some of the orders that are now sent to the MSCs. Foreign language titles could also be handled by the DOD Centers, but there also might be good reason to maintain collections in some network libraries.
For costing purposes, we must necessarily assume that the Duplication-on-Demand facilities will be operating at full capacity at some future point-in-time (TBD). We would like to discuss with NLS about how long the facilities would operate before being obsoleted by newer technology; a nominal estimate used is 10 years after the transition to flash memory-based DTBs is completed in circa 2012.
To pursue the costing of this distribution alternative, some appreciation of the number of copies that these facilities would duplicate will be needed. For now, a simple ratio of total demand should suffice for macro-simulation, as noted below:
Number of Copies
Total annual throughput - 20% of 20,000,000 4,000,000
Copies per day @ 250 working days per year 16,000
Copies per day for each of two facilities 8,000
Copies per minute for a net 400 minute workday 20
This would eventually be the production throughput of each of two Duplication-on-Demand Centers, but there would be much less production capacity needed at startup (which will be addressed in subsequent project tasks). The building configuration and the layout design would provide for modular expansion at a later date, with minimum disruption to ongoing operations.
The achievable throughput of a production line would be between 20 and 30 copies per minute, depending solely upon the speed of duplication; the duplication speed could turn out to be half of this value. The design specification for the duplicator line should therefore be at least 30 copies per minute. All upstream and downstream production tasks must also be accomplished in this time frame to prevent bottlenecks, and duplicate workstations and equipment would therefore be provided as needed.
The handling units in this instance would be the DTB cartridge and probably a reusable shipping/return envelope, and there would never be more than one cartridge in a shipment. The envelope would be made of a flexible durable plastic, and would have a fastener (such as a zip-lock) that could be readily opened and closed (TBD). The routing would be by USPS letter mail, which may prove to be a faster and more reliable service than parcel mail.
The cartridge labeling would include all of the print information that is now on a cassette, plus a bar-coded unique transaction number which would be assigned by the computer system at the time of order entry. Braille labeling would be provided for Braille readers, who constitute 5% of the readership.
The cartridge must be conveyable through the automated labeling, duplication and packing stations. It would most likely resemble the present cassette to satisfy this functional handling requirement, but would be smaller than a cassette. It may also have some accommodation to make the labels more easily removable when the cartridges are prepared for recirculation.
A DOD Center would have two distinct production areas, which would be operationally independent, as follows:
- Input operations, including receiving, check-in of patron returns, and reconditioning of cartridges;
- Output operations, including cartridge labeling, duplication, check-out of patron shipments, packing and shipping.
The two operations could possibly have different starting times (TBD). However, output operations would always end on or before the 5:00 PM deadline for same-day shipment of patron orders. DOD service response times would therefore compare favorably with present response times for library shipments.
A sizeable storage area for recycled cartridges would be located between the two operating areas. This buffer is needed to counter the variations in cartridge input and output. When workflow is in balance, the area would be half-full. Perhaps an 8-hour supply of cartridges would be adequate to provide this buffering capability. Two-hundred-forty new cartridges would be added to the buffer supply on an average day, to provide for an expected attrition rate of 3% of circulation.
The upstream operations would begin by setting aside the unopened containers of books returned by the USPS as undeliverable. These would be taken to the DOD office for special processing. A material handler would then distribute the unopened envelopes to the receiving workstations, and later, take the emptied envelopes to the packing stations for reuse. Two-hundred-forty new envelopes would be needed on an average day, to provide for an expected attrition rate of 3% of circulation.
At a receiving workstation, each cartridge would be passed through a reader (e.g., a bar code reader) to notify the computer system that the book has been returned. The computer system would then notify the library of the return, using the either the library’s transaction number or possibly the DOD transaction number (TBD). This notification could be either on-line or batched, depending on when and how the library would use the information (TBD).
The labels would then be manually removed from the cartridges (the label specification will call for ready removal), and the reconditioned cartridges would be placed into stackable tote boxes. A tote box would hold 180 to 240 cartridges, depending upon the cartridge thickness. The tote boxes would be stored in-process on flatbed carts, and there would be 20 tote boxes per cart.
Labels for cartridge duplication will be prepared in the office and there will be two label printing machines to provide the needed capacity. The labels would be prepared using a database containing the label format for all titles, in combination with the transaction number and the title number, which will reside in the customer file. Only a barcoded transaction number will appear on the cartridge label. For each title to be duplicated, the system would create a label and associate it with the barcoded transaction number.
There would be two cartridge labeling machines to provide both scheduling flexibility and backup protection, and the labeling rate of each machine would be 25 cartridges per minute. Unlabeled cartridges would be fed into the labelers from the in-process tote boxes, and after labeling, the cartridges would be automatically loaded into dispenser containers. The labeler would be off-the-shelf equipment, but some modification or addition to the takeaway conveyor may be required. The dispenser containers would later interface with metering devices that would automatically singulate cartridges onto the conveyor feeding each duplicator. The design of the dispenser containers and the associated loading/metering devices will be an integral part of the duplicator systems design.
The duplicators will in all likelihood be custom-designed equipment, but may turn out to be off-the-shelf equipment that is customized to the NLS DTB cartridge standard depending upon the evolution of flash memory-based commercial book production. For a duplicating capacity of 20 cartridges per minute, there would be 10 duplicators (TBD – if 1 minute per copy is required for duplication of the average DTB rather than 30 seconds, then 10 duplicators would still be used but operated on two rather than one shift per day). A duplicator would have an input and an output conveyor (or chutes) that would be an integral part of the duplicator design. The input conveyor would probably be of the indexing type, whereby, on signal the entire conveyor is moved forward one cartridge interval at a time. The indexing signal would be provided by the duplicator, immediately following duplication of a cartridge.
The inventory of all titles that could be duplicated by a Center would reside in a large file, or "image," server utilizing hard-drive storage for quick read-times rather than optical jukebox-based storage. If 40,000 titles are to be stored eventually, in compressed format, then the required capacity of the server would be 5 TB (TBD). The server would be loaded with new titles in compressed format (8 per average workday) either via data telecommunicated over high-speed channels and/or via delivery of CD/DVD (TBD) to support acquisition of (large) DTB files from NLS book narration contractors. Even if the compressed title masters are acquired from narration contractors by the DOD Centers via telecommunications, the CD/DVD copies of the title masters would nevertheless also be shipped to the DOD Centers and stored there on-site (or in close proximity offsite) in a backup collection for restore and recovery operations as necessary.
Once a new cartridge has been metered onto the duplicator input conveyor, a scanner located above the conveyor would read the bar coded transaction number on the cartridge. The system will then obtain the title number from the transaction number in the open order file. This would signal the server to retrieve the files for that title and send it (the files will be grouped in a compact "image" format) to a RAM buffer. For a duplicating capacity of 20 cartridges per minute, the retrieval time required of a single server would be 3 seconds, and two servers, rather than one, could possibly be required (TBD). Based upon operations at Recordings for the Blind and Dyslectic (RFB&D), this rate is just attainable with a single high-end image server. As noted earlier in this report, the total time required to first erase and then load an average sized NLS DTB on the duplicators will be 30 to 60 seconds. In the event that this loading time is about 60 seconds rather than 30, the Centers would operate on two shifts rather than one and a single image server should easily support the duplication function.
A cartridge in loading position would be first erased before duplication, and after duplication the cartridge would be automatically cycled out of the duplicator and onto one or two takeaway conveyors (TBD) feeding the packing stations which would be located along the conveyor. The conveyor(s) would be of belt-type accumulation design, with adjustable speed controls.
At a packing station, the packer would remove a cartridge from the conveyor, and immediately pass the cartridge through a reader (e.g., a bar code reader) to signal that the order has (or will be) shipped. The computer system would then notify the library accordingly. This notification could be either on-line or batched, depending upon when and how the libraries would use the information (TBD).
This data entry would also notify the system that a shipping label is required at the station. The packer would then proceed to pack the cartridge, while a self-adhesive label is being printed on a high speed printer located at the station. The labeled envelope would then be placed in a USPS hamper for same-day shipment.
There would be two separate computer systems in a DOD Center. The first system would be a dedicated computer that would manage the cartridge dispenser-server-RAM buffer-duplicator-conveyor interface. This system would be an integral part of the duplicator design and procurement contract.
The second system would be for more conventional computerized operations, including:
• communications with network libraries on order entry and order status
• managing the shippable order backlog and planning the next day’s production
• producing strings of clear plastic labels for all cartridges
• producing strings of special Braille labels for 5% of the cartridges
• interface with receiving stations for patron returns and possibly new title masters
• interface with packing stations for patron shipments
• printing shipping labels at the packing stations on command
The libraries served by a Duplication Center would enter a patron’s order by transmitting the following information to the Center:
• Network Library placing the order, the NLS unique identifier, e.g., "NC1A"
Either as a field in the order records, or in the transmitted file header
• Date of Order, e.g., "04012008"
Library Patron ID, e.g. "123456"
(to facilitate tracking by the library on the library’s information system)
• DB Number, unique identifier for the title ordered, e.g., "DB68000"
• Patron Name, e.g. "John Doe"
• Patron Street Address, e.g., "1234 Main Street"
• Patron City, County or Town, e.g., "Anytown"
• Patron State, e.g., "PA"
• Patron Zip Code, e.g. "12345-6789" (probably plus 4 format)
• Patron Braille Reader or not, e.g. "B" or "1"
• Rush Order or not, e.g., "Y" or "N" (Yes/No)
An order record using the above structure would appear as below; the Library Patron ID, Patron Name, Patron Street Address and Patron City fields would probably be set to established fixed lengths using either blanks (for address fields) or leading zeros (for Patron ID) as appropriate:
NC1A|04012008|123456|DB68000|John Doe|1234 Main Street|Anytown|NC|123456789|B|N
The individual orders that do not require batching for profile-select would be sent on-line to the Center during the day, or batched at the end of the day (TBD), and would account for 35% of total shipping demand. The remaining 65% of the orders would be processed in the libraries overnight and would be batched to the Center early the next day (TBD).
On receipt of all required information, and verification that an authorized title has been ordered, the Center would assign a transaction number to each order, and transmit the library order (transaction) numbers, or possibly the DOD transaction numbers, to the libraries to confirm receipt and order entry (TBD). This information would be entered into patron records in the libraries’ systems. All further communication between the libraries and the Centers would reference either the DOD transaction number or the library order/transaction number.
With some 35% of all patron orders received before 5:00 PM, planning the next day’s shipments would be done in two parts. The Center would combine these early orders with any unfilled orders remaining in the database, and preference would always be given to the oldest orders. The remaining 65% of patron orders that are received early the next day would be combined in a similar manner.
All orders requiring Braille labeling would be placed first in the order queue. There could be some merit in sorting the remaining orders by title number (TBD). This would then be the sequence in which the cartridges are labeled and the orders are duplicated.
The cartridge labels for 35% of the orders would be prepared in the office in the early evening and must be available before the start of work the next day. The labels for the remaining 65% of orders would be prepared early the next day. There would be separate label batches for cartridges requiring Braille labeling, in both instances.
Shipping demand in the duplication Center is expected to vary by day of week. But just how much it will vary is yet to be determined. For scheduling purposes, it is best to have a stable workload, and to do this, there would have to be a carryover order backlog from day-to-day, but not necessarily from week-to-week.
A sampling of possible variations in shipping demand at a DOD Center by day of week is shown in Appendix 14. This information is taken from shipping statistics of the Maryland RL, and is for illustration only. As can be seen, shipping activity varied greatly by day of week at the MD RL, and we can reasonably expect that there could be a somewhat similar variation at the DOD Centers.
Volatility in demand associated with fluctuations in reserve and request orders would be reduced by pooling such demand for the multiple libraries served by a Center. However, 65% of total circulation is generated by profile-select, which is driven by book returns that drop readers below their copy limits, so that the system then selects more books for them. If USPS returns vary at a Center like they do for the Maryland RL, then the shipping variability at the Centers for the portion of circulation driven by profile-select may possibly be as great as that for Maryland.
The estimated staffing for a Hybrid distribution DOD Center would be 16 people, consisting of 12 production personnel, 3 office personnel and one supervisor, as shown in Appendix 51. This would be a daytime 1-shift operation, thereby providing patrons with the same or better shipping response times as now, and there would be no backorders (TBD); as previously mentioned, however, if duplication time for the average DTB is closer to 60 seconds rather than 30, the Center may operate (possibly only duplicate) a second shift as well.
The estimated size of each of the two production facilities required for the Hybrid DOD distribution option is 3,600 square feet, and functional space allocations are provided in Appendix 52. In making these projections, we have assumed that a DOD Center would be a compatible part of the operations of a much larger contractor. The 3,600 square feet is therefore only the net production and office area required, and does not include common areas, such as receiving and shipping docks and employee facilities.
The estimated capital equipment costs for a DOD Center are shown in Appendix 53. With the single exception of Item 18, "Office Computer System – Software," no costs would be shared by another DOD Center. But if there is more than one Center (two is recommended), the costs for this system, which would probably require a custom design (TBD in Task 2), would be shared between two operations.
The estimated total costs to duplicate a DTB on demand are shown in Appendix 54. The major components of cost, which are for labor, facilities, equipment, materials and profit, are all shown. Labor costs are the dominant variable, and the average hourly rates used in the cost model are derived from comparable hourly rates as currently listed by the Bureau of Labor Statistics (BLS).
Based upon these components and an annual volume of 2,000,000 copies per Center, a value of $0.66 per DTB copy (which is net of the cost of flash memory cartridges and a reusable shipping/return envelope, which will be provided by NLS as GFE) is derived and subsequently used as the baseline value in the cost model. The cost of DOD duplication using new cartridges would be $0.56 per DTB copy (Appendix 55).
We believe that the costs of mass-duplicating cartridge books can best be estimated by comparing them with the projected costs for producing DTBs on demand, as well as by making comparisons with portions of present RC production practices. However, we have not yet had an opportunity to visit a typical cassette duplication facility and would like to do this at the beginning of Task 2 (Magnetix was not duplicating NLS RC books during our site visit there). Of particular concern are the costs of label printing and cartridge labeling, the handling and labeling of containers, and how the cartridges and containers are matched-up.
In preparing the DOD cost model, we have calculated the cost of preparing the cartridge label as $0.04 per copy, and this cost is included in the staffing charges. The estimated cost of the label itself is $0.05 per copy and is included in the materials charges. These baseline costs were used in estimating the additional costs that would be incurred in labeling containers under the mass-duplication portion of the Hybrid option.
The daily requirement for new book production is the same for all distribution options, and has been segregated for costing purposes as follows:
|Books using new cartridges and containers||2,400|
|Books using reconditioned cartridges and containers||5,000|
|Total Daily Production (8 titles/day times 925 copies/title)||7,400|
This estimate considers an average annual production of 2,000 new book titles, an average of 925 copies per title mass-produced, and an attrition rate equal to 3% of circulation. In a steady-state operation, which is assumed for the cost model, 1,850,000 copies (2,000 titles x 925 copies/title) would be mass-duplicated annually and 600,000 copies (3% of 20,000,000) would be lost. This macro-calculation effectively establishes a maximum cartridge reuse rate of 68% ((1,850,000-600,000)/1,850,000). The values for the best overall reuse/weeding rate to use for long-term planning are subject to further review and confirmation, as they are critical to the economics of both the All Mass Duplication option and the Hybrid option.
These figures also illustrate that the cost of reconditioning cartridges and containers must be considered in determining the unit cost of mass-duplicating DTBs. We believe that this reconditioning should be done by a duplication contractor, rather than by a third party, so as to minimize the transfer costs between the libraries and the duplication contractors (TBD).
The contemplated DOD Centers, if implemented, would probably operate on only one shift so as to satisfy the design parameter of same-day service, and would therefore be available for production on a second shift. We believe that the best way to utilize this latent potential is to mass-duplicate books using only reconditioned cartridges and containers. An additional production capacity of 2,500 DTBs per day in each of the DOD centers would be sufficient to centralize the reconditioning of all cartridges and containers.
As previously noted, the average cost of duplicating a DTB copy using reconditioned cartridges would be $0.66. The estimated additional cost of reconditioning and relabeling a container would be $0.33. The total costs of mass-duplicating 2,500 DTBs per day would therefore be $0.99 per copy.
For a production of 1,200 DTBs per day using new cartridges and new containers, the cost per copy would be $0.81. Systemwide, the total cost of mass duplication would be $0.94 per copy.
Here are the additional resources that would be needed to provide a mass-duplication capability of 2,500 DTBs per day in a DOD Center:
- Space and carts for in-process container storage;
- Workstations to remove old container labels and apply new labels;
- Label printers (2 print and 1 Braille) and office space for the printers;
- Workstations for packing and labeling containers;
- Packing station label printers;
- Label supplies; and
- Additional staffing
Although production operations in a DOD Center could be leveraged to also mass-duplicate DTBs as explained above, process-wise mass-duplication by other contractors could be much less capital intensive than DOD production, as noted below:
- Mass-production would not require a mass-storage device/file image server and very highest speed optical cable...the system doesn’t need ready access to thousands of titles, actually only the digital files for the one book that is being mass-duplicated at the moment.
- The digital book files would only be loaded from a hard-drive into one or multiple RAM buffers once...after that, the data would be repeatedly copied from the single or multiple RAM buffer units to the cartridges.
- There could be a classic labor vs. capital tradeoff for the DTB mass duplicator RE cartridge handling, i.e., either use a $75,000 duplicator as designed for DOD use with customized automated cartridge handling, or "keep it simple" and have staff insert and remove the cartridges from mating connectors on RAM buffer(s).
- Information system requirements would be greatly simplified relative to those required for DOD operations; there would be neither a requirement for communicating on a two-way basis with multiple libraries for multiple orders, nor for enabling the simultaneous production of copies from multiple title masters.
The actual impact on total costs using the simplified information support and duplication processes described would require further study. It was only during the writing of this report that a commercial firm that produces books on flash memory was identified; NLS also identified a second firm doing the same. For baseline projections in the cost model, we assume that core mass-duplication costs for DTBs (i.e., costs net of GFE and the reconditioning of both cartridges and containers) will not exceed that for DOD production.
We believe that the costs presented herein are well measured for Task 1 planning purposes. These and other costs will be revisited in Task 2, and some will no doubt change
2.12 Capacity Comparisons of Present Cassette Container Shelving Converted to Pro Forma Cartridge Container Storage
The present cassette container storage module is a shelving section measuring 84" high x 36" wide by 12" deep. There are seven shelves in each module and the vertical shelf interval is 12". The shelf spacing is adjustable in 1" increments, and the shelf interval for cartridge container storage would be 9".
The present cassette container is 6" long x 4.75" wide x 1.5" high, and the most favorable dimensions of the cartridge container are 6" long x 4.125" wide x 1.07" high. As can be seen, the present and future containers are both 6" long, and are not of consequence in making the comparisons.
In this analysis, the shelving container capacities shown are for 1-deep container storage at 100% space utilization. But the ratios developed are equally appropriate for both 2-deep container storage and for less than 100% occupancy.
The capacity analysis shows that by tailoring the suggested new cartridge container dimensions for optimum shelf storage, the container capacity of each storage module would increase 47%. There would also be a concurrent increase of 47% in the number of slot facings available. We believe that the number of slot facings is a more valid measure of storage capacity than container count.
Perhaps more apropos; if nothing more was done than converting the entire RC inventory to DTB inventory, the collections in all libraries could be accommodated in 32% less space.
These conclusions were developed by analyzing the shelving usage of the Maryland Regional Library. Details of the calculations, with associated comments, are shown below.
|Dimension Description||Present Container||Suggested New Container|
|Now||Alt. 1||Alt. 2||Alt. 3||Alt. 4|
|Vertical Shelf Interval||12||9||9||9||9|
|Usable Shelf Opening||11||8||8||8||8|
|Clearance to Top Shelf||0.5||0.5||0.5||0.5||0.5|
|Maximum Stack Height||10.5||7.5||7.5||7.5||7.5|
|Containers per stack||7.0||5.0||7.0||7.5||6.0|
For Alternative 1: This option shows how present shelf spacing could be reprofiled to conform with the proposed shelf spacing for the suggested new container dimensions, if desired. This configuration could be particularly helpful in phasing-in cartridge container storage, and in weeding the inventory of aging cassette books to save space.
For Alternative 2: The container height shown is what would be needed to accommodate 7-high cartridge container stacking, which is the same stack capacity as the present cassette container. This greatly simplifies the before-and-after comparisons.
For Alternative 3: The container height shown is what we consider to be the minimum height that would be needed to accommodate the necessary container marking. This is the height of the lid of the present container, on which the labels are placed.
For Alternative 4: The container height shown is what would be needed to accommodate 6-high container stacking, and could be used in the event that the additional height would make for better container design.
Comparative storage capacity per shelf opening for Alternative 2: No change
|Suggested New Container|
|Alt. 1||Alt. 2|
|Horizontal Shelf Interval||36||36||36|
|Usable Shelf Opening||35||35||35|
|Clearance Between Containers||0.25||0.25||0.25|
|Stacks per Shelf Opening||7.00||8.00||9.00|
For Alternative 1: This is the only change in container width that would be of any practical benefit. It will increase storage capacity, and will also permit the label size on the hinge end of the present container to be greatly increased.
For Alternative 2: The container width shown is what would be needed to accommodate 9-across container stacks, and is provided for reference only. It would create a serious container labeling problem, and probably would not meet the USPS definition of a parcel.
Comparative slot facings per opening for Alternative 1: +14.28%
|Dimension Description||Present Container||Suggested Container|
|Shelf Section Module Height||84||81|
|Vertical Shelf Spacing||12||9|
|Shelf Openings per Module||7||9|
Comparative increase in container storage capacity per module: +47% (1.2857 x 1.1428)
Comparative increase in slot facings per module: +47% (1.2857 x 1.1428)
Note: Converting the above capacity figures to space needs, the present library collections could be stored in 32% less space, calculated as 100% - (100%/147%).
The potential reuse of surplus library RC storage space for each of the distribution options is as follows:
- All Mass Duplication Option – 50%
[32% for less DTB space needed (relative to RC), plus 18% for faster weeding and consolidation of weeded inventories]
- All Duplication-on-Demand Distribution Option – 100%
- Hybrid Distribution Option – variable depending upon title apportionment split measured relative to the All Mass Duplication option
It should be noted that collection storage space will not be freed-up until the DOD Centers are in operation and there could, in fact, be a slight increase in space needs during the transition period. This issue will be further addressed during development of the transition plan in Task 4 of this project.
For the Maryland Regional Library, the percentage of library space used for RC storage is about 29% as cited earlier in the report. The percentage of RC storage space used in six other regional libraries was provided by NLS and is shown along with that for Maryland in Appendix 56. For this sample of seven regional libraries, the unweighted average percentage of total facility space used for RC storage is 47% and the weighted average is 40%. The weighted average value of 40% is used in the cost model.
The present cassette containers, which are of end-opening clamshell design, are 6" long x 4.75" wide x 1.5" deep, and the lid portion of the container is 1" high. The bottom of the container contains a slot for a 3" x 5" address card, and a Braille label on the container appears directly below the address card.
The hinge end of the container has a 9/16" x 1 3/8" label showing the title number. One side of the lid has a 9/16" x 3 7/8" label showing the title number, title description, and other information. The other side of the lid is used by the libraries to affix a label containing a scannable copy number, as this is the only free surface available.
The pro forma cartridge container is 6" long x 4.125" wide x 1.07" high and could possibly be of other than clamshell design. There would be sufficient room on one end of the container for a label perhaps 3 5/8" long. The information on this label could include the title number, space for a bar-coded title number, and space for the libraries to affix a label containing a scannable copy number. With this orientation, the copy number would be visible from the storage aisle. All labels could and should be higher than 9/16", for better readability.
An estimated 67% of the new titles produced would make use of reconditioned containers. All container labels should therefore be designed for ready removal, after three or more years in circulation.
An alternative container labeling method that has some merit is to use a copy of the cartridge label, which has both print and Braille legend, or perhaps a smaller version would be better. The label would be placed on the top of the container, which would be an advantageous location for the reader, and a cartridge labeling machine could probably be modified to affix the label to the container.
We have also considered the impending problem that a reader would have in distinguishing the one cartridge in five that should be returned to a DOD Center rather than to a network library. Since there are no liberties with cartridge design, this differentiation must come from the label. While there is some merit in using a different color, the most defining solution would be to provide a tactile border on either the Braille portion of the label or a tactile feature on the print label (TBD).