Monday, March 8, 2010

Trying to Finish the Book

As I recently observed, I have been reading Ira Gitler's Swing to Bop: An Oral History of the Transition in Jazz in the 1940s. For all the problematic elements of oral history, which I have previously cited, now with reinforcement from Max Hastings, this book was a great companion as I wandered about San Francisco from one performance to another. Reading this collection of statements was rather like perusing a photograph album. No entry demanded very much time, and the continuity was very loosely structured. So, whenever I found myself waiting, I could just dip into the book and pick up a few more statements to fill the gap in time.

There was one gap, however, for which I was not prepared. I had borrowed this book from the Main Branch of the San Francisco Public Library; and, as I neared the end of the book (while waiting for the beginning of yesterday's Chamber Music San Francisco concert), I discovered that the last four pages were missing. On closer inspection I realized that they had been rather meticulously cut out, probably with something like an X-Acto knife. It was such a clean job that the only evidence of missing pages was in the discontinuity of page numbers and the "suspended" paragraph at the end of page 316.

I still have to report this to the Library; and, in their new automated age, I shall be interested to see what it will take to bring this to the attention of a human being who can act on the problem. My more immediate problem, however, was whether or not I would be able to read those four pages while the rest of the book was fresh in my mind. The good news was that the book is now in the Google books collection, and the four pages were there. The bad news was that getting to them pointed out some problematic elements, which I had just heard cited on Book TV in a talk that Robert Darnton had given at the Harvard Book Store. Darnton's bottom line was that, in the interest of efficient throughput, Google had reduced the operation of managing Google books to assembly-line-like "business processes" (Darnton did not use that dreaded phrase, but it perfectly captures the target of his criticism) with little (if any) oversight from any human beings (let alone experienced librarians). Darnton observed that, in the absence of such oversight, George Eliot's Middlemarch had gone "into the system" (again, my words) with one of its eight "Books" missing. This left me wondering whether or not I would find the meager four pages I had yet to read in my own book.

Given Google's commitment to search, finding those pages was more difficult than I anticipated. Yes, there is a pull-down menu for a table of contents with each chapter hyperlinked to its first page. Unfortunately, the last two elements of the printed table of contents were missing from this menu, the "Epilogue" (which had two of the four missing pages) and the Index. That latter was particularly curious. Most (but not all, again probably due to inadequate human oversight) of the page references in the index are hyperlinked, making this "digital" index a real asset for reading from the screen; yet there was no direct path that would lead the serious reader to it. Fortunately, I could do a text search on "Epilogue," which returned only two hits, one of which was the first page of the "Epilogue." (The other was the table of contents page.) I could then back up two pages and account for all of the missing material with perfect continuity.

Even before I had seen Darnton on Book TV, I had been following his arguments about the risks of entrusting the responsibility for a major library to a commercial enterprise like Google. I now have a more personal appreciation of those risks. I would express the major lesson from my experience by paraphrasing that notorious motto from the National Rifle Association:

Business processes do not manage libraries; librarians manage libraries.

Whatever its advantages (and this is far from the first time I have drawn upon them), Google books has created a world in which there is no evidence of a librarian, let alone any appreciation that the presence of a librarian might be relevant. Ultimately, all that seems to matter is that a vast number of books be scanned and then "processed" by software that, like all software, seems to have bugs. Identifying those bugs and doing something about them, however, does not appear to be part of the "business process model," which means that the motto for the result will be: What you get is what you get. I can appreciate why the Director of the Harvard University Libraries would be queasy about this prospect! On the other hand, if, as I have suggested, ours has become a culture in which little value is attached to memory, why should we care how many defects there are in the Google books business processes?

No comments:

Post a Comment