Digital Research with Google Books
Digital tools such as databases and Google Books are unquestionably valuable for research. In producing anything, though, the tools one uses are not neutral. For any given topic, I could write with a pen or I could write with a computer, but the tool I choose will in some way shape what I produce.
I’ve tried to analyze how I use Google Books, to see what effect it has on my research. I’ve chosen Google Books because it’s the tool I’m most familiar with, but I think my thoughts could apply to other tools, like the Internet Archive. There are six ways that I use Google Books, listed roughly in order of sophistication.
First, Google Books works like any library catalog, only for a larger collection than any one library. Using it this way is comparable to using WorldCat.
Second, Google Books provides access to texts that aren’t available in nearby libraries. For now Google only gives the full text of books that are in the public domain, but in the future books that are in copyright but out of print may be available.
Third, Google Books permits me to verify quotations or page numbers quickly without having the book. With just a fragment of a quotation, it’s often possible to find the whole quotation and corresponding page number. Amazon’s “look inside this book” also works for some more recent books that Google does not provide access to. Of course, this technique is just a crutch made necessary by poor note taking.
Fourth, using Google Books I can search the text of my library to find which books address a given topic or cite a given author. For example, searching for “Unitarianism” shows me that I own four books that mention the topic. Actually, though, Sydney Ahlstrom’s Religious History of the American People is missing from the search results, even though it’s in my digital library.
Fifth, I can find books that refer to certain topics without a lot of leg work in the library. For example, I was looking for references to an obscure person whom no one has written a book or article about. I could have gone to the library and looked in a lot of indexes, but chances are he isn’t in them. A quick search found him in several books, including this one.
Sixth, once I’ve narrowed down the sources I want to study, I can search for particular concepts or keywords. For example, I wrote a paper about timekeepers in the poetry of Robert Herrick. I searched inside the Google Books version of his collected poems for keywords such as “clock,” “watch,” “hour,” “minute,” “year,” “calendar” and “pendulum.” That search quickly identified the poems that I needed to consider. I also used this technique when studying John Eliot and his use of the sabbath in his mission to the New England Indians. Using an online repository that contained all of Eliot’s printed works, I searched for “sabbath” and found most of the places where he had used the term. Searching electronically was much faster than reading every work. I even found some uses of the term in Algonquian texts that I otherwise would not have read. I suppose this method is a very simple form of text-mining.
The first five techniques I’ve mentioned are straightforward. So long as they are considered an aid to research and not a replacement of traditional work in archives and libraries, I doubt that too many scholars would take issue with them. Those techniques aren’t substantively different from using a card catalog or an index.
I have some misgivings, however, about my sixth technique. Searching primary sources for key terms is a good way to identify sources or to locate where a source discusses a topic. Such searches can even help find sources that one might otherwise have passed over. But searching the sources is no substitute for actually reading them.
Without reading sources both extensively and closely, a scholar risks missing much of what his sources have to offer. First, he will probably miss the context of most of the sources. Had I the time, I would have been better served to have read Herrick’s complete works. For my paper on Eliot, I did read completely read every source that I could find. Second, the OCR for these books can be pretty poor, especially for older books. A simple search will often miss references that reading would not. Third, searching for keywords will often miss concepts that a scholar never thought to look for.
In illustration of that last point, take Herrick’s poem “His Winding Sheet.” Herrick describes himself as lying in the grave “to be reveal’d / Next, at that Great Platonick yeere.” I did search for “year,” but of course the variant spelling didn’t show up. And if even it had, I could never have anticipated searching for a Platonic year, an idea about the alignment of planets that proved to be one of my most important sources for Herrick’s idea about times. Since I didn’t know what to look for, I didn’t find it until I read through the book.
My point is not that using Google Books or the like is detrimental to scholarship. To the contrary, I think it’s an invaluable tool. But it’s a tool for finding source to read closely, not a substitute for close reading itself.