I am an assistant professor in the Department of History and Art History at George Mason University, working on digital history and the history of American religions. You can find a link to just about anything I've worked on in my CV or in the blog archives. Some of my work is described in more detail on the research page, and my syllabi are on the teaching page along with workshop materials. You can write to me at lincoln@lincolnmullen.com.

Detecting Text Reuse in Nineteenth-Century Legal Documents: Methods and Preliminary Results

How can you track changes in the law of nearly every state in the United States over the course of half a century? How can you figure out which states borrowed laws from one another, and how can you visualize the connections among the legal system as a whole?

Kellen Funk, a historian of American law, is writing a dissertation on how codes of civil procedure spread across the United States in the second half of the nineteenth century. He and I have been collaborating on the digital part of this project, which involves identifying and visualizing the borrowings between these codes. The problem of text reuse is a common one in digital history/humanities projects. In this post I want to describe our methods and lay out some of our preliminary results. To get a fuller picture of this project, you should read the four posts that Kellen has written about his project:

David Dudley Field, leading member of the New York commission that drafted the Field Code of 1850. Image from the [New York Public Library](http://digitalcollections.nypl.org/items/510d47df-a7c9-a3d9-e040-e00a18064a99).
Figure 1: David Dudley Field, leading member of the New York commission that drafted the Field Code of 1850. Image from the New York Public Library. [JPG]

The question stated

After the American Revolution, almost all states were common law jurisdictions. Beginning in the 1840s, legal reformers, led most prominently by the New York attorney David Dudley Field, attempted to revolutionize legal practice by preparing codes of civil procedure. For civil cases these codes did away with common law ways of bringing lawsuits and running trials and instead offered a simplified, rationalized, supposedly democratic way of conducting lawsuits. The first such code was submitted by a three-man commission to the New York state legislature in 1848. In 1850 the legislature enacted a revised version, commonly called the Field Code. Other states followed suit and codified their civil procedure. As Kellen writes:

Over the course of the next several decades almost every state eventually adopted the Field reforms, eliminating their courts of chancery (equity) and requiring “fact pleading,” party testimony, and unregulated lawyers’ fees. Some states adopted their reforms piecemeal, but most enacted codes of their own, often following one or another draft (but usually the later, complete draft) of the Field Code.

From the first Field Code in New York in 1848 through the 1890s, thirty-two states enacted versions of the Field Code, and civil procedure in other states were influenced by it. States often revised their civil procedure, so there were multiple codes from many states. Sometimes the state commissions explicitly stated which codes they borrowed from. But in many cases, such borrowings have to be inferred from the text.

This has the makings of a classic digital history problem. To begin with, the codes are public documents from the nineteenth century so there are no copyright restrictions. The texts are readily available from Google Books or the Hathi Trust. Kellen is in the process of identifying and gathering all of the relevant texts and quite a few others besides into his American Legislation Project. Second, we have a large volume of text: the 1850 New York code alone runs to 969 pages. These are texts which are amenable to distant reading. In digital literary studies one could at least make the argument that, say, novels are interesting when read closely.1 But (though I don’t wish to disparage the sources that legal historians use) the codes themselves are mind-numbing. It is probably uncontroversial to say that we are all better off reading these texts at a distance instead of slogging through them subsection by subsection. And third, the relationships between these texts can easily be expressed in terms of network analysis—one of the common methods of digital history.

And most important, there is a real historical payoff to doing digital work on these codes. Legal historians have long known about the Field Code in New York; they have also long known that other states emulated that code. No one, however, has figured out exactly which parts of which codes were borrowed and whether states adapted their codes from states other than New York.2 In other words, this project is also a classic problem of historical revisionism, whereby an accepted historical narrative is made considerably more complex. And, as I’ll report below, we have indeed found that the relationships between state legal codes is more complex than previously argued.

An example borrowing

But first, let’s look at one instance of the kinds of borrowings that we are trying to detect in order to see what the potential difficulties are. For this example, I’ll show one place where California’s 1851 code of civil procedure borrowed from New York’s 1850 code. Keep in mind that California was admitted as a state in September 1850, so in 1851 it was revising its code of civil procedure for the first time as a state.

First, here is a section on serving summonses from the 1850 New York code.3

An excerpt from the 1850 New York code of civil procedure.
Figure 2: An excerpt from the 1850 New York code of civil procedure. [PNG]


§ 628. The summons must be served by delivering a copy thereof, as follows:

1. If the suit be against a corporation, to the president or other head of the corporation, secretary, cashier, or managing agent thereof:

2. If against a minor under the age of fourteen years, to such minor personally, and also to his father, mother, …

And here is the related section in the California code.4 I have highlighted the changes in yellow. You can look at the surrounding text for both codes on Google Books and get a sense of the other borrowings.

An excerpt from the 1851 California code of civil procedure.
Figure 3: An excerpt from the 1851 California code of civil procedure. [PNG]


§ 29. The summons shall be served by delivering a copy thereof attached to the certified copy of the complaint as follows:

1st. If the suit be against a corporation, to the president or other head of the corporation, secretary, cashier, or managing agent thereof:

2d. If against a minor under the age of fourteen years, to such minor personally, and also to his father, mother, or guardian; or if there …

The borrowing is obvious, but variations make it complicated to detect the similarity algorithmically. In the example above, California adds the words “attached to the certified copy of the complaint.” Sometimes the variations are due to local circumstances, as in the example below, where the California code inserts the local names of its courts and sensibly requires that actions happen in San Francisco instead of New York. In addition to actual changes in the text, imperfect OCR adds noise. Though we have taken some pains to get good OCR and were mostly successful, I think, the noise still adds an additional challenge.

Section 22 of the California code from 1851.
Figure 4: Section 22 of the California code from 1851. [PNG]

So the challenge is to figure out how to detect these borrowings at a large scale, both identifying which codes were most likely to be derived from others and also identifying which sections in particular were borrowed. This project is thus similar in purpose and method, though rather different in content, to the Viral Texts project created by Ryan Cordell, Elizabeth Maddock Dillon, David Smith, Abby Mullen, Peter Roby, Kevin Smith, and Matthew Williamson to detect communities of reprinting in nineteenth-century newspapers.

Using n-grams to detect borrowings

For our test run of the project we started with 61 codes, which comprise some ten thousand pages and four million words. This is only a middling-sized dataset, compared to some, but it is still nothing to sneeze at. Our first priority in investigating this corpus of civil procedures codes was to figure out whether we could detect (and visualize) where borrowings occurred between codes. Like the Viral Texts team (and many other people doing the same kind of work), we use n-grams to break the text into small chunks in order to detect reuse.5

Our method worked like this. After stripping punctuation and tokenizing the text into words, we first split the texts into n-grams: we chose five-grams after some experimentation. We then filtered n-grams that seemed unreasonable: anything containing a non-alphabetic character is likely to be just OCR noise. We then found all of n-grams that were shared between codes (i.e., the intersection of the two sets).

Using n-grams in this way lets us detect shared passages while allowing for minor changes and errors in the OCR. I hope the following example does not belabor what is a fairly simple point. But let’s take the example above where California 1851 borrowed from New York 1850, and split the two passages into five-grams. Splitting the texts into short sequences duplicates the words across the n-grams. This redundancy permits us to detect matches even when there are variations—meaningful or not—in the text. The table below shows the five-grams from the excerpts above and whether or not they match.

NY 1850 CA 1851 match?
the summons must be served the summons shall be served no
summons must be served by summons shall be serve by no
must be served by delivering shall be served by delivering no
be served by delivering a be served by delivering a yes
served by delivering a copy served by delivering a copy yes
by delivering a copy thereof by delivering a copy thereof yes
delivering a copy thereof as delivering a copy thereof attached no
a copy thereof as follows a copy thereof attached to no
as follows if the suit as follows if the suit yes
follows if the suit be follows if the suit be yes

We then extended this technique to compare any two codes. That is, for each five-gram in the California 1851 code, we see whether it has a match in the New York 1850 code. The plot below shows a vertical line for each five-gram in the California code that has a match somewhere in the New York code.

Places where the California 1851 code borrowed from the New York 1850 code.
Figure 5: Places where the California 1851 code borrowed from the New York 1850 code. [JPG]

One’s reaction to a first look at that plot ought to be skepticism. It is interesting to note that there are definitely places in the code where the borrowing is heavier than in other places. I take that variability as an invitation for me for Kellen to go back and read those sections of the code. But one should be skeptical because the degree to which it looks like California borrowed from New York is almost entirely dependent on the level of transparency one assigns to the vertical lines indicating matches. I could make you think that everything or nothing was borrowed by adjusting that one setting.6

However, I don’t want you to look at that plot in isolation. Instead, compare it to the plot below which shows matches between Michigan’s 1853 book of court rules, covering the same topics as a code of civil procedure, and New York’s 1850 code.

Places were the Michigan 1853 book of court rules borrowed from the New York 1850 code.
Figure 6: Places were the Michigan 1853 book of court rules borrowed from the New York 1850 code. [PNG]

Just as we chose the California 1851 code because we knew it borrowed from New York, we also chose the 1853 Michigan code because knew it did not borrow from New York. And that is what the two plots show: California borrows heavily and Michigan borrows hardly at all. The noise in the Michigan plot is what we would expect. All of these codes are formulaic documents using a highly specialized (and verbose) vocabulary, so it is not surprising that there are phrases of even five words that occasionally show up in both codes. Exactly how reliable this method remains to be determined precisely, and of course these plots could be improved. But these plots do provide persuasive, if pragmatic, evidence that this method works: we can detect which codes borrow and which codes do not. Furthermore, by writing a kind of keywords in context function, we were able to spot check the matches to see that they were in fact significant borrowings of meaningful length.7

Visualizing the network of borrowings

Once we were sure that we could usefully detect matches between codes, we wanted to get an overview of our entire corpus.8 To do this, we reduced the relationship between each code to a single number expressing the degree to which code B borrowed from code A. We tried this several ways. The simplest way is to figure out whether a given n-gram in code B also occurs in code A, then to calculate the proportion of matches to the total number of n-grams. We have to experiment further with this. Another way is was to calculate the ratio between the unique number of five-grams in both codes and the unique number of the five-grams in the destination code. This gives us a single number between 0 and 1 for each combination of codes.9 Conceptually, this is equivalent to the proportion of the destination code that is borrowed from the origin code. So for example the California 1851 code above has a score of .387 when compared to the New York 1850 code: a significant proportion of the California code was borrowed from the New York code. When a code is compared to itself, the pairing gets a score of 1, i.e., they are a perfect match. Two codes in which there were absolutely no borrowings would receive a score of 0. For comparison, when the Michigan 1853 is compared to New York 1850 (as plotted above) it receives a score of .032.

From there we computed a matrix of similarity scores between all the possible borrowings. The table excerpted below shows the degree of similarity between California’s 1851 code and the other codes listed.

CA 1851’s similarity to other codes
code score
AZ 1865 .759
NV 1861 .695
NV 1869 .656
UT 1870 .476
NY 1850 .387
FL 1870 .188
NC 1868 .184
NY 1849 .174
CA 1850 .139
NY 1848 .115
OH 1853 .091
KY 1851 .065
MO 1849 .056
MO 1856 .045
GA 1860 .022
MI 1853 .013
CT 1879 .003
UT 1853 .002

Since 1851 was very early in the codification movement, most of these comparisons are anachronistic: it is of course impossible for California in 1851 to have borrowed from Arizona in 1865. So we filtered our matrix so we did not look for earlier codes borrowing from later codes. Time’s arrow alone is helpful for sorting out these relationships. The remaining matrix expresses a network graph, where the rows and columns are the nodes, and the values of the matrix are the weights of the edges. In such a matrix every code is connected to every other code, even if the connection (as betweeen CA 1851 and MO 1849 in the table above) so weak as to be just noise. So we again filtered the matrix to remove connections below an arbitrary weight.

Using the resulting matrix, we can created the following network visualization of how codes borrowed from one another.10 (Click the figure to get a bigger version and zoom in with your browser.)

Network graph of borrowings between codes of civil procedure in the nineteenth-century United States (weight ≥ .10).
Figure 7: Network graph of borrowings between codes of civil procedure in the nineteenth-century United States (weight ≥ .10). [SVG]

The network graph below extracts the largest family of codes that were related to one another, and is much stricter about which connections are retained. Just as we expected, this graph centers on the 1850 New York code, which radiates connections out to the surrounding codes. (If the Field Code were not central, as we know it to be from other historical research, then we would have a had a strong presumption that our method had a problem.) But what is different about this visualization is the detailed connections between all the other codes.

The largest family of codes which borrowed from one another (weight ≥ .25).
Figure 8: The largest family of codes which borrowed from one another (weight ≥ .25). [SVG]

Preliminary findings and next steps

Our most significant findings so far both confirms and modifies the existing interpretations of the codification movement. The New York 1850 code, the original Field Code, was indeed the most important of the codes of civil procedure. But the NY 1850 code was only weakly to moderately related to its derivatives. Its significance was not in providing the model for each of the codes of civil procedure that followed. Rather, the Field Code provided the model for several regional families of codes. The Field Code, for instance, strongly influenced the California 1851 code. But the California 1851 code provided the archetype for a set of codes in the Western states. Utah, Montana, Arizona, Nevada, Colorado, and Idaho borrowed their codes almost entirely from California’s. Other regional borrowings can be detected. North Carolina in 1868 borrowed from New York’s 1849 code (an earlier version of the 1850 code), and Florida and South Carolina in 1870 borrowed from North Carolina. In 1862 Oregon borrowed from New York’s 1850 code, and in 1900 Alaska borrowed from Oregon. The story is not that states adopted the Field Code, but that the first or more important state in each region used the Field Code as a model, and then the other states crafted codes from local models that worked for the particular needs and conditions of those places.

There is much more to be done with these codes. I’ll mention the three most obvious next steps, then two possible further steps. First, the easiest thing to do is to extend this analysis to additional codes, which is simply a matter of identifying and OCRing them and then re-running the analysis. Second, it will be necessary to identify the content of the borrowings between codes. For example, which parts of the Field Code did California adopt, and how did California change those adoptions over time? Which parts of California’s code did the western states adopt? And third, it will be necessary to improve our analysis and make it more rigorous.

On that last point. We are at the stage in the project where we have proven (to ourselves, at least) the usefulness of this method. I often find that I can divide data analysis projects like this one into two phases (usually iterated). The first is a period of quick and dirty building of code to discover what the main questions are and what possible approaches might work. The second is a period of working to answer specific questions using methods which can be justified step by step to other scholars. In between, it’s necessary to burn down all the work that one has already done and start fresh. I think we have discovered the main lines of inquiry in the first phase, and now we are in the intermediate (read, burning down to rebuild better) phase.

Finally, two tentative further steps. First, it may prove worthwhile to create an interactive web version of some of these visualizations, and perhaps of a “code browser,” in order to permit others to explore these borrowings for themselves. Whether there would be enough interest to justify the expense of time we have not yet determined. And second, we think this method is more broadly applicable to the field of legal history. We hope that readers of these posts might offer some critiques of our methods. Eventually we plan to write up our methods in a more formal way, as well as to extend our work to legislative borrowings more broadly.


Appendix: check our work

All the code for this project is available on GitHub. In the text directory of that repository you can find the plain-text versions of all the codes that we have OCRed, if you wish to run your own analysis. See also Kellen’s American Legislation Project, which gathers the available scanned versions of the published codes. Several RPubs files (1, 2, 3, 4, 5) show the earlier stages of our investigation. I’m not really proud of some of the code in those documents, but this is as close to open notebook history as we can get. We will of course improve the code as we go.

  1. Well, not all novels, and certainly not all nineteenth-century novels.

  2. For that matter, no one has done the archival work on the legal reformers either. It is important to remember that Kellen’s larger project is a mix of macroscopic digital historical work and old-fashioned archival Sitzfleisch.

  3. The Code of Civil Procedure of the State of New-York (1850), title 4, section 628, p. 257.

  4. “An Act to regulate proceedings in Civil Cases, in the Courts of Justice of this State,” title 3, section 29, in The Statutes of California, Passed at the Second Session of the Legislature (1851), 55.

  5. Some of our code is modeled on the algorithms described in David A. Smith, Ryan Cordell, and Elizabeth Maddock Dillon, “Infectious texts: Modeling Text Reuse in Nineteenth-century Newspapers,” IEEE Workshop on Big Data and the Humanities (2013).

  6. For the record, the alpha setting for that plot is 0.1.

  7. A next step will be to extract not just phrases or sentences but the large sections of the codes which are heavily reused, just as Viral Texts does in extracting reprinted newspaper articles.

  8. The charts below are from the first 61 codes that we found and OCRed. We will eventually extend this approach to every relevant code that can be found. At last count we are up to 76.

  9. In our work so far, the difference between these two methods has not produced significant differences in the network graphs below. We used the second method most, and report on it in this post. But we think the simpler method is probably better and we are going to move to that. See the discussion of next steps below.

  10. There are other ways of visualizing the results, including clustering the civil procedure codes, but that is a subject for further work.

An R Client for the Internet Archive API

"Internet Archive Logo"

In support of some of my research projects, I created a simple R package to access the Internet Archive’s API. The package is intended to search for items, to retrieve their metadata in a usable form, and to download the files associated with the items. The package, called internetarchive, is available on GitHub. The README and the vignette have a full explanation, but here is a brief overview.

First, you can do keyword searches:

ia_keyword_search("isaac hecker")

Next, you can do advanced searches, specifying which fields you want to search:

ia_search(c("publisher" = "american tract society",
            date = "1840 TO 1850"))

Having retrieve a list of items using either of those search functions, you can get get the items and their associated metadata in a data frame. Here we use magrittr’s pipe operator (%>%) to create a pipeline:

ia_keyword_search("isaac hecker") %>%
  ia_get_items() %>%

My intended use is for downloading the files associated with items to create a corpus of texts. In this example, we search for items and download only the plain text files. The filtering is provided by dplyr.

ia_keyword_search("isaac hecker") %>%
  ia_get_items() %>%
  ia_files() %>%
  filter(type == "txt") %>%
  ia_download(dir = "texts")

The functions ia_metadata(), ia_files(), and ia_download() all return data frames, which should be easily filtered, reshaped, and joined as necessary. I hope the package is useful for creating corpora for text mining as well as for downloading sources to read in batches.

Doubtless there are parts of the Internet Archive API, especially in the advanced search and file types, that I haven’t adequately explored. I’ll be glad for bug reports.

Review of Religion and the Marketplace in the United States

[This post was originally published at Religion in American History.]

As Heath Carter has noted, we are due for a bumper crop of books on religion and capitalism in the United States. I want to briefly take note of a new collection of essays on the subject which came out of a conference held at Heidelberg University in 2011: Jan Stievermann, Philip Goff, Detlef Junker, Anthony Santoro, and Daniel Silliman, eds., Religion and the Marketplace in the United States New York: Oxford University Press, 2015.

Religion in the Marketplace cover
Religion in the Marketplace cover

I came to this collection looking for a critique of the persistent metaphor of a “market of religion.” (To be more precise, I’ve been sketching ideas for an essay on this topic. When I saw in the Amazon preview that the title of Brooks Holifield’s essay was similar to mine, I figured I needed to read it.) In the introduction the editors begin by critiquing “two metanarratives” about religion and the marketplace. The first of these “compartmentalizes religion and economics as more or less distinct spheres of human life that causally explain each other” (9). Most of the essays in this volume complicate this metanarrative. Mark Valeri on eighteenth-century New England merchants, Grant Wacker on Billy Graham, and Hilde Løvdal Stephens on James Dobson all describe small, daily interactions between religious belief and practice and the economy. Likewise a trio of essays on markets for religious books by Matthew Hedstrom, Günter Leypoldt, and Daniel Silliman show how groups from liberal Protestants to pretribulationist evangelicals navigated and created markets in religious commodities.

The second metanarrative that the editors critique is the idea that “the relationship of religion and markets … explains ‘the American difference,’ why America seems so religious in comparison with other Western countries” (15). The bookend essays to this collection by Brooks Holifield and Kathryn Lofton take on this idea. In a tightly argued essay on “The Limitations of Market Explanations,” Holifield makes short work of the idea, not so much refuting it as showing its implausibility. He argues instead that there is a “contingent, not necessary” connection between markets in religion and religiosity. Lofton makes the critique more general with a meditative essay on neoliberalism. She offers two observations: that historians are currently writing in an era of neoliberalism, and that most of the essays in the volume argue for a close connection between religious actors and the marketplace. From these observations she asks, “Is all American religion now neoliberal? Or is it merely the case that our scholarship has been so determined?” (285). Lofton doesn’t give a definitive answer to this question, but the asking is what makes her essay the highlight of the collection. My suspicion is that the idea of “a [metaphorical] market of religion” has become an crutch we reach for too often to describe American religious interactions without explaining them.

So there are two reasons you might want to pick up this edited collection. The body of the book offers a number of thoughtful, nuanced expositions of the daily interactions between religious actors and the economy. But Holifield and Lofton call into question the terms on which we write about religions and markets.

Exploring Elections for Massachusetts Governor in the Early Republic

The New Nation Votes database (NNV) offers election returns from the early American republic collected by Philip Lampi and digitized by Tufts University and the American Antiquarian Society. Several scholars writing in a 2013 issue of the Journal of the Early Republic have tackled questions such as voter turnout and measures of party competitiveness (Brooke), the resurgence of the Federalists after 1808 (Lampi), the expansion of the franchise (Ratcliffe), and families and the turnover of congressmen (Zagarri). My aim is much more preliminary: to see what kind of analysis, in particular mapping, might be done with the dataset.1 I have wanted to explore this dataset for some time, so here is a preliminary investigation into the Massachusetts gubernatorial elections up to 1824.

The first aim is to get an overview of party politics in the elections for governor. The chart below shows the percentage of votes won by the Federalist and Democratic-Republican parties from 1796 to 1824. The overall pattern in elections for governors is fairly plain. From 1797 until 1805, the Federalists had a strong hold on the office, putting Increase Sumner, Moses Gill, and Caleb Strong in the governor’s chair.2 Caleb Strong came to office (after both Sumner and Gill had died in office) with some competition in 1800, but his hold was fairly secure until 1805. That year inaugurated stiff competition for the governorship, which switched hands repeatedly until the election of 1813. The War of 1812 and the rise of younger Federalists gave the Federalists the upper hand until they lost the 1823 election, never to win the Massachusetts governorship again. Note that there are a few oddities in this chart which I have not resolved. For example, John Brooks was listed as a Federalist every year from 1816 to 1821, except 1818 to 1819; I don’t know whether that means Brooks really ran without an affiliation or whether it is an omission in the data. But this chart more or less confirms the argument of Philip Lampi (and earlier, of David Hackett Fischer).

Percentage of the vote won by the Federalist and Democratic-Republican parties in elections for Massachusetts governor, 1796--1824. Data: [A New Nation Votes](http://elections.lib.tufts.edu/).
Figure 1: Percentage of the vote won by the Federalist and Democratic-Republican parties in elections for Massachusetts governor, 1796–1824. Data: A New Nation Votes. [SVG, PNG]

The next task is to see which politicians were serious contenders for governor of Massachusetts. I’m arbitrarily defining a contender as someone who managed to win at least 10 percent of the vote in at least one election. More than one thousand people are listed in the gubernatorial election returns from 1787 to 1824, but only twenty got at least 10% of the vote, and only ten won office in their own right.3 The chart below shows the careers of those contenders. There are too many people on this chart for the colors to be much help, so I’ve labeled the lines for the more significant figures. John Hancock had a secure tenure, while Samuel Adams’s was somewhat more rocky. But note that Increase Sumner, the first Federalist governor, also won high proportions of the popular vote. Starting in 1800 and certainly by 1805, elections were contested much more heavily; the nature of gubernatorial politics changed. We can see the arcs of people’s careers. Federalist Caleb Strong won a close election in 1800 and gradually increased his margin of victory, but with the new regime of competition he was an on-again, off-again candidate until 1815. Federalist John Brooks enjoyed seven wins in a row, but his last three election were contested by William Eustis. Eustis never did defeat Brooks, but he did defeat Federalists Harrison Gray Otis in 1823 and Samuel Lathrop in 1824. Elbridge Gerry was the William Jennings Bryan of the early republic (except Gerry eventually won), running repeatedly for governor starting in 1788, but not even coming close until he won in 1810.

Contenders in elections for Massachusetts governor who won more than 10% of the vote, 1787--1824. Data: [A New Nation Votes](http://elections.lib.tufts.edu/).
Figure 2: Contenders in elections for Massachusetts governor who won more than 10% of the vote, 1787–1824. Data: A New Nation Votes. [SVG]

Next, I’ve created maps for three gubernatorial elections: 1800, 1807, and 1823. These maps are exceedingly rough-and-ready, intended for exploration rather than argument. I made them using my cartographer package. The election returns are in the NNV dataset at the level of towns, so I geocoded the names of 869 towns in Massachusetts and Maine.4 This is not ideal, but since towns tend to split rather than move the locations should be more or less correct for these exploratory maps. The county boundaries are from the Atlas of Historical County Boundaries via my USAboundaries package. The maps each include layers for the top two or three candidates. Red represents the Federalists; blue, the Democratic-Republicans. Each town is sized according to the number of votes for the candidate.5 Click on each town to get the number of votes. This way of layering the votes for each candidate is not ideal. Perhaps a better solution would be to show how many more votes each candidate won in a particular place; e.g., Strong won 142 more votes than Gerry in Brookfield.

Some general observations about the importance of space in these elections.

First, Boston was far and away the biggest city in Massachusetts, but it had little impact on the elections. In the 1800 election, Gerry got only 24 more votes than Strong in Boston, a difference of less than 1 percent of the turn out. In 1823, Otis got only 108 more votes than Eustis. Only in 1807 did Strong get significantly more votes than Sullivan (and Strong still lost the election). Even though Boston contributed more votes than any place, and though sometimes it went for Democratic-Republicans and sometimes it went for Federalists, it was not really a swing city because the two parties were usually closely tied in Boston.

Second, in the 1800 election Strong won because he won Berkshire and Hampshire Counties in Western Massachusetts. Gerry’s support in those counties was virtually non-existent. 6 Gerry, though, did much better in Maine, especially away from the coast. Strong also did well in Essex County, a Federalist center of strength.

Figure 3: Election for Massachusetts governor, 1800. Data: A New Nation Votes.

By 1807 the Democratic-Republican candidate, James Sullivan, did far better than Gerry had in the Western counties and in some Western towns he did better than Strong. Sullivan even made some inroads into Essex County and Cape Cod, though Strong made some inroads into Maine. This election was closely contested in nearly every town, and Sullivan narrowly defeated Strong by gaining votes in places that had gone heavily for Federalists in earlier elections. The change in politics from dominance by one party to heavily contested elections that we noted in the charts above also appears on this map.

Figure 4: Election for Massachusetts governor, 1807. Data: A New Nation Votes.

In 1823, Otis maintained some of the Federalist strength in western Massachusetts, though he also lost (I suspect that when Maine gained statehood in 1820, the Federalists benefited slightly from a decline of Republican votes). But Otis was defeated in most of the towns surrounding Boston.

Figure 5: Election for Massachusetts governor, 1823. Data: A New Nation Votes.

These maps show comparatively little of the split between “blue” cities and “red” country that we are accustomed to in modern electoral maps. This is hardly surprising, since mass urbanization happened much later. But what is surprising in these few maps is how close the vote was in many towns. The line between Federalists and Democratic-Republicans did not run between towns but through them. Elections were highly competitive at the state level, but that competition was also reflected in most towns.7 There is a lot more work to do, including figuring out a better way of representing votes by town, creating maps for all the Massachusetts gubernatorial elections, extending the analysis to other states and other types of elections, and taking on questions such as voter turnout and changing patterns of votes within particular towns.


If you would like to look up a particular election or candidate, use the table below.

Table 1: Top five contenders in each election for governor of Massachusetts, 1787–1824. Data: A New Nation Votes. Search by year or candidate name.

  1. See the summer 2013 issue of the Journal of the Early Republic, which includes the following articles: Caroline F. Sloat, “A New Nation Votes and the Study of American Politics, 1789-1824,” Journal of the Early Republic 33, no. 2 (2013): 183–86, doi:10.1353/jer.2013.0042; John L. Brooke, “‘King George Has Issued Too Many Pattents for Us’: Property and Democracy in Jeffersonian New York,” Journal of the Early Republic 33, no. 2 (2013): 187–217, doi:10.1353/jer.2013.0037; Donald Ratcliffe, “The Right to Vote and the Rise of Democracy, 1787-1828,” Journal of the Early Republic 33, no. 2 (2013): 219–54, doi:10.1353/jer.2013.0033; Philip J. Lampi, “The Federalist Party Resurgence, 1808-1816: Evidence from the New Nation Votes Database,” Journal of the Early Republic 33, no. 2 (2013): 255–81, doi:10.1353/jer.2013.0029; Rosemarie Zagarri, “The Family Factor: Congressmen, Turnover, and the Burden of Public Service in the Early American Republic,” Journal of the Early Republic 33, no. 2 (2013): 283–316, doi:10.1353/jer.2013.0026; Andrew W. Robertson, “Afterword: Reconceptualizing Jeffersonian Democracy,” Journal of the Early Republic 33, no. 2 (2013): 317–34, doi:10.1353/jer.2013.0023.

  2. Samuel Adams (won 1796) is listed as a Republican in NNV.

  3. Governors Levi Lincoln Sr., Moses Gill, and Marcus Morton succeeded governors who died in office but did not win office in their own right. The turnover between parties must be attributed, at least in part, to weak successors running for governor.

  4. This required an additional step to distinguish between Maine and Massachusetts, since until 1820, towns in what is today Maine were part of Massachusetts. A few populated places, such as “Number 8 and 9” in Maine could not be geocoded, but those places account for fewer than 100 votes per election.

  5. Because many more people voted in later elections, the relationship between the size of the circles and the number of votes varies from map to map.

  6. I am surprised that a Federalist did better in Western Massachusetts. Am I wrong to be surprised?

  7. Of course there are some exceptions. Chesire and Adams stand out to me: both were home to mills, and Chesire had a glass factory. Did these mill towns have a different kind of politics?

New Syllabus: Religion and Capitalism in the United States

This semester I’m teaching a new graduate seminar titled “Religion and Capitalism in the United States.” The readers of the Religion in American History blog gave me many suggestions for the readings. The syllabus is available online. Here is the class description:

The relationship between religion and capitalism has long exercised historians of the United States, and before them it concerned the people whom historians study. In this class, you will meet many people whose religion led them to interact with capitalism in incredibly diverse ways. You will meet the Puritans whose work ethic supposedly created capitalism, but who insisted on resting on the Sabbath; Moravian missionaries who made converts and money; slaves, slaveowners, and abolitionists who all claimed the Bible when reckoning with the capitalist system of slavery; a Protestant writer who insisted that Jesus was a businessman, and Catholics who believed Jesus called them to a kind of socialism; African American preachers who marketed their recorded sermons; Jews who mass-manufactured matzah and created Yiddish socialism; an industrialist who wrote The Gospel of Wealth, and laborers who created churches for the working class; nineteenth-century consumers who turned gift-giving into a ritual, and a twenty-first-century television personality who turned consumption into therapy; converts who thought religion required poverty, and Prosperity Gospelers who thought it promised wealth. You will read primary sources from American history, secondary works in both religious history and the new history of capitalism, and excerpts from theorists of religion and capitalism. Through these readings and your own research project, you are invited make sense of this perpetual historical puzzle.

Religious History and Religious Studies Syllabi from the Past Semester

Happy New Year, Religion in American History readers.

One of my favorite ways to get to know a scholar is to read her syllabi. Syllabi show how scholars put together a whole field. (And probably no text reveals personality as much as the introduction and policies on a syllabus.) Yet unfortunately teaching documents are shared less routinely than our research, so we are much more likely to know a scholar’s books and articles than her syllabi. Following the example of Paul Putz’s regular lists of new books, I intend to start a posting a roundup of syllabi for religious history and religious studies from the past semester from whoever wishes to contribute.

So here is a list of past syllabi from people who replied to my entreaties. Only a small number replied this first time, but if you would like to add your syllabus to this list, feel free to leave a link in the comments, or you can e-mail me a document and I’ll add it (lincoln@lincolnmullen.com).

N.B. The following syllabi have been added since this post was first published: