I often have small snippets of Markdown that I want to copy to the clipboard and then paste as HTML. I thought about writing an extension for Visual Studio Code, or a custom script for Boop. But that seemed like a lot of work for a simple task. And then I remembered: Unix.

pbpaste | pandoc | pbcopy

There is a one-liner which will work on a Mac to paste Markdown into Pandoc and then copy the resulting HTML back to the clipboard.

Can’t get much simpler than that.

If you want do go from HTML to Markdown, the one-liner is a little longer:

pbpaste | pandoc -f html -t markdown | pbcopy

If you are a GMU student, faculty, or staff member, you can access library resources from off campus using a library proxy. Suppose you want to access this article:

Rosenzweig, Roy. “Scarcity or Abundance? Preserving the Past in a Digital Era.” American Historical Review 108, no. 3 (2003): 735–62. https://doi.org/10.1086/ahr/108.3.735.

Following that DOI will take you to this page at the Oxford University Press website. This version is behind a paywall, however: we can’t see the full article.

A paywalled version of the article.

A paywalled version of the article.

Read the entire post 


The start of the academic year at RRCHNM also means the return of many of our graduate students. This week RRCHNM welcomed twenty-five graduate research assistants or graduate affiliates.

RRCHNM on the first day of classes.

RRCHNM on the first day of classes.

Graduate students are a critical part of the work that RRCHNM creates, and RRCHNM is in turn core to the experience of many of the graduate students in GMU’s Department of History and Art History.

A number of the graduate students at RRCHNM are graduate research assistants. As a part of their PhD fellowship from GMU, they work on RRCHNM projects. Coming alongside our faculty and staff, GRAs are full part of project teams. They contribute their expertise on both the subject matter of our projects and on the digital history skills that go into making them. While participating in the projects they learn from the faculty and staff and quickly move from being research assistants in name to research collaborators in fact. Our alumni from RRCHNM take what they have learned to become digital historians in libraries, museums, businesses, research centers, and academic departments.

The kinds of work that GRAs do at RRCHNM can be quite varied. For example, one GRA is working on writing and producing episodes for The Green Tunnel podcast about the history of the Appalachian Trail. Two GRAs are working with our partners at the National Museum of African American History and Culture to create a digital archive from the collections of five HBCUs. Two other GRAs are contributing to large-scale projects to turn historical collections into datasets. And two first-year GRAs are working on World History Commons to create educational resources and lesson plans for K–12 teachers, parents, and students.

Graduate students also are a part of RRCHNM to work on research projects that they start and lead themselves. Our graduate research affiliates—who are MA or PhD students in history or art history, as well as in related disciplines—benefit from the mentoring of RRCHNM’s staff and faculty. We currently have three working groups—on public history, spatial history, and data analysis—meeting regularly throughout the year to discuss works in progress and new publications on these topics. We regularly offer a series on “Basics of” where people at RRCHNM offer one another introductions to podcasting, educational resources, community engagement, various technologies, and so-called “soft skills” such as budgeting, project management. The graduate students themselves also form a community for peer mentoring.

The research projects that graduate students at RRCHNM create lead the field of digital history as much as any of the work by our faculty and staff. PhD students at GMU were the first to create born-digital dissertations. Our affiliates regularly present their research at academic conferences and publish their projects online. A number of our current affiliates are working on digital dissertations that combine visualizations, maps, databases, or public engagement with the prose more typically associated with dissertations.

To most people, RRCHNM is associated with the digital projects that we create and make freely available online.  Those online projects are the most visible part of RRCHNM. But another key part of our work is educating students to do digital history so that they can in turn achieve our mission—to democratize history. And that is work we have been doing for decades.


This weekend I am giving a presentation about the future of digital scholarship in the field of American religion at the Biennial Conference on Religion and American Culture. In the presentation I’ll be sharing a number of digital projects in American religion that I’ve learned a lot from. Since the proceedings of the conference will be published later, I won’t publish my remarks here now. But for the sake of conference participants who might want to follow along, here is a list of the projects I’ll mention without notes or comment.


Hi folks. Lest this newsletter’s status be downgraded from occasional to sporadic, let me catch you up on the latest news about digital history and American religion from my small corner of the world.


My colleagues at RRCHNM and I have started a newsletter about our work. Titled “American Religion @ RRCHNM,” the newsletter is headed up by our excellent outreach manager, Bridget Buckovich, meaning that is published regularly at the middle of each month. We’ve published our first two issues. There is lots of good stuff in the most recent issue, including blog posts by our graduate research assistant Caroline Greer, an interactive visualization that the whole team worked on, and fascinating materials for Passover and Easter from our pandemic collecting projects. Check it out, and then consider subscribing{target="_blank" rel=“noopener noreferrer nofollow”}.


Mapping historical congregations in U.S. cities

Speaking of our work on American religion at RRCHNM, we recently released an interactive map of urban American congregations drawn from the 1926 Census of Religious Bodies. Here’s a bit more about the map.

In the early twentieth century, the U.S. Census Bureau conducted surveys of American religious congregations every ten years and published reports on the data it collected. The Bureau categorized denominations into different denomination families, linking together churches that had shared history, theology, or practice. This interactive map displays congregations by denominations and denominational families in American cities, including places with 25,000 or more residents.

To give you a taste, here are Pentecostal congregations located in cities in 1926.

Pentecostal cities map

We will be releasing the underlying data and adding three other decades (1906, 1916, and 1936) soon. The whole team worked on this dataset and map, but special thanks to my colleague and our developer-scholar, Jason Heppler.


DataScribe’s version 1.0 release

And how does one go about getting the dataset for such a map? Well, you have to transcribe the historical sources—in this case, the published records of the 1926 census—into structured data. And how does one do that? For sometime, my colleagues at RRCHNM and some former colleagues now at the Corporation for Digital Scholarship have been working on DataScribe, a module (i.e., plugin) for Omeka S that helps you transcribe historical sources into datasets. The software, which is led by my colleague Jessica Otis, recently reached version 1.0. You can find the software on the project home page.

DataScribe preview

Computing Cultural Heritage in the Cloud

During the fall semester, I had the privilege of working with the Library of Congress Labs—cool place, cool people—on a project called Computing Cultural Heritage in the Cloud. The LC Labs team did an amazing job documenting the project’s progress, and you can find all their blog posts on the project home page. But you might as well start with the blog post on project outcomes, so you can hear about the fascinating work done by my fellow researchers, Andromeda Yelton and Lauren Tilton. You can find the software I developed for the project on GitHub. If you’ve ever had a hankering to download all the digitized collections at the Library of Congress and run machine-learning models across them, then this is for you.


Shoutout to the Uncivil Religion project

Finally, a shoutout to one of the most interesting and significant digital projects on American religion to be released recently. Uncivil Religion is an online set of essays and media galleries, seeking to capture and interpret the religious dimensions of the January 6 insurrection at the U.S. Capitol. The lead essay describes the insurrection as “a religious, yet religiously incoherent event.” Perhaps you, like me, found the ways that religion “showed up” (to borrow another phrase from the project) to be an endlessly tangled pile of confusion. Uncivil Religion is the best source I’ve found for trying to understand the religious dimensions of that event. The project comes out of the University of Alabama and the National Museum of American History, and it is directed by Michael J. Altman and Jerome Copulsky with Peter Manseau as an advisor.


Recent blog posts

City-level data post

Updates

Reading: Herman Melville’s The Confidence Man.

Listening: I’ve been enjoying an “American roots music” (mostly bluegrass) band called The Petersens that I found on YouTube.

Planning: My wife, Abby Mullen, has accepted a job at the U.S. Naval Academy teaching—you guessed it—U.S. naval history.

Working: America’s Public Bible is through editorial board approval, and I need to submit the final manuscript in one month’s time exactly.


I was asked to write up what I thought made someone a good academic mentor, in less than a page. Since I had to write it up quickly, here it is for further thought. This list is partial and based on my own experience, but here is what I’ve observed from watching the good mentors that I have had.

  1. What got you here is not what will get them there. Too much of mentorship is the mentor describing their own career path. While there is value in hearing other people’s stories as a quick route for understanding how the academy works, the chances that someone else will follow the same career path are nil. A good mentor helps someone else find their own way forward, based on their values, interests, and goals, as well as the changing circumstances in the academy.
  2. Where you wanted to go is not where they want to go. Too much of mentorship is the mentor trying to reproduce him- or herself via the person being mentored. But other people’s career goals—not to mention how their career fits into their personal life—can and should be very different than your own.
  3. People can find their own answers. Generally speaking, almost all of the time people can work out what their own values, interests, priorities, strengths, and so forth are. They seldom need suggestions of what to do or even how to do it. What they need is someone to talk to who genuinely listens and can help them figure those things out for themselves. Occasionally they need someone they trust to give them “permission” to do what they’ve already figured out.
  4. Explain the boring stuff. Many things about an academic career are not hard: they are hard to learn. For instance, the mechanics of grant writing are not so difficult, but they are completely opaque the first time someone does it. One of the few times when a mentor should talk more than listen or ask questions is in explaining the routine, boring things that are hidden knowledge that block people (especially women and minorities) from success.
  5. Share failures as well as successes. When I was in grad school, I got a “revise and resubmit” from a journal then never resubmitted, because I thought that was just a polite way for the editor to say, “Get lost.” I’ve used this example to illustrate how the “pipeline” of academic research works … and to show grad students how much smarter they are than me!
  6. Open doors. Whenever possible, make introductions that benefit the person being mentored.
  7. Informal mentorship trumps formal mentorship. I have had good formal mentors, but their significance was secondary to some truly generous and wise informal mentors. My point is not to critique the idea of formal mentorship. But I do think that formal mentorship is a temporary relationship to help people until they find informal mentors for themselves—which is a great outcome.
  8. Never take credit. The successes of the person being mentored belong only to them, never to the mentor. However, some bragging on their behalf is allowed.

RRCHNM is a shop that is more and more working on computational history and historical data visualization. But we are also first and foremost a web shop: ever since Roy Rosenzweig saw the potential of the internet and left CD ROMs behind, we’ve been committed to delivering history via people’s web browsers. Those two commitments are becoming increasingly compatible. For example, Ben Schmidt has written persuasively about the next decade of data programming happening in the browser via JavaScript. But combining data analysis and the web takes work. In this blog post, I want to explain how we are solving one aspect of that challenge via our custom data API.

We have a lot of datasets in play for RRCHNM’s projects. Some of the spatial datasets, such as Natural Earth and the Atlas of Historic County Boundaries, we use over and over across projects. AHCB is a critical part of both Mapping Early American Elections and American Religious Ecologies. Some of the datasets are small and intended for display. Others are large text corpora, such as Chronicling America, Gale’s Making of Modern Law, or all of the full text collections from the Library of Congress gathered as part of Computing Cultural Heritage in the Cloud, from which we compute derivative datasets of biblical quotations, legal citations, or the like. Even those derivative datasets can be fairly large and unwieldy. And other datasets are ones that we are transcribing ourselves using our DataScribe tool. These include the data about religious congregations from the 1926 Census of Religious Bodies and about London’s bills of mortality.

The version of record for these datasets is typically a PostgreSQL database. We use a relational database for—well—all the reasons everyone else uses relational databases. In particular, we value the strong guarantees a database provides about the data being strongly typed and well structured. We find it useful to be able to access the exact same data via, say, R for data analysis and a web application for display. And of course, there is the ability to query and index the data, combine datasets through joins, provide shared access, and so forth. PostgreSQL is not an exciting choice; it may very well be the least exciting choice imaginable. But rock solid and boring is a great place to be for critical infrastructure. 

An example of what some of the data looks like from the American Religious Ecologies project. It might not look like much, but we had to reverse engineer and entire federal census in order to create it.

An example of what some of the data looks like from the American Religious Ecologies project. It might not look like much, but we had to reverse engineer and entire federal census in order to create it.

That still leaves the problem of getting the data out of the database and into the user’s browser. We needed a solution that could provide some key features:

  • The data should be delivered in a format easily usable for web visualization, which means JSON or GeoJSON.
  • The data should be reshaped as necessary. Frequently the form that data is stored in, typically some kind of normalized form, is not the way that the data should be structured for display.
  • Large datasets must be queryable. Although browser can handle more and more, that does not mean that they should be made to do so, so ideally the minimum amount of data necessary should be sent to the browser.
  • It should be easily extensible as we add new projects, and it should not require us to reinvent the wheel every time we start a new project. Rather, it should let us use existing data and functionality (such as the AHCB dataset I mentioned) across projects.
  • And, if the need arises, it should allow the browser to write back to the database.
JSON from the data API. It’s not exciting, but if it’s what you need, it’s very useful.

JSON from the data API. It’s not exciting, but if it’s what you need, it’s very useful.

Our solution was to create a custom data API for RRCHNM projects, which we call Apiary. (Yes, we know other projects use that name, but this is just our internal codename.) The API is written in Go, a simple but powerful language well suited for our needs here. The API is containerized using Docker, for ease of deployment. The API essentially consists of a thin, fairly minimal application that provides the necessary boilerplate to set up a database connection, a web server, and so forth. Then individual endpoints that provide specific datasets are added as handlers. Adding a new dataset or view on a dataset is thus as straightforward as writing a new function in Go. But since those handlers fall into a few different types, in most instance the main work of adding a new endpoint is writing a new SQL query. 

Our data API is available under an open-source license on GitHub. (You can also take a look at the API’s endpoints.) To be clear, this project is a highly-custom application, not a library or a general purpose application. Nearly all of the handlers would be of no use to non-RRCHNM projects, and you would have to create your own database, queries, endpoints, and so forth. But as we look around at the landscape of digital history and digital humanities projects, we see other projects that have a similar need to store, structure, query, and display data in the browser. Perhaps the general idea of a data API could prove useful to other institutions and scholars.


I recently had to set up a new Mac for work. Generally speaking, this happens so infrequently that it is worth setting up the new machine from scratch, rather than using Migration Assistant. I like to avoid carrying over the cruft that comes from several years of a constantly updated development environment, and all work files are in iCloud Drive or GitHub anyway. But still, that leaves a fair bit of set up to do to get things working correctly.

For a long time I’ve kept my dotfiles in a GitHub repository. This sets up configuration for ZSH, Neovim, Go, R, Homebrew, LaTeX, Git, and the like. While a lot of it is Mac specific, the shell and text editor configuration work fine on Linux machines, so I can easily bring settings to servers. This makes customizing my development environment fairly painless. (And Visual Studio Code now has great settings sync, so that takes care of itself.)

Of course not everything can go in a public GitHub repository. Recently, I’ve taken to having a single file (~/.env.zsh), which contains project and service credentials, as well as machine-specific settings, stored as environment variables. For example, all the projects that I create pull their database connection settings from environment variables. And setting the number of cores available on a particular makes scaling up parallel processing easier. This file, like SSH keys, is easy to move over to a new machine.

Some machine-specific settings from my environment file.

Some machine-specific settings from my environment file.

What was new to me this time was using Homebrew bundles for installing software and dependencies. While I’ve used Homebrew for a long time, I recently learned from a blog post by Casey Liss that Homebrew has for a while now supported creating a list of software to install. In addition to installing CLIs and other packages from Homebrew proper, and GUI applications as Homebrew Casks, it even supports (though not particularly reliably) installing applications from the Mac App Store.

So I set up a Brewfile for my work machine. This worked great for setting up the new machine, and it is nice to have an explicit record of the software that I need to have installed.


As promised, here is the first installment of an occasional series on my tech stack. If you want to jump straight to the history, see below for two book recommendations on labor history and religious history.


Last week I mentioned that I would write about the technology stack that I use to do digital history work. For this week, let me briefly introduce the concept.

Talking about “a stack” of technologies is not the same thing as talking about digital tools. The whole discussion of “tools” in digital humanities, or the broader cultural contexts of life hacking, productivity pr0n, and their ilk, is not something I want to get into right now.

Here is how a technology stack and “tools” are different. A tool is a specific means to a specific ends. You want to make a map, therefore you use X. You need to clean data, therefore you use Y. You need to make a network, therefore you use Z. There’s nothing wrong with that, of course. But too often that approach leads to a shallow understanding of the method being used. Networks and maps are different, of course, but they can also be worked on using the same technologies. The short-term convenience of using a highly specialized tool buys a lot of long-term pain because then data and outputs are rarely reusable or interoperable.

So what I want to talk about for a few newsletter issues are the components of an overall approach to the kind of digital history I do (computational, spatial, data visualization and so forth), and how they work together. These are parts of a system for doing research that work together across projects. The goal is to build fluency with using these parts of the system, so that future projects can build on the data and systems I have developed along the way.

For example, almost all of my projects store their data in PostgreSQL, whether or not the data is spatial or textual, whether it comes from an API or bulk download. If I need to analyze that data, I do it in R. If it needs to go on the web, I use some combination of Go and JavaScript. Those different parts of the stack are all pulling from the same database.

Here are a few of the pieces I’m sure I want to talk about. Perhaps there will be others too.

  • Data store: PostgreSQL, PostGIS
  • Web and general-purpose programming: Go
  • Data analysis and visualization: R and JavaScript
  • Websites: Hugo
  • Web servers and hosting

More in the coming weeks.


It was just Labor Day in the U.S., so here are two book recommendations.

The first is Roy Rosenzweig’s Eight Hours for What We Will: Workers and Leisure in an Industrial City, 1870–1920. I now work at the Roy Rosenzweig Center for History and New Media, though I never met the man Roy Rosenzweig. (I was once assigned to hand out educational materials at the Organization of American Historians on behalf of RRCHNM. I was mostly unsuccessful at that task, but at least four people came up to me to tell me, at great length, how much Rosenzweig meant to them. It made an impression.) But I knew Eight Hours for What We Will a long time before I came to RRCHNM, and not just because it still belongs on the exam list for every PhD in American history. One of my textbooks as an undergrad had the forbidding title, Historiography: Ancient, Medieval, and Modern. In reviewing labor history, it mentioned Rosenzweig’s history of Worcester, Massachusetts. Since Worcester is the closest city to the town where I grew up, I looked it up in the university library and loved it. The book still sparkles with historical imagination and appreciation for its subjects, and it is very much worth reading.

The second is Heath Carter’s Union Made: Working People and the Rise of Social Christianity in Chicago. You may detect that I love history books which are carefully anchored in a particular place. Carter’s book is very much a history of Chicago, but it also unpacks the class divisions in specific congregations. The book begins with labor union representatives preaching Labor Day sermons in church pulpits in 1910. I think you will find that practice a contrast with practically any American church a century later.


Updates

Working: Containerizing and adapting my prediction model for America’s Public Bible so that I can use it for Computing Cultural Heritage in the Cloud.

Listening: Tennessee Ernie Ford, Nearer the Cross (1958): found an LP for 99¢ at the used bookstore.

Reading: Just finished Margaret O’Mara’s The Code: Silicon Valley and the Remaking of America, which I thought was both really good and a really good read.

Hiking: Sky Meadows state park with my family.


Hi folks. It has been an embarrassingly long time since I wrote an issue of this newsletter. A few things happened. The sheer exhaustion of the pandemic caught up with me, as I am sure it did with you. But even more, I took on a major non-work responsibility—the details don’t matter for our purposes—and I have tried to discharge my duty faithfully. But a new academic year is upon us, and I hope to get back to writing this newsletter. Below is a scattershot of updates to get started again.


I read in the news that in an address on the crisis in Afghanistan, President Biden quoted Isaiah 6:8 (“Here am I; Send me”), referring to American service members . I think it is fair to say that it is a jarring and not at all typical use of the text. Certainly I had never encountered a use of that text in that context before. So I had to wonder, Is there a history of using that verse to refer to the military? But before I could work on it myself, my feed reader turned up Chris Gehrz’s post for The Anxious Bench: “‘Here I am, send me’ in American Military History.” Chris uses my America’s Public Bible to turn up a number of earlier examples of similar uses. You should take a look.


Speaking of America’s Public Bible, I am working as quickly as I can to complete the updated version to send to the press. Here’s Isaiah 6:8 in the long-running prototype version, which you can continue to access.

APB prototype

And here is the far more reliable and (I hope) more useful version that is forthcoming but still in development.

APB development

It’s not just a visual refresh: I’ve also extended the chronological range, found a lot more quotations, and added an interpretative layer to the project.


Speaking of layers, one of the most useful essays I’ve read on the form of digital scholarship is Robert Darnton’s 1999 essay, “The New Age of the Book.” I was not precocious enough to be reading the New York Review of Books at the dawn of the new millennium, but I had the good fortune to hear Darnton speak on a similar subject at the Brandeis University library while I was in graduate school and subsequently discovered the essay. I’ve found his idea of an e-book as a pyramid of scholarly materials—from a broad base of sources ascending to an interpretative point—to be a persuasive goal for digital scholarship.

I’ve tried to structure the new version of the project along those lines. Here’s a draft of the “how to use this site”:

The elements of this site form an interpretative pyramid, something like the e-books that Robert Darnton envisioned.

  • At the base are quotations in the newspaper. You can browse the gallery of quotations to see examples, or see the datasets for a complete list.
  • Those quotations are aggregated into trend lines, which are accompanied by tables of quotations. You can start by browsing the featured verses.
  • Verse histories take the information from the trend line and the quotations and offer brief interpretative essays on their history.
  • Longer essays and other explorations introduce the site, its methods, and address topical questions in the history of the Bible in the United States.

Speaking of e-books, America’s Public Bible will be a digital monograph, and it will be published more like a book than like a website. (It will even have the obligatory colon and subtitle: A Commentary.) But I hope to continue adding things as occasion arises, and one of its primary purposes is to be an ongoing platform for other people’s scholarship, too.

And so I’ve recently become a part of Computing Cultural Heritage in the Cloud at the LC Labs. Part of my aim there is to extend APB by finding biblical quotations across all the Library of Congress’s digital collections. But my not-so-secret other aim is to hang out with the cool folks at LC Labs and my fellow researchers, Andromeda Yelton and Lauren Tilton. (Mission accomplished.) Here’s a post from the Library about the project, and here’s a story in the Wall Street Journal.


Returning to Darnton’s ideas for e-books, there is a kind of homology between his concept of a layered pyramid of scholarship, and what programmers would call “the tech stack.” The stack is the set of technologies which enable some kind of software product. For example, you might have heard of the LAMP stack, which undergirds popular software like WordPress: Linux (the operating system), Apache (the web server), MySQL (the database), and PHP/Perl/Python (the programming language). Well, I don’t use any of those. But I thought I might start writing an occasional series on the technology stack that I do use for my digital research. Why? Because I love PostgreSQL and I think you should too. More on that next time.


Updates

Listening: Unearthed.

Working: Collaborating with colleagues on a map of city-level data from the Censuses of Religious Bodies.

Playing: MLB The Show.

Reading: Ted Gioia, Healing Songs.

Watching: Mythic Quest. The series as a whole is dumb yet charming, but the standalone episode “A Dark Quiet Death” was truly moving.