In this course you will learn to apply computational methods to create historical arguments. You will learn to work with historical data, including finding, gathering, manipulating, analyzing, visualizing, and arguing from data, with special attention to geospatial, textual, and network data. These methods will be taught primarily through scripting in the R programming language. While historical methods can be applied to many topics and time periods, they cannot be understood separate from how the discipline forms meaningful questions and interpretations, nor divorced from the particularities of the sources and histories of some specific topic. You will therefore work through a series of example problems using datasets from the history of the nineteenth-century U.S. religion, and you will apply these methods to a dataset in your own field of research.
After taking this course, you will be able to
perform exploratory data analysis; clean, tidy, and manipulate data; gather historical data from print and manuscript sources; use existing historical data sets; create common visualizations; work with geospatial, textual, and network data.
write scripts using the R programming language and its extensive set of packages.
understand the place of data analysis and visualization within humanities computing, digital history, and the discipline of history.
conceive of and execute a research project in computational history suitable for treatment in a dissertation chapter or journal article.
take the course “Programming in History/New Media,” a.k.a. Clio 3, should you choose.
Bring a computer to each class meeting. We will use R and RStudio. Install them on your own computer. You will also have access to an RStudio Server instance which will let you use R in your browser. Much of your work for the course will go on GitHub, so sign up for an account.
All required readings are available online for free or through the GMU libraries, though they can also be purchased (sometimes in more complete editions) in print or e-books. These are the books we will use most frequently.
Be prepared. Preparation and participation are expected as a matter of course in a graduate class. Complete all readings and assignments before class. If the readings include sample code or questions at the end, work through them as part of doing the readings.
Worksheets and weekly assignments (20%). Many classes will have an assignment due before class begins. Some will require you to do library research; others will be practice data analysis worksheets. Some of the questions on the worksheet will be easy; most will be difficult; some you may find nearly impossible. The aim is to practice. We will go over the worksheets in class each week. If you attempt a problem and can’t solve it, you should still turn in whatever work you did on it. Students who complete all the easy and moderate difficulty questions, attempt the very difficult questions, and ask for help as needed will do just fine. These assignments will graded by completion, with three levels: “incomplete,” “acceptable,” “excellent.” Unless otherwise specified, these assignments should be submitted as a PDF or a standalone HTML file, one file per assignment. Name them like this: Mullen-worksheet-week02.pdf. Submit them to this Dropbox folder.
Analysis assignments (3 @ 15% each). You will do three analysis assignments, each demonstrating a specific skill in data analysis. For these assignments you will be given a historical dataset and asked some interpretative questions. You will prepare an RMarkdown document containing prose, code, and tables or visualizations to answer the historical questions and, as necessary, explain your methods. You will be given a starter GitHub repository with the data and questions. Submit your final analysis as a PDF to this Dropbox folder. You will also be evaluated on the code in your GitHub repository, which I must be able to run on my computer.
R package tutorial (10%). At our second meeting, you will pick from a list of R packages not covered in this class. You will be assigned a week (beginning at week 7) during which you will teach the class for 15 minutes about the topic you selected. As part of that teaching, you will prepare a PDF handout. That handout should include these parts: (1) a one- to two-paragraph summary of what the package does and while it is useful; (2) a brief section of example code and results; (3) a bulleted list of examples (historical if possible) where the package was used. A draft of that handout is due to me one week before you are scheduled to teach. I will offer feedback, and you will give the class a revised version in Slack on the Friday before you teach.
Research paper (25%). You will write one research paper suitable for a presentation at a disciplinary or digital humanities conference (see for example the CFP for Current Research in Digital History, or the CFP for the major conference in your field). This paper must advance a historical argument using data analysis of a set of sources that you choose from your research interests. Submit this paper as a PDF or self-contained HTML file (if it includes interactive visualizations) to this Dropbox folder. Further instructions will be given throughout the semester. Due May 10.
Find primary source data tables, datasets, or corpora from your field of historical research. At least one of these must be a source which can be transcribed into a tabular dataset in a later week. Post full citations and URLs in the Slack group, along with a sentence or two explaining what you’ve found. Examine the links that other people post before class.
Wickham and Grolemund, R for Data Science, ch. 1, 4, 6, 8.
Shari Rabin, “‘Let us Endeavor to Count Them Up’: The Nineteenth-Century Origins of American Jewish Demography,” American Jewish History 101, no 4 (2017): 419–440, https://doi.org/10.1353/ajh.2017.0060.
Wickham and Grolemund, R for Data Science, ch. 12–13.
Healy, Data Visualization, ch 4–5, 8.
John Theibault, “Visualizations and Historical Arguments,” in Writing History in the Digital Age, ed. Kristen Nawrotzki and Jack Dougherty (University of Michigan Press, 2013), https://doi.org/10.3998/dh.12230987.0001.001.
Stephen Robertson, “Putting Harlem on the Map,” in Writing History in the Digital Age, edited by Jack Dougherty and Kristen Nawrotzki (University of Michigan Press, 2013).
Week 9 (Mar. 26): Text analysis
Mapping assignment due (see GitHub repository for data and instructions).
Silge and Robinson, Tidy Text Mining with R, ch. 1–2, 4–7.
Wickham and Grolemund, R for Data Science, 14.
Graham, Milligan, Weingart, Macroscope, chs. 3–4.
Tim Hitchcock and William J. Turkel, “The Old Bailey Proceedings, 1674–1913: Text Mining for Evidence of Court Behavior,” Law and History Review 34, no. 4 (2016): 929–955, https://doi.org/10.1017/S0738248016000304.
Week 10 (Apr. 2): Text analysis
Matthew L. Jockers and Ted Underwood, “Text-Mining the Humanities” in A New Companion to Digital Humanities, ed. Susan Schreibman, Ray Siemens, and John Unsworth (Wiley, 2016), 291–306. GMU library
Matthew K. Gold et al., “Forum: Text Analysis at Scale,” in Debates in the Digital Humanities 2016 (University of Minnesota Press, 2016), 525–568.
Ryan Cordell, “Reprinting, Circulation, and the Network Author in Antebellum Newspapers,” American Literary History 27, no. 3 (2015): 417–445, https://doi.org/10.1093/alh/ajv028.
David A. Smith, Ryan Cordell, and Abby Mullen, “Computational Methods for Uncovering Reprinted Texts in Antebellum Newspapers,” American Literary History 27, no. 3 (2015): E1–E15, https://doi.org/10.1093/alh/ajv029.
Week 11 (Apr. 9): Network analysis
Text analysis assignment due (see GitHub repository for data and instructions).
Matthew Lincoln, “Social Network Centralization Dynamics in Print Production in the Low Countries, 1550–1750,” International Journal for Digital Art History 2 (2016): 134–157, https://doi.org/10.11588/dah.2016.2.25337.
Network analysis assignment due (see GitHub repository for data and instructions).
Topic and readings to be determined by the needs of student research papers.
Week 14 (Apr. 30): TBD
Topic and readings to be determined by the needs of student research papers.
This syllabus may be updated online as necessary. The online version of this syllabus is the only authoritative version.
Students must satisfactorily complete all assignments (including participation assignments) in order to pass this course. Your attendance is expected at every meeting. If you must be absent, I request that you notify me in advance of the class meeting. I am sometimes willing to grant extensions for cause, but you must request an extension before the assignment’s due date. For every day or part of a day that an assignment is late without an extension, I may reduce your grade. No work (other than final exams and final projects) will be accepted later than the last day that the class meets. I will discuss grades only in person during office hours.
See the George Mason University catalog for general policies, as well as the university statement on diversity. You are expected to know and follow George Mason’s policies on academic integrity and the honor code. If you are a student with a disability and you need academic accommodations, please see me and contact the Office of Disability Services at 703-993-2474 or through their website. You are responsible for verifying your enrollment status. All academic accommodations must be arranged through that office. Please note these dates from the academic calendar.
Last day to add a class or drop a class without penalty: January 29, 2018.
Last day to drop a class without special permission: February 23, 2018.