Computational History (Spring 2020)

This syllabus comes from https://lincolnmullen.com/courses/data.2020/. Only the online version of this syllabus is authoritative, and it may be updated as necessary.

Course: HIST 697-001. Spring 2020. Department of History and Art History, George Mason University. 3 credits. Meets Mondays, 7:20–10:00pm in Music Theater Building 1008.

Instructor: Lincoln Mullen <lmullen@gmu.edu>. Office: Research Hall 484. Office hours: By appointment. Book an appointment.

Important updates

This syllabus has been modified for the move to online classes for the remainder of the semester. This online syllabus will be kept up to date and is your best source of guidance for the requirements of this course.

The key changes are these:

  1. Class will meet at 7:20 p.m. on Mondays via Webex. Class sessions will be recorded for anyone who cannot make it at that time or who might experience technical difficulties.

  2. I will provide written tutorials of the techniques we are learning, to the extent possible. The tutorials will be disseminated via Slack. These will be in addition to the in-class explanations and sample code customarily provided.

  3. The class calendar, as well as some readings, have changed due to the change in the university calendar. See the schedule below.

  4. While the final project will retain the same emphasis on indepdenent data analysis that produces historical insight for your field, the details of the assignment have changed. See the assignments section below.

  5. I will continue to be available to you in office hours, but now via Webex. In fact, the variety of times of day when I will be available will be much greater. Here’s how to meet with me individually.

What won’t change is that I am committed to you and your success in the course. Please let me know whenever you need help.

Course description

In this course you will learn to use computational methods to create historical interpretations. You will work with historical data, which includes finding, gathering, manipulating, analyzing, visualizing, and arguing from datasets, with special attention to geospatial, textual, and network data. These methods will be taught primarily using the R programming language. While data analysis methods can be applied to many topics and time periods, they cannot be understood separate from how the discipline forms meaningful questions and interpretations, nor divorced from the particularities of the sources and histories of some specific topic. You will therefore work through a series of example problems using datasets from the history of the nineteenth-century United States, and then apply the methods to write a research paper using a dataset from your own historical field.

Learning goals

After taking this course, you will be able to

Essential information

Most required readings are available online or through the GMU libraries. These are the main books that we will be using.

This is a graduate methods course in a field that moves reasonably quickly. The syllabus is likely to change over the course of the semester. In particular, I am likely to send you additional projects or visualizations to look at before class, which should be treated the same as other assigned readings.

All communication for this course will happen in our Slack group. Read this getting started guide if you need help. The Slack group is your primary place to ask for help. Please ask for help in the public channels rather than private messages. You are almost certainly not the only person to have your question, and asking and answering questions publicly benefits everyone. When you ask a question, help me help you by including the code that you are asking about and any error messages that are relevant.

You are always welcome to talk with me during office hours via Webex. My office hours page has instructions on how to book an appointment and connect to a Webex session. If the scheduled times don’t work for you, please contact me and suggest a few other times that would work for you.

Bring a computer to each class meeting. For the most part, we will be using an RStudio Server instance hosted by RRCHNM, which you can log in to using a web browser. But you should also install some key software on your computer. See the list under the heading for the first week. I will assume that you have a computer with some kind of Unix-like operating system available. The easiest will be macOS or a Linux distribution. But if you use Windows, good news: R has very good support for Windows.

In general I have provided datasets and questions for you to work on for all the assignments except the final paper. But for any assignment, you may substitute a dataset from your own historical field after checking with me. The forward-thinking graduate student will try to find such datasets early on in the semester so that you can use the intermediate assignments as preparation for your final assignment. If you can peer even farther into the future, you could try to use the final assignment as a test run for work you might want to do in one of your own research projects, such as a conference presentation, article, or dissertation.

Assignments

For each assignment, you should send me the completed HTML file knit from your RMarkdown document. Please submit the assignments via the Blackboard page for this class. Send the assignments before the start of class on the day on which they are due.

Preparation and participation are expected as a matter of course in a graduate class. Complete all readings and submit all assignments before class. If the readings include sample code or questions at the end, work through them as part of doing the readings, though you do not need to to submit them and I will not check them. Final grades will be calculated using the typical percentage-based grading scale (A = 93–100, A- = 90–92, B+ = 88–89, B = 83–87, B- = 80–82, … F = 0–59).

Worksheets and weekly assignments (25%). Many classes will have an assignment due before class begins. Some will require you to do library research; others will be practice data analysis worksheets. Some of the questions on the worksheets will be easy; most will be difficult; some you may find nearly impossible. The aim is to practice. We will go over the worksheets in class each week. If you attempt a problem and can’t solve it, you should still turn in whatever work you did on it. Students who complete all the easy and moderately difficult questions, attempt the very difficult questions, and ask for help as needed will do just fine. These assignments will graded by completion.

Analysis assignments (4 × 10% = 40%). You will do four analysis assignments, each demonstrating a specific skill in data analysis. For these assignments you will be use a historical dataset and asked some interpretative questions. You will prepare an RMarkdown document containing prose, code, and tables or visualizations to answer the historical questions and, as necessary, explain your methods. For these assignment I will provide a dataset that you can work with (but see below).

Final project (35%). You will designate one of the analysis assignments as a stepping stone to your final project. For that analysis assignment, you will use the same dataset that you will use for the final project. You will try out one of the methods we are learning on that dataset. In addition to the normal feedback that I will provide on an assignment, I will also give you guidance about how to refine and expand your analysis, visualizations, and interpretations. Then, you will expand and revise the work you did in the analysis assignment for the final project. This expanded version should include more prose and citations, not to exceed 1,500 words. The visualizations and data analysis should be expanded if necessary and refined in each case to the level of quality that would be expected in a published article. Each table and figure must have a caption written in complete sentences. Explain your methods as needed, but write in a way which would be understandable and compelling to any historian working in your field. The final assignment will be evaluated according to two primary criteria: (1) Did the visualizations significantly improve in refinement and quality? (2) Does the combination of prose and visualizations convey a meaningful historical argument? Due Monday, May 18 at 5pm.

Schedule

Week 1 (January 27): Introduction to computational history

Assignment:

Readings:

Do your level best to get these set up before the first day of class:

These are mostly optional, but it would be helpful to have them:

Week 2 (February 3): Data from history and historians

Assignment:

Readings:

Browse:

Week 3 (February 10): Basics of R

Assignment:

Readings:

Week 4 (February 17): Data manipulation

Assignment:

Readings:

Week 5 (February 24): Data visualization

Assignment:

Readings:

Week 6 (March 2): Exploratory data analysis

Assignment:

Readings:

Spring break (March 9)

Extended spring break (March 16)

Week 7 (March 23): Maps

Assignment:

Readings:

For reference:

Week 8 (March 30): Networks

Assignment:

Readings:

Browse:

For reference:

Week 9 (April 6): Texts

Assignment:

Readings:

For reference:

Week 10 (April 13): Word embeddings

Readings:

Week 11 (April 20): Clustering (unsupervised classification)

Assignment:

Readings:

Week 12 (April 27): Prediction (supervised classification)

Readings:

Week 13 (May 4): Next steps with computational history

Readings:

Week 14 (May 11): Final project workshop

Assignment:

Fine print

This syllabus may be updated online as necessary. The online version of this syllabus is the only authoritative version.

Students must satisfactorily complete all assignments in order to pass this course. I am sometimes willing to grant extensions on assignments for cause, but you must request an extension before the assignment’s due date. For every day or part of a day that an assignment is late without an extension, I may reduce your grade. No work (other than final projects) will be accepted after the last day that the class meets. I will discuss grades only in person during office hours.

See the George Mason University catalog for general policies, as well as the university statement on diversity. You are expected to know and follow George Mason’s policies on academic integrity and the honor code. If you are a student with a disability and you need academic accommodations, please see me and contact the Office of Disability Services at 703-993-2474 or through their website. You are responsible for verifying your enrollment status. All academic accommodations must be arranged through that office. Please note the dates for dropping and adding courses from the GMU academic calendar.

This syllabus draws ideas and assignments from many people and syllabi, including Taylor Arnold, Andrew Goldstone, Jason Heppler, Ben Schmidt, and Lauren Tilton.