Clio 2: Computational History (Spring 2019)

This syllabus comes from https://lincolnmullen.com/courses/clio2.2019/. Only the online version of this syllabus is authoritative, and it may be updated as necessary.

Course: HIST 697-001. Spring 2019. Department of History and Art History, George Mason University. 3 credits. Meets Mondays, 7:20–10:00pm in RRCHNM conference room, Research Hall 402.

Instructor: Lincoln Mullen <lmullen@gmu.edu>. Office: Research Hall 457. Office hours: By appointment. Book an appointment.

Course description

In this course you will learn to use computational methods to create historical arguments. You will work with historical data, including finding, gathering, manipulating, analyzing, visualizing, and arguing from data, with special attention to geospatial, textual, and network data. These methods will be taught primarily through scripting in the R programming language. While historical methods can be applied to many topics and time periods, they cannot be understood separate from how the discipline forms meaningful questions and interpretations, nor divorced from the particularities of the sources and histories of some specific topic. You will therefore work through a series of example problems using datasets from the history of the nineteenth-century United States, and you will apply these methods to a dataset in your own field of research.

Additional emphasis will be placed on publishing your scholarship on the web. While this is not a course in web development, you will learn the basics of publishing documents on the web, including familiarity with command line programs, getting files onto servers, basic web technologies such as HTML and CSS, Git and GitHub, and packages such as RMarkdown for publishing data analysis.

In other words, this class will teach you how to have something historically meaningful to say from data, and how to publish what you want to say on the web.

Learning goals

After taking this course, you will be able to

Essential information

This is a graduate methods course in a field that moves reasonably quickly. The syllabus is likely to change over the course of the semester. In particular, I am likely to send you additional projects or visualizations to look at before class.

You are always welcome to talk with me during office hours. While you can drop in, I strongly encourage you to book an appointment. If the scheduled times don’t work for you, email me and suggest a few other times that would work for you.

All communication for this course will happen in our Slack group. Read this getting started guide if you need help. This is your primary place to ask for help.

Bring a computer to each class meeting. See the list of software that you will need to install under the heading for the first week. This class is going to assume that you a computer with some kind of Unix-like operating system available. The easiest will be macOS or a Linux distribution. But if you use Windows, good news, you can install the Windows Subsystem for Linux, though after that you are mostly on your own to figure out the peculiarities of Windows.

You will need a basic web server and your own domain. If you do not already have a domain and web hosting, the personal shared hosting from Reclaim Hosting will be more than adequate.

One textbook is required in print (though it is partially available online).

All the other required readings are available online or through the GMU libraries, though they can also be purchased—sometimes in more complete editions—in print or e-books. These are the books we will use most frequently.

In general I have provided datasets and questions for you to work on for all the assignments except the final paper. But for any assignment, you may substitute a dataset from your own historical interests after checking with me. The forward-thinking graduate student will try to find such datasets early on in the semester so that you can use the intermediate assignments as preparation for your final assignment. If you can peer even farther into the future, you can try to use the final assignment as a test run of work you might want to do in one of your own research projects, such as an article or a dissertation.

Assignments

Assignments should be submitted via this form unless otherwise instructed. For each assignment you will be expected to turn in two things. The first will be a web page with a public-facing presentation of your work. The other will be a GitHub repository with the source code that creates that web page.

Preparation and participation are expected as a matter of course in a graduate class. Complete all readings and submit all assignments before class. If the readings include sample code or questions at the end, work through them as part of doing the readings. Final grades will be calculated using the typical percentage-based grading scale (A = 93–100, A- = 90–92, B+ = 88–89, B = 83–87, B- = 80–82, … F = 0–59).

Worksheets and weekly assignments (20%). Many classes will have an assignment due before class begins. Some will require you to do library research; others will be practice data analysis worksheets. Some of the questions on the worksheets will be easy; most will be difficult; some you may find nearly impossible. The aim is to practice. We will go over the worksheets in class each week. If you attempt a problem and can’t solve it, you should still turn in whatever work you did on it. Students who complete all the easy and moderately difficult questions, attempt the very difficult questions, and ask for help as needed will do just fine. These assignments will graded by completion.

Analysis assignments (3 × 15% = 45%). You will do three analysis assignments, each demonstrating a specific skill in data analysis. For these assignments you will be given a historical dataset and asked some interpretative questions. You will prepare an RMarkdown document containing prose, code, and tables or visualizations to answer the historical questions and, as necessary, explain your methods. You will be given a starter GitHub repository that you can fork with the data and questions.

Research paper (35%). You will write a research paper suitable for a presentation at a disciplinary or digital humanities conference. This paper should advance a historical argumentation on the basis computational historical methods, though you can and should use more traditional historical methods as necessary. The body of the paper should be about 2,000 words in length. It should include notes in Chicago format like any other work of history. The paper should include embedded visualizations or tables as appropriate. Each table and figure must have a caption written in complete sentences. The paper should be attractively presented on your website using the Radix RMarkdown format. Explain your methods as needed, but write in a way which would be understandable and compelling to any historian working in your field. The paper should be accompanied by a GitHub repository containing your data and code in a reproducible analysis. Ideally this paper could be presented at a conference, and it could serve as a trial for computational work you might do in a larger research project. As a model, see the most recent CFP for Current Research in Digital History. Due Monday, May 13 at 5pm.

Schedule

Week 1 (January 28): The web

Do your level best to get these set up before the first day of class:

Read:

Week 2 (February 4): Data from history and historians

Assignment:

Read:

Browse:

Week 3 (February 11): Basics of R

Assignment:

Read:

Browse:

Week 4 (February 18): Data manipulation

Assignment:

Read:

Week 5 (February 25): Data visualization

Assignment:

Read:

Week 6 (March 4): More data manipulation and visualizations

Assignment:

Read:

Spring break (March 11)

Week 7 (March 18): Exploratory data analysis

Assignment:

Read:

Week 8 (March 25): Mapping

Assignment:

Read:

Browse:

Week 9 (April 1): Mapping

Read:

Browse:

Week 10 (April 8): Text analysis

Assignment:

Read:

Browse:

Week 11 (April 15): Text analysis

Read:

Week 12 (April 22): Network analysis

Assignment:

Read:

Week 13 (April 29): Supervised classification

Read:

Week 14 (May 6): TBD

Topic and readings to be determined by the needs of student research papers.

Read:

Possible topic:

Fine print

This syllabus may be updated online as necessary. The online version of this syllabus is the only authoritative version.

Students must satisfactorily complete all assignments (including participation assignments) in order to pass this course. Your attendance is expected at every meeting. If you must be absent, I request that you notify me in advance of the class meeting. I am sometimes willing to grant extensions on assignments for cause, but you must request an extension before the assignment’s due date. For every day or part of a day that an assignment is late without an extension, I may reduce your grade. No work (other than final projects) will be accepted after the last day that the class meets. I will discuss grades only in person during office hours.

See the George Mason University catalog for general policies, as well as the university statement on diversity. You are expected to know and follow George Mason’s policies on academic integrity and the honor code. If you are a student with a disability and you need academic accommodations, please see me and contact the Office of Disability Services at 703-993-2474 or through their website. You are responsible for verifying your enrollment status. All academic accommodations must be arranged through that office. Please note the dates for dropping and adding courses from the GMU academic calendar.