Code

The following software packages are actively maintained and, as appropriate, peer-reviewed and accompanied by a published paper about the software. If you are interested in the code I’ve written to perform various analyses, those repositories are listed with the associated publications or projects.

tokenizers: Fast, consistent tokenization of natural language text

JOSS paper | Package website | GitHub repository | CRAN

textreuse: Detect text reuse and document similarity

Package website | GitHub repository | CRAN

USAboundaries: Historical and contemporary boundaries of the United States of America

JOSS paper | Package website | GitHub repository | CRAN

historydata: Datasets for historians

Package website | GitHub repository | CRAN

internetarchive: An R client for the Internet Archive API

Package website | GitHub repository | CRAN

gender: Predict gender from names using historical data

DHQ paper | Package website | GitHub repository | CRAN