Code Library

Open-source data science tools for students and scholars

Everything here will run on free or open source software, and will run on a reasonably modern laptop or desktop. I'll try to illustrate my code with interesting real data; you can choose to read or code-along.

Please install R and RStudio on your machine, in that order. Welcome to programming, we'll be doing a lot of copy + paste.

Recommended guides to complement the content on this page:

  1. Ismay and Yim - Getting Started with Data in R

  2. Wickham and Grolemund - R for Data Science

Coming soon...

Cleaning and prepping a dataset for semantic analysis

Using an LDA Topic Model to sample from a large document dataset

Using an LDA Topic Model to compare the thematic fingerprints of authors in a large document dataset

Creating a co-author network from data exported from Web of Science

Creating a co-citation network from data exported from Web of Science

Constructing Non-linear Online Conversations (NOCs) from Twitter