Code Library

Open-source data science tools for students and scholars

Everything here will run on free or open source software, and will run on a reasonably modern laptop or desktop. I'll try to illustrate my code with interesting real data; you can choose to read or code-along.

Please install R and RStudio on your machine, in that order. Welcome to programming, we'll be doing a lot of copy + paste.

Recommended guides to complement the content on this page:

  1. Ismay and Yim - Getting Started with Data in R

  2. Wickham and Grolemund - R for Data Science

Requirements: An internet connection and a Twitter developer account with an approved Academic Track project.


Requirements: A tabular text data set, with one row per document, and all the text in one column.


Requirements: A tabular text data set, with one row per document, and all the text in one column (Analysis dataset provided in-page)


To cite this page for code or research methodology, please use:

Bhardwaj, A. (2022) Code Library: Open-source data science tools for students and scholars. https://www.abhardwaj.net/code



Coming soon...

Cleaning and prepping a dataset for semantic analysis

Using an LDA Topic Model to sample from a large document dataset

Using an LDA Topic Model to compare the thematic fingerprints of authors in a large document dataset

Creating a co-author network from data exported from Web of Science

Creating a co-citation network from data exported from Web of Science

Constructing Non-linear Online Conversations (NOCs) from Twitter