Code Library
Open-source data science tools for students and scholars
Everything here will run on free or open source software, and will run on a reasonably modern laptop or desktop. I'll try to illustrate my code with interesting real data; you can choose to read or code-along.
Please install R and RStudio on your machine, in that order. Welcome to programming, we'll be doing a lot of copy + paste.
Recommended guides to complement the content on this page:
Requirements: An internet connection and a Twitter developer account with an approved Academic Track project.
Requirements: A tabular text data set, with one row per document, and all the text in one column.
Coming soon...
Cleaning and prepping a dataset for semantic analysis
Cleaning and prepping a dataset for semantic analysis
Using an LDA Topic Model to sample from a large document dataset
Using an LDA Topic Model to sample from a large document dataset
Using an LDA Topic Model to compare the thematic fingerprints of authors in a large document dataset
Using an LDA Topic Model to compare the thematic fingerprints of authors in a large document dataset
Creating a co-author network from data exported from Web of Science
Creating a co-author network from data exported from Web of Science
Creating a co-citation network from data exported from Web of Science
Creating a co-citation network from data exported from Web of Science
Constructing Non-linear Online Conversations (NOCs) from Twitter
Constructing Non-linear Online Conversations (NOCs) from Twitter