Skip to content

A interactive swirl course that focuses on tools for the computational analysis of language and text

Notifications You must be signed in to change notification settings

swirlTA/toolsfortextanalysis

Repository files navigation

This online course was created during the pandemic and designed specifically for the set of University of Zurich students that enrolled in Autumn 2020. Besides the swirl courses, the course consists of a number of accompanying videos that introduce some of the most relevant concepts as well as the data used in the swirl lessons. You can find the videos here – as you will see, these videos were thought only for being used within that specific course. They will be renewed soon, but for now we appreciate your understanding… And we hope you get a laugh with them!

Overview

In this swirl course you will learn the basics of the programming language R and will apply it to the computational analysis of language and text, including text mining, regular expressions and data cleaning, analysis and visualisation. The goal of this course is twofold:

  1. To learn how to digitally obtain different kinds of information from large quantities of texts, and
  2. To learn how to digitally clean, analyse and visualise data. While the examples used will most often be applied to linguistics, all the techniques are applicable to literary analysis.

Some of the skills that we'll learn are:

  1. Basics of R programming language.
  2. Basic data analysis and data visualization with the "tidyverse".
  3. Calculating some basic data of a text: how many words/sentences/n-grams does it have? Which are the most frequent ones?
  4. Designing and searching regular expressions.
  5. Automatic anguage recognition.
  6. Basic lemmatization of a text.
  7. Topic modelling and sentiment analysis.

We will analyse different kinds of texts: literary texts, tweets, Whatsapp chats…

Program

The classes in the swirl course and the acoompanying videos are grouped into thematic blocks. In the 'Program' tab you can see which video goes with which course.

How to get set for the course

R

In this course we're going to learn how to use the R programming language for textual analysis. For that we need to install R (MacOS / Windows). Because we will work with an editor that will make our life easier and our work prettier, we will also need to install RStudio.

swirl

For this course you will also need to install swirl. It is an R library for teaching and learning R programming. It is interactive and very easy to use. Once you have R and RStudio installed and running in your computer, you can intall swirl() easily. Go to swirl's website and follow the steps that are listed there.

Once installed, the console in R will start "talking to you" – follow the instructions there. It will ask you for a name and it will offer you some courses already. The course you want to attend ist called "Tools for text analysis". So simply follow the instructions to download and follow the course!

Go to the swirl help page if you need more information or check out the course forum.

Course forum

You can post questions or difficulties you have in this course forum. Some general difficulties and issues have already been discussed so you might find the solution to your personal problem there. Keep in mind that this is a public GitHub repository so restrain from sharing personal information (like private email adresses).

Downloads

For some classes in the swirl course you need .csv or .txt archives. In the 'Download' tab you can find instructions to download them.

About

A interactive swirl course that focuses on tools for the computational analysis of language and text

Topics

Resources

Stars

Watchers

Forks

Languages