Introduction

This workshop shows how to get started with R and it specifically focuses on R for analyzing language data but it offers valuable information for anyone who wants to get started with R. As such, this workshop is aimed at fresh users or beginners with the aim of showcasing how to set up a R session in RStudio, how to set up R projects, and how to do basic operations on text using R.

The goals of this workshop are:

  • How to get started with R
  • How to orient yourself to R and RStudio
  • Understand the basics of working with text data: loading and saving text data, cleaning text data, and creating simple plots

The intended audience for this workshop is beginner-level, with no or little previous experience using R. Thus, no prior knowledge of R is required.

Installing R and RStudio

  • You have NOT yet installed R on your computer?

    • You have a Windows computer? Then click here for downloading and installing R

    • You have a Mac? Then click here for downloading and installing R

  • You have NOT yet installed RStudio on your computer?

    • Click here for downloading and installing RStudio.

You can find a more elaborate explanation of how to download and install R and RStudio here that was created by the UQ library.

Introduction to the Workshop

Are you looking to improve your text analysis skills using R? This introductory session will focus on setting up efficient workflows for analysing language data in R and RStudio. The workshop is ideal for participants with basic R skills who want to explore how R can be applied to text analysis. In this hands-on session, you will learn:

  • How to properly configure R and RStudio for text analysis projects
  • Techniques for loading, cleaning, and processing language data in R
  • How to generate concordances (keyword-in-context displays) to better understand word usage in context
  • Methods for extracting collocations and identifying keywords in your dataset

This session will focus on best practices and optimal workflows, helping you work more efficiently and accurately when processing language data. Whether you’re a linguist, social scientist, or researcher in the humanities, this workshop will give you the skills to get the most out of your data analysis in R.

Dr Sam Hames

Dr. Sam Hames is a research fellow in computational humanities with UQ’s School of Languages and Cultures and also works on the Language Data Commons of Australia and the Australian Text Analytics Platform. Sam’s PhD was on machine learning for medical imaging analysis, and he has an extensive background as a data-focused software developer supporting social media and web researchers. His primary research focus is to understand how computation can enable qualitative and interpretive inquiry across the humanities and social sciences.

Dr Martin Schweinberger

Dr Martin Schweinberger is Lecturer in Applied Linguistics in the School of Language and Cultures at the University of Queensland (UQ). Martin has specialized in computational approaches to analysing language data with a focus on corpus linguistics and quantitative analyses. His research interests lie in language variation and change, language use and acquisition, and reproducibility in the language sciences. Martin is co-director of the Language Technology and Data Analysis Laboratory (LADAL), board member of the International Computer Archive of Modern and Medieval English (ICAME), Vice-President Profession of the International Society for the Linguistics of English (ISLE), as well as CI at the Language Data Commons of Australia (LDaCA) and the Australian Text Analytics Platform (ATAP).

This repo was initially generated from a bookdown template available here: https://github.com/jtr13/bookdown-template.