Instructors

This guide provides information on how to structure your course, how to use the textbook, and how to access additional resources to support your teaching.

Overview

The structure of the book takes an approach which first introduces the conceptual aspects and then the practical aspects and implementation strategies of quantitative text analysis. To understand how you might approach teaching with this book, it is important to understand the structure of the book and how the resources are connected to the book.

The Data to Insight Hiearchy, introduced in the Preface, is used as a framework to understand the process of text analysis.

Figure 1: Data to Insight Hierarchy

Figure 1 illustrates how the relationship between each of the parts of the book and the overall goal of the book. Part I is an orientation to the area of text analysis. Subsequently, Part II, III, IV, and V cover the foundational concepts, preparation, analysis, and contribution, respectively.

The book, its parts, and their chapters are connected to a series of materials to develop student experience and confidence in working with the concepts and practical strategies employed in text analysis research.

In Table 1, the chapters are connected to lessons and labs that can be used to support the learning process. The lessons are interactive programming tutorials that introduce students to essential R programming skills. Recipes elaborate on concepts in the chapter and provide more engagement with the programming techniques from the lesson(s) and chapter to further develop student experience in the practice of data analysis. The labs are hands-on exercises that allow students to practice the skills they have learned throughout the sequence and apply it in a more free-form environment.

Table 1: Resource overview
Part Chapter Lesson1 Recipe Lab2
Preface Intro to Swirl Literate Programming and Quarto Writing with code
Part I: Orientation
Text analysis Workspace
Vectors
Academic writing with Quarto Crafting scholarly documents
Part II: Foundations
Data Objects
Packages and functions
Reading, inspecting, and writing datasets Dive into datasets
Analysis Summarizing data
Visual summaries
Descriptive assessment of datasets Trace the datascape
Research Project environment Understanding the computing environment Scaffolding reproducible research
Part III: Preparation
Acquire Control Statements
Custom Functions
Collecting and documenting data Harvesting research data
Curate Pattern matching
Tidy datasets
Organizing and documenting datasets Taming data
Transform Reshape Datasets by Rows
Reshape datasets by Columns
Transforming and documenting datasets Dataset alchemy
Part IV: Analysis
Explore Advanced objects Exploratory analysis methods Pattern discovery
Predict Advanced visualization Building predictive models Text classification
Infer Advanced tables Building inference models Statistical inference
Part V: Contribution
Communicate Computing environment Manage project and computing environments Future-proofing research

Modifiable materials

I encourage you to modify the materials to fit the needs of your course. This apply particularly to the lessons and labs. Both of these resources can be found on the GitHub repository for the book. Feel free to fork the repository and modify the materials as needed. If you believe that your modifications would be beneficial to others, please consider submitting a pull request to the repository.

Course structures

Given the structure of the book and its resources, there are various options for organizing a course. Below I have structured three potential courses: introductory, intermediate, and advanced. Each option will have a recommended course structure and a list of materials that can be used for support. Each of these considers the following questions:

  • What the student profile is.
  • What the learning outcomes are.
  • What content to cover.
  • What computing environment to use.
  • What supplementary materials to use.

I will also suggest some possible deliverables for each course type as well as some general considerations from my own experience in teaching with this book.

Student profile

  • No prior experience with programming or data analysis.
  • An interest in or are studying linguistics, or a language-related field.
  • Want to know how to use programming to analyze language data.

Goals

To introduce students to the practice of data analysis with a focus on language data. An introduction to the fundamental aspects of R programming and an undestanding of reproducible research and its importance in data analysis.

Content

Chapters 0-4 are covered in this course. The corresponding supplementary materials are used to support the learning process.

Deliverables

  • Literature review, research statement, and research question

Computing environment

  • Posit Cloud or other cloud-based computing environment.
    The reasoning here is to help avoid some of the hurdles that come with setting up a local environment. This will allow students to focus on the content and not the setup. It also allows for a uniform environment for all students.

Notes

Given the fact that only 5 chapters are covered, the text and materials can be completed in much shorter time than a typical 15 week semester. It is recommended to supplement this book and its resources with other materials relevant to the course that will enhance the students’ learning experience and preparation for the literature review, research statement, and research question.

For intro students, the material in the Preparation, Analysis, and Contribution parts can be quite overwhelming. However, if there is interest, content from those chapters may be used as needed.

Student profile

  • Students have some experience with data analysis, likely with graphical user interfaces (GUIs) and some experience with programming.
  • Students have taken an introductory course in linguistics or a related field.
  • Students are interested in learning how to use programming with R to analyze language data.

Goals

  • To introduce students to designing a reproducible research project, preparing data for analysis, and choosing an appropriate analysis method (exploratory, predictive, or inferential). The students will be able to prepare a prospectus which outlines this research project.

Content

  • Textbook (12) chapters
  • Lessons (swirl interactive programming lessons: numbers 1-17)
  • Recipes: 0-10

Deliverables

  • Lab exercises: 0-11
  • Prospectus

Computing environment

  • An IDE (RStudio) running in a containerized environment.
    Students will likely want to have more control over their working environments. While instructors will want to have a consistent environment for the class. Using a containerized environment will works well in this scenario. Furthermore, the isolation between the host and container will help to avoid any unexpected issues or interactions with the host system.

Notes

The textbook and resources are an ideal fit for this student profile and course goals. In this course structure, each chapter is covered and the preparatory and follow up materials are used to support the learning process. The lessons and labs are used to help students develop their programming skills and apply them to the content of the chapters, respectively.

In my experience, the prospectus allows students to demonstrate their understanding of the fundamental aspects of the research process and highlight their ability to apply these concepts to a topic of their choosing. Given that there is a lot to learn in the course, and the learning curve is steep for some students, a full research project may be too much to ask in a single semester.

Student profile

  • Students have experience in programming (likely not with R) and data analysis.
  • Students have taken an intermediate course in linguistics or a related field.
  • Students are interested in learning how to apply these skills to language data and/ or want to learn how to approach projects with reproducibility in mind.

Goals

  • To provide students with the skills to work with designing and conducting a reproducible research project. Students will demonstrate the ability to reproduce the results and collaborate with others on a project.

Content

  • Textbook (12) chapters
  • Lessons (swirl interactive programming lessons: numbers 1-17)
  • Recipes: 0-10

Deliverables

  • Lab exercises: 0-11
  • Prospectus
  • Research project

Computing environment

  • IDE (RStudio or VSCode) running on the host machine (or in a containerized environment).
    Given that students have experience with programming and data analysis, they will likely have a preferred IDE or Editor. RStudio and VSCode both have functionality (or plugin functionality) which make the particularly well-suited for working with R and Quarto.
    Furthermore, you may choose to allow students to use their preferred environment. However, it is also worth considering the use of containerized environments to ensure that all students have a consistent environment for the course.

Notes

In this learning scenario, some of the content early on may be condensed or skipped altogether (particularly Chapters 1-4). Note that the lessons and labs are still useful for students to develop R fundamentals and apply them. These may be assigned in a more self-directed manner, with students working through them at their own pace. In addition to the prospectus, students will complete a research project that demonstrates their ability to apply the concepts and skills learned in the course. In addition, to the research process proper, creating a reproducible research project will be a key component of the course.

Teaching resources

Templates

Computing

Environments

Resources

Further reading

Data and datasets

  • View Guide 06 for more information on data and datasets.

Community resources