Introduction

Goals

We’re going to start the introduction by writing down some basic goals that underlie the construction and content of this book. We’re writing this for you, the reader, but also to hold ourselves accountable as we write. So, feel free to read if you are interested or skip ahead if you aren’t.

The goals of this book are:

  1. To teach you how to use R and RStudio as tools for applied epidemiology.1 Our goal is not to teach you to be a computer scientist or an advanced R programmer. Therefore, some readers who are experienced programmers may catch some technical inaccuracies regarding what we consider to be the fine points of what R is doing “under the hood.”

  2. To make this writing as accessible and practically useful as possible without stripping out all of the complexity that makes doing epidemiology in real life a challenge. In other words, We’re going to try to give you all the tools you need to do epidemiology in “real world” conditions (as opposed to ideal conditions) without providing a whole bunch of extraneous (often theoretical) stuff that detracts from doing. Having said that, we will strive to add links to the other (often theoretical) stuff for readers who are interested.

  3. To teach you to accomplish common tasks, rather than teach you to use functions or families of functions. In many R courses and texts, there is a focus on learning all the things a function, or set of related functions, can do. It’s then up to you, the reader, to sift through all of these capabilities and decided which, if any, of the things that can be done will accomplish the tasks that you are actually trying to accomplish. Instead, we will strive to start with the end in mind. What is the task we are actually trying to accomplish? What are some functions/methods we could use to accomplish that task? What are the strengths and limitations of each?

  4. To start each concept by showing you the end result and then deconstruct how we arrived at that result, where possible. We find that it is easier for many people to understand new concepts when learning them as a component of a final product.

  5. To learn concepts with data instead of (or alongside) mathematical formulas and text descriptions, where possible. We find that it is easier for many people to understand new concepts by seeing them in action.

Text conventions used in this book

  • We will hyperlink many keywords or phrases to their glossary entry.
  • Additionally, we may use bold face for a word or phrase that we want to call attention to, but it is not necessarily a keyword or phrase that we want to define in the glossary.
  • Highlighted inline code is used to emphasize small sections of R code and program elements such as variable or function names.

Other reading

If you are interested in R4Epi, you may also be interested in:

  • Hands-on Programming with R by Garrett Grolemund. This book is designed to provide a friendly introduction to the R language.

  • R for Data Science by Hadley Wickham, Mine Çetinkaya-Rundel, and Garrett Grolemund. This book is designed to teach readers how to do data science with R.

  • Statistical Inference via Data Science: A ModernDive into R and the Tidyverse. This book is designed to be a gentle introduction to the practice of analyzing data and answering questions using data the way data scientists, statisticians, data journalists, and other researchers would.

  • Reproducable Research with R and RStudio by Christopher Gandrud. This book gives you tools for data gathering, analysis, and presentation of results so that you can create dynamic and highly reproducible research.

  • Advanced R by Hadley Wickham. This book is designed primarily for R users who want to improve their programming skills and understanding of the language.


  1. In this case, “tools for applied epidemiology” means (1) understanding epidemiologic concepts; and (2) completing and interpreting epidemiologic analyses.↩︎