Chapter 6 Reproducibility and Report with R Markdown

Reproducibility is one of the core values in data science and R makes it both achievable and easy! Imagine trying to recreate someone’s analysis only to find that you get different results or that they left out crucial steps. Frustrating, right? Reproducibility is the answer—it means you can get the same results every time by following the same steps.

Why Reproducibility Matters

  • Trustworthiness: When your results can be replicated, others can trust your analysis.
  • Error Detection: Re-running the same code helps catch mistakes early.
  • Efficiency: With reproducible scripts, you save time if you need to redo parts of your analysis.

6.1 Key Tools in R for Reproducibility and Reporting

Let’s dive into the tools that make reproducibility and reporting a breeze in R:

  1. R Markdown: This is the gold standard for reproducible reports in R. You can write code, comments, and format it all beautifully in one document. Think of it as combining your code with a notebook-style narrative.
  • Interactive Demo: Create an R Markdown file in RStudio by clicking File > New File > R Markdown…. You can add headers, code chunks, and text.
  • Run Your Code: Run each chunk individually, or click Knit to create a fully formatted report with all your code and outputs embedded.
  1. Setting a Seed for Consistency: R’s random number generator can be controlled with set.seed(). For instance;
set.seed(42)
sample(1:100, 5)
## [1] 49 65 25 74 18

This will always produce the same random sample, making your analysis consistent.

  1. Code Commenting and Documentation: Clear comments make your analysis easy to understand for others and for yourself. Use comments (#) in your code to describe steps, and include documentation for more complex functions.

Below is an example of a comment.

# This is a comment

6.2 Creating Reproducible Reports

Let’s walk through a simple activity where we create a reproducible report:

  1. Set Up Your R Markdown File
  • Open RStudio and create a new R Markdown file.
  • Add a title, your name, and the date.
  • Start with an introduction: Below is an example of a report to introduce R makrdown.

  • Insert the relevant details and press Ok to create a markdown file. An introductory report explaining how markdown works will be automatically generated. For more information about R makrdown visit here
  1. Add Your Code and Analysis

Insert code chunks for each analysis step. For example, try loading and summarizing the mtcars data set:

# Load the data 
data(mtcars)

# Summary of the data set
summary(mtcars)
##       mpg             cyl             disp             hp       
##  Min.   :10.40   Min.   :4.000   Min.   : 71.1   Min.   : 52.0  
##  1st Qu.:15.43   1st Qu.:4.000   1st Qu.:120.8   1st Qu.: 96.5  
##  Median :19.20   Median :6.000   Median :196.3   Median :123.0  
##  Mean   :20.09   Mean   :6.188   Mean   :230.7   Mean   :146.7  
##  3rd Qu.:22.80   3rd Qu.:8.000   3rd Qu.:326.0   3rd Qu.:180.0  
##  Max.   :33.90   Max.   :8.000   Max.   :472.0   Max.   :335.0  
##       drat             wt             qsec             vs        
##  Min.   :2.760   Min.   :1.513   Min.   :14.50   Min.   :0.0000  
##  1st Qu.:3.080   1st Qu.:2.581   1st Qu.:16.89   1st Qu.:0.0000  
##  Median :3.695   Median :3.325   Median :17.71   Median :0.0000  
##  Mean   :3.597   Mean   :3.217   Mean   :17.85   Mean   :0.4375  
##  3rd Qu.:3.920   3rd Qu.:3.610   3rd Qu.:18.90   3rd Qu.:1.0000  
##  Max.   :4.930   Max.   :5.424   Max.   :22.90   Max.   :1.0000  
##        am              gear            carb      
##  Min.   :0.0000   Min.   :3.000   Min.   :1.000  
##  1st Qu.:0.0000   1st Qu.:3.000   1st Qu.:2.000  
##  Median :0.0000   Median :4.000   Median :2.000  
##  Mean   :0.4062   Mean   :3.688   Mean   :2.812  
##  3rd Qu.:1.0000   3rd Qu.:4.000   3rd Qu.:4.000  
##  Max.   :1.0000   Max.   :5.000   Max.   :8.000
  1. Customize and Style Your Report
  • Add section headers, bold text, and bullet points to organize your report.
  • You can use ggplot2 to add visualizations for a polished look.
library(ggplot2)
ggplot(mtcars, aes(x = hp, y = mpg)) +
  geom_point() +
  labs(title = "Horsepower vs. Miles per Gallon")

  1. Knit the Report
  • Click the Knit button to render your report into an HTML, PDF, or Word document.
  • Notice how your code, output, and comments are all integrated.

Here is how the report should look like when knitted.

6.3 Going Beyond: Shiny for Interactive Reporting

For advanced projects, consider using Shiny to create interactive reports! Shiny apps can run right in your browser and allow users to interact with your data in real time.

More details on RShiny will be discussed later on the next topic

Reproducibility is a powerful skill—keep practicing, and you’ll quickly see how it enhances your data work!

Hands-on Exercises

Create an R Markdown file with:

  1. A title and introduction explaining your analysis.
  2. An example dataset analysis (try using iris or mtcars).
  3. A basic visualization.
  4. A conclusion summarizing your findings.
  5. Knit the report to html

Solution

Instructor to show the students the example report(r Markdown and hmtl files) at path reproducibility_projects/example/ directory

Solution provided in reproducibility_projects/solution/solution.Rmd

Here is how the documents should look like

________________________________________________________________________________