Exercises

Data

Hypoxia MAP Treatment Dataset

The Hypoxia MAP dataset was contributed by Dr. Amy Nowacki, Associate Professor, Cleveland Clinic. Please refer to this resource as: Amy S. Nowacki, “Hypoxia MAP Treatment Dataset”, TSHS Resources Portal (2022). Available at causeweb.org/tshs/hypoxia. The data is licensed under a Creative Commons Attribution-Non Commerical-Share Alike 4.0 International (CC BY-NC-SA 4.0) license, and is intended for educational purposes only.

Download CSV: hypoxia.csv

Note:

  • Female
    • 0 = male
    • 1 = female
  • Race
    • 1 = African American
    • 2 = Caucasian
    • 3 = Other
  • Type Surg
    • 1 = gastroenterostomy
    • 2 = gastric restrictive procedure
    • 3 = gastroplasty
    • 4 = removal of gastric restrictive device

R Script

The R script used in the exercises can be downloaded below.

Download .R Script: analysis.R

See R script
analysis.R
# Load packages -----------------------------------------------------------

library(readr)
library(ggplot2)
library(dplyr)
library(ggforestplot)
library(forestmodel)
library(ggcorrplot)


# Read data ---------------------------------------------------------------

hypoxia <- read_csv("data/hypoxia.csv")
View(hypoxia)


# Useful functions --------------------------------------------------------

# An AHI of 5-14 is mild; 15-29 is moderate and
# 30 or more events per hour characterizes severe sleep apnea.
# 1 = (AHI < 5); 2 = (5 ≤ AHI < 15);
# 3 = (15 ≤ AHI < 30); 4 = (AHI ≥ 30)

# Convert binary `Female` column to characters for Male and Female
convert_to_factors <- function(data = hypoxia) {
  output <- data |>
    mutate(
      Sex = if_else(.data$Female == 1, "Female", "Male"),
      .after = .data$Female
    ) |>
    mutate(AHI = factor(.data$AHI))
  return(output)
}
hypoxia <- convert_to_factors()


# Exploratory plots -------------------------------------------------------

# Histograms and bar charts
ggplot(hypoxia) +
  geom_histogram(aes(Age))
ggplot(hypoxia) +
  geom_histogram(aes(AHI))
ggplot(hypoxia) +
  geom_bar(aes(AHI))

# Correlation
hypoxia |>
  select(where(is.numeric)) |>
  cor(use = "complete.obs") |>
  ggcorrplot()
ggsave("corr_plot.png", width = 5, height = 5)

# Multiple factors
ggplot(hypoxia) +
  geom_bar(aes(AHI)) +
  facet_wrap(~Sex)

ggplot(hypoxia) +
  geom_bar(aes(AHI)) +
  facet_wrap(Race ~ Sex)

ggplot(hypoxia) +
  geom_bar(aes(AHI)) +
  facet_grid(Race ~ Sex)

hypoxia <- hypoxia |>
  mutate(Diabetesfct = factor(Diabetes))
ggplot(hypoxia) +
  geom_bar(aes(AHI, fill = Diabetesfct), position = "fill") +
  scale_fill_brewer(palette = "Dark2") +
  theme_minimal() # no colours


# Summary statistics ------------------------------------------------------

# Mean
mean(hypoxia$Age)
mean(hypoxia$BMI)
mean(hypoxia$`Duration of Surg`)

# Standard Deviation
sd(hypoxia$Age)

# Counts
table(hypoxia$AHI)
table(hypoxia$`Duration of Surg`)


# Statistical tests -------------------------------------------------------

# Chi-squared tests of AHI factors
chisq.test(table(hypoxia$AHI, hypoxia$Sex))
chisq.test(table(hypoxia$AHI, hypoxia$Race))
chisq.test(table(hypoxia$AHI, hypoxia$Smoking))
chisq.test(table(hypoxia$AHI, hypoxia$Diabetes))

# T-tests for BMI and Sleep time
# Assume variance equal
BMI1 <- hypoxia |> filter(BMI <= median(BMI)) # nolint
BMI2 <- hypoxia |> filter(BMI > median(BMI)) # nolint
t.test(BMI1$Sleeptime, BMI2$Sleeptime, var.equal = TRUE)
t.test(BMI1$Sleeptime, BMI2$Sleeptime, var.equal = TRUE)$p.value


# Modelling ---------------------------------------------------------------

mod_data <- hypoxia |>
  mutate(severeAHI = if_else(AHI == 4, 1, 0)) |>
  select(-c(Female, AHI))

mod1 <- glm(severe_AHI ~ ., data = mod_data, family = "binomial")
summary(mod1)

mod2 <- glm(severe_AHI ~ Age + Sex + BMI, data = mod_data, family = "binomial")
summary(mod2)

mod3 <- glm(severe_AHI ~ Age + Sex, data = mod_data, family = "binomial")
summary(mod3)

mod4 <- glm(severe_AHI ~ Age + BMI, data = mod_data, family = "binomial")
summary(mod4)

mod5 <- glm(severe_AHI ~ Sex + BMI, data = mod_data, family = "binomial")
summary(mod5)

mod6 <- glm(severe_AHI ~ Age, data = mod_data, family = "binomial")
summary(mod6)

mod7 <- glm(severe_AHI ~ Sex, data = mod_data, family = "binomial")
summary(mod7)

mod8 <- glm(severe_AHI ~ BMI, data = mod_data, family = "binomial")
summary(mod8)

# Results -----------------------------------------------------------------

# AIC
aic_results <- c(
  mod2$aic, mod3$aic, mod4$aic,
  mod5$aic, mod6$aic, mod7$aic, mod8$aic
)

# Forest plot of best model
forest_model(mod2)
ggsave("forestplot.png")

Exercises

Exercise 1: Introduction to Git and GitHub for R

  1. Create a GitHub account if you don’t already have one, and make sure you have Git installed on your laptop.

  2. Install and load the usethis and gitcreds packages.

  3. Configure git then run create_github_token(), and follow the instructions to generate a token.

  4. Run gitcreds_set() and paste in the token when prompted.

Hint: Open happygitwithr.com/https-pat!

Solution

Go to github.com and create an account.

From R, configure git with your GitHub username, and the email associated with your GitHub account:

library(usethis)
use_git_config(
  user.name = "username",
  user.email = "username@example.org"
)

Then run create_github_token() which will open up a web browser:

Then click generate.

Paste in the token.

Exercise 2: Creating and cloning a repository

  1. Create a GitHub repository.

  2. Create a new R project, and clone the repository.

Solution

From your GitHub profile page, click Repositories then click New:

Go to File –> New ProjectVersion Control –> Git

Copy the URL from GitHub (ending with .git):

into the Repository URL and select where you want the project to be.

Exercise 3: Committing and pushing changes

  1. Commit the initial files created when making a project.

  2. Download the hypoxia.csv file above and save it into a folder called data in your git project. Download the analysis.R file and it to the project as well.

  3. Add the data folder to your .gitignore file.

  4. Commit and push your changes, and view them on GitHub.

Solution

From the Git tab in RStudio, click the check boxes of the files you want to commit:

Then click Commit and a pop-up window should appear. Enter an (informative) commit message. The click Commit again.

You should then see (another popup) which you can Close. Copy and paste the files from his mornings session into the git folder.You should see them appear in the Git tab:

Open up the .gitignore file in RStudio (you might need to set Show hidden files). Add the line data/ to the .gitignore file:

.Rproj.user
.Rhistory
.RData
.Ruserdata
data/

The data folder should no longer show in the Git tab, and the .gitignore file should should it has been modified.

Commit the changes as you did before. Then click Push.

Go back to the repository on github.com. You should see your R files, but not your data.

Exercise 4: Working in branches

  1. Create a new branch from RStudio.

  2. Make an edit to your R script and commit the changes.

  3. Push your new branch to GitHub and open a pull request.

Solution

In the Git tab, click New Branch and give it a useful name. Then click Create.

Make an edit and commit your changes (as you did in the previous exercise). Push your changes.

Go back to the repository on github.com. You should see a box to Compare & pull request. Click it.

Add a description, then click Create pull request.

You can click Files changed to see what changes have been made.

Exercise 5: Handling git conflicts

  1. Use the web editor to make a change on main.

  2. Return to your pull request and fix any merge conflicts.

  3. Merge the pull request and delete the new branch.

Solution
  • From the Code tab on the GitHub website, click through to the file you want to edit.

  • Make a change, then click Commit changes… in the top right.

  • Return to the *Pull requests** tab. You might see a message that says This branch has conflicts that must be resolved.

  • If not, go ahead and click Merge pull request.

  • If you have a conflict, there are different ways to resolve it. For small conflicts, it can be easiest to click *Use web editor**. You’ll see the two different versions of the code.

  • Edit the file, and select (or further edit) the code you want to keep.

  • The click Mark as resolved, then Commit merge.

  • You can then return to the Pull requests page, and Confirm merge:

Remember to delete the branch if you’re finished with it!

In RStudio, change the branch back to main, and Pull the changes.

Exercise 6: Sharing and reviewing

  1. Create a new branch and edit your R script in some way (e.g. add some code or comments).

  2. Commit your changes, open a pull request, and describe the changes.

  3. Request a review from someone else in the room (and vice versa).

Note: you may need to add them as a collaborator to the repository. (You can remove them afterwards.)

  1. Review someone else’s code - leave comments and decide whether to approve.

  2. Bonus: add a README.md file to your repository, with a brief summary of the project and stating what packages and version of R you have used.

Solution
  • Create a branch, commit changes, and create a pull request as you did in the previous exercise.

  • Within the GitHub repository, go to Settings -> Collaborators, Click Add people and add the person next to you as a collaborator using their username.

  • Request a review from your new collaborator.