Problem Statement

On January 28, 1986, a routine launch was anticipated for the Challenger space shuttle. Seventy-three seconds into the flight, disaster happened: the shuttle broke apart, killing all seven crew members on board. An investigation into the cause of the disaster focused on a critical seal called an O-ring, and it is believed that damage to these O-rings during a shuttle launch may be related to the ambient temperature during the launch. Observational data on O-rings for 23 randomly selected shuttle missions was collected, where the mission order is based on the temperature at the time of the launch. temp gives the temperature in Fahrenheit and damaged is 1 when O-ring failed and 0 when it was undamaged. (Tweaked a bit from ??? [Chapter 8])

Competing Hypotheses

In words

  • Null hypothesis: The population coefficient on temp corresponding to the logistic regression modeling whether or not an O-ring failed as a function of temperature is zero.

  • Alternative hypothesis: The population coefficient on temp corresponding to the logistic regression modeling whether or not an O-ring failed as a function of temperature is nonzero.

Another way with words

  • Null hypothesis: There is no association between temperature and whether or not an O-ring failed.

  • Alternative hypothesis: There is an association between temperature and whether or not an O-ring failed.

In symbols (with annotations)

  • \(H_0: \beta_1 = 0\), where \(\beta_1\) represents the population coefficient on temp corresponding to the logistic regression modeling whether or not an O-ring failed as a function of temperature
  • \(H_A: \beta_1 \ne 0\)

Set \(\alpha\)

It’s important to set the significance level before starting the testing using the data. Let’s set the significance level at 5% here.

Exploring the sample data

library(dplyr)
library(knitr)
library(ggplot2)
library(oilabs)
orings <- read.delim("orings.txt")

The scatterplot below shows the relationship between damage and temp. Note the use of factor() here which converts the damage variable into its two groups (1 and 0) instead of looking at it as a spectrum between 0 and 1.

qplot(x = temp, y = factor(damage), 
  data = orings, geom = "point") + 
  geom_jitter(height = 0.4, alpha = 0.5)

Guess about statistical significance

We are looking to see if temp is a significant predictor of damage. Based on the plot, it seems reasonable to guess that we will reject the null hypothesis here. There are no failures below 65 degrees and most of the points above 65 degrees did not fail.

Check conditions

Remember that in order to use the shortcut (formula-based, theoretical) approach, we need to check that some conditions are met.

  1. Linear relationship between between linked response and predictors:
    This condition is difficult to check without binning the data and computing something called the empirical logit function. The plot of the data and the fitted logistic curve can usually provide a good idea on the quality of the fit.

    qplot(x = temp, y = damage, 
        data = orings, geom = "point") +
      stat_smooth(method = "glm", 
        method.args = list(family = "binomial"),
        se = FALSE)

We have no clear reason to doubt the fit based on this plot.

  1. Independent observations: If cases are selected at random, the independent observations condition is met.

    The shuttle missions were selected at random here so the independent observations condition is met.

Test statistic

The test statistic is a random variable based on the sample data. Here, we want to look at a way to estimate the population slope \(\beta_1\). A good guess is the sample coefficient \(B_1\). Recall that this sample coefficient in logistic regression is actually a random variable that will vary as different samples are (theoretically, would be) collected.

We next look at our fitted curve from our sample of data:

oring_glm <- glm(damage ~ temp, data = orings, family = binomial)
summary(oring_glm)
## 
## Call:
## glm(formula = damage ~ temp, family = binomial, data = orings)
## 
## Deviance Residuals: 
##    Min      1Q  Median      3Q     Max  
## -1.061  -0.761  -0.378   0.452   2.217  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)  
## (Intercept)   15.043      7.379    2.04    0.041 *
## temp          -0.232      0.108   -2.14    0.032 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 28.267  on 22  degrees of freedom
## Residual deviance: 20.315  on 21  degrees of freedom
## AIC: 24.32
## 
## Number of Fisher Scoring iterations: 5

We are looking to see how likely is it for us to have observed a sample slope of \(b_{1, obs} = -0.232\) or more extreme assuming that the population slope is 0 (assuming the null hypothesis is true). If the conditions are met and assuming \(H_0\) is true, we can “standardize” this original test statistic of \(B_1\) into a \(Z\) statistic that follows a \(N(0,1)\):

\[ Z =\dfrac{ B_i - 0}{ {SE}_i } \sim N(0, 1) \]

where \({SE}_i\) represents the standard deviation of the distribution of the sample slopes.

Observed test statistic

While one could compute this observed test statistic by “hand”, the focus here is on the set-up of the problem and in understanding which formula for the test statistic applies. We can use the glm function here to fit a curve and conduct the hypothesis test. (We’ve already run this code earlier in the analysis, but it is shown here again for clarity.)

oring_glm <- glm(damage ~ temp, data = orings, family = binomial)
summary(oring_glm)
## 
## Call:
## glm(formula = damage ~ temp, family = binomial, data = orings)
## 
## Deviance Residuals: 
##    Min      1Q  Median      3Q     Max  
## -1.061  -0.761  -0.378   0.452   2.217  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)  
## (Intercept)   15.043      7.379    2.04    0.041 *
## temp          -0.232      0.108   -2.14    0.032 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 28.267  on 22  degrees of freedom
## Residual deviance: 20.315  on 21  degrees of freedom
## AIC: 24.32
## 
## Number of Fisher Scoring iterations: 5

We see here that the \(b_{1, obs}\) value is around -0.232 which corresponds to a \(z_{obs}\) value of -2.14. Note that these values are negative. This corresponds to a negative relationship between temperature and failure, which is what we saw in the scatterplot earlier.

Compute \(p\)-value

The \(p\)-value—the probability of observing a \(z}\) value of -2.14 or more extreme in our null distribution—is 0.032. This can also be calculated in R directly:

2 * pnorm(-2.14)
## [1] 0.032355

State conclusion

We, therefore, have sufficient evidence to reject the null hypothesis. Our initial guess that temp is a significant predictor of damage is supported by evidence.