Five parts of significance tests

  1. Assumptions

  2. Hypotheses

  3. Test statistic

  4. P-value

  5. Conclusion/inference

Assumptions

Valid tests about parameters require assuming a few things about the data:

  1. Proper definition of the population - If your sampling frame is not representative of the population, your inferences will almost always be wrong/inaccurate.

  2. Randomization - Significance tests assume that samples were randomly selected from a population. They essentially fail if this is not true.

  3. Distributional assumptions - Some tests assume that variables follow a particular probability distribution (Normal, Binomial,etc.)

  4. Sample size - validity of tests increases with larger sample sizes.

Formalizing Hypotheses

\[ \begin{aligned} H_{0}: & \text{ The null hypothesis.}\\ H_{a}:& \text{ The alternative hypothesis.}\\ \end{aligned} \]

Do polls suggest that it’s reasonable to believe that Trump will carry Michigan?

\[ \begin{aligned} H_{0}: & \pi_{Trump,MI} = 0.50 \\ H_{a}:& \pi_{Trump,MI} > 0.50 \\ \end{aligned} \] - A presidential candidate gets the electoral votes of a state if they carry over 50% of the votes.

Are Democrats less Christian than Republicans, on average?

\[ \begin{aligned} H_{0}: & \pi_{CDem} = \pi_{CRep} \\ H_{a}:& \pi_{CDem} < \pi_{CRep} \\ \end{aligned} \] - We can explore whether the population % of Democrats that are Christian are less than the population % of Republicans that are Christian from survey data.

Did shaming voters increase turnout?

\[ \begin{aligned} H_{0}: & Turnout_{T} = Turnout_{C} \\ H_{a}:& Turnout_{T} > Turnout_{C} \\ \end{aligned} \]

Test statistic

\[ \frac{Obs - Expected}{SE} \]

P-value

Significance test for a mean: are Georgians more conservative than South Carolinians?

Significance test for a mean: Are older people (65+) “cold” toward muslims and atheists?

Significance test for a mean: Are older people (65+) “cold” toward muslims and atheists?

Significance test for a mean: Are older people (65+) “cold” toward muslims and atheists?

Steps 1. Define \(H_{0}\) and \(H_{a}\)

  1. Choose a test statistic.

  2. Find the p-value for the test statistic (p).

  3. Compare the p-value to a preset significance value: (\(\alpha = 0.05\) or \(\alpha = 0.01\)).

  4. Reject \(H_{0}\)

Putting this all together…

Significance test for a proportion

\[ z = \frac{\hat{\pi} - \pi_{0}}{\sigma_{0}}; \sigma_{0} = \sqrt{\pi_{0}(1-\pi_{0})/n} \]

Significance test for a proportion: Do a majority of Republicans think Trump is “even tempered”?

Significance test for a proportion: Do a majority of Republicans think Trump is “even tempered”?

  1. Decribe the null and alternative hypothesis.
  2. calculate the test statistic.
  3. make a decision about the question.

Significance tests in R,

** Can the population mean be less than 50?“”

religdata<-rnorm(n=100,mean=43,sd = 2)
# Can the population mean be 50 or more? versus less than 50?
t.test(x= religdata,
       mu = 50,
       alternative = "less")
## 
##  One Sample t-test
## 
## data:  religdata
## t = -32.97, df = 99, p-value < 2.2e-16
## alternative hypothesis: true mean is less than 50
## 95 percent confidence interval:
##      -Inf 43.46561
## sample estimates:
## mean of x 
##  43.11908

Significance tests in R

** Can the population mean be equal to 44?**

religdata<-rnorm(n=100,mean=43,sd = 2)
# Can the population mean be 50 or more? versus less than 50?
t.test(x= religdata,
       mu = 44,
       alternative = "two.sided")
## 
##  One Sample t-test
## 
## data:  religdata
## t = -5.6771, df = 99, p-value = 1.37e-07
## alternative hypothesis: true mean is not equal to 44
## 95 percent confidence interval:
##  42.53130 43.29206
## sample estimates:
## mean of x 
##  42.91168

Accepting v. not rejecting the null hypothesis