Assumptions
Hypotheses
Test statistic
P-value
Conclusion/inference
Valid tests about parameters require assuming a few things about the data:
Proper definition of the population - If your sampling frame is not representative of the population, your inferences will almost always be wrong/inaccurate.
Randomization - Significance tests assume that samples were randomly selected from a population. They essentially fail if this is not true.
Distributional assumptions - Some tests assume that variables follow a particular probability distribution (Normal, Binomial,etc.)
Sample size - validity of tests increases with larger sample sizes.
\[ \begin{aligned} H_{0}: & \text{ The null hypothesis.}\\ H_{a}:& \text{ The alternative hypothesis.}\\ \end{aligned} \]
Every significance tests has two hypotheses about the value of the parameter.
The null hypothesis - a prior belief about a value that the parameter takes.
The alternative hypothesis - a belief about the value of the parameter if the null hypothesis is found to be untrue/unlikely.
In general, the null hypothesis represents the status quo or “no effect”.
\[ \begin{aligned} H_{0}: & \pi_{Trump,MI} = 0.50 \\ H_{a}:& \pi_{Trump,MI} > 0.50 \\ \end{aligned} \] - A presidential candidate gets the electoral votes of a state if they carry over 50% of the votes.
\[ \begin{aligned} H_{0}: & \pi_{CDem} = \pi_{CRep} \\ H_{a}:& \pi_{CDem} < \pi_{CRep} \\ \end{aligned} \] - We can explore whether the population % of Democrats that are Christian are less than the population % of Republicans that are Christian from survey data.
\[ \begin{aligned} H_{0}: & Turnout_{T} = Turnout_{C} \\ H_{a}:& Turnout_{T} > Turnout_{C} \\ \end{aligned} \]
\[ \frac{Obs - Expected}{SE} \]
The test statistic is what you use to determine whether to reject or not reject the null hypothesis.
It’s usually calculated as the difference between the observed and expected under the null, divided by the standard error.
The p-value is necessary to assess how likely the null hypothesis is given the data.
Indicates the probability that the test statistic is equal to or more extreme than the value observed as predicted by \(H_{a}\).
Calculated by presuming that \(H_{0}\) is true.
Small p-values \(<0.05\) or \(<0.01\) suggest that the observed data are unusual if \(H_{0}\) were true.
Survey of \(N = 100\) people from GA, \(N = 100\) people from SC.
Ask: “On a scale from 1 to 5 where 1 is Very Liberal and 5 is Very Conservative, where would you place yourself?”
Results: \(\bar{x}_{GA} = 3\), \(\bar{x}_{SC} = 4\), \(SE = 1\).
Survey of \(N = 1600\) people.
Steps 1. Define \(H_{0}\) and \(H_{a}\)
Choose a test statistic.
Find the p-value for the test statistic (p).
Compare the p-value to a preset significance value: (\(\alpha = 0.05\) or \(\alpha = 0.01\)).
Reject \(H_{0}\)
Assumptions
Hypotheses
Test statistic
P-value
Conclusion
\[ z = \frac{\hat{\pi} - \pi_{0}}{\sigma_{0}}; \sigma_{0} = \sqrt{\pi_{0}(1-\pi_{0})/n} \]
** Can the population mean be less than 50?“”
religdata<-rnorm(n=100,mean=43,sd = 2)
# Can the population mean be 50 or more? versus less than 50?
t.test(x= religdata,
mu = 50,
alternative = "less")
##
## One Sample t-test
##
## data: religdata
## t = -32.97, df = 99, p-value < 2.2e-16
## alternative hypothesis: true mean is less than 50
## 95 percent confidence interval:
## -Inf 43.46561
## sample estimates:
## mean of x
## 43.11908
** Can the population mean be equal to 44?**
religdata<-rnorm(n=100,mean=43,sd = 2)
# Can the population mean be 50 or more? versus less than 50?
t.test(x= religdata,
mu = 44,
alternative = "two.sided")
##
## One Sample t-test
##
## data: religdata
## t = -5.6771, df = 99, p-value = 1.37e-07
## alternative hypothesis: true mean is not equal to 44
## 95 percent confidence interval:
## 42.53130 43.29206
## sample estimates:
## mean of x
## 42.91168
It is not correct to say that you “accept” the null hypothesis.
This is because you are testing whether a parameter value is one of many saying that you “do not reject” emphasizes the fact that you can’t rule out that the null is the correct parameter value.
But there may be, several other plausible values.