March, 14, 2017

Two Group Comparisons

  • Often, we want to test hypotheses about two groups.

  • Does the outcome of a treatment differ from a control?

  • Is the United States have more income inequality than Canada?

  • Are there more military interventions under Democratic or Republican presidents?

Bivariate Analyses

  • When comparing two groups, you are generally conducting a bivariate analysis - ie an analysis where you're comparing two variables.

  • When doing bivariate analysis you can have two types of groups in your samples:

  1. Dependent samples- Samples that have the same subjects or the values of one group of subjects will affect the values of another. ie) Housework between husbands and wives, repeated measurements of test scores on the same people

  2. Independent samples - observations in one sample are independent of the observations in another sample. ie) randomly selected subjects in Michigan and randomly selected subject in Georgia asked about their party affiliation.

Bivariate Analyses

  • Bivariate analysis are typically conducted with independent samples or at the very least assumed to be independent.

  • Sometimes it's tricky to figure out whether samples are dependent or independent.

Bivariate Analyses

  • Does the outcome of a treatment differ from a control? (Independent)

  • Does the United States have more income inequality than Canada? (Independent, but may have dependencies)

  • Are there more military interventions under Democratic or Republican presidents? (?)

Bivariate Analyses

  • In the context of simple siginficance testing, we generally assume that samples are independent.

  • There are more sophisticated methods of dealing with dependent samples that we'll learn about when we get back from the break.

Significance testing with two samples

  • The main difference between significance testing with one sample and two samples is the standard error.

  • With two independent samples the standard error is now \(\sqrt{se_{1}^2 + se_{2}^2}\)

  • Everything else is pretty much the same.

Are there more military interventions under Democratic or Republican presidents?

  • Here we are interested in comparing:
  1. \(m_{democrat}\) - average # of military intervention under democratic presidents from 1900-Present.

  2. \(m_{republican}\) - average # of military intervention under democratic presidents from 1900-Present.

\[ \hat{m}_{democrat} = \sum_{i=1}^{N}\frac{intervention_{i}}{Terms_{democrat}} \]

Are there more military interventions under Democratic or Republican presidents?

  • Let's say that there are 29 total terms.

  • \(Terms_{democrat} = 14\), \(Terms_{republican} = 15\),

  • \(\hat{m}_{democrat}= 1.5\) , \(\hat{m}_{democrat}= 1.8\)

  • \(se_{democrat}= 0.2\) , \(se_{republican}= 0.3\)

Significance tests with two proportions

  • Most useful to do hypothesis testing for proportions with categorical data.

  • Example: Does Prayer Help Coronary Surgery Patients?

Significance tests with two means

Whether complications occured for heart surgery patients who did or did not have group prayer
Prayer Complications No Complications Total
Yes 315 289 604
No 304 293 597
  • Study: patients randomly assigned to two groups
  1. Christian volunteers instructed to pray.
  2. No instructuions to pray
  • Question: Did prayer reduce complications?

Significance tests with two proportions

  • Start by thinking about what we are comparing.

  • We want to compare the % of people in the "Prayer" group that had complications (\(\pi_{1}\)) vs. % of people in the "No Prayer" group that had complications (\(\pi_{2}\)).

  • \(\pi_{1} = 315/604 = 0.522\), \(\pi_{2} = 304/604 = 0.509\)

Significance tests with two proportions

  • What are the null and alternative hypotheses?

  • Perform the significance test to answer the question.

Confidence intervals with two proportions

  • What is the 95% confidence interval for the effect of prayer?

Significance testing with two means

  • Test for comparing two means is pretty much identical to the one for comparing two proportions.

  • Only the distribution for the confidence level changes as does the standard error.

  • Example: Who spends more time doing housework? Men or women?

Significance testing with two means

Cooking and washing up minutes, per day, for a national survey of men and women working full time in Britain.
Sex Sample Size Mean Minutes/Day SD
Men 1219 23 32
Women 733 37 16
  • What are the null and alternative hypotheses?

  • Perform a significance test to answer the question who spends more time doing housework? Men or women?

Significance testing with two means

  • What is the 95% confidence interval for the difference between time spent doing housework by men and women?