[POLS 4150] Two Group Comparisons

March, 14, 2017

Two Group Comparisons

Often, we want to test hypotheses about two groups.
Does the outcome of a treatment differ from a control?
Is the United States have more income inequality than Canada?
Are there more military interventions under Democratic or Republican presidents?

When comparing two groups, you are generally conducting a bivariate analysis - ie an analysis where you're comparing two variables.
When doing bivariate analysis you can have two types of groups in your samples:

Dependent samples- Samples that have the same subjects or the values of one group of subjects will affect the values of another. ie) Housework between husbands and wives, repeated measurements of test scores on the same people
Independent samples - observations in one sample are independent of the observations in another sample. ie) randomly selected subjects in Michigan and randomly selected subject in Georgia asked about their party affiliation.

Bivariate analysis are typically conducted with independent samples or at the very least assumed to be independent.
Sometimes it's tricky to figure out whether samples are dependent or independent.

Does the outcome of a treatment differ from a control? (Independent)
Does the United States have more income inequality than Canada? (Independent, but may have dependencies)
Are there more military interventions under Democratic or Republican presidents? (?)

In the context of simple siginficance testing, we generally assume that samples are independent.
There are more sophisticated methods of dealing with dependent samples that we'll learn about when we get back from the break.

The main difference between significance testing with one sample and two samples is the standard error.
With two independent samples the standard error is now \(\sqrt{se_{1}^2 + se_{2}^2}\)
Everything else is pretty much the same.

\(m_{democrat}\) - average # of military intervention under democratic presidents from 1900-Present.
\(m_{republican}\) - average # of military intervention under democratic presidents from 1900-Present.

\[ \hat{m}_{democrat} = \sum_{i=1}^{N}\frac{intervention_{i}}{Terms_{democrat}} \]

Whether complications occured for heart surgery patients who did or did not have group prayer
Prayer	Complications	No Complications	Total
Yes	315	289	604
No	304	293	597

Start by thinking about what we are comparing.
We want to compare the % of people in the "Prayer" group that had complications (\(\pi_{1}\)) vs. % of people in the "No Prayer" group that had complications (\(\pi_{2}\)).
\(\pi_{1} = 315/604 = 0.522\), \(\pi_{2} = 304/604 = 0.509\)

Test for comparing two means is pretty much identical to the one for comparing two proportions.
Only the distribution for the confidence level changes as does the standard error.
Example: Who spends more time doing housework? Men or women?

Cooking and washing up minutes, per day, for a national survey of men and women working full time in Britain.
Sex	Sample Size	Mean Minutes/Day	SD
Men	1219	23	32
Women	733	37	16

What are the null and alternative hypotheses?
Perform a significance test to answer the question who spends more time doing housework? Men or women?

What is the 95% confidence interval for the difference between time spent doing housework by men and women?