wu :: forums « wu :: forums - Statistics » Welcome, Guest. Please Login or Register. Jan 29th, 2022, 5:27am RIDDLES SITE WRITE MATH! Home Help Search Members Login Register
 wu :: forums    riddles    hard (Moderators: william wu, ThudnBlunder, Grimbal, Eigenray, SMQ, towr, Icarus)    Statistics « Previous topic | Next topic »
 Pages: 1 Reply Notify of replies Send Topic Print
 Author Topic: Statistics  (Read 6672 times)
Benny
Uberpuzzler

Gender:
Posts: 1024
 Statistics   « on: Feb 16th, 2008, 12:56am » Quote Modify

Suppose that a pyschologist wants to compare the intelligent quotient of a typical person's reading
ability to they're math ability.

Let us define the following variables:

Z=a person's overall recorded IQ

X= A person's mathematical recorded IQ

Y= A person's reading ability recorded IQ.

Now if our pychologist measures the above characteristics for many individuals, and then computes

Covariance(X -Z, Y-Z)=-0.5 with these values, then what can we say about a persons mathematical
ability compared to their reading ability?

In an actual experiment, why might it be necessary to compute Covariance(X -Z, Y-Z) instead of
Covariance(X,Y) in order to find a relation between X and Y?

Hint: This question requires some knowledge on the pychology of intelligence.
 IP Logged

If we want to understand our world — or how to change it — we must first understand the rational choices that shape it.
pex
Uberpuzzler

Gender:
Posts: 880
 Re: Statistics   « Reply #1 on: Feb 16th, 2008, 6:36am » Quote Modify

Is this "hard"?

Cov(X-Z,Y-Z) < 0 means that on average, given two people of the same "overall intelligence level" (whatever that may be), if one of the two is better at math, it is likely that the other is better at reading.

Cov(X,Y) is not a useful measure here, because underlying "overall intelligence" will be confounding the results. In fact, it is likely that Cov(X,Y) > 0: given two arbitrary people, if one of the two is better at math, it is likely that the same person is also better at reading, simply because of the higher "overall intelligence level".
 IP Logged
Benny
Uberpuzzler

Gender:
Posts: 1024
 Re: Statistics   « Reply #2 on: Feb 16th, 2008, 10:52am » Quote Modify

Oops, no.
 IP Logged

If we want to understand our world — or how to change it — we must first understand the rational choices that shape it.
Benny
Uberpuzzler

Gender:
Posts: 1024
 Re: Statistics   « Reply #3 on: Feb 16th, 2008, 11:36am » Quote Modify

You wrote:
"Cov(X-Z,Y-Z) < 0 means that on average, given two people of the same "overall intelligence level" (whatever that may be), if one of the two is better at math, it is likely that the other is better at reading."
..........................
Yes thats right. More generally, one can say that the larger the difference ones math IQ is compared to their overall IQ, the larger the difference their reading IQ will be compared to their overall IQ, but in the oppossite direction.
...........................

Then, you wrote:

"Cov(X,Y) is not a useful measure here, because underlying "overall intelligence" will be confounding the results. In fact, it is likely that Cov(X,Y) > 0: given two arbitrary people, if one of the two is better at math, it is likely that the same person is also better at reading, simply because of the higher "overall intelligence level"."
...........

In general, we can compare the covariance between all sorts of different mentall areas (not just math and reading). An interesting question is why some mental abilities may be positively correlated, while other are negatively correlated.

 IP Logged

If we want to understand our world — or how to change it — we must first understand the rational choices that shape it.
Benny
Uberpuzzler

Gender:
Posts: 1024
 Re: Statistics   « Reply #4 on: Mar 17th, 2008, 9:04am » Quote Modify

Here's a different question:

For the one-way analysis of variance with a independant samples of size n, prove that

Expected value{n*sum((x_ibar -xbar)^2)/(a-1)} = variance + n*sum((alpha_i)^2)/(a-1)

note: Each sum is taken from 1 to a. The alpha_i are called the treatment effects and are such that
sum(alpha_i)=0
 IP Logged

If we want to understand our world — or how to change it — we must first understand the rational choices that shape it.
Eigenray
wu::riddles Moderator
Uberpuzzler

Gender:
Posts: 1948
 Re: Statistics   « Reply #5 on: Mar 17th, 2008, 12:48pm » Quote Modify

What are the i?  Shouldn't they appear on the LHS as well?
 IP Logged
Benny
Uberpuzzler

Gender:
Posts: 1024
 Re: Statistics   « Reply #6 on: Mar 17th, 2008, 4:12pm » Quote Modify

I think the alpha_i represent difference between E(x) and E(x_i). In other words, it represents the effect of using different treatments. It is an exercise I found in my textbook.
 IP Logged

If we want to understand our world — or how to change it — we must first understand the rational choices that shape it.
Benny
Uberpuzzler

Gender:
Posts: 1024
 Re: Statistics   « Reply #7 on: Jun 29th, 2008, 1:27pm » Quote Modify

The Wisdom of Crowds

http://en.wikipedia.org/wiki/The_Wisdom_of_Crowds

Ask enough people to estimate something, and the average of their guesses will get you surprisingly close to the right answer

In an election year, people might disagree about who makes the best candidate, but you don't hear much argument on the merits of democracy: that millions of average people can, together, make a wise decision ... unless the game is rigged.

In the early 20th century, this controversial Englishman, Sir Francis Galton, tried to statistically test whether mobs of common folk were capable of choosing well.

what Sir Francis actually found was that, mathematically, at least, there's often wisdom in a crowd.

Sir Francis Galton was a nobleman who scorned the common masses. He thought that votes of governance should be left to higher classes. He'd prove with all the data from a contest inescapable, of guessing even simple things that commoners were incapable.

He went to the market, stood before a crowd and asked the people to guess an ox's weight.

He told them, "Guess the weight correctly and win a prize!"

An eager crowd queued up to play, eight hundred made a guess that day.

Sir Francis Galton had 800 data points, but no one guessed the weight correctly. Then, he declared to the crowd, "And now the ox's weight is exactly...eleven hundred ninety-eight pounds. There are no winners!"

Sir Francis knew the crowd Would never guess the weight -- How might they judge important things, if left to meet that fate?

With mathematics he would show how far they went astray. But in the end his theory was in total disarray. Because a curve of all the guesses (the cumulative distribution function of the normal distribution.) because graphing all the guesses and determining their median, he showed that if the crowd were one, its estimate is keen.

That's because, while no individual guessed the actual weight, the average of all the individual guesses is exactly right. The average will generally be better than a randomly selected individual guess.

The median of the masses assures us of success.

And the larger the number of guesses we toss in the more likely we are to get the right answer about the oxen.

Sir Francis found the wisdom of the crowds.

While I'm sure that average of readings are a good way to ensure error-minimization, I don't think that it will ensure success in every single case.

Examples: the wrong politicians that get elected, the wrong politicians that get nominated, etc.

Also, I think that the average tends to the real reading because of the central limit theorem - we can take the variations in individual readings as a randomized variation. Given that, many random variables acting together will produce a Gaussian Random Variable, with the average at the correct reading.

 IP Logged

If we want to understand our world — or how to change it — we must first understand the rational choices that shape it.
towr
wu::riddles Moderator
Uberpuzzler

Some people are average, some are just mean.

Gender:
Posts: 13730
 Re: Statistics   « Reply #8 on: Jun 29th, 2008, 2:10pm » Quote Modify

on Jun 29th, 2008, 1:27pm, BenVitale wrote:
 In an election year, people might disagree about who makes the best candidate, but you don't hear much argument on the merits of democracy
The only merit of democracy is that the alternatives are worse.

Quote:
 that millions of average people can, together, make a wise decision ...
Wise? I'm not sure what Utopia you're living in, but wisdom is something I rarely find in politics.

Quote:
 unless the game is rigged.
Which it often, intended or not, is.
http://en.wikipedia.org/wiki/Voting#Fair_voting
"Economist Kenneth Arrow lists five characteristics of a fair voting system. However, Arrow's impossibility theorem shows that it is impossible for any voting system which offers more than three options per question to have all 5 characteristics at the same time."

Quote:
 Sir Francis found the wisdom of the crowds.
Strange definition of wisdom, to guess the weight of an ox correctly.
Being accurate and wise are different things.

Quote:
I think it depends very, very much on the type of question you ask a crowd.
I wouldn't trust them with "is 0.999... equal to 1?" for example. Let alone questions of a more difficult nature.

Crowds may make a good estimate at measurements and most probably in some other areas, but that has little to do with anything like wisdom.

 « Last Edit: Jun 29th, 2008, 2:12pm by towr » IP Logged

Wikipedia, Google, Mathworld, Integer sequence DB
Hippo
Uberpuzzler

Gender:
Posts: 919
 Re: Statistics   « Reply #9 on: Jun 30th, 2008, 12:28am » Quote Modify

I am not sure the crowd would explain quantum theory "paradoxes" well in 18.. century.
May be in 22..? ... Isn't that rather process of "education"?
 « Last Edit: Jun 30th, 2008, 12:28am by Hippo » IP Logged
TenaliRaman
Uberpuzzler

I am no special. I am only passionately curious.

Gender:
Posts: 1001
 Re: Statistics   « Reply #10 on: Jun 30th, 2008, 12:58am » Quote Modify

The "average" answer being close to a "true" value is by very definition only correct for questions which have objective truths and not subjective truths.

-- AI
 IP Logged

Self discovery comes when a man measures himself against an obstacle - Antoine de Saint Exupery
ThudnBlunder
wu::riddles Moderator
Uberpuzzler

The dewdrop slides into the shining Sea

Gender:
Posts: 4489
 Re: Statistics   « Reply #11 on: Jun 30th, 2008, 1:34am » Quote Modify

A former colleague once asked me, "Why do we divide by n-1 to estimate the population variance, not by n?"
Replying that dividing by n-1 gives an unbiased estimator of the true variance is not very enlightening.

 IP Logged

THE MEEK SHALL INHERIT THE EARTH.....................................................................er, if that's all right with the rest of you.
towr
wu::riddles Moderator
Uberpuzzler

Some people are average, some are just mean.

Gender:
Posts: 13730
 Re: Statistics   « Reply #12 on: Jun 30th, 2008, 3:02am » Quote Modify

on Jun 30th, 2008, 1:34am, ThudanBlunder wrote:
 A former colleague once asked me, "Why do we divide by n-1 to estimate the population variance, not by n?" Replying that dividing by n-1 gives an unbiased estimator of the true variance is not very enlightening.   So, off the top of your head, how would you answer?
Well, I'd start with saying that with one datapoint, you have no idea at all about the variance, it can still be anything. So rather than divide by n=1, which would give 0 variance, you divide by n-1=0, giving 0/0, or unknown, variance.
If he buys that, it beats showing how the expected value of the unbiased estimator equals the true variance. (And so saves you the trouble of finding pen and paper, and possibly having to explain various statistical calculations.)
 IP Logged

Wikipedia, Google, Mathworld, Integer sequence DB
Eigenray
wu::riddles Moderator
Uberpuzzler

Gender:
Posts: 1948
 Re: Statistics   « Reply #13 on: Jun 30th, 2008, 3:28am » Quote Modify

The variance is the expected value of the square of the distance to the mean.  When you estimate the variance based on a sample, you don't know the mean, so you use the sample mean.  But the sample mean will always be closer to your samples than the true mean: the quantity that minimizes (xi - )2 is precisely the mean of the xi.

So when you find the variance of a sample, using the sample's own mean, you tend to underestimate the variance.
 « Last Edit: Jun 30th, 2008, 3:30am by Eigenray » IP Logged
pex
Uberpuzzler

Gender:
Posts: 880
 Re: Statistics   « Reply #14 on: Jun 30th, 2008, 4:20am » Quote Modify

on Jun 30th, 2008, 3:28am, Eigenray wrote:
 The variance is the expected value of the square of the distance to the mean.  When you estimate the variance based on a sample, you don't know the mean, so you use the sample mean.  But the sample mean will always be closer to your samples than the true mean: the quantity that minimizes (xi - )2 is precisely the mean of the xi.   So when you find the variance of a sample, using the sample's own mean, you tend to underestimate the variance.

Yes, I've always liked the interpretation of "losing a degree of freedom" by using the sample mean rather than the true (but unknown) population mean.

More generally, if we fit a linear model yi = 1x1i + 2x2i + ... + kxki + i, then we estimate the variance of the error term by (i ei2) / (n - k). (Here, ei is the residual from the fitted model; the estimated value of i, if you like.) The usual sample variance is just the case where k = 1 and the single "explanatory variable" is a constant term.
 « Last Edit: Jun 30th, 2008, 4:23am by pex » IP Logged
rmsgrey
Uberpuzzler

Gender:
Posts: 2846
 Re: Statistics   « Reply #15 on: Jun 30th, 2008, 10:32am » Quote Modify

The "Wisdom of the Masses" is going to have trouble with bimodal distributions too - for instance, if you went down to Wimbledon and asked each spectator in the crowd where they thought the next serve would bounce, you'd end up with two dense clusters - one in each of the corners of the relevant service box away from the net. The mean would be somewhere in between, where almost nobody guesses, and the serve almost never goes...
 IP Logged
ThudnBlunder
wu::riddles Moderator
Uberpuzzler

The dewdrop slides into the shining Sea

Gender:
Posts: 4489
 Re: Statistics   « Reply #16 on: Jun 30th, 2008, 4:41pm » Quote Modify

on Jun 30th, 2008, 4:20am, pex wrote:
 Yes, I've always liked the interpretation of "losing a degree of freedom" by using the sample mean rather than the true (but unknown) population mean.

That was my explanation, too
 IP Logged

THE MEEK SHALL INHERIT THE EARTH.....................................................................er, if that's all right with the rest of you.
Christine
Full Member

Posts: 159
 Re: Statistics   « Reply #17 on: Oct 23rd, 2008, 3:38pm » Quote Modify

In multivariate regression, one can compare two models where one model has an additional term by using the F-test. This tests the significance of adding the extra term.

Is there a similar test to compare models of the same number of independent variables (IV), but the last IV different in each model?

R2, adjusted R2, R2 pred, and Mallow's Cp all show a difference, but do not test the significance of this difference.
 IP Logged
william wu

Gender:
Posts: 1291
 Re: Statistics   « Reply #18 on: Oct 23rd, 2008, 8:50pm » Quote Modify

Hi Christine,

 IP Logged

[ wu ] : http://wuriddles.com / http://forums.wuriddles.com
Earendil
Newbie

Gender:
Posts: 46
 Re: Statistics   « Reply #19 on: Oct 23rd, 2008, 9:58pm » Quote Modify

Nevertheless, unbiased estimators were created only for the problem of finding a "best estimator", that is, one which minimizes the expected distance with the parameter, to be meaningful (in some sense).

Dividing by "n" yields an estimator which is biased but which is better then dividing by "n-1" in various ways. For instance, it's expected distance to the parameter is smaller then that of "n-1" for any value of the parameter.

I don't see why "unbiased" is a good property besides making the problem of the "best estimator" tractable by a classic statistician.
 IP Logged
towr
wu::riddles Moderator
Uberpuzzler

Some people are average, some are just mean.

Gender:
Posts: 13730
 Re: Statistics   « Reply #20 on: Oct 24th, 2008, 12:22am » Quote Modify

on Oct 23rd, 2008, 9:58pm, Earendil wrote:
 I don't see why "unbiased" is a good property besides making the problem of the "best estimator" tractable by a classic statistician.
Because you want to say something about the population you took the sample from, and not about the sample itself. For large n the numbers converge, obviously, but for small n there is a distinct difference.
The biased measures always make you seem more certain than you have cause to be.
 IP Logged

Wikipedia, Google, Mathworld, Integer sequence DB
Earendil
Newbie

Gender:
Posts: 46
 Re: Statistics   « Reply #21 on: Nov 2nd, 2008, 8:24pm » Quote Modify

on Oct 24th, 2008, 12:22am, towr wrote:
 Because you want to say something about the population you took the sample from, and not about the sample itself. For large n the numbers converge, obviously, but for small n there is a distinct difference.   The biased measures always make you seem more certain than you have cause to be.

Could you give me an example, please?
 IP Logged
towr
wu::riddles Moderator
Uberpuzzler

Some people are average, some are just mean.

Gender:
Posts: 13730
 Re: Statistics   « Reply #22 on: Nov 3rd, 2008, 2:04am » Quote Modify

on Nov 2nd, 2008, 8:24pm, Earendil wrote:
 Could you give me an example, please?
Suppose I have a population of size 100; let's just take the number 1..100.
Now take a sample of size 1; say we get 37. Our biased variance is 0. Suggesting that without a doubt all 100 numbers are 37. Our unbiased variance doesn't exist (or we could say it's infinite; depending on what you do on dividing by 0). In other words it suggests there isn't enough information from our sample to say anything about the variance of the population.
 « Last Edit: Nov 3rd, 2008, 2:08am by towr » IP Logged

Wikipedia, Google, Mathworld, Integer sequence DB
 Pages: 1 Reply Notify of replies Send Topic Print

 Forum Jump: ----------------------------- riddles -----------------------------  - easy   - medium => hard   - what am i   - what happened   - microsoft   - cs   - putnam exam (pure math)   - suggestions, help, and FAQ   - general problem-solving / chatting / whatever ----------------------------- general -----------------------------  - guestbook   - truth   - complex analysis   - wanted   - psychology   - chinese « Previous topic | Next topic »