Author 
Topic: Statistics (Read 6721 times) 

Benny
Uberpuzzler
Gender:
Posts: 1024


Statistics
« on: Feb 16^{th}, 2008, 12:56am » 
Quote Modify

Suppose that a pyschologist wants to compare the intelligent quotient of a typical person's reading ability to they're math ability. Let us define the following variables: Z=a person's overall recorded IQ X= A person's mathematical recorded IQ Y= A person's reading ability recorded IQ. Now if our pychologist measures the above characteristics for many individuals, and then computes Covariance(X Z, YZ)=0.5 with these values, then what can we say about a persons mathematical ability compared to their reading ability? In an actual experiment, why might it be necessary to compute Covariance(X Z, YZ) instead of Covariance(X,Y) in order to find a relation between X and Y? Hint: This question requires some knowledge on the pychology of intelligence.


IP Logged 
If we want to understand our world — or how to change it — we must first understand the rational choices that shape it.



pex
Uberpuzzler
Gender:
Posts: 880


Re: Statistics
« Reply #1 on: Feb 16^{th}, 2008, 6:36am » 
Quote Modify

Is this "hard"? Cov(XZ,YZ) < 0 means that on average, given two people of the same "overall intelligence level" (whatever that may be), if one of the two is better at math, it is likely that the other is better at reading. Cov(X,Y) is not a useful measure here, because underlying "overall intelligence" will be confounding the results. In fact, it is likely that Cov(X,Y) > 0: given two arbitrary people, if one of the two is better at math, it is likely that the same person is also better at reading, simply because of the higher "overall intelligence level".


IP Logged 



Benny
Uberpuzzler
Gender:
Posts: 1024


Re: Statistics
« Reply #2 on: Feb 16^{th}, 2008, 10:52am » 
Quote Modify

Oops, no.


IP Logged 
If we want to understand our world — or how to change it — we must first understand the rational choices that shape it.



Benny
Uberpuzzler
Gender:
Posts: 1024


Re: Statistics
« Reply #3 on: Feb 16^{th}, 2008, 11:36am » 
Quote Modify

You wrote: "Cov(XZ,YZ) < 0 means that on average, given two people of the same "overall intelligence level" (whatever that may be), if one of the two is better at math, it is likely that the other is better at reading." .......................... Yes thats right. More generally, one can say that the larger the difference ones math IQ is compared to their overall IQ, the larger the difference their reading IQ will be compared to their overall IQ, but in the oppossite direction. ........................... Then, you wrote: "Cov(X,Y) is not a useful measure here, because underlying "overall intelligence" will be confounding the results. In fact, it is likely that Cov(X,Y) > 0: given two arbitrary people, if one of the two is better at math, it is likely that the same person is also better at reading, simply because of the higher "overall intelligence level"." ........... In general, we can compare the covariance between all sorts of different mentall areas (not just math and reading). An interesting question is why some mental abilities may be positively correlated, while other are negatively correlated. What are your thoughts?


IP Logged 
If we want to understand our world — or how to change it — we must first understand the rational choices that shape it.



Benny
Uberpuzzler
Gender:
Posts: 1024


Re: Statistics
« Reply #4 on: Mar 17^{th}, 2008, 9:04am » 
Quote Modify

Here's a different question: For the oneway analysis of variance with a independant samples of size n, prove that Expected value{n*sum((x_ibar xbar)^2)/(a1)} = variance + n*sum((alpha_i)^2)/(a1) note: Each sum is taken from 1 to a. The alpha_i are called the treatment effects and are such that sum(alpha_i)=0


IP Logged 
If we want to understand our world — or how to change it — we must first understand the rational choices that shape it.



Eigenray
wu::riddles Moderator Uberpuzzler
Gender:
Posts: 1948


Re: Statistics
« Reply #5 on: Mar 17^{th}, 2008, 12:48pm » 
Quote Modify

What are the _{i}? Shouldn't they appear on the LHS as well?


IP Logged 



Benny
Uberpuzzler
Gender:
Posts: 1024


Re: Statistics
« Reply #6 on: Mar 17^{th}, 2008, 4:12pm » 
Quote Modify

I think the alpha_i represent difference between E(x) and E(x_i). In other words, it represents the effect of using different treatments. It is an exercise I found in my textbook.


IP Logged 
If we want to understand our world — or how to change it — we must first understand the rational choices that shape it.



Benny
Uberpuzzler
Gender:
Posts: 1024


Re: Statistics
« Reply #7 on: Jun 29^{th}, 2008, 1:27pm » 
Quote Modify

The Wisdom of Crowds http://en.wikipedia.org/wiki/The_Wisdom_of_Crowds Ask enough people to estimate something, and the average of their guesses will get you surprisingly close to the right answer In an election year, people might disagree about who makes the best candidate, but you don't hear much argument on the merits of democracy: that millions of average people can, together, make a wise decision ... unless the game is rigged. In the early 20th century, this controversial Englishman, Sir Francis Galton, tried to statistically test whether mobs of common folk were capable of choosing well. what Sir Francis actually found was that, mathematically, at least, there's often wisdom in a crowd. Sir Francis Galton was a nobleman who scorned the common masses. He thought that votes of governance should be left to higher classes. He'd prove with all the data from a contest inescapable, of guessing even simple things that commoners were incapable. He went to the market, stood before a crowd and asked the people to guess an ox's weight. He told them, "Guess the weight correctly and win a prize!" An eager crowd queued up to play, eight hundred made a guess that day. Sir Francis Galton had 800 data points, but no one guessed the weight correctly. Then, he declared to the crowd, "And now the ox's weight is exactly...eleven hundred ninetyeight pounds. There are no winners!" Sir Francis knew the crowd Would never guess the weight  How might they judge important things, if left to meet that fate? With mathematics he would show how far they went astray. But in the end his theory was in total disarray. Because a curve of all the guesses (the cumulative distribution function of the normal distribution.) because graphing all the guesses and determining their median, he showed that if the crowd were one, its estimate is keen. That's because, while no individual guessed the actual weight, the average of all the individual guesses is exactly right. The average will generally be better than a randomly selected individual guess. The median of the masses assures us of success. And the larger the number of guesses we toss in the more likely we are to get the right answer about the oxen. Sir Francis found the wisdom of the crowds. While I'm sure that average of readings are a good way to ensure errorminimization, I don't think that it will ensure success in every single case. Examples: the wrong politicians that get elected, the wrong politicians that get nominated, etc. Also, I think that the average tends to the real reading because of the central limit theorem  we can take the variations in individual readings as a randomized variation. Given that, many random variables acting together will produce a Gaussian Random Variable, with the average at the correct reading. Your thoughts, please.


IP Logged 
If we want to understand our world — or how to change it — we must first understand the rational choices that shape it.



towr
wu::riddles Moderator Uberpuzzler
Some people are average, some are just mean.
Gender:
Posts: 13730


Re: Statistics
« Reply #8 on: Jun 29^{th}, 2008, 2:10pm » 
Quote Modify

on Jun 29^{th}, 2008, 1:27pm, BenVitale wrote:In an election year, people might disagree about who makes the best candidate, but you don't hear much argument on the merits of democracy 
 The only merit of democracy is that the alternatives are worse. Quote:that millions of average people can, together, make a wise decision ... 
 Wise? I'm not sure what Utopia you're living in, but wisdom is something I rarely find in politics. Quote:unless the game is rigged. 
 Which it often, intended or not, is. http://en.wikipedia.org/wiki/Voting#Fair_voting "Economist Kenneth Arrow lists five characteristics of a fair voting system. However, Arrow's impossibility theorem shows that it is impossible for any voting system which offers more than three options per question to have all 5 characteristics at the same time." Quote:Sir Francis found the wisdom of the crowds. 
 Strange definition of wisdom, to guess the weight of an ox correctly. Being accurate and wise are different things. Quote:I think it depends very, very much on the type of question you ask a crowd. I wouldn't trust them with "is 0.999... equal to 1?" for example. Let alone questions of a more difficult nature. Crowds may make a good estimate at measurements and most probably in some other areas, but that has little to do with anything like wisdom. Also: http://www.youtube.com/watch?v=59Ga5PuckA0

« Last Edit: Jun 29^{th}, 2008, 2:12pm by towr » 
IP Logged 
Wikipedia, Google, Mathworld, Integer sequence DB



Hippo
Uberpuzzler
Gender:
Posts: 919


Re: Statistics
« Reply #9 on: Jun 30^{th}, 2008, 12:28am » 
Quote Modify

I am not sure the crowd would explain quantum theory "paradoxes" well in 18.. century. May be in 22..? ... Isn't that rather process of "education"?

« Last Edit: Jun 30^{th}, 2008, 12:28am by Hippo » 
IP Logged 



TenaliRaman
Uberpuzzler
I am no special. I am only passionately curious.
Gender:
Posts: 1001


Re: Statistics
« Reply #10 on: Jun 30^{th}, 2008, 12:58am » 
Quote Modify

The "average" answer being close to a "true" value is by very definition only correct for questions which have objective truths and not subjective truths.  AI


IP Logged 
Self discovery comes when a man measures himself against an obstacle  Antoine de Saint Exupery



ThudnBlunder
wu::riddles Moderator Uberpuzzler
The dewdrop slides into the shining Sea
Gender:
Posts: 4489


Re: Statistics
« Reply #11 on: Jun 30^{th}, 2008, 1:34am » 
Quote Modify

A former colleague once asked me, "Why do we divide by n1 to estimate the population variance, not by n?" Replying that dividing by n1 gives an unbiased estimator of the true variance is not very enlightening. So, off the top of your head, how would you answer?


IP Logged 
THE MEEK SHALL INHERIT THE EARTH.....................................................................er, if that's all right with the rest of you.



towr
wu::riddles Moderator Uberpuzzler
Some people are average, some are just mean.
Gender:
Posts: 13730


Re: Statistics
« Reply #12 on: Jun 30^{th}, 2008, 3:02am » 
Quote Modify

on Jun 30^{th}, 2008, 1:34am, ThudanBlunder wrote:A former colleague once asked me, "Why do we divide by n1 to estimate the population variance, not by n?" Replying that dividing by n1 gives an unbiased estimator of the true variance is not very enlightening. So, off the top of your head, how would you answer? 
 Well, I'd start with saying that with one datapoint, you have no idea at all about the variance, it can still be anything. So rather than divide by n=1, which would give 0 variance, you divide by n1=0, giving 0/0, or unknown, variance. If he buys that, it beats showing how the expected value of the unbiased estimator equals the true variance. (And so saves you the trouble of finding pen and paper, and possibly having to explain various statistical calculations.)


IP Logged 
Wikipedia, Google, Mathworld, Integer sequence DB



Eigenray
wu::riddles Moderator Uberpuzzler
Gender:
Posts: 1948


Re: Statistics
« Reply #13 on: Jun 30^{th}, 2008, 3:28am » 
Quote Modify

The variance is the expected value of the square of the distance to the mean. When you estimate the variance based on a sample, you don't know the mean, so you use the sample mean. But the sample mean will always be closer to your samples than the true mean: the quantity that minimizes (x_{i}  )^{2} is precisely the mean of the x_{i}. So when you find the variance of a sample, using the sample's own mean, you tend to underestimate the variance.

« Last Edit: Jun 30^{th}, 2008, 3:30am by Eigenray » 
IP Logged 



pex
Uberpuzzler
Gender:
Posts: 880


Re: Statistics
« Reply #14 on: Jun 30^{th}, 2008, 4:20am » 
Quote Modify

on Jun 30^{th}, 2008, 3:28am, Eigenray wrote:The variance is the expected value of the square of the distance to the mean. When you estimate the variance based on a sample, you don't know the mean, so you use the sample mean. But the sample mean will always be closer to your samples than the true mean: the quantity that minimizes (x_{i}  )^{2} is precisely the mean of the x_{i}. So when you find the variance of a sample, using the sample's own mean, you tend to underestimate the variance. 
 Yes, I've always liked the interpretation of "losing a degree of freedom" by using the sample mean rather than the true (but unknown) population mean. More generally, if we fit a linear model y_{i} = _{1}x_{1i} + _{2}x_{2i} + ... + _{k}x_{ki} + _{i}, then we estimate the variance of the error term by (_{i} e_{i}^{2}) / (n  k). (Here, e_{i} is the residual from the fitted model; the estimated value of _{i}, if you like.) The usual sample variance is just the case where k = 1 and the single "explanatory variable" is a constant term.

« Last Edit: Jun 30^{th}, 2008, 4:23am by pex » 
IP Logged 



rmsgrey
Uberpuzzler
Gender:
Posts: 2856


Re: Statistics
« Reply #15 on: Jun 30^{th}, 2008, 10:32am » 
Quote Modify

The "Wisdom of the Masses" is going to have trouble with bimodal distributions too  for instance, if you went down to Wimbledon and asked each spectator in the crowd where they thought the next serve would bounce, you'd end up with two dense clusters  one in each of the corners of the relevant service box away from the net. The mean would be somewhere in between, where almost nobody guesses, and the serve almost never goes...


IP Logged 



ThudnBlunder
wu::riddles Moderator Uberpuzzler
The dewdrop slides into the shining Sea
Gender:
Posts: 4489


Re: Statistics
« Reply #16 on: Jun 30^{th}, 2008, 4:41pm » 
Quote Modify

on Jun 30^{th}, 2008, 4:20am, pex wrote: Yes, I've always liked the interpretation of "losing a degree of freedom" by using the sample mean rather than the true (but unknown) population mean. 
 That was my explanation, too


IP Logged 
THE MEEK SHALL INHERIT THE EARTH.....................................................................er, if that's all right with the rest of you.



Christine
Full Member
Posts: 159


Re: Statistics
« Reply #17 on: Oct 23^{rd}, 2008, 3:38pm » 
Quote Modify

In multivariate regression, one can compare two models where one model has an additional term by using the Ftest. This tests the significance of adding the extra term. Is there a similar test to compare models of the same number of independent variables (IV), but the last IV different in each model? R2, adjusted R2, R2 pred, and Mallow's Cp all show a difference, but do not test the significance of this difference.


IP Logged 



Earendil
Newbie
Gender:
Posts: 46


Re: Statistics
« Reply #19 on: Oct 23^{rd}, 2008, 9:58pm » 
Quote Modify

Nevertheless, unbiased estimators were created only for the problem of finding a "best estimator", that is, one which minimizes the expected distance with the parameter, to be meaningful (in some sense). Dividing by "n" yields an estimator which is biased but which is better then dividing by "n1" in various ways. For instance, it's expected distance to the parameter is smaller then that of "n1" for any value of the parameter. I don't see why "unbiased" is a good property besides making the problem of the "best estimator" tractable by a classic statistician.


IP Logged 



towr
wu::riddles Moderator Uberpuzzler
Some people are average, some are just mean.
Gender:
Posts: 13730


Re: Statistics
« Reply #20 on: Oct 24^{th}, 2008, 12:22am » 
Quote Modify

on Oct 23^{rd}, 2008, 9:58pm, Earendil wrote:I don't see why "unbiased" is a good property besides making the problem of the "best estimator" tractable by a classic statistician. 
 Because you want to say something about the population you took the sample from, and not about the sample itself. For large n the numbers converge, obviously, but for small n there is a distinct difference. The biased measures always make you seem more certain than you have cause to be.


IP Logged 
Wikipedia, Google, Mathworld, Integer sequence DB



Earendil
Newbie
Gender:
Posts: 46


Re: Statistics
« Reply #21 on: Nov 2^{nd}, 2008, 8:24pm » 
Quote Modify

on Oct 24^{th}, 2008, 12:22am, towr wrote: Because you want to say something about the population you took the sample from, and not about the sample itself. For large n the numbers converge, obviously, but for small n there is a distinct difference. The biased measures always make you seem more certain than you have cause to be. 
 Could you give me an example, please?


IP Logged 



towr
wu::riddles Moderator Uberpuzzler
Some people are average, some are just mean.
Gender:
Posts: 13730


Re: Statistics
« Reply #22 on: Nov 3^{rd}, 2008, 2:04am » 
Quote Modify

on Nov 2^{nd}, 2008, 8:24pm, Earendil wrote:Could you give me an example, please? 
 Suppose I have a population of size 100; let's just take the number 1..100. Now take a sample of size 1; say we get 37. Our biased variance is 0. Suggesting that without a doubt all 100 numbers are 37. Our unbiased variance doesn't exist (or we could say it's infinite; depending on what you do on dividing by 0). In other words it suggests there isn't enough information from our sample to say anything about the variance of the population.

« Last Edit: Nov 3^{rd}, 2008, 2:08am by towr » 
IP Logged 
Wikipedia, Google, Mathworld, Integer sequence DB



