

Title: Statistics Post by BenVitale on Feb 16^{th}, 2008, 12:56am Suppose that a pyschologist wants to compare the intelligent quotient of a typical person's reading ability to they're math ability. Let us define the following variables: Z=a person's overall recorded IQ X= A person's mathematical recorded IQ Y= A person's reading ability recorded IQ. Now if our pychologist measures the above characteristics for many individuals, and then computes Covariance(X Z, YZ)=0.5 with these values, then what can we say about a persons mathematical ability compared to their reading ability? In an actual experiment, why might it be necessary to compute Covariance(X Z, YZ) instead of Covariance(X,Y) in order to find a relation between X and Y? Hint: This question requires some knowledge on the pychology of intelligence. 

Title: Re: Statistics Post by pex on Feb 16^{th}, 2008, 6:36am Is this "hard"? Cov(XZ,YZ) < 0 means that on average, given two people of the same "overall intelligence level" (whatever that may be), if one of the two is better at math, it is likely that the other is better at reading. Cov(X,Y) is not a useful measure here, because underlying "overall intelligence" will be confounding the results. In fact, it is likely that Cov(X,Y) > 0: given two arbitrary people, if one of the two is better at math, it is likely that the same person is also better at reading, simply because of the higher "overall intelligence level". 

Title: Re: Statistics Post by BenVitale on Feb 16^{th}, 2008, 10:52am Oops, no. 

Title: Re: Statistics Post by BenVitale on Feb 16^{th}, 2008, 11:36am You wrote: "Cov(XZ,YZ) < 0 means that on average, given two people of the same "overall intelligence level" (whatever that may be), if one of the two is better at math, it is likely that the other is better at reading." .......................... Yes thats right. More generally, one can say that the larger the difference ones math IQ is compared to their overall IQ, the larger the difference their reading IQ will be compared to their overall IQ, but in the oppossite direction. ........................... Then, you wrote: "Cov(X,Y) is not a useful measure here, because underlying "overall intelligence" will be confounding the results. In fact, it is likely that Cov(X,Y) > 0: given two arbitrary people, if one of the two is better at math, it is likely that the same person is also better at reading, simply because of the higher "overall intelligence level"." ........... In general, we can compare the covariance between all sorts of different mentall areas (not just math and reading). An interesting question is why some mental abilities may be positively correlated, while other are negatively correlated. What are your thoughts? 

Title: Re: Statistics Post by BenVitale on Mar 17^{th}, 2008, 9:04am Here's a different question: For the oneway analysis of variance with a independant samples of size n, prove that Expected value{n*sum((x_ibar xbar)^2)/(a1)} = variance + n*sum((alpha_i)^2)/(a1) note: Each sum is taken from 1 to a. The alpha_i are called the treatment effects and are such that sum(alpha_i)=0 

Title: Re: Statistics Post by Eigenray on Mar 17^{th}, 2008, 12:48pm What are the http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/alpha.gif_{i}? Shouldn't they appear on the LHS as well? 

Title: Re: Statistics Post by BenVitale on Mar 17^{th}, 2008, 4:12pm I think the alpha_i represent difference between E(x) and E(x_i). In other words, it represents the effect of using different treatments. It is an exercise I found in my textbook. 

Title: Re: Statistics Post by BenVitale on Jun 29^{th}, 2008, 1:27pm The Wisdom of Crowds http://en.wikipedia.org/wiki/The_Wisdom_of_Crowds Ask enough people to estimate something, and the average of their guesses will get you surprisingly close to the right answer In an election year, people might disagree about who makes the best candidate, but you don't hear much argument on the merits of democracy: that millions of average people can, together, make a wise decision ... unless the game is rigged. In the early 20th century, this controversial Englishman, Sir Francis Galton, tried to statistically test whether mobs of common folk were capable of choosing well. what Sir Francis actually found was that, mathematically, at least, there's often wisdom in a crowd. Sir Francis Galton was a nobleman who scorned the common masses. He thought that votes of governance should be left to higher classes. He'd prove with all the data from a contest inescapable, of guessing even simple things that commoners were incapable. He went to the market, stood before a crowd and asked the people to guess an ox's weight. He told them, "Guess the weight correctly and win a prize!" An eager crowd queued up to play, eight hundred made a guess that day. Sir Francis Galton had 800 data points, but no one guessed the weight correctly. Then, he declared to the crowd, "And now the ox's weight is exactly...eleven hundred ninetyeight pounds. There are no winners!" Sir Francis knew the crowd Would never guess the weight  How might they judge important things, if left to meet that fate? With mathematics he would show how far they went astray. But in the end his theory was in total disarray. Because a curve of all the guesses (the cumulative distribution function of the normal distribution.) because graphing all the guesses and determining their median, he showed that if the crowd were one, its estimate is keen. That's because, while no individual guessed the actual weight, the average of all the individual guesses is exactly right. The average will generally be better than a randomly selected individual guess. The median of the masses assures us of success. And the larger the number of guesses we toss in the more likely we are to get the right answer about the oxen. Sir Francis found the wisdom of the crowds. While I'm sure that average of readings are a good way to ensure errorminimization, I don't think that it will ensure success in every single case. Examples: the wrong politicians that get elected, the wrong politicians that get nominated, etc. Also, I think that the average tends to the real reading because of the central limit theorem  we can take the variations in individual readings as a randomized variation. Given that, many random variables acting together will produce a Gaussian Random Variable, with the average at the correct reading. Your thoughts, please. 

Title: Re: Statistics Post by towr on Jun 29^{th}, 2008, 2:10pm on 06/29/08 at 13:27:34, BenVitale wrote:
Quote:
Quote:
http://en.wikipedia.org/wiki/Voting#Fair_voting "Economist Kenneth Arrow lists five characteristics of a fair voting system. However, Arrow's impossibility theorem shows that it is impossible for any voting system which offers more than three options per question to have all 5 characteristics at the same time." Quote:
Being accurate and wise are different things. Quote:
I wouldn't trust them with "is 0.999... equal to 1?" for example. Let alone questions of a more difficult nature. Crowds may make a good estimate at measurements and most probably in some other areas, but that has little to do with anything like wisdom. Also: http://www.youtube.com/watch?v=59Ga5PuckA0 

Title: Re: Statistics Post by Hippo on Jun 30^{th}, 2008, 12:28am I am not sure the crowd would explain quantum theory "paradoxes" well in 18.. century. May be in 22..? ... Isn't that rather process of "education"? 

Title: Re: Statistics Post by TenaliRaman on Jun 30^{th}, 2008, 12:58am The "average" answer being close to a "true" value is by very definition only correct for questions which have objective truths and not subjective truths.  AI 

Title: Re: Statistics Post by ThudanBlunder on Jun 30^{th}, 2008, 1:34am A former colleague once asked me, "Why do we divide by n1 to estimate the population variance, not by n?" Replying that dividing by n1 gives an unbiased estimator of the true variance is not very enlightening. So, off the top of your head, how would you answer? 

Title: Re: Statistics Post by towr on Jun 30^{th}, 2008, 3:02am on 06/30/08 at 01:34:27, ThudanBlunder wrote:
If he buys that, it beats showing how the expected value of the unbiased estimator equals the true variance. (And so saves you the trouble of finding pen and paper, and possibly having to explain various statistical calculations.) 

Title: Re: Statistics Post by Eigenray on Jun 30^{th}, 2008, 3:28am The variance is the expected value of the square of the distance to the mean. When you estimate the variance based on a sample, you don't know the mean, so you use the sample mean. But the sample mean will always be closer to your samples than the true mean: the quantity http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/mu.gif that minimizes http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/sum.gif(x_{i}  http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/mu.gif)^{2} is precisely the mean of the x_{i}. So when you find the variance of a sample, using the sample's own mean, you tend to underestimate the variance. 

Title: Re: Statistics Post by pex on Jun 30^{th}, 2008, 4:20am on 06/30/08 at 03:28:12, Eigenray wrote:
Yes, I've always liked the interpretation of "losing a degree of freedom" by using the sample mean rather than the true (but unknown) population mean. More generally, if we fit a linear model y_{i} = http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/beta.gif_{1}x_{1i} + http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/beta.gif_{2}x_{2i} + ... + http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/beta.gif_{k}x_{ki} + http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/epsilon.gif_{i}, then we estimate the variance of the error term by (http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/sum.gif_{i} e_{i}^{2}) / (n  k). (Here, e_{i} is the residual from the fitted model; the estimated value of http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/epsilon.gif_{i}, if you like.) The usual sample variance is just the case where k = 1 and the single "explanatory variable" is a constant term. 

Title: Re: Statistics Post by rmsgrey on Jun 30^{th}, 2008, 10:32am The "Wisdom of the Masses" is going to have trouble with bimodal distributions too  for instance, if you went down to Wimbledon and asked each spectator in the crowd where they thought the next serve would bounce, you'd end up with two dense clusters  one in each of the corners of the relevant service box away from the net. The mean would be somewhere in between, where almost nobody guesses, and the serve almost never goes... 

Title: Re: Statistics Post by ThudanBlunder on Jun 30^{th}, 2008, 4:41pm on 06/30/08 at 04:20:46, pex wrote:
That was my explanation, too :) 

Title: Re: Statistics Post by Christine on Oct 23^{rd}, 2008, 3:38pm In multivariate regression, one can compare two models where one model has an additional term by using the Ftest. This tests the significance of adding the extra term. Is there a similar test to compare models of the same number of independent variables (IV), but the last IV different in each model? R2, adjusted R2, R2 pred, and Mallow's Cp all show a difference, but do not test the significance of this difference. 

Title: Re: Statistics Post by william wu on Oct 23^{rd}, 2008, 8:50pm Hi Christine, Please make a new thread when posting a new question. Thanks! 

Title: Re: Statistics Post by Earendil on Oct 23^{rd}, 2008, 9:58pm Nevertheless, unbiased estimators were created only for the problem of finding a "best estimator", that is, one which minimizes the expected distance with the parameter, to be meaningful (in some sense). Dividing by "n" yields an estimator which is biased but which is better then dividing by "n1" in various ways. For instance, it's expected distance to the parameter is smaller then that of "n1" for any value of the parameter. I don't see why "unbiased" is a good property besides making the problem of the "best estimator" tractable by a classic statistician. 

Title: Re: Statistics Post by towr on Oct 24^{th}, 2008, 12:22am on 10/23/08 at 21:58:39, Earendil wrote:
The biased measures always make you seem more certain than you have cause to be. 

Title: Re: Statistics Post by Earendil on Nov 2^{nd}, 2008, 8:24pm on 10/24/08 at 00:22:54, towr wrote:
Could you give me an example, please? 

Title: Re: Statistics Post by towr on Nov 3^{rd}, 2008, 2:04am on 11/02/08 at 20:24:15, Earendil wrote:
Now take a sample of size 1; say we get 37. Our biased variance is 0. Suggesting that without a doubt all 100 numbers are 37. Our unbiased variance doesn't exist (or we could say it's infinite; depending on what you do on dividing by 0). In other words it suggests there isn't enough information from our sample to say anything about the variance of the population. 

Powered by YaBB 1 Gold  SP 1.4! Forum software copyright © 20002004 Yet another Bulletin Board 