wu :: forums (http://www.ocf.berkeley.edu/~wwu/cgi-bin/yabb/YaBB.cgi)
riddles >> putnam exam (pure math) >> Expected Length of Projection On Plane (8/26/2002)
(Message started by: william wu on Aug 26th, 2002, 10:07pm)

Title: Expected Length of Projection On Plane (8/26/2002)
Post by william wu on Aug 26th, 2002, 10:07pm
11:22 PM 8/26/2002

A column vector x is distributed randomly and uniformly over the surface of the unit sphere in an N-dimensional Euclidean space. What is the expected (mean) value of the squared length of x's projection onto a K-dimensional plane through the sphere's center?

Title: Re: Expected Length of Projection onto Plane
Post by william wu on Aug 27th, 2002, 3:00am
So here's what I'm thinking.

First, there's a simple formula for finding the projection of one vector onto another. We can express both x and the K-dimensional plane as vectors. Assuming K < N, we will append N-K zero-valued coefficients to the vector describing the K-dimensional plane, so that we now have two vectors of length N. Now we can apply the projection formula and take the square of the returned projection.

What we now have is the squared projection for one particular x. But the problem statement wants the mean value of the squared projection, given that x is distributed randomly and uniformly over the surface of the unit sphere. The formula for expectation value is:

E[X] = summation over sample space [ p(s) X(s) ]

where X is a random variable and s is an event in the sample space. In the context of this problem, an event corresponds to the choosing of a particular column vector in N-dimensional space. Here all events are equally likely, and thus, the probability of choosing a particular event is simply 1/A, where A is the surface area of the unit sphere. The random variable X returns the squared length of the projection dictated by the choice of column vector. Since we are working with a continuous domain, calculating the expectation value will require a surface integral.

expectationVal = (1/A) * surfaceInteg [ X( <x1, x2, x3, ... , xN> ) * dA ]

Finding the surface area of an N-dimensional sphere seems rather nontrivial ... well, I couldn't visualize it. Through online research I found the following formula:

A = N (pi^(N/2))/((N/2)!)r^(N- 1)

The surface area of a ball in N-dim space is actually just the derivative of the ball's volume, which is dictated by the formula

r^n * pi^(n/2) / Gamma(n/2)

where Gamma(x) is "the smooth continuous version of the discrete factorial function". Whatever that means.

I don't like how my approach required this complicated equation for surface area of spheres. Perhaps there is an easier way? More importantly, does this solution make any sense at all?  :-/



Title: Re: Expected Length of Projection onto Plane
Post by wowbagger on Aug 27th, 2002, 7:18am
Nice problem.
At first thought, your solution does make sense to me, but:

In my opinion, this problem calls for spherical coordinates - in N dimensions of course. I have to admit I'm not familiar with such a concept, but you surely get rid of your surface integral. And you can easily create a uniform distribution of points on your N-dimensional unit sphere.

Think about how you generalize from (plane) polar coords to 3-dim. spher. coords: You add an axis perpendicular to your existing ones and describe your point using one more angle with values ranging from 0 to pi.
I believe one can do exactly the same going from (N-1) dimensions to N.

If this should be true - and the projection onto one axis generalizes straightforwardly from 3-dim. spher. coords -, the rest of the problem is about averaging the sines that emerge due to the projections.

I figured out a fairly simple formula, but I may well be wrong with the projection stuff.
Hope this helps.

Title: Re: Expected Length of Projection onto Plane
Post by william wu on Aug 27th, 2002, 7:01pm
Yes spherical coordinates are in order. I talked with Kahan today, and my approach works; however, the infinitesimally small pieces of area on the surface of the sphere should be expressed as something like (dr dTheta). Further, I was incorrect to say that the probability of choosing a point on the sphere is 1/A, where A is the surface area of the sphere. The probability of choosing a point in any continuum is zero (e.g. the probability of choosing any one point at random in the range [0,1] is zero). What I should have done was compute the probability of choosing a patch of area on the surface -- which would then be something like (dr dTheta / A).

The point of the problem, however, was to discourage me from using this approach. Once you deal with computing areas in N-dimensions, things get difficult very quickly. Here is a much simpler solution.

* * *


First we should note what is special about Euclidean space. A Euclidean space is special because there is a defined length function: the length of a vector <x> is defined by sqrt( x_1^2 + x_2^2 + ... + x_N^2 ). Another way of phrasing this is you multiply the transpose of <x> with the original <x>, and then square root that product.

Define the column vector <x> = { x_1, x_2, ... , xN }. Then the projection of this vector onto the K-dimensional plane is given by <p> = { x_1, x_2, ... , x_K, 0, 0, ... , 0 }.

Now we want the expected squared length of this projection. From the definition of Euclidean distance (let || delimit the magnitude):

E(|| <p> ||^2) = E( x_1^2 + x_2^2 + ... + x_K^2).

Because the column vector is distributed uniformly at random, we can consider this sum of random variables to be simply K times the expected value of one of these random variables. That is,

E(|| <p> ||^2) = k E ( x_1^2 ).

Why is this valid? Let's say we interchanged the positions of two of these random variables. This would correspond to a rotation of the sphere. However, every rotation leaves the sphere unchanged, since the distribution of the sphere is uniform -- so you can't even tell that you rotated it. So linearity of expectation applies.

Now we are simply left with computing E ( x_1^2).

Because the column vector <x> lies on the unit sphere, we know that

x_1^2 + x_2^2 + ... + x_N^2 = 1

We can also say:

E (x_1^2 + x_2^2 + ... + x_N^2) = 1

By linearity of expectation:

N E ( x_1^2 ) = 1
E ( x_1^2 ) = 1 / N

Thus, our final answer for the expected squared projection is:

E(|| <p> ||^2) = K (1 / N) = K / N

So simple! Wow?

* * *


Concepts required to solve this problem elegantly:

1) Euclidean space
2) Uniform Distribution
3) Linearity of Expectation

Title: Re: Expected Length of Projection onto Plane
Post by wowbagger on Aug 28th, 2002, 11:30am

on 08/27/02 at 19:01:04, william wu wrote:
[...] the infinitesimally small pieces of area on the surface of the sphere should be expressed as something like (dr dTheta). Further, I was incorrect to say that the probability of choosing a point on the sphere is 1/A, where A is the surface area of the sphere. The probability of choosing a point in any continuum is zero (e.g. the probability of choosing any one point at random in the range [0,1] is zero). What I should have done was compute the probability of choosing a patch of area on the surface -- which would then be something like (dr dTheta / A).

Correct. In fact, the infinitesimal surface element should be r^2*sin(theta)*d(phi)*d(theta) in three dimensions.


Quote:
Because the column vector is distributed uniformly at random, we can consider this sum of random variables to be simply K times the expected value of one of these random variables. That is,

E(|| <p> ||^2) = k E ( x_1^2 ).

Why is this valid? Let's say we interchanged the positions of two of these random variables. This would correspond to a rotation of the sphere. However, every rotation leaves the sphere unchanged, since the distribution of the sphere is uniform -- so you can't even tell that you rotated it. So linearity of expectation applies.


I agree on condition that the Cartesian coordinates x_j are truly independent, uniformly distributed random variables. But is this the case? They have to obey the constraint that the sum of their squares be equal to one!

I wrote a short program that produces average values close to your suggested solution of K/N. However, it does so only if I construct the initial points by first setting each coordinate to a random value (uniformly distributed between -1 and +1) and afterwards normalizing the resulting vector. Points constructed in this way are distributed uniformly over the surface of a n-dim. hypercube (before normalization), but surely not over the surface of the corresponding unit sphere. To see this, picture the case N=2, i.e. a square of length 2 with an inscribed unit circle. Consider only the value of x (let y=1) for reasons of simplification. Now choose a random variable in the interval [0; dx], this maps to an angle (measured from the y axis) interval of [0; d(alpha)_A] with d(alpha)_A=dx. If your x is in [1-dx; 1] instead, this corresponds to an alpha from [pi/4-d(alpha)_D; pi/4] with d(alpha)_D=dx/2. This will cause your "random" points to cluster near the diagonals while the regions near the axes will be covered to a lesser extent.

Now let me construct the initial points using N-dimensional spherical coordinates. The radial coordinate will be r=1 of course, so there remain N-1 angular coordinates. The first angle a_1 is drawn uniformly from [0; 2*pi), the other N-2 angles a_2,...,a_{N-1} from [0;pi]. I expect this to result in a uniform distribution of points on the surface of the unit sphere.

How to calculate E(x_1^2+x_2^2+...+x_K^2)?
The obvious possibility is to compute the values of the x_j using the known angles a_1,...,a_{N-1}. You extend from 2 to 3 dimensions (and likewise in the general case) by adding an angle a_2 (usually called theta) measured from the z axis, thus:
   x = r*sin(a_2)*cos(a_1)
   y = r*sin(a_2)*sin(a_1)
   z = r*cos(a_2)
In general, we have for j >= 2:
   x_j = r*cos(a_{j-1})* product_{m=j...N-1} sin(a_m)
and x_1 = product_{m=1...N-1} sin(a_m).
On the other hand, one could (at least try to) argue that projecting down from N to N-1 dimensions, you scale the length of your vector by sin(a_N-1), so you have
E(x_1^2+...+x_K^2) = E( product_{j=K...N-1} sin^2(a_j) ).

My crude numerical tests (using uniformly distributed angles a_j) show that both ways lead to the same result and are in accordance with the formula I already mentioned in my last post: E = 2^(K-N).
Clearly, this result leads to values drastically different from the one your formula suggests. Especially so for large N and constant difference N-K or constant ratio K/N. Unfortunately I can't confidently figure out the "reasonable" behaviour of our expectation value for large N - like what's the effect of adding one more dimension?
The distribution of my random points doesn't seem to be too uniform though.

Strangely enough, I came up with a calculation that yields your result of K/N. But it involves the functional determinant (I don't know whether that's the correct mathematical term in English) of our N-dim. spher. coords. Well ok, I looked at the cases d=2,3,4 and conjectured the general formula without proof. Integrating over the whole surface of the unit sphere should result in 1 (if normalized by diving by the area). Now for every direction x_j one projects out one gets a factor of sin(a_j)^(j+1). I won't go into the details (unless asked to), but one has to start with the last axis, j=N-1. Upon integration this leads to a factor of (N-1)/N. The next projection gives us a factor of (N-2)/(N-1), resulting in a total of (N-2)/N and so on, always ending up with K/N.

I am confused. I don't know which result is "more likely", so as to find the flaw in the reasoning on behalf of the other. Hopefully my notation isn't adding any confusion.

What do you think about the "uniform distribution over the sphere" topic?

Title: Re: Expected Length of Projection onto Plane
Post by Robman on Aug 29th, 2002, 11:01am
then there is the quick and cheesy solution...

you are on the surface of a sphere, so:

x.0^2 + x.1^2 + ...... x.N^2 = 1

so the expectation E(x.0^2 + x.1^2 + ...... x.N^2 ) = 1

now E(x.0^2 + x.1^2 + ...... x.N^2 ) = E(x.0^2) + E(x.1^2)+.....E(x.N^2)

by symmetry, each dim. contributes the same to the avg,
so E(x.k^2) = 1/N
so E(x.0^2 + x.1^2 + ...... x.K^2) = K/N     Q.E.D.
-Rob

Title: Re: Expected Length of Projection onto Plane
Post by william wu on Sep 5th, 2002, 9:49pm

on 08/28/02 at 11:30:36, wowbagger wrote:
I agree on condition that the Cartesian coordinates x_j are truly independent, uniformly distributed random variables. But is this the case? They have to obey the constraint that the sum of their squares be equal to one!


Hiya wowbagger; sorry for the late reply.

The following two lines of logic

E (x_1^2 + x_2^2 + ... + x_N^2) = 1
N E ( x_1^2 ) = 1

does NOT require that the random variables x1 to xN be independent. As you pointed out, they are not independent at all -- they are quite correlated. If I know that x12 is .99, that causes all the other random variables x2 through xN to take on relatively small values.  But this doesn't matter; the linearity of expectation used here only requires that each variable be identically distributed. And this is stipulated by the problem statement, which says:


Quote:
A column vector x is distributed randomly and uniformly over the surface of the unit sphere in an N-dimensional Euclidean space.


You can exchange the positions of the random variables to rotate the sphere. But since all the variables have identical distribution, you wouldn't be able to tell that the sphere had rotated anyways.

I'm not really sure what to make of your E = 2^[K-N] result ... I assure you that K/N is the answer though. The surface integration / polar coordinates approach really overcomplicates the problem. My mind is rather fatigued with academic theory right now; maybe when it's more clear I'll have something more intelligent to say.



Powered by YaBB 1 Gold - SP 1.4!
Forum software copyright © 2000-2004 Yet another Bulletin Board