wu :: forums (http://www.ocf.berkeley.edu/~wwu/cgi-bin/yabb/YaBB.cgi)
riddles >> hard >> Simulated Continuous Random Variables
(Message started by: william wu on Oct 22nd, 2002, 1:04am)

Title: Simulated Continuous Random Variables
Post by william wu on Oct 22nd, 2002, 1:04am

As part of a computer simulation, you want to generate values for a continous random variable X whose probability density function fX you know. The only random number generator you have at your disposal is the C function drand48() which returns non-negative, double-precision floating-point values uniformly distributed over the interval [ 0.0 , 1.0 ]. How can you generate values for X?


Note: Is this Greek to you? Here's a Wikipedia article on probability density functions; read up:
http://www.wikipedia.org/wiki/Probability_density_function


Title: Re: Simulated Continuous Random Variables
Post by william wu on Oct 22nd, 2002, 1:29am
I won't lie, this is adapted from a homework problem of mine due later this week. After discussing the problem with a friend, I have an idea for an algorithm, but I'm having difficulty convincing myself that it works, and if so, why exactly. Pretty stressed for time nowadays, so I figured I'd see what you guys think. It's a simple algorithm, here it is, hidden by color (maybe think about the problem a little first before you peek, bad ideas that seem to make intuitive sense can be very dangerous):

Possible solution (highlight to see):

Generate a random number between 0 and 1 using drand48. Let's call this random number K. Rewrite K as a binary decimal number. Now imagine placing your finger on the curve of fX. Initially, position your finger at the mean of X. Note that at this location, the area under fX from -inf to your finger is exactly 1/2. Now we will iterate through the digits of K and move our finger in a binary search fashion across fX until we run out of digits, which will eventually happen since the computer's precision is finite. If the first digit of K is 0, move your finger to the left, such that the area under fX from -inf to your new finger position is 1/4. Similarly, if the first digit of K was 1, move your finger to the right, such that the area under fX from -inf to your new finger position is 3/4. After this first iteration, look at the second digit of K: if it is 0, move left such that the area from -inf to your finger is decremented by 1/8; if it is 1, move right such that the area is incremented by 1/8. Repeat until machine precision is reached. The preimage of your finger's final location is then the randomly generated value of X.


Title: Re: Simulated Continuous Random Variables
Post by TimMann on Oct 22nd, 2002, 2:16am
What's the point of the binary search? Doing any one step in the binary search presupposes that you know how to compute the inverse of the function g(y) = Integral-ooy f(x) dx, for the value of g you're trying at this step. If you know how to do that, why not just compute g-1(rand48()) to start with? I don't see that there's something special about multiples of negative powers of 2 that make it easier to compute g-1 at those points.

Still, if you don't know how to invert the function directly, it may still be that something roughly like a binary search is the answer. It's just that you are searching in the range (-oo,oo), not [0,1], so the search can't be binary; as you're initially marching off toward -oo or oo, you have to do something like doubling the step size each time until you finall take a step big enough to overshoot the answer. After that you can binary search the last interval.

Title: Re: Simulated Continuous Random Variables
Post by oliver on Oct 23rd, 2002, 5:16am
Hmm, is this a trick question, i.e. is the "real" problem more or less tangent to the content of the question?

Because as Tim said, solving the first part of the problem is just inverting a monotonous function.

But the second part is unsolvable, finding a right mapping from a finite subset M of [0,1] to the real numbers with which we can sensibly interpolate the given g - if this mapping should be independend of f.

For instance, let's construct f as the sum of charactristical functions of [0,1], scaled and shifted along the x axis.
*Small correction*: we smooth the characteristical functions to make the continous.

If we take 10^100 of these ...

Title: Re: Simulated Continuous Random Variables
Post by Eric Yeh on Oct 23rd, 2002, 1:30pm
Ye, I'd implement it much like Tim suggested.  Take the random number and invert against the CDF using powers of two and then a binary search.  Alternatively, do the full integration (numerically) first by branching out from 0, using power of two steps, till you get to 1 with arbitrary precision; then you have an inverse map which you can invert against via interpolation.

Ollie:  While I'm sure almost all (and hell, I'll even mean that in the mathematical sense  ;) ) functions are monotonous, I think that here you might want monotonic  ;)

Best,
Eric

Title: Re: Simulated Continuous Random Variables
Post by Chronos on Oct 24th, 2002, 11:47am
Isn't this one of the examples in Numerical Recipes?  I certainly seem to recall seeing this in class.

Title: Re: Simulated Continuous Random Variables
Post by william wu on Oct 28th, 2002, 2:59am
Cool, thanks. I see what you mean about assuming I could compute an inverse ... sometimes you can't, so you resort to numerical methods. But if you can, you can just plug the randomly generated number between 0 and 1 into the inverse of the CDF. This works because the random variable Y = CDF(X) is uniformly distributed between 0 and 1. Simple and neat.


Title: Re: Simulated Continuous Random Variables
Post by Grimbal on Sep 23rd, 2010, 2:08pm
One way to avoid computing the inverse is to adjust from a known distribution.
Suppose you have a function g(x) and a real c such that c*g(x) > f(x) for all x.  Suppose you know how to generate an x according to distribution g.
Then you just need to generate y uniformly in [0..1] and reject x if c*g(x)>f(x).  When it happens, you start with a fresh x.
That should work even for infinite distributions, provided you can find a g(x) with a suitable c that is no too big.
I think this has a name: the Monte Carlo method.

Title: Re: Simulated Continuous Random Variables
Post by pex on Sep 23rd, 2010, 2:36pm

on 09/23/10 at 14:08:39, Grimbal wrote:
One way to avoid computing the inverse is to adjust from a known distribution.
Suppose you have a function g(x) and a real c such that c*g(x) > f(x) for all x.  Suppose you know how to generate an x according to distribution g.
Then you just need to generate y uniformly in [0..1] and reject x if c*g(x)>f(x).  When it happens, you start with a fresh x.
That should work even for infinite distributions, provided you can find a g(x) with a suitable c that is no too big.
I think this has a name: the Monte Carlo method.

It's known as rejection sampling (http://en.wikipedia.org/wiki/Rejection_sampling). The Monte Carlo method (http://en.wikipedia.org/wiki/Monte_Carlo_method) is the more general idea of approximating something that is infeasible to calculate exactly using random draws; it does not specify how these draws are to be obtained.



Powered by YaBB 1 Gold - SP 1.4!
Forum software copyright © 2000-2004 Yet another Bulletin Board