Author 
Topic: Generating Random Trees (Read 2286 times) 

william wu
wu::riddles Administrator
Gender:
Posts: 1291


Generating Random Trees
« on: Sep 4^{th}, 2003, 1:27am » 
Quote Modify

Described below are two ways for generating a random binary tree with n nodes. Although they seem very different, they are surprisingly equivalent  that is, both methods produce the same probability distribution on binary trees with n nodes. The problem here is to prove that they are equivalent. Method 1  First expand the root node: /\  Now expand one of the two terminal nodes at random, so you end up with either /\ /\ or /\ /\  Continue in this fashion at each step, choosing one of the terminal nodes of the tree equally at random, and expanding it. Keep doing this until you have a tree with n terminal nodes. Method 2  Choose an integer N1 uniformly distributed between 1 and (n1) inclusive. Now starting with the root node, expand it and assign N1 to the left child, and n  N1 to the right child: /\ N1 nN1  Now choose an integer N2 uniformly distributed between 1 and (N11) inclusive. Also, independently choose an integer N3 uniformly distributed between 1 and ((nN1)1) inclusive. Expanding in the same fashion, we now have the following tree:  Continue expanding in this fashion until no more subdivisions are possible at any terminal node. The result is your random tree.

« Last Edit: Sep 4^{th}, 2003, 1:28am by william wu » 
IP Logged 
[ wu ] : http://wuriddles.com / http://forums.wuriddles.com



James Fingas
Uberpuzzler
Gender:
Posts: 949


Re: Generating Random Trees
« Reply #1 on: Sep 16^{th}, 2003, 1:31pm » 
Quote Modify

I think it is sufficient to show that method 1 divides the new nodes into the left and right branches of the root node with the same probability distribution as the second case. The rest can be proved recursively from that. I'll keep thinking on it.


IP Logged 
Doc, I'm addicted to advice! What should I do?



william wu
wu::riddles Administrator
Gender:
Posts: 1291


Re: Generating Random Trees
« Reply #2 on: Sep 29^{th}, 2003, 11:01pm » 
Quote Modify

I don't know the solution myself. The hint for this problem is "Polya's Urn Model", which I am unfamiliar with.

« Last Edit: Sep 29^{th}, 2003, 11:02pm by william wu » 
IP Logged 
[ wu ] : http://wuriddles.com / http://forums.wuriddles.com



visitor
Guest


Re: Generating Random Trees
« Reply #3 on: Sep 30^{th}, 2003, 11:17am » 
Quote Modify
Remove

What are the chances that your final tree will have one long original branch that never branches further. In the second method it's simply 2/n, the odds that you'll pick either 1 or n1. In the first method, the first branch creates two trunks, the second eliminates one. The odds are 2/3 that this trunk will remain unbranched in the next division; times 3/4 for the division after that; times 4/5 for the division after that. The middle numbers will all cancel out, leaving 2/n. From any end node that is reached in method two, you can back up to the branching that created it and count the number of times the other branch was further subdivided, and the odds must, once again be identical, since, at the time when this node was created, you were simply dealing with a number that acts just like our topofthetree number.


IP Logged 



visitor
Guest


Re: Generating Random Trees
« Reply #4 on: Sep 30^{th}, 2003, 11:19am » 
Quote Modify
Remove

I meant 2/(n1), of course.


IP Logged 



James Fingas
Uberpuzzler
Gender:
Posts: 949


Re: Generating Random Trees
« Reply #5 on: Oct 17^{th}, 2003, 1:04pm » 
Quote Modify

Continuing from my first post, where I state that the two models are equivalent iff they produce the same probability distribution for the number of nodes in the left and right subtrees of the root node. This property is easily verified as follows: The second method obviously produces a uniform distribution on the number of nodes in the left and right subtrees. That is to say, if there are n nodes to be distributed, the left subtree of the root will have an equal chance of having 1 or 2 or 3 ... or n1 of them. What remains is to show that this is true of the first method. This is quite easy. The first step is always to put 1 node each into the left and right subtrees. So we start with the following tree: o o The next step will, with equal probability, split either the left or the right subtree up: (o o) o (p = 0.5) o (o o) (p = 0.5) The next step: ((o o) o) o (p = 1/3 * 0.5) (o (o o)) o (p = 1/3 * 0.5) (o o) (o o) (p = 1/3 * 0.5) (o o) (o o) (p = 1/3 * 0.5) o ((o o) o) (p = 1/3 * 0.5) o (o (o o)) (p = 1/3 * 0.5) You can see that there are only three distinct possibilities for the distribution of nodes between the left and right subtrees, and that they all have the same probability. Now I'll skip over the diagrams and do this with math: Define p(n,k) to be the probability that there are k nodes in the left hand subtree of the root node when there are n leaf nodes in the entire tree. p(2,1) = 1 Evidently, when we start with a scenario p(n,k) and add another leaf node, we have a k/n probability of getting k+1 in the left, and an (nk)/n probability of getting k. Looking at it the other way, p(n+1,k) = p(n,k1)*(k1)/n + p(n,k)*(nk)/n. If we assume that p(n,k) = 1/(n1) for 1 <= k < n, then we get: p(n+1,k) = 1/(n1)*(k1)/n + 1/(n1)*(nk)/n = (k1+nk)/n/(n1) = (n1)/n/(n1) = 1/n This shows inductively that p(n,k) = 1/(n1), with the observation that although p(n,0) was called for in the inductive step (and is zero, not 1/(n1)), it was multiplied by 0 so the answer is still valid. So the two methods distribute nodes to the left and right subtrees of the root node in the same manner.


IP Logged 
Doc, I'm addicted to advice! What should I do?



