wu :: forums (http://www.ocf.berkeley.edu/~wwu/cgi-bin/yabb/YaBB.cgi)
riddles >> putnam exam (pure math) >> Non trivial BST Insertion
(Message started by: howard roark on Feb 15th, 2009, 9:25pm)

Title: Non trivial BST Insertion
Post by howard roark on Feb 15th, 2009, 9:25pm
Assume you are building a  BST (Binary Search tree) of n nodes using purely by insertions, that all keys are distinct, and that all key orderings for the insertions were equally likely.

Write an efficient algorithm which takes as input the root of a tree, and which returns the probability
that that particular tree would have been built under these assumptions.


Hint 1: If the tree has 3 nodes, root node and children on both sides of root node, probability of that tree is 1/3

PS:I thought this is a better place for this question than cs forums. Please tell me if I should move this to cs forums

Title: Re: Non trivial BST Insertion
Post by mistaken_id on Feb 16th, 2009, 9:55pm
I doubt if the algorithm even needs n as input.. ???

Title: Re: Non trivial BST Insertion
Post by towr on Feb 17th, 2009, 2:59am
My first guess is to
1) count the nodes in each subtree
2) multiply for each node (#left+#right)!/left!/#right!  (where #left is the number of node in the left subtree, and analogous for #right)
3) divide the whole by #root!
That would be O(n) runtime if it checks out.

Title: Re: Non trivial BST Insertion
Post by howard roark on Feb 17th, 2009, 12:23pm
Cool! so the algo doesnt need n as the input.

However, run time does depend on n and it is O(n)....

Title: Re: Non trivial BST Insertion
Post by howard roark on Feb 17th, 2009, 12:33pm
what does #root mean?

#left+#right +1 ??

Title: Re: Non trivial BST Insertion
Post by towr on Feb 17th, 2009, 12:33pm
You can calculate n by going through the tree (and in fact you have to, #root=n). And to be sure you account for the whole tree you need to visit every node, so you can't do better than O(n).

Title: Re: Non trivial BST Insertion
Post by towr on Feb 17th, 2009, 12:35pm

on 02/17/09 at 12:33:25, howard roark wrote:
what does #root mean?

#left+#right +1 ??
For the root node, yes. I strictly use it for the top of the tree here though.

Title: Re: Non trivial BST Insertion
Post by howard roark on Feb 17th, 2009, 1:00pm
logic behind your algo??

I somehow feel that your algo assumes number of trees with n nodes is n!, which I think is wrong.

Number of trees with n nodes is Catalan-number (n).


I know I am wrong, Just tell me where I am wrong......

Title: Re: Non trivial BST Insertion
Post by towr on Feb 17th, 2009, 1:35pm

on 02/17/09 at 13:00:51, howard roark wrote:
I somehow feel that your algo assumes number of trees with n nodes is n!, which I think is wrong.
Ah yes.
What the algorithm returns is the probability of a random sequence of input resulting in that particular tree. And there are n! such random input sequences.
But this does seem to be exactly what was asked for.

For instance take all inputs of 3 elements
1,2,3 ->  (1,(2,(3)))
1,3,2 ->  (1,((2),3))
2,1,3 ->  ((1),2,(3))
2,3,1 ->  ((1),2,(3))
3,1,2 ->  ((1,(2)), 3)
3,2,1 ->  (((1),2), 3)

The middle two result in the same tree, so we have probability 2/6=1/3 of getting that tree, exactly the number you gave.

Title: Re: Non trivial BST Insertion
Post by towr on Feb 17th, 2009, 3:02pm

on 02/17/09 at 13:00:51, howard roark wrote:
logic behind your algo??
The possible input sequences are constraint only by the fact that the ancestor of a node has to come earlier in the sequence. So the positions of nodes in different subtrees are independent, as long as they come after their common ancestor. So if we ignore the order the elements of each subtree occurs in for a moment, we're just mixing two groups, which can be done in (n+m)!/(n!m!) ways. And then next we determine the order of those subgroups recursively.

Title: Re: Non trivial BST Insertion
Post by howard roark on Mar 1st, 2009, 10:11pm
Is the running time of the algorithm O(n).......I think it will be more than that, because as we move up we have to find factorials of larger numbers which are as big as n  in the root

Title: Re: Non trivial BST Insertion
Post by towr on Mar 2nd, 2009, 1:06am

on 03/01/09 at 22:11:19, howard roark wrote:
Is the running time of the algorithm O(n).......I think it will be more than that, because as we move up we have to find factorials of larger numbers which are as big as n  in the root
You can make a table of all factorials you need with O(n) multiplications. Although that's a small comfort when the size of the result grows beyond the machineword. There might be ways to improve it though.

Title: Re: Non trivial BST Insertion
Post by Eigenray on Mar 2nd, 2009, 5:50am
We can also write it as P(T) = 1/|T| * P(L)*P(R), which looks pretty nice (looks better in my head than on the page though).

Let F(n) be the maximum value of P(T) over all trees of size n.  Then n!*F(n) is the [link=http://www.ocf.berkeley.edu/~wwu/cgi-bin/yabb/YaBB.cgi?board=riddles_cs;action=display;num=1233192384]number of heaps of size n[/link].

Title: Re: Non trivial BST Insertion
Post by towr on Mar 2nd, 2009, 6:51am

on 03/02/09 at 05:50:52, Eigenray wrote:
We can also write it as P(T) = 1/|T| * P(L)*P(R), which looks pretty nice.
Indeed it does. And the interpretation that goes with it is so much simpler as well. In any subtree every node has an equal chance of being the root.
Or to put it another way, you can create the tree by putting all elements in a row, and then recursively pick one from the row uniformly randomly to function as root, and splitting the row in two at its position. Then repeat for both subtrees.

[edit]I liked the old version of the formula better :P[/edit]

Title: Re: Non trivial BST Insertion
Post by howard roark on Mar 2nd, 2009, 8:12pm

on 03/02/09 at 05:50:52, Eigenray wrote:
Let F(n) be the maximum value of P(T) over all trees of size n.  Then n!*F(n) is the [link=http://www.ocf.berkeley.edu/~wwu/cgi-bin/yabb/YaBB.cgi?board=riddles_cs;action=display;num=1233192384]number of heaps of size n[/link].

How is the above result possible?

Title: Re: Non trivial BST Insertion
Post by Eigenray on Mar 2nd, 2009, 10:23pm
F(n) = maxk<n  C(n-1,k) F(k) F(n-1-k).

Compared to the other formula it's not surprising.  Attached are the plots of { C(n-1,k) F(k) F(n-1-k) :  0 http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/le.gif k < n }, for n =1,...,256.  Note the two maxima that look like they are moving back and forth.  The index moving to the left is actually standing still at (one less than) a power of 2; the other one grows until it reaches the next (one less than a) power of 2.



Powered by YaBB 1 Gold - SP 1.4!
Forum software copyright © 2000-2004 Yet another Bulletin Board