wu :: forums (http://www.ocf.berkeley.edu/~wwu/cgi-bin/yabb/YaBB.cgi)
riddles >> easy >> probability and odds
(Message started by: x2862 on Feb 6th, 2011, 8:08pm)

Title: probability and odds
Post by x2862 on Feb 6th, 2011, 8:08pm
Suppose you choose n numbers from 1 to x. If k numbers are drawn from 1 to x (k > n),  what is the probability that those k numbers contain your chosen set of n numbers?

Now suppose you choose p sets of n numbers from 1 to k. Again if  k numbers are drawn from 1 to x (k > n), what is minimum value of p so that the probability that those k numbers contain at least 1 of your chosen sets of n numbers is 100%?

I figure the odds to be [hide]C(x - n, k - n) to C(x, k) - C(x - n, k - n)[/hide], where [hide]C(x, n) = x! / (n! (x - n)!)[/hide].
I also figure the minimum value of p to be [hide]C(x,n)/C(k,n) rounded uo to the nearest whole number[/hide].
Is this correct?
Is there an easy way to calculate the sets of p numbers for any values of x, k and, n without the use of a computer?

Title: Re: probability and odds
Post by towr on Feb 6th, 2011, 10:08pm

on 02/06/11 at 20:08:31, x2862 wrote:
Suppose you choose n numbers from 1 to x. If k numbers are drawn from 1 to x (k > n),  what is the probability that those k numbers contain your chosen set of n numbers?
[hide]C(k,n) / C(x,n). The number of ways to pick n numbers from the superset of k numbers, divided by the number of ways to pick n numbers from the whole range of x.[/hide]

Title: Re: probability and odds
Post by towr on Feb 7th, 2011, 10:52am
Heh, the second part is easier than I though.
p = [hide]C(x,n) - C(k,n) + 1; i.e. one more than the number possible sets of n numbers which are not subsets of the k numbers.[/hide]

Title: Re: probability and odds
Post by SMQ on Feb 7th, 2011, 11:31am

on 02/07/11 at 10:52:50, towr wrote:
Heh, the second part is easier than I thought.

No, I don't think it is; the formula you give is an upper bound, but p can often be substantially lower.  For instance, with x = 6, k = 3, n = 2, the minimum p is easily determined by hand to be 7, where your formula gives 12.

--SMQ

Title: Re: probability and odds
Post by towr on Feb 7th, 2011, 11:34am
Oh wait, I didn't notice that you get to choose the sets with n numbers.
My number is for when you only get to pick p.

Title: Re: probability and odds
Post by x2862 on Feb 7th, 2011, 4:09pm
Suppose x = 9 and k = 8
Combinations without repetition (x=9, k=8)
C(9,8)=9
{1,2,3,4,5,6,7,8} {1,2,3,4,5,6,7,9} {1,2,3,4,5,6,8,9}
{1,2,3,4,5,7,8,9} {1,2,3,4,6,7,8,9} {1,2,3,5,6,7,8,9}
{1,2,4,5,6,7,8,9} {1,3,4,5,6,7,8,9} {2,3,4,5,6,7,8,9}

that's 9 combinations

Now lets say I picked 3 numbers.

Combinations without repetition (x=9, n=3)
C(9,3)=84
{1,2,3} {1,2,4} {1,2,5} {1,2,6} {1,2,7} {1,2,8} {1,2,9} {1,3,4} {1,3,5} {1,3,6} {1,3,7} {1,3,8} {1,3,9} {1,4,5} {1,4,6} {1,4,7} {1,4,8} {1,4,9} {1,5,6} {1,5,7} {1,5,8} {1,5,9} {1,6,7} {1,6,8} {1,6,9} {1,7,8} {1,7,9} {1,8,9} {2,3,4} {2,3,5} {2,3,6} {2,3,7} {2,3,8} {2,3,9} {2,4,5} {2,4,6} {2,4,7} {2,4,8} {2,4,9} {2,5,6} {2,5,7} {2,5,8} {2,5,9} {2,6,7} {2,6,8} {2,6,9} {2,7,8} {2,7,9} {2,8,9} {3,4,5} {3,4,6} {3,4,7} {3,4,8} {3,4,9} {3,5,6} {3,5,7} {3,5,8} {3,5,9} {3,6,7} {3,6,8} {3,6,9} {3,7,8} {3,7,9} {3,8,9} {4,5,6} {4,5,7} {4,5,8} {4,5,9} {4,6,7} {4,6,8} {4,6,9} {4,7,8} {4,7,9} {4,8,9} {5,6,7} {5,6,8} {5,6,9} {5,7,8} {5,7,9} {5,8,9} {6,7,8} {6,7,9} {6,8,9} {7,8,9}

As we can see at least one of our 84 number sets match every set of 8 numbers. So we can say p = 84 but this is not the lowest value. I can see that we only need at most 2 sets {1,2,3) and {7,8,9}. That tells me that 2 is the lowest value for p.
Another way to get 2 is to take C(k=8,n=3) which equals 56. There are 56 sets of three numbers in a set of 8 numbers. So 84/56=1.5 rounded up that’s 2.
I still arrive at  C(x,n)/C(k,n)

Going further suppose n=5 instead then C{x=9,n=5)would give 126 combinations
{1,2,3,4,5} {1,2,3,4,6} {1,2,3,4,7} {1,2,3,4,8} {1,2,3,4,9} {1,2,3,5,6} {1,2,3,5,7} {1,2,3,5,8} {1,2,3,5,9} {1,2,3,6,7} {1,2,3,6,8} {1,2,3,6,9} {1,2,3,7,8} {1,2,3,7,9} {1,2,3,8,9} {1,2,4,5,6} {1,2,4,5,7} {1,2,4,5,8} {1,2,4,5,9} {1,2,4,6,7} {1,2,4,6,8} {1,2,4,6,9} {1,2,4,7,8} {1,2,4,7,9} {1,2,4,8,9} {1,2,5,6,7} {1,2,5,6,8} {1,2,5,6,9} {1,2,5,7,8} {1,2,5,7,9} {1,2,5,8,9} {1,2,6,7,8} {1,2,6,7,9} {1,2,6,8,9} {1,2,7,8,9} {1,3,4,5,6} {1,3,4,5,7} {1,3,4,5,8} {1,3,4,5,9} {1,3,4,6,7} {1,3,4,6,8} {1,3,4,6,9} {1,3,4,7,8} {1,3,4,7,9} {1,3,4,8,9} {1,3,5,6,7} {1,3,5,6,8} {1,3,5,6,9} {1,3,5,7,8} {1,3,5,7,9} {1,3,5,8,9} {1,3,6,7,8} {1,3,6,7,9} {1,3,6,8,9} {1,3,7,8,9} {1,4,5,6,7} {1,4,5,6,8} {1,4,5,6,9} {1,4,5,7,8} {1,4,5,7,9} {1,4,5,8,9} {1,4,6,7,8} {1,4,6,7,9} {1,4,6,8,9} {1,4,7,8,9} {1,5,6,7,8} {1,5,6,7,9} {1,5,6,8,9} {1,5,7,8,9} {1,6,7,8,9} {2,3,4,5,6} {2,3,4,5,7} {2,3,4,5,8} {2,3,4,5,9} {2,3,4,6,7} {2,3,4,6,8} {2,3,4,6,9} {2,3,4,7,8} {2,3,4,7,9} {2,3,4,8,9} {2,3,5,6,7} {2,3,5,6,8} {2,3,5,6,9} {2,3,5,7,8} {2,3,5,7,9} {2,3,5,8,9} {2,3,6,7,8} {2,3,6,7,9} {2,3,6,8,9} {2,3,7,8,9} {2,4,5,6,7} {2,4,5,6,8} {2,4,5,6,9} {2,4,5,7,8} {2,4,5,7,9} {2,4,5,8,9} {2,4,6,7,8} {2,4,6,7,9} {2,4,6,8,9} {2,4,7,8,9} {2,5,6,7,8} {2,5,6,7,9} {2,5,6,8,9} {2,5,7,8,9} {2,6,7,8,9} {3,4,5,6,7} {3,4,5,6,8} {3,4,5,6,9} {3,4,5,7,8} {3,4,5,7,9} {3,4,5,8,9} {3,4,6,7,8} {3,4,6,7,9} {3,4,6,8,9} {3,4,7,8,9} {3,5,6,7,8} {3,5,6,7,9} {3,5,6,8,9} {3,5,7,8,9} {3,6,7,8,9} {4,5,6,7,8} {4,5,6,7,9} {4,5,6,8,9} {4,5,7,8,9} {4,6,7,8,9} {5,6,7,8,9}


C(k=8,n=5) gives 56 number sets 126/56=2.25 for a minimum p value of  3

The long way we can see we only need 3 of the sets of 5 numbers to cover all sets of 8 numbers {1,2,3,4,5}, {1,2,3,4,6} and {5,6,7,8,9}.

Again correct me if I am wrong as this is all new math for me.

Title: Re: probability and odds
Post by x2862 on Feb 7th, 2011, 5:44pm

on 02/07/11 at 11:31:41, SMQ wrote:
For instance, with x = 6, k = 3, n = 2, the minimum p is easily determined by hand to be 7, where your formula gives 12.

--SMQ

with x = 6, k = 3, n = 2,
C(x,n)=15
{1,2} {1,3} {1,4} {1,5} {1,6} {2,3} {2,4} {2,5} {2,6} {3,4} {3,5} {3,6} {4,5} {4,6} {5,6}

C(x,k)=20
{1,2,3} {1,2,4} {1,2,5} {1,2,6} {1,3,4} {1,3,5} {1,3,6} {1,4,5} {1,4,6} {1,5,6} {2,3,4} {2,3,5} {2,3,6} {2,4,5} {2,4,6} {2,5,6} {3,4,5} {3,4,6} {3,5,6} {4,5,6}

C(k,n)=3
{1,2} {1,3} {2,3}

15/3=5 so the minimum value of p should be 5 if my formula was correct.
The actual value of p is 6 so my formula does not work after all.

{1,2} {1,3} {2,3} {4,5} {4,6} {5,6}

what i did notice is that there are exactly 4 duplicates for any givin set of 2 numbers with C(x,k).
and 20/4 is also 5. however the minimum p I get by hand is still 7.

Title: Re: probability and odds
Post by SMQ on Feb 9th, 2011, 11:01am
By exhaustive computer search:
x = 0
k = 0 1
x = 1
k = 0 1
k = 1 1 1
x = 2
k = 0 1
k = 1 1 2
k = 2 1 1 1
x = 3
k = 0 1
k = 1 1 3
k = 2 1 2 3
k = 3 1 1 1 1
x = 4
k = 0 1
k = 1 1 4
k = 2 1 3 6
k = 3 1 2 2 4
k = 4 1 1 1 1 1
x = 5
k = 0 1
k = 1 1 5
k = 2 1 4 10
k = 3 1 3 4 10
k = 4 1 2 2 3 5
k = 5 1 1 1 1 1 1
x = 6
k = 0 1
k = 1 1 6
k = 2 1 5 15
k = 3 1 4 6 20
k = 4 1 3 3 6 15
k = 5 1 2 2 2 3 6
k = 6 1 1 1 1 1 1 1
x = 7
k = 0 1
k = 1 1 7
k = 2 1 6 21
k = 3 1 5 9 35
k = 4 1 4 5 12 35
k = 5 1 3 3 5 7 21
k = 6 1 2 2 2 3 4 7
k = 7 1 1 1 1 1 1 1 1
x = 8
k = 0 1
k = 1 1 8
k = 2 1 7 28
k = 3 1 6 12 56
k = 4 1 5 7 ? 70
k = 5 1 4 4 8 ? 56
k = 6 1 3 3 4 6 ? 28
k = 7 1 2 2 2 2 3 4 8
k = 8 1 1 1 1 1 1 1 1 1

Title: Re: probability and odds
Post by towr on Feb 10th, 2011, 1:53pm
I predict those three question marks (in bold) are 4, 11, 14, 20, 12, 7 (http://oeis.org/A066010)
For the 'row' underneath there's the sequence http://oeis.org/A066019
But it doesn't gives a new perspective for calculating those 'rows'.

Title: Re: probability and odds
Post by x2862 on Feb 10th, 2011, 1:57pm
@SMQ, Thanks for the triangles. I found the useful even though I do not fully understand them. I notice that each triangle has all the minimum values for p given any x,k, and n.(correct me if I am wrong) Also if you stack the triangles into a pyramid one face will be Pascal’s triangle. As to how to compute the minimum value of p I still don’t see it. I f you see something I don’t I would like to know as I an very curious.(that, and the fact I like learning new things).

@everyone
What I have started to do in the mean time is calculate all the possibilities for small sets up to x=10.  In doing so I have made some interesting observations.
Given x=6 k=3 n=2 the minimum value of p =6
C(x,n) = 15 i.e. there are 15 ways to pull 2 numbers at random from a set of 6.
First I tried all 3,003 possible groupings of 5 and found no match.
Then I took C(15,p) = 5,005 i.e. there is 5,005 ways to group the 15 pairs into sub groups of 6.
Of those 5,005 sub groups there is only 10 sub groups that will guarantee a match within 3 numbers selected at random.
Those sets are as follows:
[1,2} {1,3} {2,3} {4,5} {4,6} {5,6]
[1,2} {1,4} {2,4} {3,5} {3,6} {5,6]
[1,2} {1,5} {2,5} {3,4} {3,6} {4,6]
[1,2} {1,6} {2,6} {3,4} {3,5} {4,5]
[1,3} {1,4} {2,5} {2,6} {3,4} {5,6]
[1,3} {1,5} {2,4} {2,6} {3,5} {4,6]
[1,3} {1,6} {2,4} {2,5} {3,6} {4,5]
[1,4} {1,5} {2,3} {2,5} {3,5} {4,6]
[1,5} {1,6} {2,3} {2,4} {3,4} {5,6]

What I notice is there is exactly 2 of every integer 1 thru 6 in each set.
I believe I have found a way to calculate the number of times each integer should  be replicated and where. I not sure though.

So to test this I looked up the minimum p from the triangles you provided for x=7 k=3 and n=2. It should be 9. I then applied my algorithm and came up with the following set:
{1,2} {1,3} {1,4} {2,3} {2,4} {3,4} {5,6} {5,7} {6,7}

This set works for any given 3 numbers out of 7. I think I might have gotten lucky though. As my algorithm only seems to work if x=2. I will look into it further. Possibly I have to exponentially increase my algorithm. But I do not think much is possible beyond calculating the odds, combinations, and minimum p. Finding an algorithm  to calculate the actual sets without using a brute force method on large numbers would imply that P = NP.

Title: Re: probability and odds
Post by ThudnBlunder on Feb 10th, 2011, 2:24pm

on 02/10/11 at 13:57:44, x2862 wrote:
Finding an algorithm  to calculate the actual sets without using a brute force method on large numbers would imply that P = NP.

Per Aspera Ad Astra (http://www.urbandictionary.com/define.php?term=Per%20aspera%20ad%20Astra&defid=2931759)  ;)

Title: Re: probability and odds
Post by x2862 on Feb 10th, 2011, 3:25pm

on 02/10/11 at 13:53:05, towr wrote:
I predict those three question marks (in bold) are 4, 11, 14, 20, 12, 7 (http://oeis.org/A066010)
For the 'row' underneath there's the sequence http://oeis.org/A066019
But it doesn't gives a new perspective for calculating those 'rows'.



Thanks for that interesting find. I am going tomorrow to the library over at cal state San Bernardino. they have the book CRC Handbook of Combinatorial Designs, which was mentioned in your link to the Encyclopedia of Integer Sequences.

Feel free to suggest any other books I might find helpful on the subject, as I will want to read those too.

Title: Re: probability and odds
Post by SMQ on Feb 10th, 2011, 4:16pm

on 02/10/11 at 13:57:44, x2862 wrote:
I notice that each triangle has all the minimum values for p given any x,k, and n.(correct me if I am wrong)

Correct.

Quote:
Also if you stack the triangles into a pyramid one face will be Pascal’s triangle.

Also correct; that would be the n = k face.  For that case p is simply C(k,x) since you need to pick all k-sets for full cover.

Quote:
As to how to compute the minimum value of p I still don’t see it.

Nor do I, and towr's links would seem to indicate that it's an open problem since the sequences in OEIS contain unknown values.

Quote:
I f you see something I don’t I would like to know as I an very curious.

From studying the results, I believe can calculate p in the following cases:
n = 0 --> p = 1
n = 1 --> p = x - k + 1
n = 2 --> p = (x mod (k - 1))C(2,http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/lceil.gifx/(k-1)http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/rceil.gif) + (k - 1 - (x mod (k - 1)))C(2,http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/lfloor.gifx/(k-1)http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/rfloor.gif)
n = k --> p = C(k,x)
k = x - 1 --> p = http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/lceil.gifx/(x-n)http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/rceil.gif
k = x --> p = 1

The only particularly interesting one of those is the n = 2 case (for which there may be a better closed-form expression).  My insight being that n = 2 can be viewed as an undirected graph where the elemnts of X are the vertexes and the p sets of two elements are the edges.  In this case the most efficient solution is to divide the graph into k-1 fully-connected independent subgraphs.  That way at most k-1 elements can be chosen such that no two are connected by an edge of the graph; the k-th vertex must be connected to some other already-chosen vertex by an edge, and so any set of k vertexes will contain at least one edge.

This idea can be extended to the cases where n = a, k = (a-1)b + 1 for any integers a and b >= 1, but I don't see a simple extension to any other cases.

--SMQ

Title: Re: probability and odds
Post by SMQ on Feb 11th, 2011, 8:54am

on 02/10/11 at 13:53:05, towr wrote:
I predict those three question marks (in bold) are 4, 11, 14, 20, 12, 7 (http://oeis.org/A066010)
For the 'row' underneath there's the sequence http://oeis.org/A066019
But it doesn't gives a new perspective for calculating those 'rows'.

Actually, that might be an interesting result since the sequences you link to are answering a different--though related--question!

The problem here asks: what is the minimum number of n-sets you can pick such that every k-set contains at least one of the chosen n-sets?

The covering problem asks: what is the minimum number of k-sets you can pick such that every n-set is contained in at least one of the chosen k-sets?

It would appear, then, that p(x, k, n) = C(x,  x - n, x - k).  I guess that's not a deep observation, since if A, B http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/subset.gif X then A http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/subset.gif B http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/biglongleftrightarrow.gif BC http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/subset.gif AC.  Ah well.

--SMQ

Title: Re: probability and odds
Post by x2862 on Feb 11th, 2011, 11:10pm
I have found the following which is interesting

http://oeis.org/A084546

The numbers match C(C(x,n),p) so if I could produce those same results without p then I could easily calculate p.

I also notice that that the lowest value p could be in this case would be the value just before the median value. I still am looking into wither this can be of any use to finding p.

Title: Re: probability and odds
Post by towr on Feb 12th, 2011, 6:31am
In what way does that sequence has anything to do with this problem? I don't see our numbers match up in any way with that sequence. ?!

Title: Re: probability and odds
Post by x2862 on Feb 12th, 2011, 2:46pm
Perhaps that sequence dose not relate to the problem at hand. perhaps it does. I have no idea. But I noticed some interesting things.
Here is what I noticed, given x=6, k=3, and n = 2 what are the possible ways to arrange the 15 different sets of 2 into sets of p.  
If p = 1 then C(15,1) =    15.
If p = 2 then C(15,2) =    105.
If p = 3 then C(15,3) =    455.
If p = 4 then C(15,4) =    1365.
If p = 5 then C(15,5) =    3003.
If p = 6 Then C(15,6) =   5005. //here is our minimum p (6). It is also the sixth value.
If p = 7 Then C(15,7) =   6435. //here is the apex of the curve.
If p = 8 Then C(15,8) =   6435.
If p = 9 Then C(15,9) =   5005.
If p = 10 Then C(15,10) = 3003.
If p = 11 Then C(15,11) = 1365.
If p = 12 Then C(15,12) = 455.
If p = 13 Then C(15,12) = 105.
If p = 14 Then C(15,14) = 15.

Notice the matching numbers?
15, 105, 455, 1365, 3003, 5005, 6435, 6435, 5005, 3003, 1365, 455, 105, 15

Now lets change the problem to x=7, k=3, n=2.
Now there are 21 different ways to arrange 7 numbers into groups of two

Starting with 21 the sequence gives
21, 210, 1330, 5985, 20349, 54264, 116280, 203490, 293930, 352716, 352716, 293930, 203490, 116280, 54264, 20349, 5985, 1330, 210, 21

Doing the above for p I find the same thing.

If p = 1 Then C(21,1) =   21
If p =  2 Then C(21,2) =   210
If p =  3 Then C(21,3) =   1330
If p =  4 Then C(21,4) =   5985
If p =  5 Then C(21,5) =   20349
If p =  6 Then C(21,6) =   54264
If p =  7 Then C(21,7) =   116280
If p =  8 Then C(21,8) =   203490
If p =  9 Then C(21,9) =   293930 //here is our minimum p (9).
If p =  10 Then C(21,10) =   352716 //here is the apex of the curve
If p =  11 Then C(21,11) =   352716
If p =  12 Then C(21,12) =   293930
If p =  12 Then C(21,13) =   203490
If p =  14 Then C(21,14) =   116280
If p =  15 Then C(21,15) =   54264
If p =  16 Then C(21,16) =   20349
If p =  17 Then C(21,17) =   5985
If p =  18 Then C(21,18) =   1330
If p =  19 Then C(21,19) =   210
If p =  20 Then C(21,20) =   21

Hope this clarifies  what I noticed.



Powered by YaBB 1 Gold - SP 1.4!
Forum software copyright © 2000-2004 Yet another Bulletin Board