wu :: forums (http://www.ocf.berkeley.edu/~wwu/cgi-bin/yabb/YaBB.cgi)
riddles >> easy >> Set P contains n elements
(Message started by: navdeep1771 on Apr 12th, 2019, 9:43am)

Title: Set P contains n elements
Post by navdeep1771 on Apr 12th, 2019, 9:43am
A set P contains n elements. A subset A of set P is chosen at random. Set P is reconstructed by replacing the elements of A. Another subset B is chosen at random. Find the probability that A[intersection]B has exactly m(m<n) elements common.

Title: Re: Set P contains n elements
Post by rmsgrey on Apr 17th, 2019, 6:15am
I don't have a closed form, but, if p is the probability that |A[intersection]B|=m where A and B are uniformly and independently chosen from the 2n subsets of P, I get: [hide]p=n!/{m!*2^(2n)}*SUMk=mn2^(n-k)/{(n-k)!*(k-m)!}[/hide]

Reasoning:
[hideb]
For A to have k elements, with m<=k<=n, there are nCk such sets A.
For each such A, there are kCm subsets of size m that could be overlaps.
For each such overlap, the inclusion or exclusion of the k elements of A in B are fixed, while the remaining n-k elements of P can be in or out freely, giving 2n-k possible Bs for each possible overlap.
Combining it all gives the number of pairs of sets A and B with an intersection of size exactly m and set A having size exactly k as:
nCk*kCm*2n-k
=n!/{k!(n-k)!}*k!/{m!(k-m)!}*2n-k
=2n-k*n!/{m!(k-m)!(n-k)!}
So the total number of pairs of sets A and B with an intersection of size exactly m is:
SUMk=mn[2n-kn!/{m!(k-m)!(n-k)!}]
=n!/m!SUMk=mn2n-k/{(k-m)!(n-k)!}
And there are 2n possibilities for each of A and B independently, giving 22n equally likely possible pairs of A and B, and a final probability of:
n!/{m!*2^(2n)}*SUMk=mn2^(n-k)/{(n-k)!*(k-m)!}
[/hideb]

Title: Re: Set P contains n elements
Post by navdeep1771 on Apr 17th, 2019, 10:39am
I am struggling to write math. That YABBC tags don't work. Neither I can post an image, as it requires some minimum no. of posts. Well I can't understand what's the operation between "n" and "2" in your SUM. It would be better if you can post it in an image where the answer is crisp clear. Let's check for some values, suppose n = 7 and m = 5. What do you got?

Title: Re: Set P contains n elements
Post by towr on Apr 17th, 2019, 10:50am
My result is [hide]C(n, m) * 3(n-m) / 4n[/hide]

[hide]I'm approaching it as: color each item with one of 4 colors (say, red, yellow, blue and green), and then count that a given color (say, red) occurs m times.
So there's C(n, m) ways to pick m red elements, then color the other n-m elements with the 3 remaining colors in one of 3(n-m) ways. And then you divide by 4n for the probability
The analogy is the red and yellow elements are in set A, and red and blue elements are in set B, green ones are in neither[/hide]

for n=7, m=5 I get 189/47 ~= 0.0115

Title: Re: Set P contains n elements
Post by towr on Apr 17th, 2019, 11:29am
NB wolframalpha happily simplifies rmsgrey's result to the same as mine. (Albeit with C(n, m) expanded to n!/m!/(n-m)! )

Title: Re: Set P contains n elements
Post by navdeep1771 on Apr 17th, 2019, 8:45pm
That's correct. The strategy towr used is very similar to that I know i.e. by Venn Diagram. [hide] There are n elements in set P. Construct 2 sets A and B in set P having some intersection part. So now total there are 4 ways to send n elements in Set P which are:
Set A only, Set B only, Intersection of Set A and B, and neither in Set A nor in Set B. So total = 4^n.
Favorable : Selecting any m elements from n and then sending those m elements into the intersection partof Set A and Set B. Which is C(n,m)*3^(n-m). [/hide]

Title: Re: Set P contains n elements
Post by towr on Apr 17th, 2019, 10:19pm
It also looks quite simple to extend this to k sets
[hide]C(n, m) * (2k -1)(n-m) / 2k*n[/hide]

Title: Re: Set P contains n elements
Post by rmsgrey on Apr 18th, 2019, 5:26am

on 04/17/19 at 10:39:42, navdeep1771 wrote:
I am struggling to write math. That YABBC tags don't work. Neither I can post an image, as it requires some minimum no. of posts. Well I can't understand what's the operation between "n" and "2" in your SUM. It would be better if you can post it in an image where the answer is crisp clear. Let's check for some values, suppose n = 7 and m = 5. What do you got?


n was the upper limit of the sum, so "sum from k=m to n of {(two raised to the (n-k)'th power) divided by ...}

In Towr's analogy, I'm breaking the total cases based on the number of yellow items, and then summing those cases.

I would like to point out that towr's extended formula also applies to situations where there are k independently chosen subsets, and you want there to be precisely m that are both in the intersection of any subset of those k sets and outside the union of the remaining sets. For example, if you had sets A, B, C, D, E and F, the formula, with k=6, would give the correct value for the probability of there being precisely m elements that are both in A, B and C, and outside D, E and F.

A simple proof can be found by considering that D can be replaced by its complement, ~D, and similarly E by ~E, F by ~F, without changing the probabilities, and A&B&C&~D&~E&~F then becomes A&B&C&D&E&F - the common intersection of all 6 subsets.



Powered by YaBB 1 Gold - SP 1.4!
Forum software copyright © 2000-2004 Yet another Bulletin Board