```

wu :: forums
(http://www.ocf.berkeley.edu/~wwu/cgi-bin/yabb/YaBB.cgi)

riddles >> easy >> Expected number of throws before n heads in a row
(Message started by: NickH on May 30th, 2004, 1:40pm)

```

Title: Expected number of throws before n heads in a row
Post by NickH on May 30th, 2004, 1:40pm
What is the expected number of times you would have to toss a fair coin in order to obtain n consecutive heads?

For example, you could obtain two consecutive heads as follows:

HH (2 tosses)
THH (3 tosses)
HTHH (4 tosses)

... and so on.

Title: Re: Expected number of throws before n heads in a
Post by towr on May 30th, 2004, 2:35pm
N + the expected length of a sequence [e](which doesn't end on a H)[/e] with less than N consecutive heads ..  ;)

Title: Re: Expected number of throws before n heads in a
Post by grimbal on May 30th, 2004, 2:49pm
It's 1+(expected throws to get heads).
Because after 1 throw, the probability of getting the same side again is the same as the probability of getting heads.

Title: Re: Expected number of throws before n heads in a
Post by Eigenray on May 30th, 2004, 10:59pm
Let E(k) be the expected number of tosses required immediately after a string of k heads:[hide]
E(k) = 1/2 [ 1+E(k+1) ] + 1/2 [ 1+E(0) ], for k<n,
E(n) = 0.
We may write
E(k+1) = 2E(k) - 2 - E(0),
and easily convince ourselves that
E(k) = E(0) - 2k+1 + 2,
and since E(n) = E(0) - 2n+1+2 = 0,
E(0) = 2n+1 - 2.
[/hide]::

Title: Re: Expected number of throws before n heads in a
Post by THUDandBLUNDER on May 31st, 2004, 1:10am

Quote:

LOL (http://db.uwaterloo.ca/~alopez-o/math-faq/mathtext/node18.html)!

;D

Title: Re: Expected number of throws before n heads in a
Post by Grimbal on May 31st, 2004, 8:32am
Oops.  :-[  Right answer to the wrong question.

What I solved is the expected number of throws to get 2 times the same side.

If we want only HH, then it is twice that value.

And if I develop for the general case, I find like Eigenray.

Title: Re: Expected number of throws before n heads in a
Post by Icarus on May 31st, 2004, 7:55pm
T&B - Thanks for the link! I have often heard of this bill, but never before seen the actual thing. As the accompanying article points out, the bill itself does not appear to assign "3" as the value of [pi], even though apparently the author claimed it did. But instead, it assigns 4 other values to [pi] (none even good approximations) at various times.

I also note that contrary to common claim, it appears that the Indiana legislature did not think they could decide mathematical truth by legislative fiat. Rather, being not so educated as to be able to cut through the heavy b.s. in this bill (I found it extremely taxing, and was unable to determine the meaning of some phrases), they thought that they were being given the chance to adopt proven new discoveries into their education curriculum without inccuring added costs. They were hoodwinked into thinking that these results were proven by false claims of peer review (at least, I find it highly doubtful that the American Mathematical Monthly ever accepted this load of bunk).

It would appear that much of the ridicule of the Indiana Legislature induced by this action was undeserved (much, but not all - they still should have checked with real mathematicians before voting - but this sort of mistake is made by legislatures all the time).

Title: Re: Expected number of throws before n heads in a
Post by THUDandBLUNDER on May 31st, 2004, 9:05pm

Quote:
 ...but this sort of mistake is made by legislatures all the time).

There was also an extremely successful April Fool's Day hoax (http://www.snopes.com/religion/pi.htm).

Title: Re: Expected number of throws before n heads in a
Post by william wu on Jul 27th, 2004, 5:10am
I know a sweet way to do this problem, from martingale theory. This approach may at first seem unnecessarily complicated, but it has the advantage of being robust enough to easily solve more difficult questions, such as

In drawing lowercase English letters a through z uniformly at random, the expected time till you see willywilly is 26^10 + 26^5.

In drawing numerals over 0-9 uniformly at random, the expected time till you see 1231231 is 10^7 + 10^4 + 10.

In flipping a fair coin, the expected time till you see HTHTTHT is 2^7 + 2^2.

This is a worthwhile technique to know for analyzing the time till certain patterns occur in random processes.

Here's the argument; hopefully after reading it you'll feel it was worth your time ...

First imagine a casino which contains only one game -- the game of betting on whether a fair sided coin comes up heads or tails. The odds are fair, meaning that if you bet \$X on a guess and you are correct, you get \$2X back as reward; otherwise you lose \$X.

Now for the gamblers. At every time step, a new gambler enters the casino, to join the existing pool of gamblers inside. Now, all the gamblers use the same policy. Each comes in with only \$1, and they progressively bet all their monies on the sequence Hn = HHHH ... H.

So for example, at time 1, gambler #1 appears; we will denote him by G1. He bets \$1 on H. If he wins, he gets \$2 back -- all of which he bets on H in the next time step. And so forth. If he ever loses, he goes broke and leaves the casino. However, if he's lucky enough to get a straight of n heads, then he'll get \$2n back, in which case the casino managers get seriously discouraged and the casino immediately shuts down. Thus the casino shuts down exactly when the first occurrence of n consecutive heads is completed. Let this time be T.

Now we need three crucial observations:

1) Eventually the casino will shut down, since eventually the sequence of n consecutive heads will rear its "head" :). When it's all over, the ONLY gamblers who will have made any money in this casino are the final n gamblers, GT-n+1 through GT. These n gamblers cause a monetary loss of 2n+1 - 2 - n for the casino.

"Proof:" GT-n+1 is the "shutdown" gambler who was lucky enough to complete the entire heads sequence. But while that sequence is being completed, new gamblers GT-n+2 through GT are coming in at each new time step, successfully betting on partial sequences. Thus the last guy to come in gets \$2, the guy before him gets \$4 (for HH), and the guy before him gets \$8 dollars ... etc. until GT-n+1, who gets \$2n. Thus their total worth is 2 + 4 + 8 + ... + 2n = 2n+1 - 2. Finally, to compute how much the casino has to pay out, note that each of the n gamblers pays \$1 to play. Thus we subtract \$n from their total worth.

Now you ask, what about all the gamblers before GT-n+1, who won a couple coin flips? Well, since the n consecutive heads have never occurred yet, all those gamblers had to have lost all their monies, since they always bet everything they have.

2) For the stretch of time between n and T-n inclusive, the casino earns money at an amortized rate of \$1 per time unit.

"Proof:" Remember, all the gamblers with entry times prior to T-n+1 end up losing all their money. Since they came in with only \$1, the casino gets \$1 from each of those gamblers. And exactly one new gambler enters at each time step.

Note that this \$1 will be in the casino's possession after at most n units of time have elapsed since the losing gambler's entry. Thus, starting at time n, we can say that the casino is earning money at a rate of \$1 per time unit.

3) Since the odds are fair, the expected amount of money which leaves the casino is equal to the expected amount of money that enters the casino.

"Proof": Obvious; this is what we mean by fair odds.

Finally, we combine these three facts.

From (1), we know that the money going out is 2n+1 - 2 - n., and all the gamblers prior to the shutdown gambler have lost their money.

From (2), we know that the money going into the casino comes in at a amortized rate of exactly \$1 per each time step, prior to the shutdown gambler's time of entry.

From (3), since expected money going out = expected money going in, then, the expected money coming in must also be 2n+1 - 2 - n. Finally, since the money comes in at an amoritzed rate of \$1 per time unit, the expected time till the string of n consecutive heads BEGINS is 2n+1 - 2 - n. Finally we add n to this time, to get the expected time till the string finishes: 2n+1 - 2.

One can easily extend the above argument to other kinds of patterns over different alphabets.

Footnote: An strange consequence of this I noticed is that the more "self-similarity" your pattern has when reading it from left to right, the longer it will take for the string to be seen. My amusing interpretation of this is that history is less likely to repeat itself!

Title: Re: Expected number of throws before n heads in a
Post by towr on Jul 27th, 2004, 6:19am

on 07/27/04 at 05:10:29, william wu wrote:
 Footnote: An strange consequence of this I noticed is that the more "self-similarity" your pattern has when reading it from left to right, the longer it will take for the string to be seen.
You mean it would take longer to get 123123 than 132213 ? It seems to me that can't be the case, seeing as independence makes both strings equally likely..

Title: Re: Expected number of throws before n heads in a
Post by william wu on Jul 27th, 2004, 6:27am
Yes, according to the argument, the expected time to see your first string is longer than the time to see the second. I didn't believe it either, but I did some simulations a while back on strings to that effect, and I think I confirmed it. Although I still find it disturbing myself. You have to be careful though -- a lot of simulations are necessary in order to get estimates accurate enough to see the tiny differences in expected time. Namely, the standard errors of the sample means should be significantly less than the difference. This difference is only a constant term, where as the expected times themselves are on the order of n^(alphabet size), where n is the length of the pattern. (For instance, in your example, assuming alphabet = {1,2,3,}, the difference is only (3^6 + 3^3) - (3^6 - 3^2) = 27 - 9 = 18.) When simulating, try to choose strings that maximize the ratio of this difference to the expected time till the patterns occur, within the computational limits of available resources.

Title: Re: Expected number of throws before n heads in a
Post by towr on Jul 27th, 2004, 8:14am
In both cases you can define (the same) matrix
Ai :=
[2/3, 1/3, 1/3, 2/3, 1/3, 1/3,   0;
1/3, 1/3, 1/3,   0, 1/3, 1/3,   0;
0,   1/3,   0,   0,   0,   0,   0;
0,   0, 1/3,   0,   0,   0,   0;
0,   0,   0, 1/3,   0,   0,   0;
0,   0,   0,   0, 1/3,   0,   0;
0,   0,   0,   0,   0, i/3,   1]

which gives the state-transition at step i
and
[prod]i=1[supinfty]Ai [times][1,0,0,0,0,0,0]  gives the end state of the system.
And considering the transition matrices are the same in both cases, the end result is the same, with the last element of the vector (which holds the expected waiting time) being 756..

132312 on the other hand really would take less time. It's intuitive to see why, because if you at some point have the first 5 numbers, you can either
-get the number 2, in which case you're done
-get the number 1, in which case you're back at only having the first number of the sequence
-or you get a 3, in which case you have the first 2 numbers of the sequence.

With 123123 if you have the first 5 numbers, you either
-get a 3 and finish
-get a 1, and start again with only the first number
-or get a 2 and you'll have to start over completely

So on average you have to repeat more of the sequence there..

Title: Re: Expected number of throws before n heads in a
Post by william wu on Jul 28th, 2004, 4:19am
Not sure what that matrix is about. States? If it's supposed to be a probability transition matrix, the rows don't add to one either.

Anyways, your latter comments I think are a great way to see it. You can consider the minimal finite state machines which one can build to recognize these patterns. The expected time to get to the absorption state is then given by some markov chain analysis. But basically when trying to recognize patterns with high left-to-right repetitive substructure, you often pay the price of retracting to very early states when an sequence mismatch occurs.

Title: Re: Expected number of throws before n heads in a
Post by towr on Jul 28th, 2004, 5:03am

on 07/28/04 at 04:19:33, william wu wrote:
 Not sure what that matrix is about. States? If it's supposed to be a probability transition matrix, the rows don't add to one either.
The columns add to one (except for the 5th, as that doesn't give a chance, but probability*time = (partial) expected time)

You start without anything of the string, then either you get 1 with probability 1/3, or 2 or 3 (and thus are left with nothing of the string) with probability 2/3.
If you have the first 1, then you can get a 2 with probability 1/3, or you get a 1 (in which case you still only have the first number of a string) with probability 1/3, or you get a 3 with probability 1/3 in which case you have to start over.
etc

Near the end you have 12312 and either get a 3 with probability 1/3 in which case you're done and it took i steps, so expected time get incremented by i/3 * (fraction in state 12312), other wise you return to 1, or nothing both with probability 1/3..

Title: Re: Expected number of throws before n heads in a
Post by THUDandBLUNDER on Aug 22nd, 2004, 9:56am

Quote:
 It seems to me that can't be the case, seeing as independence makes both strings equally likely..

Title: Re: Expected number of throws before n heads in a
Post by Porsche911 on Dec 4th, 2007, 10:06am
With Markov Chains we get the same result! (10010010 for 1231231):
Here is a nice introduction:
http://www.dartmouth.edu/~chance/teaching_aids/books_articles/probability_book/Chapter11.pdf

We have 8 different states:
{0,1,2,3,4,5,6,7}

State 7: ...1231231 = Absorption State
State 6: ...123123
State 5: ...12312
State 4: ...1231
State 3: ...123
State 2: ...12
State 1: ...1
State 0: ...

In MathematicaCode:

TransitionMatrix = {
{9/10, 1/10, 0, 0, 0, 0, 0, 0},
{8/10, 1/10, 1/10, 0, 0, 0, 0, 0},
{8/10, 1/10, 0, 1/10, 0, 0, 0, 0},
{9/10,  0, 0, 0, 1/10, 0, 0, 0},
{8/10, 1/10, 0, 0, 0, 1/10, 0, 0},
{8/10, 1/10, 0, 0, 0, 0, 1/10, 0},
{9/10,  0, 0, 0, 0, 0, 0., 1/10},
{0, 0, 0, 0, 0, 0., 0, 1];

Q = {
{9/10, 1/10, 0, 0, 0, 0, 0},
{8/10, 1/10, 1/10, 0, 0, 0, 0},
{8/10, 1/10, 0, 1/10, 0, 0, 0},
{9/10, 0, 0, 0, 1/10, 0, 0},
{8/10, 1/10, 0, 0, 0, 1/10, 0},
{8/10, 1/10, 0, 0, 0, 0, 1/10},
{9/10, 0, 0, 0, 0, 0, 0];

(*Q = TransitionMatrix without absorption states*)

ExpectedTimeToAbsorption =
Inverse[(IdentityMatrix[7] - Q)].{1,1,1,1,1,1,1}
TimeTill1231231={1, 0, 0, 0, 0, 0, 0}.T

(*TimeTill1231231=10010010*)

Title: Re: Expected number of throws before n heads in a
Post by Hippo on Dec 5th, 2007, 5:58am

on 12/04/07 at 10:06:36, Porsche911 wrote:
 With Markov Chains we get the same result! (10010010 for 1231231)

They were probably interested on 123123 (resp. 123312 only. ... I have no time to study the link yet.

The problem with towr analysis was the wrong calculation for restarting ... I hope the link solves it well

on 12/05/07 at 06:21:04, towr wrote:
 ?! Which analysis are you referring to here?

The last one. But let me think about it a little more ... this was just the first impression and sorry if the analysis was Ok and my impression wrong.

Title: Re: Expected number of throws before n heads in a
Post by towr on Dec 5th, 2007, 6:21am

on 12/05/07 at 05:58:32, Hippo wrote:
 The problem with towr analysis was the wrong calculation for restarting ...
?! Which analysis are you referring to here?

Title: Re: Expected number of throws before n heads in a
Post by Porsche911 on Dec 13th, 2007, 2:39am
Hey Hippo,

no we are interested in 1231231.

Title: Re: Expected number of throws before n heads in a
Post by fiziwig on Dec 20th, 2007, 10:15am

Quote:
 Footnote: An strange consequence of this I noticed is that the more "self-similarity" your pattern has when reading it from left to right, the longer it will take for the string to be seen.

I really had to convince myself that this counter-intuitive result was true. Here's how I proved it to myself:

Rather than playing the game until someone wins, we continue to play the game for an infinite length of time. We know that any given string will occur with equal frequency as any other given string. In other words, we expect the interval between occurrences of a given string to be, on average, the same regardless of the string in question. However, given that "123123" has just been encountered for the first time, we already have 3 out of the 6 digits matching as we look for the second occurrence, thus giving a slight edge to a second consecutive occurrence. Since the likelihood of a second consecutive occurrence is higher than average, which would give a very much shorter interval, in order for the long-term average interval to be correct, the interval before the first occurrence must be larger.

Quote:
 132312 on the other hand really would take less time. It's intuitive to see why, because if you at some point have the first 5 numbers, you can either  -get the number 2, in which case you're done-get the number 1, in which case you're back at only having the first number of the sequence-or you get a 3, in which case you have the first 2 numbers of the sequence. With 123123 if you have the first 5 numbers, you either  -get a 3 and finish  -get a 1, and start again with only the first number-or get a 2 and you'll have to start over completely So on average you have to repeat more of the sequence there..

132312 is claimed to be favored to show up earlier.

If you have the first 4 digits you can either
-get 1 and finish
-get 2 and start again with NO digits correct
-get 3 and start again with NO digits correct

123123 is claimed to be favored to show up later.

If you get the first 4 digits you can either
-get 2 and be finished
-get 1 and start over with only 1 digit correct
-get 3 and start over with NO digits correct.

Therefore, the precursor to the favored string is disfavored, and the precursor to the disfavored string is favored. Is the advantage of a certain string always balanced by the disadvantage of its precursor strings? Apparently not, since in the final analysis, some strings are more or less favored to occur (for the first time) sooner than expected.

--gary

Title: Re: Expected number of throws before n heads in a
Post by towr on Dec 20th, 2007, 11:13am
Possibly the last step always outweighs all possible previous steps; but I'm not really good at picking up a thought 3 years on..