Author |
Topic: 100 prisoners & a light bulb (Read 167246 times) |
|
Grimbal
wu::riddles Moderator Uberpuzzler
Gender:
Posts: 7527
|
|
Re: 100 prisoners & a light bulb
« Reply #350 on: Mar 16th, 2005, 3:28pm » |
|
Then there are the binary groupings. In this solution, the idea is to move groups of 2^k token that can be combined 2 together to make groups of 2^(k+1), with the aim to get a single group of 128 token. 28 extra token must be added to make a nice round total (in binary, that is). The transfers are organized in stages: a first stage is meant to transfer single token, the next stage is to transfer and combine 2-groups, then a stage for 4-groups, etc. up to the stage where 64-groups are transferred, hopefully to end up with a single 128-group. There was a confusion how to deal with stage changes. How to avoid a prisonner to send a single token that would be received as a 2-group. The answer is that the receiver must add to his tokens the count of the previous day, regardless of how well it suits him. If the prisonner had no token and planned to pass down whatever he receives, he might receive a single token on the first day of the 2-group stage. In that situation he has one, but has only the possibility to transfer 2. So he has to get back to his cell with his token. On the other side, if he had 1 already, he could immediately continue passing on the 2-group. In general, it is convenient to have long stretches of same group sizes, because then, a receiver of a group who can not combine it can immediately send it further. If the group size changes all the time, (as in a proposed solution that goes 1,2,1,4,1,2,1,8,...), many prisonners will get stuck with a group that they can not transfer. I think mattian's proposal of long stages of 1's then 2's, 4's, etc, up to 64's should be best. I just would add a "showball" at the beginning, i.e. have the N(d) look like: 1,2,3,...,S,1,...,1,2...2,4...4,...,64...64, (and restart 1...1,2...2, etc.). The question is to optimize the length of the snowball, the lengths of each stage in the initial cycle, and the lengths of the stages in the "repeat" cycles, that are probably shorter OK, that's it. I hope it makes sense. Actually, I just wanted to clarify a few points, like the idea that the light is a 1-bit message, that its meaning must be fixed from the start, and that token received must be accepted, even though they can be passed to the next prisonner, but only if the day's group size allows.
|
|
IP Logged |
|
|
|
SMQ
wu::riddles Moderator Uberpuzzler
Gender:
Posts: 2084
|
|
Re: 100 prisoners & a light bulb
« Reply #351 on: Mar 17th, 2005, 6:42am » |
|
Snowballs and badges and crowns, oh my! This was the thread that originally drew me into the forums, and while I haven't found anything new to contribute, I just thought I'd pop up and say "excellent summary!" Just for completeness, here are the details of gypo's best run time opening (from page 11, runtime of 3489, wherein a "badge" means a prisoner is collecting 1-tokens and a "crown" means a prisoner is collecting 10-tokens): on Oct 23rd, 2003, 4:29am, gypo wrote:* Initially, all prisoners have one token each, zero badges and zero crowns (same as before). * On day 0 of each cycle, if the light is on, the prisoner gets a badge and three tokens. If the light is off he gets nothing. If he now has one or more tokens or one or more badge, he turns the light on and subtracts one token from his inventory, even if he ends up with a negative number of tokens. If the prisoner has no tokens and no badges, he switches the light off and gets a badge. * On days 1 and 2, the prisoner does nothing if the light is off. If the light is on and he has a token or a badge, he leaves the light on and subtracts one token from his inventory. If the light is on but he has no tokens or badges, he turns the light off and adds d tokens and a badge to his inventory. * On day 3, if the light is off the prisoner does nothing. If the light is on and he doesn't have a badge already, he switches the light off and gets three tokens and a badge. If the light is on and he already has a badge, he leaves the light on and does nothing. * Whoever gets a badge during cycle 0 also gets a crown. * On day 1 of stage 1, if the light is on, the prisoner gets a badge and three tokens, switches the light off and pretends it was off to start with. This opening will result in somewhere between 14 and 25 drones (out of 90) being counted during the first 40 days, which gives you some idea of the power of the method. I haven't really optimised the stage lengths very thoroughly, but this is what I used to achieve the 3489 days: stage 0: 10 cycles of 4 days; stage 1: 2000 days; stage 2: 1500 days; stage 3: 300 days; stage 4: 300 days. |
| --SMQ
|
« Last Edit: Mar 17th, 2005, 6:43am by SMQ » |
IP Logged |
--SMQ
|
|
|
rmsgrey
Uberpuzzler
Gender:
Posts: 2873
|
|
Re: 100 prisoners & a light bulb
« Reply #352 on: Mar 17th, 2005, 8:05am » |
|
One possible optimisation to gypo's method is to extend the allocation phase by another 2 cycles, to hand out 12 badges, the first four of which are "badge+" and require 9 tokens to fill; the remaining 8 requiring 8 tokens each. It may also be worth tweaking the allocation cycle lengths slightly so earlier cycles drop the third day (pass the badge if you already have one and it hasn't yet been collected, without increasing the snowball) and later cycles add a fourth day (ditto). It may also be worth tweaking the sizes of the mini-snowballs. The reason for suggesting 12*8ish rather than 10*10 is that someone reported better results with that division when looking at pre-assigned leaders.
|
|
IP Logged |
|
|
|
Icarus
wu::riddles Moderator Uberpuzzler
Boldly going where even angels fear to tread.
Gender:
Posts: 4863
|
|
Re: 100 prisoners & a light bulb
« Reply #353 on: Mar 17th, 2005, 7:41pm » |
|
Concerning the straight binary solution (no snowball round or other variations) and how to handle the extra 28 tokens: Suppose at the beginning of a stage, there are L prisoners with the appropriate level of token (call these prisoners "active" for the stage), out of a total of N prisoners. Because of how the binary scheme works, each active prisoner visits the room once, and then becomes inactive - it is impossible for an active prisoner to visit the room and remain active. I will assume that the stage lasts long enough for all active prisoners to visit. The expected time for all active prisoners to visit can be calculated as follows. While k prisoners are active, the probability of one visiting on any particular day is k/N, and the expected time until one does visit is the inverse: N/k. So the expected time from the start of the stage to the visit of the first active prisoner is N/L. The expected time from this visit to the 2nd visit of an active prisoner is N/(L-1). This continues until the expected time between the visit of the 2nd-to-last to the last active prisoner is N/1. This gives a total expected conclusion time of N * [sum] 1/k for k=1 to L. (By "conclusion time", I mean the number of days it takes for all prisoners to become inactive in the stage, assuming that the stage lasts long enough. The stage itself may last much longer, as no one knows it has accomplished all it can.) Now if N is not a power of 2, and no special measures are taken to correct for this, at the beginning of some stages, L is odd. The light starts each stage in the off position (turning it off is part of the previous stage, even if it happens on the same day). Each active prisoner toggles the state of the light, and since L is odd, eventually it will be on when no active prisoners remain, indicating a lone token left in the room, and no one able to pick it up. Call this the (unsuccessful) conclusion of the stage. Consider a fixed order of prisoner visits, and let us look at the effect of various approaches for fixing this problem. Call the last active prisoner to visit in the uncorrected scheme "Ender". Grimbal/Mattian's method is to bequeath an extra token to the person visiting on the first day of the stage. If the first visitor was inactive, this makes him active, so he now switches on the light, whereas before he left it off. The next active visitor would have turned it on before, but now he turns it off. And the following active visitor now flips it on, where before he turned it off, and so on. The light bulb is now in the opposite state every day than it would of been in the uncorrected version. The result is that Ender turns the light off instead of on, and no tokens are left over. Note that it is still the same prisoner Ender who concludes the stage in both the corrected and uncorrected versions. So the two versions last the same length of time. Now, however, the stage has a successful conclusion. --------------------------------------- This post is too long for YaBB, so I am breaking it in two here.
|
|
IP Logged |
"Pi goes on and on and on ... And e is just as cursed. I wonder: Which is larger When their digits are reversed? " - Anonymous
|
|
|
Icarus
wu::riddles Moderator Uberpuzzler
Boldly going where even angels fear to tread.
Gender:
Posts: 4863
|
|
Re: 100 prisoners & a light bulb
« Reply #354 on: Mar 17th, 2005, 7:42pm » |
|
Continuing: Suppose instead that the extra token is given to some randomly chosen inactive prisoner. The effect of his first visit to the room is to swap the state of the bulb from then on in the stage. If his first visit to the room occurs before that of Ender, then just as before, Ender turns the light off instead of on and concludes the active stage after the same amount of time. If, his first visit falls after Ender's, then Ender will turn on the light, and he will turn it off, extending the conclusion of the stage by a certain number of days (expectation = N*((N-1)/N)T, where T is the number of days until Ender's visit). Now suppose the extra token is given to a randomly chosen active prisoner. The prisoner is removed from the active list, and the toggle the prisoner induces is removed, flipping the bulb state for all later prisoners. If the chosen prisoner is not Ender, this has no effect on the conclusion time of the stage, as Ender still ends it. On the other hand, if Ender is chosen, then the 2nd last active visitor (call him Pretender) becomes the last active prisoner, which shaves an expected N days off the round. Since Ender has 1/L chance of being the chosen prisoner, the overall expection in this case is a N/L decrease in conclusion time. Of course, there is no way to pick an active prisoner in advance, so assume that any prisoner is chosen at random. There is an L/N chance the prisoner is active, resulting in an expected -N/L change in conclusion time, and a (1 - L/N) chance the prisoner is inactive, resulting in an expected (N-1)T/NT-1 change in conclusion time, where T = N[sum] 1/k, k=1..L. Total expected conclusion time is T - 1 + (N-L)(1-1/N)T. For our particular case of N=100, there are 3 such stages: #3, #4, and #5, by my terminology. For stage 3, L=25, T ~= 100*(3.816) = 381.6 days. And the change in conclusion time for picking a prisoner at random is - 1 + (75)(.99)381.6 ~= 0.6 days. For stage 4, L=13, T ~= 100*(3.180) = 318 days. And the delta conclusion time is -1 + (87)*(.99)318 ~= 2.56 days. For stage 5, L=7, T ~= 100*(2.593) = 259.3 days. And the delta conclusion time is -1 + (93)*(.99)259.3 ~= 5.87 days. So it appears that Mattian & Grimbal's method could (with judicious stage length choices) save an expected 9 days off of simply picking a person at random in each stage to receive the extra token. However, in AlexH's scheme, they are not given to separate random prisoners, but to the same prisoner. To understand the effect of this, consider these cases: (1)The chosen prisoner gains a token in stages 1 & 2. (prob 1/4) Or, the chosen prisoner donates a token in one or the other of these two stages (prob 3/4), but (2) the prisoner gains a token in stage 3 (prob = (3/4)(1/2) = 3/8). Or the prisoner does not gain a token in stage 3 (and donates the extra one given him)(prob=(3/4)(1/2) = 3/8), but (3) does gain a token in stage 4 (prob = (3/8)(1/2) = 3/16). (4) Or the prisoner also donates his stage 4 token (prob = 3/16). In case 1, the prisoner is able to build a level 6 token by the end of stage 2, and therefore has no need to take part in stages 3, 4, 5, which now have 24, 12, 6 active members. This changes the expected conclusion times to 377.6, 310.3, and 245 days, saving a total of 26 days over the G/M method. In case 2, the prisoner takes part in stage 3, but not in stages 4 and 5, which now have 26, 12, 6 active members. Conclusion times are 385.4, 310.3, 245, saving 18.2 days over G/M. In case 3, the prisoner is in stages 3 and 4, but not in 5. Active members are 26, 14, 6. Times are 385.4, 325.2, 245, saving 3.3 days over G/M. In case 4, the prisoner is active in all three stages. Active members are 26, 14, 8. Times are 385.4, 325.2, 271.8, costing 23.5 days more than G/M. The total expected difference is (.25)(-26) + (.375)(-18.2) + (.1875)(-3.3) + (.1875)(23.5) ~= -9.5 days. So in the end, AlexH's method saves better than 9 days in stage conclusion time over Grimbal and Mattian's. This is not the same thing as saving 9 days in expected imprisonment, but does indicate a better overall performance. Finally, I end with this question: Does it improve matters to tell the prisoner given the extra tokens to never give away a token in the first two stages? (If he enters the room with the light off, he greedily keeps his token instead of turning the light on.) This attempts to swing matters towards that -26 day total, at the cost of ineffective visits in the first two stages.
|
« Last Edit: Mar 17th, 2005, 7:50pm by Icarus » |
IP Logged |
"Pi goes on and on and on ... And e is just as cursed. I wonder: Which is larger When their digits are reversed? " - Anonymous
|
|
|
mattian
Senior Riddler
Gender:
Posts: 404
|
|
Re: 100 prisoners & a light bulb
« Reply #355 on: Mar 18th, 2005, 8:29am » |
|
Hi guys, Icarus, I know I was supposed to give you an explanation of my results, but I have lost the source code for my original simulation. I'm in the middle of rewriting it now. In fact I have developed an engine that will run any strategy defined - ie. The general framework is built and will run an arbitrary strategy that is defined as a set of rules. I intend to include all the main strategies from this forum and then I will post the finished product and the source code to the forum when it is done, complete with the results of the tests. As for my own method, at the moment I'm getting about 5000 - so I've left something out (maybe the bit that cheats ) in my new implementation, but I will keep you posted. In brief, the mattian method - I will call it that for now - incorporates an initial analysis phase (which could be used for any of the methods) that aims to optimise the stage durations. It does this by establishing (using statistics from hundreds of iterations) the lengths L[...] of each stage in a cycle required in order for there to be an X percent chance that everyone has traded by the end of the cycle. It then determines (again using statistics) the average number of left over traders in each stage after the first cycle and starts all over again by recalculating L based on the remaining traders. This analysis phase continues until the remaining number traders is less than some threshold T. The results L of the analysis are then used in the actual simulation. X% of the time, the prisoners escape within the sum of the first cycle of stages' durations. X% x (100-X)% of the time, escape time is the sum of the first and second cycles of stages, etc. The trick then is to chose nice values for X and T. T is easy - 0.0001 is a good choice. The analysis phase is a separate method which is analagous to the initial meeting in the courtyard because it runs through test cases to establish some parameters for the strategy. The testing framework allows you to specify the all the fixed parameters, such as the number of prisoners, for example, and then requires that you implement a single method which defines the behaviour of prisoner when he enters the room given two parameters as inputs : the current day, the state of the light. Any strategy can be defined by the same set of rules that all prisoners must follow when they enter the room so all the other dynamics of this problem are handled outside the specific strategy handler method. I will try to run some of the strategies tonight using this initial analysis and see what I get. As for the mattian method, I believe my best results were around 3500, but only if I chose not to include every 10th iteration on average. In other words, every now and then, a result of around 15000 would come up and pull the average way up. That is why I posted a few months ago, asking if we should be considering the overall average to be more significant than the median - so is { 4500, 4501, 4499 } better than { 3501, 3499, 7001 }.
|
« Last Edit: Mar 18th, 2005, 2:00pm by mattian » |
IP Logged |
|
|
|
mattian
Senior Riddler
Gender:
Posts: 404
|
|
Re: 100 prisoners & a light bulb
« Reply #356 on: Mar 18th, 2005, 9:14am » |
|
on Mar 17th, 2005, 7:42pm, Icarus wrote: So in the end, AlexH's method saves better than 9 days in stage conclusion time over Grimbal and Mattian's. This is not the same thing as saving 9 days in expected imprisonment, but does indicate a better overall performance. |
| Actually, there may be a better way than AlexH's scheme: One of the other optimizations I have in my scheme is to award the power to be greedy to the any prisoner who has a balance of at least 64 - so he may collect tokens in any stage and never give any away. A similar strategy could be used for the extra tokens - give 99 prisoners a balance of 1 and the final prisoner a balance of 29 with the power to be greedy until his balance reaches 32. And then when he (or another prisoner) hits 64, it's back to being greedy. I will mull over this for a bit - and add it to the sim.
|
|
IP Logged |
|
|
|
mattian
Senior Riddler
Gender:
Posts: 404
|
|
Re: 100 prisoners & a light bulb
« Reply #357 on: Mar 18th, 2005, 10:12am » |
|
Actually! This is the perfect reason to use a strategy mentioned in some previous posts which use the first three days to handle special cases for optimization. So how about this? : *Initial balance for each prisoner is 1. *If a prisoner visits the room on day 1 or day 2, his balance is set to 0. The prisoner of day 1 switches the light off. *If the prisoners of day 1 and day 2 are the same person, then on the second day, the light is switched on. *If the prisoner visiting on day 3 is the same person as on the other two days, he adds 29 to his 0 balance. *If the prisoner on day 3 is not there for the third time, his new balance is 30 if the light is on, and 31 if it is off. So after three days, the special prisoner with the excess balance will have a balance of 29, 30 or 31. *If the prisoner of day 3 has 31, he leaves the light on and his balance changes to 0, otherwise he turns it off and he keeps his balance. *If the prisoner of day 4 has a balance of 1 and the light is on, then his balance becomes 32, and he switches the light off. Normal trading begins on day 5. *If the prisoner of day 4 has a balance of 0 and the light is on, his new balance is 31 and he switches the light off. Normal trading begins on day 5. *If the prisoner of day 4 has a balance of 1 and the light is off, Normal trading begins on day 4 with this prisoner. * If the prisoner of day 4 has a balance of 0 and the light is off, he leaves it off and trading begins on day 5. So after four days, the special prisoner (which has not necessarily been the same person throughout the four days), will have a balance of 29, 30, 31 or 32. And 32 is the most likely outcome by far. Thus within four days, the most likely outcome is that: *three prisoners are rendered inactive, and *one prisoner has hit stage 6. (32-token trading).
|
|
IP Logged |
|
|
|
mattian
Senior Riddler
Gender:
Posts: 404
|
|
Re: 100 prisoners & a light bulb
« Reply #358 on: Mar 18th, 2005, 10:41am » |
|
Here's another thought, I don't know if this has been mentioned before - it probably has, and I'm sure you believe that I have read all the posts ... and forgotten a lot. I think I need to brush up on the details of the thread. But nevertheless, ... For this exercise let me refer to days, prisoners and light states according to the following examples: DAY | PR# | L-S | BAL ----------------------- D12 : P43 : 0-1 : 01-00 D25 : P21 : 1-1 : 00-00 D98 : P01 : 0-0 : 00-00 * The first means that on day 12 prisoner number 43 swithed the light from off to on and his balance changed from 1 to 0. * The second means that on day 25 prisoner number 21 left the light on and his balance didn't change. * The third means that on day 98 prisoner number 1 left the light off and his balance didn't change, etc. Along the same lines as my previous post which has the likely outcome of a prisoner with a 32 balance after four days, what about this potential optimizing strategy... Allocate the first N days to chance, as follows. D01 : P01 : ?-1 : 01-00 D02 : P02 : 1-1 : 01-00 D03 : P03 : 1-1 : 01-00 D04 : P04 : 1-1 : 01-00 D05 : P05 : 1-1 : 01-00 ... ... D38 : P38 : 1-1 : 01-00 D39 : P04 : 1-0 : 01-67 ... <-- LOOK HERE : BAL=38+1+28 D40 : P39 : 0-0 : 01-01 D41 : P40 : 0-0 : 01-01 ... ... DN : P?? : 0-0 : ??-?? So after N days a prisoner has collected between 1 and N tokens depending on how many prisoners form the sequence without repetition. A good choice of N might be the average number of days it takes for one prisoner in 100 to be selected a second time. 101 is the obvious maximum, 2 is the obvious minimum. At the very least, there is a fair chance you could get a prisoner into level 7 (trading 64-tokens) within the first 50 days by combining the sequence count with the excess 28 tokens. This represents the first cycle. The second cycle could work the same way except that unlike the first cycle which is terminated by a repeating prisoner, the second and subsequent cycles are terminated by prisoners who do not have the currency to trade in the current cycle - ie. zero balances, inappropriate balances, repeating prisoners within that cycle (but they won't have the appropriate currency anyway). This idea (as far as I know) is in its infancy, so please feel free to contribute. It seems as though it may be able to be combined with other strategies, as an optimization. Although as the unbroken sequences become shorter and shorter with the list of eligible traders, this scheme would ultimately slow the thing down. But for the first phase, alone, it could establish a healthy leader and may prove very successful.
|
« Last Edit: Mar 18th, 2005, 10:45am by mattian » |
IP Logged |
|
|
|
mattian
Senior Riddler
Gender:
Posts: 404
|
|
Re: 100 prisoners & a light bulb
« Reply #359 on: Mar 18th, 2005, 2:16pm » |
|
With reference to the G/M strategy (Grimbal/Mattian) : Some clarification on the rules for greediness. Prisoners whose binary representation of their balance, contains more 1's than 0's should be greedy while prisoners with more 0's than 1's should be keen to give up their tokens. ALSO, If a prisoner has exactly half the tokens (64), he should be greedy in all levels except in 64-trading. If he has more than half the tokens, he should be greedy always.
|
« Last Edit: Mar 18th, 2005, 2:20pm by mattian » |
IP Logged |
|
|
|
SMQ
wu::riddles Moderator Uberpuzzler
Gender:
Posts: 2084
|
|
Re: 100 prisoners & a light bulb
« Reply #360 on: Mar 19th, 2005, 6:41am » |
|
Thanks, rmsgrey, for pointing me in an interesting direction! It had never occurred to me before reading your post that the badges need not require the same number of tokens to fill. That is, different prisoners with badges (hereafter badgers) can be looking for different numbers of tokens to fill their badges, and, since only filled badges are passed to the crowner, the system still works--the crowner counts filled badges rather than their tokens. Why is this useful? Because in the initial short snowball rounds the earlier rounds likely collect more tokens than the later rounds. By adjusting the size of the badges (and, slightly, the number of days in the individual snowball rounds), we can make it more liekly that all badgers start out needing approximately the same number of tokens to fill their badge, wihch makes the first stage more efficient. (Since otherwise badgers who need fewer tokens fill their badges early and no longer participate.) Sticking with 10 badgers for the moment (one of whom is also the crowner), I found the following series of badge sizes (used in gypo's stage 0) result in an average runtime (over 10,000,000 trials) of 3426: 1st cycle 6 days long, one badge of size 12 awarded (and also the crown) 2nd & 3rd cycles 5 days long, two badges of size 11 awarded 4th - 6th cycles 5 days long, three badges of size 10 awarded 7th - 10th cycles 4 days long, four badges of size 9 awarded. This solution still uses gypo's stage lengths (2000, 1600, 300, 300, ...) which seem to be close to optimum for 10 badges. Hopefully I can spend some time this weekend investigating solutions with fewer or more than 10 badges--there's still certainly room for improvement: I'm looking to break 3400. --SMQ
|
|
IP Logged |
--SMQ
|
|
|
Icarus
wu::riddles Moderator Uberpuzzler
Boldly going where even angels fear to tread.
Gender:
Posts: 4863
|
|
Re: 100 prisoners & a light bulb
« Reply #361 on: Mar 19th, 2005, 2:08pm » |
|
Just for the record, biwema was the first to suggest that the number of tokens collected by different leaders could differ from leader to leader. SWF made use of the same idea (with just one person collecting a different amount) to come up with his record setting (at the time) proposal. Other than those two, however, the schemes have all had constant collection requirements.
|
|
IP Logged |
"Pi goes on and on and on ... And e is just as cursed. I wonder: Which is larger When their digits are reversed? " - Anonymous
|
|
|
rmsgrey
Uberpuzzler
Gender:
Posts: 2873
|
|
Re: 100 prisoners & a light bulb
« Reply #362 on: Mar 19th, 2005, 2:10pm » |
|
Good to know someone's developing some of my suggestions - even if I'm not putting in the hard graft on evaluating potential solutions on this thread, I am generally interested in people's results. I'm currently wrestling with the concept of a hybridised system, with a mixture of dynamically labelled tokens, and what might well end up as a binary collection method for anonymous or semi-anonymous tokens and aggregates in an attempt to combine the rapid finishing of labelled tokens with the rapid starts of dynamic allocation systems. So far, all I'm getting is a headache, but I've got a feeling there might be something there.
|
|
IP Logged |
|
|
|
Icarus
wu::riddles Moderator Uberpuzzler
Boldly going where even angels fear to tread.
Gender:
Posts: 4863
|
|
Re: 100 prisoners & a light bulb
« Reply #363 on: Mar 19th, 2005, 4:36pm » |
|
Here is an abbreviated history of ideas in the thread I put together as part of my (still) upcoming survey. They are listed in chronological order, not order of best time. I've chosen these posts because I think they each add a significant idea to what has come before. By "Run-time", I mean the expected length of imprisonment under the scheme. The run-times reported are not necessarily those obtained by the proposer listed, but rather are the best reported by anyone who analyzed any minor variation of the scheme (not including any later mentioned idea). Where I know the run-time to great accuracy, I have denoted it as "Run-time = ". Where I am less sure of the accuracy for any reason, I have denoted it as "Run-time ~ ". The first two run-time values I derived myself from first principles. The others are all values I have found mentioned in posts, and are all obtained from simulations. The sudden switch back to the much longer running distribution schemes in the last two entries was the result of investigations into a suggestion of mine that distribution schemes might be profitably combined with counting schemes. (So far, this has proved to be a false hope, though apparently rmsgrey still thinks it might work. He had also suggested a somewhat different means of combining the two methods prior to mine.) I have not included Mattian's or later methods into this history because I am not yet sure what is in them. When I get the time to figure them out, I may edit the history to add them. History BanderSnatch - Complete cycle scheme. Run-time = 1.0608 x 1044 days. Captain Caveman (quoting from a slash-dot site) - Single stage counting scheme ("Leader solution"). Run-time = 10418 days (28.5 years). sjbrown - Distribution scheme, stage length = 1, only the prisoner can pass his marker. Run-time ~ 8230 years (per Mattian). S. Owen - Distribution Scheme, stage length = 1 (anyone possessing a marker can pass it): Run-time 252 years (per Mattian). Salem - Single stage counting scheme with 100 day "snowball" round. Run-time =9367 days (~25.6 years, per Mattian). AlexH - Single stage counting scheme with short "snowball" round. Run-time = 9321 days (per Mattian). Paul Hammond - Multi-stage counting scheme (future cycles start from scratch). Run-time = 3964 days (~10.8 years, per Mattian). AlexH - Multi-stage counting scheme (future cycles are continuations, stage lengths vary between cycles). Run-time ~ 3600 days AlexH - Leaderless binary counting scheme. Run-time ~ 4250 days biwema - Counting scheme with mixed-value tokens: Run-time = 3600 days. SWF - Counting scheme with multi-leader snowball round. Run-time = 3535.6 days. Rezyk - Counting scheme (consolidation round, greedy leader, 2nd cycle degeneration). Run-time = 3532 days. gypo - Counting scheme (borrowing against badges). Run-time = 3489 days Paul Hammond - Distribution scheme with stage length >1. Run-time ~ 54,000 days (148 years) rmsgrey - Distribution scheme with varying stage lengths between cycles: Run-time ~ 53,000 days (145 years) By the way, My kudos for most groundbreaking idea goes to Paul Hammond for introducing the multi-stage concept. This was the seminal post that pushed this puzzle into entirely new territory.
|
« Last Edit: Aug 8th, 2006, 5:46pm by Icarus » |
IP Logged |
"Pi goes on and on and on ... And e is just as cursed. I wonder: Which is larger When their digits are reversed? " - Anonymous
|
|
|
mattian
Senior Riddler
Gender:
Posts: 404
|
|
Re: 100 prisoners & a light bulb
« Reply #364 on: Mar 19th, 2005, 4:39pm » |
|
Working on it - still hovering around 4000 - not sure what I'm missing. Will keep you posted.
|
|
IP Logged |
|
|
|
Icarus
wu::riddles Moderator Uberpuzzler
Boldly going where even angels fear to tread.
Gender:
Posts: 4863
|
|
Re: 100 prisoners & a light bulb
« Reply #365 on: Mar 19th, 2005, 5:05pm » |
|
By the way, Mattian, when you brought up the matter of how to judge effectiveness of a scheme (whether or not expected release time was the best), you stated that your scheme compared favorably to others in most runs, but had occasional long runs that brought the mean run-time up above that of other schemes. All staged counting schemes run in cycles (with a semi-exception for Rezyk's, which has only two cycles, the 2nd of which is not like the first). Each cycle consists of a series of stages, and it is only possible for a prisoner to declare victory in the last stage of any cycle. This results in exactly the pattern you described. Assuming the scheme has been optimized for mean run-time, most runs will result in a prisoner declaring in the first cycle. But occasionally things won't work out in this cycle, and a 2nd cycle occurs. This always adds a tremendous amount of time to the run, as victory cannot be declared until the final stage comes around again. On extremely rare occasions, the 2nd cycle will fail, and a 3rd cycle occurs, which once again adds a big contribution to the length of the run. Etc. In other words, every counting scheme will come in significantly under the mean for most runs (runs with a 1st cycle victory), but occasionally will toss in much longer times (runs with 2nd cycle victory) and on rare occasions toss in an extremely long time (3rd cycle or later victory). So IF you were comparing your 1st cycle victory runs to their mean run-time, the comparison is invalid, because their 1st cycle victory runs were much shorter too.
|
|
IP Logged |
"Pi goes on and on and on ... And e is just as cursed. I wonder: Which is larger When their digits are reversed? " - Anonymous
|
|
|
mattian
Senior Riddler
Gender:
Posts: 404
|
|
Re: 100 prisoners & a light bulb
« Reply #366 on: Mar 19th, 2005, 5:18pm » |
|
Yes, I knew what was causing those nR spikes (where n is the number of cycles and R is the runtime of a succesful one cycle run). I honestly don't remember if my results included the spikes or not - based on the current numbers, I assumed yesterday that they must have excluded the spikes, but now I'm not so sure - I doubt I would have submitted them as comparable to the other solutions if I had to doctor them first. I will have to go back and read my posts. I'm getting close now though, I can feel it. If I can save myself another 500 days on average, I get to save face. .
|
|
IP Logged |
|
|
|
Icarus
wu::riddles Moderator Uberpuzzler
Boldly going where even angels fear to tread.
Gender:
Posts: 4863
|
|
Re: 100 prisoners & a light bulb
« Reply #367 on: Mar 19th, 2005, 5:48pm » |
|
I wasn't accusing you of doctoring the numbers! You quite plainly stated you were getting the occasional spike: on Jul 3rd, 2004, 10:17pm, mattian wrote:My results look something like this: 3352 2944 3323 5929 (this bastard affects my beautiful average so far) 3172 3098 6345 (and again) 2982 6083 (yet again) These are numbers from a real sample I just ran. If you ignore all the high numbers, the low numbers are quite nice. |
| I just wanted to point out that the same situation holds in all multi-stage counting schemes. ____________________________ A more thorough means of reporting these schemes would be to report the 1st cycle victory mean time & probability, 2nd cycle victory mean time & probability, later cycles mean time & probability (one value covering all later cycles), and finally, overall mean time. This would allow several comparisons between schemes, and possibly point out addition directions for improvement. ____________________________ When it comes to looking for improvements, I would suggest a look at Rezyk's scheme (see the link in my history). With all the other schemes, I have some idea as to why they improved on previous methods. But Resyk's has me baffled. Almost everything I see in it looks either dubious (the consolidation round), or downright going in the wrong direction (degeneration - tokens built in the first cycle are torn apart in the second and collected by a single leader, also binary first 3 stages). Yet somehow Rezyk was able to beat all earlier comers (if just barely), and has been bested by only one proposal since. I am curious what it is that results in Rezyk's fast times. Perhaps exploiting it could improve times farther.
|
« Last Edit: Mar 19th, 2005, 6:05pm by Icarus » |
IP Logged |
"Pi goes on and on and on ... And e is just as cursed. I wonder: Which is larger When their digits are reversed? " - Anonymous
|
|
|
SWF
Uberpuzzler
Posts: 879
|
|
Re: 100 prisoners & a light bulb
« Reply #368 on: Mar 19th, 2005, 6:48pm » |
|
Some of the claims made for various strategies appear to be questionable, so it is a good idea to have somebody verify. I programmed Rezyk's approach and verified the numbers claimed. Back when I was working on this problem, I tried many variations and still have some notes. Long ago I gave up working this problem, but with the renewed interest, thought I would share this information. Unfortunately my notes are not well documented, making it difficult to remember what they mean. My notes have a table giving how many days needed to reduce 100 prisoners to n prisoners who have each collected similar numbers of signals. Based on those results I made tables for reducing n prisoners down to m for some of the more promising values of n and m. Finally there is a table for number of days for the m prisoners to pass the information to the final prisoner who knows all 100 have visited. Those tables are bit long to include here, but for the basic method reducing 100 down to 1 in one step took an average of 10391 days, which is slightly different from the 10418 from Captain Caveman. I may have used a slightly different approach that is a little more efficient (perhaps Captain Caveman set the prisoner responsible for collecting signals as the prisoner selected on Day 1, while I may have made him the one who enters on day 2). One table that is not too long to include here gives mean number of days for using a "snowballing" method plus two stages of counting, with the 'secondary counters' (see my old post for what that means) counting to N, and 'primary counter counting' to M. This first table is for after a fairly crude optimization of length of each stage and only about 20000 cases run to get an average. Due to limited number of cases run, the mean number of days may be accurate to within a few days: N M Mean Number of Days 10 10 3537 11 12 3544 9 10 3537 12 16 3574 14 16 3633 7 9 3654 Using the top three cases from that table, more care was taken to optimize the lengths of the stages and for each case around 10 million cases were run for a more accurate average. The results were: N M Mean Number of Days 11 12 3535.55 10 10 3537.20 9 10 3537.21
|
« Last Edit: Mar 19th, 2005, 6:51pm by SWF » |
IP Logged |
|
|
|
mattian
Senior Riddler
Gender:
Posts: 404
|
|
Re: 100 prisoners & a light bulb
« Reply #369 on: Mar 19th, 2005, 7:31pm » |
|
Regarding the validity of schemes in general - I have almost finished the prison model which can run any scheme defined within a single class. I intend to drop all the schemes into this model so that the numbers can be compared using the same prison engine. When I have a solid product, I will drop it here for people to use and improve - it will work for anyone who has a JVM installed on their system - I will post it as a single jar file. In order to define a scheme, one essentially describes the behaviour of a prisoner who is party to that scheme. The all prisoners inherit from the Prisoner class which implements an interface which mandates the definition of a method called drawConclusion which is passed two parameters - a reference to the lightBulb (a simple class instantiated in the Prison object which provides the Prisoner with the ability to set the bulb state or view the bulb state) and the day since the prisoners were locked up. The standard Prisoner class which implements the IPrisoner interface defines drawConclusion to unconditionally return false. Inherited from Prisoner is a class called OptimisticPrisoner which returns true unconditionally and is extended by another class called thinkingPrisoner which is the root of all scheme-following prisoners. The difference between the two initial prisoners (prisoner and optimisticPrisoner) is that the first remains in prison forever and the second is killed on the first day along with all his inmate friends. The various schemes in this forum tend to belong to one of three or four groups - binary solutions, multilevel, leader solutions, etc. I haven't written them yet, but I intend to develop the chain of inheritance such that it mimicks the development of the ideas in each branch of thinking in this forum. I'm curious to see how the numbers compare to those posted for the various schemes. INCLUDING MY OWN!!!
|
« Last Edit: Mar 19th, 2005, 8:08pm by mattian » |
IP Logged |
|
|
|
mattian
Senior Riddler
Gender:
Posts: 404
|
|
Re: 100 prisoners & a light bulb
« Reply #370 on: Mar 19th, 2005, 7:39pm » |
|
Another point is that we should probably formalise the number of iterations that we consider acceptable. SWF for example, posted his results recently after 10 000 000 iterations. Is that enough? Too much? Or could we say that a result is considered comparable to the others if it conforms to the requirement that it holds in simulation of 10 000 000 iterations? Anyone object to 10 000 000 as the standard?
|
|
IP Logged |
|
|
|
rmsgrey
Uberpuzzler
Gender:
Posts: 2873
|
|
Re: 100 prisoners & a light bulb
« Reply #371 on: Mar 20th, 2005, 8:05am » |
|
Well, so far, my hybridisation schemes are tending to converge on gypo's scheme anyway, so, while there is something in it, that something looks rather like what we've already got... A brief summary of (my) current notation (in the hopes of inspiring people): Token - every prisoner has (at least) one to start. If anyone ever collects them all, the prisoners win. n-Token - an aggregate of n tokens, usually used in reference to the binary method family of solutions. Badge - each badge requires a number of tokens to "fill" it. If anyone ever collects all the badges and they're all filled, the prisoners should win. Crown - the unique designator for the single "leader" both in single-leader and in two stage methods. The prisoner with the crown is responsible for collecting filled badges. Snowball - a number of consecutive days in which the number of tokens passed increases by 1 each day until someone is unable to donate the token, and gains a large count instead. (the snowball period then has to be comleted before any further transactions) Label - a unique identifier for a specific object of a specific type. A filled, labelled object can only be copied and not passed, and a given label can only apply to one object - if one prisoner gets two token labels, he has to either pass one label on or obtain an unlabelled token from somewhere. Uniquely, loose labels can be passed by an off bulb. *********************************************** Some thoughts on optimising for asymmetric badges: at present, the effect of dynamically assigning badges using gypo's method or minor variations is to render some number of prisoners totally neutral (quoted figure is roughly 20+/-5) and leave the badges with required counts distributed over the range 6-10. It seems to me to be sensible to ignore those initial counts to start with and figure out the optimum setup for counting prisoners starting from a situation with a number of pre-assigned badges and a number of pre-assigned neutral prisoners (or a reduced set of prisoners with a chance of skipping given days and don't look too hard at what happens between stages) set up to match the expected case coming out of the assignment stage, then tweak the assignment stage to match. So with gypo's figures, you're looking at about 80 active prisoners, which looks like 9 badges, each wanting about 9 tokens, which ends up resembling SWF's solution very closely - allowing for the fact that SWFs badges are symmetric apart from the one paired with the crown. *********************************************** New thought on hybridisation: What if you use snowballs and dynamic allocations so that, for instance, badge#1 is created day 1, then filled over the snowball up to day 10, and shared up to (say) day 50, then badge#2 similarly, up to badge#5 or badge#6 then 4 (say) anonymous badges created to take up the slack - a reasonably large number of people will start out with copies of badges #1 and #2, and hopefully later badges up to the end of labelling. The anonymous badges and any unfilled labelled badges are filled during a standard stage 1, the anonymous badges are consolidated to a labelled crown in a standard stage 2, and then the labelled items are shared during a 3rd stage - with substage lengths based on expected numbers sharing each object.
|
|
IP Logged |
|
|
|
Paul Hammond
Newbie
Posts: 29
|
|
Re: 100 prisoners & a light bulb
« Reply #372 on: Mar 20th, 2005, 12:35pm » |
|
If people are going to start verifying claimed averages for the various schemes in this thread I ought to concede that the result of 3900 days that I claimed for my scheme must be wrong, judging by the results for obviously superior schemes later on. I remember somebody questioning it at the time, but I never did find the bug in my simulation code which led to the underestimate.
|
|
IP Logged |
|
|
|
Icarus
wu::riddles Moderator Uberpuzzler
Boldly going where even angels fear to tread.
Gender:
Posts: 4863
|
|
Re: 100 prisoners & a light bulb
« Reply #373 on: Mar 20th, 2005, 2:36pm » |
|
SWF - 10418 days for the single leader solution was not a number Captain Caveman came up with. Note that I said that the values given were not necessarily those of the proposers - instead they were the best values given for any scheme that made use of the ideas listed before, but not after. For instance, all the 3600 day values I gave came from you, SWF, not the proposers, as you stated this value as being exact for a couple of mixed-value-token plans not making use of your multi-leader snowball round, and also said that the [10, 10] binary method did not do much worse. I also stated (but admittedly, in a way that is easy to miss) that the first two values I gave (1.0715 x 1044 days for the complete cycle scheme, and 10418 days for the single-stage counting method), I myself derived mathematically. They did not come from simulation runs. (To arrive at the complete cycle run-time by straight simulation would take longer than the age of the universe, even on the fastest supercomputers). It might also be possible to calculate the run-time for the single stage with snowball round solution, but I have not done so yet. It might even be possible for the simple multi-stage solutions, though there are significant difficulties I haven't seen a way around. Paul Hammond - your erroneous calculation was later, for the binary stages routine. While your 3900 day value for your original scheme is unchecked, it does not seem unreasonable. SWF stated later that the [10, 10] scheme did not do much worse than 3600 days, and that version included AlexH's suggestion that later cycles continue the original stages. If you will recall, in your original scheme if the first cycle failed to produce a release, later cycles were to start over from scratch - everybody acting as if it were day 1 again. Clearly this would result in a longer run-time, but since the first cycle is successful more than 90% of the time, the increase should be by less than 10% - i.e. roughly in the range of 3900 days. Mattian - 10,000,000 simulations is probably a bit excessive to demand for every number quoted. If all you are doing is giving a general idea of a scheme's run-time, the lesser simulation counts are fine. And some people may not be able to devote the time and processing power to run 10,000,000 simulations. But with any number to be accepted as accurate, at the vary least, the simulation count they come from should be reported.
|
|
IP Logged |
"Pi goes on and on and on ... And e is just as cursed. I wonder: Which is larger When their digits are reversed? " - Anonymous
|
|
|
mattian
Senior Riddler
Gender:
Posts: 404
|
|
Re: 100 prisoners & a light bulb
« Reply #374 on: Mar 20th, 2005, 4:17pm » |
|
Okay - the engine is built. It is not yet well commented and it needs some kind of a user interface - whether it is as mundane as command-line parameters or a fully fledged gui. Right now, the class containing the main method, is being recompiled every time I change parameters (such as number of prisoners, number of iterations over which to extract an average, which prisoner type to use, etc.) The first of the results is in: BanderSnatchPrisoner has been developed and dropped into the prison. Running the simulation had my computer staring back at me blankly as BanderSnatchPrisoners patiently waited for a sequence of a 100 non-repeating visits. Rather than wait 3x10^31 millenia for my computer to finish running a single iteration, I modified the problem for the BanderSnatchPrisoner. The result of 10000 iterations in a prison containing ONLY 10 BanderSnatchPrisoners, is an average escape time of 2775.4017. This result is close to the probability-based expectation of 2755.7319. Next victim: Captain Caveman and his Single stage counting scheme. BanderSnatchPrisoner.java (for those who are interested) /** BanderSnatch - Jul 24th, 2002, 2:19am * * My first thought on this is rather simple, though in reality it would * (most likely) take far too long to be practical. Starting the day of * the meeting, each prisoner starts counting the days. As each prisoner * goes into the room, they leave the light in the off position. However, * if they have been in the room once before in the hundred days, they * turn the light on. Once the light is turned on, it stays on for the * remainder of the initial hundred days as a signal to the others that * someone has been in the room more than once, and hence someone will not * have been in the room at the end of the hundred days. On the hundredth * day, the prisoner turns the light off, starting another hundred day * cycle. If, on the last day of any hundred day cycle, the light remains * in the off position, it can be said with certainty that every prisoner * has been in the room, since no prisoner has been there twice. * * Though this solution would eventually work, probability says that it * would not work in a timely manner, since it goes in 100 day cycles, * rather than having the possibility of the hundred days starting at any * arbitrary time. */ package com.monahansen.prison; public class BanderSnatchPrisoner extends ThinkingPrisoner { long currentStage = -1; public BanderSnatchPrisoner(int id, int totalNumberOfPrisoners) { super(id, totalNumberOfPrisoners); this.currentStage = -1; } public boolean drawConclusion(IBooleanSwitch lightBulb, long currentDay) { boolean conclusion = false; do { // if the day is 1, 101, 201, etc. we're into the next cycle - reset light if ((currentDay % this.totalNumberOfPrisoners) == 1) { lightBulb.setState(false); } // if the light bulb is on, just leave the room - no point continuing with this stage if (lightBulb.getState() == true) { conclusion = false; break; } // if it is the 100th day in the cycle, and the light is off, declare freedom if ((currentDay % this.totalNumberOfPrisoners) == 0) { conclusion = true; break; } // if the current stage is equal to my recorded stage, then switch on the light and leave. if (((long)(currentDay / this.totalNumberOfPrisoners)) == this.currentStage) { lightBulb.setState(true); break; } // record visit in this stage this.currentStage = (long)(currentDay / this.totalNumberOfPrisoners); } while (false); return conclusion; } }
|
« Last Edit: Mar 20th, 2005, 6:01pm by mattian » |
IP Logged |
|
|
|
|