

Title: HARD: Hamming Distance Questions Post by D on Jul 26^{th}, 2002, 3:09pm The wording of this problem seems confusing. I'm guessing I don't have the math background to solve it anyway. But it might be fun to think about. DO... you want a ?formula? that will generate 2 sets [x,y] with n elements of length k bitstrings such that for each relation (x,y) the hamming distance will be greater than c ?? Or maybe like, what are n, x, and y given k and c? OR: a formula that answers: A pair of sets with n elements of length k bitstrings, optimally chosen will have a minimum hamming distance of c? The first one seems much harder and more likely to be a programming question (fe: how do you do this in linear or polynomial time?) For the second though, you can start with some boundries. You know that with length k bitstrings, the maximum hamming distance between any two bitstrings is k. So if n = 1, the answer is k[i]. You also know that if [i]n = 2^k, the answer is 0. Hrm.. I don't have time to solve this at work right now, maybe I'll try and work on it later at home. This is kind of a fun problem. D 

Title: Re: HARD: Hamming Distance Questions Post by william wu on Jul 26^{th}, 2002, 4:09pm Sorry, I will repost a clearer wording of the problem later. It was not my intent to actually generate the elements of the optimal set  although that is also an interesting problem that we could explore. The latter question in your post is what I'm interested in. Assume you already have the optimally chosen set, which maximizes how spread out the elements are in Hamming space. Then what is the minimum Hamming distance D? Here is the notion of Hamming space, summarized in a useful picture: This is the Hamming space for k=3. Start at a vertex. Change one bit to cross one edge. Change two bits to cross two edges. Change three bits to cross three edges. Note the important difference between Hamming space and the everyday notion of numeric distance. Whereas we would normally say that 100 (4 in binary) and 000 (0 in binary) are 4 units apart, in Hamming space, they are only one unit apart, and very close. We can start with some easy cases. For instance, if n = 2^k, then D = 1. Also, if n = 2, then D = k. This is clearly illustrated by the picture above. Note that the points furthest apart in Hamming space are on opposite corners of the cube, and that those corners have exactly inverse polarities. Thanks! 

Title: Re: HARD: Hamming Distance Questions Post by D on Jul 26^{th}, 2002, 11:55pm DAMN!!! I had a huge writeup of my thoughts and it just got wiped by accidentally going back a page!! Grrr... Ok, firstly, I had 2 questions. 1: Can the sets overlap? It's really more of a boundry question. If they don't overlap, then the minimum distance is 1. 2: are the sets always of equal size? Meaning that if you add one element to set x, then you always add an element to set y?  Ok, here's some abstract thinking about the problem. If you start with n = 1, then the D = k. Considering the hamming distance as the length of a path from an element of x to an element of y. If you add one element to each set, even if they are of opposite polarity you must have decreased the distance of an original path by 1. However, I believe that although still need to prove that (assuming equal sized sets) you can add up to n/2 (round down) elements before the path decreases even more. The reason I say this is, lets say the sets weren't of equal size, you start adding elements to set y, the optimal elements to add first would be all elements that are distance 1 from the original element. This means you can add k elements before D goes down again. I think that if you're adding 2 elements at a time it goes down twice as fast. But I'm not sure, it may go down faster. So... my thoughts all point to a linear solution. y = mx + b y = D x = n m = f(n)  a function of n b = yintercept, what is D if k = 0. First, lets evaluate m. The maximum D is k, when n = 1, and the minimum D is 1 when n = 2^k/2. m = y1y2 / x1x2 m = (k1)/(12^k/2) ... some algebra ... m = [ (k1) / (2^{k1} 1) ] Now, b. The tricky part about b is that what does n = 0 means, both sets are empty? How far away are you from no element to no element? So instead we take a known point on the line (n=1, D=k) and knowing the slope, evaluate b. D = mx + b k = mn + b k  m = b b = k + [ (k1) / (2^{k1} 1) ] So.. putting it together. D = [ (k1) / (2^{k1} 1) ]n + k + [ (k1) / (2^{k1} 1) ] Hrm.... seems a little messy. Oh well this is just some thoughts assuming a linear solution. D 

Title: Re: HARD: Hamming Distance Questions Post by D on Jul 27^{th}, 2002, 12:06am Um... I just ran some numbers. It works ok for k = 1, and k = 2 and not quite as elegant for k = 3. The equation needs some fine tuning. It may be just to round down D. But I'll think about it a little more. D 

Title: Re: HARD: Hamming Distance Questions Post by D on Jul 27^{th}, 2002, 12:19am It's 5 minutes later, in the last 5 minutes I've thrown out several cans of caffeine cluttering my desk and replaced them with a fresh can.. and came up with another thought. It's probably not linear... The reason is the number of paths you can take goes up exponentially and I think that at every level you have more elements you could add while keeping the hamming distance the same than you had at the previous level. Although the distance still went down. I'm thinking primarily about different size sets because it's easier to visualize a solution and hoping equal size sets would be just an extension of this. BTW, if this is the case, the equation will be some sort of D = 1/k^{p}*n + c 

Title: Re: HARD: Hamming Distance Questions Post by D on Jul 27^{th}, 2002, 4:49pm I was out doing some homework and started working on this again. I tried experiementing with k = 4. For some reason, n = 1, D = 4 n = 2, D = 3 n = 3, D = 2. I couldn't come up with a set that did better at n = 3. I was working with equal sized sets such that if you add an element to x, you have to add one to y as well. Interestingly enough what I observed was that when choosing the next pair beyond n = 1, it's polar opposite would have a D = 4, the poles (0000, 1111), had D = 3, and any element I chose seemed to share 2 bits with the first chosen on the opposite set. D 

Title: Re: HARD: Hamming Distance Questions Post by Bruce on Jul 30^{th}, 2002, 4:31am Coming at this problem from the other side  consider the set of all binary strings of length k. How many of them can we select if we insist that any pair of select elements has a Hamming distance of at least d? Well, there are 2^{k1}.^{k}C_{d} (where ^{k}C_{d} is k!/d!(kd)!) pairs of elements with Hamming distance d  so we can select 2^{k1}.SUM(n=d, n=k, ^{k}C_{n}) elements before we are forced to break our constraint. Any help? 

Title: Re: HARD: Hamming Distance Questions Post by pio on Jul 30^{th}, 2002, 4:35am I start with an hipotesis. I think that the number of bitstrings that are at a hamming distance of D of a given string is equal to the D position of the N row of the tartaglia triangle. That is C(n,d) For example: in N=3 D=0 > 1 bitstring D=1 > 3 bitstrings D=2 > 3 bitstrings D=3 < 1 bitstring I can't do it here, but if you draw a four dimensions hipercube (yes, you can) you may be able to see that this is also true. in N=4 D=0 > 1 bitstring D=1 > 4 bitstrings D=2 > 6 bitstrings D=3 > 4 bitstrings D=4 > 1 bitstring I guess this is true for every N, (anybody can prove it?) Then we can see also that we get an increase of the distance only for an odd N. With all this I've tried several aproachs and so long I've found this. My guess is that for any given K: D=K when n=2 D=(K+1) DIV 2 when 2<n<=2^(2+((K+1) MOD 2)) But i don't know how can I extend this for every K and every N. I think I'm in the right track. What do you think? Any suggestions? 

Title: Re: HARD: Hamming Distance Questions Post by Steven Noble on Jul 30^{th}, 2002, 5:40pm pio said "I think that the number of bitstrings that are at a hamming distance of D of a given string is equal to the D position of the N row of the tartaglia triangle. " later pio said "I guess this is true for every N, (anybody can prove it?)" I won't give you the proof for this statement because there is no reason you shouldn't be able to get it yourself. The easiest way to go about it is probably a recursive proof using one of the simplest characteristics of the triangle (be it tartaglia or pascal or Yanghui). Hope that's enough. As for the rest of the problem has anybody made a script that will give a bunch of data to look at? I had a number theory prof who used to say that the most important affect computers have had on mathematics is they can give mathematicians a lot of data to stare at while we search for an answer. (he had a much more elegant way of saying it... but he strongly believed that if u stare at something for long enough eventually the answer will come to you) 

Title: Re: HARD: Hamming Distance Questions Post by cnmne on Jul 30^{th}, 2002, 6:31pm Solving for H, given N and K 1<K 1<N<2^K at N<2, H is undefined at N=2, H = K at 2^(K1)<N<=2^K, H = 1 at N>2^K, H=0 (must be duplicates) I would have to guess that it is a rounded logorithm with a little tweaking. There is obviously a relationship between 2^1...K1 and H but it is not exact. 

Title: Re: HARD: Hamming Distance Questions Post by Nathan Hellweg on Jul 30^{th}, 2002, 11:01pm YABB ate my last messageo so I'll be brief. If you have n strings, each string k bits long, the hamming distance is d, then,and if d is even, then this ineqality holds. n* sun(from i=0 to d/2)(k choose i) <= 2^k Moreover, if the inequality holds for some d, then there is a set of strings with the appropriate properties. The trick is finding a closed form for the sum/inverting the sum. I suspect that the sum can be expressed as an integral in a nice way, but I'm not quite there yet. 

Title: Re: HARD: Hamming Distance Questions Post by oliver on Jul 31^{st}, 2002, 1:06am Nathan, You came to that formula by looking at "Balls" around the points of the maximum set, right? With "Balls" I mean B_{d}(x) := {y  d(x,y) <= d} That's the way I found this formula, too  but with one small correction. You have to use balls with radius (d1)/2. If two points have Hamming distance d (d even), there's clearly a point between them which has Hamming distance d/2 to each of them, therefore belonging to both "Balls". That means that the Balls aren't disjunct and the inequality may not hold (these are all my problems, just speculating). Maybe you had another idea, anyway, but if that was your reasoning, maybe you know if this inequality is sharp enough, i.e. that we now just need to find the maximum d so that the inequality hold. I'm not so sure about that. 

Title: Re: HARD: Hamming Distance Questions Post by oliver on Jul 31^{st}, 2002, 11:03am FWIW, I wrote some small python methods, in order to get some numbers to work with. The problem is that I cannot prove ATM that my solutions are ideal, but for all instances I have tested it (k<=4) the solution seems right. Another problem is that I may simply have an error in my code ;>. Anyway, for all people interested in logical/mathematical thinking, python is a nice and clear language. I do something like Erasthostenes' Sieve. I'll explain it with the code (btw. it's python). For using it, you'll have to just replace all _'s with spaces, otherwise the indendation would have been destroyed. Code:
build_space(k): constructs a dictionary (hash table) of all binary strings of length k, represented as integers. build_ball(k,d): constructs a "ball" of hamming diameter d1 around 0 (remember, integer representation, this is the bitstring with just 0's), leaving out zero itself. IOW, this ball contains all bitstrings which have at max d1 nonzero bits. Remark: We'll use that for two given bitstrings x,y, the hamming distance between the less than d, if and only if x XOR y is element of this ball. hs(k,d): hamming_sieve. I'll let you figure out that one ;P Using: Code:
we get k: 3, d: 2, n: 4, set: [0, 3, 5, 6] k: 4, d: 2, n: 8, set: [0, 3, 5, 6, 9, 10, 12, 15] k: 4, d: 3, n: 2, set: [0, 7] k: 5, d: 2, n: 16, set: [0, 3, 5, 6, 9, 10, 12, 15, 17, 18, 20, 23, 24, 27, 29, 30] k: 5, d: 3, n: 4, set: [0, 7, 25, 30] k: 5, d: 4, n: 2, set: [0, 15] Here's a longer list: k: 3, d: 2, n: 4 k: 4, d: 2, n: 8 k: 4, d: 3, n: 2 k: 5, d: 2, n: 16 k: 5, d: 3, n: 4 k: 5, d: 4, n: 2 k: 6, d: 2, n: 32 k: 6, d: 3, n: 8 k: 6, d: 4, n: 4 k: 6, d: 5, n: 2 k: 7, d: 2, n: 64 k: 7, d: 3, n: 16 k: 7, d: 4, n: 8 k: 7, d: 5, n: 2 k: 7, d: 6, n: 2 k: 8, d: 2, n: 128 k: 8, d: 3, n: 16 k: 8, d: 4, n: 16 k: 8, d: 5, n: 4 k: 8, d: 6, n: 2 k: 8, d: 7, n: 2 k: 9, d: 2, n: 256 k: 9, d: 3, n: 32 k: 9, d: 4, n: 16 k: 9, d: 5, n: 4 k: 9, d: 6, n: 4 k: 9, d: 7, n: 2 k: 9, d: 8, n: 2 I'd be glad to hear criticisms, etc. 

Title: Re: HARD: Hamming Distance Questions Post by Nathan Hellweg on Jul 31^{st}, 2002, 11:09am on 07/31/02 at 01:06:46, oliver wrote:
You're right, I've got an off by one error, so it's (d1)/2 and odd d's. Really, I should be using balls with radius d1 and figuring out how much the get shared. 

Title: Re: HARD: Hamming Distance Questions Post by oliver on Jul 31^{st}, 2002, 11:09am Argh, just saw that in the code I used at least two variable names with underscores: ball_1 and d_ball Please consider that if you do the search&replace thing. Sorry. 

Title: Re: HARD: Hamming Distance Questions Post by Nathan Hellweg on Jul 31^{st}, 2002, 8:11pm I haven't gone through and done the algebra, but it looks like there is an approximation that involves the inverse error function (yuck) that can be derived by using the stirling approximation for the binomial terms, and then integrating instead of taking the sum. 

Title: Re: HARD: Hamming Distance Questions Post by william wu on Aug 3^{rd}, 2002, 11:54pm OK there's some heavy analyses going on, so it's going to take me a while to process all this information skeptically. I'll get back to you guys later, perhaps after my stupid GRE. Others are welcome to help me comb through the logic though. I appreciate all the effort that's going into this. In the meantime, I'll be memorizing obfuscated words :P 

Title: d=floor(k/log_2(n)) Post by Nathan Hellweg on Aug 5^{th}, 2002, 11:22am I just had an epiphany. How does this sound: You've got to maintain a hamming distance of d. Let's make the (likely) assumption that I can create a situation where the distance is always a multiple of d. I need to come up witha good explanation for this assumption, but it seems reasonable to me right now. Then if you have k digits, then then you can pack in 2^(floor k/d) numbers. So n=2^(floor k/d) Now, if you're given n then you can pack with the distance: floor(k/log_2(n)) under the same circumstances. so, if that assumption holds, then: d=floor(k/log_2(n)). Clearly this is optimal when k/log_2(n) is an integer... Now I need to prove the assumption :/ Of course it's prolly strong enough for you CS types. 

Title: Re: HARD: Hamming Distance Questions Post by Nathan Hellweg on Aug 5^{th}, 2002, 11:26am Please Ignore Previous Post %) 

Title: Re: HARD: Hamming Distance Questions Post by oliver on Aug 5^{th}, 2002, 11:59am Nathan, I see you already retraced that, but maybe you're on to something. First, the number of bitstrings of length n is a group, with group operation XOR. And it seem like the minimal sets I provided are always subgroups of these sets, that means that they also are groups with group operation XOR for themselves. That is consistent with your claim (hope ;D) that the distance is always a multiple of d. If the above could be proved, we would at least know that k is a factor of 2^n. As a consequence, we know at least that k for a optimal group is of the form 2^i, for an integer i. 

Title: Re: HARD: Hamming Distance Questions Post by Nathan Hellweg on Aug 6^{th}, 2002, 11:11pm Here's the result of the foray into estimation: We had: n*\sum_{i=0}{(d1)/2}(k choose i) <= 2^k Now, there is an estimation that goes like this: n*\intgral_{\infty}{(dk/2)/k*8}((2\pi)^{.5}*e^{x^2/2}dx) <=2^k By substitution we get \integral ((2\pi)^{.5}*e^{x^2/2}dx) = \pi^{0.5} * erf(x*2^{.5}) since erf(\infty\=0) we get n*pi^{0.5} * erf(((dk/2)/k*8 )*2^{.5}) <= 2^k Some algebra gives ierf(2^k*\pi^{.5}/n)>=((dk/2)/k*8 )*2^{.5} More algebra gives ierf(2^k*\pi^{.5}/n)*2^{3.5}*k+k/2>=d Where ierf is the inverse error function and erf is the error function. Which is probably a good estimate for large k,d,and n. 

Title: Re: HARD: Hamming Distance Questions Post by NoJoy on Aug 15^{th}, 2002, 12:44pm I'm going to step through my [incomplete] analysis, even though I'm pretty sure I'm just restating what others have said before. Let's define a closed ball of radius d around a bitstring s of length k: B(k,s,d) = {t  H(s,t) <= d} How many bitstrings are in such a ball? By symmetry, B(k,s,d) = B(k,0,d), where 0 is the string of k zeroes. But a bitstring is at Hamming distance i from 0 iff it has i onebits, so the number of bitstrings at Hamming distance i from 0 is the number of ways of choosing i onebits from k bits. So B(k,0,d) = sum(0 <= i <= d : k choose i). Let's denote this by f(k,d). [As an aside, the terms of this sum are the previously mentioned elements of the kth row of the XXX triangle, and if I understand the approximation thread, there is a smooth approximation to this upanddown staircase which can be integrated to produce the ith partial sum, right?] So suppose we have a set of n bitstrings of length k with minimum pairwise Hamming distance d. For odd values of d, we can form a ball of radius (d1)/2 around each element of the set which is disjoint from each other such ball, so n*f(k,d) <= 2^k, and f(k,d) <= (2^k)/n. For even values of d, things are not so clear, because the balls overlap. But I'd like to think that if we account for the overlap, we're left with finding the maximum d satisfying some inequality. [As an aside, if I follow the approximation thread, this corresponds to using an inverse function, right?] So I'm left with three unanswered questions: 1. Is is actually always possible to pack the balls this tight, i.e. is the inequality tight enough? 2. What is required to account for the overlap in the case of even values of d? 3. Would an expression like d = {max ...} count as a "closed form solution"? 

Title: Re: HARD: Hamming Distance Questions Post by oliver on Aug 15^{th}, 2002, 1:40pm Quote:
Nojoy, I fear that this is never a closed form solution, because if you think a little bit it's quite easy to write down exactly the definition of d with this. And btw. you should really read the other threads IMO, because they answer some of your questions, and you'll see that the code I posted does exactly use your B(...) ideas. 

Title: Re: HARD: Hamming Distance Questions Post by James Fingas on Aug 19^{th}, 2002, 2:04pm Uhoh! I have run into a little snag I thought I'd let you know about. The angle I've been taking on this problem is similar to Oliver's angle (but without the code). For a given length k and a distance D, find the maximum possible number of messages n. I have been generating these messages by hand (it's pretty hard to do). At first, I found that n was always a power of 2, but I can disprove this by counterexample. Take k=8, D=3. First of all using the "ball" argument, there can be a maximum of 28 messages: total messages for k=8: 8^2 = 256 each message has 8 neighbours (messages only one bit away), plus the message itself. 256/9=28.4 Since none of these neighbours can be shared, then it must hold true that we have less than 28 messages. However, examine the following 20 messages, and please verify that they are all at least 3 bits different. 0000 0000 1110 0000 0011 1000 0000 1110 1000 0011 1001 0100 0010 0101 0100 1001 0101 0010 1010 1010 0101 0101 1101 1000 0110 1100 0011 0110 0001 1011 1000 1101 1100 0110 0110 0011 1011 0001 1111 1111 Therefore, there are between 20 and 28 messages that you can fit into 8 bits with a minimum Hamming distance of 3. Since there are no powers of 2 between 20 and 28, then obviously n isn't always a power of 2! So Oliver, I guess that you have to revise your list to read k: 8, d: 2, n: 128 k: 8, d: 3, n: 20(+?) etc. This has given me a good reason to look over all my calculations, and now I'm not sure in what direction to proceed. All I know for sure is that: 1) for k < D, you can't send any messages 2) for k >= D, you can send at least two messages 3) for 2k >= 3D, you can sent at least four messages 4) something similar can probably be worked out for 8 messages, but it gets complicated 5) for D = 2, you can send 2^(k1) messages 6) for D = 1, you can send 2^k messages (I hope this doesn't surprise anyone ;) 7) if you can send n messages with k=k1, D=D1, then you can send at least n messages with k=2*k1, D=2*D1. Sometimes, you can send more 8) if you can send n messages with k=k1, D=D1, then you can send at most n messages with k=k1+1, D=D1+1. (I'm pretty sure of this) 9) if you can send n messages with k=k1, D=D1, then you can send at most 2n messages with k=k1+1, D=D1. (I'm pretty sure of this one too) Anyways, I'm sorry to throw a monkeywrench into the whole works, but I've found that we can't restrict n to be powers of 2. In the communications world, I don't think this problem was ever solved. It turns out that thinking in terms of a Hamming distance usually doesn't suit the problem. The idea behind Hamming Distance was to build redundancy into the coding of signals, so that if you got a bit wrong, you would know you got it wrong, and possibly be able to correct it. However, since most signals aren't sent out in binary, but rather in mary signalling, the Hamming distance between signals won't match up with the actual probabilities of bits having errors. So they moved on to trelliscoded modulation, then on to Turbo Codes and stuff, without (AFAIK) solving the Hamming distance problem. From a communications perspective, you can encode signals by starting with m message bits, then tacking on (km) parity bits. By choosing the parity bits correctly, you can make the Hamming distance between all possible messages quite large (bounded by (km+1)). In practice, choosing those parity bits is half luck, half magic, and all you could do was to catalogue what parity bits worked best. When trelliscoded modulation came around, the communications field did an endrun around the problem, and started to use probability (a la maximum aposteriori probability, or MAP receivers) to calculate which message was most likely transmitted. Good Luck! James Fingas 

Title: Re: HARD: Hamming Distance Questions Post by oliver on Aug 19^{th}, 2002, 11:38pm Nice work James! Your example is indeed correct (I tested it with small program). That's pretty bad. It certainly kills the thesis that the minimal set is always a subgroup (or trivially can be 1:1 mapped to one). Nasty. Wow, I wouldn't have thought that the algorithm I implemented was simply wrong. But there's no need to excuse, what you did was certainly very helpful. 

Title: Re: HARD: Hamming Distance Questions Post by James Fingas on Aug 29^{th}, 2002, 7:28am Here's how I'm currently working on the problem. First of all, draw out Pascal's triangle, and pick out the kth row. Now the 1 on the end of the kth row corresponds with the first message we'll pick. Now the interior of the "ball" of messages that are closer the D away from our first message are the next D1 numbers in the row. In the Dth row are all the messages which are far enough away from our first message. This gives us an idea of the "surface area" of the "ball" around our first message. But how are the messages on this surface organized? How many can we fit on the surface with a distance >=D? What is the optimal way to choose them? And then things get a little more complicated. After we've put as many messages as we can on the surface, we have a new "surface" made up of all the messages that are exactly D away from at least one of the selected messages. This surface will probably be "bumpy"not all messages on the surface will have the same distance from our first message. How large is this surface? How many messages can we fit? How do we choose where to fit them in? I'm guessing we should prefer to choose the messages which are closer to the first message. As we choose more layers of messages, the surface gets more and more convoluted. However, I think that we can probably get an optimal packing of messages with a greedy algorithm. If not, then choosing the messages will be very difficult... 

Title: Re: HARD: Hamming Distance Questions Post by eKtoric on Oct 11^{th}, 2002, 2:18pm Forgive me for interrupting. I haven't given it much thought, but one thing did come to mind that might point in a different direction. Upon first reading, the question reminded me of Orthogonality. What popped into my head was the selection of interference codes for CDMA whereby the signal is filtered by an orthogonal code. My CDMA is rusty (so I could have stated that all wrong), but perhaps this area of study may shed some additional thoughts? 

Title: Re: HARD: Hamming Distance Questions Post by oliver on Oct 16^{th}, 2002, 6:53am Riddlers, this problem apparently is harder, much harder, then we (I) thought. You can see for yourself if you search in google for: "coding theory" "sphere packing" Here are some link for your reading pleasure: http://www.research.att.com/~njas/doc/1218codes.ps and (this actually contains a riddle!) http://www.research.att.com/~njas/doc/1218related.ps The main page (of an at&t researcher) is http://www.research.att.com/~njas/ Ok, I officially declare I have given up on this ;), but it's really fascinating that no one didn't see how hard this problem really is ... 

Title: Re: HARD: Hamming Distance Questions Post by local__ on Nov 21^{st}, 2002, 7:49am hi, i think i'm almost finished.. but i'm not sure if i understand the task correctly.. i will tell it in my words and can u correct me please? i'm not pro at english u have given K.. then in this K there is max bitstrings 2^{k}.. and then n is between> 0 < n < 2^{k} .. but for n = 2 the distance is K, so we can assume 2 < n < 2^{k}..... so u have given K and N ,, and we need formula that will compute the minimal distance between any 2 bitstrings, and this minimal distance (= min )MUST be as maximazed as possible.. so there cant be another subset of same power(same number of elements) at same K, that will have minimal H distance, bigger than the "min"? tnx for answering ok , if this is it, i can tell u its not so hard as u think it is.. just needs apropriate aprouch (aproach?)(approuch?) 

Title: Re: HARD: Hamming Distance Questions Post by wingyen LAU on Nov 23^{rd}, 2002, 11:45pm Hi ALL: I don't remember the exact details to this problem. And I am not sure if i understand this question full. But I don't have a little memory of seeing this problem in one of my summer school course (discrete math) from binghamton university by prof. guzman http://www.math.binghamton.edu/dept/server/research.html. I think the solution has an upper bound for n. I'll try to get back to u guys if i find the solution again. WING 

Title: Re: HARD: Hamming Distance Questions Post by william wu on Dec 15^{th}, 2002, 2:53am on 10/16/02 at 06:53:31, oliver wrote:
SWEET! I've never seen this one before! I'll have to read this paper later. In the Math. Dept. Commons Room at Bell Labs in Murray Hill there is a lightbulb game built by Elwyn Berlekamp nearly twenty years ago. There are 100 lightbulbs, arranged in a 10 × 10 array. At the back of the box there are 100 individual switches, one for each bulb. On the front there are 20 switches, one for each row and column. Throwing one of the rear switches changes the state of a single bulb, while throwing one of the front switches changes the state of a whole row or column. Suppose some subset S of the 100 bulbs are turned on using the rear switches. Let f (S) be the minimum number of illuminated bulbs that can now be attained by throwing any sequence of row and column switches. The problem is to determine It sounds very similar to Positive Matrix Sums in the medium section (http://www.ocf.berkeley.edu/~wwu/riddles/medium.shtml#positiveMatrixSums): You have an m x n grid with some real numbers in each cell. You can multiply any row by 1, or any column by 1. Show that by making multiplications of this kind, the sum of the numbers in every row, and the sum of the numbers in every column, can all be made positive simultaneously. Quote:
Actually a Mr. Fisher emailed me earlier this year, hoping to convince me that the Maximum Minimum Hamming Distance puzzle I posted was an unsolved problem. He works on software that generates codes such as the ones we are looking for, and it uses table lookups to determine maximum minimum Hamming distances for given parameters. We argued for a little while, but after some miscommunications I thought he did not understand the problem correctly, so I just ignored his advice. I guess the truth comes out. 

Title: Re: HARD: Hamming Distance Questions Post by william wu on Dec 15^{th}, 2002, 2:59am on 11/23/02 at 23:45:51, wingyen LAU wrote:
Yeah, when I checked out this problem over the summer, I recall trying to apply a basic coding theory bound whose escapes me at the moment. My knowledge of information theory however is practically garbage though (periodic frivolous browsings of a textbook for a few weeks ... hopefully I'll take an actual info theory course someday), so it's entirely possible that the bound doesn't apply in this case. Mr. Fisher mentioned it though in his email so maybe it does; I'll have to track down what exactly he said. In any case, I'm not interested in a bound, I want the exact numbers. 

Title: Re: HARD: Hamming Distance Questions Post by Joe Pellino on Jan 13^{th}, 2003, 2:59pm I have no idea about this problem. Actually i just want to know willy wu's email adress. I want to email him a solution to 100 prisoners and a lightbulb 

Title: Re: HARD: Hamming Distance Questions Post by Damien Fisher on Sep 18^{th}, 2003, 10:59pm on 12/15/02 at 02:53:38, william wu wrote:
Hey, that's me! No worries about the miscommunication. After all, I knew I was right...hehe. :) There's plenty of interesting work on this. The software I work on is at http://magma.maths.usyd.edu.au, but there is plenty of resources available for free. E.g., here (http://www.win.tue.nl/~aeb/voorlincod.html) is an (outofdate) database of bounds for various values of n and k for linear codes. If you search around you'll find tons more work. 

Title: Re: HARD: Hamming Distance Questions Post by Damien Fisher on Sep 24^{th}, 2003, 7:51pm on 12/15/02 at 02:59:18, william wu wrote:
In some cases (especially over GF(2), which I believe is the case you are interested in), the known lower and upper bounds meet, giving the "exact number" you are interested in. E.g. (this is output from Magma, BKLC = "Best Known Linear Code"): > SetPrintLevel("Minimal"); > SetVerbose("BestCode",true); > SetPrintLevel("Minimal"); > SetVerbose("BestCode", true); > BKLCLowerBound(GF(2), 22, 13), BKLCUpperBound(GF(2), 22, 13); 5 5 > BKLC(GF(2), 22, 13); Construction of a [ 22 , 13 , 5 ] Code: [1]: [23, 14, 5] Linear Code over GF(2) Construction from a stored generator matrix [2]: [22, 13, 5] Linear Code over GF(2) Shortening of [1] at { 23 } [22, 13, 5] Linear Code over GF(2) true 

Title: Re: HARD: Hamming Distance Questions Post by Digitalis on Oct 27^{th}, 2004, 3:51pm Give up on this one. The case n=k eventually resolves into the Hadamard Conjecture, which is considered to be one of the central and most difficult problems in combinatorics. Mathworld has a decent blurb about it. 

Title: Re: HARD: Hamming Distance Questions Post by NumBeast on Jun 13^{th}, 2006, 7:30pm Unless I am being completely Stupid, You all are being completely stupid. :P to quote James " For a given length k and a distance D, find the maximum possible number of messages n" be warned that you'll never get a complete list, there can be multiple n's for one distance. ;D James, you gave 2 lists of 20 bitstrings that are "3" apart from each other. in the first (and second) list you have 2 0000's and multiple other repeats. not to mention the multiple bitstrings that are less than 3 away from each other. 8) EVERY STRING OF LENGTH k THAT HAS A DISTANCE OF k1 FROM A GIVEN STRING, HAVE A DISTANCE OF 2 FROM EACH OTHER. 8) I have a proof to this but I'll let you do it if n=k D=2 (unproven) so here are my thoughts... If n=2 D=k, this is pretty obvious For any K say that when n=x, D=y then when n>X D<=Y, meaning when the number of bitstrings increases the maximum minimum distance either stays the same or decreases. so here is a 'correct' listing that I found in 2 hours, I'll update it later as I spend more time on this. (the max for n is 2^k) k=1 n=2 D=1 (1,0) K=2 n=2 D=2 (00,11) n>=3 D=1 K=3 n=2 D=3 (000,111) n=3 D=2 (010,111,100) n=4 D=2 (011,000,110,101) n>=5 D=1 k=4 n=2 D=4 (above theorm) n=3 D=2 (other not so above theorm) n=4,5 D=2 n>=6 D=1 k=5 n=2 D=5 n=3 D=3 n=4 D=3 n=5,6 D=2 n>=7 D=1 BTY in all of my calculatins I have assumed that n =! 1,0 k>=1 as I mentioned at the beginning I might have made multiple mistakes while reading your posts and/or writing my own, if so I am sorry for insulting you and please tell me where I went wrong P.S. If you are reading this and happen to wonder why I am talking to people who wrote thier last posts 2 years ago, the answer is it makes me feel better 

Title: Re: HARD: Hamming Distance Questions Post by towr on Jun 14^{th}, 2006, 1:24am on 06/13/06 at 19:30:43, NumBeast wrote:
Indeed with 4 bits, you can't get more than 16 distinct binary strings, let alone 20. 

Title: Re: HARD: Hamming Distance Questions Post by NumBeast on Jun 15^{th}, 2006, 9:30am Thanks for pointing that out. ;) and i've ben thinking about it, trying to" For a given length k and a distance D, find the maximum possible number of messages n" is a good way to set some milestones for a more through approach later. :( 

Title: Re: HARD: Hamming Distance Questions Post by Avin on Jul 27^{th}, 2006, 7:20am Hi NumBeast, Your numbers are nonoptimal. I got Quote:
The following is a list of 8 4bit strings with d=2: 0000 0011 0101 0110 1001 1010 1100 1111 In fact, I can generalize this list to show that for n=2^(k1) and k>=2, D>=2, and it should be easy to show that D<=2 as well. This is why I assert that for k=5 n=16, D is 2 as well: take the above list of 4 bit strings and append a 0 to them, then take the 8 remaining 4 bit strings that are not on that list (which also satisfy the property of d=2) and append a 1 to them. Now you've got 16 5bit strings for which d=2. Actually when I first saw this problem a few minutes ago before getting to the case of n>=5 I thought the problem was mind numbingly easy and that all values of n would follow a similar pattern based on the next smaller power of 2, and I couldn't understand why this was an unsolved problem, but once I started trying it out for n>=5 I realized why I was mistaken. 

Title: Re: HARD: Hamming Distance Questions Post by jczerwinski on Mar 1^{st}, 2008, 9:55am I've already found myself posting to this thread twice, then retracting what I've said, because I've realized it was incorrect :) Anyway, I can also confirm that James' set of 20 k=8 strings has no two elements with d < 3 from each other. So, oliver's algorithm isn't generating an optimal set of these bitstrings. I wrote the same algorithm in c: Code:
Basically, for each k and d, the algorithm tries to build an optimal set. That is, the largest possible set that has hamming distance d or greater. The thing is, oliver and I simply assumed that iterating over the natural numbers n and checking if, for every n, the distance between the binary coding for n and the numbers already in our set is less than our minimum distance. If it is, we skip the number. If it isn't, we add the number to the set. So, we know the natural iteration over the natural numbers (n, n+1, ...) does not generate an optimal set. However, there does exist some optimal set for any k and d, and possibly more than one. So, there must exist some iteration over the natural numbers, and some two seed elements of our set, that when tested against the elements already in our set, will generate the optimal set. What is this iteration? My first other guess was ensuring that the numbers we consider adding to the set are not only greater than d away from all other elements of our set, but also that the distance is as small as possible. So, first only add elements that are exactly d from our seed elements. Then, once you've iterated over all the bitstrings of length k, iterate over them again, this time allowing length d and length d+1. Do this until, on the last iteration, you've allowed any distance greater than d. Turns out, this iteration is no better. It gives the exact same (suboptimal) table that oliver gives. If anyone wants the code or output for this, I'll send it over. I also have code that tested James' 8bit set for its minimum hamming distance. If anyone wants that i'll send it as well. It's all in C. I'm going to keep trying different iterations today. Next up is going to be a pascal's triangle traversal. Start from 000...000, then iterate through the bitstrings d=1 from 0 (I.e. 0001, 0010, 0100, 1000 for k=4, d=1), then goto the next d, and try it again. I'll keep you posted. James suggested something along those lines. I'll implement his idea exactly if it's clear enough. Also, if he ever comes back, I want to know how the heck he generated his 20 string set. It's a doozy. 

Title: Re: HARD: Hamming Distance Questions Post by jczerwinski on Mar 2^{nd}, 2008, 1:01am Mmk. So the algorithm is pretty much done. Getting a seg fault for k=5, so I'll have to fix it up in the morning. But! The numbers are different from the algorithm oliver and myself implemented earlier! Here're the tables it's given me so far: k: 1 d: 1 n: 2 k: 2 d: 1 n: 4 d: 2 n: 2 k:3 d: 1 n: 8 d: 2 n: 5 (Different! One more here!) d: 3 n: 2 k:4 d: 1 n: 16 d: 2 n: 8 d: 3 n: 3 (Different! One more here!) d: 4 n: 2 So, clearly up to k = 4, this algorithm generates at least as optimal a set as the first, as well as a more optimal set for some distances. Progress! So, the algorithm works like this. For every k, d, we compute an array where the elements in the array are k digit bitstrings exactly d from the 0 bitstring. Then, we take all these arrays of elements and concatenate them, in order of increasing distance from 0. So, we've got an array that enumerates these bitstriings. Then we use this to feed the same algorithm as before. The idea is, by picking strings more optimally using a finelyhoned input sequence, we should be able to pack them tighter. I haven't bothered trying to prove this yet  I want the rest of my tables, and then I'll try and optimize further, and if I don't think it's possible, try and prove optimality. Anyway, progress! I'll post the code tomorrow, as well as new, improved tables :) 

Title: Re: HARD: Hamming Distance Questions Post by jczerwinski on Mar 2^{nd}, 2008, 12:35pm Mmk. So turns out a couple of those numbers above are the result of bad programming. Namely, those that were different than before ;) However, I fixed the program, and it still is more optimal than the original algorithm. Here are the updated tables, to k = 10. k: 1 d: 1 n: 2 k: 2 d: 1 n: 4 d: 2 n: 2 k: 3 d: 1 n: 8 d: 2 n: 4 d: 3 n: 2 k: 4 d: 1 n: 16 d: 2 n: 8 d: 3 n: 2 d: 4 n: 2 k: 5 d: 1 n: 32 d: 2 n: 16 d: 3 n: 4 d: 4 n: 2 d: 5 n: 2 k: 6 d: 1 n: 64 d: 2 n: 32 d: 3 n: 8 d: 4 n: 4 d: 5 n: 2 d: 6 n: 2 k: 7 d: 1 n: 128 d: 2 n: 64 d: 3 n: 16 d: 4 n: 8 d: 5 n: 2 d: 6 n: 2 d: 7 n: 2 k: 8 d: 1 n: 256 d: 2 n: 128 d: 3 n: 16 d: 4 n: 16 d: 5 n: 4 d: 6 n: 2 d: 7 n: 2 d: 8 n: 2 k: 9 d: 1 n: 512 d: 2 n: 256 d: 3 n: 32 d: 4 n: 16 d: 5 n: 6 (Different) d: 6 n: 4 d: 7 n: 2 d: 8 n: 2 d: 9 n: 2 k: 10 d: 1 n: 1024 d: 2 n: 512 d: 3 n: 64 d: 4 n: 32 d: 5 n: 12 (Different) d: 6 n: 4 d: 7 n: 2 d: 8 n: 2 d: 9 n: 2 d: 10n: 2 Here's oliver's original list, for comparison: k: 3, d: 2, n: 4 k: 4, d: 2, n: 8 k: 4, d: 3, n: 2 k: 5, d: 2, n: 16 k: 5, d: 3, n: 4 k: 5, d: 4, n: 2 k: 6, d: 2, n: 32 k: 6, d: 3, n: 8 k: 6, d: 4, n: 4 k: 6, d: 5, n: 2 k: 7, d: 2, n: 64 k: 7, d: 3, n: 16 k: 7, d: 4, n: 8 k: 7, d: 5, n: 2 k: 7, d: 6, n: 2 k: 8, d: 2, n: 128 k: 8, d: 3, n: 16 k: 8, d: 4, n: 16 k: 8, d: 5, n: 4 k: 8, d: 6, n: 2 k: 8, d: 7, n: 2 k: 9, d: 2, n: 256 k: 9, d: 3, n: 32 k: 9, d: 4, n: 16 k: 9, d: 5, n: 4 k: 9, d: 6, n: 4 k: 9, d: 7, n: 2 k: 9, d: 8, n: 2 So, this new algorithm is still generating better sets than the original, up to k=9. Still, it hasn't generated James' 20string set for k=8, d=3 yet, so clearly it's still not optimal. I'm figuring, when we arrange the bitstrings based on their distance from 0, and then start choosing elements to add to our set, we choose the first natural number whose binary representation is d from 0 as our second set element. This is fine: however, then, this same element has a similar "Pascal Sequence". That is, we already ordered the natural numbers so that the distance from 0 of the next number in the array is greater than or equal to the distance from 0 of the current element in the array. Well, such a sequence exists with respect to the next string we add to our set, just as it did with 0. So, to what extent do these sets overlap? It's actually easy to generate the new sequence for any other element given that element: just take the sequence for the 0 string and exclusiveor all the bits in each string in the sequence with our new string. We can then compare these two arrays to find shared elements between the sets, and select any elements that are shared first. I know the problem we're trying to solve isn't the generation of these sets, but it seems that we need to have them if we want to know how many elements are in them. For the time being, anyway. Maybe I'll just write a brute force algorithm to try all possible combinations of sets for any k and then compute the minimum hamming distance for them all..... but I hate doing things so stupidly... and the search space would get massive extremely quickly. 

Title: Re: HARD: Hamming Distance Questions Post by jczerwinski on Mar 2^{nd}, 2008, 12:54pm Interesting! I just tried the new algorithm PLUS the semiimproved old one. That is, I generate the new "Pascal Sequence", and then, ensure that the distance between any element I add to the set is EXACTLY d from all the elements, and then I go over the elements so that the distance is between d and d+1.... until the upper bound = k. I get these tables: k: 1 d: 1 n: 2 k: 2 d: 1 n: 4 d: 2 n: 2 k: 3 d: 1 n: 8 d: 2 n: 4 d: 3 n: 2 k: 4 d: 1 n: 16 d: 2 n: 8 d: 3 n: 2 d: 4 n: 2 k: 5 d: 1 n: 32 d: 2 n: 16 d: 3 n: 4 d: 4 n: 2 d: 5 n: 2 k: 6 d: 1 n: 64 d: 2 n: 22 (Worse) d: 3 n: 8 d: 4 n: 4 d: 5 n: 2 d: 6 n: 2 k: 7 d: 1 n: 128 d: 2 n: 64 d: 3 n: 13 (Worse) d: 4 n: 8 d: 5 n: 2 d: 6 n: 2 d: 7 n: 2 k: 8 d: 1 n: 256 d: 2 n: 107 (Worse) d: 3 n: 17 (Better) d: 4 n: 16 d: 5 n: 4 d: 6 n: 2 d: 7 n: 2 d: 8 n: 2 k: 9 d: 1 n: 512 d: 2 n: 256 d: 3 n: 32 d: 4 n: 16 d: 5 n: 6 d: 6 n: 4 d: 7 n: 2 d: 8 n: 2 d: 9 n: 2 k: 10 d: 1 n: 1024 d: 2 n: 386 d: 3 n: 64 d: 4 n: 32 d: 5 n: 12 d: 6 n: 4 d: 7 n: 2 d: 8 n: 2 d: 9 n: 2 d: 10 n: 2 Notice some of them are better, some of them are worse! And the better set is for the same k, d that James found that nice set for. Weird. 

Title: Re: HARD: Hamming Distance Questions Post by dk39ab on Mar 4^{th}, 2008, 3:50am Hi, there's believed to be an approximate solution to this. From Information Theory, Inference, and Learning Algorithms (http://www.inference.phy.cam.ac.uk/mackay/itila/) by David MacKay: Quote:
In the terminology used here the length k is N and the number of strings n is 2^{RN}. also, H_{2}(x) = xlog_{2}(1/x)+(1x)log_{2}(1/(1x)) So, the maximum minimum distance should be around kH_{2}^{1}(1log_{2}(n)/k) for large k, if this conjecture holds. 

Title: Re: HARD: Hamming Distance Questions Post by jczerwinski on Mar 5^{th}, 2008, 2:18pm Thanks for the link. Here's another. http://scholar.lib.vt.edu/theses/available/etd05042004120929/unrestricted/thesis_revised.pdf These binary strings we're considering are closely related to Hadamard matrices. A poster suggested earlier that this problem resolves into the Hadamard Conjecture for the case n=k. He seems to be correct. The conjecture states, in the same terms we have been using here, that for any bitstring length of 4k for any k of the naturals, there exists at least 4k bitstrings with d = k/2 from one another. So, the problem we've been looking at here is in some ways more complicated than even the hadamard conjecture. So, even a simple proof of this: k = string length d = minimum hamming distance n = max number of such strings k = 4j (j of the naturals) d = 2j n = 4j (or greater) For all j, would be quite the breakthrough. So, apparently, even solving for this case has never been done. Or at least not published. So, that means an algorithm even to generate such sets has not been proven, in general, to exist. And only sets up to some 800 or so bitlength have been proven to exist. This suggests (I'm hypothesizing) that it's computationally difficult to generate such optimal sets. Nieve algorithms that do this are probably worse than factorial time in the size of k. So my comment on trying it above would be utterly useless for almost any input. Anyway, I'm going to go study that paper :) 

Title: Re: HARD: Hamming Distance Questions Post by yoyoy on Oct 14^{th}, 2013, 11:11am "How many bitstrings are in such a ball? By symmetry, B(k,s,d) = B(k,0,d), where 0 is the string of k zeroes. But a bitstring is at Hamming distance i from 0 iff it has i onebits, so the number of bitstrings at Hamming distance i from 0 is the number of ways of choosing i onebits from k bits. " So, shouldn't f() = kCi ? and not sum( 0<=i<=d). And what is this sum defining f() at all? Not able to understand! Also, I don't nderstand the inequality : nf()<=2^k Isn't n always equal to 2^k? bcoz n is the no. of strings possible. So, say if k=3, and we know that it forms a cube with 8 vertices. so, n=2^3=8. I'm new to this world of hamming codes. Trying to understand this concept. 

Title: Re: HARD: Hamming Distance Questions Post by titan on Oct 18^{th}, 2013, 8:47am Please help!! I really want to understand this! Please refer to my last post on this thread. 

Title: Re: HARD: Hamming Distance Questions Post by Grimbal on Oct 20^{th}, 2013, 12:58am f(k,d) = B(k,0,d) = sum(0 <= i <= d : k choose i). Obviously f(k,d) cannot be kCi because there is no i. If you mean kCd, you only count the strings at distance = d and not <= d. n*f(k,d)<=2^k results from the fact that there is no overlap between the sets B(k,s,d). Each of the 2^k bitstrings can appear only once. n*f(k,d)=2^k would mean there is no gap between the sets. This is not true in general. 

Powered by YaBB 1 Gold  SP 1.4! Forum software copyright © 20002004 Yet another Bulletin Board 