wu :: forums
« wu :: forums - CRC32 interview challenge »

Welcome, Guest. Please Login or Register.
May 19th, 2024, 5:54am

RIDDLES SITE WRITE MATH! Home Home Help Help Search Search Members Members Login Login Register Register
   wu :: forums
   riddles
   cs
(Moderators: Icarus, towr, SMQ, ThudnBlunder, Grimbal, Eigenray, william wu)
   CRC32 interview challenge
« Previous topic | Next topic »
Pages: 1  Reply Reply Notify of replies Notify of replies Send Topic Send Topic Print Print
   Author  Topic: CRC32 interview challenge  (Read 1756 times)
Leo Broukhis
Senior Riddler
****





   


Gender: male
Posts: 459
CRC32 interview challenge  
« on: Jun 13th, 2003, 5:38pm »
Quote Quote Modify Modify

A candidate is given access to a computer that has a compiler/interpreter of the candidate's favorite programming language, an implementation of CRC32 in that language, no network access, no data files (in particular, no word lists like /usr/share/dict/words on Unux systems) and is asked to produce two different, short, easy to remember [to a lay person] strings that have the same CRC32 in approx. 30 minutes. What should the candidate do?
IP Logged
towr
wu::riddles Moderator
Uberpuzzler
*****



Some people are average, some are just mean.

   


Gender: male
Posts: 13730
Re: CRC32 interview challenge  
« Reply #1 on: Jun 15th, 2003, 7:43am »
Quote Quote Modify Modify

short? well just try all 6-8 letter strings should be doable in half an hour. Maybe a hash to easy find a double. And perhaps a constraint on the string form (like consonant, vowel, consonant, consonant, vowel, consonant)
IP Logged

Wikipedia, Google, Mathworld, Integer sequence DB
Leo Broukhis
Senior Riddler
****





   


Gender: male
Posts: 459
Re: CRC32 interview challenge  
« Reply #2 on: Jun 15th, 2003, 8:40am »
Quote Quote Modify Modify

There are two problems with this: first, I wouldn't say that pseudo-words are easily remembered by laypeople, let alone arbitrary letter sequences and second, the number of 8-letter strings with a constraint (e.g. of the form CWCCWCWC) is less than 12% of 232, the number of possible values of CRC32.
 
Hint 1 (or rather a clarification): a string is easy to remember when it consists of known words, preferably forming a phrase.
 
Hint 2: At the end of 30 minutes, it is not a good idea to ask the interwiewer to wait until the program finishes running for more than a few seconds. Therefore, the program must terminate, say, within one minute. And, if it does not succeed in finding a pair, an estimation of its chances of not finding a pair would be in order.
 
Hint 3: needless to say, I have a few such pairs.
IP Logged
towr
wu::riddles Moderator
Uberpuzzler
*****



Some people are average, some are just mean.

   


Gender: male
Posts: 13730
Re: CRC32 interview challenge  
« Reply #3 on: Jun 15th, 2003, 2:16pm »
Quote Quote Modify Modify

hmm...
I read the problem wrong. Also I don't know what crc32 is..
IP Logged

Wikipedia, Google, Mathworld, Integer sequence DB
Leo Broukhis
Senior Riddler
****





   


Gender: male
Posts: 459
Re: CRC32 interview challenge  
« Reply #4 on: Jun 15th, 2003, 3:19pm »
Quote Quote Modify Modify

CRC32 is a 32-bit Cyclic Redundancy Check, computed by dividing a polynomial represented by the given string, by a particular (I don't remember offhand what its value is exactly) irreducible polynomial of power 32. It is used as an integrity check in data compressors, e.g. ZIP (WinZip, GZip, etc.).
IP Logged
BNC
Uberpuzzler
*****





   


Gender: male
Posts: 1732
Re: CRC32 interview challenge  
« Reply #5 on: Jun 15th, 2003, 11:41pm »
Quote Quote Modify Modify

CRC stands for Cyclic Redundancy Check. The 32 comes from the fact it calculates a 32-bit checksum. CRC32 is an algorhytm for calculating a unique identifier for a file. It's used in programs like PKZIP to identify files and make sure that they are original. (there's 1 in 2^32 chance of two files getting the same CRC32 checksum and being mistaken as being the same file).
 
CRC's have the interesting property that they will detect ALL single burst error events of their size or less regardless of location.  
 
http://www.codeproject.com/cpp/crc32.asp
http://forth.sourceforge.net/algorithm/crc32/index.html
IP Logged

How about supercalifragilisticexpialidociouspuzzler [Towr, 2007]
Leo Broukhis
Senior Riddler
****





   


Gender: male
Posts: 459
Re: CRC32 interview challenge  
« Reply #6 on: Jun 17th, 2003, 4:41pm »
Quote Quote Modify Modify

A hint: Birthday Paradox.
IP Logged
xlh
Guest

Email

Re: CRC32 interview challenge  
« Reply #7 on: Jun 24th, 2003, 8:52pm »
Quote Quote Modify Modify Remove Remove

I think towr's answer is pretty good. If we count 21 consonants and 5 vowels we have over 222 6 letter words of the form (CVCCVC) and the words would be fairly easy to remember because they would be just 2 pronouncable syllables each.  
 
I guess the answer you're gunning for would be that you type in small sets of words that form a sentence like [firstname] [lastname] got a [color][item] for [holiday]. Type in 16 entries for each category and you've got 220 sentences.
 
By Sterling's approximation, the odds of failing on the phrases case are approximately 2.5E-56, and much lower still for the 6 letter strings.
IP Logged
Leo Broukhis
Senior Riddler
****





   


Gender: male
Posts: 459
Re: CRC32 interview challenge  
« Reply #8 on: Jun 24th, 2003, 10:42pm »
Quote Quote Modify Modify

Quote:
I think towr's answer is pretty good. If we count 21 consonants and 5 vowels we have over 222 6 letter words of the form (CVCCVC) and the words would be fairly easy to remember because they would be just 2 pronouncable syllables each.  

 
This will not work because the CRC32 of most pairs of such strings are not independent for the birthday paradox to work. If I remember correctly, CRC32 is guaranteed not to match on strings that differ by less than 4 characters.
 
I am not sure if 16 entries in each of 5 categories provide enough randomness to appeal to the birthday paradox, but 300 hundred nouns and 300 adjectives seem enough. To wit: "electric vest" and "salty bus", "motorcycle space" and "pie Sunday", "file bread" and "son law".
IP Logged
Grimbal
wu::riddles Moderator
Uberpuzzler
*****






   


Gender: male
Posts: 7527
Re: CRC32 interview challenge  
« Reply #9 on: May 29th, 2004, 2:28pm »
Quote Quote Modify Modify

Don't forget that you have no access to a dictionnary.
 
I have a completely different approach, that I tried and works well.
 
In some other thread, there was the question how to detect loops in linked lists.  This gave me an idea.
 
I create a function to encode a 32bit number into a string.  For example, in the pattern cvccvcvnnn.
 
Then I start with a randon string str0.
I compute
crc0 = crc(str0) and str1 = encode(crc0),
crc1 = crc(str1) and str2 = encode(crc1),
etc.
 
Then I use the clever algorithm to detect loops without the need to store all elements.  This happens when a string in the sequence happens to have the same CRC as a previous string.
 
Then I repeat the method to have multiple pairs to choose of.
 
Using this method, I found:
crc(rikteja44) = crc(josloza41)
crc(pykkety81) = crc(rogwopo21)
 
or with pattern cvcv...nn
crc(jeromim83) = crc(wyfuny73)
 
The rest is a matter of designing an encoding function that gives more meaningful words.
IP Logged
Leo Broukhis
Senior Riddler
****





   


Gender: male
Posts: 459
Re: CRC32 interview challenge  
« Reply #10 on: May 29th, 2004, 5:00pm »
Quote Quote Modify Modify

on May 29th, 2004, 2:28pm, grimbal wrote:
Don't forget that you have no access to a dictionnary.

I don't. How long would it take to come with a list af a few hundred common words? A few minutes at most.
Quote:

I have a completely different approach, that I tried and works well.
 
In some other thread, there was the question how to detect loops in linked lists.  This gave me an idea.
www.md5crk.com could be of interest to you. They use the Pollard rho method as well.
Quote:

The rest is a matter of designing an encoding function that gives more meaningful words.

 
True, but it might be hard within the time limit. Although a method used in the some authentication schemes where a common word encodes each byte would work.
IP Logged
Grimbal
wu::riddles Moderator
Uberpuzzler
*****






   


Gender: male
Posts: 7527
Re: CRC32 interview challenge  
« Reply #11 on: May 31st, 2004, 7:39am »
Quote Quote Modify Modify

on May 29th, 2004, 5:00pm, Leonid Broukhis wrote:

I don't. How long would it take to come with a list af a few hundred common words? A few minutes at most. www.md5crk.com could be of interest to you. They use the Pollard rho method as well.

I see.  Then,  inded, using the birthday paradox and storing the results will bring shorter strings than with loop detection.  If I want to do that with loop detection, it is necessary that the string encodes completely the CRC.  That means that with a 256-word dictionnary, I still need 4 words.
The loop detection method is just faster and easier to code because no storage and lookup is necessary.
IP Logged
Pages: 1  Reply Reply Notify of replies Notify of replies Send Topic Send Topic Print Print

« Previous topic | Next topic »

Powered by YaBB 1 Gold - SP 1.4!
Forum software copyright © 2000-2004 Yet another Bulletin Board