wu :: forums
« wu :: forums - String Hashing »

Welcome, Guest. Please Login or Register.
May 19th, 2024, 9:32am

RIDDLES SITE WRITE MATH! Home Home Help Help Search Search Members Members Login Login Register Register
   wu :: forums
   riddles
   cs
(Moderators: william wu, ThudnBlunder, Grimbal, SMQ, Eigenray, Icarus, towr)
   String Hashing
« Previous topic | Next topic »
Pages: 1  Reply Reply Notify of replies Notify of replies Send Topic Send Topic Print Print
   Author  Topic: String Hashing  (Read 2208 times)
John_Gaughan
Uberpuzzler
*****



Behold, the power of cheese!

5187759 5187759   john23874   SnowmanJTG
WWW Email

Gender: male
Posts: 767
String Hashing  
« on: Jan 25th, 2004, 11:08pm »
Quote Quote Modify Modify

This is slightly subjective and not very difficult -- what do you think is the best way to hash a string? Assume an ASCII string, without the extended character set (i.e. 7 bit). I'm interested in the general plan of attack, not "use MD5."
 
I ask because in one of my classes this topic came up and some of the methods proposed just don't seem very good to me.
 
By "best" I mean what method will produce good hashes and have decent (but not necessarily optimal) running time.
IP Logged

x = (0x2B | ~0x2B)
x == the_question
towr
wu::riddles Moderator
Uberpuzzler
*****



Some people are average, some are just mean.

   


Gender: male
Posts: 13730
Re: String Hashing  
« Reply #1 on: Jan 26th, 2004, 1:16am »
Quote Quote Modify Modify

I suppose you could take a long long int (64 bits), and fill it by adding characteri * primei .. I don't really know what it'll do, but I'm sure it's decent enough for some applications Tongue
 
Or maybe  
long double hash = 1.0;
while(character[i])
{
  srand(character[i]);
  hash*= rand(chararcter[i++])
}
(supposedly since rand is reasonably uniformly distributed, hash will too)
 
hmm.. there must be some information to be found on this somewhere..
IP Logged

Wikipedia, Google, Mathworld, Integer sequence DB
John_Gaughan
Uberpuzzler
*****



Behold, the power of cheese!

5187759 5187759   john23874   SnowmanJTG
WWW Email

Gender: male
Posts: 767
Re: String Hashing  
« Reply #2 on: Jan 27th, 2004, 6:12am »
Quote Quote Modify Modify

I think you have the right idea -- pull information from somewhere besides the contents of the string. Primes are a good direction to go in. I don't know about rand(), since this is not guaranteed to give the same numbers on different implementations given the same seed (i.e. C++ or Java, Windows or Linux, etc).
 
Some of the ideas presented are to add up all the characters, ignoring carry/overflow bits; alternately multiply and divide characters (h = (a1a2)/a3...) doing integer division, and a couple others. Each one has weaknesses and it is possible to guess the original string.
 
It does depend on the application. Doing a quick and computationally easy hash so you can hopefully find a shortcut when comparing two objects is one thing, and protecting /etc/passwd or /etc/shadow is something different. And doing checksums on files can be even trickier.
IP Logged

x = (0x2B | ~0x2B)
x == the_question
towr
wu::riddles Moderator
Uberpuzzler
*****



Some people are average, some are just mean.

   


Gender: male
Posts: 13730
Re: String Hashing  
« Reply #3 on: Jan 27th, 2004, 7:40am »
Quote Quote Modify Modify

If rand isn't deterministic given a seed, you can always try and use an algorithm for your own pseudo random number generator. In the end the hash should probably be as random-like as possible, as this will make it more usefull (for use in hashmaps for instance, it should be as uniform as possible modulo whatever prime you use as its size)
 
I suppose it'd be best if the hash isn't the same for every permutation of the input string. So that means the calculation shouldn't be associative. In that sense adding or multiplying all characters (or functions of de seperate characters) is probably not a very good idea.  
 
Guessing the original string from the hash is in general quite difficult, since in most cases the strings can be any length, and hashes have a fixed length. Only if you know enough constraints can you guess the original.  
The problem with passwords is that they are generally short (6-10 characters), and most people like using words. So after encrypting all the words from a dictionary you can generally guess some of the passwords in a password-file.
IP Logged

Wikipedia, Google, Mathworld, Integer sequence DB
John_Gaughan
Uberpuzzler
*****



Behold, the power of cheese!

5187759 5187759   john23874   SnowmanJTG
WWW Email

Gender: male
Posts: 767
Re: String Hashing  
« Reply #4 on: Jan 27th, 2004, 12:25pm »
Quote Quote Modify Modify

on Jan 27th, 2004, 7:40am, towr wrote:
If rand isn't deterministic given a seed, you can always try and use an algorithm for your own pseudo random number generator.

I don't think it's the algorithm so much as what prime numbers you use to turn the crank. Every random number generator I've seen is basically the one Knuth has in his book, rand = input * big prime (mod m).
 
Quote:
Only if you know enough constraints can you guess the original.

Not necessarily. Heuristics can give clues about the generating algorithm (bit patterns, for example), and combined with a dictionary attack on modern hardware, it is possible to crack a weak password using a weak hash in a few hours. I've seen it myself. I don't have the depth of computer science knowledge needed to do it myself, but I do understand how it is done. Again, a nice side benefit of previous security-related jobs. But I know what you're saying. I'm not disagreeing, I'm just adding my two cents.
IP Logged

x = (0x2B | ~0x2B)
x == the_question
towr
wu::riddles Moderator
Uberpuzzler
*****



Some people are average, some are just mean.

   


Gender: male
Posts: 13730
Re: String Hashing  
« Reply #5 on: Jan 27th, 2004, 1:39pm »
Quote Quote Modify Modify

on Jan 27th, 2004, 12:25pm, John_Gaughan wrote:
Not necessarily. Heuristics can give clues about the generating algorithm (bit patterns, for example), and combined with a dictionary attack on modern hardware, it is possible to crack a weak password using a weak hash in a few hours.
That presupposes there is a weak password. If every password is a completely random string a dictionary attack won't do anything.
And if the string is sufficiently larger than the hash the chance is nil you'll ever guess a password brute force (if the hashing algorithm is even half decent).  
Both the length, and being a word are constraints, as is 'easy to remember for people'. So given those you may be able to crack it, but without them I very much doubt it..
If you look at the unconstrained problem, things are much more difficult. I would be very interested to see anyone reconstuct a, oh let's say, 120 Mb program given only a 32 character hash. Or even a full 10-20 word sentence used as a password.. (A shame many programs only look at the first 8 characters..)
Without constraints the chance is too slim you'll find a string that gives the same hash, and slimmer still it would be the original.
IP Logged

Wikipedia, Google, Mathworld, Integer sequence DB
Pages: 1  Reply Reply Notify of replies Notify of replies Send Topic Send Topic Print Print

« Previous topic | Next topic »

Powered by YaBB 1 Gold - SP 1.4!
Forum software copyright © 2000-2004 Yet another Bulletin Board