« Return to Thread: Cryptographic hash uniquness (was Simple network client)

Re: Cryptographic hash uniquness (was Simple network client)

by Richard Kelsall :: Rate this Message:

Reply to Author | View in Thread

Bulat Ziganshin wrote:

> Hello Peter,
>
> Thursday, January 31, 2008, 8:01:36 PM, you wrote:
>
>> files with different content generating the same hash)... My
>> intuition told me that the odds of two cryptographic hashes (on
>> meaningful content) colliding was much less than the earth being
>> destroyed by an asteroid... But this is just intuition... What does
>> computer science tell us about this?
>
> you may be interested to know that widely used rsync algorithms relies
> on 128-bit hashes and its author speculated about its reliability:
>
> http://samba.org/~tridge/phd_thesis.pdf
>

Interesting paper. Thank you. I had a quick read of the bit relating to
hashes and my understanding is this. He uses a weak, quick and simple
hash to deal with 99.99% (I invented that percentage) of cases - if the
hash is different we know the files are definitely different - if the
hash collides he then does a strong, slow, secure hash and relies on
this as the answer. But he's relying on the strong hash rather than
doing a byte by byte comparison because there is a major cost (a network
transmission of the file) to doing the proper byte by byte comparison.
If you had both files accessible at a low cost it might be better to
byte by byte compare them when you get a collision rather than use
the strong hash.

The right approach may be to assume that collisions will occur and
cater for this properly in the program somehow. A good tip for testing
this sort of thing is to have the length of the hash (maximum size of
the array or whatever you want to test) as a parameter that you can
turn down to a very low value to induce collisions (overflows etc) to
see whether the program still works. And then turn it back up for live
use.

A cryptographic hash appears as a completely random function of the
input so the likelihood of a collision is simply 2^(bits in hash).


Richard.
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@...
http://www.haskell.org/mailman/listinfo/haskell-cafe

 « Return to Thread: Cryptographic hash uniquness (was Simple network client)

LightInTheBox - Buy quality products at wholesale price