I'm writing some perl code for a project I'm working on. One of my needs is to convert a string into a 32-bit integer. The catch is that the conversion must be deterministic (i.e. a hash function). I looked at using CRC32 but the resulting integer wasn't always the same each time. I looked at using MD5, but MD5 produces a very long hex string that will overflow a 32-bit integer. I decided to do something a bit weird; because a 32 bit integer has up to 4,294,967,295 values (signed), I could take the first 8 characters in the MD5 value and then convert any of the letters to a corresponding number. So the function looks like this in perl: sub convert_string # Function to convert string into a unique 32-bit integer { my ($str) = @_; my $md5str = md5_hex($str); my $md5strsub = substr $md5str, 0, 8; $md5strsub =~ tr/a-f/1-6/; return $md5strsub; }
Lessons after 20+ years in the product trenches