Luosky's Playground

It's better to burn out than to fade away.

Be Careful With NSString's Hash


My recent iOS project, Aimeiwei, is an image-oriented app focused on food. We use lots of images, and one picture user updated can have two different aspect ratios, one is the original square food picture, and one is a cropped rectangle picutre that can possiblly be used as the restaurant’s header image. I utilized EGOImageLoading to cache and display images from our remote server. But I came across a strange problem: Sometimes the rectangle image view displays the square one instead, dispite the fact that the urls of the two images are different and the url of the rectangle one points to the correct rectangle image.

After a long time digging, I found that the problem rooted in the hash method of NSString. These two url: 


Their hash is identical! (And EGOImageLoading uses the hash result as a key to cache the image. So if the square image had been cached, the rectangle view used the cached square image to display.)

Then I found this article back in 2004, it says:


The hash is a convolution of the first and last eight bytes plus the length of the string basically the byte values are shifted and added to the string length.

I do know that the hash result is not guaranteed to be unique, but I don’t expect a “weak unique” like these. The implementation of hash must have been improved after these years because the data provided in the article no longer give the same result. But as of now 2012, on iOS 6 SDK, those two urls in my project still yield the same hash.

The fix is simple, just use MD5 as the key. Now I use this fork instead. If someday I had time, I may give SDWebImage a try.