2008年12月16日 星期二

.NET 取得固定 Hash code 的方法

常常看到使用 object.GetHashCode() 來取得一個物件的 HashCode.
重點是, 把 HashCode 當成是 database index , 或是用來跟其他程式溝通, 這都是不對的.

請看 MSDN 裡面關於 object.GetHashCode() 的 Remarks :
我摘錄重要的幾句話 (VS 2008 中, 關於 object.GetHashCode() 的說明):
The default implementation of the GetHashCode method does not guarantee unique return values for different objects. Furthermore, the .NET Framework does not guarantee the default implementation of the GetHashCode method, and the value it returns will be the same between different versions of the .NET Framework. Consequently, the default implementation of this method must not be used as a unique object identifier for hashing purposes.
以上說明了, .NET Framework 變更會導致 GetHashCode() 回傳不一樣的值.

The GetHashCode method for an object must consistently return the same hash code as long as there is no modification to the object state that determines the return value of the object's Equals method. Note that this is true only for the current execution of an application, and that a different hash code can be returned if the application is run again.

以上更清楚說明, 就算是相同的 .NET Framework, 如果是不一樣的執行程序, 或是你的程式再跑一次, 可能相同的物件會回傳不同的 HashCode.

 

所以, 使用 GetHashCode() 只限定該次程式執行環境拿來當索引, 一旦你多個程式要溝通, 或是寫進 db, 寫進檔案等, 當作下次索引, 這都是不對的!

 

那! 要如何取得不變的  HashCode ?
以下列出方法之一 , 就是取用 MD5CryptoServiceProvider 來產生固定的 HashCode.

using System;
using System.Collections.Generic;
using System.Text;
using System.Security.Cryptography;

namespace HashCodeTool
{
/// <summary>
/// 利用 MD5 Hash , 產生 string->long hash key.
/// </summary>
public sealed class MD5HashUtils
{
/// <summary>
/// 停用 Constructor, 因為這個 Class 是用來提供 static function 的.
/// </summary>
private MD5HashUtils() { }

private static MD5CryptoServiceProvider md5 = new MD5CryptoServiceProvider();
/// <summary>
/// 利用 MD5 Hash , 產生 string->long hash key.
/// It should be thread safe.
/// if string is null or empty, return 0.
/// </summary>
/// <param Name="str"></param>
/// <returns></returns>
public static long ComputeHashKey(string str)
{
if (String.IsNullOrEmpty(str))
return 0;

// for threadsafe, lock md5
byte[] buf;
lock (md5)
buf = md5.ComputeHash(Encoding.Unicode.GetBytes(str));
int longbytes = 8;
int longmaskindex = 0;
long r = 0;
for (int i = 0; i < buf.Length; i++)
{
r = r ^ ((long)buf[i] << (longmaskindex * 8));
longmaskindex++;
if (longmaskindex >= longbytes)
longmaskindex = 0;
}
return r;
}

/// <summary>
/// 利用 MD5 Hash , 產生 string->int hash key.
/// It should be thread safe.
/// if string is null or empty, return 0.
/// </summary>
/// <param name="str"></param>
/// <returns></returns>
public static int ComputeIntHashKey(string str)
{
if (String.IsNullOrEmpty(str))
return 0;

// for threadsafe, lock md5
byte[] buf;
lock (md5)
buf = md5.ComputeHash(Encoding.Unicode.GetBytes(str));
int intbytes = 4;
int longmaskindex = 0;
int r = 0;
for (int i = 0; i < buf.Length; i++)
{
r = r ^ ((int)buf[i] << (longmaskindex * 8));
longmaskindex++;
if (longmaskindex >= intbytes)
longmaskindex = 0;
}
return r;
}
}
}

HEMiDEMi Technorati Del.icio.us MyShare個人書籤 Yahoo

0 意見: