Skip to content

Instantly share code, notes, and snippets.

@alexanderlz
Created January 7, 2013 14:33
Show Gist options
  • Save alexanderlz/4475396 to your computer and use it in GitHub Desktop.
Save alexanderlz/4475396 to your computer and use it in GitHub Desktop.
Crypto hash UDF for apache hive. Allows users to hash values using hive QL. Can be used to obfuscate data using MD5 or sha-1
1. compile the code
2. inside hive shell - run "add jar YOUR_JAR.jar;"
3. still inside shell - run "create temporary function crypt_hash as 'com.hiveextensions.udf.CryptoHash';"
4. usage - crypt_hash(FIELD_TO_HASH,ALGORITHM) where algorithm can be any of {"sha-1","md5",...}
http://docs.oracle.com/javase/6/docs/technotes/guides/security/StandardNames.html#MessageDigest
package com.hiveextensions.udf;
import java.security.MessageDigest;
import java.security.NoSuchAlgorithmException;
import org.apache.hadoop.hive.ql.exec.UDF;
public final class CryptoHash extends UDF {
public String evaluate(final String s, final String algorithm) {
if (s == null)
return null;
try {
MessageDigest md = MessageDigest.getInstance(algorithm);
md.update(s.getBytes());
byte[] mdhash = md.digest();
StringBuilder builder = new StringBuilder();
for (byte b : mdhash) {
builder.append(Integer.toString((b & 0xff) + 0x100, 16).substring(1));
}
return builder.toString();
} catch (NoSuchAlgorithmException nsae) {
System.out.println("Unable to find encryption algorithm! msg[ "+nsae,getMessage()+" ]");
System.exit(-1);
}
return null;
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment