Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
require 'digest/md5'
# This demonstrates an approach you can use to deterministically generate fake
# data based on user data to anonymize it.
# An array of substitute names read from a file
substitute_names = File.read('names.txt').split("\n")
# Real customer data, e.g., from a database
real_name = "John Realname"
# Hash the real name
name_hash = Digest::MD5.hexdigest(real_name) # => "ee4743cf5e5e2e90e27bf15f44e4afa9"
# Convert the name to an integer
name_hash_as_integer = name_hash.to_i(16) # => 316726291424644485479127505202124730281
# Fit the integer within the number of substitute names
new_name_index = name_hash_as_integer % substitute_names.count
# New name to use instead
replacement_name = substitute_names[new_name_index] # => "Ava Johnson"
puts replacement_name
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.