Skip to content

Instantly share code, notes, and snippets.

@Shoeboxam
Created April 9, 2024 19:38
Show Gist options
  • Save Shoeboxam/e2f7b4615e2de31e902711fe36b099c4 to your computer and use it in GitHub Desktop.
Save Shoeboxam/e2f7b4615e2de31e902711fe36b099c4 to your computer and use it in GitHub Desktop.
OpenDP ALP
import faker
import opendp.prelude as dp
counter = dp.t.make_count_by(
dp.vector_domain(dp.atom_domain(T=str)),
dp.symmetric_distance(),
MO=dp.L1Distance[int])
alp_meas = counter >> dp.m.then_alp_queryable(
scale=1.,
total_limit=10_000, # the sum of all values
value_limit=500, # the max of all values
)
print("ε = ", alp_meas.map(1))
# >>> ε = 1.0
# try the mechanism out on synthetic data
fake = faker.Faker()
data = [fake.first_name() for _ in range(10_000)]
alp_qbl = alp_meas(data)
# all queries sent to `alp_qbl` are post-processing
# common names have higher counts
print(alp_qbl("Michael"))
# >>> 248.0
# "Sharon" is relatively less common
print(alp_qbl("Sharon"))
# >>> 36.0
# unlikely to see any instances of Elizondeth
print(alp_qbl("Elizondeth"))
# >>> 0.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment