Skip to content

Instantly share code, notes, and snippets.

@gakhov
Created July 30, 2019 10:26
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
Star You must be signed in to star a gist
Embed
What would you like to do?
Example: How to use HyperLogLog from pdsa Python library
import json
from psda.cardinality.hyperloglog import HyperLogLog
hll = HyperLogLog(precision=10) # 2^{10} = 1024 counters
with open('visitors.txt') as f:
for line in f:
ip = json.loads(line)['ip']
hll.add(ip)
num_of_unique_visitors = hll.count()
print('Unique visitors', num_of_unique_visitors)
size_in_bytes = hll.size()
print('Size in bytes', size_in_bytes)
@gakhov
Copy link
Author

gakhov commented Jul 30, 2019

pdsa is a python library that can be found at https://github.com/gakhov/pdsa

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment