Skip to content

Instantly share code, notes, and snippets.

@dankleiman
Created March 10, 2017 20:08
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save dankleiman/02664a6f0a5a3b0556740cc229e10220 to your computer and use it in GitHub Desktop.
Save dankleiman/02664a6f0a5a3b0556740cc229e10220 to your computer and use it in GitHub Desktop.
Code for Real-time Deduping at scala
buckets = keys_per_minute / keys_per_bucket
buckets_per_min = 2,000,000 / 100 = 20,000
buckets_per_sec = 20,000 / 60 = 333
(overhead + key_bytes + value_bytes) * keys_per_minute * 60 * 24
(64 + 36 + 13) * 2,000,000 * 60 * 24 = 300GB
HMGET(bucket, message_ids)
message_ids.each do
if existing owner matches or not set
not dupe
else
dupe
end
end
HMSET(bucket, new_message_ids_and_values)
return dupes
((per_key_overhead + key_bytes + value_bytes) * keys_per_minute) * 60 * 24 +
((per_bucket_overhead + bucket_key_bytes) * buckets_per_minute * 60 * 24
((1 + 16 + 8) * 2,000,000) * 60 * 24 +
((64 + 14) * 20,000 * 60 * 24
= (72,000,000,000 + 2,246,400,000) = 69GB
message_ids.each do |message_id, owner_id|
client.HSETNX(message_id, owner_id) # claim owner
client.HGET(message_id) # determine winner
end
(per_key_overhead + key_bytes + value_bytes) * keys_per_minute * 60 * 24
(64 + 16 + 8) * 2,000,000 * 60 * 24 = 230GB
ByteBuffer.allocate(2).putShort(partition.toShort).array ++ ByteBuffer.allocate(8).putLong(offset).array.slice(2, 8)
client.pipeline do |pipeline|
message_ids.each do |message_id, owner_id|
pipeline.HSETNX(message_id, owner_id) # claim owner
pipeline.HGET(message_id) # determine winner
end
end
val uuid = java.util.UUID.fromString("ce059644-18a0-4f27-bc2b-c2a2d4d4e7bf")
val hi = uuid.getMostSignificantBits
val lo = uuid.getLeastSignificantBits
ByteBuffer.allocate(16).putLong(hi).putLong(lo).array
// => Array(-50, 5, -106, 68, 24, -96, 79, 39, -68, 43, -62, -94, -44, -44, -25, -65)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment