Implement a capped pool which will start dropping messages once it reaches <size>. This will have a much lower rate of failure (due to load) than a database or standard queue.
For example, if there is a burst of data coming in, we'd first stick it into the pool, and then fire off a job to the queue (the job would be argless and simple say "get data from pool", it could also eventually be replaced with a continuous processor). This guarantees that your queue isnt overflowing with large amounts of data, but rather potential no-ops (just like our buffer implementation), and will also ensure you don't "get behind", but instead you simply lose messages.
Add entry to sorted set, and trim sorted set to SIZE. Entries have a random score value 0 to sys.maxint (small chance of collision).
with redis.map() as conn:
conn.zadd(key, giant_blob_of_data, random.randint(0, sys.maxint))
conn.zremrangebyrank(key, pool_size + 1, -1)
Get an entry by doing a zrangebyscore and zremrangebyscore.
with redis.map() as conn:
item = conn.zrange(key, 0, 1)
item = conn.zremrangebyrank(key, 0, 1)
One thing of note here is that any item which is put into a higher score has a higher chance to get trimmed. This probably is statistically insignificant, but is worth mentioning.
We use sys.maxint (instead of pool_size) for the score value max, because we want to ensure that if we have a low volume of data, there's a very low chance for a score colission.
(This is more statistically correct)
Add entry is a single call:
redis.zadd(key, giant_blob_of_data, random.randint(0, pool_size))
Get an entry by doing a zrangebyscore and zremrangebyscore.
val = random.randint(0, pool_size)
with redis.map() as conn:
item_a = conn.zrange(key, val, 1, withscores=True)
item_b = conn.zrevrange(key, val, 1, withscores=True)
# pick either item, doesnt matter
item, score = (item_a or item_b)
# remove matching scored item
redis.zremrangebyscore(key, score, 1)
I tend to think having a hash where you HINCRBY exceptions to get a count (reducing memory overhead) and a separate set to maintain cardinality would be a good idea.
something like