Skip to content

Instantly share code, notes, and snippets.

@shakyaabiral
Last active September 11, 2023 13:27
Show Gist options
  • Save shakyaabiral/156f930069300ecc9668be234d8ad3a2 to your computer and use it in GitHub Desktop.
Save shakyaabiral/156f930069300ecc9668be234d8ad3a2 to your computer and use it in GitHub Desktop.
Python Script to dump or load data using redis
"""
Install the following requirements into your virtual environemnt
`pip install click redis`
Usage:
To load data into redis
python redis_dump.py load [filepath]
To dump data into redis
python redis_dump.py dump [filepath] --search '*txt'
"""
import click
import redis
import json
import logging
import os
@click.command()
@click.argument('action')
@click.argument('filepath')
@click.option('--search', help="Key search patter. eg `*txt`")
def main(action, filepath, search):
r = redis.StrictRedis(host='127.0.0.1', port=6379, db=0) # update your redis settings
cache_timeout = None
if action == 'dump':
out = {}
for key in r.scan_iter(search):
out.update({key: r.get(key)})
if len(out) > 0:
try:
with open(filepath, 'w') as outfile:
json.dump(out, outfile)
print('Dump Successful')
except Exception as e:
print(e)
else:
print("Keys not found")
elif action == 'load':
try:
with open(filepath) as f:
data = json.load(f)
for key in data:
r.set(key, data.get(key), cache_timeout)
print('Data loaded into redis successfully')
except Exception as e:
print(e)
if __name__ == '__main__':
log_fmt = '%(asctime)s - %(name)s - %(levelname)s - %(message)s'
logging.basicConfig(level=logging.INFO, format=log_fmt)
main()
@vveliev-tc
Copy link

Thanks for sharing
I think there are faster load approaches available (If load speed is important for the task):

  • the fastest way would be to use redic-cli (I was getting about 150k OPS on local docker setup)
  • python with redis pipeline on the same machine was giving me about 5k OPS

here is the code snippet

    def process_file(self,file_path):
        chunksize = 10000
        pipe = self.redis.pipeline()
        with open(file_path,"r") as process_f:
            line_index = 0
            for line in process_f:
                try:
                    line_index += 1
                    first_name, last_name = line.strip().split(' ')
                    if not last_name:
                        next
                    user = first_name.strip() + '.' + last_name.strip()       
                    pipe.sadd('users', user.lower())
                    if line_index % chunksize == 0:
                        pipe.execute()
                        logging.info("Processing index %s", line_index)
                except: 
                    logging.info("Something else went wrong")
                finally:
                    pipe.execute() 
            logging.info("Number of records %s", self.conn.scard('users'))

@IliaFeldgun
Copy link

r.get(key) won't work with value types other than string.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment