Last active
September 2, 2020 12:43
-
-
Save psyonara/26e9098fc179cc6d4d406d7449fa26e1 to your computer and use it in GitHub Desktop.
Speed up bulk-exists check with python sets
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Code for blog article: | |
# https://www.helmut.dev/speed-up-bulk-exists-check-with-python-sets.html | |
# When you need to do an "exists" check for a high volume of | |
# items (think hundreds of thousands or more), doing a query | |
# for each will take ages and put a strain on your database. | |
# Rather, you could extract all the relevant field values | |
# into a list and then check whether it contains each record's | |
# reference value. However, using a list for such a check | |
# is also expensive in terms of computing time. But using a set | |
# instead improves performance dramatically. | |
# Standard "exists" check | |
for item in external_records: | |
if not Data.objects.filter(external_id=item.id).exists(): | |
# Do something | |
# Bulk "exists" check for very high volumes | |
existing_ids = set(Data.objects.all().values_list("external_id", flat=True)) | |
for item in external_records: | |
if item.id not in existing_ids: | |
# Do something |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
At a certain scale you can't pull back all the external ids in your DB. Better to ask which of the set you have exist: