Skip to content

Instantly share code, notes, and snippets.

@ZechCodes
Last active May 2, 2021 16:52
Show Gist options
  • Save ZechCodes/472d317fbb699446e4b31ea1d5283446 to your computer and use it in GitHub Desktop.
Save ZechCodes/472d317fbb699446e4b31ea1d5283446 to your computer and use it in GitHub Desktop.
Sets vs Lists in Python

Sets vs Lists in Python

This is just a fun demo of how lists and sets handle the in lookup.

Code

class Testing:
    def __init__(self, name):
        self.name = name

    def __repr__(self):
        return f"<{self.name}>"

    def __eq__(self, other):
        print(f"{other!r} == {self!r}?")
        return other.name == self.name

    def __hash__(self):
        print("hash ", self)
        return hash(self.name)


print(" Testing List ".center(24, "="))
print(" Storing ".center(24, "-"))
x = [Testing(i) for i in range(20)]

print()
print(" Looking Up ".center(24, "-"))
contains = Testing(10) in x

print()
print(f"It is {'not ' * (not contains)}in the list")

print()
print(" Testing Set ".center(24, "="))
print(" Storing ".center(24, "-"))
x = {Testing(i) for i in range(20)}

print()
print(" Looking Up ".center(24, "-"))
contains = Testing(10) in x

print()
print(f"It is {'not ' * (not contains)}in the set")

We're just creating a custom object that compares and hashes based on a name we give it.

Output

===== Testing List =====
------- Storing --------

------ Looking Up ------
<0> == <10>?
<1> == <10>?
<2> == <10>?
<3> == <10>?
<4> == <10>?
<5> == <10>?
<6> == <10>?
<7> == <10>?
<8> == <10>?
<9> == <10>?
<10> == <10>?

It is in the list

===== Testing Set ======
------- Storing --------
hash  <0>
hash  <1>
hash  <2>
hash  <3>
hash  <4>
hash  <5>
hash  <6>
hash  <7>
hash  <8>
hash  <9>
hash  <10>
hash  <11>
hash  <12>
hash  <13>
hash  <14>
hash  <15>
hash  <16>
hash  <17>
hash  <18>
hash  <19>

------ Looking Up ------
hash  <10>
<10> == <10>?

It is in the set

Conclusion

The list doesn't call any methods on the object when storing it while the set calls the hash method on each item. When doing the in lookup the list has to iterate through each item until it finds a match while set just gets the hash of the item and grabs the matching item before comparing them. So list requires a linear number of checks while set (typically) only needs a single comparison.

Bonus

Here's a snippet that could result in a set lookup taking more than one comparison. It hashes on 1 value but compares on 2.

class Testing:
    def __init__(self, name, value):
        self.name = name
        self.value = value

    def __repr__(self):
        return f"<{self.name}:{self.value}>"

    def __eq__(self, other):
        print(f"{other!r} == {self!r}?")
        return other.name == self.name and other.value == self.value

    def __hash__(self):
        print("hash ", self)
        return hash(self.name)


print(" Testing Set ".center(24, "="))
print(" Storing ".center(24, "-"))
x = {Testing(i, 1) for i in range(5)}
x.update(Testing(i, 2) for i in range(5))

print()
print(" Looking Up ".center(24, "-"))
contains = Testing(1, 2) in x

print()
print(f"It is {'not ' * (not contains)}in the set")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment