Skip to content

Instantly share code, notes, and snippets.

Last active Jul 17, 2021
What would you like to do?
Use Hypothesis for smoketests of
Runs PBTs on `bytes.count` and `bytes.find`. Also confirms results match Python 3.9.
How to use:
- build CPython 3.11 from source
- run ./python.exe -m ensurepip
- run ./python.exe -m pip install hypothesis
- run ./python.exe # this file
Note: `MAX` value below generates tests that on my machine execute between
2 - 5 minutes. Decrease it if it's too slow for you.
import atexit
from collections import Counter
import os
import subprocess
from tempfile import NamedTemporaryFile
from textwrap import dedent
import unittest
from hypothesis import given, settings, HealthCheck
from hypothesis.strategies import binary
MAX = 256
stats: Counter[int] = Counter()
too_slow = HealthCheck.too_slow
def confirm_count_on_python39(
needle: bytes, haystack: bytes, count: int, found: int
) -> None:
content = dedent(
needle = {needle!r}
haystack = {haystack!r}
count = {count!r}
found = {found!r}
assert haystack.count(needle) == count
assert haystack.find(needle) == found
with NamedTemporaryFile("w", suffix=".py", delete=False) as f:
try:["python3.9",], check=True)
class TestCount(unittest.TestCase):
def test_count_of_self_is_one(self, b):
stats['t1'] += 1
self.assertEqual(b.count(b), 1)
@settings(deadline=None, suppress_health_check=[too_slow])
@given(binary(max_size=MAX), binary(min_size=MAX + 1, max_size=16 * MAX))
def test_count_doesnt_crash(self, needle, haystack):
stats['t2'] += 1
count = haystack.count(needle)
self.assertGreaterEqual(count, 0)
if count:
stats['t2.count'] += 1
found = haystack.find(needle)
self.assertNotEqual(found, -1)
confirm_count_on_python39(needle, haystack, count, found)
self.assertEqual(needle.count(haystack), 0)
self.assertEqual(needle.find(haystack), -1)
mid = len(haystack) // 2
for i in range(1, 100, 3):
needle = haystack[mid:mid+i]
found = haystack.find(needle)
self.assertLessEqual(found, mid)
if found == mid:
stats['t2.found'] += 1
count = haystack.count(needle)
confirm_count_on_python39(needle, haystack, count, found)
def print_stats():
if __name__ == "__main__":
Copy link

ambv commented Jul 16, 2021

@Zac-HD, any simple ideas how to make those more interesting?

Copy link

Zac-HD commented Jul 17, 2021

  • test_count_doesnt_crash will have a very large minimum example size, with len(haystack) >= 157. Probably not worth doing anything about this, but FYI.
  • Maybe restricting your bytestrings to a small alphabet could be interesting, by increaing collisions? e.g. with from_regex(b"[abc]{257,}", fullmatch=True)
  • Driving this with a coverage-guided fuzzer would probably be interesting; I'd suggest Atheris since there's a lot of C code involved. If you want to try HypoFuzz for Python code though just let me know, OSS devs are welcome to a free copy.

Otherwise this looks good to me!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment