Last active
December 11, 2023 21:53
-
-
Save nickyreinert/5fe95435ce28cc559bc11e43e83eabfb to your computer and use it in GitHub Desktop.
Improve Calculation of Prime Numbers in Python
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"# Prime Number Generation\n", | |
"We need all prime numbers up to 50.000. Now. Quick. Whats the best approach?\n", | |
"\n", | |
"Let's start with what we know: A prime number has two natural divisors, 1 a the number itself. \n", | |
"\n", | |
"We can utilize the modulo operator, that returns 0, if number / divisor leaves no remainder. \n", | |
"\n", | |
"If that's the case, the number is not prime. " | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 44, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"def isPrime1(number):\n", | |
" for i in range(2, number):\n", | |
"\n", | |
" if (number % i) == 0:\n", | |
" return False\n", | |
" \n", | |
" return True" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Let's measure the time this algorithm needs to find all primes up to 50.000:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 45, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"found 5133 primes\n", | |
"CPU times: user 6.2 s, sys: 106 ms, total: 6.31 s\n", | |
"Wall time: 6.57 s\n" | |
] | |
} | |
], | |
"source": [ | |
"%%time\n", | |
"\n", | |
"count = 0\n", | |
"\n", | |
"for number in range(2, 50000):\n", | |
" if isPrime1(number):\n", | |
" count += 1\n", | |
"\n", | |
"print(f'found {count} primes')" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Around 6 seconds. That can be improved. Let's tinker the algorithm. \n", | |
"\n", | |
"It also makes sense to test divisors up to a given limit: Every number greater than \"number / 2\" would lead to a result with decimals. " | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 46, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"def isPrime2(number):\n", | |
" limit = int(number / 2)\n", | |
" for i in range(2, limit + 1):\n", | |
"\n", | |
" if (number % i) == 0:\n", | |
" return False\n", | |
" \n", | |
" return True" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Let's see how this approach performs:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 47, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"found 5133 primes\n", | |
"CPU times: user 2.95 s, sys: 40.2 ms, total: 2.99 s\n", | |
"Wall time: 3.08 s\n" | |
] | |
} | |
], | |
"source": [ | |
"%%time\n", | |
"\n", | |
"count = 0\n", | |
"\n", | |
"for number in range(2, 50000):\n", | |
" if isPrime2(number):\n", | |
" count += 1\n", | |
"\n", | |
"print(f'found {count} primes')" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Half the time? Wonderful! There's another trick regarding this limit: Don't test numbers greater than square root of the number we are looking for. Because if one divisor is greater than sqrt(number) the other divisor must be lower than sqrt(divisor)." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 51, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"import math\n", | |
"def isPrime3(number):\n", | |
" limit = int(math.sqrt(number))\n", | |
" for i in range(2, limit + 1):\n", | |
"\n", | |
" if (number % i) == 0:\n", | |
" return False\n", | |
" \n", | |
" return True" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 52, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"found 5133 primes\n", | |
"CPU times: user 66.9 ms, sys: 4.52 ms, total: 71.5 ms\n", | |
"Wall time: 72 ms\n" | |
] | |
} | |
], | |
"source": [ | |
"%%time\n", | |
"\n", | |
"count = 0\n", | |
"\n", | |
"for number in range(2, 50000):\n", | |
" if isPrime3(number):\n", | |
" count += 1\n", | |
"\n", | |
"print(f'found {count} primes')" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Wow, that is impressive! Let's add another step: It's fair to ignore even numbers. \n", | |
"\n", | |
"First of all: An even number is naturaly not a prime. So we can skip all even numbers when checking numbers up to 50.000.\n", | |
"\n", | |
"And besides that: We can also skip even numbers in our testing loop, because an even number in a multiplication always leads to an even result. \n", | |
"\n", | |
"But as a prime number is always odd (except the 2, of course), it cannot have an even number as a factor. So checking an even number would always lead to a modulo unequal 0. \n", | |
"\n", | |
"We can achieve this by checking the last digit of the given number:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 57, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"import math\n", | |
"def isPrime4(number):\n", | |
" limit = int(math.sqrt(number))\n", | |
" for i in range(2, limit + 1):\n", | |
"\n", | |
" if str(i)[-1] in [0,2,4,5,6,8]:\n", | |
" continue\n", | |
"\n", | |
" if (number % i) == 0:\n", | |
" return False\n", | |
" \n", | |
" return True" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"We add the same test to our main loop:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 58, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"found 5133 primes\n", | |
"CPU times: user 255 ms, sys: 5.94 ms, total: 261 ms\n", | |
"Wall time: 264 ms\n" | |
] | |
} | |
], | |
"source": [ | |
"%%time\n", | |
"\n", | |
"count = 0\n", | |
"\n", | |
"for number in range(2, 50000):\n", | |
" if str(number)[-1] in [0,2,4,5,6,8]:\n", | |
" continue\n", | |
" if isPrime4(number):\n", | |
" count += 1\n", | |
"\n", | |
"print(f'found {count} primes')" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Ok, something went wrong. Why is the loop slower? Let's try something more sophisticated. \n", | |
"\n", | |
"Right now we are using a string operation for this odd/even test. But Python allows a smarter way: We can define the step size for range(). The range now needs to start at 3: " | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 59, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"import math\n", | |
"def isPrime5(number):\n", | |
" limit = int(math.sqrt(number))\n", | |
" for i in range(3, limit + 1, 2):\n", | |
"\n", | |
" if (number % i) == 0:\n", | |
" return False\n", | |
" \n", | |
" return True" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Same for the main loop. The test will start at 3 so we increment count by 1 to also honor the magic 2:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 61, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"found 5133 primes\n", | |
"CPU times: user 41.1 ms, sys: 3.3 ms, total: 44.4 ms\n", | |
"Wall time: 45.5 ms\n" | |
] | |
} | |
], | |
"source": [ | |
"%%time\n", | |
"\n", | |
"count = 1\n", | |
"for number in range(3, 50000, 2):\n", | |
"\n", | |
" if isPrime5(number):\n", | |
" count += 1\n", | |
"\n", | |
"print(f'found {count} primes')" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"And there we are, back on the road to perfect optimisation! Remember how we startet with 6 seconds?" | |
] | |
} | |
], | |
"metadata": { | |
"kernelspec": { | |
"display_name": "Python 3", | |
"language": "python", | |
"name": "python3" | |
}, | |
"language_info": { | |
"codemirror_mode": { | |
"name": "ipython", | |
"version": 3 | |
}, | |
"file_extension": ".py", | |
"mimetype": "text/x-python", | |
"name": "python", | |
"nbconvert_exporter": "python", | |
"pygments_lexer": "ipython3", | |
"version": "3.11.4" | |
} | |
}, | |
"nbformat": 4, | |
"nbformat_minor": 2 | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment