Skip to content

Instantly share code, notes, and snippets.

@kvedala
Last active April 29, 2020 03:02
Show Gist options
  • Save kvedala/9713842d5e264c94e293a930b479f597 to your computer and use it in GitHub Desktop.
Save kvedala/9713842d5e264c94e293a930b479f597 to your computer and use it in GitHub Desktop.
Durand-Kerner Roots - Python
Display the source blob
Display the rendered blob
Raw
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"_draft": {
"nbviewer_url": "https://gist.github.com/9713842d5e264c94e293a930b479f597"
},
"gist": {
"id": "9713842d5e264c94e293a930b479f597",
"data": {
"description": "make the cofficient of x^n equal to 1. This improves the stability of the algorithm",
"public": true
}
},
"hide_input": false,
"kernelspec": {
"name": "python3",
"display_name": "Python 3",
"language": "python"
},
"nbTranslate": {
"hotkey": "alt-t",
"sourceLang": "en",
"targetLang": "fr",
"displayLangs": [
"*"
],
"langInMainMenu": true,
"useGoogleTranslate": true
},
"toc": {
"nav_menu": {},
"number_sections": true,
"sideBar": true,
"skip_h1_title": false,
"base_numbering": 1,
"title_cell": "Table of Contents",
"title_sidebar": "Contents",
"toc_cell": true,
"toc_position": {},
"toc_section_display": true,
"toc_window_display": false
},
"varInspector": {
"window_display": true,
"cols": {
"lenName": 16,
"lenType": 16,
"lenVar": 40
},
"kernels_config": {
"python": {
"library": "var_list.py",
"delete_cmd_prefix": "del ",
"delete_cmd_postfix": "",
"varRefreshCmd": "print(var_dic_list())"
},
"r": {
"library": "var_list.r",
"delete_cmd_prefix": "rm(",
"delete_cmd_postfix": ") ",
"varRefreshCmd": "cat(var_dic_list()) "
}
},
"types_to_exclude": [
"module",
"function",
"builtin_function_or_method",
"instance",
"_Feature"
],
"position": {
"left": "1228px",
"top": "197px",
"width": "312px",
"height": "384px",
"right": "20px"
}
},
"language_info": {
"name": "python",
"version": "3.7.7",
"mimetype": "text/x-python",
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"pygments_lexer": "ipython3",
"nbconvert_exporter": "python",
"file_extension": ".py"
},
"colab": {
"name": "Durand-Kerner Roots - Python",
"provenance": [],
"collapsed_sections": [],
"include_colab_link": true
}
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/gist/kvedala/9713842d5e264c94e293a930b479f597/notebook.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "markdown",
"metadata": {
"toc": true,
"id": "3Fi7Okap2gJs",
"colab_type": "text"
},
"source": [
"<h1>Table of Contents<span class=\"tocSkip\"></span></h1>\n",
"<div class=\"toc\"><ul class=\"toc-item\"><li><span><a href=\"#Fast-Numerical-algorithm-to-find-all-the-roots-of-an-n-th-degree-polynomial\" data-toc-modified-id=\"Fast-Numerical-algorithm-to-find-all-the-roots-of-an-n-th-degree-polynomial-1\"><span class=\"toc-item-num\">1&nbsp;&nbsp;</span>Fast-Numerical algorithm to find all the roots of an n-th degree polynomial</a></span></li><li><span><a href=\"#Using-Numba\" data-toc-modified-id=\"Using-Numba-2\"><span class=\"toc-item-num\">2&nbsp;&nbsp;</span>Using Numba</a></span></li><li><span><a href=\"#Using-Cython\" data-toc-modified-id=\"Using-Cython-3\"><span class=\"toc-item-num\">3&nbsp;&nbsp;</span>Using Cython</a></span></li><li><span><a href=\"#Testing-&amp;-Benchmarking\" data-toc-modified-id=\"Testing-&amp;-Benchmarking-4\"><span class=\"toc-item-num\">4&nbsp;&nbsp;</span>Testing &amp; Benchmarking</a></span><ul class=\"toc-item\"><li><span><a href=\"#Known-bugs-in-Durand–Kerner-algorithm\" data-toc-modified-id=\"Known-bugs-in-Durand–Kerner-algorithm-4.1\"><span class=\"toc-item-num\">4.1&nbsp;&nbsp;</span>Known bugs in Durand–Kerner algorithm</a></span><ul class=\"toc-item\"><li><span><a href=\"#Fail-cases\" data-toc-modified-id=\"Fail-cases-4.1.1\"><span class=\"toc-item-num\">4.1.1&nbsp;&nbsp;</span>Fail cases</a></span></li><li><span><a href=\"#Fail-cases-resolved\" data-toc-modified-id=\"Fail-cases-resolved-4.1.2\"><span class=\"toc-item-num\">4.1.2&nbsp;&nbsp;</span>Fail cases resolved</a></span></li></ul></li></ul></li></ul></div>"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "kRgF7Ylg2gJx",
"colab_type": "text"
},
"source": [
"# Fast-Numerical algorithm to find all the roots of an n-th degree polynomial\n",
"The algorithm is an implementation of [Durand–Kerner method](https://en.wikipedia.org/wiki/Durand–Kerner_method). The algorithm is implemented using the [numba](https://numba.pydata.org) optimizer and also using [cython](http://cython.readthedocs.io/). The two implementations are then compared with the implementation of numpy and we see that both usually perform better than `numpy.solve`. "
]
},
{
"cell_type": "code",
"metadata": {
"ExecuteTime": {
"end_time": "2020-02-29T22:11:53.086454Z",
"start_time": "2020-02-29T22:11:50.238096Z"
},
"run_control": {
"marked": false
},
"trusted": true,
"id": "vxpdkFwO2gJ3",
"colab_type": "code",
"colab": {}
},
"source": [
"import numpy as np\n",
"import numba\n",
"from typing import Callable, List\n",
"from math import sqrt\n",
"%reload_ext cython"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "1WRuKQsp2gKg",
"colab_type": "text"
},
"source": [
"# Using Numba"
]
},
{
"cell_type": "code",
"metadata": {
"ExecuteTime": {
"end_time": "2020-02-29T22:11:53.120963Z",
"start_time": "2020-02-29T22:11:53.091450Z"
},
"trusted": true,
"id": "vYjWyN982gKk",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 35
},
"outputId": "f3fc8fa9-966e-4963-c93b-7ea114c2d88f"
},
"source": [
"@numba.njit(nogil=True, boundscheck=False, cache=True)\n",
"def f(poly, x: complex) -> complex:\n",
" out = 0 + 0j\n",
" L = len(poly)\n",
" for i in range(L):\n",
" k = L - i - 1\n",
" if poly[k] != 0:\n",
" out += poly[k] * pow(x, i)\n",
"\n",
"# if poly[i] != 0:\n",
"# out += poly[i] * (x ** i)\n",
" return out\n",
"\n",
"\n",
"@numba.jit(parallel=True, \n",
" nopython=True, boundscheck=False, nogil=True, cache=True)\n",
"def solve(poly: np.ndarray, max_iter: int = 50000):\n",
" epsilon = 5e-16\n",
" ii = 0\n",
" scale = 500\n",
" L = poly.shape[0] - 1\n",
" roots = (np.random.random(L) + np.random.random(L) * 1j) - (0.5 + 0.5j)\n",
" roots *= scale\n",
" max_delta = 1\n",
"\n",
" while max_delta > epsilon and ii < max_iter:\n",
" max_delta = 0\n",
" ii += 1\n",
" if L > 20:\n",
" for i in numba.prange(L): # for ith root\n",
" deno = 1 + 0j\n",
" delta = f(poly, roots[i])\n",
" for k in range(L):\n",
" if k != i:\n",
" delta /= (roots[i] - roots[k])\n",
" roots[i] -= delta\n",
" # print(ii, roots, delta)\n",
" max_delta += ((delta.real * delta.real) +\n",
" (delta.imag * delta.imag)) / L\n",
" else:\n",
" for i in range(L): # for ith root\n",
" deno = 1 + 0j\n",
" delta = f(poly, roots[i])\n",
" for k in range(L):\n",
" if k != i:\n",
" delta /= (roots[i] - roots[k])\n",
" roots[i] -= delta\n",
" # print(ii, roots, delta)\n",
" max_delta += ((delta.real * delta.real) +\n",
" (delta.imag * delta.imag)) / L\n",
"\n",
"\n",
" max_delta = np.sqrt(max_delta)\n",
" return roots, max_delta\n",
"\n",
"\n",
"solve(np.array([1, 0, -1]))"
],
"execution_count": 2,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"(array([ 1.+0.j, -1.+0.j]), 1.7849884655831547e-29)"
]
},
"metadata": {
"tags": []
},
"execution_count": 2
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "3u-SFiz-2gK0",
"colab_type": "text"
},
"source": [
"# Using Cython"
]
},
{
"cell_type": "code",
"metadata": {
"ExecuteTime": {
"end_time": "2020-02-29T22:12:11.454576Z",
"start_time": "2020-02-29T22:11:53.126778Z"
},
"scrolled": false,
"trusted": true,
"id": "GywU0ooa2gK3",
"colab_type": "code",
"colab": {}
},
"source": [
"%%cython -c=-Ofast -c=-flto=thin\n",
"# cython: boundscheck=False, wraparound=False, cdivision=True\n",
"\n",
"cimport numpy as np\n",
"import numpy as np\n",
"\n",
"cimport cython\n",
"from cython.parallel cimport prange\n",
"\n",
"# from libc.math cimport abs, sqrt, fmax\n",
"# from libcpp.complex cimport pow as cpow, real, imag, abs\n",
"\n",
"cdef extern from \"complex.h\" nogil:\n",
" np.complex128_t cpow(np.complex128_t, long)\n",
" double cabs(np.complex128_t) \n",
" double creal(np.complex128_t) \n",
" double cimag(np.complex128_t) \n",
"\n",
"cdef extern from \"math.h\" nogil:\n",
" long double sqrtl(long double)\n",
" long double fabsl(long double)\n",
" long double fmaxl(long double, long double)\n",
"\n",
" \n",
"cdef np.complex128_t f_c(double[:] poly, np.complex128_t x) nogil:\n",
" cdef:\n",
" np.complex128_t out = 0 + 0j\n",
" long i, k, L = poly.shape[0]\n",
" \n",
" for i in range(L):\n",
" k = L - i - 1\n",
" if poly[k] != 0:\n",
" out += poly[k] * cpow(x, i)\n",
"# if poly[i] != 0:\n",
"# out += <np.complex128_t> poly[i] * (x ** i)\n",
" return out \n",
"\n",
"\n",
"def solve_c(double[:] poly, long max_iter = 500000):\n",
" cdef:\n",
" double epsilon = 5e-10\n",
" long ii = 0, i, k, L = poly.shape[0] - 1\n",
" double scale = 500.0\n",
" np.ndarray[np.complex128_t, ndim=1] roots = ((np.random.random(L) + np.random.random(L) * 1j) \\\n",
" - (.5 + .5j)) * scale\n",
"# np.complex128_t[:] roots = _roots\n",
" np.complex128_t deno, delta\n",
" long double max_delta = 1, d2, rd, id\n",
" \n",
" with nogil:\n",
" while max_delta > epsilon and ii < max_iter:\n",
" max_delta = 0\n",
" ii += 1\n",
" for i in range(L):\n",
"# deno = 1 + 0j\n",
"# delta = 0 + 0j\n",
" delta = f_c(poly, roots[i])\n",
" for k in range(L):\n",
" if k != i:\n",
" delta /= (roots[i] - roots[k])\n",
" roots[i] -= delta\n",
" rd = creal(delta)\n",
" id = cimag(delta)\n",
" max_delta += ((rd * rd) + (id * id)) / L\n",
"# with gil:\n",
"# print(ii, roots, delta)\n",
" max_delta = sqrtl(max_delta)\n",
"# print(\"max_delta = \", max_delta)\n",
" return roots, max_delta"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "mjhmrUtv2gLF",
"colab_type": "text"
},
"source": [
"# Testing & Benchmarking"
]
},
{
"cell_type": "code",
"metadata": {
"ExecuteTime": {
"end_time": "2020-02-29T22:12:11.494565Z",
"start_time": "2020-02-29T22:12:11.466458Z"
},
"trusted": true,
"id": "w3tEdDny2gLI",
"colab_type": "code",
"colab": {}
},
"source": [
"def benchmark(polynom):\n",
" L = len(polynom)\n",
" msg = ''\n",
" for i in range(L):\n",
" if polynom[i] == 0:\n",
" continue\n",
" if i > 0:\n",
" msg += '+ '\n",
" msg += '({})x^{} '.format(polynom[i], L-i-1)\n",
" print(msg)\n",
" print('--------------')\n",
" \n",
" print(\"Using numpy.solve\")\n",
" %timeit np.roots(polynom)\n",
" roots1 = np.roots(polynom)\n",
" print(roots1)\n",
" print('--------------')\n",
" print(\"Using numba implementation\")\n",
" %timeit roots = solve(polynom)\n",
" roots = solve(polynom)\n",
" print(roots)\n",
" print('--------------')\n",
" print(\"Using cython implementation\")\n",
" %timeit roots2, _ = solve_c(polynom)\n",
" roots2, _ = solve_c(polynom)\n",
" print(roots2)\n",
" print('--------------')"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "I-qkJ6bu2gLb",
"colab_type": "text"
},
"source": [
"## Known bugs in Durand–Kerner algorithm\n",
"### Fail cases"
]
},
{
"cell_type": "code",
"metadata": {
"trusted": true,
"id": "YKCUNO8R2gLi",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 623
},
"outputId": "aa2a5b0a-92f8-4a11-c05d-5933b5b9d25e"
},
"source": [
"# 1 - fail example:\n",
"polynom = np.array([4, 0, -1], dtype=float)\n",
"benchmark(polynom)\n",
"print('================')\n",
"# 2 - fail example\n",
"polynom = np.array([2, 13, -7], dtype=float)\n",
"benchmark(polynom)"
],
"execution_count": 6,
"outputs": [
{
"output_type": "stream",
"text": [
"(4.0)x^2 + (-1.0)x^0 \n",
"--------------\n",
"Using numpy.solve\n",
"The slowest run took 399.88 times longer than the fastest. This could mean that an intermediate result is being cached.\n",
"10000 loops, best of 3: 59 µs per loop\n",
"[-0.5 0.5]\n",
"--------------\n",
"Using numba implementation\n",
"The slowest run took 19131.08 times longer than the fastest. This could mean that an intermediate result is being cached.\n",
"1 loop, best of 3: 75.6 µs per loop\n",
"(array([-inf-infj, nan+nanj]), nan)\n",
"--------------\n",
"Using cython implementation\n",
"The slowest run took 4.43 times longer than the fastest. This could mean that an intermediate result is being cached.\n",
"10000 loops, best of 3: 121 µs per loop\n",
"[-7.99123143+1.48700444j nan +infj]\n",
"--------------\n",
"================\n",
"(2.0)x^2 + (13.0)x^1 + (-7.0)x^0 \n",
"--------------\n",
"Using numpy.solve\n",
"The slowest run took 4.84 times longer than the fastest. This could mean that an intermediate result is being cached.\n",
"10000 loops, best of 3: 59.5 µs per loop\n",
"[-7. 0.5]\n",
"--------------\n",
"Using numba implementation\n",
"100 loops, best of 3: 14.4 ms per loop\n",
"(array([-84121.4758112 +51790.51808499j, -136.70609964 +82.11608597j]), 139921.18903100255)\n",
"--------------\n",
"Using cython implementation\n",
"1 loop, best of 3: 387 ms per loop\n",
"[ 8.31462374e+00+4.56625783e+00j -2.20698923e+04-9.57221948e+03j]\n",
"--------------\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "yhdGU1IJ2gL1",
"colab_type": "text"
},
"source": [
"### Fail cases resolved"
]
},
{
"cell_type": "code",
"metadata": {
"trusted": true,
"id": "CIZ6Pfg22gL3",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 641
},
"outputId": "de18b478-7941-4683-e9e6-4e2d11d57708"
},
"source": [
"# 1 - fail example - resolved:\n",
"polynom = np.array([4, 0, -1])\n",
"polynom = polynom / polynom[0]\n",
"benchmark(polynom)\n",
"# 2 - fail example - resolved\n",
"polynom = np.array([2, 13, -7])\n",
"polynom = polynom / polynom[0]\n",
"benchmark(polynom)"
],
"execution_count": 7,
"outputs": [
{
"output_type": "stream",
"text": [
"(1.0)x^2 + (-0.25)x^0 \n",
"--------------\n",
"Using numpy.solve\n",
"The slowest run took 4.95 times longer than the fastest. This could mean that an intermediate result is being cached.\n",
"10000 loops, best of 3: 59.1 µs per loop\n",
"[-0.5 0.5]\n",
"--------------\n",
"Using numba implementation\n",
"The slowest run took 30.43 times longer than the fastest. This could mean that an intermediate result is being cached.\n",
"10000 loops, best of 3: 22.2 µs per loop\n",
"(array([-0.5-1.92592994e-34j, 0.5+0.00000000e+00j]), 3.9263788673002175e-17)\n",
"--------------\n",
"Using cython implementation\n",
"The slowest run took 24.82 times longer than the fastest. This could mean that an intermediate result is being cached.\n",
"100000 loops, best of 3: 18.1 µs per loop\n",
"[ 0.5+0.0000000e+00j -0.5+2.0317014e-17j]\n",
"--------------\n",
"(1.0)x^2 + (6.5)x^1 + (-3.5)x^0 \n",
"--------------\n",
"Using numpy.solve\n",
"The slowest run took 4.74 times longer than the fastest. This could mean that an intermediate result is being cached.\n",
"10000 loops, best of 3: 59.2 µs per loop\n",
"[-7. 0.5]\n",
"--------------\n",
"Using numba implementation\n",
"The slowest run took 13.15 times longer than the fastest. This could mean that an intermediate result is being cached.\n",
"10000 loops, best of 3: 34.6 µs per loop\n",
"(array([ 0.5+0.00000000e+00j, -7. -1.62452469e-16j]), 3.587542551130318e-16)\n",
"--------------\n",
"Using cython implementation\n",
"The slowest run took 22.21 times longer than the fastest. This could mean that an intermediate result is being cached.\n",
"10000 loops, best of 3: 19.7 µs per loop\n",
"[-7. -2.67810438e-16j 0.5+0.00000000e+00j]\n",
"--------------\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"ExecuteTime": {
"end_time": "2020-02-29T22:12:22.905023Z",
"start_time": "2020-02-29T22:12:11.526177Z"
},
"scrolled": false,
"trusted": false,
"id": "BLHeoyMP2gME",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 384
},
"outputId": "f739b7e2-60a8-4f36-a8e8-dcc0774deb0e"
},
"source": [
"polynom = np.array([1, 0, 0, 10, 0, 0, -27], dtype=float)\n",
"benchmark(polynom)"
],
"execution_count": 9,
"outputs": [
{
"output_type": "stream",
"text": [
"(1.0)x^6 + (10.0)x^3 + (-27.0)x^0 \n",
"--------------\n",
"Using numpy.solve\n",
"The slowest run took 4.40 times longer than the fastest. This could mean that an intermediate result is being cached.\n",
"10000 loops, best of 3: 72.3 µs per loop\n",
"[-2.30277564+0.j 1.15138782+1.9942622j 1.15138782-1.9942622j\n",
" -0.65138782+1.1282368j -0.65138782-1.1282368j 1.30277564+0.j ]\n",
"--------------\n",
"Using numba implementation\n",
"100 loops, best of 3: 9.83 ms per loop\n",
"(array([-2.30277564+4.85609416e-16j, 1.15138782+1.99426220e+00j,\n",
" 1.30277564+2.32216653e-33j, -0.65138782+1.12823680e+00j,\n",
" -0.65138782-1.12823680e+00j, 1.15138782-1.99426220e+00j]), 2.048948560685953e-16)\n",
"--------------\n",
"Using cython implementation\n",
"10000 loops, best of 3: 93.1 µs per loop\n",
"[-0.65138782-1.12823680e+00j -0.65138782+1.12823680e+00j\n",
" 1.30277564-1.14341592e-32j 1.15138782+1.99426220e+00j\n",
" 1.15138782-1.99426220e+00j -2.30277564-4.39589541e-17j]\n",
"--------------\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"ExecuteTime": {
"start_time": "2020-02-29T22:11:50.565Z"
},
"trusted": false,
"id": "hMLlZqAu2gMO",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 476
},
"outputId": "a34f0f00-5c4c-4b4d-f098-672e943152dd"
},
"source": [
"polynom = np.array([1,0,0,0,0,0,0,0,0,-1], dtype=float)\n",
"benchmark(polynom)"
],
"execution_count": 10,
"outputs": [
{
"output_type": "stream",
"text": [
"(1.0)x^9 + (-1.0)x^0 \n",
"--------------\n",
"Using numpy.solve\n",
"The slowest run took 5.14 times longer than the fastest. This could mean that an intermediate result is being cached.\n",
"10000 loops, best of 3: 88.7 µs per loop\n",
"[-0.93969262+0.34202014j -0.93969262-0.34202014j -0.5 +0.8660254j\n",
" -0.5 -0.8660254j 0.17364818+0.98480775j 0.17364818-0.98480775j\n",
" 1. +0.j 0.76604444+0.64278761j 0.76604444-0.64278761j]\n",
"--------------\n",
"Using numba implementation\n",
"10000 loops, best of 3: 217 µs per loop\n",
"(array([ 1. +0.j , 0.17364818+0.98480775j,\n",
" -0.5 +0.8660254j , 0.76604444-0.64278761j,\n",
" -0.93969262-0.34202014j, -0.5 -0.8660254j ,\n",
" 0.17364818-0.98480775j, -0.93969262+0.34202014j,\n",
" 0.76604444+0.64278761j]), 1.1483671133852965e-16)\n",
"--------------\n",
"Using cython implementation\n",
"10000 loops, best of 3: 172 µs per loop\n",
"[ 1. -3.08148791e-33j 0.76604444+6.42787610e-01j\n",
" 0.17364818+9.84807753e-01j -0.5 +8.66025404e-01j\n",
" 0.76604444-6.42787610e-01j -0.93969262+3.42020143e-01j\n",
" -0.5 -8.66025404e-01j -0.93969262-3.42020143e-01j\n",
" 0.17364818-9.84807753e-01j]\n",
"--------------\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"ExecuteTime": {
"start_time": "2020-02-29T22:11:50.593Z"
},
"trusted": false,
"id": "-Fo3oZqp2gMX",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 403
},
"outputId": "05d2148e-33db-4fd3-b293-aed50f73f29e"
},
"source": [
"polynom = np.array([1, 0, 0, 0, -1, -1], dtype=float)\n",
"benchmark(polynom)"
],
"execution_count": 11,
"outputs": [
{
"output_type": "stream",
"text": [
"(1.0)x^5 + (-1.0)x^1 + (-1.0)x^0 \n",
"--------------\n",
"Using numpy.solve\n",
"The slowest run took 5.79 times longer than the fastest. This could mean that an intermediate result is being cached.\n",
"10000 loops, best of 3: 65.4 µs per loop\n",
"[ 1.16730398+0.j 0.18123244+1.0839541j 0.18123244-1.0839541j\n",
" -0.76488443+0.35247155j -0.76488443-0.35247155j]\n",
"--------------\n",
"Using numba implementation\n",
"The slowest run took 7.89 times longer than the fastest. This could mean that an intermediate result is being cached.\n",
"10000 loops, best of 3: 113 µs per loop\n",
"(array([ 0.18123244+1.08395410e+00j, -0.76488443+3.52471546e-01j,\n",
" 0.18123244-1.08395410e+00j, 1.16730398+3.70314704e-33j,\n",
" -0.76488443-3.52471546e-01j]), 1.799710450582027e-16)\n",
"--------------\n",
"Using cython implementation\n",
"10000 loops, best of 3: 78 µs per loop\n",
"[ 0.18123244+1.08395410e+00j -0.76488443+3.52471546e-01j\n",
" 0.18123244-1.08395410e+00j -0.76488443-3.52471546e-01j\n",
" 1.16730398-8.56796441e-33j]\n",
"--------------\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"ExecuteTime": {
"start_time": "2020-02-29T22:11:50.665Z"
},
"trusted": false,
"id": "VqUCWLlk2gMh",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 366
},
"outputId": "d45bde1d-aadb-4d56-bfbb-3e92bcbb0831"
},
"source": [
"polynom = np.array([1,-3,3,-5], dtype=float)\n",
"benchmark(polynom)"
],
"execution_count": 12,
"outputs": [
{
"output_type": "stream",
"text": [
"(1.0)x^3 + (-3.0)x^2 + (3.0)x^1 + (-5.0)x^0 \n",
"--------------\n",
"Using numpy.solve\n",
"The slowest run took 5.13 times longer than the fastest. This could mean that an intermediate result is being cached.\n",
"10000 loops, best of 3: 61.3 µs per loop\n",
"[2.58740105+0.j 0.20629947+1.37472964j 0.20629947-1.37472964j]\n",
"--------------\n",
"Using numba implementation\n",
"The slowest run took 8.89 times longer than the fastest. This could mean that an intermediate result is being cached.\n",
"10000 loops, best of 3: 55.1 µs per loop\n",
"(array([0.20629947+1.37472964e+00j, 0.20629947-1.37472964e+00j,\n",
" 2.58740105-5.47382213e-48j]), 2.3375548826626946e-16)\n",
"--------------\n",
"Using cython implementation\n",
"The slowest run took 14.33 times longer than the fastest. This could mean that an intermediate result is being cached.\n",
"10000 loops, best of 3: 37.6 µs per loop\n",
"[0.20629947+1.37472964e+00j 2.58740105-7.57306469e-29j\n",
" 0.20629947-1.37472964e+00j]\n",
"--------------\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"scrolled": false,
"trusted": false,
"id": "mFD6jhGd2gMp",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 1000
},
"outputId": "a7b27961-4d1b-4d64-cbaa-04ff42fba255"
},
"source": [
"N = 61\n",
"polynom = np.zeros(N, dtype=float)\n",
"polynom[0] = 1\n",
"polynom[N-1] = -1\n",
"benchmark(polynom)"
],
"execution_count": 13,
"outputs": [
{
"output_type": "stream",
"text": [
"(1.0)x^60 + (-1.0)x^0 \n",
"--------------\n",
"Using numpy.solve\n",
"1000 loops, best of 3: 900 µs per loop\n",
"[-1.00000000e+00+0.j -9.94521895e-01+0.10452846j\n",
" -9.94521895e-01-0.10452846j -9.78147601e-01+0.20791169j\n",
" -9.78147601e-01-0.20791169j -9.51056516e-01+0.30901699j\n",
" -9.51056516e-01-0.30901699j -9.13545458e-01+0.40673664j\n",
" -9.13545458e-01-0.40673664j -8.66025404e-01+0.5j\n",
" -8.66025404e-01-0.5j -8.09016994e-01+0.58778525j\n",
" -8.09016994e-01-0.58778525j -7.43144825e-01+0.66913061j\n",
" -7.43144825e-01-0.66913061j -6.69130606e-01+0.74314483j\n",
" -6.69130606e-01-0.74314483j -5.87785252e-01+0.80901699j\n",
" -5.87785252e-01-0.80901699j -5.00000000e-01+0.8660254j\n",
" -5.00000000e-01-0.8660254j -4.06736643e-01+0.91354546j\n",
" -4.06736643e-01-0.91354546j -3.09016994e-01+0.95105652j\n",
" -3.09016994e-01-0.95105652j -2.07911691e-01+0.9781476j\n",
" -2.07911691e-01-0.9781476j -1.04528463e-01+0.9945219j\n",
" -1.04528463e-01-0.9945219j -5.68989300e-16+1.j\n",
" -5.68989300e-16-1.j 1.04528463e-01+0.9945219j\n",
" 1.04528463e-01-0.9945219j 2.07911691e-01+0.9781476j\n",
" 2.07911691e-01-0.9781476j 3.09016994e-01+0.95105652j\n",
" 3.09016994e-01-0.95105652j 4.06736643e-01+0.91354546j\n",
" 4.06736643e-01-0.91354546j 5.00000000e-01+0.8660254j\n",
" 5.00000000e-01-0.8660254j 5.87785252e-01+0.80901699j\n",
" 5.87785252e-01-0.80901699j 6.69130606e-01+0.74314483j\n",
" 6.69130606e-01-0.74314483j 7.43144825e-01+0.66913061j\n",
" 7.43144825e-01-0.66913061j 8.09016994e-01+0.58778525j\n",
" 8.09016994e-01-0.58778525j 1.00000000e+00+0.j\n",
" 9.94521895e-01+0.10452846j 9.94521895e-01-0.10452846j\n",
" 9.78147601e-01+0.20791169j 9.78147601e-01-0.20791169j\n",
" 9.51056516e-01+0.30901699j 9.51056516e-01-0.30901699j\n",
" 9.13545458e-01+0.40673664j 9.13545458e-01-0.40673664j\n",
" 8.66025404e-01+0.5j 8.66025404e-01-0.5j ]\n",
"--------------\n",
"Using numba implementation\n",
"10 loops, best of 3: 11.9 ms per loop\n",
"(array([-6.69130606e-01+7.43144825e-01j, -9.51056516e-01+3.09016994e-01j,\n",
" -5.87785252e-01-8.09016994e-01j, -9.51056516e-01-3.09016994e-01j,\n",
" 1.04528463e-01+9.94521895e-01j, -9.94521895e-01-1.04528463e-01j,\n",
" 9.94521895e-01+1.04528463e-01j, -5.87785252e-01+8.09016994e-01j,\n",
" -4.55892420e-17-1.00000000e+00j, -2.07911691e-01+9.78147601e-01j,\n",
" -8.66025404e-01+5.00000000e-01j, -3.09016994e-01-9.51056516e-01j,\n",
" 6.69130606e-01+7.43144825e-01j, 9.51056516e-01-3.09016994e-01j,\n",
" 9.78147601e-01+2.07911691e-01j, -3.09016994e-01+9.51056516e-01j,\n",
" 4.06736643e-01+9.13545458e-01j, 5.87785252e-01+8.09016994e-01j,\n",
" 3.09016994e-01-9.51056516e-01j, 5.00000000e-01+8.66025404e-01j,\n",
" -8.09016994e-01-5.87785252e-01j, 8.09016994e-01+5.87785252e-01j,\n",
" -1.04528463e-01+9.94521895e-01j, 2.07911691e-01-9.78147601e-01j,\n",
" 4.06736643e-01-9.13545458e-01j, 5.87785252e-01-8.09016994e-01j,\n",
" 9.13545458e-01-4.06736643e-01j, 9.94521895e-01-1.04528463e-01j,\n",
" 7.43144825e-01-6.69130606e-01j, 8.66025404e-01+5.00000000e-01j,\n",
" -1.00000000e+00+3.15661262e-17j, -9.94521895e-01+1.04528463e-01j,\n",
" -4.06736643e-01-9.13545458e-01j, -7.43144825e-01-6.69130606e-01j,\n",
" -9.13545458e-01-4.06736643e-01j, -9.13545458e-01+4.06736643e-01j,\n",
" -1.04528463e-01-9.94521895e-01j, -2.07911691e-01-9.78147601e-01j,\n",
" -7.43144825e-01+6.69130606e-01j, -6.69130606e-01-7.43144825e-01j,\n",
" -5.00000000e-01-8.66025404e-01j, 1.00000000e+00+2.80259693e-45j,\n",
" 6.69130606e-01-7.43144825e-01j, 3.09016994e-01+9.51056516e-01j,\n",
" -4.06736643e-01+9.13545458e-01j, 9.13545458e-01+4.06736643e-01j,\n",
" 9.78147601e-01-2.07911691e-01j, -1.91436462e-16+1.00000000e+00j,\n",
" 1.04528463e-01-9.94521895e-01j, 5.00000000e-01-8.66025404e-01j,\n",
" -8.66025404e-01-5.00000000e-01j, 2.07911691e-01+9.78147601e-01j,\n",
" 8.09016994e-01-5.87785252e-01j, 9.51056516e-01+3.09016994e-01j,\n",
" 8.66025404e-01-5.00000000e-01j, -9.78147601e-01-2.07911691e-01j,\n",
" -9.78147601e-01+2.07911691e-01j, -8.09016994e-01+5.87785252e-01j,\n",
" 7.43144825e-01+6.69130606e-01j, -5.00000000e-01+8.66025404e-01j]), 3.0828975039459106e-16)\n",
"--------------\n",
"Using cython implementation\n",
"100 loops, best of 3: 9.9 ms per loop\n",
"[-1.04528463e-01-9.94521895e-01j -8.66025404e-01-5.00000000e-01j\n",
" 6.69130606e-01-7.43144825e-01j -5.00000000e-01+8.66025404e-01j\n",
" 4.06736643e-01-9.13545458e-01j 7.43144825e-01-6.69130606e-01j\n",
" 9.94521895e-01+1.04528463e-01j -7.43144825e-01+6.69130606e-01j\n",
" 6.69130606e-01+7.43144825e-01j -2.07911691e-01+9.78147601e-01j\n",
" 9.78147601e-01+2.07911691e-01j -5.00000000e-01-8.66025404e-01j\n",
" 8.09016994e-01+5.87785252e-01j -9.13545458e-01-4.06736643e-01j\n",
" -4.06736643e-01-9.13545458e-01j -6.69130606e-01-7.43144825e-01j\n",
" 9.94521895e-01-1.04528463e-01j 8.09016994e-01-5.87785252e-01j\n",
" 1.04528463e-01-9.94521895e-01j 8.66025404e-01-5.00000000e-01j\n",
" 1.00000000e+00+7.22223729e-35j -9.78147601e-01-2.07911691e-01j\n",
" -9.51056516e-01+3.09016994e-01j 9.13545458e-01-4.06736643e-01j\n",
" -5.87785252e-01-8.09016994e-01j -3.09016994e-01+9.51056516e-01j\n",
" 5.87785252e-01-8.09016994e-01j -8.09016994e-01+5.87785252e-01j\n",
" -6.69130606e-01+7.43144825e-01j -8.66025404e-01+5.00000000e-01j\n",
" 1.04528463e-01+9.94521895e-01j 5.87785252e-01+8.09016994e-01j\n",
" 5.00000000e-01+8.66025404e-01j 9.51056516e-01-3.09016994e-01j\n",
" 5.00000000e-01-8.66025404e-01j 3.09016994e-01-9.51056516e-01j\n",
" 9.78147601e-01-2.07911691e-01j -4.06736643e-01+9.13545458e-01j\n",
" 3.09016994e-01+9.51056516e-01j -4.96923288e-17-1.00000000e+00j\n",
" -3.09016994e-01-9.51056516e-01j -9.94521895e-01-1.04528463e-01j\n",
" 4.06736643e-01+9.13545458e-01j -8.09016994e-01-5.87785252e-01j\n",
" -2.47296713e-17+1.00000000e+00j -7.43144825e-01-6.69130606e-01j\n",
" -5.87785252e-01+8.09016994e-01j 8.66025404e-01+5.00000000e-01j\n",
" 2.07911691e-01-9.78147601e-01j -9.13545458e-01+4.06736643e-01j\n",
" -1.00000000e+00+1.71645897e-16j -2.07911691e-01-9.78147601e-01j\n",
" 2.07911691e-01+9.78147601e-01j 9.13545458e-01+4.06736643e-01j\n",
" -9.51056516e-01-3.09016994e-01j -1.04528463e-01+9.94521895e-01j\n",
" -9.78147601e-01+2.07911691e-01j 7.43144825e-01+6.69130606e-01j\n",
" -9.94521895e-01+1.04528463e-01j 9.51056516e-01+3.09016994e-01j]\n",
"--------------\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"trusted": false,
"id": "oixlYGxa2gM1",
"colab_type": "code",
"colab": {}
},
"source": [
""
],
"execution_count": 0,
"outputs": []
}
]
}
@kvedala
Copy link
Author

kvedala commented Apr 9, 2020

Performance improvement

Performing

polynom = polynom / polynom[0]

i.e., by ensuing the coefficient of the highest power of x to be 1 significantly improves the numerical stability.

For example, the above algorithm fails for:

# 1 - fail example:
polynom = np.array([4, 0, -1])
benchmark(polynom)
# 2 - fail example
polynom = np.array([2, 13, -7])
benchmark(polynom)

but the problem gets resolved when executing as such:

# 1 - fail example - resolved:
polynom = np.array([4, 0, -1])
polynom = polynom / polynom[0]
benchmark(polynom)
# 2 - fail example - resolved
polynom = np.array([2, 13, -7])
polynom = polynom / polynom[0]
benchmark(polynom)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment