Skip to content

Instantly share code, notes, and snippets.

@rjodon
Created February 24, 2021 09:05
Show Gist options
  • Save rjodon/2f05a60825f9bf885faf359be66d9a73 to your computer and use it in GitHub Desktop.
Save rjodon/2f05a60825f9bf885faf359be66d9a73 to your computer and use it in GitHub Desktop.
numpy_savetxt_benchmark.ipynb
Display the source blob
Display the rendered blob
Raw
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"name": "numpy_savetxt_benchmark.ipynb",
"provenance": [],
"collapsed_sections": [],
"authorship_tag": "ABX9TyMPxexvXmwpxY1xTOJoStQJ",
"include_colab_link": true
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
}
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/gist/rjodon/2f05a60825f9bf885faf359be66d9a73/numpy_savetxt_benchmark.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "code",
"metadata": {
"id": "syeOp7pHMtan"
},
"source": [
"\n",
"\"\"\"\n",
"Small benchmark of numpy.savetxt() vs python io.open + io.write\n",
"\n",
"* I am using time plus an explicit loop to profile the snippet\n",
"* Why? timeit can't control python caches here. Thus, results would not reflect reality.\n",
"\n",
"* Savetxt is highly inefficient for arrays of dim>500. It is browsing row per row,\n",
" formatting, then writing to a file.\n",
"\n",
"\"\"\"\n",
"import numpy\n",
"\n",
"nc = 1; nb = 20; ni = 6; nc = 2; ia = 20; ib = 20; ic = 0\n",
"scale_factor = 10000\n",
"\n",
"U1 = numpy.array(range(0, scale_factor))\n",
"U2 = numpy.array(range(scale_factor, 2*scale_factor))\n",
"U3 = numpy.array(range(3*scale_factor, 4*scale_factor))\n",
"U4 = numpy.array(range(4*scale_factor, 5*scale_factor))\n",
"U5 = numpy.array(range(5*scale_factor, 6*scale_factor))\n",
"\n",
"a = nc*(nb*nc*ia+nc*ib+ic)+U1\n",
"a2 = ia + U1\n",
"b2 = ib + U3\n",
"c2 = ic + U4\n",
"b = nc * (nb * nc * a2 + nc * b2 + c2) + U2\n",
"A = numpy.array((a, b, U5)).T"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "K02_PyvmMyMP",
"outputId": "deb14a30-cadc-4f71-bb04-a38f84167519"
},
"source": [
"# NUMPY SAVETXT\n",
"%%time\n",
"for i in range(100):\n",
" numpy.savetxt(\"savetxt.txt\", A)"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"CPU times: user 3.41 s, sys: 128 ms, total: 3.54 s\n",
"Wall time: 3.61 s\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "1Jdsx3bcM3uw",
"outputId": "cd49f477-4be2-4f13-d001-dfb467cecc00"
},
"source": [
"# Builtin IO\n",
"%%time\n",
"with open(file=\"write.txt\", mode=\"w\") as b:\n",
" for i in range(100):\n",
" b.write(numpy.array2string(A))"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"CPU times: user 15.1 ms, sys: 825 µs, total: 16 ms\n",
"Wall time: 19.5 ms\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "10dIHCkYNM_c",
"outputId": "df6c88bc-03ae-422c-9fd2-a1971a3ef1c1"
},
"source": [
"# Builtin IO + opening/closing each iteration\n",
"%%time\n",
"for i in range(100):\n",
" with open(file=\"write.txt\", mode=\"w\") as b:\n",
" b.write(numpy.array2string(A))"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"CPU times: user 22.7 ms, sys: 3.12 ms, total: 25.8 ms\n",
"Wall time: 31.8 ms\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "Bk1lB4dvN4g1"
},
"source": [
""
],
"execution_count": null,
"outputs": []
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment