Skip to content

Instantly share code, notes, and snippets.

@eteq
Created April 19, 2021 17:57
Show Gist options
  • Save eteq/2c1b9c5b5ba04cb8d36a8f4df78087b9 to your computer and use it in GitHub Desktop.
Save eteq/2c1b9c5b5ba04cb8d36a8f4df78087b9 to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Performance tests of Numerical and Astronomy Python Workflows for a User in 2021\n",
"\n",
"*Erik Tollerud* \n",
"\n",
"*@eteq*\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This post contains a typical execution time analysis of several possible devices that might be available to the typical scientist (particularly Astronomy, although probably applicable to similar programming-oriented science disciplines) in early 2021.\n",
"\n",
"Note, though: while this provides some useful checks on speed of basic workflow elements, in many scientific workflows, the most relevant metric is not the speed of your computer, but rather the speed of your ability to write the code! To me the real test is: when you run a cell, is it slow enough to interrupt your mental flow? If not, it's fast enough. The below is hopefully a guide to show exactly that. Any workflow you have to walk away from, well, you probably have something else to do anyway, so orders-of-magnitude are all that matter!\n",
"\n",
"**Methodology Caveat**\n",
"These tests were not executed particularly carefully *by design* - this is meant to represent the experience of someone who does not want to sit down and fiddle with their machine settings for hours and hours to shave off every last millisecond of run time. So there is no special compiling/optimization, adjust ment of caching, or even closing of browser tabs to make the performance metrics more consistent. But that's on purpose, because I always have a bunch of tabs open when working, anyway!"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Preliminaries "
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"from collections import defaultdict\n",
"\n",
"import numpy as np\n",
"\n",
"%matplotlib inline\n",
"from matplotlib import pyplot as plt"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Generic Numeric/scientific Operations\n",
"\n",
"We start with a more numerically-oriented set of tests. These focus on basic number crunching, linear algebra, raw numerical I/O, etc. Specifically, the following:\n",
"\n",
"* Basic Numpy arithmetic\n",
"* Eigenvalue decomposition\n",
"* Singular Value Decomposition (on a non-square matrix)\n",
"* Least Squares fit to a Gaussian\n",
"* Writing a \"small\" array (300 KB)\n",
"* Writing a \"largish\" array (300 MB)\n",
"* For all of the above, the time to generate the random data sets is checked\n",
"\n",
"The cell below contains the full set of tests. It writes out to a file because that's the easiest way to make sure the same thing is run on all the different devices, but you can also comment out the `%%writefile` line and run it directly in the notebook."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Overwriting npprofile.py\n"
]
}
],
"source": [
"%%writefile npprofile.py\n",
"\n",
"import os\n",
"import time\n",
"import tempfile\n",
"\n",
"import numpy as np\n",
"\n",
"from scipy import optimize\n",
"\n",
"\n",
"N = 50\n",
"sz = 300\n",
"\n",
"\n",
"ts = []\n",
"ts_rand = []\n",
"for i in range(N + 1):\n",
" prest = time.time()\n",
" a = np.random.randn(sz * sz)\n",
" st = time.time()\n",
" c = ((1.2*a + 3 ) / 4 - 5.6 ) ** 4 \n",
" et=time.time()\n",
" ts.append((et-st)*1000)\n",
" ts_rand.append((st-prest)*1000)\n",
"# remove the first one since it might be tainted by imports\n",
"for l in (ts, ts_rand):\n",
" del l[0]\n",
"\n",
"print(f'Arithmetic Operations (n={sz*sz})', np.mean(ts), '±', np.std(ts), 'ms')\n",
"print('RNG', np.mean(ts_rand), '±', np.std(ts_rand), 'ms')\n",
"\n",
"ts = []\n",
"ts_rand = []\n",
"for i in range(N + 1):\n",
" prest = time.time()\n",
" a = np.random.randn(sz, sz)\n",
" st = time.time()\n",
" e=np.linalg.eig(a)\n",
" et=time.time()\n",
" ts.append((et-st)*1000)\n",
" ts_rand.append((st-prest)*1000)\\\n",
"# remove the first one since it might be tainted by imports\n",
"for l in (ts, ts_rand):\n",
" del l[0]\n",
"\n",
"print(f'Eig ({sz} x {sz})', np.mean(ts), '±', np.std(ts), 'ms')\n",
"print('RNG', np.mean(ts_rand), '±', np.std(ts_rand), 'ms')\n",
"\n",
"ts = []\n",
"ts_rand = []\n",
"for i in range(N + 1):\n",
" prest = time.time()\n",
" a = np.random.randn(2*sz, sz)\n",
" st = time.time()\n",
" e=np.linalg.svd(a)\n",
" et=time.time()\n",
" ts.append((et-st)*1000)\n",
" ts_rand.append((st-prest)*1000)\n",
"# remove the first one since it might be tainted by imports\n",
"for l in (ts, ts_rand):\n",
" del l[0]\n",
"\n",
"print(f'SVD ({2*sz} x {sz})', np.mean(ts), '±', np.std(ts), 'ms')\n",
"print('RNG', np.mean(ts_rand), '±', np.std(ts_rand), 'ms')\n",
"\n",
"\n",
"\n",
"ts = []\n",
"ts_rand = []\n",
"def fopt(p, x, y):\n",
" A, mu, sig = p\n",
" model = A*np.exp(-0.5 * ((x-mu)/sig)**2)\n",
" return model - y\n",
"for i in range(N + 1):\n",
" prest = time.time()\n",
" xgauss = np.linspace(-3, 3, sz*sz)\n",
" ygauss = fopt((10, 1, .6), xgauss, 0) + np.random.randn(len(xgauss))\n",
" st = time.time()\n",
" optimize.least_squares(fopt, [11, .5, .9], args=[xgauss, ygauss], method='lm')\n",
" et = time.time()\n",
" ts.append((et-st)*1000)\n",
" ts_rand.append((st-prest)*1000)\n",
"# remove the first one since it might be tainted by imports\n",
"for l in (ts, ts_rand):\n",
" del l[0]\n",
"\n",
"print(f'Leastsq (n={sz*sz})', np.mean(ts), '±', np.std(ts), 'ms')\n",
"print('RNG', np.mean(ts_rand), '±', np.std(ts_rand), 'ms')\n",
"\n",
"tr1 = []\n",
"tr2 = []\n",
"tkB = []\n",
"tMB = []\n",
"for i in range(N//8 + 1): #//8 because writing can be hard on SSDs, and seems to be very repeatable anyway\n",
" t1 = time.time()\n",
" kbarr = np.random.randint(256, size=1024*sz, dtype='uint8')\n",
" t2 = time.time()\n",
" Mbarr = np.random.randint(256, size=1024*1024*sz, dtype='uint8')\n",
" t3 = time.time()\n",
" tr1.append(t2 - t1)\n",
" tr2.append(t3 - t2)\n",
"\n",
" with tempfile.NamedTemporaryFile() as f:\n",
" st = time.time()\n",
" np.save(f, kbarr)\n",
" et = time.time()\n",
" tkB.append(et-st)\n",
"\n",
" with tempfile.NamedTemporaryFile() as f:\n",
" st = time.time()\n",
" np.save(f, Mbarr)\n",
" et = time.time()\n",
" tMB.append(et-st)\n",
"# remove the first one since it might be tainted by imports\n",
"for l in (tr1, tr2, tkB, tMB):\n",
" del l[0]\n",
"\n",
"print(f'Write ({sz} KB)', np.mean(tkB), '±', np.std(tkB), 'ms')\n",
"print('RNG', np.mean(tr1), '±', np.std(tr1), 'ms')\n",
"print(f'Write ({sz} MB)', np.mean(tMB), '±', np.std(tMB), 'ms')\n",
"print('RNG', np.mean(tr2), '±', np.std(tr2), 'ms')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Below is the results of running the above on several devices:\n",
"\n",
"* A new laptop with a top-of-the-line Intel mobile processor (mobile = balance of power efficiency and performance), on Linux or Windows (using [WSL](https://docs.microsoft.com/en-us/windows/wsl/about))\n",
"* An iMac that was top-of-the-line in 2015\n",
"* A hand-built Windows desktop with a mid-range Intel processor from 2014 on [WSL](https://docs.microsoft.com/en-us/windows/wsl/about)\n",
"* A Samsung S21 Ultra, which uses the Qualcomm 888, a state-of-the-art ARM processor\n",
"\n",
"Note that, aside from the hand-built machine (which is hard to apples-to-apples compare) these are all relatively similar prices to purchase as of 2021.\n",
"\n",
"Android is a relatively locked-down OS, so there's not a single obvious way to use Python, but there are various apps that let you get to a terminal or otherwise run Python - hence several different mechanisms of running python scripts on Android are included for the S21. Specifically:\n",
"* [Pydroid](https://play.google.com/store/apps/details?id=ru.iiec.pydroid3&hl=en_US&gl=US), a more IDE like approach, \n",
"* [Termux](https://termux.com/), a relatively thin layer around the linux system at the core of Android, and \n",
"* [AnLinux](https://play.google.com/store/apps/details?id=exa.lnx.a&hl=en_US&gl=US), a layer that essentially creates more familiar linux environments on top of Termux."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"nppprofile_data = {}\n",
"\n",
"#HP spectre x360 (2020) on ubuntu 20.04 shortly after reboot (attached to an egpu):\n",
"#Intel(R) Core(TM) i7-1065G7 CPU @ 1.30GHz\n",
"nppprofile_data['ultrabook-linux'] = \"\"\"\n",
"Arithmetic Operations (n=90000) 1.6756057739257812 ± 0.3688577466492849 ms\n",
"RNG 1.7036724090576172 ± 0.3487183753829759 ms\n",
"Eig (300 x 300) 87.32826232910156 ± 7.635946221140548 ms\n",
"RNG 2.881178855895996 ± 0.20569738749601182 ms\n",
"SVD (600 x 300) 31.614999771118164 ± 3.6539909780748583 ms\n",
"RNG 6.116633415222168 ± 0.1097590840008892 ms\n",
"Leastsq (n=90000) 37.378525733947754 ± 1.3944810645194972 ms\n",
"RNG 3.880915641784668 ± 0.08596371316124923 ms\n",
"Write (300 KB) 0.0008246898651123047 ± 0.000797926920265935 ms\n",
"RNG 0.0004380544026692708 ± 0.000132861889237448 ms\n",
"Write (300 MB) 0.15468204021453857 ± 0.0038820216803405073 ms\n",
"RNG 0.35952723026275635 ± 0.042130659425058725 ms\n",
"\n",
"\"\"\"\n",
"\n",
"#Same machine on windows subsystem for linux\n",
"nppprofile_data['ultrabook-wsl'] = \"\"\"\n",
"Arithmetic Operations (n=90000) 4.832429885864258 ± 0.18373089164953693 ms\n",
"RNG 2.8391218185424805 ± 0.13812821901274971 ms\n",
"Eig (300 x 300) 113.80751132965088 ± 22.498752193976614 ms\n",
"RNG 4.834494590759277 ± 0.5230535848315021 ms\n",
"SVD (600 x 300) 43.30723285675049 ± 22.532194668748375 ms\n",
"RNG 10.174250602722168 ± 0.8574586455109767 ms\n",
"Leastsq (n=90000) 48.835811614990234 ± 3.2048900663966524 ms\n",
"RNG 5.980625152587891 ± 0.4472850713612565 ms\n",
"Write (300 KB) 0.0004915396372477213 ± 8.223571042651581e-05 ms\n",
"RNG 0.00046324729919433594 ± 3.511450546600878e-05 ms\n",
"Write (300 MB) 0.1921478509902954 ± 0.011080115081110224 ms\n",
"RNG 0.4553593397140503 ± 0.03169751993885473 ms\n",
"\n",
"\"\"\"\n",
"\n",
"\n",
"# 2015 iMac nearly maxed out\n",
"#Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz\n",
"nppprofile_data['iMac-2015']=\"\"\"\n",
"Arithmetic Operations (n=90000) 2.3137378692626953 ± 0.08105763590143897 ms\n",
"RNG 2.070302963256836 ± 0.035946508882229194 ms\n",
"Eig (300 x 300) 40.73537349700928 ± 0.8877029951961521 ms\n",
"RNG 2.100992202758789 ± 0.05319846743322194 ms\n",
"SVD (600 x 300) 19.57542896270752 ± 0.6718816379966097 ms\n",
"RNG 4.210338592529297 ± 0.08190987009025406 ms\n",
"Leastsq (n=90000) 43.709702491760254 ± 1.626838423905233 ms\n",
"RNG 3.3311891555786133 ± 0.1692955819066108 ms\n",
"Write (300 KB) 0.00046435991923014325 ± 0.00019626250313539386 ms\n",
"RNG 0.0005466938018798828 ± 8.96728588509377e-05 ms\n",
"Write (300 MB) 0.43807244300842285 ± 0.033577342528322805 ms\n",
"RNG 0.47133394082387287 ± 0.029525434342570843 ms\n",
"\n",
"\"\"\"\n",
"\n",
"# Midrange windows desktop on WSL\n",
"#Intel(R) Core(TM) i5-4670K CPU @ 3.40GHz\n",
"nppprofile_data['desktop-2014-wsl'] = \"\"\"\n",
"Arithmetic Operations (n=90000) 5.980334281921387 ± 0.2104658348356849 ms\n",
"RNG 4.324584007263184 ± 1.0804979944176138 ms\n",
"Eig (300 x 300) 50.49713134765625 ± 2.0623265808992093 ms\n",
"RNG 2.8104686737060547 ± 0.0156216365157509 ms\n",
"SVD (600 x 300) 28.721413612365723 ± 5.101158795147819 ms\n",
"RNG 6.0588836669921875 ± 0.55144857116891 ms\n",
"Leastsq (n=90000) 87.6505708694458 ± 3.08129761424699 ms\n",
"RNG 6.100587844848633 ± 0.4707221997963097 ms\n",
"Write (300 KB) 0.0013281901677449544 ± 0.0013958411843367031 ms\n",
"RNG 0.0004692872365315755 ± 1.4913818493830237e-05 ms\n",
"Write (300 MB) 0.14026343822479248 ± 0.0023074786719720066 ms\n",
"RNG 0.6560348272323608 ± 0.009111507587664876 ms\n",
"\n",
"\"\"\"\n",
"\n",
"#Samsung S21 Ultra using Pydroid\n",
"# Snapdragon 888\n",
"nppprofile_data['highandroid-2021-pydroid'] = \"\"\"\n",
"Arithmetic Operations (n=90000) 2.443075180053711 ± 0.09796898794472672 ms\n",
"RNG 2.286715507507324 ± 0.03912888908399019 ms\n",
"Eig (300 x 300) 146.36609077453613 ± 13.92771544144655 ms\n",
"RNG 2.5898265838623047 ± 0.3223041232934862 ms\n",
"SVD (600 x 300) 120.31131744384766 ± 8.356954123605899 ms\n",
"RNG 5.516853332519531 ± 0.5379121736035692 ms\n",
"Leastsq (n=90000) 103.03671360015869 ± 3.413509906924827 ms\n",
"RNG 5.648460388183594 ± 0.5531529589972044 ms\n",
"Write (300 KB) 0.0007154544194539388 ± 2.2834068417978086e-05 ms\n",
"RNG 0.0014925003051757812 ± 9.316595580475708e-05 ms\n",
"Write (300 MB) 0.33723100026448566 ± 0.0038397555237155026 ms\n",
"RNG 1.3123644590377808 ± 0.05261338173658255 ms\n",
"\"\"\"\n",
"\n",
"#Samsung S21 Ultra using Termux\n",
"# Snapdragon 888\n",
"# some are 0 because scipy is unavailable\n",
"nppprofile_data['highandroid-2021-termux'] = \"\"\"\n",
"Arithmetic Operations (n=90000) 2.175107002258301 ± 0.10424237669373446 ms\n",
"RNG 1.7603015899658203 ± 0.04076398733090065 ms\n",
"Eig (300 x 300) 210.49492835998535 ± 3.9608078538338294 ms\n",
"RNG 1.8459177017211914 ± 0.05442355394188722 ms\n",
"SVD (600 x 300) 268.0877351760864 ± 2.306343832925803 ms\n",
"RNG 3.6109209060668945 ± 0.08536337754031413 ms\n",
"Leastsq (n=90000) 0.0 ± 0.0 ms\n",
"RNG 0.0 ± 0.0 ms\n",
"Write (300 KB) 0.0005832513173421224 ± 2.3740978869500227e-05 ms\n",
"RNG 0.0005519390106201172 ± 2.1529958964280505e-05 ms\n",
"Write (300 MB) 0.3305274248123169 ± 0.005845571513869176 ms\n",
"RNG 0.3960670630137126 ± 0.001248675055207535 ms\n",
"\n",
"\"\"\"\n",
"\n",
"#Samsung S21 Ultra using AnLinux Ubuntu, enhanced processing, no battery optimization\n",
"# Snapdragon 888\n",
"nppprofile_data['highandroid-2021-anlinux'] = \"\"\"\n",
"Arithmetic Operations (n=90000) 1.5455341339111328 ± 0.25446268813232686 ms\n",
"RNG 1.5759468078613281 ± 0.1421778326858043 ms\n",
"Eig (300 x 300) 105.60596942901611 ± 9.093983856141131 ms\n",
"RNG 2.099013328552246 ± 0.03460866388542104 ms\n",
"SVD (600 x 300) 193.0505132675171 ± 23.852222798661916 ms\n",
"RNG 17.71977424621582 ± 3.4671751132076625 ms\n",
"Leastsq (n=90000) 41.48430824279785 ± 1.7697894147441546 ms\n",
"RNG 4.266977310180664 ± 1.0803862724244608 ms\n",
"Write (300 KB) 0.0005009969075520834 ± 5.461965613345033e-05 ms\n",
"RNG 0.0005480448404947916 ± 7.536600306582695e-05 ms\n",
"Write (300 MB) 0.32338865598042804 ± 0.0029522671250881453 ms\n",
"RNG 0.38102173805236816 ± 0.03997051661768271 ms\n",
"\"\"\""
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"test_names = {}\n",
"vals = defaultdict(list)\n",
"uncs = defaultdict(list)\n",
"for data_entry in nppprofile_data.values():\n",
" for idx, line in enumerate(data_entry.strip().split('\\n')):\n",
" nameval, unc = line.replace(' ms', '').split(' ± ')\n",
" nameval = nameval.split(' ')\n",
" val = nameval[-1]\n",
" name = ' '.join(nameval[:-1])\n",
"\n",
" test_names[idx] = name\n",
" vals[idx].append(float(val))\n",
" uncs[idx].append(float(unc))\n",
"\n",
"vals = dict(vals)\n",
"uncs = dict(uncs)\n",
"machine_names = list(nppprofile_data.keys())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Having pulled the data and refomatted into a suitable format, we can now plot the results:"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 864x1152 with 12 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"fig, axs = plt.subplots(len(vals)//2, 2, figsize=(12, 16))\n",
"\n",
"andmsk = np.array(['android' in nm for nm in machine_names])\n",
"\n",
"for i in vals.keys():\n",
" ax = axs[i//2, i%2]\n",
" val = np.array(vals[i])\n",
" unc = np.array(uncs[i])\n",
" test_name = test_names[i]\n",
" \n",
" xs = np.arange(len(machine_names))\n",
" ax.errorbar(xs[~andmsk], val[~andmsk], unc[~andmsk], fmt='o')\n",
" ax.errorbar(xs[andmsk], val[andmsk], unc[andmsk], fmt='o')\n",
" #ax.errorbar(np.arange(len(machine_names)), val, unc, fmt='o')\n",
" ax.set_title(test_name)\n",
" ax.set_ylim(0, ax.get_ylim()[1])\n",
" \n",
"for ax in axs.ravel():\n",
" ax.set_xticklabels('')\n",
"for ax in axs[:, 0]:\n",
" ax.set_ylabel('ms')\n",
" \n",
"if len(vals) % 2:\n",
" # odd number of tests\n",
" for ax in (axs[-1, 0], axs[-2, 1]):\n",
" ax.set_xticks(np.arange(len(machine_names)))\n",
" ax.set_xticklabels(machine_names, rotation=90)\n",
" axs[-1, 1].set_visible(False)\n",
"else:\n",
" # even number of tests\n",
" for ax in axs[-1]:\n",
" ax.set_xticks(np.arange(len(machine_names)))\n",
" ax.set_xticklabels(machine_names, rotation=90)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Orange is the phone, and Blue are the Intel-based \"traditional computers\". \n",
"\n",
"Rather shockingly, the phone equals or out-performs the computers in several cases. It is a few x slower for eigenvalue decomposition and SVD, however. It's not implausible that this is driven by the lack of carefully-optimized linear algebra libraries for aarch64, though, as the platform is less used for these applications, and I spent less time trying to find optimal packages or compile the underlying libraries myself."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Practical Astronomy Operations\n",
"\n",
"Below are more astronomy-oriented workflows, leaning heavily on [Astropy](http://www.astropy.org/). Concretely, these are:\n",
"\n",
"* Creation of a `SkyCoord` object uniformly sampling the sphere, using either a trigonometric approach, or a conceptually simpler spherical-volume-sampling approach.\n",
"* Matching two `SkyCoord` catalogs on-sky or in 3D space.\n",
"* `astropy.table.Table` basic operations\n",
"* `astropy.cosmology` redshift-to-luminosity distance conversion (for Planck2018 LCDM)\n",
"* Mixing of table and coordinate operations in one test\n",
"* FITS I/O and arithmetic operations on a smallish but relatively typical astronomy CCD-sized image\n",
"\n",
"The cell below can be run as-is or saved to a script and run that way."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Overwriting astprofile.py\n"
]
}
],
"source": [
"%%writefile astprofile.py\n",
"\n",
"import os\n",
"import time\n",
"import tempfile\n",
"\n",
"import numpy as np\n",
"\n",
"from astropy import units as u\n",
"from astropy.coordinates import SkyCoord, CartesianRepresentation\n",
"from astropy import table, cosmology\n",
"from astropy.io import fits\n",
"\n",
"N = 25\n",
"\n",
"sz_big = 512*512\n",
"sz_small = 1024\n",
"\n",
"dts1 = []\n",
"dts2 = []\n",
"for i in range(N + 1):\n",
" t1 = time.time()\n",
" phi = 2 * np.pi * np.random.rand(sz_big) * u.radian\n",
" theta = (np.arccos(1 - 2*np.random.rand(sz_big)) - np.pi/2)* u.radian\n",
" sc = SkyCoord(dec=theta, ra=phi)\n",
"\n",
" t2 = time.time()\n",
"\n",
" xyz = np.random.rand(3, sz_big*2)*2 - 1\n",
" r = np.sum(xyz**2, axis=0)**0.5\n",
" scr = SkyCoord(CartesianRepresentation(xyz[:, r<1][:, :sz_big]*u.au))\n",
" assert len(sc) == sz_big\n",
" t3 = time.time()\n",
"\n",
" dts1.append((t2 - t1)*1000)\n",
" dts2.append((t3 - t2)*1000)\n",
"# remove the first one since it might be tainted by imports\n",
"for l in (dts1, dts2):\n",
" del l[0]\n",
"\n",
"print(f'sc gen acos', np.mean(dts1), '±', np.std(dts1), 'ms')\n",
"print(f'sc gen r', np.mean(dts2), '±', np.std(dts2), 'ms')\n",
"\n",
"tmatch1 = []\n",
"tmatch2 = []\n",
"ttable1 = []\n",
"ttable2 = []\n",
"for i in range(N + 1):\n",
" sc_small = SkyCoord(ra=np.random.rand(sz_small)*np.pi/10 * u.radian,\n",
" dec=np.random.rand(sz_small)*np.pi/20 * u.radian,\n",
" distance=np.random.randn(sz_small)**2 * u.au)\n",
"\n",
" sc_big2 = sc.copy()\n",
"\n",
" st = time.time()\n",
" idx = sc_small.match_to_catalog_sky((sc_big2))[0]\n",
" sc_big2[idx]\n",
" et = time.time()\n",
" tmatch1.append((et-st)*1000)\n",
"\n",
" sc_big2 = scr.copy()\n",
" st = time.time()\n",
" idx = sc_small.match_to_catalog_3d((sc_big2))[0]\n",
" sc_big2[idx]\n",
" et = time.time()\n",
" tmatch2.append((et-st)*1000)\n",
"\n",
" objnames = ['SGC{i}' for i in range(sz_big)]\n",
" mags = np.random.rand(sz_big)*10 + 10\n",
"\n",
" st = time.time()\n",
" tab = table.Table(data={'name': objnames, 'ra': sc_big2.ra, 'dec': sc_big2.dec, 'r_mag': mags})\n",
" subtab = tab[tab['r_mag'] < 15]\n",
" st2 = time.time()\n",
" SkyCoord.guess_from_table(subtab)\n",
" et = time.time()\n",
" ttable1.append((st2-st)*1000)\n",
" ttable2.append((et-st2)*1000)\n",
"# remove the first one since it might be tainted by imports\n",
"for l in (tmatch1, tmatch2, ttable1, ttable2):\n",
" del l[0]\n",
"\n",
"print(f'sc match sky ({sz_big} + {sz_small}):', np.mean(tmatch1), '±', np.std(tmatch1), 'ms')\n",
"print(f'sc match 3d ({sz_big} + {sz_small}):', np.mean(tmatch2), '±', np.std(tmatch2), 'ms')\n",
"print(f'table operations (N={sz_big}):', np.mean(ttable1), '±', np.std(ttable1), 'ms')\n",
"print(f'table+sc guess (N={sz_big}):', np.mean(ttable2), '±', np.std(ttable2), 'ms')\n",
"\n",
"tcosmo = []\n",
"for i in range(N):\n",
" zs = np.random.rand(sz_small)\n",
" st = time.time()\n",
" cosmology.Planck18.luminosity_distance(zs)\n",
" et = time.time()\n",
" tcosmo.append((et-st)*1000)\n",
"# remove the first one since it might be tainted by imports\n",
"for l in (tcosmo,):\n",
" del l[0]\n",
"print(f'cosmology/lum dist (N={sz_small})', np.mean(tcosmo), '±', np.std(tcosmo), 'ms')\n",
"\n",
"\n",
"sz_image = (1024, 2048)\n",
"im = np.random.randint(2**15, size=sz_image, dtype='uint16')\n",
"\n",
"t_img_ops = []\n",
"t_img_writes = []\n",
"t_img_reads = []\n",
"for i in range(N//5):\n",
" st = time.time()\n",
" hdu = fits.PrimaryHDU(im)\n",
" hdul = fits.HDUList([hdu])\n",
"\n",
" new_data = hdul[0].data - 1.2*hdul[0].data\n",
" hdul.append(fits.ImageHDU(new_data))\n",
"\n",
" et = time.time()\n",
" t_img_ops.append((et-st)*1000)\n",
"\n",
" with tempfile.NamedTemporaryFile() as f:\n",
" st = time.time()\n",
" hdul.writeto(f)\n",
" et = time.time()\n",
" t_img_writes.append((et-st)*1000)\n",
" f.seek(0)\n",
" st = time.time()\n",
" ff = fits.open(f, mode='update')\n",
" d = [hdu.data.copy() for hdu in ff]\n",
" et = time.time()\n",
" t_img_reads.append((et-st)*1000)\n",
"# remove the first one since it might be tainted by imports\n",
"for l in (t_img_ops, t_img_writes, t_img_reads):\n",
" del l[0]\n",
"\n",
"print(f'fits operations ({sz_image}):', np.mean(t_img_ops), '±', np.std(t_img_ops), 'ms')\n",
"print(f'write fits ({sz_image} x 2):', np.mean(t_img_writes), '±', np.std(t_img_writes), 'ms')\n",
"print(f'read fits ({sz_image} x 2):', np.mean(t_img_reads), '±', np.std(t_img_reads), 'ms')"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"astprofile_data = {}\n",
"\n",
"#HP spectre x360 (2020) on ubuntu 20.04 shortly after reboot (attached to an egpu):\n",
"#Intel(R) Core(TM) i7-1065G7 CPU @ 1.30GHz\n",
"astprofile_data['ultrabook-linux'] = \"\"\"\n",
"sc gen acos 12.842092514038086 ± 0.6722904480313653 ms\n",
"sc gen r 50.011844635009766 ± 1.396443023079295 ms\n",
"sc match sky (262144 + 1024): 60.12982368469238 ± 8.70020309929742 ms\n",
"sc match 3d (262144 + 1024): 58.58334541320801 ± 2.0472988809456214 ms\n",
"table operations (N=262144): 29.957971572875977 ± 0.8245573441342412 ms\n",
"table+sc guess (N=262144): 2.676219940185547 ± 0.27850526726104774 ms\n",
"cosmology/lum dist (N=1024) 8.378162384033203 ± 10.027830058918557 ms\n",
"fits operations ((1024, 2048)): 8.668661117553711 ± 2.4358964051300016 ms\n",
"write fits ((1024, 2048) x 2): 14.883184432983398 ± 3.074780985818674 ms\n",
"read fits ((1024, 2048) x 2): 6.48188591003418 ± 1.3695594136788176 ms\n",
"\"\"\"\n",
"\n",
"#Same machine on windows subsystem for linux\n",
"astprofile_data['ultrabook-wsl'] = \"\"\"\n",
"sc gen acos 14.613504409790039 ± 0.7271670047555933 ms\n",
"sc gen r 56.64877891540527 ± 1.7119539234432843 ms\n",
"sc match sky (262144 + 1024): 69.91011619567871 ± 1.8061275966995818 ms\n",
"sc match 3d (262144 + 1024): 71.30675315856934 ± 1.8147002549001203 ms\n",
"table operations (N=262144): 32.91806221008301 ± 1.124963057940266 ms\n",
"table+sc guess (N=262144): 3.114309310913086 ± 0.26113611482609267 ms\n",
"cosmology/lum dist (N=1024) 11.075417200724283 ± 0.805692122180425 ms\n",
"fits operations ((1024, 2048)): 7.7422261238098145 ± 1.9125828402586698 ms\n",
"write fits ((1024, 2048) x 2): 14.831364154815674 ± 0.3663309177892698 ms\n",
"read fits ((1024, 2048) x 2): 6.840229034423828 ± 1.1189445580168444 ms\n",
"\"\"\"\n",
"\n",
"# 2015 iMac nearly maxed out\n",
"#Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz\n",
"astprofile_data['iMac-2015']=\"\"\"\n",
"sc gen acos 13.168296813964844 ± 1.0857403537185053 ms\n",
"sc gen r 53.190011978149414 ± 3.155536790274871 ms\n",
"sc match sky (262144 + 1024): 51.28659248352051 ± 3.682698417194298 ms\n",
"sc match 3d (262144 + 1024): 50.61375617980957 ± 2.1685561403578313 ms\n",
"table operations (N=262144): 34.831552505493164 ± 1.599833744113237 ms\n",
"table+sc guess (N=262144): 2.776632308959961 ± 0.3777887938317537 ms\n",
"cosmology/lum dist (N=1024) 10.283837715784708 ± 0.05534307494328314 ms\n",
"fits operations ((1024, 2048)): 11.98512315750122 ± 3.069392256270619 ms\n",
"write fits ((1024, 2048) x 2): 53.21228504180908 ± 0.3454074507822927 ms\n",
"read fits ((1024, 2048) x 2): 32.88692235946655 ± 1.148757105488188 ms\n",
"\"\"\"\n",
"\n",
"# Midrange windows desktop on WSL\n",
"#Intel(R) Core(TM) i5-4670K CPU @ 3.40GHz\n",
"astprofile_data['desktop-2014-wsl'] = \"\"\"\n",
"sc gen acos 15.425453186035156 ± 0.4633349179205587 ms\n",
"sc gen r 68.1332015991211 ± 6.873381615891233 ms\n",
"sc match sky (262144 + 1024): 74.66437339782715 ± 2.0985225455659644 ms\n",
"sc match 3d (262144 + 1024): 77.05792427062988 ± 1.0324446191673073 ms\n",
"table operations (N=262144): 35.49234390258789 ± 0.6469692568082389 ms\n",
"table+sc guess (N=262144): 3.2297611236572266 ± 0.16665856379837637 ms\n",
"cosmology/lum dist (N=1024) 10.366390148798624 ± 0.28290357005102457 ms\n",
"fits operations ((1024, 2048)): 20.4164981842041 ± 9.674421361333419 ms\n",
"write fits ((1024, 2048) x 2): 15.024721622467041 ± 0.4563531644205001 ms\n",
"read fits ((1024, 2048) x 2): 22.221803665161133 ± 6.64362245100332 ms\n",
"\"\"\"\n",
"\n",
"#Samsung S21 Ultra using Pydroid\n",
"# Snapdragon 888\n",
"astprofile_data['highandroid-2021-pydroid'] = \"\"\"\n",
"sc gen acos 23.489856719970703 ± 0.42243584345656904 ms\n",
"sc gen r 77.1247386932373 ± 0.28679086159876765 ms\n",
"sc match sky (262144 + 1024): 63.94397735595703 ± 9.868335193276538 ms\n",
"sc match 3d (262144 + 1024): 60.43668746948242 ± 0.42401081727717355 ms\n",
"table operations (N=262144): 39.35305595397949 ± 0.4626641513497839 ms\n",
"table+sc guess (N=262144): 5.342597961425781 ± 0.24932977509490548 ms\n",
"cosmology/lum dist (N=1024) 11.884967486063639 ± 0.9114423661391373 ms\n",
"fits operations ((1024, 2048)): 13.368892669677734 ± 0.48117895875093597 ms\n",
"write fits ((1024, 2048) x 2): 26.287508010864258 ± 1.5190836158341297 ms\n",
"read fits ((1024, 2048) x 2): 28.046226501464844 ± 12.596456665834499 ms\n",
"\"\"\"\n",
"\n",
"#Samsung S21 Ultra using AnLinux Ubuntu, enh processing/no battery optimization\n",
"# Snapdragon 888\n",
"astprofile_data['highandroid-2021-anlinux'] = \"\"\"\n",
"sc gen acos 13.16655158996582 ± 0.2982963311616191 ms\n",
"sc gen r 45.601444244384766 ± 2.2279403657623265 ms\n",
"sc match sky (262144 + 1024): 51.74835205078125 ± 0.7975183843463821 ms\n",
"sc match 3d (262144 + 1024): 52.8885555267334 ± 0.7346391386768037 ms\n",
"table operations (N=262144): 46.15926742553711 ± 0.40573682356127655 ms\n",
"table+sc guess (N=262144): 2.8734397888183594 ± 0.055282268475837726 ms\n",
"cosmology/lum dist (N=1024) 6.613612174987793 ± 0.15888013579429994 ms\n",
"fits operations ((1024, 2048)): 10.212361812591553 ± 2.448267983131714 ms\n",
"write fits ((1024, 2048) x 2): 15.184521675109863 ± 0.5650024980852292 ms\n",
"read fits ((1024, 2048) x 2): 27.90158987045288 ± 4.763780226317586 ms\n",
"\"\"\""
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [],
"source": [
"test_names = {}\n",
"vals = defaultdict(list)\n",
"uncs = defaultdict(list)\n",
"for data_entry in astprofile_data.values():\n",
" for idx, line in enumerate(data_entry.strip().split('\\n')):\n",
" nameval, unc = line.replace(' ms', '').split(' ± ')\n",
" nameval = nameval.split(' ')\n",
" val = nameval[-1]\n",
" name = ' '.join(nameval[:-1])\n",
"\n",
" test_names[idx] = name\n",
" vals[idx].append(float(val))\n",
" uncs[idx].append(float(unc))\n",
"\n",
"vals = dict(vals)\n",
"uncs = dict(uncs)\n",
"machine_names = list(astprofile_data.keys())"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"False"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"bool(len(vals) % 2)"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 864x1152 with 10 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"fig, axs = plt.subplots(int(np.ceil(len(vals)/2)), 2, figsize=(12, 16))\n",
"\n",
"\n",
"andmsk = np.array(['android' in nm for nm in machine_names])\n",
"\n",
"for i in vals.keys():\n",
" ax = axs[i//2, i%2]\n",
" val = np.array(vals[i])\n",
" unc = np.array(uncs[i])\n",
" test_name = test_names[i]\n",
" \n",
" xs = np.arange(len(machine_names))\n",
" ax.errorbar(xs[~andmsk], val[~andmsk], unc[~andmsk], fmt='o')\n",
" ax.errorbar(xs[andmsk], val[andmsk], unc[andmsk], fmt='o')\n",
" ax.set_title(test_name)\n",
" ax.set_ylim(0, ax.get_ylim()[1])\n",
" \n",
"for ax in axs.ravel():\n",
" ax.set_xticklabels('')\n",
"for ax in axs[:, 0]:\n",
" ax.set_ylabel('ms')\n",
"\n",
"if len(machine_names) % 2:\n",
" # odd number of tests\n",
" for ax in (axs[-1, 0], axs[-2, 1]):\n",
" ax.set_xticks(np.arange(len(machine_names)))\n",
" ax.set_xticklabels(machine_names, rotation=90)\n",
" axs[-1, 1].set_visible(False)\n",
"else:\n",
" # even number of tests\n",
" for ax in axs[-1]:\n",
" ax.set_xticks(np.arange(len(machine_names)))\n",
" ax.set_xticklabels(machine_names, rotation=90)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Orange is the phone, and Blue are the Intel-based \"traditional computers\". \n",
"\n",
"Broadly speaking, all perform fairly similarly. In some cases the phone beats out the computers, and in others vice versa. The only striking outlier is the slow hard drive on the older mac (not necessarily surprising, as it is the oldest SSD on the list given that the mid-range desktop got a newer drive several years later."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Conclusions"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Surprisingly: for most applications, the phone is comparable to, or in a few cases even exceeds the performance of the laptop/desktop devices. In a few cases it's 2-3x slower, although this is probably at least in part due to a lack of care in compilation/optimization for aarch64.\n",
"\n",
"**If your android phone can be connected to a keyboard/mouse/monitor, and has one of the \"Desktop-like\" modes available, it may be a reasonable replacement for a laptop or mid-range desktop if you are a scientific programmer/data scientist.**\n",
"\n",
"Caveat: this is a pretty top-of-the-line ARM processor for phones, and the more traditional computer comparisons are more like mid-range devices. Then again, computers are usually more expensive than phones, such that all options listed here are fairly comparable as of 2021 (since the older devices are cheaper than their brand-new counterparts). So from a FLOP per dollar perspective on a regular daily-work device, compabilities are suprisingly close. We may be nearing the only-one-device-needed threshold!"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.5"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
@eteq
Copy link
Author

eteq commented Apr 19, 2021

Additional runs of the benchmark scripts welcome! Feel free to just drop the output here is the comments

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment