Skip to content

Instantly share code, notes, and snippets.

@brian-armstrong-discord
Created November 7, 2017 19:59
Show Gist options
  • Star 7 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save brian-armstrong-discord/1eb10046edb91167b7187513cc306d65 to your computer and use it in GitHub Desktop.
Save brian-armstrong-discord/1eb10046edb91167b7187513cc306d65 to your computer and use it in GitHub Desktop.
Lilliput vs. Pillow-simd
Setup: Intel Haswell, Debian. Both tests ran against the same image libraries (libjpeg-turbo etc).
Both ran on a single thread only. Save qualities were 85 for JPEG and WEBP, compress level 7 for PNG.
Both tests ran many iterations and then averaged results.
Benchmarking code can be found at https://github.com/discordapp/lilliput-bench
Test types:
- Header reading: We don't actually know what's in a blob of image bytes when we get it. Reading the header allows us
to decide if we want to resize the image.
- Resize, 256x256 => 32x32: We have lots of icon-sized assets that we need resized into various smaller formats.
These make up a pretty sizable percentage of our resizes.
- Resize, 1920x1080 => 800x600: This test is a somewhat typical case for some arbitrary remote image that we've
fetched and have to crop and resize down to a size the client can use. This gives us a test case for large images.
- Transcode: A few various test cases that are useful for us. Just image decode/encode only, no resizing.
Pillow-simd
======================
JPEG 1920x1080 header read: 1920x1080, avg: 0.037861 ms min: 0.033855 ms max: 13.420820 ms
PNG 1920x1080 header read: 1920x1080, avg: 0.041355 ms min: 0.038862 ms max: 0.269175 ms
WEBP 1920x1080 header read: 1920x1080, avg: 37.851110 ms min: 35.255909 ms max: 65.973997 ms
GIF 1920x1080 header read: 1920x1080, avg: 0.034195 ms min: 0.031948 ms max: 0.703096 ms
JPEG 256x256 => 32x32: 1083 Bytes, avg: 0.97 ms min: 0.89 ms max: 1.75 ms
PNG 256x256 => 32x32: 2090 Bytes, avg: 1.39 ms min: 1.31 ms max: 2.42 ms
WEBP 256x256 => 32x32: 1366 Bytes, avg: 2.88 ms min: 2.65 ms max: 4.70 ms
GIF 256x256 => 32x32: 61644 Bytes, avg: 71.76 ms min: 68.54 ms max: 96.00 ms
JPEG 1920x1080 => 800x600: 123522 Bytes, avg: 39.03 ms min: 34.09 ms max: 42.21 ms
PNG 1920x1080 => 800x600: 856122 Bytes, avg: 278.56 ms min: 272.84 ms max: 298.88 ms
WEBP 1920x1080 => 800x600: 93564 Bytes, avg: 147.06 ms min: 142.81 ms max: 171.15 ms
GIF 1920x1080 => 800x600: 4017933 Bytes, avg: 1819.24 ms min: 1734.70 ms max: 1915.06 ms
PNG 256x256 => WEBP 256x256: 9790 Bytes, avg: 20.67 ms min: 20.12 ms max: 31.40 ms
JPEG 256x256 => PNG 256x256: 38724 Bytes, avg: 7.13 ms min: 6.76 ms max: 9.91 ms
GIF 256x256 => PNG 256x256: 8511 Bytes, avg: 1.50 ms min: 1.44 ms max: 1.92 ms
Lilliput
======================
JPEG 1920x1080 header read: 1920x1080, avg: 0.005083 ms, min: 0.004398 ms, max: 0.663941 ms
PNG 1920x1080 header read: 1920x1080, avg: 0.003636 ms, min: 0.003362 ms, max: 0.134671 ms
WEBP 1920x1080 header read: 1920x1080, avg: 0.002145 ms, min: 0.001795 ms, max: 1.949846 ms
GIF 1920x1080 header read: 1920x1080, avg: 0.003795 ms, min: 0.003521 ms, max: 0.140039 ms
JPEG 256x256 => 32x32: 1078 Bytes, avg: 0.62 ms, min: 0.58 ms, max: 1.07 ms
PNG 256x256 => 32x32: 1411 Bytes, avg: 0.92 ms, min: 0.88 ms, max: 1.43 ms
WEBP 256x256 => 32x32: 946 Bytes, avg: 3.19 ms, min: 2.87 ms, max: 4.57 ms
GIF 256x256 => 32x32: 18310 Bytes, avg: 27.96 ms, min: 27.31 ms, max: 39.77 ms
JPEG 1920x1080 => 800x600: 117991 Bytes, avg: 37.30 ms, min: 36.06 ms, max: 50.85 ms
PNG 1920x1080 => 800x600: 722169 Bytes, avg: 178.61 ms, min: 176.41 ms, max: 203.03 ms
WEBP 1920x1080 => 800x600: 88172 Bytes, avg: 129.14 ms, min: 124.67 ms, max: 171.35 ms
GIF 1920x1080 => 800x600: 2372725 Bytes, avg: 3255.42 ms, min: 3220.95 ms, max: 3301.01 ms
PNG 256x256 => WEBP 256x256: 9790 Bytes, avg: 22.52 ms, min: 21.53 ms, max: 31.95 ms
JPEG 256x256 => PNG 256x256: 40134 Bytes, avg: 6.38 ms, min: 6.20 ms, max: 9.17 ms
GIF 256x256 => PNG 256x256: 18053 Bytes, avg: 5.57 ms, min: 5.44 ms, max: 7.88 ms
Conclusion:
Lilliput seems to be a suitable replacement for pillow simd for our use cases.
@homm
Copy link

homm commented Oct 17, 2018

Bonus track: I have implemented optimization for ImageOps.fit which will be available in Pillow-SIMD 5.4.

It used to crop the image first, then resize cropped image to the desired size. Now it calculates crop size as before, but instead of cropping, it passes the rectangle to the resize function which uses it as the source. As a result, you can expect acceleration of about 32% in JPEG 1920x1080 => 800x600 test.

JPEG 1920x1080 header read:     1920x1080,      avg: 0.075202 ms        min: 0.072956 ms        max: 0.258923 ms
PNG 1920x1080 header read:      1920x1080,      avg: 0.054562 ms        min: 0.049829 ms        max: 0.174999 ms
WEBP 1920x1080 header read:     1920x1080,      avg: 0.526440 ms        min: 0.169992 ms        max: 5.101919 ms
GIF 1920x1080 header read:      1920x1080,      avg: 0.043169 ms        min: 0.037909 ms        max: 0.396013 ms
JPEG 256x256 => 32x32:  1109 Bytes,     avg: 0.68 ms    min: 0.67 ms    max: 0.91 ms
PNG 256x256 => 32x32:   2603 Bytes,     avg: 3.04 ms    min: 3.03 ms    max: 3.17 ms
WEBP 256x256 => 32x32:  652 Bytes,      avg: 1.81 ms    min: 1.80 ms    max: 2.00 ms
GIF 256x256 => 32x32:   1897 Bytes,     avg: 3.39 ms    min: 3.38 ms    max: 3.57 ms
JPEG 1920x1080 => 800x600:      107450 Bytes,   avg: 17.66 ms   min: 17.60 ms   max: 20.16 ms
PNG 1920x1080 => 800x600:       810516 Bytes,   avg: 239.70 ms  min: 239.29 ms  max: 241.37 ms
WEBP 1920x1080 => 800x600:      84676 Bytes,    avg: 108.89 ms  min: 108.72 ms  max: 114.47 ms
GIF 1920x1080 => 800x600:       411470 Bytes,   avg: 83.33 ms   min: 83.17 ms   max: 85.63 ms
PNG 256x256 => WEBP 256x256:    14668 Bytes,    avg: 12.45 ms   min: 12.43 ms   max: 12.66 ms
JPEG 256x256 => PNG 256x256:    109987 Bytes,   avg: 23.32 ms   min: 23.23 ms   max: 23.62 ms
GIF 256x256 => PNG 256x256:     43766 Bytes,    avg: 3.42 ms    min: 3.40 ms    max: 3.53 ms

In overall, JPEG 1920x1080 => 800x600 test becomes 77% faster on Pillow-SIMD than on Lilliput.

@WardBenjamin
Copy link

WardBenjamin commented Nov 17, 2020

@homm For what it's worth, I'm not seeing the same results you are, using:

  • Go 1.15.5 (latest stable)
  • discord/lilliput b93131c
  • Python 3.8.5, Pillow-SIMD==7.0.0.post3 (all native dependencies installed, compiled with AVX2)
    • side note, the latest version of Pillow-SIMD on PyPi is lagging behind, and upstream Pillow is a major version ahead as well
  • homm/lilliput-bench 5afe036
    • this has all of your modifications included

On a virtualized Ubuntu 20.04 system with 6 cores (host CPU: Ryzen 5 2600X):

lscpu output
Architecture:                    x86_64
CPU op-mode(s):                  32-bit, 64-bit
Byte Order:                      Little Endian
Address sizes:                   48 bits physical, 48 bits virtual
CPU(s):                          6
On-line CPU(s) list:             0-5
Thread(s) per core:              1
Core(s) per socket:              6
Socket(s):                       1
NUMA node(s):                    1
Vendor ID:                       AuthenticAMD
CPU family:                      23
Model:                           8
Model name:                      AMD Ryzen 5 2600X Six-Core Processor
Stepping:                        2
CPU MHz:                         3600.002
BogoMIPS:                        7200.00
Virtualization:                  AMD-V
Hypervisor vendor:               KVM
Virtualization type:             full
L1d cache:                       192 KiB
L1i cache:                       384 KiB
L2 cache:                        3 MiB
L3 cache:                        16 MiB
NUMA node0 CPU(s):               0-5
Vulnerability Itlb multihit:     Not affected
Vulnerability L1tf:              Not affected
Vulnerability Mds:               Not affected
Vulnerability Meltdown:          Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Vulnerability Spectre v1:        Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2:        Mitigation; Full AMD retpoline, STIBP disabled, RSB filling
Vulnerability Srbds:             Not affected
Vulnerability Tsx async abort:   Not affected
Flags:                           fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext
                                  fxsr_opt rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid tsc_known_freq pni pclmulqdq ssse3 cx16 s
                                 se4_1 sse4_2 x2apic movbe popcnt aes xsave avx rdrand hypervisor lahf_lm cmp_legacy svm cr8_legacy abm sse4a misaligns
                                 se 3dnowprefetch ssbd vmmcall fsgsbase avx2 rdseed clflushopt arat nrip_save flushbyasid decodeassists
lsb-release output
Distributor ID:	Ubuntu
Description:	Ubuntu 20.04.1 LTS
Release:	20.04
Codename:	focal
uname output
Linux BEN-VM-UBUNTU 5.4.0-53-generic #59-Ubuntu SMP Wed Oct 21 09:38:44 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

Here are my results with Lilliput:

JPEG 1920x1080 header read:	1920x1080,	avg: 0.004800 ms,	min: 0.003180 ms,	max: 0.323940 ms
PNG 1920x1080 header read:	1920x1080,	avg: 0.005300 ms,	min: 0.004790 ms,	max: 0.527940 ms
WEBP 1920x1080 header read:	1920x1080,	avg: 0.001274 ms,	min: 0.001120 ms,	max: 0.138480 ms
GIF 1920x1080 header read:	1920x1080,	avg: 0.003031 ms,	min: 0.002840 ms,	max: 0.126557 ms
JPEG 256x256 => 32x32:	        1117 Bytes,	avg: 0.47 ms,	min: 0.41 ms,	max: 9.69 ms
PNG 256x256 => 32x32:   	2623 Bytes,	avg: 1.73 ms,	min: 1.63 ms,	max: 11.49 ms
WEBP 256x256 => 32x32:  	650 Bytes,	avg: 1.37 ms,	min: 1.27 ms,	max: 4.27 ms
GIF 256x256 => 32x32:   	1954 Bytes,	avg: 1.58 ms,	min: 1.47 ms,	max: 3.95 ms
JPEG 1920x1080 => 800x600:	104917 Bytes,	avg: 20.43 ms,	min: 19.49 ms,	max: 29.70 ms
PNG 1920x1080 => 800x600:	748236 Bytes,	avg: 115.92 ms,	min: 109.55 ms,	max: 163.37 ms
WEBP 1920x1080 => 800x600:	82102 Bytes,	avg: 91.57 ms,	min: 82.92 ms,	max: 129.05 ms
GIF 1920x1080 => 800x600:	314587 Bytes,	avg: 72.85 ms,	min: 65.94 ms,	max: 94.18 ms
PNG 256x256 => WEBP 256x256:	14668 Bytes,	avg: 9.28 ms,	min: 8.76 ms,	max: 19.37 ms
JPEG 256x256 => PNG 256x256:	108668 Bytes,	avg: 10.08 ms,	min: 9.58 ms,	max: 16.11 ms
GIF 256x256 => PNG 256x256:	102390 Bytes,	avg: 11.13 ms,	min: 10.64 ms,	max: 25.81 ms

and Pillow-SIMD:

JPEG 1920x1080 header read:	1920x1080,	avg: 0.053079 ms	min: 0.046968 ms	max: 0.318050 ms
PNG 1920x1080 header read:	1920x1080,	avg: 0.039227 ms	min: 0.035048 ms	max: 0.718832 ms
WEBP 1920x1080 header read:	1920x1080,	avg: 1.223647 ms	min: 0.134468 ms	max: 10.828972 ms
GIF 1920x1080 header read:	1920x1080,	avg: 0.026811 ms	min: 0.024319 ms	max: 1.053572 ms
JPEG 256x256 => 32x32:	        1109 Bytes,	avg: 0.51 ms	min: 0.47 ms	max: 5.39 ms
PNG 256x256 => 32x32:	        2603 Bytes,	avg: 2.35 ms	min: 2.27 ms	max: 8.18 ms
WEBP 256x256 => 32x32:	        652 Bytes,	avg: 1.53 ms	min: 1.46 ms	max: 5.44 ms
GIF 256x256 => 32x32:	        1897 Bytes,	avg: 2.52 ms	min: 2.40 ms	max: 9.40 ms
JPEG 1920x1080 => 800x600:	107450 Bytes,	avg: 14.77 ms	min: 14.30 ms	max: 17.00 ms
PNG 1920x1080 => 800x600:	810516 Bytes,	avg: 199.02 ms	min: 193.53 ms	max: 210.72 ms
WEBP 1920x1080 => 800x600:	84676 Bytes,	avg: 93.90 ms	min: 88.23 ms	max: 130.73 ms
GIF 1920x1080 => 800x600:	411470 Bytes,	avg: 69.07 ms	min: 65.87 ms	max: 79.41 ms
PNG 256x256 => WEBP 256x256:	14668 Bytes,	avg: 10.38 ms	min: 9.88 ms	max: 15.85 ms
JPEG 256x256 => PNG 256x256:	109987 Bytes,	avg: 20.04 ms	min: 19.31 ms	max: 42.36 ms
GIF 256x256 => PNG 256x256:	43766 Bytes,	avg: 3.06 ms	min: 2.89 ms	max: 4.13 ms

A few of my observations:

  • The results of my WEBP header read in Pillow-SIMD are over 2x yours. Weird.
  • The Pillow-SIMD WEBP header read is also almost 1000x slower than Lilliput. Lilliput also shows a 9x better GIF header read, a 7x better PNG read, and an 11x better JPEG header read. So yes, Pillow-SIMD still totally loses the header test.
  • My Python bench runs better than yours across the board. But, so does my Lilliput bench compared to the original benchmarks. Since 2018, there have been several updates to both libraries, the core Python/Go tooling, etc. I'm also running a faster processor, but in a virtual environment.
  • Lilliput shows better average speed results in every test except for (3/15):
    • JPEG 1920x1080 => 800x600 Lilliput: 20.43ms, Pillow-SIMD: 14.77ms
    • GIF 1920x1080 => 800x600 Lilliput: 72.85ms, Pillow-SIMD: 69.07ms
    • GIF 256x256 => PNG 256x256 Lilliput: 11.13ms, Pillow-SIMD: 3.06ms (this is significant, but not a common real world use case)
  • In JPEG 256x256 => PNG 256x256, Lilliput is about 2x faster. In PNG 1920x1080 => 800x600, Lilliput is over 70% faster. In PNG 256x256 => 32x32, Lilliput is over 35% faster. In GIF 256x256 => 32x32, Lilliput is almost 60% faster.
  • It's still hard to compare PNG and GIF compression ratio and speed, but we see here that Lilliput can produce an 8% smaller PNG, faster. That's a clear loss for Pillow-SIMD. The GIF comparison is less clear, since Lilliput wins on compression ratio and Pillow-SIMD produces a slightly faster computation.

Overall this shows a very compelling advantage to Lilliput, even in a fair competition.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment