Instantly share code, notes, and snippets.

Embed
What would you like to do?
Lilliput vs. Pillow-simd
Setup: Intel Haswell, Debian. Both tests ran against the same image libraries (libjpeg-turbo etc).
Both ran on a single thread only. Save qualities were 85 for JPEG and WEBP, compress level 7 for PNG.
Both tests ran many iterations and then averaged results.
Benchmarking code can be found at https://github.com/discordapp/lilliput-bench
Test types:
- Header reading: We don't actually know what's in a blob of image bytes when we get it. Reading the header allows us
to decide if we want to resize the image.
- Resize, 256x256 => 32x32: We have lots of icon-sized assets that we need resized into various smaller formats.
These make up a pretty sizable percentage of our resizes.
- Resize, 1920x1080 => 800x600: This test is a somewhat typical case for some arbitrary remote image that we've
fetched and have to crop and resize down to a size the client can use. This gives us a test case for large images.
- Transcode: A few various test cases that are useful for us. Just image decode/encode only, no resizing.
Pillow-simd
======================
JPEG 1920x1080 header read: 1920x1080, avg: 0.037861 ms min: 0.033855 ms max: 13.420820 ms
PNG 1920x1080 header read: 1920x1080, avg: 0.041355 ms min: 0.038862 ms max: 0.269175 ms
WEBP 1920x1080 header read: 1920x1080, avg: 37.851110 ms min: 35.255909 ms max: 65.973997 ms
GIF 1920x1080 header read: 1920x1080, avg: 0.034195 ms min: 0.031948 ms max: 0.703096 ms
JPEG 256x256 => 32x32: 1083 Bytes, avg: 0.97 ms min: 0.89 ms max: 1.75 ms
PNG 256x256 => 32x32: 2090 Bytes, avg: 1.39 ms min: 1.31 ms max: 2.42 ms
WEBP 256x256 => 32x32: 1366 Bytes, avg: 2.88 ms min: 2.65 ms max: 4.70 ms
GIF 256x256 => 32x32: 61644 Bytes, avg: 71.76 ms min: 68.54 ms max: 96.00 ms
JPEG 1920x1080 => 800x600: 123522 Bytes, avg: 39.03 ms min: 34.09 ms max: 42.21 ms
PNG 1920x1080 => 800x600: 856122 Bytes, avg: 278.56 ms min: 272.84 ms max: 298.88 ms
WEBP 1920x1080 => 800x600: 93564 Bytes, avg: 147.06 ms min: 142.81 ms max: 171.15 ms
GIF 1920x1080 => 800x600: 4017933 Bytes, avg: 1819.24 ms min: 1734.70 ms max: 1915.06 ms
PNG 256x256 => WEBP 256x256: 9790 Bytes, avg: 20.67 ms min: 20.12 ms max: 31.40 ms
JPEG 256x256 => PNG 256x256: 38724 Bytes, avg: 7.13 ms min: 6.76 ms max: 9.91 ms
GIF 256x256 => PNG 256x256: 8511 Bytes, avg: 1.50 ms min: 1.44 ms max: 1.92 ms
Lilliput
======================
JPEG 1920x1080 header read: 1920x1080, avg: 0.005083 ms, min: 0.004398 ms, max: 0.663941 ms
PNG 1920x1080 header read: 1920x1080, avg: 0.003636 ms, min: 0.003362 ms, max: 0.134671 ms
WEBP 1920x1080 header read: 1920x1080, avg: 0.002145 ms, min: 0.001795 ms, max: 1.949846 ms
GIF 1920x1080 header read: 1920x1080, avg: 0.003795 ms, min: 0.003521 ms, max: 0.140039 ms
JPEG 256x256 => 32x32: 1078 Bytes, avg: 0.62 ms, min: 0.58 ms, max: 1.07 ms
PNG 256x256 => 32x32: 1411 Bytes, avg: 0.92 ms, min: 0.88 ms, max: 1.43 ms
WEBP 256x256 => 32x32: 946 Bytes, avg: 3.19 ms, min: 2.87 ms, max: 4.57 ms
GIF 256x256 => 32x32: 18310 Bytes, avg: 27.96 ms, min: 27.31 ms, max: 39.77 ms
JPEG 1920x1080 => 800x600: 117991 Bytes, avg: 37.30 ms, min: 36.06 ms, max: 50.85 ms
PNG 1920x1080 => 800x600: 722169 Bytes, avg: 178.61 ms, min: 176.41 ms, max: 203.03 ms
WEBP 1920x1080 => 800x600: 88172 Bytes, avg: 129.14 ms, min: 124.67 ms, max: 171.35 ms
GIF 1920x1080 => 800x600: 2372725 Bytes, avg: 3255.42 ms, min: 3220.95 ms, max: 3301.01 ms
PNG 256x256 => WEBP 256x256: 9790 Bytes, avg: 22.52 ms, min: 21.53 ms, max: 31.95 ms
JPEG 256x256 => PNG 256x256: 40134 Bytes, avg: 6.38 ms, min: 6.20 ms, max: 9.17 ms
GIF 256x256 => PNG 256x256: 18053 Bytes, avg: 5.57 ms, min: 5.44 ms, max: 7.88 ms
Conclusion:
Lilliput seems to be a suitable replacement for pillow simd for our use cases.
@fenollp

This comment has been minimized.

fenollp commented Nov 14, 2017

Kinda weird on GIF 1920x1080 => 800x600:

GIF 1920x1080 => 800x600: 4017933 Bytes, avg: 1819.24 ms min: 1734.70 ms max: 1915.06 ms
lilliput:
GIF 1920x1080 => 800x600: 2372725 Bytes, avg: 3255.42 ms, min: 3220.95 ms, max: 3301.01 ms

Might be a good idea to watch closely clients uploading large GIFs (who does that anyway?)

@kkopachev

This comment has been minimized.

kkopachev commented Feb 27, 2018

Benchmark seems weird a bit.
Was pillow-simd compiled with AVX2 optimizations?
Why in pillow case image is converted to RGB/RGBA mode?
Looks like straight comparison of resizes in pillow/opencv is incorrect, since they produce different results. https://www.reddit.com/r/Python/comments/4j5mla/pillowsimd_is_25_times_faster_than_pillow_and_10/d340k8z/
Noticed resized jpegs have different output sizes. Since both libs use libjpeg-turbo with the same version and same settings, given same RGB bytes they should output same encoded jpeg. But output jpeg size is different, which means resize produced different results.
I also noticed that transcode in case of JPEG=>PNG produced different results, which is not expected. PNG=>WEBP produced same output (in terms of filesize).

Not sure if it's significant or not, but bench_transcode had 100 iterations in pillow case, but 1000 in go.

@homm

This comment has been minimized.

homm commented Oct 15, 2018

First of all, never trust a benchmark which you can't reproduce. You can't reproduce lilliput-bench for the obvious reason: there are no test images. And since this is the benchmark which tests graphics codecs, results heavily depend on exact files modes and even content. So the first, what you need, is create some test data.

Now, with the installation instruction, we can start. This is my results on i5-4430 CPU (the same Haswell architecture as in the original test) and Ubuntu 18.04.1 LTS, running on bare metal. Pillow-SIMD version 5.2.0.post0 compiled with AVX2 running on Python 2.7.

JPEG 1920x1080 header read:  1920x1080,  avg: 0.036190 ms  min: 0.033855 ms  max: 7.170200 ms
PNG 1920x1080 header read:   1920x1080,  avg: 0.040357 ms  min: 0.038862 ms  max: 0.266075 ms
WEBP 1920x1080 header read:  1920x1080,  avg: 0.678090 ms  min: 0.149965 ms  max: 14.508009 ms
GIF 1920x1080 header read:   1920x1080,  avg: 0.029355 ms  min: 0.025988 ms  max: 0.365019 ms
JPEG 256x256 => 32x32:  1132 Bytes,     avg: 0.87 ms    min: 0.81 ms    max: 1.06 ms
PNG 256x256 => 32x32:   3022 Bytes,     avg: 3.41 ms    min: 3.40 ms    max: 3.54 ms
WEBP 256x256 => 32x32:  678 Bytes,      avg: 2.25 ms    min: 2.24 ms    max: 2.49 ms
GIF 256x256 => 32x32:   1897 Bytes,     avg: 3.55 ms    min: 3.53 ms    max: 3.91 ms
JPEG 1920x1080 => 800x600:      112307 Bytes,   avg: 25.98 ms   min: 25.38 ms   max: 26.21 ms
PNG 1920x1080 => 800x600:       929497 Bytes,   avg: 331.29 ms  min: 330.92 ms  max: 331.71 ms
WEBP 1920x1080 => 800x600:      92466 Bytes,    avg: 121.59 ms  min: 121.45 ms  max: 126.35 ms
GIF 1920x1080 => 800x600:       411470 Bytes,   avg: 85.91 ms   min: 85.58 ms   max: 89.52 ms
PNG 256x256 => WEBP 256x256:    14668 Bytes,    avg: 12.43 ms   min: 12.41 ms   max: 12.65 ms
JPEG 256x256 => PNG 256x256:    109987 Bytes,   avg: 23.17 ms   min: 23.05 ms   max: 23.42 ms
GIF 256x256 => PNG 256x256:     43766 Bytes,    avg: 3.40 ms    min: 3.37 ms    max: 3.54 ms

Lilliput master (commit 3960219) running on go 1.10.1.

JPEG 1920x1080 header read:  1920x1080,  avg: 0.005600 ms,  min: 0.004915 ms,  max: 0.172405 ms
PNG 1920x1080 header read:   1920x1080,  avg: 0.003186 ms,  min: 0.002956 ms,  max: 0.051915 ms
WEBP 1920x1080 header read:  1920x1080,  avg: 0.001392 ms,  min: 0.001290 ms,  max: 0.017828 ms
GIF 1920x1080 header read:   1920x1080,  avg: 0.004994 ms,  min: 0.004805 ms,  max: 0.029046 ms
JPEG 256x256 => 32x32:  1117 Bytes,     avg: 0.67 ms,   min: 0.67 ms,   max: 0.97 ms
PNG 256x256 => 32x32:   2623 Bytes,     avg: 8.75 ms,   min: 8.72 ms,   max: 8.94 ms
WEBP 256x256 => 32x32:  650 Bytes,      avg: 8.43 ms,   min: 8.42 ms,   max: 9.05 ms
GIF 256x256 => 32x32:   1954 Bytes,     avg: 5.35 ms,   min: 5.29 ms,   max: 5.77 ms
JPEG 1920x1080 => 800x600:      104917 Bytes,   avg: 31.27 ms,  min: 31.22 ms,  max: 33.52 ms
PNG 1920x1080 => 800x600:       748236 Bytes,   avg: 406.31 ms, min: 405.98 ms, max: 407.65 ms
WEBP 1920x1080 => 800x600:      82102 Bytes,    avg: 683.10 ms, min: 682.86 ms, max: 689.18 ms
GIF 1920x1080 => 800x600:       314587 Bytes,   avg: 222.03 ms, min: 221.75 ms, max: 222.30 ms
PNG 256x256 => WEBP 256x256:    14668 Bytes,    avg: 76.34 ms,  min: 76.27 ms,  max: 76.54 ms
JPEG 256x256 => PNG 256x256:    108668 Bytes,   avg: 27.84 ms,  min: 27.79 ms,  max: 28.03 ms
GIF 256x256 => PNG 256x256:     102390 Bytes,   avg: 35.29 ms,  min: 35.09 ms,  max: 35.41 ms

Some conclusions can already be drawn.

  1. Pillow-SIMD totally lost the "header" test. Indeed, most file parsing in Pillow done in Python itself without any speedups. On the other hand, the absolute numbers are often less than 0.1 ms which is acceptable for most cases.
  2. There is a huge difference between 35 ms, the minimal time for querying the WEBP image in the original results, and 0.15 ms in my results. This is due to the problem in earlier versions of Pillow, when whole WEBP image is loaded during the opening. This was fixed eventually.
  3. This is absolutely meaningless to compare PNG and GIF compression ratio and speed. Both formats are lossless, so if you have identical source images, the more time you spend to compression, the better compression you get. The problem that source images are not identical (I'll explain this below) and default preferences are different.
  4. Something is broken with WEBP compression on Linux in the latest Lilliput. It slow as hell. WEBP is a format with lossy compression (at least in this test) and its speed shouldn't highly depend on compression ratio. This is confirmed by this issue.

The most important test (which reflects the performance of the library itself) is a test which is not demanding on compression ratio (i.e. lossy format), is not broken like WEBP and operates large enough images. So the are only one test meet the requirements, this is the JPEG 1920x1080 => 800x600 test. In the original benchmark it faster on Lilliput, in my benchmark it faster on Pillow-SIMD. There is no information which version of Pillow-SIMD was used for the original test or was it compiled with AVX2 support. But from the results, I assume a pre-4.3 version was used without significant optimizations for most general cases (which was already available at the time of the test). But this is only the beginning. In the original test, several more errors were made that distort the results. Let's look at the bencmark code.

Resizing filter

Lilliput uses INTER_AREA interpolation for OpenCV's resize function. Pillow uses a very flexible and high-quality resizing based on convolutions. It allows choosing exact quality for resizing and also affects performance. LANCZOS, which is used in the benchmark, produces very sharp images which looks much better than INTER_AREA interpolation. This affects not only the resize speed itself, but also the speed of following compression and the result size of coded images. BICUBIC filter a bit cheaper and its result is much closer to INTER_AREA, while still slightly better and shaper.

Added alpha-channel to PNG and WEBP images

As @kkopachev mentioned before, there is a very interesting line which converts JPEG images to RGB (while they are already) and adds alpha-channel to other formats. For JPEGs this is only one extra copying, but for other formats, this adds an extra size of compressed images and adds extra time on compression.

Preload codecs

Pillow is written in Python, dynamic scripting language. Each imported python file contributes to the loading time, that is why Pillow loads codecs lazily. The cons are that time is spent during test. You may see that JPEG and WEBP the maximum header read time is 100x higher than average and min.

What we get if we fix all the problems in Pillow-SIMD version:

JPEG 1920x1080 header read:     1920x1080,      avg: 0.068893 ms        min: 0.066996 ms        max: 0.257015 ms
PNG 1920x1080 header read:      1920x1080,      avg: 0.049018 ms        min: 0.043869 ms        max: 0.116825 ms
WEBP 1920x1080 header read:     1920x1080,      avg: 0.559106 ms        min: 0.155926 ms        max: 5.062103 ms
GIF 1920x1080 header read:      1920x1080,      avg: 0.035386 ms        min: 0.032902 ms        max: 0.418901 ms
JPEG 256x256 => 32x32:  1109 Bytes,     avg: 0.70 ms    min: 0.69 ms    max: 0.91 ms
PNG 256x256 => 32x32:   2603 Bytes,     avg: 3.05 ms    min: 3.04 ms    max: 3.15 ms
WEBP 256x256 => 32x32:  652 Bytes,      avg: 1.85 ms    min: 1.84 ms    max: 3.40 ms
GIF 256x256 => 32x32:   1897 Bytes,     avg: 3.42 ms    min: 3.41 ms    max: 3.58 ms
JPEG 1920x1080 => 800x600:      107480 Bytes,   avg: 23.43 ms   min: 22.57 ms   max: 23.53 ms
PNG 1920x1080 => 800x600:       810489 Bytes,   avg: 243.70 ms  min: 243.24 ms  max: 244.43 ms
WEBP 1920x1080 => 800x600:      84690 Bytes,    avg: 112.28 ms  min: 112.14 ms  max: 115.48 ms
GIF 1920x1080 => 800x600:       411470 Bytes,   avg: 84.42 ms   min: 84.26 ms   max: 87.32 ms
PNG 256x256 => WEBP 256x256:    14668 Bytes,    avg: 12.43 ms   min: 12.42 ms   max: 12.62 ms
JPEG 256x256 => PNG 256x256:    109987 Bytes,   avg: 23.18 ms   min: 23.11 ms   max: 23.28 ms
GIF 256x256 => PNG 256x256:     43766 Bytes,    avg: 3.42 ms    min: 3.39 ms    max: 3.50 ms

Fair result:

  • JPEG 256x256 => 32x32 — speed is equal, mainly because of the slow parsing of images in Python. Size is lower for Pillow-SIMD.
  • PNG and WEBP 256x256 => 32x32 — much faster in Pillow with the same size.
  • GIF 256x256 => 32x32 — faster in Pillow with smaller size.
  • JPEG 1920x1080 => 800x600 — faster in Pillow-SIMD (still the fastest software for resizing images on CPU) with the slightly bigger size due to sharper result.
  • PNG and GIF 1920x1080 => 800x600 — time better, size worse. As I said before, the more time you contribute to compression, the smaller output image you get. The difference only in settings.
  • WEBP 1920x1080 => 800x600 — slightly bigger file size due to a sharper result and dramatic slowdown for Lilliput.
  • PNG 256x256 => WEBP 256x256 — dramatic slowdown for WEBP in Lilliput.
  • JPEG 256x256 => PNG 256x256 — a bit faster, a bit larger due to PNG compression.
  • GIF 256x256 => PNG 256x256 — the dramatic slowdown for Lilliput due to unnecessary palette​ to RGB conversion.

So there is no tests except header read when Lilliput truly wins. In turn, I want to say that in Pillow we need to look at the image detection and parsing speed.

In total, I've made five pull requests in lilliput-bench repository:

@brian-armstrong-discord, could you look at them?

@homm

This comment has been minimized.

homm commented Oct 17, 2018

Bonus track: I have implemented optimization for ImageOps.fit which will be available in Pillow-SIMD 5.4.

It used to crop the image first, then resize cropped image to the desired size. Now it calculates crop size as before, but instead of cropping, it passes the rectangle to the resize function which uses it as the source. As a result, you can expect acceleration of about 32% in JPEG 1920x1080 => 800x600 test.

JPEG 1920x1080 header read:     1920x1080,      avg: 0.075202 ms        min: 0.072956 ms        max: 0.258923 ms
PNG 1920x1080 header read:      1920x1080,      avg: 0.054562 ms        min: 0.049829 ms        max: 0.174999 ms
WEBP 1920x1080 header read:     1920x1080,      avg: 0.526440 ms        min: 0.169992 ms        max: 5.101919 ms
GIF 1920x1080 header read:      1920x1080,      avg: 0.043169 ms        min: 0.037909 ms        max: 0.396013 ms
JPEG 256x256 => 32x32:  1109 Bytes,     avg: 0.68 ms    min: 0.67 ms    max: 0.91 ms
PNG 256x256 => 32x32:   2603 Bytes,     avg: 3.04 ms    min: 3.03 ms    max: 3.17 ms
WEBP 256x256 => 32x32:  652 Bytes,      avg: 1.81 ms    min: 1.80 ms    max: 2.00 ms
GIF 256x256 => 32x32:   1897 Bytes,     avg: 3.39 ms    min: 3.38 ms    max: 3.57 ms
JPEG 1920x1080 => 800x600:      107450 Bytes,   avg: 17.66 ms   min: 17.60 ms   max: 20.16 ms
PNG 1920x1080 => 800x600:       810516 Bytes,   avg: 239.70 ms  min: 239.29 ms  max: 241.37 ms
WEBP 1920x1080 => 800x600:      84676 Bytes,    avg: 108.89 ms  min: 108.72 ms  max: 114.47 ms
GIF 1920x1080 => 800x600:       411470 Bytes,   avg: 83.33 ms   min: 83.17 ms   max: 85.63 ms
PNG 256x256 => WEBP 256x256:    14668 Bytes,    avg: 12.45 ms   min: 12.43 ms   max: 12.66 ms
JPEG 256x256 => PNG 256x256:    109987 Bytes,   avg: 23.32 ms   min: 23.23 ms   max: 23.62 ms
GIF 256x256 => PNG 256x256:     43766 Bytes,    avg: 3.42 ms    min: 3.40 ms    max: 3.53 ms

In overall, JPEG 1920x1080 => 800x600 test becomes 77% faster on Pillow-SIMD than on Lilliput.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment