Skip to content

Instantly share code, notes, and snippets.

@ppearson
Last active October 15, 2017 07:07
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ppearson/69392459614b35641ec9f48fe8e94f50 to your computer and use it in GitHub Desktop.
Save ppearson/69392459614b35641ec9f48fe8e94f50 to your computer and use it in GitHub Desktop.
Compiler Benchmarks v2
Machine:
--------
Dual socket Intel Xeon E5-2643 (SB), 4 physical cores per proc. 8 logical (with hyperthreading). 3.3 GHz.
Ubuntu 16.04, Kernel 4.4.0.
Compilers:
----------
g++ 4.8.5 (Ubuntu package)
g++ 4.9.3 (compiled from source)
g++ 5.4 (Ubuntu package)
g++ 6.3 (compiled from source)
g++ 7.1 (compiled from source)
llvm/clang 3.8 (Ubuntu package)
llvm/clang 3.9 (compiled from source)
llvm/clang 4.0 (compiled from source)
llvm/clang 5.0 (compiled from source)
Scenes:
Scene1:
-------
Cornell box (floor diffuse procedural texture with Beckmann microfacet spec lobe, walls diffuse + Beckmann microfacet spec lobs),
1 bust model (544k triangles) with conductor microfacet BSDF (GGX), 1 dragon model (535k triangles) with dielectric refractive
BSDF with brute force internal scattering for SSS with multiple scattering, and Beckmann microfacet dielectric lobe.
Three area lights (two quads, one sphere).
Config: resolution: 1024x768 res, max path length: 6, volumetric integrator (can't always early-out on occlusion rays), Mitchell-Netravali pixel filter,
144 stratified samples per pixel, sampling all lights per hit, with MIS.
Scene 2:
--------
Cornell box (floors and most walls diffuse, but back wall with additional Beckmann microfacet spec lobe), with dense voxel grid volumetric bunny
(converted from OpenVDB examples) with Isotropic phase function.
Two quad lights.
Config: resolution: 1024x768, max path length: 6, volumetric integrator, Mitchell-Netravali pixel filter, 81 stratified samples per pixel, sampling
all lights per hit, with MIS.
Woodcock tracking volume sampling, with multiple scattering, and two transmittance samples per scatter event per light (due to sampling all lights per hit).
Volume roughening was off, so trilinear voxel lookups were taking place for every lookup.
Scene3:
-------
Single 10M triangle mesh of scanned church ornament, with diffuse texture provided by a 1 M point pointcloud lookup texture (KDTree,
with a very large filter radius due to the way the colour was scanned - colour info of scanned model was very bad and had weird gaps,
position topology was much more detailed and constant). Constant Beckmann microfacet spec lobe on material as well.
Single Physical Sky Environment light.
Config: resolution: 1024x768, max path length: 5, non-volumetric integrator (can early out on occlusion rays), Mitchell-Netravali pixel filter,
81 stratified samples per pixel, with MIS. One light sample (NEE) per hit, with Environment Directional culling disabled.
Compilation Benchmarks:
=======================
Imagine compilation from clean with varying numbers of threads (up to -j16) on an SSD.
Dual socket Xeon 4 core (8 virtual with Hyperthreading), 16 total possible threads.
GCC 4.9 and 5.4 were from packages on the Ubuntu system, the other versions were built from the release
tarball and built with the flag '--enable-checking=release'.
LLVM 3.8 was an Ubuntu package, other versions were compiled from source using CMake, in release config.
IlmBase / OpenEXR lib compilation generally took around 13/14 seconds before actual core Imagine build (8 threads),
as part of the overall Imagine build, and is included in these times. Compilation of these libs used same compiler
and number of threads as the overall build of Imagine.
All times in seconds, counted using time on the command line to execute the 'make' command.
Compilation flags used for Imagine:
[main test optimisation flag] -fPIC -ffast-math -mfpmath=sse -msse -msse2 -msse3 -mssse3 -msse4
GCC 4.8 - O2
------------
Executable size: 7.0 MB
Threads 16:
59.841
60.131
59.442
Threads 8:
60.628
60.051
60.132
Threads 4:
101.829
101.674
101.396
Threads 2:
213.482
213.261
213.114
GCC 4.8 - O3
------------
Executable size: 7.3 MB
Threads 16:
62.512
61.737
61.442
Threads 8:
61.412
62.493
62.205
Threads 4:
106.469
106.144
105.896
Threads 2:
224.267
223.586
223.733
GCC 4.8 - Os
------------
Executable size: 5.3 MB
Threads 16:
54.603
54.715
54.439
Threads 8:
54.815
54.689
54.922
Threads 4:
91.855
91.87
91.749
Threads 2:
194.542
194.439
194.8
GCC 4.9 - O2
------------
Executable size: 7.1 MB
Threads 16:
62.31
61.799
61.713
Threads 8:
61.74
62.703
62.502
Threads 4:
105.992
105.69
105.287
Threads 2:
220.324
220.259
220.075
GCC 4.9 - O3
------------
Executable size: 7.4 MB
Threads 16:
64.438
64.819
63.857
Threads 8:
64.775
65.272
64.619
Threads 4:
110.787
110.948
110.75
Threads 2:
230.619
232.864
231.835
GCC 4.9 - Os
------------
Executable size: 5.3 MB
Threads 16:
57.088
57.574
57.36
Threads 8:
57.508
58.219
58.086
Threads 4:
95.893
95.429
96.127
Threads 2:
200.721
200.951
201.233
GCC 5.4 - O2
------------
Executable size: 7.0 MB
Threads 16:
56.37
56.009
56.12
Threads 8:
57.737
56.82
56.957
Threads 4:
94.304
94.556
94.529
Threads 2:
198.292
199.248
198.54
GCC 5.4 - O3
------------
Executable size: 7.3 MB
Threads 16:
58.78
57.664
58.519
Threads 8:
58.881
59.043
59.356
Threads 4:
99.441
98.768
98.874
Threads 2:
207.124
207.046
207.153
GCC 5.4 - Os
------------
Executable size: 5.5 MB
Threads 16:
52.83
52.172
52.39
Threads 8:
52.751
53.381
53.602
Threads 4:
86.549
86.803
86.525
Threads 2:
183.507
183.541
182.916
GCC 6.3 - O2
------------
Apparent link time is randomly really quick... Not really sure why...
Executable size: 6.7 MB
Threads 16:
74.396
73.467
74.075
Threads 8:
78.984
78.411
78.448
Threads 4:
137.218
138.147
137.392
Threads 2:
284.157
284.882
283.985
GCC 6.3 - O3
------------
Again, apparent link time is randomly really quick... Not really sure why...
Executable size: 7.1 MB
Threads 16:
76.632
76.516
76.016
Threads 8:
81.107
81.669
81.17
Threads 4:
141.875
142.072
142.853
Threads 2:
292.574
292.586
293.139
GCC 6.3 - Os
------------
Executable size: 5.2 MB
Threads 16:
70.683
69.843
70.024
Threads 8:
74.455
74.422
74.548
Threads 4:
129.823
129.754
129.512
Threads 2:
268.364
268.086
268.73
GCC 7.1 - O2
------------
Executable size: 6.7 MB
Threads 16:
78.993
79.897
78.937
Threads 8:
83.411
82.705
82.687
Threads 4:
145.639
145.88
145.476
Threads 2:
299.749
299.609
300.479
GCC 7.1 - O3
------------
Executable size: 7.2 MB
Threads 16:
81.355
81.797
82.333
Threads 8:
85.523
85.877
86.716
Threads 4:
150.832
151.321
151.446
Threads 2:
311.928
312.119
312.278
GCC 7.1 - Os
------------
Executable size: 5.2 MB
Threads 16:
74.015
74.86
74.504
Threads 8:
78.68
78.018
78.725
Threads 4:
137.305
137.042
137.244
Threads 2:
283.421
284.306
284.203
LLVM 3.8 - O2
-------------
Executable size: 6.1 MB
Threads 16:
84.039
84.547
84.186
Threads 8:
96.33
97.757
96.717
Threads 4:
170.771
170.789
170.537
Threads 2:
352.958
352.803
353.2
LLVM 3.8 - O3
-------------
Executable size: 6.3 MB
Threads 16:
84.542
84.766
85.246
Threads 8:
98.47
98.599
98.49
Threads 4:
171.952
171.513
172.31
Threads 2:
356.062
356.775
356.547
LLVM 3.8 - Os
-------------
Executable size: 5.9 MB
Threads 16:
81.425
81.401
82.091
Threads 8:
96.325
95.837
96.25
Threads 4:
166.823
166.875
166.545
Threads 2:
344.239
345.815
345.974
LLVM 3.9 - O2
-------------
Executable size: 6.0 MB
Threads 16:
84.118
83.999
83.968
Threads 8:
98.71
98.793
99.3
Threads 4:
176.522
175.991
176.233
Threads 2:
360.231
360.408
359.833
LLVM 3.9 - O3
-------------
Executable size: 6.1 MB
Threads 16:
85.276
84.93
85.196
Threads 8:
99.261
99.14
99.731
Threads 4:
177.88
177.156
178.234
Threads 2:
362.155
362.198
362.311
LLVM 3.9 - Os
-------------
Executable size: 5.8 MB
Threads 16:
83.016
83.417
83.384
Threads 8:
98.257
98.066
97.915
Threads 4:
173.716
174.068
173.514
Threads 2:
353.999
353.962
353.397
LLVM 4.0 - O2
-------------
Executable size: 6.1 MB
Threads 16:
89.441
89.416
89.283
Threads 8:
103.972
104.062
103.104
Threads 4:
186.017
185.519
184.825
Threads 2:
376.839
376.678
376.368
LLVM 4.0 - O3
-------------
Executable size: 6.4 MB
Threads 16:
91.575
90.685
91.046
Threads 8:
105.05
105.409
104.657
Threads 4:
187.384
187.752
187.468
Threads 2:
382.76
382.571
382.46
LLVM 4.0 - Os
-------------
Executable size: 5.7 MB
Threads 16:
87.344
87.25
87.296
Threads 8:
100.811
102.074
101.682
Threads 4:
182.331
181.492
181.532
Threads 2:
369.027
370.263
370.341
----------------
LLVM 5.0 - O2
-------------
Executable size: 6.1 MB
Threads 16:
91.529
90.416
89.689
Threads 8:
105.28
105.641
105.001
Threads 4:
189.056
189.158
188.792
Threads 2:
385.945
384.774
385.497
LLVM 5.0 - O3
-------------
Executable size: 6.4 MB
Threads 16:
91.954
92.189
91.925
Threads 8:
106.6
105.872
106.283
Threads 4:
191.126
191.063
190.726
Threads 2:
388.319
388.847
388.34
LLVM 5.0 - Os
-------------
Executable size: 5.7 MB
Threads 16:
87.429
88.009
87.76
Threads 8:
103.402
103.53
103.287
Threads 4:
184.591
184.65
184.972
Threads 2:
376.469
376.916
376.678
Rendering Benchmarks
====================
Pure render times - after scene loading and building (including acceleration structure building) - in seconds, measured with code in Imagine.
Using 16 threads (so all logical cores - i.e. Hyperthreaded cores).
Scene 1:
GCC 4.8 - O2
------------
64.01
63.99
64.06
63.05
63.03
63.17
GCC 4.8 - O3
------------
63.09
63.16
62.91
63.55
63.75
63.71
GCC 4.8 - Os
------------
87.03
86.81
86.46
87.23
86.93
86.79
GCC 4.9 - O2
------------
64.26
64.33
64.37
64.42
64.48
64.57
GCC 4.9 - O3
------------
64.46
64.35
64.44
64.55
64.47
64.84
GCC 4.9 - Os
------------
88.13
88.16
88.07
88.01
87.94
88.28
GCC 5.4 - O2
------------
63.75
63.9
63.92
63.48
63.66
63.47
GCC 5.4 - O3
------------
63.8
64.02
63.9
63.84
64.13
64.33
GCC 5.4 - Os
------------
88.96
89.21
89.11
88.65
88.9
88.69
GCC 6.3 - O2
------------
62.59
62.43
62.62
62.96
63.1
63.09
GCC 6.3 - O3
------------
62.57
62.39
62.64
62.82
62.85
62.82
GCC 6.3 - Os
------------
86.76
86.37
86.5
86.83
86.87
87.14
GCC 7.1 - O2
------------
60.72
60.98
60.78
61.2
60.87
61.37
GCC 7.1 - O3
------------
60.94
61.12
60.86
60.96
61.03
61.04
GCC 7.1 - Os
------------
83.04
83.57
82.82
82.4
83.07
83.02
LLVM 3.8 - O2
-------------
76.07
76.06
75.8
75.3
75.65
75.25
LLVM 3.8 - O3
-------------
77.22
77.34
77.01
76.57
76.62
76.6
LLVM 3.8 - Os
-------------
76.42
77.69
77.34
77.36
77.21
77.67
LLVM 3.9 - O2
-------------
Two different binaries with identical size and MD5sum gave consistently different results... Just luck with random memory layouts?
74.46
74.58
74.77
76.49
76.58
74.67
76.33
LLVM 3.9 - O3
-------------
76.04
75.76
76.15
76.46
76.58
76.83
LLVM 3.9 - Os
-------------
77.65
77.69
78.11
76.62
76.63
78.51
LLVM 4.0 - O2
-------------
75.76
76.14
75.94
75.65
75.92
75.82
LLVM 4.0 - O3
-------------
75.51
75.24
75.22
75.84
76.08
75.81
LLVM 4.0 - Os
-------------
76.25
76.6
76.17
76.11
76.0
76.19
LLVM 5.0 - O2
-------------
59.10494
59.46831
59.31131
59.29539
59.46924
59.34171
LLVM 5.0 - O3
-------------
59.76187
59.73822
59.82621
59.86747
59.74842
59.5381
LLVM 5.0 - Os
-------------
59.39022
59.79142
59.8394
59.86937
59.37262
59.5809
Scene 2:
--------
GCC 4.8 - O2
------------
180.42
179.87
179.26
179.24
181.25
180.7
GCC 4.8 - O3
------------
184.58
178.43
181.49
184.16
182.88
179.92
GCC 4.8 - Os
------------
241.02
239.71
242.65
244.07
241.14
239.81
GCC 4.9 - O2
------------
176.33
175.67
176.02
174.59
173.89
175.57
GCC 4.9 - O3
------------
174.7
175.57
171.87
171.16
175.22
173.88
GCC 4.9 - Os
------------
244.94
246.6
243.93
245.41
244.53
245.39
GCC 5.4 - O2
------------
196.91
194.95
197.9
196.44
196.92
199.01
GCC 5.4 - O3
------------
199.99
198.91
199.61
200.78
199.38
199.4
GCC 5.4 - Os
------------
245.25
244.72
245.1
247.04
241.9
243.69
GCC 6.3 - O2
------------
196.32
197.83
196.47
196.12
196.61
195.59
GCC 6.3 - O3
------------
198.64
196.44
197.65
198.83
199.34
202.6
GCC 6.3 - Os
------------
242.72
241.29
238.91
243.22
244.56
242.09
GCC 7.1 - O2
------------
196.17
196.24
195.67
195.4
195.19
195.11
GCC 7.1 - O3
------------
194.22
193.6
192.64
195.76
193.63
193.95
GCC 7.1 - Os
------------
232.36
231.42
231.05
230.37
231.59
228.99
LLVM 3.8 - O2
-------------
162.14
162.42
162.59
164.55
161.75
163.62
LLVM 3.8 - O3
-------------
163.35
162.99
163.58
160.53
162.69
162.64
LLVM 3.8 - Os
-------------
163.3
163.35
162.5
164.46
164.47
161.3
LLVM 3.9 - O2
-------------
164.37
164.13
163.18
165.05
162.44
172.58
LLVM 3.9 - O3
-------------
165.42
165.03
164.55
167.48
166.62
164.71
LLVM 3.9 - Os
-------------
166.81
166.78
166.03
165.14
164.75
166.47
LLVM 4.0 - O2
-------------
164.38
162.44
162.21
163.86
162.43
162.65
LLVM 4.0 - O3
-------------
161.66
163.28
163.17
164.01
162.47
163.74
LLVM 4.0 - Os
-------------
164.74
162.84
162.46
163.66
163.26
163.17
LLVM 5.0 - O2
-------------
164.52
163.77
166.37
163.09
162.72
164.06
LLVM 5.0 - O3
-------------
165.13
165.61
165.08
163.84
164.05
164.2
LLVM 5.0 - Os
-------------
165.07
164.91
164.27
164.72
162.85
163.31
Scene 3:
--------
GCC 4.8 - O2
------------
33.32226
33.48312
33.47908
33.36416
33.33967
33.49392
GCC 4.8 - O3
------------
33.2799
33.28379
33.32472
33.4115
33.27513
33.30364
GCC 4.8 - Os
------------
38.98763
39.21229
39.16736
39.03934
39.13645
39.01673
GCC 4.9 - O2
------------
33.57348
33.55575
33.51794
33.5071
33.5833
33.54093
GCC 4.9 - O3
------------
33.2321
33.15517
33.21386
33.39822
33.41943
33.08116
GCC 4.9 - Os
------------
39.07437
38.8753
38.76696
38.80994
38.89322
38.9306
GCC 5.4 - O2
------------
33.54735
33.36386
33.50536
33.46032
33.33546
33.34143
GCC 5.4 - O3
------------
32.80032
32.76485
32.72706
32.77562
32.72713
32.89044
GCC 5.4 - Os
------------
38.94966
39.06506
39.06821
39.02556
39.02453
39.06697
GCC 6.3 - O2
------------
32.60718
32.54882
32.6611
32.73773
32.8702
32.80432
GCC 6.3 - O3
------------
32.38673
32.46868
32.58288
32.23814
32.35074
32.32375
GCC 6.3 - Os
------------
38.29238
38.19671
38.24157
38.25996
38.40025
38.38785
GCC 7.1 - O2
------------
32.80596
32.77457
32.71059
32.66917
32.76218
32.71005
GCC 7.1 - O3
------------
31.83457
31.94614
31.94188
32.05389
31.89560
31.93986
GCC 7.1 - Os
------------
37.50075
37.5858
37.58491
37.70684
37.84435
37.83472
LLVM 3.8 - O2
-------------
42.01244
42.04181
42.12865
42.33704
42.34631
42.20418
LLVM 3.8 - O3
-------------
41.83455
41.82077
42.00944
41.74965
41.79423
41.78191
LLVM 3.8 - Os
-------------
42.92604
42.96269
42.96959
43.26074
43.25978
43.3138
LLVM 3.9 - O2
-------------
41.83468
41.91025
41.93987
42.09748
42.16781
42.15306
LLVM 3.9 - O3
-------------
41.4883
41.43939
41.50032
41.34367
41.31301
41.36161
LLVM 3.9 - Os
-------------
42.71885
42.76104
42.71565
42.99491
42.90148
42.86661
LLVM 4.0 - O2
-------------
40.62165
40.64902
40.64455
40.80723
40.71471
40.68766
LLVM 4.0 - O3
-------------
41.27802
41.26978
41.33857
41.38128
41.36125
41.33312
LLVM 4.0 - Os
-------------
42.51669
42.93512
42.55947
42.10146
42.05888
42.21505
LLVM 5.0 - O2
-------------
33.22879
33.42002
33.33204
33.1982
33.21148
33.23548
LLVM 5.0 - O3
-------------
33.08145
33.03157
33.15889
33.14731
33.30016
33.28567
LLVM 5.0 - Os
-------------
34.03637
34.10956
33.988
34.11975
33.85392
33.81756
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment