Skip to content

Instantly share code, notes, and snippets.

@ppearson
Created December 14, 2014 17:54
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ppearson/3fc68b1b005adf12a19e to your computer and use it in GitHub Desktop.
Save ppearson/3fc68b1b005adf12a19e to your computer and use it in GitHub Desktop.
Compiler Benchmark Raw numbers
Scene1: CornellBox room (fully-enclosed, with front wall invisible to camera rays), with two area
lights, and Stanford dragon (1M tris) with SSS material and Robot (1.6M tris) with metal material
and car paint material (both microfacet). Using full brute-force multiple-scattering volumetric rendering for SSS with MIS,
two light samples. Max path length 5, diffuse/spec depth 4. 512x512, 128 SPP, MN
Scene2: Anisotropic metal plane (with procedural simplex noise bump texture), voxel volume (67x46x66, trilinear
interpolation) with dense grids for density and temperature with a (pretty bad) blackbody shader
for the emission, toy train (17K tris) with
8-bit image textures for diffuse, spec colour and bump, and RC helicopter (807K tris) with metal, rough
and smooth plastic materials, with quaternion interpolation motion-blur on the rotor blades, multiple
scattering with MIS, HDR environment light physical sky. Max path length 5, diffuse/spec depth 4.
720x576, 128 SPP, MN
Scene3: Island with 165,000 instanced trees. Reflective ocean plane with procedural simplex noise bump texture,
tree bark textured (diffuse, spec colour 16-bit images and procedural bump), tree leaf material: fresnel spec with
backlight diffuse, HDR environment light physical sky.
Deep render at 720x576, 64 SPP, Max path length 4, diff/spec depth 3, LS
CPU: dual E5-2643
Compiler flags:
(GCC, LLVM)
-O2 -march=native -mfpmath=sse -fPIC -ffast-math -msse -msse2 -msse3 -mssse3 -msse4
-O3 -march=native -mfpmath=sse -fPIC -ffast-math -msse -msse2 -msse3 -mssse3 -msse4
Intel:
-O2 -fPIC -fp-model fast -msse4 -no-intel-extensions
-O3 -fPIC -fp-model fast -msse4 -no-intel-extensions
Did a couple of tests with AVX and:
-O3 -inline-level=2 -xHost -fPIC -fp-model fast -no-intel-extensions
build = 8 threads (-j8)
g++ 4.7.3 - O2
---------
43.39 seconds
43.32 seconds
43.46 seconds
g++ 4.7.3 - O3
---------
44.8 seconds
44.8 seconds
44.59 seconds
g++ 4.8.3 - O2
---------
43.41 seconds
43.36 seconds
43.54 seconds
g++ 4.8.3 - O3
---------
44.78 seconds
44.8 seconds
44.65 seconds
g++ 4.9.2 - O2
---------
45.82 seconds
45.65 seconds
45.47 seconds
g++ 4.9.2 - O3
---------
47.19 seconds
47.36 seconds
47.05 seconds
llvm 3.6 - O2
--------
60.55 seconds
59.92 seconds
59.5 seconds
llvm 3.6 - O3
61.39 seconds
60.07 seconds
61.09 seconds
ICC 15.0 - O2
51.32 seconds
52.25 seconds
(FlexLM license server issues, so got fed up of testing)
ICC 15.0 - O3
50.67 seconds
51.57 seconds
(FlexLM license server issues)
ICC 15.0 - O3 + AVX
53.13
Executable sizes:
GCC 4.7.3 O2: 5450540
GCC 4.7.3 O3: 5599779
GCC 4.8.3 O2: 5533478
GCC 4.8.3 O3: 5620703
GCC 4.9.2 O2: 5572401
GCC 4.9.2 O3: 5641183
LLVM 3.6 SVN O2: 5548575
LLVM 3.6 SVN O3: 5546138
ICC 15.0 O2: 7294836
ICC 15.0 O3: 7307492
benchmark timings
======
16 threads
g++ 4.7.3 - O2
--------------
Scene1:
66.16
66.61
66.64
66.56
66.87
Scene2:
83.46
83.21
82.98
82.65
83.03
Scene3:
78.81
78.43
78.49
78.73
78.23
Mipmap:
48.30592
48.52533
48.75819
48.46161
48.45272
Simplex Noise:
58.05239
58.00146
58.04959
58.06759
g++ 4.7.3 - O3
--------------
Scene1:
66.40
66.96
66.19
66.78
66.72
Scene2:
82.92
83.51
82.88
83.74
Scene3:
79.72
79.37
79.87
80.21
80.07
Mipmap:
48.51241
48.32204
48.19698
48.55927
Simplex Noise:
58.14884
58.02725
58.00416
58.07203
58.03000
58.04147
g++ 4.8.3 - O2
--------------
Scene1:
66.46
65.76
66.14
66.50
Scene2:
82.01
81.58
81.68
82.10
82.58
Scene3:
77.36
77.88
77.27
77.96
Mipmap:
48.60808
48.39812
48.54532
48.46358
Simplex Noise:
57.02998
57.19717
58.07825
57.13549
57.17956
57.49133
57.16777
g++ 4.8.3 - O3
--------------
Scene1:
67.13
66.98
67.38
67.31
Scene2:
80.72
81.29
81.11
81.24
Scene3:
79.27
79.59
79.38
79.78
Mipmap:
48.52480
48.50242
48.53644
48.50602
Simplex Noise:
57.18209
57.18441
57.18813
57.16033
57.15507
g++ 4.9.2 - O2
--------------
Scene1:
67.93
67.31
67.70
67.14
Scene2:
85.84
85.35
83.28
83.00
88.43
82.39
87.98
86.52
Scene3:
79.84
79.92
84.51
79.84
84.70
78.99
82.54
79.14
Mipmap:
48.94262
49.08687
48.99200
48.94998
Simplex Noise:
54.19921
54.19584
54.26487
54.53203
54.21451
g++ 4.9.2 - O3
--------------
Scene1:
67.34
67.73
67.27
67.49
Scene2:
80.66
83.23
81.78
83.81
84.42
80.28
81.27
84.03
Scene3:
83.87
79.53
83.54
79.56
79.94
79.47
79.54
80.09
Mipmap:
48.37876
48.62971
48.46039
48.47242
Simplex Noise:
54.20487
54.25435
54.22344
54.12791
54.18971
LLVM 3.6 - O2
-------------
Scene1:
65.12
65.06
64.11
64.61
65.06
64.54
64.91
64.87
Scene2:
70.04
70.29
69.68
70.15
(because this was so much faster just to check)
69.86
70.32
69.71
Scene3:
79.53
79.76
79.21
80.06
Mipmap:
56.80294
56.44910
56.71217
56.24026
56.48234
Simplex Noise:
49.23084
49.35447
49.27471
49.48672
49.43071
LLVM 3.6 - O3
-------------
Scene1:
65.04
64.08
63.94
64.22
64.91
Scene2:
69.89
70.35
70.27
69.90
Scene3:
78.84
79.03
79.01
79.14
Mipmap:
56.37586
56.25753
56.69339
56.39633
56.60046
Simplex Noise:
49.66771
49.61252
49.66463
49.63885
ICC 15.0 - O2
--------
Scene1:
66.26
67.03
66.13
65.62
66.54
65.94
Scene2:
Wouldn't run without issues
Scene3:
81.52
81.19
81.27
81.25
Mipmap:
55.78321
55.74312
55.18375
55.17110
55.16112
Simplex Noise:
55.94818
55.94186
56.08366
55.94086
ICC 15.0 - O3
--------
Scene1:
67.14
66.06
65.78
66.14
65.33
66.83
Scene2:
Wouldn't run without issues
Scene3:
81.98
81.28
81.30
81.21
Mipmap:
55.09479
55.28289
54.99294
55.09926
55.07039
Simplex Noise:
59.31031
56.01021
55.99270
56.01010
55.98562
56.01012
ICC 15.0 - O3 - AVX
----------
Scene1:
65.07
65.56
66.52
66.10
66.31
Scene2:
ICC 15.0 - O3, -xHost (AVX) and full inlining when compiler wants:
Scene1:
Mipmap:
55.34891
55.36664
55.38241
ICC 15.0 - -O3 -no-prec-div -fp-model fast=2 -xHost:
Scene1:
66.40
66.17
Scene3:
82.39
82.88
Mipmap:
55.08535
55.03044
Simplex Noise:
53.73193
53.74097
54.43051
53.74580
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment