Skip to content

Instantly share code, notes, and snippets.

@zeux
Created December 5, 2018 04:43
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save zeux/cff3bcae52d67a961f757b85756c4f60 to your computer and use it in GitHub Desktop.
Save zeux/cff3bcae52d67a961f757b85756c4f60 to your computer and use it in GitHub Desktop.
Amazon A1 (AArch64) vs Amazon T2 (x64). VtxCodec uses SSSE3 on x64 and NEON on ARM. Look at revision history to see the diff.
a1.medium $ ./build/release/meshoptimizer buddha.obj
# buddha.obj: 549409 vertices, 1087474 triangles; read in 483.46 msec; indexed in 354.61 msec
Original : ACMR 1.556966 ATVR 3.081784 (NV 3.124747 AMD 3.277660 Intel 2.289651) Overfetch 2.105950 Overdraw 1.200370 in 0.00 msec
Random : ACMR 2.999919 ATVR 5.937897 (NV 5.937882 AMD 5.937935 Intel 5.936783) Overfetch 10.839888 Overdraw 1.218682 in 33.38 msec
Cache : ACMR 0.661465 ATVR 1.309272 (NV 1.590738 AMD 1.434356 Intel 1.138871) Overfetch 1.509062 Overdraw 1.206893 in 477.86 msec
CacheFifo: ACMR 0.689948 ATVR 1.365651 (NV 1.706663 AMD 1.516610 Intel 1.229416) Overfetch 1.552013 Overdraw 1.197034 in 146.00 msec
Overdraw : ACMR 2.776432 ATVR 5.495538 (NV 5.508446 AMD 5.527603 Intel 5.314811) Overfetch 8.624212 Overdraw 1.086317 in 209.29 msec
Fetch : ACMR 1.556966 ATVR 3.081784 (NV 3.124747 AMD 3.277660 Intel 2.289651) Overfetch 2.105950 Overdraw 1.200370 in 25.13 msec
FetchMap : ACMR 1.556966 ATVR 3.081784 (NV 3.124747 AMD 3.277660 Intel 2.289651) Overfetch 2.105950 Overdraw 1.200370 in 30.68 msec
Complete : ACMR 0.667803 ATVR 1.321817 (NV 1.602355 AMD 1.446700 Intel 1.167953) Overfetch 1.242535 Overdraw 1.105052 in 628.91 msec
Stripify : ACMR 0.695596 ATVR 1.376829 (NV 1.697231 AMD 1.591876 Intel 1.139321); 1894000 strip indices (58.1%) in 81.61 msec
Meshlets : 11285 meshlets (avg vertices 64.0, avg triangles 96.4, not full 111) in 29.08 msec
ConeCull : rejected apex 2724 (24.1%) / center 2723 (24.1%), trivially accepted 540 (4.8%) in 83.73 msec
ConeCull8: rejected apex 2663 (23.6%) / center 2663 (23.6%), trivially accepted 645 (5.7%) in 83.73 msec
ShadowIB : ACMR 0.656863 (1.01x improvement); 543439 shadow vertices (1.01x improvement) in 93.90 msec
IdxCodec : 9.9 bits/triangle (post-deflate 6.2 bits/triangle); encode 51.37 msec, decode 11.71 msec (1.04 GB/s)
VtxPack : 128.0 bits/vertex (post-deflate 63.9 bits/vertex)
VtxCodec : 45.5 bits/vertex (post-deflate 39.9 bits/vertex); encode 50.51 msec, decode 10.51 msec (0.78 GB/s)
VtxCodecO: 37.9 bits/vertex (post-deflate 33.3 bits/vertex); encode 39.87 msec, decode 8.49 msec (0.72 GB/s)
Simplify : 1087474 triangles => 5 LOD levels down to 261072 triangles in 1630.95 msec, optimized in 1810.75 msec
ACMR 0.663601...0.685217 Overfetch 1.319371..1.201511 Codec VB 42.4 bits/vertex IB 17.2 bits/triangle
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment