Skip to content

Instantly share code, notes, and snippets.

@heatd
Created June 12, 2022 22:55
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save heatd/e83005662c837800fd5273934923a42b to your computer and use it in GitHub Desktop.
Save heatd/e83005662c837800fd5273934923a42b to your computer and use it in GitHub Desktop.
[pfalcato@PC-PEDRO-ARCH tinymembench]$ ./tinymembench
tinymembench v0.4.9 (simple benchmark for memory throughput and latency)
==========================================================================
== Memory bandwidth tests ==
== ==
== Note 1: 1MB = 1000000 bytes ==
== Note 2: Results for 'copy' tests show how many bytes can be ==
== copied per second (adding together read and writen ==
== bytes would have provided twice higher numbers) ==
== Note 3: 2-pass copy means that we are using a small temporary buffer ==
== to first fetch data into it, and only then write it to the ==
== destination (source -> L1 cache, L1 cache -> destination) ==
== Note 4: If sample standard deviation exceeds 0.1%, it is shown in ==
== brackets ==
==========================================================================
C copy backwards : 4739.7 MB/s (1.0%)
C copy backwards (32 byte blocks) : 4735.0 MB/s (0.6%)
C copy backwards (64 byte blocks) : 4731.6 MB/s (1.6%)
C copy : 4800.2 MB/s (0.4%)
C copy prefetched (32 bytes step) : 4689.6 MB/s (0.5%)
C copy prefetched (64 bytes step) : 4685.5 MB/s (11.5%)
C 2-pass copy : 4621.5 MB/s (7.1%)
C 2-pass copy prefetched (32 bytes step) : 4593.1 MB/s (0.7%)
C 2-pass copy prefetched (64 bytes step) : 4596.0 MB/s (0.7%)
C fill : 7335.6 MB/s (1.1%)
C fill (shuffle within 16 byte blocks) : 7352.8 MB/s (1.1%)
C fill (shuffle within 32 byte blocks) : 7349.4 MB/s (0.6%)
C fill (shuffle within 64 byte blocks) : 7322.6 MB/s (6.8%)
---
standard memcpy : 6715.9 MB/s (0.7%)
standard memset : 15051.6 MB/s (13.6%)
---
MOVSB copy : 5925.9 MB/s (7.5%)
MOVSD copy : 5937.8 MB/s (0.7%)
SSE2 copy : 4815.2 MB/s (5.9%)
SSE2 nontemporal copy : 6847.2 MB/s (1.0%)
SSE2 copy prefetched (32 bytes step) : 4674.0 MB/s (7.1%)
SSE2 copy prefetched (64 bytes step) : 4699.8 MB/s (5.7%)
SSE2 nontemporal copy prefetched (32 bytes step) : 6740.9 MB/s (2.5%)
SSE2 nontemporal copy prefetched (64 bytes step) : 6793.3 MB/s (0.4%)
SSE2 2-pass copy : 4729.6 MB/s (0.7%)
SSE2 2-pass copy prefetched (32 bytes step) : 4633.9 MB/s (0.8%)
SSE2 2-pass copy prefetched (64 bytes step) : 4636.5 MB/s (0.7%)
SSE2 2-pass nontemporal copy : 4287.9 MB/s (1.0%)
SSE2 fill : 7315.6 MB/s (0.4%)
SSE2 nontemporal fill : 16398.0 MB/s (0.9%)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment