Skip to content

Instantly share code, notes, and snippets.

@fwsGonzo
Last active February 18, 2024 14:09
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save fwsGonzo/a594727a9429cb29f2012652ad43fb37 to your computer and use it in GitHub Desktop.
Save fwsGonzo/a594727a9429cb29f2012652ad43fb37 to your computer and use it in GitHub Desktop.
$ ./rvlinux ../binaries/STREAM/build/stream
-------------------------------------------------------------
STREAM version $Revision: 5.10 $
-------------------------------------------------------------
This system uses 4 bytes per array element.
-------------------------------------------------------------
Array size = 20000000 (elements), Offset = 0 (elements)
Memory per array = 76.3 MiB (= 0.1 GiB).
Total memory required = 228.9 MiB (= 0.2 GiB).
Each kernel will be executed 10 times.
The *best* time for each kernel (excluding the first iteration)
will be used to compute the reported bandwidth.
-------------------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds.
Each test below will take on the order of 96551 microseconds.
(= 96551 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function Best Rate MB/s Avg time Min time Max time
Copy: 18262.5 0.010872 0.008761 0.012888
Scale: 9456.8 0.017738 0.016919 0.022430
Add: 10730.6 0.022911 0.022366 0.024496
Triad: 7793.5 0.032249 0.030795 0.036608
-------------------------------------------------------------
Solution Validates: avg error less than 1.000000e-06 on all three arrays
-------------------------------------------------------------
>>> Program exited, exit code = 0 (0x0)
Instructions executed: 1177805251 Runtime: 1489.893ms Insn/s: 791mi/s
Pages in use: 233 (932 kB virtual memory, total 1978 kB)
@fwsGonzo
Copy link
Author

fwsGonzo commented Nov 12, 2022

The STREAM benchmark shown above uses libriscv in interpreter mode. That is, no binary translation.

@fwsGonzo
Copy link
Author

fwsGonzo commented Oct 25, 2023

With binary translation:

$ ./rvlinux ../binaries/STREAM/build/stream 
-------------------------------------------------------------
STREAM version $Revision: 5.10 $
-------------------------------------------------------------
This system uses 4 bytes per array element.
-------------------------------------------------------------
Array size = 20000000 (elements), Offset = 0 (elements)
Memory per array = 76.3 MiB (= 0.1 GiB).
Total memory required = 228.9 MiB (= 0.2 GiB).
Each kernel will be executed 10 times.
 The *best* time for each kernel (excluding the first iteration)
 will be used to compute the reported bandwidth.
-------------------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds.
Each test below will take on the order of 16090 microseconds.
   (= 16090 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function    Best Rate MB/s  Avg time     Min time     Max time
Copy:           33320.2     0.004919     0.004802     0.005124
Scale:          32956.0     0.004941     0.004855     0.005010
Add:            31775.8     0.007746     0.007553     0.007912
Triad:          30852.0     0.007901     0.007779     0.008037
-------------------------------------------------------------
Solution Validates: avg error less than 1.000000e-06 on all three arrays
-------------------------------------------------------------
>>> Program exited, exit code = 0 (0x0)
Instructions executed: 1177821750  Runtime: 394.081ms  Insn/s: 2989mi/s
Pages in use: 233 (932 kB virtual memory, total 1978 kB)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment