Skip to content

Instantly share code, notes, and snippets.

@ZenithalHourlyRate
Last active May 17, 2022 08:08
Show Gist options
  • Save ZenithalHourlyRate/7b5175734f87acb73d0bbc53391d7140 to your computer and use it in GitHub Desktop.
Save ZenithalHourlyRate/7b5175734f87acb73d0bbc53391d7140 to your computer and use it in GitHub Desktop.
Benchmark of OpenSSL AES for RISC-V 64

Benchmark of OpenSSL AES for RISC-V 64

setup

This is evaluated against a rocket core with Zb/Zk support (note: current impl only needs 1 cycle for aes64esm, with considerable hardware cost) with this config running in 100MHz on an xc7k325tffg900-2 FPGA board.

We have the following implementations of AES

  1. pure C version
  2. rv64i asm from #17640
  3. rv64i zkne zknd asm from #18197

We will also evaluate the following implementations of GHASH

  1. pure C version
  2. rv64i zbb zbc asm from #17640

All openssl are statically compiled (busybox rootfs) using riscv64-musl-ubuntu-18.04-nightly-2022.04.23-nightly.tar.gz as the toolchain (CC="riscv64-unknown-linux-musl-gcc" ./config linux64-riscv64 --static -static), and the flags are

version: 3.1.0-dev
built on: Wed Apr 27 12:28:05 2022 UTC
options: bn(64,64)
compiler: riscv64-unknown-linux-gnu-gcc -fPIC -pthread -Wa,--noexecstack -Wall -O3 -DOPENSSL_USE_NODELETE -DOPENSSL_PIC -DOPENSSL_BUILDING_OPENSSL -DNDEBUG
CPUINFO: N/A

You may get the compiled binary from here. In this archive, openssl-c is compiled from master, openssl is compiled from the result of merging #17640 and #18197. If you want to enable zkn/zbb/zbc, you need to export OPENSSL_riscvcap="rv64gc_zknd_zkne_zbb_zbc". openssl-zkn always turns on zknd/zkne. Thus, to evaluate aes-gcm, you need to use openssl-c, openssl, OPENSSL_riscvcap="rv64gc_zbb_zbc" openssl, openssl-zkn, OPENSSL_riscvcap="rv64gc_zbb_zbc" openssl-zkn.

The results are produced via openssl speed -evp aes-128-cbc and similar commands.

aes-cbc

  1. pure C
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
aes-128-cbc        992.20k     1151.62k     1205.93k     1219.90k     1206.95k     1202.94k
aes-192-cbc        870.31k     1000.02k     1043.09k     1047.55k     1040.38k     1039.65k
aes-256-cbc        778.06k      878.81k      910.42k      919.20k      912.04k      912.04k
  1. rv64i asm
aes-128-cbc       1117.01k     1311.21k     1374.95k     1380.69k     1365.33k     1364.42k
aes-192-cbc        985.86k     1127.53k     1173.76k     1185.31k     1171.46k     1164.84k
aes-256-cbc        862.07k      987.25k     1020.25k     1031.88k     1024.00k     1021.27k
  1. rv64i zknd zkne asm
aes-128-cbc       2982.12k     5290.10k     6566.44k     6993.92k     6991.97k     6673.75k
aes-192-cbc       2882.87k     4948.61k     6093.14k     6458.37k     6430.72k     6191.95k
aes-256-cbc       2626.09k     4543.57k     5619.41k     5985.43k     5955.58k     5739.86k

aes-gcm

  1. pure C
AES-128-GCM        669.18k      789.44k      820.31k      829.82k      824.68k      824.66k
AES-192-GCM        613.60k      710.79k      737.11k      748.31k      740.27k      740.27k
AES-256-GCM        567.73k      652.84k      669.44k      676.18k      676.73k      677.21k
  1. rv64i asm AES with pure C GHASH
AES-128-GCM        769.17k      879.58k      913.38k      917.50k      912.35k      912.04k
AES-192-GCM        701.95k      794.22k      820.23k      827.08k      819.20k      816.48k
AES-256-GCM        645.42k      723.03k      743.17k      748.99k      745.47k      743.25k
  1. rv64i asm AES with rv64i zbb zbc asm GHASH
AES-128-GCM       1071.15k     1280.19k     1357.49k     1366.02k     1356.20k     1348.95k
AES-192-GCM        944.85k     1110.39k     1164.59k     1177.77k     1157.80k     1157.80k
AES-256-GCM        844.49k      974.10k     1018.61k     1024.68k     1016.47k     1015.81k
  1. rv64i zknd zkne AES with pure C GHASH
AES-128-GCM       1454.57k     1826.72k     1960.41k     1992.36k     1966.08k     1956.22k
AES-192-GCM       1428.87k     1783.33k     1914.54k     1945.94k     1926.08k     1906.01k
AES-256-GCM       1406.88k     1743.47k     1870.34k     1902.11k     1882.24k     1862.31k
  1. rv64i zknd zkne AES with rv64i zbb zbc asm GHASH
AES-128-GCM       2997.58k     5072.03k     6260.44k     6614.36k     6605.66k     6346.07k
AES-192-GCM       2902.73k     4780.91k     5834.40k     6132.05k     6120.71k     5887.32k
AES-256-GCM       2790.03k     4505.79k     5467.53k     5721.77k     5682.52k     5507.00k

aes-ecb

  1. pure C
AES-128-ECB       1094.13k     1260.14k     1312.85k     1328.12k     1306.37k     1305.26k
AES-192-ECB        961.99k     1086.69k     1124.69k     1132.56k     1116.84k     1117.84k
AES-256-ECB        847.37k      946.86k      971.52k      980.16k      972.63k      969.89k
  1. rv64i asm AES
AES-128-ECB       1269.05k     1456.46k     1512.19k     1529.49k     1504.15k     1490.94k
AES-192-ECB       1081.70k     1232.67k     1275.98k     1284.97k     1267.03k     1261.57k
AES-256-ECB        973.74k     1074.26k     1106.45k     1111.67k     1094.08k     1095.92k
  1. rv64i zknd zkne asm AED
AES-128-ECB       4478.88k     8493.74k    11570.60k    12583.56k    12566.53k    11736.41k
AES-192-ECB       4240.11k     7866.35k    10170.45k    10936.32k    10934.54k    10267.31k
AES-256-ECB       3888.08k     7267.68k     9070.42k     9681.42k     9622.87k     9134.49k

aes-cfb

  1. pure C
AES-128-CFB        917.28k     1004.54k     1082.73k     1090.78k     1079.48k     1072.31k
AES-192-CFB        823.19k      914.43k      941.65k      946.18k      939.35k      936.23k
AES-256-CFB        745.31k      816.36k      843.34k      844.46k      838.31k      838.25k
  1. rv64i asm
AES-128-CFB        338.70k      355.04k      358.66k      360.28k      361.65k      359.91k
AES-192-CFB        325.36k      337.29k      343.55k      345.90k      345.64k      346.06k
AES-256-CFB        311.81k      327.06k      328.87k      330.83k      332.03k      332.03k
  1. rv64i zknd zkne asm
AES-128-CFB        428.22k      457.01k      463.45k      467.14k      465.39k      465.02k
AES-192-CFB        426.39k      454.89k      460.29k      462.85k      464.21k      464.21k
AES-256-CFB        423.84k      452.54k      459.52k      462.34k      459.95k      461.83k

aes-ctr

  1. pure C
AES-128-CTR        841.20k      990.67k     1025.37k     1026.73k     1021.27k     1015.81k
AES-192-CTR        793.91k      873.56k      898.22k      907.22k      898.65k      895.66k
AES-256-CTR        719.43k      771.75k      807.13k      807.94k      805.55k      802.82k
  1. rv64i asm
AES-128-CTR        331.19k      347.53k      349.70k      352.90k      351.09k      351.47k
AES-192-CTR        317.47k      331.01k      335.45k      335.19k      337.48k      336.36k
AES-256-CTR        307.51k      318.98k      323.87k      322.90k      324.95k      324.44k
  1. rv64i zknd zkne asm
AES-128-CTR        422.02k      444.53k      450.30k      451.93k      453.29k      451.78k
AES-192-CTR        418.08k      442.71k      448.51k      451.72k      450.56k      451.78k
AES-256-CTR        414.99k      438.55k      445.78k      448.64k      447.83k      446.34k

aes-ofb

  1. pure C
AES-128-OFB        937.65k     1051.63k     1090.73k     1100.37k     1086.81k     1083.19k
AES-192-OFB        838.62k      922.20k      955.25k      956.76k      950.27k      947.11k
AES-256-OFB        759.83k      823.06k      845.23k      850.37k      846.51k      843.69k
  1. rv64i asm
AES-128-OFB        338.41k      354.54k      359.68k      360.45k      361.97k      361.09k
AES-192-OFB        326.45k      337.02k      344.53k      344.06k      345.64k      346.06k
AES-256-OFB        312.18k      326.96k      330.23k      332.20k      332.03k      333.14k
  1. rv64i zknd zkne asm
AES-128-OFB        432.85k      460.03k      465.66k      468.51k      466.56k      468.11k
AES-192-OFB        431.09k      456.45k      464.13k      467.14k      465.39k      466.56k
AES-256-OFB        425.91k      455.15k      460.29k      463.37k      462.67k      462.67k

aes-ccm

  1. pure C
AES-128-CCM        281.98k      483.10k      588.89k      621.23k      630.78k      633.51k
AES-192-CCM        248.12k      416.60k      503.38k      533.58k      538.88k      542.48k
AES-256-CCM        217.79k      364.82k      439.64k      465.42k      472.41k      473.56k
  1. rv64i asm
AES-128-CCM        318.22k      548.56k      676.17k      717.49k      725.73k      723.94k
AES-192-CCM        278.81k      470.10k      570.11k      605.15k      616.45k      615.08k
AES-256-CCM        244.97k      411.01k      499.76k      527.75k      535.21k      537.00k
  1. rv64i zknd zkne asm
AES-128-CCM       1255.25k     2669.89k     3968.51k     4475.80k     4542.59k     4427.52k
AES-192-CCM       1142.58k     2470.68k     3606.77k     4029.44k     4090.54k     4011.07k
AES-256-CCM       1085.42k     2300.51k     3309.76k     3670.36k     3723.39k     3642.71k

aes-ocb

  1. pure C
AES-128-OCB        955.12k     1073.37k     1145.92k     1159.96k     1145.24k     1141.42k
AES-192-OCB        848.50k      960.06k      995.99k     1001.47k      991.81k      988.50k
AES-256-OCB        760.24k      849.12k      877.85k      885.64k      876.74k      873.81k
  1. rv64i asm
AES-128-OCB       1061.44k     1250.25k     1305.69k     1318.23k     1299.80k     1294.34k
AES-192-OCB        939.57k     1077.00k     1125.37k     1129.13k     1120.58k     1119.57k
AES-256-OCB        847.00k      954.28k      991.21k      998.31k      982.50k      983.04k
  1. rv64i zknd zkne asm
AES-128-OCB       1046.05k     1246.97k     1301.83k     1315.45k     1298.66k     1290.04k
AES-192-OCB        929.53k     1080.77k     1124.43k     1133.93k     1116.84k     1117.84k
AES-256-OCB        834.92k      945.92k      987.70k      994.55k      983.04k      979.77k

aes-xts

  1. pure C
AES-128-XTS        556.05k      936.43k     1118.12k     1179.48k     1182.38k     1179.65k
AES-256-XTS        437.51k      709.95k      850.54k      889.51k      898.65k      895.66k
  1. rv64i asm
AES-128-XTS        641.34k     1069.89k     1277.27k     1348.32k     1343.49k     1343.49k
AES-256-XTS        488.29k      803.56k      962.01k     1007.27k     1014.38k     1010.35k
  1. rv64i zknd zkne asm
AES-128-XTS        631.39k     1060.79k     1275.63k     1338.03k     1340.76k     1338.03k
AES-256-XTS        481.96k      799.72k      957.22k     1001.13k     1008.25k     1004.89k

Bad performance for CTR/CFB/OFB and Alignment

In the last benchmark the accelerated version (asm and zkne asm) of CTR/CFB/OFB mode performed even worse than the pure C version. After gdb I found that it is not the problem of optimization (-O3) for C code, but the alignment of memory access.

gdb into CTR mode

Originally I thought there is a C kernel of type ctr128_f for CRYPTO_ctr128_encrypt_ctr32 thus it performs better than a CRYPTO_ctr128_encrypt with block128_f asm kernel, but I can not find it in the source tree, thus I decided to use gdb to find that ctr128_f function.

However, it turned out that the C version uses the generic AES_encrypt (type block_f) for CTR mode, thus the function calls for both C and asm are the same and the only difference is the AES encryption kernel. I was confused.

I often share my debugging process (e.g. backtrace) with a group of friends. At that time @cyyself suggested whether it is related to memory alignment.

Some background on memory alignment on RISC-V. RISC-V allows misaligned load and store, but does not enforce it to be implemented in hardware. In my setup, Rocket Chip does not support misaligned load/store, instead each unaligned load/store is trapped into OpenSBI and emulated, which results in drastic performance drop (Imagining each load/store in the AES kernel is a syscall).

Originally I thought it is not the problem as in my asm code, memory access is naturally aligned in 8 bytes with regard to register a0/a1. Then I suddenly realized that maybe the address in a0/a1 is misaligned.

It turned out to be the case. Here is the backtrace.

(gdb) bt
#0  AES_encrypt (in=0x3ff7fca874 "", out=0x3ff7fca864 "", key=0x3ff7fca8a0) at crypto/aes/aes_core.c:1442
#1  0x00000000001add3a in CRYPTO_ctr128_encrypt (in=0x3ff7fab3d0 "", out=0x3ff7fab3d0 "", len=16, key=0x3ff7fca8a0, ivec=0x3ff7fca874 "", ecount_buf=0x3ff7fca864 "", 
    num=0x3ffffff46c, block=0x2bb740 <AES_encrypt>) at crypto/modes/ctr128.c:126
#2  0x00000000002b99d8 in ossl_cipher_hw_generic_ctr (dat=0x3ff7fca7e0, out=0x3ff7fab3d0 "", in=0x3ff7fab3d0 "", len=16)
    at providers/implementations/ciphers/ciphercommon_hw.c:120
#3  0x00000000002b590e in ossl_cipher_generic_stream_update (vctx=0x3ff7fca7e0, out=0x3ff7fab3d0 "", outl=0x3ffffff508, outsize=16, in=0x3ff7fab3d0 "", inl=16)
    at providers/implementations/ciphers/ciphercommon.c:469
#4  0x0000000000170312 in EVP_EncryptUpdate (ctx=0x3ff7fd8140, out=0x3ff7fab3d0 "", outl=0x3ffffff548, in=0x3ff7fab3d0 "", inl=16) at crypto/evp/evp_enc.c:647
#5  0x0000000000054434 in EVP_Update_loop (args=0x3ffffff588) at apps/speed.c:744
#6  0x0000000000055430 in run_benchmark (async_jobs=0, loop_function=0x54336 <EVP_Update_loop>, loopargs=0x3ff7ff6ad0) at apps/speed.c:1120
#7  0x00000000000588bc in speed_main (argc=0, argv=0x3ffffffaf8) at apps/speed.c:2259
#8  0x0000000000034fce in do_cmd (prog=0x3ff7ff8080, argc=5, argv=0x3ffffffad0) at apps/openssl.c:418
#9  0x0000000000034c2a in main (argc=5, argv=0x3ffffffad0) at apps/openssl.c:298
(gdb) p &(((PROV_CIPHER_CTX*)0x3ff7fca7e0)->oiv)
$12 = (unsigned char (*)[16]) 0x3ff7fca854
(gdb) p &(((PROV_CIPHER_CTX*)0x3ff7fca7e0)->num)
$13 = (unsigned int *) 0x3ff7fca850

For AES_encrypt, its in (0x3ff7fca874) and out (0x3ff7fca864) are pointers with address not aligned in 8 bytes. From (*block) (ivec, ecount_buf, key) in CRYPTO_ctr128_encrypt we know that in and out are the IV and buffer, and from ossl_cipher_hw_generic_ctr we know that IV and the buffer belong to a PROV_CIPHER_CTX, which is dat (0x3ff7fca7e0). Note that even the CTX is aligned, its internal member IV and buf is not aligned, which is caused by an unsigned int num (only takes 4 bytes as the ABI is lp64).

So I hacked it using this commit, which turns the unsigned int num into unsigned long num and makes dat->iv aligned. Then the performance is crazy.

openssl-c (original version)
AES-128-CTR        879.73k      988.59k     1018.11k     1023.66k     1015.81k     1012.43k
openssl-c-long (hacked version for pure C)
AES-128-CTR       1048.14k     1181.93k     1231.11k     1239.72k     1224.69k     1219.27k
openssl-zkn (original version)
AES-128-CTR        423.31k      443.78k      450.01k      450.22k      450.29k      451.78k
openssl-zkn-long (hacked version for rv64i zkn asm
AES-128-CTR       3590.52k     6035.76k     7389.32k     7836.85k     7771.48k     7427.41k

The performance for the C version is also improved!

New benchmark

Then I made a new benchmark using the openssl-long-aes.sh script below and these binares. Now the performance for CTR/CFB/OFB all improve as they all use IV. Some other modes also improve thanks to the aligned buffer (I've not investigated into it yet).

Start cbc for 10 sec
openssl-c
AES-128-CBC        962.67k     1138.03k     1195.67k     1206.07k     1200.13k     1196.47k
AES-192-CBC        838.36k      991.01k     1034.52k     1041.31k     1032.19k     1034.43k
AES-256-CBC        757.12k      871.66k      904.70k      911.67k      906.85k      901.86k
openssl
AES-128-CBC       1051.10k     1278.34k     1344.74k     1370.11k     1356.60k     1351.97k
AES-192-CBC        943.71k     1111.19k     1162.80k     1173.61k     1163.26k     1158.83k
AES-256-CBC        848.62k      975.48k     1016.70k     1025.84k     1016.43k     1014.79k
openssl-zkn
AES-128-CBC       2808.04k     5072.70k     6398.18k     6927.97k     6907.49k     6622.41k
AES-192-CBC       2715.28k     4780.21k     5998.37k     6383.62k     6371.74k     6124.34k
AES-256-CBC       2643.32k     4525.23k     5584.28k     5918.92k     5912.17k     5686.12k
openssl-c-long
AES-128-CBC       1012.16k     1216.29k     1283.71k     1300.99k     1280.77k     1283.22k
AES-192-CBC        877.18k     1043.62k     1095.96k     1109.81k     1097.45k     1092.81k
AES-256-CBC        798.39k      918.89k      955.70k      963.89k      956.01k      954.23k
openssl-long
AES-128-CBC       1135.74k     1392.07k     1473.69k     1493.20k     1477.84k     1468.18k
AES-192-CBC        972.45k     1178.46k     1247.33k     1259.11k     1244.36k     1245.58k
AES-256-CBC        889.92k     1037.60k     1083.01k     1092.20k     1079.71k     1076.99k
openssl-zkn-long
AES-128-CBC       3470.16k     7364.80k    10284.26k    11412.28k    11477.80k    10785.59k
AES-192-CBC       3327.43k     6761.92k     9175.04k    10050.36k    10075.34k     9533.85k
AES-256-CBC       3200.97k     6267.98k     8270.62k     8980.79k     8975.97k     8527.87k

Start ecb for 10 sec
openssl-c
AES-128-ECB       1088.22k     1254.21k     1309.59k     1321.88k     1303.35k     1295.97k
AES-192-ECB        952.69k     1078.12k     1111.45k     1123.02k     1110.54k     1108.09k
AES-256-ECB        844.52k      941.20k      968.27k      970.85k      964.05k      960.78k
openssl
AES-128-ECB       1251.52k     1441.27k     1501.41k     1516.75k     1489.31k     1484.54k
AES-192-ECB       1079.82k     1224.79k     1267.10k     1276.42k     1261.13k     1250.49k
AES-256-ECB        957.91k     1065.02k     1098.73k     1103.56k     1090.90k     1086.81k
openssl-zkn
AES-128-ECB       4330.83k     8672.99k    11491.35k    12495.77k    12468.22k    11688.35k
AES-192-ECB       4065.33k     7855.45k    10042.39k    10842.42k    10844.57k    10226.89k
AES-256-ECB       3936.47k     7172.77k     9025.79k     9626.42k     9563.34k     9056.21k
openssl-c-long
AES-128-ECB       1065.48k     1249.50k     1303.63k     1318.40k     1300.89k     1296.32k
AES-192-ECB        937.71k     1072.98k     1113.01k     1118.92k     1108.38k     1107.56k
AES-256-ECB        830.11k      935.72k      965.43k      971.37k      960.10k      960.78k
openssl-long
AES-128-ECB       1239.66k     1443.68k     1505.66k     1519.82k     1491.76k     1482.91k
AES-192-ECB       1074.32k     1224.77k     1270.40k     1279.39k     1261.94k     1258.29k
AES-256-ECB        949.29k     1064.42k     1097.68k     1104.59k     1089.54k     1089.00k
openssl-zkn-long
AES-128-ECB       4420.08k     8550.62k    11478.43k    12497.92k    12498.53k    11699.81k
AES-192-ECB       4026.98k     7826.48k    10060.80k    10875.80k    10831.46k    10230.17k
AES-256-ECB       4075.84k     7051.33k     9025.66k     9596.21k     9585.46k     9076.74k

Start cfb for 10 sec
openssl-c
AES-128-CFB        914.85k     1039.46k     1073.08k     1080.22k     1065.53k     1068.24k
AES-192-CFB        816.16k      911.99k      936.88k      941.88k      933.89k      928.75k
AES-256-CFB        739.20k      816.24k      835.58k      839.99k      833.95k      833.11k
openssl
AES-128-CFB        334.70k      352.60k      356.40k      357.48k      358.45k      358.65k
AES-192-CFB        321.70k      337.75k      341.25k      341.81k      343.38k      343.38k
AES-256-CFB        304.22k      323.26k      327.37k      328.09k      329.81k      329.64k
openssl-zkn
AES-128-CFB        425.96k      454.91k      460.77k      463.26k      464.49k      463.91k
AES-192-CFB        423.59k      452.02k      459.44k      459.67k      461.57k      461.57k
AES-256-CFB        421.39k      449.98k      457.04k      457.52k      457.48k      458.56k
openssl-c-long
AES-128-CFB       1077.41k     1237.98k     1289.63k     1301.20k     1287.78k     1277.95k
AES-192-CFB        941.30k     1066.18k     1102.41k     1109.61k     1097.73k     1093.90k
AES-256-CFB        833.82k      933.51k      957.21k      964.40k      956.83k      955.19k
openssl-long
AES-128-CFB       1197.05k     1420.87k     1484.83k     1491.35k     1478.66k     1469.81k
AES-192-CFB       1048.64k     1209.41k     1253.99k     1264.54k     1244.36k     1245.58k
AES-256-CFB        872.94k     1052.33k     1085.00k     1090.25k     1082.98k     1075.92k
openssl-zkn-long
AES-128-CFB       3962.33k     7795.17k    10398.39k    11404.49k    11422.11k    10764.29k
AES-192-CFB       3716.76k     7090.21k     9243.42k    10026.70k    10038.48k     9494.53k
AES-256-CFB       3570.91k     6530.75k     8342.76k     8972.80k     8943.21k     8504.93k

Start ctr for 10 sec
openssl-c
AES-128-CTR        874.87k      987.02k     1013.63k     1021.13k     1014.79k     1013.16k
AES-192-CTR        787.88k      872.70k      893.52k      895.69k      892.93k      890.40k
AES-256-CTR        709.80k      783.49k      801.00k      803.33k      797.92k      799.54k
openssl
AES-128-CTR        327.25k      343.40k      346.83k      348.06k      348.63k      349.57k
AES-192-CTR        315.17k      328.68k      331.57k      333.41k      335.05k      334.23k
AES-256-CTR        302.10k      316.03k      318.85k      319.80k      321.62k      321.80k
openssl-zkn
AES-128-CTR        415.55k      442.14k      448.00k      450.05k      451.38k      450.85k
AES-192-CTR        414.35k      438.67k      446.11k      447.59k      448.47k      448.47k
AES-256-CTR        413.07k      437.67k      441.50k      445.24k      446.46k      445.95k
openssl-c-long
AES-128-CTR       1031.51k     1183.18k     1227.24k     1234.12k     1224.70k     1219.39k
AES-192-CTR        855.02k     1022.78k     1054.11k     1064.96k     1047.53k     1049.17k
AES-256-CTR        808.36k      899.24k      923.21k      929.79k      921.60k      915.67k
openssl-long
AES-128-CTR       1143.41k     1341.06k     1396.02k     1410.05k     1393.46k     1386.09k
AES-192-CTR        991.90k     1146.37k     1187.28k     1198.80k     1186.65k     1185.02k
AES-256-CTR        894.73k     1003.99k     1036.47k     1046.32k     1035.25k     1034.43k
openssl-zkn-long
AES-128-CTR       3539.68k     6013.04k     7343.36k     7816.81k     7778.30k     7430.14k
AES-192-CTR       3394.59k     5618.19k     6758.02k     7123.76k     7091.81k     6807.55k
AES-256-CTR       3264.01k     5258.12k     6259.99k     6582.89k     6509.36k     6281.63k

Start ofb for 10 sec
openssl-c
AES-128-OFB        933.25k     1054.32k     1083.62k     1093.73k     1083.54k     1076.99k
AES-192-OFB        835.99k      921.69k      946.48k      948.94k      943.72k      938.80k
AES-256-OFB        748.26k      821.78k      842.70k      847.26k      842.14k      841.30k
openssl
AES-128-OFB        335.11k      352.93k      357.09k      358.30k      359.63k      359.73k
AES-192-OFB        321.69k      336.67k      342.12k      343.24k      344.19k      344.67k
AES-256-OFB        309.83k      324.49k      326.94k      329.11k      330.30k      330.63k
openssl-zkn
AES-128-OFB        428.27k      456.04k      463.64k      463.20k      466.48k      465.55k
AES-192-OFB        426.57k      454.73k      461.82k      463.46k      461.92k      463.91k
AES-256-OFB        424.66k      451.69k      459.39k      461.00k      462.03k      459.93k
openssl-c-long
AES-128-OFB       1080.38k     1239.92k     1293.34k     1306.83k     1288.95k     1284.86k
AES-192-OFB        951.28k     1067.15k     1101.47k     1114.73k     1100.19k     1096.63k
AES-256-OFB        843.31k      932.00k      959.54k      961.74k      957.51k      955.87k
openssl-long
AES-128-OFB       1218.63k     1423.60k     1487.08k     1499.65k     1476.36k     1476.20k
AES-192-OFB       1058.32k     1210.99k     1256.73k     1267.30k     1252.56k     1245.58k
AES-256-OFB        930.97k     1050.28k     1084.03k     1092.40k     1083.54k     1081.90k
openssl-zkn-long
AES-128-OFB       4043.26k     7918.44k    10622.59k    11603.25k    11597.29k    10959.26k
AES-192-OFB       3818.86k     7191.71k     9411.84k    10194.23k    10180.20k     9641.98k
AES-256-OFB       3638.45k     6618.84k     8457.11k     9073.97k     9047.24k     8619.62k

Start ccm for 10 sec
openssl-c
AES-128-CCM        280.87k      481.67k      583.42k      618.80k      629.15k      629.52k
AES-192-CCM        243.57k      414.16k      500.86k      527.26k      537.68k      539.03k
AES-256-CCM        216.20k      363.78k      438.17k      461.82k      468.93k      471.39k
openssl
AES-128-CCM        314.89k      548.09k      669.95k      709.63k      719.36k      718.54k
AES-192-CCM        268.76k      463.99k      561.46k      599.65k      604.78k      606.63k
AES-256-CCM        240.22k      405.41k      490.29k      518.35k      530.31k      530.84k
openssl-zkn
AES-128-CCM       1272.38k     2708.22k     3956.98k     4459.62k     4521.16k     4412.21k
AES-192-CCM       1170.62k     2495.08k     3565.41k     4009.57k     4068.15k     3979.67k
AES-256-CCM       1080.16k     2328.54k     3304.96k     3641.45k     3699.51k     3632.33k
openssl-c-long
AES-128-CCM        280.22k      480.86k      585.04k      619.01k      626.06k      629.15k
AES-192-CCM        244.12k      413.47k      500.04k      528.18k      536.86k      535.79k
AES-256-CCM        216.67k      364.13k      438.68k      462.75k      471.04k      471.39k
openssl-long
AES-128-CCM        313.06k      546.78k      669.29k      709.32k      719.36k      720.18k
AES-192-CCM        272.32k      466.36k      568.32k      601.70k      611.33k      612.15k
AES-256-CCM        240.39k      409.78k      493.77k      522.96k      531.66k      532.52k
openssl-zkn-long
AES-128-CCM       1254.85k     2715.69k     3946.78k     4454.50k     4520.35k     4402.38k
AES-192-CCM       1145.90k     2511.84k     3589.07k     4007.22k     4067.33k     3978.04k
AES-256-CCM       1120.97k     2337.06k     3289.73k     3664.79k     3692.13k     3624.14k

Start ocb for 10 sec
openssl-c
AES-128-OCB        935.72k     1085.10k     1133.75k     1153.95k     1138.69k     1129.87k
AES-192-OCB        843.86k      954.92k      985.47k      998.60k      987.96k      982.06k
AES-256-OCB        741.44k      840.65k      869.38k      878.90k      869.99k      869.12k
openssl
AES-128-OCB       1055.92k     1225.58k     1297.38k     1311.33k     1293.04k     1287.78k
AES-192-OCB        922.50k     1065.40k     1107.35k     1117.80k     1106.45k     1099.37k
AES-256-OCB        840.30k      951.62k      983.45k      987.03k      979.60k      977.15k
openssl-zkn
AES-128-OCB       1045.38k     1236.35k     1295.41k     1307.85k     1287.78k     1286.50k
AES-192-OCB        921.92k     1073.06k     1115.26k     1124.86k     1112.47k     1108.09k
AES-256-OCB        827.63k      947.19k      981.38k      987.96k      977.97k      975.51k
openssl-c-long
AES-128-OCB        955.19k     1083.24k     1142.37k     1158.35k     1143.28k     1140.33k
AES-192-OCB        825.89k      952.23k      983.78k      999.32k      987.79k      986.97k
AES-256-OCB        743.88k      844.46k      873.09k      879.51k      874.03k      872.39k
openssl-long
AES-128-OCB       1042.53k     1233.90k     1289.57k     1309.39k     1288.95k     1289.77k
AES-192-OCB        941.78k     1074.24k     1118.08k     1127.73k     1108.91k     1112.47k
AES-256-OCB        828.79k      941.50k      979.66k      988.06k      978.78k      974.54k
openssl-zkn-long
AES-128-OCB       1048.37k     1235.07k     1295.69k     1309.29k     1288.95k     1288.13k
AES-192-OCB        924.58k     1072.08k     1117.97k     1127.32k     1111.36k     1110.84k
AES-256-OCB        830.07k      933.97k      979.28k      986.93k      976.49k      977.15k

Start gcm for 10 sec
openssl-c
AES-128-GCM        664.71k      784.25k      813.77k      821.86k      818.38k      818.38k
AES-192-GCM        608.99k      704.83k      733.72k      738.00k      735.73k      736.54k
AES-256-GCM        567.70k      645.96k      666.88k      671.85k      667.95k      671.74k
openssl
AES-128-GCM        763.35k      873.20k      907.42k      914.33k      905.13k      903.49k
AES-192-GCM        697.90k      788.95k      816.41k      820.53k      814.29k      814.29k
AES-256-GCM        638.22k      717.62k      738.84k      744.65k      739.74k      739.82k
openssl-zkn
AES-128-GCM       1442.82k     1803.65k     1947.67k     1978.16k     1956.25k     1943.14k
AES-192-GCM       1375.95k     1754.18k     1892.28k     1931.98k     1901.10k     1893.74k
AES-256-GCM       1394.31k     1735.58k     1859.97k     1882.83k     1865.91k     1854.67k
openssl-c-long
AES-128-GCM        683.28k      783.48k      815.08k      821.96k      815.10k      818.38k
AES-192-GCM        612.68k      707.61k      735.16k      740.86k      736.54k      735.64k
AES-256-GCM        568.43k      645.40k      666.73k      672.26k      669.29k      670.40k
openssl-long
AES-128-GCM        776.14k      868.97k      909.63k      913.92k      908.49k      906.77k
AES-192-GCM        704.64k      791.16k      816.67k      820.84k      815.93k      815.92k
AES-256-GCM        647.10k      713.84k      741.38k      744.73k      741.45k      741.45k
openssl-zkn-long
AES-128-GCM       1406.97k     1784.51k     1933.95k     1974.27k     1953.48k     1941.50k
AES-192-GCM       1400.78k     1765.06k     1903.54k     1926.96k     1911.19k     1898.91k
AES-256-GCM       1376.80k     1728.02k     1857.95k     1887.44k     1863.68k     1856.31k

Start xts for 10 sec
openssl-c
AES-128-XTS        565.92k      908.86k     1116.48k     1170.84k     1178.01k     1170.75k
AES-256-XTS        421.84k      711.16k      846.85k      885.45k      893.67k      893.67k
openssl
AES-128-XTS        609.14k     1040.01k     1266.28k     1336.22k     1339.39k     1335.30k
AES-256-XTS        485.35k      798.58k      956.47k     1000.14k     1004.97k     1004.34k
openssl-zkn
AES-128-XTS        629.96k     1054.50k     1264.36k     1333.86k     1336.93k     1329.05k
AES-256-XTS        480.54k      797.36k      953.04k      998.81k     1003.34k      999.06k
openssl-c-long
AES-128-XTS        550.98k      923.25k     1115.44k     1171.56k     1178.83k     1175.66k
AES-256-XTS        435.18k      714.35k      847.87k      889.14k      895.39k      894.57k
openssl-long
AES-128-XTS        640.08k     1059.38k     1273.45k     1338.57k     1339.39k     1335.30k
AES-256-XTS        487.40k      802.76k      954.44k     1002.29k     1006.80k     1004.34k
openssl-zkn-long
AES-128-XTS        626.25k     1047.43k     1260.31k     1334.07k     1335.30k     1332.02k
AES-256-XTS        476.90k      791.07k      948.53k      996.86k     1003.52k     1001.70k

Start gcm for 10 (with zbb/zbc)
openssl-c
AES-128-GCM        667.19k      785.46k      814.28k      822.37k      815.93k      819.20k
AES-192-GCM        607.93k      707.30k      732.03k      740.97k      737.36k      735.07k
AES-256-GCM        559.91k      645.12k      665.42k      672.26k      670.11k      671.07k
openssl
AES-128-GCM       1052.32k     1272.72k     1344.67k     1362.02k     1346.76k     1342.15k
AES-192-GCM        932.96k     1097.25k     1151.80k     1164.60k     1153.10k     1150.64k
AES-256-GCM        835.52k      955.69k     1004.77k     1018.98k     1009.88k     1009.25k
openssl-zkn
AES-128-GCM       2960.71k     5031.76k     6198.81k     6546.94k     6546.23k     6316.03k
AES-192-GCM       2797.85k     4689.20k     5726.41k     6083.17k     6033.41k     5848.16k
AES-256-GCM       2769.17k     4486.93k     5413.15k     5687.09k     5652.48k     5449.32k
openssl-c-long
AES-128-GCM        669.84k      781.90k      812.90k      820.33k      816.74k      818.38k
AES-192-GCM        611.55k      706.17k      733.13k      739.23k      736.54k      737.28k
AES-256-GCM        576.21k      642.48k      666.09k      671.44k      669.44k      670.40k
openssl-long
AES-128-GCM       1079.88k     1266.68k     1345.97k     1360.90k     1345.95k     1343.49k
AES-192-GCM        953.07k     1106.26k     1156.02k     1163.67k     1153.43k     1150.64k
AES-256-GCM        853.04k      971.86k     1010.25k     1014.89k     1010.07k     1009.25k
openssl-zkn-long
AES-128-GCM       2887.28k     4978.92k     6201.06k     6607.36k     6547.05k     6325.86k
AES-192-GCM       2796.31k     4693.54k     5778.89k     6122.80k     6082.56k     5845.81k
AES-256-GCM       2710.93k     4444.12k     5408.33k     5709.82k     5663.95k     5477.17k

New ocb/xts

It turned out that ocb/xts was not using the accelerated version, here is the new benchmark

Start ocb for 3 sec
./openssl-c-long
AES-128-OCB        960.79k     1100.31k     1145.66k     1160.53k     1145.24k     1141.42k
AES-192-OCB        845.51k      956.76k      986.97k      999.00k      988.50k      988.50k
AES-256-OCB        758.89k      847.94k      873.05k      881.87k      876.74k      870.91k
./openssl-xts
AES-128-OCB       2965.20k     4466.82k     5255.84k     5495.69k     5409.45k     5254.94k
AES-192-OCB       2842.80k     4343.17k     4956.50k     5152.88k     5101.51k     4947.97k
AES-256-OCB       2705.55k     4054.39k     4676.15k     4858.16k     4791.91k     4674.10k

Start xts for 3
./openssl-c-long
AES-128-XTS        549.68k      916.46k     1110.90k     1169.89k     1180.85k     1174.19k
AES-256-XTS        420.04k      670.67k      817.21k      888.15k      896.17k      895.66k
./openssl-xts
AES-128-XTS       2752.42k     4733.33k     5862.83k     6278.46k     6283.26k     6120.71k
AES-256-XTS       2432.03k     4171.88k     5107.63k     5469.18k     5490.56k     5297.49k
#!/bin/bash
SEC=10
LIST="openssl-c openssl openssl-zkn openssl-c-long openssl-long openssl-zkn-long"
openssl_one() {
OPENSSL=$1
W=$2
MODE=$3
$OPENSSL speed -evp aes-$W-$MODE -elapsed -seconds $SEC 2>/dev/null | grep -i aes | grep k
}
openssl_all() {
OPENSSL=$1
MODE=$2
openssl_one $OPENSSL 128 $MODE
openssl_one $OPENSSL 192 $MODE
openssl_one $OPENSSL 256 $MODE
}
mode_one() {
MODE=$1
echo "Start $MODE for $SEC sec"
for o in $LIST; do
echo $o
openssl_all ./$o $MODE
done
echo
}
unset OPENSSL_riscvcap
for m in cbc ecb cfb ctr ofb ccm ocb gcm; do
mode_one $m
done
# xts
echo "Start xts for $SEC"
for o in $LIST; do
echo $o
openssl_one ./$o 128 xts
openssl_one ./$o 256 xts
done
# gcm with zbb
export OPENSSL_riscvcap="rv64gc_zbb_zbc"
mode_one gcm

Benchmark of OpenSSL AES for RISC-V 32

This is evaluated against FPGA based on RTL from chipsalliance/rocket-chip#2950.

The openssl32-asm-all binary is the result of merging openssl/openssl#17640, openssl/openssl#18197 and openssl/openssl#18308

Note that there is no aes-gcm with accelerated clmul as PR 17640 only provides the 64 bit implementation.

Start cbc for 10 sec
./openssl32-c
AES-128-CBC       1010.09k     1259.94k     1339.06k     1361.82k     1344.31k     1335.60k
AES-192-CBC        897.41k     1092.07k     1150.16k     1165.52k     1153.43k     1145.73k
AES-256-CBC        807.06k      951.93k     1000.63k     1013.04k     1004.34k     1001.06k
./openssl32-asm-all
AES-128-CBC       2505.50k     4124.27k     4933.04k     5185.13k     5153.59k     4952.88k
AES-192-CBC       2335.56k     3657.98k     4355.66k     4569.80k     4534.27k     4376.17k
AES-256-CBC       2182.84k     3321.75k     3852.31k     4018.48k     3978.85k     3861.71k

Start ecb for 10 sec
./openssl32-c
AES-128-ECB       1212.59k     1335.07k     1387.39k     1400.12k     1377.08k     1365.06k
AES-192-ECB       1021.90k     1147.13k     1185.25k     1195.01k     1176.01k     1167.01k
AES-256-ECB        906.99k     1003.08k     1030.40k     1036.80k     1023.18k     1016.43k
./openssl32-asm-all
AES-128-ECB       3523.06k     4914.89k     5497.55k     5662.92k     5586.27k     5341.18k
AES-192-ECB       3186.91k     4342.96k     4808.14k     4937.83k     4876.70k     4666.16k
AES-256-ECB       2944.69k     3852.46k     4204.16k     4302.64k     4247.55k     4085.36k

Start cfb for 10 sec
./openssl32-c
AES-128-CFB       1122.53k     1290.65k     1350.48k     1366.02k     1344.31k     1333.96k
AES-192-CFB        981.39k     1112.88k     1157.40k     1168.79k     1152.28k     1145.73k
AES-256-CFB        859.27k      978.66k     1009.79k     1017.75k     1004.97k     1000.06k
./openssl32-asm-all
AES-128-CFB       3145.66k     4472.99k     5028.89k     5189.22k     5126.55k     4916.84k
AES-192-CFB       2914.64k     3967.08k     4442.57k     4578.00k     4521.16k     4349.95k
AES-256-CFB       2645.67k     3540.03k     3912.19k     4015.82k     3972.30k     3842.05k

Start ctr for 10 sec
./openssl32-c
AES-128-CTR       1071.18k     1225.10k     1280.03k     1293.93k     1273.86k     1264.84k
AES-192-CTR        941.24k     1064.40k     1105.33k     1115.65k     1099.09k     1094.45k
AES-256-CTR        845.76k      941.59k      970.50k      977.51k      965.84k      960.10k
./openssl32-asm-all
AES-128-CTR       2808.25k     3793.83k     4184.55k     4293.32k     4239.36k     4089.45k
AES-192-CTR       2615.76k     3418.01k     3765.48k     3859.97k     3812.56k     3694.59k
AES-256-CTR       2405.34k     3102.07k     3384.52k     3461.73k     3416.06k     3320.99k

Start ofb for 10 sec
./openssl32-c
AES-128-OFB       1135.83k     1306.16k     1360.61k     1375.74k     1349.22k     1340.21k
AES-192-OFB        988.81k     1125.99k     1167.21k     1177.80k     1155.89k     1150.16k
AES-256-OFB        878.44k      985.40k     1015.60k     1023.28k     1006.61k     1003.34k
./openssl32-asm-all
AES-128-OFB       3249.57k     4511.85k     5096.68k     5265.10k     5203.56k     4990.49k
AES-192-OFB       2907.45k     4030.89k     4503.81k     4638.11k     4579.33k     4410.57k
AES-256-OFB       2696.21k     3597.01k     3970.30k     4072.14k     4014.90k     3889.56k

Start ccm for 10 sec
./openssl32-c
AES-128-CCM        295.85k      507.05k      617.16k      653.31k      660.43k      659.93k
AES-192-CCM        257.83k      437.79k      530.12k      559.72k      567.14k      566.82k
AES-256-CCM        228.78k      384.97k      465.61k      490.60k      497.58k      497.58k
./openssl32-asm-all
AES-128-CCM        866.68k     1670.21k     2193.79k     2377.83k     2400.26k     2367.49k
AES-192-CCM        787.08k     1484.47k     1915.01k     2063.16k     2073.40k     2052.92k
AES-256-CCM        717.65k     1331.46k     1694.75k     1818.42k     1828.26k     1813.53k

Start ocb for 10 sec
./openssl32-c
AES-128-OCB        928.88k     1062.34k     1107.00k     1117.59k     1099.37k     1094.99k
AES-192-OCB        831.76k      938.95k      973.13k      981.30k      968.29k      965.02k
AES-256-OCB        761.80k      840.92k      867.97k      875.11k      863.44k      860.08k
./openssl32-asm-all
AES-128-OCB        914.65k     1056.29k     1098.44k     1108.79k     1090.36k     1085.17k
AES-192-OCB        832.55k      936.12k      968.96k      976.90k      964.20k      960.78k
AES-256-OCB        753.73k      838.03k      863.64k      869.89k      859.34k      855.17k

Start gcm for 10 sec
./openssl32-c
AES-128-GCM        604.17k      680.86k      704.03k      710.45k      706.26k      706.15k
AES-192-GCM        561.64k      630.00k      647.78k      653.00k      649.63k      649.15k
AES-256-GCM        525.05k      583.14k      599.78k      603.24k      599.87k      600.09k
./openssl32-asm-all
AES-128-GCM        921.52k     1086.25k     1138.61k     1152.92k     1150.16k     1144.10k
AES-192-GCM        895.61k     1053.69k     1104.79k     1118.52k     1116.57k     1110.25k
AES-256-GCM        873.06k     1022.98k     1070.62k     1083.80k     1079.71k     1075.35k

Start xts for 10
./openssl32-c
AES-128-XTS        567.60k      916.10k     1089.51k     1146.16k     1141.15k     1135.91k
AES-256-XTS        438.90k      712.86k      846.23k      888.73k      887.94k      886.37k
./openssl32-asm-all
AES-128-XTS        567.16k      919.17k     1091.61k     1145.04k     1138.69k     1133.14k
AES-256-XTS        443.25k      717.45k      848.20k      889.14k      888.83k      885.49k

New ocb/xts

It turned out that ocb/xts was not using the accelerated version, here is the new benchmark

Start ocb for 3 sec
./openssl32-c
AES-128-OCB        929.40k     1055.04k     1106.60k     1117.87k     1096.80k     1094.08k
AES-192-OCB        828.43k      939.35k      973.06k      982.36k      966.17k      963.44k
AES-256-OCB        755.99k      840.60k      868.10k      874.84k      862.89k      860.02k
./openssl32-asm-xts
AES-128-OCB       2023.50k     2560.64k     2796.97k     2860.71k     2823.51k     2763.43k
AES-192-OCB       1851.94k     2384.04k     2578.18k     2638.85k     2599.59k     2555.90k
AES-256-OCB       1738.55k     2225.17k     2416.98k     2465.79k     2435.75k     2392.06k

Start xts for 3
./openssl32-c
AES-128-XTS        558.96k      917.38k     1089.62k     1146.54k     1137.63k     1135.96k
AES-256-XTS        440.86k      706.56k      846.42k      888.49k      887.24k      887.24k
./openssl32-asm-xts
AES-128-XTS       1544.80k     2507.90k     2967.04k     3110.57k     3104.77k     3031.04k
AES-256-XTS       1302.23k     2119.98k     2518.02k     2641.92k     2637.82k     2588.67k
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment