wolfSSL on RISC-V Benchmarks (HiFive Unleashed)

We are excited to share the latest benchmark results of wolfSSL v5.7.0 running on the HiFive Unleashed at 1.4GHz. We implemented AES for ECB, CBC, CTR, GCM, and CCM using assembly for RISC-V. This benchmark demonstrates the performance capabilities of wolfSSL on RISC-V architecture, highlighting our commitment to providing high-performance, lightweight, and secure SSL/TLS solutions across diverse platforms.

The benchmark results prove that the new assembly optimizations are much faster.

With RISC-V assembly optimizations:

./configure --enable-riscv-asm && make

root@HiFiveU:~/wolfssl-riscv# ./wolfcrypt/benchmark/benchmark -aes-cbc -aes-gcm------------------------------------------------------------------------------
 wolfSSL version 5.7.0
------------------------------------------------------------------------------
Math:   Multi-Precision: Wolf(SP) word-size=64 bits=3072 sp_int.c
wolfCrypt Benchmark (block bytes 1048576, min 1.0 sec each)
AES-128-CBC-enc             20 MiB took 1.076 seconds,   18.588 MiB/s
AES-128-CBC-dec             20 MiB took 1.083 seconds,   18.473 MiB/s
AES-192-CBC-enc             20 MiB took 1.245 seconds,   16.062 MiB/s
AES-192-CBC-dec             20 MiB took 1.246 seconds,   16.047 MiB/s
AES-256-CBC-enc             15 MiB took 1.057 seconds,   14.189 MiB/s
AES-256-CBC-dec             15 MiB took 1.055 seconds,   14.212 MiB/s
AES-128-GCM-enc             15 MiB took 1.300 seconds,   11.543 MiB/s
AES-128-GCM-dec             15 MiB took 1.300 seconds,   11.535 MiB/s
AES-192-GCM-enc             15 MiB took 1.425 seconds,   10.526 MiB/s
AES-192-GCM-dec             15 MiB took 1.425 seconds,   10.523 MiB/s
AES-256-GCM-enc             10 MiB took 1.032 seconds,    9.687 MiB/s
AES-256-GCM-dec             10 MiB took 1.032 seconds,    9.691 MiB/s
GMAC Table 4-bit            31 MiB took 1.025 seconds,   30.251 MiB/s
Benchmark complete

Without RISC-V assembly optimizations:

./configure —enable-all && make

root@HiFiveU:~/wolfssl# ./wolfcrypt/benchmark/benchmark -aes-cbc -aes-gcm
------------------------------------------------------------------------------
 wolfSSL version 5.7.0
------------------------------------------------------------------------------
Math:   Multi-Precision: Wolf(SP) word-size=64 bits=4096 sp_int.c
wolfCrypt Benchmark (block bytes 1048576, min 1.0 sec each)
AES-128-CBC-enc              5 MiB took 12.798 seconds,    0.391 MiB/s
AES-128-CBC-dec              5 MiB took 12.672 seconds,    0.395 MiB/s
AES-192-CBC-enc              5 MiB took 15.301 seconds,    0.327 MiB/s
AES-192-CBC-dec              5 MiB took 15.181 seconds,    0.329 MiB/s
AES-256-CBC-enc              5 MiB took 17.820 seconds,    0.281 MiB/s
AES-256-CBC-dec              5 MiB took 17.669 seconds,    0.283 MiB/s
AES-128-GCM-enc              5 MiB took 12.870 seconds,    0.388 MiB/s
AES-128-GCM-dec              5 MiB took 12.870 seconds,    0.388 MiB/s
AES-192-GCM-enc              5 MiB took 15.375 seconds,    0.325 MiB/s
AES-192-GCM-dec              5 MiB took 15.376 seconds,    0.325 MiB/s
AES-256-GCM-enc              5 MiB took 17.878 seconds,    0.280 MiB/s
AES-256-GCM-dec              5 MiB took 17.896 seconds,    0.279 MiB/s
AES-128-GCM-STREAM-enc       5 MiB took 12.878 seconds,    0.388 MiB/s
AES-128-GCM-STREAM-dec       5 MiB took 12.878 seconds,    0.388 MiB/s
AES-192-GCM-STREAM-enc       5 MiB took 15.379 seconds,    0.325 MiB/s
AES-192-GCM-STREAM-dec       5 MiB took 15.385 seconds,    0.325 MiB/s
AES-256-GCM-STREAM-enc       5 MiB took 17.881 seconds,    0.280 MiB/s
AES-256-GCM-STREAM-dec       5 MiB took 17.888 seconds,    0.280 MiB/s
GMAC Table 4-bit            30 MiB took 1.006 seconds,   29.831 MiB/s
Benchmark complete

If you have questions about any of the above, please contact us at facts@wolfSSL.com or +1 425 245 8247.

Download wolfSSL Now

wolfCrypt implementations of LMS/HSS and XMSS/XMSS^MT signatures: build options and benchmarks (Intel x86)

At wolfSSL we’re excited about stateful hash-based signature schemes and the CNSA 2.0, and we just had a webinar on this subject. If you recall, previously we added initial support for LMS/HSS and XMSS/XMSS^MT, through external integration with the hash-sigs and xmss-reference implementations.

Recently however we have completed our own wolfCrypt implementations of these algorithms, and would like to share benchmarking results and some of the build options available. Generally the wolfCrypt implementations of these signature methods are faster, with more options available to tune build size and performance.

With that said, we’ll review some of the more relevant build options and benchmarking data for LMS/HSS, and XMSS/XMSS^MT. These benchmarks were obtained on a Fedora 38 workstation with an Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz. Only a single core was used. wolfSSL was built with –enable-intelasm to utilize assembly speedups for all tests. Note: LMS/HSS and XMSS/XMSS^MT support a very wide range of parameters. For the sake of conciseness only a targeted range is benchmarked here.

LMS build options and benchmarking

The five main defines that customize the wolfCrypt LMS/HSS build are the following:

  • WOLFSSL_LMS_LARGE_CACHES
  • WOLFSSL_WC_LMS_SMALL
  • WOLFSSL_LMS_MAX_LEVELS=N
  • WOLFSSL_LMS_MAX_HEIGHT=H
  • WOLFSSL_LMS_VERIFY_ONLY

The define WOLFSSL_LMS_LARGE_CACHES will cache more of the authentication path into memory, speeding up signing operations for larger height trees.

The define WOLFSSL_WC_LMS_SMALL reduces code size and memory use overall, with the tradeoff of much slower signing operations. However the performance impact for verification is negligible.

The defines WOLFSSL_LMS_MAX_LEVELS, and WOLFSSL_LMS_MAX_HEIGHT set compile time limits on the size of the LMS/HSS hypertree, and mainly reduce code footprint without impacting performance. These can be used to slim the build size if you are only interested in a specific parameter set range. More specifically, WOLFSSL_LMS_MAX_LEVELS sets the max allowed levels in HSS (the number of trees in the hypertree), while WOLFSSL_LMS_MAX_HEIGHT sets the max allowed height per tree for both LMS and HSS.

The define WOLFSSL_LMS_VERIFY_ONLY restricts the build to a smaller verify-only subset (LMS API and data structures needed for keygen/signing are omitted). This does not impact verify performance, and is intended for embedded targets that need verify-only functionality (e.g. wolfBoot). WOLFSSL_LMS_VERIFY_ONLY can be combined with WOLFSSL_WC_LMS_SMALL, WOLFSSL_LMS_MAX_LEVELS, and WOLFSSL_LMS_MAX_HEIGHT for further footprint reduction.

In Table 1 we show benchmarking results (obtained with ./wolfcrypt/benchmark/benchmark -lms_hss) for these different build options, with the external LMS/HSS implementation provided for comparison.

In general we see the default wolfCrypt LMS/HSS performance (wc_lms) is much faster than the external integration (ext_lms) for all categories of operation (keygen, signing, verifying). The WOLFSSL_LMS_LARGE_CACHES (wc_lms large) option speeds up signing operations for larger height trees, but otherwise does not impact performance. The small variations in verify speed across wc_lms, wc_lms large, and wc_lms small are likely just system noise and do not represent a systematic trend. The WOLFSSL_WC_LMS_SMALL option (wc_lms small) significantly reduces signing speed, but leaves verification speed basically unchanged, making this option attractive for verify-only applications in embedded systems.


Table 1: Comparison of wolfCrypt LMS/HSS (wc_lms), wolfCrypt LMS/HSS with WOLFSSL_LMS_LARGE_CACHES (wc_lms large), wolfCrypt LMS/HSS with WOLFSSL_WC_LMS_SMALL (wc_lms small), and the external integration implementation (ext_lms). All values in units of ops/sec.

wc_lms wc_lms large wc_lms small ext_lms
L2_H10_W2 keygen 6.482 6.494 12.828 1.330
L2_H10_W2 sign 4437.469 5521.796 6.526 786.083
L2_H10_W2 verify 13954.450 14087.794 13874.450 4789.383
L2_H10_W4 keygen 3.567 3.592 6.954 0.764
L2_H10_W4 sign 2452.361 3052.326 3.562 443.225
L2_H10_W4 verify 6482.891 6707.271 6962.215 2281.440
L3_H5_W4 keygen 70.926 73.673 227.376 17.467
L3_H5_W4 sign 4660.370 4669.019 74.653 820.640
L3_H5_W4 verify 4632.118 4670.963 4790.742 1756.355
L3_H5_W8 keygen 9.395 9.413 29.041 2.265
L3_H5_W8 sign 609.408 605.199 9.542 106.059
L3_H5_W8 verify 561.759 554.635 573.341 214.093
L3_H10_W4 keygen 2.384 2.368 7.128 0.569
L3_H10_W4 sign 2459.698 3067.848 2.376 444.601
L3_H10_W4 verify 4895.203 4345.130 4793.853 1618.676
L4_H5_W8 keygen 7.045 7.017 29.258 1.770
L4_H5_W8 sign 608.915 607.318 7.168 106.881
L4_H5_W8 verify 446.384 443.804 438.542 145.672

Graph 1: Signing speeds for wolfCrypt LMS/HSS (wc_lms), wolfCrypt LMS/HSS with WOLFSSL_LMS_LARGE_CACHES (wc_lms large), and the external integration implementation (ext_lms). All values in units of ops/sec.

XMSS build options and benchmarking

Three important defines that customize the wc_xmss build are:

  • WOLFSSL_WC_XMSS_SMALL
  • WOLFSSL_XMSS_MAX_HEIGHT=N
  • WOLFSSL_XMSS_VERIFY_ONLY

The define WOLFSSL_WC_XMSS_SMALL reduces code size and memory use overall, with the tradeoff of much slower signing operations, and 20-30% slower verification.

The define WOLFSSL_XMSS_MAX_HEIGHT=N sets compile time limits on the max height of the hypertree, and mainly reduces code size without impacting performance.

The define WOLFSSL_XMSS_VERIFY_ONLY restricts the build to a smaller verify-only subset, and can be combined with WOLFSSL_WC_XMSS_SMALL, and WOLFSSL_XMSS_MAX_HEIGHT for further size reduction. It does not impact verify performance.

In Table 2 we show benchmarking results for XMSS/XMSS^MT for these options (obtained with ./wolfcrypt/benchmark/benchmark -xmss_xmssmt_sha256), with the external XMSS/XMSS^MT implementation for comparison. The default wolfCrypt XMSS/XMSS^MT (wc_xmss) is in general better than the external integration (ext_xmss), for all operations. There is a smaller difference between wc_xmss and ext_xmss as compared to wc_lms and ext_lms though, because ext_xmss can benefit from assembly speedups whereas ext_lms cannot. Similar to LMS, the WOLFSSL_WC_XMSS_SMALL option (wc_xmss small) significantly reduces signing performance, but verify speeds remain fast, making this a good option for embedded verify-only targets.

Table 2: Comparison of wolfCrypt XMSS/XMSS^MT (wc_xmss), wolfCrypt XMSS/XMSS^MT with WOLFSSL_WC_XMSS_SMALL (wc_xmss small), and the external integration implementation (ext_xmss). All values in units of ops/sec.

wc_xmss wc_xmss small ext_xmss
XMSS-SHA2_10_256 keygen 1.587 1.079 0.943
XMSS-SHA2_10_256 sign 363.693 1.106 226.782
XMSS-SHA2_10_256 verify 3050.276 2044.995 1892.234
XMSSMT-SHA2_20/2_256 keygen 0.808 1.100 0.472
XMSSMT-SHA2_20/2_256 sign 298.138 0.551 191.214
XMSSMT-SHA2_20/2_256 verify 1307.295 982.836 852.348
XMSSMT-SHA2_20/4_256 keygen 9.880 35.274 7.309
XMSSMT-SHA2_20/4_256 sign 390.942 8.681 290.516
XMSSMT-SHA2_20/4_256 verify 729.433 517.298 443.444
XMSSMT-SHA2_40/4_256 keygen 0.406 1.107 0.237
XMSSMT-SHA2_40/4_256 sign 294.738 0.276 161.656
XMSSMT-SHA2_40/4_256 verify 750.591 487.257 424.986
XMSSMT-SHA2_40/8_256 keygen 5.604 35.318 3.755
XMSSMT-SHA2_40/8_256 sign 469.764 4.374 293.184
XMSSMT-SHA2_40/8_256 verify 361.289 262.160 225.254
XMSSMT-SHA2_60/6_256 keygen 0.266 1.099 0.159
XMSSMT-SHA2_60/6_256 sign 280.160 0.185 144.637
XMSSMT-SHA2_60/6_256 verify 521.610 352.718 295.882
XMSSMT-SHA2_60/12_256 keygen 4.143 35.280 2.505
XMSSMT-SHA2_60/12_256 sign 514.658 2.910 292.371
XMSSMT-SHA2_60/12_256 verify 247.682 170.459 152.471

Graph 2: Verify speeds for wolfCrypt XMSS/XMSS^MT (wc_xmss), wolfCrypt XMSS/XMSS^MT with WOLFSSL_WC_XMSS_SMALL (wc_xmss small), and the external integration implementation (ext_xmss). All values in units of ops/sec.

Conclusions

In general our wolfCrypt implementations for LMS/HSS and XMSS/XMSS^MT are significantly faster than the external reference implementations, with speedups of 20-30% to even 3x-4x possible depending on the combination of operation, algorithm, and parameters.

The small footprint build shows fast verification speeds for all parameters, making it an attractive choice for embedded verify-only applications (e.g. wolfBoot).

Overall our LMS/HSS implementation is faster than XMSS/XMSS^MT (at least on x86), which is consistent with what is known about these two methods. However which of the two is more appropriate for your use case will ultimately depend on other factors as well, such as signature size, target environment, and parameters used.

If you’re interested in learning more about our post-quantum work, or want to learn more about stateful hash-based signature schemes, contact us at wolfSSL by emailing facts@wolfSSL.com or calling us at +1 425 245 8247 to reach out to your regional wolfSSL business director.

Download wolfSSL Now

wolfSSL on the Espressif ESP8266 – Better than ever!

It may not be as glamorous as the new ESP32 RISC-V chipsets with all the various hardware acceleration capabilities, but the ESP8266 is a well established device which has a large codebase available with an even larger user community.

Due to high customer demand, we’ve enhanced the wolfSSL libraries for the ESP8266. The recent changes have improved both the ESP-IDF CMake and traditional Makefile builds. This new capability allows for specification of the wolfSSL component source code as an alternative to using the setup script to copy everything locally.

For make, set the WOLFSSL_ROOT value in components/wolfssl/component.mk

For cmake, there are more options:

  • Set the WOLFSSL_ROOT value in components/wolfssl/CMakeLists.txt
  • Set the WOLFSSL_ROOT environment variable.
  • Have the components/wolfssl/CMakeLists.txt as a subdirectory in wolfSSL.

When a project is in a subdirectory of wolfSSL, the cmake file will search parent directories, up to the root, looking for wolfSSL.

The ability to specify the wolfSSL component source code ensures consistent versioning across projects and facilitates easy updates via GitHub.

You may have seen our recent announcement regarding wolfCrypt hardware acceleration for the ESP32 series. There’s no such capability on the ESP8266. However, there’s still a noticeable difference between debug and release optimizations, as shown at the end of this blog.

Once the Espressif ESP8266 RTOS SDK is installed, it is easy to get the wolfSSL examples working (see the README for more details):


# Set your path to RTOS SDK, 
# shown here for default from WSL with VisualGDB
WRK_IDF_PATH=/mnt/c/SysGCC/esp8266/rtos-sdk/v3.4
#  or
WRK_IDF_PATH=~/esp/ESP8266_RTOS_SDK

# Setup the environment
. $WRK_IDF_PATH/export.sh

# Optional: install as needed / prompted
# /mnt/c/SysGCC/esp8266/rtos-sdk/v3.4/install.sh

# Fetch wolfssl from GitHub if needed:
cd /workspace
git clone https://github.com/wolfSSL/wolfssl.git

# change directory to wolfssl client example.
cd wolfssl/IDE/Espressif/ESP-IDF/examples/wolfssl_client

# Adjust settings as desired
# Set IP address and wifi SSID name & password
idf.py menuconfig

# Build, flash and monitor
idf.py build flash -p /dev/ttyS70 -b 115200
idf.py monitor -p /dev/ttyS70 -b 74880

Are you interested in using the ESP8266 or ESP32 in your next project? Let us know! We love to hear about how wolfSSL is being used, and can optionally help promote your project on social media, with your approval.

Get Started with wolfSSL

Additional information on getting Started with wolfSSL on the Espressif environment is available on the wolfSSL GitHub repository as well as this YouTube recording:

Benchmark metrics for the ESP8266, compiler optimization for size (-oS):

Chip is ESP8266 (revision v1), Crystal is 26MHz, cpu freq: 160000000 Hz (160MHz)

I (59) boot: ESP-IDF v3.4 2nd stage bootloader
I (59) boot: compile time 13:01:06
I (68) qio_mode: Enabling default flash chip QIO
I (68) boot: SPI Speed      : 40MHz
I (72) boot: SPI Mode       : QIO
I (78) boot: SPI Flash Size : 2MB
I (84) boot: Partition Table:
I (89) boot: ## Label            Usage          Type ST Offset   Length
I (101) boot:  0 nvs              WiFi data        01 02 00009000 00006000
I (112) boot:  1 phy_init         RF data          01 01 0000f000 00001000
I (124) boot:  2 factory          factory app      00 00 00010000 000f0000
I (136) boot: End of partition table
I (142) esp_image: segment 0: paddr=0x00010010 vaddr=0x40210010 size=0x40874 (264308) map
I (227) esp_image: segment 1: paddr=0x0005088c vaddr=0x40250884 size=0x13cd4 ( 81108) map
I (250) esp_image: segment 2: paddr=0x00064568 vaddr=0x3ffe8000 size=0x004d0 (  1232) load
I (251) esp_image: segment 3: paddr=0x00064a40 vaddr=0x40100000 size=0x00080 (   128) load
I (262) esp_image: segment 4: paddr=0x00064ac8 vaddr=0x40100080 size=0x045a8 ( 17832) load
I (279) boot: Loaded app from partition at offset 0x10000
I (299) main: This is ESP8266 chip with 1 CPU cores, WiFi, 
I (301) main: silicon revision 1, 
I (303) main: 2MB external flash

wolfCrypt Benchmark (block bytes 1024, min 1.0 sec each)
RNG                        575 KiB took 1.022 seconds,  562.622 KiB/s Cycles per byte =   1.57
AES-128-CBC-enc            125 KiB took 1.247 seconds,  100.241 KiB/s Cycles per byte =  17.25
AES-128-CBC-dec            250 KiB took 1.089 seconds,  229.568 KiB/s Cycles per byte =  12.82
AES-192-CBC-enc            100 KiB took 1.087 seconds,   91.996 KiB/s Cycles per byte =  42.30
AES-192-CBC-dec            225 KiB took 1.019 seconds,  220.805 KiB/s Cycles per byte =  23.32
AES-256-CBC-enc            100 KiB took 1.189 seconds,   84.104 KiB/s Cycles per byte =  64.92
AES-256-CBC-dec            225 KiB took 1.075 seconds,  209.302 KiB/s Cycles per byte =  33.33
AES-128-GCM-enc             75 KiB took 1.001 seconds,   74.925 KiB/s Cycles per byte = 113.90
AES-128-GCM-dec             75 KiB took 1.001 seconds,   74.925 KiB/s Cycles per byte = 126.15
AES-192-GCM-enc             75 KiB took 1.053 seconds,   71.225 KiB/s Cycles per byte = 139.33
AES-192-GCM-dec             75 KiB took 1.053 seconds,   71.225 KiB/s Cycles per byte = 153.21
AES-256-GCM-enc             75 KiB took 1.137 seconds,   65.963 KiB/s Cycles per byte = 168.58
AES-256-GCM-dec             75 KiB took 1.137 seconds,   65.963 KiB/s Cycles per byte = 183.13
GMAC Default               342 KiB took 1.001 seconds,  341.658 KiB/s Cycles per byte =  43.08
3DES                       200 KiB took 1.115 seconds,  179.372 KiB/s Cycles per byte =  79.45
MD5                       5225 KiB took 1.000 seconds, 5225.000 KiB/s Cycles per byte =   3.22
SHA                       2300 KiB took 1.000 seconds, 2300.000 KiB/s Cycles per byte =   7.76
SHA-224                   1475 KiB took 1.009 seconds, 1461.843 KiB/s Cycles per byte =  12.71
SHA-256                   1475 KiB took 1.008 seconds, 1463.294 KiB/s Cycles per byte =  13.40
SHA-384                    475 KiB took 1.014 seconds,  468.442 KiB/s Cycles per byte =  43.79
SHA-512                    475 KiB took 1.012 seconds,  469.368 KiB/s Cycles per byte =  45.90
SHA-512/224                475 KiB took 1.012 seconds,  469.368 KiB/s Cycles per byte =  47.99
SHA-512/256                475 KiB took 1.013 seconds,  468.904 KiB/s Cycles per byte =  50.12
SHA3-224                  1250 KiB took 1.018 seconds, 1227.898 KiB/s Cycles per byte =  19.77
SHA3-256                  1175 KiB took 1.003 seconds, 1171.486 KiB/s Cycles per byte =  21.96
SHA3-384                   925 KiB took 1.021 seconds,  905.975 KiB/s Cycles per byte =  28.84
SHA3-512                   650 KiB took 1.024 seconds,  634.766 KiB/s Cycles per byte =  42.68
SHAKE128                  1450 KiB took 1.011 seconds, 1434.224 KiB/s Cycles per byte =  19.80
SHAKE256                  1175 KiB took 1.002 seconds, 1172.655 KiB/s Cycles per byte =  25.22
RIPEMD                    4375 KiB took 1.002 seconds, 4366.267 KiB/s Cycles per byte =   7.01
HMAC-MD5                  5175 KiB took 1.002 seconds, 5164.671 KiB/s Cycles per byte =   6.11
HMAC-SHA                  2325 KiB took 1.009 seconds, 2304.262 KiB/s Cycles per byte =  14.05
HMAC-SHA224               1475 KiB took 1.017 seconds, 1450.344 KiB/s Cycles per byte =  22.83
HMAC-SHA256               1475 KiB took 1.017 seconds, 1450.344 KiB/s Cycles per byte =  23.46
HMAC-SHA384                475 KiB took 1.049 seconds,  452.812 KiB/s Cycles per byte =  74.96
HMAC-SHA512                475 KiB took 1.048 seconds,  453.244 KiB/s Cycles per byte =  77.12
PBKDF2                       0 KiB took 1.077 seconds,    0.174 KiB/s Cycles per byte = 201056.28
RSA     1024  key gen         1 ops took 65.685 sec, avg 65685.000 ms, 0.015 ops/sec
RSA     2048  key gen         1 ops took 77.480 sec, avg 77480.000 ms, 0.013 ops/sec
RSA     2048   public        10 ops took 1.035 sec, avg 103.500 ms, 9.662 ops/sec
RSA     2048  private         2 ops took 44.756 sec, avg 22378.000 ms, 0.045 ops/sec
ECC   [      SECP256R1]   256  key gen         2 ops took 1.662 sec, avg 831.000 ms, 1.203 ops/sec
ECDHE [      SECP256R1]   256    agree         2 ops took 1.668 sec, avg 834.000 ms, 1.199 ops/sec
ECDSA [      SECP256R1]   256     sign         2 ops took 1.688 sec, avg 844.000 ms, 1.185 ops/sec
ECDSA [      SECP256R1]   256   verify         2 ops took 3.212 sec, avg 1606.000 ms, 0.623 ops/sec
CURVE  25519  key gen         2 ops took 1.785 sec, avg 892.500 ms, 1.120 ops/sec
CURVE  25519    agree         2 ops took 1.326 sec, avg 663.000 ms, 1.508 ops/sec
ED     25519  key gen        15 ops took 1.009 sec, avg 67.267 ms, 14.866 ops/sec
ED     25519     sign        14 ops took 1.008 sec, avg 72.000 ms, 13.889 ops/sec
ED     25519   verify         6 ops took 1.140 sec, avg 190.000 ms, 5.263 ops/sec
Benchmark complete

Benchmark metrics for the ESP8266, No Compiler Optimization (debug -Og):

Chip is ESP8266 (revision v1), Crystal is 26MHz, cpu freq: 160000000 Hz (160MHz)

I (60) boot: ESP-IDF v3.4 2nd stage bootloader
I (60) boot: compile time 14:00:00
I (69) qio_mode: Enabling default flash chip QIO
I (69) boot: SPI Speed      : 40MHz
I (73) boot: SPI Mode       : QIO
I (79) boot: SPI Flash Size : 2MB
I (85) boot: Partition Table:
I (91) boot: ## Label            Usage          Type ST Offset   Length
I (102) boot:  0 nvs              WiFi data        01 02 00009000 00006000
I (114) boot:  1 phy_init         RF data          01 01 0000f000 00001000
I (125) boot:  2 factory          factory app      00 00 00010000 000f0000
I (137) boot: End of partition table
I (143) esp_image: segment 0: paddr=0x00010010 vaddr=0x40210010 size=0x46e48 (290376) map
I (255) esp_image: segment 1: paddr=0x00056e60 vaddr=0x40256e58 size=0x14560 ( 83296) map
I (284) esp_image: segment 2: paddr=0x0006b3c8 vaddr=0x3ffe8000 size=0x004dc (  1244) load
I (285) esp_image: segment 3: paddr=0x0006b8ac vaddr=0x40100000 size=0x00080 (   128) load
I (296) esp_image: segment 4: paddr=0x0006b934 vaddr=0x40100080 size=0x046a4 ( 18084) load
I (315) boot: Loaded app from partition at offset 0x10000
I (334) main: This is ESP8266 chip with 1 CPU cores, WiFi, 
I (337) main: silicon revision 1, 
I (339) main: 2MB external flash

wolfCrypt Benchmark (block bytes 1024, min 1.0 sec each)
RNG                        375 KiB took 1.019 seconds,  368.008 KiB/s Cycles per byte =   2.74
AES-128-CBC-enc             75 KiB took 1.168 seconds,   64.212 KiB/s Cycles per byte =  27.59
AES-128-CBC-dec            475 KiB took 1.045 seconds,  454.545 KiB/s Cycles per byte =   6.43
AES-192-CBC-enc             75 KiB took 1.272 seconds,   58.962 KiB/s Cycles per byte =  57.19
AES-192-CBC-dec            425 KiB took 1.007 seconds,  422.046 KiB/s Cycles per byte =  12.70
AES-256-CBC-enc             75 KiB took 1.362 seconds,   55.066 KiB/s Cycles per byte =  88.98
AES-256-CBC-dec            400 KiB took 1.053 seconds,  379.867 KiB/s Cycles per byte =  19.31
AES-128-GCM-enc             75 KiB took 1.440 seconds,   52.083 KiB/s Cycles per byte = 121.83
AES-128-GCM-dec             75 KiB took 1.440 seconds,   52.083 KiB/s Cycles per byte = 141.40
AES-192-GCM-enc             50 KiB took 1.026 seconds,   48.733 KiB/s Cycles per byte = 230.32
AES-192-GCM-dec             50 KiB took 1.026 seconds,   48.733 KiB/s Cycles per byte = 249.89
AES-256-GCM-enc             50 KiB took 1.076 seconds,   46.468 KiB/s Cycles per byte = 273.47
AES-256-GCM-dec             50 KiB took 1.076 seconds,   46.468 KiB/s Cycles per byte = 294.31
GMAC Default               247 KiB took 1.003 seconds,  246.261 KiB/s Cycles per byte =  63.60
3DES                       175 KiB took 1.121 seconds,  156.111 KiB/s Cycles per byte =  95.82
MD5                       1100 KiB took 1.008 seconds, 1091.270 KiB/s Cycles per byte =  16.18
SHA                       3900 KiB took 1.000 seconds, 3900.000 KiB/s Cycles per byte =   4.82
SHA-224                    925 KiB took 1.007 seconds,  918.570 KiB/s Cycles per byte =  21.27
SHA-256                    925 KiB took 1.002 seconds,  923.154 KiB/s Cycles per byte =  22.43
SHA-384                    850 KiB took 1.009 seconds,  842.418 KiB/s Cycles per byte =  25.58
SHA-512                    850 KiB took 1.007 seconds,  844.091 KiB/s Cycles per byte =  26.63
SHA-512/224                850 KiB took 1.004 seconds,  846.614 KiB/s Cycles per byte =  27.75
SHA-512/256                850 KiB took 1.008 seconds,  843.254 KiB/s Cycles per byte =  28.90
SHA3-224                   700 KiB took 1.000 seconds,  700.000 KiB/s Cycles per byte =  36.47
SHA3-256                   675 KiB took 1.021 seconds,  661.117 KiB/s Cycles per byte =  39.45
SHA3-384                   525 KiB took 1.035 seconds,  507.246 KiB/s Cycles per byte =  52.72
SHA3-512                   375 KiB took 1.048 seconds,  357.824 KiB/s Cycles per byte =  76.24
SHAKE128                   800 KiB took 1.001 seconds,  799.201 KiB/s Cycles per byte =  37.12
SHAKE256                   675 KiB took 1.031 seconds,  654.704 KiB/s Cycles per byte =  45.49
RIPEMD                    4300 KiB took 1.002 seconds, 4291.417 KiB/s Cycles per byte =   7.35
HMAC-MD5                  1100 KiB took 1.017 seconds, 1081.613 KiB/s Cycles per byte =  29.67
HMAC-SHA                  4100 KiB took 1.004 seconds, 4083.665 KiB/s Cycles per byte =   8.20
HMAC-SHA224                925 KiB took 1.021 seconds,  905.975 KiB/s Cycles per byte =  37.38
HMAC-SHA256                925 KiB took 1.016 seconds,  910.433 KiB/s Cycles per byte =  38.53
HMAC-SHA384                825 KiB took 1.002 seconds,  823.353 KiB/s Cycles per byte =  44.42
HMAC-SHA512                850 KiB took 1.024 seconds,  830.078 KiB/s Cycles per byte =  44.23
PBKDF2                       0 KiB took 1.283 seconds,    0.097 KiB/s Cycles per byte = 311171.78
RSA     1024  key gen         1 ops took 28.932 sec, avg 28932.000 ms, 0.035 ops/sec
RSA     2048  key gen         1 ops took 382.088 sec, avg 382088.000 ms, 0.003 ops/sec
RSA     2048   public        12 ops took 1.130 sec, avg 94.167 ms, 10.619 ops/sec
RSA     2048  private         2 ops took 39.968 sec, avg 19984.000 ms, 0.050 ops/sec
ECC   [      SECP256R1]   256  key gen         2 ops took 1.591 sec, avg 795.500 ms, 1.257 ops/sec
ECDHE [      SECP256R1]   256    agree         2 ops took 1.597 sec, avg 798.500 ms, 1.252 ops/sec
ECDSA [      SECP256R1]   256     sign         2 ops took 1.619 sec, avg 809.500 ms, 1.235 ops/sec
ECDSA [      SECP256R1]   256   verify         2 ops took 3.093 sec, avg 1546.500 ms, 0.647 ops/sec
CURVE  25519  key gen         2 ops took 1.988 sec, avg 994.000 ms, 1.006 ops/sec
CURVE  25519    agree         2 ops took 1.529 sec, avg 764.500 ms, 1.308 ops/sec
ED     25519  key gen        17 ops took 1.038 sec, avg 61.059 ms, 16.378 ops/sec
ED     25519     sign        16 ops took 1.041 sec, avg 65.062 ms, 15.370 ops/sec
ED     25519   verify         6 ops took 1.334 sec, avg 222.333 ms, 4.498 ops/sec
Benchmark complete

Questions?

If you have questions about any of the above, please contact us at facts@wolfSSL.com or call us at +1 425 245 8247.

Download wolfSSL Now

wolfSSL now supported on PlatformIO

The best encryption libraries are now available on the PlatformIO environment!

At wolfSSL, we continue to embrace rapid prototyping environments, including Arduino, Visual Studio, and now PlatformIO for VS Code, among other IDE applications.

There are hundreds of boards supported by PlatformIO on numerous frameworks and platforms.

We are providing two different Official wolfSSL libraries: standard and another specifically for Arduino:

There are also two different versions: the stable release versions (above) and these staging updates, with the latest post-release changes.

The stable release versions will generally follow our standard release cycle. The initial 5.7.0 versions include post stable-release updates needed for the Initial PlatformIO support.

See the PlatformIO documentation for Getting Started with PlatformIO.

For Windows users using pio from command line:


set PATH=%PATH%;C:\Users\%USERNAME%\.platformio\penv\Scripts\
pio --help
pio account show

Our initial release has full support for Espressif ESP32 boards, but other boards should work with just a few modifications to the wolfSSL user_settings.h file. See the example configs:

https://github.com/wolfSSL/wolfssl/tree/master/examples/configs

Here’s an example platformio.ini file for the ESP32:


[env:esp32dev]
platform = espressif32
board = esp32dev
framework = espidf
upload_port = COM82
monitor_port = COM82
monitor_speed = 115200
build_flags = -DWOLFSSL_USER_SETTINGS, -DWOLFSSL_ESP32
monitor_filters = direct
lib_deps = wolfssl/wolfSSL@^5.7.0-rev.3b

See also: Espressif Systems Leverages PlatformIO Labs Next-Gen Technology for its Software Products.

Is your device working on the PlatformIO environment with wolfSSL? Send us a message and let us help you get started: support@wolfSSL.com or open an issue on GitHub.

Get Started with wolfSSL

Additional information on getting Started with wolfSSL on the Espressif environment is available on the wolfSSL GitHub repository as well as this YouTube recording:

There’s also a must-see 2024 Roadmap to review all the exciting new features:

Find out more

If you have any feedback, questions, or require support, please don’t hesitate to reach out to us via facts@wolfSSL.com, call us at +1 425 245 8247, or open an issue on GitHub.

Download wolfSSL Now

PQC support for the Zephyr port

PQC support for the Zephyr port was introduced in the last wolfSSL release using liboqs. This involved adding necessary files to the CMakeLists.txt for the Zephyr module. Zephyr is an open-source real-time operating system (RTOS) designed for resource-constrained devices and embedded systems. It is maintained by the Linux Foundation and supported by a vibrant community of developers and contributors.

PR #7026 (https://github.com/wolfSSL/wolfssl/pull/7026) also addressed proper random number generation within liboqs by using the wolfSSL interface. Previously, liboqs random data acquisition relied on various sources, depending on the liboqs build configuration. With the changes, a custom RNG method is provided through the OQS_randombytes_custom_algorithm() interface, enabling liboqs to obtain RNG data from wolfSSL for all generic liboqs uses.

If you have questions about post quantum or any of the above, please contact facts@wolfSSL.com or call us at +1 425 245 8247.

Download wolfSSL Now

wolfSSL on Microblaze

MicroBlaze, developed by Xilinx, is a soft processor core optimized for Xilinx FPGAs. It offers flexibility and scalability, making it suitable for a wide range of applications, including embedded systems and IoT devices. Integrating wolfSSL’s AES-GCM with MicroBlaze is possible and has been done running on a soft CPU on MicroBlaze. In the latest wolfSSL release this integration saw some additional enhancements. When used on a MicroBlaze, wolfSSL’s AES-GCM enhances the security capabilities of FPGA-based systems, enabling developers to implement secure communication protocols and data encryption mechanisms. There is also the option of setting up wolfSSL so that it makes use of Xilinx’s xilsecure while running on the Microblaze. Increasing the AES-GCM performance significantly.

For more information about using wolfSSL on a MicroBlaze or if you have questions about any of the above, please contact us at facts@wolfSSL.com or call us at +1 425 245 8247.

Download wolfSSL Now

RSA-PSS with CRL’s

Did you know wolfSSL has integration of RSA-PSS signatures with Certificate Revocation List (CRL) support?

RSA-PSS: Enhancing Security Layers

RSA-PSS, or Probabilistic Signature Scheme, represents a modern approach to digital signatures. Unlike traditional RSA signatures, RSA-PSS offers improved security properties, making it more resilient against various cryptographic attacks. By adopting RSA-PSS, wolfSSL users benefit from heightened security, enhancing the integrity of cryptographic operations.

Certificate Revocation List (CRL): Managing Certificate Integrity

In the realm of certificate management, CRL plays a pivotal role. It serves as a mechanism for indicating the revocation status of digital certificates. With CRL, systems can promptly identify and reject compromised or revoked certificates, bolstering the overall security posture. Integrating CRL support into wolfSSL empowers users with efficient certificate management capabilities, ensuring the authenticity and integrity of cryptographic transactions.

Empowering wolfSSL with RSA-PSS and CRL Integration

The fusion of RSA-PSS with CRL support within wolfSSL is a logical step when providing cutting-edge security solutions. Now, wolfSSL users can leverage the combined strength of RSA-PSS signatures and CRL management to fortify their cryptographic environments.

To delve deeper into the RSA-PSS with CRL integration in wolfSSL, visit our GitHub repository (https://github.com/wolfSSL/wolfssl/pull/7119) or reach out to facts@wolfSSL.com for assistance.

Thank you for entrusting wolfSSL as your ally in cybersecurity.

If you have questions about any of the above, please contact us at facts@wolfSSL.com or call us at +1 425 245 8247.

Download wolfSSL Now

Removal of user RSA

In the last release of wolfSSL there was some house cleaning done on older RSA implementations. The user RSA layer was removed along with the hooks used for tying in IPP. When those were first introduced we had yet to implement SP (single precision) versions of RSA. Fast forward to today, and there is a faster implementation of RSA in wolfSSL itself. In IPP v0.9 it was able to do 990.09 RSA 2048 bit sign operations per second and in wolfSSL 5.7.0 it was able to run 1,015.23 operations per second. Verify operations took around the same time with both libraries now at 35,714 operations per second on average. These measurements were collected on an older Intel(R) Core(TM) i7-4870HQ CPU. Along with a performant implementation of RSA there are now the crypto callbacks if desiring to plug in custom RSA operations. This being the case the –enable-fastrsa, user RSA, and IPP hooks were dropped to lower maintenance and reduce bundle size.

If you have questions about any of the above, please contact us at facts@wolfSSL.com or call us at +1 425 245 8247.

Download wolfSSL Now

How to unload intermediate certificates with wolfSSL?

Recently, a notable modification was introduced in wolfSSL, a prominent provider of security solutions. Pull request #7245 (https://github.com/wolfSSL/wolfssl/pull/7245) focuses on optimizing memory management by introducing a function to unload intermediate CA certificates and free up memory. Let’s explore the significance of this code change and its potential impact on enhancing efficiency and resource utilization within cryptographic applications.

Specifically, the code change addresses the need to efficiently handle intermediate Certificate Authority (CA) certificates. These certificates, while essential for establishing trust chains in cryptographic operations, can consume valuable memory resources, particularly in resource-constrained environments.

The essence of the code change lies in the introduction of a dedicated function (wolfSSL_CertManagerUnloadIntermediateCerts()) to unload intermediate CA certificates from memory when they are no longer needed. By using this function, developers can optimize resource utilization, thereby enhancing the overall efficiency and stability of cryptographic operations.

Key Benefits: The introduction of the function to unload intermediate CA certificates brings several notable benefits:

  1. Efficient Memory Management: By providing a mechanism to unload intermediate CA certificates from memory, the code change ensures efficient utilization of resources. This is particularly crucial in environments where memory constraints are a concern, such as embedded systems and IoT devices.
  2. Prevention of Memory Leaks: Memory leaks can pose significant security and reliability risks in software applications. The new function helps prevent memory leaks by explicitly releasing memory allocated for intermediate CA certificates when they are no longer required, thereby improving the robustness of cryptographic operations.
  3. Scalability and Performance: Optimal memory management contributes to improved scalability and performance of cryptographic applications. By freeing up memory resources, the code change enables applications to handle larger workloads more efficiently, leading to enhanced responsiveness and overall performance.

By incorporating the function to unload intermediate CA certificates, developers can optimize resource utilization and mitigate potential security risks associated with memory management issues. This not only enhances the reliability and stability of cryptographic applications but also contributes to the overall security resilience of the systems in which they are deployed.

If you have questions about any of the above, please contact us at facts@wolfSSL.com or call us at +1 425 245 8247.

Download wolfSSL Now

TLS on Embedded Systems: UART, I2C or SPI

Recently, we have seen an uptick in interest in securing communications between different embedded modules within a larger system. The academic community has seen great work in showing that these communications need to be secured; especially in the automotive space.

Are you looking to start securing your internal communications over UART, I2C or SPI? With wolfSSL, no matter how small and constrained your micro-controller, we can help!! You can make trade-offs and set build flags to suit your needs with regards to code size, memory usage and binary footprint size. For example, if you are running a TLS 1.3 client, we have flags to exclude all server-only code and exclude all earlier versions of TLS and SSL.

Some might find the idea of TLS over UART, I2C or SPI to be somewhat strange. Isn’t TLS supposed to be running over a network connection? Actually, with our IO callback system, there is no problem at all. For a great example of how to do it, you can have a look at our STM32 example code. There we show TLS 1.3 over UART both as server and client. Please have a look at https://github.com/wolfSSL/wolfssl/blob/master/IDE/STM32Cube/wolfssl_example.c. You can search there for the ENABLE_TLS_UART macro to better understand how it hooks into our IO callbacks.

Want more details? Want to discuss further how you can secure your data interfaces on your micro-controller? Reach out to facts@wolfssl.com.

If you have questions about any of the above, please contact us at facts@wolfSSL.com or call us at +1 425 245 8247.

Download wolfSSL Now

Posts navigation

1 2 3