Here at wolfSSL, we are intimately aware of the needs of our embedded customers. It is always about the tradeoffs and optimizations that fit their unique use cases and needs. The tradeoffs are typically between speed, footprint size, and memory usage. In many of our blog posts, we like to focus on our speed performance, but in this post, we look at options around memory usage. This is especially important for our post-quantum algorithms, ML-KEM and ML-DSA, as they are generally faster or on par with their conventional counterparts, such as ECDSA and ECDH, but do use more memory.
We are going to focus on some experiments we did on a Raspberry Pi5. We built and ran wolfSSL’s testwolfcrypt and got some statistics. Here are the results:
Configuration | Algorithm | Stack (bytes) | Heap (bytes) | Total (bytes) | Heap Allocs |
---|---|---|---|---|---|
Small code | MLKEM-512 | 23,568 | 7,552 | 31,120 | 3 |
MLKEM-768 | 32,672 | 11,968 | 44,640 | 3 | |
MLKEM-1024 | 42,400 | 17,568 | 59,968 | 3 | |
MLDSA-44 | 15,904 | 50,304 | 66,208 | 2 | |
MLDSA-65 | 17,440 | 77,952 | 95,392 | 2 | |
MLDSA-87 | 19,376 | 120,960 | 140,336 | 2 | |
Small code with small mem | MLKEM-512 | 23,696 | 3,968 | 27,664 | 3 |
MLKEM-768 | 32,928 | 5,824 | 38,752 | 3 | |
MLKEM-1024 | 42,656 | 7,840 | 50,496 | 3 | |
MLDSA-44 | 15,856 | 15,656 | 31,512 | 2 | |
MLDSA-65 | 17,392 | 20,776 | 38,168 | 2 | |
MLDSA-87 | 19,328 | 26,920 | 46,248 | 2 | |
Small code with small mem + stack | MLKEM-512 | 2,112 | 19,306 | 21,418 | 17 |
MLKEM-768 | 2,112 | 27,306 | 29,418 | 17 | |
MLKEM-1024 | 2,112 | 35,786 | 37,898 | 17 | |
MLDSA-44 | 2,112 | 28,211 | 30,323 | 7 | |
MLDSA-65 | 2,112 | 33,331 | 35,443 | 7 | |
MLDSA-87 | 2,160 | 39,475 | 41,635 | 7 |
Here are some interesting points we noticed in the data:
- Stack vs Heap Trade-off: WOLFSSL_SMALL_STACK configuration dramatically reduces stack usage (from ~23K-42K to 2,112 bytes)
- ML-KEM Memory Scaling: Memory usage scales predictably with security levels – ML-KEM-512 uses ~31K total, ML-KEM-768 uses ~44K, and ML-KEM-1024 uses ~59K in default configuration
- ML-DSA Higher Heap Usage: ML-DSA algorithms use significantly more heap memory (50K-120K) compared to MLKEM (7K-17K) in small code configuration
- Small Memory Optimization: Adding the small mem configuration flags reduces memory usage by 10 to 15 percent for ML-KEM and 50 to 65 percent for ML-DSA. Quite impressive!
If you’re wondering whether you will be able to use post-quantum algorithms on your system then these numbers should help you get an idea of the resource you will need to allocate.
Here are the configurations and commands used:
Configuration Name | Configurations and Command |
---|---|
Small code | $ ./configure –enable-dilithium=all,44,small –enable-mlkem=all,512,small –enable-trackmemory=verbose –enable-stacksize=verbose $ make $ ./wolfcrypt/test/testwolfcrypt |
Small code with small memory | $ ./configure –enable-dilithium=all,44,small –enable-mlkem=all,512,small CFLAGS=”-DWOLFSSL_DILITHIUM_VERIFY_SMALL_MEM -DWOLFSSL_DILITHIUM_SIGN_SMALL_MEM -DWOLFSSL_DILITHIUM_MAKE_KEY_SMALL_MEM -DWOLFSSL_MLKEM_ENCAPSULATE_SMALL_MEM -DWOLFSSL_MLKEM_MAKEKEY_SMALL_MEM” –enable-trackmemory=verbose –enable-stacksize=verbose $ make $ ./wolfcrypt/test/testwolfcrypt |
Small code with small mem and small stack | $ ./configure –enable-dilithium=all,44,small –enable-mlkem=all,512,small CFLAGS=”-DWOLFSSL_DILITHIUM_VERIFY_SMALL_MEM -DWOLFSSL_DILITHIUM_SIGN_SMALL_MEM -DWOLFSSL_DILITHIUM_MAKE_KEY_SMALL_MEM -DWOLFSSL_MLKEM_ENCAPSULATE_SMALL_MEM -DWOLFSSL_MLKEM_MAKEKEY_SMALL_MEM” –enable-trackmemory=verbose –enable-stacksize=verbose –enable-smallstack $ make $ ./wolfcrypt/test/testwolfcrypt |
Let us know if you need us to get even tighter in terms of memory usage. Our cryptographers are wizards when it comes to exploiting tradeoffs!
If you have questions about any of the above, please contact us at facts@wolfSSL.com or call us at +1 425 245 8247.
Download wolfSSL Now