Topic: Benchmark runs 50x times slower with Visual Studio than Cygwin

I built WolfSSL twice on the same computer - Windows 7 - Leveno Desktop Intel i5-4460 3.2 GHz.  One build is using cygwin and the other with Visual Studio 17 Community C++.  The Cygwin version ran 50 times faster than the Visual Studio version on the same computer!  I got virtually identical benchmark results on a Windows 7 2.7 Ghz Intel i7 computer using Visual Studio 15 Professional C++.

I ran the two benchmarks using the same blocksize of 1024.  The Visual Studio benchmark has a default blocksize of 1024 whereas the Cygwin benchmark program has a default blocksize of 1048756.  When I set the Visual Studio benchmark blocksize to 1048756, I get all zeros:

benchmark 1048576
------------------------------------------------------------------------------
wolfSSL version 4.0.0
------------------------------------------------------------------------------
wolfCrypt Benchmark (block bytes 1048576, min 1.0 sec each)
RNG                  0 bytes took 1.000 seconds,    0.000 bytes/s
AES-128-CBC-enc      0 bytes took 1.000 seconds,    0.000 bytes/s
AES-128-CBC-dec      0 bytes took 1.000 seconds,    0.000 bytes/s
AES-192-CBC-enc      0 bytes took 1.000 seconds,    0.000 bytes/s
^C
========================================
Here are the Cygwin Results:
========================================

./benchmark 1024
------------------------------------------------------------------------------
wolfSSL version 4.0.0
------------------------------------------------------------------------------
wolfCrypt Benchmark (block bytes 1024, min 1.0 sec each)
RNG                 95 MB took 1.040 seconds,   91.346 MB/s Cycles per byte =  33.10
AES-128-CBC-enc    225 MB took 1.000 seconds,  225.000 MB/s Cycles per byte =  13.58
AES-128-CBC-dec    235 MB took 1.010 seconds,  232.673 MB/s Cycles per byte =  13.11
AES-192-CBC-enc    195 MB took 1.020 seconds,  191.176 MB/s Cycles per byte =  15.88
AES-192-CBC-dec    200 MB took 1.010 seconds,  198.019 MB/s Cycles per byte =  15.41
AES-256-CBC-enc    170 MB took 1.020 seconds,  166.666 MB/s Cycles per byte =  18.17
AES-256-CBC-dec    175 MB took 1.010 seconds,  173.267 MB/s Cycles per byte =  17.68
AES-128-GCM-enc     35 MB took 1.100 seconds,   31.818 MB/s Cycles per byte =  95.98
AES-128-GCM-dec     35 MB took 1.110 seconds,   31.531 MB/s Cycles per byte =  96.32
AES-192-GCM-enc     35 MB took 1.100 seconds,   31.818 MB/s Cycles per byte =  95.80
AES-192-GCM-dec     35 MB took 1.130 seconds,   30.973 MB/s Cycles per byte =  98.40
AES-256-GCM-enc     35 MB took 1.100 seconds,   31.818 MB/s Cycles per byte =  95.04
AES-256-GCM-dec     35 MB took 1.080 seconds,   32.407 MB/s Cycles per byte =  93.87
AES-CCM            110 MB took 1.010 seconds,  108.911 MB/s Cycles per byte =  28.10
CHACHA             395 MB took 1.010 seconds,  391.088 MB/s Cycles per byte =   7.75
CHA-POLY           285 MB took 1.000 seconds,  285.000 MB/s Cycles per byte =  10.70
MD5                480 MB took 1.000 seconds,  480.000 MB/s Cycles per byte =   6.36
POLY1305          1340 MB took 1.000 seconds, 1339.997 MB/s Cycles per byte =   2.26
SHA                420 MB took 1.000 seconds,  420.000 MB/s Cycles per byte =   7.24
SHA-224            230 MB took 1.010 seconds,  227.722 MB/s Cycles per byte =  13.45
SHA-256            230 MB took 1.010 seconds,  227.723 MB/s Cycles per byte =  13.39
SHA-384            315 MB took 1.010 seconds,  311.881 MB/s Cycles per byte =   9.72
SHA-512            315 MB took 1.000 seconds,  314.999 MB/s Cycles per byte =   9.69
SHA3-224           290 MB took 1.010 seconds,  287.128 MB/s Cycles per byte =  10.55
SHA3-256           275 MB took 1.010 seconds,  272.277 MB/s Cycles per byte =  11.22
SHA3-384           210 MB took 1.010 seconds,  207.921 MB/s Cycles per byte =  14.61
SHA3-512           145 MB took 1.000 seconds,  145.000 MB/s Cycles per byte =  21.01
HMAC-MD5           485 MB took 1.000 seconds,  484.999 MB/s Cycles per byte =   6.29
HMAC-SHA           420 MB took 1.010 seconds,  415.841 MB/s Cycles per byte =   7.28
HMAC-SHA224        230 MB took 1.010 seconds,  227.722 MB/s Cycles per byte =  13.48
HMAC-SHA256        230 MB took 1.020 seconds,  225.490 MB/s Cycles per byte =  13.40
HMAC-SHA384        315 MB took 1.000 seconds,  315.000 MB/s Cycles per byte =   9.70
HMAC-SHA512        315 MB took 1.000 seconds,  314.999 MB/s Cycles per byte =   9.70
RSA     2048 public       3600 ops took 1.010 sec, avg 0.281 ms, 3564.353 ops/sec
RSA     2048 private       400 ops took 1.320 sec, avg 3.300 ms, 303.030 ops/sec
DH      2048 key gen       927 ops took 1.000 sec, avg 1.079 ms, 926.998 ops/sec
DH      2048 agree        1000 ops took 1.060 sec, avg 1.060 ms, 943.395 ops/sec
ECC      256 key gen       987 ops took 1.000 sec, avg 1.013 ms, 986.999 ops/sec
ECDHE    256 agree        1000 ops took 1.040 sec, avg 1.040 ms, 961.537 ops/sec
ECDSA    256 sign         1000 ops took 1.030 sec, avg 1.030 ms, 970.873 ops/sec
ECDSA    256 verify       1500 ops took 1.040 sec, avg 0.693 ms, 1442.305 ops/sec
Benchmark complete

========================================
Here are the Visual Studio 17 Results:
========================================

benchmark
------------------------------------------------------------------------------
wolfSSL version 4.0.0
------------------------------------------------------------------------------
wolfCrypt Benchmark (block bytes 1024, min 1.0 sec each)
RNG                  2 MB took 1.009 seconds,    1.935 MB/s
AES-128-CBC-enc     15 MB took 1.000 seconds,   14.496 MB/s
AES-128-CBC-dec     14 MB took 1.002 seconds,   13.894 MB/s
AES-192-CBC-enc     14 MB took 1.001 seconds,   14.275 MB/s
AES-192-CBC-dec     14 MB took 1.000 seconds,   13.671 MB/s
AES-256-CBC-enc     14 MB took 1.000 seconds,   13.987 MB/s
AES-256-CBC-dec     13 MB took 1.002 seconds,   13.432 MB/s
AES-128-GCM-enc      3 MB took 1.006 seconds,    3.131 MB/s
AES-128-GCM-dec      3 MB took 1.001 seconds,    3.145 MB/s
AES-192-GCM-enc      3 MB took 1.002 seconds,    3.093 MB/s
AES-192-GCM-dec      3 MB took 1.006 seconds,    3.106 MB/s
AES-256-GCM-enc      3 MB took 1.000 seconds,    3.075 MB/s
AES-256-GCM-dec      3 MB took 1.006 seconds,    3.084 MB/s
AES-CCM              7 MB took 1.001 seconds,    7.021 MB/s
CHACHA               8 MB took 1.002 seconds,    8.139 MB/s
CHA-POLY             7 MB took 1.002 seconds,    7.068 MB/s
MD5                 41 MB took 1.001 seconds,   41.140 MB/s
POLY1305            97 MB took 1.000 seconds,   97.459 MB/s
SHA                 10 MB took 1.001 seconds,    9.801 MB/s
SHA-224              4 MB took 1.001 seconds,    4.364 MB/s
SHA-256              4 MB took 1.001 seconds,    4.365 MB/s
SHA-384              5 MB took 1.003 seconds,    5.162 MB/s
SHA-512              5 MB took 1.002 seconds,    5.164 MB/s
SHA3-224            22 MB took 1.001 seconds,   21.909 MB/s
SHA3-256            21 MB took 1.001 seconds,   20.764 MB/s
SHA3-384            16 MB took 1.001 seconds,   16.223 MB/s
SHA3-512            11 MB took 1.002 seconds,   11.477 MB/s
HMAC-MD5            41 MB took 1.000 seconds,   40.707 MB/s
HMAC-SHA            10 MB took 1.001 seconds,    9.728 MB/s
HMAC-SHA224          4 MB took 1.001 seconds,    4.294 MB/s
HMAC-SHA256          4 MB took 1.004 seconds,    4.305 MB/s
HMAC-SHA384          5 MB took 1.004 seconds,    5.035 MB/s
HMAC-SHA512          5 MB took 1.001 seconds,    5.047 MB/s
RSA     1024 key gen         2 ops took 1.174 sec, avg 587.073 ms, 1.703 ops/sec
RSA     2048 key gen         1 ops took 12.811 sec, avg 12810.671 ms, 0.078 ops/sec
RSA     2048 public        266 ops took 1.005 sec, avg 3.778 ms, 264.705 ops/sec
RSA     2048 private        14 ops took 1.151 sec, avg 82.205 ms, 12.165 ops/sec
DH      2048 key gen        37 ops took 1.018 sec, avg 27.522 ms, 36.335 ops/sec
DH      2048 agree          38 ops took 1.048 sec, avg 27.572 ms, 36.269 ops/sec
ECC      256 key gen        96 ops took 1.003 sec, avg 10.443 ms, 95.757 ops/sec
ECDHE    256 agree         108 ops took 1.017 sec, avg 9.412 ms, 106.246 ops/sec
ECDSA    256 sign          106 ops took 1.006 sec, avg 9.492 ms, 105.352 ops/sec
ECDSA    256 verify        156 ops took 1.006 sec, avg 6.446 ms, 155.136 ops/sec

Benchmark complete
wolfSSL Entering wolfCrypt_Cleanup

Share

Re: Benchmark runs 50x times slower with Visual Studio than Cygwin

I forgot to add:  Here are the testsuite results that indicate that the Visual Studio 17 configuration passes:

testsuite
error    test passed!
Bad end of line in Base64 Decode
Bad Base64 Decode data, too small
Bad Base64 Decode data, too big
Bad Base64 Decode data, too small
Bad Base64 Decode data, too big
Bad Base64 Decode data, too small
Bad Base64 Decode data, too big
Bad Base64 Decode data, too small
Bad Base64 Decode data, too big
Escape buffer max too small
base64   test passed!
asn      test passed!
MD5      test passed!
SHA      test passed!
SHA-224  test passed!
SHA-256  test passed!
SHA-384  test passed!
SHA-512  test passed!
SHA-3    test passed!
Hash     test passed!
HMAC-MD5 test passed!
HMAC-SHA test passed!
HMAC-SHA224 test passed!
HMAC-SHA256 test passed!
HMAC-SHA384 test passed!
HMAC-SHA512 test passed!
HMAC-SHA3   test passed!
HMAC-KDF    test passed!
GMAC     test passed!
Chacha   test passed!
POLY1305 test passed!
ChaCha20-Poly1305 AEAD test passed!
AES      test passed!
AES192   test passed!
AES256   test passed!
AES-GCM  test passed!
AES-CCM  test passed!
RANDOM   test passed!
GetLength value exceeds buffer length
GetLength value exceeds buffer length
GetLength value exceeds buffer length
wc_SignatureGetSize: Invalid RsaKey key size
RSA Signature Verify difference!
wolfSSL Using RSA OAEP padding
wolfSSL Using RSA OAEP un-padding
wolfSSL Using RSA OAEP padding
wolfSSL Using RSA OAEP un-padding
wolfSSL Using RSA OAEP un-padding
wolfSSL Using RSA OAEP padding
wolfSSL Using RSA OAEP un-padding
wolfSSL Using RSA OAEP padding
wolfSSL Using RSA OAEP un-padding
wolfSSL Using RSA OAEP padding
wolfSSL Using RSA OAEP un-padding
wolfSSL Using RSA OAEP padding
wolfSSL Using RSA OAEP un-padding
wolfSSL Entering wc_PemCertToDer
wolfSSL Entering PemToDer
wolfSSL Entering GetExplicitVersion
wolfSSL Entering GetSerialNumber
Got Cert Header
wolfSSL Entering GetAlgoId
wolfSSL Entering GetObjectId()
Got Algo ID
Getting Cert Name
Getting Cert Name
Got Subject Name
wolfSSL Entering GetAlgoId
wolfSSL Entering GetObjectId()
Got Key
Parsed Past Key
wolfSSL Entering DecodeCertExtensions
wolfSSL Entering GetObjectId()
wolfSSL Entering DecodeSubjKeyId
wolfSSL Entering GetObjectId()
wolfSSL Entering DecodeAuthKeyId
wolfSSL Entering GetObjectId()
wolfSSL Entering DecodeBasicCaConstraint
wolfSSL Entering GetAlgoId
wolfSSL Entering GetObjectId()
wolfSSL Entering GetObjectId()
wolfSSL Entering wc_PemCertToDer
wolfSSL Entering PemToDer
wolfSSL Entering GetExplicitVersion
wolfSSL Entering GetSerialNumber
Got Cert Header
wolfSSL Entering GetAlgoId
wolfSSL Entering GetObjectId()
Got Algo ID
Getting Cert Name
Getting Cert Name
Got Subject Name
wolfSSL Entering GetAlgoId
wolfSSL Entering GetObjectId()
Got Key
Parsed Past Key
wolfSSL Entering DecodeCertExtensions
wolfSSL Entering GetObjectId()
wolfSSL Entering DecodeSubjKeyId
wolfSSL Entering GetObjectId()
wolfSSL Entering DecodeAuthKeyId
wolfSSL Entering GetObjectId()
wolfSSL Entering DecodeBasicCaConstraint
wolfSSL Entering GetAlgoId
wolfSSL Entering GetObjectId()
wolfSSL Using RSA PSS padding
wolfSSL Using RSA PSS un-padding
wolfSSL Using RSA PSS padding
wolfSSL Using RSA PSS un-padding
wolfSSL Using RSA PSS padding
wolfSSL Using RSA PSS un-padding
wolfSSL Using RSA PSS padding
wolfSSL Using RSA PSS un-padding
wolfSSL Using RSA PSS padding
wolfSSL Using RSA PSS un-padding
wolfSSL Using RSA PSS padding
wolfSSL Using RSA PSS un-padding
wolfSSL Using RSA PSS padding
wolfSSL Using RSA PSS un-padding
wolfSSL Using RSA PSS padding
wolfSSL Using RSA PSS un-padding
wolfSSL Using RSA PSS padding
wolfSSL Using RSA PSS un-padding
wolfSSL Using RSA PSS padding
wolfSSL Using RSA PSS un-padding
wolfSSL Using RSA PSS padding
wolfSSL Using RSA PSS un-padding
wolfSSL Using RSA PSS padding
wolfSSL Using RSA PSS un-padding
wolfSSL Using RSA PSS padding
wolfSSL Using RSA PSS un-padding
wolfSSL Using RSA PSS padding
wolfSSL Using RSA PSS un-padding
wolfSSL Using RSA PSS padding
wolfSSL Using RSA PSS un-padding
wolfSSL Using RSA PSS padding
wolfSSL Using RSA PSS un-padding
wolfSSL Using RSA PSS padding
wolfSSL Using RSA PSS un-padding
wolfSSL Using RSA PSS padding
wolfSSL Using RSA PSS un-padding
wolfSSL Using RSA PSS padding
wolfSSL Using RSA PSS un-padding
wolfSSL Using RSA PSS padding
wolfSSL Using RSA PSS un-padding
wolfSSL Using RSA PSS padding
wolfSSL Using RSA PSS un-padding
wolfSSL Using RSA PSS padding
wolfSSL Using RSA PSS un-padding
wolfSSL Using RSA PSS padding
wolfSSL Using RSA PSS un-padding
wolfSSL Using RSA PSS padding
wolfSSL Using RSA PSS un-padding
wolfSSL Using RSA PSS padding
wolfSSL Using RSA PSS un-padding
wolfSSL Using RSA PSS padding
wolfSSL Using RSA PSS un-padding
wolfSSL Using RSA PSS un-padding
wolfSSL Using RSA PSS padding
RSA      test passed!
DH       test passed!
Checking size of PKCS8
wolfSSL Entering wc_CreatePKCS8Key()
Checking size of PKCS8
wolfSSL Entering wc_CreatePKCS8Key()
Checking size of PKCS8
wolfSSL Entering wc_CreatePKCS8Key()
Checking size of PKCS8
wolfSSL Entering wc_CreatePKCS8Key()
Checking size of PKCS8
wolfSSL Entering wc_CreatePKCS8Key()
Checking size of PKCS8
wolfSSL Entering wc_CreatePKCS8Key()
Checking size of PKCS8
wolfSSL Entering wc_CreatePKCS8Key()
Checking size of PKCS8
wolfSSL Entering wc_CreatePKCS8Key()
Checking size of PKCS8
wolfSSL Entering wc_CreatePKCS8Key()
Checking size of PKCS8
wolfSSL Entering wc_CreatePKCS8Key()
Checking size of PKCS8
wolfSSL Entering wc_CreatePKCS8Key()
wolfSSL Entering GetObjectId()
wolfSSL Entering wc_ecc_make_pub
wolfSSL Entering wc_ecc_make_pub
Verify called with private key, generating public part
wolfSSL Entering GetObjectId()
wolfSSL Entering wc_PemCertToDer
wolfSSL Entering PemToDer
wolfSSL Entering GetExplicitVersion
wolfSSL Entering GetSerialNumber
Got Cert Header
wolfSSL Entering GetAlgoId
wolfSSL Entering GetObjectId()
Got Algo ID
Getting Cert Name
Getting Cert Name
Got Subject Name
wolfSSL Entering GetAlgoId
wolfSSL Entering GetObjectId()
wolfSSL Entering GetObjectId()
Got Key
Parsed Past Key
wolfSSL Entering DecodeCertExtensions
wolfSSL Entering GetObjectId()
wolfSSL Entering DecodeSubjKeyId
wolfSSL Entering GetObjectId()
wolfSSL Entering DecodeAuthKeyId
wolfSSL Entering GetObjectId()
wolfSSL Entering DecodeBasicCaConstraint
wolfSSL Entering GetObjectId()
wolfSSL Entering DecodeKeyUsage
wolfSSL Entering GetAlgoId
wolfSSL Entering GetObjectId()
ECC      test passed!
logging  test passed!
mutex    test passed!
memcb    test passed!
Test complete
wolfSSL Entering wolfSSL_Cleanup
wolfSSL Entering wolfCrypt_Cleanup

All tests passed!

Share

Re: Benchmark runs 50x times slower with Visual Studio than Cygwin

I used WolfSSL version 4.0.0 for all of the builds (Cygwin and Visual Studio 17/15)

Share

Re: Benchmark runs 50x times slower with Visual Studio than Cygwin

Hi johncarroll944,

Thanks for reaching out about this issue. Did you build the Visual studio projects in release mode or in debug mode? There are a few items here that would impact performance.

1) Cygwin might be detecting the Intel assembly and turning it on.
1) Windows will not auto-detect any environment stuff since it doesn't use autoconf you could try manually define the AES_NI setting in wolfssl/IDE/WIN/user_settings.h and then add the windows assembly file to the build.

2) Cygwin build is likely setting -O[value], windows might be setup to build for size over speed.

3) Cygwin is likely NOT building in debug mode, windows by default uses the WIN32 | Debug configuration which should be set to either x64 | Release or WIN32 | Release mode for better performance.

Warm Regards,

K