I am posting these results as they may be helpful to others building on an embedded Cortex M platform.

I had a working project build with Segger Embedded Studio for a Nordic NRF52840 (64MHz Cortex M).        We switched our build tooling to use cmake/ninja along with the newlib-nano libraries packaged with ARM GCC tools. One thing I noticed is that our TLS handshake in our application went from 3 seconds to about 16 seconds.

Using the built in wolf test functions,  I was able to track down the issue.

While Segger Embedded Studio uses GCC,  they have their own buildsof the C standard libraries.    The wolfcrypt ECC routines are very sensitive to how the C libraries are built.

Results:

I could eventually get *better* performance than the SES custom library but had to use NewLib and not NewLib-nano.    This results in a larger binary but the performance is much better.   


wolf test time:

Segger Embedded Studio w/ custom std libs: 208 seconds
ARM GCC Embedded w/ Newlib-Nano:  608 seconds
ARM GCC Embedded w/ Newlib:  202 seconds
ARM GCC Embedded w/ Newlib &WOLFSSL_SP_ARM_CORTEX_M_ASM :  175 seconds


In addition to using Newlib,  I found the macro WOLFSSL_SP_ARM_CORTEX_M_ASM  helped quite a bit.  (Looks like it has hand tuned ASM routines for SP math).


Apparently newlib is built with speed optimizations and newlib-nano is built with size optimizations.      I was surprised to the stark difference.     I have other math heavy routines and the difference in execution speed is only 5% between -03 and -Os.     It seems that the ECC SP routines have some sensitivity to the C library.


Hope this is helpful to anyone building for a Cortex M MCU.

2

(2 replies, posted in wolfMQTT)

Just as an update.     I have been running continuously for a couple weeks now.     I believe the upgrade to V1.2 that handles the MqttSocket_TlsSocketSend  MQTT_CODE_CONTINUE  correctly fixed the problem.

3

(2 replies, posted in wolfMQTT)

A few things as an update

1.)   I am using the library with non-blocking IO enabled
2.)   I was actually using V1.1, not the latest V1.2


I setup a test that would just continually connect, public one message and then disconnect.  In this scenario I noticed I could never get the hardfault (which was the result of a bad malloc/free)

After some more investigation I found a few more things.

a.)  There were instances where my cell modem wasn't actually closing the TCP connection.   This was causing issues when the state machine would restart.

After fixing a.)   the actual bug became easier to reproduce (almost on every reconnect cycle I could get a hardfault during a malloc).

I reviewed my IO calls in mqttnet.c and realized that my net_write routine would sometimes actually block until I got an acknowlegement from my modem.  I re-implemented the logic to return an MQTT_CODE_CONTINUE until I got my reponse.

That is when I found a memory leak.    MqttSocket_TlsSocketSend would return WOLFSSL_CBIO_ERR_WANT_WRITE on an rc of 0 or MQTT_CODE_ERROR_TIMEOUT, not MQTT_CODE_CONTINUE.    When I traced the call stack it looks like the connect/hello packet for the TLS connection would malloc many times as I returned MQTT_CODE_CONTINUE in my net_write function.    I could make this happen on demand if I forced my net_write to block, or make it non blocking and use MQTT_CODE_CONTINUE.

I looked through the latest code on github and saw that the latest version handles MQTT_CODE_CONTINUE in MqttSocket_TlsSocketSend correctly.

I worked in V1.2 of WolfMQTT and started testing again.  Hopefully this version fixes the issue.

I do still plan on looking at static allocation in WolfSSL as well.

4

(2 replies, posted in wolfMQTT)

I have a successfully gotten WolfMQTT and WolfSSL to communicate with Azure IOT hub through a Cell modem.   For some time I have been working on adding some robustness to the software as the cellular modem can frequently lose connection I have to gracefully terminate and restart.   (NRF52 is the platform).

My system will Tx once a minute and I have been leaving it run to see where potential hangups are.   

1.)    The  TCP connection will go down about every few hours.   I modified the MQTT statemachine to go through the WMQ_DISCONNECT and WMQ_NET_DISCONNECT states and then restart at WMQ begin.

2.) It looks like MqttClient_NetDisconnect() calls functions to free resources use by WolfSSL.

3.)   After about 24 hours I always get an error in the TLS setup.   Today it was:

MqttSocket_TlsConnect Error -1: Num -112, mp_exptmod error state


4.)  Other times I never get through the TLS connection as it always returns a "CONTINUE".  It takes about 24 hours (roughly 50 or 60 restarts) for this error to pop up.

5.)   Since it takes so long for it occur, it hard to capture statistics but today I was ablw to attach a debugger.        What was interesting is that my IRQ routines are still running (as there are diagnostic messages from the modem to a serial terminal).

In every case where there was an issue it was locked up in the std C lib "Free" function.  Unfortunately I did not have any more data from the call stack.

6.)   I believe the only place where my code does malloc/free in in WolfSSL.    WolfMQTT did have a malloc for it's context struct but I change it to statically allocate.

Right now I am trying to make this problem happen quicker but I think there is something happening with freeing resources.      I am going to try to fix the state machine so it will keep trying to disconnect and reconnect to see if the issue still shows up.

I am also going to look at using static allocation for WolfSSL (currently my stack is 65k and the heap is 72k,   the WolfSSL test function pass OK).

I am posting this to see if there is any info on "mp_exptmod error state".   I'll post updates as this might be useful to others.

I just wanted to post that I have things working on my embedded platform.  There were several issues but I wanted to post some things to help someone else who may be doing something similiar.


1.)   Azure cares about correct time.    I was trying to work around not having unix time but it is needed for the connection string.  I had to write the code to ping a NIST time server.

2.)   The MQTT state machine does not report an error when there is a failed connection in my case I was getting a return code of "5" in the MQTT CONN ACK packet.   This turned out to be an authentication issue.

The root cause of issue was my embedded (sn)printf library not handling long long formatting.   The connection string was malformed.


related:

From #1 & 2 in my original post.

The defines #OPENSSL_EXTRA  and #WOLFSSL_SNIFFER were cause the differences in packet sizes and the (NULL).  They were used in the embedded platform as they seemed to need a lot of OS support.

Hello:

I have successfully gotten the WolfMQTT Azure example working in Windows with some custom network hardware (an embedded Cell modem).   I am in the process of porting to my embedded platform and ran into a issue in the MQTT state machine when subscribing to messages from the IOT Hub.

On my embedded platform,  WolfSSL/WolfCrypt is compiling OK and all the unit tests are passing.

I can monitor debug strings from both the windows and embedded version and noted 3 differences that cannot I find the source of:

1.) Once the MQTT state machine successfully makes a TCP connection,   I get a debug message that the TLS setup has been initiated:

MQTT TLS Setup (1)

At this point,  WolfSSL sends out a “hello” message
a.)    On my Windows version,   I can see the Netwrite function (mqttnet.c) send a 170 byte packet.
b.)    On the embedded build,    the netwrite function sends 146 bytes.

What would cause this difference?  [  (My WofSSL settings for each platform are at the bottom this message).   


2.)     After the hello packet is sent, I get the TLS Verify call backs

a.)     Windows:
MQTT TLS Verify Callback: PreVerify 0, Error 0 (none)
  Subject's domain name is (null)
MQTT TLS Verify Callback: PreVerify 0, Error 0 (none)
  Subject's domain name is (null)

b.)   
MQTT TLS Verify Callback: PreVerify 0, Error -188 (ASN no signer error to confirm failure)
Subject's domain name is Microsoft IT TLS CA 1
Allowing cert anyways
MQTT TLS Verify Callback: PreVerify 0, Error -188 (ASN no signer error to confirm failure)



What would cause the Windows version to not get the domain name?    The difference here seems to be important.


3.)     At this point I can see of transmit receive activity the is the same between both platforms.   There is a point where I get a debug message saying the MQTT connection is successful:

a.)    Windows

MQTT Connect: Success (0)
MQTT Connect Ack: Return Code 0, Session Present 0

c.)     Embedded

MQTT Connect: Success (0)
MQTT Connect Ack: Return Code 5, Session Present 0

Notice the difference in return code.  I could not determine what a “5” means in this context.

When the state machine tries to subscribe to messages coming back from the IOT hub,  the embedded version hangs waiting for a response.    I think there is something going on in #1 and #2 but can’t figure out what to instrument next.

Do you have any ideas on what I can try to debug?  It seems that there may be some compile option that is different between the 2 platforms.



Windows WolfSSL Config

   

     #define OPENSSL_EXTRA
        #define WOLFSSL_RIPEMD
        #define WOLFSSL_SHA512
        #define NO_PSK
        #define HAVE_EXTENDED_MASTER
        #define WOLFSSL_SNIFFER
        #define HAVE_TLS_EXTENSIONS
        #define HAVE_SECURE_RENEGOTIATION


        #define DEBUG_WOLFSSL

        #define HAVE_AESGCM
        #define WOLFSSL_SHA384
        #define WOLFSSL_SHA512

        #define HAVE_SUPPORTED_CURVES
        #define HAVE_TLS_EXTENSIONS

        #define HAVE_ECC
        #define ECC_SHAMIR
        #define ECC_TIMING_RESISTANT


Embedded WolfSSL config

   

#define WOLFSSL_NRF51
        #define WOLFSSL_USER_IO
        #define DEBUG_WOLFSSL
        #define SIZEOF_LONG 4
        #define SIZEOF_LONG_LONG 8
        #define NO_DEV_RANDOM
        #define NO_FILESYSTEM
        #define NO_MAIN_DRIVER
        #define NO_WRITEV
        #define NO_FILESYSTEM

        #define HAVE_ECC
        #define ECC_SHAMIR
        #define ECC_TIMING_RESISTANT

        #define HAVE_TLS_EXTENSIONS
        #define HAVE_SECURE_RENEGOTIATION
        #define HAVE_AESGCM
        #define HAVE_SUPPORTED_CURVES
        #define HAVE_TLS_EXTENSIONS
        #define HAVE_EXTENDED_MASTER

        #define USE_FAST_MATH
        #define USE_WOLFSSL_MEMORY
        

            /* Enables blinding mode, to prevent timing attacks */
        #define WC_RSA_BLINDING
        #define SINGLE_THREADED
        
        #define WOLFSSL_SHA384
        #define WOLFSSL_SHA512
        #define WOLFSSL_RIPEMD
        #define WOLFSSL_SHA512
        #define WOLFSSL_USER_IO
        #define WOLFSSL_BASE64_ENCODE

        #define ECC_SHAMIR
        #define ECC_TIMING_RESISTANT

        #define SINGLE_THREADED
        
        #if !defined(USE_CERT_BUFFERS_2048) && !defined(USE_CERT_BUFFERS_4096)
            #define USE_CERT_BUFFERS_1024
        #endif

        #define BENCH_EMBEDDED

7

(3 replies, posted in wolfSSL)

Thank you for the detailed analysis as well as the additional compile options to reduce size.     I am unsure of why ciphers are used when opening SSL connections to these cloud services (AWS, azure) as I am not familiar with details of SSL/TLS.     This data helps so I can play it safe and make sure to allocate plenty of memory to the heap.   I do plan on playing with the static memory options once I get the end to end system functional. It just looks like I trade heap for the stack but I am glad the library has all these options to play with.

8

(3 replies, posted in wolfSSL)

I am currently porting WolfMQTT and WolfSSL to an embedded platform  using the Azure IOT hub example.   I started getting everything working with the Windows based examples and am now getting the build working in embedded platform (NRF52840 - Cortex M4).  (Everything work OK in the Windows version with a Cell Modem!)

I have wolfSSL building with the embedded tools and I am running the test routines.    I am at the point where I can pass all the tests for my current settings (see below for settings).    I noticed that I have to set aside lot quite a bit of heap and stack for the tests to pass.  (90k of heap and 64k of stack).     The stack/heap settings can be trimmed down a bit but there was some (slow) trial an error to just get the tests to pass.

Question 1:

Is it a good assumption that wolfSSL will need this amount of heap and stack at runtime?     (MQTT with TLS to Azure).      It is quite a bit of memory (I have 256k on my platform).     My guess is that most memory usage is just for the mechanics of the tests.

I am considering moving to another CPU to get some external SDRAM to be safe but want to get a true feel for the resource usage. 

Question 2:

The tests take about 10 minutes (64MHz cortex M4).   Most of that is in the ECC test.  Is this to be expected?

My settings:

#ifdef WOLFSSL_NRF5x
 
        #define WOLFSSL_NRF51
        #define WOLFSSL_USER_IO
 
        #define DEBUG_WOLFSSL
        #define SIZEOF_LONG 4
        #define SIZEOF_LONG_LONG 8
 
        #define NO_ASN_TIME
        #define NO_DEV_RANDOM
        #define NO_FILESYSTEM
        #define NO_MAIN_DRIVER
        #define NO_WRITEV
        #define NO_FILESYSTEM
        #define NO_SESSION_CACHE
        #define NO_PSK
       
        #define HAVE_ECC
        #define HAVE_TLS_EXTENSIONS
        #define HAVE_SECURE_RENEGOTIATION
        #define HAVE_AESGCM
        #define HAVE_SUPPORTED_CURVES
        #define HAVE_TLS_EXTENSIONS
        #define HAVE_EXTENDED_MASTER

        #define USE_FAST_MATH
        #define USE_WOLFSSL_MEMORY
        
        #define TFM_TIMING_RESISTANT
        #define SINGLE_THREADED
        
        #define WOLFSSL_SHA384
        #define WOLFSSL_SHA512
        #define WOLFSSL_RIPEMD
        #define WOLFSSL_SHA512
        #define WOLFSSL_USER_IO
        #define WOLFSSL_BASE64_ENCODE

        #define ECC_SHAMIR
        #define ECC_TIMING_RESISTANT

        #define SINGLE_THREADED
        
        #if !defined(USE_CERT_BUFFERS_2048) && !defined(USE_CERT_BUFFERS_4096)
            #define USE_CERT_BUFFERS_2048
        #endif

        #define BENCH_EMBEDDED
    
      
#endif

9

(3 replies, posted in wolfMQTT)

Thanks for the info.     We are in development mode so this will work the time being.   


I am plumbing this code into a cell modem so I am trying to figure out how much code I actually have to write.

10

(3 replies, posted in wolfMQTT)

Updates.

1.)    I did get a proper x64 build setup for the core WolfSSL Library (which solved #2)  but I still needed to work around the wc_GetTime problem.

3.) I was able to get data through.   one thing compounding errors was my internet connection was spotty today.  There were several websites that kept report SSL/TLS problems.     


At this point things are working enough where I can start thinking about porting the IO layer and getting it working on the embedded platform.     It would be nice to know what is going on the with the time function.


Related to the time,    is it absolutely necessary to have correct UTC time?   My embedded system does not have an RTC.

11

(3 replies, posted in wolfMQTT)

Hello:

I am trying to run the AzureIOT Hub client using the ltaest version of WolfSSL and Wolf MQTT from the latest snapshot on github.


1.)  (Build Related) 

It seems there are some possible issue in the build system under windows.     The Solution for the 64-bit build of WolfSSL still builds 32-bit libraries.


Also,   when I setup the projext for a 32-bit build,   calls to wc_GetTime corrupt the stack.   I get a 

"Run-Time Check Failure #2 - Stack around the variable 'lTime' was corrupted."

around line 245 in azureiothub.c


The only workaround was to add a line like this:

lTime = XTIME(0);

and comment out the call on line line 195 of azureiothub.c


2.)   After #1,   I found that I got an exception in "external.c" (see attached png).

The only fix was to copy the source files for WolfSSL into the MQTT library project and build.

3.)

After getting the projects/libraries to build,  I ran azureiothub.exe with visualstudio 2017 with a debugger attached.      The example seems to hang:

AzureIoTHub Client: QoS 1, Use TLS 1
MQTT Net Init: Success (0)
SharedAccessSignature sr=DEMO-IOT-HUB2.azure-devices.net%2fdevices%2fwolftest&sig=HMI1IlcMI5C5%2be8XRxsRMulI%2fXZ2nhqIYABQbmP25eA%3d&se=1544729501
MQTT Init: Success (0)
MQTT TLS Setup (1)


I did change the URI for my own azure account as well as the key.     I got the same results with the stock settings.


When I look in wireshark I can see some DNS queries go out for DEMO-IOT-HUB2.azure-devices.net but nothing else.

What would be the next best step to debug this issue?