1 (edited by embedded 2017-04-15 08:54:40)

Topic: WOLFMQTT_NONBLOCK

Hi,

i'm in process to porting wolfMQTT + wolfSSL (aws) in a microcontroller,
i'm start from the example "awsiot.c" and in blocking mode the library work properly, but with WOLFMQTT_NONBLOCK there are some issue with the timeout of the read function

my porting of mqttnet.c
NetRead()

    return MQTT_CODE_ERROR_NETWORK; //socket error
    return 0 //Nothing to read
    return data_len //num of byte read
    return MQTT_CODE_ERROR_TIMEOUT //timeout_ms expired 

if there are not activity between the client and the server, after DEFAULT_CMD_TIMEOUT_MS the function NetRead return MQTT_CODE_ERROR_TIMEOUT

src/mqtt_socket.c

    
51:    if (rc == 0 || rc == MQTT_CODE_ERROR_TIMEOUT) {
52:        rc = WOLFSSL_CBIO_ERR_WANT_READ;
53:    }
54:    else if (rc < 0) {
55:        rc = WOLFSSL_CBIO_ERR_GENERAL;
56:    }

but the library overwrite the result with WOLFSSL_CBIO_ERR_WANT_READ if there are a timeout and also for "no data available"

src/mqtt_socket.c

    
157:        if (error == SSL_ERROR_WANT_READ) {
158:        #ifdef WOLFMQTT_NONBLOCK
159:            rc = MQTT_CODE_CONTINUE;
160:        #else
161:            rc = MQTT_CODE_ERROR_TIMEOUT;
162:        #endif
163:        }

in the next step of the code the result is set just checking the #define, so the initial information of timeout is lost and the code do the same path between "no data available" and timeout

i tried also to return MQTT_CODE_CONTINUE when the data is not available but the library set the result with WOLFSSL_CBIO_ERR_GENERAL and off course the code stop.


is a issue of the library or i misunderstood / or wrong implemented?
in any case, some suggestion how to solve the issue?

Share

Re: WOLFMQTT_NONBLOCK

Hi embedded,

In non-blocking mode a timeout should never be reached, ever. The timeout is for one specific "blocking mode" case:

BLOCKING MODE:

1. server and client are connected.
2. Server advances TLS state machine to a "read" state and enters the receive data method.
3. Client ungracefully disconnects (network goes down temporarily or something else unexpected happens)
4. Server gets stuck waiting to receive data FOREVER.
5. Insert TIMEOUT, now server will only get stuck waiting to receive until timeout is reached.
6. Now server has a way to back out and see that the client has disconnected and aborts the session.

NON-BLOCKING MODE:

1. <same as 1 in blocking mode>
2. <same as 2 in blocking mode>
3. <same as 3 in blocking mode>
4. Server checks to see if there is data. No data waiting, returns immediately in non-blocking mode.
5. No need for a timeout
6. server always returns regardless of data or not, sees that the client has disconnected and aborts the session.

There are two layers that play a role in non-blocking, the TCP/IP layer and the TLS layer. Could you make sure that the TCP/IP socket is also configured non-blocking at the TCP level? If you are still getting a TIMEOUT while using WOLFMQTT_NONBLOCK, that means non-blocking mode is not completely configured correctly.


Best Regards,

Kaleb

Re: WOLFMQTT_NONBLOCK

Hi embedded,

One other note from our MQTT developer:

Also if you have WOLFMQTT_NONBLOCK defined then you'll need to handle the MQTT_CODE_CONTINUE response type.
- D.G. wolfSSL Embedded Engineer


Best Regards,

Kaleb

4 (edited by embedded 2017-04-25 11:11:24)

Re: WOLFMQTT_NONBLOCK

First of all thank you for your support and suggestion, but some doubt still present.

Kaleb J. Himes wrote:

Hi embedded,

In non-blocking mode a timeout should never be reached, ever. The timeout is for one specific "blocking mode" case:

if the timeout is never rise from NetRead, how MqttClient_WaitType can restore msg->stat to MQTT_MSG_BEGIN ?

if the msg->stat is set to MQTT_MSG_WAIT, the  mqtt client is stuck on wait state and can not start any new action (like publish)

265:            rc = MqttPacket_Read(client, client->rx_buf, client->rx_buf_len, timeout_ms);
266:            if (rc <= 0) {
267:                if (rc != MQTT_CODE_CONTINUE) {
268:                    msg->stat = MQTT_MSG_BEGIN;
269:                }
270:                return rc;
271:            }

is c language, so off course is possible to force the internal state of the mqqt library from the application layer, but is look like ugly and is not designed to use it in that way. i´m trying to use your library in the right way.

currently i´m looking on this last version

wolfMQTT v0.12 (12/20/16) edit

wolfMQTT v0.11 (11/28/16)
wolfSSL (Formerly CyaSSL) Release 3.10.2 (2/10/2017)

Share

Re: WOLFMQTT_NONBLOCK

other strange behavior with WOLFMQTT_NONBLOCK define is also on some write function.

just one example:

file mqqt_client.c

144:                /* Send packet */
145:                msg->stat = MQTT_MSG_BEGIN;
146:                rc = MqttPacket_Write(client, client->tx_buf,
147:                                                    client->packet.buf_len);

after the client have received a message from the broker it send ACK in response to the publish(according to the QoS level)
when is in use the NONBLOCK mode it is supposed that the library do not proceed to the next step up to the MqttPacket_Write was completed.
In this case the library set the next status to MQTT_MSG_BEGIN independently of the value returned by the function (CONTINUE, SUCCESS, ERROR ).

I'm not sure if i totally misunderstand the workflow with WOLFMQTT_NONBLOCK or the library is NOT stable and/or NOT fully tested.

Share

Re: WOLFMQTT_NONBLOCK

Hi,

Is there a reason you haven't updated to the latest v0.12 wolfMQTT?
https://github.com/wolfSSL/wolfMQTT/releases/tag/v0.12

I believe the issue you are seeing has been addressed in one of these commits:
https://github.com/wolfSSL/wolfMQTT/com … b726821896
https://github.com/wolfSSL/wolfMQTT/com … 86ccb31caa

Also a MQTT_CODE_ERROR_TIMEOUT is considered a failure, is there a reason you aren't returning the MQTT_CODE_CONTINUE return code in non-blocking mode from your NetRead function?

Thanks,
David Garske, wolfSSL

Share

Re: WOLFMQTT_NONBLOCK

Hi David,

i'm very sorry, it was a mistake i currently use v0.12, you can check it looking the code on my first post, it is extracted from v0.12

i see that you are one of the author of the wolfMQTT, thank to you for all the effort from you and your team.

dgarske wrote:

Also a MQTT_CODE_ERROR_TIMEOUT is considered a failure

if the timeout is a failure why in the example file (mqttclient.c) the publish of a new message and the ping is under the condition:
if (rc == MQTT_CODE_ERROR_TIMEOUT)
?

mqttclient.c

                rc = MqttClient_WaitMessage(&mqttCtx->client,
                                                    mqttCtx->cmd_timeout_ms);
...................

                /* check return code */
                if (rc == MQTT_CODE_CONTINUE) {
                    return rc;
                }
                else if (rc == MQTT_CODE_ERROR_TIMEOUT) {

...................

                        rc = MqttClient_Publish(&mqttCtx->client, &mqttCtx->publish);

...................

                        rc = MqttClient_Ping(&mqttCtx->client);

...................
                }

it send ping or publish only in case of failure? misunderstanding?


dgarske wrote:

is there a reason you aren't returning the MQTT_CODE_CONTINUE return code in non-blocking mode from your NetRead function?

yes

mqtt_socket.c

static int MqttSocket_TlsSocketReceive(WOLFSSL* ssl, char *buf, int sz,
    void *ptr)
{
    int rc;
    MqttClient *client = (MqttClient*)ptr;
    (void)ssl; /* Not used */
    rc = client->net->read(client->net->context, (byte*)buf, sz,
        client->cmd_timeout_ms);
    if (rc == 0) {
        rc = WOLFSSL_CBIO_ERR_WANT_READ;
    }
    else if (rc < 0) {
        rc = WOLFSSL_CBIO_ERR_GENERAL;
    }
    return rc;
}

if NetRead return MQTT_CODE_CONTINUE (aka -101) the function MqttSocket_TlsSocketReceive return WOLFSSL_CBIO_ERR_GENERAL and THIS for sure is a failure ERROR

return case on my NetRead implementation:

int NetRead(void *context, byte* buf, int buf_len,  int timeout_ms)

case [ nothing to read, continue ]:
    return 0
case [ read byte from the incoming buffer ]:
    return data_len 
case [ network error( ex. socket close, overflow, ecc ]
    return MQTT_CODE_ERROR_NETWORK;
case [timeout_ms expired]
    return MQTT_CODE_ERROR_TIMEOUT;

if i totally misunderstood the right return case, please can suggest me what is the right one?

Share

Re: WOLFMQTT_NONBLOCK

Hi embedded,

I recognize the gap in getting return codes from the NetRead back to the caller via the wolfSSL layer. There is a fix I have in mind to pass that through the callback context. I'll push a branch/pull request to GitHub shortly to better demonstrate this. I'll let you know when its available.

Thanks,
David Garske, wolfSSL

Share

Re: WOLFMQTT_NONBLOCK

Hi embedded,

I've done some testing and found a bug and also pushed some cleanup to this PR:
https://github.com/wolfSSL/wolfMQTT/pull/24

Let me know your results with this.

This should get merged into master this week (its in peer review). Then we have a v0.13 release planned early next week.

Thanks,
David Garske, wolfSSL

Share