Topic: TLSv1.3 bad_record_mac alert

We're using WolfSSL v4.4.0 with the Atmel port, and the WolfSSL peer is sending a bad_record_mac alert when trying to do TLSv1.3 with a HAProxy 2.0.12 server.  Our WolfSSL product is an embedded one which uses a Microchip (formerly Atmel) cryptographic hardware chip.

We see the alert when the WolfSSL peer downloads ~500KiB over HTTPS.  The alert manifests itself in different ways:

  • When it occurs during connection establishment, we get a "bad certificate" error

  • During data transfer, we get a TCP connection reset as the peer side shuts down its end of the connection

I have not confirmed, but I believe the WolfSSL peer is always the one failing to validate the MAC of a message and responding with an alert sent to the HAProxy peer.

The anomaly occurs pretty frequently when the WolfSSL peer is communicating from $job's network (nearly 50% of transfers fail to complete), but much less frequently when communicating via my home AT&T fiber ISP (<5% of transfers fail).  The failure may occur at any point during the transfer.

Googling suggests several possible culprits:

  • One peer is using an old OpenSSL library with a known bug that causes this issue (disproven; HAProxy was built with OpenSSLv1.1.1)

  • The peer sending the bad message is using OpenSSL incorrectly in a multithreaded application (unlikely; HAProxy was designed as a multithreaded application)

  • TCP packet corruption (unlikely, as TCP checksumming should rarely allow a bad packet to pass a checksum check)

Questions:

  1. Could a sporadic crytpo operation failure cause this behavior?

  2. If the WolfSSL peer is only receiving corrupt messages, would that exonerate our WolfSSL peer from blame?

  3. Any tips on pinpointing the culprit?

Thanks in advance!

Share

Re: TLSv1.3 bad_record_mac alert

Could a sporadic crytpo operation failure cause this behavior?

Yes it could be, you mentioned using an Atmel chip, we have seen cases where if data is not specifically aligned offloading to hardware can lead to issues.

If the WolfSSL peer is only receiving corrupt messages, would that exonerate our WolfSSL peer from blame?

If the messages are corrupted on arrival at the wolfSSL end then yes that would exonerate wolfSSL from being the culprit of the corruption.

Any tips on pinpointing the culprit?

You mentioned this happens very frequently from $jobs network (50%) and very infrequently from a private network. Is it possible the IT department in $jobs network is monitoring, decrypting and re-encrypting information for the sake of protection proprietary/trade secrets?

Warm Regards,

K