Why WPA3 connections drop after 11 hours

In 2018

certification has begun

the first Wi-Fi devices to support the new WPA3 security protocol, and in subsequent years WPA3 became a common feature on all new equipment, including routers, single-board devices like the Raspberry Pi, etc.

But sometimes technology causes completely unexpected and inexplicable failures. Some users have started reporting a strange bug when WPA3 wireless connections ruptures after 11 hours for some unknown reason.


WPA3

WPA3 (Wi-Fi Protected Access 3) is based on a cryptographic protocol

Simultaneous Authentication of Equals

(SAE) from

Dan Harkins

(Dan Harkins), author

many RFCs

. The standard adds new features to simplify Wi-Fi security, strengthen authentication, and improve cryptographic strength and resiliency of mission-critical networks.

WPA3 offers two profiles for personal and corporate networks:

  1. WPA3-Personal provides stronger password authentication and brute force protection even for short or simple passwords. This is achieved by replacing the old Pre-shared Key (PSK) protocol with SAE, which is a type of protocol PAKE (password-authenticated key agreement). The key property of PAKE is that the person in the middle cannot obtain enough information to carry out an “offline” brute force in a passive mode; he necessarily needs to interact with the parties to check each option.
  2. WPA3-Enterprise provides much higher security requirements, using particularly strong cryptographic protocols with a minimum of 192-bit keys and the following cryptographic tools for data protection:

    • Authentic encryption: 256-bit Galois/Counter Mode Protocol (GCMP-256)
    • Key generation and confirmation: 384-bit Hashed Message Authentication Mode (HMAC) with Secure Hash Algorithm (HMAC-SHA384) hashing
    • Key exchange and authentication: Elliptic Curve Diffie-Hellman (ECDH) exchange and Elliptic Curve Digital Signature Algorithm (ECDSA) digital signature on a 384-bit elliptic curve
    • Reliable traffic protection management: 256-bit Broadcast/Multicast Integrity Protocol Galois Message Authentication Code (BIP-GMAC-256)

    Selecting 192-bit mode is expected to use all of the tools listed, providing a basic security platform within a WPA3 network.

Disconnecting Wi-Fi connections

The first reports of WPA3 connections being dropped after a certain time date back to 2021. Problem

discussed on the Infinion forum in October 2021.

Engineers came to the conclusion that the reason is related to the incorrect operation of the Broadcom/Cypress/Infineon chipset in WPA3 mode. If we lower the security level to WPA2, then everything works fine. The discussion continued for almost a year and ended in August 2022 without any decision.



Raspberry Pi Pico WH with Infineon CYW43439 wireless chip that supports IEEE 802.11 b/g/n and Bluetooth 5.2

Two and a half years later, the bug has not been completely fixed. At least for now Raspberry Pi owners are reporting similar Wi-Fi connection dropouts after 11 hours of operation in WPA3 mode. Events in the logs look something like this:

01:03:51 NetworkManager [...]: new IWD device state is connected
[...]
12:04:39 iwd[...]: Received Deauthentication event, reason: 0, from_ap: false

Probably, the chipset developers decided that the bug is not too critical, so it is not worth spending resources on finding out the cause. After all, such reverse engineering is a very non-trivial task.

On the HN developer forum suggestthat the bug may be related to the rekey interval. For example, it is set to 3600 s, but on the eleventh reconnection (39,600 s) it fails, the keys are missing – and the connection is broken due to an authentication error.

Other possible causes include an overflow of a buffer in which some variable is stored (for example, a time counter). Specifically, the interval is 11 hours 06 minutes 40 seconds (40,000 seconds) corresponds to 10,000,000 ticks with a frequency of 250 Hz, that is, a failure is possible if someone stores time (counter value) in numerical form. They say the same problem happened on September 8, 2001 with KDE, when the counter time_t went from 999999999 to 1000000000 seconds, which KDE mail client broke.

SSH corruption

In this regard, I remember a story from back in 2012 about how between servers in London and Montreal

SSH connection dropped inexplicably

: it either froze or ended without a timeout error. Engineers spent a very long time tracking the cause of the failures – and eventually found out that in the body of TCP packets (after 576 bytes) every 15th byte out of sixteen was damaged. Moreover, the damage was predictable: for example, all characters

h

turned into

x

and all

c

became

s

:

From the ASCII table it became clear that one bit was “stuck” in position 1.

For example, a packet filled with zeros arrived at its destination in a modified form. Here is part of one of the packages:

Initially it was a package of zeros

0x0210  .....
0x0220  0000 0000 0000 0000 0000 0000 0000 0000 ................
0x0230  0000 0000 0000 0000 0000 0000 0000 0000 ................
0x0240  0000 0000 0000 0000 0000 0000 0000 0000 ................
0x0250  0000 0000 0000 0000 0000 0000 0000 1000 ................
0x0260  0000 0000 0000 0000 0000 0000 0000 1000 ................
0x0270  0000 0000 0000 0000 0000 0000 0000 1000 ................
0x0280  0000 0000 0000 0000 0000 0000 0000 1000 ................
0x0290  0000 0000 0000 0000 0000 0000 0000 1000 ................
0x02a0  0000 0000 0000 0000 0000 0000 0000 1000 ................
0x02b0  0000 0000 0000 0000 0000 0000 0000 1000 ................
0x02c0  0000 0000 0000 0000 0000 0000 0000 1000 ................
0x02d0  0000 0000 0000 0000 0000 0000 0000 1000 ................
0x02e0  0000 0000 0000 0000 0000 0000 0000 1000 ................
0x02f0  0000 0000 0000 0000 0000 0000 0000 1000 ................
0x0300  0000 0000 0000 0000 0000 0000 0000 1000 ................
0x0310  0000 0000 0000 0000 0000 0000 0000 1000 ................
0x0320  0000 0000 0000 0000 0000 0000 0000 1000 ................
0x0330  0000 0000 0000 0000 0000 0000 0000 1000 ................
0x0340  0000 0000 0000 0000 0000 0000 0000 1000 ................
0x0350  0000 0000 0000 0000 0000 0000 0000 1000 ................
0x0360  0000 0000 0000 0000 0000 0000 0000 1000 ................
0x0370  0000 0000 0000 0000 0000 0000 0000 1000 ................
0x0380  0000 0000 0000 0000 0000 0000 0000 1000 ................
0x0390  0000 0000 0000 0000 0000 0000 0000 1000 ................
0x03a0  0000 0000 0000 0000 0000 0000 0000 1000 ................
0x03b0  0000 0000 0000 0000 0000 0000 0000 1000 ................
0x03c0  0000 0000 0000 0000 0000 0000 0000 1000 ................
0x03d0  0000 0000 0000 0000 0000 0000 0000 0000 ................
0x03e0  .....

It remained to find out which of the 17 nodes along the route was corrupting the TCP packets. The engineers took advantage of the fact that the SSH connection would fail as a result of corruption. So they compiled a list of geographically distributed open SSH servers – and tested the connection to each of them. Comparing the routing of packets on failed connections allowed us to calculate the specific IP address that was present in all failed routes. The problem was reported to the administrator of that foreign system.

… So anything can happen. And switching to a “more secure” version of the protocol does not always guarantee a real increase in security. In some cases, it makes sense to stick with proven solutions that work more reliably. Apparently, in the case of Wi-Fi, WPA2 works more reliably.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *