Restarting troublesome USB WiFi devices

I recently setup a Rock64 single board computer and needed to add WiFi to it. So I just plugged in a USB WiFi adapter. This worked fine, except the WiFi sometimes stopped working. The actual computer is headless with no display, so I had no way to troubleshoot it. But unplugging the USB WiFi adapter & plugging it back in seemed to fix it. I think this may just be an issue with the USB device, as I had only used it for brief periods of time normally plugged into a laptop. I need the WiFi to stay up 24/7 for the application I am working on.

This USB WiFi adapter works initially, but stops after some period of time

While it might be possible to identify the actual issue with the device or it's driver & fix it, I determined it would probably be easier to just come up with the software equivalent of unplugging the device & plugging it back in.

Identifying the USB device uniquely

The first step is to identify the USB device. Running lsusb makes this easy

$ lsusb
Bus 005 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 004 Device 024: ID 0bda:2838 Realtek Semiconductor Corp. RTL2838 DVB-T
Bus 004 Device 023: ID 0bda:2838 Realtek Semiconductor Corp. RTL2838 DVB-T
Bus 004 Device 022: ID 0bda:2838 Realtek Semiconductor Corp. RTL2838 DVB-T
Bus 004 Device 021: ID 1a40:0101 Terminus Technology Inc. Hub
Bus 004 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 003 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 001 Device 002: ID 1737:0071 Linksys WUSB600N v1 Dual-Band Wireless-N Network Adapter [Ralink RT2870]
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub

This lists all the USB devices currently plugged into the system. The WiFi device is a "WUSB600N" and has a vendor ID of 1737 and a product ID of 0071. All the WiFi adapters with the same model have that vendor and product ID. So this is a unique enough identifier for this purpose.

Finding the device entry in the USB subsystem

In order to reset the device, I figured there are two ways to do it

  1. Mess with power management to cut power to the USB device and then re-enable it
  2. Have the device driver reinitialize the device

The second option turned out to be the easier option to implement. Each USB device pluggd into a linux system has an entry under the sysfs located at /sys/bus/usb/devices/. Unfortunately, those identifiers are basically opaque. Mine looks like this on my desktop

$ ls -l /sys/bus/usb/devices/
total 0
lrwxrwxrwx 1 root root 0 Feb 21 11:13 1-0:1.0 -> ../../../devices/pci0000:00/0000:00:01.2/0000:20:00.0/0000:21:08.0/0000:2a:00.1/usb1/1-0:1.0
lrwxrwxrwx 1 root root 0 Feb 21 11:13 2-0:1.0 -> ../../../devices/pci0000:00/0000:00:01.2/0000:20:00.0/0000:21:08.0/0000:2a:00.1/usb2/2-0:1.0
lrwxrwxrwx 1 root root 0 Feb 21 11:13 3-0:1.0 -> ../../../devices/pci0000:00/0000:00:01.2/0000:20:00.0/0000:21:08.0/0000:2a:00.3/usb3/3-0:1.0
lrwxrwxrwx 1 root root 0 Feb 21 11:13 3-5 -> ../../../devices/pci0000:00/0000:00:01.2/0000:20:00.0/0000:21:08.0/0000:2a:00.3/usb3/3-5
lrwxrwxrwx 1 root root 0 Feb 21 11:13 3-5:1.0 -> ../../../devices/pci0000:00/0000:00:01.2/0000:20:00.0/0000:21:08.0/0000:2a:00.3/usb3/3-5/3-5:1.0
lrwxrwxrwx 1 root root 0 Feb 21 11:13 3-6 -> ../../../devices/pci0000:00/0000:00:01.2/0000:20:00.0/0000:21:08.0/0000:2a:00.3/usb3/3-6
lrwxrwxrwx 1 root root 0 Feb 21 11:13 3-6:1.0 -> ../../../devices/pci0000:00/0000:00:01.2/0000:20:00.0/0000:21:08.0/0000:2a:00.3/usb3/3-6/3-6:1.0
lrwxrwxrwx 1 root root 0 Feb 21 11:13 4-0:1.0 -> ../../../devices/pci0000:00/0000:00:01.2/0000:20:00.0/0000:21:08.0/0000:2a:00.3/usb4/4-0:1.0
lrwxrwxrwx 1 root root 0 Feb 21 11:13 5-0:1.0 -> ../../../devices/pci0000:00/0000:00:08.1/0000:2f:00.3/usb5/5-0:1.0
lrwxrwxrwx 1 root root 0 Feb 21 11:13 5-3 -> ../../../devices/pci0000:00/0000:00:08.1/0000:2f:00.3/usb5/5-3
lrwxrwxrwx 1 root root 0 Feb 21 11:13 5-3:1.0 -> ../../../devices/pci0000:00/0000:00:08.1/0000:2f:00.3/usb5/5-3/5-3:1.0
lrwxrwxrwx 1 root root 0 Feb 21 11:13 5-3.4 -> ../../../devices/pci0000:00/0000:00:08.1/0000:2f:00.3/usb5/5-3/5-3.4
lrwxrwxrwx 1 root root 0 Feb 21 11:13 5-3.4.1 -> ../../../devices/pci0000:00/0000:00:08.1/0000:2f:00.3/usb5/5-3/5-3.4/5-3.4.1
lrwxrwxrwx 1 root root 0 Feb 21 11:13 5-3.4:1.0 -> ../../../devices/pci0000:00/0000:00:08.1/0000:2f:00.3/usb5/5-3/5-3.4/5-3.4:1.0
lrwxrwxrwx 1 root root 0 Feb 21 11:13 5-3.4.1:1.0 -> ../../../devices/pci0000:00/0000:00:08.1/0000:2f:00.3/usb5/5-3/5-3.4/5-3.4.1/5-3.4.1:1.0
lrwxrwxrwx 1 root root 0 Feb 21 11:13 5-3.4.1:1.1 -> ../../../devices/pci0000:00/0000:00:08.1/0000:2f:00.3/usb5/5-3/5-3.4/5-3.4.1/5-3.4.1:1.1
lrwxrwxrwx 1 root root 0 Feb 21 11:13 5-3.4.1:1.2 -> ../../../devices/pci0000:00/0000:00:08.1/0000:2f:00.3/usb5/5-3/5-3.4/5-3.4.1/5-3.4.1:1.2
lrwxrwxrwx 1 root root 0 Feb 21 11:13 5-3.4.2 -> ../../../devices/pci0000:00/0000:00:08.1/0000:2f:00.3/usb5/5-3/5-3.4/5-3.4.2
lrwxrwxrwx 1 root root 0 Feb 21 18:37 5-3.4.2:1.0 -> ../../../devices/pci0000:00/0000:00:08.1/0000:2f:00.3/usb5/5-3/5-3.4/5-3.4.2/5-3.4.2:1.0
lrwxrwxrwx 1 root root 0 Feb 21 18:37 5-3.4.2:1.1 -> ../../../devices/pci0000:00/0000:00:08.1/0000:2f:00.3/usb5/5-3/5-3.4/5-3.4.2/5-3.4.2:1.1
lrwxrwxrwx 1 root root 0 Feb 21 18:37 5-3.4.2:1.2 -> ../../../devices/pci0000:00/0000:00:08.1/0000:2f:00.3/usb5/5-3/5-3.4/5-3.4.2/5-3.4.2:1.2
lrwxrwxrwx 1 root root 0 Feb 21 18:37 5-3.4.2:1.3 -> ../../../devices/pci0000:00/0000:00:08.1/0000:2f:00.3/usb5/5-3/5-3.4/5-3.4.2/5-3.4.2:1.3
lrwxrwxrwx 1 root root 0 Feb 21 11:13 5-3.4.3 -> ../../../devices/pci0000:00/0000:00:08.1/0000:2f:00.3/usb5/5-3/5-3.4/5-3.4.3
lrwxrwxrwx 1 root root 0 Feb 21 11:13 5-3.4.3:1.0 -> ../../../devices/pci0000:00/0000:00:08.1/0000:2f:00.3/usb5/5-3/5-3.4/5-3.4.3/5-3.4.3:1.0
lrwxrwxrwx 1 root root 0 Feb 21 11:13 5-3.4.3:1.1 -> ../../../devices/pci0000:00/0000:00:08.1/0000:2f:00.3/usb5/5-3/5-3.4/5-3.4.3/5-3.4.3:1.1
lrwxrwxrwx 1 root root 0 Feb 21 11:13 5-3.4.4 -> ../../../devices/pci0000:00/0000:00:08.1/0000:2f:00.3/usb5/5-3/5-3.4/5-3.4.4
lrwxrwxrwx 1 root root 0 Feb 21 11:13 5-3.4.4:1.0 -> ../../../devices/pci0000:00/0000:00:08.1/0000:2f:00.3/usb5/5-3/5-3.4/5-3.4.4/5-3.4.4:1.0
lrwxrwxrwx 1 root root 0 Feb 21 11:13 5-3.4.4:1.1 -> ../../../devices/pci0000:00/0000:00:08.1/0000:2f:00.3/usb5/5-3/5-3.4/5-3.4.4/5-3.4.4:1.1
lrwxrwxrwx 1 root root 0 Feb 21 11:13 5-3.4.4:1.2 -> ../../../devices/pci0000:00/0000:00:08.1/0000:2f:00.3/usb5/5-3/5-3.4/5-3.4.4/5-3.4.4:1.2
lrwxrwxrwx 1 root root 0 Feb 21 11:13 5-3.4.4:1.3 -> ../../../devices/pci0000:00/0000:00:08.1/0000:2f:00.3/usb5/5-3/5-3.4/5-3.4.4/5-3.4.4:1.3
lrwxrwxrwx 1 root root 0 Feb 21 11:13 6-0:1.0 -> ../../../devices/pci0000:00/0000:00:08.1/0000:2f:00.3/usb6/6-0:1.0
lrwxrwxrwx 1 root root 0 Feb 21 11:13 6-3 -> ../../../devices/pci0000:00/0000:00:08.1/0000:2f:00.3/usb6/6-3
lrwxrwxrwx 1 root root 0 Feb 21 11:13 6-3:1.0 -> ../../../devices/pci0000:00/0000:00:08.1/0000:2f:00.3/usb6/6-3/6-3:1.0
lrwxrwxrwx 1 root root 0 Feb 21 11:13 6-3.4 -> ../../../devices/pci0000:00/0000:00:08.1/0000:2f:00.3/usb6/6-3/6-3.4
lrwxrwxrwx 1 root root 0 Feb 21 11:13 6-3.4:1.0 -> ../../../devices/pci0000:00/0000:00:08.1/0000:2f:00.3/usb6/6-3/6-3.4/6-3.4:1.0
lrwxrwxrwx 1 root root 0 Feb 21 11:13 usb1 -> ../../../devices/pci0000:00/0000:00:01.2/0000:20:00.0/0000:21:08.0/0000:2a:00.1/usb1
lrwxrwxrwx 1 root root 0 Feb 21 11:13 usb2 -> ../../../devices/pci0000:00/0000:00:01.2/0000:20:00.0/0000:21:08.0/0000:2a:00.1/usb2
lrwxrwxrwx 1 root root 0 Feb 21 11:13 usb3 -> ../../../devices/pci0000:00/0000:00:01.2/0000:20:00.0/0000:21:08.0/0000:2a:00.3/usb3
lrwxrwxrwx 1 root root 0 Feb 21 11:13 usb4 -> ../../../devices/pci0000:00/0000:00:01.2/0000:20:00.0/0000:21:08.0/0000:2a:00.3/usb4
lrwxrwxrwx 1 root root 0 Feb 21 11:13 usb5 -> ../../../devices/pci0000:00/0000:00:08.1/0000:2f:00.3/usb5
lrwxrwxrwx 1 root root 0 Feb 21 11:13 usb6 -> ../../../devices/pci0000:00/0000:00:08.1/0000:2f:00.3/usb6

Thankfully underneath all that mess you can find files for the corresponding USB vendor and product IDs. For example

$ cat /sys/bus/usb/devices/5-3.4.2/idVendor
0d8c
$ cat /sys/bus/usb/devices/5-3.4.2/idProduct
0012

This entry is for the headset adapter I have on my desktop. But the numbers 5-3.4.2 don't have any real meaning and if you plug the same device into a different USB port or a USB hub, the number could change. So it isn't sufficient to find the correct entry in this filesystem just once. It needs to be search through and matched based off the already known vendor and product ID.

I thought I could be clever and use the find command to search up matching entries in there. But the sysfs filesystem isn't a real filesystem and contains way too many loops to make this practical. I ended up writing a script that checks each entry in /sys/bus/usb/devices. Interestingly, not all entries have the files idVendor and idProduct. I don't know much about how the USB subsystem works, or what the numbering scheme means.

Reseting the USB device

Once the entry has been found in the syfs what needs to happen next is the driver needs to be reinitialized. This done by interacting with the sysfs entries /sys/bus/usb/drivers/usb/unbind and /sys/bus/usb/drivers/usb/bind. In my previous example the identifier is 5-3.4.2. So to unbind and reconnect the driver we need to open those files and write that value into them. You can do this from a shell easily enough if you want to try it out

$ cat '5-3.4.2' > /sys/bus/usb/drivers/usb/unbind # Disable the driver
$ cat '5-3.4.2' > /sys/bus/usb/drivers/usb/bind # Enable the driver

This isn't quite the same thing as unplugging the device since the USB device is not actually power cycled at any point, but it seems to work well enough for what I needed to do.

Detecting when to reset the device

Now that I had a method to reset the USB device from software, I needed to identify when to take that action. I suppose I could just reset it hourly, but that would reset the network when it was not actually required. It also could result in a gap of up to 59 minutes of the device being off the WiFi.

So I needed to come up with a way to detect if the network interface is down. Each network interface has an entry under /sys/class/net. The name there corresponds to the network adapter name as shown by ifconfig. This is what it looks like on my desktop

$ ls -l /sys/class/net
total 0
lrwxrwxrwx 1 root root 0 Feb 22 07:00 br-eb437df3b396 -> ../../devices/virtual/net/br-eb437df3b396
lrwxrwxrwx 1 root root 0 Feb 22 07:00 docker0 -> ../../devices/virtual/net/docker0
lrwxrwxrwx 1 root root 0 Feb 22 07:00 enp38s0 -> ../../devices/pci0000:00/0000:00:01.2/0000:20:00.0/0000:21:04.0/0000:26:00.0/net/enp38s0
lrwxrwxrwx 1 root root 0 Feb 22 07:00 lo -> ../../devices/virtual/net/lo
lrwxrwxrwx 1 root root 0 Feb 22 13:40 veth58c7568 -> ../../devices/virtual/net/veth58c7568
lrwxrwxrwx 1 root root 0 Feb 22 07:00 virbr0 -> ../../devices/virtual/net/virbr0
lrwxrwxrwx 1 root root 0 Feb 22 07:00 virbr0-nic -> ../../devices/virtual/net/virbr0-nic

This shows all network interfaces on the machine, even virtual ones. You can easily check to see if the network adapter is connected to something by checking this from a shell. On my desktop, I can check that my ethernet adapter has a connection

$ cat /sys/class/net/enp38s0/carrier
1

The value 1 in the file means it has a connection to something. It is 0 if there is no connection. Checking this works, but there is a caveat. Whenever the USB WiFi device misbehaved, the corresponding entry for that adapter disappeared entirely. So I came up with this series of checks

  1. Check if the carrier file exists for the network adapter under /sys/class/net. If it doesn't exist, assume the device is not connected.
  2. Read the carrier file. If the numerical value in it is greater than zero, assume the device is connected.

If the outcome of this check indicates the network device is not connected, then that means it is time to reset the WiFi adapter. I compiled all this logic into a python script.

Running this check on a timer

With all these pieces in place, I needed a way to run this check periodically. This is usually done with a cron job. On newer linux distributions this is easier to manage with systemd. You need to create two entries in /etc/systemd/system. One defines a timer, that periodically runs a service. The other part is the service itself. This service actually runs the check.

systemd timer

A timer in systemd usually just starts a service. There are a bunch of other nice things you can do however. I wound up with this definition for the service.

[Unit]
Description=runs the network restart task
RefuseManualStart=no
RefuseManualStop=no

[Timer]
#Do not execute job if it missed a run due to machine being off
Persistent=false
#wait 5 minutes after boot
OnBootSec=300
#Run every 2 minutes thereafter
OnUnitInactiveSec=120
#File describing job to execute
Unit=restart-network-if-down.service

[Install]
WantedBy=timers.target

This has a couple advantages over a cron job. The service it starts is called restart-network-if-down. But it doesn't just run that service every 2 minutes. It runs that service 2 minutes after the service completes. This is better because sometimes the service doesn't run the checks instantly if the system is busy. Also it doesn't run until 5 minutes (300 seconds) until after boot. I figure there is no reason to clog up the startup logs running this check immediately when it boots.

systemd service

The systemd service just defines a service of type simple. It's expected to run and then exit.

[Unit]
Description=Restarts the network if the wifi down

[Service]
Type=simple
# Update with the interface name from ifconfig
Environment=IFACE=wlx001ee5d8d053 
# Vendor ID of the USB stick, from lsusb
Environment=VENDOR_ID=1737 
# Product ID of the USB stick, from lsusb
Environment=PRODUCT_ID=0071 
# The script to run
ExecStart=/usr/bin/python3 /opt/usb_reset/reboot_if_down.py

[Install]
WantedBy=default.target

Result

This is what happens when the service runs and detects that the network interface is missing

Feb 26 17:08:29 rtlsdrhost0 systemd[1]: Started Restarts the network if the wifi down.
Feb 26 17:08:29 rtlsdrhost0 python3[184811]: interface wlx001ee5d8d053Z has no carrier file
Feb 26 17:08:29 rtlsdrhost0 python3[184812]: USB 1-1 [1737:0071]
Feb 26 17:08:29 rtlsdrhost0 python3[184812]: unbinding USB device: 1-1
Feb 26 17:08:31 rtlsdrhost0 python3[184812]: binding USB device: 1-1
Feb 26 17:08:34 rtlsdrhost0 systemd[1]: restart-network-if-down.service: Succeeded.

So far this seems to work acceptably.

Github

This entire project can be found on Github.


Copyright Eric Urban 2022, or the respective entity where indicated