Discussion:
Atheros 9220 - losing connectivity
Volodymyr Kostyrko
2018-11-15 20:55:38 UTC
Permalink
Hello.

I'm not quite sure what happens so I'll just dump here what I have in
mind so it would be easier to sort out later.

I upgraded from 11-STABLE to 12-STABLE and my card is now working in HT
mode. And that is nice. Currnently it's detected as:

***@pci0:4:5:0: class=0x028000 card=0x2093168c chip=0x0029168c
rev=0x01 hdr=0x00
vendor = 'Qualcomm Atheros'
device = 'AR922X Wireless Network Adapter'
class = network

ath0: <Atheros 9220> mem 0xfebf0000-0xfebfffff irq 20 at device 5.0 on pci4
ath0: [HT] enabling HT modes
ath0: [HT] 1 stream STBC receive enabled
ath0: [HT] 1 stream STBC transmit enabled
ath0: [HT] 2 RX streams; 2 TX streams
ath0: Enabling register serialisation
ath0: AR9220 mac 128.2 RF5133 phy 13.0
ath0: 2GHz radio: 0x0000; 5GHz radio: 0x00c0

# ifconfig wlan1 list chan
Channel 1 : 2412 MHz 11g ht Channel 8 : 2447 MHz 11g ht
Channel 2 : 2417 MHz 11g ht Channel 9 : 2452 MHz 11g ht
Channel 3 : 2422 MHz 11g ht Channel 10 : 2457 MHz 11g ht
Channel 4 : 2427 MHz 11g ht Channel 11 : 2462 MHz 11g ht
Channel 5 : 2432 MHz 11g ht Channel 12 : 2467* MHz 11g ht
Channel 6 : 2437 MHz 11g ht Channel 13 : 2472* MHz 11g ht
Channel 7 : 2442 MHz 11g ht

Debug enabled:
dev.ath.0.hal.force_full_reset=1
dev.ath.0.hal.debug=1

Everyting I see in logs is:
Nov 15 19:45:54 limbo kernel: ath0: stuck beacon; resetting (bmiss count 4)
Nov 15 19:45:57 limbo kernel: ath0: stuck beacon; resetting (bmiss count 4)
Nov 15 19:46:04 limbo kernel: ath0: stuck beacon; resetting (bmiss count 4)

Though this happens all the time.

What changed after going to 12-STABLE:

1. HT works.
2. Devices are losing connection constantly.

Sometimes devices are getting back on the network by themself. Sometimes
this requires interface down/up. That's so fast so that tcp connections
are staying intact. But sometimes putting server interface down/up is
required so the others can talk to it. On the other host I see interface
trying different channels, then hitting correct one, waiting a little
and going back to trying other channels. On correct channel it shows
status: no carrier, just sits there longer.

It also looks like some devices have a good connection while others are
losing connection. There's one device that never gets off.

Thanks for any pointers.
--
Sphinx of black quartz judge my vow.
Adrian Chadd
2018-11-15 21:31:20 UTC
Permalink
Hi!

11-stable to 12-stable shouldn't have resulted in a regression?
interesting, okay. Which devices are they?


-adrian
Post by Volodymyr Kostyrko
Hello.
I'm not quite sure what happens so I'll just dump here what I have in
mind so it would be easier to sort out later.
I upgraded from 11-STABLE to 12-STABLE and my card is now working in HT
rev=0x01 hdr=0x00
vendor = 'Qualcomm Atheros'
device = 'AR922X Wireless Network Adapter'
class = network
ath0: <Atheros 9220> mem 0xfebf0000-0xfebfffff irq 20 at device 5.0 on pci4
ath0: [HT] enabling HT modes
ath0: [HT] 1 stream STBC receive enabled
ath0: [HT] 1 stream STBC transmit enabled
ath0: [HT] 2 RX streams; 2 TX streams
ath0: Enabling register serialisation
ath0: AR9220 mac 128.2 RF5133 phy 13.0
ath0: 2GHz radio: 0x0000; 5GHz radio: 0x00c0
# ifconfig wlan1 list chan
Channel 1 : 2412 MHz 11g ht Channel 8 : 2447 MHz 11g ht
Channel 2 : 2417 MHz 11g ht Channel 9 : 2452 MHz 11g ht
Channel 3 : 2422 MHz 11g ht Channel 10 : 2457 MHz 11g ht
Channel 4 : 2427 MHz 11g ht Channel 11 : 2462 MHz 11g ht
Channel 5 : 2432 MHz 11g ht Channel 12 : 2467* MHz 11g ht
Channel 6 : 2437 MHz 11g ht Channel 13 : 2472* MHz 11g ht
Channel 7 : 2442 MHz 11g ht
dev.ath.0.hal.force_full_reset=1
dev.ath.0.hal.debug=1
Nov 15 19:45:54 limbo kernel: ath0: stuck beacon; resetting (bmiss count 4)
Nov 15 19:45:57 limbo kernel: ath0: stuck beacon; resetting (bmiss count 4)
Nov 15 19:46:04 limbo kernel: ath0: stuck beacon; resetting (bmiss count 4)
Though this happens all the time.
1. HT works.
2. Devices are losing connection constantly.
Sometimes devices are getting back on the network by themself. Sometimes
this requires interface down/up. That's so fast so that tcp connections
are staying intact. But sometimes putting server interface down/up is
required so the others can talk to it. On the other host I see interface
trying different channels, then hitting correct one, waiting a little
and going back to trying other channels. On correct channel it shows
status: no carrier, just sits there longer.
It also looks like some devices have a good connection while others are
losing connection. There's one device that never gets off.
Thanks for any pointers.
--
Sphinx of black quartz judge my vow.
_______________________________________________
https://lists.freebsd.org/mailman/listinfo/freebsd-wireless
"
Volodymyr Kostyrko
2018-11-15 22:01:03 UTC
Permalink
Post by Adrian Chadd
Hi!
11-stable to 12-stable shouldn't have resulted in a regression?
interesting, okay. Which devices are they?
a) Android phones.
b) Two PC's on Atheros 9227, one Windows, one DFBSD.
c) The one I see no failures yet: Android TV (android 5).
--
Sphinx of black quartz judge my vow.
Adrian Chadd
2018-11-15 22:02:23 UTC
Permalink
ok, interesting. I'll see if I can reproduce it later next week. I haven't
changed much on the AR9220 HAL...


-a
Post by Volodymyr Kostyrko
Post by Adrian Chadd
Hi!
11-stable to 12-stable shouldn't have resulted in a regression?
interesting, okay. Which devices are they?
a) Android phones.
b) Two PC's on Atheros 9227, one Windows, one DFBSD.
c) The one I see no failures yet: Android TV (android 5).
--
Sphinx of black quartz judge my vow.
Volodymyr Kostyrko
2018-11-21 15:28:28 UTC
Permalink
Post by Adrian Chadd
ok, interesting. I'll see if I can reproduce it later next week. I haven't
changed much on the AR9220 HAL...
Though I'm already not quite sure that's about wireless:

Client side:

64 bytes from 172.29.1.1: icmp_seq=3484 ttl=64 time=6769.941 ms
64 bytes from 172.29.1.1: icmp_seq=3485 ttl=64 time=5759.887 ms
64 bytes from 172.29.1.1: icmp_seq=3486 ttl=64 time=4749.831 ms
64 bytes from 172.29.1.1: icmp_seq=3491 ttl=64 time=1079.225 ms
ping: sendto: Network is down
ping: sendto: Network is down
64 bytes from 172.29.1.1: icmp_seq=3543 ttl=64 time=3.459 ms
64 bytes from 172.29.1.1: icmp_seq=3544 ttl=64 time=0.765 ms
64 bytes from 172.29.1.1: icmp_seq=3545 ttl=64 time=0.764 ms

Server side:

17:16:13.780615 IP 172.29.1.147 > 172.29.1.1: ICMP echo request, id
61963, seq 3484, length 64
17:16:13.780626 IP 172.29.1.1 > 172.29.1.147: ICMP echo reply, id 61963,
seq 3484, length 64
17:16:13.780665 IP 172.29.1.147 > 172.29.1.1: ICMP echo request, id
61963, seq 3485, length 64
17:16:13.780671 IP 172.29.1.1 > 172.29.1.147: ICMP echo reply, id 61963,
seq 3485, length 64
17:16:13.780798 IP 172.29.1.147 > 172.29.1.1: ICMP echo request, id
61963, seq 3486, length 64
17:16:13.780806 IP 172.29.1.1 > 172.29.1.147: ICMP echo reply, id 61963,
seq 3486, length 64
17:16:13.901127 IP 172.29.1.147 > 172.29.1.1: ICMP 172.29.1.147 udp port
1209 unreachable, length 36
17:16:13.926867 IP 172.29.1.147 > 172.29.1.1: ICMP 172.29.1.147 udp port
1209 unreachable, length 36
17:16:13.926901 IP 172.29.1.147 > 172.29.1.1: ICMP 172.29.1.147 udp port
2265 unreachable, length 36
17:16:13.953174 IP 172.29.1.147 > 172.29.1.1: ICMP 172.29.1.147 udp port
2265 unreachable, length 36
17:16:14.255022 IP 172.29.1.147 > 172.29.1.1: ICMP echo request, id
61963, seq 3491, length 64
17:16:14.255043 IP 172.29.1.1 > 172.29.1.147: ICMP echo reply, id 61963,
seq 3491, length 64
17:16:14.666831 IP 172.29.1.147 > 172.29.1.1: ICMP 172.29.1.147 udp port
1549 unreachable, length 36
17:16:14.666847 IP 172.29.1.147 > 172.29.1.1: ICMP 172.29.1.147 udp port
1191 unreachable, length 36
17:16:14.666853 IP 172.29.1.147 > 172.29.1.1: ICMP 172.29.1.147 udp port
1626 unreachable, length 36
17:16:15.268339 IP 172.29.1.147 > 172.29.1.1: ICMP echo request, id
61963, seq 3492, length 64
17:16:15.268376 IP 172.29.1.1 > 172.29.1.147: ICMP echo reply, id 61963,
seq 3492, length 64
17:16:16.292245 IP 172.29.1.147 > 172.29.1.1: ICMP echo request, id
61963, seq 3493, length 64
17:16:16.292265 IP 172.29.1.1 > 172.29.1.147: ICMP echo reply, id 61963,
seq 3493, length 64
17:16:17.326008 IP 172.29.1.147 > 172.29.1.1: ICMP echo request, id
61963, seq 3494, length 64
17:16:17.326050 IP 172.29.1.1 > 172.29.1.147: ICMP echo reply, id 61963,
seq 3494, length 64
17:16:18.306333 IP 172.29.1.147 > 172.29.1.1: ICMP echo request, id
61963, seq 3495, length 64
17:16:18.306349 IP 172.29.1.1 > 172.29.1.147: ICMP echo reply, id 61963,
seq 3495, length 64
17:16:19.305307 IP 172.29.1.147 > 172.29.1.1: ICMP echo request, id
61963, seq 3496, length 64
17:16:19.305341 IP 172.29.1.1 > 172.29.1.147: ICMP echo reply, id 61963,
seq 3496, length 64
17:16:20.316914 IP 172.29.1.147 > 172.29.1.1: ICMP echo request, id
61963, seq 3497, length 64
17:16:20.316951 IP 172.29.1.1 > 172.29.1.147: ICMP echo reply, id 61963,
seq 3497, length 64
17:16:21.358600 IP 172.29.1.147 > 172.29.1.1: ICMP echo request, id
61963, seq 3498, length 64
17:16:21.358627 IP 172.29.1.1 > 172.29.1.147: ICMP echo reply, id 61963,
seq 3498, length 64
17:16:22.667213 IP 172.29.1.147 > 172.29.1.1: ICMP echo request, id
61963, seq 3499, length 64
17:16:22.667240 IP 172.29.1.1 > 172.29.1.147: ICMP echo reply, id 61963,
seq 3499, length 64

1. Everything works.
2. Packets 3487 - 3490 doesn't reach server.
3. Packet 3491 hits.
4. Packets 3492 - 3540 doesn't reach client.
5. I frob the client link.
6. 2 packets went nowhere while connection gets established.
7. Everything works.

No bstuck increments in the process.
--
Sphinx of black quartz judge my vow.
Adrian Chadd
2018-11-21 16:29:20 UTC
Permalink
what's a channel survey give? is it on a very busy channel? Is there a
nearby device?


-a
Post by Adrian Chadd
Post by Adrian Chadd
ok, interesting. I'll see if I can reproduce it later next week. I
haven't
Post by Adrian Chadd
changed much on the AR9220 HAL...
64 bytes from 172.29.1.1: icmp_seq=3484 ttl=64 time=6769.941 ms
64 bytes from 172.29.1.1: icmp_seq=3485 ttl=64 time=5759.887 ms
64 bytes from 172.29.1.1: icmp_seq=3486 ttl=64 time=4749.831 ms
64 bytes from 172.29.1.1: icmp_seq=3491 ttl=64 time=1079.225 ms
ping: sendto: Network is down
ping: sendto: Network is down
64 bytes from 172.29.1.1: icmp_seq=3543 ttl=64 time=3.459 ms
64 bytes from 172.29.1.1: icmp_seq=3544 ttl=64 time=0.765 ms
64 bytes from 172.29.1.1: icmp_seq=3545 ttl=64 time=0.764 ms
17:16:13.780615 IP 172.29.1.147 > 172.29.1.1: ICMP echo request, id
61963, seq 3484, length 64
17:16:13.780626 IP 172.29.1.1 > 172.29.1.147: ICMP echo reply, id 61963,
seq 3484, length 64
17:16:13.780665 IP 172.29.1.147 > 172.29.1.1: ICMP echo request, id
61963, seq 3485, length 64
17:16:13.780671 IP 172.29.1.1 > 172.29.1.147: ICMP echo reply, id 61963,
seq 3485, length 64
17:16:13.780798 IP 172.29.1.147 > 172.29.1.1: ICMP echo request, id
61963, seq 3486, length 64
17:16:13.780806 IP 172.29.1.1 > 172.29.1.147: ICMP echo reply, id 61963,
seq 3486, length 64
17:16:13.901127 IP 172.29.1.147 > 172.29.1.1: ICMP 172.29.1.147 udp port
1209 unreachable, length 36
17:16:13.926867 IP 172.29.1.147 > 172.29.1.1: ICMP 172.29.1.147 udp port
1209 unreachable, length 36
17:16:13.926901 IP 172.29.1.147 > 172.29.1.1: ICMP 172.29.1.147 udp port
2265 unreachable, length 36
17:16:13.953174 IP 172.29.1.147 > 172.29.1.1: ICMP 172.29.1.147 udp port
2265 unreachable, length 36
17:16:14.255022 IP 172.29.1.147 > 172.29.1.1: ICMP echo request, id
61963, seq 3491, length 64
17:16:14.255043 IP 172.29.1.1 > 172.29.1.147: ICMP echo reply, id 61963,
seq 3491, length 64
17:16:14.666831 IP 172.29.1.147 > 172.29.1.1: ICMP 172.29.1.147 udp port
1549 unreachable, length 36
17:16:14.666847 IP 172.29.1.147 > 172.29.1.1: ICMP 172.29.1.147 udp port
1191 unreachable, length 36
17:16:14.666853 IP 172.29.1.147 > 172.29.1.1: ICMP 172.29.1.147 udp port
1626 unreachable, length 36
17:16:15.268339 IP 172.29.1.147 > 172.29.1.1: ICMP echo request, id
61963, seq 3492, length 64
17:16:15.268376 IP 172.29.1.1 > 172.29.1.147: ICMP echo reply, id 61963,
seq 3492, length 64
17:16:16.292245 IP 172.29.1.147 > 172.29.1.1: ICMP echo request, id
61963, seq 3493, length 64
17:16:16.292265 IP 172.29.1.1 > 172.29.1.147: ICMP echo reply, id 61963,
seq 3493, length 64
17:16:17.326008 IP 172.29.1.147 > 172.29.1.1: ICMP echo request, id
61963, seq 3494, length 64
17:16:17.326050 IP 172.29.1.1 > 172.29.1.147: ICMP echo reply, id 61963,
seq 3494, length 64
17:16:18.306333 IP 172.29.1.147 > 172.29.1.1: ICMP echo request, id
61963, seq 3495, length 64
17:16:18.306349 IP 172.29.1.1 > 172.29.1.147: ICMP echo reply, id 61963,
seq 3495, length 64
17:16:19.305307 IP 172.29.1.147 > 172.29.1.1: ICMP echo request, id
61963, seq 3496, length 64
17:16:19.305341 IP 172.29.1.1 > 172.29.1.147: ICMP echo reply, id 61963,
seq 3496, length 64
17:16:20.316914 IP 172.29.1.147 > 172.29.1.1: ICMP echo request, id
61963, seq 3497, length 64
17:16:20.316951 IP 172.29.1.1 > 172.29.1.147: ICMP echo reply, id 61963,
seq 3497, length 64
17:16:21.358600 IP 172.29.1.147 > 172.29.1.1: ICMP echo request, id
61963, seq 3498, length 64
17:16:21.358627 IP 172.29.1.1 > 172.29.1.147: ICMP echo reply, id 61963,
seq 3498, length 64
17:16:22.667213 IP 172.29.1.147 > 172.29.1.1: ICMP echo request, id
61963, seq 3499, length 64
17:16:22.667240 IP 172.29.1.1 > 172.29.1.147: ICMP echo reply, id 61963,
seq 3499, length 64
1. Everything works.
2. Packets 3487 - 3490 doesn't reach server.
3. Packet 3491 hits.
4. Packets 3492 - 3540 doesn't reach client.
5. I frob the client link.
6. 2 packets went nowhere while connection gets established.
7. Everything works.
No bstuck increments in the process.
--
Sphinx of black quartz judge my vow.
Volodymyr Kostyrko
2018-11-21 16:50:55 UTC
Permalink
Post by Adrian Chadd
what's a channel survey give? is it on a very busy channel? Is there a
nearby device?
# ifconfig wlan0 list scan
SSID/MESH ID BSSID CHAN RATE S:N
INT CAPS
kyivstar 35(1) 14:cc:20:b2:25:ad 1 54M -91:-96
100 EPS RSN BSSLOAD HTCAP WPS WME
Max c4:e9:84:59:6a:fc 9 54M -87:-96
100 EPS RSN HTCAP WME ATH WPS

I'm on channel 6 right now. I'm not setting a channel, using rather
chanlist 1-11.
--
Sphinx of black quartz judge my vow.
Loading...