Discussion:
[Bug 227602] [regression] ralink hostap mode with authmode WPA2 works in 10.3, broken in CURRENT
b***@freebsd.org
2018-04-18 05:32:40 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=227602

Bug ID: 227602
Summary: [regression] ralink hostap mode with authmode WPA2
works in 10.3, broken in CURRENT
Product: Base System
Version: CURRENT
Hardware: Any
OS: Any
Status: New
Keywords: regression
Severity: Affects Only Me
Priority: ---
Component: wireless
Assignee: ***@FreeBSD.org
Reporter: ***@gmail.com

My WiFi USB dongle, using the ralink driver, works reproducibly and reliably in
hostap mode in 10.3, but never works in CURRENT, giving an error of "run0:
could not load 8051 microcode" when hostapd starts, and becoming invisible to
WiFi clients.

I've confirmed it works well in GitHub commit
74ee552c5dacc20b6dde64cbb8a44e8c8ce975d0 (from around the 10.3 release), and
have begun bisecting for the regression.

In CURRENT it does work in authmode "open", but not with WPA2. The microcode
loads the first time, but doesn't load the second time when hostapd starts. It
loads successfully both times in 74ee552c5dacc20b6dde64cbb8a44e8c8ce975d0.


How I test it:

# ifconfig wlan0 create wlandev run0 wlanmode hostap
# ifconfig wlan0 inet 192.168.0.1 netmask 255.255.255.0 ssid home
# service hostapd start


/etc/rc.conf:

hostapd_enable="YES"


/etc/hostapd.conf:

interface=wlan0
debug=1
ctrl_interface=/var/run/hostapd
ctrl_interface_group=wheel
ssid=home
wpa=2
wpa_passphrase=abc123
wpa_key_mgmt=WPA-PSK
wpa_pairwise=CCMP


My USB dongle:

# usbconfig dump_device_desc

ugen0.5: <Ralink 802.11 n WLAN> at usbus0, cfg=0 md=HOST spd=HIGH (480Mbps)
pwr=ON (450mA)

bLength = 0x0012
bDescriptorType = 0x0001
bcdUSB = 0x0200
bDeviceClass = 0x0000 <Probed by interface class>
bDeviceSubClass = 0x0000
bDeviceProtocol = 0x0000
bMaxPacketSize0 = 0x0040
idVendor = 0x148f
idProduct = 0x3070
bcdDevice = 0x0101
iManufacturer = 0x0001 <Ralink>
iProduct = 0x0002 <802.11 n WLAN>
iSerialNumber = 0x0003 <1.0>
bNumConfigurations = 0x0001
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2018-04-23 20:32:07 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=227602

Damjan Jovanovic <***@gmail.com> changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |***@FreeBSD.org
Severity|Affects Only Me |Affects Many People

--- Comment #1 from Damjan Jovanovic <***@gmail.com> ---
I bisected this regression to SVN commit 304629, which is this commit in
GitHub:

b33c688c747fa7cc20a9f21b3e6f4e137647a4cc is the first bad commit
commit b33c688c747fa7cc20a9f21b3e6f4e137647a4cc
Author: hselasky <***@FreeBSD.org>
Date: Mon Aug 22 19:32:50 2016 +0000

Don't separate the status stage of the XHCI USB control transfers into
its own job because this breaks the simplified QEMU XHCI TRB parser,
which expects the complete USB control transfer as a series of back to
back TRBs. The old behaviour is kept under #ifdef in case this change
breaks enumeration of any USB devices.

PR: 212021
MFC after: 1 week

:040000 040000 8fc5db306b826d39d5b25322b5f8f4aae71cd450
05ffa3294c4cbbfb29a04bcf29a53f305313a937 M sys



Upgrading importance and adding author to CC.
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2018-04-24 01:12:47 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=227602

Damjan Jovanovic <***@gmail.com> changed:

What |Removed |Added
----------------------------------------------------------------------------
Component|wireless |usb
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2018-04-24 01:14:13 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=227602

Damjan Jovanovic <***@gmail.com> changed:

What |Removed |Added
----------------------------------------------------------------------------
Summary|[regression] ralink hostap |[regression] r304629 broke
|mode with authmode WPA2 |ralink USB hostap mode with
|works in 10.3, broken in |authmode WPA2
|CURRENT |
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2018-04-27 05:29:16 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=227602

--- Comment #2 from Damjan Jovanovic <***@gmail.com> ---
Undoing the bad commit (304629) on top of CURRENT gets my WiFi dongle working
again.

Can someone please find a permanent fix for this bug though? It must affect
other USB devices. It also affects the 11.1 release, if not more of them.
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2018-04-27 07:01:00 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=227602

--- Comment #3 from Hans Petter Selasky <***@FreeBSD.org> ---
Hi,

Can you collect "usbdump -i usbusX -f Y -w log.pcap" traces when plugging your
device in both cases, w/ and w/o reverting r304629 ?

Plug your device one time, and record X.Y numbers after ugen.

Unplug your device and start usbdump as shown above. Then plug device again,
and save the log after a few seconds.

Upload binary logs to this PR.

--HPS
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2018-04-27 07:07:04 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=227602

--- Comment #4 from Hans Petter Selasky <***@FreeBSD.org> ---
Beware that this might be a bug in your USB device, that it only works with
EHCI and not XHCI.
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2018-04-27 08:50:56 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=227602

--- Comment #5 from Damjan Jovanovic <***@gmail.com> ---
Created attachment 192839
--> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=192839&action=edit
usbdump log with r304629

Thank you so much. Here is the usbdump taken at commit r304629 itself, where I
got that "could not load microcode" error.
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2018-04-27 08:59:06 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=227602

--- Comment #6 from Damjan Jovanovic <***@gmail.com> ---
Created attachment 192840
--> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=192840&action=edit
usbdump log without r304629

Here is the usbdump taken on CURRENT with commit r304629 reverted, where
hostapd is working and my phone successfully connected.

I think only the 2 blue USB ports on the back of my PC are USB 3. The dongle
was only tested with a black USB port on the front.
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2018-04-28 09:47:55 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=227602

Hans Petter Selasky <***@FreeBSD.org> changed:

What |Removed |Added
----------------------------------------------------------------------------
Status|New |In Progress
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2018-04-28 10:02:07 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=227602

--- Comment #7 from Hans Petter Selasky <***@FreeBSD.org> ---
Hi,

In the log-with-r304629.pcap I clearly see ERR=IOERROR which is not present in
the other log file.

I suspect that your XHCI controller needs a quirk, I.E. the old behaviour.

If I were to guess, the problem is that your XHCI controller executes the
status stage instantly after the so-called USB data stage, because the patch
make the DMA jobs back to back. The ralink dongle doesn't handle this and
IOERROR usually means no response. Typically this patch makes the status stage
execute after the next SOF. This should be handled automagically by the XHCI
controller.

Can you post the output of "pciconf -lv" for your XHCI PCI device?

I guess we can make the #ifdef into a system tunable, and then you can set it
in /boot/loader.conf ?

--HPS
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2018-04-29 06:23:13 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=227602

--- Comment #8 from Damjan Jovanovic <***@gmail.com> ---
(In reply to Hans Petter Selasky from comment #7)

Thank you for your help. Here is my pciconf -lv:

***@pci0:0:0:0: class=0x060000 card=0x50001458 chip=0x0c008086 rev=0x06
hdr=0x00
vendor = 'Intel Corporation'
device = '4th Gen Core Processor DRAM Controller'
class = bridge
subclass = HOST-PCI
***@pci0:0:1:0: class=0x060400 card=0x50001458 chip=0x0c018086 rev=0x06
hdr=0x01
vendor = 'Intel Corporation'
device = 'Xeon E3-1200 v3/4th Gen Core Processor PCI Express x16
Controller'
class = bridge
subclass = PCI-PCI
***@pci0:0:20:0: class=0x0c0330 card=0x50071458 chip=0x8c318086 rev=0x05
hdr=0x00
vendor = 'Intel Corporation'
device = '8 Series/C220 Series Chipset Family USB xHCI'
class = serial bus
subclass = USB
***@pci0:0:22:0: class=0x078000 card=0x1c3a1458 chip=0x8c3a8086 rev=0x04
hdr=0x00
vendor = 'Intel Corporation'
device = '8 Series/C220 Series Chipset Family MEI Controller'
class = simple comms
***@pci0:0:26:0: class=0x0c0320 card=0x50061458 chip=0x8c2d8086 rev=0x05
hdr=0x00
vendor = 'Intel Corporation'
device = '8 Series/C220 Series Chipset Family USB EHCI'
class = serial bus
subclass = USB
***@pci0:0:27:0: class=0x040300 card=0xa0021458 chip=0x8c208086 rev=0x05
hdr=0x00
vendor = 'Intel Corporation'
device = '8 Series/C220 Series Chipset High Definition Audio
Controller'
class = multimedia
subclass = HDA
***@pci0:0:28:0: class=0x060400 card=0x50011458 chip=0x8c108086 rev=0xd5
hdr=0x01
vendor = 'Intel Corporation'
device = '8 Series/C220 Series Chipset Family PCI Express Root Port'
class = bridge
subclass = PCI-PCI
***@pci0:0:28:2: class=0x060400 card=0x50011458 chip=0x8c148086 rev=0xd5
hdr=0x01
vendor = 'Intel Corporation'
device = '8 Series/C220 Series Chipset Family PCI Express Root Port'
class = bridge
subclass = PCI-PCI
***@pci0:0:28:3: class=0x060401 card=0x50011458 chip=0x244e8086 rev=0xd5
hdr=0x01
vendor = 'Intel Corporation'
device = '82801 PCI Bridge'
class = bridge
subclass = PCI-PCI
***@pci0:0:29:0: class=0x0c0320 card=0x50061458 chip=0x8c268086 rev=0x05
hdr=0x00
vendor = 'Intel Corporation'
device = '8 Series/C220 Series Chipset Family USB EHCI'
class = serial bus
subclass = USB
***@pci0:0:31:0: class=0x060100 card=0x50011458 chip=0x8c508086 rev=0x05
hdr=0x00
vendor = 'Intel Corporation'
device = 'B85 Express LPC Controller'
class = bridge
subclass = PCI-ISA
***@pci0:0:31:2: class=0x010601 card=0xb0051458 chip=0x8c028086 rev=0x05
hdr=0x00
vendor = 'Intel Corporation'
device = '8 Series/C220 Series Chipset Family 6-port SATA Controller 1
[AHCI mode]'
class = mass storage
subclass = SATA
***@pci0:0:31:3: class=0x0c0500 card=0x50011458 chip=0x8c228086 rev=0x05
hdr=0x00
vendor = 'Intel Corporation'
device = '8 Series/C220 Series Chipset Family SMBus Controller'
class = serial bus
subclass = SMBus
***@pci0:1:0:0: class=0x030000 card=0x00000000 chip=0x0f0110de rev=0xa1
hdr=0x00
vendor = 'NVIDIA Corporation'
device = 'GF108 [GeForce GT 620]'
class = display
subclass = VGA
***@pci0:1:0:1: class=0x040300 card=0x00000000 chip=0x0e0810de rev=0xa1
hdr=0x00
vendor = 'NVIDIA Corporation'
device = 'GF119 HDMI Audio Controller'
class = multimedia
subclass = HDA
***@pci0:3:0:0: class=0x020000 card=0xe0001458 chip=0x816810ec rev=0x06
hdr=0x00
vendor = 'Realtek Semiconductor Co., Ltd.'
device = 'RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller'
class = network
subclass = ethernet
***@pci0:4:0:0: class=0x060401 card=0x88921458 chip=0x244e8086 rev=0x41
hdr=0x01
vendor = 'Intel Corporation'
device = '82801 PCI Bridge'
class = bridge
subclass = PCI-PCI
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2018-04-29 06:42:54 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=227602

--- Comment #9 from Damjan Jovanovic <***@gmail.com> ---
Apparently r304629 was added because qemu needed these back to back DMA jobs
for its emulated XHCI controller (in bug 212021). But qemu is cross platform,
just like my USB dongle. How does Windows get both of them to work? How does
Linux? Other BSDs? Do they all use these back to back DMA jobs, or does only
FreeBSD use them?

Also I can - very rarely - get the WiFi dongle to work even with r304629
applied. It only works 19.67% of the time. Could there be some kind of timing
issue or race condition, that is being exacerbated by r304629? Let me try get a
usbdump of one of these rare cases. I think it should be very revealing.
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2018-04-29 09:00:58 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=227602

--- Comment #10 from Damjan Jovanovic <***@gmail.com> ---
Created attachment 192894
--> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=192894&action=edit
usbdump log of the rare case when it works with r304629

Here is the usbdump of the rare case when it fully works at commit r304629
itself. Does it reveal anything useful, compared to the broken case in
attachment 192839?
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2018-04-29 09:47:11 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=227602
How does Linux?
Linux is more good at adding quirks when needed.

This does not only affect QEMU, but also VMWARE and all kinds of hypervisors.

--HPS
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2018-04-29 09:50:29 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=227602

--- Comment #12 from Hans Petter Selasky <***@FreeBSD.org> ---
Hi,

The XHCI controller you have already has some quirks in FreeBSD, if you look at
sys/dev/usb/controller/xhci_pci.c:

case 0x8c318086: /* Lynx Point */

I'll add another one if I find some time later today for this issue.

--HPS
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2018-04-30 07:31:22 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=227602

--- Comment #14 from Hans Petter Selasky <***@FreeBSD.org> ---
Please try the patch submitted to head.

Let me know if it doesn't work.
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2018-04-30 07:30:45 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=227602

--- Comment #13 from commit-***@freebsd.org ---
A commit references this bug:

Author: hselasky
Date: Mon Apr 30 07:30:38 UTC 2018
New revision: 333100
URL: https://svnweb.freebsd.org/changeset/base/333100

Log:
Improve fix in r304629 by allowing configuration of the behaviour
through a SYSCTL instead of a compile time define.

Add quirk by default for all LynxPoint XHCI controllers.

PR: 227602
MFC after: 3 days
Sponsored by: Mellanox Technologies

Changes:
head/sys/dev/usb/controller/xhci.c
head/sys/dev/usb/controller/xhci.h
head/sys/dev/usb/controller/xhci_pci.c
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2018-04-30 07:31:37 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=227602

Hans Petter Selasky <***@FreeBSD.org> changed:

What |Removed |Added
----------------------------------------------------------------------------
Status|In Progress |Closed
Resolution|--- |FIXED
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2018-05-03 07:29:52 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=227602

--- Comment #15 from commit-***@freebsd.org ---
A commit references this bug:

Author: hselasky
Date: Thu May 3 07:29:08 UTC 2018
New revision: 333199
URL: https://svnweb.freebsd.org/changeset/base/333199

Log:
MFC r333100:
Improve fix in r304629 by allowing configuration of the behaviour
through a SYSCTL instead of a compile time define.

Add quirk by default for all LynxPoint XHCI controllers.

PR: 227602
Sponsored by: Mellanox Technologies

Changes:
_U stable/11/
stable/11/sys/dev/usb/controller/xhci.c
stable/11/sys/dev/usb/controller/xhci.h
stable/11/sys/dev/usb/controller/xhci_pci.c
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2018-05-03 07:39:02 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=227602

--- Comment #16 from commit-***@freebsd.org ---
A commit references this bug:

Author: hselasky
Date: Thu May 3 07:38:46 UTC 2018
New revision: 333203
URL: https://svnweb.freebsd.org/changeset/base/333203

Log:
MFC r333100:
Improve fix in r304629 by allowing configuration of the behaviour
through a SYSCTL instead of a compile time define.

Add quirk by default for all LynxPoint XHCI controllers.

PR: 227602
Sponsored by: Mellanox Technologies

Changes:
_U stable/10/
stable/10/sys/dev/usb/controller/xhci.c
stable/10/sys/dev/usb/controller/xhci.h
stable/10/sys/dev/usb/controller/xhci_pci.c
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2018-05-16 18:13:31 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=227602

--- Comment #17 from Damjan Jovanovic <***@gmail.com> ---
It works with that commit. Thank you so much!

.-'"""""'-.
.' `.
/ . . \
: :
| |
: \ / :
\ `.____.' /
`. .'
`-._____.-'
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2018-11-06 04:14:08 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=227602

Kubilay Kocak <***@FreeBSD.org> changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |***@FreeBSD.org
Assignee|***@FreeBSD.org |***@FreeBSD.org
Flags| |mfc-stable10+,
| |mfc-stable11+
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
Loading...