Unable to use USB scanner more than once

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Unable to use USB scanner more than once

Natasha Kerensikova
Hello,

I recently bought a SANE-supported USB document scanner (Fujitsu
S1300i), in the hope of using it on my OpenBSD desktop. After a bit of
murking around with firmware blobs, I eventually managed to make it work
once. At that is where I'm stuck.

To put it more precisely, after turning on the device, I get exactly
once successful use of `scanimage`, and only timeouts afterwards.

At first I thought of an issue with not sending the firmware blob, but
then I wrote a culled-down use of libusb which only queries the device
status, and I get the same symptoms: status is read fine on the first
try after powering the device, and then only read timeouts (but the
command-writing part is still successful).

Here is the code in question:
https://upload.instinctive.eu/5DpBeF32WUvs1f2Yc3TC34fbz38/test-usb.c
(sorry for the lazy style)

It is probably worth pointing that the exact same code, compiled on
linux with the same libusb version, and the exact same device, manages
to query repeatedly the device status both before and after sending the
frimware blob.

So I belive there is a bug somewhere in libusb port or in OpenBSD USB
stack. My guess would be something that is not correctly cleaned-up
after the first use.

Since you all probably don't have the device, I don't expect a fix, but
I would be very interested in what I can do to further diagnose the
issue. I guess it would be nice to spy on what is going on in the USB
wire, but I don't have that kind of hardware. So I was hoping you could
help me navigate the USB stack to dump useful information.

I have some dumps from libusb-debug, but I cannot find anything useful
here, both runs are identical until the timeout. Maybe you can spot
something more?
https://upload.instinctive.eu/a6TAzULS9R6k82g_U9go4XjmDAo/libusb-debug-first.txt
https://upload.instinctive.eu/kn7VfrcHo1FLaDuU_Lk4L8FqhNI/libusb-debug-second.txt

In case it matters, I'm running CURRENT, with `what /bsd` printing
OpenBSD 6.6-current (GENERIC.MP) #643: Tue Feb 11 00:08:34 MST 2020

Any ideas?


Thanks in advance for your help,
Natasha

Reply | Threaded
Open this post in threaded view
|

Re: Unable to use USB scanner more than once

Stuart Henderson
On 2020/02/12 10:19, Natasha Kerensikova wrote:
> So I belive there is a bug somewhere in libusb port or in OpenBSD USB
> stack.

I believe there is more than one bug :-)

> Since you all probably don't have the device, I don't expect a fix, but
> I would be very interested in what I can do to further diagnose the
> issue. I guess it would be nice to spy on what is going on in the USB
> wire, but I don't have that kind of hardware. So I was hoping you could
> help me navigate the USB stack to dump useful information.

Linux and OpenBSD both support USB packet capture, you could compare
traces (taken with tcpdump or wireshark) between the two, maybe that will
give some clues.

Reply | Threaded
Open this post in threaded view
|

Re: Unable to use USB scanner more than once

Natasha Kerensikova
Hello,

on Wednesday 12 February 2020 at 11:56, Stuart Henderson wrote:

> On 2020/02/12 10:19, Natasha Kerensikova wrote:
> > Since you all probably don't have the device, I don't expect a fix, but
> > I would be very interested in what I can do to further diagnose the
> > issue. I guess it would be nice to spy on what is going on in the USB
> > wire, but I don't have that kind of hardware. So I was hoping you could
> > help me navigate the USB stack to dump useful information.
>
> Linux and OpenBSD both support USB packet capture, you could compare
> traces (taken with tcpdump or wireshark) between the two, maybe that will
> give some clues.

Thanks a lot for pointing it out, I really had no idea such capture
existed and was so easy.

According to what I dumped on Linux, the whole enumeration thing is
cached, and repeated runs of my test program only registers as the
following exchange:

18:52:45.521580 BULK SUBMIT to 1:18:2
        0x0000:  1b03                                     ..
18:52:45.521617 BULK COMPLETE from 1:18:2
18:52:45.521700 BULK SUBMIT to 1:18:1
18:52:45.521819 BULK COMPLETE from 1:18:1
        0x0000:  0800                                     ..

In OpenBSD, there are several exchanges of control packets before the
bulk operations, which seem to be four "GET DESCRIPTOR Request
CONFIGURATION", the odd ones requesting 9 bytes and the even ones
requesting 32 bytes.

So I tried cutting the exploration code from the test case, jumping
directly from the loop after libusb_get_device_list() to the sequence
libusb_open(), libusb_claim_interface(), libusb_bulk_transfer(),
libusb_bulk_transfer(), libusb_close().

Following that change, I still have the same symptoms, but without the
control exchanges. Here is the successful exchange:

23:05:50.796876 bus 0 > addr 8: ep2 bulk 2
  0000: 1b03                                     ..
23:05:50.796895 bus 0 < addr 8: ep2 bulk 0
23:05:50.809142 bus 0 > addr 8: ep1 bulk 0
23:05:50.809163 bus 0 < addr 8: ep1 bulk 2
  0000: 0800                                     ..

And here is the failing one:

23:05:52.829576 bus 0 > addr 8: ep2 bulk 2
  0000: 1b03                                     ..
23:05:52.829640 bus 0 < addr 8: ep2 bulk 0
23:05:52.830543 bus 0 > addr 8: ep1 bulk 0
23:06:02.828968 bus 0 < addr 8: ep1 bulk 0
23:06:02.829025 bus 0 > addr 8: ep0 ctrl 8
  0000: 0002 0100 0081 0000 00                   .........
23:06:02.829027 bus 0 > addr 8: ep0 ctrl 0
  0000: 01                                       .
23:06:02.829227 bus 0 < addr 8: ep0 ctrl 0
  0000: 02                                       .
23:06:11.359685 bus 0 < addr 4: ep1 intr 8
  0000: 0100 0000 0000 0000                      ........

So at this point, nothing I see on the capture explains the difference
of behavior between Linux and OpenBSD.

More intriguing is the fact that according to the capture, when the 10s
timeout occurs, the devices sends an empty response packet to the host,
but nothing in the capture ever says that the timeout is 10s. How is
this transmitted to the device?

It's probably unrelated, but there must be something somewhere that is
left different after the first (successful) exchange and which makes the
subsequent exchanges fail, and I still have no clue on whether the
"memory" is on the host OS or in the device.


Being out of ideas, I followed my intuition about some deinitialization
issue, and tried to reproduce the issue using a single executable run.
The result is available at
https://upload.instinctive.eu/t3QWc3v5LW-MzqNJ_twkqVenpvc/test-usb2.c

Basically, if I use two pairs libusb_bulk_transfer() in a row,
everything works fine, but I release the interface and claim it again
between them, I reproduce the timeout problem.

So at this point, I suspect that libusb_claim_interface() does not
exactly undo what libusb_release_interface().


Any idea on how to proceed at this point?


Thanks for your help,
Natasha

Reply | Threaded
Open this post in threaded view
|

Re: Unable to use USB scanner more than once: XHCI problem

Natasha Kerensikova
Hello,

More detials below, but TL;DR: the problem only happens on XHCI USB,
everything is fine with the same code going through EHCI; and the
problem is the loss of a short bulk xfer from the device, but it never
happens on the first handle opened for the device file, only on
subsequent uses of the same device.

on Wednesday 12 February 2020 at 22:33, Natasha Kerensikova wrote:

> Being out of ideas, I followed my intuition about some deinitialization
> issue, and tried to reproduce the issue using a single executable run.
> The result is available at
> https://upload.instinctive.eu/t3QWc3v5LW-MzqNJ_twkqVenpvc/test-usb2.c
>
> Basically, if I use two pairs libusb_bulk_transfer() in a row,
> everything works fine, but I release the interface and claim it again
> between them, I reproduce the timeout problem.
>
> So at this point, I suspect that libusb_claim_interface() does not
> exactly undo what libusb_release_interface().

To continue the investigations, I got rid of libusb, and used OpenBSD
API directly, using the following code:
https://upload.instinctive.eu/zHYhUftnlxgEvknA9whSUHZHslc/test-usb3.c

I have really no idea what I'm doing, maybe I'm badly misusing the API,
but to my untrained eye it seems reasonable to open(), ioctl(), read()
or write(), and close() the device file, and expecting to leave things
the way I found them.
When running this code, the first iteration of the loop works fine, and
the second read() times out.

To double-check, I tried moving open() and close() outside of the loop,
which means two write/read pairs with the same fds. Both times succeed,
and I could run it several times and it still succeeds.

So I went on with three write/read paris on the same fds, and I'm back
with the first run working, and the subsequent runs timing out; however
only the first read times out, the second and third do work fine.

So it seems there is some kind of parity thing on, with odd requests
putting the system in a wierd, bogus state, and even requests bringing
back to normal.

I tried further diagnosing by enabling kernel debugging traces for ugen,
usb, and xhci. Here is a fragment of /var/log/messages showing the
powering of the USB device, two runs of the binary with three write/read
paris on the same fds, and unpowering of the device, with
ugendebug = 10;
usbdebug = 10;
xhcidebug = 9;
https://upload.instinctive.eu/mI8lANDl7KTetMiuo7EQc3hzRHA/2020-02-13-13-50-messages.txt

As far as I can tell, all read() behave similarly and enter the only
while loop in usbd_transfer(), and the difference between successful and
timing-out read() lies in the external completion of the xfer,
presumably somewhere in xhci.

As said above, this made me try with an older laptop (after updating to
the latest CURRENT), and with its ehci-based USB I cannot reproduce any
part of this problem (and the enumeration of `scanimage -L` is much much
faster).

I guess at this point I'm blocked by my lack of global understanding of
xhci and how it works with usbdi.

Would anyone have any idea on how to further diagnose the issue?


Thanks for help,
Natasha