[RPI4] axen checksum err followed by kernel panic

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

[RPI4] axen checksum err followed by kernel panic

OpenBSD Bug Reporter
Hello,

Recurring bug on OpenBSD 6.7-RELEASE which results in kernel panic.

Hardware
- RPI4
- axen USB ethernet, D-Link DUB-1312 (USB3.0 Gigabit Ethernet Adapter)
- USB storage devices (variety have been tried and bug continues)

Occurs at least once a day. Sometimes more. System is running
24/7. Sufficient cooling, etc.

Tried fresh install and different USB storage devices, i.e. USB memory
sticks, USB hard drives and so on. Haven't tried CURRENT.

Before the panic, the messages "axen0: checksum err (pkt#1)" is
printed one or more times. I have seen it printed, at most, three
times. It is also written to /var/log/messages, sample output below-

> foo$ sudo grep checksum /var/log/messages
> Sep  7 04:50:39 foo /bsd: axen0: checksum err (pkt#1)
> Sep  7 21:50:37 foo /bsd: axen0: checksum err (pkt#1)
> Sep  8 00:10:14 foo /bsd: axen0: checksum err (pkt#1)
> Sep  8 09:24:49 foo /bsd: axen0: checksum err (pkt#1)
> Sep  8 23:21:50 foo /bsd: axen0: checksum err (pkt#1)
> Sep  8 23:21:50 foo /bsd: axen0: checksum err (pkt#1)

The kernel may panic after such message but not always. However, when
it does panic it seems to be after one or more of these messages have
been displayed.

ddb output number: (hopefully useful)

> starting local daemons: cron.
> Mon Sep  7 14:52:30 AEST 2020
>
> OpenBSD/arm64 (foo) (console)
>
> login: panic: assertwaitok: non-zero mutex count: 1
> Stopped at      panic+0x150:        TID    PID    UID     PRFLAGS     PFLAGS  C
> PU  COMMAND
> db_enter() at panic+0x14c
> panic() at mi_switch+0x38
> mi_switch() at sleep_finish+0x7c
> sleep_finish() at tsleep+0xbc
> tsleep() at usbd_transfer+0x23c
> usbd_transfer() at usbd_do_request_flags+0x168
> usbd_do_request_flags() at axen_ax88179_init+0x6c
> https://www.openbsd.org/ddb.html describes the minimum info required in bug
> reports.  Insufficient info makes it difficult to find and fix bugs.
> ddb{2}> trace
> db_enter() at panic+0x14c
> panic() at mi_switch+0x38
> mi_switch() at sleep_finish+0x7c
> sleep_finish() at tsleep+0xbc
> tsleep() at usbd_transfer+0x23c
> usbd_transfer() at usbd_do_request_flags+0x168
> usbd_do_request_flags() at axen_ax88179_init+0x6c
> axen_ax88179_init() at axen_stop+0x30
> axen_stop() at axen_encap+0x168
> axen_encap() at axen_start+0x84
> axen_start() at if_qstart_compat+0x38
> if_qstart_compat() at ifq_serialize+0xc4
> ifq_serialize() at taskq_thread+0x40
> taskq_thread() at proc_trampoline+0x10
> ddb{2}> mach ddbcpu 1
> Stopped at      ampintc_ipi_ddb+0x1c:   db_enter() at ampintc_ipi_ddb+0x18
> ampintc_ipi_ddb() at arm_cpu_intr+0x30
> arm_cpu_intr() at handle_el1h_irq+0x6c
> handle_el1h_irq() at sched_idle+0x220
> sched_idle() at proc_trampoline+0x10
> ddb{1}> trace
> db_enter() at ampintc_ipi_ddb+0x18
> ampintc_ipi_ddb() at arm_cpu_intr+0x30
> arm_cpu_intr() at handle_el1h_irq+0x6c
> handle_el1h_irq() at sched_idle+0x220
> sched_idle() at proc_trampoline+0x10
> ddb{1}> mach ddbcpu 2
> Stopped at      panic+0x150:    db_enter() at panic+0x14c
> panic() at mi_switch+0x38
> mi_switch() at sleep_finish+0x7c
> sleep_finish() at tsleep+0xbc
> tsleep() at usbd_transfer+0x23c
> usbd_transfer() at usbd_do_request_flags+0x168
> usbd_do_request_flags() at axen_ax88179_init+0x6c
> ddb{2}> trace
> db_enter() at panic+0x14c
> panic() at mi_switch+0x38
> mi_switch() at sleep_finish+0x7c
> sleep_finish() at tsleep+0xbc
> tsleep() at usbd_transfer+0x23c
> usbd_transfer() at usbd_do_request_flags+0x168
> usbd_do_request_flags() at axen_ax88179_init+0x6c
> axen_ax88179_init() at axen_stop+0x30
> axen_stop() at axen_encap+0x168
> axen_encap() at axen_start+0x84
> axen_start() at if_qstart_compat+0x38
> if_qstart_compat() at ifq_serialize+0xc4
> ifq_serialize() at taskq_thread+0x40
> taskq_thread() at proc_trampoline+0x10
> ddb{2}> mach ddbcpu 3
> Stopped at      ampintc_ipi_ddb+0x1c:   db_enter() at ampintc_ipi_ddb+0x18
> ampintc_ipi_ddb() at arm_cpu_intr+0x30
> arm_cpu_intr() at handle_el1h_irq+0x6c
> handle_el1h_irq() at sched_idle+0x220
> sched_idle() at proc_trampoline+0x10
> ddb{3}> trace
> db_enter() at ampintc_ipi_ddb+0x18
> ampintc_ipi_ddb() at arm_cpu_intr+0x30
> arm_cpu_intr() at handle_el1h_irq+0x6c
> handle_el1h_irq() at sched_idle+0x220
> sched_idle() at proc_trampoline+0x10
> ddb{3}> ps
>    PID     TID   PPID    UID  S       FLAGS  WAIT          COMMAND
>  43925  411488      1      0  3    0x100083  ttyin         getty
>  81261  460486      1      0  3    0x100098  poll          cron
>  43112  239668      1     99  3    0x100090  poll          sndiod
>  75321  313265      1    110  3    0x100090  poll          sndiod
>  34479  162555      1      0  3        0x80  poll          ftpd
>  13361   80056  58541     95  3    0x100092  kqread        smtpd
>  86037  390071  58541    103  3    0x100092  kqread        smtpd
>   7203  124298  58541     95  3    0x100092  kqread        smtpd
>  13688   48560  58541     95  3    0x100092  kqread        smtpd
>  34867  349798  58541     95  3    0x100092  kqread        smtpd
>  47820  489186  58541     95  3    0x100092  kqread        smtpd
>  58541  116709      1      0  3    0x100080  kqread        smtpd
>  60922  144462      1     77  3    0x100090  poll          dhcpd
>  54345  404136      1      0  3        0x80  select        sshd
>  91630  157036      1      0  3    0x100080  poll          ntpd
>  34697  218273  91625     83  3    0x100092  poll          ntpd
>  91625  499335      1     83  3    0x100092  poll          ntpd
>  47601    4759      1     53  3        0x90  kqread        unbound
>  58797   24645  76108     74  3    0x100092  bpf           pflogd
>  76108    8510      1      0  3        0x80  netio         pflogd
>  35424   70939  89649     73  3    0x100090  kqread        syslogd
>  89649   71851      1      0  3    0x100082  netio         syslogd
>  73736  426743      1     77  3    0x100090  poll          dhclient
>  63099  251459      1      0  3        0x80  poll          dhclient
>  46142   84838  36744    115  3    0x100092  kqread        slaacd
>  86475  180387  36744    115  3    0x100092  kqread        slaacd
>  36744  468962      1      0  3    0x100080  kqread        slaacd
>  19393  162333      0      0  3     0x14200  bored         smr
>  78539  358287      0      0  3     0x14200  pgzero        zerothread
>  78693  387616      0      0  3     0x14200  aiodoned      aiodoned
>  10074    1686      0      0  3     0x14200  syncer        update
>  88759  216228      0      0  3     0x14200  cleaner       cleaner
>  14820  158091      0      0  3     0x14200  reaper        reaper
>  55217  109680      0      0  3     0x14200  pgdaemon      pagedaemon
>  60838  450047      0      0  3     0x14200  bored         bwfm0
>   6308  221213      0      0  3     0x14200  bored         crynlk
>  83000  472552      0      0  3     0x14200  bored         crypto
>  19888  115172      0      0  3     0x14200  bored         tztq
> *10979  443454      0      0  7  0x40014200                idle3
>  19210  149871      0      0  3  0x40014200                idle2
>  44551  465903      0      0  7  0x40014200                idle1
>  35302  387579      0      0  3     0x14200  usbtsk        usbtask
>  66840  504792      0      0  3     0x14200  usbatsk       usbatsk
>  97049  173004      0      0  3     0x14200  mmctsk        sdmmc1
>  11763   70779      0      0  3     0x14200  mmctsk        sdmmc0
>  68179   15525      0      0  3     0x14200  usbsyn        softnet
>   2045  458507      0      0  3     0x14200  bored         systqmp
>  15740   44439      0      0  3     0x14200  bored         systq
>  91058  303122      0      0  3  0x40014200  bored         softclock
>  33017  399155      0      0  7  0x40014200                idle0
>  41574  269690      0      0  3     0x14200  kmalloc       kmthread
>      1  338012      0      0  3        0x82  wait          init
>      0       0     -1      0  3     0x10200  scheduler     swapper
> ddb{3}> show panic
> assertwaitok: non-zero mutex count: 1
> ddb{3}> show registers
> x0                0xffffff800972a800    _end+0x89d6130
> x1                0xffffff8029ef1000
> x2                               0x4
> x3                        0xfffffff0
> x4                0xffffff80003ad620    smc_call
> x5                0xffffff8029f92000
> x6                            0xfde8    $d.0+0xfd8b
> x7                                 0
> x8                0xffffff800972a800    _end+0x89d6130
> x9                0xffffff8000b66000    $d.33
> x10                             0xb0    $d.0+0x53
> x11                       0x60000005
> x12                                0
> x13                                0
> x14                          0x1627a    $d.0+0x1621d
> x15               0x46e60019ae37f6d2
> x16                     0x1ad373c848
> x17                     0x1aadd5b910
> x18               0xffffff8029f96c40
> x19                                0
> x20                            0x801    $d.0+0x7a4
> x21               0xffffff8029f96ce0
> x22               0xffffff8000d1f000    uvm_map_entry_pool+0xd0
> x23                                0
> x24               0xffffff800972c500    _end+0x89d7e30
> x25                                0
> x26                            0x3c0    $d.0+0x363
> x27                              0x3
> x28                              0x1
> x29               0xffffff8029f96c50
> x30                                0
> sp                0xffffff8029f96c40
> spsr                      0xa0000345
> elr               0xffffff800087db74    db_enter+0xc
> lr                0xffffff8000823470    ampintc_ipi_ddb+0x1c
> ampintc_ipi_ddb+0x1c:
> ddb{3}>

On previous crashes, I, unfortunately, did not carry out sufficient
traces, etc. However, here they are for completeness only. Their use
is probably extremely limited and, at most, probably only show that
the panic repeats itself.

ddb output number 2:

> axen0: checksum err (pkt#1)
> axen0: checksum err (pkt#1)
> axen0: checksum err (pkt#1)
> panic: assertwaitok: non-zero mutex count: 1
> Stopped at      panic+0x150:        TID    PID    UID     PRFLAGS     PFLAGS  C
> PU  COMMAND
> db_enter() at panic+0x14c
> panic() at mi_switch+0x38
> mi_switch() at sleep_finish+0x7c
> sleep_finish() at tsleep+0xbc
> tsleep() at usbd_transfer+0x23c
> usbd_transfer() at usbd_do_request_flags+0x168
> usbd_do_request_flags() at axen_ax88179_init+0x6c
> https://www.openbsd.org/ddb.html describes the minimum info required in bug
> reports.  Insufficient info makes it difficult to find and fix bugs.
> ddb{1}> ddb{1}> ddb{1}> Bad character
> ?
> ddb{1}> machine     kill        print       p           pprint      examine
> x           search      set         write       w           delete
> d           break       dwatch      watch       step        s
> continue    c           until       next        match       trace
> bt          call        ps          callout     show        boot
> help        hangman     dmesg
> ddb{1}> db_enter() at panic+0x14c
> panic() at mi_switch+0x38
> mi_switch() at sleep_finish+0x7c
> sleep_finish() at tsleep+0xbc
> tsleep() at usbd_transfer+0x23c
> usbd_transfer() at usbd_do_request_flags+0x168
> usbd_do_request_flags() at axen_ax88179_init+0x6c
> axen_ax88179_init() at axen_stop+0x30
> axen_stop() at axen_encap+0x168
> axen_encap() at axen_start+0x84
> axen_start() at if_qstart_compat+0x38
> if_qstart_compat() at ifq_serialize+0xc4
> ifq_serialize() at taskq_thread+0x40
> taskq_thread() at proc_trampoline+0x10

ddb output number 3:

> ddb{1}> boot dump
> panic: kernel diagnostic assertion "__mp_lock_held(&sched_lock, curcpu()) == 0" failed: file "/usr/src/sys/kern/kern_lock.c", line 63
> Stopped at      panic+0x150:    db_enter() at panic+0x14c
> panic() at _kernel_lock+0x110
> __mp_lock_held() at solock+0x40
> solock() at route_input+0xc4
> route_input() at if_down+0xa4
> if_down() at if_downall+0x40
> if_downall() at boot+0xac

ddb output number 4:

> login: axen0: checksum err (pkt#1)
> panic: assertwaitok: non-zero mutex count: 1
> Stopped at      panic+0x150:        TID    PID    UID     PRFLAGS     PFLAGS  C
> PU  COMMAND
> db_enter() at panic+0x14c
> panic() at mi_switch+0x38
> mi_switch() at sleep_finish+0x7c
> sleep_finish() at tsleep+0xbc
> tsleep() at usbd_transfer+0x23c
> usbd_transfer() at usbd_do_request_flags+0x168
> usbd_do_request_flags() at axen_ax88179_init+0x6c

If required, I can execute any suggested commands in the ddb console
etc required on the next kernel panic (which will happen).

I also noticed that if I run,

> mach ddbcpu 0

when the ddb console shows, the ddb console will freeze/hang.

I haven't carried out an objdump etc as yet.

Kind regards,
--
OpenBSD Bug Reporter

Reply | Threaded
Open this post in threaded view
|

Re: [RPI4] axen checksum err followed by kernel panic

OpenBSD Bug Reporter
Hello again,

Not sure if this is at all related to my previous email. Despite the
same kernel panic message, this is first time I have seen trace
messages like this.

> OpenBSD/arm64 (foo) (console)
>
> login: panic: assertwaitok: non-zero mutex count: 1
> Stopped at      0xffffff800074f198:         TID    PID    UID     PRFLAGS     P
> FLAGS  CPU  COMMAND
> ffffff80007050b0 at 0xffffff800074f194
> ffffff800074f198 at 0xffffff80007ff490
> ffffff80007ff494 at 0xffffff80003d5e84
> ffffff80003d5e88 at 0xffffff80003d51b4
> ffffff80003d51b8 at 0xffffff8000912cd4
> ffffff8000912cd8 at 0xffffff8000913ea4
> ffffff8000913ea8 at 0xffffff80007e5050
> https://www.openbsd.org/ddb.html describes the minimum info required in bug
> reports.  Insufficient info makes it difficult to find and fix bugs.
> ddb{3}> show panic
> assertwaitok: non-zero mutex count: 1
> ddb{3}> show registers
> x0                               0x1
> x1                0xffffff8029ef2000
> x2                               0x4
> x3                              0x10
> x4                0xffffff8029f6e990
> x5                0xffffff800072fcd4
> x6                0xa56b35985c632f01
> x7                0xa56b35985c632f01
> x8                               0x1
> x9                0xffffff8000b67000
> x10                              0xb
> x11                           0x12c3
> x12                                0
> x13                       0xffffffff
> x14                                0
> x15               0x86802f3d4b5762b8
> x16               0xffffff8009b4d024
> x17                              0x2
> x18               0xffffff8029f6ea00
> x19               0xffffff8000a1fe63
> x20                            0x100
> x21               0xffffff8000c75398
> x22               0xffffff8000d1c000
> x23               0xffffff8029f6eb00
> x24               0xffffff8009967800
> x25               0xffffff8000c7f000
> x26                              0x1
> x27               0xffffff8000c7f000
> x28                              0xf
> x29               0xffffff8029f6eaf0
> x30               0xffffff8029ec9000
> sp                0xffffff8029f6ea00
> spsr                      0x60000005
> elr               0xffffff80007050b0
> lr                0xffffff800074f198
> 0xffffff800074f198:
> ddb{3}> trace
> ffffff80007050b0 at 0xffffff800074f194
> ffffff800074f198 at 0xffffff80007ff490
> ffffff80007ff494 at 0xffffff80003d5e84
> ffffff80003d5e88 at 0xffffff80003d51b4
> ffffff80003d51b8 at 0xffffff8000912cd4
> ffffff8000912cd8 at 0xffffff8000913ea4
> ffffff8000913ea8 at 0xffffff80007e5050
> ffffff80007e5054 at 0xffffff80007e5740
> ffffff80007e5744 at 0xffffff80007e6328
> ffffff80007e632c at 0xffffff80007e5a5c
> ffffff80007e5a60 at 0xffffff800052c5b0
> ffffff800052c5b4 at 0xffffff80008f3b34
> ffffff80008f3b38 at 0xffffff800063ebec
> ffffff800063ebf0 at 0xffffff80007c1a6c
> ddb{3}> mach ddbcpu 1
> Stopped at      0xffffff800072cbc4:     ffffff80007050b0 at 0xffffff800072cbc0
> ffffff800072cbc4 at 0xffffff80007b5da8
> ffffff80007b5dac at 0xffffff80005df92c
> ffffff80005df930 at 0xffffff800054be64
> ffffff800054be68 at 0xffffff80007c1a6c
> ddb{1}> trace
> ffffff80007050b0 at 0xffffff800072cbc0
> ffffff800072cbc4 at 0xffffff80007b5da8
> ffffff80007b5dac at 0xffffff80005df92c
> ffffff80005df930 at 0xffffff800054be64
> ffffff800054be68 at 0xffffff80007c1a6c
> ddb{1}> mach ddbcpu 2
> Stopped at      0xffffff800072cbc4:     ffffff80007050b0 at 0xffffff800072cbc0
> ffffff800072cbc4 at 0xffffff80007b5da8
> ffffff80007b5dac at 0xffffff80005df92c
> ffffff80005df930 at 0xffffff800054be64
> ffffff800054be68 at 0xffffff80007c1a6c
> ddb{2}> trace
> ffffff80007050b0 at 0xffffff800072cbc0
> ffffff800072cbc4 at 0xffffff80007b5da8
> ffffff80007b5dac at 0xffffff80005df92c
> ffffff80005df930 at 0xffffff800054be64
> ffffff800054be68 at 0xffffff80007c1a6c
> ddb{2}> mach ddbcpu 3
> Stopped at      0xffffff800074f198:     ffffff80007050b0 at 0xffffff800074f194
> ffffff800074f198 at 0xffffff80007ff490
> ffffff80007ff494 at 0xffffff80003d5e84
> ffffff80003d5e88 at 0xffffff80003d51b4
> ffffff80003d51b8 at 0xffffff8000912cd4
> ffffff8000912cd8 at 0xffffff8000913ea4
> ffffff8000913ea8 at 0xffffff80007e5050
> ddb{3}> trace
> ffffff80007050b0 at 0xffffff800074f194
> ffffff800074f198 at 0xffffff80007ff490
> ffffff80007ff494 at 0xffffff80003d5e84
> ffffff80003d5e88 at 0xffffff80003d51b4
> ffffff80003d51b8 at 0xffffff8000912cd4
> ffffff8000912cd8 at 0xffffff8000913ea4
> ffffff8000913ea8 at 0xffffff80007e5050
> ffffff80007e5054 at 0xffffff80007e5740
> ffffff80007e5744 at 0xffffff80007e6328
> ffffff80007e632c at 0xffffff80007e5a5c
> ffffff80007e5a60 at 0xffffff800052c5b0
> ffffff800052c5b4 at 0xffffff80008f3b34
> ffffff80008f3b38 at 0xffffff800063ebec
> ffffff800063ebf0 at 0xffffff80007c1a6c
> ddb{3}> boot dump
> syncing disks...panic: kernel diagnostic assertion "__mp_lock_held(&sched_lock, curcpu()) == 0" failed: file "/usr/src/sys/kern/kern_lock.c", line 63
> Stopped at      0xffffff800074f198:     ffffff80007050b0 at 0xffffff800074f194
> ffffff800074f198 at 0xffffff800037f41c
> ffffff800037f420 at 0xffffff800045a994
> ffffff800045a998 at 0xffffff800045b0d0
> ffffff800045b0d4 at 0xffffff800045af80
> ffffff800045af84 at 0xffffff800060b760
> ffffff800060b764 at 0xffffff80007ed7a4
> ddb{3}> boot reboot
> rebooting...

No savecore dump file on reboot.

Kind regards,
--
OpenBSD Bug Reporter

Reply | Threaded
Open this post in threaded view
|

Re: [RPI4] axen checksum err followed by kernel panic

OpenBSD Bug Reporter
Hello again,

By way of update, if the kernel panic occurs as first described, boot
dump seems to cause the system to hang on the message 'syncing
disks...' and I have to carry out a hard reset. Not surprisingly, on
boot, savecore reports that there was no core dump.

> OpenBSD/arm64 (foo) (console)
>
> login: panic: assertwaitok: non-zero mutex count: 1
> Stopped at      panic+0x150:        TID    PID    UID     PRFLAGS     PFLAGS  C
> PU  COMMAND
> db_enter() at panic+0x14c
> panic() at mi_switch+0x38
> mi_switch() at sleep_finish+0x7c
> sleep_finish() at tsleep+0xbc
> tsleep() at usbd_transfer+0x23c
> usbd_transfer() at usbd_do_request_flags+0x168
> usbd_do_request_flags() at axen_ax88179_init+0x6c
> https://www.openbsd.org/ddb.html describes the minimum info required in bug
> reports.  Insufficient info makes it difficult to find and fix bugs.
> ddb{2}> show registers
> x0                               0x1
> x1                0xffffff8029ef0000
> x2                               0x4
> x3                              0x10
> x4                0xffffff8029f6c990
> x5                0xffffff8000703a40    _dmamap_sync
> x6                0x4f0267eec17df804
> x7                0x4f0267eec17df804
> x8                               0x1
> x9                0xffffff8000b6a000    $d.11
> x10                              0xb
> x11                           0x12e9    $d.0+0x128c
> x12                                0
> x13                       0xffffffff
> x14                                0
> x15               0xd6e11c01742377cb
> x16               0xffffff8009d1b024    _end+0x8fc771c
> x17                              0x2
> x18               0xffffff8029f6ca00
> x19               0xffffff8000a2106f    $d.2+0x7
> x20                            0x100    $d.0+0xa3
> x21               0xffffff8000c4de80    kprintf_mutex
> x22               0xffffff8000d3f000    db_history+0x1d8
> x23               0xffffff8029f6cb00
> x24               0xffffff8009965000    _end+0x8c116f8
> x25               0xffffff8000c84000    uk_cd+0x10
> x26                              0x1
> x27               0xffffff8000c84000    uk_cd+0x10
> x28                              0xf
> x29               0xffffff8029f6caf0
> x30                                0
> sp                0xffffff8029f6ca00
> spsr                      0x60000005
> elr               0xffffff80005023b4    db_enter+0xc
> lr                0xffffff80005de2f8    panic+0x150
> panic+0x150:
> ddb{2}> show panic
> assertwaitok: non-zero mutex count: 1
> ddb{2}> trace  
> db_enter() at panic+0x14c
> panic() at mi_switch+0x38
> mi_switch() at sleep_finish+0x7c
> sleep_finish() at tsleep+0xbc
> tsleep() at usbd_transfer+0x23c
> usbd_transfer() at usbd_do_request_flags+0x168
> usbd_do_request_flags() at axen_ax88179_init+0x6c
> axen_ax88179_init() at axen_stop+0x30
> axen_stop() at axen_encap+0x168
> axen_encap() at axen_start+0x84
> axen_start() at if_qstart_compat+0x38
> if_qstart_compat() at ifq_serialize+0xc4
> ifq_serialize() at taskq_thread+0x40
> taskq_thread() at proc_trampoline+0x10
> ddb{2}> mach ddbcpu 1
> Stopped at      ampintc_ipi_ddb+0x1c:   db_enter() at ampintc_ipi_ddb+0x18
> ampintc_ipi_ddb() at arm_cpu_intr+0x30
> arm_cpu_intr() at handle_el1h_irq+0x6c
> handle_el1h_irq() at sched_idle+0x220
> sched_idle() at proc_trampoline+0x10
> ddb{1}> trace
> db_enter() at ampintc_ipi_ddb+0x18
> ampintc_ipi_ddb() at arm_cpu_intr+0x30
> arm_cpu_intr() at handle_el1h_irq+0x6c
> handle_el1h_irq() at sched_idle+0x220
> sched_idle() at proc_trampoline+0x10
> ddb{1}> mach ddbpu 2
> No such command
> ddb{1}> mach ddbcpu 2
> Stopped at      panic+0x150:    db_enter() at panic+0x14c
> panic() at mi_switch+0x38
> mi_switch() at sleep_finish+0x7c
> sleep_finish() at tsleep+0xbc
> tsleep() at usbd_transfer+0x23c
> usbd_transfer() at usbd_do_request_flags+0x168
> usbd_do_request_flags() at axen_ax88179_init+0x6c
> ddb{2}> trace
> db_enter() at panic+0x14c
> panic() at mi_switch+0x38
> mi_switch() at sleep_finish+0x7c
> sleep_finish() at tsleep+0xbc
> tsleep() at usbd_transfer+0x23c
> usbd_transfer() at usbd_do_request_flags+0x168
> usbd_do_request_flags() at axen_ax88179_init+0x6c
> axen_ax88179_init() at axen_stop+0x30
> axen_stop() at axen_encap+0x168
> axen_encap() at axen_start+0x84
> axen_start() at if_qstart_compat+0x38
> if_qstart_compat() at ifq_serialize+0xc4
> ifq_serialize() at taskq_thread+0x40
> taskq_thread() at proc_trampoline+0x10
> ddb{2}> mach ddbcpu 3
> Stopped at      ampintc_ipi_ddb+0x1c:   db_enter() at ampintc_ipi_ddb+0x18
> ampintc_ipi_ddb() at arm_cpu_intr+0x30
> arm_cpu_intr() at handle_el1h_irq+0x6c
> handle_el1h_irq() at sched_idle+0x220
> sched_idle() at proc_trampoline+0x10
> ddb{3}> trace
> db_enter() at ampintc_ipi_ddb+0x18
> ampintc_ipi_ddb() at arm_cpu_intr+0x30
> arm_cpu_intr() at handle_el1h_irq+0x6c
> handle_el1h_irq() at sched_idle+0x220
> sched_idle() at proc_trampoline+0x10
> ddb{3}> boot dump
> syncing disks...
 
Kind regards,
--
OpenBSD Bug Reporter

Reply | Threaded
Open this post in threaded view
|

Re: [RPI4] axen checksum err followed by kernel panic

Stuart Henderson
I would try updating to a snapshot to see if it helps the axen errors/panic.

I don't think "boot dump" is possible with USB storage.

--
  Sent from a phone, apologies for poor formatting.
On 9 September 2020 05:56:33 OpenBSD Bug Reporter
<[hidden email]> wrote:

> Hello again,
>
> By way of update, if the kernel panic occurs as first described, boot
> dump seems to cause the system to hang on the message 'syncing
> disks...' and I have to carry out a hard reset. Not surprisingly, on
> boot, savecore reports that there was no core dump.
>
>> OpenBSD/arm64 (foo) (console)
>>
>> login: panic: assertwaitok: non-zero mutex count: 1
>> Stopped at      panic+0x150:        TID    PID    UID     PRFLAGS     PFLAGS  C
>> PU  COMMAND
>> db_enter() at panic+0x14c
>> panic() at mi_switch+0x38
>> mi_switch() at sleep_finish+0x7c
>> sleep_finish() at tsleep+0xbc
>> tsleep() at usbd_transfer+0x23c
>> usbd_transfer() at usbd_do_request_flags+0x168
>> usbd_do_request_flags() at axen_ax88179_init+0x6c
>> https://www.openbsd.org/ddb.html describes the minimum info required in bug
>> reports.  Insufficient info makes it difficult to find and fix bugs.
>> ddb{2}> show registers
>> x0                               0x1
>> x1                0xffffff8029ef0000
>> x2                               0x4
>> x3                              0x10
>> x4                0xffffff8029f6c990
>> x5                0xffffff8000703a40    _dmamap_sync
>> x6                0x4f0267eec17df804
>> x7                0x4f0267eec17df804
>> x8                               0x1
>> x9                0xffffff8000b6a000    $d.11
>> x10                              0xb
>> x11                           0x12e9    $d.0+0x128c
>> x12                                0
>> x13                       0xffffffff
>> x14                                0
>> x15               0xd6e11c01742377cb
>> x16               0xffffff8009d1b024    _end+0x8fc771c
>> x17                              0x2
>> x18               0xffffff8029f6ca00
>> x19               0xffffff8000a2106f    $d.2+0x7
>> x20                            0x100    $d.0+0xa3
>> x21               0xffffff8000c4de80    kprintf_mutex
>> x22               0xffffff8000d3f000    db_history+0x1d8
>> x23               0xffffff8029f6cb00
>> x24               0xffffff8009965000    _end+0x8c116f8
>> x25               0xffffff8000c84000    uk_cd+0x10
>> x26                              0x1
>> x27               0xffffff8000c84000    uk_cd+0x10
>> x28                              0xf
>> x29               0xffffff8029f6caf0
>> x30                                0
>> sp                0xffffff8029f6ca00
>> spsr                      0x60000005
>> elr               0xffffff80005023b4    db_enter+0xc
>> lr                0xffffff80005de2f8    panic+0x150
>> panic+0x150:
>> ddb{2}> show panic
>> assertwaitok: non-zero mutex count: 1
>> ddb{2}> trace
>> db_enter() at panic+0x14c
>> panic() at mi_switch+0x38
>> mi_switch() at sleep_finish+0x7c
>> sleep_finish() at tsleep+0xbc
>> tsleep() at usbd_transfer+0x23c
>> usbd_transfer() at usbd_do_request_flags+0x168
>> usbd_do_request_flags() at axen_ax88179_init+0x6c
>> axen_ax88179_init() at axen_stop+0x30
>> axen_stop() at axen_encap+0x168
>> axen_encap() at axen_start+0x84
>> axen_start() at if_qstart_compat+0x38
>> if_qstart_compat() at ifq_serialize+0xc4
>> ifq_serialize() at taskq_thread+0x40
>> taskq_thread() at proc_trampoline+0x10
>> ddb{2}> mach ddbcpu 1
>> Stopped at      ampintc_ipi_ddb+0x1c:   db_enter() at ampintc_ipi_ddb+0x18
>> ampintc_ipi_ddb() at arm_cpu_intr+0x30
>> arm_cpu_intr() at handle_el1h_irq+0x6c
>> handle_el1h_irq() at sched_idle+0x220
>> sched_idle() at proc_trampoline+0x10
>> ddb{1}> trace
>> db_enter() at ampintc_ipi_ddb+0x18
>> ampintc_ipi_ddb() at arm_cpu_intr+0x30
>> arm_cpu_intr() at handle_el1h_irq+0x6c
>> handle_el1h_irq() at sched_idle+0x220
>> sched_idle() at proc_trampoline+0x10
>> ddb{1}> mach ddbpu 2
>> No such command
>> ddb{1}> mach ddbcpu 2
>> Stopped at      panic+0x150:    db_enter() at panic+0x14c
>> panic() at mi_switch+0x38
>> mi_switch() at sleep_finish+0x7c
>> sleep_finish() at tsleep+0xbc
>> tsleep() at usbd_transfer+0x23c
>> usbd_transfer() at usbd_do_request_flags+0x168
>> usbd_do_request_flags() at axen_ax88179_init+0x6c
>> ddb{2}> trace
>> db_enter() at panic+0x14c
>> panic() at mi_switch+0x38
>> mi_switch() at sleep_finish+0x7c
>> sleep_finish() at tsleep+0xbc
>> tsleep() at usbd_transfer+0x23c
>> usbd_transfer() at usbd_do_request_flags+0x168
>> usbd_do_request_flags() at axen_ax88179_init+0x6c
>> axen_ax88179_init() at axen_stop+0x30
>> axen_stop() at axen_encap+0x168
>> axen_encap() at axen_start+0x84
>> axen_start() at if_qstart_compat+0x38
>> if_qstart_compat() at ifq_serialize+0xc4
>> ifq_serialize() at taskq_thread+0x40
>> taskq_thread() at proc_trampoline+0x10
>> ddb{2}> mach ddbcpu 3
>> Stopped at      ampintc_ipi_ddb+0x1c:   db_enter() at ampintc_ipi_ddb+0x18
>> ampintc_ipi_ddb() at arm_cpu_intr+0x30
>> arm_cpu_intr() at handle_el1h_irq+0x6c
>> handle_el1h_irq() at sched_idle+0x220
>> sched_idle() at proc_trampoline+0x10
>> ddb{3}> trace
>> db_enter() at ampintc_ipi_ddb+0x18
>> ampintc_ipi_ddb() at arm_cpu_intr+0x30
>> arm_cpu_intr() at handle_el1h_irq+0x6c
>> handle_el1h_irq() at sched_idle+0x220
>> sched_idle() at proc_trampoline+0x10
>> ddb{3}> boot dump
>> syncing disks...
>
> Kind regards,
> --
> OpenBSD Bug Reporter

Reply | Threaded
Open this post in threaded view
|

Re: [RPI4] axen checksum err followed by kernel panic

OpenBSD Bug Reporter
Hi Stuart,

Yes, that'll be next move. I'll try it when I get a chance and revert
with results.

However, I thought I'd refrain from immediately doing such just in
case someone on the mailing list suggested that I carry out more
tests, etc. Then again, given I'm running everything off USB drives on
this device, I can probably swap them over when tests are required.

And with "boot dump", I was guessing such but was unsure. Thanks for
the confirmation.

Kind regards,
--
OpenBSD Bug Reporter

On 20-09-09 07:07:35, Stuart Henderson wrote:

> I would try updating to a snapshot to see if it helps the axen errors/panic.
>
> I don't think "boot dump" is possible with USB storage.
>
> --
>  Sent from a phone, apologies for poor formatting.
> On 9 September 2020 05:56:33 OpenBSD Bug Reporter
> <[hidden email]> wrote:
>
> > Hello again,
> >
> > By way of update, if the kernel panic occurs as first described, boot
> > dump seems to cause the system to hang on the message 'syncing
> > disks...' and I have to carry out a hard reset. Not surprisingly, on
> > boot, savecore reports that there was no core dump.
> >
> > > OpenBSD/arm64 (foo) (console)
> > >
> > > login: panic: assertwaitok: non-zero mutex count: 1
> > > Stopped at      panic+0x150:        TID    PID    UID     PRFLAGS     PFLAGS  C
> > > PU  COMMAND
> > > db_enter() at panic+0x14c
> > > panic() at mi_switch+0x38
> > > mi_switch() at sleep_finish+0x7c
> > > sleep_finish() at tsleep+0xbc
> > > tsleep() at usbd_transfer+0x23c
> > > usbd_transfer() at usbd_do_request_flags+0x168
> > > usbd_do_request_flags() at axen_ax88179_init+0x6c
> > > https://www.openbsd.org/ddb.html describes the minimum info required in bug
> > > reports.  Insufficient info makes it difficult to find and fix bugs.
> > > ddb{2}> show registers
> > > x0                               0x1
> > > x1                0xffffff8029ef0000
> > > x2                               0x4
> > > x3                              0x10
> > > x4                0xffffff8029f6c990
> > > x5                0xffffff8000703a40    _dmamap_sync
> > > x6                0x4f0267eec17df804
> > > x7                0x4f0267eec17df804
> > > x8                               0x1
> > > x9                0xffffff8000b6a000    $d.11
> > > x10                              0xb
> > > x11                           0x12e9    $d.0+0x128c
> > > x12                                0
> > > x13                       0xffffffff
> > > x14                                0
> > > x15               0xd6e11c01742377cb
> > > x16               0xffffff8009d1b024    _end+0x8fc771c
> > > x17                              0x2
> > > x18               0xffffff8029f6ca00
> > > x19               0xffffff8000a2106f    $d.2+0x7
> > > x20                            0x100    $d.0+0xa3
> > > x21               0xffffff8000c4de80    kprintf_mutex
> > > x22               0xffffff8000d3f000    db_history+0x1d8
> > > x23               0xffffff8029f6cb00
> > > x24               0xffffff8009965000    _end+0x8c116f8
> > > x25               0xffffff8000c84000    uk_cd+0x10
> > > x26                              0x1
> > > x27               0xffffff8000c84000    uk_cd+0x10
> > > x28                              0xf
> > > x29               0xffffff8029f6caf0
> > > x30                                0
> > > sp                0xffffff8029f6ca00
> > > spsr                      0x60000005
> > > elr               0xffffff80005023b4    db_enter+0xc
> > > lr                0xffffff80005de2f8    panic+0x150
> > > panic+0x150:
> > > ddb{2}> show panic
> > > assertwaitok: non-zero mutex count: 1
> > > ddb{2}> trace
> > > db_enter() at panic+0x14c
> > > panic() at mi_switch+0x38
> > > mi_switch() at sleep_finish+0x7c
> > > sleep_finish() at tsleep+0xbc
> > > tsleep() at usbd_transfer+0x23c
> > > usbd_transfer() at usbd_do_request_flags+0x168
> > > usbd_do_request_flags() at axen_ax88179_init+0x6c
> > > axen_ax88179_init() at axen_stop+0x30
> > > axen_stop() at axen_encap+0x168
> > > axen_encap() at axen_start+0x84
> > > axen_start() at if_qstart_compat+0x38
> > > if_qstart_compat() at ifq_serialize+0xc4
> > > ifq_serialize() at taskq_thread+0x40
> > > taskq_thread() at proc_trampoline+0x10
> > > ddb{2}> mach ddbcpu 1
> > > Stopped at      ampintc_ipi_ddb+0x1c:   db_enter() at ampintc_ipi_ddb+0x18
> > > ampintc_ipi_ddb() at arm_cpu_intr+0x30
> > > arm_cpu_intr() at handle_el1h_irq+0x6c
> > > handle_el1h_irq() at sched_idle+0x220
> > > sched_idle() at proc_trampoline+0x10
> > > ddb{1}> trace
> > > db_enter() at ampintc_ipi_ddb+0x18
> > > ampintc_ipi_ddb() at arm_cpu_intr+0x30
> > > arm_cpu_intr() at handle_el1h_irq+0x6c
> > > handle_el1h_irq() at sched_idle+0x220
> > > sched_idle() at proc_trampoline+0x10
> > > ddb{1}> mach ddbpu 2
> > > No such command
> > > ddb{1}> mach ddbcpu 2
> > > Stopped at      panic+0x150:    db_enter() at panic+0x14c
> > > panic() at mi_switch+0x38
> > > mi_switch() at sleep_finish+0x7c
> > > sleep_finish() at tsleep+0xbc
> > > tsleep() at usbd_transfer+0x23c
> > > usbd_transfer() at usbd_do_request_flags+0x168
> > > usbd_do_request_flags() at axen_ax88179_init+0x6c
> > > ddb{2}> trace
> > > db_enter() at panic+0x14c
> > > panic() at mi_switch+0x38
> > > mi_switch() at sleep_finish+0x7c
> > > sleep_finish() at tsleep+0xbc
> > > tsleep() at usbd_transfer+0x23c
> > > usbd_transfer() at usbd_do_request_flags+0x168
> > > usbd_do_request_flags() at axen_ax88179_init+0x6c
> > > axen_ax88179_init() at axen_stop+0x30
> > > axen_stop() at axen_encap+0x168
> > > axen_encap() at axen_start+0x84
> > > axen_start() at if_qstart_compat+0x38
> > > if_qstart_compat() at ifq_serialize+0xc4
> > > ifq_serialize() at taskq_thread+0x40
> > > taskq_thread() at proc_trampoline+0x10
> > > ddb{2}> mach ddbcpu 3
> > > Stopped at      ampintc_ipi_ddb+0x1c:   db_enter() at ampintc_ipi_ddb+0x18
> > > ampintc_ipi_ddb() at arm_cpu_intr+0x30
> > > arm_cpu_intr() at handle_el1h_irq+0x6c
> > > handle_el1h_irq() at sched_idle+0x220
> > > sched_idle() at proc_trampoline+0x10
> > > ddb{3}> trace
> > > db_enter() at ampintc_ipi_ddb+0x18
> > > ampintc_ipi_ddb() at arm_cpu_intr+0x30
> > > arm_cpu_intr() at handle_el1h_irq+0x6c
> > > handle_el1h_irq() at sched_idle+0x220
> > > sched_idle() at proc_trampoline+0x10
> > > ddb{3}> boot dump
> > > syncing disks...
> >
> > Kind regards,
> > --
> > OpenBSD Bug Reporter
>

Reply | Threaded
Open this post in threaded view
|

Re: [RPI4] axen checksum err followed by kernel panic

OpenBSD Bug Reporter
Hello,

Running -current now.

Using snapshot dated 6 September 2020.

Message "axen0: checksum err (pkt#1)" continues to appear in
/var/log/messages.

No kernel panic yet.

However, new error-

> axen0: usb errors on rx: IOERROR
> axen0: usb error on tx: IOERROR
> axen0: watchdog timeout
> axen0: usb error on tx: IOERROR

Those messages don't appear immediately.

First I can't ping, can't resolve DNS, etc and then those messages are
displayed.

Bringing the device up and down doesn't fix it. i.e. 'ifconfig axen0
down && ifconfig axen0 up'

Due to time constraints, I couldn't investiage further and had to
reboot.

I have not mentioned this to date but I did experience the "exact"
same bug when running 6.7-release and when I was running the device on
USB2 (not USB3 as I am presently doing so). I say "exact" in quotes
because I don't have registers, trace, dumps etc to compare and
confirm that they were *exactly* the same bug.

When I have a chance, I'll pull the latest cvs -current source and
rebuild kernel to see if the bug persists. However, I did notice that
the last commit to if_axen.c in -current was in July 2020. I'll revert
once I've done that.

Kind regards,
--
OpenBSD Bug Reporter