clang build kernel traps

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

clang build kernel traps

rgcinjp
ppc@

as title ... i've been playing around since i got 6.6-current installed.

first build a gcc4 kernel and test if it boots properly. OK

modified share/mk/bsd.own.mk; modified links to cc, c++, cpp; kernel
build without errors; these steps are from a 2018 mail on how to
"switch armv7 to clang"

... but does not boot

-----
[ using 1135748 bytes of bsd ELF symbol table ]
console out [ATY,Jasper_A] console in [keyboard], using USB
using parent ATY,JasperParent:[]-2147483648/0 sp=6db6db6d inside 80210194-38214000: not MAP_STACK
-----

seems i got a trap while doing of_display_console()

later reverted the links to cc, c++, cpp since library reordering
balks hard. now i just use 'CC=clang CXX=clang++ CPP=clang-cpp make'


will a noob like me ... be able to proceed and 'try' to debug this?

i guess next step would be to drop down to the kernel debugger ...
but i am not sure if that is even possible *haven't actually tried
it yet*. i got no serial console.

Reply | Threaded
Open this post in threaded view
|

Re: clang build kernel traps

rgcinjp

> i guess next step would be to drop down to the kernel debugger ...
> but i am not sure if that is even possible *haven't actually tried
> it yet*. i got no serial console.

i can get ddb
n/s results in a hang

On Sun, Jan 19, 2020 at 11:12:15AM +0900, rgc wrote:

> ppc@
>
> as title ... i've been playing around since i got 6.6-current installed.
>
> first build a gcc4 kernel and test if it boots properly. OK
>
> modified share/mk/bsd.own.mk; modified links to cc, c++, cpp; kernel
> build without errors; these steps are from a 2018 mail on how to
> "switch armv7 to clang"
>
> ... but does not boot
>
> -----
> [ using 1135748 bytes of bsd ELF symbol table ]
> console out [ATY,Jasper_A] console in [keyboard], using USB
> using parent ATY,JasperParent:[]-2147483648/0 sp=6db6db6d inside 80210194-38214000: not MAP_STACK
> -----
>
> seems i got a trap while doing of_display_console()
>
> later reverted the links to cc, c++, cpp since library reordering
> balks hard. now i just use 'CC=clang CXX=clang++ CPP=clang-cpp make'
>
>
> will a noob like me ... be able to proceed and 'try' to debug this?
>
> i guess next step would be to drop down to the kernel debugger ...
> but i am not sure if that is even possible *haven't actually tried
> it yet*. i got no serial console.
>

Reply | Threaded
Open this post in threaded view
|

Re: clang build kernel traps

rgcinjp
On Sun, Jan 19, 2020 at 07:32:42PM +0100, Karel Gardas wrote:

> On 1/19/20 3:32 AM, rgc wrote:
> > [ using 1135748 bytes of bsd ELF symbol table ]
> > > console out [ATY,Jasper_A] console in [keyboard], using USB
> > > using parent ATY,JasperParent:[]-2147483648/0 sp=6db6db6d inside 80210194-38214000: not MAP_STACK
>
> not MAP_STACK looks like a hint -- I would search the tree for it. Perhaps
> this feature is not supported by clang build on ppc32 yet? Or perhaps build
> infrastructure for ppc32 does not count with it/or does not enforce its
> usage? (does not support it), but on the other hand it's required by the
> code?
> Certainly I would start from there...

'not MAP_STACK' was printed from

sys/arch/powerpc/powerpc/trap.c

and that's all i got. i think i got lucky that the trap printed a message
because now there is no log. it just hangs after printing 'using parent'.
i got no idea of the macppc memory map at the moment.

my understanding now is that the kernel crashes quite early ... during the
initial processing of ofw. 'using parent' is in

macppc/macppc/ofw_machdep.c

when i have time i'd be comparing the disassembly of the lowlevel files
from both gcc and clang. my knowledge of assembly though is not really
good.

yorosiku ~

Reply | Threaded
Open this post in threaded view
|

Re: clang build kernel traps

George Koehler-2
In reply to this post by rgcinjp
On Sun, 19 Jan 2020 11:32:26 +0900
rgc <[hidden email]> wrote:

> ...
>
> On Sun, Jan 19, 2020 at 11:12:15AM +0900, rgc wrote:
> > ppc@
> >
> > as title ... i've been playing around since i got 6.6-current installed.
> >
> > first build a gcc4 kernel and test if it boots properly. OK
> >
> > modified share/mk/bsd.own.mk; modified links to cc, c++, cpp; kernel
> > build without errors; these steps are from a 2018 mail on how to
> > "switch armv7 to clang"
> >
> > ... but does not boot
> >
> > -----
> > [ using 1135748 bytes of bsd ELF symbol table ]
> > console out [ATY,Jasper_A] console in [keyboard], using USB
> > using parent ATY,JasperParent:[]-2147483648/0 sp=6db6db6d inside 80210194-38214000: not MAP_STACK
> > -----
> >
> > ...

clang still has bugs and can emit wrong assembly code.

MAP_STACK is documented in mmap(2).  The kernel sometimes checks that
the cpu's stack pointer (register %r1 on PowerPC) points to MAP_STACK
memory.  The check would fail if something corrupted the stack pointer;
"sp=6db6db6d" might be a corrupt value.  You might get more info by
adding printf() calls and rebuilding the kernel.

I reported https://bugs.llvm.org/show_bug.cgi?id=40736 where clang and
gcc use incompatible assembly code to return small structs, and am
working on a possible fix.  Without a fix, user commands built by clang
might have trouble calling libraries built by gcc.  If we fix enough
compatibility bugs, we might try building user commands with clang.
This might be easier than building the kernel.

--George

Reply | Threaded
Open this post in threaded view
|

Re: clang build kernel traps

rgcinjp
On Wed, Jan 22, 2020 at 12:42:53AM -0500, George Koehler wrote:

> On Sun, 19 Jan 2020 11:32:26 +0900
> rgc <[hidden email]> wrote:
>
> > ...
> >
> > On Sun, Jan 19, 2020 at 11:12:15AM +0900, rgc wrote:
> > > ppc@
> > >
> > > as title ... i've been playing around since i got 6.6-current installed.
> > >
> > > first build a gcc4 kernel and test if it boots properly. OK
> > >
> > > modified share/mk/bsd.own.mk; modified links to cc, c++, cpp; kernel
> > > build without errors; these steps are from a 2018 mail on how to
> > > "switch armv7 to clang"
> > >
> > > ... but does not boot
> > >
> > > -----
> > > [ using 1135748 bytes of bsd ELF symbol table ]
> > > console out [ATY,Jasper_A] console in [keyboard], using USB
> > > using parent ATY,JasperParent:[]-2147483648/0 sp=6db6db6d inside 80210194-38214000: not MAP_STACK
> > > -----
> > >
> > > ...
>
> clang still has bugs and can emit wrong assembly code.
>

and i think it is what i am seeing right now. for example

/usr/src/sys/arch/macppc/macppc/locore.S:295

        lwz     %r30,_C_LABEL(battable)@l(%r31) /* get batu */
        mtcr    %r30
        bc      4,30,1f                 /* branch if supervisor valid is false */
        lwz     %r31,_C_LABEL(battable)+4@l(%r31)       /* get batl */
/* We randomly use the highest two bat registers here */
        mftb    %r28
        andi.   %r28,%r28,1
        bne     2f
        mtdbatu 2,%r30

generates

--- gcc4-obj/locore.s   Wed Jan 22 04:29:10 2020
+++ clang-obj/locore.s  Wed Jan 22 04:29:05 2020
@@ -197,9 +197,9 @@
                        21a: R_PPC_ADDR16_LO    battable
      21c:      7f cf f1 20     mtcr    r30
      220:      40 9e 00 48     bne-    cr7,268 <nopbat_1e>
-     224:      83 ff 00 04     lwz     r31,4(r31)
+     224:      83 ff 00 00     lwz     r31,0(r31)
                        226: R_PPC_ADDR16_LO    battable+0x4
-     228:      7f 8c 42 e6     mftb    r28
+     228:      7f 8c 42 a6     mfspr   r28,268
      22c:      73 9c 00 01     andi.   r28,r28,1
      230:      40 82 00 10     bne-    240 <nopbat_1s+0x3c>
      234:      7f dc 83 a6     mtdbatu 2,r30

(full diff attached at the end)

> MAP_STACK is documented in mmap(2).  The kernel sometimes checks that
> the cpu's stack pointer (register %r1 on PowerPC) points to MAP_STACK
> memory.  The check would fail if something corrupted the stack pointer;
> "sp=6db6db6d" might be a corrupt value.  You might get more info by
> adding printf() calls and rebuilding the kernel.
>
> I reported https://bugs.llvm.org/show_bug.cgi?id=40736 where clang and
> gcc use incompatible assembly code to return small structs, and am
> working on a possible fix.  Without a fix, user commands built by clang
> might have trouble calling libraries built by gcc.  If we fix enough
> compatibility bugs, we might try building user commands with clang.
> This might be easier than building the kernel.
>
> --George

goodluck on you work.



yorosiku ~

--- gcc4-obj/locore.s Wed Jan 22 04:29:10 2020
+++ clang-obj/locore.s Wed Jan 22 04:29:05 2020
@@ -197,9 +197,9 @@
  21a: R_PPC_ADDR16_LO battable
      21c: 7f cf f1 20 mtcr    r30
      220: 40 9e 00 48 bne-    cr7,268 <nopbat_1e>
-     224: 83 ff 00 04 lwz     r31,4(r31)
+     224: 83 ff 00 00 lwz     r31,0(r31)
  226: R_PPC_ADDR16_LO battable+0x4
-     228: 7f 8c 42 e6 mftb    r28
+     228: 7f 8c 42 a6 mfspr   r28,268
      22c: 73 9c 00 01 andi.   r28,r28,1
      230: 40 82 00 10 bne-    240 <nopbat_1s+0x3c>
      234: 7f dc 83 a6 mtdbatu 2,r30
@@ -219,7 +219,7 @@
 00000268 <nopbat_1e>:
 nopbat_1e():
      268: 7f 88 02 a6 mflr    r28
-     26c: 48 00 08 7f bla     87c <s_dsitrap>
+     26c: 48 00 00 03 bla     0 <cpu_switchto_asm>
  26c: R_PPC_ADDR24 .text+0x87c
 
 00000270 <isitrap>:
@@ -240,7 +240,7 @@
      28c: 7f a0 00 26 mfcr    r29
      290: 7f fb 02 a6 mfsrr1  r31
      294: 7c 31 42 a6 mfsprg  r1,1
-     298: 48 00 09 5b bla     958 <s_isitrap>
+     298: 48 00 00 03 bla     0 <cpu_switchto_asm>
  298: R_PPC_ADDR24 .text+0x958
 
 0000029c <extint>:
@@ -266,7 +266,7 @@
      2cc: 80 21 01 bc lwz     r1,444(r1)
      2d0: 41 82 00 08 beq-    2d8 <nop32_5e+0x2c>
      2d4: 7c 31 42 a6 mfsprg  r1,1
-     2d8: 48 00 09 ca ba      9c8 <extintr>
+     2d8: 48 00 00 02 ba      0 <cpu_switchto_asm>
  2d8: R_PPC_ADDR24 .text+0x9c8
 
 000002dc <decrint>:
@@ -418,9 +418,9 @@
      4dc: 38 42 ff f8 addi    r2,r2,-8
      4e0: 4b ff ff ac b       48c <tlbdsmiss+0x14>
      4e4: 54 23 f0 03 rlwinm. r3,r1,30,0,1
-     4e8: 40 80 00 18 bge-    500 <tlbdsmiss+0x88>
+     4e8: 40 c0 00 18 bge-    500 <tlbdsmiss+0x88>
      4ec: 70 23 00 01 andi.   r3,r1,1
-     4f0: 41 a2 00 28 beq+    518 <tlbdsmiss+0xa0>
+     4f0: 41 e2 00 28 beq+    518 <tlbdsmiss+0xa0>
      4f4: 7c 7b 02 a6 mfsrr1  r3
      4f8: 3c 20 0a 00 lis     r1,2560
      4fc: 48 00 00 34 b       530 <tlbdsmiss+0xb8>
@@ -429,7 +429,7 @@
      508: 7c 7b 02 a6 mfsrr1  r3
      50c: 54 63 97 fe rlwinm  r3,r3,18,31,31
      510: 5c 21 18 43 rlwnm.  r1,r1,r3,1,1
-     514: 40 a2 ff e0 bne-    4f4 <tlbdsmiss+0x7c>
+     514: 40 c2 ff e0 bne+    4f4 <tlbdsmiss+0x7c>
      518: 80 22 00 04 lwz     r1,4(r2)
      51c: 60 21 01 80 ori     r1,r1,384
      520: b0 22 00 06 sth     r1,6(r2)
@@ -468,7 +468,7 @@
      580: 7f d0 42 a6 mfsprg  r30,0
      584: 83 de 01 bc lwz     r30,444(r30)
      588: 38 3e 20 00 addi    r1,r30,8192
-     58c: 48 00 0d 83 bla     d80 <ddbtrap>
+     58c: 48 00 00 03 bla     0 <cpu_switchto_asm>
  58c: R_PPC_ADDR24 .text+0xd80
 
 00000590 <disitrap>:
@@ -487,8 +487,7 @@
      5b0: 7c 3b 02 a6 mfsrr1  r1
      5b4: 7c 2f f1 20 mtcr    r1
      5b8: 7c 31 42 a6 mfsprg  r1,1
-     5bc: 40 91 00 00 ble-    cr4,5bc <realtrap+0xc>
- 5bc: R_PPC_REL14 s_trap
+     5bc: 40 91 00 10 ble-    cr4,5cc <s_trap>
      5c0: 7c 30 42 a6 mfsprg  r1,0
      5c4: 80 21 01 94 lwz     r1,404(r1)
      5c8: 38 21 40 00 addi    r1,r1,16384
@@ -566,7 +565,7 @@
      6d4: 90 c5 01 b4 stw     r6,436(r5)
      6d8: 3f c0 00 00 lis     r30,0
  6da: R_PPC_ADDR16_HA .text+0x128
-     6dc: 3b de 01 28 addi    r30,r30,296
+     6dc: 3b de 00 00 addi    r30,r30,0
  6de: R_PPC_ADDR16_LO .text+0x128
      6e0: 93 c1 00 98 stw     r30,152(r1)
      6e4: 93 e1 00 9c stw     r31,156(r1)
@@ -684,8 +683,7 @@
      86c: 7f e8 03 a6 mtlr    r31
      870: 7f c3 f3 78 mr      r3,r30
      874: 4e 80 00 21 blrl
-     878: 48 00 00 00 b       878 <fork_trampoline+0x14>
- 878: R_PPC_REL24 trapexit
+     878: 4b ff fe 94 b       70c <trapexit>
 
 0000087c <s_dsitrap>:
 s_dsitrap():
@@ -854,7 +852,6 @@
 00000ac4 <extint_call>:
 extint_call():
      ac4: 48 00 00 01 bl      ac4 <extint_call>
- ac4: R_PPC_REL24 extint_call
 
 00000ac8 <intr_exit>:
 intr_exit():
@@ -1121,7 +1118,7 @@
      e88: 90 c5 01 b4 stw     r6,436(r5)
      e8c: 3f c0 00 00 lis     r30,0
  e8e: R_PPC_ADDR16_HA .text+0x128
-     e90: 3b de 01 28 addi    r30,r30,296
+     e90: 3b de 00 00 addi    r30,r30,0
  e92: R_PPC_ADDR16_LO .text+0x128
      e94: 93 c1 00 98 stw     r30,152(r1)
      e98: 93 e1 00 9c stw     r31,156(r1)
@@ -1302,61 +1299,34 @@
     1114: 60 00 00 00 nop
 
 00001118 <rfi_start>:
-rfi_start():
-    1118: 00 00 08 60 .long 0x860
+ ...
  1118: R_PPC_ADDR32 .text+0x860
-    111c: 00 00 08 64 .long 0x864
  111c: R_PPC_ADDR32 .text+0x864
-    1120: 00 00 09 54 .long 0x954
  1120: R_PPC_ADDR32 .text+0x954
-    1124: 00 00 09 58 .long 0x958
  1124: R_PPC_ADDR32 .text+0x958
-    1128: 00 00 0c 18 .long 0xc18
  1128: R_PPC_ADDR32 .text+0xc18
-    112c: 00 00 0c 1c .long 0xc1c
  112c: R_PPC_ADDR32 .text+0xc1c
-    1130: 00 00 11 08 .long 0x1108
  1130: R_PPC_ADDR32 .text+0x1108
-    1134: 00 00 11 0c .long 0x110c
  1134: R_PPC_ADDR32 .text+0x110c
- ...
 
 00001140 <nopbat_start>:
-nopbat_start():
-    1140: 00 00 02 04 .long 0x204
+ ...
  1140: R_PPC_ADDR32 .text+0x204
-    1144: 00 00 02 68 .long 0x268
  1144: R_PPC_ADDR32 .text+0x268
- ...
 
 00001150 <nop32_start>:
-nop32_start():
-    1150: 00 00 01 50 .long 0x150
+ ...
  1150: R_PPC_ADDR32 .text+0x150
-    1154: 00 00 01 5c .long 0x15c
  1154: R_PPC_ADDR32 .text+0x15c
-    1158: 00 00 01 90 .long 0x190
  1158: R_PPC_ADDR32 .text+0x190
-    115c: 00 00 01 9c .long 0x19c
  115c: R_PPC_ADDR32 .text+0x19c
-    1160: 00 00 01 e4 .long 0x1e4
  1160: R_PPC_ADDR32 .text+0x1e4
-    1164: 00 00 01 f0 .long 0x1f0
  1164: R_PPC_ADDR32 .text+0x1f0
-    1168: 00 00 02 74 .long 0x274
  1168: R_PPC_ADDR32 .text+0x274
-    116c: 00 00 02 80 .long 0x280
  116c: R_PPC_ADDR32 .text+0x280
-    1170: 00 00 02 a0 .long 0x2a0
  1170: R_PPC_ADDR32 .text+0x2a0
-    1174: 00 00 02 ac .long 0x2ac
  1174: R_PPC_ADDR32 .text+0x2ac
-    1178: 00 00 02 e0 .long 0x2e0
  1178: R_PPC_ADDR32 .text+0x2e0
-    117c: 00 00 02 ec .long 0x2ec
  117c: R_PPC_ADDR32 .text+0x2ec
-    1180: 00 00 05 64 .long 0x564
  1180: R_PPC_ADDR32 .text+0x564
-    1184: 00 00 05 70 .long 0x570
  1184: R_PPC_ADDR32 .text+0x570
- ...

Reply | Threaded
Open this post in threaded view
|

Re: clang build kernel traps

George Koehler-2
On Wed, 22 Jan 2020 06:36:17 +0900
rgc <[hidden email]> wrote:

> --- gcc4-obj/locore.s   Wed Jan 22 04:29:10 2020
> +++ clang-obj/locore.s  Wed Jan 22 04:29:05 2020
> ...

You might have found a problem with mftb (move from time base) in
clang's assembler.  Some of the other differences in your disassembly
might not cause problems.

This is a sample of the differences:

$ cat sample.s                                                        
        lwz %r31,battable+4@l(%r31)
        mftb %r28
        bla s_dsitrap
        bc 4,17,s_trap
        addi %r30,%r30,idledone@l
        nop
s_dsitrap:
        nop
cpu_switchto_asm:
        nop
idledone:
        nop
$ gcc -c sample.s
$ clang -c -o sample-clang.o sample.s
$ objdump -d sample.o > sample.ds
$ objdump -d sample-clang.o > sample-clang.ds
fishport$ diff -u sample.ds sample-clang.ds
...
 00000000 <s_dsitrap-0x18>:
-   0: 83 ff 00 04 lwz     r31,4(r31)
-   4: 7f 8c 42 e6 mftb    r28
-   8: 48 00 00 1b bla     18 <s_dsitrap>
+   0: 83 ff 00 00 lwz     r31,0(r31)
+   4: 7f 8c 42 a6 mfspr   r28,268
+   8: 48 00 00 03 bla     0 <s_dsitrap-0x18>
    c: 40 91 00 10 ble-    cr4,1c <s_trap>
-  10: 3b de 00 20 addi    r30,r30,32
+  10: 3b de 00 00 addi    r30,r30,0
   14: 60 00 00 00 nop
...

The change from 4 to 0 in 'lwz r31,4(r31)' is not significant if the
relocation for 'battable+4@l' overwrites the 4 or 0 with a different
value.  I see that gcc and clang use the same relocations:

=begin
$ readelf -r sample.o sample-clang.o

File: sample.o

Relocation section '.rela.text' at offset 0x270 contains 3 entries:
 Offset     Info    Type            Sym.Value  Sym. Name + Addend
00000002  00000704 R_PPC_ADDR16_LO   00000000   battable + 4
00000008  00000102 R_PPC_ADDR24      00000000   .text + 18
00000012  00000104 R_PPC_ADDR16_LO   00000000   .text + 20

File: sample-clang.o

Relocation section '.rela.text' at offset 0xc8 contains 3 entries:
 Offset     Info    Type            Sym.Value  Sym. Name + Addend
00000002  00000604 R_PPC_ADDR16_LO   00000000   battable + 4
00000008  00000502 R_PPC_ADDR24      00000000   .text + 18
00000012  00000504 R_PPC_ADDR16_LO   00000000   .text + 20
=end

Both gcc and clang use 'battable + 4' as the relocation for the
'battable+4@l' in the source code, so the 4 is not lost.  Each
relocation has a different symbol index (Info >> 8, see ELF32_R_SYM in
<sys/exec_elf.h>) because gcc and clang wrote the symbols in a
different order (readelf -s *.o), but the symbols 'battable' and
'.text' are the same.

Branches with predictions like bge- beq+ bne- might differ, because
newer versions of the Power ISA changed how to set the branch
prediction bits (but tried to be backward-compatible).

I recall that mftb (move from time base) is different between 32-bit
and 64-bit PowerPC, and suspect that clang might be using the wrong
mftb for our 32-bit target, but I won't know until I check the ISA
manual.

--George

Reply | Threaded
Open this post in threaded view
|

Re: clang build kernel traps

George Koehler-2
On Wed, 29 Jan 2020 00:11:40 -0500
George Koehler <[hidden email]> wrote:

> You might have found a problem with mftb (move from time base) in
> clang's assembler.  Some of the other differences in your disassembly
> might not cause problems.
>
> ...
> -   4: 7f 8c 42 e6 mftb    r28
> +   4: 7f 8c 42 a6 mfspr   r28,268

This might not be a problem.
7f 8c 42 e6 is the old encoding, up to PowerPC 2.02.
7f 8c 42 a6 is the new encoding, from Power ISA 2.03.
(PPC_Vers202_Book2_public.pdf, PowerISA_V2.03_Final_Public.pdf)

0x7f8c42e6 >> 1 & 0x3ff == 371  # old mftb
0x7f8c42a6 >> 1 & 0x3ff == 339  # new mfspr

Power ISA 2.03 claims that the new mfspr works on most processors,
but not in PowerPC 601 nor POWER3.  It should work on processors
supported by OpenBSD macppc.  Later, I want to check if the new mfspr
from clang really works on my iMac G3 (PowerPC 750).

--George