boost md context switching on macppc

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

boost md context switching on macppc

Otto Moerbeek
Hi,

some time ago boost-md was enabled for powerpc, but the proper context
create and switch functions from
pobj/boost_1_66_0/boost_1_66_0/libs/context/src/asm are not selected
and linked in.

The diff below fixes that part (and now powerdns recursor builds), but
the actual context switching does not work, it segfaults:

Program received signal SIGSEGV, Segmentation fault.
[Switching to thread 466468]
0xa85695f0 in jump_fcontext () from /usr/local/lib/libboost_context-mt.so.9.0

What I'm seeing is (in jump_ppc32_sysv_elf_gas.S/jump_context)
register r3 that is supposed to point to the memory that will receive
the transfer_t (the return value of jump_fcontext()) is 0.

    # return transfer_t
    stw  %r6, 0(%r3)
    stw  %r5, 4(%r3)

Any suggestion os help is welcome,

Also, is there any point in enabling the boost-md lib for an arch if
this is not working?

        -Otto

Index: Makefile
===================================================================
RCS file: /cvs/ports/devel/boost/Makefile,v
retrieving revision 1.89
diff -u -p -r1.89 Makefile
--- Makefile 9 Aug 2019 11:25:29 -0000 1.89
+++ Makefile 27 Aug 2019 12:19:39 -0000
@@ -17,7 +17,7 @@ EXTRACT_SUFX= .tar.bz2
 FIX_EXTRACT_PERMISSIONS = Yes
 
 REVISION-main= 6
-REVISION-md= 1
+REVISION-md= 2
 
 SO_VERSION= 9.0
 BOOST_LIBS= boost_atomic-mt \
Index: patches/patch-libs_context_build_Jamfile_v2
===================================================================
RCS file: patches/patch-libs_context_build_Jamfile_v2
diff -N patches/patch-libs_context_build_Jamfile_v2
--- /dev/null 1 Jan 1970 00:00:00 -0000
+++ patches/patch-libs_context_build_Jamfile_v2 27 Aug 2019 12:19:39 -0000
@@ -0,0 +1,14 @@
+$OpenBSD$
+
+Index: libs/context/build/Jamfile.v2
+--- libs/context/build/Jamfile.v2.orig
++++ libs/context/build/Jamfile.v2
+@@ -315,7 +315,7 @@ alias asm_sources
+      <address-model>32
+      <architecture>power
+      <binary-format>elf
+-     <toolset>clang
++     <toolset>gcc
+    ;
+
+ alias asm_sources

Reply | Threaded
Open this post in threaded view
|

Re: boost md context switching on macppc

George Koehler-2
On Tue, 27 Aug 2019 14:30:50 +0200
Otto Moerbeek <[hidden email]> wrote:

> What I'm seeing is (in jump_ppc32_sysv_elf_gas.S/jump_context)
> register r3 that is supposed to point to the memory that will receive
> the transfer_t (the return value of jump_fcontext()) is 0.
>
>     # return transfer_t
>     stw  %r6, 0(%r3)
>     stw  %r5, 4(%r3)

Ouch.  This code is for Linux; there is a secret incompatibility
between BSD and Linux on 32-bit PowerPC.  These systems have almost
the same System V ABI, but differ when returning a small struct of up
to 8 bytes.  transfer_t from <boost/context/detail/fcontext.hpp> is
such a struct: its 2 pointers measure 8 bytes.  BSD has no return area
in %r3; the callee should return the transfer_t in %r3 and %r4.  The
params were in %r4 and %r5 in Linux, but are in %r3 and %r4 in BSD.

The fixes might be to

 - delete the lines `stw %r3, 228(%r1)` and `lwz %r3, 228(%r1)`,
   because the return area no longer exists.
 - change `mr %r1, %r4` to `mr %r1, %r3`, because first param is %r3.
 - change `stw %r6, 0(%r3)` to `mr %r3, %r6` (move to %r3 from %r6),
   because first word of transfer_t is %r3.
 - delete `stw %r5, 4(%r3)`, because second param is %r4, and second
   word of transfer_t is %r4, so we would move %r4 to itself.

One might guard the changes with #ifdef __Linux__ ... #else ... #endif,
because .S files go through the preprocessor.  The other *ppc32_sysv*.S
files might also need changes.  I haven't tried these changes, because
I will need to upgrade my macppc snapshot, then wait a few days for my
PowerBook G4 to build boost and dependencies.

The System V ABI of 1995 [1], page 3-22, specified,
> A structure or union whose size is less than or equal to 8 bytes shall be
> returned in r3 and r4, as if it were first stored in an 8-byte aligned
> memory area and then the low-addressed word were loaded into r3 and the
> high-addressed word into r4.  Bits beyond the last member of the
> structure or union are not defined.

For unknown reason, GCC for Linux ignores this and returns a small
struct like a large struct (through a return area in %r3).  GCC for BSD
does return a small struct in %r3 and %r4, but extends it to 4 or 8
bytes by inserting bits before the first member, not "beyond the last
member", given that the system is big-endian; so struct { char c; }
puts c in the low byte of %r3, not the first byte.  By BSD, I mean at
least OpenBSD and NetBSD; I didn't check FreeBSD.

The Linux ABI of 2011 [2] specified what GCC for Linux does.  LLVM and
clang follow Linux (so code built with clang for OpenBSD/macppc crashes
when it calls libraries built with gcc); boost is crashing because we
use gcc in OpenBSD/macppc.

[1] https://refspecs.linuxbase.org/elf/elfspec_ppc.pdf
[2] https://www.polyomino.org.uk/publications/2011/Power-Arch-32-bit-ABI-supp-1.0-Unified.pdf

--
George Koehler <[hidden email]>

Reply | Threaded
Open this post in threaded view
|

Re: boost md context switching on macppc

Otto Moerbeek
On Tue, Aug 27, 2019 at 01:58:22PM -0400, George Koehler wrote:

> On Tue, 27 Aug 2019 14:30:50 +0200
> Otto Moerbeek <[hidden email]> wrote:
>
> > What I'm seeing is (in jump_ppc32_sysv_elf_gas.S/jump_context)
> > register r3 that is supposed to point to the memory that will receive
> > the transfer_t (the return value of jump_fcontext()) is 0.
> >
> >     # return transfer_t
> >     stw  %r6, 0(%r3)
> >     stw  %r5, 4(%r3)
>
> Ouch.  This code is for Linux; there is a secret incompatibility
> between BSD and Linux on 32-bit PowerPC.  These systems have almost
> the same System V ABI, but differ when returning a small struct of up
> to 8 bytes.  transfer_t from <boost/context/detail/fcontext.hpp> is
> such a struct: its 2 pointers measure 8 bytes.  BSD has no return area
> in %r3; the callee should return the transfer_t in %r3 and %r4.  The
> params were in %r4 and %r5 in Linux, but are in %r3 and %r4 in BSD.
>
> The fixes might be to
>
>  - delete the lines `stw %r3, 228(%r1)` and `lwz %r3, 228(%r1)`,
>    because the return area no longer exists.
>  - change `mr %r1, %r4` to `mr %r1, %r3`, because first param is %r3.
>  - change `stw %r6, 0(%r3)` to `mr %r3, %r6` (move to %r3 from %r6),
>    because first word of transfer_t is %r3.
>  - delete `stw %r5, 4(%r3)`, because second param is %r4, and second
>    word of transfer_t is %r4, so we would move %r4 to itself.
>
> One might guard the changes with #ifdef __Linux__ ... #else ... #endif,
> because .S files go through the preprocessor.  The other *ppc32_sysv*.S
> files might also need changes.  I haven't tried these changes, because
> I will need to upgrade my macppc snapshot, then wait a few days for my
> PowerBook G4 to build boost and dependencies.
>
> The System V ABI of 1995 [1], page 3-22, specified,
> > A structure or union whose size is less than or equal to 8 bytes shall be
> > returned in r3 and r4, as if it were first stored in an 8-byte aligned
> > memory area and then the low-addressed word were loaded into r3 and the
> > high-addressed word into r4.  Bits beyond the last member of the
> > structure or union are not defined.
>
> For unknown reason, GCC for Linux ignores this and returns a small
> struct like a large struct (through a return area in %r3).  GCC for BSD
> does return a small struct in %r3 and %r4, but extends it to 4 or 8
> bytes by inserting bits before the first member, not "beyond the last
> member", given that the system is big-endian; so struct { char c; }
> puts c in the low byte of %r3, not the first byte.  By BSD, I mean at
> least OpenBSD and NetBSD; I didn't check FreeBSD.
>
> The Linux ABI of 2011 [2] specified what GCC for Linux does.  LLVM and
> clang follow Linux (so code built with clang for OpenBSD/macppc crashes
> when it calls libraries built with gcc); boost is crashing because we
> use gcc in OpenBSD/macppc.
>
> [1] https://refspecs.linuxbase.org/elf/elfspec_ppc.pdf
> [2] https://www.polyomino.org.uk/publications/2011/Power-Arch-32-bit-ABI-supp-1.0-Unified.pdf
>
> --
> George Koehler <[hidden email]>
>

Thanks!

I was already thinking it would be something along this line, i386 has
similar issue I fixed a few months ago.

A first shot did not work here, so if you can take a closer look please
do. In the meantime I'll try to do so as well after reading up on the ABI.

        -Otto

Reply | Threaded
Open this post in threaded view
|

Re: boost md context switching on macppc

George Koehler-2
On Tue, 27 Aug 2019 21:04:00 +0200
Otto Moerbeek <[hidden email]> wrote:

> A first shot did not work here, so if you can take a closer look please
> do. In the meantime I'll try to do so as well after reading up on the ABI.

I made my own attempt to fix the *ppc32_sysv_elf* assembly code in
lang/boost, but I made some mistake.  I believe that I set the stack
pointer %r1 outside MAP_STACK memory.  This causes the machine to
freeze, as the macppc kernel gets stuck in an infinite loop, repeatedly
printing a message like

[jump]57834/195711 sp=9421ffc0 inside ffbee000-fffee000: not MAP_STACK

where "jump" is the name of the executable.  "jump" is one of the
programs from WRKSRC/libs/context/example

To get the kernel messages to appear, I needed to rcctl stop xenodm,
so xconsole doesn't grab the messages.  Then I ran the executable from
the boot console ttyC0 (Ctrl-Meta-F1).

I don't need boost to reproduce this kernel problem; it is enough to
build a program that sets a bad stack pointer, like

$ cat crash.c
#include <stdlib.h>
int
main(void) {
        malloc(16384);
        __asm__("addi %r1, %r3, 16368");
        exit(0);
}
$ gcc -o crash crash.c
$ ./crash

The stuck kernel responds to nothing -- it doesn't answer ping(8) --
so my only way out is to force off the power, by holding the power
button of my PowerBook G4.  I need to work around this kernel problem;
I might upgrade to a newer snapshot (my kernel is from Aug 26), report
a bug, or try to build a kernel without the MAP_STACK check.

The rest of this email is the *broken* diff to devel/boost.  It
includes your fixes, plus my assembly changes.

Index: Makefile
===================================================================
RCS file: /cvs/ports/devel/boost/Makefile,v
retrieving revision 1.89
diff -u -p -r1.89 Makefile
--- Makefile 9 Aug 2019 11:25:29 -0000 1.89
+++ Makefile 4 Sep 2019 02:39:07 -0000
@@ -17,7 +17,7 @@ EXTRACT_SUFX= .tar.bz2
 FIX_EXTRACT_PERMISSIONS = Yes
 
 REVISION-main= 6
-REVISION-md= 1
+REVISION-md= 2
 
 SO_VERSION= 9.0
 BOOST_LIBS= boost_atomic-mt \
Index: patches/patch-libs_context_build_Jamfile_v2
===================================================================
RCS file: patches/patch-libs_context_build_Jamfile_v2
diff -N patches/patch-libs_context_build_Jamfile_v2
--- /dev/null 1 Jan 1970 00:00:00 -0000
+++ patches/patch-libs_context_build_Jamfile_v2 4 Sep 2019 02:39:07 -0000
@@ -0,0 +1,16 @@
+$OpenBSD$
+
+The second "clang" should be "gcc".
+
+Index: libs/context/build/Jamfile.v2
+--- libs/context/build/Jamfile.v2.orig
++++ libs/context/build/Jamfile.v2
+@@ -326,7 +326,7 @@ alias asm_sources
+      <address-model>32
+      <architecture>power
+      <binary-format>elf
+-     <toolset>clang
++     <toolset>gcc
+    ;
+
+ alias asm_sources
Index: patches/patch-libs_context_src_asm_jump_ppc32_sysv_elf_gas_S
===================================================================
RCS file: patches/patch-libs_context_src_asm_jump_ppc32_sysv_elf_gas_S
diff -N patches/patch-libs_context_src_asm_jump_ppc32_sysv_elf_gas_S
--- /dev/null 1 Jan 1970 00:00:00 -0000
+++ patches/patch-libs_context_src_asm_jump_ppc32_sysv_elf_gas_S 4 Sep 2019 02:39:07 -0000
@@ -0,0 +1,66 @@
+$OpenBSD$
+
+ELF systems other than Linux use a different convention to return a
+small struct like transfer_t.
+
+Index: libs/context/src/asm/jump_ppc32_sysv_elf_gas.S
+--- libs/context/src/asm/jump_ppc32_sysv_elf_gas.S.orig
++++ libs/context/src/asm/jump_ppc32_sysv_elf_gas.S
+@@ -78,6 +78,9 @@
+ .align 2
+ .type jump_fcontext,@function
+ jump_fcontext:
++    # Linux: jump_fcontext( hidden transfer_t * %r3, %r4, %r5)
++    # Other: transfer_t %r3:%r4 = jump_fcontext( %r3, %r4)
++
+     # reserve space on stack
+     subi  %r1, %r1, 244
+
+@@ -121,7 +124,9 @@ jump_fcontext:
+     stw  %r29, 216(%r1)  # save R29
+     stw  %r30, 220(%r1)  # save R30
+     stw  %r31, 224(%r1)  # save R31
++#ifdef __Linux__
+     stw  %r3,  228(%r1)  # save hidden
++#endif
+
+     # save CR
+     mfcr  %r0
+@@ -135,8 +140,12 @@ jump_fcontext:
+     # store RSP (pointing to context-data) in R6
+     mr  %r6, %r1
+
+-    # restore RSP (pointing to context-data) from R4
++    # restore RSP (pointing to context-data) from R4/R3
++#ifdef __Linux__
+     mr  %r1, %r4
++#else
++    mr  %r1, %r3
++#endif
+
+     lfd  %f14, 0(%r1)  # restore F14
+     lfd  %f15, 8(%r1)  # restore F15
+@@ -178,7 +187,9 @@ jump_fcontext:
+     lwz  %r29, 216(%r1)  # restore R29
+     lwz  %r30, 220(%r1)  # restore R30
+     lwz  %r31, 224(%r1)  # restore R31
++#ifdef __Linux__
+     lwz  %r3,  228(%r1)  # restore hidden
++#endif
+
+     # restore CR
+     lwz   %r0, 232(%r1)
+@@ -195,8 +206,13 @@ jump_fcontext:
+     addi  %r1, %r1, 244
+
+     # return transfer_t
++#ifdef __Linux__
+     stw  %r6, 0(%r3)
+     stw  %r5, 4(%r3)
++#else
++    mr   %r3, %r5
++    #    %r4, %r4
++#endif
+
+     # jump to context
+     bctr
Index: patches/patch-libs_context_src_asm_make_ppc32_sysv_elf_gas_S
===================================================================
RCS file: patches/patch-libs_context_src_asm_make_ppc32_sysv_elf_gas_S
diff -N patches/patch-libs_context_src_asm_make_ppc32_sysv_elf_gas_S
--- /dev/null 1 Jan 1970 00:00:00 -0000
+++ patches/patch-libs_context_src_asm_make_ppc32_sysv_elf_gas_S 4 Sep 2019 02:39:07 -0000
@@ -0,0 +1,21 @@
+$OpenBSD$
+
+ELF systems other than Linux use a different convention to return a
+small struct like transfer_t.
+
+Index: libs/context/src/asm/make_ppc32_sysv_elf_gas.S
+--- libs/context/src/asm/make_ppc32_sysv_elf_gas.S.orig
++++ libs/context/src/asm/make_ppc32_sysv_elf_gas.S
+@@ -99,10 +99,12 @@ make_fcontext:
+     mffs  %f0  # load FPSCR
+     stfd  %f0, 144(%r3)  # save FPSCR
+
++#ifdef __Linux__
+     # compute address of returned transfer_t
+     addi  %r0, %r3, 252
+     mr    %r4, %r0
+     stw   %r4, 228(%r3)
++#endif
+
+     # load LR
+     mflr  %r0
Index: patches/patch-libs_context_src_asm_ontop_ppc32_sysv_elf_gas_S
===================================================================
RCS file: patches/patch-libs_context_src_asm_ontop_ppc32_sysv_elf_gas_S
diff -N patches/patch-libs_context_src_asm_ontop_ppc32_sysv_elf_gas_S
--- /dev/null 1 Jan 1970 00:00:00 -0000
+++ patches/patch-libs_context_src_asm_ontop_ppc32_sysv_elf_gas_S 4 Sep 2019 02:39:07 -0000
@@ -0,0 +1,75 @@
+$OpenBSD$
+
+ELF systems other than Linux use a different convention to return a
+small struct like transfer_t.
+
+Index: libs/context/src/asm/ontop_ppc32_sysv_elf_gas.S
+--- libs/context/src/asm/ontop_ppc32_sysv_elf_gas.S.orig
++++ libs/context/src/asm/ontop_ppc32_sysv_elf_gas.S
+@@ -78,6 +78,9 @@
+ .align 2
+ .type ontop_fcontext,@function
+ ontop_fcontext:
++    # Linux: ontop_fcontext( hidden transfer_t * %r3, %r4, %r5, %r6)
++    # Other: transfer_t %r3:%r4 = ontop_fcontext( %r3, %r4, %r5)
++
+     # reserve space on stack
+     subi  %r1, %r1, 244
+
+@@ -121,7 +124,9 @@ ontop_fcontext:
+     stw  %r29, 216(%r1)  # save R29
+     stw  %r30, 220(%r1)  # save R30
+     stw  %r31, 224(%r1)  # save R31
++#ifdef __Linux__
+     stw  %r3,  228(%r1)  # save hidden
++#endif
+
+     # save CR
+     mfcr  %r0
+@@ -135,8 +140,12 @@ ontop_fcontext:
+     # store RSP (pointing to context-data) in R7
+     mr  %r7, %r1
+
+-    # restore RSP (pointing to context-data) from R4
++    # restore RSP (pointing to context-data) from R4/R3
++#ifdef __Linux__
+     mr  %r1, %r4
++#else
++    mr  %r1, %r3
++#endif
+
+     lfd  %f14, 0(%r1)  # restore F14
+     lfd  %f15, 8(%r1)  # restore F15
+@@ -178,7 +187,9 @@ ontop_fcontext:
+     lwz  %r29, 216(%r1)  # restore R29
+     lwz  %r30, 220(%r1)  # restore R30
+     lwz  %r31, 224(%r1)  # restore R31
++#ifdef __Linux__
+     lwz  %r4,  228(%r1)  # restore hidden
++#endif
+
+     # restore CR
+     lwz   %r0, 232(%r1)
+@@ -191,12 +202,21 @@ ontop_fcontext:
+     # adjust stack
+     addi  %r1, %r1, 244
+
+-    # return transfer_t
++    # return transfer_t
++#ifdef __Linux__
+     stw  %r7, 0(%r4)
+     stw  %r5, 4(%r4)
++#else
++    mr   %r3, %r7
++    #    %r4, %r4
++#endif
+
+     # restore CTR
++#ifdef __Linux__
+     mtctr %r6
++#else
++    mtctr %r5
++#endif
+
+     # jump to ontop-function
+     bctr

Reply | Threaded
Open this post in threaded view
|

Re: boost md context switching on macppc

Otto Moerbeek
On Tue, Sep 03, 2019 at 11:49:05PM -0400, George Koehler wrote:

> On Tue, 27 Aug 2019 21:04:00 +0200
> Otto Moerbeek <[hidden email]> wrote:
>
> > A first shot did not work here, so if you can take a closer look please
> > do. In the meantime I'll try to do so as well after reading up on the ABI.
>
> I made my own attempt to fix the *ppc32_sysv_elf* assembly code in
> lang/boost, but I made some mistake.  I believe that I set the stack
> pointer %r1 outside MAP_STACK memory.  This causes the machine to
> freeze, as the macppc kernel gets stuck in an infinite loop, repeatedly
> printing a message like
>
> [jump]57834/195711 sp=9421ffc0 inside ffbee000-fffee000: not MAP_STACK
>
> where "jump" is the name of the executable.  "jump" is one of the
> programs from WRKSRC/libs/context/example
>
> To get the kernel messages to appear, I needed to rcctl stop xenodm,
> so xconsole doesn't grab the messages.  Then I ran the executable from
> the boot console ttyC0 (Ctrl-Meta-F1).
>
> I don't need boost to reproduce this kernel problem; it is enough to
> build a program that sets a bad stack pointer, like
>
> $ cat crash.c
> #include <stdlib.h>
> int
> main(void) {
> malloc(16384);
> __asm__("addi %r1, %r3, 16368");
> exit(0);
> }
> $ gcc -o crash crash.c
> $ ./crash
>
> The stuck kernel responds to nothing -- it doesn't answer ping(8) --
> so my only way out is to force off the power, by holding the power
> button of my PowerBook G4.  I need to work around this kernel problem;
> I might upgrade to a newer snapshot (my kernel is from Aug 26), report
> a bug, or try to build a kernel without the MAP_STACK check.

The kernel is supposed to abort programs that have a stack pointer
not pointing to a MAP_STACK flagged reagion. The repeating is indeed a
bug.

Pleaase post your test program on bugs. This need to be fixed to be
able to do debug the boost problem further.

        -Otto


>
> The rest of this email is the *broken* diff to devel/boost.  It
> includes your fixes, plus my assembly changes.
>
> Index: Makefile
> ===================================================================
> RCS file: /cvs/ports/devel/boost/Makefile,v
> retrieving revision 1.89
> diff -u -p -r1.89 Makefile
> --- Makefile 9 Aug 2019 11:25:29 -0000 1.89
> +++ Makefile 4 Sep 2019 02:39:07 -0000
> @@ -17,7 +17,7 @@ EXTRACT_SUFX= .tar.bz2
>  FIX_EXTRACT_PERMISSIONS = Yes
>  
>  REVISION-main= 6
> -REVISION-md= 1
> +REVISION-md= 2
>  
>  SO_VERSION= 9.0
>  BOOST_LIBS= boost_atomic-mt \
> Index: patches/patch-libs_context_build_Jamfile_v2
> ===================================================================
> RCS file: patches/patch-libs_context_build_Jamfile_v2
> diff -N patches/patch-libs_context_build_Jamfile_v2
> --- /dev/null 1 Jan 1970 00:00:00 -0000
> +++ patches/patch-libs_context_build_Jamfile_v2 4 Sep 2019 02:39:07 -0000
> @@ -0,0 +1,16 @@
> +$OpenBSD$
> +
> +The second "clang" should be "gcc".
> +
> +Index: libs/context/build/Jamfile.v2
> +--- libs/context/build/Jamfile.v2.orig
> ++++ libs/context/build/Jamfile.v2
> +@@ -326,7 +326,7 @@ alias asm_sources
> +      <address-model>32
> +      <architecture>power
> +      <binary-format>elf
> +-     <toolset>clang
> ++     <toolset>gcc
> +    ;
> +
> + alias asm_sources
> Index: patches/patch-libs_context_src_asm_jump_ppc32_sysv_elf_gas_S
> ===================================================================
> RCS file: patches/patch-libs_context_src_asm_jump_ppc32_sysv_elf_gas_S
> diff -N patches/patch-libs_context_src_asm_jump_ppc32_sysv_elf_gas_S
> --- /dev/null 1 Jan 1970 00:00:00 -0000
> +++ patches/patch-libs_context_src_asm_jump_ppc32_sysv_elf_gas_S 4 Sep 2019 02:39:07 -0000
> @@ -0,0 +1,66 @@
> +$OpenBSD$
> +
> +ELF systems other than Linux use a different convention to return a
> +small struct like transfer_t.
> +
> +Index: libs/context/src/asm/jump_ppc32_sysv_elf_gas.S
> +--- libs/context/src/asm/jump_ppc32_sysv_elf_gas.S.orig
> ++++ libs/context/src/asm/jump_ppc32_sysv_elf_gas.S
> +@@ -78,6 +78,9 @@
> + .align 2
> + .type jump_fcontext,@function
> + jump_fcontext:
> ++    # Linux: jump_fcontext( hidden transfer_t * %r3, %r4, %r5)
> ++    # Other: transfer_t %r3:%r4 = jump_fcontext( %r3, %r4)
> ++
> +     # reserve space on stack
> +     subi  %r1, %r1, 244
> +
> +@@ -121,7 +124,9 @@ jump_fcontext:
> +     stw  %r29, 216(%r1)  # save R29
> +     stw  %r30, 220(%r1)  # save R30
> +     stw  %r31, 224(%r1)  # save R31
> ++#ifdef __Linux__
> +     stw  %r3,  228(%r1)  # save hidden
> ++#endif
> +
> +     # save CR
> +     mfcr  %r0
> +@@ -135,8 +140,12 @@ jump_fcontext:
> +     # store RSP (pointing to context-data) in R6
> +     mr  %r6, %r1
> +
> +-    # restore RSP (pointing to context-data) from R4
> ++    # restore RSP (pointing to context-data) from R4/R3
> ++#ifdef __Linux__
> +     mr  %r1, %r4
> ++#else
> ++    mr  %r1, %r3
> ++#endif
> +
> +     lfd  %f14, 0(%r1)  # restore F14
> +     lfd  %f15, 8(%r1)  # restore F15
> +@@ -178,7 +187,9 @@ jump_fcontext:
> +     lwz  %r29, 216(%r1)  # restore R29
> +     lwz  %r30, 220(%r1)  # restore R30
> +     lwz  %r31, 224(%r1)  # restore R31
> ++#ifdef __Linux__
> +     lwz  %r3,  228(%r1)  # restore hidden
> ++#endif
> +
> +     # restore CR
> +     lwz   %r0, 232(%r1)
> +@@ -195,8 +206,13 @@ jump_fcontext:
> +     addi  %r1, %r1, 244
> +
> +     # return transfer_t
> ++#ifdef __Linux__
> +     stw  %r6, 0(%r3)
> +     stw  %r5, 4(%r3)
> ++#else
> ++    mr   %r3, %r5
> ++    #    %r4, %r4
> ++#endif
> +
> +     # jump to context
> +     bctr
> Index: patches/patch-libs_context_src_asm_make_ppc32_sysv_elf_gas_S
> ===================================================================
> RCS file: patches/patch-libs_context_src_asm_make_ppc32_sysv_elf_gas_S
> diff -N patches/patch-libs_context_src_asm_make_ppc32_sysv_elf_gas_S
> --- /dev/null 1 Jan 1970 00:00:00 -0000
> +++ patches/patch-libs_context_src_asm_make_ppc32_sysv_elf_gas_S 4 Sep 2019 02:39:07 -0000
> @@ -0,0 +1,21 @@
> +$OpenBSD$
> +
> +ELF systems other than Linux use a different convention to return a
> +small struct like transfer_t.
> +
> +Index: libs/context/src/asm/make_ppc32_sysv_elf_gas.S
> +--- libs/context/src/asm/make_ppc32_sysv_elf_gas.S.orig
> ++++ libs/context/src/asm/make_ppc32_sysv_elf_gas.S
> +@@ -99,10 +99,12 @@ make_fcontext:
> +     mffs  %f0  # load FPSCR
> +     stfd  %f0, 144(%r3)  # save FPSCR
> +
> ++#ifdef __Linux__
> +     # compute address of returned transfer_t
> +     addi  %r0, %r3, 252
> +     mr    %r4, %r0
> +     stw   %r4, 228(%r3)
> ++#endif
> +
> +     # load LR
> +     mflr  %r0
> Index: patches/patch-libs_context_src_asm_ontop_ppc32_sysv_elf_gas_S
> ===================================================================
> RCS file: patches/patch-libs_context_src_asm_ontop_ppc32_sysv_elf_gas_S
> diff -N patches/patch-libs_context_src_asm_ontop_ppc32_sysv_elf_gas_S
> --- /dev/null 1 Jan 1970 00:00:00 -0000
> +++ patches/patch-libs_context_src_asm_ontop_ppc32_sysv_elf_gas_S 4 Sep 2019 02:39:07 -0000
> @@ -0,0 +1,75 @@
> +$OpenBSD$
> +
> +ELF systems other than Linux use a different convention to return a
> +small struct like transfer_t.
> +
> +Index: libs/context/src/asm/ontop_ppc32_sysv_elf_gas.S
> +--- libs/context/src/asm/ontop_ppc32_sysv_elf_gas.S.orig
> ++++ libs/context/src/asm/ontop_ppc32_sysv_elf_gas.S
> +@@ -78,6 +78,9 @@
> + .align 2
> + .type ontop_fcontext,@function
> + ontop_fcontext:
> ++    # Linux: ontop_fcontext( hidden transfer_t * %r3, %r4, %r5, %r6)
> ++    # Other: transfer_t %r3:%r4 = ontop_fcontext( %r3, %r4, %r5)
> ++
> +     # reserve space on stack
> +     subi  %r1, %r1, 244
> +
> +@@ -121,7 +124,9 @@ ontop_fcontext:
> +     stw  %r29, 216(%r1)  # save R29
> +     stw  %r30, 220(%r1)  # save R30
> +     stw  %r31, 224(%r1)  # save R31
> ++#ifdef __Linux__
> +     stw  %r3,  228(%r1)  # save hidden
> ++#endif
> +
> +     # save CR
> +     mfcr  %r0
> +@@ -135,8 +140,12 @@ ontop_fcontext:
> +     # store RSP (pointing to context-data) in R7
> +     mr  %r7, %r1
> +
> +-    # restore RSP (pointing to context-data) from R4
> ++    # restore RSP (pointing to context-data) from R4/R3
> ++#ifdef __Linux__
> +     mr  %r1, %r4
> ++#else
> ++    mr  %r1, %r3
> ++#endif
> +
> +     lfd  %f14, 0(%r1)  # restore F14
> +     lfd  %f15, 8(%r1)  # restore F15
> +@@ -178,7 +187,9 @@ ontop_fcontext:
> +     lwz  %r29, 216(%r1)  # restore R29
> +     lwz  %r30, 220(%r1)  # restore R30
> +     lwz  %r31, 224(%r1)  # restore R31
> ++#ifdef __Linux__
> +     lwz  %r4,  228(%r1)  # restore hidden
> ++#endif
> +
> +     # restore CR
> +     lwz   %r0, 232(%r1)
> +@@ -191,12 +202,21 @@ ontop_fcontext:
> +     # adjust stack
> +     addi  %r1, %r1, 244
> +
> +-    # return transfer_t
> ++    # return transfer_t
> ++#ifdef __Linux__
> +     stw  %r7, 0(%r4)
> +     stw  %r5, 4(%r4)
> ++#else
> ++    mr   %r3, %r7
> ++    #    %r4, %r4
> ++#endif
> +
> +     # restore CTR
> ++#ifdef __Linux__
> +     mtctr %r6
> ++#else
> ++    mtctr %r5
> ++#endif
> +
> +     # jump to ontop-function
> +     bctr

Reply | Threaded
Open this post in threaded view
|

Re: boost md context switching on macppc

George Koehler-2
On Wed, 4 Sep 2019 07:50:29 +0200
Otto Moerbeek <[hidden email]> wrote:

> The kernel is supposed to abort programs that have a stack pointer
> not pointing to a MAP_STACK flagged reagion. The repeating is indeed a
> bug.
>
> Pleaase post your test program on bugs. This need to be fixed to be
> able to do debug the boost problem further.
>
> -Otto

Theo de Raadt fixed the kernel last week.  I made some progress with
boost this week, and now send a revised diff; but I now have a problem
with C++ exceptions through ontop_fcontext(), so today's diff is still
broken.

The fixed kernel aborted the program with SIGSEGV and left a core
dump.  My diff for jump*.S had a wrong `mr %r3, %r5`, where the %r5
should be %r6.  My mistake lost the future stack pointer in %r6.

After fixing the %r6, I made 2 other changes:

 1. I fixed make_fcontext() in make*.S to work with jump_fcontext()
    in jump*.S, by adding a trampoline to pass the transfer_t to
    the context-function.

 2. I tried to fix ontop_fcontext() in ontop*.S, so that it passes
    the transfer_t to the ontop-function.

To build the jump and fibonacci examples, I copied
WRKSRC/libs/context/example to my home directory, then did

$ cat Makefile
CXX = eg++
DEBUG = -g
CPPFLAGS = -I/usr/local/include
LDFLAGS = -L/usr/local/lib -lboost_context-mt
all: fibonacci jump
$ make

Now ./jump works, but ./fibonacci fails with

$ ./fibonacci                                                      
0 1 1 2 3 5 8 13 21 34
main: done
terminate called after throwing an instance of 'boost::context::detail::forced_unwind'
Abort trap (core dumped)

There are 2 contexts: a main function that takes 10 numbers, and a
ctx::continuation that would generate the infinite sequence of
fibonacci numbers.  As the main function returns, it calls the C++
destructor ~continuation.  The destructor uses ontop_fcontext() to
throw the C++ exception forced_unwind in the context of the fibonacci
generator.  This exception should unwind the stack until
context_entry() catches the forced_unwind; context_entry() is defined
in <boost/context/continuation_fcontext.hpp>.

I interpret the error "terminate called..." to mean that
context_entry() failed to catch the exception.  My guess is that
the ontop_fcontext() fails to pass through C++ exceptions.

boost/context uses 3 machine-dependent functions:

 - transfer_t jump_fcontext(fcontext_t to, void *vp) pauses the
   current context and jumps to the other context.  It returns
   struct transfer_t {fcontext_t from, void *vp}.

 - fcontext_t make_fcontext(void *sp, std::size_t size,
   void (*fn)(transfer_t)) takes a stack of the given size, and
   returns a new context.  The first jump to the new context will
   call the entry point fn({from, vp}).

 - transfer_t ontop_fcontext(fcontext_t to, void *vp,
   transfer_t (*fn)(transfer_t)) jumps to the other context, but
   calls fn({from, vp}) in the other context.

{jump,make,ontop}_ppc32_sysv_elf_gas.S implements these functions
in PowerPC assembly.  fcontext_t is a pointer to a special 244-byte
frame on top of the stack.

To recap the original problem: the System V ABI says to return the
8-byte transfer_t in registers %r3 and %r4.  Linux deviates from the
ABI by passing a hidden parameter transfer_t *%r3; this points to an
8-byte return area.  The existing code is for Linux and needs patches
to work on OpenBSD.

ontop_fcontext calls transfer_t fn(transfer_t).  This call is

 - fn(hidden transfer_t *%r3, transfer_t *%r4) on Linux

 - fn(transfer_t *%r3) on OpenBSD

For Linux, the code removes the 244-byte frame from the stack and
makes a tail call to fn.  The storage for *%r3 and *%r4 comes from the
hidden parameters to ontop_fcontext or jump_fcontext in both contexts.
Because of the tail call, ontop_fcontext() is no longer in the stack
when the fibonacci example throws force_unwind.

For OpenBSD, there are no hidden parameters, so we are missing 8 bytes
of storage for *%r3.  I also don't want to keep the 244-byte frame on
the stack, because 244 isn't a multiple of 16, so the stack pointer
would be misaligned.  I try to solve this by removing the 244-byte
frame and then allocating a 16-byte frame to store *%r3; but now
ontop_fcontext() keeps the 16-byte frame in the stack, and I don't
know how to throw force_unwind through this frame!

Also, the upstream code has 2 or 3 other bugs:

 1. make_fcontext() misaligns the new stack.  It tries to 16-align the
    stack pointer $r1, but it points $r1 to the 244-byte frame, so
    jump_fcontext() will remove the frame, then call the entry point
    with a misaligned $r1.  I haven't fixed this.

 2. New contexts set %r13 to garbage.  In the ABI, %r13 must point to
    the executable's small data.  I might not fix this, because
    OpenBSD deviates from the ABI by never setting %r13.  Compilers
    seem to never use %r13, because they don't know whether the code
    is for the executable or a shared library.

 3. I'm not sure, but the call to _exit(0) in make_fcontext() might be
    wrong on systems that use the secure PLT, including OpenBSD.

I might want to resize and rearrange the 244-byte frame to fit the ABI
(https://refspecs.linuxbase.org/elf/elfspec_ppc.pdf), but this would
require rewriting most of the code, and might not solve the
force_unwind problem.  My next idea is to try using eg++ -S to teach
myself how to handle C++ exceptions in assembly.

The broken diff follows.

Index: Makefile
===================================================================
RCS file: /cvs/ports/devel/boost/Makefile,v
retrieving revision 1.89
diff -u -p -r1.89 Makefile
--- Makefile 9 Aug 2019 11:25:29 -0000 1.89
+++ Makefile 12 Sep 2019 01:59:21 -0000
@@ -17,7 +17,7 @@ EXTRACT_SUFX= .tar.bz2
 FIX_EXTRACT_PERMISSIONS = Yes
 
 REVISION-main= 6
-REVISION-md= 1
+REVISION-md= 2
 
 SO_VERSION= 9.0
 BOOST_LIBS= boost_atomic-mt \
Index: patches/patch-libs_context_build_Jamfile_v2
===================================================================
RCS file: patches/patch-libs_context_build_Jamfile_v2
diff -N patches/patch-libs_context_build_Jamfile_v2
--- /dev/null 1 Jan 1970 00:00:00 -0000
+++ patches/patch-libs_context_build_Jamfile_v2 12 Sep 2019 01:59:21 -0000
@@ -0,0 +1,17 @@
+$OpenBSD$
+
+ppc32_sysv_elf has 2 instances of "<toolset>clang".
+The second "clang" should be "gcc".
+
+Index: libs/context/build/Jamfile.v2
+--- libs/context/build/Jamfile.v2.orig
++++ libs/context/build/Jamfile.v2
+@@ -326,7 +326,7 @@ alias asm_sources
+      <address-model>32
+      <architecture>power
+      <binary-format>elf
+-     <toolset>clang
++     <toolset>gcc
+    ;
+
+ alias asm_sources
Index: patches/patch-libs_context_src_asm_jump_ppc32_sysv_elf_gas_S
===================================================================
RCS file: patches/patch-libs_context_src_asm_jump_ppc32_sysv_elf_gas_S
diff -N patches/patch-libs_context_src_asm_jump_ppc32_sysv_elf_gas_S
--- /dev/null 1 Jan 1970 00:00:00 -0000
+++ patches/patch-libs_context_src_asm_jump_ppc32_sysv_elf_gas_S 12 Sep 2019 01:59:21 -0000
@@ -0,0 +1,66 @@
+$OpenBSD$
+
+ELF systems other than Linux use a different convention to return a
+small struct like transfer_t.
+
+Index: libs/context/src/asm/jump_ppc32_sysv_elf_gas.S
+--- libs/context/src/asm/jump_ppc32_sysv_elf_gas.S.orig
++++ libs/context/src/asm/jump_ppc32_sysv_elf_gas.S
+@@ -78,6 +78,9 @@
+ .align 2
+ .type jump_fcontext,@function
+ jump_fcontext:
++    # Linux: jump_fcontext( hidden transfer_t * %r3, %r4, %r5)
++    # Other: transfer_t %r3:%r4 = jump_fcontext( %r3, %r4)
++
+     # reserve space on stack
+     subi  %r1, %r1, 244
+
+@@ -121,7 +124,9 @@ jump_fcontext:
+     stw  %r29, 216(%r1)  # save R29
+     stw  %r30, 220(%r1)  # save R30
+     stw  %r31, 224(%r1)  # save R31
++#ifdef __Linux__
+     stw  %r3,  228(%r1)  # save hidden
++#endif
+
+     # save CR
+     mfcr  %r0
+@@ -135,8 +140,12 @@ jump_fcontext:
+     # store RSP (pointing to context-data) in R6
+     mr  %r6, %r1
+
+-    # restore RSP (pointing to context-data) from R4
++    # restore RSP (pointing to context-data) from R4/R3
++#ifdef __Linux__
+     mr  %r1, %r4
++#else
++    mr  %r1, %r3
++#endif
+
+     lfd  %f14, 0(%r1)  # restore F14
+     lfd  %f15, 8(%r1)  # restore F15
+@@ -178,7 +187,9 @@ jump_fcontext:
+     lwz  %r29, 216(%r1)  # restore R29
+     lwz  %r30, 220(%r1)  # restore R30
+     lwz  %r31, 224(%r1)  # restore R31
++#ifdef __Linux__
+     lwz  %r3,  228(%r1)  # restore hidden
++#endif
+
+     # restore CR
+     lwz   %r0, 232(%r1)
+@@ -195,8 +206,13 @@ jump_fcontext:
+     addi  %r1, %r1, 244
+
+     # return transfer_t
++#ifdef __Linux__
+     stw  %r6, 0(%r3)
+     stw  %r5, 4(%r3)
++#else
++    mr   %r3, %r6
++    #    %r4, %r4
++#endif
+
+     # jump to context
+     bctr
Index: patches/patch-libs_context_src_asm_make_ppc32_sysv_elf_gas_S
===================================================================
RCS file: patches/patch-libs_context_src_asm_make_ppc32_sysv_elf_gas_S
diff -N patches/patch-libs_context_src_asm_make_ppc32_sysv_elf_gas_S
--- /dev/null 1 Jan 1970 00:00:00 -0000
+++ patches/patch-libs_context_src_asm_make_ppc32_sysv_elf_gas_S 12 Sep 2019 01:59:21 -0000
@@ -0,0 +1,67 @@
+$OpenBSD$
+
+ELF systems other than Linux use a different convention to return a
+small struct like transfer_t.
+
+Index: libs/context/src/asm/make_ppc32_sysv_elf_gas.S
+--- libs/context/src/asm/make_ppc32_sysv_elf_gas.S.orig
++++ libs/context/src/asm/make_ppc32_sysv_elf_gas.S
+@@ -90,7 +90,13 @@ make_fcontext:
+     subi  %r3, %r3, 336
+
+     # third arg of make_fcontext() == address of context-function
++#ifdef __Linux__
++    # save context-function as PC
+     stw  %r5, 240(%r3)
++#else
++    # save context-function for trampoline
++    stw  %r5, 252(%r3)
++#endif
+
+     # set back-chain to zero
+     li   %r0, 0
+@@ -99,10 +105,12 @@ make_fcontext:
+     mffs  %f0  # load FPSCR
+     stfd  %f0, 144(%r3)  # save FPSCR
+
++#ifdef __Linux__
+     # compute address of returned transfer_t
+     addi  %r0, %r3, 252
+     mr    %r4, %r0
+     stw   %r4, 228(%r3)
++#endif
+
+     # load LR
+     mflr  %r0
+@@ -111,6 +119,11 @@ make_fcontext:
+ 1:
+     # load LR into R4
+     mflr  %r4
++#ifndef __Linux__
++    # compute abs address of trampoline; use as PC
++    addi  %r7, %r4, trampoline - 1b
++    stw   %r7, 240(%r3)
++#endif
+     # compute abs address of label finish
+     addi  %r4, %r4, finish - 1b
+     # restore LR
+@@ -123,6 +136,19 @@ make_fcontext:
+     mtlr  %r6
+
+     blr  # return pointer to context-data
++
++#ifndef __Linux__
++trampoline:
++    # On systems other than Linux, jump_fcontext is returning the
++    # transfer_t in %r3:%r4, but we need to pass transfer_t * %r3 to
++    # our context-function.
++    lwz   %r0, 8(%r1)   # address of context-function
++    mtctr %r0
++    stw   %r3, 8(%r1)
++    stw   %r4, 12(%r1)  # move transfer_t to stack
++    la    %r3, 8(%r1)   # address of transfer_t
++    bctr
++#endif
+
+ finish:
+     # save return address into R0
Index: patches/patch-libs_context_src_asm_ontop_ppc32_sysv_elf_gas_S
===================================================================
RCS file: patches/patch-libs_context_src_asm_ontop_ppc32_sysv_elf_gas_S
diff -N patches/patch-libs_context_src_asm_ontop_ppc32_sysv_elf_gas_S
--- /dev/null 1 Jan 1970 00:00:00 -0000
+++ patches/patch-libs_context_src_asm_ontop_ppc32_sysv_elf_gas_S 12 Sep 2019 01:59:21 -0000
@@ -0,0 +1,91 @@
+$OpenBSD$
+
+ELF systems other than Linux use a different convention to return a
+small struct like transfer_t.
+
+Index: libs/context/src/asm/ontop_ppc32_sysv_elf_gas.S
+--- libs/context/src/asm/ontop_ppc32_sysv_elf_gas.S.orig
++++ libs/context/src/asm/ontop_ppc32_sysv_elf_gas.S
+@@ -78,6 +78,9 @@
+ .align 2
+ .type ontop_fcontext,@function
+ ontop_fcontext:
++    # Linux: ontop_fcontext( hidden transfer_t * %r3, %r4, %r5, %r6)
++    # Other: transfer_t %r3:%r4 = ontop_fcontext( %r3, %r4, %r5)
++
+     # reserve space on stack
+     subi  %r1, %r1, 244
+
+@@ -121,7 +124,9 @@ ontop_fcontext:
+     stw  %r29, 216(%r1)  # save R29
+     stw  %r30, 220(%r1)  # save R30
+     stw  %r31, 224(%r1)  # save R31
++#ifdef __Linux__
+     stw  %r3,  228(%r1)  # save hidden
++#endif
+
+     # save CR
+     mfcr  %r0
+@@ -135,8 +140,12 @@ ontop_fcontext:
+     # store RSP (pointing to context-data) in R7
+     mr  %r7, %r1
+
+-    # restore RSP (pointing to context-data) from R4
++    # restore RSP (pointing to context-data) from R4/R3
++#ifdef __Linux__
+     mr  %r1, %r4
++#else
++    mr  %r1, %r3
++#endif
+
+     lfd  %f14, 0(%r1)  # restore F14
+     lfd  %f15, 8(%r1)  # restore F15
+@@ -178,20 +187,25 @@ ontop_fcontext:
+     lwz  %r29, 216(%r1)  # restore R29
+     lwz  %r30, 220(%r1)  # restore R30
+     lwz  %r31, 224(%r1)  # restore R31
++#ifdef __Linux__
+     lwz  %r4,  228(%r1)  # restore hidden
++#endif
+
+     # restore CR
+     lwz   %r0, 232(%r1)
+     mtcr  %r0
+     # restore LR
+     lwz   %r0, 236(%r1)
++#ifdef __Linux__
+     mtlr  %r0
++#endif
+     # ignore PC
+
+     # adjust stack
+     addi  %r1, %r1, 244
+
+-    # return transfer_t
++#ifdef __Linux__
++    # return transfer_t
+     stw  %r7, 0(%r4)
+     stw  %r5, 4(%r4)
+
+@@ -200,6 +214,21 @@ ontop_fcontext:
+
+     # jump to ontop-function
+     bctr
++#else
++    # On systems other than Linux, the caller didn't allocate memory
++    # for the transfer_t, so we must allocate it.
++    mtctr %r5            # ontop-function
++    stwu  %r1, -16(%r1)  # allocate stack frame
++    stw   %r0, 20(%r1)   # save LR in caller's LR save word
++    stw   %r7, 8(%r1)
++    stw   %r4, 12(%r1)
++    la    %r3, 8(%r1)    # address of transfer_t
++    bctrl                # call ontop-function, return here
++    lwz   %r0, 20(%r1)
++    mtlr  %r0            # restore LR
++    addi  %r1, %r1, 16   # free stack frame
++    blr                  # return to caller
++#endif
+ .size ontop_fcontext, .-ontop_fcontext
+
+ /* Mark that we don't need executable stack.  */

--
George Koehler <[hidden email]>

Reply | Threaded
Open this post in threaded view
|

Re: boost md context switching on macppc

Otto Moerbeek
On Thu, Sep 12, 2019 at 12:54:47AM -0400, George Koehler wrote:

> On Wed, 4 Sep 2019 07:50:29 +0200
> Otto Moerbeek <[hidden email]> wrote:
>
> > The kernel is supposed to abort programs that have a stack pointer
> > not pointing to a MAP_STACK flagged reagion. The repeating is indeed a
> > bug.
> >
> > Pleaase post your test program on bugs. This need to be fixed to be
> > able to do debug the boost problem further.
> >
> > -Otto
>
> Theo de Raadt fixed the kernel last week.  I made some progress with
> boost this week, and now send a revised diff; but I now have a problem
> with C++ exceptions through ontop_fcontext(), so today's diff is still
> broken.
>
> The fixed kernel aborted the program with SIGSEGV and left a core
> dump.  My diff for jump*.S had a wrong `mr %r3, %r5`, where the %r5
> should be %r6.  My mistake lost the future stack pointer in %r6.
>
> After fixing the %r6, I made 2 other changes:
>
>  1. I fixed make_fcontext() in make*.S to work with jump_fcontext()
>     in jump*.S, by adding a trampoline to pass the transfer_t to
>     the context-function.
>
>  2. I tried to fix ontop_fcontext() in ontop*.S, so that it passes
>     the transfer_t to the ontop-function.
>
> To build the jump and fibonacci examples, I copied
> WRKSRC/libs/context/example to my home directory, then did
>
> $ cat Makefile
> CXX = eg++
> DEBUG = -g
> CPPFLAGS = -I/usr/local/include
> LDFLAGS = -L/usr/local/lib -lboost_context-mt
> all: fibonacci jump
> $ make
>
> Now ./jump works, but ./fibonacci fails with
>
> $ ./fibonacci                                                      
> 0 1 1 2 3 5 8 13 21 34
> main: done
> terminate called after throwing an instance of 'boost::context::detail::forced_unwind'
> Abort trap (core dumped)
>
> There are 2 contexts: a main function that takes 10 numbers, and a
> ctx::continuation that would generate the infinite sequence of
> fibonacci numbers.  As the main function returns, it calls the C++
> destructor ~continuation.  The destructor uses ontop_fcontext() to
> throw the C++ exception forced_unwind in the context of the fibonacci
> generator.  This exception should unwind the stack until
> context_entry() catches the forced_unwind; context_entry() is defined
> in <boost/context/continuation_fcontext.hpp>.
>
> I interpret the error "terminate called..." to mean that
> context_entry() failed to catch the exception.  My guess is that
> the ontop_fcontext() fails to pass through C++ exceptions.
>
> boost/context uses 3 machine-dependent functions:
>
>  - transfer_t jump_fcontext(fcontext_t to, void *vp) pauses the
>    current context and jumps to the other context.  It returns
>    struct transfer_t {fcontext_t from, void *vp}.
>
>  - fcontext_t make_fcontext(void *sp, std::size_t size,
>    void (*fn)(transfer_t)) takes a stack of the given size, and
>    returns a new context.  The first jump to the new context will
>    call the entry point fn({from, vp}).
>
>  - transfer_t ontop_fcontext(fcontext_t to, void *vp,
>    transfer_t (*fn)(transfer_t)) jumps to the other context, but
>    calls fn({from, vp}) in the other context.
>
> {jump,make,ontop}_ppc32_sysv_elf_gas.S implements these functions
> in PowerPC assembly.  fcontext_t is a pointer to a special 244-byte
> frame on top of the stack.
>
> To recap the original problem: the System V ABI says to return the
> 8-byte transfer_t in registers %r3 and %r4.  Linux deviates from the
> ABI by passing a hidden parameter transfer_t *%r3; this points to an
> 8-byte return area.  The existing code is for Linux and needs patches
> to work on OpenBSD.
>
> ontop_fcontext calls transfer_t fn(transfer_t).  This call is
>
>  - fn(hidden transfer_t *%r3, transfer_t *%r4) on Linux
>
>  - fn(transfer_t *%r3) on OpenBSD
>
> For Linux, the code removes the 244-byte frame from the stack and
> makes a tail call to fn.  The storage for *%r3 and *%r4 comes from the
> hidden parameters to ontop_fcontext or jump_fcontext in both contexts.
> Because of the tail call, ontop_fcontext() is no longer in the stack
> when the fibonacci example throws force_unwind.
>
> For OpenBSD, there are no hidden parameters, so we are missing 8 bytes
> of storage for *%r3.  I also don't want to keep the 244-byte frame on
> the stack, because 244 isn't a multiple of 16, so the stack pointer
> would be misaligned.  I try to solve this by removing the 244-byte
> frame and then allocating a 16-byte frame to store *%r3; but now
> ontop_fcontext() keeps the 16-byte frame in the stack, and I don't
> know how to throw force_unwind through this frame!
>
> Also, the upstream code has 2 or 3 other bugs:
>
>  1. make_fcontext() misaligns the new stack.  It tries to 16-align the
>     stack pointer $r1, but it points $r1 to the 244-byte frame, so
>     jump_fcontext() will remove the frame, then call the entry point
>     with a misaligned $r1.  I haven't fixed this.
>
>  2. New contexts set %r13 to garbage.  In the ABI, %r13 must point to
>     the executable's small data.  I might not fix this, because
>     OpenBSD deviates from the ABI by never setting %r13.  Compilers
>     seem to never use %r13, because they don't know whether the code
>     is for the executable or a shared library.
>
>  3. I'm not sure, but the call to _exit(0) in make_fcontext() might be
>     wrong on systems that use the secure PLT, including OpenBSD.
>
> I might want to resize and rearrange the 244-byte frame to fit the ABI
> (https://refspecs.linuxbase.org/elf/elfspec_ppc.pdf), but this would
> require rewriting most of the code, and might not solve the
> force_unwind problem.  My next idea is to try using eg++ -S to teach
> myself how to handle C++ exceptions in assembly.
>
> The broken diff follows.

The good news is that is is not broken for my use-case: PowerDNS
Recursor.  It does not use ontop_fcontext. Thanks a lot for working on
this! I am wondering if there any users of ontop_fcontext in our tree...

        -Otto


>
> Index: Makefile
> ===================================================================
> RCS file: /cvs/ports/devel/boost/Makefile,v
> retrieving revision 1.89
> diff -u -p -r1.89 Makefile
> --- Makefile 9 Aug 2019 11:25:29 -0000 1.89
> +++ Makefile 12 Sep 2019 01:59:21 -0000
> @@ -17,7 +17,7 @@ EXTRACT_SUFX= .tar.bz2
>  FIX_EXTRACT_PERMISSIONS = Yes
>  
>  REVISION-main= 6
> -REVISION-md= 1
> +REVISION-md= 2
>  
>  SO_VERSION= 9.0
>  BOOST_LIBS= boost_atomic-mt \
> Index: patches/patch-libs_context_build_Jamfile_v2
> ===================================================================
> RCS file: patches/patch-libs_context_build_Jamfile_v2
> diff -N patches/patch-libs_context_build_Jamfile_v2
> --- /dev/null 1 Jan 1970 00:00:00 -0000
> +++ patches/patch-libs_context_build_Jamfile_v2 12 Sep 2019 01:59:21 -0000
> @@ -0,0 +1,17 @@
> +$OpenBSD$
> +
> +ppc32_sysv_elf has 2 instances of "<toolset>clang".
> +The second "clang" should be "gcc".
> +
> +Index: libs/context/build/Jamfile.v2
> +--- libs/context/build/Jamfile.v2.orig
> ++++ libs/context/build/Jamfile.v2
> +@@ -326,7 +326,7 @@ alias asm_sources
> +      <address-model>32
> +      <architecture>power
> +      <binary-format>elf
> +-     <toolset>clang
> ++     <toolset>gcc
> +    ;
> +
> + alias asm_sources
> Index: patches/patch-libs_context_src_asm_jump_ppc32_sysv_elf_gas_S
> ===================================================================
> RCS file: patches/patch-libs_context_src_asm_jump_ppc32_sysv_elf_gas_S
> diff -N patches/patch-libs_context_src_asm_jump_ppc32_sysv_elf_gas_S
> --- /dev/null 1 Jan 1970 00:00:00 -0000
> +++ patches/patch-libs_context_src_asm_jump_ppc32_sysv_elf_gas_S 12 Sep 2019 01:59:21 -0000
> @@ -0,0 +1,66 @@
> +$OpenBSD$
> +
> +ELF systems other than Linux use a different convention to return a
> +small struct like transfer_t.
> +
> +Index: libs/context/src/asm/jump_ppc32_sysv_elf_gas.S
> +--- libs/context/src/asm/jump_ppc32_sysv_elf_gas.S.orig
> ++++ libs/context/src/asm/jump_ppc32_sysv_elf_gas.S
> +@@ -78,6 +78,9 @@
> + .align 2
> + .type jump_fcontext,@function
> + jump_fcontext:
> ++    # Linux: jump_fcontext( hidden transfer_t * %r3, %r4, %r5)
> ++    # Other: transfer_t %r3:%r4 = jump_fcontext( %r3, %r4)
> ++
> +     # reserve space on stack
> +     subi  %r1, %r1, 244
> +
> +@@ -121,7 +124,9 @@ jump_fcontext:
> +     stw  %r29, 216(%r1)  # save R29
> +     stw  %r30, 220(%r1)  # save R30
> +     stw  %r31, 224(%r1)  # save R31
> ++#ifdef __Linux__
> +     stw  %r3,  228(%r1)  # save hidden
> ++#endif
> +
> +     # save CR
> +     mfcr  %r0
> +@@ -135,8 +140,12 @@ jump_fcontext:
> +     # store RSP (pointing to context-data) in R6
> +     mr  %r6, %r1
> +
> +-    # restore RSP (pointing to context-data) from R4
> ++    # restore RSP (pointing to context-data) from R4/R3
> ++#ifdef __Linux__
> +     mr  %r1, %r4
> ++#else
> ++    mr  %r1, %r3
> ++#endif
> +
> +     lfd  %f14, 0(%r1)  # restore F14
> +     lfd  %f15, 8(%r1)  # restore F15
> +@@ -178,7 +187,9 @@ jump_fcontext:
> +     lwz  %r29, 216(%r1)  # restore R29
> +     lwz  %r30, 220(%r1)  # restore R30
> +     lwz  %r31, 224(%r1)  # restore R31
> ++#ifdef __Linux__
> +     lwz  %r3,  228(%r1)  # restore hidden
> ++#endif
> +
> +     # restore CR
> +     lwz   %r0, 232(%r1)
> +@@ -195,8 +206,13 @@ jump_fcontext:
> +     addi  %r1, %r1, 244
> +
> +     # return transfer_t
> ++#ifdef __Linux__
> +     stw  %r6, 0(%r3)
> +     stw  %r5, 4(%r3)
> ++#else
> ++    mr   %r3, %r6
> ++    #    %r4, %r4
> ++#endif
> +
> +     # jump to context
> +     bctr
> Index: patches/patch-libs_context_src_asm_make_ppc32_sysv_elf_gas_S
> ===================================================================
> RCS file: patches/patch-libs_context_src_asm_make_ppc32_sysv_elf_gas_S
> diff -N patches/patch-libs_context_src_asm_make_ppc32_sysv_elf_gas_S
> --- /dev/null 1 Jan 1970 00:00:00 -0000
> +++ patches/patch-libs_context_src_asm_make_ppc32_sysv_elf_gas_S 12 Sep 2019 01:59:21 -0000
> @@ -0,0 +1,67 @@
> +$OpenBSD$
> +
> +ELF systems other than Linux use a different convention to return a
> +small struct like transfer_t.
> +
> +Index: libs/context/src/asm/make_ppc32_sysv_elf_gas.S
> +--- libs/context/src/asm/make_ppc32_sysv_elf_gas.S.orig
> ++++ libs/context/src/asm/make_ppc32_sysv_elf_gas.S
> +@@ -90,7 +90,13 @@ make_fcontext:
> +     subi  %r3, %r3, 336
> +
> +     # third arg of make_fcontext() == address of context-function
> ++#ifdef __Linux__
> ++    # save context-function as PC
> +     stw  %r5, 240(%r3)
> ++#else
> ++    # save context-function for trampoline
> ++    stw  %r5, 252(%r3)
> ++#endif
> +
> +     # set back-chain to zero
> +     li   %r0, 0
> +@@ -99,10 +105,12 @@ make_fcontext:
> +     mffs  %f0  # load FPSCR
> +     stfd  %f0, 144(%r3)  # save FPSCR
> +
> ++#ifdef __Linux__
> +     # compute address of returned transfer_t
> +     addi  %r0, %r3, 252
> +     mr    %r4, %r0
> +     stw   %r4, 228(%r3)
> ++#endif
> +
> +     # load LR
> +     mflr  %r0
> +@@ -111,6 +119,11 @@ make_fcontext:
> + 1:
> +     # load LR into R4
> +     mflr  %r4
> ++#ifndef __Linux__
> ++    # compute abs address of trampoline; use as PC
> ++    addi  %r7, %r4, trampoline - 1b
> ++    stw   %r7, 240(%r3)
> ++#endif
> +     # compute abs address of label finish
> +     addi  %r4, %r4, finish - 1b
> +     # restore LR
> +@@ -123,6 +136,19 @@ make_fcontext:
> +     mtlr  %r6
> +
> +     blr  # return pointer to context-data
> ++
> ++#ifndef __Linux__
> ++trampoline:
> ++    # On systems other than Linux, jump_fcontext is returning the
> ++    # transfer_t in %r3:%r4, but we need to pass transfer_t * %r3 to
> ++    # our context-function.
> ++    lwz   %r0, 8(%r1)   # address of context-function
> ++    mtctr %r0
> ++    stw   %r3, 8(%r1)
> ++    stw   %r4, 12(%r1)  # move transfer_t to stack
> ++    la    %r3, 8(%r1)   # address of transfer_t
> ++    bctr
> ++#endif
> +
> + finish:
> +     # save return address into R0
> Index: patches/patch-libs_context_src_asm_ontop_ppc32_sysv_elf_gas_S
> ===================================================================
> RCS file: patches/patch-libs_context_src_asm_ontop_ppc32_sysv_elf_gas_S
> diff -N patches/patch-libs_context_src_asm_ontop_ppc32_sysv_elf_gas_S
> --- /dev/null 1 Jan 1970 00:00:00 -0000
> +++ patches/patch-libs_context_src_asm_ontop_ppc32_sysv_elf_gas_S 12 Sep 2019 01:59:21 -0000
> @@ -0,0 +1,91 @@
> +$OpenBSD$
> +
> +ELF systems other than Linux use a different convention to return a
> +small struct like transfer_t.
> +
> +Index: libs/context/src/asm/ontop_ppc32_sysv_elf_gas.S
> +--- libs/context/src/asm/ontop_ppc32_sysv_elf_gas.S.orig
> ++++ libs/context/src/asm/ontop_ppc32_sysv_elf_gas.S
> +@@ -78,6 +78,9 @@
> + .align 2
> + .type ontop_fcontext,@function
> + ontop_fcontext:
> ++    # Linux: ontop_fcontext( hidden transfer_t * %r3, %r4, %r5, %r6)
> ++    # Other: transfer_t %r3:%r4 = ontop_fcontext( %r3, %r4, %r5)
> ++
> +     # reserve space on stack
> +     subi  %r1, %r1, 244
> +
> +@@ -121,7 +124,9 @@ ontop_fcontext:
> +     stw  %r29, 216(%r1)  # save R29
> +     stw  %r30, 220(%r1)  # save R30
> +     stw  %r31, 224(%r1)  # save R31
> ++#ifdef __Linux__
> +     stw  %r3,  228(%r1)  # save hidden
> ++#endif
> +
> +     # save CR
> +     mfcr  %r0
> +@@ -135,8 +140,12 @@ ontop_fcontext:
> +     # store RSP (pointing to context-data) in R7
> +     mr  %r7, %r1
> +
> +-    # restore RSP (pointing to context-data) from R4
> ++    # restore RSP (pointing to context-data) from R4/R3
> ++#ifdef __Linux__
> +     mr  %r1, %r4
> ++#else
> ++    mr  %r1, %r3
> ++#endif
> +
> +     lfd  %f14, 0(%r1)  # restore F14
> +     lfd  %f15, 8(%r1)  # restore F15
> +@@ -178,20 +187,25 @@ ontop_fcontext:
> +     lwz  %r29, 216(%r1)  # restore R29
> +     lwz  %r30, 220(%r1)  # restore R30
> +     lwz  %r31, 224(%r1)  # restore R31
> ++#ifdef __Linux__
> +     lwz  %r4,  228(%r1)  # restore hidden
> ++#endif
> +
> +     # restore CR
> +     lwz   %r0, 232(%r1)
> +     mtcr  %r0
> +     # restore LR
> +     lwz   %r0, 236(%r1)
> ++#ifdef __Linux__
> +     mtlr  %r0
> ++#endif
> +     # ignore PC
> +
> +     # adjust stack
> +     addi  %r1, %r1, 244
> +
> +-    # return transfer_t
> ++#ifdef __Linux__
> ++    # return transfer_t
> +     stw  %r7, 0(%r4)
> +     stw  %r5, 4(%r4)
> +
> +@@ -200,6 +214,21 @@ ontop_fcontext:
> +
> +     # jump to ontop-function
> +     bctr
> ++#else
> ++    # On systems other than Linux, the caller didn't allocate memory
> ++    # for the transfer_t, so we must allocate it.
> ++    mtctr %r5            # ontop-function
> ++    stwu  %r1, -16(%r1)  # allocate stack frame
> ++    stw   %r0, 20(%r1)   # save LR in caller's LR save word
> ++    stw   %r7, 8(%r1)
> ++    stw   %r4, 12(%r1)
> ++    la    %r3, 8(%r1)    # address of transfer_t
> ++    bctrl                # call ontop-function, return here
> ++    lwz   %r0, 20(%r1)
> ++    mtlr  %r0            # restore LR
> ++    addi  %r1, %r1, 16   # free stack frame
> ++    blr                  # return to caller
> ++#endif
> + .size ontop_fcontext, .-ontop_fcontext
> +
> + /* Mark that we don't need executable stack.  */
>
> --
> George Koehler <[hidden email]>

Reply | Threaded
Open this post in threaded view
|

Re: boost md context switching on macppc

Stuart Henderson
On 2019/09/12 16:19, Otto Moerbeek wrote:
> The good news is that is is not broken for my use-case: PowerDNS
> Recursor.  It does not use ontop_fcontext. Thanks a lot for working on
> this! I am wondering if there any users of ontop_fcontext in our tree...

Currently the only port depending on boost-md is pdns-recursor so this would
be enough for our current use.

Reply | Threaded
Open this post in threaded view
|

Re: boost md context switching on macppc

Tracey Emery
On Thu, Sep 12, 2019 at 03:27:30PM +0100, Stuart Henderson wrote:
> On 2019/09/12 16:19, Otto Moerbeek wrote:
> > The good news is that is is not broken for my use-case: PowerDNS
> > Recursor.  It does not use ontop_fcontext. Thanks a lot for working on
> > this! I am wondering if there any users of ontop_fcontext in our tree...
>
> Currently the only port depending on boost-md is pdns-recursor so this would
> be enough for our current use.

I'd like Kicad 5 to be a second port to use it, but I still can't get it
to work with any of the changes, thus far. It's a mess, so I probably
never will.

--

Tracey Emery

Reply | Threaded
Open this post in threaded view
|

Re: boost md context switching on macppc

Otto Moerbeek
On Thu, Sep 12, 2019 at 08:32:22AM -0600, Tracey Emery wrote:

> On Thu, Sep 12, 2019 at 03:27:30PM +0100, Stuart Henderson wrote:
> > On 2019/09/12 16:19, Otto Moerbeek wrote:
> > > The good news is that is is not broken for my use-case: PowerDNS
> > > Recursor.  It does not use ontop_fcontext. Thanks a lot for working on
> > > this! I am wondering if there any users of ontop_fcontext in our tree...
> >
> > Currently the only port depending on boost-md is pdns-recursor so this would
> > be enough for our current use.
>
> I'd like Kicad 5 to be a second port to use it, but I still can't get it
> to work with any of the changes, thus far. It's a mess, so I probably
> never will.

Kicad needs some diffs, it alloctates stacks itself. I helped henning@
with some diff, but I think he dropped work. never heard back from him
about it. I could at least start kicad (on amd64).

If you're intersted in the wip diff I'll send then to you.

        -Otto

Reply | Threaded
Open this post in threaded view
|

Re: boost md context switching on macppc

George Koehler-2
In reply to this post by Otto Moerbeek
On Thu, 12 Sep 2019 16:19:18 +0200
Otto Moerbeek <[hidden email]> wrote:

> On Thu, Sep 12, 2019 at 12:54:47AM -0400, George Koehler wrote:
> > The broken diff follows.
>
> The good news is that is is not broken for my use-case: PowerDNS
> Recursor.  It does not use ontop_fcontext. Thanks a lot for working on
> this! I am wondering if there any users of ontop_fcontext in our tree...
>
> -Otto

Here's a new diff with 3 more fixes:

 1. It changes ontop_fcontext, so the fibonacci example now works.

 2. It changes make_fcontext to align the stack pointer to 16 bytes.
    (Most code can run well or slightly slow with a 4-aligned stack
    pointer, but altivec vectors might cause a problem.)

 3. Our patch-boost_context_pooled_fixedsize_stack_hpp used a wrong
    variable name, so any program that tried to #include
    <boost/context/pooled_fixedsize_stack.hpp> would get an error.
    The diff changes the variable name and bumps REVISION-main; this
    is the only part of the diff to affect arches other than powerpc.

I have no code using pooled_fixedsize_stack, but one of the examples
in boost includes the header via <boost/context/all.hpp>.

I broke the fibonacci example because I caused ontop_fcontext to leave
a stack frame, but didn't provide an .eh_frame for C++ exceptions.
Then fibonacci threw an exception, but the unwinder can't remove the
frame, so it didn't reach the code to catch the exception.

To fix fibonacci, I go back to having ontop_fcontext make a tail call
to the ontop-function without leaving a stack frame, like it does on
Linux.  I then cheat by placing an 8-byte transfer_t on the *other*
stack; the existing code uses a similar cheat on Linux.  This cheat
will break if the program resumes the other stack before the
ontop-function returns, but this is already broken on Linux.

The diff doesn't fix 2 other bugs:

 1. The handling of register %r13 is wrong, but this seems not to
    matter on OpenBSD, so I'm not trying to fix it.

 2. The call to _exit(0) in make_fcontext is wrong for systems using
    the secure PLT, like OpenBSD.  I have no code that reaches this
    call, but I would expect it to crash because it fails to set r30
    to the global offset table.

I have stopped work on this diff.  My next task is to report an issue
to GitHub boost/context, about the multiple problems with ppc32.

Index: Makefile
===================================================================
RCS file: /cvs/ports/devel/boost/Makefile,v
retrieving revision 1.89
diff -u -p -r1.89 Makefile
--- Makefile 9 Aug 2019 11:25:29 -0000 1.89
+++ Makefile 14 Sep 2019 00:56:15 -0000
@@ -16,8 +16,8 @@ MASTER_SITES= ${MASTER_SITE_SOURCEFORGE:
 EXTRACT_SUFX= .tar.bz2
 FIX_EXTRACT_PERMISSIONS = Yes
 
-REVISION-main= 6
-REVISION-md= 1
+REVISION-main= 7
+REVISION-md= 2
 
 SO_VERSION= 9.0
 BOOST_LIBS= boost_atomic-mt \
Index: patches/patch-boost_context_pooled_fixedsize_stack_hpp
===================================================================
RCS file: /cvs/ports/devel/boost/patches/patch-boost_context_pooled_fixedsize_stack_hpp,v
retrieving revision 1.1
diff -u -p -r1.1 patch-boost_context_pooled_fixedsize_stack_hpp
--- patches/patch-boost_context_pooled_fixedsize_stack_hpp 13 Dec 2018 19:52:46 -0000 1.1
+++ patches/patch-boost_context_pooled_fixedsize_stack_hpp 14 Sep 2019 00:56:15 -0000
@@ -18,7 +18,7 @@ Index: boost/context/pooled_fixedsize_st
          stack_context allocate() {
 -            void * vp = storage_.malloc();
 -            if ( ! vp) {
-+            void * vp = mmap(NULL, size_, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANON | MAP_STACK, -1, 0);
++            void * vp = mmap(NULL, stack_size_, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANON | MAP_STACK, -1, 0);
 +            if ( vp == MAP_FAILED ) {
                  throw std::bad_alloc();
              }
Index: patches/patch-libs_context_build_Jamfile_v2
===================================================================
RCS file: patches/patch-libs_context_build_Jamfile_v2
diff -N patches/patch-libs_context_build_Jamfile_v2
--- /dev/null 1 Jan 1970 00:00:00 -0000
+++ patches/patch-libs_context_build_Jamfile_v2 14 Sep 2019 00:56:15 -0000
@@ -0,0 +1,17 @@
+$OpenBSD$
+
+ppc32_sysv_elf has 2 instances of "<toolset>clang".
+The second "clang" should be "gcc".
+
+Index: libs/context/build/Jamfile.v2
+--- libs/context/build/Jamfile.v2.orig
++++ libs/context/build/Jamfile.v2
+@@ -326,7 +326,7 @@ alias asm_sources
+      <address-model>32
+      <architecture>power
+      <binary-format>elf
+-     <toolset>clang
++     <toolset>gcc
+    ;
+
+ alias asm_sources
Index: patches/patch-libs_context_src_asm_jump_ppc32_sysv_elf_gas_S
===================================================================
RCS file: patches/patch-libs_context_src_asm_jump_ppc32_sysv_elf_gas_S
diff -N patches/patch-libs_context_src_asm_jump_ppc32_sysv_elf_gas_S
--- /dev/null 1 Jan 1970 00:00:00 -0000
+++ patches/patch-libs_context_src_asm_jump_ppc32_sysv_elf_gas_S 14 Sep 2019 00:56:15 -0000
@@ -0,0 +1,66 @@
+$OpenBSD$
+
+ELF systems other than Linux use a different convention to return a
+small struct like transfer_t.
+
+Index: libs/context/src/asm/jump_ppc32_sysv_elf_gas.S
+--- libs/context/src/asm/jump_ppc32_sysv_elf_gas.S.orig
++++ libs/context/src/asm/jump_ppc32_sysv_elf_gas.S
+@@ -78,6 +78,9 @@
+ .align 2
+ .type jump_fcontext,@function
+ jump_fcontext:
++    # Linux: jump_fcontext( hidden transfer_t * %r3, %r4, %r5)
++    # Other: transfer_t %r3:%r4 = jump_fcontext( %r3, %r4)
++
+     # reserve space on stack
+     subi  %r1, %r1, 244
+
+@@ -121,7 +124,9 @@ jump_fcontext:
+     stw  %r29, 216(%r1)  # save R29
+     stw  %r30, 220(%r1)  # save R30
+     stw  %r31, 224(%r1)  # save R31
++#ifdef __Linux__
+     stw  %r3,  228(%r1)  # save hidden
++#endif
+
+     # save CR
+     mfcr  %r0
+@@ -135,8 +140,12 @@ jump_fcontext:
+     # store RSP (pointing to context-data) in R6
+     mr  %r6, %r1
+
+-    # restore RSP (pointing to context-data) from R4
++    # restore RSP (pointing to context-data) from R4/R3
++#ifdef __Linux__
+     mr  %r1, %r4
++#else
++    mr  %r1, %r3
++#endif
+
+     lfd  %f14, 0(%r1)  # restore F14
+     lfd  %f15, 8(%r1)  # restore F15
+@@ -178,7 +187,9 @@ jump_fcontext:
+     lwz  %r29, 216(%r1)  # restore R29
+     lwz  %r30, 220(%r1)  # restore R30
+     lwz  %r31, 224(%r1)  # restore R31
++#ifdef __Linux__
+     lwz  %r3,  228(%r1)  # restore hidden
++#endif
+
+     # restore CR
+     lwz   %r0, 232(%r1)
+@@ -195,8 +206,13 @@ jump_fcontext:
+     addi  %r1, %r1, 244
+
+     # return transfer_t
++#ifdef __Linux__
+     stw  %r6, 0(%r3)
+     stw  %r5, 4(%r3)
++#else
++    mr   %r3, %r6
++    #    %r4, %r4
++#endif
+
+     # jump to context
+     bctr
Index: patches/patch-libs_context_src_asm_make_ppc32_sysv_elf_gas_S
===================================================================
RCS file: patches/patch-libs_context_src_asm_make_ppc32_sysv_elf_gas_S
diff -N patches/patch-libs_context_src_asm_make_ppc32_sysv_elf_gas_S
--- /dev/null 1 Jan 1970 00:00:00 -0000
+++ patches/patch-libs_context_src_asm_make_ppc32_sysv_elf_gas_S 14 Sep 2019 00:56:15 -0000
@@ -0,0 +1,78 @@
+$OpenBSD$
+
+Stack should have alignment 16 after jump_fcontext drops 244 bytes.
+
+ELF systems other than Linux use a different convention to return a
+small struct like transfer_t.
+
+Index: libs/context/src/asm/make_ppc32_sysv_elf_gas.S
+--- libs/context/src/asm/make_ppc32_sysv_elf_gas.S.orig
++++ libs/context/src/asm/make_ppc32_sysv_elf_gas.S
+@@ -85,12 +85,19 @@ make_fcontext:
+     # shift address in R3 to lower 16 byte boundary
+     clrrwi  %r3, %r3, 4
+
+-    # reserve space for context-data on context-stack
+-    # including 64 byte of linkage + parameter area (R1 % 16 == 0)
+-    subi  %r3, %r3, 336
++    # reserve space on context-stack, including 16 bytes of linkage
++    # and parameter area + 244 bytes of context-data; jump_fcontext
++    # will drop 244 bytes to align the stack (244 % 16 != 0)
++    subi  %r3, %r3, 16 + 244
+
+     # third arg of make_fcontext() == address of context-function
++#ifdef __Linux__
++    # save context-function as PC
+     stw  %r5, 240(%r3)
++#else
++    # save context-function for trampoline
++    stw  %r5, 252(%r3)
++#endif
+
+     # set back-chain to zero
+     li   %r0, 0
+@@ -99,10 +106,12 @@ make_fcontext:
+     mffs  %f0  # load FPSCR
+     stfd  %f0, 144(%r3)  # save FPSCR
+
++#ifdef __Linux__
+     # compute address of returned transfer_t
+     addi  %r0, %r3, 252
+     mr    %r4, %r0
+     stw   %r4, 228(%r3)
++#endif
+
+     # load LR
+     mflr  %r0
+@@ -111,6 +120,11 @@ make_fcontext:
+ 1:
+     # load LR into R4
+     mflr  %r4
++#ifndef __Linux__
++    # compute abs address of trampoline; use as PC
++    addi  %r7, %r4, trampoline - 1b
++    stw   %r7, 240(%r3)
++#endif
+     # compute abs address of label finish
+     addi  %r4, %r4, finish - 1b
+     # restore LR
+@@ -123,6 +137,19 @@ make_fcontext:
+     mtlr  %r6
+
+     blr  # return pointer to context-data
++
++#ifndef __Linux__
++trampoline:
++    # On systems other than Linux, jump_fcontext is returning the
++    # transfer_t in %r3:%r4, but we need to pass transfer_t * %r3 to
++    # our context-function.
++    lwz   %r0, 8(%r1)   # address of context-function
++    mtctr %r0
++    stw   %r3, 8(%r1)
++    stw   %r4, 12(%r1)  # move transfer_t to stack
++    la    %r3, 8(%r1)   # address of transfer_t
++    bctr
++#endif
+
+ finish:
+     # save return address into R0
Index: patches/patch-libs_context_src_asm_ontop_ppc32_sysv_elf_gas_S
===================================================================
RCS file: patches/patch-libs_context_src_asm_ontop_ppc32_sysv_elf_gas_S
diff -N patches/patch-libs_context_src_asm_ontop_ppc32_sysv_elf_gas_S
--- /dev/null 1 Jan 1970 00:00:00 -0000
+++ patches/patch-libs_context_src_asm_ontop_ppc32_sysv_elf_gas_S 14 Sep 2019 00:56:15 -0000
@@ -0,0 +1,75 @@
+$OpenBSD$
+
+ELF systems other than Linux use a different convention to return a
+small struct like transfer_t.
+
+Index: libs/context/src/asm/ontop_ppc32_sysv_elf_gas.S
+--- libs/context/src/asm/ontop_ppc32_sysv_elf_gas.S.orig
++++ libs/context/src/asm/ontop_ppc32_sysv_elf_gas.S
+@@ -78,6 +78,9 @@
+ .align 2
+ .type ontop_fcontext,@function
+ ontop_fcontext:
++    # Linux: ontop_fcontext( hidden transfer_t * %r3, %r4, %r5, %r6)
++    # Other: transfer_t %r3:%r4 = ontop_fcontext( %r3, %r4, %r5)
++
+     # reserve space on stack
+     subi  %r1, %r1, 244
+
+@@ -121,7 +124,9 @@ ontop_fcontext:
+     stw  %r29, 216(%r1)  # save R29
+     stw  %r30, 220(%r1)  # save R30
+     stw  %r31, 224(%r1)  # save R31
++#ifdef __Linux__
+     stw  %r3,  228(%r1)  # save hidden
++#endif
+
+     # save CR
+     mfcr  %r0
+@@ -135,8 +140,12 @@ ontop_fcontext:
+     # store RSP (pointing to context-data) in R7
+     mr  %r7, %r1
+
+-    # restore RSP (pointing to context-data) from R4
++    # restore RSP (pointing to context-data) from R4/R3
++#ifdef __Linux__
+     mr  %r1, %r4
++#else
++    mr  %r1, %r3
++#endif
+
+     lfd  %f14, 0(%r1)  # restore F14
+     lfd  %f15, 8(%r1)  # restore F15
+@@ -178,7 +187,9 @@ ontop_fcontext:
+     lwz  %r29, 216(%r1)  # restore R29
+     lwz  %r30, 220(%r1)  # restore R30
+     lwz  %r31, 224(%r1)  # restore R31
++#ifdef __Linux__
+     lwz  %r4,  228(%r1)  # restore hidden
++#endif
+
+     # restore CR
+     lwz   %r0, 232(%r1)
+@@ -191,12 +202,22 @@ ontop_fcontext:
+     # adjust stack
+     addi  %r1, %r1, 244
+
++#ifdef __Linux__
+     # return transfer_t
+     stw  %r7, 0(%r4)
+     stw  %r5, 4(%r4)
+
+     # restore CTR
+     mtctr %r6
++#else
++    # On systems other than Linux, we allocate a transfer_t on the
++    # other stack, below its stack pointer %r7.
++    stw  %r7, -8(%r7)
++    stw  %r4, -4(%r7)
++    la   %r3, -8(%r7)  # address of transfer_t
++
++    mtctr %r5
++#endif
+
+     # jump to ontop-function
+     bctr

Reply | Threaded
Open this post in threaded view
|

Re: boost md context switching on macppc

Otto Moerbeek
On Sat, Sep 14, 2019 at 03:08:10PM -0400, George Koehler wrote:

> On Thu, 12 Sep 2019 16:19:18 +0200
> Otto Moerbeek <[hidden email]> wrote:
>
> > On Thu, Sep 12, 2019 at 12:54:47AM -0400, George Koehler wrote:
> > > The broken diff follows.
> >
> > The good news is that is is not broken for my use-case: PowerDNS
> > Recursor.  It does not use ontop_fcontext. Thanks a lot for working on
> > this! I am wondering if there any users of ontop_fcontext in our tree...
> >
> > -Otto
>
> Here's a new diff with 3 more fixes:
>
>  1. It changes ontop_fcontext, so the fibonacci example now works.
>
>  2. It changes make_fcontext to align the stack pointer to 16 bytes.
>     (Most code can run well or slightly slow with a 4-aligned stack
>     pointer, but altivec vectors might cause a problem.)
>
>  3. Our patch-boost_context_pooled_fixedsize_stack_hpp used a wrong
>     variable name, so any program that tried to #include
>     <boost/context/pooled_fixedsize_stack.hpp> would get an error.
>     The diff changes the variable name and bumps REVISION-main; this
>     is the only part of the diff to affect arches other than powerpc.
>
> I have no code using pooled_fixedsize_stack, but one of the examples
> in boost includes the header via <boost/context/all.hpp>.
>
> I broke the fibonacci example because I caused ontop_fcontext to leave
> a stack frame, but didn't provide an .eh_frame for C++ exceptions.
> Then fibonacci threw an exception, but the unwinder can't remove the
> frame, so it didn't reach the code to catch the exception.
>
> To fix fibonacci, I go back to having ontop_fcontext make a tail call
> to the ontop-function without leaving a stack frame, like it does on
> Linux.  I then cheat by placing an 8-byte transfer_t on the *other*
> stack; the existing code uses a similar cheat on Linux.  This cheat
> will break if the program resumes the other stack before the
> ontop-function returns, but this is already broken on Linux.
>
> The diff doesn't fix 2 other bugs:
>
>  1. The handling of register %r13 is wrong, but this seems not to
>     matter on OpenBSD, so I'm not trying to fix it.
>
>  2. The call to _exit(0) in make_fcontext is wrong for systems using
>     the secure PLT, like OpenBSD.  I have no code that reaches this
>     call, but I would expect it to crash because it fails to set r30
>     to the global offset table.
>
> I have stopped work on this diff.  My next task is to report an issue
> to GitHub boost/context, about the multiple problems with ppc32.

This PowerDNS Recursor is still happy and the
boost_context_pooled_fixedsize_stack_hpp fix looks ok as well.

I'd say this is good to go in. Thanks,

        -Otto

>
> Index: Makefile
> ===================================================================
> RCS file: /cvs/ports/devel/boost/Makefile,v
> retrieving revision 1.89
> diff -u -p -r1.89 Makefile
> --- Makefile 9 Aug 2019 11:25:29 -0000 1.89
> +++ Makefile 14 Sep 2019 00:56:15 -0000
> @@ -16,8 +16,8 @@ MASTER_SITES= ${MASTER_SITE_SOURCEFORGE:
>  EXTRACT_SUFX= .tar.bz2
>  FIX_EXTRACT_PERMISSIONS = Yes
>  
> -REVISION-main= 6
> -REVISION-md= 1
> +REVISION-main= 7
> +REVISION-md= 2
>  
>  SO_VERSION= 9.0
>  BOOST_LIBS= boost_atomic-mt \
> Index: patches/patch-boost_context_pooled_fixedsize_stack_hpp
> ===================================================================
> RCS file: /cvs/ports/devel/boost/patches/patch-boost_context_pooled_fixedsize_stack_hpp,v
> retrieving revision 1.1
> diff -u -p -r1.1 patch-boost_context_pooled_fixedsize_stack_hpp
> --- patches/patch-boost_context_pooled_fixedsize_stack_hpp 13 Dec 2018 19:52:46 -0000 1.1
> +++ patches/patch-boost_context_pooled_fixedsize_stack_hpp 14 Sep 2019 00:56:15 -0000
> @@ -18,7 +18,7 @@ Index: boost/context/pooled_fixedsize_st
>           stack_context allocate() {
>  -            void * vp = storage_.malloc();
>  -            if ( ! vp) {
> -+            void * vp = mmap(NULL, size_, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANON | MAP_STACK, -1, 0);
> ++            void * vp = mmap(NULL, stack_size_, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANON | MAP_STACK, -1, 0);
>  +            if ( vp == MAP_FAILED ) {
>                   throw std::bad_alloc();
>               }
> Index: patches/patch-libs_context_build_Jamfile_v2
> ===================================================================
> RCS file: patches/patch-libs_context_build_Jamfile_v2
> diff -N patches/patch-libs_context_build_Jamfile_v2
> --- /dev/null 1 Jan 1970 00:00:00 -0000
> +++ patches/patch-libs_context_build_Jamfile_v2 14 Sep 2019 00:56:15 -0000
> @@ -0,0 +1,17 @@
> +$OpenBSD$
> +
> +ppc32_sysv_elf has 2 instances of "<toolset>clang".
> +The second "clang" should be "gcc".
> +
> +Index: libs/context/build/Jamfile.v2
> +--- libs/context/build/Jamfile.v2.orig
> ++++ libs/context/build/Jamfile.v2
> +@@ -326,7 +326,7 @@ alias asm_sources
> +      <address-model>32
> +      <architecture>power
> +      <binary-format>elf
> +-     <toolset>clang
> ++     <toolset>gcc
> +    ;
> +
> + alias asm_sources
> Index: patches/patch-libs_context_src_asm_jump_ppc32_sysv_elf_gas_S
> ===================================================================
> RCS file: patches/patch-libs_context_src_asm_jump_ppc32_sysv_elf_gas_S
> diff -N patches/patch-libs_context_src_asm_jump_ppc32_sysv_elf_gas_S
> --- /dev/null 1 Jan 1970 00:00:00 -0000
> +++ patches/patch-libs_context_src_asm_jump_ppc32_sysv_elf_gas_S 14 Sep 2019 00:56:15 -0000
> @@ -0,0 +1,66 @@
> +$OpenBSD$
> +
> +ELF systems other than Linux use a different convention to return a
> +small struct like transfer_t.
> +
> +Index: libs/context/src/asm/jump_ppc32_sysv_elf_gas.S
> +--- libs/context/src/asm/jump_ppc32_sysv_elf_gas.S.orig
> ++++ libs/context/src/asm/jump_ppc32_sysv_elf_gas.S
> +@@ -78,6 +78,9 @@
> + .align 2
> + .type jump_fcontext,@function
> + jump_fcontext:
> ++    # Linux: jump_fcontext( hidden transfer_t * %r3, %r4, %r5)
> ++    # Other: transfer_t %r3:%r4 = jump_fcontext( %r3, %r4)
> ++
> +     # reserve space on stack
> +     subi  %r1, %r1, 244
> +
> +@@ -121,7 +124,9 @@ jump_fcontext:
> +     stw  %r29, 216(%r1)  # save R29
> +     stw  %r30, 220(%r1)  # save R30
> +     stw  %r31, 224(%r1)  # save R31
> ++#ifdef __Linux__
> +     stw  %r3,  228(%r1)  # save hidden
> ++#endif
> +
> +     # save CR
> +     mfcr  %r0
> +@@ -135,8 +140,12 @@ jump_fcontext:
> +     # store RSP (pointing to context-data) in R6
> +     mr  %r6, %r1
> +
> +-    # restore RSP (pointing to context-data) from R4
> ++    # restore RSP (pointing to context-data) from R4/R3
> ++#ifdef __Linux__
> +     mr  %r1, %r4
> ++#else
> ++    mr  %r1, %r3
> ++#endif
> +
> +     lfd  %f14, 0(%r1)  # restore F14
> +     lfd  %f15, 8(%r1)  # restore F15
> +@@ -178,7 +187,9 @@ jump_fcontext:
> +     lwz  %r29, 216(%r1)  # restore R29
> +     lwz  %r30, 220(%r1)  # restore R30
> +     lwz  %r31, 224(%r1)  # restore R31
> ++#ifdef __Linux__
> +     lwz  %r3,  228(%r1)  # restore hidden
> ++#endif
> +
> +     # restore CR
> +     lwz   %r0, 232(%r1)
> +@@ -195,8 +206,13 @@ jump_fcontext:
> +     addi  %r1, %r1, 244
> +
> +     # return transfer_t
> ++#ifdef __Linux__
> +     stw  %r6, 0(%r3)
> +     stw  %r5, 4(%r3)
> ++#else
> ++    mr   %r3, %r6
> ++    #    %r4, %r4
> ++#endif
> +
> +     # jump to context
> +     bctr
> Index: patches/patch-libs_context_src_asm_make_ppc32_sysv_elf_gas_S
> ===================================================================
> RCS file: patches/patch-libs_context_src_asm_make_ppc32_sysv_elf_gas_S
> diff -N patches/patch-libs_context_src_asm_make_ppc32_sysv_elf_gas_S
> --- /dev/null 1 Jan 1970 00:00:00 -0000
> +++ patches/patch-libs_context_src_asm_make_ppc32_sysv_elf_gas_S 14 Sep 2019 00:56:15 -0000
> @@ -0,0 +1,78 @@
> +$OpenBSD$
> +
> +Stack should have alignment 16 after jump_fcontext drops 244 bytes.
> +
> +ELF systems other than Linux use a different convention to return a
> +small struct like transfer_t.
> +
> +Index: libs/context/src/asm/make_ppc32_sysv_elf_gas.S
> +--- libs/context/src/asm/make_ppc32_sysv_elf_gas.S.orig
> ++++ libs/context/src/asm/make_ppc32_sysv_elf_gas.S
> +@@ -85,12 +85,19 @@ make_fcontext:
> +     # shift address in R3 to lower 16 byte boundary
> +     clrrwi  %r3, %r3, 4
> +
> +-    # reserve space for context-data on context-stack
> +-    # including 64 byte of linkage + parameter area (R1 % 16 == 0)
> +-    subi  %r3, %r3, 336
> ++    # reserve space on context-stack, including 16 bytes of linkage
> ++    # and parameter area + 244 bytes of context-data; jump_fcontext
> ++    # will drop 244 bytes to align the stack (244 % 16 != 0)
> ++    subi  %r3, %r3, 16 + 244
> +
> +     # third arg of make_fcontext() == address of context-function
> ++#ifdef __Linux__
> ++    # save context-function as PC
> +     stw  %r5, 240(%r3)
> ++#else
> ++    # save context-function for trampoline
> ++    stw  %r5, 252(%r3)
> ++#endif
> +
> +     # set back-chain to zero
> +     li   %r0, 0
> +@@ -99,10 +106,12 @@ make_fcontext:
> +     mffs  %f0  # load FPSCR
> +     stfd  %f0, 144(%r3)  # save FPSCR
> +
> ++#ifdef __Linux__
> +     # compute address of returned transfer_t
> +     addi  %r0, %r3, 252
> +     mr    %r4, %r0
> +     stw   %r4, 228(%r3)
> ++#endif
> +
> +     # load LR
> +     mflr  %r0
> +@@ -111,6 +120,11 @@ make_fcontext:
> + 1:
> +     # load LR into R4
> +     mflr  %r4
> ++#ifndef __Linux__
> ++    # compute abs address of trampoline; use as PC
> ++    addi  %r7, %r4, trampoline - 1b
> ++    stw   %r7, 240(%r3)
> ++#endif
> +     # compute abs address of label finish
> +     addi  %r4, %r4, finish - 1b
> +     # restore LR
> +@@ -123,6 +137,19 @@ make_fcontext:
> +     mtlr  %r6
> +
> +     blr  # return pointer to context-data
> ++
> ++#ifndef __Linux__
> ++trampoline:
> ++    # On systems other than Linux, jump_fcontext is returning the
> ++    # transfer_t in %r3:%r4, but we need to pass transfer_t * %r3 to
> ++    # our context-function.
> ++    lwz   %r0, 8(%r1)   # address of context-function
> ++    mtctr %r0
> ++    stw   %r3, 8(%r1)
> ++    stw   %r4, 12(%r1)  # move transfer_t to stack
> ++    la    %r3, 8(%r1)   # address of transfer_t
> ++    bctr
> ++#endif
> +
> + finish:
> +     # save return address into R0
> Index: patches/patch-libs_context_src_asm_ontop_ppc32_sysv_elf_gas_S
> ===================================================================
> RCS file: patches/patch-libs_context_src_asm_ontop_ppc32_sysv_elf_gas_S
> diff -N patches/patch-libs_context_src_asm_ontop_ppc32_sysv_elf_gas_S
> --- /dev/null 1 Jan 1970 00:00:00 -0000
> +++ patches/patch-libs_context_src_asm_ontop_ppc32_sysv_elf_gas_S 14 Sep 2019 00:56:15 -0000
> @@ -0,0 +1,75 @@
> +$OpenBSD$
> +
> +ELF systems other than Linux use a different convention to return a
> +small struct like transfer_t.
> +
> +Index: libs/context/src/asm/ontop_ppc32_sysv_elf_gas.S
> +--- libs/context/src/asm/ontop_ppc32_sysv_elf_gas.S.orig
> ++++ libs/context/src/asm/ontop_ppc32_sysv_elf_gas.S
> +@@ -78,6 +78,9 @@
> + .align 2
> + .type ontop_fcontext,@function
> + ontop_fcontext:
> ++    # Linux: ontop_fcontext( hidden transfer_t * %r3, %r4, %r5, %r6)
> ++    # Other: transfer_t %r3:%r4 = ontop_fcontext( %r3, %r4, %r5)
> ++
> +     # reserve space on stack
> +     subi  %r1, %r1, 244
> +
> +@@ -121,7 +124,9 @@ ontop_fcontext:
> +     stw  %r29, 216(%r1)  # save R29
> +     stw  %r30, 220(%r1)  # save R30
> +     stw  %r31, 224(%r1)  # save R31
> ++#ifdef __Linux__
> +     stw  %r3,  228(%r1)  # save hidden
> ++#endif
> +
> +     # save CR
> +     mfcr  %r0
> +@@ -135,8 +140,12 @@ ontop_fcontext:
> +     # store RSP (pointing to context-data) in R7
> +     mr  %r7, %r1
> +
> +-    # restore RSP (pointing to context-data) from R4
> ++    # restore RSP (pointing to context-data) from R4/R3
> ++#ifdef __Linux__
> +     mr  %r1, %r4
> ++#else
> ++    mr  %r1, %r3
> ++#endif
> +
> +     lfd  %f14, 0(%r1)  # restore F14
> +     lfd  %f15, 8(%r1)  # restore F15
> +@@ -178,7 +187,9 @@ ontop_fcontext:
> +     lwz  %r29, 216(%r1)  # restore R29
> +     lwz  %r30, 220(%r1)  # restore R30
> +     lwz  %r31, 224(%r1)  # restore R31
> ++#ifdef __Linux__
> +     lwz  %r4,  228(%r1)  # restore hidden
> ++#endif
> +
> +     # restore CR
> +     lwz   %r0, 232(%r1)
> +@@ -191,12 +202,22 @@ ontop_fcontext:
> +     # adjust stack
> +     addi  %r1, %r1, 244
> +
> ++#ifdef __Linux__
> +     # return transfer_t
> +     stw  %r7, 0(%r4)
> +     stw  %r5, 4(%r4)
> +
> +     # restore CTR
> +     mtctr %r6
> ++#else
> ++    # On systems other than Linux, we allocate a transfer_t on the
> ++    # other stack, below its stack pointer %r7.
> ++    stw  %r7, -8(%r7)
> ++    stw  %r4, -4(%r7)
> ++    la   %r3, -8(%r7)  # address of transfer_t
> ++
> ++    mtctr %r5
> ++#endif
> +
> +     # jump to ontop-function
> +     bctr
>

Reply | Threaded
Open this post in threaded view
|

Re: boost md context switching on macppc

Rafael Sadowski
On Mon Sep 16, 2019 at 07:54:53PM +0200, Otto Moerbeek wrote:

> On Sat, Sep 14, 2019 at 03:08:10PM -0400, George Koehler wrote:
>
> > On Thu, 12 Sep 2019 16:19:18 +0200
> > Otto Moerbeek <[hidden email]> wrote:
> >
> > > On Thu, Sep 12, 2019 at 12:54:47AM -0400, George Koehler wrote:
> > > > The broken diff follows.
> > >
> > > The good news is that is is not broken for my use-case: PowerDNS
> > > Recursor.  It does not use ontop_fcontext. Thanks a lot for working on
> > > this! I am wondering if there any users of ontop_fcontext in our tree...
> > >
> > > -Otto
> >
> > Here's a new diff with 3 more fixes:
> >
> >  1. It changes ontop_fcontext, so the fibonacci example now works.
> >
> >  2. It changes make_fcontext to align the stack pointer to 16 bytes.
> >     (Most code can run well or slightly slow with a 4-aligned stack
> >     pointer, but altivec vectors might cause a problem.)
> >
> >  3. Our patch-boost_context_pooled_fixedsize_stack_hpp used a wrong
> >     variable name, so any program that tried to #include
> >     <boost/context/pooled_fixedsize_stack.hpp> would get an error.
> >     The diff changes the variable name and bumps REVISION-main; this
> >     is the only part of the diff to affect arches other than powerpc.
> >
> > I have no code using pooled_fixedsize_stack, but one of the examples
> > in boost includes the header via <boost/context/all.hpp>.
> >
> > I broke the fibonacci example because I caused ontop_fcontext to leave
> > a stack frame, but didn't provide an .eh_frame for C++ exceptions.
> > Then fibonacci threw an exception, but the unwinder can't remove the
> > frame, so it didn't reach the code to catch the exception.
> >
> > To fix fibonacci, I go back to having ontop_fcontext make a tail call
> > to the ontop-function without leaving a stack frame, like it does on
> > Linux.  I then cheat by placing an 8-byte transfer_t on the *other*
> > stack; the existing code uses a similar cheat on Linux.  This cheat
> > will break if the program resumes the other stack before the
> > ontop-function returns, but this is already broken on Linux.
> >
> > The diff doesn't fix 2 other bugs:
> >
> >  1. The handling of register %r13 is wrong, but this seems not to
> >     matter on OpenBSD, so I'm not trying to fix it.
> >
> >  2. The call to _exit(0) in make_fcontext is wrong for systems using
> >     the secure PLT, like OpenBSD.  I have no code that reaches this
> >     call, but I would expect it to crash because it fails to set r30
> >     to the global offset table.
> >
> > I have stopped work on this diff.  My next task is to report an issue
> > to GitHub boost/context, about the multiple problems with ppc32.
>
> This PowerDNS Recursor is still happy and the
> boost_context_pooled_fixedsize_stack_hpp fix looks ok as well.
>
> I'd say this is good to go in. Thanks,

No objections here.

>
> -Otto
>
> >
> > Index: Makefile
> > ===================================================================
> > RCS file: /cvs/ports/devel/boost/Makefile,v
> > retrieving revision 1.89
> > diff -u -p -r1.89 Makefile
> > --- Makefile 9 Aug 2019 11:25:29 -0000 1.89
> > +++ Makefile 14 Sep 2019 00:56:15 -0000
> > @@ -16,8 +16,8 @@ MASTER_SITES= ${MASTER_SITE_SOURCEFORGE:
> >  EXTRACT_SUFX= .tar.bz2
> >  FIX_EXTRACT_PERMISSIONS = Yes
> >  
> > -REVISION-main= 6
> > -REVISION-md= 1
> > +REVISION-main= 7
> > +REVISION-md= 2
> >  
> >  SO_VERSION= 9.0
> >  BOOST_LIBS= boost_atomic-mt \
> > Index: patches/patch-boost_context_pooled_fixedsize_stack_hpp
> > ===================================================================
> > RCS file: /cvs/ports/devel/boost/patches/patch-boost_context_pooled_fixedsize_stack_hpp,v
> > retrieving revision 1.1
> > diff -u -p -r1.1 patch-boost_context_pooled_fixedsize_stack_hpp
> > --- patches/patch-boost_context_pooled_fixedsize_stack_hpp 13 Dec 2018 19:52:46 -0000 1.1
> > +++ patches/patch-boost_context_pooled_fixedsize_stack_hpp 14 Sep 2019 00:56:15 -0000
> > @@ -18,7 +18,7 @@ Index: boost/context/pooled_fixedsize_st
> >           stack_context allocate() {
> >  -            void * vp = storage_.malloc();
> >  -            if ( ! vp) {
> > -+            void * vp = mmap(NULL, size_, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANON | MAP_STACK, -1, 0);
> > ++            void * vp = mmap(NULL, stack_size_, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANON | MAP_STACK, -1, 0);
> >  +            if ( vp == MAP_FAILED ) {
> >                   throw std::bad_alloc();
> >               }
> > Index: patches/patch-libs_context_build_Jamfile_v2
> > ===================================================================
> > RCS file: patches/patch-libs_context_build_Jamfile_v2
> > diff -N patches/patch-libs_context_build_Jamfile_v2
> > --- /dev/null 1 Jan 1970 00:00:00 -0000
> > +++ patches/patch-libs_context_build_Jamfile_v2 14 Sep 2019 00:56:15 -0000
> > @@ -0,0 +1,17 @@
> > +$OpenBSD$
> > +
> > +ppc32_sysv_elf has 2 instances of "<toolset>clang".
> > +The second "clang" should be "gcc".
> > +
> > +Index: libs/context/build/Jamfile.v2
> > +--- libs/context/build/Jamfile.v2.orig
> > ++++ libs/context/build/Jamfile.v2
> > +@@ -326,7 +326,7 @@ alias asm_sources
> > +      <address-model>32
> > +      <architecture>power
> > +      <binary-format>elf
> > +-     <toolset>clang
> > ++     <toolset>gcc
> > +    ;
> > +
> > + alias asm_sources
> > Index: patches/patch-libs_context_src_asm_jump_ppc32_sysv_elf_gas_S
> > ===================================================================
> > RCS file: patches/patch-libs_context_src_asm_jump_ppc32_sysv_elf_gas_S
> > diff -N patches/patch-libs_context_src_asm_jump_ppc32_sysv_elf_gas_S
> > --- /dev/null 1 Jan 1970 00:00:00 -0000
> > +++ patches/patch-libs_context_src_asm_jump_ppc32_sysv_elf_gas_S 14 Sep 2019 00:56:15 -0000
> > @@ -0,0 +1,66 @@
> > +$OpenBSD$
> > +
> > +ELF systems other than Linux use a different convention to return a
> > +small struct like transfer_t.
> > +
> > +Index: libs/context/src/asm/jump_ppc32_sysv_elf_gas.S
> > +--- libs/context/src/asm/jump_ppc32_sysv_elf_gas.S.orig
> > ++++ libs/context/src/asm/jump_ppc32_sysv_elf_gas.S
> > +@@ -78,6 +78,9 @@
> > + .align 2
> > + .type jump_fcontext,@function
> > + jump_fcontext:
> > ++    # Linux: jump_fcontext( hidden transfer_t * %r3, %r4, %r5)
> > ++    # Other: transfer_t %r3:%r4 = jump_fcontext( %r3, %r4)
> > ++
> > +     # reserve space on stack
> > +     subi  %r1, %r1, 244
> > +
> > +@@ -121,7 +124,9 @@ jump_fcontext:
> > +     stw  %r29, 216(%r1)  # save R29
> > +     stw  %r30, 220(%r1)  # save R30
> > +     stw  %r31, 224(%r1)  # save R31
> > ++#ifdef __Linux__
> > +     stw  %r3,  228(%r1)  # save hidden
> > ++#endif
> > +
> > +     # save CR
> > +     mfcr  %r0
> > +@@ -135,8 +140,12 @@ jump_fcontext:
> > +     # store RSP (pointing to context-data) in R6
> > +     mr  %r6, %r1
> > +
> > +-    # restore RSP (pointing to context-data) from R4
> > ++    # restore RSP (pointing to context-data) from R4/R3
> > ++#ifdef __Linux__
> > +     mr  %r1, %r4
> > ++#else
> > ++    mr  %r1, %r3
> > ++#endif
> > +
> > +     lfd  %f14, 0(%r1)  # restore F14
> > +     lfd  %f15, 8(%r1)  # restore F15
> > +@@ -178,7 +187,9 @@ jump_fcontext:
> > +     lwz  %r29, 216(%r1)  # restore R29
> > +     lwz  %r30, 220(%r1)  # restore R30
> > +     lwz  %r31, 224(%r1)  # restore R31
> > ++#ifdef __Linux__
> > +     lwz  %r3,  228(%r1)  # restore hidden
> > ++#endif
> > +
> > +     # restore CR
> > +     lwz   %r0, 232(%r1)
> > +@@ -195,8 +206,13 @@ jump_fcontext:
> > +     addi  %r1, %r1, 244
> > +
> > +     # return transfer_t
> > ++#ifdef __Linux__
> > +     stw  %r6, 0(%r3)
> > +     stw  %r5, 4(%r3)
> > ++#else
> > ++    mr   %r3, %r6
> > ++    #    %r4, %r4
> > ++#endif
> > +
> > +     # jump to context
> > +     bctr
> > Index: patches/patch-libs_context_src_asm_make_ppc32_sysv_elf_gas_S
> > ===================================================================
> > RCS file: patches/patch-libs_context_src_asm_make_ppc32_sysv_elf_gas_S
> > diff -N patches/patch-libs_context_src_asm_make_ppc32_sysv_elf_gas_S
> > --- /dev/null 1 Jan 1970 00:00:00 -0000
> > +++ patches/patch-libs_context_src_asm_make_ppc32_sysv_elf_gas_S 14 Sep 2019 00:56:15 -0000
> > @@ -0,0 +1,78 @@
> > +$OpenBSD$
> > +
> > +Stack should have alignment 16 after jump_fcontext drops 244 bytes.
> > +
> > +ELF systems other than Linux use a different convention to return a
> > +small struct like transfer_t.
> > +
> > +Index: libs/context/src/asm/make_ppc32_sysv_elf_gas.S
> > +--- libs/context/src/asm/make_ppc32_sysv_elf_gas.S.orig
> > ++++ libs/context/src/asm/make_ppc32_sysv_elf_gas.S
> > +@@ -85,12 +85,19 @@ make_fcontext:
> > +     # shift address in R3 to lower 16 byte boundary
> > +     clrrwi  %r3, %r3, 4
> > +
> > +-    # reserve space for context-data on context-stack
> > +-    # including 64 byte of linkage + parameter area (R1 % 16 == 0)
> > +-    subi  %r3, %r3, 336
> > ++    # reserve space on context-stack, including 16 bytes of linkage
> > ++    # and parameter area + 244 bytes of context-data; jump_fcontext
> > ++    # will drop 244 bytes to align the stack (244 % 16 != 0)
> > ++    subi  %r3, %r3, 16 + 244
> > +
> > +     # third arg of make_fcontext() == address of context-function
> > ++#ifdef __Linux__
> > ++    # save context-function as PC
> > +     stw  %r5, 240(%r3)
> > ++#else
> > ++    # save context-function for trampoline
> > ++    stw  %r5, 252(%r3)
> > ++#endif
> > +
> > +     # set back-chain to zero
> > +     li   %r0, 0
> > +@@ -99,10 +106,12 @@ make_fcontext:
> > +     mffs  %f0  # load FPSCR
> > +     stfd  %f0, 144(%r3)  # save FPSCR
> > +
> > ++#ifdef __Linux__
> > +     # compute address of returned transfer_t
> > +     addi  %r0, %r3, 252
> > +     mr    %r4, %r0
> > +     stw   %r4, 228(%r3)
> > ++#endif
> > +
> > +     # load LR
> > +     mflr  %r0
> > +@@ -111,6 +120,11 @@ make_fcontext:
> > + 1:
> > +     # load LR into R4
> > +     mflr  %r4
> > ++#ifndef __Linux__
> > ++    # compute abs address of trampoline; use as PC
> > ++    addi  %r7, %r4, trampoline - 1b
> > ++    stw   %r7, 240(%r3)
> > ++#endif
> > +     # compute abs address of label finish
> > +     addi  %r4, %r4, finish - 1b
> > +     # restore LR
> > +@@ -123,6 +137,19 @@ make_fcontext:
> > +     mtlr  %r6
> > +
> > +     blr  # return pointer to context-data
> > ++
> > ++#ifndef __Linux__
> > ++trampoline:
> > ++    # On systems other than Linux, jump_fcontext is returning the
> > ++    # transfer_t in %r3:%r4, but we need to pass transfer_t * %r3 to
> > ++    # our context-function.
> > ++    lwz   %r0, 8(%r1)   # address of context-function
> > ++    mtctr %r0
> > ++    stw   %r3, 8(%r1)
> > ++    stw   %r4, 12(%r1)  # move transfer_t to stack
> > ++    la    %r3, 8(%r1)   # address of transfer_t
> > ++    bctr
> > ++#endif
> > +
> > + finish:
> > +     # save return address into R0
> > Index: patches/patch-libs_context_src_asm_ontop_ppc32_sysv_elf_gas_S
> > ===================================================================
> > RCS file: patches/patch-libs_context_src_asm_ontop_ppc32_sysv_elf_gas_S
> > diff -N patches/patch-libs_context_src_asm_ontop_ppc32_sysv_elf_gas_S
> > --- /dev/null 1 Jan 1970 00:00:00 -0000
> > +++ patches/patch-libs_context_src_asm_ontop_ppc32_sysv_elf_gas_S 14 Sep 2019 00:56:15 -0000
> > @@ -0,0 +1,75 @@
> > +$OpenBSD$
> > +
> > +ELF systems other than Linux use a different convention to return a
> > +small struct like transfer_t.
> > +
> > +Index: libs/context/src/asm/ontop_ppc32_sysv_elf_gas.S
> > +--- libs/context/src/asm/ontop_ppc32_sysv_elf_gas.S.orig
> > ++++ libs/context/src/asm/ontop_ppc32_sysv_elf_gas.S
> > +@@ -78,6 +78,9 @@
> > + .align 2
> > + .type ontop_fcontext,@function
> > + ontop_fcontext:
> > ++    # Linux: ontop_fcontext( hidden transfer_t * %r3, %r4, %r5, %r6)
> > ++    # Other: transfer_t %r3:%r4 = ontop_fcontext( %r3, %r4, %r5)
> > ++
> > +     # reserve space on stack
> > +     subi  %r1, %r1, 244
> > +
> > +@@ -121,7 +124,9 @@ ontop_fcontext:
> > +     stw  %r29, 216(%r1)  # save R29
> > +     stw  %r30, 220(%r1)  # save R30
> > +     stw  %r31, 224(%r1)  # save R31
> > ++#ifdef __Linux__
> > +     stw  %r3,  228(%r1)  # save hidden
> > ++#endif
> > +
> > +     # save CR
> > +     mfcr  %r0
> > +@@ -135,8 +140,12 @@ ontop_fcontext:
> > +     # store RSP (pointing to context-data) in R7
> > +     mr  %r7, %r1
> > +
> > +-    # restore RSP (pointing to context-data) from R4
> > ++    # restore RSP (pointing to context-data) from R4/R3
> > ++#ifdef __Linux__
> > +     mr  %r1, %r4
> > ++#else
> > ++    mr  %r1, %r3
> > ++#endif
> > +
> > +     lfd  %f14, 0(%r1)  # restore F14
> > +     lfd  %f15, 8(%r1)  # restore F15
> > +@@ -178,7 +187,9 @@ ontop_fcontext:
> > +     lwz  %r29, 216(%r1)  # restore R29
> > +     lwz  %r30, 220(%r1)  # restore R30
> > +     lwz  %r31, 224(%r1)  # restore R31
> > ++#ifdef __Linux__
> > +     lwz  %r4,  228(%r1)  # restore hidden
> > ++#endif
> > +
> > +     # restore CR
> > +     lwz   %r0, 232(%r1)
> > +@@ -191,12 +202,22 @@ ontop_fcontext:
> > +     # adjust stack
> > +     addi  %r1, %r1, 244
> > +
> > ++#ifdef __Linux__
> > +     # return transfer_t
> > +     stw  %r7, 0(%r4)
> > +     stw  %r5, 4(%r4)
> > +
> > +     # restore CTR
> > +     mtctr %r6
> > ++#else
> > ++    # On systems other than Linux, we allocate a transfer_t on the
> > ++    # other stack, below its stack pointer %r7.
> > ++    stw  %r7, -8(%r7)
> > ++    stw  %r4, -4(%r7)
> > ++    la   %r3, -8(%r7)  # address of transfer_t
> > ++
> > ++    mtctr %r5
> > ++#endif
> > +
> > +     # jump to ontop-function
> > +     bctr
> >
>