userland clock_gettime proof of concept

classic Classic list List threaded Threaded
203 messages Options
1 ... 7891011
Reply | Threaded
Open this post in threaded view
|

Re: userland clock_gettime proof of concept

Christian Weisgerber
On 2020-07-03, Scott Cheloha <[hidden email]> wrote:

> Are we doing powerpc, arm64, and sparc64 separately?

hppa can de done as well, but somebody with access to a machine
needs to investigate/ensure that the S bit in the processor status
register is appropriately set to allow userland access to the
Interval Timer register.

--
Christian "naddy" Weisgerber                          [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: userland clock_gettime proof of concept

Paul Irofti-4
In reply to this post by Paul Irofti-4
On Fri, Jul 03, 2020 at 06:36:39PM +0300, Paul Irofti wrote:

>
>
> În 3 iulie 2020 17:55:25 EEST, Mark Kettenis <[hidden email]> a scris:
> >> Date: Fri, 3 Jul 2020 15:13:22 +0200
> >> From: Robert Nagy <[hidden email]>
> >>
> >> On 02/07/20 00:31 +0100, Stuart Henderson wrote:
> >> > running on 38 of these, btw.
> >>
> >> been running with this on all my workstations and laptops and on 3
> >build
> >> servers as well
> >
> >Are the issue that naddy@ saw solved?
> >
> >Did anybody do a *proper* test on anything besides amd64?  Especially
> >on architectures where the optimized clock_gettime is *not* available?
>
> Yes and yes.

So, can we go ahead with this?

Reply | Threaded
Open this post in threaded view
|

Re: userland clock_gettime proof of concept

Mark Kettenis
> Date: Sun, 5 Jul 2020 20:31:19 +0300
> From: Paul Irofti <[hidden email]>
>
> On Fri, Jul 03, 2020 at 06:36:39PM +0300, Paul Irofti wrote:
> >
> >
> > În 3 iulie 2020 17:55:25 EEST, Mark Kettenis <[hidden email]> a scris:
> > >> Date: Fri, 3 Jul 2020 15:13:22 +0200
> > >> From: Robert Nagy <[hidden email]>
> > >>
> > >> On 02/07/20 00:31 +0100, Stuart Henderson wrote:
> > >> > running on 38 of these, btw.
> > >>
> > >> been running with this on all my workstations and laptops and on 3
> > >build
> > >> servers as well
> > >
> > >Are the issue that naddy@ saw solved?
> > >
> > >Did anybody do a *proper* test on anything besides amd64?  Especially
> > >on architectures where the optimized clock_gettime is *not* available?
> >
> > Yes and yes.
>
> So, can we go ahead with this?

I wish your diff wouldn't add thos unneeded NULL and 0 initializers to
files that don't need them.

But ok kettenis@

Reply | Threaded
Open this post in threaded view
|

Re: userland clock_gettime proof of concept

Scott Cheloha
In reply to this post by Paul Irofti-4
On Sun, Jul 05, 2020 at 08:31:19PM +0300, Paul Irofti wrote:

> On Fri, Jul 03, 2020 at 06:36:39PM +0300, Paul Irofti wrote:
> >
> > ?n 3 iulie 2020 17:55:25 EEST, Mark Kettenis <[hidden email]> a scris:
> > >> Date: Fri, 3 Jul 2020 15:13:22 +0200
> > >> From: Robert Nagy <[hidden email]>
> > >>
> > >> On 02/07/20 00:31 +0100, Stuart Henderson wrote:
> > >> > running on 38 of these, btw.
> > >>
> > >> been running with this on all my workstations and laptops and on 3
> > >build
> > >> servers as well
> > >
> > >Are the issue that naddy@ saw solved?
> > >
> > >Did anybody do a *proper* test on anything besides amd64?  Especially
> > >on architectures where the optimized clock_gettime is *not* available?
> >
> > Yes and yes.
>
> So, can we go ahead with this?

ok cheloha@

I'd like to document how all this works somewhere.  I think I'm going
to update tc_init.9.  Maybe we can add a section for "userspace
timecounter drivers" to that page after it is up-to-date with the
state of timecounting in the kernel.

Also, I sanity-tested your most recent diff on this arm64:

OpenBSD 6.7-current (GENERIC.MP) #1: Fri Jul  3 13:04:15 CDT 2020
    [hidden email]:/usr/src/sys/arch/arm64/compile/GENERIC.MP
real mem  = 3073093632 (2930MB)
avail mem = 2946428928 (2809MB)
random: good seed from bootblocks
mainbus0 at root: ACPI
psci0 at mainbus0: PSCI 1.1, SMCCC 1.2
cpu0 at mainbus0 mpidr 0: ARM Cortex-A72 r0p3
cpu0: 48KB 64b/line 3-way L1 PIPT I-cache, 32KB 64b/line 2-way L1 D-cache
cpu0: 1024KB 64b/line 16-way L2 cache
cpu1 at mainbus0 mpidr 1: ARM Cortex-A72 r0p3
cpu1: 48KB 64b/line 3-way L1 PIPT I-cache, 32KB 64b/line 2-way L1 D-cache
cpu1: 1024KB 64b/line 16-way L2 cache
cpu2 at mainbus0 mpidr 2: ARM Cortex-A72 r0p3
cpu2: 48KB 64b/line 3-way L1 PIPT I-cache, 32KB 64b/line 2-way L1 D-cache
cpu2: 1024KB 64b/line 16-way L2 cache
cpu3 at mainbus0 mpidr 3: ARM Cortex-A72 r0p3
cpu3: 48KB 64b/line 3-way L1 PIPT I-cache, 32KB 64b/line 2-way L1 D-cache
cpu3: 1024KB 64b/line 16-way L2 cache
efi0 at mainbus0: UEFI 2.7
efi0: https://github.com/pftf/RPi4 rev 0x10000
smbios0 at efi0: SMBIOS 3.3.0
smbios0: vendor https://github.com/pftf/RPi4 version "UEFI Firmware v1.16" date Jun 18 2020 12:20:53
smbios0: Sony UK Raspberry Pi 4 Model B
apm0 at mainbus0
ampintc0 at mainbus0 nirq 256, ncpu 4 ipi: 0, 1: "interrupt-controller"
agtimer0 at mainbus0: tick rate 54000 KHz
acpi0 at mainbus0: ACPI 6.3
acpi0: sleep states
acpi0: tables DSDT FACP CSRT DBG2 GTDT APIC PPTT SPCR
acpi0: wakeup devices
"BCM2849" at acpi0 not configured
"BCM2835" at acpi0 not configured
"BCM2854" at acpi0 not configured
"ACPI0004" at acpi0 not configured
xhci0 at acpi0 XHC0 addr 0x600000000/0x1000 irq 175, xHCI 1.0
usb0 at xhci0: USB revision 3.0
uhub0 at usb0 configuration 1 interface 0 "Generic xHCI root hub" rev 3.00/1.00 addr 1
"ACPI0007" at acpi0 not configured
"ACPI0007" at acpi0 not configured
"ACPI0007" at acpi0 not configured
"ACPI0007" at acpi0 not configured
"ACPI0004" at acpi0 not configured
"BCM2848" at acpi0 not configured
"BCM2850" at acpi0 not configured
"BCM2856" at acpi0 not configured
"BCM2845" at acpi0 not configured
"BCM2841" at acpi0 not configured
"BCM2841" at acpi0 not configured
"BCM2838" at acpi0 not configured
"BCM2839" at acpi0 not configured
"BCM2844" at acpi0 not configured
pluart0 at acpi0 URT0 addr 0xfe201000/0x1000 irq 153: console
"BCM2836" at acpi0 not configured
"BCM2EA6" at acpi0 not configured
"MSFT8000" at acpi0 not configured
"BCM2847" at acpi0 not configured
"BCM2855" at acpi0 not configured
bse0 at acpi0 ETH0 addr 0xfd580000/0x10000 irq 189: address dc:a6:32:7f:dd:8e
brgphy0 at bse0 phy 1: BCM54210E 10/100/1000baseT PHY, rev. 2
uhub1 at uhub0 port 1 configuration 1 interface 0 "VIA Labs USB2.0 Hub" rev 2.10/4.21 addr 2
umass0 at uhub1 port 1 configuration 1 interface 0 "SanDisk Cruzer Fit" rev 2.10/1.00 addr 3
umass0: using SCSI over Bulk-Only
scsibus0 at umass0: 2 targets, initiator 0
sd0 at scsibus0 targ 1 lun 0: <SanDisk, Cruzer Fit, 1.00> removable serial.07815571280726107280
sd0: 59976MB, 512 bytes/sector, 122830848 sectors
vscsi0 at root
scsibus1 at vscsi0: 256 targets
softraid0 at root
scsibus2 at softraid0: 256 targets
bootfile: sd0a:/bsd
boot device: sd0
root on sd0a (0cc470a97f4b5ef4.a) swap on sd0b dump on sd0b
WARNING: clock lost 15 days
WARNING: CHECK AND RESET THE DATE!

Reply | Threaded
Open this post in threaded view
|

Re: userland clock_gettime proof of concept

Vitaliy Makkoveev
In reply to this post by Paul Irofti-4


> On 5 Jul 2020, at 20:31, Paul Irofti <[hidden email]> wrote:
>
> On Fri, Jul 03, 2020 at 06:36:39PM +0300, Paul Irofti wrote:
>>
>>
>> În 3 iulie 2020 17:55:25 EEST, Mark Kettenis <[hidden email]> a scris:
>>>> Date: Fri, 3 Jul 2020 15:13:22 +0200
>>>> From: Robert Nagy <[hidden email]>
>>>>
>>>> On 02/07/20 00:31 +0100, Stuart Henderson wrote:
>>>>> running on 38 of these, btw.
>>>>
>>>> been running with this on all my workstations and laptops and on 3
>>> build
>>>> servers as well
>>>
>>> Are the issue that naddy@ saw solved?
>>>
>>> Did anybody do a *proper* test on anything besides amd64?  Especially
>>> on architectures where the optimized clock_gettime is *not* available?
>>
>> Yes and yes.
>
> So, can we go ahead with this?
>

Sorry for late reaction. At least VirtualBox based virtual machines started to
panic with the recent kernel. I reverted your diff and panics stopped.

Screenshot attached.


panic2.png (20K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: userland clock_gettime proof of concept

Vitaliy Makkoveev
Sorry for late reaction. At least VirtualBox based virtual machines
started to panic with the recent kernel. I reverted your diff and panics
stopped.
Screenshot attached.

panic2.png (20K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: userland clock_gettime proof of concept

Mark Kettenis
In reply to this post by Vitaliy Makkoveev
> From: Vitaliy Makkoveev <[hidden email]>
> Date: Tue, 7 Jul 2020 01:34:33 +0300
>
> > On 5 Jul 2020, at 20:31, Paul Irofti <[hidden email]> wrote:
> >
> > On Fri, Jul 03, 2020 at 06:36:39PM +0300, Paul Irofti wrote:
> >>
> >>
> >> În 3 iulie 2020 17:55:25 EEST, Mark Kettenis <[hidden email]> a scris:
> >>>> Date: Fri, 3 Jul 2020 15:13:22 +0200
> >>>> From: Robert Nagy <[hidden email]>
> >>>>
> >>>> On 02/07/20 00:31 +0100, Stuart Henderson wrote:
> >>>>> running on 38 of these, btw.
> >>>>
> >>>> been running with this on all my workstations and laptops and on 3
> >>> build
> >>>> servers as well
> >>>
> >>> Are the issue that naddy@ saw solved?
> >>>
> >>> Did anybody do a *proper* test on anything besides amd64?  Especially
> >>> on architectures where the optimized clock_gettime is *not* available?
> >>
> >> Yes and yes.
> >
> > So, can we go ahead with this?
> >
>
> Sorry for late reaction. At least VirtualBox based virtual machines
> started to panic with the recent kernel. I reverted your diff and
> panics stopped.

Should already be fixed.

Reply | Threaded
Open this post in threaded view
|

Re: userland clock_gettime proof of concept

Theo de Raadt-2
In reply to this post by Vitaliy Makkoveev
Ah, it was seen.

But everyone has too much memory these days.

Vitaliy Makkoveev <[hidden email]> wrote:

> Sorry for late reaction. At least VirtualBox based virtual machines
> started to panic with the recent kernel. I reverted your diff and panics
> stopped.
> Screenshot attached.

Reply | Threaded
Open this post in threaded view
|

Re: userland clock_gettime proof of concept

Charlene Wendling
In reply to this post by George Koehler-2
Hi George,

On Thu, 25 Jun 2020 23:22:36 -0400
George Koehler wrote:

> On Mon, 22 Jun 2020 19:12:22 +0300
> Paul Irofti <[hidden email]> wrote:
>
> > New iteration:
> >
> >   - ps_timekeep should not coredump, pointed by deraadt@
> >   - set ps_timekeep to 0 before user uvm_map for randomization
> >   - map timekeep before fixup. confirmed by naddy@ that it fixes
> > NULL init
> >   - initialize va. clarified by kettenis@
>
> Here's macppc again.  My macppc isn't using your newest diff but does
> now need to define TC_TB in <machine/timetc.h>.
>
> The /sys/arch/powerpc/include/timetc.h in your diff never gets used,
> because there is no #include <powerpc/timetc.h>.  On macppc,
>   uname -m => macppc     and
>   uname -p => powerpc    are different,
> and #include <machine/timetc.h> is /sys/arch/macppc/include/timetc.h.
> I suspect that <machine/timetc.h> is /sys/arch/$i/include/timetc.h
> if and only if /sys/arch/$i/compile exists.
>
> 10 days ago, naddy said, "You only need the lower register."  That is
> correct, so this diff also stops using mftbu (the higher register).
>
> --- lib/libc/arch/powerpc/gen/usertc.c.before Wed Jun 24
> 16:42:36 2020 +++ lib/libc/arch/powerpc/gen/usertc.c Wed Jun
> 24 16:46:00 2020 @@ -18,4 +18,17 @@
>  #include <sys/types.h>
>  #include <sys/timetc.h>
>  
> -int (*const _tc_get_timecount)(struct timekeep *, u_int *) = NULL;
> +int
> +tc_get_timecount(struct timekeep *tk, u_int *tc)
> +{
> + u_int tb;
> +
> + if (tk->tk_user != TC_TB)
> + return -1;
> +
> + asm volatile("mftb %0" : "=r"(tb));
> + *tc = tb;
> + return 0;
> +}
> +int (*const _tc_get_timecount)(struct timekeep *tk, u_int *tc)
> + = tc_get_timecount;
> --- sys/arch/macppc/include/timetc.h.before Wed Jun 24
> 16:36:03 2020 +++ sys/arch/macppc/include/timetc.h Wed Jun 24
> 16:37:47 2020 @@ -18,6 +18,7 @@
>  #ifndef _MACHINE_TIMETC_H_
>  #define _MACHINE_TIMETC_H_
>  
> -#define TC_LAST 0
> +#define TC_TB 1
> +#define TC_LAST 2
>  
>  #endif /* _MACHINE_TIMETC_H_ */
> --- sys/arch/macppc/macppc/clock.c.before Wed Jun 24 16:39:58
> 2020 +++ sys/arch/macppc/macppc/clock.c Wed Jun 24 16:40:08
> 2020 @@ -57,7 +57,7 @@
>  static int32_t ticks_per_intr;
>  
>  static struct timecounter tb_timecounter = {
> - tb_get_timecount, NULL, 0x7fffffff, 0, "tb", 0, NULL, 0
> + tb_get_timecount, NULL, 0x7fffffff, 0, "tb", 0, NULL, TC_TB
>  };
>  
>  /* calibrate the timecounter frequency for the listed models */
>

I've applied that diff against what is committed to cvs already, it
works fine on my PowerBook G4 A1138. I'm building some ports in
parallel on that machine at the moment, and met no issues so far.

Thanks a lot for this one, i can see speed improvements :)

Charlène.

Reply | Threaded
Open this post in threaded view
|

Re: userland clock_gettime proof of concept

Theo de Raadt-2
> The /sys/arch/powerpc/include/timetc.h in your diff never gets used,
> because there is no #include <powerpc/timetc.h>.  On macppc,

I am fixing this issue for all the architectures, just being careful
by doing builds first.

Reply | Threaded
Open this post in threaded view
|

Re: userland clock_gettime proof of concept

Paul Irofti-4
In reply to this post by George Koehler-2
On 2020-06-26 06:22, George Koehler wrote:

> On Mon, 22 Jun 2020 19:12:22 +0300
> Paul Irofti <[hidden email]> wrote:
>
>> New iteration:
>>
>>    - ps_timekeep should not coredump, pointed by deraadt@
>>    - set ps_timekeep to 0 before user uvm_map for randomization
>>    - map timekeep before fixup. confirmed by naddy@ that it fixes NULL init
>>    - initialize va. clarified by kettenis@
>
> Here's macppc again.  My macppc isn't using your newest diff but does
> now need to define TC_TB in <machine/timetc.h>.
>
> The /sys/arch/powerpc/include/timetc.h in your diff never gets used,
> because there is no #include <powerpc/timetc.h>.  On macppc,
>    uname -m => macppc     and
>    uname -p => powerpc    are different,
> and #include <machine/timetc.h> is /sys/arch/macppc/include/timetc.h.
> I suspect that <machine/timetc.h> is /sys/arch/$i/include/timetc.h
> if and only if /sys/arch/$i/compile exists.
>
> 10 days ago, naddy said, "You only need the lower register."  That is
> correct, so this diff also stops using mftbu (the higher register).

Reads OK to me. Please make the adjustments to static functions that
kettenis@ mentioned in the alpha thread.

>
> --- lib/libc/arch/powerpc/gen/usertc.c.before Wed Jun 24 16:42:36 2020
> +++ lib/libc/arch/powerpc/gen/usertc.c Wed Jun 24 16:46:00 2020
> @@ -18,4 +18,17 @@
>   #include <sys/types.h>
>   #include <sys/timetc.h>
>  
> -int (*const _tc_get_timecount)(struct timekeep *, u_int *) = NULL;
> +int
> +tc_get_timecount(struct timekeep *tk, u_int *tc)
> +{
> + u_int tb;
> +
> + if (tk->tk_user != TC_TB)
> + return -1;
> +
> + asm volatile("mftb %0" : "=r"(tb));
> + *tc = tb;
> + return 0;
> +}
> +int (*const _tc_get_timecount)(struct timekeep *tk, u_int *tc)
> + = tc_get_timecount;
> --- sys/arch/macppc/include/timetc.h.before Wed Jun 24 16:36:03 2020
> +++ sys/arch/macppc/include/timetc.h Wed Jun 24 16:37:47 2020
> @@ -18,6 +18,7 @@
>   #ifndef _MACHINE_TIMETC_H_
>   #define _MACHINE_TIMETC_H_
>  
> -#define TC_LAST 0
> +#define TC_TB 1
> +#define TC_LAST 2
>  
>   #endif /* _MACHINE_TIMETC_H_ */
> --- sys/arch/macppc/macppc/clock.c.before Wed Jun 24 16:39:58 2020
> +++ sys/arch/macppc/macppc/clock.c Wed Jun 24 16:40:08 2020
> @@ -57,7 +57,7 @@
>   static int32_t ticks_per_intr;
>  
>   static struct timecounter tb_timecounter = {
> - tb_get_timecount, NULL, 0x7fffffff, 0, "tb", 0, NULL, 0
> + tb_get_timecount, NULL, 0x7fffffff, 0, "tb", 0, NULL, TC_TB
>   };
>  
>   /* calibrate the timecounter frequency for the listed models */
>

Reply | Threaded
Open this post in threaded view
|

Re: userland clock_gettime proof of concept

Paul Irofti-4
In reply to this post by Theo de Raadt-2
On 2020-07-08 01:09, Theo de Raadt wrote:
>> The /sys/arch/powerpc/include/timetc.h in your diff never gets used,
>> because there is no #include <powerpc/timetc.h>.  On macppc,
>
> I am fixing this issue for all the architectures, just being careful
> by doing builds first.
>

Thank you for handling this.

Reply | Threaded
Open this post in threaded view
|

Re: userland clock_gettime proof of concept

Mark Kettenis
In reply to this post by Paul Irofti-4
> From: Paul Irofti <[hidden email]>
> Date: Wed, 8 Jul 2020 12:05:04 +0300
>
> On 2020-06-26 06:22, George Koehler wrote:
> > On Mon, 22 Jun 2020 19:12:22 +0300
> > Paul Irofti <[hidden email]> wrote:
> >
> >> New iteration:
> >>
> >>    - ps_timekeep should not coredump, pointed by deraadt@
> >>    - set ps_timekeep to 0 before user uvm_map for randomization
> >>    - map timekeep before fixup. confirmed by naddy@ that it fixes NULL init
> >>    - initialize va. clarified by kettenis@
> >
> > Here's macppc again.  My macppc isn't using your newest diff but does
> > now need to define TC_TB in <machine/timetc.h>.
> >
> > The /sys/arch/powerpc/include/timetc.h in your diff never gets used,
> > because there is no #include <powerpc/timetc.h>.  On macppc,
> >    uname -m => macppc     and
> >    uname -p => powerpc    are different,
> > and #include <machine/timetc.h> is /sys/arch/macppc/include/timetc.h.
> > I suspect that <machine/timetc.h> is /sys/arch/$i/include/timetc.h
> > if and only if /sys/arch/$i/compile exists.
> >
> > 10 days ago, naddy said, "You only need the lower register."  That is
> > correct, so this diff also stops using mftbu (the higher register).
>
> Reads OK to me. Please make the adjustments to static functions that
> kettenis@ mentioned in the alpha thread.

To add to that:

* TC_LAST isn't needed, so kill that
* tc_get_timecount

Also in the sparc64 I did an exact copy of the kernel implementation
of the functions to read the counter.  I only made them static inline.
That makes it easier to verify that they are indeed identical.

> > --- lib/libc/arch/powerpc/gen/usertc.c.before Wed Jun 24 16:42:36 2020
> > +++ lib/libc/arch/powerpc/gen/usertc.c Wed Jun 24 16:46:00 2020
> > @@ -18,4 +18,17 @@
> >   #include <sys/types.h>
> >   #include <sys/timetc.h>
> >  
> > -int (*const _tc_get_timecount)(struct timekeep *, u_int *) = NULL;
> > +int
> > +tc_get_timecount(struct timekeep *tk, u_int *tc)
> > +{
> > + u_int tb;
> > +
> > + if (tk->tk_user != TC_TB)
> > + return -1;
> > +
> > + asm volatile("mftb %0" : "=r"(tb));
> > + *tc = tb;
> > + return 0;
> > +}
> > +int (*const _tc_get_timecount)(struct timekeep *tk, u_int *tc)
> > + = tc_get_timecount;
> > --- sys/arch/macppc/include/timetc.h.before Wed Jun 24 16:36:03 2020
> > +++ sys/arch/macppc/include/timetc.h Wed Jun 24 16:37:47 2020
> > @@ -18,6 +18,7 @@
> >   #ifndef _MACHINE_TIMETC_H_
> >   #define _MACHINE_TIMETC_H_
> >  
> > -#define TC_LAST 0
> > +#define TC_TB 1
> > +#define TC_LAST 2
> >  
> >   #endif /* _MACHINE_TIMETC_H_ */
> > --- sys/arch/macppc/macppc/clock.c.before Wed Jun 24 16:39:58 2020
> > +++ sys/arch/macppc/macppc/clock.c Wed Jun 24 16:40:08 2020
> > @@ -57,7 +57,7 @@
> >   static int32_t ticks_per_intr;
> >  
> >   static struct timecounter tb_timecounter = {
> > - tb_get_timecount, NULL, 0x7fffffff, 0, "tb", 0, NULL, 0
> > + tb_get_timecount, NULL, 0x7fffffff, 0, "tb", 0, NULL, TC_TB
> >   };
> >  
> >   /* calibrate the timecounter frequency for the listed models */
> >
>
>

Reply | Threaded
Open this post in threaded view
|

Re: userland clock_gettime proof of concept

Christian Weisgerber
In reply to this post by George Koehler-2
On 2020-06-26, George Koehler <[hidden email]> wrote:

> Here's macppc again.  My macppc isn't using your newest diff but does
> now need to define TC_TB in <machine/timetc.h>.

Crucial question: How confident are we that TB is in sync on
multiprocessor machines?

--
Christian "naddy" Weisgerber                          [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: userland clock_gettime proof of concept

Theo de Raadt-2
Christian Weisgerber <[hidden email]> wrote:

> On 2020-06-26, George Koehler <[hidden email]> wrote:
>
> > Here's macppc again.  My macppc isn't using your newest diff but does
> > now need to define TC_TB in <machine/timetc.h>.
>
> Crucial question: How confident are we that TB is in sync on
> multiprocessor machines?

A few small test programs exposed the problem on alpha for me.

This one identified the problem in a glaring fashion.  The upper
word of the register is per-cpu, and it toggles back and forth
upon each nanosleep call, as we return to a different cpu.  (Quite
obviously the scheduler has zero afinity)

A slightly different test program was able to identify occasions
when returning to the other cpu would return a clock value which was
earlier.  A variation of test_time_skew.c, I don't have it handy.

Maybe this can be quickly convered for macppc, to test?

I don't understand yet how to fix alpha.  It looks like we need a
mechanism to know what cpu# we are on, and a table for the variation.
(The upper word of the alpha clock register assists with profiling
it seems, so I don't think we can put a delta there)

#include <sys/types.h>
#include <sys/time.h>
#include <stdio.h>

main()
{
        struct timespec s = {0, 0};
        long times[40];
        volatile unsigned long t;
        int i;

        for (i = 0; i < sizeof times / sizeof times[0]; i++) {
                __asm volatile("rpcc %0" : "=r" (t));
                times[i] = t;
                nanosleep(&s, NULL);
        }
        for (i = 0; i < sizeof times / sizeof times[0]; i++) {
                t = times[i];
                printf("%lx %x\n", (t >> 32) & 0xffffffff, t & 0xffffffff);
        }
}


Reply | Threaded
Open this post in threaded view
|

Re: userland clock_gettime proof of concept

Theo de Raadt-2
In reply to this post by Christian Weisgerber
Christian Weisgerber <[hidden email]> wrote:

> On 2020-06-26, George Koehler <[hidden email]> wrote:
>
> > Here's macppc again.  My macppc isn't using your newest diff but does
> > now need to define TC_TB in <machine/timetc.h>.
>
> Crucial question: How confident are we that TB is in sync on
> multiprocessor machines?

Ugly test program (not showing it) seems to show syncronized clocks on
powerpc, or at least closely sycronized.  It might be one register.

Reply | Threaded
Open this post in threaded view
|

Re: userland clock_gettime proof of concept

Scott Cheloha
On Wed, Jul 08, 2020 at 09:36:03AM -0600, Theo de Raadt wrote:

> Christian Weisgerber <[hidden email]> wrote:
>
> > On 2020-06-26, George Koehler <[hidden email]> wrote:
> >
> > > Here's macppc again.  My macppc isn't using your newest diff but does
> > > now need to define TC_TB in <machine/timetc.h>.
> >
> > Crucial question: How confident are we that TB is in sync on
> > multiprocessor machines?
>
> Ugly test program (not showing it) seems to show syncronized clocks on
> powerpc, or at least closely sycronized.  It might be one register.

I guess I'll share mine, then.

With this program I can consistently catch an offset of -2 to -1 on my
G5.  Such a small offset is probably fine, right?  I'm curious if it
gets any larger on other machines.

peanut$ cc -o tb-sync-check tb-sync-check.c -lpthread
peanut$ ./tb-sync-check                              
tb-sync-check: T1: regression: local 91661686941 global 91661686943 (-2)
peanut$ ./tb-sync-check
tb-sync-check: T1: regression: local 91721148999 global 91721149000 (-1)
peanut$ ./tb-sync-check
tb-sync-check: T1: regression: local 91773944538 global 91773944539 (-1)
peanut$ ./tb-sync-check
tb-sync-check: T2: regression: local 91839284894 global 91839284895 (-1)
peanut$ ./tb-sync-check
tb-sync-check: T1: regression: local 91885196234 global 91885196235 (-1)
peanut$ ./tb-sync-check
tb-sync-check: T2: regression: local 92460898355 global 92460898356 (-1)
peanut$ ./tb-sync-check
tb-sync-check: T2: regression: local 92553793423 global 92553793424 (-1)
peanut$ ./tb-sync-check
tb-sync-check: T1: regression: local 92602154422 global 92602154423 (-1)
peanut$ ./tb-sync-check
tb-sync-check: T1: regression: local 92643366617 global 92643366619 (-2)
$
$ cat tb-sync-check.c
#include <sys/param.h>
#include <sys/atomic.h>

#include <err.h>
#include <pthread.h>
#include <stdint.h>
#include <signal.h>

uint64_t tb_global_max;
pthread_mutex_t tb_mtx = PTHREAD_MUTEX_INITIALIZER;
volatile sig_atomic_t uninterrupted = 1;

void handle_int(int);
uint64_t mftb(void);
void *tb_sync_test(void *);

int
main(int argc, char *argv[])
{
        pthread_t thread[2];
        sigset_t empty;
        int error, i;

        signal(SIGINT, handle_int);

        tb_global_max = mftb();

        for (i = 0; i < nitems(thread); i++) {
                error = pthread_create(&thread[i], NULL, tb_sync_test,
                    (void *)(i + 1));
                if (error)
                        errc(1, error, "pthread_create");
        }

        sigemptyset(&empty);
        sigsuspend(&empty);

        for (i = 0; i < nitems(thread); i++) {
                error = pthread_join(thread[i], NULL);
                if (error)
                        errc(1, error, "pthread_join");
        }

        return 0;
}

void
handle_int(int signo)
{
        uninterrupted = 0;
}

/*
 * Cribbed from sys/arch/powerpc/include/cpu.h.
 *
 * Hopefully it's correct.
 */
uint64_t
mftb(void)
{
        u_long scratch;
        u_int64_t tb;

        asm volatile ("1: mftbu %0; mftb %L0; mftbu %1;"
            " cmpw 0,%0,%1; bne 1b" : "=r"(tb), "=r"(scratch));
        return tb;
}

void *
tb_sync_test(void *id)
{
        uint64_t tb_local;
        unsigned int thread_id = (unsigned int)id;

        while (uninterrupted) {
                pthread_mutex_lock(&tb_mtx);
                tb_local = mftb();

                /*
                 * Does the local TimeBase lag behind the global maximum?
                 */
                if (tb_local < tb_global_max) {
                        errx(1, "T%u: regression: local %llu global %llu (%lld)",
                            thread_id,
                            (unsigned long long)tb_local,
                            (unsigned long long)tb_global_max,
                            (long long)(tb_local - tb_global_max));
                }

                /*
                 * Update the global maximum.
                 */
                tb_global_max = tb_local;
                pthread_mutex_unlock(&tb_mtx);
        }

        return NULL;
}

Reply | Threaded
Open this post in threaded view
|

Re: userland clock_gettime proof of concept

George Koehler-2
On Wed, 8 Jul 2020 11:32:09 -0500
Scott Cheloha <[hidden email]> wrote:

> On Wed, Jul 08, 2020 at 09:36:03AM -0600, Theo de Raadt wrote:
> > Ugly test program (not showing it) seems to show syncronized clocks on
> > powerpc, or at least closely sycronized.  It might be one register.
>
> I guess I'll share mine, then.
>
> With this program I can consistently catch an offset of -2 to -1 on my
> G5.  Such a small offset is probably fine, right?  I'm curious if it
> gets any larger on other machines.

/sys/arch/macppc/macppc/cpu.c has code to "Sync timebase" of the cpus.
One cpu does h->tb = ppc_mftb() + 100000; other does ppc_mttb(h->tb).

Your -2 to -1 implies that time went backwards!  I don't like this, but
time also goes backwards in OpenBSD/amd64 and in other systems.
(Microsoft Docs, "Acquiring high-resolution time stamps", says that
stamps "from different threads", within a small range of uncertainty,
have "ambiguous ordering".)

Reply | Threaded
Open this post in threaded view
|

Re: userland clock_gettime proof of concept

George Koehler-2
In reply to this post by Mark Kettenis
On Wed, 8 Jul 2020 14:26:02 +0200 (CEST)
Mark Kettenis <[hidden email]> wrote:

> > From: Paul Irofti <[hidden email]>
> > Reads OK to me. Please make the adjustments to static functions that
> > kettenis@ mentioned in the alpha thread.
>
> To add to that:
>
> * TC_LAST isn't needed, so kill that
> * tc_get_timecount
>
> Also in the sparc64 I did an exact copy of the kernel implementation
> of the functions to read the counter.  I only made them static inline.
> That makes it easier to verify that they are indeed identical.

Here is the diff for macppc after I drop TC_LAST, recopy usertc.c from
amd64 (so tc_get_timecount is now static), and copy ppc_mftbl from
/sys/arch/powerpc/include/cpu.h

OK to commit?

Index: lib/libc/arch/powerpc/gen/usertc.c
===================================================================
RCS file: /cvs/src/lib/libc/arch/powerpc/gen/usertc.c,v
retrieving revision 1.1
diff -u -p -r1.1 usertc.c
--- lib/libc/arch/powerpc/gen/usertc.c 6 Jul 2020 13:33:05 -0000 1.1
+++ lib/libc/arch/powerpc/gen/usertc.c 9 Jul 2020 21:41:47 -0000
@@ -1,4 +1,4 @@
-/* $OpenBSD: usertc.c,v 1.1 2020/07/06 13:33:05 pirofti Exp $ */
+/* $OpenBSD: usertc.c,v 1.2 2020/07/08 09:17:48 kettenis Exp $ */
 /*
  * Copyright (c) 2020 Paul Irofti <[hidden email]>
  *
@@ -18,4 +18,24 @@
 #include <sys/types.h>
 #include <sys/timetc.h>
 
-int (*const _tc_get_timecount)(struct timekeep *, u_int *) = NULL;
+static __inline u_int32_t
+ppc_mftbl (void)
+{
+ int ret;
+ __asm volatile ("mftb %0" : "=r" (ret));
+ return ret;
+}
+
+static int
+tc_get_timecount(struct timekeep *tk, u_int *tc)
+{
+ switch (tk->tk_user) {
+ case TC_TB:
+ *tc = ppc_mftbl();
+ return 0;
+ }
+
+ return -1;
+}
+
+int (*const _tc_get_timecount)(struct timekeep *, u_int *) = tc_get_timecount;
Index: sys/arch/macppc/include/timetc.h
===================================================================
RCS file: /cvs/src/sys/arch/macppc/include/timetc.h,v
retrieving revision 1.1
diff -u -p -r1.1 timetc.h
--- sys/arch/macppc/include/timetc.h 6 Jul 2020 13:33:07 -0000 1.1
+++ sys/arch/macppc/include/timetc.h 9 Jul 2020 21:41:48 -0000
@@ -18,6 +18,6 @@
 #ifndef _MACHINE_TIMETC_H_
 #define _MACHINE_TIMETC_H_
 
-#define TC_LAST 0
+#define TC_TB 1
 
 #endif /* _MACHINE_TIMETC_H_ */
Index: sys/arch/macppc/macppc/clock.c
===================================================================
RCS file: /cvs/src/sys/arch/macppc/macppc/clock.c,v
retrieving revision 1.44
diff -u -p -r1.44 clock.c
--- sys/arch/macppc/macppc/clock.c 6 Jul 2020 13:33:08 -0000 1.44
+++ sys/arch/macppc/macppc/clock.c 9 Jul 2020 21:41:48 -0000
@@ -57,7 +57,7 @@ u_int32_t ns_per_tick = 320;
 static int32_t ticks_per_intr;
 
 static struct timecounter tb_timecounter = {
- tb_get_timecount, NULL, 0x7fffffff, 0, "tb", 0, NULL, 0
+ tb_get_timecount, NULL, 0x7fffffff, 0, "tb", 0, NULL, TC_TB
 };
 
 /* calibrate the timecounter frequency for the listed models */

Reply | Threaded
Open this post in threaded view
|

Re: userland clock_gettime proof of concept

Mark Kettenis
> Date: Fri, 10 Jul 2020 19:03:58 -0400
> From: George Koehler <[hidden email]>
>
> On Wed, 8 Jul 2020 14:26:02 +0200 (CEST)
> Mark Kettenis <[hidden email]> wrote:
>
> > > From: Paul Irofti <[hidden email]>
> > > Reads OK to me. Please make the adjustments to static functions that
> > > kettenis@ mentioned in the alpha thread.
> >
> > To add to that:
> >
> > * TC_LAST isn't needed, so kill that
> > * tc_get_timecount
> >
> > Also in the sparc64 I did an exact copy of the kernel implementation
> > of the functions to read the counter.  I only made them static inline.
> > That makes it easier to verify that they are indeed identical.
>
> Here is the diff for macppc after I drop TC_LAST, recopy usertc.c from
> amd64 (so tc_get_timecount is now static), and copy ppc_mftbl from
> /sys/arch/powerpc/include/cpu.h
>
> OK to commit?
>
> Index: lib/libc/arch/powerpc/gen/usertc.c
> ===================================================================
> RCS file: /cvs/src/lib/libc/arch/powerpc/gen/usertc.c,v
> retrieving revision 1.1
> diff -u -p -r1.1 usertc.c
> --- lib/libc/arch/powerpc/gen/usertc.c 6 Jul 2020 13:33:05 -0000 1.1
> +++ lib/libc/arch/powerpc/gen/usertc.c 9 Jul 2020 21:41:47 -0000
> @@ -1,4 +1,4 @@
> -/* $OpenBSD: usertc.c,v 1.1 2020/07/06 13:33:05 pirofti Exp $ */
> +/* $OpenBSD: usertc.c,v 1.2 2020/07/08 09:17:48 kettenis Exp $ */
>  /*
>   * Copyright (c) 2020 Paul Irofti <[hidden email]>
>   *
> @@ -18,4 +18,24 @@
>  #include <sys/types.h>
>  #include <sys/timetc.h>
>  
> -int (*const _tc_get_timecount)(struct timekeep *, u_int *) = NULL;
> +static __inline u_int32_t
> +ppc_mftbl (void)
> +{
> + int ret;
> + __asm volatile ("mftb %0" : "=r" (ret));
> + return ret;
> +}
> +
> +static int

That should be u_int.  I now see that this is broken in the amd64
version as well.

Otherwise this is ok kettenis@

> +tc_get_timecount(struct timekeep *tk, u_int *tc)
> +{
> + switch (tk->tk_user) {
> + case TC_TB:
> + *tc = ppc_mftbl();
> + return 0;
> + }
> +
> + return -1;
> +}
> +
> +int (*const _tc_get_timecount)(struct timekeep *, u_int *) = tc_get_timecount;
> Index: sys/arch/macppc/include/timetc.h
> ===================================================================
> RCS file: /cvs/src/sys/arch/macppc/include/timetc.h,v
> retrieving revision 1.1
> diff -u -p -r1.1 timetc.h
> --- sys/arch/macppc/include/timetc.h 6 Jul 2020 13:33:07 -0000 1.1
> +++ sys/arch/macppc/include/timetc.h 9 Jul 2020 21:41:48 -0000
> @@ -18,6 +18,6 @@
>  #ifndef _MACHINE_TIMETC_H_
>  #define _MACHINE_TIMETC_H_
>  
> -#define TC_LAST 0
> +#define TC_TB 1
>  
>  #endif /* _MACHINE_TIMETC_H_ */
> Index: sys/arch/macppc/macppc/clock.c
> ===================================================================
> RCS file: /cvs/src/sys/arch/macppc/macppc/clock.c,v
> retrieving revision 1.44
> diff -u -p -r1.44 clock.c
> --- sys/arch/macppc/macppc/clock.c 6 Jul 2020 13:33:08 -0000 1.44
> +++ sys/arch/macppc/macppc/clock.c 9 Jul 2020 21:41:48 -0000
> @@ -57,7 +57,7 @@ u_int32_t ns_per_tick = 320;
>  static int32_t ticks_per_intr;
>  
>  static struct timecounter tb_timecounter = {
> - tb_get_timecount, NULL, 0x7fffffff, 0, "tb", 0, NULL, 0
> + tb_get_timecount, NULL, 0x7fffffff, 0, "tb", 0, NULL, TC_TB
>  };
>  
>  /* calibrate the timecounter frequency for the listed models */
>

1 ... 7891011