6.4-release tset(1) really slow, what have I missed?

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

6.4-release tset(1) really slow, what have I missed?

Adam Thompson
I've successfully installed OpenBSD 6.4-RELEASE at OVH, but I'm noticing
one thing there that's different from everywhere else I've used 6.4.

tset(1) takes approximately 12-15 seconds to execute, (almost) every
time.

On a DigitalOcean VPS running 6.3-STABLE (via openup) tset sensibly
takes about 1 or 2 seconds:
   [hidden email]:~$ time tset -s
   TERM=xterm;
       0m01.01s real     0m00.00s user     0m00.01s system
   [hidden email]:~$ uname -r
   6.3

On the OVH VPS running 6.4-STABLE (via openup), the same command takes
15 seconds:
   [hidden email]:~$ time tset -s
   TERM=xterm;
       0m15.19s real     0m00.00s user     0m00.01s system
   [hidden email]:~$ uname -r
   6.4


That's from two SSH sessions from the same client with the same
parameters.

I've captured ktrace(1) output, which shows tset(1) doing, well,
nothing:
...
  57429/443422  tset     0.035908 CALL  
kbind(0x7f7fffff7678,24,0xecf2201fc1aab9ca)
  57429/443422  tset     0.035933 RET   kbind 0
  57429/443422  tset     0.035950 CALL  
nanosleep(0x7f7fffff7760,0x7f7fffff7750)
  57429/443422  tset     0.035967 STRU  struct timespec { 1 }
  57429/443422  tset     15.809238 STRU  struct timespec { 0 }
  57429/443422  tset     15.809272 RET   nanosleep 0
  57429/443422  tset     15.809303 CALL  
kbind(0x7f7fffff76c8,24,0xecf2201fc1aab9ca)
  57429/443422  tset     15.809380 RET   kbind 0
...

I don't think this is a bug in 6.4, it's clearly environment-specific...
but I have no idea what on earth could be causing it.

(dmesg, etc. omitted in this first message, since it's so ridiculously
specific.)

Oh, and to make it even weirder, it doesn't ALWAYS happen.  It ran
quickly twice earlier today, but never again.

Normally I'd say "it's DNS", and I thought it was due to the slow login
times, but ktrace(1) says otherwise.

Any ideas?

Thanks,
-Adam

Reply | Threaded
Open this post in threaded view
|

Re: 6.4-release tset(1) really slow, what have I missed?

Philip Guenther-2
On Sun, Dec 2, 2018 at 2:15 PM Adam Thompson <[hidden email]> wrote:

> I've successfully installed OpenBSD 6.4-RELEASE at OVH, but I'm noticing
> one thing there that's different from everywhere else I've used 6.4.
>
> tset(1) takes approximately 12-15 seconds to execute, (almost) every
> time.
>
> On a DigitalOcean VPS running 6.3-STABLE (via openup) tset sensibly
> takes about 1 or 2 seconds:
>    [hidden email]:~$ time tset -s
>    TERM=xterm;
>        0m01.01s real     0m00.00s user     0m00.01s system
>    [hidden email]:~$ uname -r
>    6.3
>
> On the OVH VPS running 6.4-STABLE (via openup), the same command takes
> 15 seconds:
>    [hidden email]:~$ time tset -s
>    TERM=xterm;
>        0m15.19s real     0m00.00s user     0m00.01s system
>    [hidden email]:~$ uname -r
>    6.4
>
>
> That's from two SSH sessions from the same client with the same
> parameters.
>
> I've captured ktrace(1) output, which shows tset(1) doing, well,
> nothing:
> ...
>   57429/443422  tset     0.035908 CALL
> kbind(0x7f7fffff7678,24,0xecf2201fc1aab9ca)
>   57429/443422  tset     0.035933 RET   kbind 0
>   57429/443422  tset     0.035950 CALL
> nanosleep(0x7f7fffff7760,0x7f7fffff7750)
>   57429/443422  tset     0.035967 STRU  struct timespec { 1 }
>   57429/443422  tset     15.809238 STRU  struct timespec { 0 }
>   57429/443422  tset     15.809272 RET   nanosleep 0
>   57429/443422  tset     15.809303 CALL
> kbind(0x7f7fffff76c8,24,0xecf2201fc1aab9ca)
>   57429/443422  tset     15.809380 RET   kbind 0
> ...
>
> I don't think this is a bug in 6.4, it's clearly environment-specific...
> but I have no idea what on earth could be causing it.
>

It requested a sleep of 1 second and 15 seconds passed.  That's a kernel
timetracking issue, so the output of "sysctl kern.timecounter" would be a
good place to start.  Is this is an MP kernel using the CPU TSC, but on a
VM where the virtual CPU's TSCs aren't in sync?


Philip Guenther
Reply | Threaded
Open this post in threaded view
|

Re: 6.4-release tset(1) really slow, what have I missed?

Adam Thompson
On 2018-12-02 20:50, Philip Guenther wrote:

> On Sun, Dec 2, 2018 at 2:15 PM Adam Thompson <[hidden email]> wrote:
>
>> I've successfully installed OpenBSD 6.4-RELEASE at OVH, but I'm noticing
>> one thing there that's different from everywhere else I've used 6.4.
>>
>> tset(1) takes approximately 12-15 seconds to execute, (almost) every
>> time.
>>
>> On a DigitalOcean VPS running 6.3-STABLE (via openup) tset sensibly
>> takes about 1 or 2 seconds:
>> [hidden email]:~$ time tset -s
>> TERM=xterm;
>> 0m01.01s real     0m00.00s user     0m00.01s system
>> [hidden email]:~$ uname -r
>> 6.3
>>
>> On the OVH VPS running 6.4-STABLE (via openup), the same command takes
>> 15 seconds:
>> [hidden email]:~$ time tset -s
>> TERM=xterm;
>> 0m15.19s real     0m00.00s user     0m00.01s system
>> [hidden email]:~$ uname -r
>> 6.4
>>
>> That's from two SSH sessions from the same client with the same
>> parameters.
>>
>> I've captured ktrace(1) output, which shows tset(1) doing, well,
>> nothing:
>> ...
>> 57429/443422  tset     0.035908 CALL  
>> kbind(0x7f7fffff7678,24,0xecf2201fc1aab9ca)
>> 57429/443422  tset     0.035933 RET   kbind 0
>> 57429/443422  tset     0.035950 CALL  
>> nanosleep(0x7f7fffff7760,0x7f7fffff7750)
>> 57429/443422  tset     0.035967 STRU  struct timespec { 1 }
>> 57429/443422  tset     15.809238 STRU  struct timespec { 0 }
>> 57429/443422  tset     15.809272 RET   nanosleep 0
>> 57429/443422  tset     15.809303 CALL  
>> kbind(0x7f7fffff76c8,24,0xecf2201fc1aab9ca)
>> 57429/443422  tset     15.809380 RET   kbind 0
>> ...
>>
>> I don't think this is a bug in 6.4, it's clearly environment-specific...
>> but I have no idea what on earth could be causing it.
>
> It requested a sleep of 1 second and 15 seconds passed.  That's a kernel timetracking issue, so the output of "sysctl kern.timecounter" would be a good place to start.  Is this is an MP kernel using the CPU TSC, but on a VM where the virtual CPU's TSCs aren't in sync?
>
> Philip Guenther

Thanks for the pointer!  I noticed, once, that the system clock was way
off, but I assumed that was one of the boots where I didn't have
networking at startup so ntpd(8) -s failed to operate as expected.  Bad
kernel timekeeping would also produce that result, though.

Anyway:

[hidden email]:~$ sysctl kern.timecounter
kern.timecounter.tick=1
kern.timecounter.timestepwarnings=0
kern.timecounter.hardware=acpitimer
kern.timecounter.choice=i8254(0) acpitimer0(1000) dummy(-1000000)

and it's an SP kernel running on one vCPU.  No way of knowing what's
happening under the hood, other than that it's OpenStack Nova, which
IIRC means KVM virtualization.  Note the lack of advertised TSC support.


I'm unsure if my test is valid, but I switched to i8254 (confirmed
successful via sysctl), and tset(1) continues to pause for an
unnaturally long time.  But then I rebooted and re-tested the same
sysctl vaules, and this time tset(1) took 1sec, as expected.  WTF...

Well, putting that into sysctl.conf seems to be a reasonable workaround
for now.  I also enabled timestepwarnings, and they are occurring,
although I don't yet know how often.

Is this likely to be a big enough problem that I should ditch this VPS
platform for now?

dmesg output, to verify SP kernel with no TSC:

OpenBSD 6.4 (GENERIC) #1: Mon Nov 26 09:32:17 CET 2018
[hidden email]:/usr/src/sys/arch/amd64/compile/GENERIC
real mem = 4177379328 (3983MB)
avail mem = 4041621504 (3854MB)
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.8 @ 0xf6890 (10 entries)
bios0: vendor SeaBIOS version "2:1.10.2-58953eb7" date 04/01/2014
bios0: OpenStack Foundation OpenStack Nova
acpi0 at bios0: rev 0
acpi0: sleep states S3 S4 S5
acpi0: tables DSDT FACP SSDT APIC
acpi0: wakeup devices
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee00000: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel Core Processor (Haswell, no TSX, IBRS), 2394.81 MHz,
06-3c-01
cpu0:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,IBRS,IBPB,ARAT,XSAVEOPT,MELTDOWN
cpu0: 64KB 64b/line 2-way I-cache, 64KB 64b/line 2-way D-cache, 512KB
64b/line 16-way L2 cache
cpu0: ITLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped
cpu0: DTLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges
cpu0: apic clock running at 1000MHz
ioapic0 at mainbus0: apid 0 pa 0xfec00000, version 11, 24 pins
acpiprt0 at acpi0: bus 0 (PCI0)
acpicpu0 at acpi0: C1(@1 halt!)
...etc...

Thanks,

-Adam
Reply | Threaded
Open this post in threaded view
|

Re: 6.4-release tset(1) really slow, what have I missed?

Adam Thompson
On 2018-12-02 22:12, Adam Thompson wrote:

> I'm unsure if my test is valid, but I switched to i8254 (confirmed successful via sysctl), and tset(1) continues to pause for an unnaturally long time.  But then I rebooted and re-tested the same sysctl vaules, and this time tset(1) took 1sec, as expected.  WTF...
>
> Well, putting that into sysctl.conf seems to be a reasonable workaround for now.  I also enabled timestepwarnings, and they are occurring, although I don't yet know how often.

I understand why I got inconsistent results... I had two different hosts
open in the two windows I was using.  Thank goodness they were both just
personal systems!

I'll re-test tomorrow morning when I'm more awake!

-Adam