idle threads say 100%

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

idle threads say 100%

Ted Unangst-6
I know there have been some changes in this area, but this seems like a
remaining bug.

Run top. Press S to show system processes. Observe idle1 and idle3 (the
hyperthreaded idles) stuck at 100%. With 0:00 time.

Now if you switch hw.smt=1 the 100% slowly ticks down, but the time
immediately jumps up to many hours.

Just a peculiarity I noticed. Perhaps there's some way to consolidate special
cases?

Reply | Threaded
Open this post in threaded view
|

Re: idle threads say 100%

Theo de Raadt-2
There was work before release which had to be pulled since it changed
API which some ports could not handle.

A new approach (adding a different API) is in the works from Scott.

However, it is possible there are other bugs in the way the cpus are
'parked'.

Or it could simply be bugs in top, which Scott is also addressing.

> I know there have been some changes in this area, but this seems like a
> remaining bug.
>
> Run top. Press S to show system processes. Observe idle1 and idle3 (the
> hyperthreaded idles) stuck at 100%. With 0:00 time.
>
> Now if you switch hw.smt=1 the 100% slowly ticks down, but the time
> immediately jumps up to many hours.
>
> Just a peculiarity I noticed. Perhaps there's some way to consolidate special
> cases?
>

Reply | Threaded
Open this post in threaded view
|

Re: idle threads say 100%

Mark Kettenis
> From: "Theo de Raadt" <[hidden email]>
> Date: Thu, 01 Nov 2018 14:21:43 -0600
>
> There was work before release which had to be pulled since it changed
> API which some ports could not handle.
>
> A new approach (adding a different API) is in the works from Scott.
>
> However, it is possible there are other bugs in the way the cpus are
> 'parked'.
>
> Or it could simply be bugs in top, which Scott is also addressing.

I think Ted is pointing out a different issue here that doesn't really
have anything to do with SMT.  The issue is that in some cases we end
up with CPUs reporting being 100% busy running the idle thread instead
of reporting being 100% idle.  This happens quite a lot on machines
with lots of CPUs immediately after they are booted.  Usually this
funny state disappears after some time.

An idle CPU is of course running the idle thread, so in that sense
this isn't super-strange.  But it does indicate there is some kind of
accounting issue.  I have a feeling this happens before any processes
have been scheduled on these CPUs.  But I've never found the problem...

> > I know there have been some changes in this area, but this seems like a
> > remaining bug.
> >
> > Run top. Press S to show system processes. Observe idle1 and idle3 (the
> > hyperthreaded idles) stuck at 100%. With 0:00 time.
> >
> > Now if you switch hw.smt=1 the 100% slowly ticks down, but the time
> > immediately jumps up to many hours.
> >
> > Just a peculiarity I noticed. Perhaps there's some way to consolidate special
> > cases?
> >
>
>

Reply | Threaded
Open this post in threaded view
|

Re: idle threads say 100%

Alexander Bluhm
On Fri, Nov 02, 2018 at 10:19:15PM +0100, Mark Kettenis wrote:

> I think Ted is pointing out a different issue here that doesn't really
> have anything to do with SMT.  The issue is that in some cases we end
> up with CPUs reporting being 100% busy running the idle thread instead
> of reporting being 100% idle.  This happens quite a lot on machines
> with lots of CPUs immediately after they are booted.  Usually this
> funny state disappears after some time.
>
> An idle CPU is of course running the idle thread, so in that sense
> this isn't super-strange.  But it does indicate there is some kind of
> accounting issue.  I have a feeling this happens before any processes
> have been scheduled on these CPUs.  But I've never found the problem...

I can easily reproduce this on a 2 socket machine with 4 cores each.
Hyper threads are turned off in BIOS.

schedcpu() does not account the run time if p->p_slptime > 1.  So
fresh idle threads have a CPU percentage of 99%.

USER       PID %CPU %MEM   VSZ   RSS TT  STAT  STARTED       TIME COMMAND
root     11006 99.0  0.0     0     0 ??  RK/6  10:19PM    0:15.51 (idle6)
root     26529 99.0  0.0     0     0 ??  RK/7  10:19PM    2:55.45 (idle7)
root     55382 99.0  0.0     0     0 ??  RK/3  10:19PM    2:55.27 (idle3)
root     84574 99.0  0.0     0     0 ??  RK/4  10:19PM    2:54.90 (idle4)
root     53490 99.0  0.0     0     0 ??  RK/5  10:19PM    2:49.72 (idle5)
root     59318 73.2  0.0     0     0 ??  DK    10:19PM    2:53.36 (idle0)
root     24358  0.0  0.0     0     0 ??  RK/1  10:19PM    2:54.66 (idle1)
root     15902  0.0  0.0     0     0 ??  RK/2  10:19PM    2:55.04 (idle2)

This effect goes away after runnning multiple processes on all cores.

USER       PID %CPU %MEM   VSZ   RSS TT  STAT  STARTED       TIME COMMAND
root     55382 11.7  0.0     0     0 ??  RK/3  10:19PM    4:06.52 (idle3)
root     84574 11.6  0.0     0     0 ??  DK    10:19PM    4:12.00 (idle4)
root     53490 11.7  0.0     0     0 ??  RK/5  10:19PM    4:11.94 (idle5)
root     11006 11.7  0.0     0     0 ??  RK/6  10:19PM    3:48.90 (idle6)
root     26529  1.3  0.0     0     0 ??  RK/7  10:19PM    4:11.81 (idle7)
root     59318  0.4  0.0     0     0 ??  DK    10:19PM    4:09.78 (idle0)
root     24358  0.0  0.0     0     0 ??  RK/1  10:19PM    4:11.37 (idle1)
root     15902  0.0  0.0     0     0 ??  RK/2  10:19PM    4:11.87 (idle2)

If I initialize p_slptime with 127 the 99% effect does not happen
from the beginning.

USER       PID %CPU %MEM   VSZ   RSS TT  STAT  STARTED       TIME COMMAND
root     86374  0.0  0.0     0     0 ??  RK/1  10:18PM    3:10.15 (idle1)
root     78654  0.0  0.0     0     0 ??  DK    10:18PM    3:09.52 (idle2)
root     85088  0.0  0.0     0     0 ??  DK    10:18PM    3:10.82 (idle3)
root     34080  0.0  0.0     0     0 ??  RK/4  10:18PM    2:06.89 (idle4)
root     65391  0.0  0.0     0     0 ??  RK/5  10:18PM    0:20.39 (idle5)
root      6812  0.0  0.0     0     0 ??  RK/6  10:18PM    0:36.33 (idle6)
root      6433  0.0  0.0     0     0 ??  RK/7  10:18PM    0:53.09 (idle7)
root     87244  0.0  0.0     0     0 ??  RK/0  10:18PM    3:07.58 (idle0)

There are still things I don't understand.  After a while the CPU
time for idle5, idle6, idle7 does not increase anymore.  I am doing
iperf3 performance tests on this machine.  My patch makes the results
more unsteady and throughput lower.  It seems that iperf3 processes
get scheduled on CPUs with less memory affinity.

bluhm

Index: kern/kern_sched.c
===================================================================
RCS file: /data/mirror/openbsd/cvs/src/sys/kern/kern_sched.c,v
retrieving revision 1.54
diff -u -p -r1.54 kern_sched.c
--- kern/kern_sched.c 17 Nov 2018 23:10:08 -0000 1.54
+++ kern/kern_sched.c 17 Dec 2018 19:35:33 -0000
@@ -145,6 +145,7 @@ sched_idle(void *v)
  */
  SCHED_LOCK(s);
  cpuset_add(&sched_idle_cpus, ci);
+ p->p_slptime = 127;
  p->p_stat = SSLEEP;
  p->p_cpu = ci;
  atomic_setbits_int(&p->p_flag, P_CPUPEG);