apmd: fix autoaction timeout

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

apmd: fix autoaction timeout

Jeremie Courreges-Anglas-2

The diff below improves the way apmd -z/-Z may trigger.

I think the current behavior is bogus, incrementing and checking
apmtimeout like this doesn't make much sense.

Here's a proposal:
- on APM_POWER_CHANGE events, check the battery level and trigger
  autoaction if needed.  This should be enough to make autoaction just
  work with drivers like acpibat(4).
- on kevent timeout (10mn by default now, maybe too long), also check
  the battery level and suspend if needed.  This should be useful only
  if your battery driver doesn't send any APM_POWER_CHANGE event.

While here I also tweaked the warning.

Some more context:
- a subsequent diff would reorder the code to handle similarly the "!rv"
  and "ev->ident == ctl_fd" paths
- I think we want some throttling mechanism, like wait for 1mn after we
  resume before autosuspending again.  But I want to fix obvious
  problems first.
- if battery is still lower than your autoaction limit, your system
  might go back to sleep once if you attempt a resume after plugging
  your AC cable.  Another glitch, another mail to tech@.

Thoughts?  oks?


Index: apmd.c
===================================================================
RCS file: /cvs/src/usr.sbin/apmd/apmd.c,v
retrieving revision 1.91
diff -u -p -r1.91 apmd.c
--- apmd.c 2 Nov 2019 00:41:36 -0000 1.91
+++ apmd.c 25 Jan 2020 18:05:08 -0000
@@ -380,7 +380,6 @@ main(int argc, char *argv[])
  int noacsleep = 0;
  struct timespec ts = {TIMO, 0}, sts = {0, 0};
  struct apm_power_info pinfo;
- time_t apmtimeout = 0;
  const char *sockname = _PATH_APM_SOCKET;
  const char *errstr;
  int kq, nchanges;
@@ -502,13 +501,10 @@ main(int argc, char *argv[])
 
  sts = ts;
 
- apmtimeout += 1;
  if ((rv = kevent(kq, NULL, 0, ev, 1, &sts)) == -1)
  break;
 
- if (apmtimeout >= ts.tv_sec) {
- apmtimeout = 0;
-
+ if (!rv) {
  /* wakeup for timeout: take status */
  powerbak = power_status(ctl_fd, 0, &pinfo);
  if (powerstatus != powerbak) {
@@ -519,8 +515,8 @@ main(int argc, char *argv[])
  if (!powerstatus && autoaction &&
     autolimit > (int)pinfo.battery_life) {
  logmsg(LOG_NOTICE,
-    "estimated battery life %d%%, "
-    "autoaction limit set to %d%% .",
+    "estimated battery life %d%%"
+    " below configured limit %d%%",
     pinfo.battery_life,
     autolimit
  );
@@ -530,10 +526,8 @@ main(int argc, char *argv[])
  else
  hibernate(ctl_fd);
  }
- }
-
- if (!rv)
  continue;
+ }
 
  if (ev->ident == ctl_fd) {
  suspends = standbys = hibernates = resumes = 0;
@@ -575,6 +569,19 @@ main(int argc, char *argv[])
  if (powerstatus != powerbak) {
  powerstatus = powerbak;
  powerchange = 1;
+ }
+
+ if (!powerstatus && autoaction &&
+    autolimit > (int)pinfo.battery_life) {
+ logmsg(LOG_NOTICE,
+    "estimated battery life %d%%"
+    " below configured limit %d%%",
+    pinfo.battery_life, autolimit);
+
+ if (autoaction == AUTO_SUSPEND)
+ suspends++;
+ else
+ hibernates++;
  }
  break;
  default:


--
jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF  DDCC 0DFA 74AE 1524 E7EE

Reply | Threaded
Open this post in threaded view
|

Re: apmd: fix autoaction timeout

Ted Unangst-6
Jeremie Courreges-Anglas wrote:
>
> The diff below improves the way apmd -z/-Z may trigger.
>
> I think the current behavior is bogus, incrementing and checking
> apmtimeout like this doesn't make much sense.

this all seems reasonable to me.

> - I think we want some throttling mechanism, like wait for 1mn after we
>   resume before autosuspending again.  But I want to fix obvious
>   problems first.

would agree with that as well.

Reply | Threaded
Open this post in threaded view
|

Re: apmd: fix autoaction timeout

Jeremie Courreges-Anglas-2
In reply to this post by Jeremie Courreges-Anglas-2
On Sat, Jan 25 2020, Jeremie Courreges-Anglas <[hidden email]> wrote:

> The diff below improves the way apmd -z/-Z may trigger.
>
> I think the current behavior is bogus, incrementing and checking
> apmtimeout like this doesn't make much sense.
>
> Here's a proposal:
> - on APM_POWER_CHANGE events, check the battery level and trigger
>   autoaction if needed.  This should be enough to make autoaction just
>   work with drivers like acpibat(4).
> - on kevent timeout (10mn by default now, maybe too long), also check
>   the battery level and suspend if needed.  This should be useful only
>   if your battery driver doesn't send any APM_POWER_CHANGE event.
>
> While here I also tweaked the warning.

This has been committed, thanks Ted.

> Some more context:
> - a subsequent diff would reorder the code to handle similarly the "!rv"
>   and "ev->ident == ctl_fd" paths

Diff below.

> - I think we want some throttling mechanism, like wait for 1mn after we
>   resume before autosuspending again.  But I want to fix obvious
>   problems first.
[...]

Further unify the handling of kevent(2) timeouts and apm events:
fake an APM_POWER_CHANGE event on timeouts.

I think assert(3) is appropriate here but could be convinced otherwise.

ok?


Index: apmd.c
===================================================================
--- apmd.c.orig
+++ apmd.c
@@ -37,6 +37,7 @@
 #include <sys/event.h>
 #include <sys/time.h>
 #include <sys/sysctl.h>
+#include <assert.h>
 #include <stdarg.h>
 #include <stdio.h>
 #include <syslog.h>
@@ -497,7 +498,7 @@ main(int argc, char *argv[])
  error("kevent", NULL);
 
  for (;;) {
- int rv;
+ int rv, event, index;
 
  sts = ts;
 
@@ -528,6 +529,46 @@ main(int argc, char *argv[])
  continue;
  } else if (rv == 0) {
  /* wakeup for timeout: take status */
+ event = APM_POWER_CHANGE;
+ index = -1;
+ } else {
+ assert(rv == 1 && ev->ident == ctl_fd);
+ event = APM_EVENT_TYPE(ev->data);
+ index = APM_EVENT_INDEX(ev->data);
+ }
+
+ logmsg(LOG_DEBUG, "apmevent %04x index %d", event, index);
+
+ switch (event) {
+ case APM_SUSPEND_REQ:
+ case APM_USER_SUSPEND_REQ:
+ case APM_CRIT_SUSPEND_REQ:
+ case APM_BATTERY_LOW:
+ suspends++;
+ break;
+ case APM_USER_STANDBY_REQ:
+ case APM_STANDBY_REQ:
+ standbys++;
+ break;
+ case APM_USER_HIBERNATE_REQ:
+ hibernates++;
+ break;
+#if 0
+ case APM_CANCEL:
+ suspends = standbys = 0;
+ break;
+#endif
+ case APM_NORMAL_RESUME:
+ case APM_CRIT_RESUME:
+ case APM_SYS_STANDBY_RESUME:
+ powerbak = power_status(ctl_fd, 0, &pinfo);
+ if (powerstatus != powerbak) {
+ powerstatus = powerbak;
+ powerchange = 1;
+ }
+ resumes++;
+ break;
+ case APM_POWER_CHANGE:
  powerbak = power_status(ctl_fd, 0, &pinfo);
  if (powerstatus != powerbak) {
  powerstatus = powerbak;
@@ -548,63 +589,9 @@ main(int argc, char *argv[])
  else
  hibernates++;
  }
- } else if (ev->ident == ctl_fd) {
- logmsg(LOG_DEBUG, "apmevent %04x index %d",
-    (int)APM_EVENT_TYPE(ev->data),
-    (int)APM_EVENT_INDEX(ev->data));
-
- switch (APM_EVENT_TYPE(ev->data)) {
- case APM_SUSPEND_REQ:
- case APM_USER_SUSPEND_REQ:
- case APM_CRIT_SUSPEND_REQ:
- case APM_BATTERY_LOW:
- suspends++;
- break;
- case APM_USER_STANDBY_REQ:
- case APM_STANDBY_REQ:
- standbys++;
- break;
- case APM_USER_HIBERNATE_REQ:
- hibernates++;
- break;
-#if 0
- case APM_CANCEL:
- suspends = standbys = 0;
- break;
-#endif
- case APM_NORMAL_RESUME:
- case APM_CRIT_RESUME:
- case APM_SYS_STANDBY_RESUME:
- powerbak = power_status(ctl_fd, 0, &pinfo);
- if (powerstatus != powerbak) {
- powerstatus = powerbak;
- powerchange = 1;
- }
- resumes++;
- break;
- case APM_POWER_CHANGE:
- powerbak = power_status(ctl_fd, 0, &pinfo);
- if (powerstatus != powerbak) {
- powerstatus = powerbak;
- powerchange = 1;
- }
-
- if (!powerstatus && autoaction &&
-    autolimit > (int)pinfo.battery_life) {
- logmsg(LOG_NOTICE,
-    "estimated battery life %d%%"
-    " below configured limit %d%%",
-    pinfo.battery_life, autolimit);
-
- if (autoaction == AUTO_SUSPEND)
- suspends++;
- else
- hibernates++;
- }
- break;
- default:
- ;
- }
+ break;
+ default:
+ ;
  }
 
  if ((standbys || suspends) && noacsleep &&

--
jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF  DDCC 0DFA 74AE 1524 E7EE

Reply | Threaded
Open this post in threaded view
|

Re: apmd: fix autoaction timeout

Jeremie Courreges-Anglas-2
On Wed, Feb 12 2020, Jeremie Courreges-Anglas <[hidden email]> wrote:

> On Sat, Jan 25 2020, Jeremie Courreges-Anglas <[hidden email]> wrote:
>> The diff below improves the way apmd -z/-Z may trigger.
>>
>> I think the current behavior is bogus, incrementing and checking
>> apmtimeout like this doesn't make much sense.
>>
>> Here's a proposal:
>> - on APM_POWER_CHANGE events, check the battery level and trigger
>>   autoaction if needed.  This should be enough to make autoaction just
>>   work with drivers like acpibat(4).
>> - on kevent timeout (10mn by default now, maybe too long), also check
>>   the battery level and suspend if needed.  This should be useful only
>>   if your battery driver doesn't send any APM_POWER_CHANGE event.
>>
>> While here I also tweaked the warning.
>
> This has been committed, thanks Ted.
>
>> Some more context:
>> - a subsequent diff would reorder the code to handle similarly the "!rv"
>>   and "ev->ident == ctl_fd" paths
>
> Diff below.
>
>> - I think we want some throttling mechanism, like wait for 1mn after we
>>   resume before autosuspending again.  But I want to fix obvious
>>   problems first.

On top of the previous diff, here's a diff to block autoaction for 60
seconds after:
- autoaction triggers; this prevents apmd from sending multiple suspend
requests before the system goes to sleep
- a resume happens; this gives you 60 seconds to fetch and plug your AC
cable if you notice you're low on power

A side effect is that apmd now ignores stale acpiac(4) data at resume
time, so it apmd doesn't suspend the system again when you resume with
a low battery and AC plugged.

cc'ing Scott since he has a thing for everything time-related. :)

ok?



Index: apmd.c
===================================================================
--- apmd.c.orig
+++ apmd.c
@@ -380,6 +380,7 @@ main(int argc, char *argv[])
  int powerstatus = 0, powerbak = 0, powerchange = 0;
  int noacsleep = 0;
  struct timespec ts = {TIMO, 0}, sts = {0, 0};
+ struct timespec autoaction_timestamp = { 0, 0 };
  struct apm_power_info pinfo;
  const char *sockname = _PATH_APM_SOCKET;
  const char *errstr;
@@ -566,6 +567,7 @@ main(int argc, char *argv[])
  powerstatus = powerbak;
  powerchange = 1;
  }
+ clock_gettime(CLOCK_MONOTONIC, &autoaction_timestamp);
  resumes++;
  break;
  case APM_POWER_CHANGE:
@@ -577,6 +579,8 @@ main(int argc, char *argv[])
 
  if (!powerstatus && autoaction &&
     autolimit > (int)pinfo.battery_life) {
+ struct timespec delay, now;
+
  logmsg(LOG_NOTICE,
     "estimated battery life %d%%"
     " below configured limit %d%%",
@@ -584,10 +588,17 @@ main(int argc, char *argv[])
     autolimit
  );
 
- if (autoaction == AUTO_SUSPEND)
- suspends++;
- else
- hibernates++;
+ delay = autoaction_timestamp;
+ delay.tv_sec += 60;
+ clock_gettime(CLOCK_MONOTONIC, &now);
+ if (timespeccmp(&now, &delay, >)) {
+ if (autoaction == AUTO_SUSPEND)
+ suspends++;
+ else
+ hibernates++;
+ clock_gettime(CLOCK_MONOTONIC,
+    &autoaction_timestamp);
+ }
  }
  break;
  default:

--
jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF  DDCC 0DFA 74AE 1524 E7EE

Reply | Threaded
Open this post in threaded view
|

Re: apmd: fix autoaction timeout

Scott Cheloha
On Wed, Feb 12, 2020 at 01:35:22PM +0100, Jeremie Courreges-Anglas wrote:

> On Wed, Feb 12 2020, Jeremie Courreges-Anglas <[hidden email]> wrote:
> > On Sat, Jan 25 2020, Jeremie Courreges-Anglas <[hidden email]> wrote:
> >> The diff below improves the way apmd -z/-Z may trigger.
> >>
> >> I think the current behavior is bogus, incrementing and checking
> >> apmtimeout like this doesn't make much sense.
> >>
> >> Here's a proposal:
> >> - on APM_POWER_CHANGE events, check the battery level and trigger
> >>   autoaction if needed.  This should be enough to make autoaction just
> >>   work with drivers like acpibat(4).
> >> - on kevent timeout (10mn by default now, maybe too long), also check
> >>   the battery level and suspend if needed.  This should be useful only
> >>   if your battery driver doesn't send any APM_POWER_CHANGE event.
> >>
> >> While here I also tweaked the warning.
> >
> > This has been committed, thanks Ted.
> >
> >> Some more context:
> >> - a subsequent diff would reorder the code to handle similarly the "!rv"
> >>   and "ev->ident == ctl_fd" paths
> >
> > Diff below.
> >
> >> - I think we want some throttling mechanism, like wait for 1mn after we
> >>   resume before autosuspending again.  But I want to fix obvious
> >>   problems first.
>
> On top of the previous diff, here's a diff to block autoaction for 60
> seconds after:
> - autoaction triggers; this prevents apmd from sending multiple suspend
> requests before the system goes to sleep
> - a resume happens; this gives you 60 seconds to fetch and plug your AC
> cable if you notice you're low on power
>
> A side effect is that apmd now ignores stale acpiac(4) data at resume
> time, so it apmd doesn't suspend the system again when you resume with
> a low battery and AC plugged.
>
> cc'ing Scott since he has a thing for everything time-related. :)

For the first case, is there any way you can detect that a suspend is
in-progress but not done yet?

I think that'd be cleaner (in some ways) than an autoaction cooldown
timer.

Whenever I want to add an arbitrary delay that isn't per se required
by an interface I wonder whether I'm working around a deficiency in
the state machine instead of addressing the root cause.

Sometimes it can't be helped, but I have to ask.

For the second case, I thought the design of autoaction was to (a)
note that the battery was below a particular threshold and (b) take
action to avert data loss.  If you resume from suspend with battery
below the threshold and no AC I think you would *want* autoaction to
trigger.  Like, it sounds like the state machine is working as
designed.

If the machine is immediately suspending after resume shouldn't you
just plug it in before reattempting resume?  Isn't that better than
having the battery die on you?

Reply | Threaded
Open this post in threaded view
|

Re: apmd: fix autoaction timeout

Jeremie Courreges-Anglas-5
On Wed, Feb 12 2020, Scott Cheloha <[hidden email]> wrote:

> On Wed, Feb 12, 2020 at 01:35:22PM +0100, Jeremie Courreges-Anglas wrote:
>> On Wed, Feb 12 2020, Jeremie Courreges-Anglas <[hidden email]> wrote:
>> > On Sat, Jan 25 2020, Jeremie Courreges-Anglas <[hidden email]> wrote:
>> >> The diff below improves the way apmd -z/-Z may trigger.
>> >>
>> >> I think the current behavior is bogus, incrementing and checking
>> >> apmtimeout like this doesn't make much sense.
>> >>
>> >> Here's a proposal:
>> >> - on APM_POWER_CHANGE events, check the battery level and trigger
>> >>   autoaction if needed.  This should be enough to make autoaction just
>> >>   work with drivers like acpibat(4).
>> >> - on kevent timeout (10mn by default now, maybe too long), also check
>> >>   the battery level and suspend if needed.  This should be useful only
>> >>   if your battery driver doesn't send any APM_POWER_CHANGE event.
>> >>
>> >> While here I also tweaked the warning.
>> >
>> > This has been committed, thanks Ted.
>> >
>> >> Some more context:
>> >> - a subsequent diff would reorder the code to handle similarly the "!rv"
>> >>   and "ev->ident == ctl_fd" paths
>> >
>> > Diff below.
>> >
>> >> - I think we want some throttling mechanism, like wait for 1mn after we
>> >>   resume before autosuspending again.  But I want to fix obvious
>> >>   problems first.
>>
>> On top of the previous diff, here's a diff to block autoaction for 60
>> seconds after:
>> - autoaction triggers; this prevents apmd from sending multiple suspend
>> requests before the system goes to sleep
>> - a resume happens; this gives you 60 seconds to fetch and plug your AC
>> cable if you notice you're low on power
>>
>> A side effect is that apmd now ignores stale acpiac(4) data at resume
>> time, so it apmd doesn't suspend the system again when you resume with
>> a low battery and AC plugged.
>>
>> cc'ing Scott since he has a thing for everything time-related. :)
>
> For the first case, is there any way you can detect that a suspend is
> in-progress but not done yet?

Well, apmd could record that it asked the kernel for a suspend/hibernate
and skip autoaction as long as it doesn't get a resume event.

> I think that'd be cleaner (in some ways) than an autoaction cooldown
> timer.
>
> Whenever I want to add an arbitrary delay that isn't per se required
> by an interface I wonder whether I'm working around a deficiency in
> the state machine instead of addressing the root cause.
>
> Sometimes it can't be helped, but I have to ask.

Initially I only cared about the second case, and then noticed that
APM_POWER_CHANGE events can happen at any time.  Reusing the 60 seconds
timer looked appealing (cheap) but please see the updated diff below.

> For the second case, I thought the design of autoaction was to (a)
> note that the battery was below a particular threshold and (b) take
> action to avert data loss.  If you resume from suspend with battery
> below the threshold and no AC I think you would *want* autoaction to
> trigger.  Like, it sounds like the state machine is working as
> designed.
>
> If the machine is immediately suspending after resume shouldn't you
> just plug it in before reattempting resume?  Isn't that better than
> having the battery die on you?

We can't know when the battery will fail to feed the system.  I suspect
that the resume sequence itself may drain more power than 60 seconds
spent idling (wild guess, no power meter at hand).  So I see no
convincing reason to prevent any use of the system.

Regarding the user experience, I think the user should be put into
control.  60 seconds is enough to plug the power cable or take a quick
look at a document, or even kill apmd if the laptop is really needed
like, *right now*.  I know I've been in this kind of situation several
times.

So here's an updated diff that:
- disables autoaction for 60 seconds after resume.  This is still done
  in a naive way, autoaction won't kick in exactly after 60 seconds
  after resume.  Good enough for now, I think.
- prevents autoaction to kick in several times before suspend/hibernate
- improves naming (suggestions welcome)
- logs why autoaction doesn't kick in
- documents the 60 seconds grace period in the manpage

This still works around the issue with stale acpiac(4) data at resume
time.

Thanks for your input so far, I hope I have addressed your concerns.

Comments / oks?


Index: apmd.c
===================================================================
--- apmd.c.orig
+++ apmd.c
@@ -368,18 +368,20 @@ resumed(int ctl_fd)
 }
 
 #define TIMO (10*60) /* 10 minutes */
+#define AUTOACTION_GRACE_PERIOD (60) /* 1mn after resume */
 
 int
 main(int argc, char *argv[])
 {
  const char *fname = _PATH_APM_CTLDEV;
  int ctl_fd, sock_fd, ch, suspends, standbys, hibernates, resumes;
- int autoaction = 0;
+ int autoaction = 0, autoaction_inflight = 0;
  int autolimit = 0;
  int statonly = 0;
  int powerstatus = 0, powerbak = 0, powerchange = 0;
  int noacsleep = 0;
  struct timespec ts = {TIMO, 0}, sts = {0, 0};
+ struct timespec last_resume = { 0, 0 };
  struct apm_power_info pinfo;
  const char *sockname = _PATH_APM_SOCKET;
  const char *errstr;
@@ -566,6 +568,8 @@ main(int argc, char *argv[])
  powerstatus = powerbak;
  powerchange = 1;
  }
+ clock_gettime(CLOCK_MONOTONIC, &last_resume);
+ autoaction_inflight = 0;
  resumes++;
  break;
  case APM_POWER_CHANGE:
@@ -577,17 +581,30 @@ main(int argc, char *argv[])
 
  if (!powerstatus && autoaction &&
     autolimit > (int)pinfo.battery_life) {
+ struct timespec graceperiod, now;
+
+ graceperiod = last_resume;
+ graceperiod.tv_sec += AUTOACTION_GRACE_PERIOD;
+ clock_gettime(CLOCK_MONOTONIC, &now);
+
  logmsg(LOG_NOTICE,
     "estimated battery life %d%%"
-    " below configured limit %d%%",
-    pinfo.battery_life,
-    autolimit
+    " below configured limit %d%%%s%s",
+    pinfo.battery_life, autolimit,
+    !autoaction_inflight ? "" : ", in flight",
+    timespeccmp(&now, &graceperiod, >) ?
+        "" : ", grace period"
  );
 
- if (autoaction == AUTO_SUSPEND)
- suspends++;
- else
- hibernates++;
+ if (!autoaction_inflight &&
+    timespeccmp(&now, &graceperiod, >)) {
+ if (autoaction == AUTO_SUSPEND)
+ suspends++;
+ else
+ hibernates++;
+ /* Block autoaction until next resume */
+ autoaction_inflight = 1;
+ }
  }
  break;
  default:
Index: apmd.8
===================================================================
--- apmd.8.orig
+++ apmd.8
@@ -128,6 +128,8 @@ If both
 and
 .Fl z
 are specified, the last one will supersede the other.
+After a resume, AC state and estimated battery life are ignored for 60
+seconds.
 .El
 .Pp
 When a client requests a suspend or stand-by state,

--
jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF  DDCC 0DFA 74AE 1524 E7EE

Reply | Threaded
Open this post in threaded view
|

Re: apmd: fix autoaction timeout

Jeremie Courreges-Anglas-5
On Thu, Feb 13 2020, Jeremie Courreges-Anglas <[hidden email]> wrote:

[...]

> - documents the 60 seconds grace period in the manpage

That part was not accurate.  Updated wording:

  "After a resume, the effect of those options is inhibited for 60 seconds."


Index: apmd.c
===================================================================
--- apmd.c.orig
+++ apmd.c
@@ -368,18 +368,20 @@ resumed(int ctl_fd)
 }
 
 #define TIMO (10*60) /* 10 minutes */
+#define AUTOACTION_GRACE_PERIOD (60) /* 1mn after resume */
 
 int
 main(int argc, char *argv[])
 {
  const char *fname = _PATH_APM_CTLDEV;
  int ctl_fd, sock_fd, ch, suspends, standbys, hibernates, resumes;
- int autoaction = 0;
+ int autoaction = 0, autoaction_inflight = 0;
  int autolimit = 0;
  int statonly = 0;
  int powerstatus = 0, powerbak = 0, powerchange = 0;
  int noacsleep = 0;
  struct timespec ts = {TIMO, 0}, sts = {0, 0};
+ struct timespec last_resume = { 0, 0 };
  struct apm_power_info pinfo;
  const char *sockname = _PATH_APM_SOCKET;
  const char *errstr;
@@ -566,6 +568,8 @@ main(int argc, char *argv[])
  powerstatus = powerbak;
  powerchange = 1;
  }
+ clock_gettime(CLOCK_MONOTONIC, &last_resume);
+ autoaction_inflight = 0;
  resumes++;
  break;
  case APM_POWER_CHANGE:
@@ -577,17 +581,30 @@ main(int argc, char *argv[])
 
  if (!powerstatus && autoaction &&
     autolimit > (int)pinfo.battery_life) {
+ struct timespec graceperiod, now;
+
+ graceperiod = last_resume;
+ graceperiod.tv_sec += AUTOACTION_GRACE_PERIOD;
+ clock_gettime(CLOCK_MONOTONIC, &now);
+
  logmsg(LOG_NOTICE,
     "estimated battery life %d%%"
-    " below configured limit %d%%",
-    pinfo.battery_life,
-    autolimit
+    " below configured limit %d%%%s%s",
+    pinfo.battery_life, autolimit,
+    !autoaction_inflight ? "" : ", in flight",
+    timespeccmp(&now, &graceperiod, >) ?
+        "" : ", grace period"
  );
 
- if (autoaction == AUTO_SUSPEND)
- suspends++;
- else
- hibernates++;
+ if (!autoaction_inflight &&
+    timespeccmp(&now, &graceperiod, >)) {
+ if (autoaction == AUTO_SUSPEND)
+ suspends++;
+ else
+ hibernates++;
+ /* Block autoaction until next resume */
+ autoaction_inflight = 1;
+ }
  }
  break;
  default:
Index: apmd.8
===================================================================
--- apmd.8.orig
+++ apmd.8
@@ -128,6 +128,7 @@ If both
 and
 .Fl z
 are specified, the last one will supersede the other.
+After a resume, the effect of those options is inhibited for 60 seconds.
 .El
 .Pp
 When a client requests a suspend or stand-by state,


--
jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF  DDCC 0DFA 74AE 1524 E7EE

Reply | Threaded
Open this post in threaded view
|

Re: apmd: fix autoaction timeout

Scott Cheloha
In reply to this post by Jeremie Courreges-Anglas-5
On Thu, Feb 13, 2020 at 02:08:32PM +0100, Jeremie Courreges-Anglas wrote:

> On Wed, Feb 12 2020, Scott Cheloha <[hidden email]> wrote:
> > On Wed, Feb 12, 2020 at 01:35:22PM +0100, Jeremie Courreges-Anglas wrote:
> >> On Wed, Feb 12 2020, Jeremie Courreges-Anglas <[hidden email]> wrote:
> >> > On Sat, Jan 25 2020, Jeremie Courreges-Anglas <[hidden email]> wrote:
> >> >> The diff below improves the way apmd -z/-Z may trigger.
> >> >>
> >> >> I think the current behavior is bogus, incrementing and checking
> >> >> apmtimeout like this doesn't make much sense.
> >> >>
> >> >> Here's a proposal:
> >> >> - on APM_POWER_CHANGE events, check the battery level and trigger
> >> >>   autoaction if needed.  This should be enough to make autoaction just
> >> >>   work with drivers like acpibat(4).
> >> >> - on kevent timeout (10mn by default now, maybe too long), also check
> >> >>   the battery level and suspend if needed.  This should be useful only
> >> >>   if your battery driver doesn't send any APM_POWER_CHANGE event.
> >> >>
> >> >> While here I also tweaked the warning.
> >> >
> >> > This has been committed, thanks Ted.
> >> >
> >> >> Some more context:
> >> >> - a subsequent diff would reorder the code to handle similarly the "!rv"
> >> >>   and "ev->ident == ctl_fd" paths
> >> >
> >> > Diff below.
> >> >
> >> >> - I think we want some throttling mechanism, like wait for 1mn after we
> >> >>   resume before autosuspending again.  But I want to fix obvious
> >> >>   problems first.
> >>
> >> On top of the previous diff, here's a diff to block autoaction for 60
> >> seconds after:
> >> - autoaction triggers; this prevents apmd from sending multiple suspend
> >> requests before the system goes to sleep
> >> - a resume happens; this gives you 60 seconds to fetch and plug your AC
> >> cable if you notice you're low on power
> >>
> >> A side effect is that apmd now ignores stale acpiac(4) data at resume
> >> time, so it apmd doesn't suspend the system again when you resume with
> >> a low battery and AC plugged.
> >>
> >> cc'ing Scott since he has a thing for everything time-related. :)
> >
> > For the first case, is there any way you can detect that a suspend is
> > in-progress but not done yet?
>
> Well, apmd could record that it asked the kernel for a suspend/hibernate
> and skip autoaction as long as it doesn't get a resume event.

Hmmm.  So what happens if the suspend/hibernate fails?  Could apmd(8)
get stuck waiting for a resume that will never happen?

> > I think that'd be cleaner (in some ways) than an autoaction cooldown
> > timer.
> >
> > Whenever I want to add an arbitrary delay that isn't per se required
> > by an interface I wonder whether I'm working around a deficiency in
> > the state machine instead of addressing the root cause.
> >
> > Sometimes it can't be helped, but I have to ask.
>
> Initially I only cared about the second case, and then noticed that
> APM_POWER_CHANGE events can happen at any time.  Reusing the 60 seconds
> timer looked appealing (cheap) but please see the updated diff below.
>
> > For the second case, I thought the design of autoaction was to (a)
> > note that the battery was below a particular threshold and (b) take
> > action to avert data loss.  If you resume from suspend with battery
> > below the threshold and no AC I think you would *want* autoaction to
> > trigger.  Like, it sounds like the state machine is working as
> > designed.
> >
> > If the machine is immediately suspending after resume shouldn't you
> > just plug it in before reattempting resume?  Isn't that better than
> > having the battery die on you?
>
> We can't know when the battery will fail to feed the system.  I suspect
> that the resume sequence itself may drain more power than 60 seconds
> spent idling (wild guess, no power meter at hand).  So I see no
> convincing reason to prevent any use of the system.
>
> Regarding the user experience, I think the user should be put into
> control.  60 seconds is enough to plug the power cable or take a quick
> look at a document, or even kill apmd if the laptop is really needed
> like, *right now*.  I know I've been in this kind of situation several
> times.

Fair enough.

> So here's an updated diff that:
> - disables autoaction for 60 seconds after resume.  This is still done
>   in a naive way, autoaction won't kick in exactly after 60 seconds
>   after resume.  Good enough for now, I think.

I think this part is fine.

You could use an alarm(3) to pull you out of kevent(2) prematurely.
That could be done in a separate change.

> - prevents autoaction to kick in several times before suspend/hibernate
> - improves naming (suggestions welcome)
> - logs why autoaction doesn't kick in
> - documents the 60 seconds grace period in the manpage
>
> This still works around the issue with stale acpiac(4) data at resume
> time.
>
> Thanks for your input so far, I hope I have addressed your concerns.
>
> Comments / oks?

Just the question above.

> Index: apmd.c
> ===================================================================
> --- apmd.c.orig
> +++ apmd.c
> @@ -368,18 +368,20 @@ resumed(int ctl_fd)
>  }
>  
>  #define TIMO (10*60) /* 10 minutes */
> +#define AUTOACTION_GRACE_PERIOD (60) /* 1mn after resume */
>  
>  int
>  main(int argc, char *argv[])
>  {
>   const char *fname = _PATH_APM_CTLDEV;
>   int ctl_fd, sock_fd, ch, suspends, standbys, hibernates, resumes;
> - int autoaction = 0;
> + int autoaction = 0, autoaction_inflight = 0;
>   int autolimit = 0;
>   int statonly = 0;
>   int powerstatus = 0, powerbak = 0, powerchange = 0;
>   int noacsleep = 0;
>   struct timespec ts = {TIMO, 0}, sts = {0, 0};
> + struct timespec last_resume = { 0, 0 };
>   struct apm_power_info pinfo;
>   const char *sockname = _PATH_APM_SOCKET;
>   const char *errstr;
> @@ -566,6 +568,8 @@ main(int argc, char *argv[])
>   powerstatus = powerbak;
>   powerchange = 1;
>   }
> + clock_gettime(CLOCK_MONOTONIC, &last_resume);
> + autoaction_inflight = 0;
>   resumes++;
>   break;
>   case APM_POWER_CHANGE:
> @@ -577,17 +581,30 @@ main(int argc, char *argv[])
>  
>   if (!powerstatus && autoaction &&
>      autolimit > (int)pinfo.battery_life) {
> + struct timespec graceperiod, now;
> +
> + graceperiod = last_resume;
> + graceperiod.tv_sec += AUTOACTION_GRACE_PERIOD;
> + clock_gettime(CLOCK_MONOTONIC, &now);
> +
>   logmsg(LOG_NOTICE,
>      "estimated battery life %d%%"
> -    " below configured limit %d%%",
> -    pinfo.battery_life,
> -    autolimit
> +    " below configured limit %d%%%s%s",
> +    pinfo.battery_life, autolimit,
> +    !autoaction_inflight ? "" : ", in flight",
> +    timespeccmp(&now, &graceperiod, >) ?
> +        "" : ", grace period"
>   );
>  
> - if (autoaction == AUTO_SUSPEND)
> - suspends++;
> - else
> - hibernates++;
> + if (!autoaction_inflight &&
> +    timespeccmp(&now, &graceperiod, >)) {
> + if (autoaction == AUTO_SUSPEND)
> + suspends++;
> + else
> + hibernates++;
> + /* Block autoaction until next resume */
> + autoaction_inflight = 1;
> + }
>   }
>   break;
>   default:
> Index: apmd.8
> ===================================================================
> --- apmd.8.orig
> +++ apmd.8
> @@ -128,6 +128,8 @@ If both
>  and
>  .Fl z
>  are specified, the last one will supersede the other.
> +After a resume, AC state and estimated battery life are ignored for 60
> +seconds.
>  .El
>  .Pp
>  When a client requests a suspend or stand-by state,
>
> --
> jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF  DDCC 0DFA 74AE 1524 E7EE

Reply | Threaded
Open this post in threaded view
|

Re: apmd: fix autoaction timeout

Jeremie Courreges-Anglas-2
On Fri, Feb 14 2020, Scott Cheloha <[hidden email]> wrote:

> On Thu, Feb 13, 2020 at 02:08:32PM +0100, Jeremie Courreges-Anglas wrote:
>> On Wed, Feb 12 2020, Scott Cheloha <[hidden email]> wrote:
>> > On Wed, Feb 12, 2020 at 01:35:22PM +0100, Jeremie Courreges-Anglas wrote:
>> >> On Wed, Feb 12 2020, Jeremie Courreges-Anglas <[hidden email]> wrote:
>> >> > On Sat, Jan 25 2020, Jeremie Courreges-Anglas <[hidden email]> wrote:
>> >> >> The diff below improves the way apmd -z/-Z may trigger.
>> >> >>
>> >> >> I think the current behavior is bogus, incrementing and checking
>> >> >> apmtimeout like this doesn't make much sense.
>> >> >>
>> >> >> Here's a proposal:
>> >> >> - on APM_POWER_CHANGE events, check the battery level and trigger
>> >> >>   autoaction if needed.  This should be enough to make autoaction just
>> >> >>   work with drivers like acpibat(4).
>> >> >> - on kevent timeout (10mn by default now, maybe too long), also check
>> >> >>   the battery level and suspend if needed.  This should be useful only
>> >> >>   if your battery driver doesn't send any APM_POWER_CHANGE event.
>> >> >>
>> >> >> While here I also tweaked the warning.
>> >> >
>> >> > This has been committed, thanks Ted.
>> >> >
>> >> >> Some more context:
>> >> >> - a subsequent diff would reorder the code to handle similarly the "!rv"
>> >> >>   and "ev->ident == ctl_fd" paths
>> >> >
>> >> > Diff below.
>> >> >
>> >> >> - I think we want some throttling mechanism, like wait for 1mn after we
>> >> >>   resume before autosuspending again.  But I want to fix obvious
>> >> >>   problems first.
>> >>
>> >> On top of the previous diff, here's a diff to block autoaction for 60
>> >> seconds after:
>> >> - autoaction triggers; this prevents apmd from sending multiple suspend
>> >> requests before the system goes to sleep
>> >> - a resume happens; this gives you 60 seconds to fetch and plug your AC
>> >> cable if you notice you're low on power
>> >>
>> >> A side effect is that apmd now ignores stale acpiac(4) data at resume
>> >> time, so it apmd doesn't suspend the system again when you resume with
>> >> a low battery and AC plugged.
>> >>
>> >> cc'ing Scott since he has a thing for everything time-related. :)
>> >
>> > For the first case, is there any way you can detect that a suspend is
>> > in-progress but not done yet?
>>
>> Well, apmd could record that it asked the kernel for a suspend/hibernate
>> and skip autoaction as long as it doesn't get a resume event.
>
> Hmmm.  So what happens if the suspend/hibernate fails?  Could apmd(8)
> get stuck waiting for a resume that will never happen?

Well, if suspend fails maybe there's no point in having apmd retry
a suspend. Also what would get stuck is only the autoaction behavior,
the rest of apmd would keep on working as usual.

acpi_sleep_state seems to properly send a RESUME event even if
suspend/hibernate fails, except in one error case.

But depending on a resume event is not portable, the APM code in i386
and loongson doesn't notify userland about resumes. Something that ought
to be fixed.

--
jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF  DDCC 0DFA 74AE 1524 E7EE