amd64 UEFI machine fails to load kernel; displays "ExitBootServices" and reboots

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

amd64 UEFI machine fails to load kernel; displays "ExitBootServices" and reboots

Ryan-2
I am very interested in trying out OpenBSD on my machine as a desktop
system.

I flashed install60.fs to a pendrive, and attempted to boot my amd64 UEFI
system with it.

It displays the kernel entry point information, followed by the message
"ExitBootServices" and then instantly reboots the machine.


Here is the code where the problem happens:

http://cvsweb.openbsd.org/cgi-bin/cvsweb/src/sys/arch/amd64/stand/efiboot/efiboot.c?rev=1.14&content-type=text/x-cvsweb-markup

void
efi_cleanup(void)
{
        EFI_STATUS status;

        efi_memprobe_internal(); /* sync the current map */
        status = EFI_CALL(BS->ExitBootServices, IH, mmap_key);
        if (status != EFI_SUCCESS)
                panic("ExitBootServices");
}

I spliced in some debugging code and found that the call to
ExitBootServices is returning a status code of 2 (EFI_INVALID_PARAMETER),
which causes the panic and reboot.

Here's a description of this problem in the UEFI Specification [1] (page
221):

"An EFI OS loader must ensure that it has the system's current memory map
at the time it calls ExitBootServices(). This is done by passing in the
current memory map's MapKey value as returned by
EFI_BOOT_SERVICES.GetMemoryMap(). Care must be taken to ensure that the
memory map does not change between these two calls. It is suggested that
GetMemoryMap() be called immediately before calling ExitBootServices(). If
MapKey value is incorrect, ExitBootServices() returns EFI_INVALID_PARAMETER
and GetMemoryMap() with ExitBootServices() must be called again."

I spliced in some more debugging code and found that the MapKey value
(mmap_key) does indeed change during the call to ExitBootServices.

That isolates the problem: ExitBootServices is changing the memory map. The
simple solution seems to be to just retry calling GetMemoryMap to get the
updated MapKey, and then call ExitBootServices a second time. However, I
did some more research before attempting this solution on my machine.


I found this quote from a Linux mailing list archive where they were
discussing the same issue [2]:

"ExitBootServices() can legitimately fail if any of the event handlers
that are signaled by its invocation change the memory map. In that case,
we need to get the memory map and exit boot services again. Only retry
once so that we don't get stuck if ExitBootServices() is returning a
genuine error value."

This tells me that it's normal behaviour for ExitBootServices to fail in
such a way, so I am more confident that this is a legitimate bug and the
code needs to be prepared to call ExitBootServices a second time in case of
failure.


I also found some more directly relevant quotes where the same problem was
being discussed in relation to FreeBSD [3]:

"[the solution] Is not quite that easy -- the memory map itself is passed
as module metadata to the kernel (bi_load_efi_data), so that needs to be
redone as well."

I did some more investigating and found that this appears to be the case in
OpenBSD too:

http://cvsweb.openbsd.org/cgi-bin/cvsweb/src/sys/arch/amd64/stand/libsa/exec_i386.c?rev=1.18&content-type=text/x-cvsweb-markup

        /* Pass memory map to the kernel */
        mem_pass();


and shortly afterwards:

        /*
         * Move the loaded kernel image to the usual place after calling
         * ExitBootServices().
         */
        delta = DEFAULT_KERNEL_ADDRESS - efi_loadaddr;
        efi_cleanup();


It appears that the memory map is passed to the kernel before calling
ExitBootServices (which changes the memory map again, leaving the kernel
with an obsolete copy).


I am very hesitant to attempt any self-made solutions to this problem after
reading this quote, again from [3]:

"... if the memory map has actually changed the kernel will now possibly
corrupt firmware data (or the firmware will corrupt OS/user data)."

It seems that a proper and safe solution may require rearranging a
significant chunk of code to ensure that ExitBootServices is successfully
called prior to passing the memory map to the kernel.

At least, that's my take on it. I am not a strong coder and nowhere near
experienced enough to attempt any such fixes myself, so I thought it would
be best to send this in as a bug report.


[1] (www.uefi.org/sites/default/files/resources/UEFI%202_5.pdf)
[2] (http://www.gossamer-threads.com/lists/linux/kernel/1727968)
[3] (https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=202455)
Reply | Threaded
Open this post in threaded view
|

Re: amd64 UEFI machine fails to load kernel; displays "ExitBootServices" and reboots

Ryan-2
I've successfully booted the machine with the following changes:


--- sys/arch/amd64/stand/efiboot/efiboot.c    2016-06-10 12:36:06.000000000
-0600
+++ sys/arch/amd64/stand/efiboot/efiboot.c    2016-12-28 15:57:36.728002168
-0600
@@ -137,12 +137,16 @@ efi_main(EFI_HANDLE image, EFI_SYSTEM_TA
 void
 efi_cleanup(void)
 {
+      int         retry;
     EFI_STATUS     status;

-    efi_memprobe_internal();    /* sync the current map */
-    status = EFI_CALL(BS->ExitBootServices, IH, mmap_key);
-    if (status != EFI_SUCCESS)
-        panic("ExitBootServices");
+    /* retry once in case of failure */
+    for (retry = 1; retry == 0; retry--) {
+        efi_memprobe_internal();    /* sync the current map */
+        status = EFI_CALL(BS->ExitBootServices, IH, mmap_key);
+        if (status != EFI_SUCCESS && retry == 0)
+            panic("ExitBootServices failed (%d)\n", status);
+    }
 }

 /***********************************************************************


--- sys/arch/amd64/stand/libsa/exec_i386.c    2015-11-26 04:52:40.000000000
-0600
+++ sys/arch/amd64/stand/libsa/exec_i386.c    2016-12-28 15:30:42.201644481
-0600
@@ -116,6 +116,27 @@ run_loadfile(u_long *marks, int howto)
     sr_clear_keys();
 #endif

+    entry = marks[MARK_ENTRY] & 0x0fffffff;
+
+#ifdef EFIBOOT
+    /* Move the loaded kernel image to the usual place */
+    delta = DEFAULT_KERNEL_ADDRESS - efi_loadaddr;
+    memcpy((void *)marks[MARK_START] + delta, (void *)marks[MARK_START],
+        marks[MARK_END] - marks[MARK_START]);
+    for (i = 0; i < MARK_MAX; i++)
+        marks[i] += delta;
+    entry += delta;
+#endif
+
+    printf("entry point at 0x%lx [%x, %x, %x, %x]\n", entry,
+        ((int *)entry)[0], ((int *)entry)[1],
+        ((int *)entry)[2], ((int *)entry)[3]);
+
+#ifdef EFIBOOT
+    /* Sync the memory map and call ExitBootServices() */
+    efi_cleanup();
+#endif
+
     /* Pass memory map to the kernel */
     mem_pass();

@@ -129,34 +150,12 @@ run_loadfile(u_long *marks, int howto)
     makebootargs(av, &ac);
 #endif

-    entry = marks[MARK_ENTRY] & 0x0fffffff;
-
-    printf("entry point at 0x%lx [%x, %x, %x, %x]\n", entry,
-        ((int *)entry)[0], ((int *)entry)[1],
-        ((int *)entry)[2], ((int *)entry)[3]);
-#ifndef EFIBOOT
-    /* stack and the gung is ok at this point, so, no need for asm setup */
-    (*(startfuncp)entry)(howto, bootdev, BOOTARG_APIVER, marks[MARK_END],
-        extmem, cnvmem, ac, (int)av);
-#else
-    /*
-     * Move the loaded kernel image to the usual place after calling
-     * ExitBootServices().
-     */
-    delta = DEFAULT_KERNEL_ADDRESS - efi_loadaddr;
-    efi_cleanup();
-    memcpy((void *)marks[MARK_START] + delta, (void *)marks[MARK_START],
-        marks[MARK_END] - marks[MARK_START]);
-    for (i = 0; i < MARK_MAX; i++)
-        marks[i] += delta;
-    entry += delta;
-#ifdef __amd64__
+#if defined(EFIBOOT) && defined(__amd64__)
     (*run_i386)((u_long)run_i386, entry, howto, bootdev, BOOTARG_APIVER,
         marks[MARK_END], extmem, cnvmem, ac, (intptr_t)av);
 #else
     (*(startfuncp)entry)(howto, bootdev, BOOTARG_APIVER, marks[MARK_END],
         extmem, cnvmem, ac, (int)av);
 #endif
-#endif
     /* not reached */
 }

I diffed my changes against the 6.0 version of the files, but I tested
applying them against both 6.0 and the most current versions and there were
no conflicts either way.

Oh before I forget, while I was reading through the code I noticed what
appears to be a typo in sys/arch/amd64/stand/efiboot/eficall.h:

#define EFI_CALL(...) \
    _efi_call_fn(__VA_ARGS__, _call_9, _call_8, _call_7, _call_6, _call_5, \
            _call_4, _call_3, _call_2, _call_1, _call_1)(__VA_ARGS__)
#endif

It looks like the second instance of _call_1 should be _call_0 (which is
defined earlier), unless I am totally misreading this.
Reply | Threaded
Open this post in threaded view
|

Re: amd64 UEFI machine fails to load kernel; displays "ExitBootServices" and reboots

YASUOKA Masahiko-3
Hi,

On Thu, 29 Dec 2016 14:53:20 -0600
Ryan <[hidden email]> wrote:

> I've successfully booted the machine with the following changes:
>
>
> --- sys/arch/amd64/stand/efiboot/efiboot.c    2016-06-10 12:36:06.000000000
> -0600
> +++ sys/arch/amd64/stand/efiboot/efiboot.c    2016-12-28 15:57:36.728002168
> -0600
> @@ -137,12 +137,16 @@ efi_main(EFI_HANDLE image, EFI_SYSTEM_TA
>  void
>  efi_cleanup(void)
>  {
> +      int         retry;
>      EFI_STATUS     status;
>
> -    efi_memprobe_internal();    /* sync the current map */
> -    status = EFI_CALL(BS->ExitBootServices, IH, mmap_key);
> -    if (status != EFI_SUCCESS)
> -        panic("ExitBootServices");
> +    /* retry once in case of failure */
> +    for (retry = 1; retry == 0; retry--) {
> +        efi_memprobe_internal();    /* sync the current map */
> +        status = EFI_CALL(BS->ExitBootServices, IH, mmap_key);
> +        if (status != EFI_SUCCESS && retry == 0)
> +            panic("ExitBootServices failed (%d)\n", status);
> +    }
>  }

See the condition of "for" carefully.  It seems that it never become
true.  Can you check whether it will still boot successfull after this
problem is fixed?

(snip)

> Oh before I forget, while I was reading through the code I noticed what
> appears to be a typo in sys/arch/amd64/stand/efiboot/eficall.h:
>
> #define EFI_CALL(...) \
>     _efi_call_fn(__VA_ARGS__, _call_9, _call_8, _call_7, _call_6, _call_5, \
>    _call_4, _call_3, _call_2, _call_1, _call_1)(__VA_ARGS__)
> #endif
>
> It looks like the second instance of _call_1 should be _call_0 (which is
> defined earlier), unless I am totally misreading this.

I commited this already.  Thanks.

--yasuoka

Reply | Threaded
Open this post in threaded view
|

Re: amd64 UEFI machine fails to load kernel; displays "ExitBootServices" and reboots

Jonathan Gray-11
On Fri, Dec 30, 2016 at 05:33:44PM +0900, YASUOKA Masahiko wrote:

> Hi,
>
> On Thu, 29 Dec 2016 14:53:20 -0600
> Ryan <[hidden email]> wrote:
> > I've successfully booted the machine with the following changes:
> >
> >
> > --- sys/arch/amd64/stand/efiboot/efiboot.c    2016-06-10 12:36:06.000000000
> > -0600
> > +++ sys/arch/amd64/stand/efiboot/efiboot.c    2016-12-28 15:57:36.728002168
> > -0600
> > @@ -137,12 +137,16 @@ efi_main(EFI_HANDLE image, EFI_SYSTEM_TA
> >  void
> >  efi_cleanup(void)
> >  {
> > +      int         retry;
> >      EFI_STATUS     status;
> >
> > -    efi_memprobe_internal();    /* sync the current map */
> > -    status = EFI_CALL(BS->ExitBootServices, IH, mmap_key);
> > -    if (status != EFI_SUCCESS)
> > -        panic("ExitBootServices");
> > +    /* retry once in case of failure */
> > +    for (retry = 1; retry == 0; retry--) {
> > +        efi_memprobe_internal();    /* sync the current map */
> > +        status = EFI_CALL(BS->ExitBootServices, IH, mmap_key);
> > +        if (status != EFI_SUCCESS && retry == 0)
> > +            panic("ExitBootServices failed (%d)\n", status);
> > +    }
> >  }
>
> See the condition of "for" carefully.  It seems that it never become
> true.  Can you check whether it will still boot successfull after this
> problem is fixed?
>
> (snip)
> > Oh before I forget, while I was reading through the code I noticed what
> > appears to be a typo in sys/arch/amd64/stand/efiboot/eficall.h:
> >
> > #define EFI_CALL(...) \
> >     _efi_call_fn(__VA_ARGS__, _call_9, _call_8, _call_7, _call_6, _call_5, \
> >    _call_4, _call_3, _call_2, _call_1, _call_1)(__VA_ARGS__)
> > #endif
> >
> > It looks like the second instance of _call_1 should be _call_0 (which is
> > defined earlier), unless I am totally misreading this.
>
> I commited this already.  Thanks.

Shouldn't this change also be made to armv7 and arm64?

Reply | Threaded
Open this post in threaded view
|

Re: amd64 UEFI machine fails to load kernel; displays "ExitBootServices" and reboots

YASUOKA Masahiko-4
On Fri, 30 Dec 2016 19:38:20 +1100
Jonathan Gray <[hidden email]> wrote:

> On Fri, Dec 30, 2016 at 05:33:44PM +0900, YASUOKA Masahiko wrote:
>> On Thu, 29 Dec 2016 14:53:20 -0600
>> Ryan <[hidden email]> wrote:
>> > Oh before I forget, while I was reading through the code I noticed what
>> > appears to be a typo in sys/arch/amd64/stand/efiboot/eficall.h:
>> >
>> > #define EFI_CALL(...) \
>> >     _efi_call_fn(__VA_ARGS__, _call_9, _call_8, _call_7, _call_6, _call_5, \
>> >    _call_4, _call_3, _call_2, _call_1, _call_1)(__VA_ARGS__)
>> > #endif
>> >
>> > It looks like the second instance of _call_1 should be _call_0 (which is
>> > defined earlier), unless I am totally misreading this.
>>
>> I commited this already.  Thanks.
>
> Shouldn't this change also be made to armv7 and arm64?

Yes, I committed for armv7 and arm64 as well.

Thanks,

--yasuoka

Reply | Threaded
Open this post in threaded view
|

Re: amd64 UEFI machine fails to load kernel; displays "ExitBootServices" and reboots

Jonathan Gray-11
On Fri, Dec 30, 2016 at 06:49:22PM +0900, YASUOKA Masahiko wrote:

> On Fri, 30 Dec 2016 19:38:20 +1100
> Jonathan Gray <[hidden email]> wrote:
> > On Fri, Dec 30, 2016 at 05:33:44PM +0900, YASUOKA Masahiko wrote:
> >> On Thu, 29 Dec 2016 14:53:20 -0600
> >> Ryan <[hidden email]> wrote:
> >> > Oh before I forget, while I was reading through the code I noticed what
> >> > appears to be a typo in sys/arch/amd64/stand/efiboot/eficall.h:
> >> >
> >> > #define EFI_CALL(...) \
> >> >     _efi_call_fn(__VA_ARGS__, _call_9, _call_8, _call_7, _call_6, _call_5, \
> >> >    _call_4, _call_3, _call_2, _call_1, _call_1)(__VA_ARGS__)
> >> > #endif
> >> >
> >> > It looks like the second instance of _call_1 should be _call_0 (which is
> >> > defined earlier), unless I am totally misreading this.
> >>
> >> I commited this already.  Thanks.
> >
> > Shouldn't this change also be made to armv7 and arm64?
>
> Yes, I committed for armv7 and arm64 as well.
>
> Thanks,

Thanks, in this particular case they use the

#define EFI_CALL(_func_, ...) (_func_)(__VA_ARGS__)

path, it is good to keep them in sync though.