Success upgrade process question from FAQ upgrade caution "should".

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Success upgrade process question from FAQ upgrade caution "should".

Daniel Ouellet
Hi,

I never did an upgrade process before and always did a full install from
OpenBSD 2.8 to the 3.8.

But I decided to give it a try for fun and learning only. Did 3.6 to
3.7, then to 3.8. The process was great as expected as long as you
follow the great Nick's FAQ.

Thanks Nick for them. I am always amaze how well they are done! You got
to be commended for that and I think you sure deserve the FAQ Nick
development funds! (;>

But reading it, got me thinking to one small section Nick put int there
and if I learn something over time about Nick's writing is that even
small step that you could very easily overlook are there for a reason
and they were really thought about before been put in.

I refer to this "but it should be done now, as usually, the new kernel
will run old userland apps" from the upgrade 3.6 -> 3.7, or 3.7 -> 3.8,
etc. I was very curious about the word "should". Was that there because
of the switch of a.out to elf from 3.3 to 3.4? That's all I could think
about, or is there actually other possibility that when you do the
process remotely for example like I did that the box doesn't come back
to life?

I am not trying to pick on Nick's word, I just learn over the years to
actually respect the choice of them a lots!

SO, I really was trying to think when that wasn't going to work, or what
is the extend of that meaning in the context of the upgrade process.

I couldn't come up with anything else then the above note on a.out upgrade.

Any situations it might actually apply? Or are you referring to
applications more then the OS itself?

Thanks

Daniel

Reply | Threaded
Open this post in threaded view
|

Re: Success upgrade process question from FAQ upgrade caution "should".

Nick Holland
Daniel Ouellet wrote:
[snip some stuff too flattering to repeat]
> But reading it, got me thinking to one small section Nick put int there
> and if I learn something over time about Nick's writing is that even
> small step that you could very easily overlook are there for a reason
> and they were really thought about before been put in.

heh.  Many people think I write too much. :)

> I refer to this "but it should be done now, as usually, the new kernel
> will run old userland apps" from the upgrade 3.6 -> 3.7, or 3.7 -> 3.8,
> etc. I was very curious about the word "should". Was that there because
> of the switch of a.out to elf from 3.3 to 3.4? That's all I could think
> about, or is there actually other possibility that when you do the
> process remotely for example like I did that the box doesn't come back
> to life?

The a.out -> ELF conversion was certainly a major example of this kind
of issue (though remote upgrades through the a.out -> ELF conversion had
other major, near-fatal issues, too[1]).

However, there are lots of potential things that could be done on any
platform which produces new binaries which won't run on old kernels.
Any API/ABI[2] change (a.out -> ELF is just one example of such a thing)
could have that effect.  We also have a lot of platforms, any of which
could have their own little "quiet" API/ABI change that no one bothered
to tell me about.  I'd rather have one procedure for all platforms if
possible, and I'd rather keep the process as consistant from release to
release as possible, that way, when this step becomes important, you
don't have to notice the "THIS IS DIFFERENT AND REALLY IMPORTANT" note
in the upgradeXX file.

> I am not trying to pick on Nick's word, I just learn over the years to
> actually respect the choice of them a lots!

...for a second opinion, ask some of the translators, who are really
good proofreaders. :)

However, in this case, yes, they were chosen carefully.  This is not a
requirement EVERY TIME, but I don't want to have to verify the upgrade
process for every single platform every single release.

This also follows the standard OpenBSD process when building from source:
  replace kernel, reboot
  replace userland, reboot
If this process doesn't work, that's a form of flag day, because we
always try to make sure that process works.  This is just the binary
version of that process.

New kernels are supposed to maintain some kind of support for old
binaries, but by definition, it isn't possible to always support new
binaries in old kernels.

> SO, I really was trying to think when that wasn't going to work, or what
> is the extend of that meaning in the context of the upgrade process.
>
> I couldn't come up with anything else then the above note on a.out upgrade.
>
> Any situations it might actually apply? Or are you referring to
> applications more then the OS itself?

Imagine some new application security technology, similar to W^X or
Propolice, which had to talk to the kernel to "ok" an act.  The old
kernel would not recognize the function call that the new app made, and
thus, the app (reboot, in this case) would not run.  You would end up
with a machine which won't work, but won't reboot, either (whoa.
Windows emulation on OpenBSD!).  Another example might be some ACPI call
might get stuck in reboot someday if/when ACPI gets more
supported...again, same issue, the old, pre-ACPI kernel wouldn't know
what to do with them.

(note: I'm not a kernel hacker, I really have no idea how plausable the
stuff I just babbled above is, so either or both of those examples might
be totally bogus.)

Developers reserve the right to make changes like that at any time, they
could be more subtle than a complete binary format conversion.


IF there is some reason you have to complete an upgrade in one reboot,
or faster than the local boot media process goes, you might want to try
the above referened footnote [1] below.  (and if that sentence doesn't
cure you of this delusion that I always write efficiently, concisely and
clearly, I am sure there are other examples in this note. :)

Nick.



[1] http://www.holland-consulting.net/obsd/aout-up.html
[2] API/ABI: Application Program Interface, Application Binary
Interface.  How programs talk to each other and the kernel.  Guess the
binary part would be the only "critical" thing in this case, however.

Reply | Threaded
Open this post in threaded view
|

Re: Success upgrade process question from FAQ upgrade caution "should".

Daniel Ouellet
Nick Holland wrote:

> IF there is some reason you have to complete an upgrade in one reboot,
> or faster than the local boot media process goes, you might want to try
> the above referened footnote [1] below.  (and if that sentence doesn't
> cure you of this delusion that I always write efficiently, concisely and
> clearly, I am sure there are other examples in this note. :)


The idea wasn't sure not to do the upgrade in one step at all.

May be I didn't express my thought to well. I took it more as this, it
said to reboot after the first step, but then, the warning said it
should run, meaning, well it's possible not to reboot what so ever,
meaning even doing the first step only doesn't mean it will work and the
server would come back to life to do the second step.

I never meant it as skipping the first step what so ever. I wouldn't
question your warning! (;>

I was more curious as to what situation it may not work doing a remote
upgrade like this. I only thought of the a.out situation.

Thanks

Daniel

Reply | Threaded
Open this post in threaded view
|

Re: Success upgrade process question from FAQ upgrade caution "should".

Nick Holland
Daniel Ouellet wrote:

> Nick Holland wrote:
>
>> IF there is some reason you have to complete an upgrade in one reboot,
>> or faster than the local boot media process goes, you might want to try
>> the above referened footnote [1] below.  (and if that sentence doesn't
>> cure you of this delusion that I always write efficiently, concisely and
>> clearly, I am sure there are other examples in this note. :)
>
>
> The idea wasn't sure not to do the upgrade in one step at all.
>
> May be I didn't express my thought to well. I took it more as this, it
> said to reboot after the first step, but then, the warning said it
> should run, meaning, well it's possible not to reboot what so ever,
> meaning even doing the first step only doesn't mean it will work and the
> server would come back to life to do the second step.
>
> I never meant it as skipping the first step what so ever. I wouldn't
> question your warning! (;>
>
> I was more curious as to what situation it may not work doing a remote
> upgrade like this. I only thought of the a.out situation.

ah, ok, I think I understand...you were worrying about my use of
qualifiers, probably this one:
   "...it should be done now, as usually, the new kernel will run old
userland apps..."
ok, yeah, that "usually" looks a little scary (you had me looking at the
"should", which caused me to think you were trying to skip a step).  The
qualifier is there because I don't like making absolute of statements
that I'm not sure will always be absolute, and that statement was untrue
at least once in the past and may be again.  Yes, I was thinking of the
old a.out -> ELF conversion flag day on several platforms.  However, I'm
never going to rule out that a similar flag day might happen in the
future, though I'm not aware of any plans for one now.  So, do as you
were doing, read the appropriate set of instructions, and don't just
assume they are "just like last release".

IF there is a catagoric reason why that wouldn't work on any platform,
I'll certainly try to let you know.

When doing a remote (or critical system) upgrade, that is not your real
problem.  What you have to worry about is other things you did to your
system.  The systems I test the remote upgrade process on are pretty
simple -- one is entirely contrived, I install previous release on it
(often, left over from the previous faq4 update actually), then upgrade
it and make sure it works with the machine six feet away from me.  Then
I do the same thing with a couple machines that are actually pretty
important to me and are a drive and most likely time off work to get to.
 So by the time it is committed, I'm pretty confident of the general
idea.  But the machines are all still "pretty simple".

What I can't control is what really version specific app you have, or
what change you made to /etc/rc or similar.  That's why the primary
disclaimer is the second sentence of the second paragraph at the top of
the page:

   "If you are doing it on a critical or physically remote machine, it
is recommended that you test this process on an identical, local system
to verify its success before attempting on a critical or remote computer."

You will note that is not in the remote upgrade part, that's in the
"general, impacts everyone" part.  For a reason. :)

Hopefully, I got closer to your question that time. :)


Nick.

Reply | Threaded
Open this post in threaded view
|

Re: Success upgrade process question from FAQ upgrade caution "should".

Daniel Ouellet
Nick Holland wrote:

> Daniel Ouellet wrote:
>> Nick Holland wrote:
>>
>>> IF there is some reason you have to complete an upgrade in one reboot,
>>> or faster than the local boot media process goes, you might want to try
>>> the above referened footnote [1] below.  (and if that sentence doesn't
>>> cure you of this delusion that I always write efficiently, concisely and
>>> clearly, I am sure there are other examples in this note. :)
>>
>> The idea wasn't sure not to do the upgrade in one step at all.
>>
>> May be I didn't express my thought to well. I took it more as this, it
>> said to reboot after the first step, but then, the warning said it
>> should run, meaning, well it's possible not to reboot what so ever,
>> meaning even doing the first step only doesn't mean it will work and the
>> server would come back to life to do the second step.
>>
>> I never meant it as skipping the first step what so ever. I wouldn't
>> question your warning! (;>
>>
>> I was more curious as to what situation it may not work doing a remote
>> upgrade like this. I only thought of the a.out situation.
>
> ah, ok, I think I understand...you were worrying about my use of
> qualifiers, probably this one:
>    "...it should be done now, as usually, the new kernel will run old
> userland apps..."

Yes that was it. (;>

> ok, yeah, that "usually" looks a little scary (you had me looking at the
> "should", which caused me to think you were trying to skip a step).  The
> qualifier is there because I don't like making absolute of statements
> that I'm not sure will always be absolute, and that statement was untrue
> at least once in the past and may be again.  Yes, I was thinking of the
> old a.out -> ELF conversion flag day on several platforms.  However, I'm
> never going to rule out that a similar flag day might happen in the
> future, though I'm not aware of any plans for one now.  So, do as you
> were doing, read the appropriate set of instructions, and don't just
> assume they are "just like last release".

Understood. I also thought only at the a.out stuff.

> IF there is a catagoric reason why that wouldn't work on any platform,
> I'll certainly try to let you know.
>
> When doing a remote (or critical system) upgrade, that is not your real
> problem.  What you have to worry about is other things you did to your
> system.  The systems I test the remote upgrade process on are pretty
> simple -- one is entirely contrived, I install previous release on it
> (often, left over from the previous faq4 update actually), then upgrade
> it and make sure it works with the machine six feet away from me.  Then
> I do the same thing with a couple machines that are actually pretty
> important to me and are a drive and most likely time off work to get to.
>  So by the time it is committed, I'm pretty confident of the general
> idea.  But the machines are all still "pretty simple".
>
> What I can't control is what really version specific app you have, or
> what change you made to /etc/rc or similar.  That's why the primary
> disclaimer is the second sentence of the second paragraph at the top of
> the page:

Using packages and removing them as you suggested and then putting them
back after the fact sound good to me and makes the system pretty simple
I guess. So, all look go to me and sure answer the question I had in the
back of my mind! (;> May be I will do more remote upgrade now. Always
more comfortable doing it at home with a beer or two. Might increase the
risk with the beer, but sure increase the pleasure as well! (:>

>    "If you are doing it on a critical or physically remote machine, it
> is recommended that you test this process on an identical, local system
> to verify its success before attempting on a critical or remote computer."
>
> You will note that is not in the remote upgrade part, that's in the
> "general, impacts everyone" part.  For a reason. :)
>
> Hopefully, I got closer to your question that time. :)


Sure did! Always a pleasure to read your answers.

Many thanks Nick!

Daniel