CPU selection

classic Classic list List threaded Threaded
21 messages Options
12
Reply | Threaded
Open this post in threaded view
|

CPU selection

Paolo Supino
Hi

  I'm in the process of configuring a Dell PowerEdge 860 as firewall and
I debating what kind of CPU to get for the firewall for an office of
about 50 people, 20MB metro ethernet, and 15 lightly used Internet
servers: FTP, web, DNS, email, NTP, etc ... In addition for the computer
being a firewall it will also act as a NIDS and IPSEC peer (something
like 10 concurrent tunnels). The options I have for the CPU are:
1. Intel Celeron 336 at 2.8Ghz/256K cache, 533Mhz FSB.
2. Dual Core Intel Pentium D 915 at 2.8Ghz/2x2MB cache, 800Mhz FSB.
3. Dual Core Xeon 3050, 2.13Ghz, 2MB cache, 1066Mhz FSB.
4. Dual Core Xeon 3060, 2.40Ghz, 4MB cache, 1066Mhz FSB.
5. Dual Core Xeon 3070, 2.66Ghz, 4MB cache, 1066Mhz FSB.

  I have to be very price concious so will the celeron CPU hold the load
or should I take one of the Xeon CPU's for the load?




TIA
Paolo

Reply | Threaded
Open this post in threaded view
|

Re: CPU selection

alexander lind
I don't think the celeron CPU will have any problems coping with that.

Consider getting two of the machines and CARPing them, for redundancy
and load balancing (not that you will likely really need that).
Also consider putting some extra cash down on a hw raid controller, and
2 scsi disks for each machine, and run raid 1 on them, for even more
failover safety.

Alec

Paolo Supino wrote:

> Hi
>
>  I'm in the process of configuring a Dell PowerEdge 860 as firewall
> and I debating what kind of CPU to get for the firewall for an office
> of about 50 people, 20MB metro ethernet, and 15 lightly used Internet
> servers: FTP, web, DNS, email, NTP, etc ... In addition for the
> computer being a firewall it will also act as a NIDS and IPSEC peer
> (something like 10 concurrent tunnels). The options I have for the CPU
> are:
> 1. Intel Celeron 336 at 2.8Ghz/256K cache, 533Mhz FSB.
> 2. Dual Core Intel Pentium D 915 at 2.8Ghz/2x2MB cache, 800Mhz FSB.
> 3. Dual Core Xeon 3050, 2.13Ghz, 2MB cache, 1066Mhz FSB.
> 4. Dual Core Xeon 3060, 2.40Ghz, 4MB cache, 1066Mhz FSB.
> 5. Dual Core Xeon 3070, 2.66Ghz, 4MB cache, 1066Mhz FSB.
>
>  I have to be very price concious so will the celeron CPU hold the
> load or should I take one of the Xeon CPU's for the load?
>
>
>
>
> TIA
> Paolo

Reply | Threaded
Open this post in threaded view
|

Re: CPU selection

K Kadow
In reply to this post by Paolo Supino
On 11/2/06, Paolo Supino <[hidden email]> wrote:
>   I'm in the process of configuring a Dell PowerEdge 860 as firewall and
> I debating what kind of CPU to get for the firewall for an office of
> about 50 people, 20MB metro ethernet, and 15 lightly used Internet
> servers: FTP, web, DNS, email, NTP, etc ... In addition for the computer
> being a firewall it will also act as a NIDS and IPSEC peer (something
> like 10 concurrent tunnels).

So the only processes running on-box would be pf, IPSEC, and NIDS?
What sort of NIDS?

The Celeron @2.8Ghz should be sufficient, I do not recall if the PE860
with Celeron can be upgraded to Xeon later.

Kevin

Reply | Threaded
Open this post in threaded view
|

Re: CPU selection

Josh-24
In reply to this post by Paolo Supino
I would go with option number 2 :)

The NIDS will probably be the most cpu/memory intensive, and if your
running snort or something like that, be sure to get plenty of memory
( eg, over a gig ).

Cheers,
    Josh

On Thu, 2006-11-02 at 15:38 -0500, Paolo Supino wrote:

> Hi
>
>   I'm in the process of configuring a Dell PowerEdge 860 as firewall and
> I debating what kind of CPU to get for the firewall for an office of
> about 50 people, 20MB metro ethernet, and 15 lightly used Internet
> servers: FTP, web, DNS, email, NTP, etc ... In addition for the computer
> being a firewall it will also act as a NIDS and IPSEC peer (something
> like 10 concurrent tunnels). The options I have for the CPU are:
> 1. Intel Celeron 336 at 2.8Ghz/256K cache, 533Mhz FSB.
> 2. Dual Core Intel Pentium D 915 at 2.8Ghz/2x2MB cache, 800Mhz FSB.
> 3. Dual Core Xeon 3050, 2.13Ghz, 2MB cache, 1066Mhz FSB.
> 4. Dual Core Xeon 3060, 2.40Ghz, 4MB cache, 1066Mhz FSB.
> 5. Dual Core Xeon 3070, 2.66Ghz, 4MB cache, 1066Mhz FSB.
>
>   I have to be very price concious so will the celeron CPU hold the load
> or should I take one of the Xeon CPU's for the load?
>
>
>
>
> TIA
> Paolo

Reply | Threaded
Open this post in threaded view
|

Re: CPU selection

Michael Lockhart
In reply to this post by Paolo Supino
Paolo,

Celerons will work fine, but in the interests of long term capacity
planning, I would recommend going with the low end Dual Core Xeon.

Regards,
Mike Lockhart
 
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Mike Lockhart        [Systems Engineering & Operations]
StayOnline, Inc
http://www.stayonline.net/
mailto: [hidden email]
GPG: 8714 6F73 3FC8 E0A4 0663  3AFF 9F5C 888D 0767 1550
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf
Of Paolo Supino
Sent: Thursday, November 02, 2006 3:39 PM
To: [hidden email]
Subject: CPU selection

Hi

  I'm in the process of configuring a Dell PowerEdge 860 as firewall and

I debating what kind of CPU to get for the firewall for an office of
about 50 people, 20MB metro ethernet, and 15 lightly used Internet
servers: FTP, web, DNS, email, NTP, etc ... In addition for the computer

being a firewall it will also act as a NIDS and IPSEC peer (something
like 10 concurrent tunnels). The options I have for the CPU are:
1. Intel Celeron 336 at 2.8Ghz/256K cache, 533Mhz FSB.
2. Dual Core Intel Pentium D 915 at 2.8Ghz/2x2MB cache, 800Mhz FSB.
3. Dual Core Xeon 3050, 2.13Ghz, 2MB cache, 1066Mhz FSB.
4. Dual Core Xeon 3060, 2.40Ghz, 4MB cache, 1066Mhz FSB.
5. Dual Core Xeon 3070, 2.66Ghz, 4MB cache, 1066Mhz FSB.

  I have to be very price concious so will the celeron CPU hold the load

or should I take one of the Xeon CPU's for the load?




TIA
Paolo

Reply | Threaded
Open this post in threaded view
|

Re: CPU selection

Paolo Supino
In reply to this post by K Kadow
Hi K Kadow

   The NIDS would be snort.


TIA
Paolo



K Kadow wrote:

> On 11/2/06, Paolo Supino <[hidden email]> wrote:
>
>>   I'm in the process of configuring a Dell PowerEdge 860 as firewall and
>> I debating what kind of CPU to get for the firewall for an office of
>> about 50 people, 20MB metro ethernet, and 15 lightly used Internet
>> servers: FTP, web, DNS, email, NTP, etc ... In addition for the computer
>> being a firewall it will also act as a NIDS and IPSEC peer (something
>> like 10 concurrent tunnels).
>
>
> So the only processes running on-box would be pf, IPSEC, and NIDS?
> What sort of NIDS?
>
> The Celeron @2.8Ghz should be sufficient, I do not recall if the PE860
> with Celeron can be upgraded to Xeon later.
>
> Kevin

Reply | Threaded
Open this post in threaded view
|

Re: CPU selection

Stuart Henderson
In reply to this post by alexander lind
On 2006/11/02 13:36, Alexander Lind wrote:
> Consider getting two of the machines and CARPing them, for redundancy

agreed, it makes servicing, upgrades and fault diagnosis much simpler.

> Also consider putting some extra cash down on a hw raid controller, and
> 2 scsi disks for each machine, and run raid 1 on them, for even more
> failover safety.

but that doubles the cost of the machine and makes for a more complex
system - if that type of money is available, the extra box is probably
more useful

Reply | Threaded
Open this post in threaded view
|

Re: CPU selection

Paolo Supino
In reply to this post by alexander lind
Hi Alexander

   I completely agree with you and in the long run it will happen, but
getting a second machine is beyond my budget for the next couple of months.




TIA
Paolo





Alexander Lind wrote:

>I don't think the celeron CPU will have any problems coping with that.
>
>Consider getting two of the machines and CARPing them, for redundancy
>and load balancing (not that you will likely really need that).
>Also consider putting some extra cash down on a hw raid controller, and
>2 scsi disks for each machine, and run raid 1 on them, for even more
>failover safety.
>
>Alec
>
>Paolo Supino wrote:
>  
>
>>Hi
>>
>> I'm in the process of configuring a Dell PowerEdge 860 as firewall
>>and I debating what kind of CPU to get for the firewall for an office
>>of about 50 people, 20MB metro ethernet, and 15 lightly used Internet
>>servers: FTP, web, DNS, email, NTP, etc ... In addition for the
>>computer being a firewall it will also act as a NIDS and IPSEC peer
>>(something like 10 concurrent tunnels). The options I have for the CPU
>>are:
>>1. Intel Celeron 336 at 2.8Ghz/256K cache, 533Mhz FSB.
>>2. Dual Core Intel Pentium D 915 at 2.8Ghz/2x2MB cache, 800Mhz FSB.
>>3. Dual Core Xeon 3050, 2.13Ghz, 2MB cache, 1066Mhz FSB.
>>4. Dual Core Xeon 3060, 2.40Ghz, 4MB cache, 1066Mhz FSB.
>>5. Dual Core Xeon 3070, 2.66Ghz, 4MB cache, 1066Mhz FSB.
>>
>> I have to be very price concious so will the celeron CPU hold the
>>load or should I take one of the Xeon CPU's for the load?
>>
>>
>>
>>
>>TIA
>>Paolo

Reply | Threaded
Open this post in threaded view
|

Re: CPU selection

alexander lind
In reply to this post by Stuart Henderson
>> Also consider putting some extra cash down on a hw raid controller, and
>> 2 scsi disks for each machine, and run raid 1 on them, for even more
>> failover safety.
>>    
>
> but that doubles the cost of the machine and makes for a more complex
> system - if that type of money is available, the extra box is probably
> more useful
>
>  
i don't agree, the cost of a hw raid card and a second scsi disk is more
money than one sata disk, but it does not exactly double the price.
setting up openbsd on a raided machine is also extremely simple
(provided you use a supported raid card of course).

the harddrives, next after the psu:s, are in my experience the most
common points of failure, so whenever i set up a server to be used in
production (even if it has a carp buddy) i try to make sure they are
raided also.

alec

Reply | Threaded
Open this post in threaded view
|

Re: CPU selection

alexander lind
In reply to this post by Paolo Supino
Hello Paolo

Then at least make sure you get a machine with a backup psu and raid. If
downtime is expensive (and it tends to be for most companies) you want
to make sure that your assets are covered when the hw fails :)

Alec

Paolo Supino wrote:

> Hi Alexander
>
>   I completely agree with you and in the long run it will happen, but
> getting a second machine is beyond my budget for the next couple of
> months.
>
>
>
>
> TIA
> Paolo
>
>
>
>
>
> Alexander Lind wrote:
>
>> I don't think the celeron CPU will have any problems coping with that.
>>
>> Consider getting two of the machines and CARPing them, for redundancy
>> and load balancing (not that you will likely really need that).
>> Also consider putting some extra cash down on a hw raid controller, and
>> 2 scsi disks for each machine, and run raid 1 on them, for even more
>> failover safety.
>>
>> Alec
>>
>> Paolo Supino wrote:
>>  
>>
>>> Hi
>>>
>>> I'm in the process of configuring a Dell PowerEdge 860 as firewall
>>> and I debating what kind of CPU to get for the firewall for an office
>>> of about 50 people, 20MB metro ethernet, and 15 lightly used Internet
>>> servers: FTP, web, DNS, email, NTP, etc ... In addition for the
>>> computer being a firewall it will also act as a NIDS and IPSEC peer
>>> (something like 10 concurrent tunnels). The options I have for the CPU
>>> are:
>>> 1. Intel Celeron 336 at 2.8Ghz/256K cache, 533Mhz FSB.
>>> 2. Dual Core Intel Pentium D 915 at 2.8Ghz/2x2MB cache, 800Mhz FSB.
>>> 3. Dual Core Xeon 3050, 2.13Ghz, 2MB cache, 1066Mhz FSB.
>>> 4. Dual Core Xeon 3060, 2.40Ghz, 4MB cache, 1066Mhz FSB.
>>> 5. Dual Core Xeon 3070, 2.66Ghz, 4MB cache, 1066Mhz FSB.
>>>
>>> I have to be very price concious so will the celeron CPU hold the
>>> load or should I take one of the Xeon CPU's for the load?
>>>
>>>
>>>
>>>
>>> TIA
>>> Paolo

Reply | Threaded
Open this post in threaded view
|

Re: CPU selection

Nick Holland
In reply to this post by Paolo Supino
Paolo Supino wrote:
> Hi Alexander
>
>    I completely agree with you and in the long run it will happen, but
> getting a second machine is beyond my budget for the next couple of months.

Then, you should go grab a couple OLD machines, and build your firewall
with them.  You probably won't be implementing all the cool stuff right
away, anyway...  Save buying the new machines for when you can do it right.

For reference, we got a DS3 (45Mbps) and 900 users going through a
CARPed pair of five year old machines.  Primary is a 600MHz Celeron, the
standby is a PIII-750MHz.  Not running IPsec or IDS on them, but these
machines seem to have a fair amount of growth potential on 'em.

And yes, the primary machine is "slower" than the backup.

You need the second machine.  Even if you don't run CARP, you need a
second machine.  If you DO run CARP, I'd even argue you need a third
machine:
  Rapid repair: Don't rely on someone else to get yourself back running.
  Testing: "What happens if I do X?"
  upgrades: do your upgrade on the second system, make sure all goes as
you expect before doing it on the production machine.
etc.

Granted, your "second" (or third) machine could be the second machine
for a lot of different systems in your company, if you standardize your HW.

As for RAID on a firewall, uh...no, all things considered, I'd rather
AVOID that, actually.  Between added complexity, added boot time, and
disks that can't be used without the RAID controller, it is a major
loser when it comes to total up-time if you do things right.  Put a
second disk in the machine, and regularly dump the primary to the
secondary.  Blow the primary drive, you simply remove it, and boot off
the secondary (and yes, you test test test this to make sure you did it
right!).  RAID is great when you have constantly changing data and you
don't want to lose ANYTHING EVER (i.e., mail server).  When you have a
mostly-static system like a firewall, there are simpler and better ways.

A couple months ago, our Celeron 600 firewall seemed to be having
"problems", which we thought may have been due to processor load.  We
were able to pull the disk out of it, put it in a much faster machine,
adjust a few files, and we were back up and running quickly...and found
that the problem was actually due to a router misconfig and a run-away
nmap session.  Would not have been able to do that with a RAID card.

Nick.

Reply | Threaded
Open this post in threaded view
|

Re: CPU selection

alexander lind
> As for RAID on a firewall, uh...no, all things considered, I'd rather
> AVOID that, actually.  Between added complexity,
what complexity?
>  added boot time, and
> disks that can't be used without the RAID controller,
why would you want to use your disk WITHOUT the raid controller?
>  it is a major
> loser when it comes to total up-time if you do things right.  Put a
> second disk in the machine, and regularly dump the primary to the
> secondary.  Blow the primary drive, you simply remove it, and boot off
> the secondary (and yes, you test test test this to make sure you did it
> right!).
Now you're talking crazy. Lets consider the two setups:
No-raid setup:
  - two separately controlled disks, you are in charge of syncing
between them
  - if one dies, the machine goes down, and you go to the machine, and
manually boot from the backup disk
  - IF you had important data on the dead disk not yet backed up, you
are screwed.
you could almost look at this as poor mans manual pretend raid.

Raid setup:
  - two disks, constantly synced, if one dies, the machine does NOT go down
  - if a disk fails, just go and plug a new one in _at your
convenience*_ and it will autmatically rebuild, a task any person could
perform with proper direction. Not a seconds downtime.

* this is _very_ important if your machine is hosted where you don't
have easy physical access to it. Machines at a colo center would be a
very common scenario.
>  RAID is great when you have constantly changing data and you
> don't want to lose ANYTHING EVER (i.e., mail server).  When you have a
> mostly-static system like a firewall, there are simpler and better ways.
>  
RAID is great for any server. So are scsi drives. If you are a company
that loses more money on a few hours (or even minutes) downtime than it
costs to invest in proper servers with proper hw raid + scsi disks, then
you are ill-advised _not_ to raid all your missioncritical servers. And
have backup machines, too!  Preferably loadbalanced.
> A couple months ago, our Celeron 600 firewall seemed to be having
> "problems", which we thought may have been due to processor load.  We
> were able to pull the disk out of it, put it in a much faster machine,
> adjust a few files, and we were back up and running quickly...and found
> that the problem was actually due to a router misconfig and a run-away
> nmap session.  Would not have been able to do that with a RAID card.
>  
Next time, you may want to check what the machine is actually doing
before you start blaming your hardware.
I personally would not trust the OS setup on one machine to run smoothly
in any machine not more or less identical to itself as far as the hw
goes. Especially not for a production unit.
But if you really wanted too, you could move the entire raid array over
to a different machine, if that makes you happy.

Alec

Reply | Threaded
Open this post in threaded view
|

Re: CPU selection

Ingo Schwarze
Perhaps you missed that Nick was talking about a pair of carp'ed
firewalls.  Failure of one machine means *no* downtime.  Besides,
firewalls rarely need to store any valuable data, almost by definition.

Alexander Lind wrote on Thu, Nov 02, 2006 at 05:27:00PM -0800:

> Now you're talking crazy.

That happens rarely to Nick.  ;-)

I remember about one or two instances where he was actually proven
wrong, in a long time.

Reply | Threaded
Open this post in threaded view
|

Re: CPU selection

alexander lind
Ingo Schwarze wrote:
> Perhaps you missed that Nick was talking about a pair of carp'ed
> firewalls.  Failure of one machine means *no* downtime.  Besides,
> firewalls rarely need to store any valuable data, almost by definition.
>  
I'm not saying that digging up parts and building a couple of machines
out of old scrap that you could find in my attic (and you could find
enough to build a server farm, I assure you) and making a whole farm of
carp:ed firewalls will not do the trick.
But from an enterprise point of view, spending a few hundred dollars
extra to build machines that are very unlikely to go down in the first
place - but if they do go down can be rebuilt with minimum effort - is
usually going to be worthwhile. Carped or not.

Different story for home users, or someone that are hard up for cash of
course.


>> Now you're talking crazy.
>>    
>
> That happens rarely to Nick.  ;-)
>
> I remember about one or two instances where he was actually proven
> wrong, in a long time.
>  
Perhaps your memory just isn't that great?
j/k ;)

Alec

Reply | Threaded
Open this post in threaded view
|

Re: CPU selection

Nick Holland
In reply to this post by alexander lind
Alexander Lind wrote:
>> As for RAID on a firewall, uh...no, all things considered, I'd rather
>> AVOID that, actually.  Between added complexity,
> what complexity?

RAID, kiddo.
It's more complex.  It is something else that can go wrong.
And...it DOES go wrong.  Either believe me now, or wish you believed me
later.  Your call.  I spent a lot of time profiting from people who
ignored my advice. :)

>>  added boot time, and
>> disks that can't be used without the RAID controller,
> why would you want to use your disk WITHOUT the raid controller?

Oh, say, maybe your RAID controller failed?
Or the spare machine you had didn't happen to have the same brand and
model RAID card?
Or the replacement RAID card happened to have a different firmware on
it, and the newer firmware wouldn't read your old disk pack?  (yes,
that's a real issue).

>>  it is a major
>> loser when it comes to total up-time if you do things right.  Put a
>> second disk in the machine, and regularly dump the primary to the
>> secondary.  Blow the primary drive, you simply remove it, and boot off
>> the secondary (and yes, you test test test this to make sure you did it
>> right!).
> Now you're talking crazy. Lets consider the two setups:
> No-raid setup:
>   - two separately controlled disks, you are in charge of syncing
> between them

yep.  you better test your work from time to time.
(wow...come to think of it, you better test your RAID assumptions, too.
 Few people do that, they just assume "it works".  This leads to people
proving me right about simplicity vs. complexity)

>   - if one dies, the machine goes down, and you go to the machine, and
> manually boot from the backup disk

yep.  Meanwhile, the system has been running just fine on the SECONDARY
SYSTEM.

>   - IF you had important data on the dead disk not yet backed up, you
> are screwed.

Ah, so you are in the habit of keeping important, non-backed up data on
your firewall?  wow.

> you could almost look at this as poor mans manual pretend raid.

Or as part of RAIC: Redundant Array of Inexpensive Computers.

> Raid setup:
>   - two disks, constantly synced, if one dies, the machine does NOT go down

you are funny.  Or inexperienced.

>   - if a disk fails, just go and plug a new one in _at your
> convenience*_ and it will autmatically rebuild, a task any person could
> perform with proper direction. Not a seconds downtime.

That's the way it is SUPPOSED to work.
Reality is very, very different some times.

Simple systems have simple problems.
Complex systems have complex problems.

Worst down-time events I've ever seen always seem to involve a RAID
system, usually managed by someone who said, "does NOT go down!", who
believed that complexity was the solution to a problem

A RAID controller never causes downtime in a system its not installed
in.  Power distribution boards don't fail on machines that don't have
them.  Hotplug backplanes don't fail on machines that don't have them.
(seen 'em all happen).

> * this is _very_ important if your machine is hosted where you don't
> have easy physical access to it. Machines at a colo center would be a
> very common scenario.

That is correct... IF that was what we were talking about.  It isn't.
You keep trying to use the wrong special case for the topic at hand.

Design your solutions to meet the problem in front of you, not a totally
unrelated problem.

>>  RAID is great when you have constantly changing data and you
>> don't want to lose ANYTHING EVER (i.e., mail server).  When you have a
>> mostly-static system like a firewall, there are simpler and better ways.
>>  
> RAID is great for any server.

WRONG.
It is good for the right systems in the right places.  There are a lot
of those places.
It is great when administered by someone who understands the limitations
of it.  That, sadly, is uncommon.

> So are scsi drives.

I've been hearing that "SCSI is better!" stuff for 20 years, most of
that while working in service and support of LOTS of companys' computers.

It *may* be true that SCSI drives are more reliable than IDE drives,
though I really suspect if it is really true on average, the variation
between models is probably greater than the difference between
interfaces.  But that's just the drive, and I'm giving you that.

HOWEVER, by the time you add the SCSI controller, the software and the
other stuff in a SCSI solution, you have a much more cranky beast than
your IDE disk systems usually are.  No, it isn't supposed to be that
way, but experience has shown me that SCSI cards suck, SCSI drivers
suck, you rarely have the right cables and terminators on hand, and
people rarely screw up IDE drivers or chips as badly as they do the SCSI
chips and drivers (and I am most certainly not talking just OpenBSD
here).  No question in my mind on this.  I've seen too many bad things
happen with SCSI...none of which that should have...but they did, anyway.

> If you are a company
> that loses more money on a few hours (or even minutes) downtime than it
> costs to invest in proper servers with proper hw raid + scsi disks, then
> you are ill-advised _not_ to raid all your missioncritical servers. And
> have backup machines, too!  Preferably loadbalanced.

No, if controlling downtime is important to you, you have to look at the
ENTIRE solution, not chant mantras that you don't fully understand about
tiny little bits of individual computers that make up whole systems
(note: "system" here being used to indicate much more than one computer).

>> A couple months ago, our Celeron 600 firewall seemed to be having
>> "problems", which we thought may have been due to processor load.  We
>> were able to pull the disk out of it, put it in a much faster machine,
>> adjust a few files, and we were back up and running quickly...and found
>> that the problem was actually due to a router misconfig and a run-away
>> nmap session.  Would not have been able to do that with a RAID card.
>>  
> Next time, you may want to check what the machine is actually doing
> before you start blaming your hardware.
> I personally would not trust the OS setup on one machine to run smoothly
> in any machine not more or less identical to itself as far as the hw
> goes. Especially not for a production unit.

Ah, a windows user, I see. ;)
Understand how OpenBSD works, you will understand that this is not a
problem.  It is the same kernel, same supporting files installed to the
same places in the same way and doing the same thing, whether it be on a
486 or a P4.  It is just a (very) few config files that are different.
It's truly wonderful.  It's how things should be.

> But if you really wanted too, you could move the entire raid array over
> to a different machine, if that makes you happy.

Assuming you have practiced and practiced and practiced this process.
Do it wrong, you can kiss all copies of your data bye-bye, too.  Some
RAID controllers make it really easy to do this.  Others make it really
easy to clear your disks of all data...And sometimes, two cards with
really similar model numbers in machines you thought were really close
to being the same have really big differences you didn't anticipate.

Don't get me wrong, RAID has its place, and it has a very good place on
a lot of systems, maybe even most systems that call themselves servers
(and if it wasn't for the cost, most systems that call themselves
workstations, too).  I have a snootload of different types of RAID
systems around here (and btw, bioctl(8) rocks!).  My firewall runs
ccd(4) mirroring, in fact (mostly because I'm curious how it fails in
real life.  All things considered, I much prefer the design I described
earlier).

But in /this/ case, we are talking about a particular application,
firewalls in an office.  It doesn't really matter one bit what would be
more appropriate at a CoLo site if that's not what we are talking about.

OpenBSD makes it almost trivial to make entire redundant pairs of
machines.  Think of it as RAID on steroids...it isn't just redundant
disks, it is redundant NICs, power supplies, disk controllers, cables,
processors, memory, cases...everything.  PLUS, it not only helps you
with your uptime in case of failure, it also makes a lot of other things
easier, too, such as software and hardware upgrades, so you are more
likely to do upgrades when needed.  At this point, RAID and redundant
power supplies and such just make life more expensive and more complex,
not better.


Last February, I had an opportunity to replace a bunch of old servers at
my company's branch offices.  About 11 branches got big, "classic"
servers, about 15 smaller branches got workstations converted into
servers by adding an Accusys box for simple SATA mirroring.  The big
branches needed the faster performance of the big server, the small
branches just needed to get rid of the old servers that were getting
very unreliable (guess what?  It was the SCSI back planes and the RAID
controllers that were causing us no end of trouble).  It has been an
interesting test of "server vs. workstation" and "SATA vs. SCSI".

It is actually hard to tell who is ahead...
    Disk failures have been a close call: about the same number of SCSI
disks have failed as SATA disks, but the SCSI systems have ten disks vs.
three for the SATA machines, so there are more SCSI disks, but fewer
SCSI systems.  You will look at the disk count, the users look at "is my
system working?".
    Three SCSI disks have off-lined themselves, but simply unplugging
and plugging them back in has resulted in them coming back up, and
staying up.  Scary.
    Most of the SATA failures have been "clean" failures, though one
drive was doing massive retries, and eventually "succeeded", so the
system performance was horrible until we manually off-lined the
offending drive (which was easy to spot by looking at the disk activity
lights).
    One system actually lost data: one of the SCSI systems had a disk
off-line itself that was not noted by on-site staff, and a week later,
that drive's mirror failed, too (note: the first drive just off-lined
itself..no apparent reason, and it happily went back on-line).
Unfortunately, the second-rate OS on these things lacked something like
bioctl(4) to easily monitor the machines...  Complexity doesn't save you
from user error...though it might add to it.
    The drive failures on the SATA systems immediately results in a
phone call, "there's this loud beeping coming from my server room!".
    We went through a month or two where drives seemed to be popping all
over the place...and since then, things have been very reliable...
    Working with the RAID system on the SCSI machines is something that
needs to be practiced...working with the Accusys boxes is simplicity in
the extreme.
    The little machines are cheap enough we have three spares in boxes
waiting to be next-day shipped to anywhere we MIGHT have a problem.
    The big machines cost a lot of money to next-day ship anywhere, so
we don't even think of it unless we are sure we got a big problem.
    Only one machine has been next-day shipped: one of the big ones, at
a price of about 1/4th the cost of an entire little machine (after the
dual-disk failures, I figured let's get a new machine on site, get them
back up, and we'll worry about the old machine later).

About eight months into the project, I can say, the performance of the
big RAID 1+0 systems rock, but I love the simplicity of the little
machines...Ask me again in three years. :)

Nick.

Reply | Threaded
Open this post in threaded view
|

Re: CPU selection

alexander lind
>> what complexity?
>>    
>
> RAID, kiddo.
> It's more complex.  It is something else that can go wrong.
> And...it DOES go wrong.  Either believe me now, or wish you believed me
> later.  Your call.  I spent a lot of time profiting from people who
> ignored my advice. :)
>  
Of course raid are more complex on a hardware level, but that doesn't
exactly make it more complex for _me_, the user, does it?
I have deployed lots and lots of servers, both with and without raid and
using various different OS:es, and I give you that it used to be a
little tricky to get for example slackware to boot off some
semi-supported raid devices back in the day, but nowadays its all pretty
simple imho.
And the times when disks have failed, we have plopped in new disks and
they got rebuilt and I lived happily afterwards.
So really, where is you're profit margin on someone like me? ;)

>  
>>>  added boot time, and
>>> disks that can't be used without the RAID controller,
>>>      
>> why would you want to use your disk WITHOUT the raid controller?
>>    
>
> Oh, say, maybe your RAID controller failed?
> Or the spare machine you had didn't happen to have the same brand and
> model RAID card?
> Or the replacement RAID card happened to have a different firmware on
> it, and the newer firmware wouldn't read your old disk pack?  (yes,
> that's a real issue).
>  
If indeed the raid card failed, unlikely as it would be, then that could
be a little messy. Not that I ever had this problem, but you ought to be
able to downgrade raid cards if you run into the firmware problem?

>  
>>>  it is a major
>>> loser when it comes to total up-time if you do things right.  Put a
>>> second disk in the machine, and regularly dump the primary to the
>>> secondary.  Blow the primary drive, you simply remove it, and boot off
>>> the secondary (and yes, you test test test this to make sure you did it
>>> right!).
>>>      
>> Now you're talking crazy. Lets consider the two setups:
>> No-raid setup:
>>   - two separately controlled disks, you are in charge of syncing
>> between them
>>    
>
> yep.  you better test your work from time to time.
> (wow...come to think of it, you better test your RAID assumptions, too.
>  Few people do that, they just assume "it works".  This leads to people
> proving me right about simplicity vs. complexity)
>  
If you configure it right it tends to work right. At least it does for me.

>  
>>   - if one dies, the machine goes down, and you go to the machine, and
>> manually boot from the backup disk
>>    
>
> yep.  Meanwhile, the system has been running just fine on the SECONDARY
> SYSTEM.
>  
>  
>>   - IF you had important data on the dead disk not yet backed up, you
>> are screwed.
>>    
>
> Ah, so you are in the habit of keeping important, non-backed up data on
> your firewall?  wow.
>  
of course, thats where i store my porn.
>  
>> you could almost look at this as poor mans manual pretend raid.
>>    
>
> Or as part of RAIC: Redundant Array of Inexpensive Computers.
>  
which may not always be feasible in an already densely packed rack where
every U is expensive.
>  
>> Raid setup:
>>   - two disks, constantly synced, if one dies, the machine does NOT go down
>>    
>
> you are funny.  Or inexperienced.
>  
master, you flatter me!
maybe i'm a lucky bastard, but every single disk failure i have seen in
a raided machine has been solved by pulling the disk out, and putting a
new back in.
rebuild for some time, and then the machine is happy again.
i think this has happened to servers i maintain or help maintain 5 or so
times now.
>  
>>   - if a disk fails, just go and plug a new one in _at your
>> convenience*_ and it will autmatically rebuild, a task any person could
>> perform with proper direction. Not a seconds downtime.
>>    
>
> That's the way it is SUPPOSED to work.
> Reality is very, very different some times.
>  
my servers must be living in fantasyland or something.
> Simple systems have simple problems.
> Complex systems have complex problems.
>
> Worst down-time events I've ever seen always seem to involve a RAID
> system, usually managed by someone who said, "does NOT go down!", who
> believed that complexity was the solution to a problem
>  
how exactly did the machine go down then, i wonder?
> A RAID controller never causes downtime in a system its not installed
> in.  Power distribution boards don't fail on machines that don't have
> them.  Hotplug backplanes don't fail on machines that don't have them.
> (seen 'em all happen).
>  
flawless logic sir, i wish courts would apply it in the same way
concerning rapists genitals, and lying politicians left brainhalves (a
study i read suggested the left side is most active when you lie).
>  
>> * this is _very_ important if your machine is hosted where you don't
>> have easy physical access to it. Machines at a colo center would be a
>> very common scenario.
>>    
>
> That is correct... IF that was what we were talking about.  It isn't.
> You keep trying to use the wrong special case for the topic at hand.
>  
I don't think an firewall should be any less failsafe or easy to
maintain than one at a colo, BUT the colo is for sure more important in
that respect.
> Design your solutions to meet the problem in front of you, not a totally
> unrelated problem.
>  
I don't think they are unrelated.

>  
>>>  RAID is great when you have constantly changing data and you
>>> don't want to lose ANYTHING EVER (i.e., mail server).  When you have a
>>> mostly-static system like a firewall, there are simpler and better ways.
>>>  
>>>      
>> RAID is great for any server.
>>    
>
> WRONG.
> It is good for the right systems in the right places.  There are a lot
> of those places.
> It is great when administered by someone who understands the limitations
> of it.  That, sadly, is uncommon.
>  
Ok, maybe its not so good for someone that doesn't understand what it
does, or how to set it up. But that applies to so much more than just
raid systems.
Thereby not said its always _necessary_. But still good to have imho.

>  
>> So are scsi drives.
>>    
>
> I've been hearing that "SCSI is better!" stuff for 20 years, most of
> that while working in service and support of LOTS of companys' computers.
>
> It *may* be true that SCSI drives are more reliable than IDE drives,
> though I really suspect if it is really true on average, the variation
> between models is probably greater than the difference between
> interfaces.  But that's just the drive, and I'm giving you that.
>
> HOWEVER, by the time you add the SCSI controller, the software and the
> other stuff in a SCSI solution, you have a much more cranky beast than
> your IDE disk systems usually are.  No, it isn't supposed to be that
> way, but experience has shown me that SCSI cards suck, SCSI drivers
> suck, you rarely have the right cables and terminators on hand, and
> people rarely screw up IDE drivers or chips as badly as they do the SCSI
> chips and drivers (and I am most certainly not talking just OpenBSD
> here).  No question in my mind on this.  I've seen too many bad things
> happen with SCSI...none of which that should have...but they did, anyway.
>  
Well there is something we can agree on. The umpteen different interface
standards, those are very annoying.
I have not really had any problems with neither scsi card drivers or
raid controller drivers in either linux or any bsd that I've used, but
you may have had different experiences?
Since SCSI _is_ a more complex system than IDE/SATA its not surprising
if those drivers historically have had more bugs in them. Especially not
with some manufacturers stupid non-disclosure BS.

>  
>> If you are a company
>> that loses more money on a few hours (or even minutes) downtime than it
>> costs to invest in proper servers with proper hw raid + scsi disks, then
>> you are ill-advised _not_ to raid all your missioncritical servers. And
>> have backup machines, too!  Preferably loadbalanced.
>>    
>
> No, if controlling downtime is important to you, you have to look at the
> ENTIRE solution, not chant mantras that you don't fully understand about
> tiny little bits of individual computers that make up whole systems
> (note: "system" here being used to indicate much more than one computer).
>  
What do you know about the rest of my systems eh?  :p
I never said it was about _one_ computer only. I'm just saying that to
me, saving a few dollars and using some dual IDE disk setup instead of
spending a little more and using raid/scsi on an enterprise firewall,
makes more sense. But I'm still carp:ing them.

>  
>>> A couple months ago, our Celeron 600 firewall seemed to be having
>>> "problems", which we thought may have been due to processor load.  We
>>> were able to pull the disk out of it, put it in a much faster machine,
>>> adjust a few files, and we were back up and running quickly...and found
>>> that the problem was actually due to a router misconfig and a run-away
>>> nmap session.  Would not have been able to do that with a RAID card.
>>>  
>>>      
>> Next time, you may want to check what the machine is actually doing
>> before you start blaming your hardware.
>> I personally would not trust the OS setup on one machine to run smoothly
>> in any machine not more or less identical to itself as far as the hw
>> goes. Especially not for a production unit.
>>    
>
> Ah, a windows user, I see. ;)
> Understand how OpenBSD works, you will understand that this is not a
> problem.  It is the same kernel, same supporting files installed to the
> same places in the same way and doing the same thing, whether it be on a
> 486 or a P4.  It is just a (very) few config files that are different.
> It's truly wonderful.  It's how things should be.
>  
hehe, you are right about that I did try and transplant a windows disk a
looong time and that didn't go very well ;)
Have never transplanted an openbsd disk, but I imagine that would go
better provided you hadn't customized the kernel for that one machine it
was running on.

>  
>> But if you really wanted too, you could move the entire raid array over
>> to a different machine, if that makes you happy.
>>    
>
> Assuming you have practiced and practiced and practiced this process.
> Do it wrong, you can kiss all copies of your data bye-bye, too.  Some
> RAID controllers make it really easy to do this.  Others make it really
> easy to clear your disks of all data...And sometimes, two cards with
> really similar model numbers in machines you thought were really close
> to being the same have really big differences you didn't anticipate.
>  
good point, haven't done this myself ever. And hoping that it stays that
way.

> Don't get me wrong, RAID has its place, and it has a very good place on
> a lot of systems, maybe even most systems that call themselves servers
> (and if it wasn't for the cost, most systems that call themselves
> workstations, too).  I have a snootload of different types of RAID
> systems around here (and btw, bioctl(8) rocks!).  My firewall runs
> ccd(4) mirroring, in fact (mostly because I'm curious how it fails in
> real life.  All things considered, I much prefer the design I described
> earlier).
>
> But in /this/ case, we are talking about a particular application,
> firewalls in an office.  It doesn't really matter one bit what would be
> more appropriate at a CoLo site if that's not what we are talking about.
>  
actually it does for some installations I do, because I administer them
remotely from another office!

> OpenBSD makes it almost trivial to make entire redundant pairs of
> machines.  Think of it as RAID on steroids...it isn't just redundant
> disks, it is redundant NICs, power supplies, disk controllers, cables,
> processors, memory, cases...everything.  PLUS, it not only helps you
> with your uptime in case of failure, it also makes a lot of other things
> easier, too, such as software and hardware upgrades, so you are more
> likely to do upgrades when needed.  At this point, RAID and redundant
> power supplies and such just make life more expensive and more complex,
> not better.
>  
Very much agree with all this.
The only bad thing here is that you sometimes can not fit more machines
into places.

>
> Last February, I had an opportunity to replace a bunch of old servers at
> my company's branch offices.  About 11 branches got big, "classic"
> servers, about 15 smaller branches got workstations converted into
> servers by adding an Accusys box for simple SATA mirroring.  The big
> branches needed the faster performance of the big server, the small
> branches just needed to get rid of the old servers that were getting
> very unreliable (guess what?  It was the SCSI back planes and the RAID
> controllers that were causing us no end of trouble).  It has been an
> interesting test of "server vs. workstation" and "SATA vs. SCSI".
>  
Really, what kind of raid controllers were they?

> It is actually hard to tell who is ahead...
>     Disk failures have been a close call: about the same number of SCSI
> disks have failed as SATA disks, but the SCSI systems have ten disks vs.
> three for the SATA machines, so there are more SCSI disks, but fewer
> SCSI systems.  You will look at the disk count, the users look at "is my
> system working?".
>     Three SCSI disks have off-lined themselves, but simply unplugging
> and plugging them back in has resulted in them coming back up, and
> staying up.  Scary.
>  
hmms, that would worry me a little too :p .. how old are these disks?

>     Most of the SATA failures have been "clean" failures, though one
> drive was doing massive retries, and eventually "succeeded", so the
> system performance was horrible until we manually off-lined the
> offending drive (which was easy to spot by looking at the disk activity
> lights).
>     One system actually lost data: one of the SCSI systems had a disk
> off-line itself that was not noted by on-site staff, and a week later,
> that drive's mirror failed, too (note: the first drive just off-lined
> itself..no apparent reason, and it happily went back on-line).
> Unfortunately, the second-rate OS on these things lacked something like
> bioctl(4) to easily monitor the machines...  Complexity doesn't save you
> from user error...though it might add to it.
>     The drive failures on the SATA systems immediately results in a
> phone call, "there's this loud beeping coming from my server room!".
>     We went through a month or two where drives seemed to be popping all
> over the place...and since then, things have been very reliable...
>     Working with the RAID system on the SCSI machines is something that
> needs to be practiced...working with the Accusys boxes is simplicity in
> the extreme.
>     The little machines are cheap enough we have three spares in boxes
> waiting to be next-day shipped to anywhere we MIGHT have a problem.
>     The big machines cost a lot of money to next-day ship anywhere, so
> we don't even think of it unless we are sure we got a big problem.
>     Only one machine has been next-day shipped: one of the big ones, at
> a price of about 1/4th the cost of an entire little machine (after the
> dual-disk failures, I figured let's get a new machine on site, get them
> back up, and we'll worry about the old machine later).
>  
sounds like you're working for a pretty massive organization there.. may
I ask which one?
> About eight months into the project, I can say, the performance of the
> big RAID 1+0 systems rock, but I love the simplicity of the little
> machines...Ask me again in three years. :)
>  
heh, I'll put that in my calendar :p

Alec
> Nick.

Reply | Threaded
Open this post in threaded view
|

Re: CPU selection

J.C. Roberts-2
On Thu, 02 Nov 2006 22:03:05 -0800, Alexander Lind <[hidden email]>
wrote:

>> RAID, kiddo.
>> It's more complex.  It is something else that can go wrong.
>> And...it DOES go wrong.  Either believe me now, or wish you believed me
>> later.  Your call.  I spent a lot of time profiting from people who
>> ignored my advice. :)
>>  
>Of course raid are more complex on a hardware level, but that doesn't
>exactly make it more complex for _me_, the user, does it?
>

Alexander,

Yes, it does. Not realizing the increased complexity and risks for the
user just means you drank the koolaid and actually believe the marketing
and advertising nonsense for hardware RAID products. If with *your*
experience you really believe that hardware and firmware never have
serious bugs or catastrophic failures, then you are statistically
overdue for a number of unpleasant surprises.

Here is an interesting question for you which may help you grasp the
concept Nick is preaching; in the event of a nasty failure on a RAID
where you absolutely *must* be able to recover the valuable data, do you
stand a better chance of recovering the data from a hardware RAID
configuration or a software RAID configuration?

Though contrary to the marketing koolaid, the answer is software RAID.
In a hardware RAID you are blindly trusting incompletely documented
hardware and undisclosed firmware. You will *NEVER* have access to the
firmware source code or the chip logic, so you never really know how it
works exactly. In a software RAID configuration (ccd/raidframe/etc), you
have the source code, know exactly how it works and the hardware is far
less complex as well as reasonably well documented in most cases. With
software RAID, at least you have a chance of mounting the raw disks and
piecing thing back together manually. The odds of recovery are always
better when things are simple and you actually know how they work.

Mindlessly slapping a new disk into a hardware RAID after a disk failure
only works *some* of the time and only for *some* types of failures. If
you're not lucky enough to be in the *some* category, then you'll be
dusting off those outdated backup tapes and updating your resume.
Imagine telling your boss that there is no way to recover the data from
the trashed RAID disks because the vendor refuses to release required
hardware/firmware information.

If you had kept things known and simple by using a software RAID, you
may have had a chance of recovering the companys' financial records.

Hardware RAID is fun, fast and useful for some applications but you
should at least understand the additional complexity you're deploying,
the additional risks caused by the complexity and the additional costs
you will bear. When your only concern is reliability then your goal
should be to keep it as simple as feasible. Less complexity and fewer
unknowns not only means fewer things can go wrong but it also means a
greater chance of recovery.

Still not convinced? Let's say a bug is committed to the -CURRENT source
tree in the driver for your hardware RAID card. Since reliability is so
critical to you, you must have a completely identical hardware setup for
constantly testing your hardware RAID controller with -CURRENT to
prevent that bug from getting into a -RELEASE? Or maybe you went out and
spent the few hundred bucks for an additional RAID controller like the
one you use so you could donate it to one of the developers in the
project who actually work on the driver?

Nope, statistically you're probably a typical user who waits until
release to see if your RAID volumes are hosed by an undiscovered bug.
Luckily, with OpenBSD you have extremely dedicated expert developers
covering up for your short-sightedness.

The path of "Simple, Known and Tested" should be looking really good to
you about now for reliability but if not, then there is really no point
in arguing it any further. Not everyone can provoke Nick into yet
another world class RAID RANT, but those who do darn well ought to learn
something before he pulls out the nail gun again to show you what a
worst case disk failure is really like. (no joke, search the archives).

/JCR


--
Free, Open Source CAD, CAM and EDA Tools
http://www.DesignTools.org

Reply | Threaded
Open this post in threaded view
|

Re: CPU selection

ropers
In reply to this post by alexander lind
On 03/11/06, Alexander Lind <[hidden email]> wrote:
> >> what complexity?
> >>
> >
> > RAID, kiddo.
> > It's more complex.  It is something else that can go wrong.
> > And...it DOES go wrong.  Either believe me now, or wish you believed me
> > later.  Your call.  I spent a lot of time profiting from people who
> > ignored my advice. :)

<snip longer, more detailed argument>

Please allow me to weigh in that Nick is absolutely and completely
right. IMHO you (Alexander) also make some valid points, but they
mostly are those that both Nick and you agree on.

I have learnt what Nick was talking about the hard way.

Some time age, I've inherited a RAID system that then caused me so
much grief, and that despite me recognizing my own limits and my
limited prior RAID exposure, despite due diligence and doing my
homework, despite me testing things first and trying to be as
circumspect as I possibly could, things did go majorly, horribly
wrong.

This wasn't an OpenBSD setup, so it could be seen as off-topic on this
list, but if people are interested, I'd be happy to spell out to you
just what is wrong with commonly found current RAID "technology".

(Part of this actually does have a remote relevance to OpenBSD as it
was a picture book example of just why Adaptec is the scourge of
system administrators everywhere. All of you know this, of course, but
doesn't it always feel great to be vindicated?)

Repeat after me:
"Complexity is the worst enemy of security. Secure systems should be
cut to the bone and made as simple as possible. There is no substitute
for simplicity." (Schneier)

RAID is wonderful in theory.
But it ain't so easy to escape bad RAID products. It can be difficult
to avoid RAID pitfalls. RAID can be surprisingly hard to get right and
unexpectedly easy to screw up.

You'll remember Nick when a screwed up RAID setup bites you.

regards,
--ropers

Reply | Threaded
Open this post in threaded view
|

Re: CPU selection

Rod.. Whitworth-2
On Fri, 3 Nov 2006 11:04:03 +0100, ropers wrote:

>Repeat after me:
>"Complexity is the worst enemy of security. Secure systems should be
>cut to the bone and made as simple as possible. There is no substitute
>for simplicity." (Schneier)
>
>RAID is wonderful in theory.
>But it ain't so easy to escape bad RAID products. It can be difficult
>to avoid RAID pitfalls. RAID can be surprisingly hard to get right and
>unexpectedly easy to screw up.
>
>You'll remember Nick when a screwed up RAID setup bites you.

This may not sound relevant ( except to the old wise men) but it is.

I am a pilot (or was until I ran out of time to stay current) and there
is a parallel relating to the case under discussion.

Which would you rather be a passenger in:

A single engine aircraft?
or
A twin engine aircraft?


Guess what? Twins have twice as many engine failures as singles. What
is worse is that, unless the pilot(s) is/are really current and on top
of it, the risk to your neck is worse in a twin with one out than a
single with no power. On some days here in the summer there are twins
that cannot climb out with one donk shut down and feathered.

It's the aviation version of what Bruce Schneier talks about.

The maestro who gave me my multi-engine endorsement had, at that time
(32 years ago) flown the Pacific solo 54 times in singles and he taught
me to figure out when the twin I was flying became a twin. Below that
point he wanted me to treat <his> plane as a single and bend it as
little as possible. We are both still alive.

So I agree with Nick. Unless you need, <really> need, the complexity of
RAID <and> you have proven management skills in disaster recovery with
the variety you are managing, forget it and do better backups.

As far as firewalls are concerned physical size is not an issue. Pick a
small form factor mobo that will handle the number of packets per
second you need with a good margin and CARP it.

No stinkin' RAID needed. CARP (thanks guys!) is like two planes - not
one plane with two engines. Besides I don't see any RAID for my Compact
Flash on the market. ;-)

Two Soekris or Yawarra Eber units is a smaller volume than a slim
desktop PC. and the combo does exactly what Nick says.

And that was the scenario under discussion. It was not small
entertprise servers, where I run Stardom SATA RAID 1 hardware with
standby hardware of the same rev. AND do backups on offsite media, of
course.

But then I'm even older than Nick........... tho' he's a good lad,
still soaking up experience.
8-)

R/



Simply put:
1> Go find out how MTBF is calculated.
2> Be ready, really ready to handle the one out situation.

From the land "down under": Australia.
Do we look <umop apisdn> from up over?

Do NOT CC me - I am subscribed to the list.
Replies to the sender address will fail except from the list-server.
Your IP address will also be greytrapped for 24 hours after any attempt.
I am continually amazed by the people who run OpenBSD who don't take this advice. I always expected a smarter class. I guess not.

Reply | Threaded
Open this post in threaded view
|

Re: CPU selection

alexander lind
In reply to this post by J.C. Roberts-2
Thanks, I do stand corrected.

Next time I spec out firewalls, I will keep your arguments in mind for
sure, they do make a lot of sense.

Alec

J.C. Roberts wrote:

> On Thu, 02 Nov 2006 22:03:05 -0800, Alexander Lind <[hidden email]>
> wrote:
>
>  
>>> RAID, kiddo.
>>> It's more complex.  It is something else that can go wrong.
>>> And...it DOES go wrong.  Either believe me now, or wish you believed me
>>> later.  Your call.  I spent a lot of time profiting from people who
>>> ignored my advice. :)
>>>  
>>>      
>> Of course raid are more complex on a hardware level, but that doesn't
>> exactly make it more complex for _me_, the user, does it?
>>
>>    
>
> Alexander,
>
> Yes, it does. Not realizing the increased complexity and risks for the
> user just means you drank the koolaid and actually believe the marketing
> and advertising nonsense for hardware RAID products. If with *your*
> experience you really believe that hardware and firmware never have
> serious bugs or catastrophic failures, then you are statistically
> overdue for a number of unpleasant surprises.
>
> Here is an interesting question for you which may help you grasp the
> concept Nick is preaching; in the event of a nasty failure on a RAID
> where you absolutely *must* be able to recover the valuable data, do you
> stand a better chance of recovering the data from a hardware RAID
> configuration or a software RAID configuration?
>
> Though contrary to the marketing koolaid, the answer is software RAID.
> In a hardware RAID you are blindly trusting incompletely documented
> hardware and undisclosed firmware. You will *NEVER* have access to the
> firmware source code or the chip logic, so you never really know how it
> works exactly. In a software RAID configuration (ccd/raidframe/etc), you
> have the source code, know exactly how it works and the hardware is far
> less complex as well as reasonably well documented in most cases. With
> software RAID, at least you have a chance of mounting the raw disks and
> piecing thing back together manually. The odds of recovery are always
> better when things are simple and you actually know how they work.
>
> Mindlessly slapping a new disk into a hardware RAID after a disk failure
> only works *some* of the time and only for *some* types of failures. If
> you're not lucky enough to be in the *some* category, then you'll be
> dusting off those outdated backup tapes and updating your resume.
> Imagine telling your boss that there is no way to recover the data from
> the trashed RAID disks because the vendor refuses to release required
> hardware/firmware information.
>
> If you had kept things known and simple by using a software RAID, you
> may have had a chance of recovering the companys' financial records.
>
> Hardware RAID is fun, fast and useful for some applications but you
> should at least understand the additional complexity you're deploying,
> the additional risks caused by the complexity and the additional costs
> you will bear. When your only concern is reliability then your goal
> should be to keep it as simple as feasible. Less complexity and fewer
> unknowns not only means fewer things can go wrong but it also means a
> greater chance of recovery.
>
> Still not convinced? Let's say a bug is committed to the -CURRENT source
> tree in the driver for your hardware RAID card. Since reliability is so
> critical to you, you must have a completely identical hardware setup for
> constantly testing your hardware RAID controller with -CURRENT to
> prevent that bug from getting into a -RELEASE? Or maybe you went out and
> spent the few hundred bucks for an additional RAID controller like the
> one you use so you could donate it to one of the developers in the
> project who actually work on the driver?
>
> Nope, statistically you're probably a typical user who waits until
> release to see if your RAID volumes are hosed by an undiscovered bug.
> Luckily, with OpenBSD you have extremely dedicated expert developers
> covering up for your short-sightedness.
>
> The path of "Simple, Known and Tested" should be looking really good to
> you about now for reliability but if not, then there is really no point
> in arguing it any further. Not everyone can provoke Nick into yet
> another world class RAID RANT, but those who do darn well ought to learn
> something before he pulls out the nail gun again to show you what a
> worst case disk failure is really like. (no joke, search the archives).
>
> /JCR
>
>
> --
> Free, Open Source CAD, CAM and EDA Tools
> http://www.DesignTools.org

12