openbsd clusters

classic Classic list List threaded Threaded
20 messages Options
Reply | Threaded
Open this post in threaded view
|

openbsd clusters

Friedrich Locke
Hi folks,

i have been using OpenBSD for about 10 years now. It is my OS of choice.
But due to increasing demand in my working environment i would like to test
cluster technologies using OpenBSD.

Here is what i need:

File Systems,
web,
dns,
Ldap,
email
kerberos

I have some ideias about what i could use: For instance, for email service,
i could use openldap+qmail, for kerberos it is easy to have propagation
performed and have the same weight in the DNS record for kdc service in my
domain.

But for other services i don't have now what i could use. A example: i need
a file system that must expand by adding more machine in the network in a
simple way. I was studying OpenAFS, but OBSD 5.1 only support it for i386,
not amd64. Is there any alternative to it ?
Does anybody here use OpenAFS on OpenBSD ? Does it scale well ? What about
GlusterFS ? What would it be a better choice ?

What about cluster with dns ? Could i use djbdns ? And web stuff ?

I would like to get in touch with some one from the community that already
faced such requirements using OpenBSD of course.

Thanks a lot for your time and cooperation.

Best regards.

Reply | Threaded
Open this post in threaded view
|

Re: openbsd clusters

Stuart Henderson
On 2012-12-22, Friedrich Locke <[hidden email]> wrote:

> Hi folks,
>
> i have been using OpenBSD for about 10 years now. It is my OS of choice.
> But due to increasing demand in my working environment i would like to test
> cluster technologies using OpenBSD.
>
> Here is what i need:
>
> File Systems,
> web,
> dns,
> Ldap,
> email
> kerberos
>
> I have some ideias about what i could use: For instance, for email service,
> i could use openldap+qmail,

OpenLDAP+Postfix+Dovecot works pretty well for mail.

>                             for kerberos it is easy to have propagation
> performed and have the same weight in the DNS record for kdc service in my
> domain.
>
> But for other services i don't have now what i could use. A example: i need
> a file system that must expand by adding more machine in the network in a
> simple way. I was studying OpenAFS, but OBSD 5.1 only support it for i386,
> not amd64. Is there any alternative to it ?
> Does anybody here use OpenAFS on OpenBSD ? Does it scale well ? What about
> GlusterFS ? What would it be a better choice ?

I'm not sure if there's anything really good in this area for OpenBSD.
GlusterFS requires FUSE.

> What about cluster with dns ? Could i use djbdns ?

You can use any DNS server software, you don't need anything special,
you can just use CARP to get packets to the correct machine (or if you're
in an environment using OSPF then using ospfd to announce a /32 is an
alternative which works well if you want geographical diversity).

I wouldn't recommend anybody use djbdns for a new installation.

>                                                    And web stuff ?

One method is to run CARP on the boxes and just use that to handle
distributing the traffic. Alternatively use a (possibly clustered)
frontend running a proxy (relayd, nginx, varnish, ...) which gives
more options (e.g. serve static files locally and pass requests for
generated pages to a pool of backend machines).

> I would like to get in touch with some one from the community that already
> faced such requirements using OpenBSD of course.

Apart from filesystems, the things you're asking for are not really
OS-specific.

Reply | Threaded
Open this post in threaded view
|

Re: openbsd clusters

Jiri B-2
On Sat, Dec 22, 2012 at 01:23:12PM +0000, Stuart Henderson wrote:
> > But for other services i don't have now what i could use. A example: i need
> > a file system that must expand by adding more machine in the network in a
> > simple way. I was studying OpenAFS, but OBSD 5.1 only support it for i386,
> > not amd64. Is there any alternative to it ?
> > Does anybody here use OpenAFS on OpenBSD ? Does it scale well ? What about
> > GlusterFS ? What would it be a better choice ?
>
> I'm not sure if there's anything really good in this area for OpenBSD.
> GlusterFS requires FUSE.

"...or accessed via gfapi client library." So if you app would be able to
use this library you could use glusterfs directly without native posix-like
filesystem. Still, how would you make backup of glusterfs on OpenBSD...?
The same applies to HDFS (Hadoo), doesn't it?

oVirt uses NFS as storage for virtualization hosts and implements its own
logic checking availability between hosts - SPM. Maybe you could use NFS
and write some stuff around it to guarantee integrity and availability,
in oVirt a hosts which looses NFS storage is fenced...

IIRC somebody on the list described a NFS-based "clustered" filesystem
using vnd images on NFS cross mounted and RAID on top of it.

jirib

Reply | Threaded
Open this post in threaded view
|

Re: openbsd clusters

Stuart Henderson
In reply to this post by Stuart Henderson
On 2012-12-22, Stuart Henderson <[hidden email]> wrote:
> I wouldn't recommend anybody use djbdns for a new installation.

Since I had a comment offlist about this I'll elaborate on this a bit
(I used tinydns for many years but don't any more).

- unmaintained upstream, you need to work out which of dozens of
third-party patches are actually needed to fix problems and important
lacking support (TXT/AAAA can only be done as 'generic records' without
patching), and which patches are junk

- awkward nonstandard logging format

- awkward to run with multiple IP addresses, you have to run multiple
daemons one per IP address

- daemontools

- the alternatives available now are much better than the alternatives
available when djbdns was first available

on the plus side, tinydns has a nice method to expire records at a
certain time and a semi-nice way of doing split horizon dns, but
this doesn't outweigh the disadvantages for me.

Reply | Threaded
Open this post in threaded view
|

Re: openbsd clusters

David Diggles-2
In reply to this post by Jiri B-2
On Sat, Dec 22, 2012 at 09:12:27AM -0500, Jiri B wrote:

> On Sat, Dec 22, 2012 at 01:23:12PM +0000, Stuart Henderson wrote:
> > > But for other services i don't have now what i could use. A example: i need
> > > a file system that must expand by adding more machine in the network in a
> > > simple way. I was studying OpenAFS, but OBSD 5.1 only support it for i386,
> > > not amd64. Is there any alternative to it ?
> > > Does anybody here use OpenAFS on OpenBSD ? Does it scale well ? What about
> > > GlusterFS ? What would it be a better choice ?
> >
> > I'm not sure if there's anything really good in this area for OpenBSD.
> > GlusterFS requires FUSE.
>
> "...or accessed via gfapi client library." So if you app would be able to
> use this library you could use glusterfs directly without native posix-like
> filesystem. Still, how would you make backup of glusterfs on OpenBSD...?
> The same applies to HDFS (Hadoo), doesn't it?
>
> oVirt uses NFS as storage for virtualization hosts and implements its own
> logic checking availability between hosts - SPM. Maybe you could use NFS
> and write some stuff around it to guarantee integrity and availability,
> in oVirt a hosts which looses NFS storage is fenced...
>
> IIRC somebody on the list described a NFS-based "clustered" filesystem
> using vnd images on NFS cross mounted and RAID on top of it.
>
> jirib
>

Something like pNFS would be ideal http://www.pnfs.com/

Reply | Threaded
Open this post in threaded view
|

Re: openbsd clusters

Friedrich Locke
Does OBSD support it?

On Sat, Dec 22, 2012 at 7:02 PM, David Diggles <[hidden email]> wrote:

> On Sat, Dec 22, 2012 at 09:12:27AM -0500, Jiri B wrote:
> > On Sat, Dec 22, 2012 at 01:23:12PM +0000, Stuart Henderson wrote:
> > > > But for other services i don't have now what i could use. A example:
> i need
> > > > a file system that must expand by adding more machine in the network
> in a
> > > > simple way. I was studying OpenAFS, but OBSD 5.1 only support it for
> i386,
> > > > not amd64. Is there any alternative to it ?
> > > > Does anybody here use OpenAFS on OpenBSD ? Does it scale well ? What
> about
> > > > GlusterFS ? What would it be a better choice ?
> > >
> > > I'm not sure if there's anything really good in this area for OpenBSD.
> > > GlusterFS requires FUSE.
> >
> > "...or accessed via gfapi client library." So if you app would be able to
> > use this library you could use glusterfs directly without native
> posix-like
> > filesystem. Still, how would you make backup of glusterfs on OpenBSD...?
> > The same applies to HDFS (Hadoo), doesn't it?
> >
> > oVirt uses NFS as storage for virtualization hosts and implements its own
> > logic checking availability between hosts - SPM. Maybe you could use NFS
> > and write some stuff around it to guarantee integrity and availability,
> > in oVirt a hosts which looses NFS storage is fenced...
> >
> > IIRC somebody on the list described a NFS-based "clustered" filesystem
> > using vnd images on NFS cross mounted and RAID on top of it.
> >
> > jirib
> >
>
> Something like pNFS would be ideal http://www.pnfs.com/

Reply | Threaded
Open this post in threaded view
|

Re: openbsd clusters

Nick Holland
In reply to this post by Friedrich Locke
On 12/22/12 07:54, Friedrich Locke wrote:
...
> But for other services i don't have now what i could use. A example: i need
> a file system that must expand by adding more machine in the network in a
> simple way.

in plain English: "I'm not thinking out the design carefully, so I'm
going to rely on fancy shit to haul my ass out of the fire when the
predictable (and not so predictable) happens.

You don't need that for your problem, you need that for the solution you
came up with for your problem.  Your solution is wrong.

You know your needs will change in the future, so build the whole system
around the idea of modular storage and other scalability design features
-- not "unlimited expandable storage".

Chunk your data from the very beginning.  In the case of a mail server,
part of the user's LDAP record indicates the storage unit where it is
stored.

Yes, this is a better design.

I've seen many designs where the answer was "toss it all in one pool,
let some 'advanced technology' keep my ass out of the fire."  They have
all been total shit.  Usual result: the "advanced technology" gathers
the kindling, splits the logs, lights the fire, and tosses your ass on
the pyre before you ever get around to the first "expansion".  If you
wish to argue that your "problem" is special, and requires One Big Pool
of Storage, feel free to tell me about it (off list), maybe someone's
got one.  More likely, you will be telling me about your SOLUTION which
requires one big pool, not the root problem.  (I'm not above learning
new stuff, but I'm done with assuming most people know something I don't
-- that's something that is really annoying to be wrong about, I'm finding).

Your design should incorporate (among other things):
* initial load handling.
* future load handling improvements.
* future storage upgrade.
* future storage REPLACEMENTS (you want to remove your three year old
storage module in favor of a new one ten times the size, but your six
month old one is still quite good)
* future complete solution replacements. (*)
the simplest possible solutions that will accomplish the above within
acceptable business frameworks (i.e., not "we'll have our entire IT
staff working a major multi-day holiday because that's the only way we
can accomplish this")

Nick.


(*) if you ever wish to keep a closed source solution OUT of your
operations, this is your magic weapon to use with responsible, thinking
people.  Every closed source solution is built around the idea of
keeping you a captive customer.  But the fact is, if your business is
run well, in 50 years, it can still be around.  You will almost
certainly have to replace entire systems with competing products "some
day" -- your company's success should not be dependent upon a third
party remaining in business.  So, an exit strategy has to be part of any
good system design (even though it almost never is).  How are you going
to scrape your legacy data off your old system and install it into its
replacement?  When the APIs are proprietary, you won't...  Ask your
prospective vendor "If you go bankrupt or otherwise leave the business
next year, how will we move >OUR< data stored in your system to another
product?"  They will start with "We aren't going anywhere", which you
know they would say if they weren't sure about getting their paychecks
next week.

'course, most people are not thinking about the long-term health of the
company, but the short-term "what can I stuff on my resume on my way out
the door before this blows up"

Reply | Threaded
Open this post in threaded view
|

Re: openbsd clusters

Maxim Bourmistrov-5
:)

A good one!
Nice writing, Nick.

My favorite:

> 'course, most people are not thinking about the long-term health of the
> company, but the short-term "what can I stuff on my resume on my way out
> the door before this blows up"

//mxb

On 23 dec 2012, at 04:43, Nick Holland <[hidden email]> wrote:

> On 12/22/12 07:54, Friedrich Locke wrote:
> ...
>> But for other services i don't have now what i could use. A example: i need
>> a file system that must expand by adding more machine in the network in a
>> simple way.
>
> in plain English: "I'm not thinking out the design carefully, so I'm
> going to rely on fancy shit to haul my ass out of the fire when the
> predictable (and not so predictable) happens.
>
> You don't need that for your problem, you need that for the solution you
> came up with for your problem.  Your solution is wrong.
>
> You know your needs will change in the future, so build the whole system
> around the idea of modular storage and other scalability design features
> -- not "unlimited expandable storage".
>
> Chunk your data from the very beginning.  In the case of a mail server,
> part of the user's LDAP record indicates the storage unit where it is
> stored.
>
> Yes, this is a better design.
>
> I've seen many designs where the answer was "toss it all in one pool,
> let some 'advanced technology' keep my ass out of the fire."  They have
> all been total shit.  Usual result: the "advanced technology" gathers
> the kindling, splits the logs, lights the fire, and tosses your ass on
> the pyre before you ever get around to the first "expansion".  If you
> wish to argue that your "problem" is special, and requires One Big Pool
> of Storage, feel free to tell me about it (off list), maybe someone's
> got one.  More likely, you will be telling me about your SOLUTION which
> requires one big pool, not the root problem.  (I'm not above learning
> new stuff, but I'm done with assuming most people know something I don't
> -- that's something that is really annoying to be wrong about, I'm finding).
>
> Your design should incorporate (among other things):
> * initial load handling.
> * future load handling improvements.
> * future storage upgrade.
> * future storage REPLACEMENTS (you want to remove your three year old
> storage module in favor of a new one ten times the size, but your six
> month old one is still quite good)
> * future complete solution replacements. (*)
> the simplest possible solutions that will accomplish the above within
> acceptable business frameworks (i.e., not "we'll have our entire IT
> staff working a major multi-day holiday because that's the only way we
> can accomplish this")
>
> Nick.
>
>
> (*) if you ever wish to keep a closed source solution OUT of your
> operations, this is your magic weapon to use with responsible, thinking
> people.  Every closed source solution is built around the idea of
> keeping you a captive customer.  But the fact is, if your business is
> run well, in 50 years, it can still be around.  You will almost
> certainly have to replace entire systems with competing products "some
> day" -- your company's success should not be dependent upon a third
> party remaining in business.  So, an exit strategy has to be part of any
> good system design (even though it almost never is).  How are you going
> to scrape your legacy data off your old system and install it into its
> replacement?  When the APIs are proprietary, you won't...  Ask your
> prospective vendor "If you go bankrupt or otherwise leave the business
> next year, how will we move >OUR< data stored in your system to another
> product?"  They will start with "We aren't going anywhere", which you
> know they would say if they weren't sure about getting their paychecks
> next week.
>
> 'course, most people are not thinking about the long-term health of the
> company, but the short-term "what can I stuff on my resume on my way out
> the door before this blows up"

Reply | Threaded
Open this post in threaded view
|

Re: openbsd clusters

Sebastian Neuper
In reply to this post by Nick Holland
On Sat, 22 Dec 2012 22:43:54 -0500
Nick Holland <[hidden email]> wrote:

> On 12/22/12 07:54, Friedrich Locke wrote:
> ...
> > But for other services i don't have now what i could use. A example: i need
> > a file system that must expand by adding more machine in the network in a
> > simple way.
>
> in plain English: "I'm not thinking out the design carefully, so I'm
> going to rely on fancy shit to haul my ass out of the fire when the
> predictable (and not so predictable) happens.
>
> You don't need that for your problem, you need that for the solution you
> came up with for your problem.  Your solution is wrong.

So, please let's go more in detail. If you want a openbsd fileserver with a few
terra bytes storage, secured by a raid; the file server should handle
a lot of media files in future and should provide them via network;
what motherboard, cpu, network and (perhaps) raid controller would you buy, to assure,
that it is best supported by openbsd, reliable, easy to maintain and costs less
then 0,5k?

In our company, we purchased a media file server (48TB for 40k+) a year ago based on
Linux and it sucks. Promised features only work sporadic, and to make it work, there
are workarounds around workarounds. But I don't want to get more in detail. I think, nobody
of you heard of Avid or Editshare or work alot with the Adobe Suite.
Now, this server is almost full and we will have to buy an expansion.
Exact the scenario, Nick explained.

I'm looking for an openbsd solution for my home since I first throw a glance
at our new expensive 'thing'.

But I don't know, if I should follow the blog entry "build a home server
with openbsd 3.9" or the 'howto make a fileserver with openbsd' dated 2 years ago.

So what hardware would you buy for an openbsd file server, to get it
fast enough to provide hd video media assets via network? Which set is a robust and
good solution and tested and proven by yourself?
 
Best, Sebastian.

--
Sebastian Neuper <[hidden email]>

Reply | Threaded
Open this post in threaded view
|

Re: openbsd clusters

Brett Mahar-2
>I'm looking for an openbsd solution for my home since I first throw a glance
>at our new expensive 'thing'.
>

> So what hardware would you buy for an openbsd file server, to get it
> fast enough to provide hd video media assets via network? Which set is a robust and
> good solution and tested and proven by yourself?
>  

A couple of years ago I set up a low end desktop computer with a nfs server and a rum(4). This had no problem streaming video to my laptop which had an athn(4). Why don't you try something like this for the home network you want to set up, before looking around for specialised hardware?

As a prior poster said, you can set up a few hard drives in a raid array of you need more storage. Or just buy a second hard drive when your first one fills up, and symbolic link it to the first.

Brett.

Reply | Threaded
Open this post in threaded view
|

Re: openbsd clusters

Eric Furman-3
In reply to this post by Sebastian Neuper
Not long ago Nick did go into some detail about this very thing.
I don't remember how long ago or what the thread was about,
but you might find it in the archives.
Just search for Nick Holland. Anything you find will be worth
reading in any case. :)

On Tue, Dec 25, 2012, at 04:03 PM, Sebastian Neuper wrote:

> On Sat, 22 Dec 2012 22:43:54 -0500
> Nick Holland <[hidden email]> wrote:
>
> > On 12/22/12 07:54, Friedrich Locke wrote:
> > ...
> > > But for other services i don't have now what i could use. A example: i need
> > > a file system that must expand by adding more machine in the network in a
> > > simple way.
> >
> > in plain English: "I'm not thinking out the design carefully, so I'm
> > going to rely on fancy shit to haul my ass out of the fire when the
> > predictable (and not so predictable) happens.
> >
> > You don't need that for your problem, you need that for the solution you
> > came up with for your problem.  Your solution is wrong.
>
> So, please let's go more in detail. If you want a openbsd fileserver with
> a few
> terra bytes storage, secured by a raid; the file server should handle
> a lot of media files in future and should provide them via network;
> what motherboard, cpu, network and (perhaps) raid controller would you
> buy, to assure,
> that it is best supported by openbsd, reliable, easy to maintain and
> costs less
> then 0,5k?
>
> In our company, we purchased a media file server (48TB for 40k+) a year
> ago based on
> Linux and it sucks. Promised features only work sporadic, and to make it
> work, there
> are workarounds around workarounds. But I don't want to get more in
> detail. I think, nobody
> of you heard of Avid or Editshare or work alot with the Adobe Suite.
> Now, this server is almost full and we will have to buy an expansion.
> Exact the scenario, Nick explained.
>
> I'm looking for an openbsd solution for my home since I first throw a
> glance
> at our new expensive 'thing'.
>
> But I don't know, if I should follow the blog entry "build a home server
> with openbsd 3.9" or the 'howto make a fileserver with openbsd' dated 2
> years ago.
>
> So what hardware would you buy for an openbsd file server, to get it
> fast enough to provide hd video media assets via network? Which set is a
> robust and
> good solution and tested and proven by yourself?
>  
> Best, Sebastian.
>
> --
> Sebastian Neuper <[hidden email]>

Reply | Threaded
Open this post in threaded view
|

Re: openbsd clusters

Zhang Huangbin
In reply to this post by Friedrich Locke
On Saturday, December 22, 2012 at 8:54 PM, Friedrich Locke wrote:

> For instance, for email service, i could use openldap+qmail


Maybe you can try iRedMail (OpenLDAP + Postfix):
http://www.iredmail.org/install_iredmail_on_openbsd.html 

Reply | Threaded
Open this post in threaded view
|

Re: openbsd clusters

Nick Holland
In reply to this post by Eric Furman-3
On 12/25/12 19:50, Eric Furman wrote:
> Not long ago Nick did go into some detail about this very thing.
> I don't remember how long ago or what the thread was about,
> but you might find it in the archives.
> Just search for Nick Holland. Anything you find will be worth
> reading in any case. :)
>

*blush*
Do not feed The Ego. :)

Probably thinking of this thread:
http://marc.info/?t=117689108200011&r=1&w=2
and my two contributions to it.  A number of other people provided some
good (and some bad) comments, too...read through 'em all.  You get to
decide which are useful and which are not, and what is right and what is
wrong.

Keep in mind that thread is almost six years old...500GB was a big disk
back then.  However, I'm still quite proud of that system.
(and in case you were wondering, my employment ended with that employer
about four months later.  That also makes a great story, but quite
off-topic.  They did replace my system with a proprietary system that
cost many times as much).

Nick.



> On Tue, Dec 25, 2012, at 04:03 PM, Sebastian Neuper wrote:
>> On Sat, 22 Dec 2012 22:43:54 -0500
>> Nick Holland <[hidden email]> wrote:
>>
>> > On 12/22/12 07:54, Friedrich Locke wrote:
>> > ...
>> > > But for other services i don't have now what i could use. A example: i need
>> > > a file system that must expand by adding more machine in the network in a
>> > > simple way.
>> >
>> > in plain English: "I'm not thinking out the design carefully, so I'm
>> > going to rely on fancy shit to haul my ass out of the fire when the
>> > predictable (and not so predictable) happens.
>> >
>> > You don't need that for your problem, you need that for the solution you
>> > came up with for your problem.  Your solution is wrong.
>>
>> So, please let's go more in detail. If you want a openbsd fileserver with
>> a few
>> terra bytes storage, secured by a raid; the file server should handle
>> a lot of media files in future and should provide them via network;
>> what motherboard, cpu, network and (perhaps) raid controller would you
>> buy, to assure,
>> that it is best supported by openbsd, reliable, easy to maintain and
>> costs less
>> then 0,5k?
>>
>> In our company, we purchased a media file server (48TB for 40k+) a year
>> ago based on
>> Linux and it sucks. Promised features only work sporadic, and to make it
>> work, there
>> are workarounds around workarounds. But I don't want to get more in
>> detail. I think, nobody
>> of you heard of Avid or Editshare or work alot with the Adobe Suite.
>> Now, this server is almost full and we will have to buy an expansion.
>> Exact the scenario, Nick explained.
>>
>> I'm looking for an openbsd solution for my home since I first throw a
>> glance
>> at our new expensive 'thing'.
>>
>> But I don't know, if I should follow the blog entry "build a home server
>> with openbsd 3.9" or the 'howto make a fileserver with openbsd' dated 2
>> years ago.
>>
>> So what hardware would you buy for an openbsd file server, to get it
>> fast enough to provide hd video media assets via network? Which set is a
>> robust and
>> good solution and tested and proven by yourself?
>>  
>> Best, Sebastian.
>>
>> --
>> Sebastian Neuper <[hidden email]>

Reply | Threaded
Open this post in threaded view
|

Re: openbsd clusters

Stefan Sperling-8
On Wed, Dec 26, 2012 at 03:26:43PM -0500, Nick Holland wrote:

> On 12/25/12 19:50, Eric Furman wrote:
> > Not long ago Nick did go into some detail about this very thing.
> > I don't remember how long ago or what the thread was about,
> > but you might find it in the archives.
> > Just search for Nick Holland. Anything you find will be worth
> > reading in any case. :)
> >
>
> *blush*
> Do not feed The Ego. :)

My favourite post of yours is still the nail gun one.
http://www.monkey.org/openbsd/archive/misc/0007/msg01182.html

Perhaps some day there'll be a printed collection :)

Reply | Threaded
Open this post in threaded view
|

Re: openbsd clusters

Johan Beisser
In reply to this post by Nick Holland
On Sat, Dec 22, 2012 at 7:43 PM, Nick Holland
<[hidden email]> wrote:
> On 12/22/12 07:54, Friedrich Locke wrote:
> ...
>> But for other services i don't have now what i could use. A example: i need
>> a file system that must expand by adding more machine in the network in a
>> simple way.
>
> in plain English: "I'm not thinking out the design carefully, so I'm
> going to rely on fancy shit to haul my ass out of the fire when the
> predictable (and not so predictable) happens.

Yes and no. Yes, the design is important. No, I actually do have a
need for linear storage that can be easily expanded upon. I could use
a NetApp or similar setup, but then I can't throw more CPU at the
other side of the problem: using the stored data.

So the bigger problem isn't storage space (disk is cheap, after all),
rather than being able to slice and dice the data that's stored on the
system. Processing huge files is much easier when when you have a
dozen nodes to do it on.

I fully agree that being able to later extract and migrate away from
any storage solution is important. Along with that comes migration
paths to new hardware, software, and simple failure recovery (bad
disks, broken node, etc).

Big data takes quite a bit of planning, but it's gotten much easier.
Good thing I don't need to do this quickly...

Reply | Threaded
Open this post in threaded view
|

Re: openbsd clusters

Alvaro Mantilla Gimenez-4
Hi,

  I am following this thread and I should say I consider myself a
totally beginner on this kind of subjects. Whit that being said I need to
ask this (probably I am wrong and I hope I am wrong because it gives me the
possibility to learn from your answers): is there a way that OpenAFS fits
in this design? I am thinking on a group of servers with some local
partition schemes like Nick's thread but those servers belong to an OpenAFS
cell/s for linear access from clients and all the replication that OpenAFS
seems to provide?

  Regards,

      Alvaro

2012/12/26 Johan Beisser <[hidden email]>

> On Sat, Dec 22, 2012 at 7:43 PM, Nick Holland
> <[hidden email]> wrote:
> > On 12/22/12 07:54, Friedrich Locke wrote:
> > ...
> >> But for other services i don't have now what i could use. A example: i
> need
> >> a file system that must expand by adding more machine in the network in
> a
> >> simple way.
> >
> > in plain English: "I'm not thinking out the design carefully, so I'm
> > going to rely on fancy shit to haul my ass out of the fire when the
> > predictable (and not so predictable) happens.
>
> Yes and no. Yes, the design is important. No, I actually do have a
> need for linear storage that can be easily expanded upon. I could use
> a NetApp or similar setup, but then I can't throw more CPU at the
> other side of the problem: using the stored data.
>
> So the bigger problem isn't storage space (disk is cheap, after all),
> rather than being able to slice and dice the data that's stored on the
> system. Processing huge files is much easier when when you have a
> dozen nodes to do it on.
>
> I fully agree that being able to later extract and migrate away from
> any storage solution is important. Along with that comes migration
> paths to new hardware, software, and simple failure recovery (bad
> disks, broken node, etc).
>
> Big data takes quite a bit of planning, but it's gotten much easier.
> Good thing I don't need to do this quickly...

Reply | Threaded
Open this post in threaded view
|

Re: openbsd clusters

Jiri B-2
In reply to this post by Nick Holland
On Wed, Dec 26, 2012 at 03:26:43PM -0500, Nick Holland wrote:

> Probably thinking of this thread:
> http://marc.info/?t=117689108200011&r=1&w=2
> and my two contributions to it.  A number of other people provided some
> good (and some bad) comments, too...read through 'em all.  You get to
> decide which are useful and which are not, and what is right and what is
> wrong.
>
> Keep in mind that thread is almost six years old...500GB was a big disk
> back then.  However, I'm still quite proud of that system.
> (and in case you were wondering, my employment ended with that employer
> about four months later.  That also makes a great story, but quite
> off-topic.  They did replace my system with a proprietary system that
> cost many times as much).

Only setup I can imagine which cannot fit into this setup of small
partitions combined with filesystem structure and symlinks is this one

   'unrestricted space offered directly to a user via ftp/sftp/ssh'

As we cannot predict how fast and when he/she would fit the storage,
moving later user's whole data to bigger one is slow and still not
a solution.

It seems to me that giving a user direct access to his data root dir
while telling him about no space restriction is not possible.

On the other hand, if the user would not require one big directory for
his data, then filesystem layout could be hidden to the user and mentioned
setup would fit - although instead of direct ftp/sftp the user would use
some specialized client to get his files, the setup would use some UUID and
keep track of UUID and his owner (or something similar).

Any comments? Do exists some "proxies" which would mirror files immediately
when a user is uploading them via some common protocol? And when the user
deletes some of his files the "proxy" would delete the copy? (rsyncing
later regularly could be quite problematic if you would have many users
uploading for example a couple of GB files...).

jirib

Reply | Threaded
Open this post in threaded view
|

Re: openbsd clusters

Alvaro Mantilla Gimenez-4
Is not this what you are trying to accomplish?

<a href="http://docs.openafs.org/AdminGuide/index.html#HDRWQ57.html#HDRWQ59">http://docs.openafs.org/AdminGuide/index.html#HDRWQ57.html#HDRWQ59

and then, adding space:

http://docs.openafs.org/AdminGuide/index.html#HDRWQ130.html

and if you need to move the volume to another partition/bigger disk:

<a href="http://docs.openafs.org/AdminGuide/index.html#HDRWQ177.html#HDRWQ179">http://docs.openafs.org/AdminGuide/index.html#HDRWQ177.html#HDRWQ179

"Volumes are easy to move between partitions, on the same or different
machines, because they are by definition smaller than a partition. Perhaps
the most common reasons to move volumes are to balance the load among file
server machines or to take advantage of greater disk capacity on certain
machines. You can move volumes as often as necessary without disrupting
user access to their contents, because the move procedure makes the
contents unavailable for only a few seconds. The automatic tracking of
volume locations in the Volume Location Database (VLDB) assures that access
remains transparent."

Regards,

    Alvaro

2012/12/27 Jiri B <[hidden email]>

> On Wed, Dec 26, 2012 at 03:26:43PM -0500, Nick Holland wrote:
> > Probably thinking of this thread:
> > http://marc.info/?t=117689108200011&r=1&w=2
> > and my two contributions to it.  A number of other people provided some
> > good (and some bad) comments, too...read through 'em all.  You get to
> > decide which are useful and which are not, and what is right and what is
> > wrong.
> >
> > Keep in mind that thread is almost six years old...500GB was a big disk
> > back then.  However, I'm still quite proud of that system.
> > (and in case you were wondering, my employment ended with that employer
> > about four months later.  That also makes a great story, but quite
> > off-topic.  They did replace my system with a proprietary system that
> > cost many times as much).
>
> Only setup I can imagine which cannot fit into this setup of small
> partitions combined with filesystem structure and symlinks is this one
>
>    'unrestricted space offered directly to a user via ftp/sftp/ssh'
>
> As we cannot predict how fast and when he/she would fit the storage,
> moving later user's whole data to bigger one is slow and still not
> a solution.
>
> It seems to me that giving a user direct access to his data root dir
> while telling him about no space restriction is not possible.
>
> On the other hand, if the user would not require one big directory for
> his data, then filesystem layout could be hidden to the user and mentioned
> setup would fit - although instead of direct ftp/sftp the user would use
> some specialized client to get his files, the setup would use some UUID and
> keep track of UUID and his owner (or something similar).
>
> Any comments? Do exists some "proxies" which would mirror files immediately
> when a user is uploading them via some common protocol? And when the user
> deletes some of his files the "proxy" would delete the copy? (rsyncing
> later regularly could be quite problematic if you would have many users
> uploading for example a couple of GB files...).
>
> jirib

Reply | Threaded
Open this post in threaded view
|

Re: openbsd clusters

Jiri B-2
On Thu, Dec 27, 2012 at 05:28:24PM -0600, Alvaro Mantilla Gimenez wrote:

> Is not this what you are trying to accomplish?
>
> <a href="http://docs.openafs.org/AdminGuide/index.html#HDRWQ57.html#HDRWQ59">http://docs.openafs.org/AdminGuide/index.html#HDRWQ57.html#HDRWQ59
>
> and then, adding space:
>
> http://docs.openafs.org/AdminGuide/index.html#HDRWQ130.html
>
> and if you need to move the volume to another partition/bigger disk:
>
> <a href="http://docs.openafs.org/AdminGuide/index.html#HDRWQ177.html#HDRWQ179">http://docs.openafs.org/AdminGuide/index.html#HDRWQ177.html#HDRWQ179
>
> "Volumes are easy to move between partitions, on the same or different
> machines, because they are by definition smaller than a partition. Perhaps
> the most common reasons to move volumes are to balance the load among file
> server machines or to take advantage of greater disk capacity on certain
> machines. You can move volumes as often as necessary without disrupting
> user access to their contents, because the move procedure makes the
> contents unavailable for only a few seconds. The automatic tracking of
> volume locations in the Volume Location Database (VLDB) assures that access
> remains transparent."

Altough the discussion was related to nick's "old" designs which do not use
nfs or afs.

jirib

Reply | Threaded
Open this post in threaded view
|

Re: openbsd clusters

Nick Holland
In reply to this post by Jiri B-2
On 12/27/12 17:25, Jiri B wrote:

> On Wed, Dec 26, 2012 at 03:26:43PM -0500, Nick Holland wrote:
>> Probably thinking of this thread:
>> http://marc.info/?t=117689108200011&r=1&w=2
>> and my two contributions to it.  A number of other people provided some
>> good (and some bad) comments, too...read through 'em all.  You get to
>> decide which are useful and which are not, and what is right and what is
>> wrong.
>>
>> Keep in mind that thread is almost six years old...500GB was a big disk
>> back then.  However, I'm still quite proud of that system.
>> (and in case you were wondering, my employment ended with that employer
>> about four months later.  That also makes a great story, but quite
>> off-topic.  They did replace my system with a proprietary system that
>> cost many times as much).
>
> Only setup I can imagine which cannot fit into this setup of small
> partitions combined with filesystem structure and symlinks is this one
>
>    'unrestricted space offered directly to a user via ftp/sftp/ssh'
>
> As we cannot predict how fast and when he/she would fit the storage,
> moving later user's whole data to bigger one is slow and still not
> a solution.
>
> It seems to me that giving a user direct access to his data root dir
> while telling him about no space restriction is not possible.

I would say that's true, period.  Fancy stuff only lets you push off the
problem to a bigger number, but you always have some finite storage
available, and if given no limits, no checks, no costs, you WILL fill it
eventually...unless you have an inbound pipe that's slower than your
procurement process for new storage (and I'm going to argue, that's
cheating! :)

If your task definition is "give a user direct access to unlimited
storage", well, yes... I may not have the greatest solution in the world
for you...but then, you crafted the question in a non-business savvy way
to stump me (me: "you don't need unlimited storage for most real world
tasks"  you: "My real world task is to give someone unlimited storage")
-- you are ignoring all laws of economics, and your solution WILL have
serious issues because of that (why do we have a problem with spam?
Because it's painless and risk-free for the sender.  Why are we seeing a
resurgence in telephone-based scams?  Because it's become painless and
risk-free for the scammer.  Why will your task blow up in your face in
predictable ways?  Because there's no cost to the consumer of your disk
space.  Econ 101).

But still...this is not a statement of an actual problem to be solved
("I need to be able to upload lots of huge video files for exchange with
other people"), but a proposed solution ("unlimited direct access to
file systems").  So I'm not going to admit defeat. :)

>
> On the other hand, if the user would not require one big directory for
> his data, then filesystem layout could be hidden to the user and mentioned
> setup would fit - although instead of direct ftp/sftp the user would use
> some specialized client to get his files, the setup would use some UUID and
> keep track of UUID and his owner (or something similar).
>
> Any comments? Do exists some "proxies" which would mirror files immediately
> when a user is uploading them via some common protocol? And when the user
> deletes some of his files the "proxy" would delete the copy? (rsyncing
> later regularly could be quite problematic if you would have many users
> uploading for example a couple of GB files...).

actually, rsyncing is fantastic for huge files...it can verify quickly
and sync at hardware's capability for mismatches.  Lots of small files,
you start having file system overhead.

If you look at some of the Big File Sharing Services, I think you will
find this "problem" has been solved....and considering the fact that
many of them offer some service for "free", or at least a fraction of
the price per gigabyte that many high-end solutions give you, I think it
is safe to say it is NOT being done with high-end SANs, but cheap
commodity hw and disks (and low maintenance solutions, too).

Realistically, you will have upload limits.  2GB is an upload limit
above which, http starts having issues and some file systems start
having issues (note: USB devices are still often formatted with
variations of FAT file systems, which have a 2GB limit).

So..you let people upload to a "temp" area...if you accept 2GB as an
upload limit, a 500GB upload area would cover a fair number of uploads.
 If you want 100GB upload limit, well...500GB will fill rapidly, but you
can have a lot of these "temp" areas, and a 2TB file system isn't so
crazy anymore.  Your user uploads to this area, the received file name
is uniquely generated and tracked by a database.  When uploads are
complete, you give the user some kind of "key" to identify THEIR file
(maybe just the original name, when combined with their user ID), and
the database tracks it.  After the upload is complete, the system
identifies the size of the file, and looks around in its storage chunks
for a place to put it, and slowly (to not tax the disk I/O) copies it to
that location, maybe again to a backup location on a different physical
device, SHA256's the original file and the two new copies, updates the
database with the new, "permanent" locations, and purges the file from
the "temp" upload area.  Note that the file will remain available at
every step of the process after the original upload -- if the first
download request is made before the "move" is complete, it is served
from the temp area.  As new storage is bolted on, files can be (slowly)
pushed around from old storage to new storage to allow pathetic looking
old storage to be shut down in favor of shiny new storage.  Files can be
pushed around on existing storage to better optimize the available space
in predefined chunks, too.

Yes, this is starting to get complex, but it is a complexity of simple,
understood tools, which I have more faith in than AFS, which seems to be
understood by few, other than "whatever you want to do, AFS probably
does it.  I think.  dunno, never actually used it".  It does appear that
a successful AFS implementation does require a fair amount of
planning...oh, there's that concept again...


Funny thing...  I actually DO have an app which works out very nicely
with ZFS, and not so well with traditional file systems for one silly
reason.

The app: a disk-to-disk backup system
Each "target" machine (system being backed up) gets its own ZFS file
system in a big backup pool. rsync is used to backup the remote machines
to this ZFS file system, then after a backup, a ZFS snapshot is made,
and the snapshot is sent over to a second machine.  Old snapshots are
deleted after the designated number of backups.

Why ZFS works better than a standard file system: The ability to bolt on
additional storage units is nice, though practically speaking, all the
storage this system will ever have is in the box already, I just add 2TB
chunks as needed, so I really don't count this as a huge win.  The big
win?  "df -h" tells me quickly which backup is chewing the most space,
so if some machine is backing up hundreds of GB every night but should
be mostly static data, I can go after that machine and figure out what
is wrong.  "du -s" is how I'd have to do this on a traditional file
system, and that's very slow when you have tens of thousands of files.
(the ZFS snapshots look like a win, but I've used the rsync --link-dest
option to get the same effect.  A ZFS knowledgeable coworker of mine
argued that the ZFS snapshotting would be more efficient than the
--link-dest option, but I'm not seeing it -- the size of the snapshot
being pushed between machines is much bigger than I'd expect often,
often much bigger than the incremental rsync transfer.  Ok, the ability
to push the snapshot between machines is cool, but considering how fast
these backups usually are, one could just run full rsync backups twice
very easily, and sending ZFS snapshots between machines has some
seriously odd quirks).

downside: well, this backup system seems to wedge its FreeBSD-based ZFS
very often -- once managed an uptime of almost 50 days, but that was an
extremely successful run.  ZFS is quirky as hell; even if it were
BSD-licenced, I don't think it meets OpenBSD's standards of "just
works"...it's cool, it's nifty, it sets back the computer industry about
20 years in terms of twisting knobs to get basic functionality... and it
generates mysterious errors that the official fix is "rebuild your file
system and restore from backup"... um...this IS the backup!

At home, my d2d backup system is running OpenBSD, with finite partitions
and I'm not about to change that.  It runs from [upgrade|power event] to
[upgrade|power event], on old hardware and big disks.

Really, getting quite off-topic here.  If you want to have a buzzword
compliant system, go for it.  It will probably move you further ahead in
the IT business than my conservative approaches will move me.  It isn't
about what works best, it's about what you can add to your resume for
your next job (and when I learn to live by this rule, I'll be a much
happier person).

Nick.