Expected throughput in an OpenBSD virtual server

classic Classic list List threaded Threaded
33 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Expected throughput in an OpenBSD virtual server

Sjöholm Per-Olov
Hi "Misc"

# Background #

I have done som fun laborations with a virtual fully patched OpenBSD 4.9
firewall on top of SuSE Enterprise Linux 11 SP1 running KVM. The Virtual
OpenBSD got 512MB RAM and one core from a system with two quadcore Xeon 5504
(2Ghz) sitting in a Dell T410 Tower Server. I have given the OpenBSD FW 2
dedicated "Intel PRO/1000 MT (82574L)" physical nic:s via PCI passthorugh. So
OpenBSD sees and uses the real nic:s (they are then unusable to Linux as they
are unbound).

I have not measured packets per second which of course is more relevant. But
as I try to tweak the speed I don't care if I measure packets or Mbits as long
as my tweaks give a higher value during the next test. Going in on one
physcial nic and out on the other with my small ruleset that uses keep state
everywhere give me about 400 Mbit. AFP, SMB, SCP or NFS give similar results
(I copy large files, a few Gig each). I started with a lower value and after a
few tweaks in sysctl.conf  ended up with this speed of 400 Mbit. At this speed
I can see that the interrupts in the firewall simply eat all resources. Have
no "ip.ifq.drops" or any other drops that I am aware of...


# Question #

I now simply wonder if I can increase this speed.... I did one test and
replaced these two physical desktop Intel Nics with a dual port server adapter
(also Intel, 82546GB). I was interested to see if a dual port, more expensive,
server adapter could lower my interrupt load. However... OpenBSD yelled
something about "unable to reset PCI device". So I went back to these two
desktop adapters. These low price dektop adapters however in a intel i7
desktop workstation download over SMB from my server at 119 Mbyte/s and fill
up the Gig pipe. So they cannot be to bad...


As PF cannot use SMP, is the only way to bump up the firewall throughput (in
this scenario) to increase the speed of the processor core (i.e change
server)? Or are there any other interesting configs to try ?


Regards

/Per-Olov
--
GPG keyID: 5231C0C4
GPG fingerprint: B232 3E1A F5AB 5E10 7561 6739 766E D29D 5231 C0C4
GPG key:
http://wwwkeys.eu.pgp.net/pks/lookup?op=get&search=0x766ED29D5231C0C4

Reply | Threaded
Open this post in threaded view
|

Re: Expected throughput in an OpenBSD virtual server

Tomas Bodzar-4
Try OpenBSD outside of KVM on real HW and you will see where's the
bottleneck. Anyway getting 400Mbit/s under virtualization seems pretty
fine or try to compare with OpenBSD running in VMware as there's fine
support for that use.

Of course security is around zero in this scenario, but as you said
you're doing it for fun :-)

On Mon, Aug 22, 2011 at 2:03 AM, Per-Olov SjC6holm <[hidden email]> wrote:
> Hi "Misc"
>
> # Background #
>
> I have done som fun laborations with a virtual fully patched OpenBSD 4.9
> firewall on top of SuSE Enterprise Linux 11 SP1 running KVM. The Virtual
> OpenBSD got 512MB RAM and one core from a system with two quadcore Xeon
5504
> (2Ghz) sitting in a Dell T410 Tower Server. I have given the OpenBSD FW 2
> dedicated "Intel PRO/1000 MT (82574L)" physical nic:s via PCI passthorugh.
So
> OpenBSD sees and uses the real nic:s (they are then unusable to Linux as
they
> are unbound).
>
> I have not measured packets per second which of course is more relevant.
But
> as I try to tweak the speed I don't care if I measure packets or Mbits as
long
> as my tweaks give a higher value during the next test. Going in on one
> physcial nic and out on the other with my small ruleset that uses keep
state
> everywhere give me about 400 Mbit. AFP, SMB, SCP or NFS give similar
results
> (I copy large files, a few Gig each). I started with a lower value and after
a
> few tweaks in sysctl.conf B ended up with this speed of 400 Mbit. At this
speed
> I can see that the interrupts in the firewall simply eat all resources.
Have
> no "ip.ifq.drops" or any other drops that I am aware of...
>
>
> # Question #
>
> I now simply wonder if I can increase this speed.... I did one test and
> replaced these two physical desktop Intel Nics with a dual port server
adapter
> (also Intel, 82546GB). I was interested to see if a dual port, more
expensive,
> server adapter could lower my interrupt load. However... OpenBSD yelled
> something about "unable to reset PCI device". So I went back to these two
> desktop adapters. These low price dektop adapters however in a intel i7
> desktop workstation download over SMB from my server at 119 Mbyte/s and
fill
> up the Gig pipe. So they cannot be to bad...
>
>
> As PF cannot use SMP, is the only way to bump up the firewall throughput
(in

> this scenario) to increase the speed of the processor core (i.e change
> server)? Or are there any other interesting configs to try ?
>
>
> Regards
>
> /Per-Olov
> --
> GPG keyID: 5231C0C4
> GPG fingerprint: B232 3E1A F5AB 5E10 7561 6739 766E D29D 5231 C0C4
> GPG key:
> http://wwwkeys.eu.pgp.net/pks/lookup?op=get&search=0x766ED29D5231C0C4

Reply | Threaded
Open this post in threaded view
|

Re: Expected throughput in an OpenBSD virtual server

Sjöholm Per-Olov
On 22 aug 2011, at 07:45, Tomas Bodzar wrote:

> Try OpenBSD outside of KVM on real HW and you will see where's the
> bottleneck. Anyway getting 400Mbit/s under virtualization seems pretty
> fine or try to compare with OpenBSD running in VMware as there's fine
> support for that use.
>
> Of course security is around zero in this scenario, but as you said
> you're doing it for fun :-)
>
> On Mon, Aug 22, 2011 at 2:03 AM, Per-Olov Sjvholm <[hidden email]> wrote:
>> Hi "Misc"
>>
>> # Background #
>>
>> I have done som fun laborations with a virtual fully patched OpenBSD 4.9
>> firewall on top of SuSE Enterprise Linux 11 SP1 running KVM. The Virtual
>> OpenBSD got 512MB RAM and one core from a system with two quadcore Xeon
5504
>> (2Ghz) sitting in a Dell T410 Tower Server. I have given the OpenBSD FW 2
>> dedicated "Intel PRO/1000 MT (82574L)" physical nic:s via PCI passthorugh.
So
>> OpenBSD sees and uses the real nic:s (they are then unusable to Linux as
they
>> are unbound).
>>
>> I have not measured packets per second which of course is more relevant.
But
>> as I try to tweak the speed I don't care if I measure packets or Mbits as
long
>> as my tweaks give a higher value during the next test. Going in on one
>> physcial nic and out on the other with my small ruleset that uses keep
state
>> everywhere give me about 400 Mbit. AFP, SMB, SCP or NFS give similar
results
>> (I copy large files, a few Gig each). I started with a lower value and
after a
>> few tweaks in sysctl.conf  ended up with this speed of 400 Mbit. At this
speed
>> I can see that the interrupts in the firewall simply eat all resources.
Have
>> no "ip.ifq.drops" or any other drops that I am aware of...
>>
>>
>> # Question #
>>
>> I now simply wonder if I can increase this speed.... I did one test and
>> replaced these two physical desktop Intel Nics with a dual port server
adapter
>> (also Intel, 82546GB). I was interested to see if a dual port, more
expensive,
>> server adapter could lower my interrupt load. However... OpenBSD yelled
>> something about "unable to reset PCI device". So I went back to these two
>> desktop adapters. These low price dektop adapters however in a intel i7
>> desktop workstation download over SMB from my server at 119 Mbyte/s and
fill
>> up the Gig pipe. So they cannot be to bad...
>>
>>
>> As PF cannot use SMP, is the only way to bump up the firewall throughput
(in

>> this scenario) to increase the speed of the processor core (i.e change
>> server)? Or are there any other interesting configs to try ?
>>
>>
>> Regards
>>
>> /Per-Olov
>> --
>> GPG keyID: 5231C0C4
>> GPG fingerprint: B232 3E1A F5AB 5E10 7561 6739 766E D29D 5231 C0C4
>> GPG key:
>> http://wwwkeys.eu.pgp.net/pks/lookup?op=get&search=0x766ED29D5231C0C4
>>
>>



Plz, don't top post

Vmware is commercial software = avoid if I can. Also Linux guests with virtio
drivers gives much better performance on the same hardware if using KVM
instead of Vmware. Also, no need for vmware tools as everything is in stock
Linux kernel.

I cannot at this time give a fair test running it on the same hardware but as
a physical server instead of a virtual one. This as the KVM host runs 10 other
servers. I have however tested the OpenBSD on another hardware which ended up
with similar performance. That was on a physical box with Gig Intel Nics
(82541 cards) but on a weak Quad core Intel Atom 1.6GHz processor running the
SMP kernel. At the bottle neck speed there was 100% interrupts at around
400Mbit (same tested files and protocols to be able to give a fair
comparison). Maybe the Intel atom 1.6 can be compared to a Xeon 5504 core on
2GHz ??? I am not a processor guru. Anyone??


regarding security which you say is "around zero". Yes this is a laboration.
But maybe you should say increased risk which is a more fair statement. I have
not heard of anyone that managed to hack a scenario like this in VMware or
KVM. Also note that the host OS itself in my case cannot even see these
devices as they are unbound. From my point of view it's like the race on WiFi
where people say you should use WPA2 with AES to be secure. But the real fact
is that standard old WPA without AES and with a reasonable key length (20+
chars) have not been broken by anyone in the world yet (what we know). One
person claims he manage to break a part of it in a lab. So... WPA = secure,
better performance and better compatibility. If I was Nasa or DoD I would
probable avoid WPA as someone someday of course will break it, otherwise
not...



So the question remains. Is it likely that a faster cpu core will give better
performance (not that I need it. Just doing some laborations here). Is a
faster CPU the best / only way to increase throughput. Of course we assume the
OS tweak is ok and that reasonable NIC:s are used. Is there a plan to change
the  interrupt handling model in OpenBSD to device polling in future releases
?




plz don't make this thread a security one from now on as this is not the main
purpose.


/Per-Olov

A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing in e-mail?

Reply | Threaded
Open this post in threaded view
|

Re: Expected throughput in an OpenBSD virtual server

Alexander Hall-3
On 08/22/11 10:59, Per-Olov Sjvholm wrote:

> Q: What is the most annoying thing in e-mail?

Rants.

Reply | Threaded
Open this post in threaded view
|

Re: Expected throughput in an OpenBSD virtual server

Tomas Bodzar-4
In reply to this post by Sjöholm Per-Olov
On Mon, Aug 22, 2011 at 10:59 AM, Per-Olov SjC6holm <[hidden email]> wrote:

> On 22 aug 2011, at 07:45, Tomas Bodzar wrote:
>> Try OpenBSD outside of KVM on real HW and you will see where's the
>> bottleneck. Anyway getting 400Mbit/s under virtualization seems pretty
>> fine or try to compare with OpenBSD running in VMware as there's fine
>> support for that use.
>>
>> Of course security is around zero in this scenario, but as you said
>> you're doing it for fun :-)
>>
>> On Mon, Aug 22, 2011 at 2:03 AM, Per-Olov Sjvholm <[hidden email]> wrote:
>>> Hi "Misc"
>>>
>>> # Background #
>>>
>>> I have done som fun laborations with a virtual fully patched OpenBSD 4.9
>>> firewall on top of SuSE Enterprise Linux 11 SP1 running KVM. The Virtual
>>> OpenBSD got 512MB RAM and one core from a system with two quadcore Xeon
> 5504
>>> (2Ghz) sitting in a Dell T410 Tower Server. I have given the OpenBSD FW 2
>>> dedicated "Intel PRO/1000 MT (82574L)" physical nic:s via PCI
passthorugh.

> So
>>> OpenBSD sees and uses the real nic:s (they are then unusable to Linux as
> they
>>> are unbound).
>>>
>>> I have not measured packets per second which of course is more relevant.
> But
>>> as I try to tweak the speed I don't care if I measure packets or Mbits as
> long
>>> as my tweaks give a higher value during the next test. Going in on one
>>> physcial nic and out on the other with my small ruleset that uses keep
> state
>>> everywhere give me about 400 Mbit. AFP, SMB, SCP or NFS give similar
> results
>>> (I copy large files, a few Gig each). I started with a lower value and
> after a
>>> few tweaks in sysctl.conf B ended up with this speed of 400 Mbit. At this
> speed
>>> I can see that the interrupts in the firewall simply eat all resources.
> Have
>>> no "ip.ifq.drops" or any other drops that I am aware of...
>>>
>>>
>>> # Question #
>>>
>>> I now simply wonder if I can increase this speed.... I did one test and
>>> replaced these two physical desktop Intel Nics with a dual port server
> adapter
>>> (also Intel, 82546GB). I was interested to see if a dual port, more
> expensive,
>>> server adapter could lower my interrupt load. However... OpenBSD yelled
>>> something about "unable to reset PCI device". So I went back to these two
>>> desktop adapters. These low price dektop adapters however in a intel i7
>>> desktop workstation download over SMB from my server at 119 Mbyte/s and
> fill
>>> up the Gig pipe. So they cannot be to bad...
>>>
>>>
>>> As PF cannot use SMP, is the only way to bump up the firewall throughput
> (in
>>> this scenario) to increase the speed of the processor core (i.e change
>>> server)? Or are there any other interesting configs to try ?
>>>
>>>
>>> Regards
>>>
>>> /Per-Olov
>>> --
>>> GPG keyID: 5231C0C4
>>> GPG fingerprint: B232 3E1A F5AB 5E10 7561 6739 766E D29D 5231 C0C4
>>> GPG key:
>>> http://wwwkeys.eu.pgp.net/pks/lookup?op=get&search=0x766ED29D5231C0C4
>>>
>>>
>
>
>
> Plz, don't top post

sorry. Sometimes I forgot because here are different rules.

>
> Vmware is commercial software = avoid if I can. Also Linux guests with
virtio
> drivers gives much better performance on the same hardware if using KVM
> instead of Vmware. Also, no need for vmware tools as everything is in stock
> Linux kernel.
>
> I cannot at this time give a fair test running it on the same hardware but
as
> a physical server instead of a virtual one. This as the KVM host runs 10
other
> servers. I have however tested the OpenBSD on another hardware which ended
up
> with similar performance. That was on a physical box with Gig Intel Nics
> (82541 cards) but on a weak Quad core Intel Atom 1.6GHz processor running
the
> SMP kernel. At the bottle neck speed there was 100% interrupts at around
> 400Mbit (same tested files and protocols to be able to give a fair
> comparison). Maybe the Intel atom 1.6 can be compared to a Xeon 5504 core
on
> 2GHz ??? I am not a processor guru. Anyone??

http://marc.info/?l=openbsd-misc&m=126204017310569&w=2

>
>
> regarding security which you say is "around zero". Yes this is a
laboration.
> But maybe you should say increased risk which is a more fair statement. I
have
> not heard of anyone that managed to hack a scenario like this in VMware or
> KVM. Also note that the host OS itself in my case cannot even see these
> devices as they are unbound. From my point of view it's like the race on
WiFi
> where people say you should use WPA2 with AES to be secure. But the real
fact

> is that standard old WPA without AES and with a reasonable key length (20+
> chars) have not been broken by anyone in the world yet (what we know). One
> person claims he manage to break a part of it in a lab. So... WPA = secure,
> better performance and better compatibility. If I was Nasa or DoD I would
> probable avoid WPA as someone someday of course will break it, otherwise
> not...
>
>
>
> So the question remains. Is it likely that a faster cpu core will give
better
> performance (not that I need it. Just doing some laborations here). Is a
> faster CPU the best / only way to increase throughput. Of course we assume
the
> OS tweak is ok and that reasonable NIC:s are used. Is there a plan to
change
> the B interrupt handling model in OpenBSD to device polling in future
releases
> ?

Intel cards are probably best option on OpenBSD regarding uses a lot
of people here. Better bus and CPU will help for sure. You may find
this thread useful too
http://marc.info/?l=openbsd-misc&m=129839483317022&w=2

>
>
>
>
> plz don't make this thread a security one from now on as this is not the
main
> purpose.
>
>
> /Per-Olov
>
> A: Because it messes up the order in which people normally read text.
> Q: Why is top-posting such a bad thing?
> A: Top-posting.
> Q: What is the most annoying thing in e-mail?

Reply | Threaded
Open this post in threaded view
|

Re: Expected throughput in an OpenBSD virtual server

Daniel Gracia
In reply to this post by Sjöholm Per-Olov
AFAIK, OpenBSD kernel is not designed accounting for any form of
virtualization toy, so don't even try figuring performance numbers out
of it. These will be plain wrong.

As http://www.openbsd.org/faq/faq6.html states, there's little you can
tweak to improve your numbers; just get a nice-clocked, good cache-sized
CPU and give it some loving.

If OBSD doesn't satisfies you as is, recode it or stay appart, as you like.

Good luck!

El 22/08/2011 2:03, Per-Olov Sjvholm escribis:

> Hi "Misc"
>
> # Background #
>
> I have done som fun laborations with a virtual fully patched OpenBSD 4.9
> firewall on top of SuSE Enterprise Linux 11 SP1 running KVM. The Virtual
> OpenBSD got 512MB RAM and one core from a system with two quadcore Xeon 5504
> (2Ghz) sitting in a Dell T410 Tower Server. I have given the OpenBSD FW 2
> dedicated "Intel PRO/1000 MT (82574L)" physical nic:s via PCI passthorugh. So
> OpenBSD sees and uses the real nic:s (they are then unusable to Linux as they
> are unbound).
>
> I have not measured packets per second which of course is more relevant. But
> as I try to tweak the speed I don't care if I measure packets or Mbits as long
> as my tweaks give a higher value during the next test. Going in on one
> physcial nic and out on the other with my small ruleset that uses keep state
> everywhere give me about 400 Mbit. AFP, SMB, SCP or NFS give similar results
> (I copy large files, a few Gig each). I started with a lower value and after a
> few tweaks in sysctl.conf  ended up with this speed of 400 Mbit. At this speed
> I can see that the interrupts in the firewall simply eat all resources. Have
> no "ip.ifq.drops" or any other drops that I am aware of...
>
>
> # Question #
>
> I now simply wonder if I can increase this speed.... I did one test and
> replaced these two physical desktop Intel Nics with a dual port server adapter
> (also Intel, 82546GB). I was interested to see if a dual port, more expensive,
> server adapter could lower my interrupt load. However... OpenBSD yelled
> something about "unable to reset PCI device". So I went back to these two
> desktop adapters. These low price dektop adapters however in a intel i7
> desktop workstation download over SMB from my server at 119 Mbyte/s and fill
> up the Gig pipe. So they cannot be to bad...
>
>
> As PF cannot use SMP, is the only way to bump up the firewall throughput (in
> this scenario) to increase the speed of the processor core (i.e change
> server)? Or are there any other interesting configs to try ?
>
>
> Regards
>
> /Per-Olov
> --
> GPG keyID: 5231C0C4
> GPG fingerprint: B232 3E1A F5AB 5E10 7561 6739 766E D29D 5231 C0C4
> GPG key:
> http://wwwkeys.eu.pgp.net/pks/lookup?op=get&search=0x766ED29D5231C0C4

Reply | Threaded
Open this post in threaded view
|

Re: Expected throughput in an OpenBSD virtual server

Stuart Henderson
In reply to this post by Tomas Bodzar-4
>> Plz, don't top post
>
> sorry. Sometimes I forgot because here are different rules.

Just try and make your emails look nice and easy to read if you want
other people to read them, especially if you're asking others for help.
Before you hit send, read through your email, if it doesn't look good,
re-edit until it does.

A mess of hundreds of lines of irrelevant quotes with poor line-wrapping
is always hard to read, whether your text is written at the top, the bottom,
or interspersed with the quoted text.

Reply | Threaded
Open this post in threaded view
|

Re: Expected throughput in an OpenBSD virtual server

Sjöholm Per-Olov
In reply to this post by Daniel Gracia
On 22 aug 2011, at 12:09, Daniel Gracia wrote:
> AFAIK, OpenBSD kernel is not designed accounting for any form of
virtualization toy, so don't even try figuring performance numbers out of it.
These will be plain wrong.
>
> As http://www.openbsd.org/faq/faq6.html states, there's little you can tweak
to improve your numbers; just get a nice-clocked, good cache-sized CPU and
give it some loving.

>
> If OBSD doesn't satisfies you as is, recode it or stay appart, as you like.
>
> Good luck!
>
> El 22/08/2011 2:03, Per-Olov Sjvholm escribis:
>> Hi "Misc"
>>
>> # Background #
>>
>> I have done som fun laborations with a virtual fully patched OpenBSD 4.9
>> firewall on top of SuSE Enterprise Linux 11 SP1 running KVM. The Virtual
>> OpenBSD got 512MB RAM and one core from a system with two quadcore Xeon
5504
>> (2Ghz) sitting in a Dell T410 Tower Server. I have given the OpenBSD FW 2
>> dedicated "Intel PRO/1000 MT (82574L)" physical nic:s via PCI passthorugh.
So
>> OpenBSD sees and uses the real nic:s (they are then unusable to Linux as
they
>> are unbound).
>>
>> I have not measured packets per second which of course is more relevant.
But
>> as I try to tweak the speed I don't care if I measure packets or Mbits as
long
>> as my tweaks give a higher value during the next test. Going in on one
>> physcial nic and out on the other with my small ruleset that uses keep
state
>> everywhere give me about 400 Mbit. AFP, SMB, SCP or NFS give similar
results
>> (I copy large files, a few Gig each). I started with a lower value and
after a
>> few tweaks in sysctl.conf  ended up with this speed of 400 Mbit. At this
speed
>> I can see that the interrupts in the firewall simply eat all resources.
Have
>> no "ip.ifq.drops" or any other drops that I am aware of...
>>
>>
>> # Question #
>>
>> I now simply wonder if I can increase this speed.... I did one test and
>> replaced these two physical desktop Intel Nics with a dual port server
adapter
>> (also Intel, 82546GB). I was interested to see if a dual port, more
expensive,
>> server adapter could lower my interrupt load. However... OpenBSD yelled
>> something about "unable to reset PCI device". So I went back to these two
>> desktop adapters. These low price dektop adapters however in a intel i7
>> desktop workstation download over SMB from my server at 119 Mbyte/s and
fill
>> up the Gig pipe. So they cannot be to bad...
>>
>>
>> As PF cannot use SMP, is the only way to bump up the firewall throughput
(in

>> this scenario) to increase the speed of the processor core (i.e change
>> server)? Or are there any other interesting configs to try ?
>>
>>
>> Regards
>>
>> /Per-Olov
>> --
>> GPG keyID: 5231C0C4
>> GPG fingerprint: B232 3E1A F5AB 5E10 7561 6739 766E D29D 5231 C0C4
>> GPG key:
>> http://wwwkeys.eu.pgp.net/pks/lookup?op=get&search=0x766ED29D5231C0C4
>



>  AFAIK, OpenBSD kernel is not designed accounting for any form of
virtualization toy, so don't even try figuring performance numbers out of it.
These will be plain wrong.

Why is that? The speed so far seems good enough for a virtual fw with this
2Ghz CPU core. No matter if you use a virtual of physical server, you always
want to get the most out of it. I do NOT compare with a physical server at
all. I want to try to maximize the throughput and se what I can get out of it
as a virtual FW test. The same applies if you use a physical server. You can
hit the limit and get 100% interrupts with both a physical and virtual server,
right? I didn't ask for a comparison with a physical server... I asked what I
can do more with it under these circumstances...


> As http://www.openbsd.org/faq/faq6.html states, there's little you can tweak
to improve your numbers; just get a nice-clocked, good cache-sized CPU and
give it some loving.

The FAQ you refer to seems to be of no use at all and is totally unrelated to
this post.



But if you can give hints of how to decrease the interrupt load I am all ears.
As I see it, if the interrupt handling model i OpenBSD would change to a
polling one u could maybe increase the throughput at the same processor speed
(just me guessing though). But now the fact is that it is not polling. So what
can I do with what we have....

Is pure cpu speed the only way? Or is it possible to decrease the interrupt
load with even better NIC:s?


Regards
/Per-Olov

Reply | Threaded
Open this post in threaded view
|

Re: Expected throughput in an OpenBSD virtual server

Stuart Henderson
> But if you can give hints of how to decrease the interrupt load I am all ears.
> As I see it, if the interrupt handling model i OpenBSD would change to a
> polling one u could maybe increase the throughput at the same processor speed
> (just me guessing though). But now the fact is that it is not polling. So what
> can I do with what we have....

polling is one mechanism to ensure you aren't handling interrupts all the
time, so you can ensure userland remains responsive even when the machine is
under heavy network load. OpenBSD has another way to handle this, MCLGETI.

> Is pure cpu speed the only way? Or is it possible to decrease the interrupt
> load with even better NIC:s?

here are some things that might help:

- faster cpu
- larger cpu cache
- faster ram
- reduce overheads (things like switching VM context while handling
packets is not going to help matters)
- improving code efficiency

have you tried -current?

Reply | Threaded
Open this post in threaded view
|

Re: Expected throughput in an OpenBSD virtual server

Sjöholm Per-Olov
On 22 aug 2011, at 22:04, Stuart Henderson wrote:
>> But if you can give hints of how to decrease the interrupt load I am all
ears.
>> As I see it, if the interrupt handling model i OpenBSD would change to a
>> polling one u could maybe increase the throughput at the same processor
speed
>> (just me guessing though). But now the fact is that it is not polling. So
what
>> can I do with what we have....
>
> polling is one mechanism to ensure you aren't handling interrupts all the
> time, so you can ensure userland remains responsive even when the machine
is
> under heavy network load. OpenBSD has another way to handle this, MCLGETI.


 With polling if I get it right the context switch overhead is mostly avoided
because the system can choose to look at the device when it is already in the
right context. The drawback could be increased latency in processsing events
in a polling model. But according to what I have read, the latency is reduced
to a very low low values by raising the clock interrupt frequency. They say
polling is better  from a OS "time spent on device" control perspective. Note
that I am not a pro in this area, but will for sure look deeper...

MCLGETI ?? Is it in if_em.c if I want to see how it is implemented?

>
>> Is pure cpu speed the only way? Or is it possible to decrease the
interrupt

>> load with even better NIC:s?
>
> here are some things that might help:
>
> - faster cpu
> - larger cpu cache
> - faster ram
> - reduce overheads (things like switching VM context while handling
> packets is not going to help matters)
> - improving code efficiency
>
> have you tried -current?
>



I tried to share and use the same interrupt for my network ports as I have a
guess it could be a boost, but the bios did not want what I wanted....
Interrupts could be shared, but not between the ports I wanted. I simple did
not understand the interrupt allocation scheme in my Dell T410 tower server.

Have not tried current, but will try current as soon as I can. Also... I will
try to do some laborations with CPU speed of the core the OpenBSD virtual
machine has. This to see how the interrupts and throughput is related to the
CPU speed of the allocated core.


Tnx

/Per-Olov


GPG keyID: 5231C0C4
GPG fingerprint: B232 3E1A F5AB 5E10 7561 6739 766E D29D 5231 C0C4
GPG key:
http://wwwkeys.eu.pgp.net/pks/lookup?op=get&search=0x766ED29D5231C0C4

Reply | Threaded
Open this post in threaded view
|

Re: Expected throughput in an OpenBSD virtual server

Christer Solskogen-3
In reply to this post by Stuart Henderson
On Mon, Aug 22, 2011 at 10:04 PM, Stuart Henderson <[hidden email]> wrote:
> - faster ram

Are you sure about that? Almost every benchmark I've seen, fast ram
has almost nothing to say. I would be delighted if what I've been
reading is wrong :-)

--
chs,

Reply | Threaded
Open this post in threaded view
|

Re: Expected throughput in an OpenBSD virtual server

Claudio Jeker
On Mon, Aug 22, 2011 at 10:53:05PM +0200, Christer Solskogen wrote:
> On Mon, Aug 22, 2011 at 10:04 PM, Stuart Henderson <[hidden email]> wrote:
> > - faster ram
>
> Are you sure about that? Almost every benchmark I've seen, fast ram
> has almost nothing to say. I would be delighted if what I've been
> reading is wrong :-)
>

Yes. memory speed matters a lot. DMA goes into main memory and needs to be
read into the cache when the recieved packet is accessed. Having the
memory close by the CPU and on fast busses helps in that regard. Big
caches will do the rest.

--
:wq Claudio

Reply | Threaded
Open this post in threaded view
|

Re: Expected throughput in an OpenBSD virtual server

Claudio Jeker
In reply to this post by Sjöholm Per-Olov
On Mon, Aug 22, 2011 at 10:49:47PM +0200, Per-Olov Sjvholm wrote:

> On 22 aug 2011, at 22:04, Stuart Henderson wrote:
> >> But if you can give hints of how to decrease the interrupt load I am all
> ears.
> >> As I see it, if the interrupt handling model i OpenBSD would change to a
> >> polling one u could maybe increase the throughput at the same processor
> speed
> >> (just me guessing though). But now the fact is that it is not polling. So
> what
> >> can I do with what we have....
> >
> > polling is one mechanism to ensure you aren't handling interrupts all the
> > time, so you can ensure userland remains responsive even when the machine
> is
> > under heavy network load. OpenBSD has another way to handle this, MCLGETI.
>
>
>  With polling if I get it right the context switch overhead is mostly avoided
> because the system can choose to look at the device when it is already in the
> right context. The drawback could be increased latency in processsing events
> in a polling model. But according to what I have read, the latency is reduced
> to a very low low values by raising the clock interrupt frequency. They say
> polling is better  from a OS "time spent on device" control perspective. Note
> that I am not a pro in this area, but will for sure look deeper...

Polling only works reliably at insane HZ settings which will cause other
issues at other places (some obvious some not so obvious). In the end
polling is a poor mans interrupt mitigation (which is also enabled on
em(4) btw.) since instead of using the interrupt of the network card you
use the interrupt of the clock to process the DMA rings. Polling does not
gain you much on good modern HW.
 
> MCLGETI ?? Is it in if_em.c if I want to see how it is implemented?
>

Yes. em(4) has MCLGETI().

> >
> >> Is pure cpu speed the only way? Or is it possible to decrease the
> interrupt
> >> load with even better NIC:s?
> >
> > here are some things that might help:
> >
> > - faster cpu
> > - larger cpu cache
> > - faster ram
> > - reduce overheads (things like switching VM context while handling
> > packets is not going to help matters)
> > - improving code efficiency
> >
> > have you tried -current?
> >
>
>
>
> I tried to share and use the same interrupt for my network ports as I have a
> guess it could be a boost, but the bios did not want what I wanted....
> Interrupts could be shared, but not between the ports I wanted. I simple did
> not understand the interrupt allocation scheme in my Dell T410 tower server.
>
> Have not tried current, but will try current as soon as I can. Also... I will
> try to do some laborations with CPU speed of the core the OpenBSD virtual
> machine has. This to see how the interrupts and throughput is related to the
> CPU speed of the allocated core.
>

Also make sure that the guest can actually access the physical HW directly
without any virtualisation in between. In the end real HW is going to have
less overhead and will be faster then a VM solution.

--
:wq Claudio

Reply | Threaded
Open this post in threaded view
|

Re: Expected throughput in an OpenBSD virtual server

Sjöholm Per-Olov
On 22 aug 2011, at 23:28, Claudio Jeker wrote:
> On Mon, Aug 22, 2011 at 10:49:47PM +0200, Per-Olov Sjvholm wrote:
>> On 22 aug 2011, at 22:04, Stuart Henderson wrote:
>>>> But if you can give hints of how to decrease the interrupt load I am all
>> ears.
>>>> As I see it, if the interrupt handling model i OpenBSD would change to a
>>>> polling one u could maybe increase the throughput at the same processor
>> speed
>>>> (just me guessing though). But now the fact is that it is not polling.
So
>> what
>>>> can I do with what we have....
>>>
>>> polling is one mechanism to ensure you aren't handling interrupts all the
>>> time, so you can ensure userland remains responsive even when the machine
>> is
>>> under heavy network load. OpenBSD has another way to handle this,
MCLGETI.
>>
>>
>> With polling if I get it right the context switch overhead is mostly
avoided
>> because the system can choose to look at the device when it is already in
the
>> right context. The drawback could be increased latency in processsing
events
>> in a polling model. But according to what I have read, the latency is
reduced
>> to a very low low values by raising the clock interrupt frequency. They
say
>> polling is better  from a OS "time spent on device" control perspective.
Note

>> that I am not a pro in this area, but will for sure look deeper...
>
> Polling only works reliably at insane HZ settings which will cause other
> issues at other places (some obvious some not so obvious). In the end
> polling is a poor mans interrupt mitigation (which is also enabled on
> em(4) btw.) since instead of using the interrupt of the network card you
> use the interrupt of the clock to process the DMA rings. Polling does not
> gain you much on good modern HW.
>
>> MCLGETI ?? Is it in if_em.c if I want to see how it is implemented?
>>
>
> Yes. em(4) has MCLGETI().
>
>>>
>>>> Is pure cpu speed the only way? Or is it possible to decrease the
>> interrupt
>>>> load with even better NIC:s?
>>>
>>> here are some things that might help:
>>>
>>> - faster cpu
>>> - larger cpu cache
>>> - faster ram
>>> - reduce overheads (things like switching VM context while handling
>>> packets is not going to help matters)
>>> - improving code efficiency
>>>
>>> have you tried -current?
>>>
>>
>>
>>
>> I tried to share and use the same interrupt for my network ports as I have
a
>> guess it could be a boost, but the bios did not want what I wanted....
>> Interrupts could be shared, but not between the ports I wanted. I simple
did
>> not understand the interrupt allocation scheme in my Dell T410 tower
server.
>>
>> Have not tried current, but will try current as soon as I can. Also... I
will
>> try to do some laborations with CPU speed of the core the OpenBSD virtual
>> machine has. This to see how the interrupts and throughput is related to
the
>> CPU speed of the allocated core.
>>
>
> Also make sure that the guest can actually access the physical HW directly
> without any virtualisation in between. In the end real HW is going to have
> less overhead and will be faster then a VM solution.



--snip--
The KVM hypervisor supports attaching PCI devices on the host system to
virtualized guests. PCI passthrough allows guests to have exclusive access to
PCI devices for a range of tasks. PCI passthrough allows PCI devices to appear
and behave as if they were physically attached to the guest operating system.
--snip--
From:
http://docs.fedoraproject.org/en-US/Fedora/13/html/Virtualization_Guide/chap-
Virtualization-PCI_passthrough.html


The link above doesn't say anything about performance loss though of doing PCI
pass through. But the OpenBSD indeed sees and uses the correct real physical
NIC:s . I am of course _very_ interested in testing by installing OpenBSD
directly on the hardware. But I cannot do that at this time. This is what the
OpenBSD sees..
--snip--
em0 at pci0 dev 4 function 0 "Intel PRO/1000 MT (82574L)" rev 0x00: apic 1 int
11 (irq 11), address 00:1b:21:c2:8a:b0
em1 at pci0 dev 5 function 0 "Intel PRO/1000 MT (82574L)" rev 0x00: apic 1 int
10 (irq 10), address 00:1b:21:bf:76:77
--snip--
The MAC:s are these adapters real MAC:s. When used in OpenBSD these adapters
are totally unbound in Linux and cannot be seen or used.

This virtual fully patched OpenBSD 4.9 has got one (of total eight) Xeon 5504
2Ghz core, 512MB RAM and the above NIC:s and some raised values in sysctl. It
(as said earlier) gives about max 400Mbit throughput with a small ruleset will
keep state everywhere. Have tested with NFS, AFP, SCP, SMB and with different
created 2GB ISO:s. All protocols gives near the same result (AFP performs
best). Another physical server with a 1.6 Ghz Intel Atom with Intel Gig cards
(not the same cards) performs similar (a little lower though) and max out at
near the same speed. When these systems (both the physical and the virtual)
max out, the interrupts eat 100%. Removing the firewall the file transfer give
119 Mbyte/s and max out the Gigabit pipe.

These measurements (i.e comparison with the physical server) make me believe
that the virtualization is not that bad. At least not from a performance
perspective. A security discussion however is another topic.


Maybe the best boost is a as Stuart said,... Faster CPU with more Ghz and
faster memory. Or are there any other tweaks that can be looked into in
OpenBSD?


But after all, these are primarily some interesting and very fun tests.

/Per-Olov

Reply | Threaded
Open this post in threaded view
|

Re: Expected throughput in an OpenBSD virtual server

john slee
In reply to this post by Sjöholm Per-Olov
On 22 August 2011 23:45, Per-Olov Sjvholm <[hidden email]> wrote:
>> As http://www.openbsd.org/faq/faq6.html states, there's little you can
tweak
> to improve your numbers; just get a nice-clocked, good cache-sized CPU and
> give it some loving.
>
> The FAQ you refer to seems to be of no use at all and is totally unrelated
to
> this post.

It is quite pertinent, actually. See the beginning of section 6.6;

http://www.openbsd.org/faq/faq6.html#Tuning

John

Reply | Threaded
Open this post in threaded view
|

Re: Expected throughput in an OpenBSD virtual server

Sjöholm Per-Olov
On 23 aug 2011, at 01:32, john slee wrote:

> On 22 August 2011 23:45, Per-Olov Sjvholm <[hidden email]> wrote:
>>> As http://www.openbsd.org/faq/faq6.html states, there's little you can
> tweak
>> to improve your numbers; just get a nice-clocked, good cache-sized CPU and
>> give it some loving.
>>
>> The FAQ you refer to seems to be of no use at all and is totally unrelated
> to
>> this post.
>
> It is quite pertinent, actually. See the beginning of section 6.6;
>
> http://www.openbsd.org/faq/faq6.html#Tuning
>
> John
>


If you please will explain how "baddynamic" and avoiding certain ports will
affect what we are talking about...

Naaahh lets forget that section

/Per-Olov

Reply | Threaded
Open this post in threaded view
|

Re: Expected throughput in an OpenBSD virtual server

Patrick Lamaiziere
In reply to this post by Sjöholm Per-Olov
Le Mon, 22 Aug 2011 22:49:47 +0200,
Per-Olov SjC6holm <[hidden email]> a C)crit :

Hello,
> Have not tried current, but will try current as soon as I can.
> Also... I will try to do some laborations with CPU speed of the core
> the OpenBSD virtual machine has. This to see how the interrupts and
> throughput is related to the CPU speed of the allocated core.

It would be nice to know if current is better with Intel em(4) cards.
because of this commit : http://freshbsd.org/2011/04/13/00/19/01

Here we reach 400 MBits/s with a CPU rate ~70% but we
run OpenBSD 4.9.

Regards.

Reply | Threaded
Open this post in threaded view
|

Re: Expected throughput in an OpenBSD virtual server

Patrick Lamaiziere
In reply to this post by Stuart Henderson
Le Mon, 22 Aug 2011 20:04:50 +0000 (UTC),
Stuart Henderson <[hidden email]> a C)crit :

Hello,

> OpenBSD has another way to handle this, MCLGETI.

Is there a documentation (for the human being, not the developer)
about how MCLGETI works? (don't find a lot about it)

Thanks, regards.

Reply | Threaded
Open this post in threaded view
|

Re: Expected throughput in an OpenBSD virtual server

Ryan McBride-3
In reply to this post by Sjöholm Per-Olov
On Tue, Aug 23, 2011 at 09:10:05AM +0200, Per-Olov SjC6holm wrote:
> If you please will explain how "baddynamic" and avoiding certain ports will
> affect what we are talking about...
>
> Naaahh lets forget that section

I believe people are referring to the text above that:

   One goal of OpenBSD is to have the system Just Work for the vast
   majority of our users. Twisting knobs you don't understand is far more
   likely to break the system than it is to improve its performance. Always
   start from the default settings, and only adjust things you actually see
   a problem with.

   VERY FEW people will need to adjust any networking parameters!


Earlier you asked:

> So the question remains. Is it likely that a faster cpu core will give
> better performance (not that I need it. Just doing some laborations
> here).  Is a faster CPU the best / only way to increase throughput.

Yes, all other things being equal faster CPU will help. Other hardware factors
include:

- CPU vendor (AMD vs Intel)
- CPU cache, bus, chipset
- PCI bus
- Network card
- If you are doing IPSec, AES-specific instructions ("AES-NI" on Intel)

Some CPU architectures have much better IO and interrupt performance for
a given clock speed (Sparc64, for example), but cost makes them an
unlikely choice for a firewall.

Things that seem to make very little difference in testing:

- MP vs SP kernel
- i386 vs AMD64


> Of course we assume the OS tweak is ok and that reasonable
> NIC:s are used.  

OS tweaks are usually not OK. The general rule of thumb is that if you
have to ask about them on misc@ because there is no documentation and
you don't understand the effects, then you shouldn't touch it

PF configuration can have a big effect on your performance for some
types of traffic. In general it's better to worry about making your
ruleset correct and maintainable, but if you MUST write your ruleset
with performance in mind, the following article discusses most of the
issues:

http://www.undeadly.org/cgi?action=article&sid=20060927091645


> Is there a plan to change the interrupt handling model in OpenBSD to
> device polling in future releases ?

No.

Reply | Threaded
Open this post in threaded view
|

Re: Expected throughput in an OpenBSD virtual server

Tomas Bodzar-4
In reply to this post by Patrick Lamaiziere
On Tue, Aug 23, 2011 at 11:10 AM, Patrick Lamaiziere
<[hidden email]> wrote:
> Le Mon, 22 Aug 2011 20:04:50 +0000 (UTC),
> Stuart Henderson <[hidden email]> a C)crit :
>
> Hello,
>
>> OpenBSD has another way to handle this, MCLGETI.
>
> Is there a documentation (for the human being, not the developer)
> about how MCLGETI works? (don't find a lot about it)

Maybe these?
http://blogs.oracle.com/video/entry/mclgeti_effective_network_livelock_mitigation
https://www.youtube.com/watch?v=fv-AQJqUzRI
http://wikis.sun.com/display/KCA2009/KCA2009+Conference+Agenda  (see
Friday 17th)

looks like only David Gwynne may point to something useful.


>
> Thanks, regards.

12