VMs loosing network connectivity for a few minutes on a daily basis

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

VMs loosing network connectivity for a few minutes on a daily basis

mabi
Hello,

I am testing VMM/VMD on OpenBSD 6.4 with OpenBSD 6.4 virtual machines but noticed that maybe around 2 times per day the VM loose their network connectivity for a short amount of time of around 2-3 minutes. I currently have 3 OpenBSD VM with very light load on them and it happens to all of them.

The network connectivity recovers on its own or if I login through the console to the VM and initiate for example a ping to the outside. The host/hypervisor itself never looses connectivity.

Now I presume there is either an issue with my network setup or maybe a bug but I would rather think it has to do with my network setup. My network setup on the OpenBSD host itself consists of two physical network devices (bnx0 + bnx1) which I have bundled in a trunk (trunk0) in failover mode. Then on top of my trunk I have two VLAN interfaces (vlan2 and vlan6). vlan2 is my private network and vlan6 is my public facing network (internet). Then finally I have a bridge interface (bridge6) with my vlan6 interface inside where my VM connect to as they are directly available on the internet.

So the whole chain of network interfaces from host to VM looks like this:

[bnx0+bnx1]-[trunk0]-[vlan6]-[bridge6]-[tap0]-[vio0]


My /etc/vm.conf looks like this:

switch "uplink_vlan6" {
        interface bridge6
}

vm "obsd1vm" {
        memory 2G
        disk "/var/vmm/obsd1vm.qcow2"

        interface {
                switch "uplink_vlan6"
                lladdr fe:e1:bb:03:01:01
        }
}

My /etc/hostname.* files look like this:

/etc/hostname.bnx0
up

/etc/hostname.bnx1
up

/etc/hostname.trunk0
trunkproto failover trunkport bnx0 trunkport bnx1 up

/etc/hostname.vlan2
 inet 192.168.1.56 255.255.255.0 192.168.1.255 vnetid 2 parent trunk0 description "private" up

/etc/hostname.vlan6
inet xxx.xxx.xxx.xxx 255.255.255.0 xxx.xxx.xxx.255 vnetid 6 parent trunk0 description "public" up

/etc/hostname.bridge6
add vlan6
up

The hardware switch behind the host is a Cisco switch and the two ports connected to the two hardware NICS of the server have both the following config:

interface Eth101/1/9
  switchport mode trunk
  switchport trunk native vlan 99
  switchport trunk allowed vlan 2,6


Finally below is the output of ifconfig:

bnx0: flags=8b43<UP,BROADCAST,RUNNING,PROMISC,ALLMULTI,SIMPLEX,MULTICAST> mtu 1500
        lladdr ---REMOVED---
        index 3 priority 0 llprio 3
        trunk: trunkdev trunk0
        media: Ethernet autoselect (1000baseT full-duplex,rxpause)
        status: active
bnx1: flags=8b43<UP,BROADCAST,RUNNING,PROMISC,ALLMULTI,SIMPLEX,MULTICAST> mtu 1500
        lladdr ---REMOVED---
        index 4 priority 0 llprio 3
        trunk: trunkdev trunk0
        media: Ethernet autoselect (1000baseT full-duplex,rxpause)
        status: active
bridge6: flags=41<UP,RUNNING>
        description: switch1-uplink_vlan6
        index 5 llprio 3
        groups: bridge
        priority 32768 hellotime 2 fwddelay 15 maxage 20 holdcnt 6 proto rstp
        vlan6 flags=3<LEARNING,DISCOVER>
                port 8 ifpriority 0 ifcost 0
        tap0 flags=3<LEARNING,DISCOVER>
                port 10 ifpriority 0 ifcost 0
trunk0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> mtu 1500
        lladdr ---REMOVED---
        index 6 priority 0 llprio 3
        trunk: trunkproto failover
                trunkport bnx1
                trunkport bnx0 master,active
        groups: trunk
        media: Ethernet autoselect
        status: active
vlan2:  flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
        lladdr ---REMOVED---
        description: private
        index 7 priority 0 llprio 3
        encap: vnetid 2 parent trunk0
        groups: vlan egress
        media: Ethernet autoselect
        status: active
        inet 192.168.1.56 netmask 0xffffff00 broadcast 192.168.1.255
vlan6: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> mtu 1500
        lladdr ---REMOVED---
        description: public
        index 8 priority 0 llprio 3
        encap: vnetid 6 parent trunk0
        groups: vlan
        media: Ethernet autoselect
        status: active
        inet ---REMOVED--- netmask 0xffffff00 broadcast ---REMOVED---
pflog0: flags=141<UP,RUNNING,PROMISC> mtu 33136
        index 9 priority 0 llprio 3
        groups: pflog
tap0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> mtu 1500
        lladdr fe:e1:ba:d0:56:1c
        description: vm1-if0-obsd1vm
        index 10 priority 0 llprio 3
        groups: tap
        status: active

Last note, the host and VMs are all patched up to 013_unveil.

I hope I could provide here all the relevant details, if there is anything else I should add I would be happy to provide with more info.

Best regards,
Mabi

Reply | Threaded
Open this post in threaded view
|

Re: VMs loosing network connectivity for a few minutes on a daily basis

mabi
I was wondering if maybe this could have something to do with spanning tree on the bridge6 interface?

An ifconfig on the bridge6 interface shows the following spanning tree settings:

 priority 32768 hellotime 2 fwddelay 15 maxage 20 holdcnt 6 proto rstp

Do I really need spanning tree here? and would it be safe to disable it for a test?

Regards,
Mabi

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Friday, February 1, 2019 7:02 PM, mabi <[hidden email]> wrote:

> Hello,
>
> I am testing VMM/VMD on OpenBSD 6.4 with OpenBSD 6.4 virtual machines but noticed that maybe around 2 times per day the VM loose their network connectivity for a short amount of time of around 2-3 minutes. I currently have 3 OpenBSD VM with very light load on them and it happens to all of them.
>
> The network connectivity recovers on its own or if I login through the console to the VM and initiate for example a ping to the outside. The host/hypervisor itself never looses connectivity.
>
> Now I presume there is either an issue with my network setup or maybe a bug but I would rather think it has to do with my network setup. My network setup on the OpenBSD host itself consists of two physical network devices (bnx0 + bnx1) which I have bundled in a trunk (trunk0) in failover mode. Then on top of my trunk I have two VLAN interfaces (vlan2 and vlan6). vlan2 is my private network and vlan6 is my public facing network (internet). Then finally I have a bridge interface (bridge6) with my vlan6 interface inside where my VM connect to as they are directly available on the internet.
>
> So the whole chain of network interfaces from host to VM looks like this:
>
> [bnx0+bnx1]-[trunk0]-[vlan6]-[bridge6]-[tap0]-[vio0]
>
> My /etc/vm.conf looks like this:
>
> switch "uplink_vlan6" {
> interface bridge6
> }
>
> vm "obsd1vm" {
> memory 2G
> disk "/var/vmm/obsd1vm.qcow2"
>
> interface {
> switch "uplink_vlan6"
> lladdr fe:e1:bb:03:01:01
> }
> }
>
> My /etc/hostname.* files look like this:
>
> /etc/hostname.bnx0
> up
>
> /etc/hostname.bnx1
> up
>
> /etc/hostname.trunk0
> trunkproto failover trunkport bnx0 trunkport bnx1 up
>
> /etc/hostname.vlan2
> inet 192.168.1.56 255.255.255.0 192.168.1.255 vnetid 2 parent trunk0 description "private" up
>
> /etc/hostname.vlan6
> inet xxx.xxx.xxx.xxx 255.255.255.0 xxx.xxx.xxx.255 vnetid 6 parent trunk0 description "public" up
>
> /etc/hostname.bridge6
> add vlan6
> up
>
> The hardware switch behind the host is a Cisco switch and the two ports connected to the two hardware NICS of the server have both the following config:
>
> interface Eth101/1/9
> switchport mode trunk
> switchport trunk native vlan 99
> switchport trunk allowed vlan 2,6
>
> Finally below is the output of ifconfig:
>
> bnx0: flags=8b43<UP,BROADCAST,RUNNING,PROMISC,ALLMULTI,SIMPLEX,MULTICAST> mtu 1500
>
>         lladdr ---REMOVED---
>         index 3 priority 0 llprio 3
>         trunk: trunkdev trunk0
>         media: Ethernet autoselect (1000baseT full-duplex,rxpause)
>         status: active
>
>
> bnx1: flags=8b43<UP,BROADCAST,RUNNING,PROMISC,ALLMULTI,SIMPLEX,MULTICAST> mtu 1500
>
>         lladdr ---REMOVED---
>         index 4 priority 0 llprio 3
>         trunk: trunkdev trunk0
>         media: Ethernet autoselect (1000baseT full-duplex,rxpause)
>         status: active
>
>
> bridge6: flags=41<UP,RUNNING>
>
>         description: switch1-uplink_vlan6
>         index 5 llprio 3
>         groups: bridge
>         priority 32768 hellotime 2 fwddelay 15 maxage 20 holdcnt 6 proto rstp
>         vlan6 flags=3<LEARNING,DISCOVER>
>
>                 port 8 ifpriority 0 ifcost 0
>         tap0 flags=3<LEARNING,DISCOVER>
>
>                 port 10 ifpriority 0 ifcost 0
>
>
> trunk0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> mtu 1500
>
>         lladdr ---REMOVED---
>         index 6 priority 0 llprio 3
>         trunk: trunkproto failover
>                 trunkport bnx1
>                 trunkport bnx0 master,active
>         groups: trunk
>         media: Ethernet autoselect
>         status: active
>
>
> vlan2: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
>
>         lladdr ---REMOVED---
>         description: private
>         index 7 priority 0 llprio 3
>         encap: vnetid 2 parent trunk0
>         groups: vlan egress
>         media: Ethernet autoselect
>         status: active
>         inet 192.168.1.56 netmask 0xffffff00 broadcast 192.168.1.255
>
>
> vlan6: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> mtu 1500
>
>         lladdr ---REMOVED---
>         description: public
>         index 8 priority 0 llprio 3
>         encap: vnetid 6 parent trunk0
>         groups: vlan
>         media: Ethernet autoselect
>         status: active
>         inet ---REMOVED--- netmask 0xffffff00 broadcast ---REMOVED---
>
>
> pflog0: flags=141<UP,RUNNING,PROMISC> mtu 33136
>
>         index 9 priority 0 llprio 3
>         groups: pflog
>
>
> tap0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> mtu 1500
>
>         lladdr fe:e1:ba:d0:56:1c
>         description: vm1-if0-obsd1vm
>         index 10 priority 0 llprio 3
>         groups: tap
>         status: active
>
>
> Last note, the host and VMs are all patched up to 013_unveil.
>
> I hope I could provide here all the relevant details, if there is anything else I should add I would be happy to provide with more info.
>
> Best regards,
> Mabi