SMP on IBM eseriesand amd64

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

SMP on IBM eseriesand amd64

Lukáš Macura
Hello all,

please we have little problem with openbsd runing on $SUBJ.
I think there could be better irq routing. But I don't know how to
achieve this. In BIOS, I cannot change IRQ for anything. Every device in
this server is on same IRQ. I really don't know why but I cannot change
this. When I boot linux on this machine, IRQ routing is OK, probably
bacause linux know how to change IRQ of devices.

I do not understand what is difference between int and irq in dmesg.
Sorry, I am not expert for this. But reality is, that on OpenBSD, only
one CPU is used. Probably because all interrupts are routed thru this
CPU. Second CPU is still idle.

Please can somebody help me what to do to utilise second CPU? We need
more bandwidth and we want to use one NIC/one IRQ. So it shuold bring
better CPU utilisation. Am I right?

Thanks to all,
Lukas Macura

Here is dmesg:
OpenBSD 3.8-current (GENERIC.MP) #6: Thu Nov  3 17:32:14 CET 2005
    [hidden email]:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 1073319936 (1048164K)
avail mem = 908783616 (887484K)
using 22937 buffers containing 107540480 bytes (105020K) of memory
mainbus0 (root)
mainbus0: scanning 0x9d400 to 0x9d7f0 for MP signature
mainbus0: MP floating pointer found in extended bios data area at
0x9d540
mainbus0: MP config table at 0x9e520, 356 bytes long
mainbus0: Intel MP Specification (Version 1.4) (IBM ENSW X336 SMP    )
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Xeon(TM) CPU 3.20GHz, 3200.71 MHz
cpu0:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,NXE,LONG
cpu0: 2MB 64b/line 8-way L2 cache
cpu0: calibrating local timer
cpu0: apic clock running at 200006987Hz
cpu0: kstack at 0xffff80006585c000 for 20480 bytes
cpu0: idle pcb at 0xffff80006585c000, idle sp at 0xffff800065860ff0
cpu1 at mainbus0: apid 6 (application processor)
cpu1: Intel(R) Xeon(TM) CPU 3.20GHz, 3200.12 MHz
cpu1:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,NXE,LONG
cpu1: 2MB 64b/line 8-way L2 cache
cpu1: kstack at 0xffff800065861000 for 20480 bytes
cpu1: idle pcb at 0xffff800065861000, idle sp at 0xffff800065865ff0
mpbios: bus 0 is type PCI
mpbios: bus 1 is type PCI
mpbios: bus 2 is type PCI
mpbios: bus 3 is type PCI
mpbios: bus 4 is type PCI
mpbios: bus 5 is type PCI
mpbios: bus 6 is type PCI
mpbios: bus 7 is type PCI
mpbios: bus 8 is type ISA
ioapic0 at mainbus0 apid 14: pa 0xffff800001ba7f24, virtual wire mode,
version 20, 24 pins
ioapic1 at mainbus0 apid 13: pa 0xffff800001ba7e24, virtual wire mode,
version 20, 24 pins
ioapic2 at mainbus0 apid 12: pa 0xffff800001ba7d24, virtual wire mode,
version 20, 24 pins
ioapic0: int1 attached to isa0 irq 1 (type 0x0 flags 0x0)
ioapic0: int2 attached to isa0 irq 0 (type 0x0 flags 0x0)
ioapic0: int6 attached to isa0 irq 6 (type 0x0 flags 0x0)
ioapic0: int8 attached to isa0 irq 8 (type 0x0 flags 0x5)
ioapic0: int9 attached to isa0 irq 9 (type 0x0 flags 0x0)
ioapic0: int12 attached to isa0 irq 12 (type 0x0 flags 0x0)
ioapic0: int13 attached to isa0 irq 13 (type 0x0 flags 0x0)
ioapic0: int14 attached to isa0 irq 14 (type 0x0 flags 0x0)
ioapic0: int15 attached to isa0 irq 15 (type 0x0 flags 0x0)
mpbios: can't find ioapic 0
ioapic0: int16 attached to pci0 device 29 INT_A (type 0x0 flags 0x0)
ioapic0: int19 attached to pci0 device 29 INT_B (type 0x0 flags 0x0)
ioapic0: int23 attached to pci0 device 29 INT_D (type 0x0 flags 0x0)
ioapic0: int17 attached to pci0 device 31 INT_B (type 0x0 flags 0x0)
ioapic0: int16 attached to pci1 device 1 INT_A (type 0x0 flags 0x0)
mpbios: can't find ioapic 0
ioapic1: int4 attached to pci4 device 1 INT_A (type 0x0 flags 0x0)
ioapic2: int0 attached to pci5 device 1 INT_A (type 0x0 flags 0x0)
ioapic0: int16 attached to pci6 device 0 INT_A (type 0x0 flags 0x0)
ioapic0: int16 attached to pci7 device 0 INT_A (type 0x0 flags 0x0)
local apic: int1 attached to NMI (type 0x1 flags 0x0)
local apic: int0 attached to ExtINT (type 0x3 flags 0x0)
mainbus0: MP WARNING: 348 bytes of extended entries not examined
pci0 at mainbus0 bus 0: configuration mode 1
pchb0 at pci0 dev 0 function 0 "Intel E7710 SMCH" rev 0x0c
"Intel E7710 MCH ERR" rev 0x0c at pci0 dev 0 function 1 not configured
ppb0 at pci0 dev 2 function 0 "Intel E7710 MCH PCIE" rev 0x0c
pci1 at ppb0 bus 2
ppb1 at pci0 dev 4 function 0 "Intel E7710 MCH PCIE" rev 0x0c
pci2 at ppb1 bus 3
ppb2 at pci2 dev 0 function 0 "Intel PCIE-PCIE" rev 0x09
pci3 at ppb2 bus 4
mpt0 at pci3 dev 1 function 0 "Symbios Logic 53c1030" rev 0x08: apic 13
int 4 (irq 10)
mpt0: sending FW Upload request to IOC (size: 36, img size: 69956)
mpt0: IM support: 4
scsibus0 at mpt0: 16 targets
sd0 at scsibus0 targ 0 lun 0: <LSILOGIC, 1030 IM IM, 1000> SCSI2
0/direct fixed
sd0: 139898MB, 139898 cyl, 16 head, 128 sec, 512 bytes/sec, 286511104
sec total
mpt0: target 0 Asynchronous at 0MHz width 8bit offset 0 QAS 0 DT 0 IU 0
ppb3 at pci2 dev 0 function 2 "Intel PCIE-PCIE" rev 0x09
pci4 at ppb3 bus 5
em0 at pci4 dev 1 function 0 "Intel PRO/1000MT (82545GM)" rev 0x04: apic
12 int 0 (irq 10), address 00:0e:0c:9c:07:13
ppb4 at pci0 dev 6 function 0 "Intel E7710 MCH PCIE" rev 0x0c
pci5 at ppb4 bus 6
bge0 at pci5 dev 0 function 0 "Broadcom BCM5721" rev 0x11, BCM5750 B1
(0x4101): apic 14 int 16 (irq 10) address 00:14:5e:0b:3e:ea
brgphy0 at bge0 phy 1: BCM5750 10/100/1000baseT PHY, rev. 0
ppb5 at pci0 dev 7 function 0 "Intel E7710 MCH PCIE" rev 0x0c
pci6 at ppb5 bus 7
bge1 at pci6 dev 0 function 0 "Broadcom BCM5721" rev 0x11, BCM5750 B1
(0x4101): apic 14 int 16 (irq 10) address 00:14:5e:0b:3e:eb
brgphy1 at bge1 phy 1: BCM5750 10/100/1000baseT PHY, rev. 0
vendor "Intel", unknown product 0x359b (class system subclass
miscellaneous, rev 0x0c) at pci0 dev 8 function 0 not configured
uhci0 at pci0 dev 29 function 0 "Intel 82801EB/ER USB" rev 0x02: apic 14
int 16 (irq 10)
usb0 at uhci0: USB revision 1.0
uhub0 at usb0
uhub0: Intel UHCI root hub, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
uhci1 at pci0 dev 29 function 1 "Intel 82801EB/ER USB" rev 0x02: apic 14
int 19 (irq 7)
usb1 at uhci1: USB revision 1.0
uhub1 at usb1
uhub1: Intel UHCI root hub, rev 1.00/1.00, addr 1
uhub1: 2 ports with 2 removable, self powered
ehci0 at pci0 dev 29 function 7 "Intel 82801EB/ER USB" rev 0x02: apic 14
int 23 (irq 5)
usb2 at ehci0: USB revision 2.0
uhub2 at usb2
uhub2: Intel EHCI root hub, rev 2.00/1.00, addr 1
uhub2: 4 ports with 4 removable, self powered
ppb6 at pci0 dev 30 function 0 "Intel 82801BA AGP" rev 0xc2
pci7 at ppb6 bus 1
vga1 at pci7 dev 1 function 0 "ATI Radeon VE QY" rev 0x00
wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
pcib0 at pci0 dev 31 function 0 "Intel 82801EB/ER LPC" rev 0x02
pciide0 at pci0 dev 31 function 2 "Intel 82801EB SATA" rev 0x02: DMA,
channel 0 configured to compatibility, channel 1 configured to
compatibility
atapiscsi0 at pciide0 channel 0 drive 0
scsibus1 at atapiscsi0: 2 targets
cd0 at scsibus1 targ 0 lun 0: <HL-DT-ST, DVD-ROM GDR8083N, 0L02> SCSI0
5/cdrom removable
cd0(pciide0:0:0): using PIO mode 4, Ultra-DMA mode 2
"Intel 82801EB/ER SMBus" rev 0x02 at pci0 dev 31 function 3 not
configured
isa0 at pcib0
isadma0 at isa0
pckbc0 at isa0 port 0x60/5
pckbd0 at pckbc0 (kbd slot)
pckbc0: using irq 1 for kbd slot
wskbd0 at pckbd0: console keyboard, using wsdisplay0
pmsi0 at pckbc0 (aux slot)
pckbc0: using irq 12 for aux slot
wsmouse0 at pmsi0 mux 0
pcppi0 at isa0 port 0x61
spkr0 at pcppi0
sysbeep0 at pcppi0
cpu0: prelint0 0x700 0x0
cpu0: prelint1 0x400 0x0
cpu0: timer0 0x300c0 0x0
cpu0: pcint0 0x10000 0x0
cpu0: lint0 0x10700 0x0
cpu0: lint1 0x400 0x0
cpu0: err0 0x10000 0x0
ioapic2: int0 0xa070 0x0
ioapic1: int4 0xa060 0x0
ioapic0: int16 0xa061 0x0
ioapic0: int19 0xa062 0x0
ioapic0: int23 0xa063 0x0
dkcsum: sd0 matches BIOS drive 0x80
root on sd0a
rootdev=0x400 rrootdev=0xd00 rawdev=0xd02
cpu1: prelint0 0x10000 0x0
cpu1: prelint1 0x10000 0x0
cpu1: timer0 0x200c0 0x0
cpu1: pcint0 0x10000 0x0
cpu1: lint0 0x10700 0x0
cpu1: lint1 0x400 0x0
cpu1: err0 0x10000 0x0

Reply | Threaded
Open this post in threaded view
|

Re: SMP on IBM eseriesand amd64

Niklas Hallqvist
Hi!

Unfortunately, you seem to have an application where OpenBSD is an
inferior choice.
The design-goals behind SMP was to be able to utilize extra CPUs without
adding too
much complexity (believe me, it is complex enough anyway).  This is why
we chose the
so-called biglock approach, i.e. only one CPU can execute kernel code at
any given time.
Interrupts are kernel code, thus only one interrupt can be served
concurrently.  Furthermore,
at this point only the boot processor can execute interrupts (except
timer which are cpu-local
and thus can occur on all CPUs), which means high-interrupt pressure
will clog one singe CPU.
This is not optimal, but for many purposes it works quite OK.  If you
have several processes
eating most of its CPU consumption in userland, this is indeed quite
good.  An example is
compilation, one CPU will run the compiler, another the preprocessor,
both are expensive
CPU consumers.  Web applications can also utilize extra CPUs with
concurrent requests where
non-trivial time is spent in the application code (userland).

It is not good for mostly-kernel or single-process applications, like
routing, filtering or
database servers (unless the database implementation is designed with
multiple worker processes,
and the queries are non-trivial, i.e. complexity is in the query
optimization, not in just shoveling
data to the clients).

We have always had intentions to improve this, but intentions is not
enough.  Capable people must
have motivation to do the work.  What kind of motivation depends on
exactly who is doing the work.

What kind of application is this?  Pure routing/filtering?  I suspect so
since you don't get any action
from the secondary CPU.

Sorry for the sad answer,
Niklas

Lukas Macura wrote:

> Hello all,
>
> please we have little problem with openbsd runing on $SUBJ.
> I think there could be better irq routing. But I don't know how to
> achieve this. In BIOS, I cannot change IRQ for anything. Every device in
> this server is on same IRQ. I really don't know why but I cannot change
> this. When I boot linux on this machine, IRQ routing is OK, probably
> bacause linux know how to change IRQ of devices.
>
> I do not understand what is difference between int and irq in dmesg.
> Sorry, I am not expert for this. But reality is, that on OpenBSD, only
> one CPU is used. Probably because all interrupts are routed thru this
> CPU. Second CPU is still idle.
>
> Please can somebody help me what to do to utilise second CPU? We need
> more bandwidth and we want to use one NIC/one IRQ. So it shuold bring
> better CPU utilisation. Am I right?
>
> Thanks to all,
> Lukas Macura
>
> Here is dmesg:
> OpenBSD 3.8-current (GENERIC.MP) #6: Thu Nov  3 17:32:14 CET 2005
>     [hidden email]:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> real mem = 1073319936 (1048164K)
> avail mem = 908783616 (887484K)
> using 22937 buffers containing 107540480 bytes (105020K) of memory
> mainbus0 (root)
> mainbus0: scanning 0x9d400 to 0x9d7f0 for MP signature
> mainbus0: MP floating pointer found in extended bios data area at
> 0x9d540
> mainbus0: MP config table at 0x9e520, 356 bytes long
> mainbus0: Intel MP Specification (Version 1.4) (IBM ENSW X336 SMP    )
> cpu0 at mainbus0: apid 0 (boot processor)
> cpu0: Intel(R) Xeon(TM) CPU 3.20GHz, 3200.71 MHz
> cpu0:
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,NXE,LONG
> cpu0: 2MB 64b/line 8-way L2 cache
> cpu0: calibrating local timer
> cpu0: apic clock running at 200006987Hz
> cpu0: kstack at 0xffff80006585c000 for 20480 bytes
> cpu0: idle pcb at 0xffff80006585c000, idle sp at 0xffff800065860ff0
> cpu1 at mainbus0: apid 6 (application processor)
> cpu1: Intel(R) Xeon(TM) CPU 3.20GHz, 3200.12 MHz
> cpu1:
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,NXE,LONG
> cpu1: 2MB 64b/line 8-way L2 cache
> cpu1: kstack at 0xffff800065861000 for 20480 bytes
> cpu1: idle pcb at 0xffff800065861000, idle sp at 0xffff800065865ff0
> mpbios: bus 0 is type PCI
> mpbios: bus 1 is type PCI
> mpbios: bus 2 is type PCI
> mpbios: bus 3 is type PCI
> mpbios: bus 4 is type PCI
> mpbios: bus 5 is type PCI
> mpbios: bus 6 is type PCI
> mpbios: bus 7 is type PCI
> mpbios: bus 8 is type ISA
> ioapic0 at mainbus0 apid 14: pa 0xffff800001ba7f24, virtual wire mode,
> version 20, 24 pins
> ioapic1 at mainbus0 apid 13: pa 0xffff800001ba7e24, virtual wire mode,
> version 20, 24 pins
> ioapic2 at mainbus0 apid 12: pa 0xffff800001ba7d24, virtual wire mode,
> version 20, 24 pins
> ioapic0: int1 attached to isa0 irq 1 (type 0x0 flags 0x0)
> ioapic0: int2 attached to isa0 irq 0 (type 0x0 flags 0x0)
> ioapic0: int6 attached to isa0 irq 6 (type 0x0 flags 0x0)
> ioapic0: int8 attached to isa0 irq 8 (type 0x0 flags 0x5)
> ioapic0: int9 attached to isa0 irq 9 (type 0x0 flags 0x0)
> ioapic0: int12 attached to isa0 irq 12 (type 0x0 flags 0x0)
> ioapic0: int13 attached to isa0 irq 13 (type 0x0 flags 0x0)
> ioapic0: int14 attached to isa0 irq 14 (type 0x0 flags 0x0)
> ioapic0: int15 attached to isa0 irq 15 (type 0x0 flags 0x0)
> mpbios: can't find ioapic 0
> ioapic0: int16 attached to pci0 device 29 INT_A (type 0x0 flags 0x0)
> ioapic0: int19 attached to pci0 device 29 INT_B (type 0x0 flags 0x0)
> ioapic0: int23 attached to pci0 device 29 INT_D (type 0x0 flags 0x0)
> ioapic0: int17 attached to pci0 device 31 INT_B (type 0x0 flags 0x0)
> ioapic0: int16 attached to pci1 device 1 INT_A (type 0x0 flags 0x0)
> mpbios: can't find ioapic 0
> ioapic1: int4 attached to pci4 device 1 INT_A (type 0x0 flags 0x0)
> ioapic2: int0 attached to pci5 device 1 INT_A (type 0x0 flags 0x0)
> ioapic0: int16 attached to pci6 device 0 INT_A (type 0x0 flags 0x0)
> ioapic0: int16 attached to pci7 device 0 INT_A (type 0x0 flags 0x0)
> local apic: int1 attached to NMI (type 0x1 flags 0x0)
> local apic: int0 attached to ExtINT (type 0x3 flags 0x0)
> mainbus0: MP WARNING: 348 bytes of extended entries not examined
> pci0 at mainbus0 bus 0: configuration mode 1
> pchb0 at pci0 dev 0 function 0 "Intel E7710 SMCH" rev 0x0c
> "Intel E7710 MCH ERR" rev 0x0c at pci0 dev 0 function 1 not configured
> ppb0 at pci0 dev 2 function 0 "Intel E7710 MCH PCIE" rev 0x0c
> pci1 at ppb0 bus 2
> ppb1 at pci0 dev 4 function 0 "Intel E7710 MCH PCIE" rev 0x0c
> pci2 at ppb1 bus 3
> ppb2 at pci2 dev 0 function 0 "Intel PCIE-PCIE" rev 0x09
> pci3 at ppb2 bus 4
> mpt0 at pci3 dev 1 function 0 "Symbios Logic 53c1030" rev 0x08: apic 13
> int 4 (irq 10)
> mpt0: sending FW Upload request to IOC (size: 36, img size: 69956)
> mpt0: IM support: 4
> scsibus0 at mpt0: 16 targets
> sd0 at scsibus0 targ 0 lun 0: <LSILOGIC, 1030 IM IM, 1000> SCSI2
> 0/direct fixed
> sd0: 139898MB, 139898 cyl, 16 head, 128 sec, 512 bytes/sec, 286511104
> sec total
> mpt0: target 0 Asynchronous at 0MHz width 8bit offset 0 QAS 0 DT 0 IU 0
> ppb3 at pci2 dev 0 function 2 "Intel PCIE-PCIE" rev 0x09
> pci4 at ppb3 bus 5
> em0 at pci4 dev 1 function 0 "Intel PRO/1000MT (82545GM)" rev 0x04: apic
> 12 int 0 (irq 10), address 00:0e:0c:9c:07:13
> ppb4 at pci0 dev 6 function 0 "Intel E7710 MCH PCIE" rev 0x0c
> pci5 at ppb4 bus 6
> bge0 at pci5 dev 0 function 0 "Broadcom BCM5721" rev 0x11, BCM5750 B1
> (0x4101): apic 14 int 16 (irq 10) address 00:14:5e:0b:3e:ea
> brgphy0 at bge0 phy 1: BCM5750 10/100/1000baseT PHY, rev. 0
> ppb5 at pci0 dev 7 function 0 "Intel E7710 MCH PCIE" rev 0x0c
> pci6 at ppb5 bus 7
> bge1 at pci6 dev 0 function 0 "Broadcom BCM5721" rev 0x11, BCM5750 B1
> (0x4101): apic 14 int 16 (irq 10) address 00:14:5e:0b:3e:eb
> brgphy1 at bge1 phy 1: BCM5750 10/100/1000baseT PHY, rev. 0
> vendor "Intel", unknown product 0x359b (class system subclass
> miscellaneous, rev 0x0c) at pci0 dev 8 function 0 not configured
> uhci0 at pci0 dev 29 function 0 "Intel 82801EB/ER USB" rev 0x02: apic 14
> int 16 (irq 10)
> usb0 at uhci0: USB revision 1.0
> uhub0 at usb0
> uhub0: Intel UHCI root hub, rev 1.00/1.00, addr 1
> uhub0: 2 ports with 2 removable, self powered
> uhci1 at pci0 dev 29 function 1 "Intel 82801EB/ER USB" rev 0x02: apic 14
> int 19 (irq 7)
> usb1 at uhci1: USB revision 1.0
> uhub1 at usb1
> uhub1: Intel UHCI root hub, rev 1.00/1.00, addr 1
> uhub1: 2 ports with 2 removable, self powered
> ehci0 at pci0 dev 29 function 7 "Intel 82801EB/ER USB" rev 0x02: apic 14
> int 23 (irq 5)
> usb2 at ehci0: USB revision 2.0
> uhub2 at usb2
> uhub2: Intel EHCI root hub, rev 2.00/1.00, addr 1
> uhub2: 4 ports with 4 removable, self powered
> ppb6 at pci0 dev 30 function 0 "Intel 82801BA AGP" rev 0xc2
> pci7 at ppb6 bus 1
> vga1 at pci7 dev 1 function 0 "ATI Radeon VE QY" rev 0x00
> wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
> wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
> pcib0 at pci0 dev 31 function 0 "Intel 82801EB/ER LPC" rev 0x02
> pciide0 at pci0 dev 31 function 2 "Intel 82801EB SATA" rev 0x02: DMA,
> channel 0 configured to compatibility, channel 1 configured to
> compatibility
> atapiscsi0 at pciide0 channel 0 drive 0
> scsibus1 at atapiscsi0: 2 targets
> cd0 at scsibus1 targ 0 lun 0: <HL-DT-ST, DVD-ROM GDR8083N, 0L02> SCSI0
> 5/cdrom removable
> cd0(pciide0:0:0): using PIO mode 4, Ultra-DMA mode 2
> "Intel 82801EB/ER SMBus" rev 0x02 at pci0 dev 31 function 3 not
> configured
> isa0 at pcib0
> isadma0 at isa0
> pckbc0 at isa0 port 0x60/5
> pckbd0 at pckbc0 (kbd slot)
> pckbc0: using irq 1 for kbd slot
> wskbd0 at pckbd0: console keyboard, using wsdisplay0
> pmsi0 at pckbc0 (aux slot)
> pckbc0: using irq 12 for aux slot
> wsmouse0 at pmsi0 mux 0
> pcppi0 at isa0 port 0x61
> spkr0 at pcppi0
> sysbeep0 at pcppi0
> cpu0: prelint0 0x700 0x0
> cpu0: prelint1 0x400 0x0
> cpu0: timer0 0x300c0 0x0
> cpu0: pcint0 0x10000 0x0
> cpu0: lint0 0x10700 0x0
> cpu0: lint1 0x400 0x0
> cpu0: err0 0x10000 0x0
> ioapic2: int0 0xa070 0x0
> ioapic1: int4 0xa060 0x0
> ioapic0: int16 0xa061 0x0
> ioapic0: int19 0xa062 0x0
> ioapic0: int23 0xa063 0x0
> dkcsum: sd0 matches BIOS drive 0x80
> root on sd0a
> rootdev=0x400 rrootdev=0xd00 rawdev=0xd02
> cpu1: prelint0 0x10000 0x0
> cpu1: prelint1 0x10000 0x0
> cpu1: timer0 0x200c0 0x0
> cpu1: pcint0 0x10000 0x0
> cpu1: lint0 0x10700 0x0
> cpu1: lint1 0x400 0x0
> cpu1: err0 0x10000 0x0

Reply | Threaded
Open this post in threaded view
|

Re: SMP on IBM eseriesand amd64

Lukáš Macura
Thank you to your answer, now I know that it has no sense to compile
some new kernels and spend my time :)

Our machine is used as firewall, so we really need to pin irqs to both
cpu and to utilise both CPUs.

In this situation, we do not achieve even 100mbit throughput :( Do you
think it is normal? Is there any other optimalization ? Is ther
possibility to use first cpu for kernel and interrupts and second for
applications? Now only one cpu is utilized..

Thank you!
Lukas

On Po, 2006-06-12 at 09:04 +0200, Niklas Hallqvist wrote:

> Hi!
>
> Unfortunately, you seem to have an application where OpenBSD is an
> inferior choice.
> The design-goals behind SMP was to be able to utilize extra CPUs without
> adding too
> much complexity (believe me, it is complex enough anyway).  This is why
> we chose the
> so-called biglock approach, i.e. only one CPU can execute kernel code at
> any given time.
> Interrupts are kernel code, thus only one interrupt can be served
> concurrently.  Furthermore,
> at this point only the boot processor can execute interrupts (except
> timer which are cpu-local
> and thus can occur on all CPUs), which means high-interrupt pressure
> will clog one singe CPU.
> This is not optimal, but for many purposes it works quite OK.  If you
> have several processes
> eating most of its CPU consumption in userland, this is indeed quite
> good.  An example is
> compilation, one CPU will run the compiler, another the preprocessor,
> both are expensive
> CPU consumers.  Web applications can also utilize extra CPUs with
> concurrent requests where
> non-trivial time is spent in the application code (userland).
>
> It is not good for mostly-kernel or single-process applications, like
> routing, filtering or
> database servers (unless the database implementation is designed with
> multiple worker processes,
> and the queries are non-trivial, i.e. complexity is in the query
> optimization, not in just shoveling
> data to the clients).
>
> We have always had intentions to improve this, but intentions is not
> enough.  Capable people must
> have motivation to do the work.  What kind of motivation depends on
> exactly who is doing the work.
>
> What kind of application is this?  Pure routing/filtering?  I suspect so
> since you don't get any action
> from the secondary CPU.
>
> Sorry for the sad answer,
> Niklas
>
> Lukas Macura wrote:
> > Hello all,
> >
> > please we have little problem with openbsd runing on $SUBJ.
> > I think there could be better irq routing. But I don't know how to
> > achieve this. In BIOS, I cannot change IRQ for anything. Every device in
> > this server is on same IRQ. I really don't know why but I cannot change
> > this. When I boot linux on this machine, IRQ routing is OK, probably
> > bacause linux know how to change IRQ of devices.
> >
> > I do not understand what is difference between int and irq in dmesg.
> > Sorry, I am not expert for this. But reality is, that on OpenBSD, only
> > one CPU is used. Probably because all interrupts are routed thru this
> > CPU. Second CPU is still idle.
> >
> > Please can somebody help me what to do to utilise second CPU? We need
> > more bandwidth and we want to use one NIC/one IRQ. So it shuold bring
> > better CPU utilisation. Am I right?
> >
> > Thanks to all,
> > Lukas Macura
> >
> > Here is dmesg:
> > OpenBSD 3.8-current (GENERIC.MP) #6: Thu Nov  3 17:32:14 CET 2005
> >     [hidden email]:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> > real mem = 1073319936 (1048164K)
> > avail mem = 908783616 (887484K)
> > using 22937 buffers containing 107540480 bytes (105020K) of memory
> > mainbus0 (root)
> > mainbus0: scanning 0x9d400 to 0x9d7f0 for MP signature
> > mainbus0: MP floating pointer found in extended bios data area at
> > 0x9d540
> > mainbus0: MP config table at 0x9e520, 356 bytes long
> > mainbus0: Intel MP Specification (Version 1.4) (IBM ENSW X336 SMP    )
> > cpu0 at mainbus0: apid 0 (boot processor)
> > cpu0: Intel(R) Xeon(TM) CPU 3.20GHz, 3200.71 MHz
> > cpu0:
> > FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,NXE,LONG
> > cpu0: 2MB 64b/line 8-way L2 cache
> > cpu0: calibrating local timer
> > cpu0: apic clock running at 200006987Hz
> > cpu0: kstack at 0xffff80006585c000 for 20480 bytes
> > cpu0: idle pcb at 0xffff80006585c000, idle sp at 0xffff800065860ff0
> > cpu1 at mainbus0: apid 6 (application processor)
> > cpu1: Intel(R) Xeon(TM) CPU 3.20GHz, 3200.12 MHz
> > cpu1:
> > FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,NXE,LONG
> > cpu1: 2MB 64b/line 8-way L2 cache
> > cpu1: kstack at 0xffff800065861000 for 20480 bytes
> > cpu1: idle pcb at 0xffff800065861000, idle sp at 0xffff800065865ff0
> > mpbios: bus 0 is type PCI
> > mpbios: bus 1 is type PCI
> > mpbios: bus 2 is type PCI
> > mpbios: bus 3 is type PCI
> > mpbios: bus 4 is type PCI
> > mpbios: bus 5 is type PCI
> > mpbios: bus 6 is type PCI
> > mpbios: bus 7 is type PCI
> > mpbios: bus 8 is type ISA
> > ioapic0 at mainbus0 apid 14: pa 0xffff800001ba7f24, virtual wire mode,
> > version 20, 24 pins
> > ioapic1 at mainbus0 apid 13: pa 0xffff800001ba7e24, virtual wire mode,
> > version 20, 24 pins
> > ioapic2 at mainbus0 apid 12: pa 0xffff800001ba7d24, virtual wire mode,
> > version 20, 24 pins
> > ioapic0: int1 attached to isa0 irq 1 (type 0x0 flags 0x0)
> > ioapic0: int2 attached to isa0 irq 0 (type 0x0 flags 0x0)
> > ioapic0: int6 attached to isa0 irq 6 (type 0x0 flags 0x0)
> > ioapic0: int8 attached to isa0 irq 8 (type 0x0 flags 0x5)
> > ioapic0: int9 attached to isa0 irq 9 (type 0x0 flags 0x0)
> > ioapic0: int12 attached to isa0 irq 12 (type 0x0 flags 0x0)
> > ioapic0: int13 attached to isa0 irq 13 (type 0x0 flags 0x0)
> > ioapic0: int14 attached to isa0 irq 14 (type 0x0 flags 0x0)
> > ioapic0: int15 attached to isa0 irq 15 (type 0x0 flags 0x0)
> > mpbios: can't find ioapic 0
> > ioapic0: int16 attached to pci0 device 29 INT_A (type 0x0 flags 0x0)
> > ioapic0: int19 attached to pci0 device 29 INT_B (type 0x0 flags 0x0)
> > ioapic0: int23 attached to pci0 device 29 INT_D (type 0x0 flags 0x0)
> > ioapic0: int17 attached to pci0 device 31 INT_B (type 0x0 flags 0x0)
> > ioapic0: int16 attached to pci1 device 1 INT_A (type 0x0 flags 0x0)
> > mpbios: can't find ioapic 0
> > ioapic1: int4 attached to pci4 device 1 INT_A (type 0x0 flags 0x0)
> > ioapic2: int0 attached to pci5 device 1 INT_A (type 0x0 flags 0x0)
> > ioapic0: int16 attached to pci6 device 0 INT_A (type 0x0 flags 0x0)
> > ioapic0: int16 attached to pci7 device 0 INT_A (type 0x0 flags 0x0)
> > local apic: int1 attached to NMI (type 0x1 flags 0x0)
> > local apic: int0 attached to ExtINT (type 0x3 flags 0x0)
> > mainbus0: MP WARNING: 348 bytes of extended entries not examined
> > pci0 at mainbus0 bus 0: configuration mode 1
> > pchb0 at pci0 dev 0 function 0 "Intel E7710 SMCH" rev 0x0c
> > "Intel E7710 MCH ERR" rev 0x0c at pci0 dev 0 function 1 not configured
> > ppb0 at pci0 dev 2 function 0 "Intel E7710 MCH PCIE" rev 0x0c
> > pci1 at ppb0 bus 2
> > ppb1 at pci0 dev 4 function 0 "Intel E7710 MCH PCIE" rev 0x0c
> > pci2 at ppb1 bus 3
> > ppb2 at pci2 dev 0 function 0 "Intel PCIE-PCIE" rev 0x09
> > pci3 at ppb2 bus 4
> > mpt0 at pci3 dev 1 function 0 "Symbios Logic 53c1030" rev 0x08: apic 13
> > int 4 (irq 10)
> > mpt0: sending FW Upload request to IOC (size: 36, img size: 69956)
> > mpt0: IM support: 4
> > scsibus0 at mpt0: 16 targets
> > sd0 at scsibus0 targ 0 lun 0: <LSILOGIC, 1030 IM IM, 1000> SCSI2
> > 0/direct fixed
> > sd0: 139898MB, 139898 cyl, 16 head, 128 sec, 512 bytes/sec, 286511104
> > sec total
> > mpt0: target 0 Asynchronous at 0MHz width 8bit offset 0 QAS 0 DT 0 IU 0
> > ppb3 at pci2 dev 0 function 2 "Intel PCIE-PCIE" rev 0x09
> > pci4 at ppb3 bus 5
> > em0 at pci4 dev 1 function 0 "Intel PRO/1000MT (82545GM)" rev 0x04: apic
> > 12 int 0 (irq 10), address 00:0e:0c:9c:07:13
> > ppb4 at pci0 dev 6 function 0 "Intel E7710 MCH PCIE" rev 0x0c
> > pci5 at ppb4 bus 6
> > bge0 at pci5 dev 0 function 0 "Broadcom BCM5721" rev 0x11, BCM5750 B1
> > (0x4101): apic 14 int 16 (irq 10) address 00:14:5e:0b:3e:ea
> > brgphy0 at bge0 phy 1: BCM5750 10/100/1000baseT PHY, rev. 0
> > ppb5 at pci0 dev 7 function 0 "Intel E7710 MCH PCIE" rev 0x0c
> > pci6 at ppb5 bus 7
> > bge1 at pci6 dev 0 function 0 "Broadcom BCM5721" rev 0x11, BCM5750 B1
> > (0x4101): apic 14 int 16 (irq 10) address 00:14:5e:0b:3e:eb
> > brgphy1 at bge1 phy 1: BCM5750 10/100/1000baseT PHY, rev. 0
> > vendor "Intel", unknown product 0x359b (class system subclass
> > miscellaneous, rev 0x0c) at pci0 dev 8 function 0 not configured
> > uhci0 at pci0 dev 29 function 0 "Intel 82801EB/ER USB" rev 0x02: apic 14
> > int 16 (irq 10)
> > usb0 at uhci0: USB revision 1.0
> > uhub0 at usb0
> > uhub0: Intel UHCI root hub, rev 1.00/1.00, addr 1
> > uhub0: 2 ports with 2 removable, self powered
> > uhci1 at pci0 dev 29 function 1 "Intel 82801EB/ER USB" rev 0x02: apic 14
> > int 19 (irq 7)
> > usb1 at uhci1: USB revision 1.0
> > uhub1 at usb1
> > uhub1: Intel UHCI root hub, rev 1.00/1.00, addr 1
> > uhub1: 2 ports with 2 removable, self powered
> > ehci0 at pci0 dev 29 function 7 "Intel 82801EB/ER USB" rev 0x02: apic 14
> > int 23 (irq 5)
> > usb2 at ehci0: USB revision 2.0
> > uhub2 at usb2
> > uhub2: Intel EHCI root hub, rev 2.00/1.00, addr 1
> > uhub2: 4 ports with 4 removable, self powered
> > ppb6 at pci0 dev 30 function 0 "Intel 82801BA AGP" rev 0xc2
> > pci7 at ppb6 bus 1
> > vga1 at pci7 dev 1 function 0 "ATI Radeon VE QY" rev 0x00
> > wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
> > wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
> > pcib0 at pci0 dev 31 function 0 "Intel 82801EB/ER LPC" rev 0x02
> > pciide0 at pci0 dev 31 function 2 "Intel 82801EB SATA" rev 0x02: DMA,
> > channel 0 configured to compatibility, channel 1 configured to
> > compatibility
> > atapiscsi0 at pciide0 channel 0 drive 0
> > scsibus1 at atapiscsi0: 2 targets
> > cd0 at scsibus1 targ 0 lun 0: <HL-DT-ST, DVD-ROM GDR8083N, 0L02> SCSI0
> > 5/cdrom removable
> > cd0(pciide0:0:0): using PIO mode 4, Ultra-DMA mode 2
> > "Intel 82801EB/ER SMBus" rev 0x02 at pci0 dev 31 function 3 not
> > configured
> > isa0 at pcib0
> > isadma0 at isa0
> > pckbc0 at isa0 port 0x60/5
> > pckbd0 at pckbc0 (kbd slot)
> > pckbc0: using irq 1 for kbd slot
> > wskbd0 at pckbd0: console keyboard, using wsdisplay0
> > pmsi0 at pckbc0 (aux slot)
> > pckbc0: using irq 12 for aux slot
> > wsmouse0 at pmsi0 mux 0
> > pcppi0 at isa0 port 0x61
> > spkr0 at pcppi0
> > sysbeep0 at pcppi0
> > cpu0: prelint0 0x700 0x0
> > cpu0: prelint1 0x400 0x0
> > cpu0: timer0 0x300c0 0x0
> > cpu0: pcint0 0x10000 0x0
> > cpu0: lint0 0x10700 0x0
> > cpu0: lint1 0x400 0x0
> > cpu0: err0 0x10000 0x0
> > ioapic2: int0 0xa070 0x0
> > ioapic1: int4 0xa060 0x0
> > ioapic0: int16 0xa061 0x0
> > ioapic0: int19 0xa062 0x0
> > ioapic0: int23 0xa063 0x0
> > dkcsum: sd0 matches BIOS drive 0x80
> > root on sd0a
> > rootdev=0x400 rrootdev=0xd00 rawdev=0xd02
> > cpu1: prelint0 0x10000 0x0
> > cpu1: prelint1 0x10000 0x0
> > cpu1: timer0 0x200c0 0x0
> > cpu1: pcint0 0x10000 0x0
> > cpu1: lint0 0x10700 0x0
> > cpu1: lint1 0x400 0x0
> > cpu1: err0 0x10000 0x0

Reply | Threaded
Open this post in threaded view
|

Re: SMP on IBM eseriesand amd64

Niklas Hallqvist
Eh, 100Mbit throughput on that machine should be achievable with just
one cpu I think.
I don't think you will need SMP for that.  What kind of interrupt load
you get (check with
systat vm, or vmstat -i) when you're throttled?  What kind of cpu load
do you get?
I'm specifically interested in how much are in intr context vs sys, and
idle.  I am really amazed
you can load this system with just filtering.

Niklas

Lukas Macura wrote:

> Thank you to your answer, now I know that it has no sense to compile
> some new kernels and spend my time :)
>
> Our machine is used as firewall, so we really need to pin irqs to both
> cpu and to utilise both CPUs.
>
> In this situation, we do not achieve even 100mbit throughput :( Do you
> think it is normal? Is there any other optimalization ? Is ther
> possibility to use first cpu for kernel and interrupts and second for
> applications? Now only one cpu is utilized..
>
> Thank you!
> Lukas
>
> On Po, 2006-06-12 at 09:04 +0200, Niklas Hallqvist wrote:
>  
>> Hi!
>>
>> Unfortunately, you seem to have an application where OpenBSD is an
>> inferior choice.
>> The design-goals behind SMP was to be able to utilize extra CPUs without
>> adding too
>> much complexity (believe me, it is complex enough anyway).  This is why
>> we chose the
>> so-called biglock approach, i.e. only one CPU can execute kernel code at
>> any given time.
>> Interrupts are kernel code, thus only one interrupt can be served
>> concurrently.  Furthermore,
>> at this point only the boot processor can execute interrupts (except
>> timer which are cpu-local
>> and thus can occur on all CPUs), which means high-interrupt pressure
>> will clog one singe CPU.
>> This is not optimal, but for many purposes it works quite OK.  If you
>> have several processes
>> eating most of its CPU consumption in userland, this is indeed quite
>> good.  An example is
>> compilation, one CPU will run the compiler, another the preprocessor,
>> both are expensive
>> CPU consumers.  Web applications can also utilize extra CPUs with
>> concurrent requests where
>> non-trivial time is spent in the application code (userland).
>>
>> It is not good for mostly-kernel or single-process applications, like
>> routing, filtering or
>> database servers (unless the database implementation is designed with
>> multiple worker processes,
>> and the queries are non-trivial, i.e. complexity is in the query
>> optimization, not in just shoveling
>> data to the clients).
>>
>> We have always had intentions to improve this, but intentions is not
>> enough.  Capable people must
>> have motivation to do the work.  What kind of motivation depends on
>> exactly who is doing the work.
>>
>> What kind of application is this?  Pure routing/filtering?  I suspect so
>> since you don't get any action
>> from the secondary CPU.
>>
>> Sorry for the sad answer,
>> Niklas
>>
>> Lukas Macura wrote:
>>    
>>> Hello all,
>>>
>>> please we have little problem with openbsd runing on $SUBJ.
>>> I think there could be better irq routing. But I don't know how to
>>> achieve this. In BIOS, I cannot change IRQ for anything. Every device in
>>> this server is on same IRQ. I really don't know why but I cannot change
>>> this. When I boot linux on this machine, IRQ routing is OK, probably
>>> bacause linux know how to change IRQ of devices.
>>>
>>> I do not understand what is difference between int and irq in dmesg.
>>> Sorry, I am not expert for this. But reality is, that on OpenBSD, only
>>> one CPU is used. Probably because all interrupts are routed thru this
>>> CPU. Second CPU is still idle.
>>>
>>> Please can somebody help me what to do to utilise second CPU? We need
>>> more bandwidth and we want to use one NIC/one IRQ. So it shuold bring
>>> better CPU utilisation. Am I right?
>>>
>>> Thanks to all,
>>> Lukas Macura
>>>
>>> Here is dmesg:
>>> OpenBSD 3.8-current (GENERIC.MP) #6: Thu Nov  3 17:32:14 CET 2005
>>>     [hidden email]:/usr/src/sys/arch/amd64/compile/GENERIC.MP
>>> real mem = 1073319936 (1048164K)
>>> avail mem = 908783616 (887484K)
>>> using 22937 buffers containing 107540480 bytes (105020K) of memory
>>> mainbus0 (root)
>>> mainbus0: scanning 0x9d400 to 0x9d7f0 for MP signature
>>> mainbus0: MP floating pointer found in extended bios data area at
>>> 0x9d540
>>> mainbus0: MP config table at 0x9e520, 356 bytes long
>>> mainbus0: Intel MP Specification (Version 1.4) (IBM ENSW X336 SMP    )
>>> cpu0 at mainbus0: apid 0 (boot processor)
>>> cpu0: Intel(R) Xeon(TM) CPU 3.20GHz, 3200.71 MHz
>>> cpu0:
>>> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,NXE,LONG
>>> cpu0: 2MB 64b/line 8-way L2 cache
>>> cpu0: calibrating local timer
>>> cpu0: apic clock running at 200006987Hz
>>> cpu0: kstack at 0xffff80006585c000 for 20480 bytes
>>> cpu0: idle pcb at 0xffff80006585c000, idle sp at 0xffff800065860ff0
>>> cpu1 at mainbus0: apid 6 (application processor)
>>> cpu1: Intel(R) Xeon(TM) CPU 3.20GHz, 3200.12 MHz
>>> cpu1:
>>> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,NXE,LONG
>>> cpu1: 2MB 64b/line 8-way L2 cache
>>> cpu1: kstack at 0xffff800065861000 for 20480 bytes
>>> cpu1: idle pcb at 0xffff800065861000, idle sp at 0xffff800065865ff0
>>> mpbios: bus 0 is type PCI
>>> mpbios: bus 1 is type PCI
>>> mpbios: bus 2 is type PCI
>>> mpbios: bus 3 is type PCI
>>> mpbios: bus 4 is type PCI
>>> mpbios: bus 5 is type PCI
>>> mpbios: bus 6 is type PCI
>>> mpbios: bus 7 is type PCI
>>> mpbios: bus 8 is type ISA
>>> ioapic0 at mainbus0 apid 14: pa 0xffff800001ba7f24, virtual wire mode,
>>> version 20, 24 pins
>>> ioapic1 at mainbus0 apid 13: pa 0xffff800001ba7e24, virtual wire mode,
>>> version 20, 24 pins
>>> ioapic2 at mainbus0 apid 12: pa 0xffff800001ba7d24, virtual wire mode,
>>> version 20, 24 pins
>>> ioapic0: int1 attached to isa0 irq 1 (type 0x0 flags 0x0)
>>> ioapic0: int2 attached to isa0 irq 0 (type 0x0 flags 0x0)
>>> ioapic0: int6 attached to isa0 irq 6 (type 0x0 flags 0x0)
>>> ioapic0: int8 attached to isa0 irq 8 (type 0x0 flags 0x5)
>>> ioapic0: int9 attached to isa0 irq 9 (type 0x0 flags 0x0)
>>> ioapic0: int12 attached to isa0 irq 12 (type 0x0 flags 0x0)
>>> ioapic0: int13 attached to isa0 irq 13 (type 0x0 flags 0x0)
>>> ioapic0: int14 attached to isa0 irq 14 (type 0x0 flags 0x0)
>>> ioapic0: int15 attached to isa0 irq 15 (type 0x0 flags 0x0)
>>> mpbios: can't find ioapic 0
>>> ioapic0: int16 attached to pci0 device 29 INT_A (type 0x0 flags 0x0)
>>> ioapic0: int19 attached to pci0 device 29 INT_B (type 0x0 flags 0x0)
>>> ioapic0: int23 attached to pci0 device 29 INT_D (type 0x0 flags 0x0)
>>> ioapic0: int17 attached to pci0 device 31 INT_B (type 0x0 flags 0x0)
>>> ioapic0: int16 attached to pci1 device 1 INT_A (type 0x0 flags 0x0)
>>> mpbios: can't find ioapic 0
>>> ioapic1: int4 attached to pci4 device 1 INT_A (type 0x0 flags 0x0)
>>> ioapic2: int0 attached to pci5 device 1 INT_A (type 0x0 flags 0x0)
>>> ioapic0: int16 attached to pci6 device 0 INT_A (type 0x0 flags 0x0)
>>> ioapic0: int16 attached to pci7 device 0 INT_A (type 0x0 flags 0x0)
>>> local apic: int1 attached to NMI (type 0x1 flags 0x0)
>>> local apic: int0 attached to ExtINT (type 0x3 flags 0x0)
>>> mainbus0: MP WARNING: 348 bytes of extended entries not examined
>>> pci0 at mainbus0 bus 0: configuration mode 1
>>> pchb0 at pci0 dev 0 function 0 "Intel E7710 SMCH" rev 0x0c
>>> "Intel E7710 MCH ERR" rev 0x0c at pci0 dev 0 function 1 not configured
>>> ppb0 at pci0 dev 2 function 0 "Intel E7710 MCH PCIE" rev 0x0c
>>> pci1 at ppb0 bus 2
>>> ppb1 at pci0 dev 4 function 0 "Intel E7710 MCH PCIE" rev 0x0c
>>> pci2 at ppb1 bus 3
>>> ppb2 at pci2 dev 0 function 0 "Intel PCIE-PCIE" rev 0x09
>>> pci3 at ppb2 bus 4
>>> mpt0 at pci3 dev 1 function 0 "Symbios Logic 53c1030" rev 0x08: apic 13
>>> int 4 (irq 10)
>>> mpt0: sending FW Upload request to IOC (size: 36, img size: 69956)
>>> mpt0: IM support: 4
>>> scsibus0 at mpt0: 16 targets
>>> sd0 at scsibus0 targ 0 lun 0: <LSILOGIC, 1030 IM IM, 1000> SCSI2
>>> 0/direct fixed
>>> sd0: 139898MB, 139898 cyl, 16 head, 128 sec, 512 bytes/sec, 286511104
>>> sec total
>>> mpt0: target 0 Asynchronous at 0MHz width 8bit offset 0 QAS 0 DT 0 IU 0
>>> ppb3 at pci2 dev 0 function 2 "Intel PCIE-PCIE" rev 0x09
>>> pci4 at ppb3 bus 5
>>> em0 at pci4 dev 1 function 0 "Intel PRO/1000MT (82545GM)" rev 0x04: apic
>>> 12 int 0 (irq 10), address 00:0e:0c:9c:07:13
>>> ppb4 at pci0 dev 6 function 0 "Intel E7710 MCH PCIE" rev 0x0c
>>> pci5 at ppb4 bus 6
>>> bge0 at pci5 dev 0 function 0 "Broadcom BCM5721" rev 0x11, BCM5750 B1
>>> (0x4101): apic 14 int 16 (irq 10) address 00:14:5e:0b:3e:ea
>>> brgphy0 at bge0 phy 1: BCM5750 10/100/1000baseT PHY, rev. 0
>>> ppb5 at pci0 dev 7 function 0 "Intel E7710 MCH PCIE" rev 0x0c
>>> pci6 at ppb5 bus 7
>>> bge1 at pci6 dev 0 function 0 "Broadcom BCM5721" rev 0x11, BCM5750 B1
>>> (0x4101): apic 14 int 16 (irq 10) address 00:14:5e:0b:3e:eb
>>> brgphy1 at bge1 phy 1: BCM5750 10/100/1000baseT PHY, rev. 0
>>> vendor "Intel", unknown product 0x359b (class system subclass
>>> miscellaneous, rev 0x0c) at pci0 dev 8 function 0 not configured
>>> uhci0 at pci0 dev 29 function 0 "Intel 82801EB/ER USB" rev 0x02: apic 14
>>> int 16 (irq 10)
>>> usb0 at uhci0: USB revision 1.0
>>> uhub0 at usb0
>>> uhub0: Intel UHCI root hub, rev 1.00/1.00, addr 1
>>> uhub0: 2 ports with 2 removable, self powered
>>> uhci1 at pci0 dev 29 function 1 "Intel 82801EB/ER USB" rev 0x02: apic 14
>>> int 19 (irq 7)
>>> usb1 at uhci1: USB revision 1.0
>>> uhub1 at usb1
>>> uhub1: Intel UHCI root hub, rev 1.00/1.00, addr 1
>>> uhub1: 2 ports with 2 removable, self powered
>>> ehci0 at pci0 dev 29 function 7 "Intel 82801EB/ER USB" rev 0x02: apic 14
>>> int 23 (irq 5)
>>> usb2 at ehci0: USB revision 2.0
>>> uhub2 at usb2
>>> uhub2: Intel EHCI root hub, rev 2.00/1.00, addr 1
>>> uhub2: 4 ports with 4 removable, self powered
>>> ppb6 at pci0 dev 30 function 0 "Intel 82801BA AGP" rev 0xc2
>>> pci7 at ppb6 bus 1
>>> vga1 at pci7 dev 1 function 0 "ATI Radeon VE QY" rev 0x00
>>> wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
>>> wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
>>> pcib0 at pci0 dev 31 function 0 "Intel 82801EB/ER LPC" rev 0x02
>>> pciide0 at pci0 dev 31 function 2 "Intel 82801EB SATA" rev 0x02: DMA,
>>> channel 0 configured to compatibility, channel 1 configured to
>>> compatibility
>>> atapiscsi0 at pciide0 channel 0 drive 0
>>> scsibus1 at atapiscsi0: 2 targets
>>> cd0 at scsibus1 targ 0 lun 0: <HL-DT-ST, DVD-ROM GDR8083N, 0L02> SCSI0
>>> 5/cdrom removable
>>> cd0(pciide0:0:0): using PIO mode 4, Ultra-DMA mode 2
>>> "Intel 82801EB/ER SMBus" rev 0x02 at pci0 dev 31 function 3 not
>>> configured
>>> isa0 at pcib0
>>> isadma0 at isa0
>>> pckbc0 at isa0 port 0x60/5
>>> pckbd0 at pckbc0 (kbd slot)
>>> pckbc0: using irq 1 for kbd slot
>>> wskbd0 at pckbd0: console keyboard, using wsdisplay0
>>> pmsi0 at pckbc0 (aux slot)
>>> pckbc0: using irq 12 for aux slot
>>> wsmouse0 at pmsi0 mux 0
>>> pcppi0 at isa0 port 0x61
>>> spkr0 at pcppi0
>>> sysbeep0 at pcppi0
>>> cpu0: prelint0 0x700 0x0
>>> cpu0: prelint1 0x400 0x0
>>> cpu0: timer0 0x300c0 0x0
>>> cpu0: pcint0 0x10000 0x0
>>> cpu0: lint0 0x10700 0x0
>>> cpu0: lint1 0x400 0x0
>>> cpu0: err0 0x10000 0x0
>>> ioapic2: int0 0xa070 0x0
>>> ioapic1: int4 0xa060 0x0
>>> ioapic0: int16 0xa061 0x0
>>> ioapic0: int19 0xa062 0x0
>>> ioapic0: int23 0xa063 0x0
>>> dkcsum: sd0 matches BIOS drive 0x80
>>> root on sd0a
>>> rootdev=0x400 rrootdev=0xd00 rawdev=0xd02
>>> cpu1: prelint0 0x10000 0x0
>>> cpu1: prelint1 0x10000 0x0
>>> cpu1: timer0 0x200c0 0x0
>>> cpu1: pcint0 0x10000 0x0
>>> cpu1: lint0 0x10700 0x0
>>> cpu1: lint1 0x400 0x0
>>> cpu1: err0 0x10000 0x0

Reply | Threaded
Open this post in threaded view
|

Re: SMP on IBM eseriesand amd64

Berk D. Demir
In reply to this post by Lukáš Macura
Lukas Macura wrote:

> Thank you to your answer, now I know that it has no sense to compile
> some new kernels and spend my time :)
>
> Our machine is used as firewall, so we really need to pin irqs to both
> cpu and to utilise both CPUs.
>
> In this situation, we do not achieve even 100mbit throughput :( Do you
> think it is normal? Is there any other optimalization ? Is ther
> possibility to use first cpu for kernel and interrupts and second for
> applications? Now only one cpu is utilized..

100mbit is an easy goal to achive with even small PCs.

If you use MP kernel, APIC support is enabled and interrupt load on the
CPU decreases dramatically. Using MP kernel even on uniprocessor systems
offloads the interrupt load.

If you're not using a very low quality ethernet adapter, it'll be very
much possible to handle ~400mbit/s traffic or ~60K pkts/s loads with
uniprocessor but APIC enabled workstations.

Use the command "systat vmstat" and watch the interrupts columns on the
right. It'll display your network adapter's interrupts. Look for it's
iface name. (em0, sk0, bge0, etc.)

Normalization, modulation and other features of PF can create
significant CPU load. To get detailed info about PF status, use the
command "pfctl -vvvs i"

"congestion" is your enemy. If it keeps going up, you're probably using
a crappy network adapter or your switch (you're not using hubs eh?) is
malfunctioning.

To avoid congestion you can bump "net.inet.ip.ifq.maxlen" to a value
your interface card can handle. I'm using value of 250 for em(4) cards
which are generally "Intel PRO/1000MT Dual Port Server Adapter (PWLA8492MT)"

Blindly bumping the number won't help but worsen the situation. I'm not
an expert on network adapter specs. so you have to search it for yourself.

Anyway, if you could reproduce the hogged scenario and post here the
outputs of recently mentioned commands, maybe we can find a clue.

Hope this helps,
bdd

Reply | Threaded
Open this post in threaded view
|

Re: SMP on IBM eseriesand amd64

Lukáš Macura
Thanks to all who are helping me!

Yes, it is strange. I heard about this that this machine should have bigger
throughput. But it does not.. Maybe we will find some bottleneck..

I tried to put some data across the box. Box is slowing connection because
server was able to achieve 40Mbytes/sec traffic if it did not go across our
box.
We have three interfaces (bge0 ad bge1 are input and ouptut ifaces of
firewall,
on em0, there are vlans (maybe vlans are slowing communication??) and there is
DMZ interface vlan4. You see that maximum throughput is around 13Mbytes/sec.

ifstat -i bge0,bge1,em0
       bge0                bge1                em0
KB/s in  KB/s out   KB/s in  KB/s out   KB/s in  KB/s out
1140.61   1482.30   1846.41  13672.80  12701.09    677.83
1961.63   1906.47   2266.95  14208.18  13021.33   1301.74
1544.95   1913.62   2515.52  14022.09  12926.79   1226.91
1084.10   1814.52   2377.34  13460.40  12318.63    669.62
1102.96   1838.74   2237.76  13447.68  12423.09    637.19
1013.89   1772.47   2314.35  13391.93  12300.46    626.47
1084.60   1593.75   2305.87  13501.58  12369.76    828.49
1066.76   1725.85   2146.51  13557.00  12447.68    547.13
  998.55   1577.78   2157.61  13476.19  12398.57    666.44
1125.56   1702.79   2353.74  13546.82  12380.44    774.68
1349.65   1671.96   2164.74  13813.20  12460.98    667.28
1155.46   1969.64   2501.87  13714.57  12567.29    716.43
  979.44   2056.51   2591.37  13261.97  12266.25    680.06
1126.17   2064.37   2437.21  13480.67  12362.19    533.83
1229.66   2086.62   2715.48  13454.45  12257.00    833.42
1166.42   1984.17   2503.31  13574.11  12516.47    802.70
1141.74   1820.94   2279.01  13584.51  12443.09    632.97
1064.19   1972.24   2494.15  13281.63  12261.40    739.20
1724.37   1999.30   2470.15  14284.73  13271.40   1364.62
1061.01   2081.08   2596.49  13434.53  12417.23    733.66
  984.57   1849.00   2180.34  13182.10  12117.27    405.91
  982.43   2310.92   2583.39  13032.67  12008.25    387.75
1157.07   2538.57   3016.93  13463.59  12438.74    782.56
1338.31   2289.07   2530.70  13566.12  12536.73    715.35
1134.96   2159.54   2464.83  13434.33  12355.31    529.81
1356.44   1943.62   2311.49  13513.62  12323.27    692.58
1024.38   1819.81   2285.28  13275.06  12288.39    666.19
1208.24   1757.76   2261.24  13425.95  12333.64    767.69
  966.42   1702.23   1965.50  13444.56  12452.02    380.01
1012.70   1780.69   2173.39  13175.06  12131.25    514.55
  998.22   1819.66   2416.99  13358.15  12309.87    707.30
1029.78   1939.76   2459.17  13429.32  12416.67    706.06
  908.04   1974.43   2608.94  12978.15  12042.42    771.12
  894.58   1935.90   2485.24  13197.27  12287.65    705.77
  848.73   1757.40   2192.18  12844.69  11978.21    572.72
  600.02   1878.60   2108.00  13096.39  12505.82    378.97
  987.49   1648.36   2361.47  13334.47  12350.65    882.62
  980.52   1847.26   2259.06  13469.34  12554.95    646.34
  912.79   1796.95   2122.29  13211.95  12258.95    434.39
  658.41   1648.06   1847.83  12798.11  12122.58    315.53

I do not know what is low quality adapter.. These are my adapters:

em0 at pci4 dev 1 function 0 "Intel PRO/1000MT (82545GM)" rev 0x04:
apic 12 int
0 (irq 10), address 00:0e:0c:9c:07:13

bge0 at pci5 dev 0 function 0 "Broadcom BCM5721" rev 0x11, BCM5750 B1
(0x4101):
apic 14 int 16 (irq 10) address 00:14:5e:0b:3e:ea

bge1 at pci6 dev 0 function 0 "Broadcom BCM5721" rev 0x11, BCM5750 B1
(0x4101):
apic 14 int 16 (irq 10) address 00:14:5e:0b:3e:eb

vmstat 1
procs   memory        page                    disks     traps         cpu
r b w    avm    fre   flt  re  pi  po  fr  sr sd0 cd0  int   sys   cs us sy id
1 3 0  98944 719784    69   0   0   0   0   0   8   0 3152  1461  189  0 11 89
1 3 0  98944 719784    16   0   0   0   0   0   0   0 13199   952  103  
0 22 78
0 3 0  98944 719784    11   0   0   0   0   0   0   0 13332  1010  127  
0 31 69
0 3 0  98944 719784     7   0   0   0   0   0   0   0 13294   868  102  
0 31 69
0 3 0  98944 719784     7   0   0   0   0   0   0   0 13420  1176  143  
0 27 73
0 3 0  98944 719784     7   0   0   0   0   0   0   0 13390  1073  136  
0 31 69
0 3 0  98944 719784     7   0   0   0   0   0   0   0 13409  1050   98  
0 27 73
0 3 0  98944 719784    11   0   0   0   0   0   0   0 13277  1064  124  
0 24 76
0 3 0  98952 719772     9   0   0   0   0   0   0   0 13359  1084  111  
0 29 71

systat vmstat shows arround 8k interrupts per iface

pfctl -vvvs i
Status: Enabled for 4 days 22:13:53           Debug: Urgent

Hostid:   0xbd4a2e30
Checksum: 0x9c14bb48b1ac30341e7da19edc7c32d3

Interface Stats for bge0              IPv4             IPv6
  Bytes In                    345357698409          8546928
  Bytes Out                   419614550986          1373749
  Packets In
    Passed                       538779628            82385
    Blocked                        5625101                0
  Packets Out
    Passed                       589144477            14869
    Blocked                         671466                0

State Table                          Total             Rate
  current entries                    10477
  searches                      3232606557         7594.8/s
  inserts                         21975946           51.6/s
  removals                        21965469           51.6/s
Source Tracking Table
  current entries                        0
  searches                               0            0.0/s
  inserts                                0            0.0/s
  removals                               0            0.0/s
Counters
  match                         1307766710         3072.5/s
  bad-offset                             0            0.0/s
  fragment                            1030            0.0/s
  short                                349            0.0/s
  normalize                              0            0.0/s
  memory                                 0            0.0/s
  bad-timestamp                          0            0.0/s
  congestion                         25715            0.1/s
  ip-option                           4593            0.0/s
  proto-cksum                        23052            0.1/s
  state-mismatch                    534903            1.3/s
  state-insert                         199            0.0/s
  state-limit                            0            0.0/s
  src-limit                              0            0.0/s
  synproxy                         1718874            4.0/s
Limit Counters
  max states per rule                    0            0.0/s
  max-src-states                         0            0.0/s
  max-src-nodes                          0            0.0/s
  max-src-conn                           0            0.0/s
  max-src-conn-rate                      0            0.0/s
  overload table insertion               0            0.0/s
  overload flush states                  0            0.0/s

Congestion of internet link is not problem, I know this. Our line is dedicated
1Gpbs full duplex. We are not using hubs :) but cisco 6550 ;) It has little
more inteligence than HUB. I hope :))



sysctl net.inet.ip.ifq.maxlen
net.inet.ip.ifq.maxlen=250
This was already set before, I found it on some conferrence.

Thank you for any next suggestions!

Lukas Macura
UIT


Quoting "Berk D. Demir" <[hidden email]>:

> Lukas Macura wrote:
>> Thank you to your answer, now I know that it has no sense to compile
>> some new kernels and spend my time :)
>>
>> Our machine is used as firewall, so we really need to pin irqs to both
>> cpu and to utilise both CPUs. In this situation, we do not achieve
>> even 100mbit throughput :( Do you
>> think it is normal? Is there any other optimalization ? Is ther
>> possibility to use first cpu for kernel and interrupts and second for
>> applications? Now only one cpu is utilized..
>
> 100mbit is an easy goal to achive with even small PCs.
>
> If you use MP kernel, APIC support is enabled and interrupt load on
> the CPU decreases dramatically. Using MP kernel even on uniprocessor
> systems offloads the interrupt load.
>
> If you're not using a very low quality ethernet adapter, it'll be
> very much possible to handle ~400mbit/s traffic or ~60K pkts/s loads
> with uniprocessor but APIC enabled workstations.
>
> Use the command "systat vmstat" and watch the interrupts columns on
> the right. It'll display your network adapter's interrupts. Look for
> it's iface name. (em0, sk0, bge0, etc.)
>
> Normalization, modulation and other features of PF can create
> significant CPU load. To get detailed info about PF status, use the
> command "pfctl -vvvs i"
>
> "congestion" is your enemy. If it keeps going up, you're probably
> using a crappy network adapter or your switch (you're not using hubs
> eh?) is malfunctioning.
>
> To avoid congestion you can bump "net.inet.ip.ifq.maxlen" to a value
> your interface card can handle. I'm using value of 250 for em(4)
> cards which are generally "Intel PRO/1000MT Dual Port Server Adapter
> (PWLA8492MT)"
>
> Blindly bumping the number won't help but worsen the situation. I'm
> not an expert on network adapter specs. so you have to search it for
> yourself.
>
> Anyway, if you could reproduce the hogged scenario and post here the
> outputs of recently mentioned commands, maybe we can find a clue.
>
> Hope this helps,
> bdd
>
>



----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.