wg(4) crash

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

wg(4) crash

Stuart Henderson
Not a great report but I don't have much more to go on, machine had
ddb.panic=0 and ddb hanged while printing the stack trace. Retyped by
hand, may contain typos. Happened a few hours after setting up wg on it.

uvm_fault(0xffffffff82204e38, 0x20, 0, 1) -> e
fatal page fault in supervisor mode
trap type 6 code 0 rip ffffffff81752116 cs 8 rflags 10246 cr2 20 cpl 0 rsp 00023b35eb0
gsbase 0xffffffff820eaff0 kgsbase 0x0
panic: trap type 6, code=0, pc=ffffffff81752116
Starting stack trace...
panic(ffffffff81ddc97a) at panic+0x11d
kerntrap(ffff800023b35e00) at kerntrap+0x114
alltraps_kern_meltdown() at alltraps_kern_meltdown+0x7b
wg_index_drop(ffff8000012ae000,0) at wg_index_drop+0x96
noise_create_initiation(

OpenBSD 6.9-beta (GENERIC.MP) #383: Sun Mar  7 20:38:08 MST 2021
    [hidden email]:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 34295709696 (32706MB)
avail mem = 33240948736 (31701MB)
random: good seed from bootblocks
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.7 @ 0xcf42c000 (99 entries)
bios0: vendor Dell Inc. version "2.9.0" date 12/06/2019
bios0: Dell Inc. PowerEdge R620
acpi0 at bios0: ACPI 3.0
acpi0: sleep states S0 S4 S5
acpi0: tables DSDT FACP APIC SPCR HPET DMAR MCFG WD__ SLIC ERST HEST BERT EINJ TCPA PC__ SRAT SSDT
acpi0: wakeup devices PCI0(S5) PCI1(S5)
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee00000: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz, 2900.44 MHz, 06-3e-04
cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,PERF,ITSC,FSGSBASE,SMEP,ERMS,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu0: 256KB 64b/line 8-way L2 cache
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges
cpu0: apic clock running at 100MHz
cpu0: mwait min=64, max=64, C-substates=0.2.1.1, IBE
cpu1 at mainbus0: apid 32 (application processor)
cpu1: Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz, 1200.01 MHz, 06-3e-04
cpu1: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,PERF,ITSC,FSGSBASE,SMEP,ERMS,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu1: 256KB 64b/line 8-way L2 cache
cpu1: disabling user TSC (skew=135)
cpu1: smt 0, core 0, package 1
cpu2 at mainbus0: apid 2 (application processor)
cpu2: Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz, 2900.01 MHz, 06-3e-04
cpu2: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,PERF,ITSC,FSGSBASE,SMEP,ERMS,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu2: 256KB 64b/line 8-way L2 cache
cpu2: smt 0, core 1, package 0
cpu3 at mainbus0: apid 34 (application processor)
cpu3: Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz, 1200.00 MHz, 06-3e-04
cpu3: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,PERF,ITSC,FSGSBASE,SMEP,ERMS,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu3: 256KB 64b/line 8-way L2 cache
cpu3: smt 0, core 1, package 1
cpu4 at mainbus0: apid 4 (application processor)
cpu4: Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz, 2900.00 MHz, 06-3e-04
cpu4: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,PERF,ITSC,FSGSBASE,SMEP,ERMS,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu4: 256KB 64b/line 8-way L2 cache
cpu4: smt 0, core 2, package 0
cpu5 at mainbus0: apid 36 (application processor)
cpu5: Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz, 1200.00 MHz, 06-3e-04
cpu5: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,PERF,ITSC,FSGSBASE,SMEP,ERMS,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu5: 256KB 64b/line 8-way L2 cache
cpu5: disabling user TSC (skew=121)
cpu5: smt 0, core 2, package 1
cpu6 at mainbus0: apid 6 (application processor)
cpu6: Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz, 2900.02 MHz, 06-3e-04
cpu6: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,PERF,ITSC,FSGSBASE,SMEP,ERMS,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu6: 256KB 64b/line 8-way L2 cache
cpu6: smt 0, core 3, package 0
cpu7 at mainbus0: apid 38 (application processor)
cpu7: Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz, 2900.00 MHz, 06-3e-04
cpu7: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,PERF,ITSC,FSGSBASE,SMEP,ERMS,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu7: 256KB 64b/line 8-way L2 cache
cpu7: smt 0, core 3, package 1
cpu8 at mainbus0: apid 8 (application processor)
cpu8: Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz, 2900.00 MHz, 06-3e-04
cpu8: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,PERF,ITSC,FSGSBASE,SMEP,ERMS,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu8: 256KB 64b/line 8-way L2 cache
cpu8: smt 0, core 4, package 0
cpu9 at mainbus0: apid 40 (application processor)
cpu9: Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz, 2900.00 MHz, 06-3e-04
cpu9: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,PERF,ITSC,FSGSBASE,SMEP,ERMS,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu9: 256KB 64b/line 8-way L2 cache
cpu9: smt 0, core 4, package 1
cpu10 at mainbus0: apid 10 (application processor)
cpu10: Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz, 2900.01 MHz, 06-3e-04
cpu10: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,PERF,ITSC,FSGSBASE,SMEP,ERMS,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu10: 256KB 64b/line 8-way L2 cache
cpu10: smt 0, core 5, package 0
cpu11 at mainbus0: apid 42 (application processor)
cpu11: Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz, 2900.00 MHz, 06-3e-04
cpu11: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,PERF,ITSC,FSGSBASE,SMEP,ERMS,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu11: 256KB 64b/line 8-way L2 cache
cpu11: smt 0, core 5, package 1
ioapic0 at mainbus0: apid 0 pa 0xfec00000, version 20, 24 pins
ioapic1 at mainbus0: apid 1 pa 0xfec3f000, version 20, 24 pins, remapped
ioapic2 at mainbus0: apid 2 pa 0xfec7f000, version 20, 24 pins, remapped
acpihpet0 at acpi0: 14318179 Hz
acpimcfg0 at acpi0
acpimcfg0: addr 0xe0000000, bus 0-255
acpiprt0 at acpi0: bus 0 (PCI0)
acpiprt1 at acpi0: bus 1 (PEX1)
acpiprt2 at acpi0: bus -1 (PE1C)
acpiprt3 at acpi0: bus 3 (PEX2)
acpiprt4 at acpi0: bus 2 (PEX3)
acpiprt5 at acpi0: bus 4 (PEX4)
acpiprt6 at acpi0: bus -1 (PEX5)
acpiprt7 at acpi0: bus 7 (PEX6)
acpiprt8 at acpi0: bus -1 (PEX7)
acpiprt9 at acpi0: bus 64 (PCI1)
acpiprt10 at acpi0: bus 65 (PEXB)
acpiprt11 at acpi0: bus -1 (PEXC)
acpiprt12 at acpi0: bus 66 (PEXD)
acpiprt13 at acpi0: bus -1 (PEXE)
acpipci0 at acpi0 PCI0: 0x00000000 0x00000011 0x00000001
acpicmos0 at acpi0
acpipci1 at acpi0 PCI1: 0x00000000 0x00000011 0x00000001
acpipci2 at acpi0 P0B1: 0x00000000 0x00000011 0x00000001
acpipci3 at acpi0 P1B1: 0x00000000 0x00000011 0x00000001
"PNP0C14" at acpi0 not configured
acpicpu0 at acpi0: C2(350@41 mwait.3@0x20), C1(1000@1 mwait.1)
acpicpu1 at acpi0: C2(350@41 mwait.3@0x20), C1(1000@1 mwait.1)
acpicpu2 at acpi0: C2(350@41 mwait.3@0x20), C1(1000@1 mwait.1)
acpicpu3 at acpi0: C2(350@41 mwait.3@0x20), C1(1000@1 mwait.1)
acpicpu4 at acpi0: C2(350@41 mwait.3@0x20), C1(1000@1 mwait.1)
acpicpu5 at acpi0: C2(350@41 mwait.3@0x20), C1(1000@1 mwait.1)
acpicpu6 at acpi0: C2(350@41 mwait.3@0x20), C1(1000@1 mwait.1)
acpicpu7 at acpi0: C2(350@41 mwait.3@0x20), C1(1000@1 mwait.1)
acpicpu8 at acpi0: C2(350@41 mwait.3@0x20), C1(1000@1 mwait.1)
acpicpu9 at acpi0: C2(350@41 mwait.3@0x20), C1(1000@1 mwait.1)
acpicpu10 at acpi0: C2(350@41 mwait.3@0x20), C1(1000@1 mwait.1)
acpicpu11 at acpi0: C2(350@41 mwait.3@0x20), C1(1000@1 mwait.1)
ipmi at mainbus0 not configured
cpu0: using VERW MDS workaround (except on vmm entry)
pci0 at mainbus0 bus 0
pchb0 at pci0 dev 0 function 0 "Intel E5 v2 Host" rev 0x04
ppb0 at pci0 dev 1 function 0 "Intel E5 v2 PCIE" rev 0x04
pci1 at ppb0 bus 1
em0 at pci1 dev 0 function 0 "Intel I350" rev 0x01: msi, address bc:30:5b:ef:92:e0
em1 at pci1 dev 0 function 1 "Intel I350" rev 0x01: msi, address bc:30:5b:ef:92:e1
em2 at pci1 dev 0 function 2 "Intel I350" rev 0x01: msi, address bc:30:5b:ef:92:e2
em3 at pci1 dev 0 function 3 "Intel I350" rev 0x01: msi, address bc:30:5b:ef:92:e3
ppb1 at pci0 dev 2 function 0 "Intel E5 v2 PCIE" rev 0x04
pci2 at ppb1 bus 3
ppb2 at pci0 dev 2 function 2 "Intel E5 v2 PCIE" rev 0x04
pci3 at ppb2 bus 2
mfii0 at pci3 dev 0 function 0 "Symbios Logic MegaRAID SAS2208" rev 0x05: msi
mfii0: "PERC H710 Mini", firmware 21.3.5-0002, 512MB cache
scsibus1 at mfii0: 64 targets
sd0 at scsibus1 targ 0 lun 0: <DELL, PERC H710, 3.13> naa.6c81f660f041d8002739a89c03aba527
sd0: 571776MB, 512 bytes/sector, 1170997248 sectors
sd1 at scsibus1 targ 1 lun 0: <DELL, PERC H710, 3.13> naa.6c81f660f041d80027b1947708cc4624
sd1: 2860032MB, 512 bytes/sector, 5857345536 sectors
scsibus2 at mfii0: 256 targets
ppb3 at pci0 dev 3 function 0 "Intel E5 v2 PCIE" rev 0x04: msi
pci4 at ppb3 bus 4
"Intel E5 v2 Address Map" rev 0x04 at pci0 dev 5 function 0 not configured
"Intel E5 v2 IIO RAS" rev 0x04 at pci0 dev 5 function 2 not configured
ppb4 at pci0 dev 17 function 0 "Intel C600 Virtual PCIE" rev 0x05
pci5 at ppb4 bus 5
"Intel C600 MEI" rev 0x05 at pci0 dev 22 function 0 not configured
"Intel C600 MEI" rev 0x05 at pci0 dev 22 function 1 not configured
ehci0 at pci0 dev 26 function 0 "Intel C600 USB" rev 0x05: apic 0 int 23
usb0 at ehci0: USB revision 2.0
uhub0 at usb0 configuration 1 interface 0 "Intel EHCI root hub" rev 2.00/1.00 addr 1
ppb5 at pci0 dev 28 function 0 "Intel C600 PCIE" rev 0xb5
pci6 at ppb5 bus 6
ppb6 at pci0 dev 28 function 7 "Intel C600 PCIE" rev 0xb5
pci7 at ppb6 bus 7
ppb7 at pci7 dev 0 function 0 "Renesas SH7757 PCIE Switch" rev 0x00
pci8 at ppb7 bus 8
ppb8 at pci8 dev 0 function 0 "Renesas SH7757 PCIE Switch" rev 0x00
pci9 at ppb8 bus 9
ppb9 at pci9 dev 0 function 0 "Renesas SH7757 PCIE-PCI" rev 0x00
pci10 at ppb9 bus 10
vga1 at pci10 dev 0 function 0 "Matrox MGA G200eR" rev 0x00
wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
ppb10 at pci8 dev 1 function 0 "Renesas SH7757 PCIE Switch" rev 0x00
pci11 at ppb10 bus 11
ehci1 at pci0 dev 29 function 0 "Intel C600 USB" rev 0x05: apic 0 int 22
usb1 at ehci1: USB revision 2.0
uhub1 at usb1 configuration 1 interface 0 "Intel EHCI root hub" rev 2.00/1.00 addr 1
ppb11 at pci0 dev 30 function 0 "Intel 82801BA Hub-to-PCI" rev 0xa5
pci12 at ppb11 bus 12
pcib0 at pci0 dev 31 function 0 "Intel C600 LPC" rev 0x05
ahci0 at pci0 dev 31 function 2 "Intel C600 AHCI" rev 0x05: msi, AHCI 1.3
scsibus3 at ahci0: 32 targets
isa0 at pcib0
isadma0 at isa0
com0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo
com1 at isa0 port 0x2f8/8 irq 3: ns16550a, 16 byte fifo
pckbc0 at isa0 port 0x60/5 irq 1 irq 12
pcppi0 at isa0 port 0x61
spkr0 at pcppi0
pci13 at mainbus0 bus 64
ppb12 at pci13 dev 1 function 0 "Intel E5 v2 PCIE" rev 0x04: msi
pci14 at ppb12 bus 65
ppb13 at pci13 dev 3 function 0 "Intel E5 v2 PCIE" rev 0x04: msi
pci15 at ppb13 bus 66
"Intel E5 v2 Address Map" rev 0x04 at pci13 dev 5 function 0 not configured
"Intel E5 v2 IIO RAS" rev 0x04 at pci13 dev 5 function 2 not configured
pci16 at mainbus0 bus 63
"Intel E5 v2 QPI Link" rev 0x04 at pci16 dev 8 function 0 not configured
"Intel E5 v2 QPI Link" rev 0x04 at pci16 dev 9 function 0 not configured
"Intel E5 v2 PCU" rev 0x04 at pci16 dev 10 function 0 not configured
"Intel E5 v2 PCU" rev 0x04 at pci16 dev 10 function 1 not configured
"Intel E5 v2 PCU" rev 0x04 at pci16 dev 10 function 2 not configured
"Intel E5 v2 PCU" rev 0x04 at pci16 dev 10 function 3 not configured
"Intel E5 v2 UBOX" rev 0x04 at pci16 dev 11 function 0 not configured
"Intel E5 v2 UBOX" rev 0x04 at pci16 dev 11 function 3 not configured
"Intel E5 v2 Unicast" rev 0x04 at pci16 dev 12 function 0 not configured
"Intel E5 v2 Unicast" rev 0x04 at pci16 dev 12 function 1 not configured
"Intel E5 v2 Unicast" rev 0x04 at pci16 dev 12 function 2 not configured
"Intel E5 v2 Unicast" rev 0x04 at pci16 dev 13 function 0 not configured
"Intel E5 v2 Unicast" rev 0x04 at pci16 dev 13 function 1 not configured
"Intel E5 v2 Unicast" rev 0x04 at pci16 dev 13 function 2 not configured
"Intel E5 v2 Home Agent" rev 0x04 at pci16 dev 14 function 0 not configured
"Intel E5 v2 Home Agent" rev 0x04 at pci16 dev 14 function 1 not configured
"Intel E5 v2 TA" rev 0x04 at pci16 dev 15 function 0 not configured
"Intel E5 v2 RAS" rev 0x04 at pci16 dev 15 function 1 not configured
"Intel E5 v2 TAD" rev 0x04 at pci16 dev 15 function 2 not configured
"Intel E5 v2 TAD" rev 0x04 at pci16 dev 15 function 3 not configured
"Intel E5 v2 TAD" rev 0x04 at pci16 dev 15 function 4 not configured
"Intel E5 v2 TAD" rev 0x04 at pci16 dev 15 function 5 not configured
"Intel E5 v2 Thermal" rev 0x04 at pci16 dev 16 function 0 not configured
"Intel E5 v2 Thermal" rev 0x04 at pci16 dev 16 function 1 not configured
"Intel E5 v2 Error" rev 0x04 at pci16 dev 16 function 2 not configured
"Intel E5 v2 Error" rev 0x04 at pci16 dev 16 function 3 not configured
"Intel E5 v2 Thermal" rev 0x04 at pci16 dev 16 function 4 not configured
"Intel E5 v2 Thermal" rev 0x04 at pci16 dev 16 function 5 not configured
"Intel E5 v2 Error" rev 0x04 at pci16 dev 16 function 7 not configured
"Intel E5 v2 R2PCIE" rev 0x04 at pci16 dev 19 function 0 not configured
"Intel E5 v2 QPI Link Monitor" rev 0x04 at pci16 dev 19 function 1 not configured
"Intel E5 v2 QPI" rev 0x04 at pci16 dev 19 function 4 not configured
"Intel E5 v2 QPI Link Monitor" rev 0x04 at pci16 dev 19 function 5 not configured
"Intel E5 v2 SAD" rev 0x04 at pci16 dev 22 function 0 not configured
"Intel E5 v2 Broadcast" rev 0x04 at pci16 dev 22 function 1 not configured
"Intel E5 v2 Broadcast" rev 0x04 at pci16 dev 22 function 2 not configured
pci17 at mainbus0 bus 127
"Intel E5 v2 QPI Link" rev 0x04 at pci17 dev 8 function 0 not configured
"Intel E5 v2 QPI Link" rev 0x04 at pci17 dev 9 function 0 not configured
"Intel E5 v2 PCU" rev 0x04 at pci17 dev 10 function 0 not configured
"Intel E5 v2 PCU" rev 0x04 at pci17 dev 10 function 1 not configured
"Intel E5 v2 PCU" rev 0x04 at pci17 dev 10 function 2 not configured
"Intel E5 v2 PCU" rev 0x04 at pci17 dev 10 function 3 not configured
"Intel E5 v2 UBOX" rev 0x04 at pci17 dev 11 function 0 not configured
"Intel E5 v2 UBOX" rev 0x04 at pci17 dev 11 function 3 not configured
"Intel E5 v2 Unicast" rev 0x04 at pci17 dev 12 function 0 not configured
"Intel E5 v2 Unicast" rev 0x04 at pci17 dev 12 function 1 not configured
"Intel E5 v2 Unicast" rev 0x04 at pci17 dev 12 function 2 not configured
"Intel E5 v2 Unicast" rev 0x04 at pci17 dev 13 function 0 not configured
"Intel E5 v2 Unicast" rev 0x04 at pci17 dev 13 function 1 not configured
"Intel E5 v2 Unicast" rev 0x04 at pci17 dev 13 function 2 not configured
"Intel E5 v2 Home Agent" rev 0x04 at pci17 dev 14 function 0 not configured
"Intel E5 v2 Home Agent" rev 0x04 at pci17 dev 14 function 1 not configured
"Intel E5 v2 TA" rev 0x04 at pci17 dev 15 function 0 not configured
"Intel E5 v2 RAS" rev 0x04 at pci17 dev 15 function 1 not configured
"Intel E5 v2 TAD" rev 0x04 at pci17 dev 15 function 2 not configured
"Intel E5 v2 TAD" rev 0x04 at pci17 dev 15 function 3 not configured
"Intel E5 v2 TAD" rev 0x04 at pci17 dev 15 function 4 not configured
"Intel E5 v2 TAD" rev 0x04 at pci17 dev 15 function 5 not configured
"Intel E5 v2 Thermal" rev 0x04 at pci17 dev 16 function 0 not configured
"Intel E5 v2 Thermal" rev 0x04 at pci17 dev 16 function 1 not configured
"Intel E5 v2 Error" rev 0x04 at pci17 dev 16 function 2 not configured
"Intel E5 v2 Error" rev 0x04 at pci17 dev 16 function 3 not configured
"Intel E5 v2 Thermal" rev 0x04 at pci17 dev 16 function 4 not configured
"Intel E5 v2 Thermal" rev 0x04 at pci17 dev 16 function 5 not configured
"Intel E5 v2 Error" rev 0x04 at pci17 dev 16 function 7 not configured
"Intel E5 v2 R2PCIE" rev 0x04 at pci17 dev 19 function 0 not configured
"Intel E5 v2 QPI Link Monitor" rev 0x04 at pci17 dev 19 function 1 not configured
"Intel E5 v2 QPI" rev 0x04 at pci17 dev 19 function 4 not configured
"Intel E5 v2 QPI Link Monitor" rev 0x04 at pci17 dev 19 function 5 not configured
"Intel E5 v2 SAD" rev 0x04 at pci17 dev 22 function 0 not configured
"Intel E5 v2 Broadcast" rev 0x04 at pci17 dev 22 function 1 not configured
"Intel E5 v2 Broadcast" rev 0x04 at pci17 dev 22 function 2 not configured
vmm0 at mainbus0: VMX/EPT
uhub2 at uhub0 port 1 configuration 1 interface 0 "Intel Rate Matching Hub" rev 2.00/0.00 addr 2
uhub3 at uhub2 port 6 configuration 1 interface 0 "no manufacturer Gadget USB HUB" rev 2.00/0.00 addr 3
uhidev0 at uhub3 port 1 configuration 1 interface 0 "Avocent Keyboard/Mouse Function" rev 2.00/0.00 addr 4
uhidev0: iclass 3/1
ukbd0 at uhidev0: 8 variable keys, 6 key codes
wskbd0 at ukbd0: console keyboard, using wsdisplay0
uhidev1 at uhub3 port 1 configuration 1 interface 1 "Avocent Keyboard/Mouse Function" rev 2.00/0.00 addr 4
uhidev1: iclass 3/1
ums0 at uhidev1: 3 buttons, Z dir
wsmouse0 at ums0 mux 0
uhidev2 at uhub3 port 1 configuration 1 interface 2 "Avocent Keyboard/Mouse Function" rev 2.00/0.00 addr 4
uhidev2: iclass 3/1
ums1 at uhidev2: 3 buttons, Z dir
wsmouse1 at ums1 mux 0
uhub4 at uhub1 port 1 configuration 1 interface 0 "Intel Rate Matching Hub" rev 2.00/0.00 addr 2
vscsi0 at root
scsibus4 at vscsi0: 256 targets
softraid0 at root
scsibus5 at softraid0: 256 targets
root on sd0a (67a664a003047604.a) swap on sd0b dump on sd0b
ukbd0: was console keyboard
wskbd0 detached
ukbd0 detached
uhidev0 detached
wsmouse0 detached
ums0 detached
uhidev1 detached
wsmouse1 detached
ums1 detached
uhidev2 detached
WARNING: / was not properly unmounted
uhidev0 at uhub3 port 1 configuration 1 interface 0 "Avocent Keyboard/Mouse Function" rev 2.00/0.00 addr 4
uhidev0: iclass 3/1
ukbd0 at uhidev0: 8 variable keys, 6 key codes
wskbd0 at ukbd0: console keyboard, using wsdisplay0
uhidev1 at uhub3 port 1 configuration 1 interface 1 "Avocent Keyboard/Mouse Function" rev 2.00/0.00 addr 4
uhidev1: iclass 3/1
ums0 at uhidev1: 3 buttons, Z dir
wsmouse0 at ums0 mux 0
uhidev2 at uhub3 port 1 configuration 1 interface 2 "Avocent Keyboard/Mouse Function" rev 2.00/0.00 addr 4
uhidev2: iclass 3/1
ums1 at uhidev2: 3 buttons, Z dir
wsmouse1 at ums1 mux 0

Reply | Threaded
Open this post in threaded view
|

Re: wg(4) crash

Martin Pieuchot
On 19/03/21(Fri) 20:15, Stuart Henderson wrote:

> Not a great report but I don't have much more to go on, machine had
> ddb.panic=0 and ddb hanged while printing the stack trace. Retyped by
> hand, may contain typos. Happened a few hours after setting up wg on it.
>
> uvm_fault(0xffffffff82204e38, 0x20, 0, 1) -> e
> fatal page fault in supervisor mode
> trap type 6 code 0 rip ffffffff81752116 cs 8 rflags 10246 cr2 20 cpl 0 rsp 00023b35eb0
> gsbase 0xffffffff820eaff0 kgsbase 0x0
> panic: trap type 6, code=0, pc=ffffffff81752116
> Starting stack trace...
> panic(ffffffff81ddc97a) at panic+0x11d
> kerntrap(ffff800023b35e00) at kerntrap+0x114
> alltraps_kern_meltdown() at alltraps_kern_meltdown+0x7b
> wg_index_drop(ffff8000012ae000,0) at wg_index_drop+0x96
> noise_create_initiation(

This is a NULL dereference at line 1981 of net/if_wg.c:

wg_index_drop(void *_sc, uint32_t key0)
{
        ...
        /* We expect a peer */
        peer = CONTAINER_OF(iter->i_value, struct wg_peer, p_remote);
        ...
}

Does that mean that `iter' is NULL and i_value' is at ofset 0x20 in that
struct?

Reply | Threaded
Open this post in threaded view
|

Re: wg(4) crash

Stuart Henderson
oh, let's cc Matt on this too.

On 2021/03/20 11:17, Martin Pieuchot wrote:

> On 19/03/21(Fri) 20:15, Stuart Henderson wrote:
> > Not a great report but I don't have much more to go on, machine had
> > ddb.panic=0 and ddb hanged while printing the stack trace. Retyped by
> > hand, may contain typos. Happened a few hours after setting up wg on it.
> >
> > uvm_fault(0xffffffff82204e38, 0x20, 0, 1) -> e
> > fatal page fault in supervisor mode
> > trap type 6 code 0 rip ffffffff81752116 cs 8 rflags 10246 cr2 20 cpl 0 rsp 00023b35eb0
> > gsbase 0xffffffff820eaff0 kgsbase 0x0
> > panic: trap type 6, code=0, pc=ffffffff81752116
> > Starting stack trace...
> > panic(ffffffff81ddc97a) at panic+0x11d
> > kerntrap(ffff800023b35e00) at kerntrap+0x114
> > alltraps_kern_meltdown() at alltraps_kern_meltdown+0x7b
> > wg_index_drop(ffff8000012ae000,0) at wg_index_drop+0x96
> > noise_create_initiation(
>
> This is a NULL dereference at line 1981 of net/if_wg.c:
>
> wg_index_drop(void *_sc, uint32_t key0)
> {
> ...
> /* We expect a peer */
>         peer = CONTAINER_OF(iter->i_value, struct wg_peer, p_remote);
>         ...
> }
>
> Does that mean that `iter' is NULL and i_value' is at ofset 0x20 in that
> struct?
>

Oh, I am an idiot, I had debug set and there is something other than just
standard messages around that time. Both sides are OpenBSD wg(4). I did not
have debug on the other side.

[...]
18:51:08.041Z  wg2: Sending handshake initiation to peer 3
18:51:08.091Z  wg2: Receiving handshake initiation from peer 3
18:51:08.091Z  wg2: Sending handshake response to peer 3
18:51:08.091Z  wg2: Unknown handshake response
18:51:13.141Z  wg2: Receiving handshake initiation from peer 3
18:51:13.141Z  wg2: Sending handshake response to peer 3
18:51:13.191Z  wg2: Handshake for peer 3 did not complete after 5 seconds, retrying (try 2)
18:51:13.191Z  wg2: Receiving keepalive packet from peer 3
18:51:13.191Z  wg2: Sending keepalive packe
18:51:13.191Z  t to peer 3
18:52:28.242Z  wg2: Sending keepalive packet to peer 3
18:52:28.342Z  wg2: Receiving keepalive packet from peer 3
18:53:43.343Z  wg2: Sending keepalive packet to peer 3
18:54:58.345Z  wg2: Sending handshake initiation to peer 3
18:54:58.395Z  wg2: Receiving handshake initiation from peer 3
18:54:58.395Z  wg2: Sending handshake response to peer 3
18:54:58.395Z  wg2: Unknown handshake response
<syslog stops here, rest retyped>
wg2: Handshake for peer 3 did not complete after 5 seconds, retrying (try 2)
wg2: Sending handshake initiation to peer 3
wg2: Sending handshake response to peer 3
<null deref here>

Reply | Threaded
Open this post in threaded view
|

Re: wg(4) crash

Matt Dunwoodie
On Sat, 20 Mar 2021 11:48:52 +0000
Stuart Henderson <[hidden email]> wrote:

> oh, let's cc Matt on this too.
>
> On 2021/03/20 11:17, Martin Pieuchot wrote:
> > On 19/03/21(Fri) 20:15, Stuart Henderson wrote:  
> > > Not a great report but I don't have much more to go on, machine
> > > had ddb.panic=0 and ddb hanged while printing the stack trace.
> > > Retyped by hand, may contain typos. Happened a few hours after
> > > setting up wg on it.
> > >
> > > uvm_fault(0xffffffff82204e38, 0x20, 0, 1) -> e
> > > fatal page fault in supervisor mode
> > > trap type 6 code 0 rip ffffffff81752116 cs 8 rflags 10246 cr2 20
> > > cpl 0 rsp 00023b35eb0 gsbase 0xffffffff820eaff0 kgsbase 0x0
> > > panic: trap type 6, code=0, pc=ffffffff81752116
> > > Starting stack trace...
> > > panic(ffffffff81ddc97a) at panic+0x11d
> > > kerntrap(ffff800023b35e00) at kerntrap+0x114
> > > alltraps_kern_meltdown() at alltraps_kern_meltdown+0x7b
> > > wg_index_drop(ffff8000012ae000,0) at wg_index_drop+0x96
> > > noise_create_initiation(  
> >
> > This is a NULL dereference at line 1981 of net/if_wg.c:
> >
> > wg_index_drop(void *_sc, uint32_t key0)
> > {
> > ...
> > /* We expect a peer */
> >         peer = CONTAINER_OF(iter->i_value, struct wg_peer,
> > p_remote); ...
> > }
> >
> > Does that mean that `iter' is NULL and i_value' is at ofset 0x20 in
> > that struct?
> >  

Correct. The issue is we're trying to remove an index that doesn't
exist. wg_index_drop iterates through the list and expects to find a
matching index (perhaps a KASSERT could have been helpful here).
Nevertheless, since index 0 doesn't exist `iter` ends up being NULL.

> Oh, I am an idiot, I had debug set and there is something other than
> just standard messages around that time. Both sides are OpenBSD
> wg(4). I did not have debug on the other side.
>
> [...]
> 18:51:08.041Z  wg2: Sending handshake initiation to peer 3
> 18:51:08.091Z  wg2: Receiving handshake initiation from peer 3
> 18:51:08.091Z  wg2: Sending handshake response to peer 3
> 18:51:08.091Z  wg2: Unknown handshake response
> 18:51:13.141Z  wg2: Receiving handshake initiation from peer 3
> 18:51:13.141Z  wg2: Sending handshake response to peer 3
> 18:51:13.191Z  wg2: Handshake for peer 3 did not complete after 5
> seconds, retrying (try 2) 18:51:13.191Z  wg2: Receiving keepalive
> packet from peer 3 18:51:13.191Z  wg2: Sending keepalive packe
> 18:51:13.191Z  t to peer 3
> 18:52:28.242Z  wg2: Sending keepalive packet to peer 3
> 18:52:28.342Z  wg2: Receiving keepalive packet from peer 3
> 18:53:43.343Z  wg2: Sending keepalive packet to peer 3
> 18:54:58.345Z  wg2: Sending handshake initiation to peer 3
> 18:54:58.395Z  wg2: Receiving handshake initiation from peer 3
> 18:54:58.395Z  wg2: Sending handshake response to peer 3
> 18:54:58.395Z  wg2: Unknown handshake response
> <syslog stops here, rest retyped>
> wg2: Handshake for peer 3 did not complete after 5 seconds, retrying
> (try 2) wg2: Sending handshake initiation to peer 3
> wg2: Sending handshake response to peer 3
> <null deref here>

With this information, it was possible to reproduce the issue on my
end. There is a race between sending/receiving handshake packets. This
occurs if we consume an initiation, then send an initiation prior to
replying to the consumed initiation.

In particular, when consuming an initiation, we don't generate the
index until creating the response (which is incorrect). If we attempt
to create an initiation between these processes, we drop any
outstanding handshake which in this case has index 0 as set when
consuming the initiation.

The fix attached is to generate the index when consuming the initiation
so that any spurious initiation creation can drop a valid index. The
patch also consolidates setting fields on the handshake.

With this patch applied, I was unable to reproduce the crash.

diff --git net/wg_noise.c net/wg_noise.c
index 86f7823cc83..176c36609fc 100644
--- net/wg_noise.c
+++ net/wg_noise.c
@@ -299,9 +299,6 @@ noise_consume_initiation(struct noise_local *l, struct noise_remote **rp,
     NOISE_TIMESTAMP_LEN + NOISE_AUTHTAG_LEN, key, hs.hs_hash) != 0)
  goto error;
 
- hs.hs_state = CONSUMED_INITIATION;
- hs.hs_local_index = 0;
- hs.hs_remote_index = s_idx;
  memcpy(hs.hs_e, ue, NOISE_PUBLIC_KEY_LEN);
 
  /* We have successfully computed the same results, now we ensure that
@@ -321,6 +318,9 @@ noise_consume_initiation(struct noise_local *l, struct noise_remote **rp,
 
  /* Ok, we're happy to accept this initiation now */
  noise_remote_handshake_index_drop(r);
+ hs.hs_state = CONSUMED_INITIATION;
+ hs.hs_local_index = noise_remote_handshake_index_get(r);
+ hs.hs_remote_index = s_idx;
  r->r_handshake = hs;
  *rp = r;
  ret = 0;
@@ -369,7 +369,6 @@ noise_create_response(struct noise_remote *r, uint32_t *s_idx, uint32_t *r_idx,
  noise_msg_encrypt(en, NULL, 0, key, hs->hs_hash);
 
  hs->hs_state = CREATED_RESPONSE;
- hs->hs_local_index = noise_remote_handshake_index_get(r);
  *r_idx = hs->hs_remote_index;
  *s_idx = hs->hs_local_index;
  ret = 0;

Reply | Threaded
Open this post in threaded view
|

Re: wg(4) crash

Klemens Nanni-2
On Mon, Mar 22, 2021 at 12:42:27AM +1100, Matt Dunwoodie wrote:

> On Sat, 20 Mar 2021 11:48:52 +0000
> Stuart Henderson <[hidden email]> wrote:
>
> > oh, let's cc Matt on this too.
> >
> > On 2021/03/20 11:17, Martin Pieuchot wrote:
> > > On 19/03/21(Fri) 20:15, Stuart Henderson wrote:  
> > > > Not a great report but I don't have much more to go on, machine
> > > > had ddb.panic=0 and ddb hanged while printing the stack trace.
> > > > Retyped by hand, may contain typos. Happened a few hours after
> > > > setting up wg on it.
> > > >
> > > > uvm_fault(0xffffffff82204e38, 0x20, 0, 1) -> e
> > > > fatal page fault in supervisor mode
> > > > trap type 6 code 0 rip ffffffff81752116 cs 8 rflags 10246 cr2 20
> > > > cpl 0 rsp 00023b35eb0 gsbase 0xffffffff820eaff0 kgsbase 0x0
> > > > panic: trap type 6, code=0, pc=ffffffff81752116
> > > > Starting stack trace...
> > > > panic(ffffffff81ddc97a) at panic+0x11d
> > > > kerntrap(ffff800023b35e00) at kerntrap+0x114
> > > > alltraps_kern_meltdown() at alltraps_kern_meltdown+0x7b
> > > > wg_index_drop(ffff8000012ae000,0) at wg_index_drop+0x96
> > > > noise_create_initiation(  
> > >
> > > This is a NULL dereference at line 1981 of net/if_wg.c:
> > >
> > > wg_index_drop(void *_sc, uint32_t key0)
> > > {
> > > ...
> > > /* We expect a peer */
> > >         peer = CONTAINER_OF(iter->i_value, struct wg_peer,
> > > p_remote); ...
> > > }
> > >
> > > Does that mean that `iter' is NULL and i_value' is at ofset 0x20 in
> > > that struct?
> > >  
>
> Correct. The issue is we're trying to remove an index that doesn't
> exist. wg_index_drop iterates through the list and expects to find a
> matching index (perhaps a KASSERT could have been helpful here).
> Nevertheless, since index 0 doesn't exist `iter` ends up being NULL.
>
> > Oh, I am an idiot, I had debug set and there is something other than
> > just standard messages around that time. Both sides are OpenBSD
> > wg(4). I did not have debug on the other side.
> >
> > [...]
> > 18:51:08.041Z  wg2: Sending handshake initiation to peer 3
> > 18:51:08.091Z  wg2: Receiving handshake initiation from peer 3
> > 18:51:08.091Z  wg2: Sending handshake response to peer 3
> > 18:51:08.091Z  wg2: Unknown handshake response
> > 18:51:13.141Z  wg2: Receiving handshake initiation from peer 3
> > 18:51:13.141Z  wg2: Sending handshake response to peer 3
> > 18:51:13.191Z  wg2: Handshake for peer 3 did not complete after 5
> > seconds, retrying (try 2) 18:51:13.191Z  wg2: Receiving keepalive
> > packet from peer 3 18:51:13.191Z  wg2: Sending keepalive packe
> > 18:51:13.191Z  t to peer 3
> > 18:52:28.242Z  wg2: Sending keepalive packet to peer 3
> > 18:52:28.342Z  wg2: Receiving keepalive packet from peer 3
> > 18:53:43.343Z  wg2: Sending keepalive packet to peer 3
> > 18:54:58.345Z  wg2: Sending handshake initiation to peer 3
> > 18:54:58.395Z  wg2: Receiving handshake initiation from peer 3
> > 18:54:58.395Z  wg2: Sending handshake response to peer 3
> > 18:54:58.395Z  wg2: Unknown handshake response
> > <syslog stops here, rest retyped>
> > wg2: Handshake for peer 3 did not complete after 5 seconds, retrying
> > (try 2) wg2: Sending handshake initiation to peer 3
> > wg2: Sending handshake response to peer 3
> > <null deref here>
>
> With this information, it was possible to reproduce the issue on my
> end. There is a race between sending/receiving handshake packets. This
> occurs if we consume an initiation, then send an initiation prior to
> replying to the consumed initiation.
>
> In particular, when consuming an initiation, we don't generate the
> index until creating the response (which is incorrect). If we attempt
> to create an initiation between these processes, we drop any
> outstanding handshake which in this case has index 0 as set when
> consuming the initiation.
>
> The fix attached is to generate the index when consuming the initiation
> so that any spurious initiation creation can drop a valid index. The
> patch also consolidates setting fields on the handshake.
>
> With this patch applied, I was unable to reproduce the crash.
This looks good and works, OK kn

sthen, do you want to commit this fix?  I think it should make it into
6.9 release.

> diff --git net/wg_noise.c net/wg_noise.c
> index 86f7823cc83..176c36609fc 100644
> --- net/wg_noise.c
> +++ net/wg_noise.c
> @@ -299,9 +299,6 @@ noise_consume_initiation(struct noise_local *l, struct noise_remote **rp,
>      NOISE_TIMESTAMP_LEN + NOISE_AUTHTAG_LEN, key, hs.hs_hash) != 0)
>   goto error;
>  
> - hs.hs_state = CONSUMED_INITIATION;
> - hs.hs_local_index = 0;
> - hs.hs_remote_index = s_idx;
>   memcpy(hs.hs_e, ue, NOISE_PUBLIC_KEY_LEN);
>  
>   /* We have successfully computed the same results, now we ensure that
> @@ -321,6 +318,9 @@ noise_consume_initiation(struct noise_local *l, struct noise_remote **rp,
>  
>   /* Ok, we're happy to accept this initiation now */
>   noise_remote_handshake_index_drop(r);
> + hs.hs_state = CONSUMED_INITIATION;
> + hs.hs_local_index = noise_remote_handshake_index_get(r);
> + hs.hs_remote_index = s_idx;
>   r->r_handshake = hs;
>   *rp = r;
>   ret = 0;
> @@ -369,7 +369,6 @@ noise_create_response(struct noise_remote *r, uint32_t *s_idx, uint32_t *r_idx,
>   noise_msg_encrypt(en, NULL, 0, key, hs->hs_hash);
>  
>   hs->hs_state = CREATED_RESPONSE;
> - hs->hs_local_index = noise_remote_handshake_index_get(r);
>   *r_idx = hs->hs_remote_index;
>   *s_idx = hs->hs_local_index;
>   ret = 0;
>

Reply | Threaded
Open this post in threaded view
|

Re: wg(4) crash

Stuart Henderson
I committed this a couple of weeks ago.

--
  Sent from a phone, apologies for poor formatting.
On 8 April 2021 06:10:25 Klemens Nanni <[hidden email]> wrote:

> On Mon, Mar 22, 2021 at 12:42:27AM +1100, Matt Dunwoodie wrote:
>> On Sat, 20 Mar 2021 11:48:52 +0000
>> Stuart Henderson <[hidden email]> wrote:
>>
>>> oh, let's cc Matt on this too.
>>>
>>> On 2021/03/20 11:17, Martin Pieuchot wrote:
>>>> On 19/03/21(Fri) 20:15, Stuart Henderson wrote:
>>>>> Not a great report but I don't have much more to go on, machine
>>>>> had ddb.panic=0 and ddb hanged while printing the stack trace.
>>>>> Retyped by hand, may contain typos. Happened a few hours after
>>>>> setting up wg on it.
>>>>>
>>>>> uvm_fault(0xffffffff82204e38, 0x20, 0, 1) -> e
>>>>> fatal page fault in supervisor mode
>>>>> trap type 6 code 0 rip ffffffff81752116 cs 8 rflags 10246 cr2 20
>>>>> cpl 0 rsp 00023b35eb0 gsbase 0xffffffff820eaff0 kgsbase 0x0
>>>>> panic: trap type 6, code=0, pc=ffffffff81752116
>>>>> Starting stack trace...
>>>>> panic(ffffffff81ddc97a) at panic+0x11d
>>>>> kerntrap(ffff800023b35e00) at kerntrap+0x114
>>>>> alltraps_kern_meltdown() at alltraps_kern_meltdown+0x7b
>>>>> wg_index_drop(ffff8000012ae000,0) at wg_index_drop+0x96
>>>>> noise_create_initiation(
>>>>
>>>> This is a NULL dereference at line 1981 of net/if_wg.c:
>>>>
>>>> wg_index_drop(void *_sc, uint32_t key0)
>>>> {
>>>> ...
>>>> /* We expect a peer */
>>>>    peer = CONTAINER_OF(iter->i_value, struct wg_peer,
>>>> p_remote); ...
>>>> }
>>>>
>>>> Does that mean that `iter' is NULL and i_value' is at ofset 0x20 in
>>>> that struct?
>>
>> Correct. The issue is we're trying to remove an index that doesn't
>> exist. wg_index_drop iterates through the list and expects to find a
>> matching index (perhaps a KASSERT could have been helpful here).
>> Nevertheless, since index 0 doesn't exist `iter` ends up being NULL.
>>
>>> Oh, I am an idiot, I had debug set and there is something other than
>>> just standard messages around that time. Both sides are OpenBSD
>>> wg(4). I did not have debug on the other side.
>>>
>>> [...]
>>> 18:51:08.041Z  wg2: Sending handshake initiation to peer 3
>>> 18:51:08.091Z  wg2: Receiving handshake initiation from peer 3
>>> 18:51:08.091Z  wg2: Sending handshake response to peer 3
>>> 18:51:08.091Z  wg2: Unknown handshake response
>>> 18:51:13.141Z  wg2: Receiving handshake initiation from peer 3
>>> 18:51:13.141Z  wg2: Sending handshake response to peer 3
>>> 18:51:13.191Z  wg2: Handshake for peer 3 did not complete after 5
>>> seconds, retrying (try 2) 18:51:13.191Z  wg2: Receiving keepalive
>>> packet from peer 3 18:51:13.191Z  wg2: Sending keepalive packe
>>> 18:51:13.191Z  t to peer 3
>>> 18:52:28.242Z  wg2: Sending keepalive packet to peer 3
>>> 18:52:28.342Z  wg2: Receiving keepalive packet from peer 3
>>> 18:53:43.343Z  wg2: Sending keepalive packet to peer 3
>>> 18:54:58.345Z  wg2: Sending handshake initiation to peer 3
>>> 18:54:58.395Z  wg2: Receiving handshake initiation from peer 3
>>> 18:54:58.395Z  wg2: Sending handshake response to peer 3
>>> 18:54:58.395Z  wg2: Unknown handshake response
>>> <syslog stops here, rest retyped>
>>> wg2: Handshake for peer 3 did not complete after 5 seconds, retrying
>>> (try 2) wg2: Sending handshake initiation to peer 3
>>> wg2: Sending handshake response to peer 3
>>> <null deref here>
>>
>> With this information, it was possible to reproduce the issue on my
>> end. There is a race between sending/receiving handshake packets. This
>> occurs if we consume an initiation, then send an initiation prior to
>> replying to the consumed initiation.
>>
>> In particular, when consuming an initiation, we don't generate the
>> index until creating the response (which is incorrect). If we attempt
>> to create an initiation between these processes, we drop any
>> outstanding handshake which in this case has index 0 as set when
>> consuming the initiation.
>>
>> The fix attached is to generate the index when consuming the initiation
>> so that any spurious initiation creation can drop a valid index. The
>> patch also consolidates setting fields on the handshake.
>>
>> With this patch applied, I was unable to reproduce the crash.
> This looks good and works, OK kn
>
> sthen, do you want to commit this fix?  I think it should make it into
> 6.9 release.
>
>> diff --git net/wg_noise.c net/wg_noise.c
>> index 86f7823cc83..176c36609fc 100644
>> --- net/wg_noise.c
>> +++ net/wg_noise.c
>> @@ -299,9 +299,6 @@ noise_consume_initiation(struct noise_local *l, struct
>> noise_remote **rp,
>> NOISE_TIMESTAMP_LEN + NOISE_AUTHTAG_LEN, key, hs.hs_hash) != 0)
>> goto error;
>>
>> - hs.hs_state = CONSUMED_INITIATION;
>> - hs.hs_local_index = 0;
>> - hs.hs_remote_index = s_idx;
>> memcpy(hs.hs_e, ue, NOISE_PUBLIC_KEY_LEN);
>>
>> /* We have successfully computed the same results, now we ensure that
>> @@ -321,6 +318,9 @@ noise_consume_initiation(struct noise_local *l, struct
>> noise_remote **rp,
>>
>> /* Ok, we're happy to accept this initiation now */
>> noise_remote_handshake_index_drop(r);
>> + hs.hs_state = CONSUMED_INITIATION;
>> + hs.hs_local_index = noise_remote_handshake_index_get(r);
>> + hs.hs_remote_index = s_idx;
>> r->r_handshake = hs;
>> *rp = r;
>> ret = 0;
>> @@ -369,7 +369,6 @@ noise_create_response(struct noise_remote *r, uint32_t
>> *s_idx, uint32_t *r_idx,
>> noise_msg_encrypt(en, NULL, 0, key, hs->hs_hash);
>>
>> hs->hs_state = CREATED_RESPONSE;
>> - hs->hs_local_index = noise_remote_handshake_index_get(r);
>> *r_idx = hs->hs_remote_index;
>> *s_idx = hs->hs_local_index;
>> ret = 0;

Reply | Threaded
Open this post in threaded view
|

Re: wg(4) crash

Klemens Nanni-2
On Thu, Apr 08, 2021 at 08:09:29AM +0100, Stuart Henderson wrote:
> I committed this a couple of weeks ago.
I'm glad it's just me looking at the wrong file's CVS log...
good morning :)