[PATCH] fast conditional console scrolling

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

[PATCH] fast conditional console scrolling

johnc
This causes the write-only framebuffer console to only redraw the
chars that differ between the start and end positions.

'time ls -R /usr/src/sys' is 3x faster with this, because most of
the characters stay the same after a scroll.

If this looks good, I can do the same thing for clear rows and copy/
clear columns, although I will need to make a test case for them.

It would probably be a good idea to change the rasops interface to
have generic block copy and clear oeprations, versus the current
full-column / full-row interface, so tmux and friends could get the
full acceleration.

Index: rasops.c
===================================================================
RCS file: /cvs/src/sys/dev/rasops/rasops.c,v
retrieving revision 1.61
diff -u -p -r1.61 rasops.c
--- rasops.c 25 May 2020 09:55:49 -0000 1.61
+++ rasops.c 26 Jun 2020 04:14:13 -0000
@@ -1627,28 +1627,42 @@ rasops_vcons_copyrows(void *cookie, int
  struct rasops_info *ri = scr->rs_ri;
  int cols = ri->ri_cols;
  int row, col, rc;
+ int srcofs;
+ int move;
 
+ /* update the scrollback buffer if the entire screen is moving */
  if (dst == 0 && (src + num == ri->ri_rows) && scr->rs_sbscreens > 0)
  memmove(&scr->rs_bs[dst], &scr->rs_bs[src * cols],
-    ((ri->ri_rows * (scr->rs_sbscreens + 1) * cols) -
-    (src * cols)) * sizeof(struct wsdisplay_charcell));
- else
+    ri->ri_rows * scr->rs_sbscreens * cols
+    * sizeof(struct wsdisplay_charcell));
+
+ /* copy everything */
+ if ((ri->ri_flg & RI_WRONLY) == 0 || !scr->rs_visible) {
  memmove(&scr->rs_bs[dst * cols + scr->rs_dispoffset],
-    &scr->rs_bs[src * cols + scr->rs_dispoffset],
-    num * cols * sizeof(struct wsdisplay_charcell));
+     &scr->rs_bs[src * cols + scr->rs_dispoffset],
+     num * cols * sizeof(struct wsdisplay_charcell));
 
- if (!scr->rs_visible)
- return 0;
+ if (!scr->rs_visible)
+ return 0;
 
- if ((ri->ri_flg & RI_WRONLY) == 0)
  return ri->ri_copyrows(ri, src, dst, num);
+ }
 
- for (row = dst; row < dst + num; row++) {
+ /* smart update, only redraw characters that are different */
+ srcofs = (src - dst) * cols;
+
+ for (move = 0 ; move < num ; move++) {
+ row = srcofs > 0 ? dst + move : dst + num - 1 - move;
  for (col = 0; col < cols; col++) {
  int off = row * cols + col + scr->rs_dispoffset;
-
- rc = ri->ri_putchar(ri, row, col,
-    scr->rs_bs[off].uc, scr->rs_bs[off].attr);
+ int newc = scr->rs_bs[off+srcofs].uc;
+ int newa = scr->rs_bs[off+srcofs].attr;
+ if ( scr->rs_bs[off].uc == newc
+ && scr->rs_bs[off].attr == newa )
+ continue;
+ scr->rs_bs[off].uc = newc;
+ scr->rs_bs[off].attr = newa;
+ rc = ri->ri_putchar(ri, row, col, newc, newa);
  if (rc != 0)
  return rc;
  }


Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] fast conditional console scrolling

Paul de Weerd
Hi John,

I tried your diff.  I don't quite see the same 3x improvement that you
report, more like 2x.  I timed 7 runs of ls -R /usr/ports:

Before diff, time ls -R /usr/ports | wc -l 2.897s on average
After diff,  time ls -R /usr/ports | wc -l 2.707s on average

Before diff, time ls -R /usr/ports 2m53.067 on average
After diff, time ls -R /usr/ports 1m30.387 on average

Note that the 'before diff' runs were with a snapshot kernel.  There
may be diffs in there that account for the difference between before
and after of the no-output runs.  See dmesg and full stats below.

So, on average, a speed-up of ~48%.

Thanks!

Paul 'WEiRD' de Weerd

--- full stats -------------------------------------------------------
pre-diff, no output             post-diff, no output
real    user    system          real    user    system
02.94   00.58   02.40           02.70   00.58   02.12
02.88   00.56   02.37           02.71   00.39   02.32
03.03   00.46   02.60           02.70   00.43   02.26
02.85   00.52   02.36           02.69   00.54   02.18
02.88   00.45   02.43           02.62   00.53   02.10
02.87   00.50   02.38           02.72   00.62   02.11
02.83   00.57   02.29           02.81   00.45   02.36

pre-diff, with output           post-diff, with output
real    user    system          real    user    system
2m53.17 00.90   2m52.27         1m30.81 01.23   1m29.50
2m53.12 00.81   2m52.31         1m30.58 01.33   1m29.30
2m53.01 00.88   2m52.11         1m30.49 01.11   1m29.40
2m53.06 01.03   2m52.00         1m30.53 01.29   1m29.26
2m52.99 00.80   2m52.24         1m30.27 01.08   1m29.19
2m53.11 00.96   2m52.16         1m30.40 01.14   1m29.27
2m53.01 00.79   2m52,28         1m30.33 01.11   1m29.24
----------------------------------------------------------------------

--- dmesg ------------------------------------------------------------
OpenBSD 6.7-current (GENERIC.MP) #296: Wed Jun 24 11:34:44 MDT 2020
    [hidden email]:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 34243903488 (32657MB)
avail mem = 33191059456 (31653MB)
random: good seed from bootblocks
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.7 @ 0xec410 (88 entries)
bios0: vendor Dell Inc. version "A22" date 02/01/2018
bios0: Dell Inc. OptiPlex 9020
acpi0 at bios0: ACPI 5.0
acpi0: sleep states S0 S3 S4 S5
acpi0: tables DSDT FACP APIC FPDT SLIC LPIT SSDT SSDT SSDT HPET SSDT MCFG SSDT ASF! DMAR
acpi0: wakeup devices UAR1(S3) RP01(S4) PXSX(S4) PXSX(S4) PXSX(S4) RP05(S4) PXSX(S4) PXSX(S4) PXSX(S4) PXSX(S4) GLAN(S4) EHC1(S3) EHC2(S3) XHC_(S4) HDEF(S4) PEG0(S4) [...]
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee00000: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz, 3692.06 MHz, 06-3c-03
cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu0: 256KB 64b/line 8-way L2 cache
cpu0: TSC skew=0 observed drift=0
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges
cpu0: apic clock running at 99MHz
cpu0: mwait min=64, max=64, C-substates=0.2.1.2.4, IBE
cpu1 at mainbus0: apid 2 (application processor)
cpu1: Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz, 3691.46 MHz, 06-3c-03
cpu1: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu1: 256KB 64b/line 8-way L2 cache
cpu1: TSC skew=1 observed drift=0
cpu1: smt 0, core 1, package 0
cpu2 at mainbus0: apid 4 (application processor)
cpu2: Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz, 3691.46 MHz, 06-3c-03
cpu2: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu2: 256KB 64b/line 8-way L2 cache
cpu2: TSC skew=12 observed drift=0
cpu2: smt 0, core 2, package 0
cpu3 at mainbus0: apid 6 (application processor)
cpu3: Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz, 3691.46 MHz, 06-3c-03
cpu3: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu3: 256KB 64b/line 8-way L2 cache
cpu3: TSC skew=-1 observed drift=0
cpu3: smt 0, core 3, package 0
ioapic0 at mainbus0: apid 8 pa 0xfec00000, version 20, 24 pins
acpihpet0 at acpi0: 14318179 Hz
acpimcfg0 at acpi0
acpimcfg0: addr 0xf8000000, bus 0-63
acpiprt0 at acpi0: bus 0 (PCI0)
acpiprt1 at acpi0: bus 1 (RP01)
acpiprt2 at acpi0: bus 2 (RP05)
acpiprt3 at acpi0: bus -1 (PEG0)
acpiprt4 at acpi0: bus -1 (PEG1)
acpiprt5 at acpi0: bus -1 (PEG2)
acpiec0 at acpi0: not present
acpicpu0 at acpi0: C2(200@148 mwait.1@0x33), C1(1000@1 mwait.1), PSS
acpicpu1 at acpi0: C2(200@148 mwait.1@0x33), C1(1000@1 mwait.1), PSS
acpicpu2 at acpi0: C2(200@148 mwait.1@0x33), C1(1000@1 mwait.1), PSS
acpicpu3 at acpi0: C2(200@148 mwait.1@0x33), C1(1000@1 mwait.1), PSS
acpitz0 at acpi0: critical temperature is 105 degC
acpitz1 at acpi0: critical temperature is 105 degC
acpipci0 at acpi0 PCI0: 0x00000010 0x00000011 0x00000000
extent `acpipci0 pcibus' (0x0 - 0xff), flags=0
     0x3f - 0xff
extent `acpipci0 pciio' (0x0 - 0xffffffff), flags=0
     0xcf8 - 0xcff
     0x10000 - 0xffffffff
extent `acpipci0 pcimem' (0x0 - 0xffffffffffffffff), flags=0
     0x0 - 0x9ffff
     0xc0000 - 0xd3fff
     0xe8000 - 0xdf1fffff
     0xfeb00000 - 0xffffffffffffffff
acpicmos0 at acpi0
acpibtn0 at acpi0: PWRB
"PNP0C14" at acpi0 not configured
acpivideo0 at acpi0: GFX0
acpivout0 at acpivideo0: DD1F
cpu0: using VERW MDS workaround (except on vmm entry)
cpu0: Enhanced SpeedStep 3692 MHz: speeds: 3401, 3400, 3200, 3000, 2800, 2700, 2500, 2300, 2100, 1900, 1700, 1500, 1400, 1200, 1000, 800 MHz
pci0 at mainbus0 bus 0
pchb0 at pci0 dev 0 function 0 "Intel Core 4G Host" rev 0x06
inteldrm0 at pci0 dev 2 function 0 "Intel HD Graphics 4600" rev 0x06
drm0 at inteldrm0
inteldrm0: msi, HASWELL, gen 7
azalia0 at pci0 dev 3 function 0 "Intel Core 4G HD Audio" rev 0x06: msi
azalia0: No codecs found
xhci0 at pci0 dev 20 function 0 "Intel 8 Series xHCI" rev 0x04: msi, xHCI 1.0
usb0 at xhci0: USB revision 3.0
uhub0 at usb0 configuration 1 interface 0 "Intel xHCI root hub" rev 3.00/1.00 addr 1
"Intel 8 Series MEI" rev 0x04 at pci0 dev 22 function 0 not configured
puc0 at pci0 dev 22 function 3 "Intel 8 Series KT" rev 0x04: ports: 16 com
com4 at puc0 port 0 apic 8 int 19: ns16550a, 16 byte fifo
com4: probed fifo depth: 0 bytes
em0 at pci0 dev 25 function 0 "Intel I217-LM" rev 0x04: msi, address b8:ca:3a:93:03:e8
ehci0 at pci0 dev 26 function 0 "Intel 8 Series USB" rev 0x04: apic 8 int 16
usb1 at ehci0: USB revision 2.0
uhub1 at usb1 configuration 1 interface 0 "Intel EHCI root hub" rev 2.00/1.00 addr 1
azalia1 at pci0 dev 27 function 0 "Intel 8 Series HD Audio" rev 0x04: msi
azalia1: codecs: Realtek/0x0280
audio0 at azalia1
ppb0 at pci0 dev 28 function 0 "Intel 8 Series PCIE" rev 0xd4
pci1 at ppb0 bus 1
ppb1 at pci0 dev 28 function 4 "Intel 8 Series PCIE" rev 0xd4: msi
pci2 at ppb1 bus 2
ahci0 at pci2 dev 0 function 0 "Marvell 88SE9128 AHCI" rev 0x20: msi, AHCI 1.2
ahci0: port 7: 1.5Gb/s
scsibus1 at ahci0: 32 targets
uk0 at scsibus1 targ 7 lun 0: <Marvell, 91xx Config, 1.01>
ehci1 at pci0 dev 29 function 0 "Intel 8 Series USB" rev 0x04: apic 8 int 23
usb2 at ehci1: USB revision 2.0
uhub2 at usb2 configuration 1 interface 0 "Intel EHCI root hub" rev 2.00/1.00 addr 1
pcib0 at pci0 dev 31 function 0 "Intel Q87 LPC" rev 0x04
ahci1 at pci0 dev 31 function 2 "Intel 8 Series AHCI" rev 0x04: msi, AHCI 1.3
ahci1: port 0: 6.0Gb/s
ahci1: port 1: 1.5Gb/s
ahci1: port 2: 6.0Gb/s
scsibus2 at ahci1: 32 targets
sd0 at scsibus2 targ 0 lun 0: <ATA, Samsung SSD 850, EMT0> naa.5002538d41b86e4d
sd0: 953869MB, 512 bytes/sector, 1953525168 sectors, thin
cd0 at scsibus2 targ 1 lun 0: <HL-DT-ST, DVD+-RW GT80N, A103> removable
sd1 at scsibus2 targ 2 lun 0: <ATA, WDC WD60EFRX-68L, 82.0> naa.50014ee2b81f4111
sd1: 5723166MB, 512 bytes/sector, 11721045168 sectors
ichiic0 at pci0 dev 31 function 3 "Intel 8 Series SMBus" rev 0x04: apic 8 int 18
iic0 at ichiic0
spdmem0 at iic0 addr 0x50: 8GB DDR3 SDRAM PC3-12800
spdmem1 at iic0 addr 0x51: 8GB DDR3 SDRAM PC3-12800
spdmem2 at iic0 addr 0x52: 8GB DDR3 SDRAM PC3-12800
spdmem3 at iic0 addr 0x53: 8GB DDR3 SDRAM PC3-12800
isa0 at pcib0
isadma0 at isa0
com0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo
pckbc0 at isa0 port 0x60/5 irq 1 irq 12
pckbd0 at pckbc0 (kbd slot)
wskbd0 at pckbd0: console keyboard
pcppi0 at isa0 port 0x61
spkr0 at pcppi0
vmm0 at mainbus0: VMX/EPT
uhidev0 at uhub0 port 3 configuration 1 interface 0 "RDing TEMPERHUM1V1.2" rev 2.00/0.01 addr 2
uhidev0: iclass 3/1, 1 report id
ukbd0 at uhidev0 reportid 1: 8 variable keys, 5 key codes
wskbd1 at ukbd0 mux 1
uhidev1 at uhub0 port 3 configuration 1 interface 1 "RDing TEMPERHUM1V1.2" rev 2.00/0.01 addr 2
uhidev1: iclass 3/1
ugold0 at uhidev1
uhub3 at uhub0 port 4 configuration 1 interface 0 "VIA Labs, Inc. USB2.0 Hub" rev 2.10/d.a0 addr 3
uhidev2 at uhub3 port 4 configuration 1 interface 0 "Metadot - Das Keyboard Das Keyboard" rev 2.00/1.00 addr 4
uhidev2: iclass 3/1
ukbd1 at uhidev2: 8 variable keys, 6 key codes
wskbd2 at ukbd1 mux 1
uhidev3 at uhub3 port 4 configuration 1 interface 1 "Metadot - Das Keyboard Das Keyboard" rev 2.00/1.00 addr 4
uhidev3: iclass 3/0, 3 report ids
uhid0 at uhidev3 reportid 1: input=0, output=0, feature=7
uhid1 at uhidev3 reportid 2: input=1, output=0, feature=0
uhid2 at uhidev3 reportid 3: input=3, output=0, feature=0
uhub4 at uhub0 port 5 configuration 1 interface 0 "Texas Instruments product 0x8044" rev 2.10/1.00 addr 5
uhub5 at uhub0 port 6 configuration 1 interface 0 "Texas Instruments product 0x8044" rev 2.10/1.00 addr 6
uvideo0 at uhub0 port 9 configuration 1 interface 0 "Logitech QuickCam Pro 9000" rev 2.00/0.08 addr 7
video0 at uvideo0
uaudio0 at uhub0 port 9 configuration 1 interface 3 "Logitech QuickCam Pro 9000" rev 2.00/0.08 addr 7
uaudio0: class v1, high-speed, sync, channels: 0 play, 1 rec, 2 ctls
audio1 at uaudio0
uhidev4 at uhub0 port 10 configuration 1 interface 0 "Logitech MX518 Gaming Mouse" rev 2.00/40.00 addr 8
uhidev4: iclass 3/1
ums0 at uhidev4: 16 buttons, Z and W dir
wsmouse0 at ums0 mux 0
uhidev5 at uhub0 port 10 configuration 1 interface 1 "Logitech MX518 Gaming Mouse" rev 2.00/40.00 addr 8
uhidev5: iclass 3/0, 17 report ids
ukbd2 at uhidev5 reportid 1: 8 variable keys, 6 key codes
wskbd3 at ukbd2 mux 1
uhid3 at uhidev5 reportid 3: input=4, output=0, feature=0
uhid4 at uhidev5 reportid 4: input=1, output=0, feature=0
uhid5 at uhidev5 reportid 16: input=6, output=6, feature=0
uhid6 at uhidev5 reportid 17: input=19, output=19, feature=0
ugold0: 2 sensors type si7006 (temperature and humidity)
uhub6 at uhub0 port 21 configuration 1 interface 0 "VIA Labs, Inc. USB3.0 Hub" rev 3.00/d.a1 addr 9
uhub7 at uhub1 port 1 configuration 1 interface 0 "Intel Rate Matching Hub" rev 2.00/0.04 addr 2
uhub8 at uhub2 port 1 configuration 1 interface 0 "Intel Rate Matching Hub" rev 2.00/0.04 addr 2
vscsi0 at root
scsibus3 at vscsi0: 256 targets
softraid0 at root
scsibus4 at softraid0: 256 targets
softraid0: sd2 was not shutdown properly
sd2 at scsibus4 targ 1 lun 0: <OPENBSD, SR CRYPTO, 006>
sd2: 953866MB, 512 bytes/sector, 1953519473 sectors
root on sd2a (a0b80508b6693ba1.a) swap on sd2b dump on sd2b
WARNING: / was not properly unmounted
inteldrm0: 1920x1080, 32bpp
wsdisplay0 at inteldrm0 mux 1: console (std, vt100 emulation), using wskbd0
wskbd1: connecting to wsdisplay0
wskbd2: connecting to wsdisplay0
wskbd3: connecting to wsdisplay0
wsdisplay0: screen 1-5 added (std, vt100 emulation)
wskbd1: disconnecting from wsdisplay0
wskbd1 detached
ukbd0 detached
uhidev0 detached
ugold0 detached
uhidev1 detached
uhidev0 at uhub0 port 3 configuration 1 interface 0 "RDing TEMPERHUM1V1.2" rev 2.00/0.01 addr 2
uhidev0: iclass 3/1, 1 report id
ukbd0 at uhidev0 reportid 1: 8 variable keys, 5 key codes
wskbd1 at ukbd0: console keyboard, using wsdisplay0
uhidev1 at uhub0 port 3 configuration 1 interface 1 "RDing TEMPERHUM1V1.2" rev 2.00/0.01 addr 2
uhidev1: iclass 3/1
ugold0 at uhidev1
ugold0: 2 sensors type si7006 (temperature and humidity)
syncing disks... done
rebooting...
OpenBSD 6.7-current (GENERIC.MP) #1: Fri Jun 26 09:51:07 CEST 2020
    [hidden email]:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 34243903488 (32657MB)
avail mem = 33191059456 (31653MB)
random: good seed from bootblocks
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.7 @ 0xec410 (88 entries)
bios0: vendor Dell Inc. version "A22" date 02/01/2018
bios0: Dell Inc. OptiPlex 9020
acpi0 at bios0: ACPI 5.0
acpi0: sleep states S0 S3 S4 S5
acpi0: tables DSDT FACP APIC FPDT SLIC LPIT SSDT SSDT SSDT HPET SSDT MCFG SSDT ASF! DMAR
acpi0: wakeup devices UAR1(S3) RP01(S4) PXSX(S4) PXSX(S4) PXSX(S4) RP05(S4) PXSX(S4) PXSX(S4) PXSX(S4) PXSX(S4) GLAN(S4) EHC1(S3) EHC2(S3) XHC_(S4) HDEF(S4) PEG0(S4) [...]
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee00000: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz, 3692.22 MHz, 06-3c-03
cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu0: 256KB 64b/line 8-way L2 cache
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges
cpu0: apic clock running at 99MHz
cpu0: mwait min=64, max=64, C-substates=0.2.1.2.4, IBE
cpu1 at mainbus0: apid 2 (application processor)
cpu1: Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz, 3691.47 MHz, 06-3c-03
cpu1: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu1: 256KB 64b/line 8-way L2 cache
cpu1: smt 0, core 1, package 0
cpu2 at mainbus0: apid 4 (application processor)
cpu2: Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz, 3691.47 MHz, 06-3c-03
cpu2: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu2: 256KB 64b/line 8-way L2 cache
cpu2: smt 0, core 2, package 0
cpu3 at mainbus0: apid 6 (application processor)
cpu3: Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz, 3691.47 MHz, 06-3c-03
cpu3: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu3: 256KB 64b/line 8-way L2 cache
cpu3: smt 0, core 3, package 0
ioapic0 at mainbus0: apid 8 pa 0xfec00000, version 20, 24 pins
acpihpet0 at acpi0: 14318179 Hz
acpimcfg0 at acpi0
acpimcfg0: addr 0xf8000000, bus 0-63
acpiprt0 at acpi0: bus 0 (PCI0)
acpiprt1 at acpi0: bus 1 (RP01)
acpiprt2 at acpi0: bus 2 (RP05)
acpiprt3 at acpi0: bus -1 (PEG0)
acpiprt4 at acpi0: bus -1 (PEG1)
acpiprt5 at acpi0: bus -1 (PEG2)
acpiec0 at acpi0: not present
acpicpu0 at acpi0: C2(200@148 mwait.1@0x33), C1(1000@1 mwait.1), PSS
acpicpu1 at acpi0: C2(200@148 mwait.1@0x33), C1(1000@1 mwait.1), PSS
acpicpu2 at acpi0: C2(200@148 mwait.1@0x33), C1(1000@1 mwait.1), PSS
acpicpu3 at acpi0: C2(200@148 mwait.1@0x33), C1(1000@1 mwait.1), PSS
acpitz0 at acpi0: critical temperature is 105 degC
acpitz1 at acpi0: critical temperature is 105 degC
acpipci0 at acpi0 PCI0: 0x00000010 0x00000011 0x00000000
extent `acpipci0 pcibus' (0x0 - 0xff), flags=0
     0x3f - 0xff
extent `acpipci0 pciio' (0x0 - 0xffffffff), flags=0
     0xcf8 - 0xcff
     0x10000 - 0xffffffff
extent `acpipci0 pcimem' (0x0 - 0xffffffffffffffff), flags=0
     0x0 - 0x9ffff
     0xc0000 - 0xd3fff
     0xe8000 - 0xdf1fffff
     0xfeb00000 - 0xffffffffffffffff
acpicmos0 at acpi0
acpibtn0 at acpi0: PWRB
"PNP0C14" at acpi0 not configured
acpivideo0 at acpi0: GFX0
acpivout0 at acpivideo0: DD1F
cpu0: using VERW MDS workaround (except on vmm entry)
cpu0: Enhanced SpeedStep 3692 MHz: speeds: 3401, 3400, 3200, 3000, 2800, 2700, 2500, 2300, 2100, 1900, 1700, 1500, 1400, 1200, 1000, 800 MHz
pci0 at mainbus0 bus 0
pchb0 at pci0 dev 0 function 0 "Intel Core 4G Host" rev 0x06
inteldrm0 at pci0 dev 2 function 0 "Intel HD Graphics 4600" rev 0x06
drm0 at inteldrm0
inteldrm0: msi, HASWELL, gen 7
azalia0 at pci0 dev 3 function 0 "Intel Core 4G HD Audio" rev 0x06: msi
azalia0: No codecs found
xhci0 at pci0 dev 20 function 0 "Intel 8 Series xHCI" rev 0x04: msi, xHCI 1.0
usb0 at xhci0: USB revision 3.0
uhub0 at usb0 configuration 1 interface 0 "Intel xHCI root hub" rev 3.00/1.00 addr 1
"Intel 8 Series MEI" rev 0x04 at pci0 dev 22 function 0 not configured
puc0 at pci0 dev 22 function 3 "Intel 8 Series KT" rev 0x04: ports: 16 com
com4 at puc0 port 0 apic 8 int 19: ns16550a, 16 byte fifo
com4: probed fifo depth: 0 bytes
em0 at pci0 dev 25 function 0 "Intel I217-LM" rev 0x04: msi, address b8:ca:3a:93:03:e8
ehci0 at pci0 dev 26 function 0 "Intel 8 Series USB" rev 0x04: apic 8 int 16
usb1 at ehci0: USB revision 2.0
uhub1 at usb1 configuration 1 interface 0 "Intel EHCI root hub" rev 2.00/1.00 addr 1
azalia1 at pci0 dev 27 function 0 "Intel 8 Series HD Audio" rev 0x04: msi
azalia1: codecs: Realtek/0x0280
audio0 at azalia1
ppb0 at pci0 dev 28 function 0 "Intel 8 Series PCIE" rev 0xd4
pci1 at ppb0 bus 1
ppb1 at pci0 dev 28 function 4 "Intel 8 Series PCIE" rev 0xd4: msi
pci2 at ppb1 bus 2
ehci1 at pci0 dev 29 function 0 "Intel 8 Series USB" rev 0x04: apic 8 int 23
usb2 at ehci1: USB revision 2.0
uhub2 at usb2 configuration 1 interface 0 "Intel EHCI root hub" rev 2.00/1.00 addr 1
pcib0 at pci0 dev 31 function 0 "Intel Q87 LPC" rev 0x04
ahci0 at pci0 dev 31 function 2 "Intel 8 Series AHCI" rev 0x04: msi, AHCI 1.3
ahci0: port 0: 6.0Gb/s
ahci0: port 1: 1.5Gb/s
ahci0: port 2: 6.0Gb/s
scsibus1 at ahci0: 32 targets
sd0 at scsibus1 targ 0 lun 0: <ATA, Samsung SSD 850, EMT0> naa.5002538d41b86e4d
sd0: 953869MB, 512 bytes/sector, 1953525168 sectors, thin
cd0 at scsibus1 targ 1 lun 0: <HL-DT-ST, DVD+-RW GT80N, A103> removable
sd1 at scsibus1 targ 2 lun 0: <ATA, WDC WD60EFRX-68L, 82.0> naa.50014ee2b81f4111
sd1: 5723166MB, 512 bytes/sector, 11721045168 sectors
ichiic0 at pci0 dev 31 function 3 "Intel 8 Series SMBus" rev 0x04: apic 8 int 18
iic0 at ichiic0
spdmem0 at iic0 addr 0x50: 8GB DDR3 SDRAM PC3-12800
spdmem1 at iic0 addr 0x51: 8GB DDR3 SDRAM PC3-12800
spdmem2 at iic0 addr 0x52: 8GB DDR3 SDRAM PC3-12800
spdmem3 at iic0 addr 0x53: 8GB DDR3 SDRAM PC3-12800
isa0 at pcib0
isadma0 at isa0
com0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo
pckbc0 at isa0 port 0x60/5 irq 1 irq 12
pckbd0 at pckbc0 (kbd slot)
wskbd0 at pckbd0: console keyboard
pcppi0 at isa0 port 0x61
spkr0 at pcppi0
vmm0 at mainbus0: VMX/EPT
uhidev0 at uhub0 port 3 configuration 1 interface 0 "RDing TEMPERHUM1V1.2" rev 2.00/0.01 addr 2
uhidev0: iclass 3/1, 1 report id
ukbd0 at uhidev0 reportid 1: 8 variable keys, 5 key codes
wskbd1 at ukbd0 mux 1
uhidev1 at uhub0 port 3 configuration 1 interface 1 "RDing TEMPERHUM1V1.2" rev 2.00/0.01 addr 2
uhidev1: iclass 3/1
ugold0 at uhidev1
uhub3 at uhub0 port 4 configuration 1 interface 0 "VIA Labs, Inc. USB2.0 Hub" rev 2.10/d.a0 addr 3
uhidev2 at uhub3 port 4 configuration 1 interface 0 "Metadot - Das Keyboard Das Keyboard" rev 2.00/1.00 addr 4
uhidev2: iclass 3/1
ukbd1 at uhidev2: 8 variable keys, 6 key codes
wskbd2 at ukbd1 mux 1
uhidev3 at uhub3 port 4 configuration 1 interface 1 "Metadot - Das Keyboard Das Keyboard" rev 2.00/1.00 addr 4
uhidev3: iclass 3/0, 3 report ids
uhid0 at uhidev3 reportid 1: input=0, output=0, feature=7
uhid1 at uhidev3 reportid 2: input=1, output=0, feature=0
uhid2 at uhidev3 reportid 3: input=3, output=0, feature=0
uhub4 at uhub0 port 5 configuration 1 interface 0 "Texas Instruments product 0x8044" rev 2.10/1.00 addr 5
uhub5 at uhub0 port 6 configuration 1 interface 0 "Texas Instruments product 0x8044" rev 2.10/1.00 addr 6
uvideo0 at uhub0 port 9 configuration 1 interface 0 "Logitech QuickCam Pro 9000" rev 2.00/0.08 addr 7
video0 at uvideo0
uaudio0 at uhub0 port 9 configuration 1 interface 3 "Logitech QuickCam Pro 9000" rev 2.00/0.08 addr 7
uaudio0: class v1, high-speed, sync, channels: 0 play, 1 rec, 2 ctls
audio1 at uaudio0
uhidev4 at uhub0 port 10 configuration 1 interface 0 "Logitech MX518 Gaming Mouse" rev 2.00/40.00 addr 8
uhidev4: iclass 3/1
ums0 at uhidev4: 16 buttons, Z and W dir
wsmouse0 at ums0 mux 0
uhidev5 at uhub0 port 10 configuration 1 interface 1 "Logitech MX518 Gaming Mouse" rev 2.00/40.00 addr 8
uhidev5: iclass 3/0, 17 report ids
ukbd2 at uhidev5 reportid 1: 8 variable keys, 6 key codes
wskbd3 at ukbd2 mux 1
uhid3 at uhidev5 reportid 3: input=4, output=0, feature=0
uhid4 at uhidev5 reportid 4: input=1, output=0, feature=0
uhid5 at uhidev5 reportid 16: input=6, output=6, feature=0
uhid6 at uhidev5 reportid 17: input=19, output=19, feature=0
ugold0: 2 sensors type si7006 (temperature and humidity)
uhub6 at uhub0 port 21 configuration 1 interface 0 "VIA Labs, Inc. USB3.0 Hub" rev 3.00/d.a1 addr 9
uhub7 at uhub1 port 1 configuration 1 interface 0 "Intel Rate Matching Hub" rev 2.00/0.04 addr 2
uhub8 at uhub2 port 1 configuration 1 interface 0 "Intel Rate Matching Hub" rev 2.00/0.04 addr 2
vscsi0 at root
scsibus2 at vscsi0: 256 targets
softraid0 at root
scsibus3 at softraid0: 256 targets
sd2 at scsibus3 targ 1 lun 0: <OPENBSD, SR CRYPTO, 006>
sd2: 953866MB, 512 bytes/sector, 1953519473 sectors
root on sd2a (a0b80508b6693ba1.a) swap on sd2b dump on sd2b
drm:pid0:intel_dp_aux_wait_done *ERROR* [drm] *ERROR* AUX C/port C: did not complete or timeout within 10ms (status 0xa145003f)
inteldrm0: 1920x1080, 32bpp
wsdisplay0 at inteldrm0 mux 1: console (std, vt100 emulation), using wskbd0
wskbd1: connecting to wsdisplay0
wskbd2: connecting to wsdisplay0
wskbd3: connecting to wsdisplay0
wsdisplay0: screen 1-5 added (std, vt100 emulation)
wskbd1: disconnecting from wsdisplay0
wskbd1 detached
ukbd0 detached
uhidev0 detached
ugold0 detached
uhidev1 detached
uhidev0 at uhub0 port 3 configuration 1 interface 0 "RDing TEMPERHUM1V1.2" rev 2.00/0.01 addr 2
uhidev0: iclass 3/1, 1 report id
ukbd0 at uhidev0 reportid 1: 8 variable keys, 5 key codes
wskbd1 at ukbd0: console keyboard, using wsdisplay0
uhidev1 at uhub0 port 3 configuration 1 interface 1 "RDing TEMPERHUM1V1.2" rev 2.00/0.01 addr 2
uhidev1: iclass 3/1
ugold0 at uhidev1
ugold0: 2 sensors type si7006 (temperature and humidity)
----------------------------------------------------------------------


On Thu, Jun 25, 2020 at 09:25:49PM -0700, [hidden email] wrote:
| This causes the write-only framebuffer console to only redraw the
| chars that differ between the start and end positions.
|
| 'time ls -R /usr/src/sys' is 3x faster with this, because most of
| the characters stay the same after a scroll.
|
| If this looks good, I can do the same thing for clear rows and copy/
| clear columns, although I will need to make a test case for them.
|
| It would probably be a good idea to change the rasops interface to
| have generic block copy and clear oeprations, versus the current
| full-column / full-row interface, so tmux and friends could get the
| full acceleration.
|
| Index: rasops.c
| ===================================================================
| RCS file: /cvs/src/sys/dev/rasops/rasops.c,v
| retrieving revision 1.61
| diff -u -p -r1.61 rasops.c
| --- rasops.c 25 May 2020 09:55:49 -0000 1.61
| +++ rasops.c 26 Jun 2020 04:14:13 -0000
| @@ -1627,28 +1627,42 @@ rasops_vcons_copyrows(void *cookie, int
|   struct rasops_info *ri = scr->rs_ri;
|   int cols = ri->ri_cols;
|   int row, col, rc;
| + int srcofs;
| + int move;
|  
| + /* update the scrollback buffer if the entire screen is moving */
|   if (dst == 0 && (src + num == ri->ri_rows) && scr->rs_sbscreens > 0)
|   memmove(&scr->rs_bs[dst], &scr->rs_bs[src * cols],
| -    ((ri->ri_rows * (scr->rs_sbscreens + 1) * cols) -
| -    (src * cols)) * sizeof(struct wsdisplay_charcell));
| - else
| +    ri->ri_rows * scr->rs_sbscreens * cols
| +    * sizeof(struct wsdisplay_charcell));
| +
| + /* copy everything */
| + if ((ri->ri_flg & RI_WRONLY) == 0 || !scr->rs_visible) {
|   memmove(&scr->rs_bs[dst * cols + scr->rs_dispoffset],
| -    &scr->rs_bs[src * cols + scr->rs_dispoffset],
| -    num * cols * sizeof(struct wsdisplay_charcell));
| +     &scr->rs_bs[src * cols + scr->rs_dispoffset],
| +     num * cols * sizeof(struct wsdisplay_charcell));
|  
| - if (!scr->rs_visible)
| - return 0;
| + if (!scr->rs_visible)
| + return 0;
|  
| - if ((ri->ri_flg & RI_WRONLY) == 0)
|   return ri->ri_copyrows(ri, src, dst, num);
| + }
|  
| - for (row = dst; row < dst + num; row++) {
| + /* smart update, only redraw characters that are different */
| + srcofs = (src - dst) * cols;
| +
| + for (move = 0 ; move < num ; move++) {
| + row = srcofs > 0 ? dst + move : dst + num - 1 - move;
|   for (col = 0; col < cols; col++) {
|   int off = row * cols + col + scr->rs_dispoffset;
| -
| - rc = ri->ri_putchar(ri, row, col,
| -    scr->rs_bs[off].uc, scr->rs_bs[off].attr);
| + int newc = scr->rs_bs[off+srcofs].uc;
| + int newa = scr->rs_bs[off+srcofs].attr;
| + if ( scr->rs_bs[off].uc == newc
| + && scr->rs_bs[off].attr == newa )
| + continue;
| + scr->rs_bs[off].uc = newc;
| + scr->rs_bs[off].attr = newa;
| + rc = ri->ri_putchar(ri, row, col, newc, newa);
|   if (rc != 0)
|   return rc;
|   }
|
|
|

--
>++++++++[<++++++++++>-]<+++++++.>+++[<------>-]<.>+++[<+
+++++++++++>-]<.>++[<------------>-]<+.--------------.[-]
                 http://www.weirdnet.nl/                 

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] fast conditional console scrolling

johnc
In reply to this post by johnc
I should have been more rigorous -- I had two different changes running
on my system, as well as forcing it to use the 12x24 font for a 160x45
console.

If you apply the "Optimized rasops32 putchar" patch I just posted, you
should see another significant speedup.


-------- Original Message --------
Subject: Re: [PATCH] fast conditional console scrolling
From: Paul de Weerd <[hidden email]>
Date: Fri, June 26, 2020 1:23 am
To: [hidden email]
Cc: "[hidden email]" <[hidden email]>

Hi John,

I tried your diff. I don't quite see the same 3x improvement that you
report, more like 2x. I timed 7 runs of ls -R /usr/ports:

Before diff, time ls -R /usr/ports | wc -l 2.897s on average
After diff, time ls -R /usr/ports | wc -l 2.707s on average

Before diff, time ls -R /usr/ports 2m53.067 on average
After diff, time ls -R /usr/ports 1m30.387 on average

Note that the 'before diff' runs were with a snapshot kernel. There
may be diffs in there that account for the difference between before
and after of the no-output runs. See dmesg and full stats below.

So, on average, a speed-up of ~48%.

Thanks!

Paul 'WEiRD' de Weerd


Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] fast conditional console scrolling

Paul de Weerd
Hi John,

With both your diffs applied, results are indeed more like 3x speed-up
that I get on my machine.  Average over 7 runs ls -R /usr/ports was
64.169s making for just under 3x increase.  That's on 1920x1080 with
the standard font size for that resolution (120x33 console, so 16x32
font).

Thanks again,

Paul 'WEiRD' de Weerd

On Fri, Jun 26, 2020 at 07:49:55AM -0700, [hidden email] wrote:
| I should have been more rigorous -- I had two different changes running
| on my system, as well as forcing it to use the 12x24 font for a 160x45
| console.
|
| If you apply the "Optimized rasops32 putchar" patch I just posted, you
| should see another significant speedup.
|
|
| -------- Original Message --------
| Subject: Re: [PATCH] fast conditional console scrolling
| From: Paul de Weerd <[hidden email]>
| Date: Fri, June 26, 2020 1:23 am
| To: [hidden email]
| Cc: "[hidden email]" <[hidden email]>
|
| Hi John,
|
| I tried your diff. I don't quite see the same 3x improvement that you
| report, more like 2x. I timed 7 runs of ls -R /usr/ports:
|
| Before diff, time ls -R /usr/ports | wc -l 2.897s on average
| After diff, time ls -R /usr/ports | wc -l 2.707s on average
|
| Before diff, time ls -R /usr/ports 2m53.067 on average
| After diff, time ls -R /usr/ports 1m30.387 on average
|
| Note that the 'before diff' runs were with a snapshot kernel. There
| may be diffs in there that account for the difference between before
| and after of the no-output runs. See dmesg and full stats below.
|
| So, on average, a speed-up of ~48%.
|
| Thanks!
|
| Paul 'WEiRD' de Weerd
|
|

--
>++++++++[<++++++++++>-]<+++++++.>+++[<------>-]<.>+++[<+
+++++++++++>-]<.>++[<------------>-]<+.--------------.[-]
                 http://www.weirdnet.nl/