Need help with HDLC / FCS Errors - umsm and ppp

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Need help with HDLC / FCS Errors - umsm and ppp

J.C. Roberts-2
* Problem: Lots of HDLC / FCS Errors on Verizon Cellular Wireless Link
 
  For example when downloading a 10MB file, I'll usually get between 300
to 600 FCS errors (PPP> show hdlc). The dismal transfer rate via ftp is
about 20KB/sec (roughly 160 to 200 Kbps) due to all the errors. The
connection should be running in the 400 to 700 Kbps range at worst
according to Verizon and might be able to do 1Mbps or better in my area
since I'm right next to the towers.

  I've done tons of reading and researching on HDLC / FCS errors but I
can not figure out what is the problem with this wireless link. The best
I can do is go through the possible cuases that I know and state the
results, if any, of investigating/testing said cause. Sorry for the long
post but putting all the details in here seemed like the best bet.

  If you have any ideas or insights, I'd love to hear them...

------------------------------------------------------------------------
Hardware: (full dmesg at end of this very long email)
        System:   Old Dell OptiPlex GX1 (PII-400Mhz)
        Adapter:  PCI->PCMCIA Card (Ricoh 5C485 Chipset)
        Wireless: Kyocera KPC650 PC-CARD/PCMCIA (Cardbus 32-bit)

Notes on HDLC: High-level Data Link Control
According to wikipedia:  http://en.wikipedia.org/wiki/HDLC
"Some vendors, such as Cisco, implemented protocols such as Cisco HDLC
that used the low-level HDLC framing techniques but didn't use the
standard HDLC header."  http://en.wikipedia.org/wiki/Cisco_HDLC
I'm not sure if Verizon is using Cisco kit or if ppp(8) can handle it?

Notes on Kyocera KPC650:
The device shows up as a USB hub and should have two USB serial ports
attached. The first USB serial port is the typical "modem" and the
second is a "control port" of sorts for the device which is used for
reading connection statistics like signal strength.

The following link shows how Linux finds both USB serial ports on the
USB hub but I've been unable to find any documentation or info
regarding how to use the second USB serial port for controlling the
device.  http://wildbill.nulldevice.net/wordpress/?p=144

Thanks to the efforts of Jonathan Gray (jsg@) and others, the KPC650
shows up on OpenBSD (4.0-stable 2006.11.05) via the umsm(4) driver and
the first USB serial port (/dev/cuaU0) on the KPC650 is accessible for
use with ppp(8) and pppd(8).


------------------------------------------------------------------------
Possible Cause #1: Incorrect CHAT Script
  Though it's possible to get a bunch of initial HDLC / FCS errors due
to a provider sending additional text before/after the CONNECT and your
chat script not being set up to handle it, unfortunately, this is not
the case/cause with VerizonWireless in my area.

------------------------------------------------------------------------
Possible Cause #2: Incorrect Escape Characters
  One possible/probable cause of HDLC / FCS errors is due to escape
characters not being handled correctly. In particular, when using you're
using software flow control (XON/XOFF), you need to escape the ^Q and ^S
characters by setting the ACCMAP to 0x000a0000.

        set ctsrts off
        set aacmap 0x000a0000
        set escape 0xff (both with and without)

Unfortunately, this is not the cause of the problems here. It may be
worth noting that the ppp.conf files I've seen/found for other
"Cellular Wireless Providers" like BigPond in the UK do use software
flow control (``set ctsrts off'') with the device but the devices are
not KPC650 cards.

------------------------------------------------------------------------
Possible Cause #3: Remote End Stops Talking PPP
  Now this is yet another possible cause for getting HDLC / FCS errors
and happens when the remote end decides it doesn't want to talk ppp
any more. Considering the odd "two serial" nature of the KPC650 it might
actually be the problem. At the moment, I've got no clue how the second
USB serial (control port) is supposed to be used but it makes some
sense that it might be used for something more than just returning
connection statistics. Since the OpenBSD umsm(4) driver only has one
USB serial port instead of two, information from the remote end
which should be destined for the second "control port" might mistakenly
be making it's way onto the the one USB serial port provided by the
driver?

On rare occasion, when shutting down ppp (PPP> quit all) and restarting
it, the chat script fails due to getting junk. I'm not sure if this is
simply because a buffer did not get flushed or if the remote end thinks
it's talking to the non-existant second control port?

Debug: deflink: physical (put): iflag = a00, oflag = 6, cflag = 1cb00
Phase: deflink: Connected!
Phase: deflink: opening -> dial
Chat: Phone: #777
Chat: deflink: Dial attempt 1 of 1
Debug: m_enqueue: len = 2
Chat: Send: AT\^M
Chat: Expect(30): OK
Chat: Received:
Debug: m_enqueue: len = 3
Debug: m_enqueue: len = 4
Chat: Received: ~\M^?}#\M-@!}%}&} }$8K~~\M^?}#\M-@!}%}'} }$\M-d}1~\^M
Chat: Received: NO CARRIER\^M
Warning: Chat script failed
Phase: deflink: dial -> hangup
Phase: deflink: Disconnected!
Debug: deflink: Close

Simply quitting ppp again and restarting it a second time corrects this
problem every time. If it matters, the received string has been
consistent every time this has happened, so having junk left over in a
buffer seems highly improbable.

The above three possible causes were swiped from:
http://www.freebsd.org/doc/en_US.ISO8859-1/books/faq/ppp.html

------------------------------------------------------------------------
Possible Cause #4: Occasional USB Disconnect
  You'll need to forgive my USB ignorance. I just don't use USB devices
very often (more like at all), so I'm not familiar with them or their
"normal" operation under OpenBSD... -The "Universial Serial Bus System
Architecture" book is still sitting on my shelf with an unbent binding.

When logged in as root and running ppp in intactive mode from the
command line, I occasionally get kernel messages (white on blue) which
seem to indicate the card has been ejected (i.e. the USB device has
been unplugged) and then reinserted (i.e. the USB device has been
plugged back in). Since I've never used USB on OpenBSD before, I'm not
sure what I'm looking at here:

PPP ON fluffy> umsm0: at uhub1 port 1 (addr 2) disconnected
ucom0 detached
umsm0 detached
Warning: 0.0.0.0/0: Change route fialed: errno: No such process
Warning: ff02:6::/32: Change route failed: errno: Network is unreachable
Warning: deflink: Unable to set physical to speed 0
Warning: deflink: Unable to set physical to speed 0
Warning: deflink: Unable to set physical to speed 0
Warning deflink: tcsetattr: Unable to restore device settings
ppp ON fluffy> umsm0 at uhub1 port 1
umsm0: Qualcomm, Incorporated Qualcomm CDMA Technologies MSM, rev
1.10/0.00, addr 2
ucom at umsm0 portno 0

PPp ON fluffy: Warning: 0.0.0.0/0: Change route failed: errno: No such
process
Warning: ff02:6::/32: Chare route failed: errno: Network is unreachable
PPP ON fluffy>


From what little I know about USB on OpenBSD, it looks to me as if the
umsm driver thinks the card has been removed? This only happens after an
hour or two of being connected and sitting idle, so it may be that the
provider (Verizon) is sending some kind of "disconnect" command to the
card and the umsm(4) driver is reacting accordingly. Since ppp is
running in -auto mode, it is also handling the the situation gracefully
and the second chunk, the ppp reconnect, only happens when there is
another request to reach the internet.  

I doubt this has anything to do with the HDLC / FCS errors but since I
know virtually nothing about USB, it seemed worth mentioning.

------------------------------------------------------------------------
Possible Cause #5: Bad Compression Settings
  In searching for a cause of the HDLC / FCS errors, I've seen many
suggestions to disable one type of compression or another in ppp.conf,
particularly vjcomp. Unfortunately, this is not the cause of the problem
and disable/deny of vjcomp, pred1, deflate, and lqr makes no difference.

------------------------------------------------------------------------
Possible Cause #6: FreeBSD Tick Problem
  From searching and reading up on HDLC and FCS errors, it seems you can
get these errors from problems in your serial device. The FreeBSD camp
has such issues and needs to adjust cp4ticks in sio(4) to get around the
problem, particularly on "fast" serial ports and wireless cards.

http://unix.derkeiler.com/Mailing-Lists/FreeBSD/hackers/2005-11/0222.html
http://www.bsdforums.org/forums/archive/index.php/t-3299.html
http://marc.theaimsgroup.com/?l=freebsd-stable&m=114752376911862&w=2
http://marc.theaimsgroup.com/?l=freebsd-stable&m=114125692710301&w=2
http://www.freebsd.org/cgi/query-pr.cgi?pr=51982

I have no idea if such a problem also exists someplace in OpenBSD?

------------------------------------------------------------------------
Possible Cause #7: USB Modem Buffer Size
  Chris Paul lives in Boulder Creek, CA, USA fairly close to here and
had speed issues with a similar card under OpenBSD on the same Verizon
network. Though I talked to him on the phone the other day, I'm not sure
if he ever managed to figure out that the speed issues were being caused
by HDLC / FCS errors.
http://marc.theaimsgroup.com/?l=openbsd-misc&m=114593019725990&w=2

Jolan suggested increasing the buffer sizes to 2058 in umodem.c
Though it supposedly helped in Chris's transfer rates, the source code
reads as though 1024 is the maximum due to the limitation in ttymalloc.

/*
 * These are the maximum number of bytes transferred per frame.
 * Buffers are this large to deal with high speed wireless devices.
 * Capped at 1024 as ttymalloc() is limited to this amount.
 */
#define UMODEMIBUFSIZE 1024
#define UMODEMOBUFSIZE 1024

Blindly kicking the buffer sizes up to 2048 when ttymalloc() can't
handle it seems like really bad juju. Blindly kicking ttymalloc()
up to 2048 seems like even worse juju.

http://www.openbsd.org/cgi-bin/cvsweb/src/sys/kern/tty.c
http://fxr.watson.org/fxr/source/kern/tty.c?v=OPENBSD#L2234

You might say I have a healthy degree of respect for kernel code or
more accurately, you could say I'm down right paranoid about messing
with it without knowing the consequences. Does anyone know the possible
consequences of increasing the umodem and ttymalloc sizes?

------------------------------------------------------------------------
ppp log files are available here:
http://www.designtools.org/files/ppp.log.zip

They incude just about everything in the way of logging that I can think
of including debug and async (see ppp.conf below).

Please note that I edited the phone number since it's used as the login
and the same is true for the ppp.conf below.

-----Begin ppp.conf ----------------------------------------------------
# Default Settings
default:
 set log debug async connect Phase Chat LCP IPCP CCP tun command
 disable ipv6cp

# VerizonWireless
vzw:
 set device /dev/cuaU0
 set speed 230400
 set server /var/run/ppp.pid "" 0177
 set dial "ABORT BUSY ABORT NO\\sCARRIER TIMEOUT 30 \"\" \
  AT OK \
  ATZ0 OK \
  ATQ0 OK \
  ATV1 OK \
  ATE1 OK \
  AT&V OK \
  \\dATDT\\T TIMEOUT 70 CONNECT"
 set login
 set phone "#777"
 set authname [hidden email]
 set authkey vzw
 set timeout 0
 set ifaddr 10.0.0.1/0 10.0.0.2/0 255.255.255.0 0.0.0.0
 add! default HISADDR
 enable dns
# set mru 2048
# set mtu 2048
# enable echo
# set echoperiod 30
# set ctsrts on
# disable vjcomp
# deny vjcomp
# set ctsrts off
# set escape 0xff
# set accmap 0x000a0000
# disable vjcomp pred1 deflate lqr
# deny vjcomp pred1 deflate lqr
# disable lqr
# deny lqr
-- End ppp.conf -------------------------------------------------------

Source was update on 2006.11.05 (-STABLE)

-- Begin dmesg ---------------------------------------------------------
OpenBSD 4.0-stable (GENERIC) #1: Fri Nov 10 20:44:31 PST 2006
    [hidden email]:/usr/src/sys/arch/i386/compile/GENERIC
cpu0: Intel Pentium II ("GenuineIntel" 686-class, 512KB L2 cache) 400
MHz
cpu0:
FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR
real mem  = 133791744 (130656K)
avail mem = 114651136 (111964K)
using 1658 buffers containing 6791168 bytes (6632K) of memory
mainbus0 (root)
bios0 at mainbus0: AT/286+(00) BIOS, date 08/01/01, BIOS32 rev. 0 @
0xffe90, SMBIOS rev. 2.2 @ 0xfb410 (64 entries)
bios0: Dell Computer Corporation OptiPlex GX1 400MTbr+
apm0 at bios0: Power Management spec V1.2
apm0: AC on, battery charge unknown
apm0: flags 30102 dobusy 0 doidle 1
pcibios0 at bios0: rev 2.1 @ 0xf0000/0x10000
pcibios0: PCI IRQ Routing Table rev 1.0 @ 0xfc670/176 (9 entries)
pcibios0: PCI Interrupt Router at 000:07:0 ("Intel 82371AB PIIX4 ISA"
rev 0x00)
pcibios0: PCI bus #3 is the last bus
bios0: ROM list: 0xc0000/0x8000 0xc8000/0x8000
cpu0 at mainbus0
pci0 at mainbus0 bus 0: configuration mode 1 (no bios)
pchb0 at pci0 dev 0 function 0 "Intel 82443BX AGP" rev 0x03
ppb0 at pci0 dev 1 function 0 "Intel 82443BX AGP" rev 0x03
pci1 at ppb0 bus 1
vga1 at pci1 dev 0 function 0 "ATI Rage Pro" rev 0x5c
wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
pcib0 at pci0 dev 7 function 0 "Intel 82371AB PIIX4 ISA" rev 0x02
pciide0 at pci0 dev 7 function 1 "Intel 82371AB IDE" rev 0x01: DMA,
channel 0 wired to compatibility, channel 1 wired to compatibility
wd0 at pciide0 channel 0 drive 0: <Maxtor 90650U2>
wd0: 16-sector PIO, LBA, 6149MB, 12594960 sectors
wd0(pciide0:0:0): using PIO mode 4, Ultra-DMA mode 2
atapiscsi0 at pciide0 channel 1 drive 0
scsibus0 at atapiscsi0: 2 targets
cd0 at scsibus0 targ 0 lun 0: <SAMSUNG, CD-ROM SC-140B, d005> SCSI0
5/cdrom removable
cd0(pciide0:1:0): using PIO mode 4, Ultra-DMA mode 2
uhci0 at pci0 dev 7 function 2 "Intel 82371AB USB" rev 0x01: irq 11
usb0 at uhci0: USB revision 1.0
uhub0 at usb0
uhub0: Intel UHCI root hub, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
piixpm0 at pci0 dev 7 function 3 "Intel 82371AB Power" rev 0x02: SMBus
disabled
cbb0 at pci0 dev 13 function 0 "Ricoh 5C475 CardBus" rev 0x81: irq 9
ppb1 at pci0 dev 15 function 0 "DEC 21152 PCI-PCI" rev 0x03
pci2 at ppb1 bus 3
xl0 at pci0 dev 17 function 0 "3Com 3c905B 100Base-TX" rev 0x24: irq 11,
address 00:c0:4f:27:c5:90
exphy0 at xl0 phy 24: 3Com internal media interface
isa0 at pcib0
isadma0 at isa0
pckbc0 at isa0 port 0x60/5
pckbd0 at pckbc0 (kbd slot)
pckbc0: using irq 1 for kbd slot
wskbd0 at pckbd0: console keyboard, using wsdisplay0
pmsi0 at pckbc0 (aux slot)
pckbc0: using irq 12 for aux slot
wsmouse0 at pmsi0 mux 0
pcppi0 at isa0 port 0x61
midi0 at pcppi0: <PC speaker>
spkr0 at pcppi0
npx0 at isa0 port 0xf0/16: using exception 16
fdc0 at isa0 port 0x3f0/6 irq 6 drq 2
fd0 at fdc0 drive 0: 1.44MB 80 cyl, 2 head, 18 sec
cardslot0 at cbb0 slot 0 flags 0
cardbus0 at cardslot0: bus 2 device 0 cacheline 0x0, lattimer 0x20
pcmcia0 at cardslot0
biomask effd netmask effd ttymask ffff
pctr: 686-class user-level performance counters enabled
mtrr: Pentium Pro MTRR support
ohci0 at cardbus0 dev 0 function 0 "NEC USB" rev 0x43: irq 9, version
1.0
dkcsum: wd0 matches BIOS drive 0x80
root on wd0a
rootdev=0x0 rrootdev=0x300 rawdev=0x302
usb1 at ohci0: USB revision 1.0
uhub1 at usb1
uhub1: NEC OHCI root hub, rev 1.00/1.00, addr 1
uhub1: 1 port with 1 removable, self powered
ohci1 at cardbus0 dev 0 function 1 "NEC USB" rev 0x43: irq 9, version
1.0
usb2 at ohci1: USB revision 1.0
uhub2 at usb2
uhub2: NEC OHCI root hub, rev 1.00/1.00, addr 1
uhub2: 1 port with 1 removable, self powered
umsm0 at uhub1 port 1
umsm0: Qualcomm, Incorporated Qualcomm CDMA Technologies MSM, rev
1.10/0.00, addr 2
ucom0 at umsm0 portno 0

---End dmesg -----------------------------------------------------------


--
Free, Open Source CAD, CAM and EDA Tools
http://www.DesignTools.org