Re: Degradation of TCP connection
- From: noisetube@xxxxxxxxx
- Date: Wed, 6 Aug 2008 01:00:51 -0700 (PDT)
On Aug 5, 3:01 pm, justin.pear...@xxxxxxxxx wrote:
Bill,
I continue to appreciate your knowledgeable assistance on this
problem! You have been so helpful. To address the points you brought
up:
I'm pretty sure the network driver you have was
not written by Wind River, though I don't know if it was done by
Curtiss-Wright or Marvell. If I had to bet, I'd say that at least some
of the code was provided by Marvell.
You're absolutely right. To quote one of our contacts at Curtiss-
Wright,
"Each of our boards will have an ethernet driver specific
for the chip and the board. For the 124 board, the driver would have
been based on code from Marvell (the manufacturer of the Discovery III
bridge), and as updated by Curtiss Wright Controls Embedded Computing/
Dy
4 Systems."
So that answers that.
- When your application and LabView stop communicating, can you still
ping the target from the Windows XP machine? If yes, the ethernet
driver and the IP layer of the stack are still working, at least to
some extent, and it's something at the TCP layer that's gone wrong. If
no, then it could be the driver, or a serious problem in the stack.
We have not tried this yet, but it's on our (now much longer) list of
things to try once the problem comes up again.
My philosophy is: once you get a target into a failed state, gather as
much data as you can from it before you reset it. Sometimes that's not
a lot, but simple things like ping can often provide helpful clues.
- You say that it looks like the VxWorks target stops receiving
traffic from LabView. How did you determine this? I usually check for
receive operation by adding the target shell and the INCLUDE_IFCONFIG
component.
We had Wireshark running on a separate machine that was watching all
the traffic on the network. Each time this anomaly happens, it starts
when the VxWorks box stops ACKing packets sent from the Windows box.
To be precise, the ACK number on the "VxWorks --> Windows" packets
stop increasing. Soon, the Windows box, noticing that the VxWorks box
is reporting the same ACK number, begins retransmitting packets.
However, the VxWorks box still does not increment its ACK number. At
the same time, the VxWorks box begins retransmitting data to the
Windows box, as though it didn't hear the incrementing ACKs that the
Windows box was sending to VxWorks.
In short, we deduced it from a bunch of Wireshark data. Do you think
this is a valid conclusion?
Oh. That's interesting. Okay, so if I understand you correctly, the
target _is_ able to transmit packets when the problem occurs. This
definitely points to a failure in the RX path somewhere. The fact that
it's continuing to send TCP segments occasionally means that the
stack's TCP timers are still firing, and that the driver can transmit
frames onto the wire. If it stops receiving packets, the stack will
think the peer hasn't acknowledged the current segment yet and will
keep retransmitting it. (There are sometimes oddball cases where the
stack on one side or the other becomes desynchronized, which just
botches a single TCP stream while other traffic continues to flow
normally. These are rare though, and it doesn't sound like you're
doing anything that would trigger such a condition. In any case, this
is why I asked if you could ping the target once the anomaly occured.
My suspicion at this point is that you won't be able to.) Not being
able to receive packets could mean a couple of things:
- The RX state in the driver may have fallen out of sync with the chip
- The receiver encountered an error from which the driver couldn't
recover
- RX interrupts have stopped firing
- _all_ interrupts for that port have stopped firing (if TX interrupts
have also stopped,
the driver may still be able to send packets onto the wire for a
short time)
We feel much more prepared for the next anomaly, though. We have
enabled ifconfig() in our kernel and are running on the bench with our
fingers crossed.
Per your several suggestions, we'll try
1. Pinging the VxWorks box from another machine on the network
2. Pinging xxx.xxx.xxx.255 from the VxWorks box and seeing what
happens in Wireshark
3. Calling ifconfig() to see what the "Rx packets" and "Tx packets"
counters are doing.
Good. I'm curious to see the result. (And hopefully adding the
additional components won't just make the problem disappear.)
- The board should have at least two ethernet ports (three, if Curtiss-
Wright wired up all 3 MACs). As a test, I would enable a second port
on the target and cable it to another machine. For example,
[snip]
You are correct, Curtiss-Wright did wire up another NIC, but
unfortunately we've had a problem enabling it. Another group of my
coworkers is tackling that problem. If they get it working we'll try
setting up another machine on a two-node network, the way you
suggested.
If the second interface shows up when you do "muxShow()" then you
might just be able to do this:
-> ipcom_drv_eth_init "nameofdriver", 1, 0
-> ifconfig "nameofdriver1 10.0.0.1 netmask 255.255.255.0 up"
If it doesn't show up in muxShow(), then it probably needs to be
enabled in the BSP somewhere.
Thanks again for all your help. We've been in contact with Curtiss-
Wright support and Wind River support, but this thread has provided us
with the most help so far. In fact, our Wind River support contact
provided a cornucopia of advice, almost all of which he copied from
your last post. He omitted to include phrases which would point to a
fault of Wind River's, like your phrase "a serious problem with the
stack." Blah.
That's politics for you.
-Bill
Warm regards,
Justin
.
- Follow-Ups:
- Re: Degradation of TCP connection
- From: justin . pearson
- Re: Degradation of TCP connection
- References:
- Re: Degradation of TCP connection
- From: justin . pearson
- Re: Degradation of TCP connection
- From: justin . pearson
- Re: Degradation of TCP connection
- Prev by Date: Re: Degradation of TCP connection
- Next by Date: how does tha stack analyser utility works in vxworks?
- Previous by thread: Re: Degradation of TCP connection
- Next by thread: Re: Degradation of TCP connection
- Index(es):