Re: TCP Window Size
- From: Trendkill <jpmason@xxxxxxxxx>
- Date: Fri, 01 Jun 2007 06:22:54 -0700
On Jun 1, 7:59 am, Cheema <atif_che...@xxxxxxxxx> wrote:
On May 31, 9:57 pm, "Thrill5" <nos...@xxxxxxxxxxxxx> wrote:
Thin clients do not send large amounts of data between the thin client and
terminal server, so window size wouldn't be the problem. My bet is that the
problem is the Terminal Server. 200 clients on a single Terminal Server is
a lot even for non-database type applications. Are you also monitoring the
performance of the TS? You need to monitor memory usage, CPU, disk I/O and
network I/O, active clients, etc. Even if this is a big multi-cpu TS, you
probably have a some type of I/O bottleneck on the server.
Scott"Trendkill" <jpma...@xxxxxxxxx> wrote in message
news:1180616216.191691.66160@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
On May 31, 8:45 am, Cheema <atif_che...@xxxxxxxxx> wrote:
Hi
I would like to share my experience. We have a data base application
with record of 30 million people.
PROBLEM : Slow application access from time to time
Server : data base application
WAN : 12Mbps of clear pipe end to end WAN links
client : A Thin and Terminal Server serving 200 thin/terminal server
clients
Util% : WAN links are max 60% loaded
RTT : 20-120 msec with average RTT of 58msec
Server TCP Window Size : 24kByte
Client TCP Window Size : 65kByte
I have strong understanding that this is due to the SENDER and
RECIEVER capacity mismatch
Kindly advise on this situation
What TCP window size should be used ?
Should it be changed on both ends ?
can FAST TCP be applied in this scenario ?
Waiting for your valuable answer
Thanks
Just from my experience, I have a hard time blaming TCP windowing.
How many concurrent users? Is this real time? How do the queries
look? Are they efficient? What kind of bandwidth per user or per
transaction, and how many users/transactions at any given time?
12Mbps is not that fast, but you need to provide context of whether or
not 12Mbps is enough. Could be anything from server being busy with
backups or some kind of schedule, to WAN pipe utilization going over
80% which would start to impact latency, to service provider, to
anything. Have you used MRTG or Netflow to gauge bandwidth
utilization at these times? How about latency? Do you have a
baseline of these usages and performance during 'good performance'
times? Do you have QoS? Could someone be running a FTP and killing
your pipe?
I won't say that packet/frame sizes are NOT the issue, but I just hate
to look at fundamental networking architecture when there are WAY too
many other variables that are more likely. Not to mention, window
sizes fluctuate, and if this is small telnet or shell based
application, they will most likely never get to full size.
Dear Friends
Thanks a lot for your enlightening response. I would like to further
elaborate on my query, may be your can help me more
Age of Problem : 1.5 years
Working on Problem : whenever problem comes, it comes and goes, many
teams are involved here
Server type : IIOP Database application
Client type : some desktop PCs connect directly, but most of them
connect via thin servers which are also acting as TERMINAL Servers and
also act as clients to the IIOP Database application server
Client means : A terminal server which is also a Thin Server to 100s
of thin clients
I would like to clerify here is that we are talking about the BULK TCP
Transfer between THIN SERVERS/TS (also acting as clients to the IIOP
Database application) and the IIOP application server.
the reason of putting this post here is that it is always the NETWORK
which is blamed first for slow application and we use all cisco
networking devices. Multiple parallel WAN links are being load
balancing using IP LOAD-SHARING PER-PACKET with IP CEF.
EXPERIENCE : I have a hard time blaming TCP windowing !!!
can you put some more light on it
Q : How many concurrent users?
A : There are 6 THIN/TS Server, the concerned team has divided that at
a time there are max 30 users logged in.So 30x6= 180 users
Q : Is this real time?
A : Yes, it is real time transaction based use
Q : How do the queries look?
A : a number is put and query is sought against it. At front a java
based interface is opened, java jar compressed classes are being
downloaded from IIOP down to TS/Thin servers.
Q : Are they efficient?
A : How to find that out ? I believe that a certain transaction for 30
days takes about 12MB of data to be transferred which included screen/
graphics updates along with the real data but a transaction for a day
or two should take 1 MB or less. I believe this data is very WAN UN-
FRIENDLY but question is how to make it efficient ?
Q :What is slow and what is fast?
A : if query output is displayed in 2-4 seconds, it is fast and ok but
if it takes 15-30 seconds then it is mild and if it takes a minute or
more, it is slow.
Q : What kind of bandwidth per user or per, transaction, and how many
users/transactions at any given time?
A : The six WAN links remain at 50% loading but at times the link use
goes until 75%. Per user transaction and bandwidth need varies as some
queries yield less output while other yield more output. Total BW
transfer between the TS and IIOP server in one business day is 20GB.
At a time, average two hundred users are onto it. If we assume 200
transactions for each user, then 200x200=40000 transactions and if
each transaction on average is supposed to take 0.5-->1 MB, then it
makes 40,000x.5=20000 MB.
Q : How much total data is transferred in a week between TS Servers
and the IIOP Application Server ?
A : 1 TERA BYTE
Q : 12Mbps is not that fast, but you need to provide context of
whether or not 12Mbps is enough. Could be anything from server being
busy with backups or some kind of schedule, to WAN pipe utilization
going over 80% which would start to impact latency, to service
provider, to anything ?
A : There are 6 WAN link each of 2MB (6xE1). Each of the link varies
from 50% to 70% loading and at times it goes to 80%. But we have moved
from 2xE1 to 6XE1 and the application is so Bandwidth hungry that even
this BW does not seem enough. IIOP server is not in our domain so we
cannot check. Yes WAN pipe touches 80% but we cannot provide more
bandwidth than that and need to find the other way. TTL is also fine
but I need to check the latency during 80% loading.
Q : Have you used MRTG or Netflow to gauge bandwidth utilization at
these times? How about latency?
A : Yes we use MRTG and Netflow and I have detailed traffic stats.
Bandwidth sometimes goes up and the usage on a 2Mbps link varies from
1.5 to 1.75Mbps.
Q : Do you have a baseline of these usages and performance during
'good performance' times?
A : It has been up and down, sometime complaint comes and ususally "no
news is good news"
Q : Do you have QoS?
A : No
Q : Could someone be running a FTP and killing your pipe?
A : In the presence of Netflow I can alway catch the cluprit but that
is not the case here.
Q : My bet is that the problem is the Terminal Server. 200 clients on
a single Terminal Server is a lot even for non-database type
applications. Are you also monitoring the performance of the TS? You
need to monitor memory usage, CPU, disk I/O and network I/O, active
clients, etc. Even if this is a big multi-cpu TS, you probably have a
some type of I/O bottleneck on the server ?
A : Yes, sometime ago that was the case but later the users were split
onto different TS and more resources were added, can you refer to
SERVER SIZING URL where I can find performance parameters as you have
mentioned, how can I find that there is an I/O bottleneck ? and how
does it gets removed automatically ? TS/Thin Server team puts a
regualr weekly reboot of these machines.
I agree with you that there are too many variables
TRAFFIC CAPTURE RESULTS
========================
I have captured the traffic and analyzed it. From three TS to IIOP
application, in 1 min 9 seconds, only about 2 MB of request was sent
and against that 65MB of data was pulled.
Capture Duration : 1 min 9 seconds
Client to Server Data : 2MB
Server to Client Data : 65 MB
Data Type : TCP
Frames caputred : 75000
Application : HTML, IIOP
MSS advertised by both : 1460byte
TOO MANY TCP RETRANSMISSION, DUPLICATE ACKS, FAST RETRANSMISSIONS etc
TCP window advertised by client : 65k
TCP window advertised by server : 24k
In all 7500 frames, I saw the same TCP WINDOW SIZE, should it change ?
is there anything wrong ? who controls window size ? I believe that
the SEND TCP window size of SENDER (IIOP Application) and RECEIVE TCP
window size at the TS Server which is fetching data from the IIOP
Server should be same.
What you think from above data, is it same or different ? and if not
what should be the optimal TCP WINDOW SIZE ?
As a work-around, I am suggesting that a CLUSTER TERMINAL SERVER be
placed at the IIOP Application LAN so that huge amounts of data
transfer only between two machines on the SAME LAN and only screen
refreshes transfer over WAN, what you say ?
waiting for your valuable response...
Best Regards
Cheema
Here is your details on TCP Windowing....better than me trying to make
a 1 paragraph summary:
http://www.ncsa.uiuc.edu/People/vwelch/net_perf/tcp_windows.html
As for your issue, it sounds like you may just have a simple issue of
amount of clients and bandwidth. By 'efficient', I mean are the
clients asking for all the data at once from the DB rather than a cell
by cell query. If one client makes a request (or a few), and the
server responds back with full packets (usually 1514 or whatever),
until the query has been fulfilled, it is network 'efficient'. If the
client is making hundreds of queries for each additional piece of
information, the application needs to be looked at. Especially over a
WAN with limited bandwidth and latency, this could kill you.
Assuming that is not your issue (nothing to prove it, just saying),
you may just have some overloaded times. When the bandwidth is at 50%
per pipe, is the performance good? When the performance is reported
as bad, does the bandwidth show any clear differences, such as 80%
utilization during these times. If so, it sounds like pure volume is
your bottleneck. If not, and it happens when its at 50 and 80 alike,
what else is going on in the network or one these boxes? TCP
windowing is negotiated, and while a smaller window will not allow as
much data, I still doubt this is your issue. Non optimized windowing
would also affect ALL your traffic, not just traffic at certain
times. Either it is negotiating properly or not, and it would not
make sense that some transactions are 1-2 seconds, while others are
over a minute. This is not a windowing problem. You need to focus on
volume/usage of the network and these boxes during good and bad times,
and see what correlations you can draw.
.
- Follow-Ups:
- Re: TCP Window Size
- From: Cheema
- Re: TCP Window Size
- References:
- Re: TCP Window Size
- From: Cheema
- Re: TCP Window Size
- Prev by Date: Re: Can you be an effective engineer with just one laptop?
- Next by Date: Re: 3524-PWR powering Cisco APs?
- Previous by thread: Re: TCP Window Size
- Next by thread: Re: TCP Window Size
- Index(es):
Relevant Pages
|