Re: bites, bytes, packets, TCP, and DSL speed
- From: shamino@xxxxxxxxxx (David C.)
- Date: Fri, 06 Jul 2007 02:39:42 GMT
E Z Peaces <cash@xxxxxxxxxxxxxxx> writes:
Years ago I read that when an eight-bit byte is sent by TCP, two extra
bits are added. I think that must be wrong because I have found no
corroboration.
As others have pointed out, there is some confusion here.
Serial-port communication typically uses 10 bits to represent a byte of
data (the commonly used "N-8-1" pattern of 1 start bit, 8 data bits, one
stop-bit, no parity.) In the days of terminal I/O, E-7-1 was also
popular, and also 10 bits per byte (1 start bit, 7 data bits, 1 stop
bit, even parity). Other combinations also exist that can drive it up
to 12 bits per byte (e.g. E-8-2 : 1 start bit, 8 data bits, even parity
and two stop bits), but they are not common.
But none of that is applicable to broadband. The bit-rate you get is in
terms of 8 bits per byte. The extra framing signals are not generally
counted as part of the data rate.
In theory, if you could use every byte for data, you could take your
broadband bit-rate, divide by 8, and get your download speed.
In practice, this doesn't work, because there are several layers of
overhead that get involved.
At the Ethernet layer (or other layers that use Ethernet-like framing),
there will be 18 bytes of overhead per frame (source and destination MAC
address, layer-3 protocol ID, and CRC). The data portion of a standard
Ethernet frame will be between 46 and 1500 bytes (or 64-1518 bytes for
the entire frame).
Layered over Ethernet is the IP layer. A typical IP header has 20 bytes
of overhead (version, header length, type of service, packet length,
packet ID, fragmentation info, TTL, layer-4 protocol ID, header
checksum, source and destination addresses).
With TCP connections, each TCP packet will typically impose another 20
bytes of overhead (source port, destination port, sequence number, ack
number, HLen/reserved, code bits, window info, checksum, urgen pointer,
and options).
So, for a bulk data-transfer (where you're seeing maximum packet sizes),
assuming the network can carry full-size Ethernet frames end-to-end,
you're looking at 58 bytes of overhead for every 1460 bytes of data, or
about 4%, or an effective bit-rate of 8.3 bits per data byte.
But this is under ideal circumstances. Under real-world conditions,
your end-to-end MTU size (that is, maximum packet size) may be less than
1500, not all packets are of maximum size, network conditions impose
delays, and TCP's need to acknowledge every packet add to the delays.
It's been my experience that a 1.5Mbps DSL line will have a maximum
file-transfer speed of about 160KB/s, or about 1.3Mbps, for an overhead
of about 15% (0.2M / 1.3M = 0.15), or an effective bit-rate of 9.2 bits
per data byte.
Note that you're not actually transmitting 9.2 bits per byte. That's
just the effective data rate you typically get.
Because of this, and to make the math easy to compute without a
calculator, I typically take the bit-rate of any data-link and divide by
10 to get a ballpark estimate of file-transfer speed. The estimate
usually ends up a little low, but close enough to be useful.
TCP uses packets. A packet header has at least 20 bytes and may have
as many as 60. A packet may carry 1 to 65,536 bytes of data.
True, but:
IP headers typically don't have options in them, so they're going to be
only 20 bytes most of the time.
Packets larger than your line's MTU size will be fragmented at the IP
layer. Most IP stacks try to avoid fragmentation, because that adds
overhead. They typically perform "path MTU discovery" to determine the
smallest MTU along the path from sender to receiver, and use that for
all packets. Ethernet typically has an MTU of 1500 bytes (the Ethernet
overhead isn't counted here). Some kinds of networks use larger or
smaller MTUs. The so-called "Internet Standard" MTU is 576 bytes - no
router on the Internet should ever have an MTU smaller than this.
A smaller MTU means less data per packet, and therefore more packets,
and therefore more overhead.
Also note that some access-level protocols (like PPPoE and VPN
technologies) add additional overhead beyond what I mentioned above.
That helps explain why data transfer always seems to be slower than
connection speed. If packet headers are 20 bytes and packets average
80 bytes of data, files will be transferred 20% slower than the flow
of bytes over the internet.
Packets shouldn't be this small for file-transfers, but you are right
about the general concept. Interactive sessions (like a telnet session,
where individual keystrokes may be transmitted in separate packets)
will, of course, have much smaller amounts of data, and therefore a much
higher percentage of overhead.
It helps explain why I can download and upload big files faster than
my figures from speed tests: the packets in tests don't carry as much
data, on the average.
There are a lot of different factors that may be involved here.
A good speed test should use a mix of packet sizes that closely
resembles real-world conditions, so that shouldn't be the cause of the
discrepancy (although it might be).
The load on the server you're connecting to will usually have a big
impact - file transfer from a busy server will necessarily be slower
because your data-transfer must share CPU time and bandwidth with
transfers from many other people.
Network congestion can also be a factor, but most carriers engineer
their networks well enough so you shouldn't experience this (not
counting unusual events that may temporarily stress-out a part of a
network.)
If your tests use different application protocols (e.g. HTTP vs. FTP),
the switching equipment on the network may give one a priority boost
over others. On networks that do this, voice communication will
typically run at a higher priority than file transfer. There may be
other priorities as well.
I wonder how TCP decides how much data to put in a packet.
TCP stacks will typically perform path MTU discovery to learn the
smallest MTU between yourself and the other end of the connection. Each
packet will be the maximum size that can fit into the MTU, but no larger
(since larger packets will be fragmented in transit, adding more
overhead.)
In addition to this, TCP performes congestion-control analysis and rate
limiting. Various algorithms are used to detect when packets are
delayed, dropped, retransmitted, or delivered out-of-sequence. When
these things happen, TCP will deliberately slow down its transmission
rate to avoid overwhelming an already-congested network.
If I use email for a 1MB attachament, Mail.app will expand the file a
lot, perhaps to 1.5MB, before sending. What causes that?
E-mail is a text-based protocol. Although modern SMTP implementations
can support 8-bit data, mail clients typically only send printable ASCII
characters (that is from 32 (space) through 127 (del)).
When you send non-ASCII characters, they must be encoded into ASCII.
Typically, the "quoted-printable" encoding is used. UTF-7 is another
commonly used encoding.
When you send a binary attachment, it must also be encoded into ASCII,
in order to reliably transfer through an e-mail network.
The act of encoding a binary file into text using UU-code (for example)
involves breaking the stream of 8-bit data into 6-bit words, each of
which is represented by a single ASCII character. So every 3 bytes of
data is transmitted as 4 characters, plus a byte or two of overhead per
line of encoded text. This effectively increases the file's size by
33%.
Other encoding schemes, like BASE-64 (typical for mail attachments),
yEnc (often used for binary newsgroup postings), BinHex (a Mac-specific
encoding) and others use different algorithms with different levels of
efficiency, but they all end up doing the same thing - they turn
arbitrary binary data into ASCII text data, which necessarily makes it
larger.
-- David
.
- Follow-Ups:
- Re: bites, bytes, packets, TCP, and DSL speed
- From: E Z Peaces
- Re: bites, bytes, packets, TCP, and DSL speed
- References:
- bites, bytes, packets, TCP, and DSL speed
- From: E Z Peaces
- bites, bytes, packets, TCP, and DSL speed
- Prev by Date: Re: bites, bytes, packets, TCP, and DSL speed
- Next by Date: Re: bites, bytes, packets, TCP, and DSL speed
- Previous by thread: Re: bites, bytes, packets, TCP, and DSL speed
- Next by thread: Re: bites, bytes, packets, TCP, and DSL speed
- Index(es):
Relevant Pages
|