Re: very slow convergence of ntp to correct time.



Eric,

Many years ago the Proteon routers dropped the first packet after the cache timed out; that was a disaster. That case and the ones you describe are exactly what the NTP burst mode is designed for. The first packet in the burst carves the caches all along the route and back. The clock filter algorithm tosses it out in favor of the remaining packets in the burst. No ICMP is needed or wanted.

Dave

Eric wrote:
On Sun, 20 Jan 2008 17:50:41 GMT, Unruh <unruh-spam@xxxxxxxxxxxxxx> wrote
for the entire planet to see:


david@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx (David Woolley) writes:


In article <Hiqkj.8953$yQ1.2617@edtnps89>,
Unruh <unruh-spam@xxxxxxxxxxxxxx> wrote:

<snip>

I would assume that ntp is giving these samples with long round trip very low weight, or even
eliminating them.

Note: if these spikes are positive, they may be the result of lost ticks.

Don't think so. I think they are 5-10ms transmission delays. The delays disappear if I run at
maxpoll 7 rather than 10, so I suspect the router is forgetting the
addresses and taking its own sweet time about finding them if the time
between transmissions is many minutes.
chrony has a nice feature of being able to send an
echo datagram to the other machine if you want (before the ntp packet), to
wake up the routers along the way.


There are several related effects here that I have experienced in my NTP
network.

First is the possible ARP resolution overheads. If the IP addresses of
your host and of the destination or default gateway are not passing traffic
frequently the ARP cache in your host or the local router can time out and
need to be reloaded on each poll. These can be on the order of 5-10ms and
will affect only one side of the transaction's transmission delay.

Unfortunately ARP often uses a 15 minute TTL, and default NTP uses a 17
minute poll interval.

Then there is the whole problem that many routers all along the path
experience extra overhead on the first packet of a "flow". Route table
look ups are done by destination IP of course, but generally have to be
installed into the cache, or FIB, the first time a new source/dest IP pair
shows up. This is often a 1-3ms overhead. And that entry doesn't last
forever either.

Then there is the MAC cache in your switches, which generally purge after
1-5 minutes. This can often be adjusted higher, but that can sometimes
cause issues for others when they are reconfiguring part of the network.

Another issue is NATing or statefull firewalls. There is often outbound
(or inbound) connection setup time. Without special configuration this
often "times out" before twenty minutes, leading to more asymmetric delay.

I think the suggestion of a pre-poll ICMP echo is kinda interesting. It
might be possible to limit the packet TTL to five hops or so, just "warming
up" your side of the network. It might also be better to make it a mostly
standard UDP NTP packet so it matches whatever "rules" the intermediate
devices are applying (and you want them to remember). QoS and policy
routing are both sensitive to port numbers, and certainly most firewalls
are protocol sensitive, so matching the initial packet attributes to the
desired high-performance packet attributes would probably help this
technique work.

To mitigate some of these effects it might not have to be done that often.
In many hierarchical network topologies it might serve just to send one
extra packet every 3-5 minutes using the same source IP/port that NTP
normally uses, to any configured server. And it could still have a limited
TTL if desired. That would at least keep the switch and ARP caches fresh
and depending on the design, the policy and NAT caches as well.

- Eric
.



Relevant Pages

  • Re: Routing SMP benefit
    ... Hardware based packet header cache prefetching as ... What sort of work is needed in order to support header prefetch? ... a global rmlock could govern the entire routing table. ...
    (freebsd-net)
  • Re: Advice on a multithreaded netisr patch?
    ... DMA of the packet data to main memory from the NIC ... Servicing of CPU cache misses to access data in main memory ... See my recent commit to kern_tc.c for an example: the updating of trivial statistics for the kernel time calls reduced 30m syscalls/second to 3m syscalls/second due to heavy contention on the cache line holding the statistic. ...
    (freebsd-net)
  • Re: Advice on a multithreaded netisr patch?
    ... The driver will take cache misses on the descriptor ring entry, if it's not already in cache, and the link layer will take a cache miss on the front of the ethernet frame in the cluster pointed to by the mbuf header as part of its demux. ... By the time you receive an interrupt, the DMA is complete, so once you believe a packet referenced by the descriptor ring is done, you don't have to wait for DMA. ... Do you have any idea at all why I'm seeing the weird difference of netstat packets per second and my application's TCP performance? ...
    (freebsd-net)
  • Re: Cisco IOS Denial of Service that affects most Cisco IOS routers- requires power cycle to recove
    ... The problem with sending 19 packets of each protocol is that PIM ... A combination of the packet types is not ... PIM is not running)) will lockup the interface on vulnerable routers. ...
    (Incidents)
  • Re: Advice on a multithreaded netisr patch?
    ... DMA of the packet data to main memory from the NIC ... Servicing of CPU cache misses to access data in main memory ... m_pulluphere ensures that the first sizeofbytes of mbuf data are ...
    (freebsd-net)