Frequent time reset messages



I'm running a moderate number (around 50) dual-opterons that are
diskless booting a Linux 2.6.12 smp kernel and trying to synch with a
Symmetricon XLI-GPS stratum-1 NTP server on an isolated network.

The problem I have is that when I run "ntpq -c peers" on a number of
these machines to check the status of the ntp synchronization, I see
offsets ranging over almost 1000 msecs. If I grep through the /var/log/
messages file, I see that there are often messages around every 20
minutes like this:

Dec 1 20:30:28 (none) ntpd[27203]: time reset 0.613771 s
Dec 1 20:30:28 (none) ntpd[27203]: synchronisation lost
Dec 1 20:50:45 (none) ntpd[27203]: time reset 0.931388 s
Dec 1 20:50:45 (none) ntpd[27203]: synchronisation lost
Dec 1 21:19:23 (none) ntpd[27203]: time reset 0.451491 s
Dec 1 21:19:23 (none) ntpd[27203]: synchronisation lost
Dec 1 21:36:24 (none) ntpd[27203]: time reset 0.391510 s
Dec 1 21:36:24 (none) ntpd[27203]: synchronisation lost

This seems like large (and frequent) steps to be occuring. I have a
fairly simple ntp.conf file:
---------------------------------
restrict default ignore
restrict 10.2.40.1 mask 255.255.255.255 nomodify notrap noquery
restrict 127.0.0.1

server 10.2.40.1 iburst
server 127.127.1.0 iburst # local clock
fudge 127.127.1.0 stratum 5 # default was 10

driftfile /var/lib/ntp/drift
----------------------------------

These machines each have a Gigabit network connection to a high-end
network switch. I believe the NTP Server probably has only a 100MBit
link, and he has all the traffic, but I don't think that is the
problem.

Probably the main issue is the CPU and I/O loading on these opteron
machines. They are each handling streaming data from a firewire card
(IEEE-1394a) and the CPUs stay fairly busy handling that data -- though
they are not pegged at 100% or anything.

Here is a typical ntpq output:
ntpq> as
ind assID status conf reach auth condition last_event cnt
===========================================================
1 48644 9634 yes yes none sys.peer reachable 3
2 48645 9034 yes yes none reject reachable 3
ntpq> rv 48644
status=9634 reach, conf, sel_sys.peer, 3 events, event_reach,
srcadr=ntpserv, srcport=123, dstadr=10.1.1.1, dstport=123, leap=00,
stratum=1, precision=-9, rootdelay=0.000, rootdispersion=5.554,
refid=GPSM, reach=377, unreach=0, hmode=3, pmode=4, hpoll=7, ppoll=7,
flash=00 ok, keyid=0, offset=360.879, delay=2.544, dispersion=3.803,
jitter=6.636, reftime=c739efcd.cf993b0f Thu, Dec 1 2005 21:55:25.810,
org=c739efde.6ea22848 Thu, Dec 1 2005 21:55:42.432,
rec=c739efde.1292f6e8 Thu, Dec 1 2005 21:55:42.072,
xmt=c739efde.0c8ede54 Thu, Dec 1 2005 21:55:42.049,
filtdelay= 2.54 4.42 2.50 2.98 2.55 2.61 2.44
2.68,
filtoffset= 360.88 354.24 412.02 412.20 464.11 -95.25
-78.39 -56.90,
filtdisp= 1.96 3.90 5.82 7.77 9.70
11.62 12.61 13.57

If anyone has any suggestions about what might be happening, or how to
keep these guys synched up more tightly, I would certainly appreciate
it. I've dug around through FAQs, Wiki's, Docs, etc... but not sure
exactly why my time is bouncing around so much.

thanks in advance,
bob
--
Bob Robison bob.robison@xxxxxxxx
Staff Engineer 210-522-3935
Southwest Research Institute San Antonio, TX
_______________________________________________
questions mailing list
questions@xxxxxxxxxxxxxxxxx
https://lists.ntp.isc.org/mailman/listinfo/questions

.



Relevant Pages

  • Re: NTP and PPS calibration interval (convergence speed)
    ... What would be more helpful would be graphs of the behaviour, ... dedicated machines just for NTP. ... I was also surprised that Meinberg's NTP server calibration interval ... Here are some ntpq billboards ...
    (comp.protocols.time.ntp)
  • [HPADM] NTP question.
    ... argument is NTP. ... About this I notices a strange thing by ntpq -p output: ... NTP server: 10.16.244.144 ... NTP client 1: ...
    (HP-UX-Admin)
  • Re: Sub-millisecond NTP synchronization for local network
    ... could use a CDMA timing receiver (such as this one from EndRun: ... your machines a stratum 1 NTP server. ...
    (comp.protocols.time.ntp)
  • NTP and PPS calibration interval (convergence speed)
    ... dedicated machines just for NTP. ... I was also surprised that Meinberg's NTP server calibration interval ... Here are some ntpq billboards ... freebsd8# ntpq -p 10.0.2.1 ...
    (comp.protocols.time.ntp)
  • RE: Problem with NTP.
    ... But what's about the output of ntpq -p <ntp source server> ... In your example like: ntpq -p Server1 ... Subject: Problem with NTP. ... When I run ntpq -p on all of our other Linux machines it shows ...
    (RedHat)