Re: question regarding NTP configuration for clusters, and "cluster time" stability
- From: James Browning <jamesb.fe80@xxxxxxxxx>
- Date: Thu, 1 Oct 2009 11:22:27 -0700 (PDT)
On Sep 14, 12:09 pm, "rotor...@xxxxxxxxx" <rotord...@xxxxxxxxx> wrote:
::::snip::::
I have a product that is comprised of a cluster of Linux nodes, with
the
cluster ranging in size from 4 to over 100 nodes. To date, we've used
the version of NTP included in the OS (SLES 10) to maintain internal
time synchronization in the cluster, but without associations to any
external NTP servers nor any hardware based time sources. While
this has worked satisfactorily, it does allow for a gradual drift
from
UTC over time, so we'd like to extend the product to eliminate this.
What this means in terms of requirements is that we still must
maintain a stabile internal "cluster time" with sub-second tolerance.
This should be trivial for NTP to maintain, as that is a rather loose
tolerance compared to many others I've seen discussed. The requirement
to match true UTC is even looser, as all we're trying to do is enable
the use of an external reference to stop what can be a perpetual
drift.
Just to give it a number though, let's say we'd like it to be within
60 seconds of UTC.
The topology of our cluster has two tiers. All of the nodes are
interconnected
over a private network, and some subset of the nodes also have
external
connections to the LAN where it is deployed. The subset is always at
least 2 nodes, and can be as high as 25% of the total number of nodes.
Prior to extending the product to allow use of an external (to the
cluster)
NTP server or servers, those nodes with external connections were
configured as peer servers to the internal cluster, with all other
nodes
pure clients.
After adding support for external NTP servers, we kept something like
the same config: The nodes with external connections were still
servers to the internal network, and were peers of each other. But now
they were also clients of one or more external servers. I understand
that
requiring three or more would be better, and we can do that, but we
still
have to ensure stability of the internal cluster time even if a
reduced set
of servers (including the null set) were reachable.
Our configuration did not work, because we were able to cause
instability
in the internal cluster time with perturbations in the external
server. And we
have to guarantee stabililty even with bad inputs.
What happened was that some (but not all) of those externally
connected
nodes deemed the external server a false ticker, and stopped believing
it.
But some of the other externally connected nodes did not, and as a
result
there was time divergence between members of this group. It is this
divergence
that I'm referring to when I speak of a lack of stability.
So before I go into configuration details, is there a known "best way"
to
handle the sort of requirements I described? It sounds like orphan
mode
might provide functionality I'm looking for, but I figured in parallel
with
emperical experimentation, I'd pursue the analytical approach and ask
people who know more than me. :)
thanks,
Tim
/me is stupid, wrong, a bad dresser, malodorous, and other bad labels.
Might I suggest looking at the FNN page [1] set-up by the Aggregate
[2] at the University of Kentucky. I looks like they have a diagram
[3] that I would use in (wrongly) solving your thing.
One suggestion based on FNN (remember if you ask, you can probably get
permission to add another buzzword for the marketering department) is
to have each node peer with a small number of other nodes (four each
in the 64 node example). Then have them accept service from the four
midstream nodes (which peer with each other) and finally have the
midstream nodes accept service from diverse upstream nodes.
Another suggestion would be to switch on authentication and add
'broadcast' and 'broadcastclient' options to all of the downstream
nodes. this should form relatively tightly coupled machines on each
network segment. Secondly, for each network segment have machines from
other network segments peer (or accept service) from them. Thirdly
have (at least) one machine on each segment accept service from the
midstream boxen. Fourthly, configure the midstream boxen as above.
Or maybe how about, modify the two possibilities above by replacing
some of the server options with manycastclient options. Possibly the
peer options as well and maybe the broadcastclient options. and add
the manycastserver option where appropriate and possibly replacing the
broadcast option.
JamesB192 -- making a molehill out of a mountain, once.
[1] http://aggregate.org/FNN/
[2] http://aggregate.org/
[3] http://aggregate.org/IMG/klat2_wires.gif
.
- Prev by Date: Re: question regarding NTP configuration for clusters, and "cluster time" stability
- Next by Date: Re: question regarding NTP configuration for clusters, and "cluster time" stability
- Previous by thread: Re: question regarding NTP configuration for clusters, and "cluster time" stability
- Next by thread: Strange NTP problem on AMD Geode LX cards.
- Index(es):
Relevant Pages
|