Re: [9fans] QTCTL?



I know, I think I was not clear, sorry.

The point is that, referring to the
Tinval-Rinval-Tinval-...
sequence, a server could afford to consider its Rinval acknowledged
if the client happens not to respond (by issuing another Tinval)
within 5 seconds.
That would "freeze" clients for at most 5 seconds when a client fails
to respond,
but that should not happen often, and it would not make other clients slow.

Also, regarding
the irony is, the higher the latency, the greater the cost of syncronization.

if we consider that this would happen only for rw files, and that rd files would
be considered as leased up to the next Rinval mentioning them, the cost would
not probably be too high. But I won´t actually know before
implementing and trying it.


On 11/1/07, Latchesar Ionkov <lionkov@xxxxxxxx> wrote:
The problem is that the clients with higher latencies badly need to
be able to cache. And the ones with better latencies can afford not
caching :)

On Nov 1, 2007, at 12:03 PM, Francisco J Ballesteros wrote:

But 5 seconds would be enough to be convinced that a client is not
properly
responding to invalidation requests, and cease all its leases. Why
make other
clients wait for more? I mean, assuming a central FS and clients
connected on star
to it.

On 11/1/07, Latchesar Ionkov <lionkov@xxxxxxxx> wrote:
The 5 seconds lease might work in the local network case, but not
caching at all is going to work out pretty well too. What if you want
to cache over Internet and you round-trip is 3-4 seconds :)

On Nov 1, 2007, at 11:03 AM, Russ Cox wrote:

The fact is we have loose consistency now, we just don't call it
that.
Anytime you are running a file from a server, you have loose
consistency. It works ok in most cases.

Because all reads and writes go through to the
server, all file system operations on a particular
server are globally ordered, *and* that global ordering
matches the actual sequence of events in the
physical world (because clients wait for R-messages).
That's a pretty strong consistency statement!

Any revocation-based system has to have the server
wait for an acknowledgement from the client.
If there is no wait, then between the time that the server
sends the "oops, stop caching this" and the client
processes it, the client might incorrectly use the
now-invalid data. That's why a change file doesn't
provide the same consistency guarantees as pushing
all reads/writes to the file server. To get those, revocations
fundamentally must wait for the client.

It's also why this doesn't work:

Tcache asks whether the server is prepared to cache
Rcache makes lease available with parameters, Rerror says no.

Tlease says, ok start my lease now (almost immediately
follows Rache)
Rlease lease expired or lease needs to be given back early

Tcache done with old lease (may immediately ask for a new
lease)
etc.

because the Rlease/Tcache sequence is a s->c->s
message. If a client doesn't respond with the Tcache
to formally give up the lease, the server has no choice
but to wait.

If you are willing to assume that each machine has
a real-time clock that runs approximately at the
same rate (so that different machines agree on
what 5 seconds means, but not necessarily what
time it is right now), then you can fix the above messages
by saying that the client lease is only good for a fixed
time period (say 5 seconds) from the time that the
client sent the Tlease. Then the server can overestimate
the lease length as 5 seconds from when it sent the
Rlease, and everything is safe. And if the server
sends a Rlease and the client doesn't respond with
a Tcache to officially renounce the lease, the server
can just wait until Tlease + 5 seconds and go on.
But that means the client has to be renewing the
lease every 5 seconds (more frequently, actually).

Also, in the case where the lease is just expiring
but not being revoked, then you have to have some
mechanism for establishing the new lease before
the old one runs out. If there is a time between
two leases when you don't hold any leases, then
all your cached data becomes invalid.

The following works:

Tnewlease asks for a new lease
Rnewlease grants the lease, for n seconds starting at time of
Tnewlease

Trenewlease asks to renew the lease
Rrenewlease grants the renewal for n seconds starting at time of
Trenewlease

Now if the server needs to revoke the lease, it just
refuses to renew and waits until the current lease expires.

You can add a pseudo-callback to speed up revocation
with a cooperative client:

Tneeditback offers to give lease back to server early
Rneeditback says I accept your offer, please do

Tdroplease gives lease back
Rdroplease says okay I got it (not really necessary)

The lease ends when the client sends Tdroplease,
*not* when the server sends Rneeditback. It can't end
at Rneeditback for the same reason change files don't work.
And this can *only* be an optimization, because it
depends on the client sending Tdroplease. To get
something that works in the presence of misbehaved
clients you have to be able to fall back on the
"wait it out" strategy.

One could, of course, use a different protocol with
a 9P front end. That's okay for clients, but you'd still
have to teach the back-end server (i.e. fossil) to speak
the other protocol directly in order to get any guarantees.
(If 9P doesn't cut it then anything that's just in front of
(not in place of) a 9P server can't solve the problem.)

Russ







Relevant Pages

  • Re: Intermittent Remoting Event Callback Problem
    ... I need to know within 2-5 seconds if the "server" I'm talking to ... the client to pro-actively be aware of this particular service it's talking ... the SAO should be in use "forever". ... If the object is not used for a five-minute period, the lease ...
    (microsoft.public.dotnet.distributed_apps)
  • Re: TTL again
    ... I have just recently discovered that the lease is not ... The next question is where is your sponsoring object created. ... If it is a client object then, you are the same place that I am. ... can not get the Renewal method on my sponsoring object to be called, even though I have 2 way channels and server events working. ...
    (microsoft.public.dotnet.framework.remoting)
  • Re: [9fans] QTCTL?
    ... And the ones with better latencies can afford not caching:) ... Anytime you are running a file from a server, ... wait for an acknowledgement from the client. ... Tlease says, ok start my lease now ...
    (comp.os.plan9)
  • Re: dhcp client test case
    ... >> With the way DHCP works, the client will begin trying to renew its ... I believe up until 87.5% of the lease ... >> is expired the client will try to renew the lease from the server ... I recently saw one network, ...
    (comp.os.linux.networking)
  • Re: [9fans] QTCTL?
    ... The 5 seconds lease might work in the local network case, but not caching at all is going to work out pretty well too. ... Anytime you are running a file from a server, ... wait for an acknowledgement from the client. ...
    (comp.os.plan9)