Re: Server Side filtering (as pertains to Google Groups)



In news.software.readers on Fri, 15 Aug 2008 08:53:32 -0500,
VanguardLH <V@xxxxxxxxx> wrote:

Peter J Ross wrote:

In news.software.readers on Fri, 15 Aug 2008 08:01:17 -0500,
VanguardLH <V@xxxxxxxxx> wrote:

<...>

The problem with using leafnode, hamster, or any local proxy or NNTP
server is in exceeding a user's quota when leeching from their real NNTP
provider. Many users have a monthly bandwidth quota. Only a few
headers are included in the overview headers, like Message-ID (but you
don't need a proxy to filter on that header). The other headers require
downloading the message, not just the headers. While those with monthly
quotas might not have header downloads counted against the quota,
downloading the messages does count. That means a user could easily and
quickly consume the monthly quota in a few days or maybe just a couple
of hours or even on the first yank of all posts.

In Leafnode 2, you can use something like:

groupdelaybody = alt.binaries.*

or, globally:

delaybody = 1

The effect is to retrieve bodies only for articles whose headers have
been requested by a client.

I've never these settings myself, but people who frequent binaries
groups might find them helpful.

But isn't the purpose of the filtering proxy to eliminate the user from
even seeing the unwanted posts in their NNTP client?

That's one of its potential purposes.

The user would see
the unwanted post and then try to retrieve its body.

If the post is filtered, it won't be seen in the list of available
articles. If it is seen, the user can choose to download it or not.

After retrieving
the body (which consumes bandwidth), the proxy applies its filters to
delete that post.

Leafnode 2 doesn't retrieve bodies, or even full headers, if a "kill"
filter is used on XOVER data.

It doesn't retrieve bodies at all, unless requested by the user, if
"delaybody" is used.

The NNTP client gets an error and reports the post is
no longer available on the server.

That can happen with Leafnode 2 if new filters are applied manually
with "applyfilter", but not otherwise, as far as I'm aware.

Both headers and bodies of "killed" articles are deleted from the
local spool and database.

The idea you propose is to reduce but not eliminate the bandwidth loss
in downloading the bodies of posts the user doesn't want to see.

No. The user might want to see that articles are available, but
retrieve bodies for only some of them.

In alt.binaries.software.newsreaders (if such a group exists), I
might to filter all articles that don't have "slrn" in the Subject,
but Leafnode would then allow me to choose whether or not to download
the bodies of the non-filtered articles - e.g. I might want only slrn
pre0.9.9 for Windows, while still being interested in seeing what
other slrn binaries were available.

It
only downloads those bodies of posts that the user selected in their
NNTP client. Alas, this approach still has the user afflicted with
seeing the post in their headers listing in their NNTP client only to
have it error when it tries to get just that post.

I don't see how that can happen if both client and server are behaving
sensibly. Posts listed as available shouldn't subsequently be deleted
by filters.

I don't have my NNTP client downloading all message bodies in a
newsgroup. It already only downloads the body if and when I select a
particular message to retrieve it. I suspect that is how most or all
NNTP clients behave. So the NNTP client is already eliminating
bandwidth consumption by retrieving only the bodies of messages that the
user selects to download. There's no point in downloading all messages
in a newsgroup other than those you want to read.

Some news clients can be configured to download and store all bodies.

This setup in the proxy only reflects what the NNTP client is already
doing.

Yes, but if you're already using a proxy for whatever reason, the
option not to download bodies for all or some groups is useful.

The only difference that I see in behavior is that the NNTP
client would use its filtering to handle the unwanted message (by
deleting it, colorizing it, or whatever) versus the NNTP client issuing
an error that the message is no longer on the server (because the proxy
deleted its copy of that message).

The proxy (in the case of Leafnode 2) deletes filtered articles before
making information about them available to clients. It doesn't delete
non-filtered articles, such as those to which "delaybody" might apply,
at all (except of course when they expire, or if the Leafnode 2 admin
chooses to delete articles manually or with "applyfilter").

For the user to never even see the unwanted posts, the proxy would have
to download all messages headers to apply its filters against the
overview headers.

No, Leafnode 2 can filter on the overview headers.

<...>

I don't see where you came ahead of bandwidth consumption.

Using Leafnode 2 typically requires more bandwidth than not using it,
though there are savings when re-reading articles.

I know too little about proxy/server software other than Leafnode 2 to
comment. I originally only wanted to point out that Leafnode 2 doesn't
have to retrieve bodies for articles that the user doesn't
specifically ask for.


<...>


--
PJR :-)

<http://pjr.lasnobberia.net/usenet/>
<http://slrn-doc.sourceforge.net/>
.



Relevant Pages

  • Re: [Dialog] Option to retrieve bodies of new posts does not work reliably
    ... I enabled the option "Retrieve bodies for all new posts". ... Yet the first pass (download headers) had the filter flag the "Bruce ... Try changing the filter so From uses Header too, and adjust From, e.g. ...
    (news.software.readers)
  • Xananews (was: Sporge filtering)
    ... And does Thunderbird even allow filtering on xref ... Thunderbird can't filter on xref either (it looks like it can only ... not on other headers). ... sci.crypt Don't download, Don't display. ...
    (rec.arts.sf.written)
  • Re: Agent 4.0 Well Worth The $15
    ... download ALL BODIES when he downloads HEADERS. ... I keep trying each new version of Agent, ... In order to get Agent to do single key navigation, you have to download ...
    (comp.sys.mac.advocacy)
  • Re: Agent 4.0 Well Worth The $15
    ... download ALL BODIES when he downloads HEADERS. ... I keep trying each new version of Agent, ... In order to get Agent to do single key navigation, you have to download ...
    (comp.sys.mac.advocacy)
  • Re: Agent 4.0 Well Worth The $15
    ... download ALL BODIES when he downloads HEADERS. ... I keep trying each new version of Agent, ... In order to get Agent to do single key navigation, you have to download ...
    (comp.sys.mac.advocacy)