Re: compression - insights into infinite



On Jun 2, 3:25 am, Thomas Richter <t...@xxxxxxxxxxxxxxxxx> wrote:
jules Gilbert schrieb:



I understand your logic Thomas, and would agree with your conclusion,
but for one small fact...

[Insert long and spirited defense here... Sigh. Just like you folks,
I am getting pretty tired of this.]

Briefly, one sentence, I can reduce randomness, even in a very small
amount of data, as I said, a buffer, measured before and after, shows
up as less random after the process -- and no additional information
must be conveyed to the receiver.

Look, this argument is *not* about randomness. If you state that you can
compress *all* data, then this "all" implies random, non-random,
pseudo-random, and what-the-heck-I-can-come-up-with data. It doesn't
matter. As soon as there are more possible input sequences as they are
output sequences, your technology isn't reversible, plain and simple.

The 'measurement' is, as I have said, the "stats" routine in the C/
Math library, basic?, yes. Useful?, incredibly. Wrong?, not very
likely. I'd suggest that maybe I'm measuring something erroneously or
perhaps coming to the wrong conclusion from well constructed
measurements, but since I think I've done everything correctly my next
step is to work with someone who can help me to build a nice
compressor/decompressor package.

For that, it needs an algorithm. The feature of an algorithm is that it
generates for one possible input one predictable, reproducible output.
Unfortunately, you must generate (to decompress again) different outputs
for the same input. That's not an algorithm.

What might work is to publish some of this data, say in an FTP site
and let people examine it.

Well, all fine with me. Again, I throw in the following input for you to
compress:

All 256 one-byte sequences. All 65536 two-byte sequences.

Those include random, pseudo-random, not-at-all-random and whattheheck
other sequences of 16 bits. Posting the output of those would be
enlightening, probably more for you than for us, but nevertheless, do
that, and put the said output on a web-space, or an ftp server, or mail,
and measure their length. I *DO NOT CARE* about your top-secret
compressor or decompressor.

My early work with this method is best described this way:

a) An INPUT cell, let's stay with bytes for this example.

b) OUTPUT cell, residue, part A. The problem with this was that some
cells, say 5% (wrt a 1-byte organization,) were in excess of 8-bits.
Not very convenient for a compressor, of course. The other cells
were, typically, reduced by at least 1-bit.

Now, having done additional work, I have much smaller residue's and no
bytes in excess of 8-bits. In fact all output cells contain,
typically, less than four bits. Frequent output values are 4 and 5,
for example. Plus a sign bit, but the sign bits compress quite
handily. (They are best converted into sequences of primes.)

Jules, you still don't understand. All this doesn't matter at all,
unless you're willing to admit that said algorithm won't compress all
input. In fact, like all other compression algorithms, it will fail most
of the time. However, the failure cases for most other algorithms are
interesting (because they include text, graphics or other sources humans
care about), so what are the interesting cases of your algorithm?

Another value, needed to reconstruct the original value, isn't sent.
It's inferred on the receiver side. Remember my two-bit compressor
that worked by inferring the sign bit? This reconstruction *may* be
imperfect, but I have a simple inductive proof that, worst case, the
information that must be sent is always less than 8-bits. (I don't
mean that the compressor may be lossy, I am concerned to produce a
space reduction all the time -- that's what I mean here.)

If the overall amount is less than 8 bit while the input is 8 bit, your
proof is flawed. As simple as it is.

Again, and this is a very very simple exercise, feed all 256 possible
byte inputs into your compressor, in initial state, and post or collect
the outputs.

Two things can happen:

a) Either, in that said output, sequences longer than 8 bit appear, in
which case the algorithm might work, but your proof is flawed.

or

b) all sequences that are generated this way are less than 8 bit long,
in which case there are fewer than 256 of them, in which case at least
two input sequences must map to the same output, in which case you don't
have a decompressor.

My proof depends on something I can do in a CAS but don't have the
math to work out by hand (ie., to understand.) Still, I can build a
symbolic system that I'm sure wouldn't work if the mathematics I am
depending on were deficient.

Math depends on the elementary laws of logic, so does the internal
wirings of a computer. For that simple reason, you never will have the
"math" to represent 256 inputs by less than 8 bit.

The savings (the information that need not be sent,) varies with the
organization, but I think I can demonstrate at least a single bit per
byte savings.

Then, I beg you, demonstrate. You'll see that either a) or b) is the
outcome. You may post here. Or on an ftp side, I don't care, but *do*.

Words are cheap.

But before I commit to doing this, can I see a show of hands of people
who would look at the data, please.

So then, post, or mail. My email is valid. I do not want to see your
algorithm, I don't care. All I need to see is the output of all one-byte
sequences, and, if you like to, the output of all two-byte sequences. A
text file, one line containing the bits of the output is enough, use the
symbols 0 and 1. No binary encoding required, text files preferred.

That is, the files you post, should contain exactly 256 rows, or exactly
65536 rows. Each row defines the output of the corresponding byte input,
given by its line number, starting with line #0 (i.e. the input pattern
of line zero is 00000000 for the byte-file).

But please go over it first - if I find a duplicate output line, you
cannot decompress anymore. And, as said, the output may be either a) or b).

So long,
Thomas

Damn... you must have a really righteous conscience -- being this
philantrophical despite
http://groups.google.ca/group/comp.compression/browse_thread/thread/96ed7ff926a6af7b/d5f0d4604a55fa23

Respect.

And Jew--err, Jules, I command you at this point to continue this
discussion, answer every one of Tom's questions and not ignore the
points he makes, unless you're a complete douche. You owe it to him
after dismissing him in that "A Christmas day compressor" thread. He
wasn't being a *** to you like you complain about the rest of the
crowd, he was unbiased and responded to every technical point of your
system that you elaborated on.

Now pull your *** together and do the same.
.