random compression proven



On Jul 25, 6:27 am, mcjason <mcja...@xxxxxxxxx> wrote:
On Jul 25, 5:55 am, mcjason <mcja...@xxxxxxxxx> wrote:





On Jul 25, 5:05 am, Thomas Richter <t...@xxxxxxxxxxxxxxxxx> wrote:

mcjason schrieb:

Second point above. Please state what "random" means. You haven't done
so yet. Please do your homework - it's really about helping you, not
about annoying you. Nobody can do that for you, you must learn it yourself.

data where the trend tends to be few repeat occurances of a length of
data, where it's usually not a worthwhile tradeoff to say one
occurance of what repeats, for there to be a token, for how tokens
have a limited way of being said for what else is said. Beause in
random data the allocation space for a token is usually too exhausted
for
there to be a worthwhile way of saying what a token is for what else
is said, for how a repeat occurance of a length of data can be said
once with a token otherwise.

Not a very reasonable definition, but for the time being, let's take
this. According to this definition, the following string

1234567891012131415161718191202122232425262728293031323334353637383940...

is random, (nothing repeats, provably) though still a ten-year old can
see its construction algorithm.

Hint: You seem to believe that "random" is an attribute that you can a
apply to a sequence you can point at. "Random" is the property of a
process, not of a specific string in particular. Depending on the
process, the string

1111111111111111111111111111111111111111111111111111111111111....

is as likely as the above.

I understand perfectly why this can be seen as a problem when it comes
to compressing with the technique of saying what repeats once with a
token for other occurances.

I'm not saying this. *You* say this.

It's intuititive to think of this the way
the problem is well described. But I can't find anywhere the say so of
random being hard to compress isn't connected with the idea of only
working the way that repeat occurances are made fewer, with tokens
taking a naming allocation.

It's very limited to think that's the only way to compress, I gave A
PERFECT analagy of how this is VERY WRONG.

*Sigh* You gave a non-working example. What makes you believe that I
think in "patterns"? I don't. My field is *image compression*, yet you
can compress them even though there are no patterns, and the algorithms
used there do not look for matched patterns. Hence, please do not try to
tell me what I do and do not know - I think it's the time for you to
deepen your research.

it's to say this proves how random is compressable, take it whatever
way you want I know it's right.

Using a definition of "random" that makes sense (your definition
doesn't, I wouldn't call either of the strings random), you cannot
compress random strings.

say for every length of data there can be a shape, a shape where it's
a shape different for everyway the data is different.
given perfect math it would be a shape the same size as the data,
because of that making a different shape for everyway data is
different.

That's a "data model"; the question is "is this data model" reasonable
to compress data? And the answer is: For every model one can construct
data that cannot be successfully modeled by it (IOW, cannot be
compressed, using an optimal entropy coding algorithm on the output of
the model). In your case, the model would be to draw shapes or curves or
spheres. As long as you don't give better arguments as why you believe
the model you have is good, and for which type of data it is good for,
this is a lost attempt.

What you don't seem to realize is that while it is fairly true that more
complex models can describe more complex data, these models *also*
require more modeling parameters you somehow have to encode as part of
the message. It is a trade-off between simplicity of the model against
the size of the model parameters. Choosing a simple pattern repetition
model (as in LZ77) leaves only few model parameters (length and offset),
but it is only sufficient to match patterns exactly (from the past) and
not to describe sequences with a more complicated construction algorithm
(as the one I gave above). You can surely introduces models that do that
better, but then you also need more parameters.

In the end, you'll never have an algorithm that "perfectly compresses
everything" because even though your model is then very complete, it is
so complicated that you need to transmit too much data just to describe
it. You *cannot* win this game, it's a logical constraint about maps
between finite sets, a very elementary one.

now say for two lengths of data, a shape for each.

now.. this might be a little harder to believe is right.

I'm not arguing at this level - you don't seem to understand.

given a shape, and another shape, there is math to say the shape but
made different, to the other shape, where the math to say one shape
different to the other shape is smaller than the other shape. So
instead of saying two shapes, say one shape and the math to make the
shape different as the other shape.

All very well, but you still need data to describe this "different", and
you'll soon find out (once you would dare to try to implement it) that
the overall byte budget required to describe this "different" is higher
than the byte budget you save by using this model, at least for *most* data.

If you don't believe this, I urge you to implement your idea in an
algorithm and observe this yourself. Depending on the data set, the most
successful models are simple.

given a perfect idea of how this would work, shouldn't it be that the
math has a 50% rightful claim of being smaller than the other shape,
and a 50% rightful claim of being bigger than the other shape?
Shouldn't it though just to think of the most idea condition there
should be?

doesn't that make sense when there could be some math smaller to say
one shape made to be changed is another shape, smaller than the other
shape? and some math bigger than the other shape? shouldn't the idea
round off as a 50/50 of smaller and bigger than the other shape? to
say a shape changed is another shape.

It all makes sense to say so, but your algorithm also has to say so,
namely has to communicate this to the decoder. And *that* is where your
problem is.

Again, if you don't believe me, construct this algorithm and you'll see
yourself.

So long,
        Thomas

I have an easy time believing one thing....

say for all there is to compress... put it in a geometry area.
now say it's just that.

now the file is just that, and 1 token to say that's what expands, is
just the block there in the geometry area.

so nothing different about the size really.

now.. instead of one block, this instead...

find every instance of BBBB, and seperate the block.

so in
"sdfjl44tn98324jbBBBB098wutjk0982kjaerjtjkbBBBBsejh2348095bb23ybyBBBB2hi2u5­­53vb23bnjfngBBBB"

now say one BB block and the blocks before and after each BB

now the geometry area is with that

now one curved line as the token to draw that pattern.

so lost is every occurance of BBBB except one, so 6 bytes lost.
gained is what it took to say more blocks, and a curved line that
might be slightly bigger but not much?

so the tradeoff of finding a data block of _ANY SIZE_ that has
occurances of BB, like in any size this can happen once in a while.

no pigeonhole concept here because tokens aren't mixed with data, it's
the geometry area and curves outside it as all there is to expect.

to say seperate blocks there might as well be the simplest way....
say one block after another, but make it so one block after another is
at a location starting different like it is to say a spiral starting
at the center, but one that a curve
can always find it's way through easily maybe?

see how this proves random is compressable?

because in random data any size it's good to see BBBB once in a while,
but it's only a curve slightly more complicated and saying blocks like
before and after each BBBB... but for what there is to say about size
being bigger, it's to say a seperate block and a curve slightly more
complicated for each time BBBB is found?

it's like.. easy to see maybe?- Hide quoted text -

- Show quoted text -

See how I can say this....

in data any length, no matter what....

store in a geometry area, but say no different than the data together
and one token.
so no bigger really....

now say this is what is being compressed...

... any length ... "abcdefghijklmnopqrstuvwxyz efcdab cderfab" ... any
length... "erfab 123456789 da" .,.. any length ...

then it's to store...

BLOCK, "ab", "cd", "ef", "ghijklmnopqrstuvwxyz", BLOCK, "erf",
"123456789", "da", BLOCK

and one token...

a curved line... BLOCK - "ab" - "cd" - "ef" - "qghijklmnopqrstuvwxyz "
- "ef" - "cd" - "ab"  -  "cd" - "er" - BLOCK"f" - "ab" - BLOCK -
BLOCK"erf" - "ab" - "123456789 " - BLOCK"da" - BLOCK

so it has to say 14 blocks instead of 1, and a curved line that isn't
just saying at one place, but is saying through 14 blocks like how
they're situated.

now that's to lose 15 bytes, but gained is explaining 14 blocks
instead of one, and gained is a curved more complicated.

so that's about at odds with saying nothing better.

so what makes this better now?

isn't it to find that going on forever is to find better than what it
takes to explain a new block and how a curve becomes more complicated
for every
"ab", "cd", "ef", ...

read more »- Hide quoted text -

- Show quoted text -

did i ever screw that up... hehe

... any amount ... "abcdefghijklmnop" ... "opmnklijghefcdab" ... any
amount
stored as....

BLOCK_BEFORE, "ab", "cd", "ef", gh", "ij", "kl", "mn", "op",
BLOCK_AFTER

so then a curved line BLOCK_BEFORE - "ab" - "cd" - "ef" - "gh" - "ij"
- "kl" - "mn" - "op" - "op" - "mn" - "kl" - "ij" - "gh" - ef" - "cd"
- "ab" - BLOCK_AFTER

so....

stored with block seperation, to say one block after another makes a
spiral say for example but one a curve draws through well.

so...

total size now... each block, as seperated, and a curved line.


16 bytes lost, 10 blocks seperated instead of 1, and a curved line
more complex.


so it's to say that forever as the size of data, any 2 bytes as found
to be "ab", "cd", "ef", "gh", "ij", or "kl" is for one block
seperation, and a curved line slightly more complicated.

that's about even right? unless it's slightly better right?

so now it's only to find in data of arbitrary length more, 3
characters found together more than once to be at even better odds.

.



Relevant Pages

  • Re: random compression proven
    ... occurance of what repeats, for there to be a token, for how tokens ... can compress them even though there are no patterns, ... and a 50% rightful claim of being bigger than the other shape? ... but it's only a curve slightly more complicated and saying blocks like ...
    (comp.compression)
  • random compression proven?
    ... occurance of what repeats, for there to be a token, for how tokens ... there to be a worthwhile way of saying what a token is for what else ... can compress them even though there are no patterns, ... but it's only a curve slightly more complicated and saying blocks like ...
    (comp.compression)
  • Re: sin x / x tends to 1...
    ... >>>polygon stuff, which we can regard as mere detail, you (and one or two ... But if you think about what it is saying, ... >arc length and the length of the "ruler" will become insignificant. ... It's saying that the length of a certain curve is ...
    (sci.math)
  • Re: Curving An Object Ball(Part VI-Letting it All Hang Out))
    ... Not equal to zero but you always skip or miss the context of what people are saying. ... Bob was NOT agreeing that there is any observable curve, nor was Dr Dave, in fact they were saying the opposite. ... Jal thought the cloth bunching in front of the moving ball might cause an imbalance of forces that might cause a spinning ball to curve a bit and he even discussed this with Ron here in RSB, yet at the end even he was not so sure that this would be an observable effect. ...
    (rec.sport.billiard)
  • Re: Fermats Last Theorem simple proof impossible?
    ... to me, at least, what you are saying is still rather ... What do you mean by "curve"? ... a continuous map from an interval to n-dimensional real ... do not have formal training or a formal degree), ...
    (sci.math)