Re: Compression by descent
- From: jacko <jackokring@xxxxxxxxx>
- Date: Thu, 16 Aug 2007 10:17:05 -0700
Hi as the group knows me quite well ;) i think i should give my
opinion!
On 12 Aug, 22:55, Ashley Labowitz <sporecoun...@xxxxxxxxx> wrote:
The program extracts approximate roots of the high-degree fractal
polynomials using Newton's method (quadratic convergence) and saves
off just enough digits.
the convergence of the fractals taylor series chosen will decide the
number of significant coefficients, and how acurate they have to be
stored. so a faster converging fractal would need less coefficients to
represent the same acuracy, BUT you'd also have to show that there is
no overhead in coefficient storage because more bits are required in
the coefficient field.
For files with structured data, there is generally a way to "align"
the bytes so that they match up with edges of a fractal, particularly
so when many degrees of freedom are permitted. Consider the following
class of polynomials:
define align, as i seem to think you are quantizing the complex plane
into boxes, and placing one datum byte in each box, and you have not
explained how the datum is constructed by the decompressor back from
'sample' points within the boxes.
z^n + a cos(z)^n + b cos(sin(z)) + c
can not be bothered to show any convergence properties?
When iterated, this takes on periodic and chaotic orbits over the
complex plane, and by basing the fractal edges on whether the point
diverges, we can describe a region with unlimited self-similarity on
progressively smaller scales. That is basically how Mandelbrot first
discovered his set, but he didn't discuss this particular polynomial.
how does this relate to a byte within a box on the complex plane?
For instance, given a fragment of data, say
74 6f 20 62 65 20 6f 72 20 6e 6f 74 20 74 6f 20 62 65
fractals identify the small-scale patterns (such as repeated 20's) and
large-scale patterns (the byte spectrum, generating grammar and common
stems). For these reason, it may outperform gzip or related simple-
minded compressors.
maybe it could but so what? does it work on its own output?
By contrast, for random data it is almost as simple. We use
knapsacking to optimizing the placement of fractal curve alignments to
fully cover the data, taking advantage of "found order", or the
correlations that can be discovered. Any random file will contain
many instances of the same byte for example. This by itself is not
enough however. If it was, we could simply sort the bytes in the
file, run-length encode it, and send the arrangement of the bytes
separately. However, it is the arranging that almost all the
information is contained!
shouldn't be any more complicated?! does the above paragraph imply you
remove a fractal, and calculate a set of residuals, so another fratal
can be found? an why do the roots have to be high order, is this due
to convergence speed requirements of a taylor series?
if you factor a file the sorting destroys no information, but as bytes
are not commutative within the file then this RLE process would fail.
Instead, we use Taylor series from calculus to manage the arrangement
complexity. To illustrate an example, I'll use a random 16 bytes:
$ python -c "f = open('/dev/random', 'r'); print f.read(36)" | hexdump
-C
00000000 fa 41 c2 72 55 8d c8 27 b1 4a ac a2 13 68 37 d9
|.A.rU..'.J...h7.|
00000010 d5 60 de af d8 85 9c d9 3c 7e 1e ca c1 59 c5 cd
|.`......<~...Y..|
00000020 dd 1e 3d f9 0a |..=..|
In effect, the fractals treat this as floating point numbers and use
approximations from the fractal dimension and polynomial roots (of
which there are usually an infinite number). I can use Euler's
theorem to find these quickly, e.g.,
pi^2 - pi^1.7 + cos(sin(pi/1.7)) = 3.44071322 (variant of polynomial
above)
e^2 - e^1.7 + cos(sin(e/1.7)) = 2.45574537
and "fa 41 c2 72" from the above is 4198613618 = 2 * 179 * 11727971.
Notice the relationship between these 3 primes the sequence of
polynomial values taken as follows:
1464905053 = 18959 * 77267
1756147005 = 3 * 3 * 5 * 17 * 907 * 2531
1815095743 = 359 * 5055977
417443012 = 2 * 2 * 7 * 227 * 65677
whats this got to do with roots?
Factoring could be a problem, but there are fast methods for this.
I'm using an EC method now (yay for libraries doing all the work for
you!). Right now, I'm not sure how to encode prime numbers, but even
if I have to add one, then factor, this extra flag doesn't offset the
compression savings.
try shanks or pollard rho or lenstras factor methods. or get a quantum
computer :).
In this case, the 36 bytes above compress down to
d1 1f 4e 99 4b 3d db 22 0e c3 b0 6d ae 4e e8 cc 23 a4 81 17 10 f5 c8
5f 72 ea b3 c7 07 76 f2 f5
which is only 32 bytes, and it contains all the information needed to
reconstruct the original 36.
are you 100% sure?
looking at your post, the only things which relate to anything i have
thought about before are the factorization to introduce commutivity in
a data stream, the 2x+1 to obtain another factor set from a prime, and
residual calculations as in 1 bit dacs which at four times
oversampling (4 bits) can be made to output 16 bits resolution.
so how is the number in the byte box on the complex plane calculated?
describe the decopressor as this involves no patterm matching, just
pattern construction.
cheers
.
- Follow-Ups:
- Re: Compression by descent
- From: Ashley Labowitz
- Re: Compression by descent
- References:
- Compression by descent
- From: Ashley Labowitz
- Re: Compression by descent
- From: Phil Carmody
- Re: Compression by descent
- From: industrial_one
- Re: Compression by descent
- From: Ashley Labowitz
- Re: Compression by descent
- From: industrial_one
- Re: Compression by descent
- From: Ashley Labowitz
- Re: Compression by descent
- From: Jim Leonard
- Re: Compression by descent
- From: Phil Carmody
- Re: Compression by descent
- From: collection60
- Re: Compression by descent
- From: Ashley Labowitz
- Re: Compression by descent
- From: collection60
- Re: Compression by descent
- From: Phil Carmody
- Re: Compression by descent
- From: collection60
- Re: Compression by descent
- From: Ashley Labowitz
- Compression by descent
- Prev by Date: Re: enhanced fast_gztell/fast_gzseek idea
- Next by Date: Re: enhanced fast_gztell/fast_gzseek idea
- Previous by thread: Re: Compression by descent
- Next by thread: Re: Compression by descent
- Index(es):
Relevant Pages
|
Loading