Re: Magic Compression? (Why won't this work) (Code provided)
- From: Phil Carmody <thefatphil_demunged@xxxxxxxxxxx>
- Date: 18 Feb 2008 20:34:54 +0200
partner50145703@xxxxxxxxxxxxxxx writes:
I entirely accept the premise that there is no way to compress all
random data, but I'd love some advice on why you can't avoid this
using a time-check and verify system.
I have an "algorithm" (read: simple shell script)
that I think would successfully substantially compress Mark Nelson
AMillionRandomDigits
[1] and successfully decode it.
I'd love to get some feedback. I've included my proof-of-concept code
at the bottom of this post.
It would seem that it is possible to compress the file, and return it
to it's original form, if we can successfully leverage the time-used
to create it as a variable.
On any given machine, we can re-create any file by starting from a
blank file, and then appending.. 00000, then modifying to 00001,
00010, expanding until we have tried every iteration until we have, by
brute-force, found our file.
How do you know if you've found your file unless you also
have a copy of that file to hand. In which case, you've
compressed the file down to precisely its original size,
congratulations.
While this certainly can create our file, the problem is that the
solution solves too much.. While it DOES create our file, it also
creates all other files. That's certainly not going to be very useful.
If we're clever, and willing to accept a very small chance of loss
[2], I contend that we can re-create this file successfully.
Begin by iterating through all possible numbers, out to the number of
digits in the original. Every time we iterate, compare the MD5 (or
SHA, or whatever hash you prefer) to the hash of the original file.
Continue to iterate until you have created every possible file given
the requisite number of digits.
"But wait", you say, "You will get multiple files back, since you are
brute-force cracking the hash"
You're right, but that isn't a problem. When we encode the file in the
first place, we should go through the entire process of creating each
file, and time how long it takes until we create the correct one. We
can save this as a *very rough* time format. (Unix EPOCH is probably
fine). This should be a rough time format to avoid wasting bits..
Seconds is ALMOST CERTAINLY close enough.
How many bits of information will this timing value require
on average when brute forcing files of say 10, 100, 1000,
10000 bits?
When decoding, we again generate every possible file given the number
of bits.. We compare each against the hash of the original, and get a
list of matches.. Then, we can compare the elapsed time of each of the
matches, to the time required to create the original.
Do you think your argument works equally well with a CRC rather
than a cryptographic hash?
If so, then you know nothing about CRCs. If not, then highlight
the place in your argument where the two deviate.
Phil
--
Dear aunt, let's set so double the killer delete select all.
-- Microsoft voice recognition live demonstration
.
- Follow-Ups:
- References:
- Magic Compression? (Why won't this work) (Code provided)
- From: partner50145703
- Magic Compression? (Why won't this work) (Code provided)
- Prev by Date: Re: Best existing binary compressor method?
- Next by Date: Re: A request for samples to work with.
- Previous by thread: Re: Magic Compression? (Why won't this work) (Code provided)
- Next by thread: Re: Magic Compression? (Why won't this work) (Code provided)
- Index(es):
Relevant Pages
|