Re: Compressing List of Tuples
- From: bullockbefriending bard <kinch1967@xxxxxxxxx>
- Date: 30 May 2007 12:18:02 -0700
Once again thanks for pointing me towards this very useful thread.
I've taken the trouble to modify your 3-tuple example to more closely
suit my 6-tuple problem.
You will see that my modified method (perhaps naively) only performs
one rotation through the 6 columns as it does the greedy merge. I'm
afraid I don't really understand your reference to permutations. Well,
I *do* know that there are 6 ways to order a the elements of a 3-tuple
and 720 ways to order the elements of a 6-tuple... but I'm not really
clear on why this should be relevant given that the innermost loop in
your example compares 5 elements of two consecutive tuples in order to
see if a merge condition exists in element 6 - I just can't see why
one would need to care about order of taking these?
There must be something I'm just not getting here. My modified method
achieves 'reasonable' (>20x) compression on a file with 140K+ tuples
(needless to say, these are not random and there is obviously some
hint of underlying structure). However, assuming I *have* somehow
missed part of the point of your algorithm, I'd really appreciate it
if you could point it out to me, as I certainly wouldn't mind
sacrificing longer run time for greater compression.
(In addition, as far as I can tell, your 3-tuple code and my above
method in 3-tuple form, perform identically on input data sets... )
Many thanks,
Peter
Following are your version from the thread you pointed me to + what
seems to work ~OK for me.
def foo(lolol):
#lolol for a list of lists of lists
for i in xrange(2):
for j in xrange(3):
lolol.sort()
k = 1
while k < len(lolol):
if lolol[k-1][:2] == lolol[k][:2]:
lolol[k-1][2].extend(lolol.pop(k)[2])
lolol[k-1][2].sort()
else:
k += 1
#rotate!
for k in lolol:
k.append(k.pop(0))
#reverse!
for k in lolol:
k.reverse()
return lolol
def combination_compression_algorithm_(combinations):
"""Greedily compress all combinations on one field, then next field,
etc.
Parameter combinations is structured so:
[
[[1],[2],[3],[4],[5],[6]],
[[1],[2],[3],[4],[5],[7]],
]
Given the above input, returned combinations should be:
[
[[1],[2],[3],[4],[5],[6,7]]
]
"""
for do_it_six_times in xrange(6):
# we want everthing to be in lexically ascending order of
# combinations: [[1],[2],[3],[4],[5],[6]] < [[1],[2],[3],[4],[5],
[7]]
# < [[1],[2],[3],[5],[14],[14]]
combinations.sort()
k = 1
while k < len(combinations):
# compare fields 0-4 of two consecutive combinations, e.g.
# [[1],[2],[3],[4],[5],[6]], [[1],[2],[3],[4],[5],[7]]
if combinations[k-1][:5] == combinations[k][:5]:
# [[1],[2],[3],[4],[5]] == [[1],[2],[3],[4],[5]] =>
# we append the elements of combinations[k][5]
# to the equivalent sublist in the preceding combination, we
# can we can get rid of combinations[k] and
# combinations[k-1] becomes [[1],[2],[3],[4],[5],[6,7]]
# surprisingly combinations.pop(k) doesn't seem to result
# in slow run times for lists of 100K+ combinations.
combinations[k-1][5].extend(combinations.pop(k)[5])
# following line seems not necessary since top of outer loop
# combinations.sort() puts everything into lexically ascending
# order such that above line will always maintain correct
# element order in combinations[k-1][5]
#combinations[k-1][5].sort()
else:
# greedy algorithm. we keep merging into combinations[k-1][5]
# until we run out of matches. only then do we increment k.
k += 1
# rotate combinations left so that (e.g.)
# combination [[1],[2],[3],[4],[5],[6]] => [[2],[3],[4],[5],[6],[1]]
# the outer loop has us do this 6 times, so that upon termination
# of outer loop, applied the combination[5] merge of the inner loop
# to each of the 6 elements of combination[] and each combination is
# returned to original starting order of elements.
for combi in combinations:
combi.append(combi.pop(0))
return combinations
A *very* similar question to this was posted a couple weeks ago...
http://groups.google.com/group/comp.compression/browse_thread/thread/...
The algorithm I provided in that thread relies on certain features of
the permutations of 3 elements to merge groupings of 3-tuples. If you
were to iterate over the 720 permutations of your 6-tuples, using the
(very simple) matching algorithm I provided, you may be able to get
fairly decent results.
- Josiah
.
- Follow-Ups:
- Re: Compressing List of Tuples
- From: Josiah Carlson
- Re: Compressing List of Tuples
- References:
- Compressing List of Tuples
- From: bullockbefriending bard
- Re: Compressing List of Tuples
- From: Josiah Carlson
- Compressing List of Tuples
- Prev by Date: Re: OT: RFC, PRNG
- Next by Date: Re: Compressing List of Tuples
- Previous by thread: Re: Compressing List of Tuples
- Next by thread: Re: Compressing List of Tuples
- Index(es):
Relevant Pages
|