Re: Reducing the size of a pdf file



On Aug 28, 4:19 pm, bugbear <bugbear@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
phh...@xxxxxxxxx wrote:
On Aug 28, 1:40 pm, Hans-Werner Hilse <hi...@xxxxxx> wrote:
Hi,

On Tue, 28 Aug 2007 12:15:52 -0000 phh...@xxxxxxxxx wrote:

I have a scanned book with 500 pages in a pdf file. The size of the
file is about 90 MB. Is there some way of reducing the size of the pdf
file? I am using Linux.
That certainly depends.
- your Images might not have optimal depth or they might not (yet) use
the best (as for size) compression scheme,
- you might accept lossy compression, in which case it's a deal between
quality and size,
- it might be OK for you to reduce the images' resolution.

On Linux, I'd extract the images using pdfimages (comes with the
poppler library, older versions are in the Xpdf toolchain). Then I'd
compress them to tiff images, chosing the best possible compression
scheme (e.g. G4 for b/w images), using the libtiff tools, and do other
steps, if needed. I'd then recreate the PDF using tiff2pdf. This is the
way for lossless compression, of course.

You might also want to test JPEG2000 on your images -- though I
currently don't have a good suggestion regarding the tool to assemble
the images back to a PDF.

There's also DjVu, which is very well suited for this task, but you'd
be leaving PDF waters at that point.

As a side note: 90MB/500 scanned pages sounds pretty reasonable, so you
might already have the best compromise.

Thanks, Hans-Werner and Ken, for your very detailed and helpful
replies. I have meanwhile converted to ppm the pdf file with
pdfimages. However, when I try to convert to tif, I get the following
error:

$ ppm2tiff -c g4 pag-160.ppm pag-160.tif
pag-160.tif: Error writing TIFF header.
$

Any ideas?

Well, this is outside the scope of comp.text.pdf,
but my first guess would be disk full.

Your are right: disk full! I could not imagine that my disk is already
full... Thanks!

Paul


.



Relevant Pages

  • Re: graphics showing as red x
    ... Michael Koerner [MS PPT MVP] ... compression, deselecting compression options and hitting "Apply" did restore ... PPT to PDF. ... Most images will print but some of the clip art is replaced ...
    (microsoft.public.powerpoint)
  • Re: graphics showing as red x
    ... I was not able to reproduce the "Red X" problem with one of my clients who ... compression, deselecting compression options and hitting "Apply" did restore ... PPT to PDF. ... Most images will print but some of the clip art is replaced ...
    (microsoft.public.powerpoint)
  • Re: Reducing the size of a pdf file
    ... I am using Linux. ... you might accept lossy compression, in which case it's a deal between ... it might be OK for you to reduce the images' resolution. ... I'd then recreate the PDF using tiff2pdf. ...
    (comp.text.pdf)
  • Re: Scanner advice
    ... It's not necessarily that changing from tiff to pdf increases your ... upon the threshold you setup, automatically OCR the images, properly ... The ScanSnap is not a twain or isis compliant scanner so that's the one ...
    (comp.periphs.scanners)
  • Re: colormanagement with ghostscript ?
    ... PostScript or PDF input) then it does indeed support PostScript level2/3 ... PS file to gs still resulted in visibly better printed images, ... %%% Establish an Adobe RGB compatibe color space as Default RGB ...
    (comp.lang.postscript)