Stripping images from PDFs using Ghostscript
A long PDF was to be printed, but only the text was important. As it was full of images, it seemed like removing the images would save a whole lot of ink.
It turns out ghostscript has some very nice filters for removing classes of content from a file. You can very simply remove text, images, or vector objects without changing the rest of the layout.
For example, to strip vector and images from a PDF, you can use:
gs -o text-only.pdf -sDEVICE=pdfwrite -dFILTERVECTOR -dFILTERIMAGE pdf-with-pictures.pdf
If you don’t have ghostscript installed but use Docker, there are containers that make it easy:
docker run --rm -v pwd:/app -w /app minidocks/ghostscript gs -o text-only.pdf -sDEVICE=pdfwrite -dFILTERVECTOR -dFILTERIMAGE pdf-with-pictures.pdf
Leave a Reply