Archive for October, 2013

25 October, 2013

Manipulating .pdf files on Linux using Ghostscript

by gorthx

I have to digitally fold, spindle, and mutilate .pdf documents frequently. On Ubuntu, I tried the GiMP, pdftops, pdftk, and some truly tortuous gymnastics involving screencaps, but none of them really did what I wanted.

Then I found Ghostscript.

It’s a command line tool, which I dig, because it means that I can type instead of having to point & click, and I can write quick shell scripts to do my dirty work.

Here’s how I use it most often:

Combine multiple .pdfs into a single file:
gs -sDEVICE=pdfwrite \
-o 2012_final_report.pdf \

Pull first page only from multiple files:
for each in `ls 2012_Account_Statement_*`
cp $each ${each}.backup
gs -sDEVICE=pdfwrite \
-dFirstPage=1 -dLastPage=1 \
-o ${each%.pdf}_firstpage.pdf \

Combine multiple .pdfs and convert them to B&W:
gs -sDEVICE=pdfwrite \
-sColorConversionStrategy=Gray \
-dProcessColorModel=/DeviceGray \
-dCompatibiltyLevel=1.4 \
-dAutoRotatePages=/None \
-o 2012_final_report.pdf \

The Ghostscript Quick Start guide is here.

Tags: ,
18 October, 2013

Try these at home!

by gorthx

We had one of those truly amazing meetings at PDXPUG this week. Along with the ideas that came out of this meeting (such as, leveraging Calagator for optimal scheduling of new user groups and this), Matt Smiley schooled a bunch of us on some basic unix utilities. Recorded here so I don’t forget them; these are version-dependent, YMMV.

-S prevents line wrap, then you use the arrow keys to page through your output. This is super-handy when viewing wide, tabular output.

– ctrl-m sorts by mem
– s lets you choose the refresh rate

– await is the value to use for disk latency
– svctime is not :) (it’s a calculated value instead of an actual measurement). The sar man page notes that this field is not to be trusted and will be removed in the future.

– collect ongoing stats: iostat -x -t -k 1 100
-x = extended stats
-t = include timestamps
-k = measurements in kB :)
1 = one second intervals
100 = 100X

Your first (and possibly second) set of data collected from this can be thrown out, as it contains the cumulative stats since the system started. This also affects running a single timepoint.

I also learned about a couple of monitoring tools I need to check out: saidar and Data Dog.