Archive for ‘tidbits’

30 March, 2016

This month’s RDS tip, a pgloader exercise, and bonus recruiter spam

by gorthx

1. Several months back, AWS re-worked the web console. You can get sparkline-style graphs for CPU usage, memory, storage, and database connections at the top level, instead of having to drill down to the Cloudwatch metrics for the instance.

I find this really handy – but for one quirk with the graphs:
connection_threshold

There’s white space to the right of the red line. Therefore, I was interpreting the red line to mean “trouble is brewing & I should do something about this”.

Turns out that red line is the limit; there’s nowhere else to go. Whoops!

2. This week I finally had a reason to try out pgloader: One of my analysts needed some help loading some ugly fixed-width data1.

Installing was super-easy on my mac (`brew install pgloader`).  I worked through the examples before starting to work with my actual dataset2.

The data to be loaded came bundled as several text files: the data, plus three or four additional files describing the layout.  I wrote a truly glorious string of cut, sed, paste, and awk to create a pgloader control file that would work. And it did!

Except:

field1 | field2 | field3 
--------+--------+----------
 [null] | D | IMITRI
 [null] | G | ABRIELLE
 [null] | M | ARK
 [null] | S | ELENA

Uh, what?

The data descriptions had character counts starting at 1, and pgloader expects them to start at 0 (as they should).  (For extra fun, the first column in all records of this dataset was nothing but spaces.)

3. This week’s hilarious recruiter spam:
“Hey there Gabrielle! I was doing my homework on sites like Meetup, GitHub, etc. and I noticed your Java skills.”

I’m pretty sure you didn’t.



1 – Is that redundant?
2 – 600 columns of fixed-width data?  Who does that?!
3 – Why couldn’t it all be in one file? Again: who does this?!4
4 – People who hate DBAs, that’s who.

Advertisements
2 February, 2015

Updating My Linux Command line Toolbox, episode 3

by gorthx

Part 2

This week’s tips:

1. ulimit -a will show you all settings, plus the units.

2. crontab -l -u [user] will read out another user’s crontab for you (assuming you have the right perms)

3. and what I call “diff-on-the-fly” – pass the output of shell commands to diff. I like this one because I don’t make a bunch of “temporary” files that I forget to clean up later.

diff <([shell commands]) <([other shell commands])

For example, I need to compare ids in two files, but they’re in different fields in each file, and not in the same order:

diff <(cut -d"," -f1 file1 | sort -u) <(cut -d"," -f3 file2 | sort -u)

8 September, 2014

Updating My Linux Command line Toolbox, episode 2

by gorthx

Part 1

Five more, all from this week:

1. date -u to get your date in UTC

2. pushd and popd – create your own directory stack.  I’m still trying this one out.  (“why not just use the up arrow?”)

3. pbcopy – copy to clipboard from the command line.

4. !$ contains the last arg of the previous command, so you can do something like this:
ls -l filename.*    # check what you have in the dir
vi !$

5. This one is my favorite: !?[string] runs last command that contains that string.

Tags: ,
1 November, 2013

Powershell.

by gorthx

Yes, here we are again, with me using a Windows machine. I can’t decide if Powershell makes having to use Windows tolerable, or just throws salt in the wounds. Powershell provides much more efficient methods of searching files and moving/renaming them than messing with Exploder, but every time I need it, I have to look up the syntax because it’s just not familiar.

Here are samples of the commands I use regularly, so they’re all in one place & I can easily C&P them from anywhere.

Find all .zip files:
Get-ChildItem -path c:\path\to\search -recurse -filter *zip

Order of the options is not important, and recurse can be shortened to rec.

Find a certain file somewhere on my hard drive:
Get-ChildItem -path c:\ -filter settings.xml -rec

I search file content a lot, so I made an alias for grep (also in my profile), because it’s easier for me to remember:
Set-Alias grep select-string

Find my notes about JSON, somewhere on my hard drive:
Get-ChildItem -path c:\ -inc *.txt -rec | grep -pattern "json"
…this is a case-insensitive search.

Convoluted way to move files (still looking for something easier):
Get-ChildItem -path c:\old\path -rec -filter *zip | foreach-object { copy-item -path $_.fulllname -destination c:\new\path }

If your paths or filenames include spaces, you’ll have to quote them, of course.

There is a way to diff files but I find the output nearly unusable.

Additional tips:
– You don’t have to type the commands in in camel case; powershell will transform it.
– There is some tab-completion available.
– I added this to my profile to save my history between sessions: https://lopsa.org/content/persistent-history-powershell. There’s no up/down arrow paging for commands from a previous session, though; you have to list the history items and then execute them from the menu. (With a command e.g. “i 2”. Yeah, that’s intuitive. Feels like the 80s in here.)

And: <esc> for <ctrl>-u.


Useful links:
http://msdn.microsoft.com/en-us/library/windows/desktop/bb613488(v=vs.85).aspx
http://www.powershellatoms.com/desktop-management/creating-persistent-aliases-in-powershell/
http://blogs.technet.com/b/heyscriptingguy/archive/2012/02/27/use-powershell-to-copy-files-to-a-shared-drive.aspx

Tags:
25 October, 2013

Manipulating .pdf files on Linux using Ghostscript

by gorthx

I have to digitally fold, spindle, and mutilate .pdf documents frequently. On Ubuntu, I tried the GiMP, pdftops, pdftk, and some truly tortuous gymnastics involving screencaps, but none of them really did what I wanted.

Then I found Ghostscript.

It’s a command line tool, which I dig, because it means that I can type instead of having to point & click, and I can write quick shell scripts to do my dirty work.

Here’s how I use it most often:

Combine multiple .pdfs into a single file:
gs -sDEVICE=pdfwrite \
-o 2012_final_report.pdf \
2012-*_receipts.pdf

Pull first page only from multiple files:
for each in `ls 2012_Account_Statement_*`
do
cp $each ${each}.backup
gs -sDEVICE=pdfwrite \
-dFirstPage=1 -dLastPage=1 \
-o ${each%.pdf}_firstpage.pdf \
${each}
done

Combine multiple .pdfs and convert them to B&W:
gs -sDEVICE=pdfwrite \
-sColorConversionStrategy=Gray \
-dProcessColorModel=/DeviceGray \
-dCompatibiltyLevel=1.4 \
-dAutoRotatePages=/None \
-o 2012_final_report.pdf \
2012-*_receipts.pdf

The Ghostscript Quick Start guide is here.

Tags: ,
18 October, 2013

Try these at home!

by gorthx

We had one of those truly amazing meetings at PDXPUG this week. Along with the ideas that came out of this meeting (such as, leveraging Calagator for optimal scheduling of new user groups and this), Matt Smiley schooled a bunch of us on some basic unix utilities. Recorded here so I don’t forget them; these are version-dependent, YMMV.

less:
-S prevents line wrap, then you use the arrow keys to page through your output. This is super-handy when viewing wide, tabular output.

top:
– ctrl-m sorts by mem
– s lets you choose the refresh rate

sar:
– await is the value to use for disk latency
– svctime is not :) (it’s a calculated value instead of an actual measurement). The sar man page notes that this field is not to be trusted and will be removed in the future.

iostat
– collect ongoing stats: iostat -x -t -k 1 100
-x = extended stats
-t = include timestamps
-k = measurements in kB :)
1 = one second intervals
100 = 100X

Your first (and possibly second) set of data collected from this can be thrown out, as it contains the cumulative stats since the system started. This also affects running a single timepoint.

I also learned about a couple of monitoring tools I need to check out: saidar and Data Dog.

8 March, 2013

Updating My Linux Command line Toolbox

by gorthx

Over the past few months I’ve been working with some people with many more years of unix-y experience than I have. They’ve been teaching me new stuff, and showing me updated versions of commands I’ve been using in sometimes kludgy ways. Here are 10 examples.

1. join, which feels kind of like a stoneage tool; it’s not that versatile, and I’m sure there are perl one-liners that could do the same thing. But join’s readily available and it’s a fast way to join two files, if they meet the requirements: same file separator, a “key” field (which is easy to add with nl, #2 below), etc.

2. nl to number the lines in a file.

My most-used switches:
nl -v 800 -w 3 -n rn -s, oldfile > newfile
-v = start with this number
-w = number of characters (in this example, numbers > 999 will be truncated)
-n rn = ‘right justified, no 0 padding’ so you don’t have to go back through another round of text processing to strip them off
-s, = use a comma as a field sep

3. truncate as a desperate move to free up some space. Bonus: do this on a log file to confuse your coworkers.

4. watch [1]
Current favorite:
watch -n 10 “psql -d my_db -c \”SELECT datname, procpid, usename, backend_start, xact_start, query_start, waiting, current_query FROM pg_stat_activity WHERE current_query LIKE ‘autovacuum:%’\””

5. lastlog, last, and lastb
most recent login, all logins, and all bad logins

6. df -h instead of df -k
df -k was one of the first unix commands I learned. It used to be easy to read the blocks, Used, and Available columns (they had fewer digits back then). df -h makes it easy again, and shows me the units. The muscle memory on this one has been particularly hard to shed.

7. echo * | wc -w instead of ls | wc -l or ls -1 | wc -l
The first sysadmin I ever worked with taught me ls -1 | wc -l. Turns out you don’t need the -1 to list the files individually, if you are piping the output to another command.

Use echo * to quickly list the number of files in a directory, when that number is more than a few thousand; assumes no spaces in the filenames.

8. awk can be kind of intimidating due to its sheer power and uninformative error messages. I have a small cheatsheet I keep handy, for tasks I do frequently. These are the most recent additions:

To remove everything owned by me in a directory:
ls -l | awk ‘/gabrielle/{print $9}’ | xargs rm

To find all files > 0 size in a directory:
ls -l | awk ‘$5>0{print}’

9. Everything here: https://dougvitale.wordpress.com/2011/12/21/deprecated-linux-networking-commands-and-their-replacements/

10. And sar, which I’ll write a separate post about.

Thanks to Joe, Vibhor, Oscar, and of course MJM.

1 – “How did I not know about ‘watch’?” “You were using Solaris all these years, m’dear.”

Tags: ,
17 January, 2012

Re-docking an accidentally undocked chat in pidgin

by gorthx

In pidgin, I have my chat tabs across the top (should do a screencap & show an example). Every once in a while, I’ll accidentally drag one out of the queue.

Usually you can just drag the single tab back into the rank of tabs to re-dock. Occasionally this doesn’t work; try un-maximizing both windows, and that should allow you to re-dock the tab.

Tags:
27 September, 2011

Tuesday Tidbit – JOIN on multiple fields

by gorthx

I use this so rarely I have a hard time remembering the syntax; I always try to use a comma instead of AND. Which, of course, throws an error.

Looks like this:
SELECT [stuff]
FROM table1 t1
JOIN table2 t2 ON
(t1.field1 = t2.field1 AND t1.field2 = t2.field2);

Tags: ,
26 August, 2011

This week’s tips

by gorthx

Two things I learned on accident this week:
vim: when your cursor is over a number, ctrl-a increments it. (Members of @pdxhackathon tell me that ctrl-x decrements it.) Not sure where I’ll use it, but it’s interesting.

IOS (the Cisco kind): ‘sh run’ and then ‘/[string]’ to search is way faster than ‘sh run | include [string]’.


Also, a few weeks ago, I upgraded my production systems to PostgreSQL 9. A seamless transition, as usual. So far the only thing that’s tripping me up is the change to d [viewname] (see the release notes) – you now have to use d+ [viewname] to see the queries a view is based on. Just a matter of training my fingers to do something different.

I also discovered how to drop multiple columns from a table in a single statement:
ALTER TABLE tablename DROP COLUMN column1, DROP COLUMN column2, DROP COLUMN column3;

I don’t know if that’s new or I just figured it out (and I’m too lazy to go look up the release notes to find out right now), but I like it.