Archive for April, 2014

25 April, 2014

Mistakes I make with Perl hashes

by gorthx

I’ve been doing a bit of Perl programming the past couple months. I felt pretty rusty at first (it’s been over a year since I’ve written anything serious with it) but am getting back into the swing of things. My main use of Perl is manipulating delimited text data (from databases or flat files) for reports or loading into databases and the like. For these types of tasks, I really prefer hashes (and HoH…oH) to arrays because I can can give my variables appropriately-named keys, such as $switch{‘card’}{‘port’}. It’s a lot more obvious what that’s doing than $switch[12][2]. Obvious is good, especially 6 months later when I’ve come back to a project and am saying “what the hell is this” (as we all do).

I always seem to make the same three mistakes with hashes. The first two feature the fun symptom “I’m getting data that’s different from what I expect, and it’s randomly different every time I run my program”:

1. Oops, I need my data in a certain order. This is the first thing I forget if I haven’t written any Perl code for a while: Perl hands hash data back in whatever order it feels like. When I’m writing my initial tests, I’m using smaller data sets (like maybe one or two hash “rows”) and I get lucky and my data’s in the sort order I want. Then I get some real data, and … “oh right!” I fix this by using an AoH instead, or storing (some of) the keys in a separate array (feels kludgy), and I suppose some day I will get around to trying one of the permutations of Tie::Hash.

2. I fail to provide a unique key for the hash. This is another one that doesn’t become apparent until I’m working with real data: a “random” small data sample has two unique identifiers, but when get a bigger data set I find out there’s actually five, and so on, and pretty soon I’ve got a HoH…oH that’s 17 levels deep. (I kid.) (Maybe.)

3. The bane of my existence: typos in my hash keys. ‘use strict’ doesn’t protect from this. Writing tests helps[1], but they still occasionally slip through[2]. I troubleshoot with rigorous use of Data::Dumper::Simple and the debugger.



1 – Thank you, PDX.pm.
2 – My test for variable names: 3 months (6 months, a year) later, is it still obvious what this variable holds, and would I name it the same thing?