November | 2009 | the gabriellephant

Archive for November, 2009

6 November, 2009

Refactoring!

by gorthx

Last night at the hackathon, we refactored one of our queries from my review of Refactoring SQL Applications.*

First, we had a duplicate field name in the original select. Not a problem if you’re just doing a select, but if you want to create a table (temp or otherwise) from the data, it won’t work. So we replaced the first num_rows with rows_in_bytes.

Also, reading over this 5 months after the original attemp, I realize it’s a lot clearer if we don’t use table aliases in the outer SELECTs.

Then, we got some advice from Greg Smith that we shouldn’t do joins on pg_class.relname – this can screw you up if you have different schemas with identical table names. You want to use oids (which I’d always thought was not desirable, but I’m assured it’s ok if you’re doing it with the system tables – you don’t want your application to depend on them, though. :) ) So, instead, we match pg_namespace.oid with pg_class.relnamespace.

Selena’s illustration of how this works:
SELECT relname, relkind FROM pg_class JOIN pg_namespace ON pg_namespace.oid = pg_class.relnamespace WHERE relkind = 'r' AND pg_namespace.nspname = 'public';

The new & improved version of the query can be found on the Pg wiki.

I wanted to compare the new query against the old, so I created a couple of temp tables containing the results… and discovered we had a couple of data discrepancies: a few of our tables were listed twice in the original query results, with different values for num_rows, only one of which was correct for the current schema:

portal=# SELECT count(*) from detectorid_count;
count
——-
0
(1 row)

portal=# SELECT count(*) from stations;
count
——-
350
(1 row)

portal=# SELECT count(*) from test_agg ;
count
——-
0
(1 row)

It turns out we’d run into the exact problem that Greg had warned us about. The additional rows were from identically-named tables in other namespaces.

Find your namespaces:
portal=# SELECT nspname from pg_namespace order by 1; nspname -------------------- information_schema pg_catalog pg_temp_1 pg_temp_2 pg_toast pg_toast_temp_1 pg_toast_temp_2 public selena wendell (10 rows)

Find your data:
portal=# SELECT count(*) from selena.detectorid_count ; count ------- 631 (1 row)

portal=# SELECT count(*) from wendell.stations ;
count
——-
22
(1 row)

portal=# SELECT count(*) from selena.test_agg ;
count
——-
1386
(1 row)

Note that these match the additional data from our original query.

Thanks, Greg!

—
* No, I haven’t finished reading it yet…I don’t read during the summer, I ride my bike.

Posted in PostgreSQL | Comments Off on Refactoring!

Tags: databases, hackathon, PostgreSQL

M	T	W	T	F	S	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30

the gabriellephant

Recent Posts

Archives

Categories

Meta

Archive for November, 2009

Refactoring!

Tags

Calendar

Archives