Archive for ‘Books’

15 December, 2010

PostgreSQL 9 Admin Cookbook

by gorthx

The PostgreSQL community has been waiting for a “cookbook”, and I’m really excited that we finally have one. The PostgreSQL 9 Admin Cookbook contains many tips & techniques I’m going to put to immediate use. Like, pgloader and the ON_ERROR_STOP option to psql. This book also reminded me that there’s still a lot I don’t know about Postgres: for example, I’ve never worked with pg_controldata or the quote_ident() function. If you’re a Pg admin (or wannabe), you should give this book at least a run-through, even if you think you already know everything.

Most of the “recipes” in this cookbook will stand on their own; some require material from previous or other referenced sections to make sense. I am a sucker for conversational style, and while the book gets off to a rough start, it does even out after a bit. Big blocks of SQL are formatted consistently in a style that, while it’s not one I use myself, is easy to read. (This is very important.)

The authors give a great explanation of why they prefer their filesystem set up a specific way, and how to do it (of course, it probably helps that I agree with them on this point); and good advice about schema & relation names. Specific problem-solving tools I found useful are: the list of steps to troubleshoot failed connections; specific things to do if a backend is hung, or a query is taking too long; and generating test data and taking random samples of real-world data. They also give warnings where something you do might cause application downtime.

Sadly, this book suffers from inadequate editing. I often struggled against the grammar and organization to understand what the authors were trying to express. This frustrated me and slowed me down; a detriment in a technical manual. I also found that I couldn’t rely on the (seemingly auto-generated) index to find what I need, although the TOC and chapter headings helped out a lot there.

Bottom line: would I buy a hard copy? At the listed pricepoint (US$45 at this time), probably not. However, it’s a great first effort and I eagerly await the second edition.


Disclaimer: I received a free review copy of this book from the publisher, in pdf format.

Advertisements
6 June, 2009

Book Review (part I): Refactoring SQL Applications, with bonus queries

by gorthx

It’s taking me quite a while to wade through Stephan Faroult’s Refactoring SQL Applications. I just finished Chapter 2 & figured I’d better just go ahead with the review.

It’s quite humorous – I mean, there’s a section called “Queries of Death” – but this is some dense material, make no mistake. I tried to keep my copy nice so I could loan it to others, but I had to give up and get out The Pen, and it’s been highlighted and scribbled on.

Small gripe: the layout of the example queries makes them hard to read (capitalizing the conditionals would help). I’d also like to see more examples of result sets.

The section about statistics sparked a lively discussion on #pdxpug about cardinality vs selectivity*. What I thought I knew about indexes has been thrown on its head – don’t base your decisions just on whether or not the column in question is searched on.

One of the recommendations for “Sanity Checks” is to take a good look at your indexes. For starters, check for tables with no indexes, or a lot of indexes. There’s a sample query to pull the number of rows, indexes, and some info about those indexes for each table. Faroult only shows sample queries for Oracle, SQL Server, and MySQL, so Selena & I put our heads together & came up with an equivalent for PostgreSQL:

(Only works on 8.3; ditch the pg_size_pretty if you’re on an earlier version)

SELECT
    t.tablename,
    pg_size_pretty(c.reltuples::bigint) AS num_rows,
    c.reltuples AS num_rows,
    count(indexname) AS number_of_indexes,
    CASE WHEN x.is_unique = 1 THEN 'Y'
       ELSE 'N'
    END AS unique,
    SUM(case WHEN number_of_columns = 1 THEN 1
              ELSE 0
            END) AS single_column,
    SUM(case WHEN number_of_columns IS NULL THEN 0
             WHEN number_of_columns = 1 THEN 0
             ELSE 1
           END) AS multi_column
FROM pg_tables t
LEFT OUTER JOIN pg_class c ON t.tablename=c.relname
LEFT OUTER JOIN
       (SELECT indrelid,
           max(CAST(indisunique AS integer)) AS is_unique
       FROM pg_index
       GROUP BY indrelid) x
       ON c.oid = x.indrelid
LEFT OUTER JOIN
    ( SELECT c.relname as ctablename, ipg.relname as indexname, x.indnatts as number_of_columns FROM pg_index x
           JOIN pg_class c ON c.oid = x.indrelid
           JOIN pg_class ipg on ipg.oid = x.indexrelid  )
    as foo
    ON t.tablename = foo.ctablename
WHERE t.schemaname='public'
GROUP BY t.tablename, c.reltuples, x.is_unique
order by 2;

It took quite a bit of chocolate to wrap that up…afterwards, Selena decided that it would be neat to look at table & index sizes and see which indexes were being scanned and how many tuples fetched:

SELECT
    t.tablename,
    indexname,
    c.reltuples AS num_rows,
    pg_size_pretty(pg_relation_size(t.tablename)) AS table_size,
    pg_size_pretty(pg_relation_size(indexrelname)) AS index_size,
    CASE WHEN x.is_unique = 1  THEN 'Y'
       ELSE 'N'
    END AS unique,
    idx_scan AS number_of_scans,
    idx_tup_read AS tuples_read,
    idx_tup_fetch AS tuples_fetched
FROM pg_tables t
LEFT OUTER JOIN pg_class c ON t.tablename=c.relname
LEFT OUTER JOIN
       (SELECT indrelid,
           max(CAST(indisunique AS integer)) AS is_unique
       FROM pg_index
       GROUP BY indrelid) x
       ON c.oid = x.indrelid
LEFT OUTER JOIN
    ( SELECT c.relname as ctablename, ipg.relname as indexname, x.indnatts as number_of_columns, idx_scan, idx_tup_read, idx_tup_fetch,indexrelname FROM pg_index x
           JOIN pg_class c ON c.oid = x.indrelid
           JOIN pg_class ipg ON ipg.oid = x.indexrelid
           JOIN pg_stat_all_indexes psai ON x.indexrelid = psai.indexrelid )
    as foo
    ON t.tablename = foo.ctablename
WHERE t.schemaname='public'
order by 1,2;



cardinality: size of the relation (“number of rows in [something]”)
selectivity: percent of the relation that’s selected
cardinality * selectivity = number of tuples in your results set.

14 April, 2009

Book Review: The Manga Guide to Databases

by gorthx

Everybody’s reading it!

I admit, I was skeptical at first. But, The Manga Guide to Databases is not just a lightweight illustrated treatment of RDBMSs. Detailed text follows each chapter in the story of Princess Ruruna and her servant Cain as they attempt to bring order to the chaos of the Kingdom of Koa’s fruit export business. Don’t skip those text sections, there’s additional info in there (I learned something new: the definitions of DDL, DML, and DCL as types of SQL queries) and the study questions are relevant to the material. I like that the book covers different data models, and includes a discussion of some relational operations. The explanations of transactions and ACID principles were very concise & clear. My only complaint is the somewhat confusing explanation of joins.

This book gets bonus points for having a thorough index and a “frequently used SQL statements” cheat sheet.

Bottom line: Entertaining yet educational.

Tags: