Back in January, PDXPUG had a lab night to try out the postgres_fdw.
Our labs are even more casual than our meetings: I don’t set an agenda, I invite the attendees to choose specific questions or topics they want to investigate. Here’s what we came up with for our FDW lab:
– What is it
– Use cases; who has one?
– Best practices?
– Security concerns?
– Performance concerns? (We didn’t get to this.)
– Any diffs between 9.3 and 9.4? (… or this either.)
– Test drive!
Our main focus was the Test Drive. Setup went pretty well for most folks, though one attendee had permissions issues & fell back to file_fdw.
There are four steps (outlined in the docs)
0. Install the extension ;)
1. Create a foreign server
2. Create a user mapping
3. Create your foreign table(s)
\des to describe foreign servers
\deu to describe user mappings
\det to describe foreign tables
My use case: quick & dirty ETL solution. I pull data from several postgres datasources1 into a staging database (for data conformance and other tasks), and then push it out to a data mart.
To keep it all reasonably organized, I put the foreign tables for each data source in a schema named for that datasource, e.g. the data from HR goes to schema ‘hr’, data from Accounting goes to ‘accounting’, etc. (This has tripped me up a couple of times, which I’ll cover in a later post.)
Here’s a quick example: I have a ‘users’ table in the accounting database, in the public schema. I want to put it in the ‘accounting’ schema locally.
CREATE SERVER accounting FOREIGN DATA WRAPPER postgres_fdw OPTIONS (
, host ‘server.boozemckidney.com’
, port ‘5432’
, updatable ‘false’
CREATE USER MAPPING FOR public SERVER accounting OPTIONS (
, user ‘[username]’
CREATE FOREIGN TABLE accounting.users (
, username text
, table_name ‘users’
, updatable ‘false’
Note that you don’t get an error if the schema & table you specify doesn’t exist in the remote database, so check your work. I just run a quick SELECT … LIMIT 1 on the foreign table. This will also help pick up permissions problems.
According to the docs, the default for updatable is ‘true’, and a table setting overrides a server setting. I originally intepreted that to mean that a table’s default setting of true would override a server’s explicit setting of false, which seemed odd, and my tests don’t bear that out. (Nice to see that Pg once again does the sensible thing.) Regardless, I explicitly set this value at both the server and table level, enabling it only on tables I want to update. Of course, you should also set appropriate permissions on the remote table itself.
If you have a lot of tables you need to create fdw for, script the process! This looks like a good option: http://www.postgresonline.com/journal/archives/322-Generating-Create-Foreign-Table-Statements-for-postgres_fdw.html
You can ALTER FOREIGN TABLE for the field names as you would a regular table, but I haven’t figured out how/if you can update the options (like, if somebody changes the target hostname or dbname2).
The credentials associated with user mappings are pg_dumped as cleartext. Ideally, I guess you’d use .pgpass to manage the passwords for you, but I don’t have that option on RDS. To avoid accidentally committing schema dumps that include credentials to my git repo, I added this to my pre-commit hook:
sed -i .bak "s/password '.*',/password '[pwd]',/" schema_dump.sql
(The .bak is required on OSX.)
As usual, I picked up some additional “off-topic” info: check out SELECT * [sequence] or \d [sequence]!
Thanks to Matt S for hanging out afterwards and introducing me to the differences between Oracle’s and Pg’s query planners.
1 – and a few others that are being a pain in my ass :koff:salesforce:koff:
2 – of course, why worry about that, because that NEVER happens, right?