Last weekend we held the biggest PDXPUGDay we’ve had in a while! 5 speakers + a few lightning talks added up to a fun lineup. About 1/3 of the ~50 attendees were in town for FOSS4G; I think the guy from New Zealand will be holding the “visitor farthest from PDXPUG” for a good long while. Some folks from SEAPUG daytripped down (hi!) and we made plans for PDXPUG to road trip up there, probably for next year’s LinuxFestNW.
My highlights:
HSTORE, XML, JSON, and JSONB – David Wheeler
– Pg’s XML features are pretty neat, but I still think XML needs to DIAF. Perhaps that’s just my previous experience speaking.
– We renamed the HSTORE containment operator (@>) to “ice cream cone operator”, courtesy Mark Wong.
– Operations on JSON are slower than on HSTORE. That’s interesting.
– The storage overhead for JSONB is higher than for regular JSON, because it doesn’t compress very well. Josh B took an audience vote on improving compression at the expense of slowing down operations, and it was pretty evenly split.
– As usual, David included benchmarks and gave good overviews of when to use which data type.
Snapshotted Data Versioning – Eric Hanson
Eric gave a talk about this at PDXPUG last year and was showing an updated version of what Aquameta’s up to. Eric’s philosophy is “make everything data, and then make a UI for it”.
– Implemented FUSE for Pg, bidirectional, so you can change your data by making updates directly in the database or by editing a text file on the filesystem. I believe this was described as “perverse” by a certain audience member.
Data Near Here – Veronika Megler
– Another update to a previous PDXPUG talk
– Scientists report that they spend up to 80% of their time just finding data relevant to their research. Not collecting – locating previously saved data. What a time sink.
– Parsers for each data format have to be custom coded.
Portal Update – Kristin Tufte
– Another example of pulling data from many different sources in many “unique” formats!
– Current research on pedestrian counts uses the crosswalk buttons as a potential method to count pedestrians.
– I’d like to get ahold of the traffic light data, to see if the light at 32nd and Powell really is the longest light in Portland, or if that’s just my imagination.
AWS Faceoff (Cloud Shootout!) – Josh Berkus
I don’t care too much about Postgres on AWS – if I’m going to go that route, I’ll buy my own hardware, TYVM.
– RDS has a limited number of extensions installed, and PL/R isn’t one of them.* They did just add pg_stat_statements, which is cool. The Amazon support people are taking requests, and are attentive to the community, according to Josh. (I don’t have enough experience with that to have an opinion.)
– performance on RDS just isn’t that great; Josh got 325 TPS read/write, and 1430 TPS read-only.
– Then there was the cost comparison; RDS and Heroku don’t look that great compared to hosting it yourself, but you’d need to factor in the cost of support staff there.
Thanks for a great event!
—
* I decided to see for myself what extensions were available. Mark warned me “don’t shed too many tears for what they don’t have”. To my surprise, many of my favorites are available – pgperl, plpgsql, postgis, and tablefunc! (SO EXCITE MUCH PIVOT)
C
heck what’s available on your instance with this command:
SHOW rds.extensions;
Note that “SELECT * FROM pg_available_extensions ORDER BY name;” will show you a bunch of stuff that’s not necessarily available on RDS. (Something I wish they’d fix.)