Easy clock dimension table

by gorthx

We’re currently redesigning our data warehouse, and we’re experimenting with a clock (or time, or time-of-day) dimension to represent the time of day, separate from the date dimension we already use. This table should contain a record for each minute of a day, allowing us to easily determine business hours in various timezones, etc etc.

I’m not entirely sure it’s going to work for us just yet; we have a lot of timestamp data, and seems like it’s more trouble than it’s worth1. But it never hurts to proof-of-concept something, right? (Except for all those times the proof-of-concept ends up being the production system…)

Anyway, somebody commented that “you could have the analyst whip up the dataset in just a few hours” … :sideeye: or, I could do it myself in Postgres in just a few minutes:

CREATE TABLE clock_dimension (
 id serial primary key
 , full_timestamp time
 , hour_of_day_24 integer
 , hour_of_day_12 integer
 , am_pm text
 , minute_of_hour integer
 , business_hours_weekday boolean
 , business_hours_weekend boolean
 );

-- fill it with values
INSERT INTO clock_dimension (full_timestamp) VALUES (
generate_series('2016-08-05 00:00'::timestamp, '2016-08-05 23:59'::timestamp, '1 minute')::time
);
UPDATE clock_dimension SET
hour_of_day_24 = date_part('hour', full_timestamp)
, hour_of_day_12 = CASE
WHEN date_part('hour', full_timestamp) > 12
THEN date_part('hour', full_timestamp) - 12
WHEN date_part('hour', full_timestamp) = 0
THEN 12
ELSE date_part('hour', full_timestamp)
END
, am_pm = CASE
WHEN date_part('hour', full_timestamp) >=12
THEN 'pm'
ELSE 'am'
END
, minute_of_hour = date_part('minute', full_timestamp)
, business_hours_weekday = CASE
WHEN full_timestamp BETWEEN '08:00' AND '21:00'
THEN 't'::boolean
ELSE 'f'::boolean
END
, business_hours_weekend = 'f'
;

Sample data:

 id  | full_timestamp | hour_of_day_24 | hour_of_day_12 | am_pm | minute_of_hour | business_hours_weekday | business_hours_weekend 
-----+----------------+----------------+----------------+-------+----------------+------------------------+------------------------
 901 | 15:00:00       |             15 |              3 | pm    |              0 | t                      | f
 902 | 15:01:00       |             15 |              3 | pm    |              1 | t                      | f
 903 | 15:02:00       |             15 |              3 | pm    |              2 | t                      | f
 904 | 15:03:00       |             15 |              3 | pm    |              3 | t                      | f
 905 | 15:04:00       |             15 |              3 | pm    |              4 | t                      | f
 906 | 15:05:00       |             15 |              3 | pm    |              5 | t                      | f
 907 | 15:06:00       |             15 |              3 | pm    |              6 | t                      | f
 908 | 15:07:00       |             15 |              3 | pm    |              7 | t                      | f
 909 | 15:08:00       |             15 |              3 | pm    |              8 | t                      | f
 910 | 15:09:00       |             15 |              3 | pm    |              9 | t                      | f
 911 | 15:10:00       |             15 |              3 | pm    |             10 | t                      | f
 912 | 15:11:00       |             15 |              3 | pm    |             11 | t                      | f
 913 | 15:12:00       |             15 |              3 | pm    |             12 | t                      | f
 914 | 15:13:00       |             15 |              3 | pm    |             13 | t                      | f
 915 | 15:14:00       |             15 |              3 | pm    |             14 | t                      | f
 916 | 15:15:00       |             15 |              3 | pm    |             15 | t                      | f



1 – see also http://www.kimballgroup.com/2004/02/design-tip-51-latest-thinking-on-time-dimension-tables/

Advertisements

4 Comments to “Easy clock dimension table”

  1. What are some practical examples of how you would use this?

    • Hi Dan!

      You’d use the id as an FK in a fact table. Supposedly it makes it easier to figure out things like “what’s happening in vs out of business hours” and such. Whether that is the case or not remains to be seen ;) I’ll report back in a few months.

      g.

  2. The insert and immediate update pretty much guarantees that your table will be 2x as big as it needs to be. Do all the transformations in the first insert to avoid that.

    For that matter, I think you might have better luck defining some immutable functions like is_pm() and is_weekday() and then doing partial indexes `WHERE is_weekday(full_timestamp) = true`. You would have thrown an index on the weekday column in the dimension table anyway, at least this one is half the size…

    Even better, if you have those things defined in immutable functions, they can be used for queries, indexes, and as filters during the ETL.

    Full disclosure: I’m not a fan of the Kimball model. Their methods are highly effective at defeating effective partitioning as well as any in-db compression mechanisms, and many query optimizers can’t “see” the star-join.

    • Hi Corey! Thanks for commenting.

      You are correct about the insert/update taking up additional space. Because this table is only 1440 records, I’m not too worried about it at this point.

      And yeah, I’m not enamored of this plan myself yet. The point of this exercise is to get something up fast that I can experiment with, and determine if it makes things better or worse in terms of database layout, ETL complexity, and query complexity.

      Those are great suggestions about the functions! I’ll keep that in mind if/when I move forward with this.

%d bloggers like this: