Skip to content

Instantly share code, notes, and snippets.

@elfsternberg
Last active May 9, 2020 20:36
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save elfsternberg/3f51f8114c7f3866585f17818b93f9a6 to your computer and use it in GitHub Desktop.
Save elfsternberg/3f51f8114c7f3866585f17818b93f9a6 to your computer and use it in GitHub Desktop.
Uncle Bob's "Principles of Programming" lecture, reduced to a small SQL file with examples.
/*
Microsoft elder Robert Martin, aka "Uncle Bob," has a long series of
videos where he lectures on the joys and benefits of object oriented
programming. One thing he likes to say over and over is that your
data should not be tied up by the choice of representation, and that
the most problematic of all data representations is the choice of a
database far too early in the development process.
In his presentation on the Single Responsibility Principle (which I
agree with, in general!), he decomposes a ridiculously
over-engineered function designed to calculate travel expenses for a
human resources organization. And he does, indeed, decompose it,
into a suite of expense types, categories, and individual expenses,
revealing, automatically flagging those that are over some
reimbursement limit, and presenting a tally of how much is to be
reimbursed.
It nagged at me that he was describing *data types* as functions.
His thinking was far too mired in object oriented programming. I
wondered if, in fact, the entire thing could have been modeled using
*only* a database.
Of course it can.
The problem set is straightforward: Given a list of expenses
(person, date, cost, category (breakfast, lunch, dinner, ground
transport, air transport), and other tables that describe
meta-categories (dining, travel) and max reimbursable amount per
category, write a SQL statement that returns:
1. A list of all the expenses, categorized, meta-categorized, and
annotated with max reimbursables.
2. Annotate further with the reimbursed amount and an "attention"
note if the amount is over
3. Return a cleaned up copy of this table, as well as sum totals of
the travel and dining expenses during the time period.
*/
drop view if exists expense_report;
drop table if exists expenses;
drop table if exists users;
drop table if exists expense_types;
drop table if exists expense_categories;
create table users (
id int primary key generated by default as identity,
username text not null);
create table expense_categories (
id int primary key generated by default as identity,
expense_category text not null);
create table expense_types (
id int primary key generated by default as identity,
expense_type text not null,
maximum money,
category_id int references expense_categories);
create table expenses (
id int primary key generated by default as identity,
user_id int references users,
event timestamp not null,
type_id int references expense_types,
amount money not null
);
INSERT INTO users (username)
VALUES
('Uncle Bob');
INSERT INTO expense_categories (expense_category)
VALUES
('Dining'),
('Travel'),
('Other');
INSERT INTO expense_types (expense_type, maximum, category_id)
VALUES
('Dinner', 50, 1),
('Breakfast', 10, 1),
('Lunch', NULL, 1),
('Air Travel', NULL, 2),
('Taxi', NULL, 2),
('Other', 0, 3);
INSERT INTO expenses (user_id, event, type_id, amount)
VALUES
(1, '2020-01-08 04:05:06', 1, 45.00),
(1, '2020-01-09 09:05:06', 2, 12.00),
(1, '2020-01-09 16:05:06', 4, 250.00),
(1, '2020-01-09 21:05:06', 1, 53.00),
(1, '2020-01-09 22:05:06', 6, 127.00),
(1, '2020-01-09 19:05:06', 5, 27.00);
/*
This is the first piece of magic: we're creating a virtual table in
which derivative information such as how much will be reimbursed if
the traveller exceeds a limit, what to do if there is no limit, and
flagging that overage for attention. This view also unifies the
information above into a decorated whole, allowing the user to make
decisions on what information to extract.
*/
CREATE OR REPLACE FUNCTION max_reimbursed(amount money, maximum money)
RETURNS money AS
$$
BEGIN
IF maximum IS NOT NULL THEN
RETURN least(amount, maximum);
ELSE
RETURN amount;
END IF;
RETURN amount;
END;
$$
LANGUAGE 'plpgsql' IMMUTABLE;
CREATE OR REPLACE FUNCTION max_highlight(amount money, maximum money)
RETURNS text AS
$$
BEGIN
IF maximum IS NOT NULL THEN
IF amount > maximum THEN
RETURN '*';
ELSE
RETURN ' ';
END IF;
END IF;
RETURN ' ';
END;
$$
LANGUAGE 'plpgsql' IMMUTABLE;
CREATE VIEW expense_report AS SELECT
users.id AS user_id,
expense_types.id AS expense_type_id,
expense_categories.id AS expense_categories_id,
username,
event,
expense_type,
expense_category,
amount,
maximum,
max_highlight(amount, maximum) AS over,
max_reimbursed(amount, maximum) AS reimbursed
FROM expenses
INNER JOIN expense_types
ON type_id = expense_types.id
LEFT JOIN expense_categories
ON category_id = expense_categories.id
JOIN users
ON user_id = users.id;
/*
Once the `expense_report` view is constructed, all that's left
is to specify which user (and probably a date range), and the
information Uncle Bob specified should be displayed, ordered by
the times when the events happened:
*/
SELECT
username,
event,
expense_type,
expense_category,
amount,
maximum,
reimbursed,
over
FROM expense_report
WHERE user_id = 1
ORDER BY event;
/*
And finally, create a summary of the categorical reimbursements,
filtering out any zeroed rows.
*/
WITH expense_summary AS
(
SELECT
user_id,
username,
expense_category,
SUM(amount) AS total_amount,
SUM(reimbursed) AS total_reimbursed
FROM
expense_report
GROUP BY
user_id,
username,
expense_category)
SELECT
username, expense_category, total_amount, total_reimbursed
FROM
expense_summary
WHERE
user_id = 1 AND
total_reimbursed > '$0.0';
/*
Expected return:
username | event | expense_type | expense_category | amount | maximum | reimbursed | over
-----------+---------------------+--------------+------------------+---------+---------+------------+------
Uncle Bob | 2020-01-08 04:05:06 | Dinner | Dining | $45.00 | $50.00 | $45.00 |
Uncle Bob | 2020-01-09 09:05:06 | Breakfast | Dining | $12.00 | $10.00 | $10.00 | *
Uncle Bob | 2020-01-09 16:05:06 | Air Travel | Travel | $250.00 | | $250.00 |
Uncle Bob | 2020-01-09 19:05:06 | Taxi | Travel | $27.00 | | $27.00 |
Uncle Bob | 2020-01-09 21:05:06 | Dinner | Dining | $53.00 | $50.00 | $50.00 | *
Uncle Bob | 2020-01-09 22:05:06 | Other | Other | $127.00 | $0.00 | $0.00 | *
(6 rows)
username | expense_category | total_amount | total_reimbursed
-----------+------------------+--------------+------------------
Uncle Bob | Dining | $110.00 | $105.00
Uncle Bob | Travel | $277.00 | $277.00
*/
@elfsternberg
Copy link
Author

In order to demonstrate further how I could attend to the needs of Uncle Bob's various maxims, I have broken out the CASE statements into individual functions, the better to make clear their purposes. Both are basically three-line if statements: IF there's a maximum value and the expense amount exceeds it, return the maximum value and a highlight, else return the amount and no highlight. PSQL's primitiveness makes this annoying and noisy, but hardly fatal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment