Skip to content

Instantly share code, notes, and snippets.

@hopeseekr
Last active May 23, 2018 22:30
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save hopeseekr/13279dc6c9d6f978aa12828a8eaca479 to your computer and use it in GitHub Desktop.
Save hopeseekr/13279dc6c9d6f978aa12828a8eaca479 to your computer and use it in GitHub Desktop.
Advanced SQL: Materialized View with Many-To-Many Squashing and Regexp

So say you have a table with a many-many relationship to another, and you want to create a caching table (aka a materialized view):

CREATE TABLE orders_cache (
customer_id int primary_key, 
payment_type varchar(10000),
INDEX(payment_type(255))
) CHARSET=utf8;

First thing you'll probably want to do is normalize the payment_bases.detail_type from App\Model\Payment\% to just %. The best way to do this is using the SQL-standard substring() method: SUBSTR(pay.detail_type, 26).

Then, you'll want to flatten the query results so that each order has exactly 1 row but no data is lost. Each database server tends to have something to do this. MySQL and SQLite have GROUP_CONCAT, or array_agg in PostgreSQL, but they all have something. This is the equivalent of PHP's implode() and has a default delimiter of ,.

GROUP_CONCAT(SUBSTR(pay.detail_type, 26)) AS payment_type


However, NULLs work very poorly in WHERE LIKEs (they don't work) and or is a terrible performance hit, so we need to convert the missing payments to text, too. Let's use the codebase's None status. To do this, use the COALESCE() function. Every SQL flavor I am aware of has this function.

GROUP_CONCAT(COALESCE(SUBSTRING(pay.detail_type, 26), 'None')) AS payment_type

With this: You can easily create the cache in a single query by doing this:

INSERT INTO orders_cache 
SELECT o.customer_id, 
    GROUP_CONCAT(COALESCE(SUBSTRING(pay.detail_type, 26), 'None')) AS payment_type
FROM orders o 
LEFT JOIN payment_bases pay ON pay.customer_id=o.customer_id 
GROUP BY o.customer_id
ORDER BY o.customer_id;

But how do you actually query whether an order has a CreditCard xor Paypal payment? What if you want to find orders that have either no payment, cc, -or- paypal? How do you also make sure you get orders with both CC -and- PP? In the most performant way?

Well, all modern SQL dialects also have regular expression searches. So, in this case, you'd run the following code:

$paymentTypesParam = implode('|', $paymentTypes);
SELECT * FROM orders_cache WHERE payment_type REGEXP 'CreditCard|Paypal|None';

Output:

+-------------+-----------------------+
| customer_id | payment_type          |
+-------------+-----------------------+
|          31 | CreditCard            |
|         150 | None                  |
|         608 | CreditCard,CreditCard |
|      202326 | CreditCard,Paypal     |
|      202668 | Paypal,CreditCard     |
|      202898 | Paypal,Paypal         |
+-------------+-----------------------+
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment