This small codebase uses Docker to refresh materialized views in Postgresql on a periodic basis. It's intended to be installed in Elasticbeanstalk but can be run from your laptop. The Docker image is about 52 MB.
Materialized views defined in the target database with names ending in hourly
and daily
will get refreshed.
- Build the Docker image
docker build -t mvs_refresh .
- If you need to jump into the image to test the script
python /refresh.py
, use the appropriate environment variables:
docker exec -e MV_DB_HOST='...rds.amazonaws.com' \
-e MV_DB_USER='db_write_user' \
-e MV_DB_PW='secret' \
-e MV_DB='database' \
-it mvs_refresh /bin/sh
- Make a file
.ebextensions/secret.config
that includes the environment variable info (fill in ... appropriately):
option_settings:
- option_name: MV_DB_HOST
value: ...
- option_name: MV_DB_USER
value: ...
- option_name: MV_DB_PW
value: ...
- option_name: MV_DB
value: ...
- Start eb and deploy
eb init
# follow prompts
eb deploy
N.B.: You'll need to make sure that your database has given your EC2 instance(s) proper permissions to access the database.
- As a user that can SELECT on all the relevant tables and can CREATE in that database, make a materialize view. You should name it something that you'll remember and add a suffix
hourly
ordaily
so it can be refreshed appropriately.
CREATE MATERIALIZED VIEW usefulname_hourly AS
SELECT super_horrible_SQL_that_makes_your_eyes_bleed
FROM massive_table
JOIN another_massive_table ...
WHERE ...
GROUP BY ...
- For the refresh to work (using CONCURRENT access), you'll need to create an index on at least one of the columns in that MV:
Since the connection was not opened using a with statement, shouldn't there be a conn.commit() at some point after executing a refresh?
ref: http://initd.org/psycopg/docs/connection.html