coltoneshaw/Observations.md

## Observations.md

      
    Raw
  

              Observations.md
            
          
    Observations

Config Suggestions

If the database has less than 12GB RAM, reduce the below settings.
Note that you can set shared_buffers and effective_cache_size higher on the primary database because they generally receive less traffic.
Primary Database

# This value should match the "SqlSettings.MaxConnections" value within your config.json for Mattermost
# This is a suggestion and can be set lower / higher based on the size of your server.
max_connections = 1020
tcp_keepalives_idle = 5
tcp_keepalives_interval = 1
tcp_keepalives_count = 5

# Set both of the below settings to 65% of total memory. For a 32 GB instance, it should be 21 GB.
# If on a smaller server, set this to 20% or less total RAM.
# ex: 512MB would work for a 4GB RAM server
shared_buffers = 512MB
effective_cache_size = 512MB

# Set it to 16 MB for readers and 32 MB for writers. If it's a single instance, 16 MB should be sufficient. If the instance is of a lower capacity than r5.xlarge, then set it to a lower number.
work_mem = 16MB

# 1GB (reduce this to 512MB if your server has less than 32GB of RAM)
autovacuum_work_mem = 512MB
autovacuum_max_workers = 4
autovacuum_vacuum_cost_limit = 500

#Set it to 1.1 unless the DB is using spinning disks.
random_page_cost = 1.1

restart_after_crash = on
Replica Database

In addition to the above settings
# connection string to sending server
# This was created when you added a replication role to the primary database.
# username - replace "test" with the role you made
# password - replace "testpassword" with the role password
# host - replace "x.x.x.x" with your IP or URL string.
primary_conninfo = 'user=test password=testpassword host=x.x.x.x port=5432 sslmode=prefer sslcompression=0 gssencmode=prefer krbsrvname=postgres target_session_attrs=any'

# replication slot on sending server
# This needs to be configured using the docs below for "Keeping the replica and primary in sync."
primary_slot_name = 'replica1'

hot_standby = on
# Allows query communication between reader and primary. Suggested to prevent any timeouts.
hot_standby_feedback = on
Enable Replica Lag logging in Mattermost


Add to the Mattermost config.
Within the Mattermost config.json, add the below to the SqlSettings.ReplicaLagSettings. The query will pull the diff between the current WAL and the replica. The DataSource string should point to your primary database.

username - Is the PostgreSQL username with permissions to the mattermost database. Usually, this is mmuser.
password - Password assigned to the username from above.
connectionString - IP address or URL of the PostgresSQL instance. It doesn't require :5432.

{
"SqlSettings": {
    "ReplicaLagSettings": [
    {
        "DataSource": "postgres://username:password@connectionString/mattermost?sslmode=disable\u0026connect_timeout=10",
        "QueryAbsoluteLag": "select usename, pg_wal_lsn_diff(pg_current_wal_lsn(),replay_lsn) as metric from pg_stat_replication;",
        "QueryTimeLag": null
    }
    ]
}
}


Give permissions
Before restarting, give the mmuser access to the PostgreSQL monitoring role. Updating permissions should be done on the PostgreSQL primary database, not the replica.
PostgreSQL Role documentation
sudo -u postgres psql
postgres=# GRANT pg_monitor TO mmuser;


Update chart and restart Mattermost.
In the Mattermost Performance Monitoring v2, update the Replica Lag chart to use the value mattermost_db_replica_lag_abs instead of the current time value.
The query should be the below:
mattermost_db_replica_lag_abs{instance=~"$server"}


Restart Mattermost
You should see this go into effect. Check the logs for any query errors.
sudo systemctl restart mattermost


Keeping the replica and primary in sync

If a read replica fails, when it attempts to recover, it will be unable to because the primary database will not have stored the WAL log it requires to sync back up. You'll see the below in the logs usually if this happens.
2022-03-25 04:38:27.369 UTC [77723] mmuser@mattermost FATAL:  the database system is in recovery mode
...
2022-03-25 04:44:12.456 UTC [77947] LOG:  entering standby mode
2022-03-25 04:44:12.465 UTC [77947] LOG:  redo starts at 6/5FA18838
2022-03-25 04:44:12.969 UTC [77950] mmuser@mattermost FATAL:  the database system is starting up
2022-03-25 04:44:12.971 UTC [77951] mmuser@mattermost FATAL:  the database system is starting up
2022-03-25 04:44:13.574 UTC [77947] LOG:  consistent recovery state reached at 6/6DA47060
2022-03-25 04:44:13.575 UTC [77946] LOG:  database system is ready to accept read only connections
2022-03-25 04:44:23.290 UTC [77947] LOG:  invalid resource manager ID 215 at 6/7236A070
2022-03-25 04:44:23.306 UTC [77975] LOG:  started streaming WAL from primary at 6/72000000 on timeline 1
2022-03-25 04:44:23.306 UTC [77975] FATAL:  could not receive data from WAL stream: ERROR:  requested WAL segment 000000010000000600000072 has already been removed

Option 1 - WAL Archive / Recovery

You can use WAL Archiving with recovery on the primary / read-replica. I did not explore this option in testing because it requires an additional setup. However, docs can be found on PostgreSQL's page for continuous archiving.
Option 2 - Replication Slots

Replication slots allow the primary database to save the WAL archive for the read-replica until all transactions have been updated, confirmed then it gets flushed. Some information on this can be found here or here Effectively, this allows for an easy recovery on the read-replica if it crashes.
The only genuine concern with this option is that if your read replica goes offline for an extended period, the primary will store the WAL archive until the replica is back online. With the replica offline, it could consume a large amount of storage if offline for days/weeks as the primary stores the WAL archive.

To view existing replication slots - select * from pg_replication_slots;
Check replication slot lag - SELECT redo_lsn, slot_name,restart_lsn,
Remove a replication slot - select pg_drop_replication_slot(‘ocean’);

Setup


On the primary, you need to add a replication slot with the below.
You can replace replica1 with any unique value you want.
select pg_create_physical_replication_slot(‘replica1’);


In the PostgreSQL config on the read replica, set the primary_slot_name value to the slot name above


Edit the read replica Postgres.conf file to use the slot name.
File location - /etc/postgres/12/main/postgres.conf
primary_slot_name = 'replica1'


Restart the replica PostgreSQL service.
sudo systemctl restart postgresql


You can confirm this is running by running the below command on the primary PostgreSQL instance.
SELECT
	client_addr, state, pid, slot_name, active
FROM
	pg_stat_replication
INNER JOIN
	pg_replication_slots on pid = pg_replication_slots.active_pid;
You should see:
  client_addr  |   state   |  pid   | slot_name | active
 ---------------+-----------+--------+-----------+--------
 x.x.x.x       | streaming | 116774 | replica1   | t


Preventing query cancels on read-replica

Option 1 - Enable Hot Standby Feedback

This allows the primary/replica to communicate about queries happening and not cancel the read replica. This seemed to have the most overall benefit with the least amount of errors on the replica. They claim it adds primary system load, but it appears to be minimal at most.
hot_standby_feedback = on

Option 2 - Add a standby delay

This delay allows the standby server to wait for an update on the primary instead of canceling the query. This will reduce the overall errors in the database getting updated during a query. However, it can cause slower responses within Mattermost. Additionally, this still has some replica lag and some timed-out queries due to reaching the Mattermost max query time.
max_standby_archive_delay = 300s
max_standby_streaming_delay = 300s

Enable Crash Restart

restart_after_crash = on

Disable OOM

This should be a last resort. If it's getting OOM killed, increase the server specs first.
https://www.percona.com/blog/2019/08/02/out-of-memory-killer-or-savior/

  
## Read Replica Setup.md

      
    Raw
  

              Read Replica Setup.md
            
          
    Set up PostgreSQL to use read replication.

This guide assumes you've already set up your Mattermost database and have it running correctly.
Setting up the primary


Create the replica role
We are creating a role on your primary to allow an incoming connection to replicate the database.

username replace this with the username you want your replica to use to connect to the primary.
password replace with your replica connection password.

sudo -u postgres psql
postgres=# CREATE ROLE username WITH REPLICATION PASSWORD 'password' LOGIN;
Output
CREATE ROLE


Create a replication slot
You can replace replica1 with anything you want to identify your read replica. Note that this needs to be a unique name per replica you connect.
select pg_create_physical_replication_slot(‘replica1’);


Allow the replica communication
Edit the pg_hba.conf file on the primary database to allow incoming communication via the configured above username
/etc/postgresql/12/main/pg_hba.conf

username - This is the role username configured in step 1 above.
replica-IP - This is the IP address of your replica.

...
host    replication     username    replica-IP/32   md5


Restart Postgres
sudo systemctl restart postgresql


Setup the Read Node


Install Postgres
sudo apt-get install postgres


Clear the replica data directory.
Find the data directory first by the command below:
sudo -u postgres psql
postgres=#SHOW data_directory;
    data_directory
-----------------------------
/var/lib/postgresql/12/main
(1 row)
Delete the contents of the data directory on the replica.
sudo -u postgres rm -r /var/lib/postgresql/12/main
sudo -u postgres mkdir /var/lib/postgresql/12/main
sudo -u postgres chmod 700 /var/lib/postgresql/12/main


Backup the primary database onto the replica
The backup will take some time, depending on the database size. Testing with a 16GB database took ~4 minutes.

The -U username option allows you to specify the user you connect to the primary cluster as. This is the role you created in the previous step.
It will prompt for a password. This will be what was specified in the prior step

sudo -u postgres pg_basebackup -h primary-ip-addr -p 5432 -U username -D /var/lib/postgresql/12/main/ -Fp -Xs -R


Edit the read replica postgres.conf file to use the slot name.
This primary_slot_name is the same slot you configured above in Step 2 of setting up the primary.
File location - /etc/postgres/12/main/postgres.conf
primary_slot_name = 'replica1'


Restart postgres
sudo systemctl restart postgresql


Verify connectivity


Connect to PostgreSQL on the primary server.
sudo -u postgres psql


Copy and run the below command
SELECT client_addr, state, pid, slot_name, active
FROM pg_stat_replication
INNER JOIN pg_replication_slots on pid = pg_replication_slots.active_pid;
You should see:
client_addr  |   state   |  pid   | slot_name | active
---------------+-----------+--------+-----------+--------
x.x.x.x       | streaming | 116774 | replica1   | t


Setup the replica config in Mattermost

Edit the config.json file on the Mattermost server and include the below:

username - This is the same username on your primary database
password - This is the same password on your primary database
databaseIP - This points to the database IP of your replica

{
    "SqlSettings": {
        "DataSourceReplicas": [
            "postgres://username:password@databaseIP/mattermost?sslmode=disable\u0026connect_timeout=10"
        ],
    }
}

  
## Tests.md

      
    Raw
  

              Tests.md
            
          
    Setup

These tests were preformed on:

Mattermost instance running in AWS on c5.2xlarge
Mattermost version 6.3.9.
Two Postgres 12 databases running AWS c5.large and the above replication setup.

Results


Test #
Test Name
Start Time
End Time
Highest Replica Lag
NumErrors
Termination
Query Cancelling
User Request
Supported Users
Changes
Notes


1
Tuned PG with hot standby feedback
2022-03-28 16:32:41
2022-03-28 17:33:12
43.6MB
3536
0
19
25
4672
Tuned Postgres to Agniva's settings. Adjusted the reader to 512MB for the RAM settings and hot_standby_feedback = on
High replica lag only occured once. Usually less than 1MB


1b
Validation of above
2022-03-28 17:37:28
2022-03-28 18:36:24
39.4MB
3297
0
0
12
4623
-
High replica lag only occured once. Usually less than 1MB


2
Tuned PG with hot standby feedback off
2022-03-28 18:42:29
2022-03-28 19:44:21
150MB
6370
886
5941
6
4685
Same as Test 1 but with hot_standby_feedback = off
Replica lag was over 100MB for a bulk of the time once 4k users was hit.


2b
Validation of above
2022-03-28 19:47:12
2022-03-28 20:46:55
166MB
5149
831
5657
2
4618
-
Same as test 2, but slightly higher max replica lag,


3
Tuned, hot standby feedback off, delay
2022-03-29 13:14:22
2022-03-29 14:21:04
271MB
33785
745
3918
10
5239
hot_standby_feedback = off with delay set to 100s
Postgres crashed. Replica lag was over 200MB multiple times once 5k users was hit.


3b
Validation of test 3
2022-03-29 14:38:51
2022-03-29 15:43:22
218MB
6097
436
3754
10
4729
-
Lag was over 150MB for the bulk after 4k users.


4
Delay increased to 300s
2022-03-29 16:53:15
2022-03-29 17:56:37
466MB
6989
125
937
15
4608
increased the delay settings to 300s
Replica lag increased dramatically. Frequently hit 200MB+


4b
Validation of test 4
2022-03-29 18:09:54
2022-03-29 19:10:14
264MB
6664
93
829
73
4512
-
Not as high of replica lag as test 4, but still averaged over 150MB frequently


## Troubleshooting.md

      
    Raw
  

              Troubleshooting.md
            
          
    Troubleshooting Postgres

Log Location:

/var/log/postgresql/postgresql-<version>-main.log
Out of Sync Replica

Check on the primary database to see if the replica is still connected and streaming. If no replica is connected, you will see a blank table returned.
postgres=# select * from pg_stat_replication \x\g\x
-[ RECORD 1 ]----+------------------------------
pid              | 4643
usesysid         | 16384
usename          | test
application_name | 12/main
client_addr      | 172.31.70.183
client_hostname  |
client_port      | 41482
backend_start    | 2022-03-28 13:11:48.668763+00
backend_xmin     |
state            | streaming
sent_lsn         | 14/61502C98
write_lsn        | 14/61502C98
flush_lsn        | 14/61502C98
replay_lsn       | 14/61502C98
write_lag        |
flush_lag        |
replay_lag       |
sync_priority    | 0
sync_state       | async
reply_time       | 2022-03-28 13:40:28.312833+00
Check how bad the delay is between the primary and replica. The higher the number, the more out of sync the replica is. The query will return the difference in bytes for how out of sync it is.
So, 200000000 will be 200MB out of sync. Ideally, this metric is below 10MB out of sync, and when it starts to reach high numbers, that usually means the replica has gone down and requires a restart.
select usename, client_addr, state, pg_wal_lsn_diff(pg_current_wal_lsn(),replay_lsn) as delay from pg_stat_replication;
 usename |  client_addr  |   state   | delay
---------+---------------+-----------+-------
 test    | 172.31.70.183 | streaming |     0
(1 row)
Resyncing a Replica

Usually, this is as simple as just restarting the PostgreSQL service on the read replica. If the read replica was misconfigured or fallen too far behind the WAL archive on the primary, you might need to rebuild the replica's database with the below commands.


Delete all the postgres data on the read replica.
sudo -u postgres rm -r /var/lib/postgresql/12/main
sudo -u postgres mkdir /var/lib/postgresql/12/main
sudo -u postgres chmod 700 /var/lib/postgresql/12/main


Redownload the data from the primary database.
Use this command on the replica.

-h primary-ip-addr - This will be the IP address of your primary database.
-U username - the username here is the role that you configured within the primary database for the replica to connect with. You can check the name by using \du+ on the primary database.
It will prompt for the password you configured for that role.

sudo -u postgres pg_basebackup -h primary-ip-addr -p 5432 -U username -D /var/lib/postgresql/12/main/ -Fp -Xs -R


-U test is the replication role configured on the primary to accept communications. If you do not remember the role name, you can run \du+ on the primary database.
-D /var/lib/postgresql/12/main/ this tells the command where to output the backup.
-h primary-ip-addr The IP address of the primary database.


Restart Postgres on the read replica.
sudo systemctl restart postgresql
If PostgreSQL fails to start, use sudo tail -n 300 /var/log/postgresql/* to check what issues occurred during startup.


Check Connection

pg_isready is a utility designed to check if your box can access the PostgreSQL install.


Install pg_isready.
Replace 12 with the version string of your PostgreSQL instance.
Example for Postgres 12
sudo apt-get install postgresql-client-common postgresql-client-12


Test the connection
This will reach out to the PostgreSQL instance running at the below and make sure it's accepting connections properly.
You can find all this data in the mattermost config.json file. If you run the below you can use most of the data from the DataSource string to fill this in.
sudo grep "SqlSettings" -A 10 /opt/mattermost/config/config.json
pg_isready -d <db_name> -h <host_name> -p <port_number> -U <db_user>
Expected Response:
accepting connections
Test #	Test Name	Start Time	End Time	Highest Replica Lag	NumErrors	Termination	Query Cancelling	User Request	Supported Users	Changes	Notes
1	Tuned PG with hot standby feedback	2022-03-28 16:32:41	2022-03-28 17:33:12	43.6MB	3536	0	19	25	4672	Tuned Postgres to Agniva's settings. Adjusted the reader to 512MB for the RAM settings and `hot_standby_feedback = on`	High replica lag only occured once. Usually less than 1MB
1b	Validation of above	2022-03-28 17:37:28	2022-03-28 18:36:24	39.4MB	3297	0	0	12	4623	-	High replica lag only occured once. Usually less than 1MB
2	Tuned PG with hot standby feedback off	2022-03-28 18:42:29	2022-03-28 19:44:21	150MB	6370	886	5941	6	4685	Same as Test 1 but with `hot_standby_feedback = off`	Replica lag was over 100MB for a bulk of the time once 4k users was hit.
2b	Validation of above	2022-03-28 19:47:12	2022-03-28 20:46:55	166MB	5149	831	5657	2	4618	-	Same as test 2, but slightly higher max replica lag,
3	Tuned, hot standby feedback off, delay	2022-03-29 13:14:22	2022-03-29 14:21:04	271MB	33785	745	3918	10	5239	`hot_standby_feedback = off` with delay set to 100s	Postgres crashed. Replica lag was over 200MB multiple times once 5k users was hit.
3b	Validation of test 3	2022-03-29 14:38:51	2022-03-29 15:43:22	218MB	6097	436	3754	10	4729	-	Lag was over 150MB for the bulk after 4k users.
4	Delay increased to 300s	2022-03-29 16:53:15	2022-03-29 17:56:37	466MB	6989	125	937	15	4608	increased the delay settings to 300s	Replica lag increased dramatically. Frequently hit 200MB+
4b	Validation of test 4	2022-03-29 18:09:54	2022-03-29 19:10:14	264MB	6664	93	829	73	4512	-	Not as high of replica lag as test 4, but still averaged over 150MB frequently