Skip to content

Instantly share code, notes, and snippets.

@streamer45
Last active February 9, 2022 09:39
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save streamer45/868c451164f6e8069d8b398685a31b6e to your computer and use it in GitHub Desktop.
Save streamer45/868c451164f6e8069d8b398685a31b6e to your computer and use it in GitHub Desktop.

5.39.0 -> 6.0.0 Migration Results (70M+ posts)

Dataset

Table Rows
Posts 72M
Reactions 14M
Preferences 6M
Threads 5M
ChannelMembers 4M
Channels 140k
Users 10k

Database Cluster

  • 1x r5.2xlarge Aurora RDS writer (8vCPU, 64GB RAM)
  • 1x r5.2xlarge Aurora RDS reader (8vCPU, 64GB RAM)

MySQL

# Query Duration
1 ALTER TABLE Posts MODIFY COLUMN FileIds text, MODIFY COLUMN Props JSON, DROP COLUMN ParentId; 3h40m
2 ALTER TABLE ChannelMembers MODIFY NotifyProps JSON; ~2m
3 ALTER TABLE LinkMetadata MODIFY Data JSON; ~50ms
4 ALTER TABLE Jobs MODIFY Data JSON; ~90ms
5 ALTER TABLE Sessions MODIFY Props JSON; ~100ms
6 ALTER TABLE Threads MODIFY Participants JSON; ~2m
7 ALTER TABLE Users MODIFY Props JSON; ~3s
8 ALTER TABLE Users MODIFY NotifyProps JSON; ~2s
9 ALTER TABLE Users MODIFY Timezone JSON; ~2s
10 UPDATE Posts SET RootId = ParentId WHERE RootId = '' AND RootId != ParentId; ~12m
11 UPDATE CommandWebhooks SET RootId = ParentId WHERE RootId = '' AND RootId != ParentId; ~1ms
12 ALTER TABLE CommandWebhooks DROP COLUMN ParentId; ~1ms
13 CREATE INDEX idx_posts_root_id_delete_at ON Posts (RootId, DeleteAt); ~15m
14 DROP INDEX idx_posts_root_id ON Posts; ~500ms
15 CREATE INDEX idx_channels_team_id_display_name ON Channels (TeamId, DisplayName); ~2s
16 CREATE INDEX idx_channels_team_id_type ON Channels (TeamId, Type); ~400ms
17 DROP INDEX idx_channels_team_id ON Channels; ~30ms
18 CREATE INDEX idx_threads_channel_id_last_reply_at ON Threads (ChannelId, LastReplyAt); ~23s
19 DROP INDEX idx_threads_channel_id ON Threads; ~60ms
20 CREATE INDEX idx_channelmembers_user_id_channel_id_last_viewed_at ON ChannelMembers (UserId, ChannelId, LastViewedAt); ~20s
21 CREATE INDEX idx_channelmembers_channel_id_scheme_guest_user_id ON ChannelMembers (ChannelId, SchemeGuest, UserId); ~20s
24 DROP INDEX idx_channelmembers_user_id ON ChannelMembers; ~60ms
25 CREATE INDEX idx_status_status_dndendtime ON Status (Status, DNDEndTime); ~40ms
26 DROP INDEX idx_status_status ON Status; ~30ms

Notes

Query #1 needed some DB specific tuning to succeed as it showed very high memory consumption (~16GB peak) which on config defaults caused an OOM (out of memory) database crash:

  • Overall Posts table size was of ~25GB data and ~45GB index.
  • innodb_buffer_pool_size was modified from the default of {DBInstanceClassMemory*3/4} to {DBInstanceClassMemory*1/2} to allow for more freeable memory.
  • Significant DB CPU usage was noticed throughout the query execution.

Freeable Memory

image

CPU Usage

image

PostgreSQL

# Query Duration
1 ALTER TABLE channelmembers ALTER COLUMN notifyprops TYPE jsonb USING notifyprops::jsonb ~35s
2 ALTER TABLE jobs ALTER COLUMN data TYPE jsonb USING data::jsonb; ~20ms
3 ALTER TABLE linkmetadata ALTER COLUMN data TYPE jsonb USING data::jsonb; ~40ms
4 ALTER TABLE posts ALTER COLUMN props TYPE jsonb USING props::jsonb; ~1h
5 ALTER TABLE sessions ALTER COLUMN props TYPE jsonb USING props::jsonb; ~100ms
6 ALTER TABLE threads ALTER COLUMN participants TYPE jsonb USING participants::jsonb; ~43s
7 ALTER TABLE users ALTER COLUMN props TYPE jsonb USING props::jsonb; ~1s
8 ALTER TABLE users ALTER COLUMN notifyprops TYPE jsonb USING notifyprops::jsonb; ~1s
9 ALTER TABLE users ALTER COLUMN timezone TYPE jsonb USING timezone::jsonb; ~1s
10 UPDATE Posts SET RootId = ParentId WHERE RootId = '' AND RootId != ParentId; ~23s
11 ALTER TABLE Posts DROP COLUMN ParentId; ~3ms
12 UPDATE CommandWebhooks SET RootId = ParentId WHERE RootId = '' AND RootId != ParentId; ~1ms
13 ALTER TABLE CommandWebhooks DROP COLUMN ParentId; ~5ms
14 CREATE INDEX idx_posts_root_id_delete_at ON Posts (RootId, DeleteAt); ~1m30s
15 DROP INDEX idx_posts_root_id; ~120ms
16 CREATE INDEX idx_channels_team_id_display_name ON Channels (TeamId, DisplayName); ~60ms
17 CREATE INDEX idx_channels_team_id_type ON Channels (TeamId, Type); ~25ms
18 DROP INDEX idx_channels_team_id; ~4ms
19 CREATE INDEX idx_threads_channel_id_last_reply_at ON Threads (ChannelId, LastReplyAt); ~6s
20 DROP INDEX idx_threads_channel_id; ~20ms
21 CREATE INDEX idx_channelmembers_user_id_channel_id_last_viewed_at ON ChannelMembers (UserId, ChannelId, LastViewedAt); ~10s
22 CREATE INDEX idx_channelmembers_channel_id_scheme_guest_user_id ON ChannelMembers (ChannelId, SchemeGuest, UserId); ~10s
23 DROP INDEX idx_channelmembers_user_id; ~8ms
24 CREATE INDEX idx_status_status_dndendtime ON Status (Status, DNDEndTime); ~14ms
25 DROP INDEX idx_status_status; ~6ms
26 ALTER TABLE posts ALTER COLUMN fileids TYPE character varying(300); ~4ms
@bbodenmiller
Copy link

FYI additional explanations appear to exist at https://gist.github.com/streamer45/59b3582118913d4fc5e8ff81ea78b055.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment