Skip to content

Instantly share code, notes, and snippets.

@cdimartino
Last active July 23, 2020 02:48
Show Gist options
  • Save cdimartino/7560ffff932bcb06832db4841bd5bdbf to your computer and use it in GitHub Desktop.
Save cdimartino/7560ffff932bcb06832db4841bd5bdbf to your computer and use it in GitHub Desktop.
UTF8MB4 not properly indexing my 💩

TLDR;

Unique index containing an utf8mb4 encoded string are 🤮 due to improperly indexed 💩 value.

Database State:

All collations appear to be utf8mb4 compatible. Note also that innodb_large_prefix = on.

mysql> select version();
+-----------+
| version() |
+-----------+
| 5.7.24    |
+-----------+

mysql> SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%' or Variable_name LIKE 'innodb_large_prefix';
+--------------------------+--------------------+
| Variable_name            | Value              |
+--------------------------+--------------------+
| character_set_client     | utf8mb4            |
| character_set_connection | utf8mb4            |
| character_set_database   | utf8mb4            |
| character_set_filesystem | binary             |
| character_set_results    | utf8mb4            |
| character_set_server     | utf8mb4            |
| character_set_system     | utf8               |
| collation_connection     | utf8mb4_general_ci |
| collation_database       | utf8mb4_unicode_ci |
| collation_server         | utf8mb4_unicode_ci |
| innodb_large_prefix      | ON                 |
+--------------------------+--------------------+

Table creation with UTF8MB4 support

CREATE TABLE `test_unique_utf8_indices` (
  `id` bigint(20) NOT NULL AUTO_INCREMENT,
  `value` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `index_test_unique_utf8_indices_on_value` (`value`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;

Add some 💩 to it:

mysql> insert into test_unique_utf8_indices (value) values ("💩");
Query OK, 1 row affected (0.00 sec)

mysql> select * from test_unique_utf8_indices;
+----+-------+
| id | value |
+----+-------+
|  1 | 💩      |
+----+-------+
1 row in set (0.00 sec)

mysql> insert into test_unique_utf8_indices (value) values ("🥰");
ERROR 1062 (23000): Duplicate entry '?' for key 'index_test_unique_utf8_indices_on_value'
mysql> insert into test_unique_utf8_indices (value) VALUES ("🤮");
ERROR 1062 (23000): Duplicate entry '?' for key 'index_test_unique_utf8_indices_on_value'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment