Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
## Schema Question
### Example Schema
Assuming blog service.
```
CREATE TABLE tag (
id INT PRIMARY KEY,
value VARCHAR(32)
);
CREATE TABLE article (
id INT PRIMARY KEY,
content TEXT,
created_at DATETIME
);
CREATE TABLE tagging (
tag_id INT,
article_id INT
);
```
### Q1
Describe SQL to fetch the all of tags associated to the article wiht id is 1
- answer
```
SELECT tag.*
FROM tagging, tag
WHERE tagging.tag_id = tag.id
AND tagging.article_id = 1
;
```
- note
- It will scan the `tagging` table to fetch the records with
`article_id=1`. We can improve the query performacne to add index
to the `tagging` table.
### Q2 (if the candidate not use the index)
How dou you improve the query performance?
- answer
- Add index to `article_id`
```
CREATE INDEX idx_article_id_on_tagging
ON tagging(article_id);
```
or
- `article_id and tag_id` on tagging table. (better for the query performance)
```
CREATE INDEX idx_article_id_and_tag_id_on_tagging
ON tagging(article_id, tag_id);
```
- note
- Assuming `tagging` table is chosen as the outer table for the join
because we specified `tagging.id = 1` so that db enngine can
narrowing the fetch records of the `tagging` table.
- But it still need to scan the `tagging` table. Then we can improve the
I/O performance to eliminate the extra I/O access via using index
of that table.
- It is ok to add an index to `article_id` on the `tagging` table,
but we can more improve the performance by adding index to
`article_id and tag_id`. Then we can avoid access to the actual
`tag` table.
- Such a index is called `covering index`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment