Skip to content

Instantly share code, notes, and snippets.

@raldred
Created March 24, 2011 10:46
Show Gist options
  • Save raldred/884864 to your computer and use it in GitHub Desktop.
Save raldred/884864 to your computer and use it in GitHub Desktop.
pre-preparing inner join results for faster querying and pivoting of data
create temporary table if not exists imports
(
year int(4),
country_id int(11),
value int(30),
INDEX `index_on_year_and_country_id` (year,country_id)
);
truncate imports;
insert into imports (
select year,country_id,sum(value)
from import_exports
where direction = "I"
AND export_type_id IN (841,842)
group by year,country_id
);
create temporary table if not exists exports
(
year int(4),
country_id int(11),
value int(30),
INDEX `index_on_year_and_country_id` (year,country_id)
);
truncate exports;
insert into exports (
select year,country_id,sum(value)
from import_exports
where direction = "X"
AND export_type_id IN (841,842)
group by year,country_id
);
select
ie.year
,ie.country_id
,imports.value as import_value
,exports.value as export_value
from import_exports ie
inner join imports ON
imports.country_id = ie.country_id
AND imports.year = ie.year
inner join exports ON
exports.country_id = ie.country_id
AND exports.year = ie.year
where ie.export_type_id IN (841,842)
group by ie.year,ie.country_id;
@raldred
Copy link
Author

raldred commented Mar 24, 2011

I need the results in the follow format
where there is 1 row for each country for each year
the import and export (direction) values are summed into seperate columns:

The data is organised like:

| year | country_id | direction | export_type_id | value   |
| 1998 | 100        | I         | 2354           | 454511  |
| 1998 | 100        | I         | 2355           | 2223    |
| 1998 | 100        | I         | 2356           | 5415161 |
| 1998 | 100        | X         | 2354           | 34444   |
| 1998 | 100        | X         | 2355           | 213     |
| 1998 | 100        | X         | 2356           | 85161   |
| 1999 | 100        | I         | 2354           | 454511  |
| 1999 | 100        | I         | 2355           | 2223    |
| 1999 | 100        | I         | 2356           | 5415161 |
| 1999 | 100        | X         | 2354           | 34444   |
| 1999 | 100        | X         | 2355           | 213     |
| 1999 | 100        | X         | 2356           | 85161   |

The query produces something like this:

| year | country_id | sum(direction I) | sum(direction X) |
| 1998 | 100        | 5871895          | 119818           |
| 1999 | 100        | 5871895          | 119818           |

The data table contains over half a million rows.
bringing the query time down from 12s to <500ms

This is all down to the pre preparing of the results of the inner joins into temporary tables, this way we can add indexes to the results, making the join rapid.
If we just do a sub query in the inner joins mysql will create derived tables without any indexes.

I'm fairly happy with the speed improvement, however, I still want to get down to <10ms

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment