SQL ServerでGROUP BYを指定した場合、データのグルーピングのために暗黙的にソート処理が発生するが、 複数列でグルーピングした際のソート順は、GROUP BY句で指定した列順とは無関係のようなので動作を確認した。
- Windows 7 Professional 64bit
- Microsoft SQL Server 2012 (SP1) - 11.0.3128.0 (X64)
-
OracleのHRスキーマのデータを移植し、employees表を使用
-
employee_id列のPRIMARY KEYはクラスタ化索引
-
first_name列、last_name列の索引は表スキャンをさせるためにDROP済み
-
hire_date列の他にもう1つ日付データが必要だったためdate2列を末尾に追加済み
1> select * from employees 2> go
employee_id first_name last_name email phone_number hire_date job_id salary commission_pct manager_id department_id date2 ----------- -------------------- ------------------------- ------------------------- -------------------- ---------------- ---------- ----------- -------------- ----------- ------------- ---------------- 100 Steven King SKING 515.123.4567 1987-06-17 AD_PRES 24000 NULL NULL 90 2013-10-23 101 Neena Kochhar NKOCHHAR 515.123.4568 1989-09-21 AD_VP 17000 NULL 100 90 2013-10-23 102 Lex De Haan LDEHAAN 515.123.4569 1993-01-13 AD_VP 17000 NULL 100 90 2013-10-23 103 Alexander Hunold AHUNOLD 590.423.4567 1990-01-03 IT_PROG 9000 NULL 102 60 2013-10-23 104 Bruce Ernst BERNST 590.423.4568 1991-05-21 IT_PROG 6000 NULL 103 60 2013-10-23 105 David Austin DAUSTIN 590.423.4569 1997-06-25 IT_PROG 4800 NULL 103 60 2013-10-23 106 Valli Pataballa VPATABAL 590.423.4560 1998-02-05 IT_PROG 4800 NULL 103 60 2013-10-23 107 Diana Lorentz DLORENTZ 590.423.5567 1999-02-07 IT_PROG 4200 NULL 103 60 2013-10-23 108 Nancy Greenberg NGREENBE 515.124.4569 1994-08-17 FI_MGR 12000 NULL 101 100 2013-10-23 109 Daniel Faviet DFAVIET 515.124.4169 1994-08-16 FI_ACCOUNT 9000 NULL 108 100 2013-10-23 110 John Chen JCHEN 515.124.4269 1997-09-28 FI_ACCOUNT 8200 NULL 108 100 2013-10-23 111 Ismael Sciarra ISCIARRA 515.124.4369 1997-09-30 FI_ACCOUNT 7700 NULL 108 100 2013-10-23 112 Jose Manuel Urman JMURMAN 515.124.4469 1998-03-07 FI_ACCOUNT 7800 NULL 108 100 2013-10-23 113 Luis Popp LPOPP 515.124.4567 1999-12-07 FI_ACCOUNT 6900 NULL 108 100 2013-10-23 114 Den Raphaely DRAPHEAL 515.127.4561 1994-12-07 PU_MAN 11000 NULL 100 30 2013-10-23 115 Alexander Khoo AKHOO 515.127.4562 1995-05-18 PU_CLERK 3100 NULL 114 30 2013-10-23 116 Shelli Baida SBAIDA 515.127.4563 1997-12-24 PU_CLERK 2900 NULL 114 30 2013-10-23 117 Sigal Tobias STOBIAS 515.127.4564 1997-07-24 PU_CLERK 2800 NULL 114 30 2013-10-23 118 Guy Himuro GHIMURO 515.127.4565 1998-11-15 PU_CLERK 2600 NULL 114 30 2013-10-23 119 Karen Colmenares KCOLMENA 515.127.4566 1999-08-10 PU_CLERK 2500 NULL 114 30 2013-10-23 120 Matthew Weiss MWEISS 650.123.1234 1996-07-18 ST_MAN 8000 NULL 100 50 2013-10-23 121 Adam Fripp AFRIPP 650.123.2234 1997-04-10 ST_MAN 8200 NULL 100 50 2013-10-23 122 Payam Kaufling PKAUFLIN 650.123.3234 1995-05-01 ST_MAN 7900 NULL 100 50 2013-10-23 123 Shanta Vollman SVOLLMAN 650.123.4234 1997-10-10 ST_MAN 6500 NULL 100 50 2013-10-23 124 Kevin Mourgos KMOURGOS 650.123.5234 1999-11-16 ST_MAN 5800 NULL 100 50 2013-10-23 125 07ia Nayer JNAYER 650.124.1214 1997-07-16 ST_CLERK 3200 NULL 120 50 2013-10-23 126 Irene Mikkilineni IMIKKILI 650.124.1224 1998-09-28 ST_CLERK 2700 NULL 120 50 2013-10-23 127 James Landry JLANDRY 650.124.1334 1999-01-14 ST_CLERK 2400 NULL 120 50 2013-10-23 128 Steven 03kle S03KLE 650.124.1434 2000-03-08 ST_CLERK 2200 NULL 120 50 2013-10-23 129 Laura Bissot LBISSOT 650.124.5234 1997-08-20 ST_CLERK 3300 NULL 121 50 2013-10-23 130 Mozhe Atkinson MATKINSO 650.124.6234 1997-10-30 ST_CLERK 2800 NULL 121 50 2013-10-23 131 James 03low JAMRLOW 650.124.7234 1997-02-16 ST_CLERK 2500 NULL 121 50 2013-10-23 132 TJ Olson TJOLSON 650.124.8234 1999-04-10 ST_CLERK 2100 NULL 121 50 2013-10-23 133 Jason Mallin JMALLIN 650.127.1934 1996-06-14 ST_CLERK 3300 NULL 122 50 2013-10-23 134 Michael Rogers MROGERS 650.127.1834 1998-08-26 ST_CLERK 2900 NULL 122 50 2013-10-23 135 Ki Gee KGEE 650.127.1734 1999-12-12 ST_CLERK 2400 NULL 122 50 2013-10-23 136 Hazel Philtanker HPHILTAN 650.127.1634 2000-02-06 ST_CLERK 2200 NULL 122 50 2013-10-23 137 Renske Ladwig RLADWIG 650.121.1234 1995-07-14 ST_CLERK 3600 NULL 123 50 2013-10-23 138 Stephen Stiles SSTILES 650.121.2034 1997-10-26 ST_CLERK 3200 NULL 123 50 2013-10-23 139 John Seo JSEO 650.121.2019 1998-02-12 ST_CLERK 2700 NULL 123 50 2013-10-23 140 Joshua Patel JPATEL 650.121.1834 1998-04-06 ST_CLERK 2500 NULL 123 50 2013-10-23 141 Trenna Rajs TRAJS 650.121.8009 1995-10-17 ST_CLERK 3500 NULL 124 50 2013-10-23 142 Curtis Davies CDAVIES 650.121.2994 1997-01-29 ST_CLERK 3100 NULL 124 50 2013-10-23 143 Randall Matos RMATOS 650.121.2874 1998-03-15 ST_CLERK 2600 NULL 124 50 2013-10-23 144 Peter Vargas PVARGAS 650.121.2004 1998-07-09 ST_CLERK 2500 NULL 124 50 2013-10-23 145 John Russell JRUSSEL 011.44.1344.429268 1996-10-01 SA_MAN 14000 0 100 80 2013-10-23 146 Karen Partners KPARTNER 011.44.1344.467268 1997-01-05 SA_MAN 13500 0 100 80 2013-10-23 147 Alberto Errazuriz AERRAZUR 011.44.1344.429278 1997-03-10 SA_MAN 12000 0 100 80 2013-10-23 148 Gerald Cambrault GCAMBRAU 011.44.1344.619268 1999-10-15 SA_MAN 11000 0 100 80 2013-10-23 149 Eleni Zlotkey EZLOTKEY 011.44.1344.429018 2000-01-29 SA_MAN 10500 0 100 80 2013-10-23 150 Peter Tucker PTUCKER 011.44.1344.129268 1997-01-30 SA_REP 10000 0 145 80 2013-10-23 151 David Bernstein DBERNSTE 011.44.1344.345268 1997-03-24 SA_REP 9500 0 145 80 2013-10-23 152 Peter Hall PHALL 011.44.1344.478968 1997-08-20 SA_REP 9000 0 145 80 2013-10-23 153 Christopher Olsen COLSEN 011.44.1344.498718 1998-03-30 SA_REP 8000 0 145 80 2013-10-23 154 Nanette Cambrault NCAMBRAU 011.44.1344.987668 1998-12-09 SA_REP 7500 0 145 80 2013-10-23 155 Oliver Tuvault OTUVAULT 011.44.1344.486508 1999-11-23 SA_REP 7000 0 145 80 2013-10-23 156 01ette King JKING 011.44.1345.429268 1996-01-30 SA_REP 10000 0 146 80 2013-10-23 157 Patrick Sully PSULLY 011.44.1345.929268 1996-03-04 SA_REP 9500 0 146 80 2013-10-23 158 Allan McEwen AMCEWEN 011.44.1345.829268 1996-08-01 SA_REP 9000 0 146 80 2013-10-23 159 Lindsey Smith LSMITH 011.44.1345.729268 1997-03-10 SA_REP 8000 0 146 80 2013-10-23 160 Louise Doran LDORAN 011.44.1345.629268 1997-12-15 SA_REP 7500 0 146 80 2013-10-23 161 Sarath Sewall SSEWALL 011.44.1345.529268 1998-11-03 SA_REP 7000 0 146 80 2013-10-23 162 Clara Vishney CVISHNEY 011.44.1346.129268 1997-11-11 SA_REP 10500 0 147 80 2013-10-23 163 Danielle Greene DGREENE 011.44.1346.229268 1999-03-19 SA_REP 9500 0 147 80 2013-10-23 164 Mattea 03vins M03VINS 011.44.1346.329268 2000-01-24 SA_REP 7200 0 147 80 2013-10-23 165 David Lee DLEE 011.44.1346.529268 2000-02-23 SA_REP 6800 0 147 80 2013-10-23 166 Sundar Ande SANDE 011.44.1346.629268 2000-03-24 SA_REP 6400 0 147 80 2013-10-23 167 Amit Banda ABANDA 011.44.1346.729268 2000-04-21 SA_REP 6200 0 147 80 2013-10-23 168 Lisa Ozer LOZER 011.44.1343.929268 1997-03-11 SA_REP 11500 0 148 80 2013-10-23 169 Harrison Bloom HBLOOM 011.44.1343.829268 1998-03-23 SA_REP 10000 0 148 80 2013-10-23 170 Tayler Fox TFOX 011.44.1343.729268 1998-01-24 SA_REP 9600 0 148 80 2013-10-23 171 William Smith WSMITH 011.44.1343.629268 1999-02-23 SA_REP 7400 0 148 80 2013-10-23 172 Elizabeth Bates EBATES 011.44.1343.529268 1999-03-24 SA_REP 7300 0 148 80 2013-10-23 173 Sundita Ku03 SKU03 011.44.1343.329268 2000-04-21 SA_REP 6100 0 148 80 2013-10-23 174 Ellen Abel EABEL 011.44.1644.429267 1996-05-11 SA_REP 11000 0 149 80 2013-10-23 175 Alyssa Hutton AHUTTON 011.44.1644.429266 1997-03-19 SA_REP 8800 0 149 80 2013-10-23 176 Jonathon Taylor JTAYLOR 011.44.1644.429265 1998-03-24 SA_REP 8600 0 149 80 2013-10-23 177 Jack Livingston JLIVINGS 011.44.1644.429264 1998-04-23 SA_REP 8400 0 149 80 2013-10-23 178 Kimberely Grant KGRANT 011.44.1644.429263 1999-05-24 SA_REP 7000 0 149 NULL 2013-10-23 179 Charles Johnson CJOHNSON 011.44.1644.429262 2000-01-04 SA_REP 6200 0 149 80 2013-10-23 180 Winston Taylor WTAYLOR 650.507.9876 1998-01-24 SH_CLERK 3200 NULL 120 50 2013-10-23 181 Jean Fleaur JFLEAUR 650.507.9877 1998-02-23 SH_CLERK 3100 NULL 120 50 2013-10-23 182 03tha Sullivan MSULLIVA 650.507.9878 1999-06-21 SH_CLERK 2500 NULL 120 50 2013-10-23 183 Girard Geoni GGEONI 650.507.9879 2000-02-03 SH_CLERK 2800 NULL 120 50 2013-10-23 184 Nandita Sarchand NSARCHAN 650.509.1876 1996-01-27 SH_CLERK 4200 NULL 121 50 2013-10-23 185 Alexis Bull ABULL 650.509.2876 1997-02-20 SH_CLERK 4100 NULL 121 50 2013-10-23 186 07ia Dellinger JDELLING 650.509.3876 1998-06-24 SH_CLERK 3400 NULL 121 50 2013-10-23 187 Anthony Cabrio ACABRIO 650.509.4876 1999-02-07 SH_CLERK 3000 NULL 121 50 2013-10-23 188 Kelly Chung KCHUNG 650.505.1876 1997-06-14 SH_CLERK 3800 NULL 122 50 2013-10-23 189 Jennifer Dilly JDILLY 650.505.2876 1997-08-13 SH_CLERK 3600 NULL 122 50 2013-10-23 190 Timothy Gates TGATES 650.505.3876 1998-07-11 SH_CLERK 2900 NULL 122 50 2013-10-23 191 Randall Perkins RPERKINS 650.505.4876 1999-12-19 SH_CLERK 2500 NULL 122 50 2013-10-23 192 Sarah Bell SBELL 650.501.1876 1996-02-04 SH_CLERK 4000 NULL 123 50 2013-10-23 193 Britney Everett BEVERETT 650.501.2876 1997-03-03 SH_CLERK 3900 NULL 123 50 2013-10-23 194 Samuel McCain SMCCAIN 650.501.3876 1998-07-01 SH_CLERK 3200 NULL 123 50 2013-10-23 195 Vance Jones VJONES 650.501.4876 1999-03-17 SH_CLERK 2800 NULL 123 50 2013-10-23 196 Alana Walsh AWALSH 650.507.9811 1998-04-24 SH_CLERK 3100 NULL 124 50 2013-10-23 197 Kevin Feeney KFEENEY 650.507.9822 1998-05-23 SH_CLERK 3000 NULL 124 50 2013-10-23 198 Donald OConnell DOCONNEL 650.507.9833 1999-06-21 SH_CLERK 2600 NULL 124 50 2013-10-23 199 Douglas Grant DGRANT 650.507.9844 2000-01-13 SH_CLERK 2600 NULL 124 50 2013-10-23 200 Jennifer Whalen JWHALEN 515.123.4444 1987-09-17 AD_ASST 4400 NULL 101 10 2013-10-23 201 Michael Hartstein MHARTSTE 515.123.5555 1996-02-17 MK_MAN 13000 NULL 100 20 2013-10-23 202 Pat Fay PFAY 603.123.6666 1997-08-17 MK_REP 6000 NULL 201 20 2013-10-23 203 Susan Mavris SMAVRIS 515.123.7777 1994-06-07 HR_REP 6500 NULL 101 40 2013-10-23 204 Hermann Baer HBAER 515.123.8888 1994-06-07 PR_REP 10000 NULL 101 70 2013-10-23 205 Shelley Higgins SHIGGINS 515.123.8080 1994-06-07 AC_MGR 12000 NULL 101 110 2013-10-23 206 William Gietz WGIETZ 515.123.8181 1994-06-07 AC_ACCOUNT 8300 NULL 205 110 2013-10-23
検証で使用した列のカーディナリティ
1> SELECT count(distinct last_name) as last_name,
2> count(distinct first_name) as first_name,
3> count(distinct job_id) as job_id,
4> count(distinct salary) as salary,
5> count(distinct department_id) as department_id,
6> count(distinct manager_id) as manager_id,
7> count(distinct hire_date) as hire_date,
8> count(distinct date2) as date2
9> FROM employees;
10> go
last_name first_name job_id salary department_id manager_id hire_date date2
----------- ----------- ----------- ----------- ------------- ----------- ----------- -----------
102 91 19 57 11 18 98 1
SELECT count(*)
FROM employees
GROUP BY last_name, first_name;
Sort(ORDER BY:([first_name] ASC, [last_name] ASC))
※カーディナリティ低→高
SELECT count(*)
FROM employees
GROUP BY first_name, job_id;
Sort(ORDER BY:([job_id] ASC, [first_name] ASC))
※カーディナリティ低→高
SELECT count(*)
FROM employees
GROUP BY hire_date, date2;
Sort(ORDER BY:([date2] ASC, [hire_date] ASC))
※カーディナリティ低→高
SELECT count(*)
FROM employees
GROUP BY department_id, last_name;
Sort(ORDER BY:([last_name] ASC, [department_id] ASC))
※文字(高)→数値(低)
SELECT count(*)
FROM employees
GROUP BY salary, job_id;
Sort(ORDER BY:([job_id] ASC, [salary] ASC))
文字(低)→数値(高)
SELECT count(*)
FROM employees
GROUP BY department_id, job_id;
Sort(ORDER BY:([job_id] ASC, [department_id] ASC))
※文字(高)→数値(低)
数値+文字列の場合、カーディナリティに関わらず常に文字列を優先キーにしているように見えるが、
以下のSQLを確認中、まれにソート順が入れ替わる現象が発生した。
今のところ再現条件は確認できていない。
SELECT department_id, job_id, AVG(salary) AS average
FROM employees
GROUP BY department_id, job_id;
SELECT count(*)
FROM employees
GROUP BY hire_date, job_id;
Sort(ORDER BY:([job_id] ASC, [hire_date] ASC))
※文字(低)→日付(高)
SELECT count(*)
FROM employees
GROUP BY hire_date, last_name;
Sort(ORDER BY:([last_name] ASC, [hire_date] ASC))
※文字(高)→日付(低)
文字列+日付では、文字列が優先キーになっているように見える。
しかし、数値+文字列のようにソート順が入れ替わるケースがあるのかもしれない。(未確認)
SELECT count(*)
FROM employees
GROUP BY hire_date, department_id;
Sort(ORDER BY:([department_id] ASC, [hire_date] ASC))
※数値(低)→日付(高)
SELECT count(*)
FROM employees
GROUP BY date2, department_id;
Sort(ORDER BY:([department_id] ASC, [date2] ASC))
※数値(高)→日付(低)
数値+日付では、数値が優先キーになっているように見える。
- GROUP BY句に指定された複数列が同じデータ型の場合
- カーディナリティが低い方から順にソートされる
- GROUP BY句に指定された複数列のデータ型が異なる場合
- 文字 > 数値 > 日付 の順でソートされる(ケースが多い)
- 他の要因でソート順が変わるケースが存在する
- バッファ・キャッシュの状態?
- 似たSQLとカーソル共有されている?
- 以前のソートが影響を与えている?
- 実は完全に気まぐれとか?
- データ型による優先順位より、列統計のヒストグラムを見ている可能性が高いか?
- sp_helpstats 表名, 'ALL' で統計のリストを確認
- 索引がある列は索引名が統計名になっている
- _WA_Sys_nnnnnnnn_xxxxxxxx はWHEREで使用されたことで自動作成された統計の名前
- 一度自動作成された統計は、他の統計と同じく自動更新の対象となる
- dbcc show_statistics(表名, 統計名) でヒストグラムなどを参照可能
- CREATE STATISTICS 統計名 ON 表名(列名) で統計を手動作成可能
- DROP STATISTICS 表名.統計名 で統計を削除可能
- 列統計の有無で暗黙ソートの順が変動するか要検証