Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save sunbuhui/6f15497c33428ca6b4d1e7b5bc42fc9c to your computer and use it in GitHub Desktop.
Save sunbuhui/6f15497c33428ca6b4d1e7b5bc42fc9c to your computer and use it in GitHub Desktop.
Hive sql: divide rows by their id/primary key, then sort them by group
```
device_id rank
12 1
12 2
12 3
10 1
10 2
```
Then use the sql below
```
select
device_id,
rank
from
table
distribute by
device_id
sort by
rank
```
Also, I don't know what's the difference between ``distribute by``,``group by``,``cluster by``, later I will create another gist about it
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment