Skip to content

Instantly share code, notes, and snippets.

@kiurtis
Last active February 26, 2021 14:54
Show Gist options
  • Save kiurtis/d3aaf428406f808208637e352c8116eb to your computer and use it in GitHub Desktop.
Save kiurtis/d3aaf428406f808208637e352c8116eb to your computer and use it in GitHub Desktop.
------------ Duplicates management ------------
-- Show the duplicate rows in <table>
select * from <table> ou
where (select count(*) from <table> inr
where inr.id = ou.id) > 1
order by id;
---- OR
select id, count(*)
from <table>
group by id
HAVING count(*) > 1
-- Drop the duplicate rows (based on their values in <duplicating_column>). Keep the first row when ordering by sorting_columns.
-- If one need to order columns in different order, you need to tweak the snippet using col1 DESC, col2 ASC.
-- Remarks: this won't work if there is null values <sorting_columns>. Coalesce() can be used in this case.
DELETE FROM <table>
where <sorting_columns> NOT IN
SELECT <sorting_columns> FROM
(SELECT *,ROW_NUMBER() OVER
(PARTITION BY <columns_to_deduplicate> ORDER BY <sorting_columns> DESC) rn
FROM <table>
) tmp WHERE rn = 1;
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment