Skip to content

Instantly share code, notes, and snippets.

@jometho
Created May 24, 2018 10:02
Show Gist options
  • Save jometho/7732ee518f8752e6320618c0ccad3af0 to your computer and use it in GitHub Desktop.
Save jometho/7732ee518f8752e6320618c0ccad3af0 to your computer and use it in GitHub Desktop.
Handling duplicates records in postgresql
--Given table coverage with field name which is guaranteed uniqueness
-- You can select duplicated records with this "sequel" on the table
select coverage.name, count(*) from catalog.coverage GROUP BY coverage.name
HAVING count(*) > 1;
--Same scenario now say you want to delete the redundant tables and remain with only one record for each
--You can use this SQL, where name is the unique column
--ctid is a postgresql system columns for the physical row location of a record
DELETE FROM catalog.coverage a USING (
SELECT MIN(ctid) as ctid, name
FROM catalog.coverage
GROUP BY name HAVING COUNT(*) > 1
) b
WHERE a.name = b.name
AND a.ctid <> b.ctid
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment