Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save onhate/e8e00221ac4e7957dbcbd10b003fe50a to your computer and use it in GitHub Desktop.
Save onhate/e8e00221ac4e7957dbcbd10b003fe50a to your computer and use it in GitHub Desktop.
SQL Sample to remove duplicate events by event_id and collector_tstamp on PostgreSQL
DELETE
FROM atomic.events
WHERE ctid IN (SELECT ctid
FROM (SELECT ctid, ROW_NUMBER() OVER (PARTITION BY event_id, collector_tstamp ) AS rn
FROM atomic.events) t
WHERE rn > 1);
DELETE
FROM atomic.io_snowplow_foundation_content_1
WHERE ctid IN (SELECT ctid
FROM (SELECT ctid, ROW_NUMBER() OVER (PARTITION BY root_id, root_tstamp ) AS rn
FROM atomic.io_snowplow_foundation_content_1) t
WHERE rn > 1);
DELETE
FROM atomic.com_snowplowanalytics_snowplow_timing_1
WHERE ctid IN (SELECT ctid
FROM (SELECT ctid, ROW_NUMBER() OVER (PARTITION BY root_id, root_tstamp ) AS rn
FROM atomic.com_snowplowanalytics_snowplow_timing_1) t
WHERE rn > 1);
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment