Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save Martins6/8d594e11bd60c6c41a171a4fa73ac604 to your computer and use it in GitHub Desktop.
Save Martins6/8d594e11bd60c6c41a171a4fa73ac604 to your computer and use it in GitHub Desktop.
Cleaning or validating emails and telephones numbers with RegEx with SQL in Presto DB. (Portuguese Version)
SELECT *,
CASE
WHEN e.email like '%_@_%_.__%'
AND NOT regexp_like(coalesce(trim(regexp_extract(e.email, '.+?(?=@)')), '$'),
'[^a-zA-Z0-9\._-\ã\õ\ẽ\é\ó\ú]')
THEN lower(e.email)
ELSE 'N/A'
END
) AS email,
CASE
WHEN regexp_like(regexp_replace(e.telephone, '[^0-9]'), '^[0-9]')
AND length(regexp_replace(e.telephone, '[^0-9]')) > 7
AND NOT regexp_like(regexp_replace(e.telephone, '[^0-9]'), '\b(\d)\1+\b')
THEN regexp_replace(e.telephone, '[^0-9]')
ELSE 'N/A'
END
) AS telephone,
FROM database.table as e
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment