Created
January 11, 2023 17:46
-
-
Save alecbw/c34ab9369fdcbf012c34666a8deea42b to your computer and use it in GitHub Desktop.
Clean a column of urls with mixed HTTP/HTTPS and www. prefixes
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
WITH sample_data AS ( | |
SELECT 'https://www.example.com' AS url | |
) | |
SELECT | |
CASE | |
WHEN split_part(split_part(url, '://', 1), 'www.', 1) = url THEN url | |
WHEN split_part(url, '://', 1) = url and split_part(url, 'www.', 1) != url THEN split_part(url, 'www.', 2) | |
WHEN split_part(url, '://', 1) != url and split_part(url, 'www.', 1) = url THEN split_part(url, '://', 2) | |
ELSE split_part(split_part(url, '://', 2), 'www.', 2) | |
END as result | |
FROM sample_data |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment