Skip to content

Instantly share code, notes, and snippets.

@simonklee
Last active August 29, 2015 13:56
Show Gist options
  • Save simonklee/8842786 to your computer and use it in GitHub Desktop.
Save simonklee/8842786 to your computer and use it in GitHub Desktop.
Domain whitelist
import re
class DomainWhitelist(object):
url_re = re.compile(
r'([a-z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,6})',
re.IGNORECASE)
kogama_re = re.compile(
r'^(http[s]?://)?([a-z0-9-.]{2,8})?kogama.com(.br)?',
re.IGNORECASE)
def check(self, value):
for url in self.url_re.findall(value):
if not self.kogama_re.match(''.join(url)):
return False
return True
@alexandrebini
Copy link

@simonz05 take a look: http://rubular.com/r/LFWxz7kpgd

I just changed the "dot" part:

r'^(http[s]?://)?([a-z0-9]{2,8}.)?kogama.com(.br)?'

@kogama
Copy link

kogama commented Feb 6, 2014

The modified reg-ex is for testing if a link is in the white-list or not. It's the first reg-ex that determines if it's a link or not, and thus it's the first reg-ex that need to handle all URL-cases. This is the one: r'http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*(),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+'

@alexandrebini
Copy link

Got it.

This one that you send is not matching www.domain.com without http|s: http://rubular.com/r/FqYasTLz5l

This one is: http://rubular.com/r/FAsg1mTTnB

r'(http[s]?://|www)(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\(\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+'

@simonklee
Copy link
Author

I think I'll simplify it to just look for domain names. It should solve the issue.

[a-z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,6}

http://rubular.com/r/GOuWVw98Qi

@alexandrebini
Copy link

We may have problems with usernames don't we? http://rubular.com/r/nFfQpESoPP

@simonklee
Copy link
Author

Yes and no. You wont be able to write a comment with the username "simon.kogama" in it. However, that is not something we can solve, since domain and user names follow some of the same rules.

We do however not run these validations on usernames directly. Usernames are upon creation checked versus a much stricter regex which limits the possibilities. You can create a username like this xxx.com, but that is about it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment