Skip to content

Instantly share code, notes, and snippets.

@Codehunter-py
Created February 20, 2022 19:05
Show Gist options
  • Save Codehunter-py/309f353973077ca5395da5c2057a8f9c to your computer and use it in GitHub Desktop.
Save Codehunter-py/309f353973077ca5395da5c2057a8f9c to your computer and use it in GitHub Desktop.
The check_web_address function checks if the text passed qualifies as a top-level web address, meaning that it contains alphanumeric characters (which includes letters, numbers, and underscores), as well as periods, dashes, and a plus sign, followed by a period and a character-only top-level domain such as ".com", ".info", ".edu", etc.
import re
def check_web_address(text):
pattern = r"^\S+\.[a-zA-Z]+$"
result = re.search(pattern, text)
return result != None
print(check_web_address("gmail.com")) # True
print(check_web_address("www@google")) # False
print(check_web_address("www.Coursera.org")) # True
print(check_web_address("web-address.com/homepage")) # False
print(check_web_address("My_Favorite-Blog.US")) # True
@louiswordz
Copy link

Screenshot 2023-04-22 at 10 57 18
I had to use backslash instead thanks for the clue

@Codehunter-py
Copy link
Author

Thanks for the info!

@MrkTheCoder
Copy link

MrkTheCoder commented Nov 28, 2023

Hello

When we're solving a problem, we should pay attention to:

  1. requirements and
  2. think for the more test cases!

Therefore, to satisfy these 4 criteria:

"contains alphanumeric characters (which includes letters, numbers, and underscores), as well as periods, dashes, and a plus sign"

Criteria for start of text: The \s is not the correct special sequence to use here! since it is equal too [^ \t\n\r\f\v] and it mean it can accept '! ? # $ % ^ & * ..." characters too! The only correct special sequence to use here is \w which is equal to [a-zA-Z0-9_] and of course 3 other requested characters. We have to use them inside character classes. Please check the Python official doc about special sequences.

followed by a **period**

It is equal to: \.

a character-only top-level domain such as ".com", ".info", ".edu", etc.

Criteria for end of text: requirement is "character-only" so we cannot use special sequence\w since it covering digits and underscore too! we have to do it like this: [a-zA-Z]

..., beginning and end-of-line characters, ...

You didn't copy and paste the last sentence! It also asked to use both beginning and end-of-line characters too!

So the answer with extra test cases is:

#!/usr/bin/env python3
import re

def check_web_address(text):
    pattern = r"^[\w.+-]+?\.[a-zA-Z]+?$"
    result = re.search(pattern, text)
    return bool(result)


print(1, "Should be 'True':", check_web_address("gmail.com"))  # True
print(2, "Should be 'False':",  check_web_address("www@google"))  # False
print(3, "Should be 'True':",  check_web_address("www.Coursera.org"))  # True
print(4, "Should be 'False':",  check_web_address("web-address.com/homepage"))  # False
print(5, "Should be 'True':",  check_web_address("My_Favorite-Blog.US"))  # True
print(6, "Should be 'False':",  check_web_address(" www.test6.com"))  # False
print(7, "Should be 'True':",  check_web_address("123.test7.com"))  # True
print(8, "Should be 'False':",  check_web_address("123.test8.123"))  # False
print(9, "Should be 'False':",  check_web_address("ww!.test9.net"))  # False
print(10, "Should be 'False':",  check_web_address("w w.test10.net"))  # False
print(11, "Should be 'True':",  check_web_address("w+w.test11.net"))  # True
print(12, "Should be 'True':",  check_web_address("w-w.test12.net"))  # True
print(13, "Should be 'True':",  check_web_address("pass.under.test13.net"))  # True
print(14, "Should be 'True':",  check_web_address("Just-For.You+mE.test+14+.net"))  # True

You may ask: Why did you use repeating character ? after repeating character +?
I want to make + non-greedy! Of course, you may not use it too!

Happy Coding

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment