This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
pip install beautifulsoup4 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#IN Linux | |
$ apt-get install python-bs4 (for Python 2) | |
$ apt-get install python3-bs4 (for Python 3) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
$ pip install requests |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#CodersArts Assignment Help - Python | |
#If you need any programming or project Help contact at codersarts official website | |
from bs4 import BeautifulSoup | |
import requests | |
page_link = 'URL' |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
soup.title | |
# <title>Returns title tags and the content between the tags</title> | |
soup.title.string | |
# u'Returns the content inside a title tag as a string' | |
soup.p |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
pip install beautifulsoup4 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#14 metacharacters | |
\ : Used to drop the special meaning of character | |
[] : character class | |
^ : matches at beginning | |
$ : matches at end | |
. : matches any character(except newline) | |
? : matches zero or one occurrence | |
| : or | |
* : multiple number of occurrence |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#codersarts | |
#unicode string patterns in regular expression | |
\d : Matches any decimal digit; this is equivalent to the class [0-9]. | |
\D : Matches any non-digit character; this is equivalent to the class [^0-9] | |
\s : Matches any whitespace character; this is equivalent to the class [ \t\n\r\f\v]. | |
\S : Matches any non-whitespace character; this is equivalent to the class [^ \t\n\r\f\v]. | |
\w : Matches any alphanumeric character; this is equivalent to the class [a-zA-Z0-9_]. | |
\W : Matches any non-alphanumeric character; this is equivalent to the class [^a-zA-Z0-9_]. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import re | |
pattern=re.compile('my') | |
result=pattern.findall('my name is naveen') | |
print result | |
result2=pattern.findall('Hi Hi This is codersarts') | |
print result2 | |
Output | |
['my'] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import re | |
s = "HELLO there HOW are YOU" | |
l = re.compile("(?<!^)\s+(?=[A-Z])(?!.\s)").split(s) | |
print l | |
Output: | |
['HELLO there', 'HOW are', 'YOU'] |