Skip to content

Instantly share code, notes, and snippets.

@me-suzy
Last active April 19, 2023 07:14
Show Gist options
  • Save me-suzy/79ea919c889745bfe8b41ba1f516a281 to your computer and use it in GitHub Desktop.
Save me-suzy/79ea919c889745bfe8b41ba1f516a281 to your computer and use it in GitHub Desktop.
Python error: ValueError: substring not found
So, I have this text that I must translated with library googletrans:
<!-- ARTICOL START -->
<p class="mb-40px">La muerte de Lorenzo el Magnífico puso fin al periodo.</p>
<p class="mb-40px">My name is print</p>
<p class="mb-40px">I must go home
<p class="mb-40px">I love my laptop when is clean</p>
<!-- ARTICOL FINAL -->
**The part of the python code with the problem is this:**
for p in soup.findAll('p', class_='mb-40px'):
begin_comment = str(soup).index('<!-- ARTICOL START -->')
end_comment = str(soup).index('<!-- ARTICOL FINAL -->')
if begin_comment < str(soup).index(str(p)) < end_comment:
recursively_translate(p)
**It returns me the error:**
Traceback (most recent call last):
File "E:\BB\safary.py", line 143, in <module>
begin_comment = str(soup).index('<!-- ARTICOL START -->')
ValueError: substring not found
-----------ANSWERS----------
That error comes from the Beautifulsoup library, which controls the html tags in particular.
In the example above, you are missing the closing tag `</p>` on line three.
Use this regex to find all unclosed html tags and close them with `</p>`
- <kbd>Ctrl</kbd>+<kbd>H</kbd>
- Find what: `^(<p class="mb-40px">)((?!</p>).)*$`
- Replace with: `$0</p>`
- **CHECK** *Wrap around*
- **CHECK** *Regular expression*
- **UNCHECK** `. matches newline`
- <kbd>Replace all</kbd>
Also, in Python code, is better to delete those 2 lines that contain ARTICOL START and ARTICOL FINAL, it is much simple this way
for p in soup.findAll('p', class_='mb-40px'):
if begin_comment < str(soup).index(str(p)) < end_comment:
recursively_translate(p)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment