Skip to content

Instantly share code, notes, and snippets.

@ilovefreesw
Last active May 4, 2023 09:54
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 2 You must be signed in to fork a gist
  • Save ilovefreesw/36587762f3239162a4c1acef5e759822 to your computer and use it in GitHub Desktop.
Save ilovefreesw/36587762f3239162a4c1acef5e759822 to your computer and use it in GitHub Desktop.
A Python-Selenium script to bulk take screenshots of webpage using headless Chrome by reading a text file full of URLs Tutorial: https://www.ilovefreesoftware.com/26/tutorial/how-to-take-full-page-screenshot-in-bulk-from-multiple-urls.html
from selenium import webdriver
from selenium.common.exceptions import WebDriverException
from selenium.webdriver.common.by import By
from tqdm import tqdm
import time
lines = []
Links_File = r''
OP_DIR = r''
i = 1
S = lambda X: driver.execute_script('return document.body.scrollHeight') + X
with open(Links_File, "r") as f:
lines = f.readlines()
lines = [line.rstrip() for line in lines]
options = webdriver.ChromeOptions()
options.headless = True
options.add_argument('--log-level=3')
driver = webdriver.Chrome(options=options)
driver.execute_cdp_cmd('Network.setUserAgentOverride', {"userAgent": 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.4103.97 Safari/537.36'})
print(driver.execute_script("return navigator.userAgent;"))
for link in tqdm(lines, ncols=65):
try:
driver.get(link)
time.sleep(5)
driver.set_window_size(1024,S(0)) # May need manual adjustment
driver.find_element(By.TAG_NAME,"body").screenshot(f'{OP_DIR}\{i}.png')
i = i + 1
except WebDriverException:
print(link)
continue
driver.quit()
@ilovefreesw
Copy link
Author

hi this works well.

but it takes a screenshot of the login page.

how can i take a screenshot of a page that requires the user to be logged in?

thank you for sharing and the hard work.

This is not meant for that. To log into a website, there needs be added more lines of code based on what the website type is.

@Unscrew5772
Copy link

Doesn't work for me. is it due to Selenium removing that find element by tag?

@chiraagshah-qa
Copy link

Doesn't work for me. is it due to Selenium removing that find element by tag?

I used the following:
driver.find_element(By.TAG_NAME, 'body')

@ilovefreesw
Copy link
Author

Doesn't work for me. is it due to Selenium removing that find element by tag?

If you downgrade your selenium version, it will work.

@hejhopsa
Copy link

I got this error:

_Traceback (most recent call last):
  File "C:\Users\mateusz\Downloads\imx.to\bulk_webpage_screenshots.py", line 29, in <module>
    driver.find_elements(By.TAG_NAME, "body").screenshot(f'{OP_DIR}\{i}.png')
AttributeError: 'list' object has no attribute 'screenshot'_

@ilovefreesw
Copy link
Author

ilovefreesw commented Nov 19, 2022

I got this error:

_Traceback (most recent call last):
  File "C:\Users\mateusz\Downloads\imx.to\bulk_webpage_screenshots.py", line 29, in <module>
    driver.find_elements(By.TAG_NAME, "body").screenshot(f'{OP_DIR}\{i}.png')
AttributeError: 'list' object has no attribute 'screenshot'_

@hejhopsa Made few changes. See if it works.

@fazio79
Copy link

fazio79 commented Jan 10, 2023

Great script thank you, do you have a workaround to accept all the cookie banner and avoid that some screenshot are grayed?
WhatsApp Image 2023-01-10 at 12 04 09

@ilovefreesw
Copy link
Author

ilovefreesw commented Jan 10, 2023

You can do it in two ways.

  1. Inject JavaScript based on website you are taking screenshot of.
  2. Load Chrome with an extension installed that will block the cookie and other popups. Try with https://adlock.com/ or https://crumbs.org/en/
    You will need CRX file of any of these extensions that you can get using this: https://chrome.google.com/webstore/detail/get-crx/dijpllakibenlejkbajahncialkbdkjc

Now, you can load the extension using CRX like this:

options.add_extension('pathToCRX')

Add this after line 18 and update PATH to the CRX file of the extension.

@fazio79
Copy link

fazio79 commented Jan 10, 2023

Thank you following yours suggestions worked fine!

You can do it in two ways.

  1. Inject JavaScript based on website you are taking screenshot of.
  2. Load Chrome with an extension installed that will block the cookie and other popups. Try with https://adlock.com/ or https://crumbs.org/en/
    You will need CRX file of any of these extensions that you can get using this: https://chrome.google.com/webstore/detail/get-crx/dijpllakibenlejkbajahncialkbdkjc

Now, you can load the extension using CRX like this:

options.add_argument('pathToCRX')

Add this after line 18 and update PATH to the CRX file of the extension.

Thank you following yours suggestions worked fine!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment