Skip to content

Instantly share code, notes, and snippets.

@pletch
Last active April 12, 2024 03:02
Show Gist options
  • Star 10 You must be signed in to star a gist
  • Fork 6 You must be signed in to fork a gist
  • Save pletch/037a4a01c95688fff65752379534455f to your computer and use it in GitHub Desktop.
Save pletch/037a4a01c95688fff65752379534455f to your computer and use it in GitHub Desktop.
Scrape PFSense DHCP Leases Status Page and Export Results to JSON
#!/usr/bin/env python3
# # This python script provides a function to query the pfsense (+v2.4) dhcp leases status page and return a list of tuples including
# ip, hostname, and mac address. To use, ensure LXML is installed via package manager or via pip.
#
# 16-Dec-2016 - Original release
# 3-Sep-2020 - Minor update to match formatting of leases page in latest pfSense version (2.4.5).
# 9-Sep-2020 - Backported improvements to handle table rows with missing data, use global variables for user/pass/server_ip,
# and return list from scrape function as implemented by fryguy04 in fork here:
# https://gist.github.com/fryguy04/7d12b789260c47c571f42e5bc733a813
# 9-Sep-2020 - Added parsing of pfSense lease table header. Discovered that adding element of ClientID in the static dhcp
# definitions alters the column sequence. This modification ensures that the correct columns are found and parsed.
# 9-Sep-2020 - Removed file export function
#
import sys
import requests
from lxml import html
import re
url = "http://192.168.1.1/status_dhcp_leases.php" #change url to match your pfsense machine address. Note http or https!
user = 'your_username' #Username for pfSense login
password = 'your_password' #Password for pfSense login
def scrape_pfsense_dhcp(url, user, password):
ip = []
mac = []
dhcp_name = []
s = requests.session()
r = s.get(url,verify = False)
matchme = 'csrfMagicToken = "(.*)";var'
csrf = re.search(matchme,str(r.text))
payload = {
'__csrf_magic' : csrf.group(1),
'login' : 'Login',
'usernamefld' : user,
'passwordfld' : password
}
r = s.post(url,data=payload,verify = False)
r = s.get(url,verify = False)
tree = html.fromstring(r.content)
tr_elements = tree.xpath('//tr')
headers = [header.text for header in tr_elements[0]]
ip.extend(tree.xpath('//body[1]//div[1]//div[2]//div[2]//table[1]//tbody//tr//td[' + str(headers.index('IP address') + 1) +']//text()'))
mac.extend(tree.xpath('//body[1]//div[1]//div[2]//div[2]//table[1]//tbody//tr//td['+ str(headers.index('MAC address') + 1) +']//text()'))
for node in tree.xpath('//body[1]//div[1]//div[2]//div[2]//table[1]//tbody//tr//td['+ str(headers.index('Hostname') + 1) +']'):
if node.text is None:
dhcp_name.append('no_hostname')
else:
dhcp_name.append(node.text)
for i in range(len(mac)):
mac[i] = mac[i].strip()
return(list(zip(ip, mac, dhcp_name)))
if __name__ == "__main__":
dhcp_list = scrape_pfsense_dhcp(url, user, password)
for entry in dhcp_list:
print(entry)
@zejar
Copy link

zejar commented Nov 9, 2017

This doesn't seem to work for me, I keep getting Max retries exceeded with url errors.
Full error is: requests.exceptions.SSLError: HTTPSConnectionPool(host='192.168.1.1', port=443): Max retries exceeded with url: /status_dhcp_leases.php (Caused by SSLError(SSLEOFError(8, u'EOF occurred in violation of protocol (_ssl.c:590)'),))

@pletch
Copy link
Author

pletch commented Dec 20, 2017

My first guess is you have limited WebConfigurator access to HTTPS for your pfSense box. I am using HTTP (and only tested the script with this) since my pfSense box is only accessed on my local private network. I will take a look and see if I can get the script working with HTTPS as well.

Update: Just did a brief bit of troubleshooting by enabling HTTPS access in pfSense WebConfigurator. The script works if you update the url to be https:\ in line 28 and failed if left as http expecting redirect. Not sure this is the root cause of your issues but I'd start there.

@fryguy04
Copy link

FYI for anyone using this ... I created a cleaned up version that puts the data into a Struct and accounts for lines missing information (like IP) ... that way you get a clean (IP, MAC, Hostname) ... Above code creates 3 separate lists but unfortunately data won't line up if you have any missing data (which is highly likely if a machine connected, got a DHCP IP then disconnected). Shout out to orig author for doing all the hard work ...
https://gist.github.com/fryguy04/7d12b789260c47c571f42e5bc733a813

@fryguy04
Copy link

Also of note below Library ... overkill for what i needed (only DHCP) but might be helpful for others
https://github.com/ndejong/pfsense_fauxapi

@dannieldin195
Copy link

Hi Great work. I just don't get how you obtain __csrf_magic values.

@clayrosenthal
Copy link

@fryguy04 I extended your version, to allow passing env vars for the authentication/url. I also changed from using xpath to other methods from lxml: get_item_from_id, iterchildren, itersiblings

https://gist.github.com/clayrosenthal/9c22108eaa18e1a079144738e3c7737c

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment