Skip to content

Instantly share code, notes, and snippets.

@cobryan05
Last active July 29, 2024 02:04
Show Gist options
  • Save cobryan05/7d1fe28dd370e110a372c4d268dcb2e5 to your computer and use it in GitHub Desktop.
Save cobryan05/7d1fe28dd370e110a372c4d268dcb2e5 to your computer and use it in GitHub Desktop.
Python Script to disable ASLR and make nv fatbins read-only to reduce memory commit
# Simple script to disable ASLR and make .nv_fatb sections read-only
# Requires: pefile ( python -m pip install pefile )
# Usage: fixNvPe.py --input path/to/*.dll
import argparse
import pefile
import glob
import os
import shutil
def main(args):
failures = []
for file in glob.glob( args.input, recursive=args.recursive ):
print(f"\n---\nChecking {file}...")
pe = pefile.PE(file, fast_load=True)
nvbSect = [ section for section in pe.sections if section.Name.decode().startswith(".nv_fatb")]
if len(nvbSect) == 1:
sect = nvbSect[0]
size = sect.Misc_VirtualSize
aslr = pe.OPTIONAL_HEADER.IMAGE_DLLCHARACTERISTICS_DYNAMIC_BASE
writable = 0 != ( sect.Characteristics & pefile.SECTION_CHARACTERISTICS['IMAGE_SCN_MEM_WRITE'] )
print(f"Found NV FatBin! Size: {size/1024/1024:0.2f}MB ASLR: {aslr} Writable: {writable}")
if (writable or aslr) and size > 0:
print("- Modifying DLL")
if args.backup:
bakFile = f"{file}_bak"
print(f"- Backing up [{file}] -> [{bakFile}]")
if os.path.exists( bakFile ):
print( f"- Warning: Backup file already exists ({bakFile}), not modifying file! Delete the 'bak' to allow modification")
failures.append( file )
continue
try:
shutil.copy2( file, bakFile)
except Exception as e:
print( f"- Failed to create backup! [{str(e)}], not modifying file!")
failures.append( file )
continue
# Disable ASLR for DLL, and disable writing for section
pe.OPTIONAL_HEADER.DllCharacteristics &= ~pefile.DLL_CHARACTERISTICS['IMAGE_DLLCHARACTERISTICS_DYNAMIC_BASE']
sect.Characteristics = sect.Characteristics & ~pefile.SECTION_CHARACTERISTICS['IMAGE_SCN_MEM_WRITE']
try:
newFile = f"{file}_mod"
print( f"- Writing modified DLL to [{newFile}]")
pe.write( newFile )
pe.close()
print( f"- Moving modified DLL to [{file}]")
os.remove( file )
shutil.move( newFile, file )
except Exception as e:
print( f"- Failed to write modified DLL! [{str(e)}]")
failures.append( file )
continue
print("\n\nDone!")
if len(failures) > 0:
print("***WARNING**** These files needed modification but failed: ")
for failure in failures:
print( f" - {failure}")
def parseArgs():
parser = argparse.ArgumentParser( description="Disable ASLR and make .nv_fatb sections read-only", formatter_class=argparse.ArgumentDefaultsHelpFormatter )
parser.add_argument('--input', help="Glob to parse", default="*.dll")
parser.add_argument('--backup', help="Backup modified files", default=True, required=False)
parser.add_argument('--recursive', '-r', default=False, action='store_true', help="Recurse into subdirectories")
return parser.parse_args()
###############################
# program entry point
#
if __name__ == "__main__":
args = parseArgs()
main( args )
@CaptainStabs
Copy link

I've been running https://github.com/lucidrains/lightweight-gan mostly successfully with random crashes due to illegal memory accesses and BSoD. I can't really test if it's caused by this script or not, but it's something to keep an eye on.

But, great work on this (hopefully) temporary patch! Hopefully nvidia will get it together (even if we know they won't) and make an official fix for this.

@ryancinsight
Copy link

ryancinsight commented Feb 1, 2022

pyd files are also dll files you may want to include this is your script, for example:

  • D:\Users**\Miniconda3\Lib\site-packages\torchvision_C.pyd
  • D:\Users**\Miniconda3\pkgs\torchvision-0.12.0.dev20220130-py39_cu113\Lib\site-packages\torchvision_C.pyd
    I also couldn't seem to get glob to go recursive so I modified to pathlib path
    `# Simple script to disable ASLR and make .nv_fatb sections read-only

Requires: pefile ( python -m pip install pefile )

Usage: fixNvPe.py --input path/to/*.dll

import argparse
import pefile
import os
import shutil
from pathlib import Path

def main(args):
failures = []
extension = args.ext if type(args.ext) == list else [args.ext]
for ext in extension:
for file in Path(args.input).rglob(f'.{ext}') if args.recursive else Path(args.input).glob(f'.{ext}'):
print(f"\n---\nChecking {file}...")
pe = pefile.PE(file, fast_load=True)
nvbSect = [ section for section in pe.sections if section.Name.decode().startswith(".nv_fatb")]
if len(nvbSect) == 1:
sect = nvbSect[0]
size = sect.Misc_VirtualSize
aslr = pe.OPTIONAL_HEADER.IMAGE_DLLCHARACTERISTICS_DYNAMIC_BASE
writable = 0 != ( sect.Characteristics & pefile.SECTION_CHARACTERISTICS['IMAGE_SCN_MEM_WRITE'] )
print(f"Found NV FatBin! Size: {size/1024/1024:0.2f}MB ASLR: {aslr} Writable: {writable}")
if (writable or aslr) and size > 0:
print("- Modifying DLL")
if args.backup:
bakFile = f"{file}_bak"
print(f"- Backing up [{file}] -> [{bakFile}]")
if os.path.exists( bakFile ):
print( f"- Warning: Backup file already exists ({bakFile}), not modifying file! Delete the 'bak' to allow modification")
failures.append( file )
continue
try:
shutil.copy2( file, bakFile)
except Exception as e:
print( f"- Failed to create backup! [{str(e)}], not modifying file!")
failures.append( file )
continue
# Disable ASLR for DLL, and disable writing for section
pe.OPTIONAL_HEADER.DllCharacteristics &= ~pefile.DLL_CHARACTERISTICS['IMAGE_DLLCHARACTERISTICS_DYNAMIC_BASE']
sect.Characteristics = sect.Characteristics & ~pefile.SECTION_CHARACTERISTICS['IMAGE_SCN_MEM_WRITE']
try:
newFile = f"{file}_mod"
print( f"- Writing modified DLL to [{newFile}]")
pe.write( newFile )
pe.close()
print( f"- Moving modified DLL to [{file}]")
os.remove( file )
shutil.move( newFile, file )
except Exception as e:
print( f"- Failed to write modified DLL! [{str(e)}]")
failures.append( file )
continue

print("\n\nDone!")
if len(failures) > 0:
    print("***WARNING**** These files needed modification but failed: ")
    for failure in failures:
        print( f" - {failure}")

def parseArgs():
parser = argparse.ArgumentParser( description="Disable ASLR and make .nv_fatb sections read-only", formatter_class=argparse.ArgumentDefaultsHelpFormatter )
parser.add_argument('--ext', help="extension", default=["dll","pyd","exe"])
parser.add_argument('--input', help="Glob to parse", default="*")
parser.add_argument('--backup', help="Backup modified files", default=True, required=False)
parser.add_argument('--recursive', '-r', default=False, action='store_true', help="Recurse into subdirectories")

return parser.parse_args()

###############################

program entry point

if name == "main":
args = parseArgs()
main( args )`

@jgerityneurala
Copy link

Hi @cobryan05, is this code available for redistribution under an explicit license?

@cobryan05
Copy link
Author

Hi @cobryan05, is this code available for redistribution under an explicit license?

You can consider it public domain. Do whatever you'd like with it

@jgerityneurala
Copy link

Cheers! Thanks for the thoughtful write-up of the problem and the code 💙

@KlausRyu
Copy link

KlausRyu commented Apr 7, 2022

Hi @cobryan05 , i have some question after using your script. I don't know what is that error about sdh.dll...

this is run with yolov5 and i try to run val.py

Capture

@cobryan05
Copy link
Author

cobryan05 commented Apr 7, 2022

hi,i have some question after using your script. I don't know what is that error about sdh.dll... Capture

@KlausRyu Your error is that you are running out of memory. My script will significantly reduce memory usage for files that have an .nv_fatb section, but that's it. It's definitely still possible you just simply don't have enough memory, even with the reduction.

Make sure you actually ran the script. It looks like in your case you should have run
python fixNvPe.py --input C:\Users\Admin\anaconda3\envs\newyolo\lib\site-packages\torch\lib\*.dll

but if that has successfully ran and you still hit that error, then you just don't have enough memory. Increase your page file size, decrease your number of workers, or add more RAM?

@KlausRyu
Copy link

KlausRyu commented Apr 8, 2022

@cobryan05 Is it successful? cuz that have some files look like failed install
or my torch got some problem?

Capture2

@cobryan05
Copy link
Author

No, it wasn't successful. Read the log. "Not modifying file. Delete the 'bak' to allow modification" Perhaps you ran it once, then upgraded Torch and it overwrote the DLLs, so you have files that need modification but it won't modify them since it won't overwrite the backup.

@Evanwu1125
Copy link

No, it wasn't successful. Read the log. "Not modifying file. Delete the 'bak' to allow modification" Perhaps you ran it once, then upgraded Torch and it overwrote the DLLs, so you have files that need modification but it won't modify them since it won't overwrite the backup.

So any solutions to solve this problem?

@cobryan05
Copy link
Author

cobryan05 commented Apr 30, 2022

The log says:
"Warning: Backup file already exists (C:\User\Admin\anaconda3\envs\newyolo\lib\site-packages\torch\lib\torch_cuda_cu.dll_bak), not modifying file! Delete the 'bak' file to allow modification."

This is telling you that the file C:\User\Admin\anaconda3\envs\newyolo\lib\site-packages\torch\lib\torch_cuda_cu.dll_bak already exists, so it is not modifying the C:\User\Admin\anaconda3\envs\newyolo\lib\site-packages\torch\lib\torch_cuda_cu.dll. You must delete the file that ends with '_bak' to allow modifying the file.

@GucciFlipFlops1917
Copy link

Thank you! Works like a charm for paging file errors running https://github.com/minimaxir/aitextgen.

@szan12
Copy link

szan12 commented Nov 24, 2022

Hi @cobryan05 , I'm getting this output that says :
" Failed to write modified DLL! [[WinError 5] Access is denied: .. "

Can I know what should I do to enable it to be modified?

image

@cobryan05
Copy link
Author

cobryan05 commented Nov 28, 2022

@szan12 For it to be getting 'access denied' in your User directory, I would assume that it means the file is in use. Try restarting your computer and then running it, or try typing "taskkill /f /im python.exe" in your command prompt before running it (this will forcefully close any python process you have running). If that still fails, try running from a cmd prompt that is "Run as administrator", but that shouldn't be necessary in the 'user' directory

@zclhjw
Copy link

zclhjw commented Jun 11, 2023

i can't remove the folder's read-only attributes, when i removed, and it just automatically recovery the attributes.

@colorfuldarkgray
Copy link

Thank you for sharing codes and troubleshooting. Yet, this didn't work for me. I am surprised I can't train a FCN8s on NVIDIA 3090 with 24 Gb of VRAM. I could have done that in an older server with two 12 Gb cards. The main difference is that the old server ran Keras on linux. So I'll have to change OS and maybe go back to keras/TF.

Best regards!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment