Skip to content

Instantly share code, notes, and snippets.

@henrygab
Last active January 9, 2023 04:30
Show Gist options
  • Save henrygab/044400844e1a8f3cfa730a66cc306d94 to your computer and use it in GitHub Desktop.
Save henrygab/044400844e1a8f3cfa730a66cc306d94 to your computer and use it in GitHub Desktop.
Thoughts on Driver Verifier -- ERRORS LIKELY -- Proceed at own risk

Getting information to root-cause a bluescreen / bugcheck

Not an easy road

First, I understand how frustrating it can be to track down the root cause of a bugcheck (bluescreen). It's perfectly normal to be frustrated, hit dead-ends, and be unable to find folks who can help.

Limitations

The information in this gist is limited to showing how one might get more useful information when a machine bugchecks, by using validation settings which are built into Windows. These settings are usually helpful in generating a dump file that more precisely points out what driver is at fault.

However, it very possible that a driver (or software) is not actually at fault. For example, bugchecks can occur because of hardware-based issues. Maybe something causing the DIMMs to return bad data. That one example could be caused by so many things … a power supply, a faulty capacitor on the motherboard, insufficient filtering allowing power line noise to corrupt memory, marginal quality DIMMs, marginal quality DIMM slots on motherboard, insufficient cooling on the DIMMS, intentional overclocking of the CPU, etc.

What this is NOT

Nothing here will fix any underlying issue. Instead, these things are intended to make the machine bluescreen faster, as soon as something can be detected as incorrect. This results in a memory dump that is closer to the causation point, and thus helps developers find the cause.

Again ... this will NOT fix the underlying issues, and may even make your machine not boot (while verifier settings are enabled -- see below for safe mode recovery options).

Don't do this if this is the only computer you have, or if you are not comfortable with the risks, or if you don't understand the concepts talked about.

Conclusion

These instructions will be based on Windows 11 (as of December 2022). The Windows kernel (and the verifier.exe program) has gotten more capabilities to detect misbehaving drivers, but many of the settings existed, even in Windows XP.

# Getting more data in the memory dumps
By default, when the machine bugchecks / bluescreens, very little
information is written to the dump file. This is how to change
that.
## WARNING -- Bitlocker -- WARNING
If you have bitlocker enabled, because this changes startup
settings, it may cause the system to not boot without your
entering a "bitlocker recovery key". If you know what this means,
and you've got that available ... Great! If not, stop here.
## Windows 11
* Control Panel --> System
* Choose the "About" (tab? option? item?)
* Choose the "Advanced system settings" text link
This opens the old-style UI where this setting is stored.
* Yes, there are **_three_** buttons marked `Settings...` (sigh)
* Under `Startup and Recovery`, click `Settings...`
* Under `System Failure`, `Write debugging information`
* Select `Kernel Memory Dump` (or `Complete Memory Dump` if needed)
* Enable `Overwrite any existing file`
* Remember the path listed under `Dump File`
* Click `OK`, which will likely require a reboot.
## Copying that dump file
When the machine next bugchecks, if you want that memory dump,
you'll need to copy it someplace else. Moving, copying, or even
reading that file requires administrator privileges.
You might make a directory, such as `c:\dumps`.
You might want to name the dump files using date and time.
You might choose a sortable version of date and time:
Example: `c:\dumps\20211230-184459.dmp` might be a bugcheck
that occurred December 30th, 2021, at 6:44pm (+59 seconds).
## Evaluating the dump
Use `windbg preview`, which is in the Microsoft App Store.
Load the memory dump as a memory dump / bugcheck (names and instructions vary).
run `!analyze -v` to (hopefully) learn what caused the bugcheck.

If your next step is to enable "driver verifier"....

EXPECT Loss of Use of the Computer

Driver verifier can often catch memory corruption as it happens. At the same time, enabling some of these options can drastically affect performance (while the settings are enabled). Using driver verifier will cause your machine to bluescreen when a violation is detected ... which might be during boot.

While driver verifier is enabled, your computer may not be usable. With luck, you can boot into safe mode, disable the settings, and then be back as you were.

When done investigating

Remember to run verifier again, to disable it, when you're done.

What success looks like

You will get a bugcheck, the machine will crash (and generate a memory dump file). When you reboot, you (or the developer) can load the memory dump file into the WinDbg debugger. When it's loaded, !analyze will give useful information.

Specifically, you hope for a bugcheck code that explicitly identifies both the specific driver, and the specific error that was caught.

Any of bugcheck codes 0xC1..0xFE tend to be fairly accurate, IIRC.

General steps (for all iterations)

Here’s the general steps that will be followed by each iteration (iterations are listed in more detail below):

  1. Open administrative (elevated) command prompt
  2. Run verifier.exe
  3. Select Create Custom settings (for code developers), then select Next
  4. Enable the options for this iteration (listed later), then select Next
  5. Choose Select driver names from a list, and select Next
  6. Select the drivers for this iteration, and select Finish
  7. Reboot the machine to apply the settings

I will define two common sets of options:

Option Set A

Option set A is relatively low overhead, yet still finds many common problems with drivers. When using this set, enable only the following three settings in driver verifier:

  • Special Pool
  • Pool Tracking
  • DMA checking

Option Set B

Option set B applies stronger verification, with a correspondingly higher (potential) performance impact. When using this set, enable the three settings in driver verifier as listed in option set A:

  • Special Pool
  • Pool Tracking
  • DMA checking

And in addition, enable the following additional settings in driver verifier:

  • I/O verification
  • Invariant MDL checking for stack
  • Invariant MDL checking for driver
  • Force IRQL checking
  • Miscellaneous checks

Option Set "Sledgehammer"

This option set applies nearly all verification options. However, these settings have the largest performance impact.

  • Enable all settings listed with flag type "Standard"
  • Enable the following settings that are listed as "Additional" verification:
    • IRP logging
    • Invariant MDL checking for stack
    • Invariant MDL checking for driver
    • Port/miniport interface checking
    • DDI compliance checking (additional)
    • NDIS/WIFI verification
    • Code integrity checks

Iteration #1

This is the lowest-overhead iteration. You believe you know the 1-2 drivers that are most suspect.

  • When setting which options are enabled in driver verifier, use option set A
  • When selecting which drivers to verify:
    • Select only those drivers that you already suspect

Iteration #2

This is still mostly low overhead. You still believe it's 1-2 drivers that are the most likely cause.

  • When setting which options are enabled in driver verifier, use option set B
  • When selecting which drivers to verify:
    • Select only those drivers that you already suspect

Iteration #3

  • When setting which options are enabled in driver verifier, use option set Sledgehammer
  • When selecting which drivers to verify:
    • Select only those drivers that you already suspect

Iteration #4

If iterations 1-3 didn't highlight the error, or you didn't have very specific drivers in mind as suspect, then this is the right place to start. This iteration broadens the net some, while excluding some higher-impact drivers.

  • When setting which options are enabled in driver verifier, use option set A
  • When selecting which drivers to verify:
    • Sort the drivers list by provider (click on that column header)
    • Enable for all drivers except the following
      • Any driver provided by Microsoft Corporation
      • Any driver provided by Intel Corporation or Intel(R) Corporation

Iteration #5

  • When setting which options are enabled in driver verifier, use option set B
  • When selecting which drivers to verify:
    • Use the same driver list selections as iteration #4.

Iteration #6

  • When setting which options are enabled in driver verifier, use option set Sledgehammer
  • When selecting which drivers to verify:
    • Use the same driver list selections as iteration #4.

Iteration #7

If none of the above found the culprit, then you start enabling driver verifier on all drivers ... period.

  • When setting which options are enabled in driver verifier, use option set A
  • When selecting which drivers to verify, choose "Automatically select all drivers installed on this computer"

Iteration #8

  • When setting which options are enabled in driver verifier, use option set B
  • When selecting which drivers to verify, choose "Automatically select all drivers installed on this computer"

Iteration #9

  • When setting which options are enabled in driver verifier, use option set Sledgehammer
  • When selecting which drivers to verify, choose "Automatically select all drivers installed on this computer"

Iterations #10 and higher

There are some setting you haven't enabled. Maybe one of them will help. I don't know ... my suggestions may be dated, and fail to include new driver verifier options that were specifically designed to help the type of issue you're facing.

If machines buchecks at boot w/driver verifier enabled

  • Boot in safe mode (use your other computer to find instructions)
  • Copy the dump file from the %windir%\system32 to someplace else (so it won't get overwritten and can be analyzed later)
  • Disable driver verifier while in safe mode
  • Reboot
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment