You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Getting information to root-cause a bluescreen / bugcheck
Not an easy road
First, I understand how frustrating it can be to track down the root
cause of a bugcheck (bluescreen). It's perfectly normal to be frustrated, hit dead-ends,
and be unable to find folks who can help.
Limitations
The information in this gist is limited to showing how one might
get more useful information when a machine bugchecks, by using validation
settings which are built into Windows. These settings are usually
helpful in generating a dump file that more precisely points out what
driver is at fault.
However, it very possible that a driver (or software) is not actually
at fault. For example, bugchecks can occur because of hardware-based
issues. Maybe something causing the DIMMs to return bad data. That
one example could be caused by so many things … a power supply, a
faulty capacitor on the motherboard, insufficient filtering allowing
power line noise to corrupt memory, marginal quality DIMMs, marginal
quality DIMM slots on motherboard, insufficient cooling on the DIMMS,
intentional overclocking of the CPU, etc.
What this is NOT
Nothing here will fix any underlying issue. Instead, these things
are intended to make the machine bluescreen faster, as soon
as something can be detected as incorrect. This results in a memory
dump that is closer to the causation point, and thus helps developers
find the cause.
Again ... this will NOT fix the underlying issues, and may
even make your machine not boot (while verifier settings are enabled
-- see below for safe mode recovery options).
Don't do this if this is the only computer you have, or if you are
not comfortable with the risks, or if you don't understand the
concepts talked about.
Conclusion
These instructions will be based on Windows 11 (as of December 2022).
The Windows kernel (and the verifier.exe program) has gotten
more capabilities to detect misbehaving drivers, but many of the
settings existed, even in Windows XP.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
If your next step is to enable "driver verifier"....
EXPECT Loss of Use of the Computer
Driver verifier can often catch memory corruption as it happens.
At the same time, enabling some of these options can drastically
affect performance (while the settings are enabled). Using driver
verifier will cause your machine to bluescreen when a violation
is detected ... which might be during boot.
While driver verifier is enabled, your computer may not be usable.
With luck, you can boot into safe mode, disable the settings, and
then be back as you were.
When done investigating
Remember to run verifier again, to disable it, when you're done.
What success looks like
You will get a bugcheck, the machine will crash (and generate a memory
dump file). When you reboot, you (or the developer) can load the memory
dump file into the WinDbg debugger. When it's loaded, !analyze will
give useful information.
Specifically, you hope for a bugcheck code
that explicitly identifies both the specific driver, and the specific
error that was caught.
Any of bugcheck codes 0xC1..0xFE tend to be fairly accurate, IIRC.
General steps (for all iterations)
Here’s the general steps that will be followed by each iteration
(iterations are listed in more detail below):
Open administrative (elevated) command prompt
Run verifier.exe
Select Create Custom settings (for code developers), then select Next
Enable the options for this iteration (listed later), then select Next
Choose Select driver names from a list, and select Next
Select the drivers for this iteration, and select Finish
Reboot the machine to apply the settings
I will define two common sets of options:
Option Set A
Option set A is relatively low overhead, yet still finds
many common problems with drivers. When using this set,
enable only the following three settings in driver verifier:
Special Pool
Pool Tracking
DMA checking
Option Set B
Option set B applies stronger verification, with a
correspondingly higher (potential) performance impact.
When using this set, enable the three settings in
driver verifier as listed in option set A:
Special Pool
Pool Tracking
DMA checking
And in addition, enable the following additional
settings in driver verifier:
I/O verification
Invariant MDL checking for stack
Invariant MDL checking for driver
Force IRQL checking
Miscellaneous checks
Option Set "Sledgehammer"
This option set applies nearly all verification options.
However, these settings have the largest performance
impact.
Enable all settings listed with flag type "Standard"
Enable the following settings that are listed as "Additional" verification:
IRP logging
Invariant MDL checking for stack
Invariant MDL checking for driver
Port/miniport interface checking
DDI compliance checking (additional)
NDIS/WIFI verification
Code integrity checks
Iteration #1
This is the lowest-overhead iteration. You believe you know the 1-2
drivers that are most suspect.
When setting which options are enabled in driver verifier, use option set A
When selecting which drivers to verify:
Select only those drivers that you already suspect
Iteration #2
This is still mostly low overhead. You still believe it's 1-2
drivers that are the most likely cause.
When setting which options are enabled in driver verifier, use option set B
When selecting which drivers to verify:
Select only those drivers that you already suspect
Iteration #3
When setting which options are enabled in driver verifier, use option set Sledgehammer
When selecting which drivers to verify:
Select only those drivers that you already suspect
Iteration #4
If iterations 1-3 didn't highlight the error, or you didn't have
very specific drivers in mind as suspect, then this is the right
place to start. This iteration broadens the net some, while
excluding some higher-impact drivers.
When setting which options are enabled in driver verifier, use option set A
When selecting which drivers to verify:
Sort the drivers list by provider (click on that column header)
Enable for all drivers except the following
Any driver provided by Microsoft Corporation
Any driver provided by Intel Corporation or Intel(R) Corporation
Iteration #5
When setting which options are enabled in driver verifier, use option set B
When selecting which drivers to verify:
Use the same driver list selections as iteration #4.
Iteration #6
When setting which options are enabled in driver verifier, use option set Sledgehammer
When selecting which drivers to verify:
Use the same driver list selections as iteration #4.
Iteration #7
If none of the above found the culprit, then you start enabling
driver verifier on all drivers ... period.
When setting which options are enabled in driver verifier, use option set A
When selecting which drivers to verify, choose
"Automatically select all drivers installed on this computer"
Iteration #8
When setting which options are enabled in driver verifier, use option set B
When selecting which drivers to verify, choose
"Automatically select all drivers installed on this computer"
Iteration #9
When setting which options are enabled in driver verifier, use option set Sledgehammer
When selecting which drivers to verify, choose
"Automatically select all drivers installed on this computer"
Iterations #10 and higher
There are some setting you haven't enabled. Maybe one of them will help.
I don't know ... my suggestions may be dated, and fail to include new
driver verifier options that were specifically designed to help the type
of issue you're facing.
If machines buchecks at boot w/driver verifier enabled
Boot in safe mode (use your other computer to find instructions)
Copy the dump file from the %windir%\system32 to someplace else
(so it won't get overwritten and can be analyzed later)