BSOD - Rule of thumb?

  • Thread starter Thread starter BigAl.NZ
  • Start date Start date
Bruce said:
I concede the possibility, certainly. Although, so far, diagnostics
run on the hard drives of machines with driver failures haven't reported
any such problems. I'm not saying bad sectors can't cause a BSOD, just
that it hasn't yet happened to me.
Agreed. I just wouldn't rule it out.

Cheers,

Cliff
 
Robert said:
Certainly a good candidate but my experience of "bad drivers" has
been "bad in the sense that the programmer was apparently wearing
boxing gloves and drunk off his head while coding" more than
anything else.
That wouldn't explain why they work for ages and then stop working.

Cheers,

Cliff
 
Enkidu said:
That wouldn't explain why they work for ages and then stop working.

Not exactly. However, a device driver could trigger a BSOD (bugcheck)
due to a coding issue for a number of reasons including, but not
restricted to:

- Explicitly e.g.
It executes a call to KeBugCheck (because the programmer felt it was
safer to bring down the system than to continue and potentially corrupt
data, for example)

Indirectly due to one of say:
- Unhandled boundary conditions
- Change of hardware or configuration e.g. BIOS change
- Workload variations
- Changes in other software components on which it is dependent or
interacts

The most common windows bugcheck I've seen that is usually attributed to
a device driver in some way is:

0x0000000A: IRQL_NOT_LESS_OR_EQUAL

A look at the callstack in the BSOD data often provides a clue as to
what device drivers may be involved with the issue.
 
bok said:
Not exactly. However, a device driver could trigger a BSOD (bugcheck)
due to a coding issue for a number of reasons including, but not
restricted to:

- Explicitly e.g.
It executes a call to KeBugCheck (because the programmer felt it was
safer to bring down the system than to continue and potentially corrupt
data, for example)

Indirectly due to one of say:
- Unhandled boundary conditions
- Change of hardware or configuration e.g. BIOS change
- Workload variations
- Changes in other software components on which it is dependent or
interacts

The most common windows bugcheck I've seen that is usually attributed to
a device driver in some way is:

0x0000000A: IRQL_NOT_LESS_OR_EQUAL

A look at the callstack in the BSOD data often provides a clue as to
what device drivers may be involved with the issue.
I've often wondered why people don't just set their IRQL to
GREATER_THAN! 8-) 8-)

Cheers,

Cliff
 
Hi Guys,

I was speaking to someone the other day who said as a rule of thumb,
BSOD errors are generally related to either :
(a) Memory
(b) Bad sectors on critical parts of the HDD.

Anyone care to add there opinion to this?

I don't see the value in coming up with such a rule of thumb. As others
have pointed out elsewhere Windows BSODs are triggered for a variety of
different reasons - including but not limited to hardware, software or
configuration issues.

Whenever I have run into a BSOD that I need to deal with I note down the
information that's presented and look it up on online resources (from a
working machine) usually starting with the bugcheck code and then
refining the search based on the explanation and likely causes.

In some cases I have resorted to analyzing the kernel crash dump that
you can configure Windows to produce if required.

I recall some 10 years ago a customer of ours put in a fault report and
included a windows NT kernel crash dump following a BSOD on one of their
servers because our (non Microsoft) software happened to be running on
their machine.

After opening the crash dump it didn't take long to discover it was a
bugcheck 0x1e KMODE_EXCEPTION_NOT_HANDLED, parameter 1
indicated 'access violation', the 'kmode exception' occured in
kernel32.dll and the process was ups.exe (both Microsoft products). So I
replied to the contact telling them where to stick it (I mean send it) ;-)
 
Hi Guys,

I was speaking to someone the other day who said as a rule of thumb,
BSOD errors are generally related to either :
(a) Memory
(b) Bad sectors on critical parts of the HDD.

Anyone care to add there opinion to this?

Not true, especially not the second one. Memory, as in defective or
mismatched RAM, is a fairly common cause but not sufficiently so as to
be considered one of the two leading causes.

Ron Martell Duncan B.C. Canada
--
Microsoft MVP (1997 - 2006)
On-Line Help Computer Service
http://onlinehelp.bc.ca
Syberfix Remote Computer Repair

"Anyone who thinks that they are too small to make a difference
has never been in bed with a mosquito."
 
Enkidu said:
I've often wondered why people don't just set their IRQL to
GREATER_THAN! 8-) 8-)

Smiley's noted.

Alternatively they could set IRQL to < 2 (DISPATCH_LEVEL) before
attempting to access paged memory.
 
Enkidu said:
That wouldn't explain why they work for ages and then stop working.

Well depending on how hypothetical you want to get (though I have seen this
particular one before) it could happen that way if a fault in a driver was
only triggered on random and odd occasions, so wasn't obviously and
immediately connected to the driver being updated/installed.
 
Robert said:
Well depending on how hypothetical you want to get (though I have
seen this particular one before) it could happen that way if a fault
in a driver was only triggered on random and odd occasions, so wasn't
obviously and immediately connected to the driver being
updated/installed.

Very true. And probably a PITA to troubleshoot, besides <g>!
 
Robert said:
Well depending on how hypothetical you want to get (though I have
seen this particular one before) it could happen that way if a fault
in a driver was only triggered on random and odd occasions, so wasn't
obviously and immediately connected to the driver being
updated/installed.

All acedemic folks, seemed that the intellectually challenged OP didn't run
Prime95 to test the machine past the first minute as it failed. (????)
Instead of realising that he'd just isolated the problem (or even telling us
until now) he decided to waste all our time speculating when the problem is
more-than-likely (99.999% chance) in the CPU/RAM subsytem or settings
thereof.

"The only difference between human stupidity and genius is that the later is
finite" Al Einstein (paraphrased)
 
~misfit~ said:
All acedemic folks, seemed that the intellectually challenged OP
didn't run Prime95 to test the machine past the first minute as it
failed. (????) Instead of realising that he'd just isolated the
problem (or even telling us until now) he decided to waste all our
time speculating when the problem is more-than-likely (99.999%
chance) in the CPU/RAM subsytem or settings thereof.

Well... yes... but what's the fun in threads about odd errors if we can't
talk about war stories and stuff too ;-)
 
That wouldn't explain why they work for ages and then stop working.

Many ways this can happen.

Example:

Shoddy programmer writes driver to access information from a system
..DLL file based on a fixed offset from the beginning of the file
instead of using the internal function name for the needed component.
A new version of that .DLL file is released by Microsoft and the
physical location of that function within the file has been changed.
Accessing the new .DLL file version will now cause the driver to
crash.

This specific example has happened many many times.

Ron Martell Duncan B.C. Canada
--
Microsoft MVP (1997 - 2006)
On-Line Help Computer Service
http://onlinehelp.bc.ca
Syberfix Remote Computer Repair

"Anyone who thinks that they are too small to make a difference
has never been in bed with a mosquito."
 
Back
Top