CompTIA A_ Certification All-In-One Exam Guide, Seventh Edition - Michael Meyers [420]
What Has Changed?
Systems that run properly tend to continue to run properly. Systems that have undergone a hardware or software change have a much higher chance of not running properly than a system that has not been changed. If something has gone wrong, talk to the user to determine whether anything in particular has occurred since the system last worked properly. Has new software been installed? Did the user add some new RAM? Change the Windows Domain? Run a Windows Update? Drop the monitor on the floor? Not only do you need to consider those types of changes, but you must also make sure that any unrelated changes don’t send you down the wrong path. The fact that someone installed a new floppy drive yesterday probably doesn’t have anything to do with the printer that isn’t working today.
Last, consider side effects of changes that don’t seem to have anything to do with the problem. For example, I once had a customer whose system kept freezing up in Windows. I knew he had just added a second hard drive, but the system booted up just fine and ran normally—except it would freeze up after a few minutes. The hard drive wasn’t the problem. The problem was that he unplugged the CPU fan in the process of installing it. When I discover a change has been made, I like to visualize the process of the change to consider how that change may have directly or indirectly contributed to a problem. In other words, if you run into a situation where a person added a NIC to a functioning PC that now won’t boot, you need to think about what part of the installation process could be fouled up to cause a PC to stop working.
Check the Environment
I use the term environment in two totally different fashions in this book. The first way is the most classic definition: the heat, humidity, dirt, and other outside factors that can affect the operation of the system. The other definition is more technical and addresses the computing environment of the system and other surrounding systems: What type of system do they run? What OS? What is their network connection? What are the primary applications they use? What antivirus program do they run? Do other people use the system?
Answering these questions gives you an overview of what is affecting this system both internally and externally. A quick rundown of these issues can reveal possible problems that might not be otherwise recognized. For example, I once got a call from a user complaining she had no network connection. I first checked the NIC to ensure it had link lights (always the first thing to check to ensure a good physical connection!) only to discover that she had no link lights—someone had decided to turn on a space heater, which destroyed the cable!
Reproducing the Problem
My official rule on problems with a PC is this: “If a problem happens only once, it is not a problem.” PCs are notorious for occasionally locking up, popping errors, and displaying all types of little quirks that a quick reboot fixes, and then, they don’t happen again. Why do these things happen? I don’t know, although I’m sure if someone wanted me to guess I could come up with a clever explanation. But the majority of PCs simply don’t have redundancy built in, and it’s okay for them to occasionally hiccup.
A problem becomes interesting to me if it happens more than once. If it happens twice, the chances are much higher that it will happen a third time. I want to see it happen that third time—under my supervision. I will direct the user to try to reproduce the problem while I am watching to see what triggers the failure. This is a huge clue to helping you localize the real problem. Intermittent failures are the single most frustrating events that take place in a technician’s life. But do remember that many seemingly intermittent problems really