Difference between revisions of "Troubleshooting guide/Hardware problems"
m (→Stability Testing: changed the name of the sections to include alternative names. changed all "disk" to "drive" under the drive section (we live in a disk-less age where drives do not use disks anymore)) |
m (updated template usage) |
||
Line 32: | Line 32: | ||
===Power supply unit (PSU)=== | ===Power supply unit (PSU)=== | ||
− | Non-deterministic problems are sometimes caused by a bad power supply unit (PSU)<ref>[http://ask-leo.com/could_my_power_supply_be_causing_memory_errors.html Could my power supply be causing memory errors? - Ask Leo!]</ref> | + | Non-deterministic problems are sometimes caused by a bad power supply unit (PSU).<ref>[http://ask-leo.com/could_my_power_supply_be_causing_memory_errors.html Could my power supply be causing memory errors? - Ask Leo!]</ref> If the power supply is not stable, it is futile to test other parts of the system because they will yield inconsistent results. Power supplies do not indicate whether they are having problems because they generally do not include self-testing hardware. Sometimes electrical noise (buzzing) may be heard though. |
The best way to test a power supply is with a special PC power supply tester. If one is not available, try to load the PSU as much as possible, generally by running [[#CPU + GPU|all]] stress tests and connecting as many external devices as available (preferably in "cold" conditions). If this hamper system stability in any way, swapping the PSU with a different known working one should be enough to rule out or not the problem. | The best way to test a power supply is with a special PC power supply tester. If one is not available, try to load the PSU as much as possible, generally by running [[#CPU + GPU|all]] stress tests and connecting as many external devices as available (preferably in "cold" conditions). If this hamper system stability in any way, swapping the PSU with a different known working one should be enough to rule out or not the problem. | ||
Line 41: | Line 41: | ||
Windows Vista+ have a built in memory tester also, which can be found by running mdsched.exe. | Windows Vista+ have a built in memory tester also, which can be found by running mdsched.exe. | ||
− | In case errors don't present in a random pattern (meaning the issue is confined only to determinate RAM addresses) it might be possible to just bypass the affected locations<ref>[http://unix.stackexchange.com/a/76188/163877 linux - How to blacklist a correct bad RAM sector according to MemTest86+ error indication? - Unix & Linux Stack Exchange]</ref><ref>[https://superuser.com/a/490522/567466 memory - Running Windows with defective RAM - Super User]</ref> | + | In case errors don't present in a random pattern (meaning the issue is confined only to determinate RAM addresses) it might be possible to just bypass the affected locations.<ref>[http://unix.stackexchange.com/a/76188/163877 linux - How to blacklist a correct bad RAM sector according to MemTest86+ error indication? - Unix & Linux Stack Exchange]</ref><ref>[https://superuser.com/a/490522/567466 memory - Running Windows with defective RAM - Super User]</ref> |
− | ===Drive (SSD, SSHD, HDD, etc)=== | + | ===Drive (SSD, SSHD, HDD, etc.)=== |
Drive stability testing is performed using [http://smartmontools.sourceforge.net smartmontools]. Using the -x argument to the utility, verify the following: | Drive stability testing is performed using [http://smartmontools.sourceforge.net smartmontools]. Using the -x argument to the utility, verify the following: | ||
− | * The drive is not overheating | + | * The drive is not overheating (SSDs may not have a temperature sensor, measure their temperature some other way). |
* The drive is not reporting read or write faults in its error log. | * The drive is not reporting read or write faults in its error log. | ||
* The drive is not reporting a pre-fail condition. | * The drive is not reporting a pre-fail condition. | ||
Line 57: | Line 57: | ||
AMD does not release diagnostic software for end users. Use [http://www.mersenne.org/download/ Prime95] to stress test the CPU and see if it fails. The Windows Event Log may record a machine check error code; this will provide more specific information on what caused the problem. | AMD does not release diagnostic software for end users. Use [http://www.mersenne.org/download/ Prime95] to stress test the CPU and see if it fails. The Windows Event Log may record a machine check error code; this will provide more specific information on what caused the problem. | ||
− | === | + | ===Graphics card (GPU)=== |
If you are experiencing visual artifacts or sluggishness in visual applications, then there may indeed be a problem with you GPU. First, use a utility such as [http://www.techpowerup.com/gpuz/ GPU-Z] or [http://www.cpuid.com/softwares/hwmonitor.html HWMonitor] to see if your card is running over the recommended temperature (the max is usually around 80 Celsius), if so, then you card will be throttling itself in self-preservation. Check then, to see if the fans or blocks on your card are functioning correctly, and blow out any dust or debris build-up with a can of compressed air. If that isn't the problem, check the video drivers, and see if there is a new version or if the one you are using is reported as being unstable, in either case, perform a clean install of the drivers. Next, if you have integrated video as well as a discreet card, make sure that the computer switches when in game to your discreet card correctly instead of staying on integrated. Last try to [https://www.raymond.cc/blog/having-problems-with-video-card-stress-test-its-memory/ memtest] VRAM and stress test it with [http://www.ozone3d.net/benchmarks/fur/ FurMark]. If all else fails, check the warranty on your GPU (most are 2-3yrs) and RMA the card for repairs or a rebate of some sort. | If you are experiencing visual artifacts or sluggishness in visual applications, then there may indeed be a problem with you GPU. First, use a utility such as [http://www.techpowerup.com/gpuz/ GPU-Z] or [http://www.cpuid.com/softwares/hwmonitor.html HWMonitor] to see if your card is running over the recommended temperature (the max is usually around 80 Celsius), if so, then you card will be throttling itself in self-preservation. Check then, to see if the fans or blocks on your card are functioning correctly, and blow out any dust or debris build-up with a can of compressed air. If that isn't the problem, check the video drivers, and see if there is a new version or if the one you are using is reported as being unstable, in either case, perform a clean install of the drivers. Next, if you have integrated video as well as a discreet card, make sure that the computer switches when in game to your discreet card correctly instead of staying on integrated. Last try to [https://www.raymond.cc/blog/having-problems-with-video-card-stress-test-its-memory/ memtest] VRAM and stress test it with [http://www.ozone3d.net/benchmarks/fur/ FurMark]. If all else fails, check the warranty on your GPU (most are 2-3yrs) and RMA the card for repairs or a rebate of some sort. | ||
===CPU + GPU=== | ===CPU + GPU=== | ||
− | Run both tests together. Remember to lower CPU test priority to avoid bottlenecks in the GPU one. <ref>[http://blog.szynalski.com/2012/11/the-right-way-to-stress-test-an-overclocked-pc/ The right way to stress-test an overclocked PC « Trying To Be Helpful]</ref> | + | Run both tests together. Remember to lower CPU test priority to avoid bottlenecks in the GPU one.<ref>[http://blog.szynalski.com/2012/11/the-right-way-to-stress-test-an-overclocked-pc/ The right way to stress-test an overclocked PC « Trying To Be Helpful]</ref> |
{{References}} | {{References}} |
Revision as of 10:50, 13 October 2018
Hardware diagnosis software can be used to determine whether the problems on your PC are being caused by faulty or broken hardware. There are many utilities that are designed to scan the physical components of your computer to check whether they are in good condition.
Relevant software
- Ultimate Boot CD (UBCD)
- Hiren's Boot CD
- Piriform's Speccy - System information tool fo Windows
- WhoCrashed - Windows kernel crash dump anbalyzer
Show hardware components
Windows
Open the DirectX Diagnostic Tool:
- Windows Vista and later: open the Start Screen/Start Menu, type
dxdiag
and press ↵ Enter. - Windows XP: press ⊞ Win+R, type
dxdiag
and press ↵ Enter.
Open the System Information utility:
- Windows Vista and later: open the Start Screen/Start Menu, type
msinfo32
and press ↵ Enter. - Windows XP: press ⊞ Win+R, type
msinfo32
and press ↵ Enter.
Linux
Through the Terminal
$ lspci $ lsusb
See also Linux.
Stability testing
Many parts of a PC work together to run a game. Crashes are often caused by problems where two or more parts interact. The first question to be asked when a crash occurs is whether the PC is stable without the game running.
Power supply unit (PSU)
Non-deterministic problems are sometimes caused by a bad power supply unit (PSU).[1] If the power supply is not stable, it is futile to test other parts of the system because they will yield inconsistent results. Power supplies do not indicate whether they are having problems because they generally do not include self-testing hardware. Sometimes electrical noise (buzzing) may be heard though.
The best way to test a power supply is with a special PC power supply tester. If one is not available, try to load the PSU as much as possible, generally by running all stress tests and connecting as many external devices as available (preferably in "cold" conditions). If this hamper system stability in any way, swapping the PSU with a different known working one should be enough to rule out or not the problem.
Memory (RAM)
Memory stability testing is performed using the memtest86+ utility.
Windows Vista+ have a built in memory tester also, which can be found by running mdsched.exe.
In case errors don't present in a random pattern (meaning the issue is confined only to determinate RAM addresses) it might be possible to just bypass the affected locations.[2][3]
Drive (SSD, SSHD, HDD, etc.)
Drive stability testing is performed using smartmontools. Using the -x argument to the utility, verify the following:
- The drive is not overheating (SSDs may not have a temperature sensor, measure their temperature some other way).
- The drive is not reporting read or write faults in its error log.
- The drive is not reporting a pre-fail condition.
If each of those items are true, then follow the directions to perform a short self-test. Verify that the drive executes and passes this test. If not, go to the drive vendor web site support section and follow the directions to download their drive analysis software. Follow the directions to obtain a specific problem report and return the drive if it is under warranty. If the drive is not under warranty, swap the drive for a new one.
CPU
Intel CPU testing is performed using the Intel Processor Diagnostic Tool.
AMD does not release diagnostic software for end users. Use Prime95 to stress test the CPU and see if it fails. The Windows Event Log may record a machine check error code; this will provide more specific information on what caused the problem.
Graphics card (GPU)
If you are experiencing visual artifacts or sluggishness in visual applications, then there may indeed be a problem with you GPU. First, use a utility such as GPU-Z or HWMonitor to see if your card is running over the recommended temperature (the max is usually around 80 Celsius), if so, then you card will be throttling itself in self-preservation. Check then, to see if the fans or blocks on your card are functioning correctly, and blow out any dust or debris build-up with a can of compressed air. If that isn't the problem, check the video drivers, and see if there is a new version or if the one you are using is reported as being unstable, in either case, perform a clean install of the drivers. Next, if you have integrated video as well as a discreet card, make sure that the computer switches when in game to your discreet card correctly instead of staying on integrated. Last try to memtest VRAM and stress test it with FurMark. If all else fails, check the warranty on your GPU (most are 2-3yrs) and RMA the card for repairs or a rebate of some sort.
CPU + GPU
Run both tests together. Remember to lower CPU test priority to avoid bottlenecks in the GPU one.[4]
References
- ↑ Could my power supply be causing memory errors? - Ask Leo!
- ↑ linux - How to blacklist a correct bad RAM sector according to MemTest86+ error indication? - Unix & Linux Stack Exchange
- ↑ memory - Running Windows with defective RAM - Super User
- ↑ The right way to stress-test an overclocked PC « Trying To Be Helpful