Sometime toward the end of 2019, my 10-year-old pc (HP Pavilion Elite HPE-360z), started acting up. Its fan noise went really high time to time, and the system randomly froze up or rebooted automatically. First thing I suspected was the health of hard disks because I noticed that I started seeing a bunch of what looks like kernel messages or kernel panic.
The information in this site is the result of my researches in the Internet and of my experiences. This information below is solely used for my purpose and may not be suitable for others.
Checking the Hard Disks
First, I ran e2fsck on the suspected device:
# e2fsck -p -y -f -v /dev/sda1
It returned with no errors. I also ran the command to check bad blocks on a device:
# badblocks -v /dev/sda1 > badsectors.txt
This passed with no bad blocks... Then, I learned about S.M.A.R.T.(Self-Monitoring, Analysis and Reporting Technology). It helps to detect, report and possibly log their health status and works better than the badblocks command on SATA and SCSI devices.
When the smarctrl command was run to view its log, there were errors reported.
# smartctl -l error /dev/sda
smartctl 7.0 2018-12-30 r4883 [x86_64-linux-4.19.97-gentoo] (local build)
Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF READ SMART DATA SECTION ===
SMART Error Log Version: 1
Warning: ATA error count 1800 inconsistent with error log pointer 1
ATA Error Count: 1800 (device log contains only the most recent five errors)
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.
Error 1800 occurred at disk power-on lifetime: 56133 hours (2338 days + 21 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
04 51 fd 81 f4 2c a0 Error: ABRT
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
b0 d4 fd 81 4f c2 a0 00 00:04:40.097 SMART EXECUTE OFF-LINE IMMEDIATE
b0 d0 fe 00 4f c2 a0 00 00:04:39.810 SMART READ DATA
b0 d8 fe 00 4f c2 a0 00 00:04:39.497 SMART ENABLE OPERATIONS
b0 d0 ff 00 4f c2 a0 00 00:04:38.150 SMART READ DATA
b0 d8 ff 00 4f c2 a0 00 00:04:37.834 SMART ENABLE OPERATIONS
So, I replaced the bad disk with another one that was laying around, thinking this would solve the issue for the time being. I was wrong...
Issues Not Resolved...
While I was installing Debian 10 on this new disk, the installation still froze up. Uh, oh. That disk certainly had issues but that didn't seem to be the only problem.
Next, I suspected memory modules. I reseated all 4 modules but that didn't help. I then installed memtest86+ and ran it. While it was being executed, I yet noticed that the cpu temperature was sky rocketing. It went as high as over 100F degrees!!! I should have known that the CPU fan was complaining for a while with loud noise.
Really, Really Dusty CPU fan
In the past, I opened HPE-360z time to time and blew dust with compressed air bottle, but I never cleaned the CPU heatsink/fan. Oh boy, I was so surprised to see chunks of dust stuck in the fan and between plates. I couldn't believe how much dust it has collected over the years (the fan on GPU was as dusty as CPU fan as well). I also noticed that CPU was stuck onto its heatsink and I was unable to separate them. During its process, I damaged some pins from CPU... so I decided to upgrade its CPU.
It's Time To Upgrade
Yeah, upgrading 10-year-old PC wasn't that easy to find parts for it. You might think that it's better to buy a new, better system but I felt that I've been with this one for 10 years and maxed out the memory size and all. I couldn't really bring myself to toss it out, just like that.
The highest kind of CPU that my MOBO can take was AMD Phenom II X6 1090T Black Edition according to the HP website and others. Since the old CPU was stuck on its heatsink and couldn't boot it up anymore, I wasn't sure if getting this CPU was an upgrade. Nevertheless, I decided to get this one and another heatsink and fan.
After Upgrading the CPU
I was a bit worried that it wouldn't boot up due to needing the BIOS update, but the system booted up successfully with new CPU and heatsink/fan. Maybe the old CPU was AMD Phenom II 1090T after all.
Now, my PC is working and better than before. No complaining from CPU fan anymore. I still have that bad hard disk and need to do something about it, but that's something to deal with later.