Monday, September 5, 2011

Strange Hard Drive Behaviour On MacBook Pro

Experienced some strange and, thus far, inexplicable behaviour between a MacBook Pro and its hard drive. The Mac would intermittently go off into lah-lah land, allowing the user to do nothing more than move the mouse cursor that had turned into the typical spinning kaleidoscope. It was like the Mac entered a temporary coma upon random user interaction like opening a folder, starting an app, browsing the internet... pretty much anything could trigger it. However the length of time that the Mac remained in the coma seemed to increase with each occurrence, sometimes taking 5+ minutes to wake up. Eventually, it would just stay in this coma state requiring a forced restart. Note that during each coma minimal life could be detected (mouse still moves, some mouse-overs still work, menus may still drop-down) and I could always put my ear to the laptop and hear the hard drive clicking away like mad in search of something it never seemed to find.

At first, forced shut-downs resulted in the Mac taking 20-30 minutes to boot up again. But eventually each forced shut-down resulted in the Mac "loosing track" of the hard drive. Booting up the Mac in Verbose mode yielded various complaints all related to issues with the hard drive (could not find it, I/O errors, time-outs, etc...) and eventually the screen would just continuously roll-over with error messages. If not in Verbose mode you'd miss out on all this fun and be left staring at an Apple logo slowly burning itself into your LCD.

Sometimes, enough forced restarts would get the Mac to boot again (albeit 30 minutes later), but most times required a boot from the system disk, followed by a so-called 'successful' disk repair, followed by another 30 minutes of boot up, followed by more random comas, followed by another forced shut down,... and so the cycle continued until even a boot from the system disk could do nothing. Why? Because at that point the Mac could no longer even mount the drive! (Note: in case you're wondering, I wasn't continuing the cycle just for fun but, rather, with each attempt trying to recover data from the drive before it's suspected death). Now, with the Mac not even finding the drive, I assumed it dead... Wrong!

Now things get interesting. A scheduled trip to a Mac 'genius' at the local Mac store resulted in the 'genius' proclaiming my hard drive dead and suggest I buy a new one off the internet. Ok, I'm cool with that, but I still want to know WHY the stupid thing died after just 1-1/2 years. Thankfully the 'genius' said getting to the drive was easy as cheese and instructed me to just take out the screws on the bottom. Right on, let see what's in this baby! So I took the drive out and popped in another to see if I could isolate the issue. As suspected, the Mac finds the new drive w/o issues and I'm even able to re-install OS X and go on about life as usual. But, I decide not to do that. Wanting to know why the drive died I do a little more poking around and find that with a USB-SATA cable, my WinXP laptop can detect it just fine. Unfortunately, due to the Mac format, i can't actually read it and recover anymore data. If I do the same on the Mac (plug it in over USB), it fails to mount. So I figure, since my WinXP machine can at least detect it, I'll run some tests.

First I do the obvious thing and consult the mfr of the drive, Hitachi, for a drive test. Man, that was a bad idea! Their drive test, called Drive Fitness test (DFT), which you place on a boot-up disk, first failed to do anything other than spin the CD-ROM drive. After much wasted time troubleshooting, I eventually burned the supplied ISO again to find it now works... kind of. It works on my old WinXP box but requires the drive to be internal and my old WinXP box doesn't support SATA. So I try the boot disk on both the Mac and a newer WinXP box. With both the new machines I just get a bunch of HIMEN/XMS errors, blah blah blah more errors, and it dies. Many minutes more wasting time troubleshooting and still no go. WTF Hitachi! So then I go to the Seagate website and download their SeaTools boot-up ISO, also for testing drives. This time it works fine, on all machines. Thanks Seagate! So I run the boot disk on the newer WinXP box but the Seatools 'short' test says the drive is fine. So I install the drive back in the Mac just to double-check and... no dice. Ok, so let's try the 'long' test. This one really is long. A good 3 hours later, the result is a big fat PASS. Well, some suggestions I came across on the web say that some issues may only be detected by mfr software. Except in this case the mfr s/w doesn't do diddly squat! Again, WTF Hitachi! Ok, so for some reason I decide to try the same long test with the drive installed in the Mac. So I remove the other drive (with the new OS X install) and put the original drive back in, forget to boot up off the disk, and the Mac starts up fine (yes, on the F'd up hard drive) as if nothing ever happened. Ok, I'll say it again, WTF!

So, somehow a 'long' test (which explicitly states it doesn't affect drive data) using Seagate s/w on a Hitachi drive somehow cured the drive. Well, I doubt that, but here's my theory. I think the Mac got into some buggy state that was "not happy" with the drive. Maybe some stored RAM parameters (PRAM/NVRAM) were causing the issue. I learned later that these can be reset, but too late to try now. The drive at hand uses the relatively new "GPT Protected" partition format so maybe there's still some bugs to iron out in the interface between OS X and this partition type. In any case, I'm convinced the state of the drive never changed but the state of Mac easily could have during use with the temporarily installed drive with newly installed OS X. The fact that the problematic drive started to work again after the long test is surely just coincidence.

PS. Part of why I'm going on and on and on about this (sorry about the length) is that I've heard multiple other cases where MacBook owners have had their hard drives declared dead long before their expected life. Maybe some of these people replaced a perfectly good drive! If you end up in this situation, maybe this blog will save your ass. Or rather, your hard drives ass.

No comments:

Post a Comment