06 May 2009

My drive is failing... or not

The recent, informative recent post by davidz about the new storage handling in GNOME helped me a lot in understanding the error message that scared the shit out of me when trying Fedora 11 Preview (with a LiveUSB) on my Eee PC.

palimpset


It was instant panic when I saw it, so obviously I was in a hurry to see the details:

palimpset


There is an one single error: the hard drive temperature at around 50°C and I calmed down a bit. Later searched a bit on Google and it looks like this is an usual hard drive temperature for netbooks, where the ventilation is limited. Pretty much like a scary false alarm (but maybe I am just ignorant here).

palimpset


Overall, the application could be, IMO, better worded: even the item passing the test (green dots) are labeled "Pre-fail" making you feel something bad is going to happen.

palimpset


I don't know, maybe this is a philosophical view: anything living is going to die and anything working is going to fail, life is pre-death and working is pre-fail.

5 comments:

  1. I think the underlying problem is the default thresholds as understood by smart. I hit a different snag..my drive's firmware was giving out bogus numbers after some discussion in the bug report.

    Like most hardware UI, this is going to run into some snags on first introduction. The fact that you and I are finally seeing the smart diagnostics, even if we see weirdness, will hopefully mean the smart diagnostics will become more reliable as a health metric.

    I guess we need a "dont panic" period where people with failure notices of any type are encouraged to submit information for review..with the aim of helping to fine tune adjustments in the default thresholds to sane values.

    ReplyDelete
  2. Hi,

    You are definitely encourage to file bugs at

    http://bugzilla.gnome.org/enter_bug.cgi?product=gnome-disk-utility

    for these dialogs, graphs with suggestions on how to improve things. FWIW, the graph / attribute stuff isn't yet great but at least it's a start. Specifically, I'd too really like to use other words that PreFail and OldAge .

    Second, make sure to use the latest DeviceKit-disks packages. Versions prior to 004-2 are known to cry wolf - the 004-2 packages has a fix where we don't take failing old_age attributes into account when deciding when to put up the scary warning icon. More info about that here

    https://bugzilla.redhat.com/show_bug.cgi?id=495956#c9

    ReplyDelete
  3. Also, specifically, the 004-2 packages should fix the problem with your EeePC since temperature attributes are normally OldAge and we only look at PreFail attributes to determine if an icon should be shown.

    ReplyDelete
  4. That application seems to work perfectly. Truth is: a harddisk is ALWAYS near (or past) a total failure. ;)

    ReplyDelete
  5. @jspaleta: I think the application can use a button like: ignore this kind of error

    @davidz: I would have commented about this on your blog, but unfortunately you have the comments closed...
    Definitely I will install F11 on the Eee in a few days (probably next week, when hope will get enough free time), update to the latest Rawhide and see if it still persists, file bugs if needed.

    ReplyDelete