Profile

unixronin: Galen the technomage, from Babylon 5: Crusade (Default)
Unixronin

December 2012

S M T W T F S
      1
2345678
9101112131415
16171819202122
23242526272829
3031     

Most Popular Tags

Expand Cut Tags

No cut tags
Saturday, August 28th, 2004 10:14 am (UTC)
Are you sure the Evil Empire is responsible for this one? " . . . a cycle where the rover would reboot itself, over and over" certainly sounds like Microsoft product behavior, and marking deleted files with a special character is an old MS-DOS trick (actually I think it dates back to QDOS)--but "DOS" is a generic term, and Microsoft isn't mentioned in the story. The OS designer/vendor was Wind River.

It looks like the real problem was that at just one point some people allowed themselves to forget that they were programming an embedded system and allowed a piece of third-party software to be a memory hog.

Moral: rocket science has no tolerance for sloppiness. Even a tiny bit of sloppiness in an otherwise stunningly-brilliant project.

ObMicrosoftSlam: If the rovers were running a Microsoft OS, we'd all have been getting spam relayed from Mars for months now.
Saturday, August 28th, 2004 11:29 am (UTC)
If my memory is correct, Microsoft came up with the FAT filesystem in the first place, right? It's not properly Wind River's fault; they may not have known that the third-party tool (whatever it is, which is unspecified) required all the directory structures on the flash RAM to be mirrored in main memory. Certainly when they designed the OS, they weren't to know that NASA would use it on a rover with limited main memory and a FAT-formatted flash RAM device. It wasn't the mirroring of directory structures into main memory that in itself caused the problem; with a decent filesystem, that wouldn't have been an issue. Lots of very good operating systems do that.

The root problem was the use of a filesystem on the flash RAM in which directory structures grow forever, because directory entry slots are never re-used, in an environment in which large numbers of small data files are constantly being created and written to the flash RAM. From reading NASA's description of the issue, if FAT recycled directory entries from deleted files instead of keeping them around forever, the problem would never have occurred.

I know, the original intention of simply tagging files as deleted and keeping the directory entries in place was to allow for file undeletion. But given the nature of FAT and how it allocates disk space, the odds are against any deleted file being recoverable at all across more than a single reboot anyway, and even then, file undeletion on FAT is a hazardous operation highly likely to result in crosslinked files and a corrupted filesystem.