Re (2): [Oberon] Reason for 'read pio error' ?

muller at inf.ethz.ch muller at inf.ethz.ch
Mon Sep 2 23:20:23 CEST 2002


Andreas Dörr <andreas.doerr at workingobjects.de> wrote:
> Concerning the automatic remapping of bad sectors. My current theory is, 
> that the controller is only able to do this after a low level format marks 
> the bad sectors. Maybe someone could provide more definite information ...

I also don't know, but would speculate that no low-level format is required.

> Well I guess, almost all file systems suppose to operate on a perfect 
> media. I would not expect bad block handling from AosFS.

Not really.  I think this is a true oversight, although it is very
rarely a problem.

> Creating a file that contains all these bad blocks is the most 
> straightforward solution to the problem. From my point of view it has the 
> following drawback: Save access to a partition containing a bad sector file 
> is only possible on file level. Any tool accessing the partition on disk 
> block level will fail.

Such a tool would have to do its own bad block handling.

> Another solution would be to implement bad sector forwarding on the disk 
> driver level. This solution is transparent to all higher levels. It is even 
> possible to access the partition on block level. In BSD based Unix system 
> the command "bad144" does this.

I'm not a big fan of this, since it hides the performance hit of the 
remapping.  Urs Hölzle of Google told an interesting anecdote during
a talk at the ETH.  Google uses consumer hard disks for their servers,
and one of the problems is that they do automatic bad-block mapping 
(in hardware).  The disks keep working, but performance degrades 
significantly due to the additional seeks.  Apparently they built 
some monitoring tools to detect bad (but working) disks for replacement.

> For this solution a new common "BadBlock" error code needs to be introduced 
> to interface module "AosDisks". Otherwise a high level module like 
> "Partitions" would be bound to an error code coming from an implementation 
> of the interface module "AosDisks" (in this case "AosATADisk").

See below.

> Since I own the "offending" disk, I will try to do this.

Good!  We need more volunteers willing to get their hands dirty 
with programming (and then afterwards stand back, look at 
what was done and clean up :-).

> As a first step I 
> will implement a command which creates a file containing a given list of 
> bad blocks. If this is done, I will adapt the formatting code in 
> Partitions.Mod.

Sounds like a good approach.

> At first I will directly check for AosATADisk.Mod pio read 
> and pio write error codes.

That's too specific, since a controller with bus mastering,
or another driver, will give other error messages.  I suspect
it would be safer to treat any read error (on an in-range block)
that is not a missing media or locking error, as a bad block error. 
If the error persists after a few tries, it is likely to persist 
after the next reboot also.

> However, in the long run AosATADisk.Mod should 
> convert these error codes to a common error code defined in AosDisks.Mod. 
> Any idea, how the original error code could be preserved? 
> Controller.Transfer() only returns one result code.

The strategy above seems simpler, instead of adding additional error 
return codes.  If required, you could assign additional codes to
global variables for debugging purposes, where they can be viewed
with System.State.  Or write messages with AosOut/Kernel.Write*.

"Patrik Reali" <reali at acm.org> wrote:
> In Aos, there is a command to
> write one or more blocks to a file: you should try to write the offending
> block and then the (separatedly) teh blocks are it. This could tell if only
> one block or every block past that one has the same behaviour.

I think Patrik means Partitions.ShowBlocks dev#part block [blocks] ~
"block" is the starting block to display in hex, and "blocks" is 
the number of blocks to display (Esc interrupts).  See Partitions.Tool.
Partitions.PartitionToFile and Partitions.FileToPartition could also be used.

cheers
-- Pieter

--
Pieter Muller, Computer Systems Institute, ETH Zurich / MCT Lab, Zurich
Native Oberon OS: http://www.oberon.ethz.ch/native/



More information about the Oberon mailing list