posted by Ninho(R) E-mail, 23.04.2008, 10:06

Hi Japheth ! Dunnow if this place is appropriate for discussing this, or even if you are interested in discussing, anyway... briefly reviewed your HIMEMX code and comments, and I think there's a point not absolutely correct here :

mov cr0,eax
;--- the 80386 (and 80486?) need a short delay after switching to PM
;--- before a segment register can be set! Any instruction is sufficient.
dec ax ; clear PE bit
mov ds,dx

;;;; ... etc.

The point is not "a short delay" is required, it's there are incorrectly predecoded instructions ; the predecode queue, distinct from the prefetch queue, holds up to three complete instructions in the 386 ! If the "mov ds,dx" instruction followed the "mov cr0,eax" immediately, for sure it would be predecoded while still in real mode, and when subsequently executed, even though PE=1, it wouldn't do what you expect from a protected mode segment load.

How many predecoded instructions will be in the predecode queue at a given point in time depends in a complex way upon the length of instructions and their alignment (for filling/refilling the prefetch queue) so, since your code has been time-tested, I assume the bug does not manifest itself any more. But it's not a robust fix either.

The 486 changed how instructions were prefetched and predecoded. Because of the on-die cache, the prefetch queue is in fact smaller if it even exists - details are not easily found and difficult to assert by experiment without snooping hardware cycles. Also, decoding is done very differently from the 286 and 386. The details of course were never documented by Intel.

Anyway, the good news is there is a 100% correct and documented fix : you must do a near jmp $+2 after the move to cr0. This flushes both the instruction prefetch and predecode queues.




