DOS ain't dead - Question about some x86 opcodes

Question about some x86 opcodes (Developers)

posted by mht , Wroclaw, Poland, 02.11.2008, 14:02

Hello,

I have a question about specific x86 opcodes that I hope someone here can answer.

The following instructions:
ADD reg/mem16, immediate ADC reg/mem16, immediate SUB reg/mem16, immediate SBB reg/mem16, immediate CMP reg/mem16, immediate
have two forms, one with 16-bit immediate value, and the other with 8-bit immediate value which is sign-extended to 16 bits, for example:
ADD DI, 1234h -> 81 C7 34 12 ADD DI, 12h -> 83 C7 12
(in general, bits 5..3 in the second byte tell the operation: 000=ADD, 010=ADC, 011=SBB, 101=SUB, 111=CMP). However, it is unclear whether the remaining three arithmetic instructions in this group (001=OR, 100=AND, 110=XOR) also accept "short" immediates, i.e., whether
OR DI, 12h -> 83 CF 12 "short"
is valid, or
OR DI, 12h -> 81 CF 12 00 "long"
must be used instead.

The [short] for OR/AND/XOR is surely documented for 386+, and seems to be undocumented for earlier x86 processors (I even remeber to have seen an 8086 opcode chart with empty spaces in the corresponding places, but I do not have it anymore).

However, many assemblers and compilers actually do use "short" forms even for 8086 target! These are results of my testing:

JWASM (recent version) "short" NASM with "-O2" (0.98.38) "short" TASM 3.1 "short" TD 3.1 integrated asm "short" BC++ 3.1 code generator "short" " inline asm "long" TP 7.0 code generator "long" " inline asm "long" " (in runtime library) "short" MS-DOS DEBUG (WinXP version) "long" DR DOS 6.00 SID (R3.2) debugger "long" DR-DOS 7.03 DEBUG (R1.51) "short"

Any additions (particularly, Microsoft and Intel tools) and/or corrections to the above list are of course welcome, but even more welcome is an explanation!

Do asemblers and compilers from major software vendors produce code that does not run on some processors? Rather incredible...

My hypothesis is, that the undocumented "short" opcodes for AND/OR/XOR actually do work on all Intel chips and clones (as NEC V20). Older compilers/assemblers (and parts of their old code still remaining in newer versions) rely on old official Intel specs and always use the "long" form. Newer versions know that the short forms are always safe and use them. But this is only a hypothesis.

Even if DOS world, many software writers do not care about anything below 386 nowadays. But I like to maintain compatibility whenever possible. And, last but not least, the ability to save one byte of code is sometimes critical ;-)

Can anyone help?

Michal
mht@bttr-software.de

Post reply

Complete thread:

Question about some x86 opcodes - mht, 02.11.2008, 14:02 (Developers)
- Question about some x86 opcodes - Japheth, 02.11.2008, 17:51
- Question about some x86 opcodes - DOS386, 03.11.2008, 10:04
  - Question about some x86 opcodes - mht, 05.11.2008, 19:54
    - Question about some x86 opcodes - Rugxulo, 06.11.2008, 21:06
      - Question about some x86 opcodes - mht, 08.11.2008, 08:46
        Question about some x86 opcodes - Rugxulo, 12.11.2008, 00:43
        Question about some x86 opcodes - mht, 12.11.2008, 16:54
        Question about some x86 opcodes - mht, 22.11.2008, 13:59