Back to home page

DOS ain't dead

Forum index page

Log in | Register

Back to the forum
Board view  Mix view

SSE instructions in DOS programs? (Developers)

posted by marcov(R), 04.01.2019, 23:37

> BTW you will need extra alignment for SSE data (and
> maybe instructions?) I think 16B boundary.

(I do some SSE2/3 and avx2 at work for image processing routines, not under dos. I don't know all the system level details (like the CR4 bits, since we deliver the machines the code runs on), but I do know how to craft simple routines like image rotation, colour conversion and kernel routines)

Alignment exception or penalty depends on exact CPU and instruction set (SSE3 has instructions of unaligned loads that are aliases for the normal instructions in afaik core 4xxx series and newer (?)), but indeed it is better to naturally align (iow x byte operations on (x rounded up to power of two), so that never a 4/8k page is crossed by an operation.

This also means that for local variables the stack must aligned accordingly, sometimes that is a problem for 32-bit compilers (for 64-bit it is part of the standard ABI, so there the problem shifts to 32-byte with AVX-2).
(but that probably needs 16-byte datatype support too, since an 16-byte byte array aligns to 1 not 16 byte)

AVX-512 rollout seems to have stalled, and it is not mass available, even on newly bought Intel machines. Some of the high core server CPUs clocked back heavily when using AVX2 on all processors. Possibly to expensive at the current node, and the whole rollout is stalled because of intels process problems.

Ryzen has AVX2, but some inefficiencies (using 2 128-bit execution units to handle a 256-bit instruction), which is rumoured to be fixed in the Ryzen 3000 series this summer. That said you usually get more cores per buck, so effectively it is still pretty ok.

> There was a problem that SSE
> code of FFMPEG didn't work under DOS/DJGPP so I disabled all inline asm
> code for SSE but I never went deep into it if there's some workaround.

If e.g. the library for ffmpeg expects the stack aligned, but some code calling into ffmeg doesn't then you get trouble


Complete thread:

Back to the forum
Board view  Mix view
15891 Postings in 1469 Threads, 269 registered users, 90 users online (0 registered, 90 guests)
DOS ain't dead | Admin contact
RSS Feed
powered by my little forum