Back to home page

DOS ain't dead

Forum index page

Log in | Register

Back to the forum
Board view  Mix view

Assembler optimalisation - how to avoid a jump? (Developers)

posted by marcov, 13.05.2012, 23:22

> >shr ecx,1;rep movsd;adc ecx,ecx;rep movsw {fast 32bit write}
> If I want to transfer 2 pixels than I move 4 bytes. So:
> 2 shr 1 = 1 (and CF is set to zero)
>
> REP MOVSD with ECX=1 does one pass of dword transfer so two pixels are
> moved
> After this ECX is guaranted to be zero. And because CF is zero too, after
> ADC ECX,ECX is ECX still zero. Then...
> REP MOVSW with ECX=0 SKIPS THE TRANSFER
>
> That is how this piece works.

The fast move routine that I use is the SSE (move_JOH_SSE_10) routine in the archive on this side:

http://fastcode.sourceforge.net/

(the results of some runtime optimization contests for Delphi, most of them flowed into D2006, but for more specialistic uses I still browse through them, even if slightly outdated)

I also use it for images (but my rowlengths are typically long, 2048 1-byte pixels etc). We have a athlon64/core2 minimum though.

Note that afaik it is common to first align with a few movsb , and then do the bulk with the largest granularity move aligned, and then maybe another few odd ones.

 

Complete thread:

Back to the forum
Board view  Mix view
22760 Postings in 2121 Threads, 402 registered users (0 online)
DOS ain't dead | Admin contact
RSS Feed
powered by my little forum