Re: A89: matrix


[Prev][Next][Index][Thread]

Re: A89: matrix




That's awesome!  But I think I found one error that (hehe) everyone can
assume is a typo, right?

>   dbra    d1,loop                ; Do the loop if needed

Isn't d1 being used as one of the transfer registers?  I think you meant to
put d0 in here.  Correct me if I'm wrong, but I think d1 would change it's
value every loop, and you'll end up moving too much, or too little of the
memory you wanted to move.

On a side note: why not use the register a7?

-Miles Raymond      EML: m_rayman@bigfoot.com
ICQ: 13217756       IRC: Killer2        AIM: KilIer2 (kilier2)
http://www.bigfoot.com/~m_rayman/

----- Original Message -----
From: Zoltan Kocsi <zoltan@bendor.com.au>
To: <assembly-89@lists.ticalc.org>
Sent: Thursday, July 08, 1999 5:24 AM
Subject: Re: A89: matrix


> Olle Hedman writes:
>  > you dont have much of a choice more than move.l (Ax)+,(Ay)+
>
> Actually, you do. If you have lots of stuff to move then a little more
> cost on the preparation/cleanup side doesn't matter if you can save
> heaps on the actual transfer. In that case this might help:
>
>   movem.l d1-d7/a2-a6,-(sp)      ; Save all registers
> loop:
>   movem.l (a0)+,d1-d7/a2-a6      ; Suck in 48 bytes at once
>   movem.l d1-d7/a2-a6,(a1)+      ; Store them at destination
>   movem.l (a0)+,d1-d7/a2-a6      ; Suck next 48 bytes in
>   movem.l d1-d7/a2-a6,(a1)+      ; Store at destination
>   ...
>   dbra    d1,loop                ; Do the loop if needed
>   movem.l (sp)+,d1-d7/a2-a6      ; Restore registers
>
> With the move.l method you waste 1 bus cycle for the insn fetch for
> every 4 bytes moved. With the movem.l method you waste 4 cycles for
> every 48 bytes moved, that is, your bus bandwith loss goes from 20%
> to only 7.7%. Of course if your block size is known a priori and it
> is small enough to warrant a loop unroll, then your d0 becomes free
> so you can move up to 52 bytes per 2 insns, which further decreases
> the bandwidth waste to 7.1%. If you need absolutely everything that
> is possible, you can disable the interrupts, save a7 to some known
> location and include a7 in the transfer too - your wasted bandwith
> will reach the ever low 6.7%.
>
> It's an old trick which was worth to do on a 68000. With the advent of
> the 68010 it went out of fashion for the 68010 and all further CPUs
> have a loop mode (or equivalent) where data blocks can be moved
> without insn fetches slowing down the transfer (i.e. your copy speed
> is only limited by the actual transfer speed of the bus). On the old
> 68000, however, the above method was quite popular when you needed
> that few extra bus cycles.
>
> Regards,
>
> Zoltan



Follow-Ups: References: