Re: A89: TIGCCLIB 2.0 released!


[Prev][Next][Index][Thread]

Re: A89: TIGCCLIB 2.0 released!




Hi!

> Considering that all the HW2 grayscale routines are branched off of the
one
> I wrote, I felt inclined to look at this one.

Mainly all grayscale routines I saw before are exactly the same as mine.

> Since I have a special version of VTI meant to emulate HW2 to some degree

> AND a HW2 calc (for those of you who don't know, my calc was stolen and 
> my new one is HW2), I have to point out that it's gonna produce some 
> pretty shitty grayscale.

My previous grayscale produces quite fine picture on HW2-VTI, but the sheet
on real HW2 (this is a routine from TIGCCLIB 1.1). This grayscale routine
produced shitty (flickering) grayscale on HW2-VTI, but my beta testers said
to me that the picture is absolutely stable on real HW2 (I also check the
grayscale demo on one HW2 calc, and it was stable).

> I don't mean to insult you or your code; technically, it's somewhat
> functional.  But, as you know, we're copying a buffer to LCD_MEM every
> fourth time int1 is called.  That's incredible processor intensive, even
> with our extra 2 MHz on HW2.

Yes.

> [on a side note: I see __gmax is being used as the value to reset the
> counter, __gcnt, to.  Why?  

Maybe you see that __gmax is 2 on HW2 and 3 on HW1. Look at the
instruction move.l %d1,__gcnt. It also sets __gmax...

> a higher number could be used to intentionally flip between two
> images), then it's understandable - but you don't do that.

I am sorry, but I can't understand what do you talking about in this
sentence...

> So why not make it a constant 3

Because it is not a constant. Btw, when it WAS a simple constant 3, it 
flickered enormously on real HW2 calc (but not on HW2-VTI). I concluded
that it MUST be 2 on HW2 and 3 on HW1. Julien Muchembled tells this to
me the same thing. I don't know why.

> and save the few precious clock cycles necessary to read
> from a relative address?

10 clock cucles loss 87.5 times per second is degradation of
only 0.0087% which is really not too much...

> While that routine is optimal size-wise, it's just too slow to be
anything
> but extremely ugly.  First of all, you can't afford to have the interrupt
> determine which routine to use.  Instead, have two separate routines, and
> have the grayscale manager determine which one to install.

Again, degradation of speed is about 0.013%, as I can calculate.

> Second, the loop that I quoted above MUST be unrolled.

Universal OS and DoorsOS do not perform loop unrolling too!

> I recommend fully unrolling it, or, if you insist on maintaining some 
> kind of balance between speed and size, limit it to only 8 loops at 
> most.

I had an idea to retain small size with unrolled loop: allocate a buffer
and construct an unrolled loop in it in the initialization routine :-)
But, authors of shells (Julien and Xavier) told to me that their
experience gives the conclusion that unrolling is not necessary.

> That loop is currently using up 1,442
> clock cycles and being called 87.5 times per second.  That's 126,179
> clock cycles per second - MASSIVE speed loss.

Seems massive, but when you turned it in percent, it is 1.26% :-)
Not noticable in practice.

> If I'm reading your code correctly, there is one MAJOR BUG in your
grayscale
> code. On HW2, LCD_MEM is constantly being written to and therefore
anything
> drawn directly there fails. However, on HW1, LCD_MEM is plane 0.
> Therefore, many programmers assume that anything written to LCD_MEM will
> draw to plane 0. TIOS functions, by default, draw to LCD_MEM.  Thus many
> programmers assume they can call drawstrxy or any other TIOS function
right
> after turning on grayscale and it will draw to the screen, which is NOT
true
> on HW2!

This is not bug but a feature: this is noted in the documentation. Cite:
"In a grayscale mode, don't assume that any plane is on LCD_MEM due to
HW2 support..."

> I suggest that you call PortSet and set the drawing area to plane 0
> when grayscale is turned on, and PortRestore() when it is turned off.

Good idea. Btw, you can see that PortRestore is called on turning off.

> You may want to leave out those port changes and put them in the
> documentation, as a few programs won't need them, but that number is so
few
> and the number of bad programmers who will ignore documentation is so
many
> that I highly recommend that you have the routines do it.

I accept your idea. Btw, the common rule is: "if really nothing else helps,
then read the documentation" :-)

> And now, a question or two for the master: Is there any way I can use
> constants/defines from the C code in inline ASM?  Since the ASM code
> is in a string, it's not going to be modified by the compiler in any
> way, so anything in there will be interpreted literally, correct?

There IS a method, but very awkward (look for example my definition of
_ram_call macro in compat.h). Define a following macro:

#define _str_(x) #x

Suppose that you have the following define:

#define LCD_MEM 0x4C00

Then, the following statement:

asm("lea "##_str_(LCD_MEM)##",%a0")

or even shorter

asm("lea "_str_(LCD_MEM)",%a0")


will expand to

asm("lea 0x4C00,%a0")

I hope that this is what you want to ask.

> Also, I see you don't use volatile anywhere - when exactly will 
> the compiler attempt to optimize the ASM code itself (and possibly
> screw it up)?

The assembler will not screw up any ASM code which is defined out of
any function. Only code which is embedded into functions will be
eventually optimized.

> And that's my lengthy discourse for the day - Zeljko, I mean no offence
> to your code; I'm just providing some constructive criticism in an 
> attempt to prove it

No problems, I am open for any criticism, especially for constructive 
ones :-) And, I think that you want to say "improve" instead of "prove"...

> (and get it compatible with my calc! =)

Have you really tried it on your real calc or only on HW2-VTI? As I
said, HW2-VTI behaves wrongly!!!

Cheers,

Zeljko Juric