RE: A85: Tyrant ports

To: assembly-85@lists.ticalc.org
Subject: RE: A85: Tyrant ports
From: kaus <kaus@cybrzn.com>
Date: Fri, 13 Nov 1998 22:24:57 -0600
Delivered-To: assembly-85-outgoing@towerguard.unix.edu.sollentuna.se
In-Reply-To: <01BE0F45.404B7680@RichardLewis>
Reply-To: assembly-85@lists.ticalc.org

>Problem is, I wrote that relocation routine a while ago (several months). 
> I don't understand it anymore.  UGH.  I want to try a new method anyway. 
> Do you know how Usgard does it's relocation?  Does it just remap 
>everything?

Usgard's relocation system is simple, and yet complicated at the same time.
 It is the way that all DOS/WIN based PCs execute programs also.  I think
UNIX/LINUX also uses the same idea, but I am not sure.

So, first I will describe how DOS/WIN PCs execute EXE programs.
All EXE files have a header section before the actual code.  This header
includes information concerning the entry point, the size of the header,
the size of the actual load module (what gets loaded into the mem to be
executed) and the relocation table.  This relocation table is the key part.
 The relocation table consists of a series of offsets into the load module
that need to be modified.  So, anyway, the EXE file is read, the Load
Module is loaded into RAM at available space, and then the relocater goes
to work.  It goes through the relocation table in the header and checks
that offset into the program.  This offset contains an instruction that
references a FAR segment.  This FAR segment is a segment that will be a
different number depending on where in RAM the program gets loaded.  So, it
takes the value at the offset, and adds it to the actual segment the
program got loaded into, and places that into the offset over the old
address.  After it does these for all FAR references, it actually runs the
program, calling its entry point address from the header.  When it regains
control after the program is finished, it jsut deallocates the RAM needed
for the load module and coninues on its way.  

There are a few differences to the way Usgard does things, basically
becuase of the platform it is on.  The Usgard system has the program code
first, and then at the end of the code is the relocation table.  Instead of
containing offsets into the code, it optimizes things a bit by using
relative offsets.  For instance, if in your program, you need relocation at
the 20th code byte.  a 20 is the first value in the relocation table then.
Then, if 34 bytes after that relocation is the next one, it puts the 34 in
the reloc table.  you get the idea.  Also different is what needs to be
relocated.  In a PC based system, everything is segmented, so there are
relative (+128,-126) references, like the usgard ones, that dont need to be
relocated.  Then there are the NEAR references, which are in-segment 16 bit
addresses on the PC.  These don't need to be relcoated either, because the
program is always at $0000 in a segment.  Then there are the FAR ones,
which are 32 bit references.  The PC needs to reference these, becuase the
segment number is variable.  In the calc, we cant use a FAR reference.  The
NEAR references are the ones that need to be relocated, becuase we only
have the 16 bit address space.  So, since the program could be anywhere in
the RAM, the program has no where of knowing where it will be and so where
to point its NEAR pointers too.  In the Usgard system, What is actaully
stored at each reference's place in the code is a relative4 offset from the
beginning of the code.  so, if your program is 2000 bytes long, and you
need to get a string that is the last 2 bytes of the code, 1998 is stored
in the code.  An entry in the relocation table points to this address so
that before Usgard runs it it can change this 1998 to a valid address.  
Just before it runs the program, when it is going through its relcoation
table, it comes upon this 1998, and then adds the program's real address on
the calc's RAM to the 1998 and puts it back where it was before.  This is
the (PROGRAM_ADDR) value.  
The final difference between the way the Usgard system and the intel system
work is that since the intel pc system gets the program code from a disk,
it isnt actually modifing the _real_ code on the disk when it relcoates the
program.  In the usgard/ti-calc platform, it __is__ really modifying the
code, and so after the program quits, the operating system goes throught
the relocation table again, only this time it _subtracts_ the program_addr
value from the code values.  This puts the code back in the state it was
prior to executing the program, and ready to be executed again next time.

The system gets a bit more complicated when the possiblity exists that any
given program that is about to be executed _could_ work with the VAT and
variables, which could cause the ti's auto repositioning system to move the
program all over the place.  The tings that could do this are the delete
command and the resize command.  The ti is a very clean device.  It likes
all its variables to be right next to each other on ram with no gaps in
between.  This is all fine and dandy from the ti-os point of view, but it
is what creates all the problems with relocation on the ti-clacs.  It can
be caused whenever someone deletes a variable, or whenever someone resizes
a variable.  What if the programmer did that in his program?  His program
might have moved!  How to deal with this is left as an exercise for the
reader :)))  (the usgard crew found a way.  can you?)

you might want to read _all_ the usgard docs to find out more.

Jonathan Kaus
References:
RE: A85: Tyrant ports
From: Richard Owen Lewis <richardlewis@cedarcity.net>