Author Topic: Assembly Coding Optimization  (Read 14662 times)

0 Members and 1 Guest are viewing this topic.

Fallen Ghost

  • Guest
Assembly Coding Optimization
« Reply #15 on: April 27, 2007, 02:50:00 pm »
Hey! Where did your posts go?

Well, is it kind of self-explanatory that jp (hl) works and jp (de) does not work while jp (ix) and jp (iy) both work?

And why not jr (a)...

Offline Halifax

  • LV9 Veteran (Next: 1337)
  • *********
  • Posts: 1334
  • Rating: +2/-1
    • View Profile
    • TI-Freakware
Assembly Coding Optimization
« Reply #16 on: April 27, 2007, 02:56:00 pm »
well yeah it should be because ix is supposed to be a replacement wherever hl is and de, and bc aren't. And there would be problems with jr (a) since somone could unsign a and make it 255 instead of between 128 and -127 so yeah.
There are 10 types of people in this world-- those that can read binary, and those that can't.

Offline Iambian

  • Coder Of Tomorrow
  • LV8 Addict (Next: 1000)
  • ********
  • Posts: 739
  • Rating: +216/-3
  • Cherry Flavoured Nommer of Fishies
    • View Profile
Assembly Coding Optimization
« Reply #17 on: April 30, 2007, 05:45:00 am »
JP (HL) works with JP (IX) and JP (IY) and not JP (DE) because of the way the Z80 instruction set works. Since there's an opcode for JP (HL), and the docs say that many instructions that work with HL can also work with IX and IY, that would seem self explanitory. A little study of the instruction set would reveal that any IY or IX instruction is simply the corresponding HL instruction with a prefix byte attached to tell the processor to treat the instruction using the index register instead of the original HL register. Because of the way the processor works, you also get these "undocumented" instructions that allow one to edit the LSB or the MSB of the index registers.

Is JR (A) even a valid instruction? I can't find it anywhere in the documentation, so I dun think that's supported by the Z80. Correct me if I'm wrong by showing me the corresponding opcode and what the binary turns out to be.

But, if one wanted to use SMC relating to a JR instruction and a jump table, one could try something like this, provided that the table is page-aligned (aligned to $xx00)
c1-->
CODE
ec1
 LD HL,$8000;assuming that the table is in the $800-$80FF space
 LD L,A
A Cherry-Flavored Iambian draws near... what do you do? ...

Offline Halifax

  • LV9 Veteran (Next: 1337)
  • *********
  • Posts: 1334
  • Rating: +2/-1
    • View Profile
    • TI-Freakware
Assembly Coding Optimization
« Reply #18 on: April 30, 2007, 09:57:00 am »
wow I am amazed. Anyways no Fallen_Ghost was not saying that jr (a) is a valid instruction he was simply trying to prove a point of why it was not self-explanatory. He was saying hey why wouldn't jr (a) work. That is all.
There are 10 types of people in this world-- those that can read binary, and those that can't.

Offline Iambian

  • Coder Of Tomorrow
  • LV8 Addict (Next: 1000)
  • ********
  • Posts: 739
  • Rating: +216/-3
  • Cherry Flavoured Nommer of Fishies
    • View Profile
Assembly Coding Optimization
« Reply #19 on: April 30, 2007, 12:23:00 pm »
Oh. I thought that anyone with a decent knowledge of the Z80 instruction set would see the connection between any instruction that includes HL and turning it to IX or IY.

The only thing that is *not* self-explanitory is the lack of an EX DE,IY or EX DE,IX instruction, or any other exchanges that play around with HL. (believe me; I've tested the EX DE... thing)

And on a previous topic, the assembler would actually take JP (DE) or JP (A). TASM would complain about a missing label, tho. If you have either DE or A defined as a label, TASM would evaluate (DE) as the location of the label itself and not the stuff at the label, as the indirection might've indicated.

Perhaps this is a good way to obfuscate your code?

Of course, if you have the assembler invoked with the right flags, TASM would emit a warning.
A Cherry-Flavored Iambian draws near... what do you do? ...

Offline Jon

  • LV5 Advanced (Next: 300)
  • *****
  • Posts: 278
  • Rating: +0/-0
    • View Profile
Assembly Coding Optimization
« Reply #20 on: May 21, 2007, 01:34:00 pm »
I'm wondering, does ex (sp),ix/ex (sp),iy work?

Offline Halifax

  • LV9 Veteran (Next: 1337)
  • *********
  • Posts: 1334
  • Rating: +2/-1
    • View Profile
    • TI-Freakware
Assembly Coding Optimization
« Reply #21 on: May 21, 2007, 11:46:00 pm »
Well if ex (sp),hl works then I would imagine so. Because as he said ix is just hl with an extra byte attached to tell its ix
There are 10 types of people in this world-- those that can read binary, and those that can't.

Offline calc84maniac

  • eZ80 Guru
  • Coder Of Tomorrow
  • LV11 Super Veteran (Next: 3000)
  • ***********
  • Posts: 2912
  • Rating: +471/-17
    • View Profile
    • TI-Boy CE
Assembly Coding Optimization
« Reply #22 on: May 22, 2007, 07:58:00 am »
Well the ex commands, unfortunately, are exceptions to that rule (last i checked)
"Most people ask, 'What does a thing do?' Hackers ask, 'What can I make it do?'" - Pablos Holman

Offline Halifax

  • LV9 Veteran (Next: 1337)
  • *********
  • Posts: 1334
  • Rating: +2/-1
    • View Profile
    • TI-Freakware
Assembly Coding Optimization
« Reply #23 on: May 22, 2007, 12:46:00 pm »
EX   (SP),HL E3   1 NOP 1
EX   (SP),IX E3DD 2 NOP 1
EX   (SP),IY E3FD 2 NOP 1

As you can see DD stands for use IX and FD stands for use IY.
There are 10 types of people in this world-- those that can read binary, and those that can't.

Offline Iambian

  • Coder Of Tomorrow
  • LV8 Addict (Next: 1000)
  • ********
  • Posts: 739
  • Rating: +216/-3
  • Cherry Flavoured Nommer of Fishies
    • View Profile
Assembly Coding Optimization
« Reply #24 on: May 23, 2007, 04:01:00 am »
EX (SP),IY and the like *do* work, according to the "documentation". They're slow, so I wouldn't recommend using 'em unless you ran out of registers or something (which is very likely in high-intensity situations).

It's just that EX DE,HL lacks that same kind of modifier. And don't we wish EXX had that kind of modifier? :Ptongue.gif

But for some other assembly code optimizations...
( meh. I'm running out of things to say... )

To make use of the wonderful array of flags, you should make all loops that count decrement toward zero. In this way, you remove the needed CP instruction prior to the condition test and you cause the loop to run faster. If you're using register B for this purpose, then you should already be familiar with the DJNZ instruction. If you need something counting upward, however, you can use an extra register to keep track of the count. If you are running short on registers, you can do some CPL/NEG magic with the counter if the count upwards happens to be starting from 255 counting downward, or if the counting number happens to require that kind of value.

As with any ASM optimization, you ought to write your code first and then see if you can improve the code's form in any way. Whether it be taking advantage of a conveniently placed flag, condensing the code so it uses less registers, removing instances of "slow" instructions in favor of "faster" instructions, limiting the scope of what your code is doing to *exactly* what it should do... and making sure that if you make any changes to subroutines with its input, ensure that the input values of the calling routine are well-suited to handle the changes.

Sometimes, you will want to change input values to certain subroutines because the calling routine will usually have its input in a certain format. For example, if you needed a routine that would extract size bytes out of a program var right after a _chkfindsym, you'd take DE as the address of the program. If you're doing this multiple times, you're saving yourself a few bytes of having to do "EX DE,HL" each time you called _chkfindsym.

Experiment with these optimizations, but before you do any kind of optimization, please, for the love of all that is good, make sure that the program works prior to potentially breaking it. You'll thank yourself later.


A Cherry-Flavored Iambian draws near... what do you do? ...

Offline calc84maniac

  • eZ80 Guru
  • Coder Of Tomorrow
  • LV11 Super Veteran (Next: 3000)
  • ***********
  • Posts: 2912
  • Rating: +471/-17
    • View Profile
    • TI-Boy CE
Assembly Coding Optimization
« Reply #25 on: June 12, 2007, 04:32:00 am »
The Better CP HL,DE
compares hl to de, same flag outputs as 8-bit compare
c1-->
CODE
ec1or a
sbc hl,de
add hl,dec2
ec2
"Most people ask, 'What does a thing do?' Hackers ask, 'What can I make it do?'" - Pablos Holman

Offline Halifax

  • LV9 Veteran (Next: 1337)
  • *********
  • Posts: 1334
  • Rating: +2/-1
    • View Profile
    • TI-Freakware
Assembly Coding Optimization
« Reply #26 on: June 12, 2007, 05:09:00 am »
hmm yes very nice calc84maniac. Simple yet straight to the point.
There are 10 types of people in this world-- those that can read binary, and those that can't.

Offline Iambian

  • Coder Of Tomorrow
  • LV8 Addict (Next: 1000)
  • ********
  • Posts: 739
  • Rating: +216/-3
  • Cherry Flavoured Nommer of Fishies
    • View Profile
Assembly Coding Optimization
« Reply #27 on: June 12, 2007, 06:36:00 am »
QuoteBegin-calc84maniac+12 Jun, 2007, 10:32-->
QUOTE (calc84maniac @ 12 Jun, 2007, 10:32)
The Better CP HL,DE
compares hl to de, same flag outputs as 8-bit compare
c1-->
CODE
ec1or a
sbc hl,de
add hl,dec2
ec2  

 Are you sure that would work? Wouldn't the "ADD HL,DE" destroy the Z flag set by the SBC? If anything, shouldn't the the "OR A \ SBC HL,DE" and the "ADD HL,DE" be switched?

Or is it something I'm not getting?
A Cherry-Flavored Iambian draws near... what do you do? ...

Offline calc84maniac

  • eZ80 Guru
  • Coder Of Tomorrow
  • LV11 Super Veteran (Next: 3000)
  • ***********
  • Posts: 2912
  • Rating: +471/-17
    • View Profile
    • TI-Boy CE
Assembly Coding Optimization
« Reply #28 on: June 12, 2007, 07:25:00 am »
ADD HL,DE only modifies the carry flags, which doesn't matter in this case. ADC HL,DE, however, modifies the same flags as SBC HL,DE.
"Most people ask, 'What does a thing do?' Hackers ask, 'What can I make it do?'" - Pablos Holman

Offline Jon

  • LV5 Advanced (Next: 300)
  • *****
  • Posts: 278
  • Rating: +0/-0
    • View Profile
Assembly Coding Optimization
« Reply #29 on: June 15, 2007, 02:14:00 pm »
Calc84maniac is right, partly.  If the sbc causes a sign change, so will the add, coming back.  However, unless HL=0, the above set of commands will always yeild a nz, even if hl=de.  you need to use the additional 21 cc's (push \ pop) to back up flags.