Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.


Messages - Runer112

Pages: 1 ... 94 95 [96] 97 98 ... 153
1426
Axe / Re: Axe Q&A
« on: March 14, 2011, 09:13:59 pm »
You would want to do something like the following:

inData(getKey+1,Data(KEYCODE+1,ANOTHER_KEYCODE+1,ETC+1,0))

The inData() relies on a value of zero to specify the end of the data to search. Without a specified end of the data to search, the routine would just keep searching all of RAM until it finds the value, which is probably not what you want. This is why I added a 0 as the final byte of data. And because zero is used to specify the end, having zeros anywhere else would cause problems. This is why I then appended +1's to getKey and to all the keycodes you want to search for, so the value to search for or the match value will never be zero.


EDIT: I'll look at the inData() routine though and see if there's a way to avoid this necessity of adding 1s, so maybe with a future release of Axe you won't have to worry about it. But you'll always need the final data byte to be a 0.

1427
Axe / Re: Axe Q&A
« on: March 13, 2011, 01:06:48 am »
A few things can cause the bad axiom error. My first guess would be that what you put in the #Axiom() statement is causing the problem. What you enter inside that has to be the name of the axiom as it is present on the calculator, with correct capitalization and without a prefix like "appv". For instance, for the already compiled version of MemKit that was distributed in the Axe 0.5.0 zip file, you would use #Axiom(MemKit). The only other thing I can think of is that the axiom wasn't compiled correctly. If my first suggestion as to what the error could be isn't the problem, if you post the program your compiler produced I can check the second possible problem.

1428
The Axe Parser Project / Re: Assembly Programmers - Help Axe Optimize!
« on: March 12, 2011, 10:47:24 pm »
I'm back to take on a few more routines that I either couldn't follow or just decided not to try in my first mass optimization post.



p_Sqrt: 1 byte and 4 cycles saved. I still think it may be a good idea to replace this with a restoring square root algorithm, though, like the one I suggested a while ago here. Although maybe not that exact one, because I wrote that when I was still not too familiar with assembly and it may not be very optimized.

Code: (Original code: 14 bytes, n*37+36 cycles) [Select]
p_Sqrt:
.db __SqrtEnd-1-$
ld a,-1
ld d,a
ld e,a
__SqrtLoop:
add hl,de
inc a
dec e
dec de
jr c,__SqrtLoop
ld h,0
ld l,a
ret
__SqrtEnd:
   
   
Code: (Optimized code: 13 bytes, n*37+32 cycles) [Select]
p_Sqrt:
.db __SqrtEnd-1-$
ld de,-1&$FF
ld b,e
ld c,e
__SqrtLoop:
add hl,bc
inc e
dec c
dec bc
jr c,__SqrtLoop
ex de,hl
ret
__SqrtEnd:
   



p_Sin: 3 bytes and 8 cycles saved.

Code: (Original code: 29 bytes, too lazy to test cycles) [Select]
p_Sin:
.db __SinEnd-1-$
add a,a
rr c
ld d,a
cpl
ld e,a
xor a
ld b,8
__SinLoop:
rrc e
jr nc,__SinSkip
add a,d
__SinSkip:
rra
djnz __SinLoop
adc a,a
ld l,a
ld h,b
rl c
ret nc
cpl
inc a
ret z
ld l,a
dec h
ret
__SinEnd:
   
   
Code: (Optimized code: 26 bytes, too lazy to test-8 cycles) [Select]
p_Sin:
.db __SinEnd-1-$
ld c,a
add a,a
ld d,a
cpl
ld e,a
xor a
ld b,8
__SinLoop:
rra
rrc e
jr nc,__SinSkip
add a,d
__SinSkip:
djnz __SinLoop
ld l,a
ld h,b
or c
ret p
xor a
sub l
ret z
ld l,a
dec h
ret
__SinEnd:
   



p_Log: 1 byte saved.

Code: (Original code: 11 bytes, n*31+17 cycles) [Select]
p_Log:
.db 11
ld a,16
scf
__LogLoop:
adc hl,hl
dec a
jr nc,__LogLoop
ld l,a
ld h,0
   
   
Code: (Optimized code: 10 bytes, n*31+13 cycles) [Select]
p_Log:
.db 10
ld de,16
scf
__LogLoop:
adc hl,hl
dec e
jr nc,__LogLoop
ex de,hl
   

Before we leave this routine, though, the output for hl=0 isn't really correct; it returns 255. You could change the dec e in my suggested routine to dec de to give a slightly more accurate result of -1, but again, that's not quite correct either. The real result of log(0) should be negative infinity, which would be most properly represented by -32768. For the small cost of 3 bytes, the following routine would give you this result:

Code: (Mathematically correct code: 13 bytes, only a little bit slower cycles) [Select]
p_Log:
.db 13
ld de,16
__LogLoop:
add hl,hl
jr c,__LogLoopEnd
dec e
jr nz,__LogLoop
__LogLoopEnd:
ex de,hl
ccf
rr h
   



p_Exp: As with above suggestion, this isn't an optimization, this is a suggested improvement. With the current routine, I see two issues. Firstly, it returns 2^(input mod 256) instead of 2^(input), the latter of which is probably what you would expect of a 16-bit math function. Secondly, this routine does not do anything special for inputs with high values mod 256, which could result in it taking up to 7195 cycles. The following routine would correct both of these behaviors. Also note that it is a subroutine instead of inline code (more on turning inline code into subroutines later).

Code: (Mathematically correct code: 16 bytes, only a little bit slower cycles) [Select]
p_Exp:
.db __ExpEnd-p_Exp-1
ld b,l
ld a,l
and %11110000
or h
ld hl,0
ret nz
inc b
scf
__ExpLoop:
adc hl,hl
djnz __ExpLoop
ret
__ExpEnd:
   



__DrawMskAligned: 2 bytes, 72 cycles saved. I only tacked the aligned part of the masked sprite routine because the rest is scary.

Code: (Original code: 33 bytes, 1481 cycles) [Select]
__DrawMskAligned:
dec hl
__DrawMskAlignedLoop:
ld a,(ix+0)
xor (ix+8)
cpl

ld c,a
ld a,(hl)
or (ix+0)
and c
ld (hl),a

ld de,appBackUpScreen-plotSScreen
add hl,de

ld a,c
and (hl)
or (ix+0)
ld (hl),a

inc ix
ld de,plotSScreen-appBackUpScreen+12
add hl,de

djnz __DrawMskAlignedLoop
     
   
Code: (Optimized code: 31 bytes, 1409 cycles) [Select]
__DrawMskAligned:
dec hl
__DrawMskAlignedLoop:
push hl
ld de,appBackUpScreen-plotSScreen
add hl,de

ld a,(ix+0)
ld d,a
xor (ix+8)
cpl
ld e,a

and (hl)
or d
ld (hl),a

pop hl

ld a,(hl)
or d
and e
ld (hl),a

inc ix
ld de,12
add hl,de

djnz __DrawMskAlignedLoop
     





Finally, here are some routines that I feel would be better suited to be subroutines instead of inline code. Feel free to disagree with me on any or all of these.
  • p_DispStrApp and p_TextStrApp: Any application that displays text will most likely be doing so far more than once, making the size savings from turning these into subroutines probably quite large. Also, the OS routines used to display text are slow enough that you wouldn't notice any speed difference from the overhead of a subroutine.
  • p_Length: Not really necessary to turn into a subroutine, it just seems like the kind of function that should be.
  • p_Log and p_Exp: Like p_Mul and p_Sqrt, I think that math routines should probably be subroutines. The current routines are only 11 and 10 bytes respectively, but if you use the routines I suggested above, they would then be 13 and 16 bytes respectively, making them much more worthy of being a subroutine. The p_Exp routine I suggested actually relies on being a subroutine.
  • p_GetBit and p_GetBit16: Although listed as only being 12 bytes in Commands.inc, they have an extra 2 bytes of overhead register shifting not shown in the routines. Since these are both 14-byte math functions that could definitely be called on more than once in programs that use them, I feel that they should probably be subroutines.

And for one going in the other direction, perhaps p_OnKey doesn't need to be a subroutine? Very few programs use the on key, and I'm guessing that any program that does isn't likely to use it more than once. Or at least if you don't want to change it, can you change the .db 10 into .db __OnKeyEnd-p_OnKey-1 or .db __OnKeyEnd-1-$? I was always confused when I saw this routine about whether or not it was inserted as inline code or a subroutine.

1429
The Axe Parser Project / Re: Bug Reports
« on: March 11, 2011, 12:54:03 pm »
The section of RAM pointed to by L2 is statVars. It contains data for most of the statistics variables, so changing this data will corrupt any existing statistics variables.

Quigibo, perhaps implement B_CALL(_DelRes) somehow? It would probably have to be manually called by the user at their discretion, but it would be useful for people who want to use L2. And for that matter, B_CALL(_DelRes) should probably be a part of the interrupt setup, which it currently is not.

1430
[FR] Programmation Axe Parser / Re: Problème Output()
« on: March 09, 2011, 03:18:48 pm »
input retourne un pointeur vers une chaîne de tokens. Toutefois, les commandes texte affichent des chaînes de caractères ASCII, et non pas de tokens. Seulement les lettres majuscules et les chiffres ont value de tokens et values ASCII équivalent. Tout autre n'afficheront pas correctement.

1431
[FR] Programmation Axe Parser / Re: Problème Output()
« on: March 09, 2011, 02:30:01 pm »
Pour afficher des nombres avec les commandes textes, il est nécessaire qu'on utilise ►Dec, comme ceci. Je suppose que c'est votre problème, mais peut-être pas.

Code: [Select]
Disp 12345►Dec
Output(11,0,12345►Dec)
Text 12345►Dec
Text(40,30,12345►Dec)


Pour afficher du texte, on ne utilise pas de modificateur comme ►Dec, donc quelque chose comme cela n'est pas le problème. Je pense à deux problèmes: soit le texte n'est pas de chaîne valide soit le pointeur vers la chaîne n'est pas correct. Si vous affichez le code avec lequel vous rencontrez des problèmes, peut-être je peux repérer ce qui cloche.


Et pardon mon français s'il y a des erreurs, je ne le parle pas nativement.


EDIT: Ninja'd.

1432
The Axe Parser Project / Re: Features Wishlist
« on: March 08, 2011, 05:46:47 pm »
thepenguin77's perfect grayscale tutorial brings up a good request: changing the display routines to advance in rows instead of in columns. Although this would possibly slow the routines down a bit, even without perfect crystal timer synchronization, this should make the grayscale look better.

1433
The Axe Parser Project / Re: Bug Reports
« on: March 07, 2011, 09:48:33 pm »
Deep Thought, getKey(0) wasn't designed to give a specific value according to which key was pressed. It was simply designed to return 0 if no key is pressed and something that's not 0 otherwise. If you want a value corresponding to the actual key pressed, the normal getKey does that.

1434
Axe / Re: The Optimization Compilation
« on: March 07, 2011, 02:15:18 am »
Oh, well I haven't tested it but I think that still might be slightly faster than DispGraph at 15MHz, not 100% sure though. It would certainly be closer. And as long as you keep the back buffer blank, using DispGraphr would result in the same image put on the screen as DispGraph.


EDIT: Actually let me test this right now...


EDIT 2: The results are in! Data first, analysis after.

  • [Normal speed] DispGraphr: ~94 fps
  • [Normal speed] DispGraph: ~76 fps
  • [Full speed]      DispGraph: ~114 fps

At normal speed (6MHz), DispGraphr is about 25% faster than DispGraph. However, with an interesting turn of events, at full speed (~16MHz on my calculator) the results are pretty much reversed, with DispGraph being about 20% faster than DispGraphr. Now that I think of it, this makes sense. The DispGraphr routine runs about as fast as data can be outputted to the LCD driver. At 6MHz, the DispGraph routine is incapable of running this fast, because the driver readiness check slows things down a good amount for each failed check. However, at 15/16MHz, the check will be running very quickly and should continue execution much more swiftly after the driver becomes ready, resulting in only a little bit of overhead from the checking. After the check is exited, the remainder of the loop will then execute very quickly as well, making up most of the lost overhead from checking by performing the remainder of the loop in less than half the normal time. On top of this, the rest of the routine, including setup and the loop that the data sending loop is inside of, will run more than twice as fast, explaining the switch in the speed placings.

1435
Axe / Re: The Optimization Compilation
« on: March 07, 2011, 01:59:01 am »
I'm a little confused... I'm not sure in what context you would be switching clock speeds. And by a compatibility mode, do you mean the fact that the DispGraph routine waits for the screen driver to say that it's ready? I'm a bit confused about that because calling it a compatibility "mode" would suggest that there's something you can just turn on to automatically activate the safety.

But disregarding things I'm confused about, I do know that the two commands are close to the same speed speed, but because DispGraphr avoids driver readiness checking, it is in fact faster than DispGraph at 6MHz. And you can't really compare them at 15MHz, because DispGraphr doesn't work properly at that clock speed due to the lack of driver checking. Also, you don't need to store to the back buffer every frame for DispGraphr, you can still display black and white images by just keeping the back buffer clear.

1436
Axe / Re: The Optimization Compilation
« on: March 07, 2011, 12:54:32 am »
What do you mean DispGraphr is not faster? Using fewer cycles at the same clock speed pretty much defines it as being faster. Although the actual byte retrieval, calculations, and outputting to the screen take more cycles, it doesn't have the safety checks built-in that DispGraph does, which slow the DispGraph routine down to being slower than the DispGraphr routine.

1437
The Axe Parser Project / Re: Bug Reports
« on: March 05, 2011, 10:00:35 pm »
Ah, you are indeed correct, I'll fix that now.

1438
The Axe Parser Project / Re: Bug Reports
« on: March 05, 2011, 08:59:20 pm »
Are you sure? ASM in 28 days says the following regarding the type byte:

   Bit   
Meaning
0-4
Object type
5
If a graph equation, then it's active if set.
6
Variable is used during graphing if set.
7
Variable is designated for link transfer if set.


I considered bits 5-7 possibly being set, but I don't think they ever should be set for a real or complex variable. I believe bits 5 and 6 only apply to graph equations, and bit 7 should only be set by the OS link transfer system, which wouldn't be used in the middle of an assembly program.

1439
Axe / Re: Basic or Assembly?
« on: March 04, 2011, 02:43:26 pm »
The first two bytes of an assembly program (not counting the size bytes) are required to be $BB, $6D. That is how the OS determines whether or not a program is an assembly program.

1440
Computer Programming / Re: Command Line GIMP Batch Operations?
« on: March 04, 2011, 01:01:05 am »
At first I was trying a script I wrote, but when I had problem with that I tried the examples in their tutorial. For instance, after copying their simple-unsharp-mask example and saving it as script.scm in the .gimp-2.6\scripts\ directory, when I try this:
Code: [Select]
gimp-2.6.exe -i -b '(simple-unsharp-mask "0000.bmp" 5.0 0.5 0)' -b '(gimp-quit 0)'
I get the following:
Code: [Select]
GIMP-Error: Opening 'c:\Program Files (x86)\GIMP-2.0\bin\5.0' failed: No such file or directory

GIMP-Error: Opening 'c:\Program Files (x86)\GIMP-2.0\bin\0.5' failed: No such file or directory

GIMP-Error: Opening 'c:\Program Files (x86)\GIMP-2.0\bin\0)'' failed: No such file or directory

GIMP-Error: Opening 'c:\Program Files (x86)\GIMP-2.0\bin\0)'' failed: No such file or directory

batch command executed successfully
batch command executed successfully

Pages: 1 ... 94 95 [96] 97 98 ... 153