Omnimaga

Calculator Community => TI Calculators => Axe => Topic started by: Matrefeytontias on January 23, 2014, 01:02:36 pm

Title: [Axe] Plane deformations are fun
Post by: Matrefeytontias on January 23, 2014, 01:02:36 pm
Hey guys,

Graphically impressive (read : useless) programs had always been my thing, so when I figured out that I could do some intense LUT-based demoscenes-like programs (my first one was actually nSpeedX3D, it uses the same concept), I took the only programmable machine that was near me : my TI-83+. Could have been my PC, but I was at school.

So, in a mere 3 hours I wrote a small LUT generator that takes X, Y as input (the center of the screen being 0,0) and gives U, V as output. Basically, X and Y are screen coordinates, and U and V are texture coordinates, so I use those to draw things on the screen with a bit-per-bit basis - very slow in Axe, but w/e, it was meant only to look good, not to be fast. Actually, you can argue a bit about it looking good, but I couldn't really do any better with Axe's precision and the z80 calcs' LCD.

So yeah, basically, to perform a plane deformation, you don't need many things :So first, you want to make sure everything works. So you just do U = X and V = Y. It gives this.

(http://img.ourl.ca/firstPostPD0.gif)

(Note how the LUT generation takes ages. We actually generate 12288 coordinates.)
Now that you're certain it works, you can start doing actually funny and interesting things ;D

r = sqrt(x² + y²)
a = tan-1(x,y)


u = x * 8 // abs(y)
v = 512 / abs(y)

(http://img.ourl.ca/firstPostPD1.gif)

u = x * cos(r * 2) - (y * sin(r * 2)) // 256
v = y * cos(r * 2) + (x * sin(r * 2)) // 256

(http://img.ourl.ca/firstPostPD2.gif)

u = 1.0 /* (r * 256 + 0.5 + (sin(a * 5) // 2))
v = a * 3 * 256 /* 3.142

(http://img.ourl.ca/firstPostPD3.gif)

u = r * cos(a + r) // 320
v = r * sin(a + r) // 320

(http://img.ourl.ca/firstPostPD4.gif)

u = a
v = 1.0 /* r

(http://img.ourl.ca/firstPostPD5.gif)

Also, the whole program includes those 5 scenes and is only 1354 bytes. So it can be done very lightly.

To use it, just put a number between 0 and 5 inclusive in Ans before running Asm(prgmPLDEFORM). You need at least 13800 bytes of free RAM though. No test is done to see if Ans is out of bounds, but that shouldn't do anything else than displaying random garbage. Binaries attached, source will come later (when cleaned).

I'm going to make everything more ergonomic and release it as a pure eye-candy program I guess (among with some additional equations).
Title: Re: [Axe] Plane deformations are fun
Post by: fb39ca4 on January 23, 2014, 01:25:23 pm
That's what I initially used for my Nspire Mode 7 engine, but I moved on to faster methods. For the mode 7-type plane, the derivative of the coordinates between pixels on the same row of the screen is constant, so you can just add a constant vector to the coordinates for a pixel to get the coordinates for the next. It's not as easy when you get into using trig functions, but you could keep the values of sine and cosine, and add them to each other to quickly and dynamically compute their values, though this probably won't work out so well with the limited precision you have.
Title: Re: [Axe] Plane deformations are fun
Post by: Matrefeytontias on January 23, 2014, 01:41:16 pm
I already draw the scenes in the fastest way possible ; I push bits into A and when I did that 8 times, I store A to a byte in plotSScreen. This is faster than Pxl-On. The problem is that you can't actually do much faster in Axe (or maybe I'm missing something huge).
Title: Re: [Axe] Plane deformations are fun
Post by: Sorunome on January 23, 2014, 02:38:23 pm
wow, just wow, that is just looking too awesome :P (and yes, it is useless, lol :P)
Title: Re: [Axe] Plane deformations are fun
Post by: Matrefeytontias on January 23, 2014, 02:50:17 pm
Of course it's useless :P its only point is to look good x) and be as fast as possible (actually, not very fast).
Title: Re: [Axe] Plane deformations are fun
Post by: DJ Omnimaga on January 23, 2014, 09:37:53 pm
Woah that looks great! :O Also, screenshot 2 gave me some ideas since the tiles were from supersonic ball, but of course I would need to learn how to do mode 7 (and have it run fast) on the HP Prime. :P
Title: Re: [Axe] Plane deformations are fun
Post by: fb39ca4 on January 23, 2014, 10:24:52 pm
If you need help wrapping your head around Mode 7, just ask. None of the explanations I found on the internet made sense to me, so I struggled with it for a while.
Title: Re: [Axe] Plane deformations are fun
Post by: DJ Omnimaga on January 23, 2014, 11:20:01 pm
Ok I'll ask in due time. Mac Bernick made something looking similar to Mode 7, but I didn't check his code yet and it seems to lack support for horizontal scrolling (it just goes straightforward).
Title: Re: [Axe] Plane deformations are fun
Post by: Matrefeytontias on January 24, 2014, 10:40:44 am
Woah that looks great! :O Also, screenshot 2 gave me some ideas since the tiles were from supersonic ball, but of course I would need to learn how to do mode 7 (and have it run fast) on the HP Prime. :P
Oh I didn't notice that :P
Title: Re: Re: [Axe] Plane deformations are fun
Post by: DJ Omnimaga on January 24, 2014, 11:09:20 am
Actually now that I think about it, some Supersonic Ball tiles were taken from The Reign of Legends 3 and Metroid II Evolution :P
Title: Re: [Axe] Plane deformations are fun
Post by: TIfanx1999 on January 26, 2014, 09:47:47 am
This does look awesome Matref! ;D Damn shame it isn't a bit faster though. :/
Title: Re: [Axe] Plane deformations are fun
Post by: Xeda112358 on January 26, 2014, 09:56:09 am
That really does look cool! Is there a source? If I ever get free time, I might want to look into optimizing it with assembly or something.
Title: Re: [Axe] Plane deformations are fun
Post by: Matrefeytontias on January 26, 2014, 11:03:00 am
This does look awesome Matref! ;D Damn shame it isn't a bit faster though. :/
I swear I tried to make it fast though :/ I don't think there's much to be done about it in Axe.

That really does look cool! Is there a source? If I ever get free time, I might want to look into optimizing it with assembly or something.
I'm sure you can make it at least twice faster with ASM, because if I had to make the drawing code in C it would be (direct translation from Axe) :
Code: [Select]
for(y = 0; y < 64; y++)
  for(x = 0; x < 96; x++)
    plotSScreen[y * 96 + x] = texture[ (ylut[y * 96 + x] & 7) * 8 + xlut[y * 96 + x] ];

Lemme get on my PC and I'll attach the source.
Title: Re: [Axe] Plane deformations are fun
Post by: Xeda112358 on January 26, 2014, 12:01:55 pm
Okay, cool, thanks!
Title: Re: [Axe] Plane deformations are fun
Post by: ClrDraw on January 28, 2014, 11:21:05 am
The third one...  O.O that's crazy.
Title: Re: [Axe] Plane deformations are fun
Post by: fb39ca4 on January 28, 2014, 11:59:37 am
If you're using an 8x8 texture, that's only 64 possible coordinates needed, so you could just have one byte store both the x and y coordinate in the LUT.
Title: Re: [Axe] Plane deformations are fun
Post by: Matrefeytontias on January 28, 2014, 01:29:42 pm
That'd only be slower. I store 2 coordinates because it's faster to access than a packed word that I'd have to decompress.
Title: Re: [Axe] Plane deformations are fun
Post by: fb39ca4 on January 28, 2014, 02:22:24 pm
You don't need to unpack anything. Just store a number from 1-64 which corresponds with the index of each pixel in the texture as if it was a 1-D array.
Title: Re: [Axe] Plane deformations are fun
Post by: ben_g on January 28, 2014, 03:25:46 pm
I'm sure you can make it at least twice faster with ASM, because if I had to make the drawing code in C it would be (direct translation from Axe) :
Code: [Select]
for(y = 0; y < 64; y++)
  for(x = 0; x < 96; x++)
    plotSScreen[y * 96 + x] = texture[ (ylut[y * 96 + x] & 7) * 8 + xlut[y * 96 + x] ];

Lemme get on my PC and I'll attach the source.

I hope this code is slightely faster, but I haven't been able to test it, so I'm not sure if it even works.

Code: [Select]
;°xlut -> A
;°ylut -> B

Render:
  ld hl, plotsScreen
  push hl

RenderLoop:
  ld c, 8
  xor a
ByteLoop:
  ld hl, (A)
  ld b, (hl)
  ld hl, (B)
  ld e, (hl)
  ld d, 0
  ld hl, texture
  add hl, de
  ld d, (hl)
  inc b
TexelLoop:
  sra d
  djnz TexelLoop
  set 0, a
  jr c, TexelEnd
  res 0, a
TexelEnd:
  ld hl, (A)
  inc hl
  ld (A), hl
  ld hl, (B)
  inc hl
  ld (B), hl
  sla a
  dec c
  jr nz, ByteLoop
  rr a
  pop hl
  ld (hl), a
  inc hl
  push hl
  ld de, Plotsscreen+768
  or a
  sbc hl, de
  jr z, RenderLoop
  pop hl
  ret
Title: Re: [Axe] Plane deformations are fun
Post by: Matrefeytontias on January 28, 2014, 03:38:54 pm
You don't need to unpack anything. Just store a number from 1-64 which corresponds with the index of each pixel in the texture as if it was a 1-D array.
Oh that's actually a good idea. I'll try and see how faster it is.

I'm sure you can make it at least twice faster with ASM, because if I had to make the drawing code in C it would be (direct translation from Axe) :
Code: [Select]
for(y = 0; y < 64; y++)
  for(x = 0; x < 96; x++)
    plotSScreen[y * 96 + x] = texture[ (ylut[y * 96 + x] & 7) * 8 + xlut[y * 96 + x] ];

Lemme get on my PC and I'll attach the source.

I hope this code is slightely faster, but I haven't been able to test it, so I'm not sure if it even works.

Code: [Select]
;°xlut -> A
;°ylut -> B

Render:
  ld hl, plotsScreen
  push hl

RenderLoop:
  ld c, 8
  xor a
ByteLoop:
  ld hl, (A)
  ld b, (hl)
  ld hl, (B)
  ld e, (hl)
  ld d, 0
  ld hl, texture
  add hl, de
  ld d, (hl)
  inc b
TexelLoop:
  sra d
  djnz TexelLoop
  set 0, a
  jr c, TexelEnd
  res 0, a
TexelEnd:
  ld hl, (A)
  inc hl
  ld (A), hl
  ld hl, (B)
  inc hl
  ld (B), hl
  sla a
  dec c
  jr nz, ByteLoop
  rr a
  pop hl
  ld (hl), a
  inc hl
  push hl
  ld de, Plotsscreen+768
  or a
  sbc hl, de
  jr z, RenderLoop
  pop hl
  ret
As I seldom have access to my PC, that will require some time before testing.
Title: Re: [Axe] Plane deformations are fun
Post by: ben_g on January 28, 2014, 03:51:40 pm
I just saw in the C source code that you seem to be storing the texture as an image with one byte for every pixel(64 bytes)?
My asm code expects a pointer to a sprite with 1 bit per pixel (8 bytes).
Title: Re: [Axe] Plane deformations are fun
Post by: Matrefeytontias on January 28, 2014, 03:58:20 pm
Yeah, I use 8 BPP for the texture to avoid bit unpacking. It's definitely not Axe's thing.
Title: Re: [Axe] Plane deformations are fun
Post by: ben_g on January 28, 2014, 04:07:11 pm
Indeed. And neither is per-pixel drawing. But for assembly, bit unpacking runs at almost the same speed as reading unpacked bits. And with assembly, the usage of variables can be avoided as well, which should also give a speed boost when doing really repetitive stuff.

Would you prefer it if I convert my code to HEX? That may make it easier to test for you.
Title: Re: [Axe] Plane deformations are fun
Post by: Matrefeytontias on January 28, 2014, 04:16:52 pm
Actually I use another code now, based on fb39ca4's proposition. In C, it would be :

Code: [Select]
// usual for loops, x, y
plotSScreen[y * 96 + x] = texture[ tlut[y * 96 + x] ];
I'll put a pointer to texture in $8000 and a pointer to tlut in $8002.
Title: Re: [Axe] Plane deformations are fun
Post by: ben_g on January 28, 2014, 04:59:57 pm
You can optimize it by using only 1 loop:
Code: [Select]
:for(A,0,6144)
:plotsScreen[A] = texture[tlut[A]]
:End
Title: Re: [Axe] Plane deformations are fun
Post by: Matrefeytontias on January 28, 2014, 05:16:51 pm
But no, in Axe I must pack the bits to plotSScreen ;D
Title: Re: [Axe] Plane deformations are fun
Post by: ben_g on January 28, 2014, 06:40:29 pm
This is my attempt at converting it to assembly:
Code: [Select]
Asm(214093E50E08AF2A02805E160021(°Texture)1946CBC7CB402802CB872A028023220280CB270D20E1CB1FE17723E5114096B7ED5220D0E1)I have dissasembled it to see if I converted it to HEX correctly, and it was, but what I don't know is if my assembly code is fully correct, so backup your data just in case.

EDIT: the more readable source code:
Code: [Select]
Render:
  ld hl, plotsScreen
  push hl

RenderLoop:
  ld c, 8
  xor a
ByteLoop:
  ld hl, ($8002)
  ld e, (hl)
  ld d, 0
  ld hl, texture
  add hl, de
  ld b, (hl)
  set 0, a
  bit 0, b
  jr z, TexelEnd
  res 0, a
TexelEnd:
  ld hl, ($8002)
  inc hl
  ld ($8002), hl
  sla a ;optimization: add a, a
  dec c
  jr nz, ByteLoop
  rr a
  pop hl
  ld (hl), a
  inc hl
  push hl
  ld de, Plotsscreen+768
  or a
  sbc hl, de
  jr nz, RenderLoop
  pop hl
  ret

#comment
HEX:
214093E5 -> 4 bytes
RenderLoop:
0E08AF -> 3 bytes
ByteLoop:
2A02805E1600 -> 6 bytes
21(°Texture) -> 3 bytes ld hl, texture
1946CBC7CB40 -> 6 bytes
2802 -> 2 bytes jr z, TexelEnd
CB87 -> 2 bytes
TexelEnd:
2A028023220280CB270D -> 10 bytes
20E1 -> 2 bytes jr nz, ByteLoop
CB1FE17723E5114096B7ED52 -> 12 bytes
20D0 -> 2 bytes jr nz, renderLoop
E1 -> 1 byte

TOTAL COMMAND:
Asm(214093E50E08AF2A02805E160021(°Texture)1946CBC7CB402802CB872A028023220280CB270D20E1CB1FE17723E5114096B7ED5220D0E1)
#endcomment
Title: Re: [Axe] Plane deformations are fun
Post by: Matrefeytontias on January 28, 2014, 08:19:49 pm
Okay, so I optimized your a bit and completely replaced my drawing code by it. I also replaced the translation by a simple texture wrapping. Everything happened to work, and HOLY SHIT IT'S FAST *.*

I really think that speed was multiplied by 6 or 7. Be ready for an awesomeness galore. Note that these screenshots are still 6 MHz and that the program is actually faster and smoother on-calc :w00t:

Apart from that, it now requires a little bit more than 7500 bytes of RAM, instead of the previous 13800 ;D

DEFORM.8xp is the Axe source, PLDEFORM.8xp is the executable. As before, put a number f [0, 5] inclusive in Ans before using it, no out-of-bounds test made blah blah blah. I was busy working on optimizing it ;D
Title: Re: [Axe] Plane deformations are fun
Post by: fb39ca4 on January 28, 2014, 09:42:02 pm
Glad to hear that it helped. This would be a great effect to have for a demoscene program. Speaking of demoscene, you should have a magnifying lens effect, as was common in demos for older systems. Also, if you want the GIFs to be smoother, crank up the framerate setting in Wabbitemu.
Title: Re: [Axe] Plane deformations are fun
Post by: DJ Omnimaga on January 28, 2014, 11:55:31 pm
Woah that's quite a nice speed improvement O.O!
Title: Re: [Axe] Plane deformations are fun
Post by: Matrefeytontias on January 29, 2014, 01:30:07 am
Glad to hear that it helped. This would be a great effect to have for a demoscene program. Speaking of demoscene, you should have a magnifying lens effect, as was common in demos for older systems. Also, if you want the GIFs to be smoother, crank up the framerate setting in Wabbitemu.
We're not talking about the same thing here. Lens effect can't be achieved with a single formula for U and V, it needs a whole algorithm.
Title: Re: [Axe] Plane deformations are fun
Post by: Matrefeytontias on January 29, 2014, 10:15:43 am
Bump,

Okay so I tried my best to optimize PLDEFORM like hell, and I managed to make it exactly 1023 bytes ! :w00t: Since I had to drop #ExprOn, it remains 6 FPS, but I tried and at 15 MHz it's 14.30 FPS :D

So I've made an actual demo out of it, named it Illogical (guess why ;D) and posted it on pouet.net and ticalc.org.While it's waiting for approval, I provide a screenshot of the 15 MHz version + download link for binaries and source ! :D
Title: Re: [Axe] Plane deformations are fun
Post by: Sorunome on January 29, 2014, 10:16:50 am
That is awesome fast! (but yeah, 15MHZ, make it faaaaaaster >:D )
Title: Re: [Axe] Plane deformations are fun
Post by: Matrefeytontias on January 29, 2014, 10:18:05 am
I really tried everything to make it fast, I don't think it's possible that it'll ever go faster half of an FPS :P

EDIT : here you go :D http://www.pouet.net/prod.php?which=62454
Title: Re: [Axe] Plane deformations are fun
Post by: Eiyeron on January 29, 2014, 10:27:39 am
Aaand here's (http://glsl.heroku.com/e#13862.0) what I did, inspired from your program.
Title: Re: [Axe] Plane deformations are fun
Post by: Matrefeytontias on January 29, 2014, 10:29:31 am
Haha nice, glad to see I inspired people ;D
Title: Re: [Axe] Plane deformations are fun
Post by: XiiDraco on January 29, 2014, 10:30:50 am
Woa. Eiyeron. That's pretty cool!
Title: Re: [Axe] Plane deformations are fun
Post by: Eiyeron on January 29, 2014, 10:34:42 am
Quick test edits. That's getting very fun! :D (http://glsl.heroku.com/e#13862.1)
Last time I post about it here, modulated all the code and added tunnel effect. Comment/uncomment the defines to get your version! :p (http://glsl.heroku.com/e#13862.3)
Title: Re: [Axe] Plane deformations are fun
Post by: fb39ca4 on January 29, 2014, 11:19:27 am
We're not talking about the same thing here. Lens effect can't be achieved with a single formula for U and V, it needs a whole algorithm.
Here's something I put together quickly that produces the lens effect as a function of UV.
https://www.shadertoy.com/view/Xsj3RV
Title: Re: [Axe] Plane deformations are fun
Post by: Matrefeytontias on January 29, 2014, 11:21:22 am
Okay good, now give me the separate equations for U and V and I'll try to put it in my plane deformer.
Title: Re: [Axe] Plane deformations are fun
Post by: Sorunome on January 29, 2014, 11:24:44 am
bleh, my chrome doesn't support webgl :(
Title: Re: [Axe] Plane deformations are fun
Post by: fb39ca4 on January 29, 2014, 11:25:28 am
After transforming the coordinates so that the origin is at the center of the screen and r is the distance from the origin, x' = 0.5 * x * (r^2 + r) and y' = 0.5 * y * (r^2 + r) for all points where r < 0.5. Otherwise, the coordinates are unchanged.

Also, a suggestion for the demo: You should generate the LUT for the next scene in the background while the current one runs, so that there isn't a pause between scenes. After each frame is drawn, generate one line of the LUT.
Title: Re: [Axe] Plane deformations are fun
Post by: Matrefeytontias on January 29, 2014, 11:30:34 am
Apparently I'm getting overflows. Lemme try harder before seeking help if it doesn't work.
Title: Re: [Axe] Plane deformations are fun
Post by: Eiyeron on January 29, 2014, 11:32:05 am
Code: [Select]
#ifdef TELEPORT_EFFECT
u = a;
v = 1./(r + sin(r + time));
#endif
TELEPORTATION TIME! \o/
Title: Re: [Axe] Plane deformations are fun
Post by: Matrefeytontias on January 30, 2014, 02:39:49 pm
Bump,

So my demo made it on Pouet ! :w00t: https://www.pouet.net/prod.php?which=62454

And ticalc too ! :D http://www.ticalc.org/archives/files/fileinfo/458/45819.html

Let's spam the staff with mails saying to feature it :evillaugh:

Final gif :

(http://www.ticalc.org/archives/files/ss/851/85138.gif)
Title: Re: [Axe] Plane deformations are fun
Post by: Sorunome on January 30, 2014, 02:40:57 pm
So that is now 15MHZ again, right?
Or did you manage to get 6MHZ to run it at that speed? O.O
Title: Re: [Axe] Plane deformations are fun
Post by: Matrefeytontias on January 30, 2014, 02:41:26 pm
No, it's 15 MHz :P I wish it was 6 MHz, although even at that last speed it runs quite acceptably fast.
Title: Re: [Axe] Plane deformations are fun
Post by: Sorunome on January 30, 2014, 02:42:06 pm
Still epic, though :thumbsup:
Title: Re: [Axe] Plane deformations are fun
Post by: Runer112 on February 02, 2014, 02:09:43 am
My turn!

I was convinced this effect could be made faster, and after a lot of careful thought and crazy tricks, I managed to bump up the FPS: from 14.3 to 18.5, an improvement of about 30%! Again, it is running at 15MHz. It may look slightly different, as I had to rotate the texture up and right by one pixel per frame (rather than down and right) due to complications of the immensely aggressive optimization. I'll attach the source to this post, and here's a gif proving that it does indeed work:

(http://i.imgur.com/VlOGcAc.gif)

Spoiler For Oh, and did I mention...:
... that that's just the pure Axe version? :evillaugh:

As is always the case for assembly, and is especially the case for really specific, concise algorithms, if you know what you're doing you can get big performance gains over a compiled language. Re-coding only the rendering core in just as (if not more) aggressively optimized assembly, I registered a huge boost in FPS: 18.5 to 44, an improvement of about 140% on my pure Axe version and 200% on the existing assembly core version! The source for this will be attached as well, and luckily this one is 100% stable, so play with it all you want! Here's a gif again, showing it off, although keep in mind that it's actually rendering about twice as many frames as the gif captured:

(http://i.imgur.com/8efCZZd.gif)

Spoiler For Wait a second...:
... why does that gif say 6MHz and load effects so slowly? Because it is 6MHz! :evillaugh: :evillaugh: :evillaugh:

The true 15MHz FPS for the version with the assembly core is a stupidly high 107! This is with no pre-rendering of frames or any such cheating, every frame is rendered pixel-by-pixel as always. So the total performance markup on the original 14.3 FPS comes to about 650%. You can try it for yourself by simply un-commenting the Full in the setup part of the assembly version source. And although it doesn't even capture a quarter of the frames rendered and the original effects are all but impossible to discern, here's a gif:

(http://i.imgur.com/uDGyzwA.gif)

Beat that. :P

EDIT: Apparently, at 15MHz, the assembly version is too fast for the LCD driver on my calculator and glitches out a bit. Whoops. I should probably fix that...
Title: Re: [Axe] Plane deformations are fun
Post by: Sorunome on February 02, 2014, 03:50:56 am
wow, just wow O.O
You sir, are amazing! :thumbsup:
Title: Re: [Axe] Plane deformations are fun
Post by: Eiyeron on February 02, 2014, 05:57:42 am
And that's how, kids, how Runer112 made my day.
Title: Re: [Axe] Plane deformations are fun
Post by: Matrefeytontias on February 02, 2014, 07:17:39 am
Okay what the actual fuck. So you're getting 18.5, then 44, then 107 FPS ? You overclocked your calc or what.

EDIT : tested and yes. I see you actually wrote your own code, only taking my deformation functions. I can't understand shit to what you wrote, so yeah.

I find it actually a bit desperating, how whatever code we come with you can make it 600% faster.
Title: Re: [Axe] Plane deformations are fun
Post by: TIfanx1999 on February 02, 2014, 07:41:42 am
Bump,

So my demo made it on Pouet ! :w00t: https://www.pouet.net/prod.php?which=62454

And ticalc too ! :D http://www.ticalc.org/archives/files/fileinfo/458/45819.html

Let's spam the staff with mails saying to feature it :evillaugh:

Final gif :

(http://www.ticalc.org/archives/files/ss/851/85138.gif)

Very nice stuff. ^^
My turn!

I was convinced this effect could be made faster, and after a lot of careful thought and crazy tricks, I managed to bump up the FPS: from 14.3 to 18.5, an improvement of about 30%! Again, it is running at 15MHz. It may look slightly different, as I had to rotate the texture up and right by one pixel per frame (rather than down and right) due to complications of the immensely aggressive optimization. I'll attach the source to this post, but be warned that there's a still a bug that I haven't had the time to hunt down which crashes the calculator with fair frequency upon exiting. But here's a gif, proving that it does indeed work:

(http://i.imgur.com/VlOGcAc.gif)

Spoiler For Oh, and did I mention...:
... that that's just the pure Axe version? :evillaugh:

As is always the case for assembly, and is especially the case for really specific, concise algorithms, if you know what you're doing you can get big performance gains over a compiled language. Re-coding only the rendering core in just as (if not more) aggressively optimized assembly, I registered a huge boost in FPS: 18.5 to 44, an improvement of about 140% on my pure Axe version and 200% on the existing assembly core version! The source for this will be attached as well, and luckily this one is 100% stable, so play with it all you want! Here's a gif again, showing it off, although keep in mind that it's actually rendering about twice as many frames as the gif captured:

(http://i.imgur.com/8efCZZd.gif)

Spoiler For Wait a second...:
... why does that gif say 6MHz and load effects so slowly? Because it is 6MHz! :evillaugh: :evillaugh: :evillaugh:

The true 15MHz FPS for the version with the assembly core is a stupidly high 107! This is with no pre-rendering of frames or any such cheating, every frame is rendered pixel-by-pixel as always. So the total performance markup on the original 14.3 FPS comes to about 650%. You can try it for yourself by simply un-commenting the Full in the setup part of the assembly version source. And although it doesn't even capture a quarter of the frames rendered and the original effects are all but impossible to discern, here's a gif:

(http://i.imgur.com/uDGyzwA.gif)

Beat that. :P

Why am i not surprised. :P Those are some nice speed gains. :D
Okay what the actual fuck. So you're getting 18.5, then 44, then 107 FPS ? You overclocked your calc or what.

EDIT : tested and yes. I see you actually wrote your own code, only taking my deformation functions. I can't understand shit to what you wrote, so yeah.

I find it actually a bit desperating, how whatever code we come with you can make it 600% faster.

Didn't you know? Runer112 is some sort of wizard or something working his assembly magic. :P
Title: Re: [Axe] Plane deformations are fun
Post by: fb39ca4 on February 02, 2014, 11:02:40 am
My turn!

I was convinced this effect could be made faster, and after a lot of careful thought and crazy tricks, I managed to bump up the FPS: from 14.3 to 18.5, an improvement of about 30%! Again, it is running at 15MHz. It may look slightly different, as I had to rotate the texture up and right by one pixel per frame (rather than down and right) due to complications of the immensely aggressive optimization. I'll attach the source to this post, but be warned that there's a still a bug that I haven't had the time to hunt down which crashes the calculator with fair frequency upon exiting. But here's a gif, proving that it does indeed work:

(http://i.imgur.com/VlOGcAc.gif)

Spoiler For Oh, and did I mention...:
... that that's just the pure Axe version? :evillaugh:

As is always the case for assembly, and is especially the case for really specific, concise algorithms, if you know what you're doing you can get big performance gains over a compiled language. Re-coding only the rendering core in just as (if not more) aggressively optimized assembly, I registered a huge boost in FPS: 18.5 to 44, an improvement of about 140% on my pure Axe version and 200% on the existing assembly core version! The source for this will be attached as well, and luckily this one is 100% stable, so play with it all you want! Here's a gif again, showing it off, although keep in mind that it's actually rendering about twice as many frames as the gif captured:

(http://i.imgur.com/8efCZZd.gif)

Spoiler For Wait a second...:
... why does that gif say 6MHz and load effects so slowly? Because it is 6MHz! :evillaugh: :evillaugh: :evillaugh:

The true 15MHz FPS for the version with the assembly core is a stupidly high 107! This is with no pre-rendering of frames or any such cheating, every frame is rendered pixel-by-pixel as always. So the total performance markup on the original 14.3 FPS comes to about 650%. You can try it for yourself by simply un-commenting the Full in the setup part of the assembly version source. And although it doesn't even capture a quarter of the frames rendered and the original effects are all but impossible to discern, here's a gif:

(http://i.imgur.com/uDGyzwA.gif)

Beat that. :P
Do you think this could be made into a mode 7 engine?
Title: Re: [Axe] Plane deformations are fun
Post by: Matrefeytontias on February 02, 2014, 11:03:33 am
Nope, since it only uses 1 tile. It's really nothing more than a demo effect.
Title: Re: [Axe] Plane deformations are fun
Post by: Runer112 on February 02, 2014, 12:19:26 pm
For any of those curious, perhaps I should try to give a basic explanation of the mind of a madman how I got the speed boosts I did, and provide a look into my mindset on optimization. I assure you it was not magic, just very careful optimization and smart LUT generation! I had a lot of fun and spend a lot of time on the code itself, so I may as well take a bit more to share with you guys.



The core (the duty of which is to render one full byte) of the pure Axe version went through a few iterations. In fact, I still left the old iterations, except the first two, in a block comment in the source, each with a comment of how fast it was.




Now, the real fun one, the assembly core. This may make varying degrees of sense to you, as not only is it assembly, but it's quite hacky assembly. I'll paste the source I used for reference here, and then give a quick rundown of how it actually works.

Code: [Select]
Disp:
ld (spSave),sp
ld c,$20
ld a,$80
out ($10),a
ld sp,(LUT)
ColLoop:
ld de,64*256+%11111110
ld hl,ColLoop
ld a,c
inc c
ld b,7
djnz $
out ($10),a ;152cc into, 153cc loop
cp $2C
ld a,e
ret c
ld sp,(spSave)


;Pixel:
add a,a ;or adc, a a
ret c
out ($11),a ;147cc into, 148cc loop
ld a,e
scf
dec d
ret nz
jp (hl)

Firstly, a lot of time is saved by making the core render right to the LCD, skipping the extra ~60000 cycles of writing to a buffer and then having to read it back out later with delays injected for the slow LCD driver. But the real meat of the speed boost comes from not simply storing a 64-byte bit-exploded sprite, but storing a 576-byte array of 64 9-byte "codelets," each responsible for one pixel/texel of the source sprite (this is the code you see labelled ;Pixel:). Each byte (8 pixels) is initialized with a=%11111110 and the carry flag set, so for each pixel, you can simply perform add a,a to shift the result left by one bit and rotate in a 0 bit, or adc a,a to shift the result left by one bit and rotate in the carry bit, which will always be 1. That is, it will always be 1 until eight of these bits have been shifted in and that 0 in the lowest bit of %11111110 is finally shifted out, which allows for easily determination of when a byte is done by checking the state of the carry flag (1=not done, 0=done).

So each codelet writes its texel in only one instruction with lightning-fast speed, only 4 cycles. The real trick, then, is directing control to these 64 codelets with lightning-fast speed. And just as each codelet handled writing its texel in only one instruction, this is also achieved in only one instruction: ret c. This conditional return, which returns only if the carry flag is set, uses the carry flag effects described above to handle the done/not done determination. But how the hell is a return supposed to help, and when does the 6144-byte texel-mapping LUT get read? The answer is simple: that's what the return does! The texel-mapping LUT was instead blown up to 12288 bytes, giving each pixel 2 bytes, just like a stack entry. And instead of simply storing one of 64 texel indices, each entry stores a pointer to one of the 64 codelets. So to start rendering a frame, the stack pointer is pointed to the start of the LUT and a simple return will take you to the codelet to produce the next pixel!

For a time analysis, the extremely tight loop of ad(d/c) a,a \ ret c to produce each bit gives a stupidly low 15 cycles per bit. Multiplying this by 8 for eight bits in a byte, and adding the overhead of handling each full byte, each byte takes a mere 148 cycles to render; about 5 times faster than my fastest Axe version. Taking into account the fact that the Axe version still needs to perform a LCD update and the different formats of the data that needs to be rotated each frame, the assembly version ultimately comes out to about 6x faster than the Axe version, and with only 46 bytes worth of unique assembly instructions executed.



Final notes:

While writing this post, I realized the mistake I made with the pure Axe version (I allocated enough space in the high memory region for the sprite data, but not for the stack space that the rest of my program actually needs). I believe I have now fixed it and the Axe version should be stable, so I have updated that file in my original post.

Matrefeytontias, since most of this code is still yours and I would have had no clue about this general rendering method or how to make any of these effects myself, feel free to absorb my code into yours and use it as you see fit.

Also, my computer had a massive glitch in which the screen had crazy artifacts everywhere and the only audio was digital noise when I was about 95% done writing this post. I had feared that I lost my ~1.5 hours of work. Thank you Google, because when I restarted, Chrome prompted to restore my session and it had amazingly saved everything I wrote.
Title: Re: [Axe] Plane deformations are fun
Post by: Matrefeytontias on February 02, 2014, 12:49:45 pm
<_< how could you think of that with a human brain xD it's just stupid how it's optimized. Amazing as always.

Also, I'm kinda maniac on my work, so I use to not use anybody's code without being capable of replicating it. And I'm clearly not capable of replicating that, so it'll stay yours.
Title: Re: [Axe] Plane deformations are fun
Post by: Runer112 on February 02, 2014, 01:04:33 pm
At least on ticalc, you can specify co-authors of files. One of us should put them on there with the other specified as a co-author. :) On sites that don't allow that explicit specification, it's up to you if you'd feel comfortable updating it with a description that credits me for the renderer.

Hmm, for some reason, the assembly version doesn't work on my actual calculator... I was really looking forward to seeing the blur that would result. Guess I'd better investigate this.

EDIT: Also, to clarify: most of the code is still really yours. My compulsive coding style drove me to perform little optimizations, move things around, and make personal stylistic changes, but most of the code still does the same thing you originally designed it to. I could probably recreate the assembly version from your original source only by importing my texel-rendering codelet generator and assembly core, with slight modifications to the big texel LUT generation and the texture rotating code.

EDIT 2: I couldn't run the assembly version because I didn't have enough RAM for the 12288-byte LUT. But I discovered another problem... it's too fast for the LCD driver. x.x
Title: Re: [Axe] Plane deformations are fun
Post by: Matrefeytontias on February 02, 2014, 01:12:35 pm
I won't update my version with your code, since it's on several sites that don't permit file editing at all (especially pouet.net). But yeah you're right, you should upload it on ticalc named "Illogical optimized" or something, with me as co-author.
Title: Re: [Axe] Plane deformations are fun
Post by: DJ Omnimaga on February 07, 2014, 01:43:57 pm
My turn!

I was convinced this effect could be made faster, and after a lot of careful thought and crazy tricks, I managed to bump up the FPS: from 14.3 to 18.5, an improvement of about 30%! Again, it is running at 15MHz. It may look slightly different, as I had to rotate the texture up and right by one pixel per frame (rather than down and right) due to complications of the immensely aggressive optimization. I'll attach the source to this post, and here's a gif proving that it does indeed work:

(http://i.imgur.com/VlOGcAc.gif)

Spoiler For Oh, and did I mention...:
... that that's just the pure Axe version? :evillaugh:

As is always the case for assembly, and is especially the case for really specific, concise algorithms, if you know what you're doing you can get big performance gains over a compiled language. Re-coding only the rendering core in just as (if not more) aggressively optimized assembly, I registered a huge boost in FPS: 18.5 to 44, an improvement of about 140% on my pure Axe version and 200% on the existing assembly core version! The source for this will be attached as well, and luckily this one is 100% stable, so play with it all you want! Here's a gif again, showing it off, although keep in mind that it's actually rendering about twice as many frames as the gif captured:

(http://i.imgur.com/8efCZZd.gif)

Spoiler For Wait a second...:
... why does that gif say 6MHz and load effects so slowly? Because it is 6MHz! :evillaugh: :evillaugh: :evillaugh:

The true 15MHz FPS for the version with the assembly core is a stupidly high 107! This is with no pre-rendering of frames or any such cheating, every frame is rendered pixel-by-pixel as always. So the total performance markup on the original 14.3 FPS comes to about 650%. You can try it for yourself by simply un-commenting the Full in the setup part of the assembly version source. And although it doesn't even capture a quarter of the frames rendered and the original effects are all but impossible to discern, here's a gif:

(http://i.imgur.com/uDGyzwA.gif)

Beat that. :P

EDIT: Apparently, at 15MHz, the assembly version is too fast for the LCD driver on my calculator and glitches out a bit. Whoops. I should probably fix that...
You could basically port Mario Kart O.O