Omnimaga
Calculator Community => Other Calc-Related Projects and Ideas => TI-Nspire => Topic started by: fb39ca4 on January 08, 2013, 09:18:21 pm
-
Just something I've been working on the past couple days. The screenshot makes it look slow, but it is actually running in full resolution on the lower half of the screen at just over 60fps in the emulator. So far I haven't implemented turning, but that will come soon, and shouldn't be too much of a performance hit. The map is just a 64 x 64 array of random colors, I'll add proper texture tiles later. It's only in grayscale because I don't have a CX or a CX boot1 dump *cough* *cough*.
-
Looks cool!
Have you put any thought into implementing rotations?
-
Wow, this is looking awesome, it also runs smooth :D
-
I'll work on rotations today. Also, are there any functions to access the timer in Ndless, or do I have to read memory addresses directly? I'd like to make a proper FPS counter.
-
If you're using... *cough* *cough* nSDL, there is SDL_GetTicks().
-
I think I'm doing something wrong, because SDL_GetTicks always returns 0. Also, on my calculator, it only runs at 35 FPS :(. Of course, with the horrible LCD response times, it's not much of a difference. Can someone test this on their CX calculator, and let me know how long it takes in seconds to get from one end of the map to the other?
-
It takes about 26-27 seconds from bottom to top and the other way around. From left to right it takes about 24 seconds.
-
So it's even slower on a CX. Movement is done in increments of 1/8 of a pixel on the map, so you need 64 * 8 = 512 frames to cross the map, which works out to ~20FPS. Could it be because the screen buffer is stored in SDRAM on the CX instead of the SRAM on non-CX calcs?
-
I think I'm doing something wrong, because SDL_GetTicks always returns 0.
Are you sure about that? Did you try printing out the value it returns? I just tried it on the emulator and seems to be working.
-
I think I'm doing something wrong, because SDL_GetTicks always returns 0.
Are you sure about that? Did you try printing out the value it returns? I just tried it on the emulator and seems to be working.
Yes, I used uart_printf and it displayed 0. Anyways, I figured out how to read the timer value directly, and that works fine. However, the frame time to floats isn't working for some reason. I used:
float floatFrameTime = (float)(time - oldTime);
uart_printf("\n%u %f", time - oldTime, floatFrameTime);
time-oldTime displays correctly (it is usually in the range of 360 - 500, measured in 1/32768ths of a second), but floatFrameTime always prints 0.00.
-
I think I'm doing something wrong, because SDL_GetTicks always returns 0.
Are you sure about that? Did you try printing out the value it returns? I just tried it on the emulator and seems to be working.
Yes, I used uart_printf and it displayed 0. Anyways, I figured out how to read the timer value directly, and that works fine. However, the frame time to floats isn't working for some reason. I used:
float floatFrameTime = (float)(time - oldTime);
uart_printf("\n%u %f", time - oldTime, floatFrameTime);
time-oldTime displays correctly, but floatFrameTime always prints 0.00.
Yeah that's because of a bug or some issue with printing floats. The nspire syscalls for some reason can't handle printing floats. Try casting it to an integer.
Also remember the value returned by SDL_GetTicks() is a Uint32.
EDIT: BTW, to get the timer value, did you read the RTC or the millisecond timer?
-
Oh, so that's why. Now it works fine, but it only shows up as full numbers which I guess is good enough.
-
Weird shit happens with rotation. :P
Also, CalcCapture is acting weird. At the end, you can see that it inserted a picture of itself into a few frames.
-
Looks like a fun ocean :P
-
Wow, nice, but is it a bug that the screen suddenly turns black?
-
The whole thing is a bug.
-
Weird shit happens with rotation. :P
Also, CalcCapture is acting weird. At the end, you can see that it inserted a picture of itself into a few frames.
It always did that for me. You see CalcCapture at the end in almost all screenshots I made with it (see this screenshot from Early 2004 for example)
(http://www.omnimaga.org/oldsite/battledemo.gif)
It also has poor framerate (although it still does the job I guess). Anyway nice to see more work being done on Mode 7 lately for the Nspire :)
-
Which method are you using ? Maybe I can help you :)
-
I suspect the problem is inaccurate trig or more likely fixed point multiplication routines.
I was originally calculating a lookup table for every pixel and then translating the values for each pixel based on the camera position and rotation. Now, I am only doing a lookup table for every row of the screen, and interpolating values within the row, which is actually how the GBA and SNES do it. It also improved the framerate as you can see in the FPS counter.
I still have to add back in rotation. Fortunately, it shouldn't be as much of a performance hit as you saw last time, because all the translation will be done per row, instead of per pixel.
-
Looks kinda nice. I like how the resolution is maximum :D
-
Well, I've got rotation by a fixed angle working, and it still runs >60FPS at full resolution. (Not that it matters given how slow the response time of the LCD on the actual thing is.) Now to write *accurate* trig functions so I can change the angle dynamically. Also - screen tearing is from the emulator, not the program.
-
60 FPS??? I Wonder if it would be as fast on the CX? O.O (without the screen blur and stuff)
What is the speed for bigger maps?
-
If you're measuring 64FPS on the emulator you can be sure it's lower oncalc. The emulator doesn't have very accurate timing, because it executes every instruction in 1 cycle and doesn't emulate memory access delays (not that I'd expect it to).
-
Yeah, real calc is like ~25% the speed of the emulator for me :P
-
Since I'm getting quite interested in Mode7, I was planning to see how it worked.
I though of a way, that would look a bit like like the screenshots you give: Each "pixel" is shows as a trapezoid, from small(up) to big(down).
However when I look at screenshots of F-zero(68k) for example: http://www.ticalc.org/archives/files/fileinfo/381/38175.html this doesn't look the case.
Is that then a different kind of mode7?
-
Looks like it is doing the same thing, but at a lower resolution so you can't see the trapezoid shape.
EDIT: So I'm having trouble running this code that tests my fixed point sine and cosine functions. The input and output are 16.16 fixed point numbers, and there is four units (in fixed point, so it would be an integer 4 << 16) to a rotation.
When I run it, I get this:
Error at PC=00000104: Invalid condition code
Backtrace:
Frame PrvFrame Self Return Start
00000004: invalid address
debug>
This is the actual code, to compile, just put it in main.c and compile with the Ndless SDK. (Or however else you compile Ndless programs).
#include <os.h>
#include <nspireio2.h>
#define numToFix(a) (a << 16)
#define floatToFix(a) (int32_t)(a * 65536)
int32_t sin(int32_t a) {
int32_t sine_table[257] =
{0x0000, 0x0192, 0x0324, 0x04B6, 0x0648, 0x07DA, 0x096C, 0x0AFE,
0x0C8F, 0x0E21, 0x0FB2, 0x1144, 0x12D5, 0x1466, 0x15F6, 0x1787,
0x1917, 0x1AA7, 0x1C37, 0x1DC7, 0x1F56, 0x20E5, 0x2273, 0x2402,
0x2590, 0x271D, 0x28AA, 0x2A37, 0x2BC4, 0x2D50, 0x2EDB, 0x3066,
0x31F1, 0x337B, 0x3505, 0x368E, 0x3817, 0x399F, 0x3B26, 0x3CAD,
0x3E33, 0x3FB9, 0x413E, 0x42C3, 0x4447, 0x45CA, 0x474D, 0x48CE,
0x4A50, 0x4BD0, 0x4D50, 0x4ECF, 0x504D, 0x51CA, 0x5347, 0x54C3,
0x563E, 0x57B8, 0x5931, 0x5AAA, 0x5C22, 0x5D98, 0x5F0E, 0x6083,
0x61F7, 0x636A, 0x64DC, 0x664D, 0x67BD, 0x692D, 0x6A9B, 0x6C08,
0x6D74, 0x6EDF, 0x7049, 0x71B1, 0x7319, 0x7480, 0x75E5, 0x774A,
0x78AD, 0x7A0F, 0x7B70, 0x7CD0, 0x7E2E, 0x7F8B, 0x80E7, 0x8242,
0x839C, 0x84F4, 0x864B, 0x87A1, 0x88F5, 0x8A48, 0x8B9A, 0x8CEA,
0x8E39, 0x8F87, 0x90D3, 0x921E, 0x9368, 0x94B0, 0x95F6, 0x973C,
0x987F, 0x99C2, 0x9B02, 0x9C42, 0x9D7F, 0x9EBC, 0x9FF6, 0xA12F,
0xA267, 0xA39D, 0xA4D2, 0xA605, 0xA736, 0xA866, 0xA994, 0xAAC0,
0xABEB, 0xAD14, 0xAE3B, 0xAF61, 0xB085, 0xB1A8, 0xB2C8, 0xB3E7,
0xB504, 0xB620, 0xB73A, 0xB852, 0xB968, 0xBA7C, 0xBB8F, 0xBCA0,
0xBDAE, 0xBEBC, 0xBFC7, 0xC0D0, 0xC1D8, 0xC2DE, 0xC3E2, 0xC4E3,
0xC5E4, 0xC6E2, 0xC7DE, 0xC8D8, 0xC9D1, 0xCAC7, 0xCBBB, 0xCCAE,
0xCD9F, 0xCE8D, 0xCF7A, 0xD064, 0xD14D, 0xD233, 0xD318, 0xD3FA,
0xD4DB, 0xD5B9, 0xD695, 0xD770, 0xD848, 0xD91E, 0xD9F2, 0xDAC4,
0xDB94, 0xDC61, 0xDD2D, 0xDDF6, 0xDEBE, 0xDF83, 0xE046, 0xE106,
0xE1C5, 0xE282, 0xE33C, 0xE3F4, 0xE4AA, 0xE55E, 0xE60F, 0xE6BE,
0xE76B, 0xE816, 0xE8BF, 0xE965, 0xEA09, 0xEAAB, 0xEB4B, 0xEBE8,
0xEC83, 0xED1C, 0xEDB2, 0xEE46, 0xEED8, 0xEF68, 0xEFF5, 0xF080,
0xF109, 0xF18F, 0xF213, 0xF294, 0xF314, 0xF391, 0xF40B, 0xF484,
0xF4FA, 0xF56D, 0xF5DE, 0xF64D, 0xF6BA, 0xF724, 0xF78B, 0xF7F1,
0xF853, 0xF8B4, 0xF912, 0xF96E, 0xF9C7, 0xFA1E, 0xFA73, 0xFAC5,
0xFB14, 0xFB61, 0xFBAC, 0xFBF5, 0xFC3B, 0xFC7E, 0xFCBF, 0xFCFE,
0xFD3A, 0xFD74, 0xFDAB, 0xFDE0, 0xFE13, 0xFE43, 0xFE70, 0xFE9B,
0xFEC4, 0xFEEA, 0xFF0E, 0xFF2F, 0xFF4E, 0xFF6A, 0xFF84, 0xFF9C,
0xFFB1, 0xFFC3, 0xFFD3, 0xFFE1, 0xFFEC, 0xFFF4, 0xFFFB, 0xFFFE, 0x10000};
switch ((a & 0x3000) >> 16) {
case 0:
return sine_table[((a >> 8) & 0xFF)];
case 1:
return sine_table[256 - ((a >> 8) & 0xFF)];
case 2:
return -sine_table[((a >> 8) & 0xFF)];
case 3:
return -sine_table[256 - ((a >> 8) & 0xFF)];
}
}
inline int32_t cos(int32_t a) {
return sin(a + 0x1000);
}
int main(void) {
int n;
for (n = 0; n < 1024; n++) {
uart_printf("\n%4l, %08X, %08X", n, sin(n << 8), cos(n << 8));
}
return 0;
}
-
The problem is something with the uart_printf() and not your sin() and cos(). I don't know much about printf specifiers, but %4l doesn't seem to be valid and results in a warning (if you'd use printf instead of uart_printf, obviously).
Why do you use uart_printf and not the normal printf?
-
The function uart_printf fails.
Your code is correct, replace uart_printf with printf.
printf("\n%4l, %08X, %08X", n, sin(n << 8), cos(n << 8));
I have basically no idea what uart_printf does...
-
There's also a bug in this code, (a & 0x3000) >> 16 always gives zero. I think you meant to use 0x30000?
-
uart_printf is in nspireio2.h, and it works fine. I guess it's to differentiate printing to the screen or RS232. I was using that because the regular printf was not working for me for some reason. Anyways, I fixed the bug calc84 pointed out and another one where part of the code thought the angle was stored in a floating point number which I had neglected to change when I switched to the fixed point trig functions and rotation works now :D Tilemapping is next on my list.
EDIT: Tested on the actual calculator (Clickpad Nspire), and it runs at 35fps.
EDIT: Enabling -O3 in gcc increases it to 44fps on the calculator.
-
Good to hear about the speed. Do you think you can move the camera up and display a little bit further so that we don't see the squares this big (like http://www.omnimaga.org/index.php?action=dlattach;topic=15526.0;attach=14553;image ) or would it slow it down too much?
-
I can do that, I just changed the height of the camera sometime between those two screenshots. The one you mentioned was taken with height = 16, and the last one I uploaded was at height = 2. Actually, when I plan on doing texture mapping, what is each square ATM will have a whole texture in it.
Here's a screenshot with height = 32:
-
Ah ok nice. I guess if it's slightly closer it would be fine, since the farthest pixels will get messed up anyway due to low calc resolution. How fast does it run on a 512x512 map?
-
It should run at about the same speed no matter how large the map is, possibly a little slower because there would be more cache misses with a larger map.
-
Lookin' mighty fine t0xic_kitt3n!
-
I'm pretty positive you could get a nice speed boost if you used one of the many available fixed-point math libraries in C, especially if the program messes around a lot with floating point data (the ARM doesn't have an FPU).
-
I am using my own 16.16 fixed point number routines in most of the code. It should be faster than using a library because I can write only the features I need for this.
-
I am using my own 16.16 fixed point number routines in most of the code. It should be faster than using a library because I can write only the features I need for this.
Maybe, but I doubt it is faster. Many of the libraries use very clever hacks, such as this (http://cvs.isoar.ca/viewvc/cvs/math-sll/math-sll.c?revision=1.16&view=markup) one that even uses ARM assembly.
-
And AFAICT functions from a lib are only included if they are acctually used, isn't it?
I don't know how the ndless sdk does it, but most compilers don't include functions that are not used, so having too much functions doesn't really matter, since they don't get compiled
-
As promised, here are some textures! I also made movement dependent on the direction the camera is facing, rather than a fixed direction for each key as it was previously.
-
Awesome!
BTW, about the SDL_GetTicks() issue you mentioned earlier, it was my fault. I had completely forgotten to uncomment a line in nSDL that enabled bus access for the timer, and consequently it worked on the emulator but not one the actual hardware (i.e., it returned always 0). It was extremely dumb and careless from me, but I've updated it now, SDL_GetTicks() should work now (I've tested it on both calculators, everything runs smoothly).
-
Thanks! I haven't been using nSDL, so I just switched to reading the timer directly. However, back when I was using SDL_GetTicks(), it was returning zero on the emulator as well.
-
I believe you were asking for the source code hoffa, so here it is for anyone who wants to take a look. Also, now it supports multiple (crappy procedural) textures (but there's nothing stopping you from loading them from a file) as well as solid colors. On the map, squares with 0-15 in them are interpreted as the color, and when n >= 16, the texture with id n - 16 is used.
EDIT: Uploaded wrong version, fixed, and added screenshot.
-
Thanks a lot! That's some pretty damn clean code, it's a pleasure to read/mess with. :ninja:
EDIT:
Purely for viewing pleasure, a CX color version:
(http://i.imgur.com/VZUJpoJ.png)
And some on-calc footage to see how it runs:
(I edited a few lines to get it working with nSDL and got about 6 FPS extra at the same time)
It runs at a rather constant 28-31 FPS on CX (could certainly be pushed further if it was optimized for 16 bpp) and 45-47 FPS on Touchpad when the whole lower half of the screen is filled.
EDIT2: I added the TNS if somebody wants to play with it.
-
Thanks for adding CX support! I would do it myself, but I don't have a CX or the boot1 ROM for the emulator.
-
Man, that looks really smooth. :D
Well done!
-
Good news, I managed to push to a constant 40 FPS on CX, and over 70 FPS (!!) on Touchpad.
-
What is the format for colors with nSDL?
-
When using 8 bpp (when initializing SDL with 8 bpp) it's palettized (every color mapped to a certain RGB value; palette is located in screen->format->palette (http://www.libsdl.org/docs/html/sdlpalette.html) and can be changed using SDL_SetColors() (http://www.libsdl.org/docs/html/sdlsetcolors.html)). It generates some default palette which is the one I used.
When using 16 bpp, colors are just encoded as 16-bit high color (http://en.wikipedia.org/wiki/High_color#16-bit_high_color).
-
Good news!
I've now added real color support and with "real" textures! I removed the plain color/texture division and used the whole byte as a palette index (i.e., 8-bit, 256 colors). It was the best method speed/simplicity/size-wise.
After some nitpicky optimization, I managed to push the framerate to 90+ FPS on Touchpad/Clickpad, and to a constant 45 FPS on CX! ^-^ I believe it's soon pretty much the theoretical maximum performance you can expect.
Aaaand a screenshot because everyone loves 'em:
(http://i.imgur.com/uNbazqn.png)
Source code and binary attached!
-
Wow, looks awesome! :D
-
(http://i.imgur.com/ltvwD3U.png)
This'll be my final update.
Changes:
- Performance (oncalc): 100 FPS on Touchpad, most probably ~50 FPS on CX (can't test it, lil' brother lost his calc)
- Separated the mode7 part more and renamed stuff for more consistency (resembles much more like an engine)
- Fixed some issue with positions on the map being wrong
- Added a simple function to load textures (only works with 8-bit surfaces, i.e. surface has to be converted first if necessary)
- Cleaned up stuff
Code & binary attached.
EDIT:
One thing I've noticed is that accessing global variables is much slower than accessing any other type of variable. Just using the global screen SDL_Surface in the rendering function decreased the framerate by nearly 10 FPS! It might be good to know if speed is of your concern. IIRC it also applied to accessing structure items.
-
I wonder why accessing global variables are slower.
-
I wonder why accessing global variables are slower.
Ndless fake relocation atm.
-
Darn this looks really nice. Someone should really make a racing game or use the engine for a game like Pilotwings or an RPG world map.
-
Hoffa recommended to me to use it in my F-Zero game (see my sig). I think that it's what I gonna do, if t0xic_kitt3n agree.
-
I don't have any problem with it, go ahead. However, I was planning on adding sprites, so you may want to wait before you build around it.
-
I guess it would be a good idea unless you manage to make yours faster or something. (Or if t0xic's stopped being updated regularly and it slowed you down in your project due to having to wait or something)
-
I gonna use it now, and if fortunately you add sprite support, I'll just update my project to work with the new engine.
-
What does __attribute__ do?
static __inline__ __attribute__((always_inline))
m7Vec_t* m7AddVec(m7Vec_t* a, m7Vec_t* b) {
a->x += b->x;
a->y += b->y;
return a;
}
-
__attribute__ is the GCC-specific way to attach extra (not defined by the C/C++ standard but useful nevertheless) implementation-defined semantic information to functions, variables and statements, which triggers changes in the behaviour of the code generation. Several other compilers have their own way to attach implementation-defined information, with their own names for the implementation-defined prefix and the implementation-defined attributes, e.g. __declspec with MSC.
For instance, __attribute__((always_inline)) will try extra hard (more than inline / __inline__) to inline a function, and complain if for some reason, it cannot perform said inlining. The opposite is __attribute__((noinline)), which will force GCC to provide an out-of-line function even if it might have wanted to inline the function (e.g. in C, because it's "static", i.e. limited to the current TU, and has a single caller).
There are dozens of other GCC attributes, the most significant of which are supported by compilers which pretend to be GCC (especially Clang, and less so ICC). Some attributes are available across all targets, while some others are platform-specific (e.g. banks of memory which require special instructions, interrupt handlers with special epilogue / prologue, interrupt vectors, etc.).
Note that the C++11 standard defines a generic way to attach attributes to functions / variables / statements (though it defines a ridiculously small number of attributes, IIRC only [[noreturn]] and [[carries_dependency]]), but most of the major C++ compilers do not support it at all, or at least, in a fully spec-compliant way. No released version of GCC supports generalized attributes yet.
-
Necro,
I looked in the engine and it seems that it only supports fixed-size tilemaps right ?
-
Plop,
I'm having big problems with it D: if you're reading this please go to http://ourl.ca/18582 to help me, I describe my problem there.