This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.
Messages - Vogtinator
Pages: 1 ... 4 5 [6] 7 8 ... 83
76
« on: June 04, 2015, 10:55:16 am »
I don't have the time to play around with the Ndless programs so I don't see any point in not updating it. It's a one-way route, there's no way back yet. I understand but I'm giving this to a friend and, tbh, he would definitely update it if I didn't I'm fairly sure he wouldn't, Ndless is very useful.
77
« on: June 02, 2015, 07:07:19 pm »
In theory, yes, but there may be some breakages.
78
« on: June 01, 2015, 06:42:49 pm »
It should be, it's a symlink pointing to the latest build.
80
« on: May 29, 2015, 02:18:21 pm »
You'd need ndless for that first and if your calc is new and has OS 4.0 preinstalled, that's not possible right now.
81
« on: May 21, 2015, 11:02:19 am »
So, for example, unrolling the loop would help quite a bit. 8 Cycles for copying a single pixel without any transformation is way too much. It would, but as I said, only possible if Xsrc % 2 == Xdest % 2, so not widely applicable.
Somehow I was thinking about 32-bit access here, I don't know why.
82
« on: May 20, 2015, 07:44:41 am »
So, for example, unrolling the loop would help quite a bit. 8 Cycles for copying a single pixel without any transformation is way too much. It would, but as I said, only possible if Xsrc % 2 == Xdest % 2, so not widely applicable. 8 Cycles for copying a single pixel without any transformation is way too much. Yeah, but I guess you can't do much about it without using Asm (and I target not to, except if there are very obvious improvements). The assembler doesn't look much different with more gcc optimizations. Given your example, which seems to copy without the transparency check, we end up with something this ( cycle estimate at end of line ): I guess it could be improved by one cycles if the cmp is moved between the ldrh/strh?
83
« on: May 18, 2015, 01:44:43 pm »
Sadly both routines appear to be quite unoptimized. I guess you could make it faster by doing 32-bit transfers, but the shortest asm version with word-transfers is the nGL version minus ldrh r8, [r0, #6] , because that should happen outside of the loop and r1 could be used as counter instead of r12. Basically (r0 is source, r1 is dest, r2 is end of source) loop: ldrh r3, [r0], #2 strh r3, [r1], #2 cmp r0, r2 bne loop Also, 32bit transfers would be impossible if source or dest aren't 32-bit aligned which isn't the case if you have an uneven X.
84
« on: May 17, 2015, 08:07:19 am »
I don't know why I even bother answering this the second time, I wrote this some time ago already. Yeah well people wouldn't stop talking shit about how n2DLib was slower than nGL even though not a single test was ever made. I would make a test, but n2DLib doesn't support TEXTURE-TEXTURE blitting what nGL does. Although that shouldn't make a huge difference, it'll be unfair. You optimized it ? Oh yeah cool, so did pierrotdu18, Hayleia and I. Well, I'm sorry to say, but it just doesn't look like it. The drawSprite routine, as the simplest example, makes a call to setPixel per pixel. This is bad because of four reasons: -Function calls are slow -Two comparisons -Multiplication -Variable loaded from RAM, indirectly For easier comparision, here are the two inner loops of drawSprite and the nGL equivalent, compiled with the same flags as your example: 00000a80 <setPixel>: a80: e35100ef cmp r1, #239 ; 0xef a84: 93500d05 cmpls r0, #320 ; 0x140 a88: 33a03d05 movcc r3, #320 ; 0x140 a8c: 30210193 mlacc r1, r3, r1, r0 a90: 359f300c ldrcc r3, [pc, #12] ; aa4 <setPixel+0x24> a94: 31a01081 lslcc r1, r1, #1 a98: 35933000 ldrcc r3, [r3] a9c: 318320b1 strhcc r2, [r3, r1] aa0: e12fff1e bx lr aa4: 00011078 .word 0x00011078
In drawSprite: c94: e1550008 cmp r5, r8 c98: e08b3005 add r3, fp, r5 c9c: aa000008 bge cc4 <drawSprite+0x68> ca0: e0da20b2 ldrh r2, [sl], #2 ca4: e1d630b4 ldrh r3, [r6, #4] ca8: e1530002 cmp r3, r2 cac: 0a000002 beq cbc <drawSprite+0x60> cb0: e1a01004 mov r1, r4 cb4: e1a00005 mov r0, r5 cb8: ebffff70 bl a80 <setPixel> cbc: e2855001 add r5, r5, #1 5a0: e25cc001 subs ip, ip, #1 5a4: 3a000005 bcc 5c0 <drawTexture(...)+0x158> 5a8: e0d560b2 ldrh r6, [r5], #2 5ac: e1d080b6 ldrh r8, [r0, #6] 5b0: e2811002 add r1, r1, #2 5b4: e1580006 cmp r8, r6 5b8: 114160b2 strhne r6, [r1, #-2] As that code is run per pixel, I guess that that is definitely a noticable difference.
85
« on: May 16, 2015, 04:59:43 pm »
That would be great if you made a tutorial about texture mapping. Will do, that's the next step Also, does anyone know if nGL is faster for 2D stuff than n2DLib ? The 3D parts definitely not. Although it's faster on desktop machines to use orthogonal projection for 2D rendering as that's hardware accelerated, that's not the case here. There is a small 2D part in texturetools.cpp, for working with TEXTURE objects, like (GL_LINEAR scaled) blitting, block blitting with 50% opacity, resizing and converting to greyscale for classic calcs, but not much more. Those parts are optimized, thus probably faster than n2DLib (never tested, but it might show if you blit an excessive amount of pixels) and support blitting from TEXTURE to TEXTURE instead of blitting to screen only. If you read some older posts in this thread and n2DLib, you might notice that there was already quite a discussion about speed... Edit: Lesson 2 - Texture mapping, is up.
86
« on: May 13, 2015, 08:38:25 pm »
I finally got around to writing a tutorial on how to use nGL: http://github.com/Vogtinator/nGLMaybe we'll see some more 3D games on the nspire now!
87
« on: May 07, 2015, 04:47:10 pm »
There is no downside to using Zehn, as make-prg generates backwards-compatible tns's, even with compression enabled. It also doesn't make any sense to have only the Zehn file and no backwards-compatible one as it adds not even 10k.
88
« on: May 02, 2015, 07:24:13 am »
Using g++ ( GCC 4.8 ) works just fine but with nspire-g++ ( GCC 4.9 ), i'm getting lots of "undefined to" errors when linking. This is what i was trying to compile by the way : https://github.com/dmitrysmagin/fceu320-rzx50 You have to resolve them in a case-by-case basis. It's not the compilers fault. Finding a good emulator is not easy Definitely not, emulators are complex by nature.
89
« on: May 01, 2015, 12:08:22 pm »
Speaking about the NES, i did tried to port FCEUX but i failed because C++ support in Ndless is limited and lacking so i can't port it yet. Except for threads everything should be working. What doesn't work?
90
« on: May 01, 2015, 10:57:22 am »
If you enable CONFIG_KEYBOARD_NSPIRE it should work OOTB with the right DTB.
Pages: 1 ... 4 5 [6] 7 8 ... 83
|