0 Members and 1 Guest are viewing this topic.
Edit: I didn't downvote you. I really like discussions both sides benefit from.
Quote from: Hayleia on July 06, 2014, 02:05:34 pmQuote from: Vogtinator on July 06, 2014, 01:58:34 pmUse a Texture struct rather than a unsigned short* for better readability.Well, two of us come from Axe and care less about readability than about speed (even though it seems like we are not Nspire pros ) or efficiency.It should compile to the same code. If not, it could be faster due to alignment if you use a "flexible array member" (StackOverflow question)
Quote from: Vogtinator on July 06, 2014, 01:58:34 pmUse a Texture struct rather than a unsigned short* for better readability.Well, two of us come from Axe and care less about readability than about speed (even though it seems like we are not Nspire pros ) or efficiency.
Use a Texture struct rather than a unsigned short* for better readability.
Also, I was once told by a moderator that even if a n00b is being annoying or someone spams that it's against the rules to be agressive, so on Omni, if you question people's reading skills, tell members they have no brain, yell at them using cussing and all-caps or try to make them look stupid/inferior, then you cannot use any excuse to get away with it (the only way to get away with it is if you don't get caught). This is more an head-up, because in the past I got banned for being rude too (although not for very long) and you can see my rating ratio as an indicator.
and starting a pissing contest about which lib is better or not and doing so in someone else's lib thread.
I find it legitimate to be angry when people keep bashing my work and saying without any stats to prove it "yes my lib is faster than his lib".
Moreover, I keep proving my lib is not slow, but people keep ignoring me
and seeing it as the slowest thing of the universe, at the point that they'll take any other lib instead of it.
Of course, if someone does test and proves by a + b that Vogtinator's lib is faster than mine at an identical task, then no problem. But just stating it with no numbers of any kind is insulting.
Well that answers the "speed problem", but not the "efficiency" one because to convert the tab into a struct, well there is a need for a function to convert the tab into a struct -.-
struct Texture {uint16_t width;uint16_t height;uint16_t transparent_color;uint16_t bitmap[];} __attribute__((packed));
or blaming it for nKaruga slowdowns on CX models (even though we all saw what happened with the transition from 84+SE to 84+CSE).
think it might be a better idea to use a different texture for the selected block since I tend to get confused with actual glass windows and stuff. Would a square with nothing except an outline that slowly flashes between black and white be better?
<off-topic again sorry, but got to clear that up>Okay so I finally have a computer again, so I worked on n2DLib after several days when I couldn't. So I tried to implement your suggestions as good as I could.
First, sorry for being angry like this. I know it's ridiculous and shit, but I just had enough of seeing people implying the lib was supposedly slow. (and Vogtinator I never meant you, I know aeTIos kept saying it some time ago although he stopped now ; or maybe I'm just being too proud or paranoid, don't know which).
Anyway, well, it's a lib, so compile flags are up to the user.
DJ : I didn't want to start a war about what lib was better, I was just asking please stop assuming my lib is slow without even trying it or wondering if that's not your code's fault I remember some days ago on IRC, Streetwalrus was saying something like "tetris is slow, it's n2DLib's fault" and was serious about it, and of course it turned out the problem was his own code. The fact itself isn't really important, it's just the way of thinking that's annoying. But let's forget that and pretend it never happened.Vogtinator : for your optimizations, I do remember me asking you some, but I don't know, maybe I was working on something else at that moment so I forgot afterwards I did my best at implementing them this time, it should be visible in github's history.Although I find it a good idea, I can't afford inverting the buffer only when updateScreen-ing depending on the screen model, because maybe one day someone will not want to erase the buffer each frame and that will give weird results.
So I guess that's it, sorry for the off-topic, sorry for being upset, sorry for saying stupid shit, things like that.Back on-topic, when testing the last version on an emulated TI-Nspire CX CAS with Ndless 3.1 r914, I noticed you couldn't swim in water nor lava, is that intended ?
Quote from: Matrefeytontias on July 09, 2014, 02:25:59 pm<off-topic again sorry, but got to clear that up>Okay so I finally have a computer again, so I worked on n2DLib after several days when I couldn't. So I tried to implement your suggestions as good as I could.Looks better, but still a lot to improve! (any branch per pixel, for instance if(has_color) definitely needs to go!)QuoteAlthough I find it a good idea, I can't afford inverting the buffer only when updateScreen-ing depending on the screen model, because maybe one day someone will not want to erase the buffer each frame and that will give weird results.That's the reason nGL allocates a third buffer on monochrome calcs (yup, one on CX and three on non-CXs)!320*240*2 = ~155KB, that's almost nothing.
Although I find it a good idea, I can't afford inverting the buffer only when updateScreen-ing depending on the screen model, because maybe one day someone will not want to erase the buffer each frame and that will give weird results.
QuoteFirst, sorry for being angry like this. I know it's ridiculous and shit, but I just had enough of seeing people implying the lib was supposedly slow. (and Vogtinator I never meant you, I know aeTIos kept saying it some time ago although he stopped now ; or maybe I'm just being too proud or paranoid, don't know which).Well, then I wonder why you posted here.. To tell everybody my lib was slow as well/as fast as yours?
QuoteAnyway, well, it's a lib, so compile flags are up to the user.Yeah, but then you should use the correct flags for your example, it's an example after all..
QuoteSo I guess that's it, sorry for the off-topic, sorry for being upset, sorry for saying stupid shit, things like that.Back on-topic, when testing the last version on an emulated TI-Nspire CX CAS with Ndless 3.1 r914, I noticed you couldn't swim in water nor lava, is that intended ?No, definitely not! You can't swin in lava and water needs to be at least two blocks deep to swin in.I just tested and it works for me :-/
Edit: Is it permitted to double-post to seperate a release announcement from the rest?
And there's still more to optimize. Partial loop unrolling as 2x16bit access is slower than 1x32bit for example.To give you an example of a almost fully optimized function, I optimized drawTexture some more: https://github.com/Vogtinator/crafti/blob/master/texturetools.cpp#L151GCC does some partial unrolling, but doesn't transform 2 16bit accesses (ldrh/strh) to 32bit access(ldr/str), although -Ofast should do something, should I report a bug?Code: [Select] 8e28: e15392b4 ldrh r9, [r3, #-36] ; 0xffffffdc 8e2c: e14292b4 strh r9, [r2, #-36] ; 0xffffffdc 8e30: e15392b2 ldrh r9, [r3, #-34] ; 0xffffffde 8e34: e14292b2 strh r9, [r2, #-34] ; 0xffffffde
8e28: e15392b4 ldrh r9, [r3, #-36] ; 0xffffffdc 8e2c: e14292b4 strh r9, [r2, #-36] ; 0xffffffdc 8e30: e15392b2 ldrh r9, [r3, #-34] ; 0xffffffde 8e34: e14292b2 strh r9, [r2, #-34] ; 0xffffffde
COLOR *dest_ptr = dest.bitmap + dest_x + dest_y * dest.width, *src_ptr = src.bitmap + src_x + src_y * src.width;
Quote from: Vogtinator on July 06, 2014, 01:58:34 pmAnd there's still more to optimize. Partial loop unrolling as 2x16bit access is slower than 1x32bit for example.To give you an example of a almost fully optimized function, I optimized drawTexture some more: https://github.com/Vogtinator/crafti/blob/master/texturetools.cpp#L151GCC does some partial unrolling, but doesn't transform 2 16bit accesses (ldrh/strh) to 32bit access(ldr/str), although -Ofast should do something, should I report a bug?Code: [Select] 8e28: e15392b4 ldrh r9, [r3, #-36] ; 0xffffffdc 8e2c: e14292b4 strh r9, [r2, #-36] ; 0xffffffdc 8e30: e15392b2 ldrh r9, [r3, #-34] ; 0xffffffde 8e34: e14292b2 strh r9, [r2, #-34] ; 0xffffffdeThere are actually two reasons why a compiler cannot use 32bit memory move instructions here.1:Code: [Select]COLOR *dest_ptr = dest.bitmap + dest_x + dest_y * dest.width, *src_ptr = src.bitmap + src_x + src_y * src.width;dest_ptr might be src_ptr + 1 ( *src_ptr is not const ) so moving two texels for this case will produce different results.2:It is not guaranteed that src_ptr and dest_ptr are each aligned to 4 byte which is required if you want to move 4 byte at a time on ARM.
Quote from: Vogtinator on July 09, 2014, 02:50:37 pmQuoteFirst, sorry for being angry like this. I know it's ridiculous and shit, but I just had enough of seeing people implying the lib was supposedly slow. (and Vogtinator I never meant you, I know aeTIos kept saying it some time ago although he stopped now ; or maybe I'm just being too proud or paranoid, don't know which).Well, then I wonder why you posted here.. To tell everybody my lib was slow as well/as fast as yours?That "by the way, it's faster than n2DLib" just sitting here without any further development really made me upset.
DJ : I didn't want to start a war about what lib was better, I was just asking please stop assuming my lib is slow without even trying it or wondering if that's not your code's fault I remember some days ago on IRC, Streetwalrus was saying something like "tetris is slow, it's n2DLib's fault" and was serious about it, and of course it turned out the problem was his own code. The fact itself isn't really important, it's just the way of thinking that's annoying. But let's forget that and pretend it never happened.