Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Messages - Xeda112358

Pages: 1 ... 36 37 [38] 39 40 ... 317

556

ASM / Re: Fast 8-bit Squaring

« on: January 28, 2014, 12:29:06 pm »

I managed to optimize the first iteration. I thought to analyze it while in class, so I saved two bytes, 8 cycles.

EDIT: Here is some pseudocode for getting the full square (not just the lower bits):

Code: [Select]

A_sqrd:
;Input: A is an x-bit number (example, 8-bit, 16-bit, et-cetera)
;Output: A*A
	A<<1→B					;Store A shifted once into B.
	0→ACC					;Initialise the accumulator with 0
	1<<(x-1)→MASK				;Set the last bit of MASK
	For (x-1) Iterations:
		ACC<<1→ACC			;shift ACC (same as *2 or ACC+ACC)
		B<<1→B				;shift left once, keep carry
		((SBC(0,0)^A)&MASK)+ACC→ACC	;SBC(0,0) is "subtract with carry" and will yield either -1 (carry flag set) or 0 (carry flag reset).
		(MASK>>1)|MASK→MASK		;arithmetic right shift
	Return (A<<x-A)-ACC

Implemented on the Z80, it kind of sucks taking twice as long as a traditional routine. However, it is novel, I think, taking 7 iterations:

Code: [Select]

L_sqrd:
;Input: L
;Output: L*L->HL
;It's really slow compared to a generic 8-bit multiplication
;It's also pretty big
;But it is a novel approach!
	ld e,a            ;this just stays a constant for the algorithm
	rlca
	ld d,a            ;this will be rotated to get the bits of the input
	ld l,0            ;the accumulator
	ld bc,0780h      ;c is a mask
loop:
	add hl,hl
	rlc d
	sbc a,a
	xor e
	and c
	add a,l
	ld l,a
	jr nc,$+3
	inc h
	sra c
	djnz loop
;HL is 255*input-input*input
;add input
	ld c,e
	add hl,bc
;HL is 256*input-input*input
;negate HL
	xor a
	sub l
	ld l,a
	sbc a,a
	sub h
	ld h,a
;HL is input*input-256*input
;add input*256
	ld e,b
	add hl,de
;HL is input*input
	ret

557

ASM / Fast 8-bit Squaring

« on: January 28, 2014, 09:03:43 am »

Last night I was thinking again about my Sine Approximation routine and holy crud, I must have been really tired not to recognize this:

In the approximation, I was taking the upper 8 bits, truncating propagation. This was so that I had a fixed-point routine. But if I had taken the lower 8 bits, I would have had a useful 8-bit integer routine for -x²-x.

But wait.

If I start the accumulator with x (which is actually 3 cycles faster to do), then I end up with -x².

Well then, anybody wanna optimize this? ^^ :

Code: [Select]

L_sqrd:
;Input: L
;Output: L*L->A
;159 t-states
;39 bytes
	ld h,l
	ld c,l
	rr h
	sbc a,a
	xor l
	add a,c
	ld c,a
	rl l

	rr h
	sbc a,a
	xor l
	and %11111000
	add a,c
	ld c,a
	rl l

	rr h
	sbc a,a
	xor l
	and %11100000
	add a,c
	ld c,a
	rl l

	ld a,h
	rrca
	xor l	
	and 128
	xor c
	neg
	ret

It might be more easy if I explain that it is computing:

Code: [Select]

=(hhhhhhhh^abcdefgh)+(gggggg00^bcdefg00)+(ffff0000^cdef0000)+(ee000000^de000000)
=(hhhhhhhh^abcdefgh)+(ggggg000^bcdef000)+(fff00000^cde00000)+(e0000000^d0000000)
=((hhhhhhhh^abcdefgh)+(ggggg000^bcdef000)+(fff00000^cde00000))^e0000000^d0000000

^ is XOR
+ is addition modulo 256

EDIT: Cross-posted:
I thought of a way to optimize the first iteration. Saved 2 bytes, 8 t-states.
Basically, at the first iteration, C=-1 or C=2L, then I shift L up one bit for the next iteration. I got rid of the initial ld c,a and use the first iteration to compute c. To do this, I just shift L at the beginning, then after the "sbc a,a" I just OR that with L. If a was $FF, the result is FF (which is -1), else it is 2*L:

Code: [Select]

L_sqrd:
;Input: L
;Output: L*L->A
;151 t-states
;37 bytes
	ld h,l
;First iteration, get the lowest 3 bits
	sla l
	rrh
	sbc a,a
	or l
;second iteration, get the next 2 bits
	ld c,a
	rr h
	sbc a,a
	xor l
	and $F8
	add a,c
;third iteration, get the next 2 bits
	ld c,a
	sla l
	rr h
	sbc a,a
	xor l
	and $E0
	add a,c
;fourth iteration, get the last bit
	ld c,a
	ld a,l
	add a,a
	rrc h
	xor h
	and $80
	xor c
	neg
	ret

Also, as a note, if you stop at any iteration and perform NEG, you can mask to get the lower bits of the square. So for example, the first iteration gives you -L*L mod 8->A, the second returns -L*L mod 32->A, the third gives it mod 128.

558

Minecraft Discussion / Re: Minecraft Recruiting

« on: January 26, 2014, 02:54:54 pm »

Unfortunately, Omnimaga is not very strictly devoted to calculators. In fact, it is less so than other sites. We do computer programming and music among other things. Primarily, we make and play games and those games have inspired a number of programs.

559

TI Z80 / Re: Snow Demo

« on: January 26, 2014, 01:51:01 pm »

Yep, that is what I am using. By the way, I am just running through the program 3 times and timing it with the given rand seeds. (I am doing it all in one program, though)

560

TI Z80 / Re: Snow Demo

« on: January 26, 2014, 01:43:46 pm »

Do you generate a new snowflake with probability .5 ?

561

TI Z80 / Re: Snow Demo

« on: January 26, 2014, 01:39:59 pm »

[discussion in IRC]
[time passes]
Here is my program, it just does one at a time for competition and it spits out a new flake at each iteration with probability 1/2.

562

TI Z80 / Re: Snow Demo

« on: January 26, 2014, 01:09:15 pm »

That appears to be a bug with collision detection. Feel free to try to figure it out

Theoretically, the top spots should only be full if all the rest below are full.

563

TI Z80 / Re: Snow Demo

« on: January 26, 2014, 12:53:52 pm »

It generates 2 each cycle by default.
There are 128 positions.
This will never happen

And in case you wanted to do a number that doesn't evenly divide into 128, this piece of the code allows exiting as soon as 128 flakes are on screen:

Code: [Select]

E+1→E
If E=128
S→B

That exits the For() loop to prevent generating any more snow flakes, then if you look at the main Repeat loop conditions:

Code: [Select]

Repeat E>=128 or getKey=45

This exits if [Clear] is pressed or E is 128 or larger. E counts how many flakes have been drawn.

564

TI Z80 / Re: Snow Demo

« on: January 26, 2014, 12:39:05 pm »

It shouldn't freeze... I mean, theoretically, with a perfect random number generator it can, but the probability that it hasn't found an opening after n iterations is (15/16)ⁿ.

It has always worked pretty quickly for me at finding the opening. Also, my program exits when the screen is full (or clear is pressed).

565

TI Z80 / Re: Snow Demo

« on: January 26, 2014, 12:27:00 pm »

Uh, yup, I forgot to add one line:

Code: [Select]

Ans→L2(Ans

This should be inserted before (or after, doesn't really matter) the line redrawing the new location of the snowflake.

566

Axe / Re: [Axe] Plane deformations are fun

« on: January 26, 2014, 12:01:55 pm »

Okay, cool, thanks!

567

TI Z80 / Re: Snow Demo

« on: January 26, 2014, 11:37:02 am »

From Source Coder:

Code: [Select]

{16,8->dim([A]
DelVar EE[A]→[A]
ClrHome
2→S
Repeat E>=128 or getKey=45
For(B,1,S
Repeat not([A](Ans,1
randInt(1,16
End
Ans→[A](Ans,1
Output(1,Ans,"*
E+1→E
If E=128
S→B
End
Matr>list([A],8,L2
For(A,7,1,-1
Matr>list([A],A,L1
L1*(L1 and not(L2→L3
L1-Ans→L1
While max(L3
max(L3→B
0→L3(Ans
0→[A](B,A
Output(A,B," 
min(16,max(1,B+1-2int(2rand
If L2(Ans
B
Ans→[A](Ans,A+1
Ans→L2(Ans
Output(A+1,Ans,"*
End
L1→L2
End
End

EDIT: Forgot a line.

568

Axe / Re: [Axe] Plane deformations are fun

« on: January 26, 2014, 09:56:09 am »

That really does look cool! Is there a source? If I ever get free time, I might want to look into optimizing it with assembly or something.

569

TI Z80 / Re: Snow Demo

« on: January 26, 2014, 08:37:54 am »

Mine draws the flakes when they are newly added at the top of the screen, then it searches the matrix for snowflakes that can move down. It erases those ones and draws their new position.

Also, DJ_O, you didn't need the "1=" part of those If statements.

* Xeda112358 runs

The way I test for flakes that can move is I store the matrix rotated so that column 1 is the first row of snow flakes. Then each snow flake is represented by the column number it is in (so if it is homescreen column 6, it has a 6 in the matrix). My code is then:

Code: [Select]

Matr>list([A],8,L2	;get the last row
For(A,7,1,-1
Matr>list([A],A,L1
L1*(L1 and not(L2→L3	;This checks if anything in L1 can move down into L2
		;"not(L2" leaves a 0 where there is already a flake, else 1 if empty
		;Then "L1 and not(L2" leaves a 1 if there is a snowflake above an empty space
L1-Ans→L1	;remove the snowflakes that can move down from L1
While max(L3
max(L3→B	;location of the snowflake furthest to the right
0→L3(Ans	;remove the snowflake from L3, the list of moveable flakes
0→[A](B,A	;remove it from the matrix
Output(A,B," 	;Erase it
min(16,max(1,B+1-2int(2rand	;randomly move left/right
If L2(Ans	;check if the space is occupied
B
Ans→[A](Ans,A+1	;write the snowflake to the new coordinate
Output(A+1,Ans,"*
End
L1→L2		;now This row becomes the new lower row
End

570

TI Z80 / Re: ORG: online Z80 IDE and assembler

« on: January 26, 2014, 08:21:08 am »

Is that due to things like me using ORG a few dozen times in a short period of time to generate and download files?

Pages: 1 ... 36 37 [38] 39 40 ... 317