0 Members and 2 Guests are viewing this topic.
L_sqrd:;Input: L;Output: L*L->A;159 t-states;39 bytes ld h,l ld c,l rr h sbc a,a xor l add a,c ld c,a rl l rr h sbc a,a xor l and %11111000 add a,c ld c,a rl l rr h sbc a,a xor l and %11100000 add a,c ld c,a rl l ld a,h rrca xor l and 128 xor c neg ret
=(hhhhhhhh^abcdefgh)+(gggggg00^bcdefg00)+(ffff0000^cdef0000)+(ee000000^de000000)=(hhhhhhhh^abcdefgh)+(ggggg000^bcdef000)+(fff00000^cde00000)+(e0000000^d0000000)=((hhhhhhhh^abcdefgh)+(ggggg000^bcdef000)+(fff00000^cde00000))^e0000000^d0000000^ is XOR+ is addition modulo 256
L_sqrd:;Input: L;Output: L*L->A;151 t-states;37 bytes ld h,l;First iteration, get the lowest 3 bits sla l rrh sbc a,a or l;second iteration, get the next 2 bits ld c,a rr h sbc a,a xor l and $F8 add a,c;third iteration, get the next 2 bits ld c,a sla l rr h sbc a,a xor l and $E0 add a,c;fourth iteration, get the last bit ld c,a ld a,l add a,a rrc h xor h and $80 xor c neg ret
A_sqrd:;Input: A is an x-bit number (example, 8-bit, 16-bit, et-cetera);Output: A*A A<<1→B ;Store A shifted once into B. 0→ACC ;Initialise the accumulator with 0 1<<(x-1)→MASK ;Set the last bit of MASK For (x-1) Iterations: ACC<<1→ACC ;shift ACC (same as *2 or ACC+ACC) B<<1→B ;shift left once, keep carry ((SBC(0,0)^A)&MASK)+ACC→ACC ;SBC(0,0) is "subtract with carry" and will yield either -1 (carry flag set) or 0 (carry flag reset). (MASK>>1)|MASK→MASK ;arithmetic right shift Return (A<<x-A)-ACC
L_sqrd:;Input: L;Output: L*L->HL;It's really slow compared to a generic 8-bit multiplication;It's also pretty big;But it is a novel approach! ld e,a ;this just stays a constant for the algorithm rlca ld d,a ;this will be rotated to get the bits of the input ld l,0 ;the accumulator ld bc,0780h ;c is a maskloop: add hl,hl rlc d sbc a,a xor e and c add a,l ld l,a jr nc,$+3 inc h sra c djnz loop;HL is 255*input-input*input;add input ld c,e add hl,bc;HL is 256*input-input*input;negate HL xor a sub l ld l,a sbc a,a sub h ld h,a;HL is input*input-256*input;add input*256 ld e,b add hl,de;HL is input*input ret