Author Topic: 24 bit multiplication  (Read 18096 times)

0 Members and 1 Guest are viewing this topic.

Offline ACagliano

  • LV8 Addict (Next: 1000)
  • ********
  • Posts: 919
  • Rating: +32/-2
    • View Profile
    • ClrHome Productions
Re: 24 bit multiplication
« Reply #15 on: December 08, 2011, 03:11:20 pm »
Well I am almost positive that using my algorithm would require the use of RAM, even if you used all the shadow registers, too.
Can you help me, conceptually with doing math higher than 2 bytes? Maybe I'll write it myself.

Offline Xeda112358

  • they/them
  • Moderator
  • LV12 Extreme Poster (Next: 5000)
  • ************
  • Posts: 4704
  • Rating: +719/-6
  • Calc-u-lator, do doo doo do do do.
    • View Profile
Re: 24 bit multiplication
« Reply #16 on: December 08, 2011, 03:54:25 pm »
Okay, here is some pseudo code for an algorithm that will always work and is fast (each cycle gives the next digit):
Code: [Select]
    0→A
     0→D          ;This is where the result will go
     0→C          ;This is a carry
     For(B,1,12)  ;12 is half the number of input bits
     D+D→D        ;We want to shift D left
     D+D+1→C      ;The difference of consecutive squares?
     A<<E         ;rotate E left, then rotate the carry into A
     A<<E         ;In z80, "rlc e \ rla"
     If A>=C      ;If A is greater than or equal to C
       D+1→D
       A-C→A
     End
;You will need an input       (24 bits in the above example)
;You will need an output      (12 bits in the above example)
;You will need a counter      (4 bits in the above example)
;You will need an accumulator (24 bits in the above example)
;You will need a carry        (13 bits in the above example)
;
;You will need at least       (77 bits in the above example)
So for a simple 8-bit implementation in Z80 code:
Code: [Select]
;===============================================================
sqrtE:
;===============================================================
;Input:
;     E is the value to find the square root of
;Outputs:
;     A is E-D^2
;     B is 0
;     D is the result
;     E is not changed
;     HL is not changed
;Destroys:
;     C=2D+1 if D is even, 2D-1 if D is odd

        xor a               ;1      4         4
        ld d,a              ;1      4         4
        ld c,a              ;1      4         4
        ld b,4              ;2      7         7
sqrtELoop:
        rlc d               ;2      8        32
        ld c,d              ;1      4        16
        scf                 ;1      4        16
        rl c                ;2      8        32

        rlc e               ;2      8        32
        rla                 ;1      4        16
        rlc e               ;2      8        32
        rla                 ;1      4        16

        cp c                ;1      4        16
        jr c,$+4            ;4    12|15      48+3x
          inc d             ;--    --        --
          sub c             ;--    --        --
        djnz sqrtELoop      ;2    13|8       47
        ret                 ;1     10        10
If you want to refine the accuracy to round to the nearest integer, you can add this code to the end of the Z80 code:
Code: [Select]
       cp d                ;1      4         4
        jr c,$+3            ;3    12|11     12|11
          inc d             ;--    --        --
It takes advantage of the linear differences of a quadratic (meaning (a(x+1)2+b(x+1)+c)-(ax2+bx+c) is a lnear equation)

Anywho, I am being told I need to finish this up, so I hope this helps until I can find time later >.>

Offline ACagliano

  • LV8 Addict (Next: 1000)
  • ********
  • Posts: 919
  • Rating: +32/-2
    • View Profile
    • ClrHome Productions
Re: 24 bit multiplication
« Reply #17 on: December 08, 2011, 03:57:06 pm »
Ok, I'm confused. I'll leave this to the more advanced programmers for now.

Offline thepenguin77

  • z80 Assembly Master
  • LV10 31337 u53r (Next: 2000)
  • **********
  • Posts: 1594
  • Rating: +823/-5
  • The game in my avatar is bit.ly/p0zPWu
    • View Profile
Re: 24 bit multiplication
« Reply #18 on: December 08, 2011, 07:40:32 pm »
Edit2:
  Dang, I need to read closer. Oh well, at least mine is way bigger than jacobly's.

These things actually aren't that difficult to write if you understand the principle by which they work.

Code: [Select]

multBCDbyEHL:
ld a, b
ld b, c
ld c, d
ld d, a
multDBCbyEHL:

push de
push hl

ld ix, $8000
ld (ix), l
xor a
ld h, a
ld l, a

call do8Bits

ld (ix+1), a
pop af
ld (ix), a
push de
ld a, (ix+1)

call do8Bits

ld (ix+1), a
pop af ;least sig number
ex de, hl
ex (sp), hl ;D and 2nd least sig in for new number
ld (ix), l
pop hl ;D and 2nd least sig
push af ;least sig number
push hl ;2nd least sig number
ex de, hl
ld a, (ix+1)

call do8Bits

ld b, a
ld c, h
ld d, l
pop hl
ld a, l
pop hl
ld h, a
ret





;####
;input: DBC = 1 number
; AHL = running number
; (ix) = to multiply by
;output: AHLE = output
; E is done
; DBC = 1 number


do8Bits:
ld (ix+1), 8
loop:
srl (ix)
jr nc, skip

add hl, bc
adc a, d
skip:
rra
rr h
rr l
rr e
dec (ix+1)
jr nz, loop
ret

I had to use 2 bytes of memory, so sorry about that. If you don't like it, you could just replace (ix) with ixl and (ix+1) with ixh and it would be ram independent, but then it wouldn't run on the Nspire.

Also, to my extreme delight, this thing worked on the first try ;D

(The result is in BCDEHL)

Edit:
   Added to wikiti so people will forever be able to multiply large numbers.
« Last Edit: December 08, 2011, 08:03:57 pm by thepenguin77 »
zStart v1.3.013 9-20-2013 
All of my utilities
TI-Connect Help
You can build a statue out of either 1'x1' blocks or 12'x12' blocks. The 1'x1' blocks will take a lot longer, but the final product is worth it.
       -Runer112

Offline jacobly

  • LV5 Advanced (Next: 300)
  • *****
  • Posts: 205
  • Rating: +161/-1
    • View Profile
Re: 24 bit multiplication
« Reply #19 on: December 08, 2011, 08:35:58 pm »
My first multiplication routine takes 2746 - 4570 cycles, the second takes 1680 - 2880 cycles.
Edit: And yes, they are 24*24 bit integer multiplication routines (floating point routines are much more complicated :P).

As for square root, I should be able to do it using only af bc de hl ix (sp), or af bc de hl ix iy, just like the multiplication. Just give me a couple hours to port it to 48-bit.

Edit:
uses af bc de hl ix iy (for two bytes of memory)
84 bytes, 9186-9258 cycles
Code: [Select]
; ahl = sqrt(hldebc)
push bc ; ld c,ixl
pop ix ; ld b,ixh
push de
ld c,l
ld a,h
ld hl,0
ld b,h
ld e,l
ld d,h
ld (iy+asm_Flag1),d
ld (iy+asm_Flag2),24
Loop:
cp $40
push af
sbc hl,de
ld a,b
sbc a,(iy+asm_Flag1)
jr c,Restore
ld b,a
pop af
sub $40
scf
jr Skip
Restore:
pop af
adc hl,de
or a
Skip:
rl e
rl d
rl (iy+asm_Flag1)
add ix,ix
ex (sp),hl
adc hl,hl
ex (sp),hl
rl c
rla
adc hl,hl
rl b
add ix,ix
ex (sp),hl
adc hl,hl
ex (sp),hl
rl c
rla
adc hl,hl
rl b
dec (iy+asm_Flag2)
jr nz,Loop
pop hl
ld a,(iy+asm_Flag1)
ret

uses af bc b' de de' hl hl' ix
74 bytes, 5985-6777 cycles
Code: [Select]
; dea = sqrt(hldebc)
di
push bc ; ld c,ixl
pop ix ; ld b,ixh
ld bc,$40
ld a,l
ld l,h
ld h,b
exx
ld de,0
ld l,e
ld h,d
ld b,24
or a
Loop:
exx
sbc hl,bc
exx
sbc hl,de
jr nc,Skip
exx
add hl,bc
exx
adc hl,de
Skip:
exx
ccf
rl b
exx
rl e
rl d
exx
add ix,ix
rl e
rl d
rla
adc hl,hl
exx
adc hl,hl
exx
add ix,ix
rl e
rl d
rla
adc hl,hl
exx
adc hl,hl
djnz Loop
exx
ld a,b
exx
ei
ret
« Last Edit: December 08, 2011, 11:49:21 pm by jacobly »

Offline Xeda112358

  • they/them
  • Moderator
  • LV12 Extreme Poster (Next: 5000)
  • ************
  • Posts: 4704
  • Rating: +719/-6
  • Calc-u-lator, do doo doo do do do.
    • View Profile
Re: 24 bit multiplication
« Reply #20 on: December 09, 2011, 07:39:22 pm »
Okay, so this isn't too helpful, but it is an idea: I came up with a very simple algorithm that gets very accurate very fast (12 cycles for 16-bit numbers). On the Z80 it isn't that fast since it requires multiplication and division, but the algorithm looks like this:
Code: [Select]
B/2→A             ;start with anything, really
For(C,1,12
(A+B/A)/2→A
End

To implement this in assembly code, you will want to multiply the input by 4 (shifting left twice) and then divide the output by 2. Then if you want added precision, if there is a carry, increment by 1. Again, in z80, this algorithm isn't really worth it since it isn't as speed efficient as other methods, but it can be easier to grasp :) Plus, if you move on to coding for devices with the ability to multiply and divide, you will have a really tiny algorithm that uses a counter, input, and output as opposed to a bajillion registers.

Offline jacobly

  • LV5 Advanced (Next: 300)
  • *****
  • Posts: 205
  • Rating: +161/-1
    • View Profile
Re: 24 bit multiplication
« Reply #21 on: December 10, 2011, 02:45:24 am »
Actually, my routine only uses about 3 virtual registers. The only reason it uses so many z80 registers is because the virtual registers are so big. In fact, the first routine uses the same number of bits as your's would (don't forget that A and B in your routine would each be 48 bits wide). However, your routine does have the advantage of using similarly sized registers (and fewer iterations), so it probably would be more useful on other processors.

Offline cerzus69

  • LV2 Member (Next: 40)
  • **
  • Posts: 27
  • Rating: +6/-0
    • View Profile
Re: 24 bit multiplication
« Reply #22 on: December 10, 2011, 08:26:06 am »
Hey, jacobly, I tested both of your multiplication routines and it doesn't seem like they're doing the same thing... I've tried both of them in my program but they have different results.
« Last Edit: December 10, 2011, 09:15:25 am by cerzus69 »

Offline ACagliano

  • LV8 Addict (Next: 1000)
  • ********
  • Posts: 919
  • Rating: +32/-2
    • View Profile
    • ClrHome Productions
Re: 24 bit multiplication
« Reply #23 on: December 10, 2011, 09:24:34 am »
Ok, I'm actually going to be sticking with just two bytes for position and scrapping the sector. That means I'll need to do 2-byte subtraction (no problem), square a 2-byte (rendering a max 4-byte), then add three 4-bytes (do you only need a 4-byte output or should you go up to 5), then square root that. That should be easier.

√( (x2-x1)^2 + (y2-y1)^2 + (z2-z1)^2)

√( (16bit-16bit)^2 + (16bit-16bit)^2 + (16bit-16bit)^2)


Can someone explain to me the theory behind multiplying/dividing/sqrt'ing numbers? Is there a standard theorem, regardless of the bit-size of the number?
« Last Edit: December 10, 2011, 10:26:23 am by ACagliano »

Offline Xeda112358

  • they/them
  • Moderator
  • LV12 Extreme Poster (Next: 5000)
  • ************
  • Posts: 4704
  • Rating: +719/-6
  • Calc-u-lator, do doo doo do do do.
    • View Profile
Re: 24 bit multiplication
« Reply #24 on: December 10, 2011, 12:25:16 pm »
Okay, so multiplying and dividing you probably actually know-- you just don't know it. So you remember in grade school learning how to multiply large numbers:
Code: [Select]
  432
  x27
-----
Well that is actually the best method and fastest. What you do is multiply 2*432. THen you shift the digits left and throw a zero at the end. Then you multiply 432*7 and add the two values together. Voila, multiplication :) Now in binary, math is almost always easier. Because you are using 0 and 1, multiplication is easy:
Code: [Select]
11000101
11010111
so you take the last digit of the second number and you get 1. Multiply 1 times the first and you get 11000101. Now shift that left and check the next bit of the bottom number. If that is 1, add it to the accumulator (the accumulator in this case is the running total, not necessarily the a register).
So here is what it looks like in assembly code:
Code: [Select]
;D*E
     xor a       ;This is the accumulator
     ld b,8      ;this is the counter
MultLoop:
     add a,a     ;this is to rotate the accumulator left. Use rlca, too
     rlc e       ;This puts the next bit in E in the carry
     jr nc,$+3   ;If the bit in E was not 1, you add 0 (so don't add!)
       add a,d   ;1*D=D, so add D. Binary makes math easy.
     djnz MultLoop
     ret

When it comes to division in z80, you are pretty much just doing long division. Finding the square root, however, is something you probably didn't do in school since we have calculators and it was taken out of the school curriculum (I sound old D:). Anywho, I learned how to do it in decimal from a textbook from the 1950's and then I extended it to binary since it is much, much easier in binary. It is difficult to explain in a post (you kind of need to see it to understand it), but here is an excellent site:
http://www.homeschoolmath.net/teaching/square-root-algorithm.php

The second method is what you will want to use as it gives each consecutive digit every cycle. If you can understand that well, you will probably see why it is so much easier in binary and how to create a z80 algorithm :)


Offline jacobly

  • LV5 Advanced (Next: 300)
  • *****
  • Posts: 205
  • Rating: +161/-1
    • View Profile
Re: 24 bit multiplication
« Reply #25 on: December 10, 2011, 06:13:47 pm »
Hey, jacobly, I tested both of your multiplication routines and it doesn't seem like they're doing the same thing... I've tried both of them in my program but they have different results.
They both have different inputs. :)
The first is hla * cde, and the second is hlc * bde.
You probably want to add some code to the beginning of each so that the input works better for what you are doing.
For example:

   ld   a,c
   ld   c,b

at the beginning of the first routine causes them to have the same input (hlc and bde).

Edit: Here's some pseudo code that might help.
Code: [Select]
// Multiply a times b
temp = 0
repeat for each bit in a
temp <<= 1
if (high bit of a set)
temp += b
a <<= 1
return temp

// Divide a by b
temp = 0
repeat for each bit in a
temp <<= 1
temp += high bit of a
a <<= 1
if (temp >= b)
temp -= b
set low bit of a
return a

// Sqrt a
temp = 0
b = 0
repeat for every 2 bits in a
temp += high 2 bits of a
a <<= 2
test = b << 2 + 1
b <<= 1
if (temp >= test)
temp -= test
set low bit of b
return b
// Sqrt a, sometimes better with multiple-of-a-byte registers
temp = high byte of a
a <<= 8
b = 0
repeat for every 2 bits in a
test = b << 8 + 0x40
b <<= 1
if (temp >= test)
temp -= test
set low bit of b
temp += high 2 bits of a
a <<= 2
return b
The tricky part is figuring out how many bits are in each variable and allocating the z80 registers accordingly.
« Last Edit: December 10, 2011, 06:54:42 pm by jacobly »

Offline ACagliano

  • LV8 Addict (Next: 1000)
  • ********
  • Posts: 919
  • Rating: +32/-2
    • View Profile
    • ClrHome Productions
Re: 24 bit multiplication
« Reply #26 on: December 11, 2011, 11:31:56 am »
Ok, so let me just get something straight...

 00100110
x11011001
-------------
1. 00100110
2. 0,01001100
3. 00,10011000
4. 001,00110001
5. 0010,01100011
6. 00100,11000110
7. 001001,10001101
8. 0010011,00011011

Thus, the answer is %00010011 %00011011

Now, let's check: 38 x 217 = 4891  XX it's wrong??

Offline Xeda112358

  • they/them
  • Moderator
  • LV12 Extreme Poster (Next: 5000)
  • ************
  • Posts: 4704
  • Rating: +719/-6
  • Calc-u-lator, do doo doo do do do.
    • View Profile
Re: 24 bit multiplication
« Reply #27 on: December 11, 2011, 11:55:42 am »
Hmm, here is the algorithm in decimal:
Code: [Select]
333
x471
----
1) 4*333
   +Acc  = 1332
   Acc*10= 13320
2) 7*333
   +Acc  = 15651
   Acc*10= 156510
3) 1*333
   +Acc  = 156843
Here is the algorithm in binary:
Code: [Select]
00100110
x11011001
---------------
1) 1x00100110
     +Acc = 00100110
     Acc*2= 01001100
2) 1x00100110
     +Acc = 01001100+00100110=01110010
     Acc*2= 11100100
3) 0x00100110
     +Acc = 11100100
     Acc*2= 111001000
4) 1x00100110
     +Acc = 111101110
     Acc*2= 1111011100
5) 1x00100110
     +Acc = 10000000010
     Acc*2= 100000000100
6) 0x00100110
     +Acc = 100000000100
     Acc*2= 1000000001000
7) 0x00100110
     +Acc = 1000000001000
     Acc*2= 10000000010000
8) 1x00100110
     +Acc = 10000000110110

Offline ACagliano

  • LV8 Addict (Next: 1000)
  • ********
  • Posts: 919
  • Rating: +32/-2
    • View Profile
    • ClrHome Productions
Re: 24 bit multiplication
« Reply #28 on: December 11, 2011, 12:47:33 pm »
Ok, I think I get it.

You move through the second register. If the current bit is 1, you add 1 x a to a, then rla a. If the current bit is 0, just rla a.

Offline Xeda112358

  • they/them
  • Moderator
  • LV12 Extreme Poster (Next: 5000)
  • ************
  • Posts: 4704
  • Rating: +719/-6
  • Calc-u-lator, do doo doo do do do.
    • View Profile
Re: 24 bit multiplication
« Reply #29 on: December 11, 2011, 01:07:43 pm »
Yes :) Typically you do rla, then check the next bit and if it is set, add the other register :)

The rla at the beginning works because a starts at 0 so 0*0=0.