Author Topic: Efficient Computation of the Complex Exponential Function (Read 9493 times)

Xeda112358 · « **on:** December 03, 2014, 12:02:16 pm »

Overview
I was reading the BKM algorithm and I thought it was similar to my previous attempt at making an algorithm for the complex exponential function. However, I tried implementing it and failed miserably. So I altered my previous algorithm with an idea from the BKM algorithm, and I made my own algo that works as quickly, uses a smaller LUT, works on a wider range of values, and has a more straightforward implementation.

Why is the complex exponential function useful?
The complex exponential function can be evaluated as ##e^{x+iy}=e^{x}\left(\cos(y)+i \sin(y)\right)##. So for people who actually need a complex exponential function, this is useful, but it can also compute the real exponential function(take y=0), or real sine and cosine, simultaneously (take x=0).

Goals of the algorithm
Although I only advertise my algorithm as working on the rectangle [-1,1]+[-1,1]i, it actually works on approximately (-1.754365792,1.754365792)+(-1.213427912,1.213427912)i. Feel free to use this to your advantage.
As well, since shifting and adding are easy^{citation needed}, this is a shift and add algorithm. It takes approximately n iterations for n bits of accuracy, and 8 tables of n/2 floats (versus 18 tables suggested by the BKM algorithm).

The Algorithm
We will first give simple pseudo code, then proceed to optimize it.
As in my old algorithm and the BKM algorithm, we start by observing that e^z-s=e^ze^-s=e^z/e^s. So then if z-a-b-c-d-...=0, then we have e^z-a-b-c-...=e^z/(e^ae^be^c...), so then 1=e^z/(e^ae^be^c...), so e^ae^be^c...=e^z.

The trick is finding values for a,b,c,... so that the multiplications are "easy" as in, comprised of a shift and add.

Since we are working with complex numbers, take the complex number d where the real and imaginary parts are in the set {-1,0,1}. Multiplication by those values is even more trivial than a shift and add. What we will do is take e^a=1+d2^-n. Multiplication by 2^-n is a right shift by n.

If e^a=1+d2^-n, then log(e^a)=log(1+d2^-n), so a=log(1+d2^-n). There are 9 possible values for d, so you might think, "oh no Zeda, we need 9 tables of complex numbers which is basically 18 tables of floats," but we all know that math+programming=magic^{citation not needed}. So let's look at how the complex logarithm is defined and call the real part of d, s, and the imaginary part t. So d=s+t*i. By the way, references to "log" imply the natural logarithm in the math world, so log_e or ln() not log₁₀.
##\log(x+iy) = \log(\sqrt(x^{2}+y^{2}))+i\tan^{-1}(y/x)= \frac{1}{2}\log(x^{2}+y^{2})+i\tan^{-1}(\frac{y}{x})## (side note: atan has infinitely many valid outputs-- just add 2*k*pi for integers k, but we are only interested in k=0, the principle Arg).

So our tables would be with x=1+s2^-n, y=t2^-n.

Spoiler For tables for n:

But there are lots of repeated values, so we actually need tables for:

Spoiler For Actual tables needed:

For our purposes, we will note that to choose s and t, we will use s=sign(iPart(Re(z)*2^m+1)), t=sign(iPart(Im(z)*2^m+1)) where iPart() returns the integer part of the input, and sign() returns 1 if positive, -1 if negative, and 0 if 0. Without further ado, the algorithm:

Code: [Select]

1.0→v
1.0→w
For k,0,n-1
    sign(x)→s
    sign(y)→t
    if |x|<2^(1-k)
        0→s
    if |y|<2^(1-k)
        0→t
    if s≠0 & t≠0
        v→a
        if t=0
            x+log(1+s2^-k)→x
            v+s*v>>k→v
        if s=0
            x+log(1+4^-k)→x
            y+t*atan(2^-k)→y
            v-tw>>k→v
            w+ta>>k→w
        if s=-1
            x+log((1-2^-k)^2+4^-k)→x
            y+t*atan(1/(2^k-1))→y
            v-v>>k-tw>>k→w
            w-w>>k+ta>>k→v
        if s=1
            x+log((1+2^-k)^2+4^-k)→x
            y+t*atan(1/(2^k+1))→y
            v+v>>k-tw>>k→w
            w+w>>k+ta>>k→v

The result is v+iw to n bits of accuracy. It seems complicated, but it is basically just taking care of each case for s and t. As well, we can multiply x, y by 2 each iteration so instead of checking the nth bit, we can check if they are less than 1. Then we would need to multiply each table entry by 2^k+1. For further optimization, we note that to 2m bits of accuracy, when k>m-1, log(1+d2^-(k+1))=.5*log(1+d2^-k) So we can basically reuse the last table value, divide by 2 each iteration, then multiplying by 2 to get our 2^-(k+1). But wait, x/2*2=x, so we can directly use the last value in each table!
To clean things up a bit, lets name the tables and use an index to address each value:

Spoiler For tables:

For the optimized code to 2m bits of accuracy,

Code: [Select]

1.0→v
1.0→w
For k,0,2m-1
    if k<m
        k→n
    x<<1→x
    y<<1→y
    sign(x)→s
    sign(y)→t
    if |x|<1
        0→s
    if |y|<1
        0→t
    v→a
    if s=0
        if t=-1
            x+table6[n]→x
            y-table7[n]→y
            v+w>>k→v
            w-a>>k→w
        if t=1
            x+table6[n]→x
            y+table7[n]→y
            v-w>>k→w
            w+a>>k→v
    if s=-1
        if t=-1
            x+table0[n]→x
            y-table2[n]→y
            v-v>>k+w>>k→w
            w-w>>k-a>>k→v
        if t=0
            x+table4[n]→x
            v-v>>k→w
            w-w>>k→v
        if t=1
            x+table0[n]→x
            y+table2[n]→y
            v-v>>k-w>>k→w
            w-w>>k+a>>k→v
    if s=1
        if t=-1
            x+table1[n]→x
            y-table3[n]→y
            v+v>>k+w>>k→w
            w+w>>k-a>>k→v
        if t=0
            x+table1[n]→x
            v+v>>k→w
            w+w>>k→v
        if t=1
            x+table1[n]→x
            y+table3[n]→y
            v+v>>k-w>>k→w
            w+w>>k+a>>k→v

Summary:
For floating point math with floats in base 2, arbitrary shifting is trivial (just change the exponent). So each iteration requires at most 6 trivial shifts, 6 trivial compares, 6 non-trivial additions/subtractions. So for 2m bits of precison, the total speed cost is 12mA+epsilon, where A is the speed of an addition or subtraction, and epsilon is a small, positive number. The total space cost is 8m Floats. Compared to the BKM algorithm, the speed is the same (with just a change in epsilon), but the BKM uses 18m Floats, so we save 10 Floats. We also work on a larger space than the BKM algorithm which works on a trapezoid region of the complex pain, entirely contained in the set (a proper subset) on which this algorithm works.

Future Plans
I plan to write this up more formally, including the proofs as to why we can use m elements for 2m bits of accuracy, why we can just look at the sign and the nth bits of the real and complex component of the input to choose our d. This last bit was a difficult task for the creator of the BKM algorithm and in their paper, they mentioned that the bounds were theoretical and only as tight as they could manage. In practice they found that the bounds were loose and they suspected that they could expand their bounds, but they couldn't prove it. I found an easy way to prove this for my algorithm which I suspect will apply to the BKM algorithm since at this point they are nearly identical.

Gosh I hope I don't have typos, formatting issues, or errors in the algorithms I posted. I have been typing and revising this for hours .__.

TheCoder1998 · « **Reply #1 on:** December 03, 2014, 12:15:44 pm »

This is brilliant!
I really enjoy reading your posts

Xeda112358 · « **Reply #2 on:** December 03, 2014, 12:30:19 pm »

Thanks! Next I plan to make a similar algorithm for the complex logarithm using the same LUTs. It's pretty similar.

EDIT: Complex log will offer a logarithm function (real and complex) as well as an arctangent function (real).

Xeda112358 · « **Reply #3 on:** December 08, 2014, 01:52:53 pm »

So for the actual, cleaned up code:

First, take the following tables:
##L1[k]:=2^{k}\log((1-2^{-k})^{2}+4^{-k})##
##L2[k]:=2^{k+1}\log(1-2^{-k})##
##L3[k]:=2^{k}\log(1+4^{-k})##
##L4[k]:=2^{k}\log((1+2^{-k})^{2}+4^{-k})##
##L5[k]:=2^{k+1}\log(1+2^{-k})##
##A1[k]:=2^{k+1}\arctan(\frac{1}{2^{k}-1})##
##A2[k]:=2^{k+1}\arctan(2^{-k})##
##A3[k]:=2^{k+1}\arctan(\frac{1}{2^{k}+1})##

Code: [Select]

1→v
0→w
    for k,0,2m-1:
        if k<m:k→n
        2x→x
        2y→y
        sign(x)→s
        sign(y)→t
        if |x|<1:0→s
        if |y|<1:0→t
        v→a
        if s=-1
            if t=-1
                x-L1[n]→x
                y+A1[n]→y
                v-v>>k+w>>k→v
                w-w>>k-a>>k→w
            if t=0
                x-L2[n]→x
                v-v>>k→v
                w-w>>k→w
            if t=1
                x-L1[n]→x
                y-A1[n]→y
                v-v>>k-w>>k→v
                w-w>>k+a>>k→w
        if s=0
            if t=-1
                x-L3[n]→x
                y+A2[n]→y
                v+w>>k→v
                w-a>>k→w
            if t=1
                x-L3[n]→x
                y-A2[n]→y
                v-w>>k→v
                w+a>>k→w
        if s=1
            if t=-1
                x-L4[n]→x
                y+A3[n]→y
                v+v>>k+w>>k→v
                w+w>>k-a>>k→w
            if t=0
                x-L5[n]→x
                v+v>>k→v
                w+w>>k→w
            if t=1
                x-L4[n]→x
                y-A3[n]→y
                v+v>>k-w>>k→v
                w+w>>k+a>>k→w

It might also be convenient to move all the v,w calculations outside of the cases, since they are all essentially the same with components multiplied by s or t which are just multiplications by {1,0,-1}.

Code: [Select]

1→v
0→w
    for k,0,2m-1:
        if k<m:k→n
        2x→x
        2y→y
        sign(x)→s
        sign(y)→t
        if |x|<1:0→s
        if |y|<1:0→t
        v→a
        v+sv>>k-tw>>k→v
        w+sw>>k+ta>>k→w
        if s=-1
            if t=-1
                x-L1[n]→x
                y+A1[n]→y
            if t=0
                x-L2[n]→x
            if t=1
                x-L1[n]→x
                y-A1[n]→y
        if s=0
            if t=-1
                x-L3[n]→x
                y+A2[n]→y
            if t=1
                x-L3[n]→x
                y-A2[n]→y
        if s=1
            if t=-1
                x-L4[n]→x
                y+A3[n]→y
            if t=0
                x-L5[n]→x
            if t=1
                x-L4[n]→x
                y-A3[n]→y

Also, in the last m iterations, can be even more optimized if you are using a binary floating point format for x and y. Convert the mantissas so that you have m*2^0 (so 1.1011*2^-5 would have a mantissa of 110110000.... but would be converted to 000011011000...). Now use these as your new, converted x and y value, treating them as ints. If the function "MSb(x)" returns the most significant bit in x (so for 64-bit numbers, bit 63):

Code: [Select]

sign(x)→b
sign(y)→c
convert(|x|)→x
convert(|y|)→y
    for k,m,2m-1:
        MSb(x)→s
        MSb(y)→t
        x<<1→x
        y<<1→y
        v→a
        v+s*b*v>>k-t*c*w>>k→v
        w+s*b*w>>k+t*c*a>>k→w

In the last part, s,t are either 0 or 1, and b,c are either -1, 0, or 1. If any of them are 0, of course the algorithm can be optimized further (just plug in 0 to see how it reduces). For example, suppose x=0 upon entering:

Code: [Select]

    for k,m,2m-1:
        MSb(y)→t
        y<<1→y
        v→a
        v-t*c*w>>k→v
        w+t*c*a>>k→w

ivelegacy · « **Reply #4 on:** September 02, 2016, 10:10:02 am »

hi Xeda112358
have you already implemented this algorithm in C/Matlab ?
here I have already implemented CORDIC.sin-cos in VHDL
and tested on RTL simulator and fpga

let me know

Xeda112358 · « **Reply #5 on:** September 03, 2016, 02:50:39 pm »

Quote from: ivelegacy on September 02, 2016, 10:10:02 am

hi Xeda112358
have you already implemented this algorithm in C/Matlab ?
here I have already implemented CORDIC.sin-cos in VHDL
and tested on RTL simulator and fpga

let me know

I have not. Are you the person who emailed me? I was about to point you here since I'm too exhausted to work on this at the moment (worked 16+ hours in a 24 hour span of time).

Author Topic: Efficient Computation of the Complex Exponential Function (Read 9493 times)

Xeda112358

Efficient Computation of the Complex Exponential Function

TheCoder1998

Re: Efficient Computation of the Complex Exponential Function

Xeda112358

Re: Efficient Computation of the Complex Exponential Function

Xeda112358

Re: Efficient Computation of the Complex Exponential Function

ivelegacy

Re: Efficient Computation of the Complex Exponential Function

Xeda112358

Re: Efficient Computation of the Complex Exponential Function