Author Topic: Fermat Factoring (Zeda's Method)  (Read 3278 times)

0 Members and 1 Guest are viewing this topic.

Offline Xeda112358

  • they/them
  • Moderator
  • LV12 Extreme Poster (Next: 5000)
  • ************
  • Posts: 4704
  • Rating: +719/-6
  • Calc-u-lator, do doo doo do do do.
    • View Profile
Fermat Factoring (Zeda's Method)
« on: December 31, 2013, 03:45:50 pm »
As some have seen me talking in IRC, I have my own rendition of Fermat factoring. I compared it to what I thought was a typical approach and my method asymptotically replaces a square root operation with a subtract and an increment. In complexity terms:

Square root is on the level of multiplication and division, so O(M(n))
Subtraction is O(n)
Increment, if my calculations are correct, is O(1) (in the case of the algorithm described below, invert bit 1, if that returns 0, invert bit 2, ... to increment by 2)

So while Fermat factoring isn't as efficient as the general number field sieve, the method I have is more efficient than a typical approach to Fermat factoring.

To describe the idea, notice that a2-b2=(a-b)(a+b). This means that, given our number c, if we can find a difference of two squares so that they equal c, then we have a pair of factors for c. For example, 102-52=75, so 75=(10-5)(10+5)=5*15. If c>1 and is a positive odd integer, it will always have a trivial solution (the difference between consecutive squares is 1,3,5,7,..., so this is "trivial"). Any other solutions are non-trivial and imply that c is composite. If only the trivial solution exists, c is prime.
Spoiler For proof:
Assume c is composite. Then c=nm where n,m are natural numbers greater than 1. If we let n be the larger of the two, then we have:
n=(a+b)
m=(a-b)
m+2b=a-b+2b=a+b
m+2b=n
n-m=2b
(n-m)/2=b
n=a+b
n=a+(n-m)/2
n-(n-m)/2=a
(n+m)/2=a
(n-m)/2=b

The trivial solution is (c+1)/2=a,(c-1)/2=b factoring to 1c=c. If c is composite, it has a non trivial solution. Now the other direction, if c is prime, it does not have a non-trivial solution. Essentially, notice that if there is a non trivial solution, where n is not (c+1)/2 and m is not (c-1)/2, then we can factor c as (n-m)(n+m) which is a contradiction.

Summary:
If c is prime, it has only the trivial solution.
If c is composite, it has non-trivial solutions.
The next few steps toward solving this formula are the typical approaches and I do the same thing. We notice that a>sqrt(c).
Spoiler For proof:
Rewrite a2-b2=c as a2-c=b2. Then we see that a2-c is non-negative, so:
a2>=c,
a>=sqrt(c).
So then with our algorithm, we can initialise a=cieling(sqrt(c). That is the smallest that a can be, so now we enter the naive algorithm, checking if the difference a2-c is a perfect square:
Code: [Select]
ceiling(sqrt(c))→a
While notperfectsqaure(a*a-c)
a+1→a
EndWhile
sqrt(a*a-c)→b
Return {a-b,a+b}
However, the next step away from the naive approach is to notice that every time you add 1 to a, the difference increases by 2a+1. So you can save some computations by keeping track of the difference and updating at each iteration:
Code: [Select]
ceiling(sqrt(c))→a
a*a-c→b2
While notperfectsqaure(b2)
b2+2a+1→b2
a+1→a
EndWhile
sqrt(b2)→b
Return {a-b,a+b}
However, that notperfectsquare() routine will likely require computing the square root at least once (twice if done naively). My approach takes the optimisation another step further by keeping track of how far away b2 is from a perfect square. With this, note that if b2 is the next lowest perfect square below a2-c, then the one above is b2+2*b2+1, so if the difference between b and b2 exceeds 2*b2+1, we need to increment b and adjust the difference:
Code: [Select]
ceiling(sqrt(c))→a
a*a-c→diff
floor(sqrt(diff))→b
diff-b*b→diff
While diff>0
  diff+a+a+1→diff
  a+1→a
  While diff>=(2b+1)
    diff-b-b-1→diff
    b+1→b
  EndWhile
EndWhile
Return {a-b,a+b}
As it is, the inner while loop converges to iterating 1 time pretty quickly (bounded the square root of the difference between the initial guess for a and c). We can say then that the original square root required at each iteration is replaced by 1 subtract and 1 increment.

Rewriting this for some more efficiency, we can do:
Code: [Select]
ceiling(sqrt(c))→a
a*a-c→diff
floor(sqrt(diff))→b
diff-b*b→diff
2a+1→a
2b+1→b
While diff>0
  diff+a→diff
  a+2→a
  While diff>=b
    diff-b→diff
    b+2→b
  EndWhile
EndWhile
floor(a/2)→a
floor(b/2)→b
Return {a-b,a+b}
At the chip level, those +2 can also be increments requiring O(1) time instead of additions requiring O(n) time. This means that each iteration of the algorithm requires about:
- 2 add/sub operations
- 2 increments
- 3 compares (that can be changed to simply checking for overflow, so these can be trivially removed)

And the maximum number of iterations are based on how far the actual a value is from the original estimate. If we know C is an odd composite (as, for example, RSAs) then this is a pretty crummy upper bound of c/3-sqrt(c) in the worst case (c=3*p).
Spoiler For Example 1:
So let's try to factor 14111. First, estimate a=119. Then 1192-14111=50, so our initial b is 7, with difference of 1. (50-7^2=1).
a=119, b=7, diff=1. Increment a, add 2*119+1 to diff
a=120, b=7, diff=240. 240>=7*2+1, so increment b, subtract 15 from diff
a=120, b=8, diff=225. increment b, subtract 17 from diff
a=120, b=9, diff=208. increment b, subtract 19 from diff
a=120, b=10, diff=189. increment b, subtract 21 from diff
a=120, b=11, diff=168. increment b, subtract 23 from diff
a=120, b=12, diff=145. increment b, subtract 25 from diff
a=120, b=13, diff=120. increment b, subtract 27 from diff
a=120, b=14, diff=93. increment b, subtract 29 from diff
a=120, b=15, diff=64. increment b, subtract 31 from diff
a=120, b=16, diff=33. increment b, subtract 33 from diff
a=120, b=17, diff=0. diff=0, so we have found a and b

So 14111=(120-17)(120+17)=103*137.
Spoiler For Example 2:
c=75. Then a=9 initially, 81-75=6, so b=2, diff=2
a=9, b=2, diff =2
a=10, b=2, diff =21
a=10, b=3, diff =16
a=10, b=4, diff =9
a=10, b=5, diff =0

So 75=(10-5)(10+5)=5*15.
Spoiler For More Notes:
I came up with this algorithm a few years ago while working on Grammer. This was before Grammer had Line() or Circle() drawing routines, during the summer (when I didn't have internet). Grammer had a Circle() routine long before Line() because the Circle() routine was derived from my Fermat factoring routine above. The routine above can be thought of as interpolating a curve of the form y=sqrt(x^2-b) compared to y=sqrt(z-x^2) where z is the radius squared. However, the algorithm above stops when it lands exactly on a point (when its error=0) whereas circle drawing typically requires the whole curve to be drawn. I believe that my method of drawing circles and lines is like the Bresenham method, so I made a tutorial saying as much. However, I haven't validated that, so I haven't officially released it.
0,1,4,9, Fermat, Bresenham, Circles, and Lines

My circle and line drawing routines takes advantage of overflow (the c flag) to avoid making comparisons. Likewise:
Code: [Select]
If 0==c%2
  Return {2,a/2}
sqrt(c)→a
If floor(a)=a
  Return {a,a}
1+floor(a)→a
a*a-c→diff
floor(sqrt(diff))→b
diff-b*b→diff
2a-1→a
2b-1→b
diff-b→diff
Loop0:
  a+2→a
  diff+a→diff
  if carry_set
Loop1:
    b+2→b
    diff-b→diff
    GOTO Loop1 IF carry_not_set
  GOTO Loop0 IF zero_not_set
a>>1→a
b>>1→b
Return {a-b,a+b}
I should also note that at the assembly level, my integer square root routines typically also returns the difference between the original input and the next smaller perfect square, so that also helps speed things up a little (getting rid of all of the multiplications).

Sorry for the code nested in a spoiler tag :/