Author Topic: [x86] Hiding the Else in a NOP  (Read 3290 times)

0 Members and 1 Guest are viewing this topic.

Offline harold

  • LV5 Advanced (Next: 300)
  • *****
  • Posts: 226
  • Rating: +41/-3
    • View Profile
[x86] Hiding the Else in a NOP
« on: August 14, 2012, 01:46:39 pm »
I'm trying to implement 0x8000000000000000 >> nlz(x) well.

What I came up with might be a bit unorthodox:
Code: [Select]
 mov r11d, 1
  bsr rcx, rax
  jz _iszero
  shl r11, cl
  .db 0F, 1F, 80 ; nop [rax+sdword] with the sdword being the next shl
_iszero:
  shl r11, 63  ; 49 D3 E3 3F so 4 bytes
Because BSR is retarded and returns something useless when the argument is zero, I have to handle that case with a branch. But this gets rid of the branch I'd otherwise use to skip the second shl.
An other way to do this is shl-ing by 63 in all cases (or it could be a 64bit mov) and then shr back in the nonzero case. That means xor-ing the result of bsr with 63 though - not a disaster, but more instructions.

Is there any reason not to do it this way? (besides "maintainability", I'm the only person who's ever going to read it anyway and I certainly know what this means)
Any unexpected slowdowns on some micro-architectures? Are trace caches OK with this?
Is the other way I described better?
« Last Edit: August 14, 2012, 02:41:03 pm by harold »
Blog about bitmath: bitmath.blogspot.nl
Check the haroldbot thread for the supported commands and syntax.
You can use haroldbot from this website.