Ok, so I was checking the isqrtEst64 against isqrt(), and they are exactly the same when P = 0.
However, because I am using SLD2's "SDL_MostSignificantBitIndex32(n)" to get log2, large integers overflow.
u64 sqrt_lo(const u32 n) noexcept
{
u32 log2floor = SDL_MostSignificantBitIndex32(n);
return (n != 0) << (log2floor >> 1);
}
u32 isqrtEst64(u64 n, u8 P = 0)
{
u32 est = sqrt_lo((u32)n);
u32 b = (u32)n;
...
The nice thing is the larger values don't occur in fights, I guess they happen in map generation and pathing, probably a lot of other things. As far as I can tell, the patch is working, but I am sure there are more improvements possible.
So @Stan` I would need to come up with a 64 bit equivalent to SDL_MostSignificantBitIndex32(n) in order to see if this sqrt approach could be faster than the current isqrt().