A perfect power, and then?
Torbjorn Granlund
tg at gmplib.org
Sat Oct 27 17:00:20 CEST 2012
nisse at lysator.liu.se (Niels Möller) writes:
a^{1/k}, it should work fine to use a table indexed by low bits of a and
*low bits only* of k.
I think the underlying reason is that
\phi(2^m) = 2^{m-1},
hence
a^n (mod 2^m) = a^{n mod 2^{m-1}} (mod 2^m)
Cute.
My implementation constructs a 4-bit starting value as
r0 = 1 + (((n << 2) & ((a0 << 1) ^ (a0 << 2))) & 8);
(here, a0 is the low input limb, r0 is the low output limb, and the
iteration computes a^{1/n-1} mod a power of two.
That's a clever formula, but it might not be faster than a tiny table if
less than 7 operations can be used.
We should be able to get a 8-bit starting value using a table lookup on
at most 13 bits (18 KByte). But maybe it's not worth the effort; a
single iteration getting from 4 bits to 8 shouldn't be terribly
expensive.
18 kByte is too much.
BTW, for large n one ought to use n mod the right power of 2 for the
powering in the first few iterations, to avoid doing lots of useless
work in powering.
Perhaps as a comment add that to the file?
> (2) iterate single limb code before entering the mpn loop.
One should definitely have an initial single-limb loop. Similar to how
it's doen with binvert and binvert_limb.
I have become convinced that we need a mpn_broot, for example in a
mpn_rootexact. I am not convinced that perfpow's early inversion and
use of binv_root is worse than using mpn_broot and a mullo for each k.
I suggest that we make four new files in mpn/generic:
broot.c: mpn_broot and your mpn_xxx that computes a^{1/n-1},
perhaps call the latter mpn_brootinvm1
brootinv.c: mpn_brootinv
bsqrt.c: mpn_bsqrt (which probably calls mpn_bsqrtinv, mpn_mullo)
bsqrtinv.c: mpn_bsqrtinv
--
Torbjörn
More information about the gmp-devel
mailing list