a 32-bit arch on a 64-bit platform, eg NEON on AARCH64. It's probably
more useful for cross-platform testing, though.
The breakdown is as follows:
* decaf_bool_t, decaf_word_t and decaf_error_t are as defined in the API.
* DECAF_WORD_BITS is the size of a decaf_word_t.
* decaf_word_t is used for scalars, so on every curve the scalar impls are the same
(i.e. they follow the API's word size).
* SC_LIMB macro always takes a 64-bit word.
* non-prefixed word_t, mask_t, etc are as defined by the per-curve arch.
* ARCH_WORD_BITS is the size of a word_t.
* word_t is used for gf elements, so the curves may have different guts.
I'm kind of torn about this change, because it adds a bunch of
fairly complex code that's only needed for esoteric use cases,
and it makes Elligator more complex, if mostly only for testing
purposes. Basically, this is because Elligator is approximately
~8-to-1 when its domain is 56 bytes: 2 because it's [0..p+small]
instead of [0..(p-1)/2], and 4 for cofactor removal. So when you
call the inverse on a point, you need to say which inverse you want,
i.e. a "hint".
Of course, the inverse fails with probability 1/2.
To make round-tripping a possibility (I'm not sure why you'd need this),
the Elligator functions now return an unsigned char hint. This means
that you can call Elligator, and then invert it with the hint you gave,
and get the same buffer back out. This adds a bunch of complexity to
Elligator, which didn't previously need to compute hints. The hinting is
reasonably well tested, but it is known not to work for inputs which are
very "large", i.e. end ~28 0xFF's (FIXME. Or roll back hinting...).
There's also a significant chance that I'll revise the hinting mechanism.
Create functions:
decaf_448_invert_elligator_nonuniform
decaf_448_invert_elligator_uniform
decaf::Ed448::Point::invert_elligator
decaf::Ed448::Point::steg_encode
for inverting Elligator. This last one encodes to Point::STEG_BYTES = 64
bytes in a way which is supposed to be indistinguishable from random, so
long as your point is random on the curve.
Inverting Elligator costs about 2 square roots for nonuniform. For
uniform, it's just Elligator -> diff -> invert, so it's 3 square roots.
Stegging fails about half the time, and so costs about twice that, but
the benchmark underreports it because it ignores outliers.
The code is tested, but I haven't checked over the indistinguishability
from random (I've only proved it correct...). There could well be a way
to break the steg even without taking advantage of "very large" inputs
or similar.