a 32-bit arch on a 64-bit platform, eg NEON on AARCH64. It's probably
more useful for cross-platform testing, though.
The breakdown is as follows:
* decaf_bool_t, decaf_word_t and decaf_error_t are as defined in the API.
* DECAF_WORD_BITS is the size of a decaf_word_t.
* decaf_word_t is used for scalars, so on every curve the scalar impls are the same
(i.e. they follow the API's word size).
* SC_LIMB macro always takes a 64-bit word.
* non-prefixed word_t, mask_t, etc are as defined by the per-curve arch.
* ARCH_WORD_BITS is the size of a word_t.
* word_t is used for gf elements, so the curves may have different guts.
Currently compiles and passes tests on x86_64 with arch_32 and
DECAF_FORCE_32_BIT=1 (as well as the native settigs of course),
so that's a start.
Want to make serialization routine cross-arch. Need to check that
perf is good enough (likely). Current routine in p25519/arch_32
is almost cross-arch, but has known bugs (FIXMEs). Needs to take
into account separate p and, for NEON, the LIMBPERM.
Want to decouple arches for each curve/field. Currently the split
between decaf_word_t and word_t makes this fraught with peril. Fix
is probably to rename decaf_word_t to decaf_api_word_t and fix it
to either uint32 or uint64, then make internal things separate per
field. That way we don't have to try arch detection in the header,
which is nice.
Need to make decaf_gen_tables use SC_LIMB. Might as well get rid
of API_NS there too.