|
- Important work items for Ed448-Goldilocks:
-
- * Better architecture detection / factoring of arch-related headers.
- [PROGRESS]
-
- * Better factoring of high-level vs low-level library.
-
- * Factor out hash, crandom from core library?
-
- * Signed 32-bit NEON implementation to avoid bias/reduce after subtract
-
-
-
- * Documentation: write high-level API docs, and internal docs to help
- other implementors.
- * Partial progress on Doxygenating the code.
-
- * Documentation: write a spec or add to Watson's
-
- * Cleanup: rename everything consistently.
- * namespace_op or op_namespace? namespace_op_type?
- * We don't have to be super-careful with the namespacing, because
- symbols will be scrubbed by exported.sym.
-
- * Cleanup: hard-coded tables (probably?)
- * This reduces the work required for goldilocks_init() at the expense
- of library size.
-
- * Makes error-handling and thread safety easier.
-
- * Use the SAGE tool?
-
- * Cleanup: unify intrinsics code
- * Word_t, mask_t, bigregister_t, etc.
- * Generate asm intrinsics with a script?
-
- * [DONE] Bugfix: make sure that init() and randomization are thread-safe.
-
- * [DONE] Security: check on deserialization that points are < p.
- * [NEEDS TESTING] Check also that they're nonzero or otherwise non-pathological?
-
- * Testing:
- * Corner-case testing
- * More bulk random testing
- * Negative testing.
- * SAGE-(auto?)-generated test vectors
- * Test the Barrett fields
-
- * Safety: add static analysis attributes for compilers that support them
- * Most functions now have warn on ignored return.
-
- * Safety:
- * [DONE] Check for init() if it's still required once we've done the above
- * Decide what to do about RNG failures
- * abort
- * return error and zeroize
- * return error but continue if RNG is kind of mostly OK
-
- * Flexibility: decide which API options are good.
- * [DONE?] Eg, should functions take nbits and table sizes?
-
- * [DONE] Remove hardcoded adjustments from comb control.
- * These adjustments make the output wrong when it's not 450 bits.
-
- * Other slow Barrett fields? Montgomery fields?
-
- * Mid-level API
- * Make it easier to work with untwisted Edwards objects.
- * Probably use extended or projective, not extensible coordinates.
- * Scalarmul with other cofactor modes.
-
- * High-level API:
- * SHA512 Elligator Edition? Maybe write a paper first.
-
- * Elligator.
- * Need to write Elligator inverse. Might not be Elligator-2S.
-
- * FHMQV? Is this patented?
-
- * What low-level APIs to expose?
- * Edwards points with add, sub, scalarmul, =, ==, ser/deser?
-
- * Portability: test and make clean with other compilers
- * Using a fair amount of __attribute__ code.
- * [DONE] Should work for GCC now.
-
- * Portability: try to make the vector code as portable as possible
- * Currently using clang ext_vector_length.
- * I can't get a simple for-loop to autovectorize :-/
- * SAGE tool?
-
- * Portability: make the inner layers of the code 32-bit clean.
- * Write new versions of the field code.
- * [DONE] 28-bit limbs give less headroom for carries.
- * [DONE] Now have a vectorless ARM version; need NEON.
- * Improve speed of 32-bit field code.
-
- * [DONE] Run through the SAGE tool to generate new bias & bound.
-
- * [DONE] Portability: make the outer layers of the code 32-bit clean.
-
- * [DONE] Performance/flexibility: decide which parameters should be hard-coded.
- * Perhaps useful for comb precomputation.
-
- * Performance: Improve SHA512.
- * [DONE?] Improve portability.
- * Improve speed.
- * Except not, because this adds too much code size.
- * Link OpenSSL if a fast SHA is desired.
-
- * Protocol:
- * Decide what things to stir into hashes for various functions.
-
- * Performance: improve the Barrett field code.
- * Support other primes?
- * Capture prime shape into a struct instead of passing 3 params.
- * [DONE] Make 32-bit clean.
-
- * Automation:
- * Improve the SAGE tool to cover more cases
- * Real SSA classes to cover branching and looping
- * Constant-time selection
- * Intrinsics code
- * Field code?
-
- * SAGE tool is impossibly slow on 32-bit
- * Currently stuck on Elligator after 19 hours.
- * [FIXED] at least for now.
-
- * Vector-mul-chains
- * Negation "bubble pushing" optimization
-
- * Clear other TODO/FIXME/HACK/PERF items in the code
-
- * [DONE?] Submit to SUPERCOP
|