diff options
-rw-r--r-- | doc/proposals/2021-11-remove-arbitrary-bytes.md | 32 |
1 files changed, 32 insertions, 0 deletions
diff --git a/doc/proposals/2021-11-remove-arbitrary-bytes.md b/doc/proposals/2021-11-remove-arbitrary-bytes.md new file mode 100644 index 0000000..bcdf3cc --- /dev/null +++ b/doc/proposals/2021-11-remove-arbitrary-bytes.md @@ -0,0 +1,32 @@ +# Remove arbitrary bytes +A leaf's checksum is currently an opaque array of 32 arbitrary bytes. We would +like to change this to H(checksum), so that no logged bytes are arbitrary. As a +result, the threat of log poisoning goes from unlikely to very unlikely. + +## Details +New leaf: +- Shard hint +- H(checksum), was "just checksum" +- Signature +- H(public key) + +A signer's signed statement must be for H(checksum), not checksum. In other +words, a signer basically signs H(H(data)), then checksum<-H(data) is submitted +on our current add-leaf endpoint. The log computes H(checksum) for incoming +add-leaf requests. No other changes are required for the log's leaf endpoints. + +Monitors locate data externally based on H(checksum), not checksum. Note that +monitors can verify observed signatures as before without locating the data. +This is important so that we can be sure a signing operation actually happened. + +Verifiers need the same (meta)data distributed, but in the verification step +H(checksum) must be computed to verify signatures and inclusion proofs. + +Witnesses are not affected by this change. + +## Other +A different approach would be to submit data and let the log hash that. Not +letting the log see data is a feature: +- The data cannot be analyzed by the log unless its location is known +- The data cannot be expected to be stored in the future +- Each logging request becomes cheaper |