diff options
Diffstat (limited to 'doc')
-rw-r--r-- | doc/proposals/2021-11-remove-arbitrary-bytes.md | 22 |
1 files changed, 13 insertions, 9 deletions
diff --git a/doc/proposals/2021-11-remove-arbitrary-bytes.md b/doc/proposals/2021-11-remove-arbitrary-bytes.md index bcdf3cc..bee7540 100644 --- a/doc/proposals/2021-11-remove-arbitrary-bytes.md +++ b/doc/proposals/2021-11-remove-arbitrary-bytes.md @@ -1,19 +1,24 @@ -# Remove arbitrary bytes +**Title**: Remove arbitrary bytes </br> +**Date**: 2021-12-04 </br> +**State**: Aborted </br> + +# Summary A leaf's checksum is currently an opaque array of 32 arbitrary bytes. We would like to change this to H(checksum), so that no logged bytes are arbitrary. As a result, the threat of log poisoning goes from unlikely to very unlikely. -## Details +# Detailed description New leaf: - Shard hint - H(checksum), was "just checksum" - Signature - H(public key) -A signer's signed statement must be for H(checksum), not checksum. In other -words, a signer basically signs H(H(data)), then checksum<-H(data) is submitted -on our current add-leaf endpoint. The log computes H(checksum) for incoming -add-leaf requests. No other changes are required for the log's leaf endpoints. +A signer's signed statement would be for shard hint and H(checksum), not shard +hint and checksum. The same inputs are provided to the log for add-leaf +submissions. The log hashes the submitted checksum and then does all +verification as before. The hashed checksum is stored in the log's leaf. As +such, it becomes computationally expensive to craft many arbitrary leaf bytes. Monitors locate data externally based on H(checksum), not checksum. Note that monitors can verify observed signatures as before without locating the data. @@ -24,9 +29,8 @@ H(checksum) must be computed to verify signatures and inclusion proofs. Witnesses are not affected by this change. -## Other -A different approach would be to submit data and let the log hash that. Not -letting the log see data is a feature: +Note: a different approach would have been to submit data and let the log hash +that. Not letting the log see data is a feature: - The data cannot be analyzed by the log unless its location is known - The data cannot be expected to be stored in the future - Each logging request becomes cheaper |