aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--doc/proposals/2021-11-remove-arbitrary-bytes.md32
1 files changed, 32 insertions, 0 deletions
diff --git a/doc/proposals/2021-11-remove-arbitrary-bytes.md b/doc/proposals/2021-11-remove-arbitrary-bytes.md
new file mode 100644
index 0000000..bcdf3cc
--- /dev/null
+++ b/doc/proposals/2021-11-remove-arbitrary-bytes.md
@@ -0,0 +1,32 @@
+# Remove arbitrary bytes
+A leaf's checksum is currently an opaque array of 32 arbitrary bytes. We would
+like to change this to H(checksum), so that no logged bytes are arbitrary. As a
+result, the threat of log poisoning goes from unlikely to very unlikely.
+
+## Details
+New leaf:
+- Shard hint
+- H(checksum), was "just checksum"
+- Signature
+- H(public key)
+
+A signer's signed statement must be for H(checksum), not checksum. In other
+words, a signer basically signs H(H(data)), then checksum<-H(data) is submitted
+on our current add-leaf endpoint. The log computes H(checksum) for incoming
+add-leaf requests. No other changes are required for the log's leaf endpoints.
+
+Monitors locate data externally based on H(checksum), not checksum. Note that
+monitors can verify observed signatures as before without locating the data.
+This is important so that we can be sure a signing operation actually happened.
+
+Verifiers need the same (meta)data distributed, but in the verification step
+H(checksum) must be computed to verify signatures and inclusion proofs.
+
+Witnesses are not affected by this change.
+
+## Other
+A different approach would be to submit data and let the log hash that. Not
+letting the log see data is a feature:
+- The data cannot be analyzed by the log unless its location is known
+- The data cannot be expected to be stored in the future
+- Each logging request becomes cheaper