aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--doc/api.md206
1 files changed, 116 insertions, 90 deletions
diff --git a/doc/api.md b/doc/api.md
index b5d54e6..174d2c9 100644
--- a/doc/api.md
+++ b/doc/api.md
@@ -8,28 +8,18 @@ This is a work-in-progress document that may be moved or modified.
## Overview
The log implements an HTTP(S) API:
-- Requests that add data to the log use the HTTP POST method. The HTTP content
-type is `application/x-www-form-urlencoded`. The posted data are key-value
-pairs. Binary data must be base64-encoded.
-- Requests that retrieve data from the log use the HTTP GET method. The HTTP
-content type is `application/x-www-form-urlencoded`. Input parameters are
-key-value pairs.
-- Responses are JSON objects. The HTTP content type is `application/json`.
-- Error messages are human-readable strings. The HTTP content type is
-`text/plain`.
-
-We decided to use these web formats for requests and responses because the log
-is running as an HTTP(S) service. In other words, anyone that interacts with
-the log is most likely using these formats already. The other benefit is that
-all requests and responses are human-readable. This makes it easier to
-understand the protocol, troubleshoot issues, and copy-paste. We favored
-compatibility and understandability over a wire-efficient format.
-
-Note that we are not using JSON for signed and/or logged data. In other words,
-a submitter that wishes to distribute log responses to their user base in a
-different format may do so. The forced (de)serialization parser on _end-users_
-is a small subset of Trunnel. Trunnel is an "idiot-proof" wire-format that the
-Tor project uses.
+- Requests that add data to the log use the HTTP POST method.
+- Request that retrieve data from the log use the HTTP GET method.
+- The HTTP content type is `application/x-www-form-urlencoded` for requests and
+responses. This means that all input and output are expressed as key-value
+pairs. Binary data must be hex-encoded.
+
+We decided to use percent encoding for requests and responses because it is a
+_simple format_ that is commonly used on the web. We are not using percent
+encoding for signed and/or logged data. In other words, a submitter may
+distribute log responses to their end-users in a different format that suit
+them. The forced (de)serialization parser on _end-users_ is a small subset of
+Trunnel. Trunnel is an "idiot-proof" wire-format that the Tor project uses.
## Primitives
### Cryptography
@@ -49,6 +39,13 @@ padding. Supporting RSA is suboptimal, but excluding it would make the log
useless for many possible adopters.
### Serialization
+Log requests and responses are percent encoded. Percent encoding is a smaller
+dependency than an alternative parser like JSON. It is comparable to rolling
+your own minimalistic line-terminated format. Some input and output data is
+binary: cryptographic hashes and signatures. Binary data must be expressed as
+hex before percent-encoding it. We decided to use hex as opposed to base64
+because it is simpler, favoring simplicity over efficiency on the wire.
+
We use the [Trunnel](https://gitweb.torproject.org/trunnel.git) [description language](https://www.seul.org/~nickm/trunnel-manual.html)
to define (de)serialization of data structures that need to be signed or
inserted into the Merkle tree. Trunnel is more expressive than the
@@ -62,13 +59,12 @@ A fair summary of our Trunnel usage is as follows.
All integers are 64-bit, unsigned, and in network byte order. A fixed-size byte
array is put into the serialization buffer in-order, starting from the first
-byte. These basic types are concatenated to form a collection. You should not
-need a general-purpose Trunnel (de)serialization parser to work with this
-format. If you have one, you may use it though. The main point of using
-Trunnel is that it makes a simple format explicit and unambiguous.
-
-TODO: URL-encode _or_ JSON? I think we should only need one. Always doing HTTP
-POST would also ensure that input parameters don't show up in web server logs.
+byte. A variable length byte array first declares its length as an integer,
+which is then followed by that number of bytes. These basic types are
+concatenated to form a collection. You should not need a general-purpose
+Trunnel (de)serialization parser to work with this format. If you have one, you
+may use it though. The main point of using Trunnel is that it makes a simple
+format explicit and unambiguous.
#### Merkle tree head
Tree heads are signed by the log and its witnesses. It contains a timestamp, a
@@ -160,91 +156,124 @@ that it must be a valid HTTP(S) URL that can have the `/st/v0/<endpoint>` suffix
appended. For example, a complete endpoint URL could be
`https://log.example.com/2021/st/v0/get-signed-tree-head`.
+The HTTP status code is 200 OK to indicate success. A different HTTP status
+code is used to indicate failure. The log should set the "error" key to a
+human-readable value that describes what went wrong. For example,
+`error=invalid+signature`, `error=rate+limit+exceeded`, or
+`error=unknown+leaf+hash`.
+
### get-signed-tree-head
```
GET <base url>/st/v0/get-signed-tree-head
```
-Input key-value pairs:
-- `type`: either the string "latest", "stable", or "cosigned".
- - "latest": ask for the most recent signed tree head.
- - "stable": ask for a recent signed tree head that is fixed for some period
+Input:
+- "type": either the string "latest", "stable", or "cosigned".
+ - latest: ask for the most recent signed tree head.
+ - stable: ask for a recent signed tree head that is fixed for some period
of time.
- - "cosigned": ask for a recent cosigned tree head.
-
-Output:
-- On success: status 200 OK and a signed tree head. The response body is
-defined by the following [schema](https://github.com/system-transparency/stfe/blob/design/doc/schema/sth.schema.json).
-- On failure: a different status code and a human-readable error message.
+ - cosigned: ask for a recent cosigned tree head.
+
+Output on success:
+- "timestamp": `tree_head.timestamp` as a human-readable number.
+- "tree_size": `tree_head.tree_size` as a human-readable number.
+- "root_hash": `tree_head.root_hash` in hex.
+- "signature": an Ed25519 signature over `tree_head`. The result is
+hex-encoded.
+- "key_hash": a hash of the public verification key that can be used to verify
+the signature. The public verification key is serialized as in RFC 8032, then
+hashed using SHA256. The result is hex-encoded.
+
+The "signature" and "key_hash" fields may repeat. The first signature
+corresponds to the first key hash, the second signature corresponds to the
+second key hash, etc. The number of signatures and key hashes must match.
### get-proof-by-hash
```
POST <base url>/st/v0/get-proof-by-hash
```
-Input key-value pairs:
-- `leaf_hash`: a base64-encoded leaf hash that identifies which `tree_leaf` the
+Input:
+- "leaf_hash": a hex-encoded leaf hash that identifies which `tree_leaf` the
log should prove inclusion for. The leaf hash is computed using the RFC 6962
-hashing strategy. In other words, `H(0x00 | tree_leaf)`.
-- `tree_size`: the tree size of a tree head that the proof should be based on.
+hashing strategy. In other words, `SHA256(0x00 | tree_leaf)`.
+- "tree_size": a human-readable tree size of the tree head that the proof should
+be based on.
-Output:
-- On success: status 200 OK and an inclusion proof. The response body is
-defined by the following [schema](https://github.com/system-transparency/stfe/blob/design/doc/schema/inclusion_proof.schema.json).
-- On failure: a different status code and a human-readable error message.
+Output on success:
+- "tree_size": human-readable tree size that the proof is based on.
+- "leaf_index": human-readable zero-based index of the leaf that the proof is
+based on.
+- "inclusion_path": a node hash in hex.
+
+The "inclusion_path" may be omitted or repeated to represent an inclusion proof
+of zero or more node hashes. The order of node hashes follow from our hash
+strategy, see RFC 6962.
### get-consistency-proof
```
POST <base url>/st/v0/get-consistency-proof
```
-Input key-value pairs:
-- `new_size`: the tree size of a newer tree head.
-- `old_size`: the tree size of an older tree head that the log should prove is
-consistent with the newer tree head.
+Input:
+- "new_size": human-readable tree size of a newer tree head.
+- "old_size": human-readable tree size of an older tree head that the log should
+prove is consistent with the newer tree head.
+
+Output on success:
+- "new_size": human-readable tree size of a newer tree head that the proof
+is based on.
+- "old_size": human-readable tree size of an older tree head that the proof is
+based on.
+- "consistency_path": a node hash in hex.
-Output:
-- On success: status 200 OK and a consistency proof. The response body is
-defined by the following [schema](https://github.com/system-transparency/stfe/blob/design/doc/schema/consistency_proof.schema.json).
-- On failure: a different status code and a human-readable error message.
+The "consistency_path" may be omitted or repeated to represent a consistency
+proof of zero or more node hashes. The order of node hashes follow from our
+hash strategy, see RFC 6962.
### get-leaves
```
POST <base url>/st/v0/get-leaves
```
-Input key-value pairs:
-- `start_size`: zero-based index of the first leaf to retrieve.
-- `end_size`: index of the last leaf to retrieve.
+Input:
+- "start_size": human-readable index of the first leaf to retrieve.
+- "end_size": human-readable index of the last leaf to retrieve.
-Output:
-- On success: status 200 OK and a list of leaves. The response body is
-defined by the following [schema](https://github.com/system-transparency/stfe/blob/design/doc/schema/leaves.schema.json).
-- On failure: a different status code and a human-readable error message.
+Output on success:
+- "shard_hint": `tree_leaf.message.shard_hint` as a human-readable number.
+- "checksum": `tree_leaf.message.checksum` in hex.
+- "signature_scheme": human-readable number that identifies a signature scheme.
+- "signature": `tree_leaf.signature` in hex.
+- "key_hash": `tree_leaf.key_hash` in hex.
-The log may truncate the list of returned leaves. However, it must not be an
-empty list on success.
+All fields may be repeated to return more than one leaf. The first value in
+each list refers to the first leaf, the second value in each list refers to the
+second leaf, etc. The size of each list must match.
+
+The log may return fewer leaves than requested. At least one leaf must be
+returned on HTTP status code 200 OK.
### add-leaf
```
POST <base url>/st/v0/add-leaf
```
-Input key-value pairs:
-- `shard_hint`: the shard hint that the submitter selected.
-- `checksum`: the checksum that the submitter wants to log in base64.
-- `signature_scheme`: the signature scheme that the submitter wants to use.
-- `signature`: the submitter's signature over `tree_leaf.message` in base64.
-- `verification_key`: the submitter's public verification key. It is serialized
-as described in the corresponding RFC, then base64-encoded.
-- `domain_hint`: a domain name that indicates where the public verification-key
-hash can be downloaded in base64. Supported methods: DNS and HTTPS
-(TODO: docdoc).
-
-Output:
-- On success: HTTP 200. The log will _try_ to incorporate the submitted leaf
-into its Merkle tree.
-- On failure: a different status code and a human-readable error message.
+Input:
+- "shard_hint": human-readable number in the log's shard interval that the
+submitter selected.
+- "checksum": the cryptographic checksum that the submitter wants to log in hex.
+- "signature_scheme": human-readable number that identifies the submitter's
+signature scheme.
+- "signature": the submitter's signature over `tree_leaf.message`. The result
+is hex-encoded.
+- "verification_key": the submitter's public verification key. It is serialized
+as described in the corresponding RFC. The result is hex-encoded.
+- "domain_hint": a domain name that indicates where `tree_leaf.key_hash` can be
+retrieved as a DNS TXT resource record in hex.
+
+Output on success:
+- None
The submitted entry will not be accepted if the signature is invalid or if the
downloaded verification-key hash does not match. The submitted entry may also
@@ -260,23 +289,20 @@ Public logging should not be assumed until an inclusion proof is available. An
inclusion proof should not be relied upon unless it leads up to a trustworthy
signed tree head. Witness cosigning can make a tree head trustworthy.
-TODO: the log may allow no `domain_hint`? Especially useful for v0 testing.
-
### add-cosignature
```
POST <base url>/st/v0/add-cosignature
```
-Input key-value pairs:
-- `signature`: a base64-encoded signature over a `tree_head` that is fixed for
-some period of time. The cosigning witness retrieves the tree head using the
-`get-signed-tree-head` endpoint with the "stable" type.
-- `key_hash`: a base64-encoded hash of the public verification key that can be
-used to verify the signature.
+Input:
+- "signature": an Ed25519 signature over `tree_head`. The result is
+hex-encoded.
+- "key_hash": a hash of the public verification key that can be used to verify
+the signature. The public verification key is serialized as in RFC 8032, then
+hashed using SHA256. The result is hex-encoded.
-Output:
-- HTTP status 200 OK on success. Otherwise a different status code and a
-human-readable error message.
+Output on success:
+- None
The key-hash can be used to identify which witness signed the log's tree head.
A key-hash, rather than the full verification key, is used to force the verifier