diff options
Diffstat (limited to 'doc')
| -rw-r--r-- | doc/api.md | 206 | 
1 files changed, 116 insertions, 90 deletions
| @@ -8,28 +8,18 @@ This is a work-in-progress document that may be moved or modified.  ## Overview  The log implements an HTTP(S) API: -- Requests that add data to the log use the HTTP POST method.  The HTTP content -type is `application/x-www-form-urlencoded`.  The posted data are key-value -pairs.  Binary data must be base64-encoded. -- Requests that retrieve data from the log use the HTTP GET method.  The HTTP -content type is `application/x-www-form-urlencoded`.  Input parameters are -key-value pairs. -- Responses are JSON objects.  The HTTP content type is `application/json`. -- Error messages are human-readable strings.  The HTTP content type is -`text/plain`. - -We decided to use these web formats for requests and responses because the log -is running as an HTTP(S) service.  In other words, anyone that interacts with -the log is most likely using these formats already.  The other benefit is that -all requests and responses are human-readable.  This makes it easier to -understand the protocol, troubleshoot issues, and copy-paste.  We favored -compatibility and understandability over a wire-efficient format. - -Note that we are not using JSON for signed and/or logged data.  In other words, -a submitter that wishes to distribute log responses to their user base in a -different format may do so.  The forced (de)serialization parser on _end-users_ -is a small subset of Trunnel.  Trunnel is an "idiot-proof" wire-format that the -Tor project uses. +- Requests that add data to the log use the HTTP POST method. +- Request that retrieve data from the log use the HTTP GET method. +- The HTTP content type is `application/x-www-form-urlencoded` for requests and +responses.  This means that all input and output are expressed as key-value +pairs.  Binary data must be hex-encoded. + +We decided to use percent encoding for requests and responses because it is a +_simple format_ that is commonly used on the web.  We are not using percent +encoding for signed and/or logged data.  In other words, a submitter may +distribute log responses to their end-users in a different format that suit +them.  The forced (de)serialization parser on _end-users_ is a small subset of +Trunnel.  Trunnel is an "idiot-proof" wire-format that the Tor project uses.  ## Primitives  ### Cryptography @@ -49,6 +39,13 @@ padding.  Supporting RSA is suboptimal, but excluding it would make the log  useless for many possible adopters.  ### Serialization +Log requests and responses are percent encoded.  Percent encoding is a smaller +dependency than an alternative parser like JSON.  It is comparable to rolling +your own minimalistic line-terminated format.  Some input and output data is +binary: cryptographic hashes and signatures.  Binary data must be expressed as +hex before percent-encoding it.  We decided to use hex as opposed to base64 +because it is simpler, favoring simplicity over efficiency on the wire. +  We use the [Trunnel](https://gitweb.torproject.org/trunnel.git) [description language](https://www.seul.org/~nickm/trunnel-manual.html)  to define (de)serialization of data structures that need to be signed or  inserted into the Merkle tree.  Trunnel is more expressive than the @@ -62,13 +59,12 @@ A fair summary of our Trunnel usage is as follows.  All integers are 64-bit, unsigned, and in network byte order.  A fixed-size byte  array is put into the serialization buffer in-order, starting from the first -byte.  These basic types are concatenated to form a collection.  You should not -need a general-purpose Trunnel (de)serialization parser to work with this -format.  If you have one, you may use it though.  The main point of using -Trunnel is that it makes a simple format explicit and unambiguous. - -TODO: URL-encode _or_ JSON?  I think we should only need one.  Always doing HTTP -POST would also ensure that input parameters don't show up in web server logs. +byte.  A variable length byte array first declares its length as an integer, +which is then followed by that number of bytes.  These basic types are +concatenated to form a collection.  You should not need a general-purpose +Trunnel (de)serialization parser to work with this format.  If you have one, you +may use it though.  The main point of using Trunnel is that it makes a simple +format explicit and unambiguous.  #### Merkle tree head  Tree heads are signed by the log and its witnesses.  It contains a timestamp, a @@ -160,91 +156,124 @@ that it must be a valid HTTP(S) URL that can have the `/st/v0/<endpoint>` suffix  appended.  For example, a complete endpoint URL could be  `https://log.example.com/2021/st/v0/get-signed-tree-head`. +The HTTP status code is 200 OK to indicate success.  A different HTTP status +code is used to indicate failure.  The log should set the "error" key to a +human-readable value that describes what went wrong.  For example, +`error=invalid+signature`, `error=rate+limit+exceeded`, or +`error=unknown+leaf+hash`. +  ### get-signed-tree-head  ```  GET <base url>/st/v0/get-signed-tree-head  ``` -Input key-value pairs: -- `type`: either the string "latest", "stable", or "cosigned". -	- "latest": ask for the most recent signed tree head. -	- "stable": ask for a recent signed tree head that is fixed for some period +Input: +- "type": either the string "latest", "stable", or "cosigned". +	- latest: ask for the most recent signed tree head. +	- stable: ask for a recent signed tree head that is fixed for some period  	  of time. -	- "cosigned": ask for a recent cosigned tree head. - -Output: -- On success: status 200 OK and a signed tree head.  The response body is -defined by the following [schema](https://github.com/system-transparency/stfe/blob/design/doc/schema/sth.schema.json). -- On failure: a different status code and a human-readable error message. +	- cosigned: ask for a recent cosigned tree head. + +Output on success: +- "timestamp": `tree_head.timestamp` as a human-readable number. +- "tree_size": `tree_head.tree_size` as a human-readable number. +- "root_hash": `tree_head.root_hash` in hex. +- "signature": an Ed25519 signature over `tree_head`.  The result is +hex-encoded. +- "key_hash": a hash of the public verification key that can be used to verify +the signature.  The public verification key is serialized as in RFC 8032, then +hashed using SHA256.  The result is hex-encoded. + +The "signature" and "key_hash" fields may repeat. The first signature +corresponds to the first key hash, the second signature corresponds to the +second key hash, etc.  The number of signatures and key hashes must match.  ### get-proof-by-hash  ```  POST <base url>/st/v0/get-proof-by-hash  ``` -Input key-value pairs: -- `leaf_hash`: a base64-encoded leaf hash that identifies which `tree_leaf` the +Input: +- "leaf_hash": a hex-encoded leaf hash that identifies which `tree_leaf` the  log should prove inclusion for.  The leaf hash is computed using the RFC 6962 -hashing strategy.  In other words, `H(0x00 | tree_leaf)`. -- `tree_size`: the tree size of a tree head that the proof should be based on. +hashing strategy.  In other words, `SHA256(0x00 | tree_leaf)`. +- "tree_size": a human-readable tree size of the tree head that the proof should +be based on. -Output: -- On success: status 200 OK and an inclusion proof.  The response body is -defined by the following [schema](https://github.com/system-transparency/stfe/blob/design/doc/schema/inclusion_proof.schema.json). -- On failure: a different status code and a human-readable error message. +Output on success: +- "tree_size": human-readable tree size that the proof is based on. +- "leaf_index": human-readable zero-based index of the leaf that the proof is +based on. +- "inclusion_path": a node hash in hex. + +The "inclusion_path" may be omitted or repeated to represent an inclusion proof +of zero or more node hashes.  The order of node hashes follow from our hash +strategy, see RFC 6962.  ### get-consistency-proof  ```  POST <base url>/st/v0/get-consistency-proof  ``` -Input key-value pairs: -- `new_size`: the tree size of a newer tree head. -- `old_size`: the tree size of an older tree head that the log should prove is -consistent with the newer tree head. +Input: +- "new_size": human-readable tree size of a newer tree head. +- "old_size": human-readable tree size of an older tree head that the log should +prove is consistent with the newer tree head. + +Output on success: +- "new_size": human-readable tree size of a newer tree head that the proof +is based on. +- "old_size": human-readable tree size of an older tree head that the proof is +based on. +- "consistency_path": a node hash in hex. -Output: -- On success: status 200 OK and a consistency proof.  The response body is -defined by the following [schema](https://github.com/system-transparency/stfe/blob/design/doc/schema/consistency_proof.schema.json). -- On failure: a different status code and a human-readable error message. +The "consistency_path" may be omitted or repeated to represent a consistency +proof of zero or more node hashes.  The order of node hashes follow from our +hash strategy, see RFC 6962.  ### get-leaves  ```  POST <base url>/st/v0/get-leaves  ``` -Input key-value pairs: -- `start_size`: zero-based index of the first leaf to retrieve. -- `end_size`: index of the last leaf to retrieve. +Input: +- "start_size": human-readable index of the first leaf to retrieve. +- "end_size": human-readable index of the last leaf to retrieve. -Output: -- On success: status 200 OK and a list of leaves.  The response body is -defined by the following [schema](https://github.com/system-transparency/stfe/blob/design/doc/schema/leaves.schema.json). -- On failure: a different status code and a human-readable error message. +Output on success: +- "shard_hint": `tree_leaf.message.shard_hint` as a human-readable number. +- "checksum": `tree_leaf.message.checksum` in hex. +- "signature_scheme": human-readable number that identifies a signature scheme. +- "signature": `tree_leaf.signature` in hex. +- "key_hash": `tree_leaf.key_hash` in hex. -The log may truncate the list of returned leaves.  However, it must not be an -empty list on success.  +All fields may be repeated to return more than one leaf.  The first value in +each list refers to the first leaf, the second value in each list refers to the +second leaf, etc.  The size of each list must match. + +The log may return fewer leaves than requested.  At least one leaf must be +returned on HTTP status code 200 OK.  ### add-leaf  ```  POST <base url>/st/v0/add-leaf  ``` -Input key-value pairs: -- `shard_hint`: the shard hint that the submitter selected. -- `checksum`: the checksum that the submitter wants to log in base64. -- `signature_scheme`: the signature scheme that the submitter wants to use. -- `signature`: the submitter's signature over `tree_leaf.message` in base64. -- `verification_key`: the submitter's public verification key.  It is serialized -as described in the corresponding RFC, then base64-encoded. -- `domain_hint`: a domain name that indicates where the public verification-key -hash can be downloaded in base64.  Supported methods: DNS and HTTPS -(TODO: docdoc). - -Output: -- On success: HTTP 200.  The log will _try_ to incorporate the submitted leaf -into its Merkle tree. -- On failure: a different status code and a human-readable error message. +Input: +- "shard_hint": human-readable number in the log's shard interval that the +submitter selected. +- "checksum": the cryptographic checksum that the submitter wants to log in hex. +- "signature_scheme": human-readable number that identifies the submitter's +signature scheme. +- "signature": the submitter's signature over `tree_leaf.message`.  The result +is hex-encoded. +- "verification_key": the submitter's public verification key.  It is serialized +as described in the corresponding RFC.  The result is hex-encoded. +- "domain_hint": a domain name that indicates where `tree_leaf.key_hash` can be +retrieved as a DNS TXT resource record in hex. + +Output on success: +- None  The submitted entry will not be accepted if the signature is invalid or if the  downloaded verification-key hash does not match.  The submitted entry may also @@ -260,23 +289,20 @@ Public logging should not be assumed until an inclusion proof is available.  An  inclusion proof should not be relied upon unless it leads up to a trustworthy  signed tree head.  Witness cosigning can make a tree head trustworthy. -TODO: the log may allow no `domain_hint`?  Especially useful for v0 testing. -  ### add-cosignature  ```  POST <base url>/st/v0/add-cosignature  ``` -Input key-value pairs: -- `signature`: a base64-encoded signature over a `tree_head` that is fixed for -some period of time. The cosigning witness retrieves the tree head using the -`get-signed-tree-head` endpoint with the "stable" type. -- `key_hash`: a base64-encoded hash of the public verification key that can be -used to verify the signature. +Input: +- "signature": an Ed25519 signature over `tree_head`.  The result is +hex-encoded. +- "key_hash": a hash of the public verification key that can be used to verify +the signature.  The public verification key is serialized as in RFC 8032, then +hashed using SHA256.  The result is hex-encoded. -Output: -- HTTP status 200 OK on success.  Otherwise a different status code and a -human-readable error message. +Output on success: +- None  The key-hash can be used to identify which witness signed the log's tree head.  A key-hash, rather than the full verification key, is used to force the verifier | 
