diff options
Diffstat (limited to 'doc')
| -rw-r--r-- | doc/api.md | 323 | 
1 files changed, 181 insertions, 142 deletions
| @@ -1,7 +1,9 @@  # System Transparency Logging: API v0 -This document describes details of the System Transparency logging API, -version 0.  The broader picture is not explained here.  We assume that you have -read the System Transparency Logging design document.  It can be found [here](https://github.com/system-transparency/stfe/blob/design/doc/design.md). +This document describes details of the System Transparency logging +API, version 0.  The broader picture is not explained here.  We assume +that you have read the System Transparency Logging design document. +It can be found +[here](https://github.com/system-transparency/stfe/blob/design/doc/design.md).  **Warning.**  This is a work-in-progress document that may be moved or modified. @@ -17,24 +19,28 @@ The log implements an HTTP(S) API:  - Binary data is hex-encoded before being transmitted.  The motivation for using a text based key/value format for request and -response data is that it's simple to parse.  Note that this format is not being -used for the serialization of signed or logged data, where a more -well defined and storage efficient format is desirable. -A submitter may distribute log responses to their end-users in any +response data is that it's simple to parse.  Note that this format is +not being used for the serialization of signed or logged data, where a +more well defined and storage efficient format is desirable.  A +submitter may distribute log responses to their end-users in any  format that suits them.  The (de)serialization required for  _end-users_ is a small subset of Trunnel.  Trunnel is an "idiot-proof"  wire-format in use by the Tor project.  ## Primitives  ### Cryptography -The log uses the same Merkle tree hash strategy as [RFC 6962, §2](https://tools.ietf.org/html/rfc6962#section-2). -The hash functions must be [SHA256](https://csrc.nist.gov/csrc/media/publications/fips/180/4/final/documents/fips180-4-draft-aug2014.pdf). -The log must sign tree heads using [Ed25519](https://tools.ietf.org/html/rfc8032). -The log's witnesses must also sign tree heads using Ed25519. - -All other parts that are not Merkle tree related also use SHA256 as the hash -function.  Using more than one hash function would increases the overall attack -surface: two hash functions must be collision resistant instead of one. +The log uses the same Merkle tree hash strategy as +[RFC 6962,§2](https://tools.ietf.org/html/rfc6962#section-2). +The hash functions must be +[SHA256](https://csrc.nist.gov/csrc/media/publications/fips/180/4/final/documents/fips180-4-draft-aug2014.pdf). +The log must sign tree heads using +[Ed25519](https://tools.ietf.org/html/rfc8032).  The log's witnesses +must also sign tree heads using Ed25519. + +All other parts that are not Merkle tree related also use SHA256 as +the hash function.  Using more than one hash function would increases +the overall attack surface: two hash functions must be collision +resistant instead of one.  ### Serialization  Log requests and responses are transmitted as ASCII-encoded key/value @@ -45,32 +51,36 @@ encoding.  Using hex as opposed to base64 is motivated by it being  simpler, favoring ease of decoding and encoding over efficiency on the  wire. -We use the [Trunnel](https://gitweb.torproject.org/trunnel.git) [description language](https://www.seul.org/~nickm/trunnel-manual.html) +We use the +[Trunnel](https://gitweb.torproject.org/trunnel.git) [description language](https://www.seul.org/~nickm/trunnel-manual.html)  to define (de)serialization of data structures that need to be signed or  inserted into the Merkle tree.  Trunnel is more expressive than the  [SSH wire format](https://tools.ietf.org/html/rfc4251#section-5). -It is about as expressive as the [TLS presentation language](https://tools.ietf.org/html/rfc8446#section-3). -A notable difference is that Trunnel supports integer constraints.  The Trunnel -language is also readable by humans _and_ machines.  "Obviously correct code" -can be generated in C and Go. +It is about as expressive as the +[TLS presentation language](https://tools.ietf.org/html/rfc8446#section-3). +A notable difference is that Trunnel supports integer constraints. +The Trunnel language is also readable by humans _and_ machines. +"Obviously correct code" can be generated in C and Go.  A fair summary of our Trunnel usage is as follows. -All integers are 64-bit, unsigned, and in network byte order.  Fixed-size byte -arrays are put into the serialization buffer in-order, starting from the first -byte.  Variable length byte arrays first declare their length as an integer, -which is then followed by that number of bytes.  These basic types are -concatenated to form a collection.  You should not need a general-purpose -Trunnel (de)serialization parser to work with this format.  If you have one, you -may use it though.  The main point of using Trunnel is that it makes a simple -format explicit and unambiguous. +All integers are 64-bit, unsigned, and in network byte order. +Fixed-size byte arrays are put into the serialization buffer in-order, +starting from the first byte.  Variable length byte arrays first +declare their length as an integer, which is then followed by that +number of bytes.  These basic types are concatenated to form a +collection.  You should not need a general-purpose Trunnel +(de)serialization parser to work with this format.  If you have one, +you may use it though.  The main point of using Trunnel is that it +makes a simple format explicit and unambiguous.  #### Merkle tree head -Tree heads are signed by the log and its witnesses.  It contains a timestamp, a -tree size, and a root hash.  The timestamp is included so that monitors can -ensure _liveliness_.  It is the time since the UNIX epoch (January 1, 1970 -00:00:00 UTC) in seconds.  The tree size specifies the current number of -leaves.  The root hash fixes the structure and content of the Merkle tree. +Tree heads are signed by the log and its witnesses.  It contains a +timestamp, a tree size, and a root hash.  The timestamp is included so +that monitors can ensure _liveliness_.  It is the time since the UNIX +epoch (January 1, 1970 00:00:00 UTC) in seconds.  The tree size +specifies the current number of leaves.  The root hash fixes the +structure and content of the Merkle tree.  ```  struct tree_head { @@ -80,14 +90,16 @@ struct tree_head {  };  ``` -The serialized tree head must be signed using Ed25519.  A witness must not -cosign a tree head if it is inconsistent with prior history or if the timestamp -is backdated or future-dated more than 12 hours. +The serialized tree head must be signed using Ed25519.  A witness must +not cosign a tree head if it is inconsistent with prior history or if +the timestamp is backdated or future-dated more than 12 hours.  #### Merkle tree leaf -The log supports a single leaf type.  It contains a shard hint, a checksum over whatever the submitter wants to log a checksum for, -a signature that the submitter computed over the shard hint and the checksum, and a hash of the -submitter's public verification key, that can be used to verify the signature. +The log supports a single leaf type.  It contains a shard hint, a +checksum over whatever the submitter wants to log a checksum for, a +signature that the submitter computed over the shard hint and the +checksum, and a hash of the submitter's public verification key, that +can be used to verify the signature.  ```  struct message { @@ -102,23 +114,26 @@ struct tree_leaf {  }  ``` -Unlike X.509 certificates which already have validity ranges, a checksum does not -carry any such information.  Therefore, we require that the submitter selects a -_shard hint_.  The selected shard hint must be in the log's _shard interval_.  A -shard interval is defined by a start time and an end time.  Both ends of the -shard interval are inclusive and expressed as the number of seconds since -the UNIX epoch (January 1, 1970 00:00 UTC). - -Sharding simplifies log operations because it becomes explicit when a log can be -shutdown.  A log must only accept logging requests that have valid shard hints. -A log should only accept logging requests during the predefined shard interval. -Note that _the submitter's shard hint is not a verified timestamp_.  The -submitter should set the shard hint as large as possible.  If a roughly verified -timestamp is needed, a cosigned tree head can be used. - -Without a shard hint, the good Samaritan could log all leaves from an earlier -shard into a newer one.  Not only would that defeat the purpose of sharding, but -it would also become a potential denial-of-service vector. +Unlike X.509 certificates which already have validity ranges, a +checksum does not carry any such information.  Therefore, we require +that the submitter selects a _shard hint_.  The selected shard hint +must be in the log's _shard interval_.  A shard interval is defined by +a start time and an end time.  Both ends of the shard interval are +inclusive and expressed as the number of seconds since the UNIX epoch +(January 1, 1970 00:00 UTC). + +Sharding simplifies log operations because it becomes explicit when a +log can be shutdown.  A log must only accept logging requests that +have valid shard hints.  A log should only accept logging requests +during the predefined shard interval.  Note that _the submitter's +shard hint is not a verified timestamp_.  The submitter should set the +shard hint as large as possible.  If a roughly verified timestamp is +needed, a cosigned tree head can be used. + +Without a shard hint, the good Samaritan could log all leaves from an +earlier shard into a newer one.  Not only would that defeat the +purpose of sharding, but it would also become a potential +denial-of-service vector.  The signed message is composed of the chosen `shard_hint` and the  submitter's `checksum`.  It must be possible to verify @@ -136,9 +151,10 @@ verifier to locate the appropriate key and make an explicit trust  decision.  ## Public endpoints -Every log has a base URL that identifies it uniquely.  The only constraint is -that it must be a valid HTTP(S) URL that can have the `/st/v0/<endpoint>` suffix -appended.  For example, a complete endpoint URL could be +Every log has a base URL that identifies it uniquely.  The only +constraint is that it must be a valid HTTP(S) URL that can have the +`/st/v0/<endpoint>` suffix appended.  For example, a complete endpoint +URL could be  `https://log.example.com/2021/st/v0/get-signed-tree-head`.  Input data (in requests) is sent as ASCII key/value pairs as HTTP @@ -151,11 +167,11 @@ format as the input data, i.e. as ASCII key/value pairs on the format  `Key: Value`. Example: For sending `tree_size=4711` as output a log  would send an HTTP message body consisting of `stlog-tree_size: 4711`. -The HTTP status code is 200 OK to indicate success.  A different HTTP status -code is used to indicate failure.  The log should set the "error" key to a -human-readable value that describes what went wrong.  For example, -`error=invalid+signature`, `error=rate+limit+exceeded`, or -`error=unknown+leaf+hash`. +The HTTP status code is 200 OK to indicate success.  A different HTTP +status code is used to indicate failure.  The log should set the +"error" key to a human-readable value that describes what went wrong. +For example, `error=invalid+signature`, `error=rate+limit+exceeded`, +or `error=unknown+leaf+hash`.  ### get-tree-head-cosigned  Returns the latest cosigned tree head. Used together with @@ -169,17 +185,22 @@ Input:  - None  Output on success: -- "timestamp": `tree_head.timestamp` ASCII-encoded decimal number, seconds since the UNIX epoch. +- "timestamp": `tree_head.timestamp` ASCII-encoded decimal number, +  seconds since the UNIX epoch.  - "tree_size": `tree_head.tree_size` ASCII-encoded decimal number.  - "root_hash": `tree_head.root_hash` hex-encoded. -- "signature": hex-encoded Ed25519 signature over `tree_head` serialzed as described in section `Merkle tree head`. -- "key_hash": a hash of the public verification key (belonging to either the log or to one of its witnesses), which can be used to verify -the most recent `signature`.  The key is encoded as defined in [RFC 8032, section 5.1.2](https://tools.ietf.org/html/rfc8032#section-5.1.2), and then -hashed using SHA256.  The hash value is hex-encoded. +- "signature": hex-encoded Ed25519 signature over `tree_head` +  serialzed as described in section `Merkle tree head`. +- "key_hash": a hash of the public verification key (belonging to +  either the log or to one of its witnesses), which can be used to +  verify the most recent `signature`.  The key is encoded as defined +  in [RFC 8032, section 5.1.2](https://tools.ietf.org/html/rfc8032#section-5.1.2),  +  and then hashed using SHA256.  The hash value is hex-encoded.  The "signature" and "key_hash" fields may repeat. The first signature -corresponds to the first key hash, the second signature corresponds to the -second key hash, etc.  The number of signatures and key hashes must match. +corresponds to the first key hash, the second signature corresponds to +the second key hash, etc.  The number of signatures and key hashes +must match.  ### get-tree-head-to-sign  Returns the latest tree head to be signed by log witnesses. Used by @@ -193,20 +214,24 @@ Input:  - None  Output on success: -- "timestamp": `tree_head.timestamp` ASCII-encoded decimal number, seconds since the UNIX epoch. +- "timestamp": `tree_head.timestamp` ASCII-encoded decimal number, +  seconds since the UNIX epoch.  - "tree_size": `tree_head.tree_size` ASCII-encoded decimal number.  - "root_hash": `tree_head.root_hash` hex-encoded. -- "signature": hex-encoded Ed25519 signature over `tree_head` serialzed as described in section `Merkle tree head`. -- "key_hash": a hash of the log's public verification key, which can be used to verify -`signature`.  The key is encoded as defined in [RFC 8032, section 5.1.2](https://tools.ietf.org/html/rfc8032#section-5.1.2), and then -hashed using SHA256.  The hash value is hex-encoded. +- "signature": hex-encoded Ed25519 signature over `tree_head` +  serialzed as described in section `Merkle tree head`. +- "key_hash": a hash of the log's public verification key, which can +  be used to verify `signature`.  The key is encoded as defined in +  [RFC 8032, section 5.1.2](https://tools.ietf.org/html/rfc8032#section-5.1.2), +  and then hashed using SHA256.  The hash value is hex-encoded.  There is exactly one `signature` and one `key_hash` field. The  `key_hash` refers to the log's public verification key.  ### get-tree-head-latest -Returns the latest tree head, signed only by the log. Used for debugging purposes. +Returns the latest tree head, signed only by the log. Used for +debugging purposes.  ```  GET <base url>/st/v0/get-tree-head-latest @@ -216,14 +241,16 @@ Input:  - None  Output on success: -- "timestamp": `tree_head.timestamp` ASCII-encoded decimal number, seconds since the UNIX epoch. +- "timestamp": `tree_head.timestamp` ASCII-encoded decimal number, +  seconds since the UNIX epoch.  - "tree_size": `tree_head.tree_size` ASCII-encoded decimal number.  - "root_hash": `tree_head.root_hash` hex-encoded. -- "signature": hex-encoded Ed25519 signature over `tree_head` serialzed as described in section `Merkle tree head`. +- "signature": hex-encoded Ed25519 signature over `tree_head` +  serialzed as described in section `Merkle tree head`.  - "key_hash": a hash of the log's public verification key that can be -used to verify `signature`.  The key is encoded as defined in -[RFC 8032, section 5.1.2](https://tools.ietf.org/html/rfc8032#section-5.1.2), -and then hashed using SHA256.  The hash value is hex-encoded. +  used to verify `signature`.  The key is encoded as defined in +  [RFC 8032, section 5.1.2](https://tools.ietf.org/html/rfc8032#section-5.1.2), +  and then hashed using SHA256.  The hash value is hex-encoded.  There is exactly one `signature` and one `key_hash` field. The  `key_hash` refers to the log's public verification key. @@ -235,21 +262,22 @@ POST <base url>/st/v0/get-proof-by-hash  ```  Input: -- "leaf_hash": a hex-encoded leaf hash that identifies which `tree_leaf` the -log should prove inclusion for.  The leaf hash is computed using the RFC 6962 -hashing strategy.  In other words, `SHA256(0x00 | tree_leaf)`. -- "tree_size": a human-readable tree size of the tree head that the proof should -be based on. +- "leaf_hash": a hex-encoded leaf hash that identifies which +  `tree_leaf` the log should prove inclusion for.  The leaf hash is +  computed using the RFC 6962 hashing strategy.  In other words, +  `SHA256(0x00 | tree_leaf)`. +- "tree_size": a human-readable tree size of the tree head that the +  proof should be based on.  Output on success:  - "tree_size": human-readable tree size that the proof is based on. -- "leaf_index": human-readable zero-based index of the leaf that the proof is -based on. +- "leaf_index": human-readable zero-based index of the leaf that the +  proof is based on.  - "inclusion_path": a node hash in hex. -The "inclusion_path" may be omitted or repeated to represent an inclusion proof -of zero or more node hashes.  The order of node hashes follow from our hash -strategy, see RFC 6962. +The "inclusion_path" may be omitted or repeated to represent an +inclusion proof of zero or more node hashes.  The order of node hashes +follow from our hash strategy, see RFC 6962.  ### get-consistency-proof  ``` @@ -258,19 +286,19 @@ POST <base url>/st/v0/get-consistency-proof  Input:  - "new_size": human-readable tree size of a newer tree head. -- "old_size": human-readable tree size of an older tree head that the log should -prove is consistent with the newer tree head. +- "old_size": human-readable tree size of an older tree head that the +  log should prove is consistent with the newer tree head.  Output on success: -- "new_size": human-readable tree size of a newer tree head that the proof -is based on. -- "old_size": human-readable tree size of an older tree head that the proof is -based on. +- "new_size": human-readable tree size of a newer tree head that the +  proof is based on. +- "old_size": human-readable tree size of an older tree head that the +  proof is based on.  - "consistency_path": a node hash in hex. -The "consistency_path" may be omitted or repeated to represent a consistency -proof of zero or more node hashes.  The order of node hashes follow from our -hash strategy, see RFC 6962. +The "consistency_path" may be omitted or repeated to represent a +consistency proof of zero or more node hashes.  The order of node +hashes follow from our hash strategy, see RFC 6962.  ### get-leaves  ``` @@ -282,18 +310,21 @@ Input:  - "end_size": human-readable index of the last leaf to retrieve.  Output on success: -- "shard_hint": `tree_leaf.message.shard_hint` as a human-readable number. +- "shard_hint": `tree_leaf.message.shard_hint` as a human-readable +  number.  - "checksum": `tree_leaf.message.checksum` in hex. -- "signature_scheme": human-readable number that identifies a signature scheme. +- "signature_scheme": human-readable number that identifies a +  signature scheme.  - "signature": `tree_leaf.signature` in hex.  - "key_hash": `tree_leaf.key_hash` in hex. -All fields may be repeated to return more than one leaf.  The first value in -each list refers to the first leaf, the second value in each list refers to the -second leaf, etc.  The size of each list must match. +All fields may be repeated to return more than one leaf.  The first +value in each list refers to the first leaf, the second value in each +list refers to the second leaf, etc.  The size of each list must +match. -The log may return fewer leaves than requested.  At least one leaf must be -returned on HTTP status code 200 OK. +The log may return fewer leaves than requested.  At least one leaf +must be returned on HTTP status code 200 OK.  ### add-leaf  ``` @@ -301,31 +332,38 @@ POST <base url>/st/v0/add-leaf  ```  Input: -- "shard_hint": human-readable decimal number in the log's shard interval that the -submitter selected. -- "checksum": the cryptographic checksum that the submitter wants to log in hex. note: fixed length 64 bytes, validated by the server somehow -- "signature": the submitter's signature over `tree_leaf.message`.  The result -is hex-encoded. -- "verification_key": the submitter's public verification key.  The key is encoded as defined in [RFC 8032, section 5.1.2](https://tools.ietf.org/html/rfc8032#section-5.1.2).  The result is hex-encoded. -- "domain_hint": a domain name that indicates where `tree_leaf.key_hash` can be -retrieved as a DNS TXT resource record in hex. +- "shard_hint": human-readable decimal number in the log's shard +  interval that the submitter selected. +- "checksum": the cryptographic checksum that the submitter wants to +  log in hex. note: fixed length 64 bytes, validated by the server +  somehow +- "signature": the submitter's signature over `tree_leaf.message`. +  The result is hex-encoded. +- "verification_key": the submitter's public verification key.  The +  key is encoded as defined in +  [RFC 8032, section 5.1.2](https://tools.ietf.org/html/rfc8032#section-5.1.2).   The result is hex-encoded. +- "domain_hint": a domain name that indicates where +  `tree_leaf.key_hash` can be retrieved as a DNS TXT resource record +  in hex.  Output on success:  - None -The submitted entry will not be accepted if the signature is invalid or if the -downloaded verification-key hash does not match.  The submitted entry may also -not be accepted if the second-level domain name exceeded its rate limit.  By -coupling every add-leaf request with a second-level domain, it becomes more -difficult to spam the log.  You would need an excessive number of domain names. -This becomes costly if free domain names are rejected. +The submitted entry will not be accepted if the signature is invalid +or if the downloaded verification-key hash does not match.  The +submitted entry may also not be accepted if the second-level domain +name exceeded its rate limit.  By coupling every add-leaf request with +a second-level domain, it becomes more difficult to spam the log.  You +would need an excessive number of domain names.  This becomes costly +if free domain names are rejected. -The log does not publish domain-name to key bindings because key management is -more complex than that. +The log does not publish domain-name to key bindings because key +management is more complex than that. -Public logging should not be assumed until an inclusion proof is available.  An -inclusion proof should not be relied upon unless it leads up to a trustworthy -signed tree head.  Witness cosigning can make a tree head trustworthy. +Public logging should not be assumed until an inclusion proof is +available.  An inclusion proof should not be relied upon unless it +leads up to a trustworthy signed tree head.  Witness cosigning can +make a tree head trustworthy.  ### add-cosignature  ``` @@ -334,25 +372,26 @@ POST <base url>/st/v0/add-cosignature  Input:  - "signature": an Ed25519 signature over `tree_head`.  The result is -hex-encoded. -- "key_hash": a hash of the witness' public verification key that can be used -to verify the signature.  The key is encoded as defined in [RFC 8032, -section 5.1.2](https://tools.ietf.org/html/rfc8032#section-5.1.2), and -then hashed using SHA256.  The hash value is hex-encoded. +  hex-encoded. +- "key_hash": a hash of the witness' public verification key that can +  be used to verify the signature.  The key is encoded as defined in +  [RFC 8032, section 5.1.2](https://tools.ietf.org/html/rfc8032#section-5.1.2), +  and then hashed using SHA256.  The hash value is hex-encoded.  Output on success:  - None -The key-hash can be used to identify which witness signed the log's tree head. -A key-hash, rather than the full verification key, is used to force the verifier -to locate the appropriate key and make an explicit trust decision. +The key-hash can be used to identify which witness signed the log's +tree head.  A key-hash, rather than the full verification key, is used +to force the verifier to locate the appropriate key and make an +explicit trust decision.  ## Summary of log parameters -- **Public key**: an Ed25519 verification key that can be used to verify the -log's tree head signatures.   +- **Public key**: an Ed25519 verification key that can be used to +  verify the log's tree head signatures.  - **Log identifier**: the hashed public verification key using SHA256. -- **Shard interval**: the time during which the log accepts logging requests. -The shard interval's start and end are inclusive and expressed as the number of -seconds since the UNIX epoch. -- **Base URL**: where the log can be reached over HTTP(S).  It is the prefix -before a version-0 specific endpoint. +- **Shard interval**: the time during which the log accepts logging +  requests.  The shard interval's start and end are inclusive and +  expressed as the number of seconds since the UNIX epoch. +- **Base URL**: where the log can be reached over HTTP(S).  It is the +  prefix before a version-0 specific endpoint. | 
