aboutsummaryrefslogtreecommitdiff
path: root/doc/api.md
diff options
context:
space:
mode:
Diffstat (limited to 'doc/api.md')
-rw-r--r--doc/api.md323
1 files changed, 181 insertions, 142 deletions
diff --git a/doc/api.md b/doc/api.md
index 638b753..5b7cb19 100644
--- a/doc/api.md
+++ b/doc/api.md
@@ -1,7 +1,9 @@
# System Transparency Logging: API v0
-This document describes details of the System Transparency logging API,
-version 0. The broader picture is not explained here. We assume that you have
-read the System Transparency Logging design document. It can be found [here](https://github.com/system-transparency/stfe/blob/design/doc/design.md).
+This document describes details of the System Transparency logging
+API, version 0. The broader picture is not explained here. We assume
+that you have read the System Transparency Logging design document.
+It can be found
+[here](https://github.com/system-transparency/stfe/blob/design/doc/design.md).
**Warning.**
This is a work-in-progress document that may be moved or modified.
@@ -17,24 +19,28 @@ The log implements an HTTP(S) API:
- Binary data is hex-encoded before being transmitted.
The motivation for using a text based key/value format for request and
-response data is that it's simple to parse. Note that this format is not being
-used for the serialization of signed or logged data, where a more
-well defined and storage efficient format is desirable.
-A submitter may distribute log responses to their end-users in any
+response data is that it's simple to parse. Note that this format is
+not being used for the serialization of signed or logged data, where a
+more well defined and storage efficient format is desirable. A
+submitter may distribute log responses to their end-users in any
format that suits them. The (de)serialization required for
_end-users_ is a small subset of Trunnel. Trunnel is an "idiot-proof"
wire-format in use by the Tor project.
## Primitives
### Cryptography
-The log uses the same Merkle tree hash strategy as [RFC 6962, §2](https://tools.ietf.org/html/rfc6962#section-2).
-The hash functions must be [SHA256](https://csrc.nist.gov/csrc/media/publications/fips/180/4/final/documents/fips180-4-draft-aug2014.pdf).
-The log must sign tree heads using [Ed25519](https://tools.ietf.org/html/rfc8032).
-The log's witnesses must also sign tree heads using Ed25519.
-
-All other parts that are not Merkle tree related also use SHA256 as the hash
-function. Using more than one hash function would increases the overall attack
-surface: two hash functions must be collision resistant instead of one.
+The log uses the same Merkle tree hash strategy as
+[RFC 6962,§2](https://tools.ietf.org/html/rfc6962#section-2).
+The hash functions must be
+[SHA256](https://csrc.nist.gov/csrc/media/publications/fips/180/4/final/documents/fips180-4-draft-aug2014.pdf).
+The log must sign tree heads using
+[Ed25519](https://tools.ietf.org/html/rfc8032). The log's witnesses
+must also sign tree heads using Ed25519.
+
+All other parts that are not Merkle tree related also use SHA256 as
+the hash function. Using more than one hash function would increases
+the overall attack surface: two hash functions must be collision
+resistant instead of one.
### Serialization
Log requests and responses are transmitted as ASCII-encoded key/value
@@ -45,32 +51,36 @@ encoding. Using hex as opposed to base64 is motivated by it being
simpler, favoring ease of decoding and encoding over efficiency on the
wire.
-We use the [Trunnel](https://gitweb.torproject.org/trunnel.git) [description language](https://www.seul.org/~nickm/trunnel-manual.html)
+We use the
+[Trunnel](https://gitweb.torproject.org/trunnel.git) [description language](https://www.seul.org/~nickm/trunnel-manual.html)
to define (de)serialization of data structures that need to be signed or
inserted into the Merkle tree. Trunnel is more expressive than the
[SSH wire format](https://tools.ietf.org/html/rfc4251#section-5).
-It is about as expressive as the [TLS presentation language](https://tools.ietf.org/html/rfc8446#section-3).
-A notable difference is that Trunnel supports integer constraints. The Trunnel
-language is also readable by humans _and_ machines. "Obviously correct code"
-can be generated in C and Go.
+It is about as expressive as the
+[TLS presentation language](https://tools.ietf.org/html/rfc8446#section-3).
+A notable difference is that Trunnel supports integer constraints.
+The Trunnel language is also readable by humans _and_ machines.
+"Obviously correct code" can be generated in C and Go.
A fair summary of our Trunnel usage is as follows.
-All integers are 64-bit, unsigned, and in network byte order. Fixed-size byte
-arrays are put into the serialization buffer in-order, starting from the first
-byte. Variable length byte arrays first declare their length as an integer,
-which is then followed by that number of bytes. These basic types are
-concatenated to form a collection. You should not need a general-purpose
-Trunnel (de)serialization parser to work with this format. If you have one, you
-may use it though. The main point of using Trunnel is that it makes a simple
-format explicit and unambiguous.
+All integers are 64-bit, unsigned, and in network byte order.
+Fixed-size byte arrays are put into the serialization buffer in-order,
+starting from the first byte. Variable length byte arrays first
+declare their length as an integer, which is then followed by that
+number of bytes. These basic types are concatenated to form a
+collection. You should not need a general-purpose Trunnel
+(de)serialization parser to work with this format. If you have one,
+you may use it though. The main point of using Trunnel is that it
+makes a simple format explicit and unambiguous.
#### Merkle tree head
-Tree heads are signed by the log and its witnesses. It contains a timestamp, a
-tree size, and a root hash. The timestamp is included so that monitors can
-ensure _liveliness_. It is the time since the UNIX epoch (January 1, 1970
-00:00:00 UTC) in seconds. The tree size specifies the current number of
-leaves. The root hash fixes the structure and content of the Merkle tree.
+Tree heads are signed by the log and its witnesses. It contains a
+timestamp, a tree size, and a root hash. The timestamp is included so
+that monitors can ensure _liveliness_. It is the time since the UNIX
+epoch (January 1, 1970 00:00:00 UTC) in seconds. The tree size
+specifies the current number of leaves. The root hash fixes the
+structure and content of the Merkle tree.
```
struct tree_head {
@@ -80,14 +90,16 @@ struct tree_head {
};
```
-The serialized tree head must be signed using Ed25519. A witness must not
-cosign a tree head if it is inconsistent with prior history or if the timestamp
-is backdated or future-dated more than 12 hours.
+The serialized tree head must be signed using Ed25519. A witness must
+not cosign a tree head if it is inconsistent with prior history or if
+the timestamp is backdated or future-dated more than 12 hours.
#### Merkle tree leaf
-The log supports a single leaf type. It contains a shard hint, a checksum over whatever the submitter wants to log a checksum for,
-a signature that the submitter computed over the shard hint and the checksum, and a hash of the
-submitter's public verification key, that can be used to verify the signature.
+The log supports a single leaf type. It contains a shard hint, a
+checksum over whatever the submitter wants to log a checksum for, a
+signature that the submitter computed over the shard hint and the
+checksum, and a hash of the submitter's public verification key, that
+can be used to verify the signature.
```
struct message {
@@ -102,23 +114,26 @@ struct tree_leaf {
}
```
-Unlike X.509 certificates which already have validity ranges, a checksum does not
-carry any such information. Therefore, we require that the submitter selects a
-_shard hint_. The selected shard hint must be in the log's _shard interval_. A
-shard interval is defined by a start time and an end time. Both ends of the
-shard interval are inclusive and expressed as the number of seconds since
-the UNIX epoch (January 1, 1970 00:00 UTC).
-
-Sharding simplifies log operations because it becomes explicit when a log can be
-shutdown. A log must only accept logging requests that have valid shard hints.
-A log should only accept logging requests during the predefined shard interval.
-Note that _the submitter's shard hint is not a verified timestamp_. The
-submitter should set the shard hint as large as possible. If a roughly verified
-timestamp is needed, a cosigned tree head can be used.
-
-Without a shard hint, the good Samaritan could log all leaves from an earlier
-shard into a newer one. Not only would that defeat the purpose of sharding, but
-it would also become a potential denial-of-service vector.
+Unlike X.509 certificates which already have validity ranges, a
+checksum does not carry any such information. Therefore, we require
+that the submitter selects a _shard hint_. The selected shard hint
+must be in the log's _shard interval_. A shard interval is defined by
+a start time and an end time. Both ends of the shard interval are
+inclusive and expressed as the number of seconds since the UNIX epoch
+(January 1, 1970 00:00 UTC).
+
+Sharding simplifies log operations because it becomes explicit when a
+log can be shutdown. A log must only accept logging requests that
+have valid shard hints. A log should only accept logging requests
+during the predefined shard interval. Note that _the submitter's
+shard hint is not a verified timestamp_. The submitter should set the
+shard hint as large as possible. If a roughly verified timestamp is
+needed, a cosigned tree head can be used.
+
+Without a shard hint, the good Samaritan could log all leaves from an
+earlier shard into a newer one. Not only would that defeat the
+purpose of sharding, but it would also become a potential
+denial-of-service vector.
The signed message is composed of the chosen `shard_hint` and the
submitter's `checksum`. It must be possible to verify
@@ -136,9 +151,10 @@ verifier to locate the appropriate key and make an explicit trust
decision.
## Public endpoints
-Every log has a base URL that identifies it uniquely. The only constraint is
-that it must be a valid HTTP(S) URL that can have the `/st/v0/<endpoint>` suffix
-appended. For example, a complete endpoint URL could be
+Every log has a base URL that identifies it uniquely. The only
+constraint is that it must be a valid HTTP(S) URL that can have the
+`/st/v0/<endpoint>` suffix appended. For example, a complete endpoint
+URL could be
`https://log.example.com/2021/st/v0/get-signed-tree-head`.
Input data (in requests) is sent as ASCII key/value pairs as HTTP
@@ -151,11 +167,11 @@ format as the input data, i.e. as ASCII key/value pairs on the format
`Key: Value`. Example: For sending `tree_size=4711` as output a log
would send an HTTP message body consisting of `stlog-tree_size: 4711`.
-The HTTP status code is 200 OK to indicate success. A different HTTP status
-code is used to indicate failure. The log should set the "error" key to a
-human-readable value that describes what went wrong. For example,
-`error=invalid+signature`, `error=rate+limit+exceeded`, or
-`error=unknown+leaf+hash`.
+The HTTP status code is 200 OK to indicate success. A different HTTP
+status code is used to indicate failure. The log should set the
+"error" key to a human-readable value that describes what went wrong.
+For example, `error=invalid+signature`, `error=rate+limit+exceeded`,
+or `error=unknown+leaf+hash`.
### get-tree-head-cosigned
Returns the latest cosigned tree head. Used together with
@@ -169,17 +185,22 @@ Input:
- None
Output on success:
-- "timestamp": `tree_head.timestamp` ASCII-encoded decimal number, seconds since the UNIX epoch.
+- "timestamp": `tree_head.timestamp` ASCII-encoded decimal number,
+ seconds since the UNIX epoch.
- "tree_size": `tree_head.tree_size` ASCII-encoded decimal number.
- "root_hash": `tree_head.root_hash` hex-encoded.
-- "signature": hex-encoded Ed25519 signature over `tree_head` serialzed as described in section `Merkle tree head`.
-- "key_hash": a hash of the public verification key (belonging to either the log or to one of its witnesses), which can be used to verify
-the most recent `signature`. The key is encoded as defined in [RFC 8032, section 5.1.2](https://tools.ietf.org/html/rfc8032#section-5.1.2), and then
-hashed using SHA256. The hash value is hex-encoded.
+- "signature": hex-encoded Ed25519 signature over `tree_head`
+ serialzed as described in section `Merkle tree head`.
+- "key_hash": a hash of the public verification key (belonging to
+ either the log or to one of its witnesses), which can be used to
+ verify the most recent `signature`. The key is encoded as defined
+ in [RFC 8032, section 5.1.2](https://tools.ietf.org/html/rfc8032#section-5.1.2),
+ and then hashed using SHA256. The hash value is hex-encoded.
The "signature" and "key_hash" fields may repeat. The first signature
-corresponds to the first key hash, the second signature corresponds to the
-second key hash, etc. The number of signatures and key hashes must match.
+corresponds to the first key hash, the second signature corresponds to
+the second key hash, etc. The number of signatures and key hashes
+must match.
### get-tree-head-to-sign
Returns the latest tree head to be signed by log witnesses. Used by
@@ -193,20 +214,24 @@ Input:
- None
Output on success:
-- "timestamp": `tree_head.timestamp` ASCII-encoded decimal number, seconds since the UNIX epoch.
+- "timestamp": `tree_head.timestamp` ASCII-encoded decimal number,
+ seconds since the UNIX epoch.
- "tree_size": `tree_head.tree_size` ASCII-encoded decimal number.
- "root_hash": `tree_head.root_hash` hex-encoded.
-- "signature": hex-encoded Ed25519 signature over `tree_head` serialzed as described in section `Merkle tree head`.
-- "key_hash": a hash of the log's public verification key, which can be used to verify
-`signature`. The key is encoded as defined in [RFC 8032, section 5.1.2](https://tools.ietf.org/html/rfc8032#section-5.1.2), and then
-hashed using SHA256. The hash value is hex-encoded.
+- "signature": hex-encoded Ed25519 signature over `tree_head`
+ serialzed as described in section `Merkle tree head`.
+- "key_hash": a hash of the log's public verification key, which can
+ be used to verify `signature`. The key is encoded as defined in
+ [RFC 8032, section 5.1.2](https://tools.ietf.org/html/rfc8032#section-5.1.2),
+ and then hashed using SHA256. The hash value is hex-encoded.
There is exactly one `signature` and one `key_hash` field. The
`key_hash` refers to the log's public verification key.
### get-tree-head-latest
-Returns the latest tree head, signed only by the log. Used for debugging purposes.
+Returns the latest tree head, signed only by the log. Used for
+debugging purposes.
```
GET <base url>/st/v0/get-tree-head-latest
@@ -216,14 +241,16 @@ Input:
- None
Output on success:
-- "timestamp": `tree_head.timestamp` ASCII-encoded decimal number, seconds since the UNIX epoch.
+- "timestamp": `tree_head.timestamp` ASCII-encoded decimal number,
+ seconds since the UNIX epoch.
- "tree_size": `tree_head.tree_size` ASCII-encoded decimal number.
- "root_hash": `tree_head.root_hash` hex-encoded.
-- "signature": hex-encoded Ed25519 signature over `tree_head` serialzed as described in section `Merkle tree head`.
+- "signature": hex-encoded Ed25519 signature over `tree_head`
+ serialzed as described in section `Merkle tree head`.
- "key_hash": a hash of the log's public verification key that can be
-used to verify `signature`. The key is encoded as defined in
-[RFC 8032, section 5.1.2](https://tools.ietf.org/html/rfc8032#section-5.1.2),
-and then hashed using SHA256. The hash value is hex-encoded.
+ used to verify `signature`. The key is encoded as defined in
+ [RFC 8032, section 5.1.2](https://tools.ietf.org/html/rfc8032#section-5.1.2),
+ and then hashed using SHA256. The hash value is hex-encoded.
There is exactly one `signature` and one `key_hash` field. The
`key_hash` refers to the log's public verification key.
@@ -235,21 +262,22 @@ POST <base url>/st/v0/get-proof-by-hash
```
Input:
-- "leaf_hash": a hex-encoded leaf hash that identifies which `tree_leaf` the
-log should prove inclusion for. The leaf hash is computed using the RFC 6962
-hashing strategy. In other words, `SHA256(0x00 | tree_leaf)`.
-- "tree_size": a human-readable tree size of the tree head that the proof should
-be based on.
+- "leaf_hash": a hex-encoded leaf hash that identifies which
+ `tree_leaf` the log should prove inclusion for. The leaf hash is
+ computed using the RFC 6962 hashing strategy. In other words,
+ `SHA256(0x00 | tree_leaf)`.
+- "tree_size": a human-readable tree size of the tree head that the
+ proof should be based on.
Output on success:
- "tree_size": human-readable tree size that the proof is based on.
-- "leaf_index": human-readable zero-based index of the leaf that the proof is
-based on.
+- "leaf_index": human-readable zero-based index of the leaf that the
+ proof is based on.
- "inclusion_path": a node hash in hex.
-The "inclusion_path" may be omitted or repeated to represent an inclusion proof
-of zero or more node hashes. The order of node hashes follow from our hash
-strategy, see RFC 6962.
+The "inclusion_path" may be omitted or repeated to represent an
+inclusion proof of zero or more node hashes. The order of node hashes
+follow from our hash strategy, see RFC 6962.
### get-consistency-proof
```
@@ -258,19 +286,19 @@ POST <base url>/st/v0/get-consistency-proof
Input:
- "new_size": human-readable tree size of a newer tree head.
-- "old_size": human-readable tree size of an older tree head that the log should
-prove is consistent with the newer tree head.
+- "old_size": human-readable tree size of an older tree head that the
+ log should prove is consistent with the newer tree head.
Output on success:
-- "new_size": human-readable tree size of a newer tree head that the proof
-is based on.
-- "old_size": human-readable tree size of an older tree head that the proof is
-based on.
+- "new_size": human-readable tree size of a newer tree head that the
+ proof is based on.
+- "old_size": human-readable tree size of an older tree head that the
+ proof is based on.
- "consistency_path": a node hash in hex.
-The "consistency_path" may be omitted or repeated to represent a consistency
-proof of zero or more node hashes. The order of node hashes follow from our
-hash strategy, see RFC 6962.
+The "consistency_path" may be omitted or repeated to represent a
+consistency proof of zero or more node hashes. The order of node
+hashes follow from our hash strategy, see RFC 6962.
### get-leaves
```
@@ -282,18 +310,21 @@ Input:
- "end_size": human-readable index of the last leaf to retrieve.
Output on success:
-- "shard_hint": `tree_leaf.message.shard_hint` as a human-readable number.
+- "shard_hint": `tree_leaf.message.shard_hint` as a human-readable
+ number.
- "checksum": `tree_leaf.message.checksum` in hex.
-- "signature_scheme": human-readable number that identifies a signature scheme.
+- "signature_scheme": human-readable number that identifies a
+ signature scheme.
- "signature": `tree_leaf.signature` in hex.
- "key_hash": `tree_leaf.key_hash` in hex.
-All fields may be repeated to return more than one leaf. The first value in
-each list refers to the first leaf, the second value in each list refers to the
-second leaf, etc. The size of each list must match.
+All fields may be repeated to return more than one leaf. The first
+value in each list refers to the first leaf, the second value in each
+list refers to the second leaf, etc. The size of each list must
+match.
-The log may return fewer leaves than requested. At least one leaf must be
-returned on HTTP status code 200 OK.
+The log may return fewer leaves than requested. At least one leaf
+must be returned on HTTP status code 200 OK.
### add-leaf
```
@@ -301,31 +332,38 @@ POST <base url>/st/v0/add-leaf
```
Input:
-- "shard_hint": human-readable decimal number in the log's shard interval that the
-submitter selected.
-- "checksum": the cryptographic checksum that the submitter wants to log in hex. note: fixed length 64 bytes, validated by the server somehow
-- "signature": the submitter's signature over `tree_leaf.message`. The result
-is hex-encoded.
-- "verification_key": the submitter's public verification key. The key is encoded as defined in [RFC 8032, section 5.1.2](https://tools.ietf.org/html/rfc8032#section-5.1.2). The result is hex-encoded.
-- "domain_hint": a domain name that indicates where `tree_leaf.key_hash` can be
-retrieved as a DNS TXT resource record in hex.
+- "shard_hint": human-readable decimal number in the log's shard
+ interval that the submitter selected.
+- "checksum": the cryptographic checksum that the submitter wants to
+ log in hex. note: fixed length 64 bytes, validated by the server
+ somehow
+- "signature": the submitter's signature over `tree_leaf.message`.
+ The result is hex-encoded.
+- "verification_key": the submitter's public verification key. The
+ key is encoded as defined in
+ [RFC 8032, section 5.1.2](https://tools.ietf.org/html/rfc8032#section-5.1.2). The result is hex-encoded.
+- "domain_hint": a domain name that indicates where
+ `tree_leaf.key_hash` can be retrieved as a DNS TXT resource record
+ in hex.
Output on success:
- None
-The submitted entry will not be accepted if the signature is invalid or if the
-downloaded verification-key hash does not match. The submitted entry may also
-not be accepted if the second-level domain name exceeded its rate limit. By
-coupling every add-leaf request with a second-level domain, it becomes more
-difficult to spam the log. You would need an excessive number of domain names.
-This becomes costly if free domain names are rejected.
+The submitted entry will not be accepted if the signature is invalid
+or if the downloaded verification-key hash does not match. The
+submitted entry may also not be accepted if the second-level domain
+name exceeded its rate limit. By coupling every add-leaf request with
+a second-level domain, it becomes more difficult to spam the log. You
+would need an excessive number of domain names. This becomes costly
+if free domain names are rejected.
-The log does not publish domain-name to key bindings because key management is
-more complex than that.
+The log does not publish domain-name to key bindings because key
+management is more complex than that.
-Public logging should not be assumed until an inclusion proof is available. An
-inclusion proof should not be relied upon unless it leads up to a trustworthy
-signed tree head. Witness cosigning can make a tree head trustworthy.
+Public logging should not be assumed until an inclusion proof is
+available. An inclusion proof should not be relied upon unless it
+leads up to a trustworthy signed tree head. Witness cosigning can
+make a tree head trustworthy.
### add-cosignature
```
@@ -334,25 +372,26 @@ POST <base url>/st/v0/add-cosignature
Input:
- "signature": an Ed25519 signature over `tree_head`. The result is
-hex-encoded.
-- "key_hash": a hash of the witness' public verification key that can be used
-to verify the signature. The key is encoded as defined in [RFC 8032,
-section 5.1.2](https://tools.ietf.org/html/rfc8032#section-5.1.2), and
-then hashed using SHA256. The hash value is hex-encoded.
+ hex-encoded.
+- "key_hash": a hash of the witness' public verification key that can
+ be used to verify the signature. The key is encoded as defined in
+ [RFC 8032, section 5.1.2](https://tools.ietf.org/html/rfc8032#section-5.1.2),
+ and then hashed using SHA256. The hash value is hex-encoded.
Output on success:
- None
-The key-hash can be used to identify which witness signed the log's tree head.
-A key-hash, rather than the full verification key, is used to force the verifier
-to locate the appropriate key and make an explicit trust decision.
+The key-hash can be used to identify which witness signed the log's
+tree head. A key-hash, rather than the full verification key, is used
+to force the verifier to locate the appropriate key and make an
+explicit trust decision.
## Summary of log parameters
-- **Public key**: an Ed25519 verification key that can be used to verify the
-log's tree head signatures.
+- **Public key**: an Ed25519 verification key that can be used to
+ verify the log's tree head signatures.
- **Log identifier**: the hashed public verification key using SHA256.
-- **Shard interval**: the time during which the log accepts logging requests.
-The shard interval's start and end are inclusive and expressed as the number of
-seconds since the UNIX epoch.
-- **Base URL**: where the log can be reached over HTTP(S). It is the prefix
-before a version-0 specific endpoint.
+- **Shard interval**: the time during which the log accepts logging
+ requests. The shard interval's start and end are inclusive and
+ expressed as the number of seconds since the UNIX epoch.
+- **Base URL**: where the log can be reached over HTTP(S). It is the
+ prefix before a version-0 specific endpoint.