aboutsummaryrefslogtreecommitdiff
path: root/doc/api.md
blob: 92344c5910c7f75d47b78ea2d06368c470ad1b47 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
# System Transparency Logging: API v0
This document describes details of the System Transparency logging
API, version 0.  The broader picture is not explained here.  We assume
that you have read the System Transparency Logging design document.
It can be found
[here](https://github.com/system-transparency/stfe/blob/design/doc/design.md).

**Warning.**
This is a work-in-progress document that may be moved or modified.

## Overview
Logs implement an HTTP(S) API for accepting requests and sending
responses.

- Input data in requests and output data in responses are expressed as
  ASCII-encoded key/value pairs.
- Requests with input data use HTTP POST to send the data to a log.
- Binary data is hex-encoded before being transmitted.

The motivation for using a text based key/value format for request and
response data is that it's simple to parse.  Note that this format is
not being used for the serialization of signed or logged data, where a
more well defined and storage efficient format is desirable.  A
submitter may distribute log responses to their end-users in any
format that suits them.  The (de)serialization required for
_end-users_ is a small subset of Trunnel.  Trunnel is an "idiot-proof"
wire-format in use by the Tor project.

## Primitives
### Cryptography
Logs use the same Merkle tree hash strategy as
[RFC 6962,§2](https://tools.ietf.org/html/rfc6962#section-2).
The hash functions must be
[SHA256](https://csrc.nist.gov/csrc/media/publications/fips/180/4/final/documents/fips180-4-draft-aug2014.pdf).
Logs must sign tree heads using
[Ed25519](https://tools.ietf.org/html/rfc8032).  Log witnesses
must also sign tree heads using Ed25519.

All other parts that are not Merkle tree related also use SHA256 as
the hash function.  Using more than one hash function would increases
the overall attack surface: two hash functions must be collision
resistant instead of one.

### Serialization
Log requests and responses are transmitted as ASCII-encoded key/value
pairs, for a smaller dependency than an alternative parser like JSON.
Some input and output data is binary: cryptographic hashes and
signatures.  Binary data must be Base16-encoded, also known as hex
encoding.  Using hex as opposed to base64 is motivated by it being
simpler, favoring ease of decoding and encoding over efficiency on the
wire.

We use the
[Trunnel](https://gitweb.torproject.org/trunnel.git) [description language](https://www.seul.org/~nickm/trunnel-manual.html)
to define (de)serialization of data structures that need to be signed or
inserted into the Merkle tree.  Trunnel is more expressive than the
[SSH wire format](https://tools.ietf.org/html/rfc4251#section-5).
It is about as expressive as the
[TLS presentation language](https://tools.ietf.org/html/rfc8446#section-3).
A notable difference is that Trunnel supports integer constraints.
The Trunnel language is also readable by humans _and_ machines.
"Obviously correct code" can be generated in C and Go.

A fair summary of our Trunnel usage is as follows.

All integers are 64-bit, unsigned, and in network byte order.
Fixed-size byte arrays are put into the serialization buffer in-order,
starting from the first byte.  Variable length byte arrays first
declare their length as an integer, which is then followed by that
number of bytes.  These basic types are concatenated to form a
collection.  You should not need a general-purpose Trunnel
(de)serialization parser to work with this format.  If you have one,
you may use it though.  The main point of using Trunnel is that it
makes a simple format explicit and unambiguous.

#### Merkle tree head
Tree heads are signed both by a log and its witnesses.  It contains a
timestamp, a tree size, and a root hash.  The timestamp is included so
that monitors can ensure _liveliness_.  It is the time since the UNIX
epoch (January 1, 1970 00:00 UTC) in seconds.  The tree size
specifies the current number of leaves.  The root hash fixes the
structure and content of the Merkle tree.

```
struct tree_head {
	u64 timestamp;
	u64 tree_size;
	u8 root_hash[32];
};
```

The serialized tree head must be signed using Ed25519.  A witness must
not cosign a tree head if it is inconsistent with prior history or if
the timestamp is backdated or future-dated more than 12 hours.

#### Merkle tree leaf
Logs support a single leaf type.  It contains a shard hint, a
checksum over whatever the submitter wants to log a checksum for, a
signature that the submitter computed over the shard hint and the
checksum, and a hash of the submitter's public verification key, that
can be used to verify the signature.

```
struct message {
    u64 shard_hint;
    u8 checksum[32];
};

struct tree_leaf {
    struct message;
    u8 signature_over_message[64];
    u8 key_hash[32];
}
```

`message` is composed of the `shard_hint`, chosen by the submitter to
match the shard interval for the log it's submitting to, and the
submitter's `checksum` to be logged.

`signature_over_message` is a signature over `message`, using the
submitter's verification key. It must be possible to verify the
signature using the submitter's public verification key, as indicated
by `key_hash`.

`key_hash` is a hash of the submitter's verification key used for
signing `message`. It is included in `tree_leaf` so that the leaf can
be attributed to the submitter.  A hash, rather than the full public
key, is used to motivate verifiers to locate the appropriate key and
make an explicit trust decision.

## Public endpoints
Every log has a base URL that identifies it uniquely.  The only
constraint is that it must be a valid HTTP(S) URL that can have the
`/st/v0/<endpoint>` suffix appended.  For example, a complete endpoint
URL could be
`https://log.example.com/2021/st/v0/get-tree-head-cosigned`.

Input data (in requests) is POST:ed in the HTTP message body as ASCII
key/value pairs.

Output data (in replies) is sent in the HTTP message body in the same
format as the input data, i.e. as ASCII key/value pairs on the format
`Key=Value`

The HTTP status code is 200 OK to indicate success.  A different HTTP
status code is used to indicate failure, in which case a log should
respond with a human-readable string describing what went wrong using
the key `error`. Example: `error=Invalid signature.`.

### get-tree-head-cosigned
Returns the latest cosigned tree head. Used together with
`get-proof-by-hash` and `get-consistency-proof` for verifying the tree.

```
GET <base url>/st/v0/get-tree-head-cosigned
```

Input:
- None

Output on success:
- `timestamp`: `tree_head.timestamp` ASCII-encoded decimal number,
  seconds since the UNIX epoch.
- `tree_size`: `tree_head.tree_size` ASCII-encoded decimal number.
- `root_hash`: `tree_head.root_hash` hex-encoded.
- `signature`: hex-encoded Ed25519 signature over `tree_head`
  serialzed as described in section `Merkle tree head`.
- `key_hash`: a hash of the public verification key (belonging to
  either the log or to one of its witnesses), which can be used to
  verify the most recent `signature`.  The key is encoded as defined
  in [RFC 8032, section 5.1.2](https://tools.ietf.org/html/rfc8032#section-5.1.2), 
  and then hashed using SHA256.  The hash value is hex-encoded.

The `signature` and `key_hash` fields may repeat. The first signature
corresponds to the first key hash, the second signature corresponds to
the second key hash, etc.  The number of signatures and key hashes
must match.

### get-tree-head-to-sign
Returns the latest tree head to be signed by log witnesses. Used by
witnesses.

```
GET <base url>/st/v0/get-tree-head-to-sign
```

Input:
- None

Output on success:
- `timestamp`: `tree_head.timestamp` ASCII-encoded decimal number,
  seconds since the UNIX epoch.
- `tree_size`: `tree_head.tree_size` ASCII-encoded decimal number.
- `root_hash`: `tree_head.root_hash` hex-encoded.
- `signature`: hex-encoded Ed25519 signature over `tree_head`
  serialzed as described in section `Merkle tree head`.
- `key_hash`: a hash of the log's public verification key, which can
  be used to verify `signature`.  The key is encoded as defined in
  [RFC 8032, section 5.1.2](https://tools.ietf.org/html/rfc8032#section-5.1.2),
  and then hashed using SHA256.  The hash value is hex-encoded.

There is exactly one `signature` and one `key_hash` field. The
`key_hash` refers to the log's public verification key.


### get-tree-head-latest
Returns the latest tree head, signed only by the log. Used for
debugging purposes.

```
GET <base url>/st/v0/get-tree-head-latest
```

Input:
- None

Output on success:
- `timestamp`: `tree_head.timestamp` ASCII-encoded decimal number,
  seconds since the UNIX epoch.
- `tree_size`: `tree_head.tree_size` ASCII-encoded decimal number.
- `root_hash`: `tree_head.root_hash` hex-encoded.
- `signature`: hex-encoded Ed25519 signature over `tree_head`
  serialzed as described in section `Merkle tree head`.
- `key_hash`: a hash of the log's public verification key that can be
  used to verify `signature`.  The key is encoded as defined in
  [RFC 8032, section 5.1.2](https://tools.ietf.org/html/rfc8032#section-5.1.2),
  and then hashed using SHA256.  The hash value is hex-encoded.

There is exactly one `signature` and one `key_hash` field. The
`key_hash` refers to the log's public verification key.


### get-proof-by-hash
```
POST <base url>/st/v0/get-proof-by-hash
```

Input:
- `leaf_hash`: leaf identifying which `tree_leaf` the log should prove
  inclusion of, hex-encoded.
- `tree_size`: tree size of the tree head that the proof should be
  based on, as an ASCII-encoded decimal number.

Output on success:
- `tree_size`: tree size that the proof is based on, as an
  ASCII-encoded decimal number.
- `leaf_index`: zero-based index of the leaf that the proof is based
  on, as an ASCII-encoded decimal number.
- `inclusion_path`: node hash, hex-encoded.

The leaf hash is computed using the RFC 6962 hashing strategy.  In
other words, `SHA256(0x00 | tree_leaf)`.

`inclusion_path` may be omitted or repeated to represent an inclusion
proof of zero or more node hashes.  The order of node hashes follow
from the hash strategy, see RFC 6962.

Example: `echo "leaf_hash=241fd4538d0a35c2d0394e4710ea9e6916854d08f62602fb03b55221dcdac90f
tree_size=4711" | curl --data-binary @- localhost/st/v0/get-proof-by-hash`

### get-consistency-proof
```
POST <base url>/st/v0/get-consistency-proof
```

Input:
- `new_size`: tree size of a newer tree head, as an ASCII-encoded
  decimal number.
- `old_size`: tree size of an older tree head that the log should
  prove is consistent with the newer tree head, as an ASCII-encoded
  decimal number.

Output on success:
- `new_size`: tree size of the newer tree head that the proof is based
  on, as an ASCII-encoded decimal number.
- `old_size`: tree size of the older tree head that the proof is based
  on, as an ASCII-encoded decimal number.
- `consistency_path`: node hash, hex-encoded.

`consistency_path` may be omitted or repeated to represent a
consistency proof of zero or more node hashes.  The order of node
hashes follow from the hash strategy, see RFC 6962.

Example: `echo "new_size=4711
old_size=42" | curl --data-binary @- localhost/st/v0/get-consistency-proof`

### get-leaves
```
POST <base url>/st/v0/get-leaves
```

Input:
- `start_size`: index of the first leaf to retrieve, as an
  ASCII-encoded decimal number.
- `end_size`: index of the last leaf to retrieve, as an ASCII-encoded
  decimal number.

Output on success:
- `shard_hint`: `tree_leaf.message.shard_hint` as an ASCII-encoded
  decimal number.
- `checksum`: `tree_leaf.message.checksum`, hex-encoded.
- `signature`: `tree_leaf.signature_over_message`, hex-encoded.
- `key_hash`: `tree_leaf.key_hash`, hex-encoded.

All fields may be repeated to return more than one leaf.  The first
value in each list refers to the first leaf, the second value in each
list refers to the second leaf, etc.  The size of each list must
match.

A log may return fewer leaves than requested.  At least one leaf
must be returned on HTTP status code 200 OK.

Example: `echo "start_size=42
end_size=4711" | curl --data-binary @- localhost/st/v0/get-leaves`

### add-leaf
```
POST <base url>/st/v0/add-leaf
```

Input:
- `shard_hint`: number within the log's shard interval as an
  ASCII-encoded decimal number.
- `checksum`: the cryptographic checksum that the submitter wants to
  log, hex-encoded.
- `signature_over_message`: the submitter's signature over
  `tree_leaf.message`, hex-encoded.
- `verification_key`: the submitter's public verification key.  The
  key is encoded as defined in
  [RFC 8032, section 5.1.2](https://tools.ietf.org/html/rfc8032#section-5.1.2)
  and then hex-encoded.
- `domain_hint`: domain name indicating where `tree_leaf.key_hash`
  can be found as a DNS TXT resource record.

Output on success:
- None

The submission will not be accepted if `signature_over_message` is
invalid or if the key hash retrieved using `domain_hint` does not
match a hash over `verification_key`.

The submission may also not be accepted if the second-level domain
name exceeded its rate limit.  By coupling every add-leaf request to
a second-level domain, it becomes more difficult to spam logs.  You
would need an excessive number of domain names.  This becomes costly
if free domain names are rejected.

Logs don't publish domain-name to key bindings because key
management is more complex than that.

Public logging should not be assumed to have happened until an
inclusion proof is available.  An inclusion proof should not be relied
upon unless it leads up to a trustworthy signed tree head.  Witness
cosigning can make a tree head trustworthy.

Example: `echo "shard_hint=1640995200
checksum=cfa2d8e78bf273ab85d3cef7bde62716261d1e42626d776f9b4e6aae7b6ff953
signature_over_message=c026687411dea494539516ee0c4e790c24450f1a4440c2eb74df311ca9a7adf2847b99273af78b0bda65dfe9c4f7d23a5d319b596a8881d3bc2964749ae9ece3
verification_key=c9a674888e905db1761ba3f10f3ad09586dddfe8581964b55787b44f318cbcdf
domain_hint=example.com" | curl --data-binary @- localhost/st/v0/add-leaf`

### add-cosignature
```
POST <base url>/st/v0/add-cosignature
```

Input:
- `signature`: Ed25519 signature over `tree_head`, hex-encoded.
- `key_hash`: hash of the witness' public verification key that can be
  used to verify `signature`.  The key is encoded as defined in
  [RFC 8032, section 5.1.2](https://tools.ietf.org/html/rfc8032#section-5.1.2),
  and then hashed using SHA256. The hash value is hex-encoded.

Output on success:
- None

`key_hash` can be used to identify which witness signed the tree
head.  A key-hash, rather than the full verification key, is used to
motivate verifiers to locate the appropriate key and make an explicit
trust decision.

Example: `echo "signature=d1b15061d0f287847d066630339beaa0915a6bbb77332c3e839a32f66f1831b69c678e8ca63afd24e436525554dbc6daa3b1201cc0c93721de24b778027d41af
key_hash=662ce093682280f8fbea9939abe02fdba1f0dc39594c832b411ddafcffb75b1d" | curl --data-binary @- localhost/st/v0/add-cosignature`

## Summary of log parameters
- **Public key**: The Ed25519 verification key to be used for
  verifying tree head signatures.
- **Log identifier**: The public verification key `Public key` hashed
  using SHA256.
- **Shard interval start**: The earliest time at which logging
  requests are accepted as the number of seconds since the UNIX epoch.
- **Shard interval end**: The latest time at which logging
  requests are accepted as the number of seconds since the UNIX epoch.
- **Base URL**: Where the log can be reached over HTTP(S).  It is the
  prefix to be used to construct a version 0 specific endpoint.