moved top-most README.md into doc/sketch

author: Rasmus Dahlberg <rasmus.dahlberg@kau.se> 2021-03-16 18:34:43 +0100
committer: Rasmus Dahlberg <rasmus.dahlberg@kau.se> 2021-03-16 18:34:43 +0100
commit: 5596c7504ed8b4720f041560969e651b27a33259 (patch)
tree: e22ac7ccb771e5571bfb6998100185c6d6f01cb1 /doc
parent: 82343b7cc96dc10135caaaa109a5ff255c8d6950 (diff)
1 files changed, 372 insertions, 0 deletions
diff --git a/doc/sketch.md b/doc/sketch.md
new file mode 100644
index 0000000..31964e0
--- /dev/null
+++ b/doc/sketch.md
@@ -0,0 +1,372 @@
+# System Transparency Logging
+This document provides a sketch of System Transparency (ST) logging.  The basic
+idea is to insert hashes of system artifacts into a public, append-only, and
+tamper-evident transparency log, such that any enforcing client can be sure that
+they see the same system artifacts as everyone else.  A system artifact could
+be a browser update, an operating system image, a Debian package, or more
+generally something that is opaque.
+
+We take inspiration from the Certificate Transparency Front-End
+([CTFE](https://github.com/google/certificate-transparency-go/tree/master/trillian/ctfe))
+that implements [RFC 6962](https://tools.ietf.org/html/rfc6962) for
+[Trillian](https://transparency.dev).
+
+## Log parameters
+An ST log is defined by the following parameters:
+- `log_identifier`: a `Namespace` of type `ed25519_v1` that defines the log's
+signing algorithm and public verification key.
+- `supported_namespaces`: a list of namespace types that the log supports.
+Entities must use a supported namespace type when posting signed data to the
+log.
+- `base_url`: prefix used by clients that contact the log, e.g.,
+example.com:1234/log.
+- `final_cosigned_tree_head`: an `StItem` of type `cosigned_tree_head_v*`.  Not
+set until the log is turned into read-only mode in preparation of a shutdown.
+
+ST logs use the same hash strategy as described in RFC 6962: SHA256 with `0x00`
+as leaf node prefix and `0x01` as interior node prefix.
+
+In contrast to Certificate Transparency (CT) **there is no Maximum Merge Delay
+(MMD)**.  New entries are merged into the log as soon as possible, and no client
+should trust that something is logged until an inclusion proof can be provided
+that references a trustworthy STH.  Therefore, **there are no "promises" of
+public logging** as in CT.
+
+To produce trustworthy STHs a simple form of [witness
+cosigning](https://arxiv.org/pdf/1503.08768.pdf) is built into the log.
+Witnesses poll the log for the next stable STH, and verify that it is consistent
+before posting a cosignature that can then be served by the log.
+
+## Acceptance criteria and scope
+A log should accept a leaf submission if it is:
+- Well-formed, see data structure definitions below.
+- Digitally signed by a registered namespace.
+
+Rate limits may be applied per namespace to combat spam.  Namespaces may also be
+used by clients to determine which entries belong to who.  It is up to the
+submitters to communicate trusted namespaces to their own clients.  In other
+words, there are no mappings from namespaces to identities built into the log.
+There is also no revocation of namespaces: **we facilitate _detection_ of
+compromised signing keys by making artifact hashes public, which is not to be
+confused with _prevention_ or even _recovery_ after detection**.
+
+## Data structure definitions
+Data structures are defined and serialized using the presentation language in
+[RFC 5246, §4](https://tools.ietf.org/html/rfc5246).  A definition of the log's
+Merkle tree can be found in [RFC 6962,
+§2](https://tools.ietf.org/html/rfc6962#section-2).
+
+### Namespace
+A _namespace_ is a versioned data structure that contains a public verification
+key (or fingerprint), as well as enough information to determine its format,
+signing, and verification operations.  Namespaces are used as identifiers, both
+for the log itself and the parties that submit artifact hashes and cosignatures.
+
+```
+enum {
+	reserved(0),
+	ed25519_v1(1),
+	(2^16-1)
+} NamespaceFormat;
+
+struct {
+	NamespaceFormat format;
+	select (format) {
+		case ed25519_v1: Ed25519V1;
+	} message;
+} Namespace;
+```
+
+Our namespace format is inspired by Keybase's
+[key-id](https://keybase.io/docs/api/1.0/kid).
+
+#### Ed25519V1
+At this time the only supported namespace type is based on Ed25519.  The
+namespace field contains the full verification key.  Signing operations and
+serialized formats are defined by [RFC
+8032](https://tools.ietf.org/html/rfc8032).
+```
+struct {
+	opaque namespace[32]; // public verification key
+} Ed25519V1;
+```
+
+### `StItem`
+A general-purpose `TransItem` is defined in [RFC 6962/bis,
+§4.5](https://tools.ietf.org/html/draft-ietf-trans-rfc6962-bis-34#section-4.5).
+We define our own `TransItem`, but name it `StItem` to emphasize that they are
+not the same.
+
+```
+enum {
+	reserved(0),
+	signed_tree_head_v1(1),
+	cosigned_tree_head_v1(2),
+	consistency_proof_v1(3),
+	inclusion_proof_v1(4),
+	signed_checksum_v1(5), // leaf type
+	(2^16-1)
+} StFormat;
+
+struct {
+	StFormat format;
+	select (format) {
+		case signed_tree_head_v1: SignedTreeHeadV1;
+		case cosigned_tree_head_v1: CosignedTreeHeadV1;
+		case consistency_proof_v1: ConsistencyProofV1;
+		case inclusion_proof_v1: InclusionProofV1;
+		case signed_checksum_v1: SignedChecksumV1;
+	} message;
+} StItem;
+
+struct {
+	StItem items<0..2^32-1>;
+} StItemList;
+```
+
+#### `signed_tree_head_v1`
+We use the same tree head definition as in [RFC 6962/bis,
+§4.9](https://tools.ietf.org/html/draft-ietf-trans-rfc6962-bis-34#section-4.9).
+The resulting _signed_ tree head is packaged differently: a namespace is used as
+log identifier, and it is communicated in a `SignatureV1` structure.
+```
+struct {
+	TreeHeadV1 tree_head;
+	SignatureV1 signature;
+} SignedTreeHeadV1;
+
+struct {
+	uint64 timestamp;
+	uint64 tree_size;
+	NodeHash root_hash;
+	Extension extensions<0..2^16-1>;
+} TreeHeadV1;
+opaque NodeHash<32..2^8-1>;
+
+struct {
+	Namespace namespace;
+	opaque signature<1..2^16-1>;
+} SignatureV1;
+```
+
+#### `cosigned_tree_head_v1`
+Transparency logs were designed to be cryptographically verifiable in the
+presence of a gossip-audit model that ensures everyone observes _the same
+cryptographically verifiable log_.  The gossip-audit model is largely undefined
+in today's existing transparency logging ecosystems, which means that the logs
+must be trusted to play by the rules.   We wanted to avoid that outcome in our
+ecosystem.  Therefore, a gossip-audit model is built into the log.
+
+The basic idea is that an STH should only be considered valid if it is cosigned
+by a number of witnesses that verify the append-only property.  Which witnesses
+to trust and under what circumstances is defined by a client-side _witness
+cosigning policy_.  For example,
+	"require no witness cosigning",
+	"must have at least `k` signatures from witnesses A...J", and
+	"must have at least `k` signatures from witnesses A...J where one is from
+		witness B".
+
+Witness cosigning policies are beyond the scope of this specification.
+
+A cosigned STH is composed of an STH and a list of cosignatures.  A cosignature
+must cover the serialized STH as an `StItem`, and be produced with a witness
+namespace of type `ed25519_v1`.
+
+```
+struct {
+	SignedTreeHeadV1 signed_tree_head;
+	SignatureV1 cosignatures<0..2^32-1>; // vector of cosignatures
+} CosignedTreeHeadV1;
+```
+
+#### `consistency_proof_v1`
+For the most part we use the same consistency proof definition as in [RFC
+6962/bis,
+§4.11](https://tools.ietf.org/html/draft-ietf-trans-rfc6962-bis-34#section-4.11).
+There are two modifications: our log identifier is a namespace rather than an
+[OID](https://tools.ietf.org/html/draft-ietf-trans-rfc6962-bis-34#section-4.4),
+and a consistency proof may be empty.
+
+```
+struct {
+	Namespace log_id;
+	uint64 tree_size_1;
+	uint64 tree_size_2;
+	NodeHash consistency_path<0..2^16-1>;
+} ConsistencyProofV1;
+```
+
+#### `inclusion_proof_v1`
+For the most part we use the same inclusion proof definition as in [RFC
+6962/bis,
+§4.12](https://tools.ietf.org/html/draft-ietf-trans-rfc6962-bis-34#section-4.12).
+There are two modifications: our log identifier is a namespace rather than an
+[OID](https://tools.ietf.org/html/draft-ietf-trans-rfc6962-bis-34#section-4.4),
+and an inclusion proof may be empty.
+```
+struct {
+	Namespace log_id;
+	uint64 tree_size;
+	uint64 leaf_index;
+	NodeHash inclusion_path<0..2^16-1>;
+} InclusionProofV1;
+```
+
+#### `signed_checksum_v1`
+A checksum entry contains a package identifier like `foobar-1.2.3` and an
+artifact hash.   It is then signed so that clients can distinguish artifact
+hashes from two different software publishers A and B.  For example, the
+`signed_checksum_v1` type can help [enforce public binary logging before
+accepting a new software
+update](https://wiki.mozilla.org/Security/Binary_Transparency).
+
+```
+struct {
+	ChecksumV1 data;
+	SignatureV1 signature;
+} SignedChecksumV1;
+
+struct {
+	opaque identifier<1..128>;
+	opaque checksum<1..64>;
+} ChecksumV1;
+```
+
+It is assumed that clients know how to find the real artifact source (if not
+already at hand), such that the logged hash can be recomputed and compared for
+equality.  The log is not aware of how artifact hashes are computed, which means
+that it is up to the submitters to define hash functions, data formats, and
+such.
+
+## Public endpoints
+Clients talk to the log using HTTP(S). Successfully processed requests are
+responded to with HTTP status code `200 OK`, and any returned data is
+serialized.  Endpoints without input parameters use HTTP GET requests.
+Endpoints that have input parameters HTTP POST a TLS-serialized data structure.
+The HTTP content type `application/octet-stream` is used when sending data.
+
+### add-entry
+```
+POST https://<base url>/st/v1/add-entry
+```
+
+Input:
+- An `StItem` of type `signed_checksum_v1`.
+
+No output.
+
+### add-cosignature
+```
+POST https://<base url>/st/v1/add-cosignature
+```
+
+Input:
+- An `StItem` of type `cosigned_tree_head_v1`.  The list of cosignatures must
+be of length one, the witness signature must cover the item's STH, and that STH
+must additionally match the log's stable STH that is currently being cosigned.
+
+No output.
+
+### get-latest-sth
+```
+GET https://<base url>/st/v1/get-latest-sth
+```
+
+No input.
+
+Output:
+- An `StItem` of type `signed_tree_head_v1` that corresponds to the most
+recent STH.
+
+### get-stable-sth
+```
+GET https://<base url>/st/v1/get-stable-sth
+```
+
+No input.
+
+Output:
+- An `StItem` of type `signed_tree_head_v1` that corresponds to a stable STH
+that witnesses should cosign.  The same STH is returned for a period of time.
+
+### get-cosigned-sth
+```
+GET https://<base url>/st/v1/get-cosigned-sth
+```
+
+No input.
+
+Output:
+- An `StItem` of type `cosigned_tree_head_v1` that corresponds to the most
+recent cosigned STH.
+
+### get-proof-by-hash
+```
+POST https://<base url>/st/v1/get-proof-by-hash
+```
+
+Input:
+```
+struct {
+	opaque hash[32]; // leaf hash
+	uint64 tree_size; // tree size that the proof should be based on
+} GetProofByHashV1;
+```
+
+Output:
+- An `StItem` of type `inclusion_proof_v1`.
+
+### get-consistency-proof
+```
+POST https://<base url>/st/v1/get-consistency-proof
+```
+
+Input:
+```
+struct {
+	uint64 first; // first tree size that the proof should be based on
+	uint64 second; // second tree size that the proof should be based on
+} GetConsistencyProofV1;
+```
+
+Output:
+- An `StItem` of type `consistency_proof_v1`.
+
+### get-entries
+```
+POST https://<base url>/st/v1/get-entries
+```
+
+Input:
+```
+struct {
+	uint64 start; // 0-based index of first entry to retrieve
+	uint64 end; // 0-based index of last entry to retrieve in decimal.
+} GetEntriesV1;
+```
+
+Output:
+- An `StItem` list where each entry is of type `signed_checksum_v1`.  The first
+`StItem` corresponds to the start index, the second one to `start+1`, etc.  The
+log may return fewer entries than requested.
+
+# Appendix A
+In the future other namespace types might be supported.  For example, we could
+add [RSASSA-PKCS1-v1_5](https://tools.ietf.org/html/rfc3447#section-8.2) as
+follows:
+1. Add `rsa_v1` format and RSAV1 namespace.  This is what we would register on
+the server-side such that the server knows the namespace and complete key.
+```
+struct {
+	opaque namespace<32>; // key fingerprint
+	// + some encoding of public key
+} RSAV1;
+```
+2. Add `rsassa_pkcs1_5_v1` format and `RSASSAPKCS1_5_v1`.  This is what the
+submitter would use to communicate namespace and RSA signature mode.
+```
+struct {
+	opaque namespace<32>; // key fingerprint
+	// + necessary parameters, e.g., SHA256 as hash function
+} RSASSAPKCS1_5V1;
+```
author	Rasmus Dahlberg <rasmus.dahlberg@kau.se>	2021-03-16 18:34:43 +0100
committer	Rasmus Dahlberg <rasmus.dahlberg@kau.se>	2021-03-16 18:34:43 +0100
commit	5596c7504ed8b4720f041560969e651b27a33259 (patch)
tree	e22ac7ccb771e5571bfb6998100185c6d6f01cb1 /doc
parent	82343b7cc96dc10135caaaa109a5ff255c8d6950 (diff)