diff options
| -rw-r--r-- | README.md | 224 | 
1 files changed, 86 insertions, 138 deletions
| @@ -1,140 +1,88 @@ -# System Transparency Front-End (STFE) -**TODO:** update README to reflect the most up-to-date design -[motivation](https://github.com/system-transparency/stfe/blob/design/doc/design.md), -[specification](https://github.com/system-transparency/stfe/blob/design/doc/api.md), -and current status. - -STFE is a [Trillian](https://transparency.dev/#trillian) -[personality](https://github.com/google/trillian/blob/master/docs/Personalities.md) -that allows you to log signed checksums.  What a checksum represents is up to -the submitter.  For example, it could be a Firefox update, a Debian package, or -a document.  A log leaf contains: -- A _checksum_ that represents a data item of opaque type. -- An _identifier_ that is tied to what the checksum represents. -- A _signature_ over `checksum` and `identifier` using the submitter's secret -signing key. -- A _namespace_ that is tied to the submitter's verification key, e.g., think of -it as a hashed public key. - -The log only verifies that an entry's checksum and identifier are -cryptographically signed based on the specified namespace.  A client that wishes -to enforce transparency logging could require that, say, a valid Debian package -is only used if its checksum appears in the log with a correct namespace and -identifier.  This allows us to: -1. **Facilitate detection of compromised signing keys**, e.g., a software -publisher can inspect the log to see if there are any unexpected checksums in -their own signing namespace(s). -2. **Ensure that everyone observe the same checksums**, e.g., there should never -be two log entries with identical namespaces and identifiers but checksums that -differ. - -## Current status -STFE is at the proof-of-concept stage.  We have a -[sketch](https://github.com/system-transparency/stfe/blob/main/doc/sketch.md) of -the log's API, which basically defines data structures, data formats, and -HTTP(S) endpoints.   Be warned that it is a living design document that may be -incomplete and subject to major revisions.  For example, we are currently -thinking about data formats and which parsers are reasonable to (not) force onto -client-side tooling as well as server-side implementers and operators. - -There is a (very) basic client which can be used to interact with the -log, e.g., to add entries and verify inclusion proofs against an STH.  We have -yet to add client-side support for STFE's witness cosigning APIs.  Witness -cosigning is part of the log's _gossip-audit model_, which must be well-defined -to keep the log honest.<sup>[1](#footnote-1)</sup> - -In the near future we will set up a public STFE prototype with zero promises of -uptime, stability, etc.  In the meantime you may get your hands dirty by running -STFE locally.  Rough documentation is available -[here](https://github.com/system-transparency/stfe/blob/main/server/README.md). +# Signature Transparency Logging +Signature Transparency Logging allows you to add signed checksums into an +append-only and tamper-evident log.  What a checksum represents is up to the +submitter.  For example, it could be a cryptographic hash of a Firefox update, a +Debian package, or a document.  You can use Signature Transparency Logging to: +1. Discover which signatures were produced by what secret signing keys. +2. Be sure that everyone observes the same signed checksums. + +We abbreviate Signature Transparency Logging as _siglog_. + +## How it works +Suppose that you develop software and publish binaries.  You sign those binaries +and make them available to users in a package repository.  You are committed to +distribute the same signed binaries to every user.  That is an easy claim to +make.  However, word is cheap and sometimes things go wrong.  How would you even +know if your signing infrastructure got compromised?  A few select users might +already receive maliciously signed binaries that include a backdoor.  This is +where siglog can help by adding transparency in the future. + +For each binary you can log a signed checksum that corresponds to that binary. +If a signed checksum appears in the log that you did not expect: excellent, now +you know that your signing infrastructure was compromised at some point.  Anyone +can also detect if a logged checksum is unaccounted for in your package +repository by inspecting the log.  In other words, the claim that the same +binaries are published for everyone can be _verified_. + +Adding signed checksums into a log is already an improvement without any +end-user enforcement.  Honest mistakes can be detected.  However, end-users need +to enforce public logging to get the most out of siglog.  This means that a +binary in the above example would be rejected unless a corresponding signed +checksum is logged.  ## Design considerations -The following is a non-exhaustive list of design considerations that we had in -mind while developing STFE. - -### Gossip-audit model -Simply adding something into a transparency log is a great start that has merit -on its own.  But, to make the most of a transparency log we should keep the -following factors in mind as the ecosystem bootstraps and develops: -1. Clients should verify that the signed checksums appear in a log.  This -requires inclusion proof verification.  STFE forces inclusion proof verification -by not issuing _promises to log_ as in [Certificate -Transparency](https://tools.ietf.org/html/rfc6962).<sup>[2](#footnote-2)</sup> -2. Clients should verify that the log is append-only.  This requires consistency -proof verification. -3. Clients should verify that they see the _same_ append-only log as everyone -else.  This requires a well-defined gossip-audit model. - -The third point is often overlooked.  While transparency logs are verifiable in -theory due to inclusion and consistency proofs, _it is paramount that the -different parties interacting with the log see the same entries and -cryptographic proofs_.  Therefore, we built a proactive gossip-audit model -directly into STFE: _witness cosigning_.<sup>[3](#footnote-3)</sup> -The idea is that many independent witnesses _cosign_ the log's STH if and only -if they see a consistent append-only log.  If enough reputable parties run -witnesses that signed-off the same STH, you can be pretty sure that you see the -same log (and thus the same checksums) as everyone else. - -Moreover, if you rely on witness cosigning for security, all you need from, say, -a software publisher, is an artifact, a public verification key, a cosigned STH, -and an inclusion proof up to that STH.  To clarify why that is excellent: -client-side verification becomes completely non-interactive! - -### Ecosystem robustness -Our long-term aspiration is that clients should _fail-closed_ if a checksum is -not transparency logged.  This requires a _robust log ecosystem_.  As more -parties get involved by operating compatible logs and witnesses, the overall -reliability and availability improves for everyone.  An important factor to -consider is therefore the _minimal common denominator_ to transparency log -checksums.  As far as we can tell the log's leaf entry must at minimum indicate: -1. What public key should the checksum be attributed to. -2. What opaque data does the checksum _refer to_ such that the log entry can be -analyzed by monitors. - -Additional metadata needs can be included in the data that the checksum -represents, and the data itself can be stored in a public unauthenticated -archive.  Log APIs and data formats should also follow the principle of minimal -common denominator.  We are still in the process of analyzing this further. - -### Spam and log poisoning -Trillian personalities usually have an _admission criteria_ that determines who -can include what in the log.  Without an admission criteria, the log is subject -to both spam (large volumes of data) and poisoning (harmful data). - -The advantage of a small leaf is that spamming the log to such an extend that it -becomes a significant storage and bandwidth burden becomes harder.  It also -makes the log's policy easier, e.g., a max data limit is not necessary. - -Because every leaf is signed it is possible to apply rate limits per namespace. -As a toy example one could require that a namespace is registered before use, -and that the registration component enforces a single namespace per top-level -domain.  To spam the log you would need an excessive number of domain names. - -A more subtle advantage of not logging the actual data is that it becomes more -difficult to poison the log with something harmful.  Transparency logs are -really cryptographic, append-only, and tamper-evident data structures: nothing -can be removed or modified until the log shuts down.  Therefore, as few bytes as -possible should be arbitrary in the log's leaf.  A reasonable goal could be to -not take on a larger risk than Certificate Transparency. - -## -<a name="footnote-1">1</a>: -The lack of gossip-audit models that prevent and/or detect _split-views_ is -documented quite well with regards to Certificate Transparency.  See, for -example, the work of -[Chuat _et al._](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7346853), -[Nordberg _et al._](https://tools.ietf.org/html/draft-ietf-trans-gossip-05), and -[Dahlberg et al.](https://sciendo.com/article/10.2478/popets-2021-0024). - -<a name="footnote-2">2</a>: -So-called SCTs are signed promises that the log will merge a submitted entry -within a Maximum Merge Delay (MMD), e.g., 24 hours.  This adds significant system -complexity because the client needs to either verify that these promises were -honored after the MMD has passed, or the client must trust that the log is -honest. - -<a name="footnote-3">3</a>: -Witness cosigning was initially proposed by [Syta _et al._](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7546521). -The approach of [Meiklejohn _et al._](https://arxiv.org/pdf/2011.04551.pdf) -is closer to ours but the details differ.  For example, witnesses poll STFE for -STHs rather than waiting for a single broadcast. +We had several design considerations in mind while developing siglog.  A short +preview is listed below.  Please refer to our [design document](https://github.com/system-transparency/stfe/blob/main/doc/design.md) +and [API specification](https://github.com/system-transparency/stfe/blob/main/doc/api.md) +for additional details.  Feedback is welcomed and encouraged! +- **Preserved data flows:** an end-user can enforce transparent logging without +making additional outbound network connections.  Proofs of public logging should +be provided using the same distribution mechanism as the data.  In the above +example the software publisher would put these proofs into their package +repository. +- **Sharding to simplify log life cycles:** starting to operate a log is easier +than closing it down in a reliable way.  We have a predefined sharding interval +that determines the time during which the log will be active. +- **Defenses against log spam and poisoning:** to maximize a log's utility it +should be open for anyone to use.  However, accepting logging requests from +anyone at arbitrary rates can lead to abusive usage patterns.  We store as +little metadata as possible to combat log poisoning.  We piggyback on DNS to +combat log spam. +- **Built-in mechanisms that ensure a globally consistent log:** transparency +logs rely on gossip protocols to detect forks.  We built a proactive gossip +protocol directly into the log.  It is based on witness cosigning. +- **No cryptographic agility**: the only supported signature scheme is Ed25519. +The only supported hash function is SHA256.  Not having any cryptographic +agility makes the protocol simpler and more secure. +- **Few and simple (de)serialization parsers:** complex (de)serialization +parsers increase attack surfaces and make the system more difficult to use in +constrained environments.  End-users need a small subset of Trunnel to work with +signed and logged data.  The log's network clients also need to parse ASCII +key-value pairs. + +## Public prototype +We implemented siglog as a [Trillian](https://transparency.dev/#trillian) +[personality](https://github.com/google/trillian/blob/master/docs/Personalities.md). +A public prototype is up and running with zero promises of uptime, stability, +etc.  The log's base URL is `http://tlog-poc.system-transparency.org:4780/st/v0`. +The log's public verification key is `bc9308dab23781b8a13d59a9e67bc1b8c1585550e72956525a20e479b1f74404`. + +You can talk to the log by passing ASCII key-value pairs.  For example, +fetch a tree head and a log entry: +``` +$ curl http://tlog-poc.system-transparency.org:4780/st/v0/get-tree-head-latest +timestamp=1623053394 +tree_size=1 +root_hash=f337c7045b3233a921acc64688b729816a10f95f8be00910418aaa3c71245d5d +signature=50e88b935f6010dedb61314685371d16bf180be99bbd3463a0b6934be78c11ebf8cc81688e7d11b0dc593f2ea0453f6be8ed60abb825b5a08535a68cc007e20e +key_hash=2c27a6bafcbe210753c64666ca108025c68f28ded8933ebb2c4ef0987d7a6302 +$ +$ printf "start_size=0\nend_size=0\n" | curl --data-binary @- http://tlog-poc.system-transparency.org:4780/st/v0/get-leaves +shard_hint=0 +checksum=0000000000000000000000000000000000000000000000000000000000000000 +signature_over_message=0e0424c7288dc8ebec6b2ebd45e14e7d7f86dd7b0abc03861976a1c0ad8ca6120d4efd58aeab167e5e84fcffd0fab5861ceae85dec7f4e244e7465e41c5d5207 +key_hash=9d6c91319b27ff58043ff6e6e654438a4ca15ee11dd2780b63211058b274f1f6 +``` + +We are currently working on tooling that makes it easier to interact with the +log. | 
