Filippo Valsorda 0:57 Yes. 0:57 Fantastic. 0:58 OK, hi, I'm Filippo. 1:01 I do a few things. 1:02 In my day job, I'm an open source maintainer. 1:04 I'm one of the maintainers of the Go Cryptography Standard Library. 1:08 And I used to be the lead of the Go security team at Google. 1:11 And then in 2022, I quit because I had opinions on how you could fund open source maintenance. 1:17 And people said it was never going to work. 1:19 And I'm that kind of person. 1:21 Anyway, that's a completely different story, and that's what JOMIS is about now. 1:25 It's a small organization of open source maintainers, but it's not what we're here talking about. 1:31 You might have also heard that I am on the board of the PLC organization that was just formed. 1:37 Woo! 1:37 Exactly. 1:41 Yes. 1:41 And that is closer to what I'm going to talk to you about, because I'm going to tell you about transparency logs. 1:47 Transparency logs are something I am really, really into because they meet the moment for providing accountability for the systems that we have and that we can make for users. 2:01 But a step back. 2:02 What is a transparency log? 2:03 A transparency log is a Merkle tree. 2:05 Stop me if you heard this one. 2:07 Is a append-only Merkle tree, too. 2:10 It is not a Merkle search tree. 2:11 It's just a Merkle tree that keeps growing. 2:13 Of entries. 2:15 And the entries are things that you want to have a list of visible to everybody and where everybody has the same list of things. 2:24 That is a very useful primitive because, for example, think about package management. 2:29 I'm involved in Go, and usually how package management works is that you have a registry and you make an account, you upload a thing, but that's very centralized. 2:40 Go has a completely different thing. 2:41 It has a decentralized system where the name of the module is where you fetch it, and there's a mechanism for fetching that data. 2:49 Stop me if you heard this one. 2:51 But the problem with that is that then you have problems like left-pad. 2:55 You have the problem that if you connect to that website, it knows that you connected to it and that you're using it, which is not great. 3:02 You can fix those things with a centralized system because you can put something that saves those modules and keeps them available. 3:10 So sometimes having a centralizing relay— stop me if you've heard this one— is useful. 3:17 However, the problem is that now you need to trust that centralized party, just like you needed to trust PyPI and npm. 3:24 Well, that's what we use transparency logs for. 3:28 There's a transparency log called the Go checksum database, which is just a long list of every Go module version and its hash. 3:35 It's append-only. 3:36 Everybody sees the same. 3:39 List, and a client will only install a Go module if it has a cryptographic proof of inclusion in that list. 3:47 This already works. 3:48 You are using this if you're developing in Go. 3:50 And a lot of people don't know, because that's the beauty of TLOGs, that you can build them on top of a centralized experience, and they just work. 3:59 Python has tried so many times to get package authors to sign their thing, and it never works, because either people don't check the signature, or if people check the signature, now you're terrified of losing your key. 4:10 So nobody signs, and you don't go anywhere. 4:13 T-logs, one day we turned them on. 4:15 They're on. 4:16 And that works, and that's what CheckDish custom database is. 4:20 This is derived from certificate transparency, which is what Sunlight was about, and there's links later. 4:27 Now, so the point— ah, damn it. 4:34 Ha! 4:34 The point of this is that you can use transparency logs to provide accountability where it's hard to provide decentralized trust. 4:42 So instead of saying we are going to make a decentralized system where you have to connect to different package authors and there's all this problem, we said Google runs a proxy mirror. 4:54 But then we also said that proxy mirror is held honest by the checksum database so that if they ever put a fake module in there, the author of that module can notice. 5:03 Because the same list that every client checks inclusion in is the same list that you can just sign up, download the whole thing, go like, yeah, that's my module. 5:12 Wait, that's not the module version I developed. 5:15 Google, what are you doing? 5:17 And Google is staking its reputation on the system. 5:22 To a cryptocurrency crowd, sometimes I describe it as there's proof of work, proof of stake, and proof of reputation. 5:31 T-Logs remove all of the crypto economics, remove all of the extra work, and instead say, look, are you willing to be accountable for the content of this data, for all this dataset? 5:43 If yes, great. 5:45 We will make a system such that if a client trusts something, it means it's forever in a public list where it can be audited. 5:53 And if you put something wrong in there, you will be held accountable for that. 5:57 Which in practice would probably mean somebody forks Go to change what the checksum database operator is, because there are cryptographic proof that its steward abused that power. 6:08 Cool. 6:08 So that's why I like transparency logs. 6:10 I also like AT Proto. 6:12 So can we build transparency logs for AT Proto collections? 6:15 What is an example use case? 6:17 For example, if you're making a package manager on top of AT Proto, but also if you have any other situation where you would like all of your collections to be append-only, never change, and be auditable forever. 6:30 That is not the case for likes and posts. 6:33 That's fine. 6:34 Sometimes you want to delete the cringe post from yesterday or undo the like that you did while scrolling somebody's profile without realizing it was not your following feed. 6:43 So those are deletable. 6:44 Fantastic. 6:45 But instead, there are other applications where it makes sense to not want deletions, not want things to change. 6:50 So let's say that you made some Foo collection and ABC collection. 6:56 That you would like as a client to use knowing that those records can't go away. 7:01 Here's a sketch of how that could work in protocol. 7:05 Alice is developing a client that wants to consume foo and abc records, and she requests for a TLOG to be operated. 7:15 So she makes a record called a TLOG config and says, hey, here are some collections, some NSIDs, some DIDs, can somebody please run a TLog out of these? 7:26 And a TLog operator, which can be multiple ones because permissionlessly allows picking different operators, like Jaume could say, sure, we'll run TLogs for free up to 1 million entries in the TLog. 7:40 It's going to be what? 7:42 How much can a banana cost? 7:44 200 megabytes? 7:45 And it will create a TLog. 7:48 So it will take all of these records using tap put them into that append-only Merkle tree, publish that Merkle tree using the C2SP TLog specifications, which are already an ecosystem that the Go checksum database uses, that staticct uses, and so on. 8:06 And after publishing that, it makes a record saying, yeah, all right, I'm operating an instance for that TLog config. 8:12 It's the TLog instance record. 8:14 It has a strong ref to the config and the URL where you can find that T-log. 8:19 Alice goes like, great, OK, I'm going to configure my clients to require an inclusion proof, cryptographic inclusion proof in this T-log before they trust a record. 8:31 And this is nice because even if our tap instance messes something up and we lose this record over here, it doesn't end up in the T-log. 8:40 So it's as good as if it didn't exist because the client will not accept that record, will not process it, will not Trust it, will not install it if it's a package, will not show it if it's an attestation or a blue note or something like that because it doesn't have an inclusion proof. 8:55 So all of the things that get actually consumed by the client, you can be sure that they're in that list. 9:00 And then someone else can go and monitor that list. 9:03 And if they're the package author, they can be like, hey, no, wait, wait, wait, my PDS published something I didn't want to publish or anything of the sort. 9:11 The leaves of this T-log, so the actual things in the Merkle tree, would be the hash of the DID, the hash of the NSID, Why hashes? 9:19 Because you can never change the stuff in here. 9:22 And it's really nice not to get user-controlled data in places where you can't delete it. 9:28 Enough said. 9:30 And then the R key, which probably would have to be a timestamp ID for the same reason, and the SID. 9:37 The records are not actually in there, but you have all of the cryptographic pointers to the records. 9:43 And if a record ever gets deleted, Sure, this collection becomes unauditable, but the other ones are still standing because you can still check what the full set of those were by monitoring the TLog. 9:58 Then I guess we can also put the checkpoint back into the protocol because why not? 10:06 Yes, this is a thing that we can build very easily because TLog tooling, we've been developing it now for years to make it easy and cheap to run. 10:15 And there are a bunch of libraries now. 10:17 There is deployed software that already uses this— Go Checksum Database, Sigstore, soon probably Merkle tree certificates, which will probably replace the WebPKI because quantum computers are coming. 10:31 Different story, different talk. 10:33 So bunch of links. 10:34 It's just a loose leaf from leaflet because this stuff works. 10:41 Right? 10:41 I love it. 10:42 And yeah, you can find a bunch of links and reach out if you— what I'm looking for next is somebody that has an ATproto use case for this, somebody that has an ATproto collection that they would like to be able to say the list is universally known and anybody can audit all of the things that a client could ever have trusted. 11:02 Thank you.