Understory
Feed
Map
Sign in with your Atmosphere Account
Everything Everywhere All at Once - an impromtu session - panproto.dev
30 min
Your browser does not support video playback.
Speaker A
0:00
is like a description plus some sort of start or initiation time and then some sort of end time can now be converted and viewed.
0:07
And that means that any Atom feed that has extensions or posts that could describe this information can then be accessible.
0:16
And that's something that can be available to every application out there.
0:19
I think that's very enabling.
0:22
That's my spiel.
Speaker B
0:24
Yeah.
0:28
Repeat for people online.
Speaker C
0:35
Yeah.
0:36
You can't do— yeah, sorry.
0:38
Can you do arbitrary computation in these lenses?
0:40
It is not set up to do— it is not Turing complete.
0:44
No.
0:45
At least this is not set up that way.
Speaker D
0:49
You cannot invert all computation.
0:51
If you could, the NSA would like to speak to you.
0:54
They have— you can probably make a lot of money very quickly.
0:58
And you should not be here.
0:59
You should be capitalizing on that and then going into hiding.
1:04
But no.
1:04
So like fundamentally, this— I think the key thing actually to take from this is that this cannot solve all the problems.
1:12
Also, just because you can lens two schemas together doesn't mean that it's meaningful.
1:17
So there are— what we're doing is we're shifting from a hard technical problem to a hard social problem.
1:23
By removing the technical barriers, we will create a host of new miseries.
1:29
But those are preferable miseries to the ones that we have, I believe.
1:33
The other thing is that, like, because there are cases, you know, there are cases where you can technically lens data, but socially it would be unwise or unacceptable.
1:41
I think, like, gender fields.
1:42
If you go from a system which classifies everybody as M or F and you move to a system where it's a more open gender field, mapping people back just because you can technically does not mean that socially this will not create problems.
1:55
There are lots of other sort of less political examples.
1:58
If you want to take phone numbers without area codes, you could say anything that had no area code, we add the area code that predates when we had to track that.
2:06
You go back, it's like, well, now you've thrown away the area code, now it doesn't work.
2:09
So you are going to have cases where you cannot map the domains.
2:15
And so you're going to need to augment this approach with other things.
2:17
But you're going again from a 0% solution to, I would argue, without evidence, a 90+ solution.
2:25
And that's a step at least in the right direction.
Speaker B
2:34
I've got a statement, not a question.
2:36
Which is, Peter was kind of trolling and saying like, oh, you standards people, argued about the perfect schema.
2:43
And to me, I mean, the value to me is always, I don't know, I've got like maybe archivist hat on or something, but the value is always in the data that's out there.
2:49
That's the good stuff, right?
2:51
And like, you can't argue with it anymore.
2:53
It's out there.
2:55
And you can't, you know, it's like the, to me, like the real problems aren't like, oh, everyone chose a different field name or capitalization or these other things.
3:03
It's things like, I don't know, this book was published in summer 1970.
3:08
Is summer before or after July 5th, and you want to do something like list books in order, and you just can't do that.
3:16
You know, so you're getting these, and I think the, so the value to me that's exciting about this is this complement noticing thing and being able to deal with raw data and at least surface these rapidly without having to spend hours and hours and hours sorting through this.
3:30
It'd be nice to know, you're like, all right, I've got this catalog with 20 million books in it.
3:35
Like how many have that summer problem?
3:37
And this can maybe get at that, like dealing with lots of data in different formats and seeing how much of it can munge and how much, I don't know.
3:46
This is all like, is good munging tools in here.
3:49
In addition to actually like, like I guess like on superficially this looks like, oh, this will solve a problem.
3:54
It'll let you represent stuff.
3:56
But it's also like a power toolkit for munging existing data.
Speaker E
4:00
That raises a really interesting, uh, So part of what got me really excited about this whole area was I worked at Condé Nast some years ago.
4:15
And Condé Nast is actually 15 companies in a trench coat.
4:19
And each of the brands, so like Wired, New Yorker, et cetera, had web teams that had their own identities and their own technical challenges and everything and their own websites.
4:32
And their own Markdown formats.
4:35
And they forked the Markdown formats and they were mutually incompatible.
4:39
So when we went to build like a unified backend, that was fun.
4:45
And actually it was worse than what I've sort of just described because we couldn't stop the world and do a full migration of all of Condé Nast's archives, like take down the sites.
4:57
It's a lot of data.
5:01
And even if we could, we couldn't do it because there were a bunch of legacy applications that were never going to get updated.
5:08
There was no way.
5:09
We didn't even know how these things ran, let alone where the source code was or, you know, what we were going to do about them.
5:15
But they expected this old broken format.
5:18
And so we used this lensing, basically this like a very, very primitive version of this lensing approach.
5:25
We left the Markdown in place.
5:27
We tagged it and then we lensed it into our common sort of meta format and then lensed it back out into whatever format the consumer was looking for.
5:39
And it meant that these legacy applications could keep running and the new applications could move on and like, you know, play with new schemas and stuff.
5:47
And so it unblocked that organizational piece.
5:49
And I think that like the archiving piece is really connected there because a lot of those archives are connected to old software that expects the old archive format.
5:58
And so you actually don't want to just like, oh, we've come up with a new format and we migrated everything and now everything old is broken.
6:03
So that's a piece.
6:05
There's another piece.
6:07
We haven't talked about the version control piece here.
6:11
For now, I'm going to let that be an exercise left to the, to the, to the audience.
6:16
But Daniel had a question.
Speaker D
6:20
Yeah.
Speaker F
6:20
One of the big benefits of Lexicon to me over like defining generic posts or follows or something like that is, well, one, you get to bake whatever application-specific semantics you want into it, but it also makes it clear, like, kind of the context that the thing is used in.
6:36
And, you know, like, probably the easiest lens to create is like, like, uh, transitioning from one follow to another follow.
6:43
Like, they're going to have the exact same fields in it probably.
6:46
Um, but I don't want all of my LinkedIn follows to be my Blue Sky follows, right?
6:52
And so, uh, I'm basically checking my understanding of this.
6:56
The, the, it's like the app's prerogative whether it wants to apply a lens to something.
7:01
So it's always like the app's choice if it wants to say, like, if you are Twitter or like a Blue Sky app and you want to, uh, interoperate with LinkedIn follows, it's the Blue Sky app's prerogative to apply that lens to it, right?
Speaker A
7:15
It's similar to Golang interface.
7:16
It's similar to Golang interfaces.
7:20
You have a local interface that you're using to enforce the local contract, and if something comes in that matches that contract and you want it to, then you're, you know, you can do so.
7:28
And if something doesn't, then it doesn't follow the shape of that data in whatever prescription is applicable.
7:35
Also, realistically, you know, some, some data types, data structures, and data will require defaults or more involved field mapping or validation.
7:47
So one of the first implementations of this in Lexicon Garden is I changed the diff mechanism from one version of a lexicon to another, paying attention to repo rev as the major indicator or the major hint.
8:00
And now it's able to say, oh, hey, this is not just a change of this field that was added to this component, But it actually changes the validation mechanism because this says, hey, it went from 500 characters to 300 characters.
8:16
Thus, that's effectively a breaking change.
8:19
So you have the ability to apply— I think we're going to have to have the ability to apply additional lens structure for specific cases.
8:29
It's a nice thought to say, oh, everything can become everything everywhere all at once.
8:34
But in real, in practice, because you have the publication of lexicons, you know, coming down the event stream and you can perform witness, you can add more easily update your CI/CD process to say, oh, hey, here's a backlog of 20 new things.
8:47
Hopefully not, right?
8:49
That have the general shape.
8:50
Let's see if we want to integrate and provide that additional functionality.
Speaker E
8:56
Just to add to that and point back to what Peter was saying, like, That's a social problem.
9:02
And we need to decide what the social translation boundaries are.
9:09
We can automatically build the graph.
9:11
Just because we can doesn't mean we should.
9:14
But I think some of the stuff that Aaron was pointing out, like you can say in the graph of all of the composable lenses that actually this one doesn't make sense.
9:26
And you can, yeah, so you can sort of have that in the graph and that can be a human layer on all of this stuff.
Speaker D
9:34
Maybe, I'm sorry if I'm jumping the queue, if anybody had their hand up.
9:37
One other thing that I think is cool about all this stuff is that if you like engineer the built environment correctly, the lenses, these translations can be end user developed and end-user integrated, which means the app developer of the recipe app Blaine's making doesn't have to say, I have chosen how to integrate this other recipe corpus's schema.
10:07
A user should be able to say, I just got some recipes here from a thing I made, or that I got from a friend, or from this other app, and yeah, this is how you translate it into your schema.
10:16
And then it just— the app Blaine's app never finds out that that happened.
10:20
It just gets some data in a schema it recognizes.
10:23
And for example, if these lenses are content addressed and composable and crawlable, then if anybody anywhere has done this and you can discover them, then you gain the ability for anyone to link up new applications and new data into this system.
10:40
And over time, we might need to find different ways to refine them, right?
10:43
Like You know, one schema defines it as tags, another schema defines it as flags, and then somebody maps them together and later you realize they weren't quite the same thing.
10:53
Like, there's gonna be problems.
10:54
I'm not trying to pretend there aren't problems.
10:57
But it gives— it removes the responsibility of having a central authority even at the app level and allows many different participants in the ecosystem to contribute.
11:07
And I think that's quite a compelling property.
Speaker E
11:11
And that reminds me of a really important property here that I'm really excited about this happening on App Proto is that because all of these lenses are defined as lexicons, they're discoverable over the network.
11:25
So you don't need to install code.
11:26
One of the challenges that we had with @JSON, the thing that we did at Conde, was that the software management of a bunch of lenses was a total nightmare.
11:38
It was just really, really hard.
11:40
And the lexicon piece of this just kind of magically solves it, which is really, really awesome.
11:46
So with the relational tech stuff, you can have never seen a text format before and in your browser at real time discover the graph to the format you've never seen and translate it.
12:00
And now you can edit it bidirectionally, collaboratively.
12:04
Any other questions?
Speaker G
12:14
I love all this.
12:15
It really resonates with me and I think with our project where we've been kind of wrestling for 6 months trying to define a bunch of standard lexicons and try and get it absolutely right because once it's set, it's set and you can't break stuff.
12:32
And in that light, the first thing that came to me when I saw this lenses idea was not so much interoperability between different apps and different use cases, but just with a single use case, just like schema migrations and schema versioning.
12:48
And you know, you said it's the exercise for the reader, so maybe I'm already kind of venturing into that spot, but it seems like the— an obvious Thanks.
Speaker D
13:00
There's industry prior art here.
13:02
If you read the Cambria paper, we name-check a project out of Stripe where they had this problem, which is everybody installed their Stripe API clients and then they want to add new endpoints, they want to change things, but the old clients, they can't break them.
13:14
There's money on that, right?
13:16
And so their strategy was they would only ship and maintain the current version of the API.
13:23
But every version of the API, you had to build a lens effectively back to the previous one, and those would stack.
13:30
So if you came from a really old API, you'd hit some middleware on their server that would like gradually pop, pop, pop, pop all the way up to the current API.
13:38
So you don't have that shotgun typing supporting random flags like Avro, like everything is— Protobuf has this problem.
13:48
Every field is optional.
13:49
'Cause old clients and new clients need to interop.
13:51
So it's just like, you know, that your type is question mark on every field.
13:57
Like, cool, thanks.
13:59
Right?
13:59
And so the nice thing with the Stripe approach is it deals with this complexity at the boundary of the system.
14:03
And within the system, you have good types.
14:06
And so what we're doing is trying to generalize that to be bidirectional and scalable, which is just as relevant for your own single-use application.
14:14
And one maybe interesting thing about this is if you think just a little bit more expansively about your applications, even in your own sort of traditional software engineering environment, you actually have lots of versions running at once.
14:26
There's what's in dev.
14:28
There's what's in your dev.
14:29
There's what's in my dev.
14:30
I want to test your code, right?
14:32
And right now we just can't use real data in those environments.
14:36
And then there's the staging problem, which is slightly more real.
14:39
And then we go to prod.
14:40
And a principled approach to this system would allow us to work with real data in development safely and securely, which would give us a lot of, I think, more confidence in how we build software.
14:53
Now, this doesn't apply to every domain.
14:55
I know there's cases where this isn't true, but this, this model, I think, is very promising.
Speaker E
15:00
Do you maybe— Aaron, go ahead.
Speaker C
15:03
Yeah, I was just going to say something about that because so the— there is a VCS built into the system as well.
15:12
It's, it's intentionally very Git-like.
15:18
And so yeah, I would love for people to play with that because that is actually one of the main points of the VCS is to do schema versioning and allow you to fork.
15:27
And yeah.
Speaker D
15:32
Thanks.
Speaker G
15:35
Yeah, that is really great.
15:38
And I think like this bit of feedback someone working on a project, we're relatively new to App Pro, so, you know, last 9 months or something, and we've had, uh, like significant challenges with this whole, like, yeah, like developing, you know, lexicons in test, staging, prod.
15:56
We're right in the thick of that now and, and really feeling the pain.
15:59
And if we had some versioning system, that would be great.
16:02
Um, I, I also had this kind of devil's advocate thought?
16:07
So on the spectrum between, you cowards, everyone should, you know, we should put the work in and define a schema and solve it at the technical level, all the way through to let's solve it on the social level.
16:19
Do you think there's a bit of a danger, like if we go too far into the social side of things, that we kind of ruin interop and, you know, it gives everyone the excuse to not to bother to try and converge to some degree, and then you get this kind of plural pluralism approach and fragmentation and—
Speaker E
16:35
Yeah, I'll speak to that.
16:38
And I feel like Peter might have some thoughts.
16:42
Yeah, absolutely.
16:43
So it's definitely a danger, but I think— and I'll use a specific example— I think what this does is it allows experimentation, permissionless experimentation, to kind of go out and diverge.
16:59
But also it allows conversions because now you can take your custom schema and you can move it back into like a more, you know, something with consensus.
17:14
And I think this manifests in a bunch of different places.
17:17
So with the recipe thing, in parsing all of those recipes, there's a bunch of really bad data, right?
17:23
Like just a really simple example.
17:26
There's— I saw an ingredient that's minced garlic, right?
17:33
And it, you know, the recipe thing automatically computes nutritional information.
17:39
It doesn't— the USDA database doesn't know what minced garlic is.
17:44
And so I can— I've got a superseded record, lexicon record that basically says this ingredient has been superseded by this other ingredient.
17:55
And so I added a new preparation attribute to an ingredient record.
18:01
So instead of just having a text string that's like, what's the ingredient?
18:05
Now it has a preparation.
18:07
So it's garlic minced.
18:10
And actually that record itself can point back to the original garlic with a preparation note of minced.
18:16
And so that's, you know, it's sort of this minimization of free energy principle where actually we can, we can collectively enhance the quality of the data irrespective of which schemas we're using.
18:31
And we can, we can like, you know, now we can actually garden our data.
18:35
We can tend to our data.
18:36
And I think that's a really, really important property.
18:38
And I think speaks, hopefully speaks to the real risk.
18:42
Like if we don't do the gardening, we're going to end up with chaos.
18:45
But, you know, it gives us the opportunity to do that gardening, I think.
Speaker D
18:52
Yeah, I guess I just think we already lost.
18:57
And I think Blue Sky is exciting because it's like the door opened up a crack again.
19:04
But the semantic level that the protocol operates on is very low.
19:08
And so people are like, oh, we get another chance.
19:11
We're going to do the semantic web again.
19:12
No one— sorry, we didn't say that.
19:13
No one admits that out loud in public, I know.
19:15
But it's like there's this feeling like, oh, we'll We'll get it right this time.
19:18
We'll do books.
19:20
Yeah, we'll do recipes.
19:21
It's gonna work this time.
19:22
It's gonna— we'll solve it.
19:23
But it's like, I can't open my Apple Notes in any other text editor.
19:30
That is wild if you think about it.
19:35
And so like, I think it's not enough to solve this problem for microblogging or like publishing on @proto.
19:44
I think this is a This is a problem for all of our systems on all of our computers, and we should treat it with the seriousness of like to-do apps, note-taking apps, recipe apps, workout apps.
19:59
God.
20:00
And then you end up just recording your workouts in an Apple Note anyway.
20:03
And now like it's semi-structured data.
20:05
How do you enhance it?
20:07
So I think like there's a lot of like transformal relationships in data as well.
20:11
And I think Blaine's idea that we should aspire to perfectly encoding all data is not really credible to me.
20:20
And I love you, Blaine.
20:22
I think it's a great vision.
20:24
But people are lazy and informal and they don't know the answer to things and that's okay.
20:28
And so I think the vision of schemas that can be permissive of informal data and don't impose— you know, maybe minced garlic is different because you buy it in a jar.
20:38
And so it actually has different nutritional information.
20:40
And that's important for some people in some context, right?
20:43
But like, it's, you know, we don't know what the future holds and we shouldn't impose a vision of like modernist formality on all data and assume that anything less than that is falling short.
20:54
We should embrace that formality demands a formality and the value of formality varies tremendously on your context and your goals and your needs.
21:02
And so we should embrace the idea that data comes in all forms of formality and for all kinds of applications.
21:10
And so that's my take.
21:11
I don't think it's— you know, I was being droll when I sort of said we should abandon this project.
21:15
I actually think it's deeper than that, which is that there is a huge amount of opportunity if we can figure out how to enrich and relax our understanding of data in different places and different contexts.
21:27
But that's another project for another session, I think.
Speaker E
21:34
I fully agree with Peter.
21:37
I think the thing that this gets us out of is like we have standardized on Markdown, which is sort of a data solipsism, right?
21:44
Like it's just, this is, we want these open-ended data formats and what's the lowest common denominator?
21:52
And it's just like the text that we can type into because building more complicated things is hard.
21:58
And I think that that works for us as developers a lot of the time, and it completely fails non-developers.
22:06
And it centers developers in terms of the power dynamic of building software.
22:10
And I guess my hope for this and for App Proto is that we can kind of center someone else besides developers in navigating these questions, really.
Speaker A
22:25
Yeah, to sort of go back to your point a few minutes ago, I would have agreed with you 5 to 10 years ago.
22:34
But there's two sort of things that really stand out in my mind.
22:37
And that's first is we as a developer ecosystem, we are bought into the whole locked open construct.
22:45
Users have data that applications can access.
22:48
And there's a certain amount of interop.
22:50
Baked into things through the lowest level primitives of the protocol and the agreements that we have as applications.
22:58
And I think second— there's three, sorry, three points.
23:02
The second is the premise that lenses benefit the receiver.
23:07
They benefit the person who's reading the data.
23:09
The person who's reading the data is going to put in the effort where there's interest in converting the data as they see fit and tending to the garden as they see fit and doing all of that stuff.
23:20
Lastly, the third point, there's not a fourth, is agentic tooling has changed the game in terms of how we process, store, label, annotate, et cetera.
23:31
So I'm going to use the word burden in a very superficial way.
23:37
The burden for us as developers to do complicated tool-supported, AI-supported machine learning supported processing is way, way low.
23:49
It's like really easy to have, you know, have Claude with an MCP go through all of the events and say, convert these to posts because it can process them.
24:01
It knows how to use the libraries and tools that support this feature and then go about doing it and then look for errors because it's a big old, you know, pattern matching machine.
24:12
or maybe not errors, but outliers in the data to then flag and process differently.
24:18
And I think that that's just not something that we think about sometimes because it hasn't been possible forever.
Speaker G
24:28
Yeah.
24:29
Just to be clear, so I'm not advocating for the, you know, let's not be cowards.
24:34
And like, yeah, I think that's understood.
24:36
And in fact, like, about, I know, 3 years ago, I, uh, I gave a talk at, um, a conference on like the, the need for a universal data schema for carbon credits, right?
24:48
And, and actually, like, the, the, um, the title of the talk was— it was almost a troll because the basically what I was presenting is actually we're never going to have a universal standard, and the only solution is to make sure that all the standards are discoverable and trans— you know, transformable between them.
25:05
Like, if we maximize for that, then that, that's the only realistic solution.
25:09
So I've already been advocating for the same thing in a completely different, um, context.
25:14
My point really of, you know, taking the devil's advocate position is, is just like to make sure that we're aware of, like, when you have, you know, with the lens approach, and it, uh, it's kind of an n-squared problem in terms of you need a lens between every pair of formats, or you build some kind of network, right?
25:35
So, so yeah, so yeah, so you can, you can optimize it, but like the more, the more formats you have, the, you know, you have to solve the problem of like where's the path from any given format to any other format.
25:49
And it's, it's totally solvable, and like I totally agree with your point about like AI is making this way easier, but You know, maybe there's like not necessarily a middle ground, something closer to that.
26:00
Let's, let's lens the hell out of everything than, than to the, you know, let's standardize on one thing and for sure, but maybe, you know, a bit of kind of balance between the two.
26:11
Yeah.
Speaker A
26:11
And I don't think this will— I don't think any of this negates to have— negates— oh, I'm terribly sorry.
26:16
Yes, I, I totally agree with, uh, everything you said.
26:20
Um, I don't think any of this negates the also need to have the back references and the original content as, you know, direct association.
26:31
Because when we talk about migrating and converting data, we talk about all of these complex transformations, especially if we look at the explosion or compression of multiple data pieces, right?
26:42
And how that fits into some of this.
26:46
Things like You know, labelers.
26:48
What do they do for the references that are either split up or composed from multiple things?
26:54
Or how does that functionally get resolved from a user's point of view when they start seeing their stuff split up or aggregated in different ways?
27:04
So I think there's going to be a lot of stuff to work out.
27:07
And none of this— it creates some— It reduces some work in some ways for app developers and also increases the work in other ways for application developers.
27:17
And not all technical issues are solved, but I still think it's very exciting.
Speaker C
27:22
Yeah, another thing I wanted to mention about the point-to-point point is that not only are they composable, but they are genericizable.
27:33
So this is the point of abstracting out from a lens between specific schemas.
27:39
So the schemas themselves are— sorry, I should say the schema languages themselves are decomposed into kind of natural families of constraints.
27:49
And when you kind of abstract the lens out from being from one schema language to another, what it needs to just look at is which family of constraints this particular schema language satisfies.
28:04
So it's a little bit— There's still some amount of point-to-point because it's like you have— then you're just talking about families of constraints.
28:14
But there's fewer of them at least than there are schema languages.
Speaker G
28:18
Yeah.
Speaker E
28:22
So I think maybe we'll stop there.
28:25
We're probably over time anyways.
28:27
Thanks very much for coming in and listening.
28:31
Definitely, this is the start of a conversation, as I said.
28:36
Cambria was a really inspiring project that I don't want to mischaracterize it, but wasn't something that we felt like we could use.
28:51
And the ideas were like, this is something that I want, this is something that we all want, but it wasn't real.
29:00
And I think in the last 4 weeks, 2 weeks.
29:04
Uh, I have seen some of this, uh, you know.