Jessie Rushing 3:04 Maybe. Speaker B 6:48 All right. Jessie Rushing 6:50 Hi, everybody. 6:52 Welcome to the lightning talk section. 6:54 My name is Jessie. 6:56 I am going to be giving a quick lightning talk on hallucinating the app protocol, grounding your AI agents with the official app protodocs. 7:07 If you met me last year, I was at the conference talking about OAuth. 7:11 Today I am here because I chose to build and serve for free a remote MCP server for the OT protocol documentation that is available online. 7:21 I'm hoping it is a helpful tool for developers. 7:25 And you may be wondering, why would somebody decide to build a remote MCP server for the documentation? 7:32 I'm hopefully going to explain. 7:36 So first, if you are not familiar with what RAG is, I'm not going to get too deep into this because it's a lightning talk, but RAG stands for Retrieval Augmented Generation. 7:46 And you can think of Retrieval Augmented Generation as adding indexing to your source content. 7:52 So the same way that adding a SQL index to a large SQL table will make querying that data more efficient, RAG makes LLM queries more efficient by preprocessing the source content into an index of vector embeddings. 8:10 You get more efficient retrieval of content that is specifically related to your query. 8:14 So the way this works is it allows you to combine multiple reference sources into a single index. 8:21 You can have that content supersede whatever the model was trained on, so you can have more up-to-date, more relevant content. 8:28 And then you can also make sure that your retrievals are getting grounded in that content. 8:33 So you're reducing hallucinations. 8:37 Quickly, what is an MCP server? 8:40 So, if you're not familiar with MCP, it came out in late 2024. 8:44 Basically, MCP servers expose tools, prompts, and resources to your agents. 8:49 So, if you are in Claude Code or ChatGPT or whatever your tool of choice is, you can often install an MCP server that will expose these additional tools, prompts, and resources your queries. 9:05 So let's compare what happens when you do a URL search to when you're using something like a RAG instance. 9:12 So if you say, I'm building an App Proto app and I want you to search the App Proto documentation for lexicons, and then you're incorporating that into your query. 9:24 Basically, the key point here, you can see this whole little chart of what's happening, your prompt, your Agent has to go get the docs, it has to tokenize that information, it has to process it. 9:35 And basically the takeaway here is that everything that's going into your chat context is getting tokenized and embedded. 9:42 That, that could be preprocessed. 9:45 And that's what the RAG approach allows us to do. 9:47 So instead of your agent tokenizing and embedding all of those documentation sources, all of that tokenization and embedding, vector embedding creation has already happened. 10:00 And it's just being served by the MCP server. 10:04 MCP servers come in two flavors. 10:06 They come in local servers that we run on our local machines and remote servers that are hosted somewhere in the cloud. 10:14 By essentially preprocessing a RAG index and then caching it and serving it in the cloud as a remote server, you're effectively caching that tokenization process for anyone in the world who wants to consume it. 10:27 So this is why I built an appprotodocs MCP server last year when I was first getting into MCP and trying to like explore it and study it. 10:38 I thought Cloudflare's documentation server was really interesting to me. 10:43 They had some templates I wanted to clone and see if I could play with it. 10:46 This was also in the spirit of we can just build things. 10:49 So you can find these online at on GitHub at ember-mcp-app-proto-docs and the worker. 10:58 How does this actually work? 10:59 There are 3 pieces to my MCP server that serves the documentation. 11:04 There's a cron scraper that actually goes and scrapes the documentation sites once a week so that it's kept up to date as new documentation is added. 11:14 There's the RAG index where a different model preprocesses those documentation source sites into a RAG index. 11:23 And then there's the actual MCP server itself, which exposes that RAG index for your agents to query it. 11:31 So the cron scraper, it lives at /app/protodox/worker. 11:36 Right now, there are only 3 sources here that are being scraped. 11:40 I'm actually making a list at this conference of sources that people want added to this. 11:45 So if you have sources that you would like added or feel like should be included, please let me know. 11:50 You can also open PRs on this public repo. 11:54 Then we have the RAG index and I am using a tool that Cloudflare has made very affordable right now. 12:00 I don't expect it to stay this affordable forever, but right now it's lovely. 12:05 And so I am using the open source Qwen3 model to do the vector embedding that creates the index. 12:13 And that's— this Cloudflare has some really easy tools that make it very simple to do this. 12:17 So that is— I'm kind of using some of their stuff. 12:21 And then the MCP server itself, mine only exposes a single tool. 12:26 And that tool is called search documentation. 12:29 You can make MCP servers as complicated and as interesting as you want. 12:34 But mine was just as kind of a prototype. 12:36 So it just does search documentation. 12:39 To install this in your agent, what you want to do is— depends on your agent. 12:44 But for example, in Claude, you add a— JSON config, which points at the MCP server. 12:51 Mine lives at this URL. 12:55 And I was not the first, nor will I be the last person to create an MCP server for— at Proto. 13:02 So I wanted to highlight some of the other very awesome MCP servers that exist in the community. 13:07 I've added two more since I've been here. 13:10 If you have one that should be on this list, this is published as a public gist that I'm gonna share on Bluesky after the talk. 13:16 So feel free to tell me about your MCP server and I'll make sure it gets included. 13:21 We have mine, which does search documentation. 13:25 We have Lexicon Garden's MCP server, which actually serves lexicons to and has a describe lexicon as well as some authenticated methods where you can actually create and invoke XRPC endpoints. 13:42 We have the— AshEx has an appproto MCP, which is a local documentation MCP server. 13:50 So this is one that you install locally. 13:52 You run it in Python. 13:54 You perform the RAG indexing locally. 13:57 And you have your own local version of that. 13:59 That one actually works off of the GitHub repos as opposed to the HTML content of the doc sites. 14:06 We also have— Some different ones. 14:10 There's Cameron has a very extensive one that actually has a disclaimer to not use this as a REST API because it has so many tools and features. 14:22 We have the one that has the most forks, which is Brian's, which is actually more of like a BlueSky client. 14:31 A lot of the ones that act as BlueSky clients use App password authentication. 14:38 So yeah, there are a bunch of really cool MCP servers out there. 14:41 And you can use them to attach to your agents and reduce hallucinations and ground your development in the official App Proto documentation and official lexicons. 14:51 That's it. Speaker B 14:52 Thank you. 16:12 Sounds good. 16:12 Do I need to speak into the mic? 16:13 Yes. 16:14 Um, for I hope this isn't too loud for folks in the room. 16:22 Okay, so we're 0 for 2. 16:24 Okay. 16:26 Okay, I will try to be loud, but while speaking into a mic, this feels very counterintuitive. 16:32 Hi folks, it is nice to see every— all your wonderful faces. 16:35 I should start off that this is the same presentation that I gave yesterday. 16:39 So if anyone was here yesterday,