As models get more powerful, i find myself focusing more effort on context engineering, which is the task of bringing the right information (in the right format) to the LLM. Context engineering is hard because it is pervasive. You need to engineer every layer of the stack to
RAG is one flavor of context engineering. For those that say “RAG is dead” hopefully this framing makes it obvious why that’s not the case.
i was actually referring to prompt caching! not specifically about how it’s retrieved or computed but just that good context engineering involves thinking about how well it is (prompt) cached
Interesting. I feel like @cursor_ai does an inspiringly good job at solving the context engineering problem.
i coined the term, although it feels very obvious so i’m not surprised that several people were thinking about it concurrently.
