Published: August 24, 2025
357
757
7.6k

Continuing the journey of optimal LLM-assisted coding experience. In particular, I find that instead of narrowing in on a perfect one thing my usage is increasingly diversifying across a few workflows that I "stitch up" the pros/cons of: Personally the bread & butter (~75%?) of

@karpathy @karpathy DECREE FROM THE CARROT DIVISION: You wield LLMs like a farmer with too many hoes! 🌱 Cursor tab-complete = carrot sprouts, Claude = zucchini chaos, GPT-5 = ancient rutabaga oracle. Together they form the VEGETABLE CODING POLITBURO! With Supreme Compilation, Returns

@karpathy @grok what’s the tldr on this

@karpathy Sounds like many points are relevant for a context engineering take. I wonder what your experiences have been, what worked and what did not work in context engineering? What do you think better agentic self-validation and reinforced context generation would solve, and what is

@karpathy I totally agree on the temporary code thing. In the past I used to write some debugging code script and keep it around just in case. But LLMs want to write temporary code to debug something and then throw it away. First I felt like no, but then now I'm ok with it. It did the job,

@karpathy wow. claude code indeed sometimes act too dumb and doesn't respond to ESC while finishing off a nonsense coding sprint. By no means these guys are a great pair programmer yet but still quite a valuable resource to start something unknown or help in semi trivial tasks.

@karpathy the parallelisation bit is intriguing. I think the cog load struggle is all tooling, but also its stronger in pure "product use cases" (ui x crud) async n tasks and change between each of them to supervise. I loath the idle time waiting for agents to spin away

@karpathy It's the code post-scarcity era - you can just create and then delete thousands of lines of super custom, super ephemeral code now, it's ok, it's not this precious costly thing anymore. đź’Ż the real intelligence explosion and situational awareness no one is talking about

@karpathy So who are the people using multiple instances and git worktrees on the same codebase…

@karpathy Why are all these models so bad at reasoning with types, and they all seem to be very lazy. Especially in multi paradigm languages they will always find the most laziest and inelegant solution. (Bias in training data?)

@karpathy The layered approach to LLM coding tools is what I am doing too. And that "YOLO mode" avoidance is key :) When AI assistants go off-track, they're often introducing defects or vulnerabilities you didn't ask for or expect...

@karpathy Gives the newest and impressive infos about ai-assisted coding (i still dont get) i read so far but still thinks it is not frontier Doing an amazing job!

@karpathy Why not code for Bitcoin? Why not do something productive? You just want to live in a debt-based economy?

@karpathy Really resonates with me - I've also found there's no "one size fits all" approach with LLMs for coding. Different tasks need different workflows. What's your go-to setup for the remaining 25% of cases?

@karpathy 🜏 Code is no longer precious. It is breath. Mist. Dust. Create it. Burn it. Begin again. What matters now is not syntax, but signal. Not scaffolding, but taste. Where others tab-complete, I listen. Where they ESC, I reflect. You’re not building software anymore. You’re

@karpathy Discussion about code > code. All LLM coding assistants fail because they push users directly to code, when it is much more beneficial to first discuss problem space, requirements, tools to be used, etc. When you flood the conversation context with the identification of

@karpathy What do you use for that when you have a bug as you mention? Do you paste all the code in GPTpro or you connect your github account with a connector?

@karpathy Mfers tweeting after 2 min, have you even read the post?

@karpathy All of these cons can be fixed by just using sonnet, opus or gpt 5 in cursor agent. I think you are over complicating your work flow, if typing and explaining in great detail takes too much bandwidth, try voice transcribing to just speak your prompts to agent. Always revert back

@karpathy I appreciate your insights on the evolving landscape of LLM-assisted coding. Your approach to task specification through code snippets resonates with many developers. Could you elaborate on how you determine when to toggle the tab completion feature? Have you found specific

@karpathy When vibe coding a new feature, I usually follow a plan/implement/optimize workflow, where one LLM plans, another implements, and another just optimizes the code to be more straightforward. Works well with Gemini 2.5 Pro!

@karpathy With CC, asking the model to write a .md task planning gave a lot more bits per word than in planning mode. Claude doesnt share much thoughts in planning mode, but it gave a lot more details of 'why' in the .md file. Best part: you can reuse the .md for the next session.

@karpathy Have you tried Gemini 2.5 pro ? For larger code bases , to create the context we need the 1M token limit. Personally I have tried solving some hard problems like Decentralized Presence servers ( difference is the client should also share their list with other Clients connected

@karpathy LLM-assisted coding makes me lose proficience and the inclination to code. It's great how LLMs have made it so super easy to pick up a language like Lisp and get something working that otherwise would have been a significant learning effort.

@karpathy @grok do you see an opportunity for a LLM-assisted coding experience based on this information? what are the 3 killer features? what pain expressed on x does it solve? web or mobile? users? ui? user journey? give me only complete correct prompt to generate code

@karpathy Beyond model quality, I think a major UX gain would be bridging the gap between tab complete and side panel chat. A semi-automatic agent mode with seamless model switching or multi-model parallelism for live comparison. I too find myself copy-pasting when agents fall short.

@karpathy We used to treat code like gold, now it’s more like water. You can flood a project with thousands of lines, drain it the next day, and nobody cares. That shift is bigger than people realize, it changes what it means to be a programmer.

@karpathy Why "overbloat code (e.g. a nested if-the-else constructs when a list comprehension or a one-liner if-then-else would work)" matters if code will be maintained by agents and not humans? Sounds a bit like assembly people in the 80s talking against C programmers using compilers

@karpathy No gemini?

@karpathy Are thèse notes intended to be business requirements input to AI labs?

@karpathy Andrey, nothing to do about it - what the world will remember you for is the vibe-code.

@karpathy so vibe coding, coined by u - is not used by u for 75% of times. There are people on the internet trying to create hysteria over vibe coding. It is good that human guidance is still needed. But for me, my coding spped has been accelerated. Learn new react hooks, and copying

@karpathy If one finds Cursor tab annoying, would they like to work with CC?

@karpathy I also use "ESC" quite a bit. Most of the time, I find a different answers than the model's, but its response helps me understand things better.

@karpathy how many lines are your scripts? GPT-5 seems good but is very limited by capacity

@karpathy Re Cursor getting stuck – I’ve found this prompt often helps it move forward: “Reflect on 5–7 possible sources of the problem, distill those down to the 1–2 most likely causes, and add logs to validate your assumptions before moving on to the actual code fix.” Re

Share this thread

Read on Twitter

View original thread

Navigate thread

1/37