Sequential operations are more powerful than parallel operations.
@_arohan_ Skill issue
@vitaliychiley Sync error
@_arohan_ Under the same compute conditions would say not often
@_arohan_ Hits everyone hard lol
@_arohan_ yes!!!!!!!!!!!!!11
@_arohan_ Make the old lstms work, pls.
@_arohan_ also slower
@_arohan_ parallel clears
@_arohan_ i think about this a lot: https://en.wikipedia.org/wiki/...
@_arohan_ strictly true as theyre the same but you get a data dependency!
@_arohan_ i've been thinking about this paper lately https://arxiv.org/abs/2404.157... , even if we gave 1000000 ... as input to the model that be processed super fast in parallel, there will be some problems that need some kind of recurrence/symbolic lowering
@_arohan_ no u
@_arohan_ Seymour Cray would agree.
@_arohan_ @yacineMTB Systems beat anything
I love codex and claude taking care of all the boilerplate part of coding that wastes time and is booooooring. Come to think of it, maybe Java would in theory be the perfect language for LLM-coding? Extremely verbose boilerplate - very annoying for human, but good for LLM?
Gemini 3.0 about to drop
Are we really running out of data??? No. We're just not using it correctly. The solution: let the model learn which data it needs to learn!!! 1/n




