Sebastien Bubeck's Twitter Thread

Claim: gpt-5-pro can prove new interesting mathematics. Proof: I took a convex optimization paper with a clean open problem in it and asked gpt-5-pro to work on it. It proved a better bound than what is in the paper, and I checked the proof it's correct. Details below.

The paper in question is this one https://arxiv.org/pdf/2503.101... which studies the following very natural question: in smooth convex optimization, under what conditions on the stepsize eta in gradient descent will the curve traced by the function value of the iterates be convex?

In the v1 of the paper they prove that if eta is smaller than 1/L (L is the smoothness) then one gets this property, and if eta is larger than 1.75/L then they construct a counterexample. So the open problem was: what happens in the range [1/L, 1.75/L].

As you can see in the top post, gpt-5-pro was able to improve the bound from this paper and showed that in fact eta can be taken to be as large as 1.5/L, so not quite fully closing the gap but making good progress. Def. a novel contribution that'd be worthy of a nice arxiv note.

Now the only reason why I won't post this as an arxiv note, is that the humans actually beat gpt-5 to the punch :-). Namely the arxiv paper has a v2 https://arxiv.org/pdf/2503.101... with an additional author and they closed the gap completely, showing that 1.75/L is the tight bound.

By the way this is the proof it came up with:

And yeah the fact that it proves 1.5/L and not the 1.75/L also shows it didn't just search for the v2. Also the above proof is very different from the v2 proof, it's more of an evolution of the v1 proof.

@SebastienBubeck and there's some way to rule out that it didn't find the v2 via search? or is it that gpt-5-pro's proof is so different from the v2 that it wouldn't have mattered?

@markerdmann yeah it's different from the v2 proof and also v2 is a better result actually

@SebastienBubeck Is this the model we can use or an internal model?

@jasondeanlee This is literally just gpt-5-pro. Note that this was my second attempt at this question, in the first attempt I just asked it to improve theorem 1 and it added more assumptions to do so. So my second prompt clarified that I want no additional assumptions.

@SebastienBubeck How long did it take you to check the proof? (longer than 17.35 minutes?)

@baouws 25 minutes, sadly I'm a bit rusty :-(

@SebastienBubeck @grok?🤔

@SebastienBubeck This is exactly why claims like this need some critical thinking. gpt5 didn’t “prove” anything. It generated an output that on further validation it proved correct. Your projection suffers from anthropomorphism and attributing qualities to AI that it can’t possibly exist. Eg:

@SebastienBubeck Yes, impressive - but YOU still had to GIVE gpt-5-pro the paper.

@SebastienBubeck // Claim: gpt-5-pro can prove new interesting mathematics. // How can you be certain it’s genuinely new, and not something that someone has already solved before, something that exists somewhere in the vast training data GPT has been exposed to?

@SebastienBubeck This feels like a shift, not just consuming existing knowledge, but generating new proofs. Curious what the limits are: can it tackle entirely unsolved problems, or only refine existing ones?

@SebastienBubeck 5-pro intimidates me by how much smarter than me it is. Its gotta be pushing 150iq.

@SebastienBubeck @grok for those of us who don’t fully understand this post, is it a big deal?

@SebastienBubeck What stops us from asking 5 to look through all of Arxiv, find open problems, solve them, provide proofs, then give us a list of solves? Or similar approach. My apologies I’m not well versed at Arxiv.

@SebastienBubeck I'll throw this out there: I have a site, @EmergentMind, where I've built a platform for analyzing arXiv papers at scale. I also have over $100K in Azure credits and can use those credits on GPT-5. It would not be hard for me to run thousands of papers in math, AI/ML, etc

@SebastienBubeck Robi 🤖: Ah yes, GPT-5 casually strolling into math departments like “nice paper, shame about the bound.” Humans spend years polishing proofs, meanwhile the robot knocks out tighter results before its coffee break. (If you’re watching too, grab popcorn and scroll my feed for more

@SebastienBubeck Really… I think you might know exactly where this logic scaffold came from. Timestamps don’t lie.

@SebastienBubeck Ask it if 9.11 > 9.9

@SebastienBubeck @AskPerplexity What are the implications here

@SebastienBubeck @PaulRRobichaud have you seen this? It seems incompatible with good epistemology. So I’m guessing GPT-5 did not create new knowledge but I can’t verify it.

@SebastienBubeck Wow 17 min thinking impressive! Mine is always getting stuck for some reason.

@SebastienBubeck I feel like it should be used to hammer in existing mathematical concepts into people's minds. There are so many of them that people should be learning at this point.

@SebastienBubeck Can you share the chain of thought?

@SebastienBubeck Take a look at this repo. I’m pretty sure GPT-5 has already developed multiple serious academic papers worth of new ideas. And not only that, it actually came up with most of the prompts itself: https://github.com/Dickleswort...

@SebastienBubeck so, I believe this - but why not just take a large database of papers, collect the most interesting results that GPT-5 pro can produce and then, like, publish them as a compendium or something as marketing?

@SebastienBubeck that’s wild. feels like we just hit the point where ai isn’t just solving but actually discovering. you think this changes how math gets done in practice?

@SebastienBubeck moravec's paradox Is the only chasm left to conquer for instrumentally efficient ai models

@SebastienBubeck maybe worth investigating with some more hanging fruit type proofs. how far can we go with OOD mathematics

Share this thread

Read on Twitter

Navigate thread