After two years of work, we’ve made an AI Scientist that runs for days and makes genuine discoveries. Working with external collaborators, we report seven externally validated discoveries across multiple fields. It is available right now for anyone to use. 1/5
We’ve been unusually quiet for ~5 months because we didn’t want to just announce something and not let people use it. So we built it at scale (thanks @m_skarlinski and @ludomitch and eng🫠). And are letting edu users try it a bunch. 2/5
In June we had a two day brainstorming session about what is preventing long-running agents. We took an idea from RL - world models. We mapped the process of discovery to building a world model. This got us to coherent agents at >10M tokens, and actually deep hypotheses. 3/5
The excellent @MichaelaThinks then worked with our collaborators to find discoveries. We worked on materials, biochemistry, and biology - we had a lot of success. We found Kosmos (our system) to agree with humans about 80% in analysis. 4/5
There is so much more to say about us spinning this out, what we've had to change in our AI agents to work together, and how we want to scale this up! We're building a new company around this 5/5
Read the tech report: https://arxiv.org/abs/2511.028... Use it: https://platform.edisonscienti... See our new website with much more info: https://edisonscientific.com/
@andrewwhite01 I took a peak at the paper but I’m not really seeing anything strongly defining what the world model even is. You’re describing what it does but I’m presuming this world model is proprietary. If not, is it some adaptation of an LLM system?
@andrewwhite01 What happens if two users prompt it similarly? Who gets the discovery
@andrewwhite01 We've gone from "AI can write code" to "AI can discover new scientific knowledge." The gap between automation and augmentation just collapsed. This is what AGI looks like in practice.
@andrewwhite01 How do you evaluate such a project? Given the amount of tokens generated and the high chance of hallucination, it almost seems impossible.
@andrewwhite01 Autonomous science was a sci-fi dream a few years ago Now AI is not just analyzing data, it’s designing and executing experiments Massive leap for humanity’s rate of discovery
@andrewwhite01 Incredible! Really enjoying this paper, and organizing a journal club on it, before teams of testers/users are organized to dive into 🌌space biological and 🌘space health questions, tasks, workflows, etc. Cheers to all at @EdisonSci @SGRodriques
@andrewwhite01 Andrew this seems to be a really powerful force multiplayer. Amazing! Can you please elaborate on the scientific fields it is relevant to? I briefly scanned through the paper, and it appears to work for experimental biology / chimestry / physics. I am right?
@andrewwhite01 Amazing work, excited to learn more and try it out!
@andrewwhite01 Super dope
@andrewwhite01 That’s interesting
@andrewwhite01 congrats!!!!
@andrewwhite01 Yes!
@andrewwhite01 Wow. What's the price to use it? Is it subscription based?
@andrewwhite01 Breaking down barriers and accelerating discoveries
@andrewwhite01 Insanely important and inspiring work!!
@andrewwhite01 How significant are the discoveries it can make? Are they truly novel? I’m guessing not because Dario Amodei said Nobel prize winning discoveries from AI he expects around 26/27
@andrewwhite01 Execution on a programmed track is sophisticated automation, not scientific autonomy. Genuine discovery demands self-directed skepticism and choosing the next question, not just validating a pre-selected path. Seven validated results prove efficiency, not intellect.
@andrewwhite01 Summarise the genuine discoveries made so far?
@andrewwhite01 Congratulations to the team! Exciting. Any reason why Biomni was not used in the benchmarking. 😊
@andrewwhite01 Looks interesting! Looking forward to trying it, any sense of how it'll do with clinical neuroscience?
@andrewwhite01 One nitpick: something s off with the website scrolling. It lags a 0.2s after the mouse movement
@andrewwhite01 Exciting to see AI moving from analysis to genuine discovery
@andrewwhite01 Kosmos reads more papers, but Robin may be more code-efficient.
@andrewwhite01 Super exciting!!!
@andrewwhite01 hi andrew, I've been working on something similar - would love to connect
@andrewwhite01 Awesome! Incredible work, future is here now
@andrewwhite01 Can I have your datasets?
@andrewwhite01 The lev is newer
@andrewwhite01 Nice! Also check out our open source Claude Scientific Skills that turn Claude Code into a powerful AI Scientist on the desktop for free! https://github.com/K-Dense-AI/...
@andrewwhite01 Genuine discoveries and runs for days. So, it finally finished a gradient descent and found a known local minimum. Congrats on the uptime.


