lmarena.ai's Twitter Thread

🧬 BiomedArena is here! We’re honored to partner with @DataTecnica and @NIH CARD, who developed BiomedArena to evaluate LLMs for biomedical discovery, and to help expand this domain-specific track in community-driven evaluations. 🧪 Biomedical science is complex, high-stakes,

BiomedArena focuses on real-world biomedical workflows, from literature review to disease modeling, using open, reproducible methods trusted by scientists. It's already in use at NIH’s Intramural Research Program. We're proud to support and help expand this work around: 📈

This is just the beginning. Join us as we build the infrastructure for safe, transparent, and rigorous AI in more expert domains to come. If you’re building or deploying biomedical LLMs, or researching AI evaluation in a specific field, read more about this partnership on our

@lmarena_ai @DataTecnica @NIH Exciting to see focused LLM evaluation in biomedicine. This careful approach could truly unlock potential while mitigating risks.

@lmarena_ai @DataTecnica @NIH Good stuff

@lmarena_ai @DataTecnica @NIH https://x.com/ZeroGravityYzz/s...

@lmarena_ai @DataTecnica @NIH Please allow file upload

@lmarena_ai @DataTecnica @NIH Been following BiomedArena's benchmarking—real progress for biomedical AI. Community-led evaluations like this will help models improve usability in real healthcare settings.

I just want to vibe code AI apps

Gemini is dominating our image input rankings for LLMs 👀

really cool to see how much people are loving codex; usage is up ~10x in the past two weeks! lots more improvements to come, but already the momentum is so impressive.

Ready to meet the biggest, brainiest guy in the Qwen3 family?

Share this thread

Read on Twitter

Navigate thread