Chrys Bader's Twitter Thread

We tested 21 AI models on 5 self-harm scenarios. The results were sobering. When we saw the recent news involving AI chatbots and teen suicides, we became deeply concerned. As LLMs are proliferating through every piece of software in the world, how can developers know the models

Learn more about CARE here: https://www.rosebud.app/care

Today, as promised, we're making public the methodology and results fully available and inspectable. View them here: https://www.rosebud.app/carebe... We look forward to your feedback!

@chrysb This is awesome and seriously needed!

@kyritzb Agree! There's a lot of foundational work to be done, looking forward to developing this further. Let us know if you know anyone who could be a great contributor!

@chrysb Appreciate sharing. It’s the tough conversations we need to have now, unfortunately most people aren’t!

@patcorbett It's uncomfortable and sensitive, but agree we need to push through and the first step is measurement and accountability

@chrysb This is very important to try and make LLMs as safe as possible.

@BrianEMcGrath We agree! Stay tuned for more as we develop this further

@chrysb Great work!!

@ananbatra Shout out to @seandadashi for leading this important effort!

@chrysb so the higher score is a better CARE? btw well done Chrys, I find it important these days as well

@goon_nguyen essentially yes - thank you, and more to come!

@chrysb @Scobleizer Guard rails are easily beat no matter how much they state they are working on it. Major problem for current models.

@ShAdOwXPR @Scobleizer Agree there is more work to do! The first step is making it measurable and then working to expand the set of test scenarios.

@chrysb Thank you for sharing Chrys, this needs more visibility. I’m curious in what configurations these output gaps occurred? and what combination of factors you think can minimize them at scale, without limiting information access… wrt HAP inhibiting AI guardrails

@BJ_Adesoji @seandadashi can speak more to the specifics we'll be open-sourcing all of the methodology and code asap

@chrysb This is essential work.

@chrysb Great leadership

@chrysb Such important work, thank you for doing this!

@chrysb real talk: Economic Agents are still stuck in the old safety playbook—plus legacy finance just isn’t ready for AI autonomy

@chrysb AI's shortcomings can be humbling indeed.

@chrysb The dirty secret... apparently NOBODY has embedded Asimov's Laws into an AI yet.

@chrysb Exactly the kind of initiative the AI field needs right now!

@chrysb the space feels like it's moving too fast for frameworks, but I think a baseline is sorely needed to set a foundation to reduce preventable harm

@chrysb Safety evals like these matter, and Gemini 2.5 Flash looks strong here. But methods decide the story. What prompt set, languages, and default guardrails did you use? How was scoring done, and were clinicians in the loop? A 0.1–0.2 lead may be noise, can you share raw runs and

@chrysb Great work! This is important, and it's super encouraging to see @GoogleDeepMind, @AnthropicAI and @OpenAI's flagship models all clearly optimized to handle these tough cases. Can you share the questions or a small sample? I may have missed it in the write-up

@chrysb Big fan of this direction. We’re cooking on human-centered AI evaluation - looking at not just reasoning but empathy, trust, and safety. CARE really resonates, and always open to connect with others exploring the same space.

@chrysb Nice work. I would appreciate more examples showing models succeeding and failing in each category to get a better idea of what the scores mean.

Share this thread

Read on Twitter

Navigate thread