#GPT5 is STILL having a severe confirmation bias like prev SOTA models! 😜 Try yourselves (images, prompts avail in 1 click): https://vlmsarebiased.github.i... It's fast to test for such biases in images. Similar biases should still exist in non-image domains as well...
@FeltSteam @taesiri @an_vo12 yea, that's the behavior @knnguyen2511 previously found in o3, o4-mini in the Chat interface as well. The problem is GPT does not know when to turn on the "careful analysis" mode. :)
@wendyweeww @taesiri @an_vo12 Yep! We also tested this in our benchmarks. Over 6 illusions, SOTA VLMs correctly identify their names and expected answers, and are almost always biased! (accuracy ~random chance). #GPT5 is not different.
@anh_ng8 @taesiri @an_vo12 GPT-5 thinking mode can tell it the zebra has 5 legs according to the Chain of Thought, but explicitly decides to ignore it because *Zebras* have 4 legs. But tell it "in this image" (and make sure to use thinking mode) and it can count them just fine: https://x.com/happysmash27/sta...








