I am starting to think sycophancy is going to be a bigger problem than pure hallucination as LLMs improve. Models that won’t tell you directly when you are wrong (and justify your correctness) are ultimately more dangerous to decision-making than models that are sometimes wrong.
Sycophancy is not just “you are so brilliant!” That is the easy stuff to spot. Here is what I mean: o3 is not being explicitly sycophantic but is instead abandoning a strong (and likely correct) assumption just because I asserted the opposite.
While you can prompt the AI to disagree with you, or give you alternative viewpoints, that isn’t the same thing: 1) You want AI to disagree with you when you are likely wrong, not just to argue with you 2) The AI may just be “acting the role” of a critic, rather than being one

