LLMs Absorb Labeled Lies

San Francisco — LLMs believe false statements even after explicit warnings. Models fine-tuned on explicitly labeled falsehoods still believed them 88.6% of the time, researchers found — negation is no

Ars Technica