Google paper advocates for LLMs to express uncertainty clearly

1 month ago 42

Google Research wants AI to start saying “I’m not sure” more often. A paper from the company’s researchers argues that large language models should hedge their answers when internal confidence is low, rather than delivering every response with the unearned swagger of someone who definitely did not just make something up.

The paper, titled “Can Large Language Models Faithfully Express Their Intrinsic Uncertainty in Words?,” was presented at EMNLP 2024, one of the top venues for natural language processing research. Its core finding: current LLMs are remarkably bad at telling you when they don’t actually know what they’re talking about.

The gap between knowing and saying

Authors Gal Yona, Roee Aharoni, and Mor Geva propose a formal framework they call “faithful response uncertainty.” In English: a way to measure whether a model’s spoken confidence actually matches its internal confidence. The metric penalizes both directions of mismatch, so a model that hedges everything gets dinged just as much as one that never hedges at all.

Their recommendation is deceptively simple. When an LLM’s internal confidence is low, it should use natural language hedges like “I’m not sure, but…” instead of stating uncertain information as fact.

The researchers tested multiple aligned LLMs across knowledge-intensive question-answering tasks. The results were not encouraging. Modern models struggle significantly at accurately reflecting their own uncertainty in their outputs.

Why hallucinations matter beyond chatbots

The Google paper frames uncertainty expression as an alignment problem. Current alignment techniques, the processes used to fine-tune models after initial training, tend to optimize for helpfulness and fluency. A model that says “I don’t know” scores poorly on helpfulness benchmarks, even when “I don’t know” is the most accurate possible answer.

This creates a perverse incentive. Models learn during alignment that confident, detailed answers are rewarded, while hedged or incomplete answers are penalized. The researchers argue this gap demands new alignment techniques specifically designed to calibrate expressed certainty against actual knowledge.

The arXiv preprint was first released on May 27, 2024, giving the broader research community months to engage with the findings before the EMNLP presentation.

What this means for crypto and AI-driven trading

The paper itself contains no references to cryptocurrency, digital assets, or financial applications. But the implications ripple outward in ways that matter for anyone using AI tools in investment contexts.

A trading signal that says “Bitcoin will test resistance at $X” carries very different implications depending on whether the underlying model has 95% confidence or 45% confidence. Right now, most AI-driven tools present both scenarios identically.

For investors and traders currently leaning on AI tools for crypto analysis, the practical takeaway is straightforward: treat any AI-generated insight that doesn’t express its own uncertainty as incomplete at best. The Google paper demonstrates that even the most sophisticated models routinely overstate their confidence.

Disclosure: This article was edited by Editorial Team. For more information on how we create and review content, see our Editorial Policy.

Read Entire Article