The video “One Prompt Change That Forces Claude to Be Honest” by Dylan Davis addresses the “honesty gap” in AI—where models become smarter but also more confident in guessing rather than admitting they don’t know an answer. This leads to “automation bias,” where users trust AI blindly and fail to check for errors.
To combat this, the video outlines three specific prompt rules to ensure accuracy, especially when extracting information from source documents:
Rule 1: Force Blank Answers for Uncertainty
Instead of allowing the AI to guess or provide a “confidence score” (which can also be faked), instruct it to leave a field blank if the information is missing, ambiguous, or unclear.
The “Reason” Column: Require the AI to add a column explaining exactly why it left a field blank. This allows the user to quickly identify and resolve specific conflicts or missing data without reviewing the entire output.
Rule 2: Change the Incentive Mechanism
AI models often equate a wrong answer with a blank answer. To fix this, you must explicitly change the “penalty” for errors in your prompt.
The 3x Rule: Tell the AI that a wrong answer is “three times worse” than a blank answer. This encourages the model to default to “I don’t know” rather than providing a hallucinated or incorrect response to please the user.
Rule 3: Force Source Attribution and Safety Nets
On complex tasks, AI tends to drift away from strict instructions and starts to “infer” or interpret details.
The “Source” Column: Require a column that labels every value as either “Extracted” (word-for-word) or “Inferred” (derived from context). Evidence for Inference: If the AI labels something as inferred, it must provide a one-sentence explanation of its reasoning. This acts as a safety net, allowing you to skim only the “Inferred” rows to validate the AI’s logic.
By implementing these rules, users can shift from checking every single data point to only reviewing blanks and inferences, significantly increasing both trust and efficiency.
