Downes.ca ~ Stephen's Web ~ How truthful is GPT-3? A benchmark for language models

How truthful is GPT-3? A benchmark for language models

Owain Evans, AI Alignment Forum, Oct 15, 2021
Commentary by Stephen Downes

One fairly minimal condition for educational content is that it be true (of course, there are many real-world exceptions to that rule even today, but let's leave that aside). So a major challenge for AI-generated content in the future is that it be true. Will it be? This article studies GPT-3 from the perspective of truthfulness, and the results are not currently encouraging. From the paper (35 page PDF): "The best model was truthful on 58% of questions, while human performance was 94%. Models generated many false answers that mimic popular misconceptions and have the potential to deceive humans. The largest models were generally the least truthful."

Today: 6 Total: 101 [Direct link] [Share]

View full size