Close

    alignmentforum.org

    • How truthful is GPT-3? A benchmark for language models - AI Alignment Forum

      This is an edited excerpt of a new ML paper (pdf, code) by Stephanie Lin (FHI Oxford), Jacob Hilton (OpenAI) and Owain Evans (FHI Oxford). The paper is under review at NeurIPS. TITLE: TRUTHFULQA: MEASURING HOW MODELS MIMIC HUMAN FALSEHOODS ABSTRACT We propose a benchmark to measure whether a language model is truthful in generating answers to questions. The benchmark comprises 817 questions that span 38 categories, including health, law, finance and politics (see Figure 1). We crafted questio...