Subcultured

How truthful is GPT-3? A benchmark for language models - AI Alignment Forum

This is an edited excerpt of a new ML paper (pdf, code) by Stephanie Lin (FHI Oxford), Jacob Hilton (OpenAI) and Owain Evans (FHI Oxford). The paper is under review at NeurIPS. TITLE: TRUTHFULQA: MEASURING HOW MODELS MIMIC HUMAN FALSEHOODS ABSTRACT We propose a benchmark to measure whether a language model is truthful in generating answers to questions. The benchmark comprises 817 questions that span 38 categories, including health, law, finance and politics (see Figure 1). We crafted questio...

Subcultured

alignmentforum.org

How truthful is GPT-3? A benchmark for language models - AI Alignment Forum

Subcultured