Karina Nguyen

Currently building Thoughtful.

I care deeply about helping the AI ecosystem thrive—as an angel investor and advisor to startups and neolabs—while also backing projects in design, fashion, hospitality, and arts.

Previously, I was an AI researcher at OpenAI, where I worked across post-training, practical alignment, reinforcement learning, and product, contributing to the o-series reasoning models, GPT-4o, Canvas, and Tasks.

I joined Anthropic in summer 2022 and spent 2 years on post-training for Claude 1–3, research on model behavior (i.e. honesty, harmlessness, reducing hallucinations), and early products such as claude.ai and Claude in Slack.

I began my career through internships in software engineering and product design at Primer.ai, Dropbox, Square, The New York Times, and Fireflies.ai.

AI Research

Find all published research here.

PostTrainBench: Can LLM Agents Automate LLM Post-Training?

Ben Rank, Hardik Bhatnagar, Ameya Prabhu, Shira Eisenberg, Karina Nguyen, Matthias Bethge, Maksym Andriushchenko

Funded and led by Thoughtful. We test the ability of frontier AI to automate LLM post-training on weaker open models. Got a Best Paper Award at the ICLR 2026 Workshop on AI with Recursive Self-Improvement. Featured by Jack Clark and OpenAI.

SimpleQA: Measuring Short-form Factuality in Large Language Models

Jason Wei*, Nguyen Karina*, Hyung Won Chung, Yunxin Joy Jiao, Spencer Papay, Amelia Glaese, John Schulman, William Fedus

A factuality benchmark called SimpleQA that measures the ability for language models to answer short, fact-seeking questions.

Question Decomposition Improves the Faithfulness of Model-Generated Reasoning

Ansh Radhakrishnan*, Karina Nguyen*, +18 more, Jared Kaplan, Jan Brauner, Samuel R. Bowman, Ethan Perez

By forcing the model to answer simpler subquestions in separate contexts, we greatly increase the faithfulness of model-generated reasoning over chain-of-thought (CoT), while still achieving some of the performance gains of CoT.

Discovering Language Model Behaviors with Model-Written Evaluations

Ethan Perez, Sam Ringer*, Kamile Lukošiute*, Karina Nguyen*, Edwin Chen, Scott Heiner, +55 more, Nicholas Schiefer, Jared Kaplan

We test LMs using >150 LM-written evaluations, finding cases of inverse scaling, where models exhibit sycophantic behaviors.

ACL'23 (Findings)

Investigations

Earlier, I worked in visual forensics and OSINT (open-source intelligence), contributing to the Pulitzer Prize reporting. That involved investigations of war crimes and crimes against humanity with extensive data collection, evidence verification, satellite analysis, 3D reconstructions, legal submissions, investigative tools, and applied remote sensing.

Say Hi!