Karina Nguyen
Currently building Thoughtful.
I care deeply about helping the AI ecosystem thrive—as an angel investor and advisor to startups and neolabs—while also backing projects in design, fashion, hospitality, and arts.
Previously, I was an AI researcher at OpenAI, where I worked across post-training, practical alignment, reinforcement learning, and product, contributing to the o-series reasoning models, GPT-4o, Canvas, and Tasks.
I joined Anthropic in summer 2022 and spent 2 years on post-training for Claude 1–3, research on model behavior (i.e. honesty, harmlessness, reducing hallucinations), and early products such as claude.ai and Claude in Slack.
I began my career through internships in software engineering and product design at Primer.ai, Dropbox, Square, The New York Times, and Fireflies.ai.
AI Research
Find all published research here.
PostTrainBench: Can LLM Agents Automate LLM Post-Training?
Funded and led by Thoughtful. We test the ability of frontier AI to automate LLM post-training on weaker open models. Got a Best Paper Award at the ICLR 2026 Workshop on AI with Recursive Self-Improvement. Featured by Jack Clark and OpenAI.
SimpleQA: Measuring Short-form Factuality in Large Language Models
A factuality benchmark called SimpleQA that measures the ability for language models to answer short, fact-seeking questions.
Question Decomposition Improves the Faithfulness of Model-Generated Reasoning
By forcing the model to answer simpler subquestions in separate contexts, we greatly increase the faithfulness of model-generated reasoning over chain-of-thought (CoT), while still achieving some of the performance gains of CoT.
Investigations
Earlier, I worked in visual forensics and OSINT (open-source intelligence), contributing to the Pulitzer Prize reporting. That involved investigations of war crimes and crimes against humanity with extensive data collection, evidence verification, satellite analysis, 3D reconstructions, legal submissions, investigative tools, and applied remote sensing.
- Bloomberg CityLab
- Wired
- New York Times
- Washington Post
- CNN
- Associated Press
- Bellingcat
- SITU Research
- The Atlantic Council
- Amnesty International
Say Hi!