Karina Nguyen
I am a research lead at OpenAI building product research moonshots (hiring!).
My contributions so far (1.3 yrs): creative writing model (unreleased), ChatGPT Tasks, Canvas (first full synthetic posttraining), SimpleQA factuality benchmark, 4o (nov'24) w/ improved writing, o-series (o3 w/ tools, o1)
Research: GenOS/GenUI, synthetic envs in RL, new methods in RLAIF via multimodality, consumer coding agents, teaching the model to write novels, methods for grading subjective tasks in RL, posttraining behavior science (ICL, IF, tool use, etc.)
Previously, I worked at Anthropic:
- Post-training (RLAIF/RLHF) and leading evaluations for the Claude 3, Claude 2 family
- Training & productionizing cost-performance efficient models: Claude 3 Haiku, Claude Instant 1.2, 1.1 in the API
- Researching Constitutional AI scaling laws, character / model personality, solving hallucinations (before search tool), honesty, self-correction, refusals, faithful reasoning
- Building the file uploads feature to productionize 100K long-context capability
- Developing long-horizon & human feedback interfaces, Claude in Slack, and other unreleased products
- Writing the first 50,000 lines for the claude.ai and developer console
- +100 more things that you’d expect to do in a very fast-growing startup!
I also worked on R&D prototypes, engineering tools, and product features with teams at Primer.ai, Dropbox, Square, and the New York Times.
AI Research
Find all published research here.
SimpleQA: Measuring Short-form Factuality in Large Language Models
A factuality benchmark called SimpleQA that measures the ability for language models to answer short, fact-seeking questions.
Question Decomposition Improves the Faithfulness of Model-Generated Reasoning
By forcing the model to answer simpler subquestions in separate contexts, we greatly increase the faithfulness of model-generated reasoning over chain-of-thought (CoT), while still achieving some of the performance gains of CoT.
Towards Measuring the Representation of Subjective Global Opinions in Language Models
We develop a method to test global opinions represented in language models.
In submission 2023
Discovering Language Model Behaviors with Model-Written Evaluations
We test LMs using >150 LM-written evaluations, finding cases of inverse scaling, where models exhibit sycophantic behaviors.
ACL'23 (Findings)
Towards Semantically-Aware UI Design Tools: Design, Implementation and Evaluation of Semantic Grouping Guidelines
We develop computational metric to measure semantic grouping UI violations.
ICML Workshop'23
Investigations
My work in visual forensics & human rights contributed to the Pulitzer Prize reporting. That involved investigations of war crimes and crimes against humanity with extensive data collection, evidence verification, satellite analysis, 3D reconstructions, legal submissions, investigative tools, and applied remote sensing.
- Bloomberg CityLab
- Wired
- New York Times
- Washington Post
- CNN
- Associated Press
- Bellingcat
- SITU Research
- The Atlantic Council
- Amnesty International
Say Hi!