Thoughts — Jason Wei

Skip to Content

Jason Wei

Jason Wei

Thoughts

2025

AI will not takeoff fast
Work in the office
AI should create unicorns
The asymmetry of AI research
Fact-checking good AI researchers
The description-execution gap
RL environment specs
The inverse 80-20 rule
Method-driven vs problem-driven research
AI research is a max-performance domain
ATLA is the ultimate benchmark
Measurement is all you need
AlphaEvolve is thought-provoking
Binary-choice questions for AI research taste
The craziest chain-of-thought
The best hard-to-solve easy-to-verify benchmark
Flavors of AI for scientific innovation
Debugging-prioritized AI research
When scientific understanding catches up with models
Butterfly effect of AI researchers’ backgrounds
Benchmarks quickly get saturated
Deep browsing models
Unstoppable RL optimization vs unhackable RL environment
Dopamine cycle in AI research
Find the right dataset

2024

Biggest lessons in AI in past five years
Solving hallucinations via self-calibration
Cooking with AI mindset
OpenAI o3
Value of safety research
RL all the time
Transition to AI for science
Information density & flow of papers
CoT before and after o1
SimpleQA
The o1 paradigm
Inspiring words from a young OpenAI engineer
Levels and expectations
Bet on AI research experiments
History of Flan-2
When I don’t sleep enough
Thinking about history makes me appreciate AI
Advice from Bryan Johnson
Sora is like GPT-2 for video generation
A typical day at OpenAI
Yolo runs
Uniform information density for CoT
Inertia bias in AI research
Compute-bound, not headcount-bound
Magic of language models
Why you should write tests
Co-founders who still write code

2023

Hyung Won
Read informal write-ups
Relationship board of directors
Reinventing myself
Good prompting techniques
10k citations
Manually inspect data
Language model evals
Amusing nuggets from being an AI resident
When to use task-specific models
Benefits of pair programming
Many great managers do IC work
Why I’m 100% transparent with my manager
My girlfriend is a reward model
Better citation metrics than h-index
My strengths are communication and prioritization
Emergence (dunk on Yann LeCun)
UX for researchers
My refusal
The evolution of prompt engineering
Prompt engineering battle
Incumbents don’t have a big advantage in AI research
Potential research directions for PhD students
Best AI skillset

2022

Add an FAQ section to your research papers
Prompt engineering is black magic
What work withstands the bitter lesson
A skill to unlearn
Advice on choosing a topic