The journey of a form along the line of time
a form impregnating the barren line of time
a form conscious of an image
that returns from a feast in a mirror.
— Forough Farrokhzad
Share anonymous feedbackI am a Senior Researcher at the University of Oxford, where I lead the Technical Safety & Governance (TSG) Lab and serve as Technical Director of the AI Governance Initiative. I am also Principal Scientist at Martian. My research focuses on understanding what happens inside neural networks and using those insights to make AI systems safer and more governable. My work is supported by OpenAI, Anthropic, Schmidt Sciences, NVIDIA, and others.
Affiliations: Centre for the Study of Existential Risk, University of Cambridge; Digital Trust Centre, Nanyang Technological University; School of Informatics, University of Edinburgh; and ELLIS (European Laboratory for Learning and Intelligent Systems).
I am looking for students. If my research speaks to your interests, I would love to hear which directions excite you and how your work connects. I am committed to partnering with researchers from underrepresented and disadvantaged backgrounds.
New News
| Jul 2026 | Invited speaker at EIML@ICML 2026 (2nd Workshop on Epistemic Intelligence in Machine Learning). Talk: “Understanding model behaviour in the age of AGI.” |
|---|---|
| Jul 2026 | Teaching at two summer schools: invited lecture at the ML Summer School on Reliability and Safety, Kraków (Jul 1–4); and at the Oxford Machine Learning Summer School (OxML 2026), Track 02: Representation Learning & Generative AI (Jul 15–18). |
| Jun 2026 | Several upcoming invited talks: debating AI governance at the Oxford Union (Connected Life Summit, Jun 25); invited talk at ETH Zürich on automated interpretability (Jun 25, remote); speaking at BLISS (Berlin Learning and Intelligent Systems Society) Speaker Series, Berlin (Jun 30); and a talk at the Foresight Secure & Sovereign AI Workshop, Berlin (Jul 18–19). |
| May 2026 | Paper accepted at ACL 2026: Make Mechanistic Interpretability Auditable. |
| May 2026 | 7 papers accepted at ICML 2026, including 2 Spotlights: There Are Futures That Benchmark-Driven AI Cannot See and Don’t Just “Fix it in Post”: A Science of AI Must Study Learning Dynamics. |
| Apr 2026 | New paper: in Science Robotics: Beyond Alignment: Why Robotic Foundation Models Need Context-Aware Safety. |
| Apr 2026 | Gave an invited talk at the Barcelona Supercomputing Centre (Severo Ochoa Research Seminar) and an upcoming seminar at the University of Cambridge (Language Technology Lab) in May. |
| Apr 2026 | Featured in The Independent and the Irish Independent on AI safety and existential risk. |