LASSP & AEP Seminar: Gautam Reddy (Princeton)

Written by cida on October 19, 2024.

In-Context Learning: Insights from the Analysis of Tiny Transformers

Transformer models pretrained on large amounts of language data display a powerful feature known as in-context learning: the ability to parse new information presented in the context with no additional weight updates. In-context learning contrasts with traditional weight-based learning paradigms in neuroscience, which usually involve learning rules designed to solve specific problems and motivates new models of rapid learning. In this talk, I will present a detailed, quantitative analysis of small transformer models trained on simplified tasks. I will discuss how these models implement in-context learning, how this ability emerges during learning and why it appears even in scenarios when memorizing the dataset is optimal.

Bio:
Gautam Reddy is assistant professor of physics at Princeton University. He studied at the Indian Institute of Technology, Bombay, India, earning a degree in engineering physics. He received his Ph.D. in physics from the University of California, San Diego, and then served as an NSF-Simons Fellow at Harvard and as a research scientist at NTT Research’s Physics and Informatics Labs. Drawing upon a diverse set of problems in neuroscience, evolution and machine learning, his research is focused on understanding how living and artificial systems process high-dimensional information to solve goal-oriented tasks. He runs the Reddy lab which develops novel physics-inspired theory and tools to build phenomenological models of learning and decision-making in collaboration with experimental biologists and uses machine learning models as ‘experimental systems’ to motivate new theory.

Overview

Outreach

Leadership

Education

News

Calendar of Events

LASSP & AEP Seminar: Gautam Reddy (Princeton)

Become a Fellow

Stay up to Date

ABOUT

PROJECTS

FIND AN EXPERT

NEWS + EVENTS

DONATE

CONTACT US