Advanced Natural Language Processing
Spring 2026
Instructor: Sewon Min, Alane Suhr
Class hours: TuThu 15:30–17:00 (15:40–17:00 considering Berkeley time)
Class location: SODA 306
Instructor OH: Right after the lectures at SODA 306
GSI OH: Monday (12:30 - 1 PM), Wednesday (11:30 AM - 12 PM) | Zoom Link
Ed link: https://edstem.org/us/join/XvztdK (Please use Ed for any class related questions)
Gradescope link: Gradescope (Use code: J4N7E2)
Lecture Recordings: https://www.youtube.com/playlist?list=PLnocShPlK-Fv9YZIX7qdOyc2GJqnT3D-8 (Needs Berkeley log in, lecture 1 coming soon)
If you are interested in taking the course and can’t directly enroll, please submit this form.
This course provides a graduate-level introduction to Natural Language Processing (NLP), covering techniques from foundational methods to modern approaches. We begin with core concepts such as word representations and neural network–based NLP models, including recurrent networks and attention mechanisms. We then study modern Transformer-based models, focusing on pre-training, fine-tuning, prompting, scaling laws, and post-training. The course concludes with recent advances in NLP, including retrieval-augmented models, reasoning models, and multimodal systems involving vision and speech.
Prerequisites: CS 288 assumes prior experience in machine learning and proficiency in PyTorch. Students should be familiar with neural networks, PyTorch, and NumPy; no introductory tutorials will be provided.
Schedule (Tentative)
All deadlines are at 5:59 PM PST.
- 01/20 Tue
- Introduction & n-gram LM
- 01_Intro 02_ngram_LM
- 01/22 Thu
- Word representation
- 03_Word_Representation
- 01/27 Tue
- Text classification
- 04_Text Classification
- Assignment 1 released
- 04_Text Classification
- 01/29 Thu
- Sequence models (Key concepts: Recurrent neural networks)
- 05_Sequence Models
- 02/03 Tue
- Case study 1: Machine Translation (Key concepts: Encoder-decoder, Attention)
- 02/05 Thu
- Case study 2: Question answering
- 02/10 Tue
- Transformers
- Assignment 1 due Team matching survey due Assignment 2 released
- 02/12 Thu
- Transformers (cont’d) & Pre-training
- 02/17 Tue
- Pre-training (cont’d), Fine-tuning, & Prompting
- 02/19 Thu
- Scaling laws & Data curation
- 02/24 Tue
- Guest lecture (TBA)
- Assignment 2 due
- 02/26 Thu
- Experimental design & Human annotation
- 03/03 Tue
- Retrieval and RAG
- Project Checkpoint 1 (abstract) due Assignment 3 released
- 03/05 Thu
- Post-training
- 03/10 Tue
- Inference methods & Evaluation
- 03/12 Thu
- Mixture-of-Experts
- 03/17 Tue
- Guest lecture (TBA)
- Assignment 3 early milestone due
- 03/19 Thu
- Test-time compute & Reasoning models
- Assignment 3 due
- 03/24 Tue
- No class: Spring break
- 03/26 Thu
- No class: Spring break
- 03/31 Tue
- LLM agents
- 04/02 Thu
- Vision-language models
- 04/07 Tue
- Interactive embodied agents
- 04/09 Thu
- Guest lecture: “Advancing the Capability and Safety of Computer-Use Agents” by Huan Sun (OSU)
- Project Checkpoint 2 (midpoint report) due
- 04/14 Tue
- Guest lecture: “Memory in Language Models: Representation and Extraction” by Jack Morris (Cornell → Stealth)
- 04/16 Thu
- Pragmatics
- 04/21 Tue
- Impact & Social implications
- 04/23 Thu
- Guest lecture: “Speech” by Gopala Anumanchipalli (UC Berkeley)
- 04/28 Tue
- Project presentation
- 04/30 Thu
- Project presentation
- Project report due by 05/07 (Thu)
Acknowledgement
The class materials, including lectures and assignments, are largely based on the following courses, whose instructors have generously made their materials publicly available. We are deeply grateful to them for sharing their work with the broader community:
- Princeton COS 484 Natural Language Processing by Danqi Chen, Tri Dao, Vikram Ramaswamy
- CMU Advanced Natural Language Processing by Graham Neubig & Sean Welleck
- Stanford CS336 Language Modeling from Scratch by Tatsumori Hashimoto & Percy Liang
- Cornell LM-class by Yoav Artzi
- An earlier offering of UC Berkeley EECS 288 Natural Language Processing by Dan Klein and Alane Suhr
We are grateful to VESSL AI and Google Cloud for providing compute credits to support our final projects.
