Advanced Natural Language Processing
Spring 2026
Instructor: Sewon Min, Alane Suhr
Class hours: TuThu 15:30–17:00 (15:40–17:00 considering Berkeley time)
Class location: SODA 306
Instructor OH: Right after the lectures at SODA 306
GSI OH: Monday (12:30 - 1 PM), Wednesday (11:30 AM - 12 PM) | Zoom Link
Ed link: https://edstem.org/us/join/XvztdK (Please use Ed for any class related questions)
Gradescope link: Gradescope (Use code: J4N7E2)
Lecture recordings: https://www.youtube.com/playlist?list=PLnocShPlK-Fv9YZIX7qdOyc2GJqnT3D-8 (Needs Berkeley log in, lecture 1 coming soon)
Final project: Final project logistics and reference topics: https://docs.google.com/document/d/1C8Dl6DX0_F5g3HDR-Gwr1fTmKGgscxzbU9AiUpvxV0k/edit?usp=sharing
This course provides a graduate-level introduction to Natural Language Processing (NLP), covering techniques from foundational methods to modern approaches. We begin with core concepts such as word representations and neural network–based NLP models, including recurrent networks and attention mechanisms. We then study modern Transformer-based models, focusing on pre-training, fine-tuning, prompting, scaling laws, and post-training. The course concludes with recent advances in NLP, including retrieval-augmented models, reasoning models, and multimodal systems involving vision and speech.
Prerequisites: CS 288 assumes prior experience in machine learning and proficiency in PyTorch. Students should be familiar with neural networks, PyTorch, and NumPy; no introductory tutorials will be provided.
Schedule (Tentative)
All deadlines are at 5:59 PM PST.
- 01/20 Tue
- Introduction & n-gram LM
- 01_Intro 02_ngram_LM
- 01/22 Thu
- Word representation
- 03_Word_Representation
- 01/27 Tue
- Text classification
- 04_Text Classification
- Assignment 1 released
- 04_Text Classification
- 01/29 Thu
- Sequence models (Key concepts: Recurrent neural networks)
- 05_Sequence Models
- 02/03 Tue
- Sequence-to-sequence models
- 06_Seq2Seq
- 02/05 Thu
- Sequence-to-sequence models (cont’d) & Transformers
- 02/10 Tue
- Transformers (cont’d)
- 07_Transformers
- Assignment 1 due Team matching survey due Assignment 2 released
- 07_Transformers
- 02/12 Thu
- Pre-training, Fine-tuning, & Prompting
- 08_Pretraining/FT/Prompting
- 02/17 Tue
- Pre-training, Fine-tuning, & Prompting (cont’d)
- 02/19 Thu
- Pre-training advanced topics
- 09_Pretraining_Advanced
- 02/24 Tue
- Post-training
- Assignment 2 due
- 02/26 Thu
- Inference methods & Evaluation
- 03/03 Tue
- Experimental design & Human annotation
- Project Checkpoint 1 (abstract) due Assignment 3 released
- 03/05 Thu
- Architecture advanced topics 1: Retrieval and RAG
- 03/10 Tue
- Architecture advanced topics 2: Mixture-of-Experts and other Transformers variants
- 03/12 Thu
- Impact & Social implications
- 03/17 Tue
- No class: EECS faculty retreat
Assignment 3 early milestone due - 03/19 Thu
- Test-time compute & Reasoning models
- Assignment 3 due
- 03/24 Tue
- No class: Spring break
- 03/26 Thu
- No class: Spring break
- 03/31 Tue
- LLM agents
- 04/02 Thu
- Vision-language models
- 04/07 Tue
- Interactive embodied agents
- 04/09 Thu
- Guest lecture: “Advancing the Capability and Safety of Computer-Use Agents” by Huan Sun (OSU)
- Project Checkpoint 2 (midpoint report) due
- 04/14 Tue
- Guest lecture: “Memory in Language Models: Representation and Extraction” by Jack Morris (Cornell → Stealth)
- 04/16 Thu
- Pragmatics
- 04/21 Tue
- Guest lecture: “Continual Learning” by Akshat Gupta (UC Berkeley)
- 04/23 Thu
- Guest lecture: “Speech” by Gopala Anumanchipalli (UC Berkeley)
- 04/28 Tue
- Project presentation
- 04/30 Thu
- Project presentation
- Project report due by 05/07 (Thu)
Acknowledgement
The class materials, including lectures and assignments, are largely based on the following courses, whose instructors have generously made their materials publicly available. We are deeply grateful to them for sharing their work with the broader community:
- Princeton COS 484 Natural Language Processing by Danqi Chen, Tri Dao, Vikram Ramaswamy
- CMU Advanced Natural Language Processing by Graham Neubig & Sean Welleck
- Stanford CS336 Language Modeling from Scratch by Tatsumori Hashimoto & Percy Liang
- Cornell LM-class by Yoav Artzi
- An earlier offering of UC Berkeley EECS 288 Natural Language Processing by Dan Klein and Alane Suhr
We are grateful to VESSL AI and Google Cloud for providing compute credits to support our final projects.
