Advanced Natural Language Processing

Spring 2026

Instructor: Sewon Min, Alane Suhr
Class hours: TuThu 15:30–17:00 (15:40–17:00 considering Berkeley time)
Class location: SODA 306
Instructor OH: Right after the lectures at SODA 306
GSI OH: Monday (12:30 - 1 PM), Wednesday (11:30 AM - 12 PM) | Zoom Link

Ed link: https://edstem.org/us/join/XvztdK (Please use Ed for any class related questions)
Gradescope link: Gradescope (Use code: J4N7E2)

Lecture recordings: https://www.youtube.com/playlist?list=PLnocShPlK-Fv9YZIX7qdOyc2GJqnT3D-8 (Needs Berkeley log in, lecture 1 coming soon)

Final project: Final project logistics and reference topics: https://docs.google.com/document/d/1C8Dl6DX0_F5g3HDR-Gwr1fTmKGgscxzbU9AiUpvxV0k/edit?usp=sharing

This course provides a graduate-level introduction to Natural Language Processing (NLP), covering techniques from foundational methods to modern approaches. We begin with core concepts such as word representations and neural network–based NLP models, including recurrent networks and attention mechanisms. We then study modern Transformer-based models, focusing on pre-training, fine-tuning, prompting, scaling laws, and post-training. The course concludes with recent advances in NLP, including retrieval-augmented models, reasoning models, and multimodal systems involving vision and speech.

Prerequisites: CS 288 assumes prior experience in machine learning and proficiency in PyTorch. Students should be familiar with neural networks, PyTorch, and NumPy; no introductory tutorials will be provided.

Schedule (Tentative)

All deadlines are at 5:59 PM PST.

01/20 Tue: Introduction & n-gram LM; 01_Intro 02_ngram_LM
01/22 Thu: Word representation; 03_Word_Representation
01/27 Tue: Text classification; 04_Text Classification; Assignment 1 released
01/29 Thu: Sequence models (Key concepts: Recurrent neural networks); 05_Sequence Models
02/03 Tue: Sequence-to-sequence models; 06_Seq2Seq
02/05 Thu: Sequence-to-sequence models (cont’d) & Transformers
02/10 Tue: Transformers (cont’d); 07_Transformers; Assignment 1 due Team matching survey due Assignment 2 released
02/12 Thu: Pre-training, Fine-tuning, & Prompting; 08_Pretraining/FT/Prompting
02/17 Tue: Pre-training, Fine-tuning, & Prompting (cont’d)
02/19 Thu: Pre-training advanced topics; 09_Pretraining_Advanced
02/24 Tue: Post-training; Assignment 2 due; 10_Posttraining
02/26 Thu: Inference methods & Evaluation; 11_Generation
03/03 Tue: Experimental design & Human annotation; Project Checkpoint 1 (abstract) due Assignment 3 released; 12_Evaluation_Benchmarking
03/05 Thu: Retrieval and RAG; 13_Retrieval_and_RAG
03/10 Tue: Architecture advanced topics; 14_Advanced_Architectures
03/12 Thu: Impact & Social implications
03/17 Tue: No class: EECS faculty retreat
Assignment 3 early milestone due
03/19 Thu: Test-time compute & Reasoning models; Assignment 3 due
03/24 Tue: No class: Spring break
03/26 Thu: No class: Spring break
03/31 Tue: LLM agents
04/02 Thu: Vision-language models
04/07 Tue: Interactive embodied agents
04/09 Thu: Guest lecture: “Advancing the Capability and Safety of Computer-Use Agents” by Huan Sun (OSU); Project Checkpoint 2 (midpoint report) due
04/14 Tue: Guest lecture: “Memory in Language Models: Representation and Extraction” by Jack Morris (Cornell → Stealth)
04/16 Thu: Pragmatics
04/21 Tue: Guest lecture: “Continual Learning” by Akshat Gupta (UC Berkeley)
04/23 Thu: Guest lecture: “Speech” by Gopala Anumanchipalli (UC Berkeley)
04/28 Tue: Project presentation
04/30 Thu: Project presentation; Project report due by 05/07 (Thu)

Acknowledgement

The class materials, including lectures and assignments, are largely based on the following courses, whose instructors have generously made their materials publicly available. We are deeply grateful to them for sharing their work with the broader community:

Princeton COS 484 Natural Language Processing by Danqi Chen, Tri Dao, Vikram Ramaswamy
CMU Advanced Natural Language Processing by Graham Neubig & Sean Welleck
Stanford CS336 Language Modeling from Scratch by Tatsumori Hashimoto & Percy Liang
Cornell LM-class by Yoav Artzi
An earlier offering of UC Berkeley EECS 288 Natural Language Processing by Dan Klein and Alane Suhr

We are grateful to VESSL AI and Google Cloud for providing compute credits to support our final projects.

VESSL AI Google Cloud