Link

Policies

Table of contents

  1. Description
  2. Readings
  3. Office Hours
  4. Late Work Policy

Description

This course will explore current statistical techniques for the automatic analysis of natural (human) language data. The dominant modeling paradigm is corpus-driven statistical learning. This term, we are introducing a few new projects to give increased hands-on experience with a greater variety of NLP tasks and commonly used techniques.

This course assumes a good background in basic machine learning and a strong ability to program in Python. Prior experience with linguistics or natural languages is helpful, but not required. There will be a lot of statistics, algorithms, and coding in this class. The recommended background is CS 188 (or CS 281A) and CS 170 (or CS 270). An A in CS 188 (or CS 281A) is required. This course will be more work-intensive than most graduate or undergraduate courses.

Readings

The primary recommended texts for this course are:

Both texts are currently free online.

Office Hours

Professor office hours: TBA

GSI office hours: TBA

Late Work Policy

Everyone will have 7 slip days they may use on any project throughout the semester, but you may only use up to 2 slip days on an individual project without prior consent from the instructors.

If you need to use more than 2 slip days on a project, please make a private post on piazza before the project deadline to let us know.

Slip days will be counted by rounding up to the nearest day – ie. if a project is due at 11:59pm on Tuesday and you submit at 12:00am on Wednesday, this will use up one of your slip days. Weekend days will count towards slip days as well, so there is no notion of business days here.