Link

Policies

Table of contents

  1. Description
  2. Readings
  3. Office Hours
  4. Late Work Policy

Description

This course will explore current statistical techniques for the automatic analysis of natural (human) language data. The dominant modeling paradigm is corpus-driven statistical learning. This term, we are introducing a few new projects to give increased hands-on experience with a greater variety of NLP tasks and commonly used techniques.

This course assumes a good background in basic machine learning and a strong ability to program in Python. Prior experience with linguistics or natural languages is helpful, but not required. There will be a lot of statistics, algorithms, and coding in this class. The recommended background is CS 188 (or CS 281A) and CS 170 (or CS 270). An A in CS 188 (or CS 281A) is required. This course will be more work-intensive than most graduate or undergraduate courses.

Readings

The primary recommended texts for this course are:

Both texts are currently free online.

Office Hours

Professor office hours: Tuesdays 3:30-4:30pm; 781 Soda Hall

GSI office hours: Thursdays 5:00-6:00pm; Soda Alcove 341B

Late Work Policy

Programming projects must be turned in electronically by midnight on the listed due date. You will have a total of 7 slip days for these projects, up to four of which can be used for each project. Note that slip days are counted at the granularity of days, rounded up to the nearest day. For example, for a project due at midnight on Thursday, any submission from Thursday midnight - Friday midnight will use up one slip day, any submission from Friday midnight - Saturday midnight will use up two slip days, and so on. Any submissions after the following Monday midnight would receive no credit. We may allow more days to be used on a single project if you seek prior permission.