CS287: Machine Learning for Natural Language

Alexander Rush, Harvard University

Time: Mon/Wed 10:30-11:45pm

Location: Pierce 301

Course Info

Application
  • Course Application (CS 287 will be capped at 35 students this semester. We are looking for interested and dedicated students. If you think this is you, please take the time to fill out the application carefully.)
Links
Lab Times
  • Friday 2:30-3:30pm, MD 2nd Floor Lounge
Instructor
  • Alexander "Sasha" Rush
  • Email: Slack preferred or srush at seas.harvard.edu
Teaching Assistants
  • Justin (OH: Monday 3pm-4pm, MD 2nd floor lounge), Yuntian (OH: Wednesday 4pm-5pm, MD 2nd floor lounge)
Office Hours
  • Monday 12-1:30pm: MD 217 (Sasha)
Grading
  • Assignments (20%)
  • Presentation and Participation (15%)
  • Midterm Exam (15%)
  • Final Project (50%)

Time and Location

  • Thursday 5-6pm: Pierce Hall 320
  • Friday 11-11:59am: MD 223

Date Location Topic Materials
Sep. 1, 10-11am (Mark) Pierce 301 Math Review (Linear Algebra, Calculus, Probabilistic Theory)
Sep. 4, 5-6pm (Zhirui) Pierce 301 Math Review (Linear Algebra, Calculus, Probabilistic Theory)
Sep. 7, 5-6pm (Rachit) Pierce 320 Code Review (Python, Numpy, Matplotlib, PyTorch)
Sep. 8, 11-11:59am (Rachit) MD 223 Code Review (Python, Numpy, Matplotlib, PyTorch)

You will form groups of 3 (preferably, for exceptions please ask Sasha) to work on a project. The ideal outcome of this project would be a paper that could be submitted to a top-tier natural language or machine learning conference such as ACL, EMNLP, NIPS, ICML, or UAI. There are different ways to approach this project, which are discussed in a more comprehensive document that is available on the course website. There are four separate components of the project.

You will upload these materials via Canvas. Please see the syllabus (linked in the course website) for a more thorough description of the final project and policies related to collaboration, etc.


Important Dates

Date Due Descriptions
March 27 Abstract and Status Report This is a three to four page document that contains a draft of your final abstract, as well as a brief status report on the progress of your project.
May 13 Final Report You will write a report of up to ten pages, in the style of a mainstream CS conference paper. Please use the provided template (see here)

Our syllabus this semester consists of two parts. The first part of the semester will be an accelerated background on applied deep learning for natural language processing with a series of Kaggle competitions. The second part of the semester will consist of student led paper presentations on the topic of text generation and transfer.

Date Area TopicDemos Required ReadingsAssignment
Jan. 28 Intro
Jan. 30 Classification Basics notebook
Feb. 4 CNNs notebook
Feb. 6 Sequences NNLMs notebook
Feb. 11 RNNs notebook Classification (Kaggle)
Feb. 13 Translation (Yuntian)
Feb. 20 Attention notebook Modeling (Kaggle)
Feb. 21 Talk Sasha (Thurs 3pm G115)
Feb. 25 Guest Lecture: Alec Radford
Feb. 27 Latent Variables Variational Autoencoders (Yoon Kim)
Mar. 4 Latent Variables 2 notebook
Mar. 6 Frontier of Tasks Problem and Datasets Translation (Kaggle)
Mar. 11 Midterm
Mar. 13 Projects Discussion Sign-up 9:30am-2pm
Mar. 13 Yann LeCunn Talk
Mar. 25 Project / Office Hours
Mar. 27 EthiCS (Fairness + Bias) Final Project Abstracts
Mar. 28 Timnit Gebru (3pm MD 115)
Apr. 1 Probing Models
Apr. 3 Project Attention Ethics (Kaggle)
Apr. 8 Project
Apr. 10 Project
Apr. 15 Project
Apr. 17 Project
Apr. 22 Project
Apr. 24 Project
Apr. 29 Project
May. 1 Project
May 11
May 13 Final Paper Due