CS 181 provides a broad and rigorous introduction to machine learning, probabilistic reasoning and decision making in uncertain environments. We will discuss the motivations behind common machine learning algorithms, and the properties that determine whether or not they will work well for a particular task. You will derive the mathematical underpinnings for many common methods, as well as apply machine learning to challenges with real data. In doing so, our goal is that you gain a strong conceptual understanding of machine learning methods that can empower you to pursue future theoretical and practical directions.
The goal of CS 181 is to combine mathematical derivation and coding assignments to provide a strong and rigorous conceptual grounding in machine learning (e.g. being able to reason about how different methods should behave in different circumstances). Students interested primarily in theory may prefer Stat195 and other learning theory offerings. Students interested primarily in practice may prefer CS109a and other data science offerings. Students interested in a more advanced, optimization-based orientation may prefer CS 183. Students looking for specialized topics may prefer CS28x and other graduate seminars.
The material is aimed at an advanced undergraduate level. Students should be comfortable with writing non-trivial programs (e.g., CS 51, CS 61, or equivalent). All staff-provided code will be in Python. Students should also have a background in probability theory (e.g., STAT 110 or equivalent), and familiarity with calculus and linear algebra (e.g., AM 22a or Math 21ab, or equivalent).
Motivated students without all of these prerequisites may also be able to fill in gaps in their knowledge. Part I of Math for Machine Learning is a useful resource for mathematical background (specifically Sections 2.1-2.6; 3.1-3.5; 4.1-4.2; 5.1-5.6; 6.3). This year we are also planning additional homework zero style material as well as additional sections throughout the semester to help with mathematical background.
Team The CS181 team consists of two course instructors-- Finale Doshi Velez and David Parkes ---as well as a large staff of TFs lead by two co-head TFs. We are all dedicated to helping you to learn the fundamentals of machine learning.
Lectures
Lectures will be used to introduce new content as well as explore the
content through conceptual questions. They will be given over
Zoom and also recorded, and involve both slides and iPad-based
discussion.
We plan two sessions each week, in the designated class time. The
instructors will endeavor to hang around after class to answer
questions. Students are encouraged to attend live
so that they can ask questions, including over chat.
We recognize that not everyone is comfortable with using their
video camera during class. But we strongly encourage this and would
like 'video on' to be a norm.
Attending live lecture is an expectation of CS 181 students. You are expected to attend at least 7 live lectures (~1/3 of lectures) with your camera turned on, unless your circumstances don’t permit. If you would like to request an attendance exemption, see this Ed post for instructions.
Sections Sections will employ a flipped classroom format, in which students will work on questions that will be good preparation for homework and the midterms. The teaching staff will introduce the questions, assist students in solving them, and wrap up with the solutions. These solutions will be posted. The section cycle “restarts” each Monday, when a new section begins. Each week’s section covers the previous week's Tuesday and Thursday lectures. So for example, the sections from 3/1 to 3/4 cover content from Tuesday, 2/23 and Thursday, 2/25's lectures.
Office Hours We will be holding a lot of office hours on Zoom. Please make use of these office hours! We have in mind structing them with different break-out rooms per problem set question or topic of interest.
Zoom Policy We strongly prefer you participate on Zoom with your camera turned on, unless your circumstances don’t permit.
Tablet We will expect students to have access to a tablet, to help with communication with staff and each other during office hours and sections, and have notifed the Office of Undergraduate Education about this so that they can be prepared to help if you do not have access to one.
Textbook There is no official textbook for the course. There is a set of course notes available here. We should emphasize, though, that these are due to the awesome effort of a past CS181 student who decided to create a course textbook as an (unusually ambitious!) senior thesis. There may still be some bugs, and if you find any please be a good citizen and put in a pull request.
Course Website The course web site will be used for posting section notes and links to assignments, and includes pointers to other resources we'll use, including Ed and GradeScope.
GradeScope GradeScope will be used for submiting assignments and posting grades.
Ed Most communications with the course staff should go via Ed rather than email. In particular, the Ed site for the course will be used for three purposes:
Ed is not a formally secure, private, or confidential form of communication, and what you send may be seen by the entire course staff. If you have a sensitive concern, please also directly email the two co-instructos.
The main grading components of the course are the six homework assignments (10% each), one practical (10% ), and two midterms (15% each, in March and April). Participation in section, office hours, Ed, and lecture may be used to bump up a grade that for a student who ends up near a letter-grade boundary. Similarly, any bonus component of the course, such as an exceptionally creative practical solution, will only be a factor for students on grade boundaries.
Grading errors If you believe there has been a grading error, submit a regrade request through GradeScope. However, please note that a) we will regrade the entire assignment, which may result in your total grade going up or down, and b) we will only allow one regrade request per problem set. Regrade requests are due 1 week after grades are released.
The homework assignments help you practice the core concepts that we cover in the course. They involve components that are theoretical and conceptual and also require some programming. Homework solutions must be submitted in LaTeX and will be returned with grades and solutions. Due to the volume of the grading, it may not always be possible for the staff to provide detailed feedback. It is your responsibility to look at the solutions, identify gaps, and come to office hours to fill in those gaps. We also have one "practical" assignment, which can be done with one other student, and that is more open-ended in nature. You will be asked to explore different machine learning algorithms on a particular data set, with a passing grade for beating some baselines and bonuses for an especially creative or successful approach.
Collaboration Policy You may work with others, but your write-up must be entirely written by yourself in your own words. You may help each other debug code, but again, the code must be written by you. Include the names of anyone you worked with in your write-up. It is an honor code violation to copy parts of another person's assignment or jointly type up an assignment. You can make use of textbooks and online sources to help in answering questions, but you must cite your sources (and you should be ready to explain your answer to a member of the teaching staff.) It is an honor code violation to look up solutions to the specific questions that we ask from the internet or other sources (e.g. friends from previous years).
Late Days Policy Homework should be submitted electronically on the due date, via the Gradescope course website. This is a strict deadline, enforced by the site, so submit early enough that you don't accidentally discover that your local clock is slow. You have six late days that can be used for homework assignments. Up to two late days can be used on any assignment. Start early and plan ahead! The staff will give 50% credit to assignments turned in past their late days at their discretion. It is almost always in your interest to turn in partial or late homework rather than not turning in any homework at all. It is an honor code violation to look at the solutions if you haven't yet turned in your assignment.
Sickness (and other Life Events) Policy In general, we expect you to use your late days when you are sick. At the same time, we understand that sometimes life throws a set of circumstances that impact your perfomance in the course, and all the more so given the current global pandemic and working and studying conditions. Should this become a problem for you, please let the two co-insuctors know, via email, so that we can help determine a plan to navigate a tough situation. If you find that you have used up all your late days, for example, and have more illness then please reach out to us. Most likely, we will ask you to start a correspondence with your resident dean to verify their support of your extra needs (we would not need a doctor's note, but rather your resident dean would provide us with what we need to know to appropriately adjust).
Midterms are a chance to demonstrate what you have
learned. Midterms will be closed-book, timed (1hr 20 mins), and proctored via Zoom breakout rooms.
We’ll have two time windows to handle different time zones and load balance.
To the extent possible, we will also provide
you with what we think you need to be able to answer the question
without needing to memorize too many things. It is an honor code
violation to communicate with anyone about the midterm while
you take the midterm, and to communicate in any way with other
students. You should also be careful not to share information about the midterm with any students
who need to take a midterm at a different time.
Illness
If you have an acute illness at the time of a
midterm, then you must let the co-instructors know in advance
of the midterm and get a doctor's note and
send it to us as soon as possible.
We will likely also follow up
with your resident dean and determine the best way to handle
the situation.
The goal of the course is to instill a strong technical background for you to robustly, successfully, and responsibly apply machine learning in the world. Thus, in addition to the derivations and the practical components, each class will include some illustrations and discussion of real world applications of machine learning. There will also be a lecture and part of an assignment that is devoted to the ethical implications of machine learning as part of the Embedded EthiCS program.
Given the the increasing use of machine learning systems, the users and developers of these systems must hold themselves to high professional and ethical standards. One can cause real harm by pursuing a good cause via poor engineering choices. Quoting one of our favorite superheroes: with great power (to run any kind of analysis) comes great responsibility (to do it properly)!
Relatedly, we expect all participants in this course--- co-instructors, teaching staff, and students---to be committed to a open, professional, and inclusive environment. We want everyone to be comfortable in the course and empowered to learn. These qualities take cultivation and effort. We welcome constructive feedback to improving the course environment and want you to reach out to the two co-instructors, or members of the teaching staff, with any concerns.