Here are a list of potential topics for each exam. You should make sure read the textbook, review slides/handouts, and take advantage of the office hours.
Midterm Exam 1
- For a problem, identify
- What type of learning you would apply (supervised vs. unsupervised)?
- What model is appropriate (linear regression, decision trees, k-nearest neighbors, k-means clustering) and why?
- What type of evaluation makes sense (MAE, MSE, accuracy, precision, recall, f1) and why?
- Data
- Shape of input in a problem
- k-fold cross validation
- batch, mini-batch, stochastic gradient descent
- Metrics
- MSE equation
- precision/recall/f1 equation
- precision/recall trade-off
- How to calculate Jaccard Similarity
- KNN
- Describe
- Calculate prediction from data for new point
- k-means
- Describe
- Different initialization schemes
- Procedure for determining cluster membership for new point
- Linear regression
- formula
- interpretation of formula (e.g., the gender pay gap lab)
- trade-offs between closed-form vs. gradient descent optimization
- Decision trees
- Apply a decision tree
- Apply the CART algorithm
- Gini-impurity equation and calculation for a decision boundary
- Recommender Systems
- Pros and cons to different approaches
- Predict ratings for user and item-based collaborative filtering (both simple average and similarity weighted)