Here are a list of potential topics for each exam. You should make sure read the textbook, review slides/handouts, and take advantage of the office hours.

Midterm Exam 1

  • For a problem, identify
    • What type of learning you would apply (supervised vs. unsupervised)?
    • What model is appropriate (linear regression, decision trees, k-nearest neighbors, k-means clustering) and why?
    • What type of evaluation makes sense (MAE, MSE, accuracy, precision, recall, f1) and why?
  • Data
    • Shape of input in a problem
    • k-fold cross validation
    • batch, mini-batch, stochastic gradient descent
  • Metrics
    • MSE equation
    • precision/recall/f1 equation
    • precision/recall trade-off
    • How to calculate Jaccard Similarity
  • KNN
    • Describe
    • Calculate prediction from data for new point
  • k-means
    • Describe
    • Different initialization schemes
    • Procedure for determining cluster membership for new point
  • Linear regression
    • formula
    • interpretation of formula (e.g., the gender pay gap lab)
    • trade-offs between closed-form vs. gradient descent optimization
  • Decision trees
    • Apply a decision tree
    • Apply the CART algorithm
    • Gini-impurity equation and calculation for a decision boundary
  • Recommender Systems
    • Pros and cons to different approaches
    • Predict ratings for user and item-based collaborative filtering (both simple average and similarity weighted)