Codelet 3: Matrix Factorization Recommender System using PyTorch

The due date for this codelet is Sunday, Oct 5 at 11:59PM.

Introduction

The aim of this codelet is to build on your PyTorch skills via the concrete implementation of parts of a matrix factorization recommender system. You should draw on our course notes, Chapter 21.3 of Dive into Deep, and the PyTorch documentation. You may, and likely should, build on your code from codelet2.

Important: Only use PyTorch, sklearn, matplotlib, torchvision, and in-built Python. The use of any other libraries, including the books d2l library results in an automatic unsatisfactory grade for this assignment. A core goal of this class is to build your competencies as a machine learning engineer. I want to minimize abstractions from other libraries so that you build these skills.

Outline

Your assignment

Your task is to:

  1. Download codelet3.zip from the course website and open it. You will find these instructions, codelet3.py which has scaffolding for you, and a folder data which contains the Movie Lens data you will use.
  2. Complete each of the 3 broad tasks below in codelet3.py and include a file called codelet3.pdf with your answers to the written questions.

Grading

When assessing your work, satisfactory achievement is demonstrated, in part, by:

Matrix Factorization

You will complete an implementation of matrix factorization for a recommender system trained using gradient descent. You should build on codelet3.py. You have three tasks:

  1. Complete the model class
  2. Implement training
  3. Implement a prediction function

Matrix Factorization Model

No basic class structure is provided. You should instead adapt yours from codelet2.py. You must use PyTorch’s Embedding layer and Parameter class where appropriate to handle your model’s weights (and later compute your model’s output). Note, you should not implement your own loss function with your model class (which differs from codelet2.py).

To evidence completion of this task, and to check your own progress, please write out the function that your class implements labeling the parameters with their meaning.

Implement Training

You will build on your skills from codelet2.py and implement a train function for your model. In particular, you should implement mini-batch gradient descent. Please note two things:

  1. We are using PyTorch’s DataLoader class, which takes care of batches for you. (I’ve included code that loads the dataset as a dataloader instance in codelet3.py). For example, given a dataloader data, you can loop over batches, separating out the information, as below:
for batch in data:
    user_id, item_id, rating = batch
  1. L2 Regularization is handled by the weight_decay flag in PyTorch optimizer functions. You specify the weight you want to apply to the magnitudes of the parameters (λ in our class notation). For example,
optimizer = torch.optim.SGD(model.parameters(), lr=1e-3, weight_decay=1e-5) 

Your train function should:

  1. Print the average loss per batch each epoch
  2. Run for 20 epochs

To evidence completion of this task, and to check your own progress, please train your model using your train function. You should add to the pdf accompanying your code a screenshot of the output of your code. Mine, for example, looks as follows (showing a subset of the output for space):

Epoch 1 Avg Loss 4.826377685975847
--------------------------------------------------------------------------------
Epoch 2 Avg Loss 4.506148238345658
--------------------------------------------------------------------------------
...
Epoch 19 Avg Loss 3.5349286765882595
--------------------------------------------------------------------------------
Epoch 20 Avg Loss 3.530479103970558
--------------------------------------------------------------------------------


Epoch 1 Avg Loss 17.257477978436132
--------------------------------------------------------------------------------
Epoch 2 Avg Loss 12.937543778134724
--------------------------------------------------------------------------------
...
Epoch 19 Avg Loss 9.94724932710582
--------------------------------------------------------------------------------
Epoch 20 Avg Loss 9.824600744429205
--------------------------------------------------------------------------------

Implement Prediction

Finally, you should complete the function predict which prints the top k movie recommendations for some user id. The @torch.no_grad() above the function is a python decorator that stops PyTorch from calculating a gradient (which we don’t need when we are using our model to predict).

To evidence completion of this task, and to check your own progress, please provide screenshots of the output of your function in the pdf accompanying your code. Mine, for example, look like:

1. Camp Rock (2008). Predicted rating: 11.521324157714844
2. City of Lost Souls, The (Hyôryuu-gai) (2000). Predicted rating: 10.81906509399414
3. Celebration, The (Festen) (1998). Predicted rating: 9.90432357788086
4. Nightcrawler (2014). Predicted rating: 9.83299446105957
5. Neighbors (1981). Predicted rating: 9.778596878051758
6. Jonah Hex (2010). Predicted rating: 9.653367042541504
7. Conspiracy Theory (1997). Predicted rating: 9.449722290039062
8. Green Mile, The (1999). Predicted rating: 9.151823997497559
9. Dogfight (1991). Predicted rating: 9.054815292358398
10. Black Stallion Returns, The (1983). Predicted rating: 8.991623878479004