The due date for this codelet is Wednesday, Feb 12 at 11:59PM.
The aim of this codelet is to build on your PyTorch skills via the concrete implementation of parts of softmax regression. You should draw on Chapter 4 of the textbook and the PyTorch documentation. You may, and likely should, build on your code from codelet2.
Important: Only use PyTorch, sklearn, matplotlib, torchvision, and in-built Python. The use of any other libraries, including the books d2l library results in an automatic unsatisfactory grade for this assignment. A core goal of this class is to build your competencies as a machine learning engineer. I want to minimize abstractions from other libraries so that you build these skills.
Your task is to:
codelet3.zip
from the course website and open
it. You will find these instructions and codelet3.py
which
has scaffolding for you.codelet3.py
and include a file called codelet3.pdf
with your answers to
the written questions.When assessing your work, satisfactory achievement is demonstrated, in part, by:
You will complete an implementation of softmax regression trained
using gradient descent. You should build on codelet3.py
.
You have three tasks:
No basic class structure is provided. You should instead adapt yours
from codelet2.py
. Carefully review the provided train
function when consider your implementation. Like with
codelet2
, you must use PyTorch’s Linear
layer to handle your model’s weights (and compute your model’s output).
Additionally, you must use PyTorch’s Softmax
in creating your output (which should be the probability of each label
for each sample).
To evidence completion of this task, and to check your own progress,
please train your model using the train
method. The
train
method should work without any tweaks. Do not modify
or otherwise change that function. You should add to the pdf
accompanying your code a screenshot of the output of your code. Mine,
for example, looks as follows (skipping parts in the middle):
batch 100 loss: 1.672284483909607
batch 200 loss: 1.1067351633310318
batch 300 loss: 0.9611724829673767
batch 400 loss: 0.8704170614480973
batch 500 loss: 0.8233233571052552
batch 600 loss: 0.7921148812770844
batch 700 loss: 0.7688001942634582
batch 800 loss: 0.734564779996872
batch 900 loss: 0.6996732714772225
Epoch 1 Train Avg Loss 0.929627036304077 Validation Loss -1
--------------------------------------------------------------------------------
batch 100 loss: 0.6923972260951996
batch 200 loss: 0.6773819029331207
batch 300 loss: 0.6719656646251678
batch 400 loss: 0.6548320659995079
batch 500 loss: 0.6613071784377098
batch 600 loss: 0.6577168375253677
batch 700 loss: 0.6382966065406799
batch 800 loss: 0.6172605124115944
batch 900 loss: 0.6231981575489044
Epoch 2 Train Avg Loss 0.6551079480027185 Validation Loss -1
--------------------------------------------------------------------------------
...
...
--------------------------------------------------------------------------------
batch 100 loss: 0.49717589110136035
batch 200 loss: 0.5047877758741379
batch 300 loss: 0.48548407793045045
batch 400 loss: 0.48455618917942045
batch 500 loss: 0.4776347289979458
batch 600 loss: 0.47942177444696427
batch 700 loss: 0.4911013525724411
batch 800 loss: 0.4988816994428635
batch 900 loss: 0.48646573185920716
Epoch 10 Train Avg Loss 0.49039147889664997 Validation Loss -1
--------------------------------------------------------------------------------
Notice that the valid_loss
function returns -1 in the
provided code. Your task is to complete this function. It should meet
the following specifications:
Notice, PyTorch automatically calculates the gradient when generating predictions. You should research how to disable this for a whole function or for a block of code (either way works).
To evidence completion of this task, and to check your own progress,
please train your model using the train
function. You
should add to the pdf accompanying your code a screenshot of the output
of your code. Mine, for example, looks as follows (ignoring the
remaining epochs):
batch 100 loss: 1.6615215480327605
batch 200 loss: 1.099790757894516
batch 300 loss: 0.9497145491838456
batch 400 loss: 0.8745188522338867
batch 500 loss: 0.8291540437936783
batch 600 loss: 0.7907151341438293
batch 700 loss: 0.7683601039648056
batch 800 loss: 0.7313806009292603
batch 900 loss: 0.7262751519680023
Epoch 1 Train Avg Loss 0.928412394119047 Validation Loss 0.7187230858453519
--------------------------------------------------------------------------------
...
To further evaluate your model’s performance, you should complete two things:
To evidence completion of this task, and to check your own progress, please provide screenshots of the output of 1 and 2 in the pdf accompanying your code. Mine, for example, look like:
Initial metrics:
Acc 0.0625 P 0.07361111111111111 R 0.04892156862745098 F1 0.038225108225108224
Final metrics:
Acc 0.78125 P 0.7961111111111111 R 0.7908730158730158 F1 0.7836813186813186
Average probability of correct predictions: 83.2%
Average probability of incorrect predictions: 57.67%
For 3, include images of your plots and a description of any patterns you see or explanation you may have for your model’s mislabeling. Notice, a plotting function has been provided with the class for the data.