The due date for this codelet is Friday, Sep 19 at 11:59PM.
The aim of this codelet is to build your PyTorch skills via the concrete implementation of parts of linear regression. You should draw on Chapter 3 of the textbook and the PyTorch documentation.
Important: Only use PyTorch and in-built Python. The use of any other libraries, including the books d2l library results in an automatic unsatisfactory grade for this assignment. A core goal of this class is to build your competencies as a machine learning engineer. I want to minimize abstractions from other libraries so that you build these skills.
Your task is to:
codelet2.zip
from the course website and open
it. You will find these instructions and codelet2.py
which
has scaffolding for you.codelet2.py
and include a file called
codelet2.pdf
with your answers to the written
questions.When assessing your work, satisfactory achievement is demonstrated, in part, by:
You will complete an implementation of linear regression trained
using gradient descent. You should build on codelet2.py
.
You have two tasks:
The basic structure of your model is provided in the class
LineaerRegression
. Your task is to complete this function
with any relevant methods. Do not add methods that
serve no substantive purpose in your code or are not used (that is draw
from the book with care). For example, you should not add a plotting
method. Note, the loss function is correctly implemented and should not
be modified. In your code, you must use PyTorch’s Linear
layer to handle your model’s weights (and compute your model’s output).
Please note the introduction of this codelet. You may only use PyTorch
in completing your implementation.
In addition to completing your model, you must write code that
returns the error in your model’s estimation of w
and
b
. See below for the formatting (it draws heavily on the
example provided in the book).
To evidence completion of this task, and to check your own progress,
please train your model using the train
method. The
train
method should work without any tweaks. Do not modify
or otherwise change that function. You should add to the pdf
accompanying your code a screenshot of the output of your code. Mine,
for example, looks as follows:
0 39.86579513549805
10 12.93497085571289
20 2.6193177700042725
error in estimating w: tensor([-0.5257, 0.3273, 0.6399])
error in estimating b: tensor([0.4122])
Your second task is to adapt the train
function to (1)
implement mini-batch gradient descent in the function
minibatch_train
, and (2) implement stochastic gradient
descent in the function sgd_train
.
In order to help think through this process, you must copy the code
from batch_train
to your pdf document and annotate the
function. Note what each line does. You should make sure to note which
line calculates the gradient and which line updates the weights of your
model.
In implementing mini-batch gradient descent, here are your decision specifications:
batch_size
(batch sizes less than the total size of your dataset, for example). You
can assume the user only uses reasonable batch_size
values.In implementing stochastic gradient descent, here are your decision specifications:
0.001
To evidence completion of this task, and to check your own progress, please train your model using your two train functions and print the error of each of your parameters, as before. You should add to the pdf accompanying your code a screenshot of the output of your code. Mine, for example, looks as follows:
Epoch 0 Total Loss 1479.2000402931476
Epoch 1 Total Loss 0.05384931225699052
Stochastic gradient descent error
error in estimating w: tensor([-6.6924e-04, -3.6597e-05, -5.9795e-04, -9.9182e-05])
error in estimating b: tensor([0.0003])
----------------------------------------------------------------------------------
0 10.669316709041595
1 5.0789901435375215
2 1.4392380893230439
3 0.3736846551299095
4 0.15271076932549477
5 0.05610000006854534
6 0.018682969850488007
7 0.008071314846165478
8 0.003019134764326736
9 0.0008185547601897269
10 0.00028801323351217433
11 0.00015633701550541446
12 8.788212180661503e-05
13 6.127286615082994e-05
14 5.611547276203055e-05
15 5.720405024476349e-05
16 5.878164884052239e-05
17 5.920631556364242e-05
18 5.879780437680893e-05
19 5.825129701406695e-05
Mini-batch gradient descent error
error in estimating w: tensor([ 0.0003, 0.0014, -0.0008, -0.0025])
error in estimating b: tensor([0.0002])
Finally, you should reflect on the performance of the training methods. You should concretely answer the following questions in the pdf accompanying your code:
batch_train
, minibatch_train
, and
sgd_train
?sgd_train
to
5
. What happens to your loss and error? Why might this be
the case?