Codelet 1: PyTorch Basics with KNN

The due date for this codelet is Friday, September 12 at 11:59PM.

Introduction

The following is to help get you started working with PyTorch. You should draw on the textbook (Chapter 2) and the PyTorch documentation.

Outline

Your assignment

Your task is to:

  1. Download codelet1.zip from the course website and open it. You will find these instructions, codelet1.py, KNN.py, and utils.py which has scaffolding for you.
  2. Complete each of the exercises using the scaffolding provided in those files. Submit your completed utils.py, KNN.py, and codelet1.py and include a file called codelet1.pdf which your answer to the KNN question below.

Grading

When assessing your code, satisfactory achievement is demonstrated, in part, by:

PyTorch Exercises

You should find sufficient detail in the docstrings in codelet1.py to complete the four functions:

  1. makeTensor
  2. reshapeTensor
  3. indexTensor
  4. sumTensor

KNN Implementation

Your task is to complete an implementation of k-nearest neighbors using PyTorch. I’ve provided code to help scaffold your approach in two files utils.py and KNN.py. To demonstrate completion of this task, you should include in pdf called codelet1.pdf your answer to the following question:

How does the number of features change the performance of a `KNN` model? 

To answer this question, you should include a concrete reference to model performance using your code and the provided functions in utils.py (e.g., create_data and results). Your code that uses your KNN model should be put in the function runKNN().

KNN Organization

The implementation of KNN has been distributed across two files. In the first, utils.py you will implement a distance measure (euclidean_distance) and the function to determine the predicted label for a data point (mode). The docstrings for these functions include doctest examples to help you test your code.

With these in place, you can then complete the model in KNN.py. We are using the same basic structure as is used in sklearn (which we will use in lab), so this will help you in the rest of the semester. Your aim is to implement one class method predict which takes a test dataset and assigns labels. Notice how your distance function is passed as a class parameter becoming an attribute and you can use your mode function with utils.mode().

To use your KNN model after finishing the class, you should make sure to first fit your model using that class method and the training data.