Revisions

2026-06-08 15:15:52 -04:00
parent 1c99ca8bd3
commit 49c4e43f45
12 changed files with 209 additions and 34 deletions
--- a/questions.md
+++ b/questions.md
@@ -22,7 +22,7 @@

 *Your answer:*

-**4. Which digits were classified correctly most often? Which were most often confused for each other? (Use `digits models.handpicked.HandPickedClassifier -a` to see misclassified examples.)**
+**4. Which digits were classified correctly most often? Which were most often confused for each other? (Use `digits models.features.FeatureClassifier -a` to see misclassified examples.)**

 *Your answer:*

@@ -51,9 +51,29 @@

 ---

-## Checkpoint 3: Multi-Layer Perceptron
+## Checkpoint 3: Single Hidden Layer

-**9. Sketch your MLP architecture here (fill in the layer sizes you used):**
+**9. Record your results for at least three combinations of hidden layer size and number of epochs:**
+
+| Hidden size | Epochs | Test accuracy |
+|------------|--------|--------------|
+| | | |
+| | | |
+| | | |
+
+**10. What happened with a very small hidden layer (8 or 16 neurons)? With a very large one (512)? What does that suggest about what the hidden layer is doing?**
+
+*Your answer:*
+
+**11. Each hidden neuron is a learned feature. How does the number of features available to this network compare to the one or two you designed by hand in Checkpoint 1—and how does that help explain the difference in accuracy?**
+
+*Your answer:*
+
+---
+
+## Checkpoint 4: Multi-Layer Perceptron
+
+**12. Sketch your MLP architecture here (fill in the layer sizes you used):**

 ```
 Input layer:    _____ neurons (one per pixel)
@@ -62,18 +82,18 @@ Hidden layer 2: _____ neurons, ReLU activation  [if you used one]
 Output layer:   _____ neurons (one per digit)
 ```

-**10. Record your best results:**
+**13. Record your best results:**

 | Hidden sizes | Epochs | Val accuracy (final) | Test accuracy | F1 (macro) |
 |-------------|--------|---------------------|--------------|------------|
 | | | | | |
 | | | | | |

-**11. Both the MLP and the pixel classifier see the same 784 numbers. What does the MLP do with them that the pixel classifier cannot?**
+**14. Both the MLP and the pixel classifier see the same 784 numbers. What does the MLP do with them that the pixel classifier cannot?**

 *Your answer:*

-**12. The MLP still flattens the image into a vector of 784 numbers before its first layer ever sees it—it has no idea that pixel 0 and pixel 28 are vertical neighbors. Did stacking layers fix the limitation you identified in Checkpoint 2, or just hide it better?**
+**15. The MLP still flattens the image into a vector of 784 numbers before its first layer ever sees it—it has no idea that pixel 0 and pixel 28 are vertical neighbors. Did stacking layers fix the limitation you identified in Checkpoint 2, or just hide it better?**

 *Your answer:*

@@ -81,7 +101,7 @@ Output layer:   _____ neurons (one per digit)

 ## Final Questions

-**13. Sketch the CNN architecture (label each layer with its type and dimensions):**
+**16. Sketch the CNN architecture (label each layer with its type and dimensions):**

 ```
 Input: ___x___x___ (height × width × channels)
@@ -101,13 +121,13 @@ Fully connected: _____ → 10
 Output: 10 class probabilities (softmax)
 ```

-**14. Record your CNN results:**
+**17. Record your CNN results:**

 | Epochs | Val accuracy (final) | Test accuracy | F1 (macro) |
 |--------|---------------------|--------------|------------|
 | | | | |

-**15. Fill in the final comparison table with every classifier you built in this lab:**
+**18. Fill in the final comparison table with every classifier you built in this lab:**

 | Classifier | Hyperparameters | Test accuracy | F1 (macro) | Notes |
 |-----------|----------------|--------------|------------|-------|
@@ -116,14 +136,14 @@ Output: 10 class probabilities (softmax)
 | MLP | hidden= | | | |
 | CNN | | | | |

-**16. Architecture comparison: the MLP and CNN both ultimately process the same 784-pixel images, but the CNN reliably outperforms the MLP. What does the CNN know about images that the MLP does not?**
+**19. Architecture comparison: the MLP and CNN both ultimately process the same 784-pixel images, but the CNN reliably outperforms the MLP. What does the CNN know about images that the MLP does not?**

 *Your answer:*

-**17. Model selection: if you needed to deploy a digit classifier on a device with very limited memory and compute (e.g., a microcontroller), which classifier would you choose, and why? (Consider model size, prediction speed, and accuracy.)**
+**20. Model selection: if you needed to deploy a digit classifier on a device with very limited memory and compute (e.g., a microcontroller), which classifier would you choose, and why? (Consider model size, prediction speed, and accuracy.)**

 *Your answer:*

-**18. Real-world applications: CNNs are used for object detection, face recognition, and medical imaging. What properties of CNNs make them well suited for these applications—and what would have to change to handle images that aren't neatly centered and cropped, the way MNIST's are?**
+**21. Real-world applications: CNNs are used for object detection, face recognition, and medical imaging. What properties of CNNs make them well suited for these applications—and what would have to change to handle images that aren't neatly centered and cropped, the way MNIST's are?**

 *Your answer:*