Initial commit

2026-06-22 16:14:58 -04:00
commit 42bc2e7a50
14 changed files with 2049 additions and 0 deletions
--- a/training_log.md
+++ b/training_log.md
@@ -0,0 +1,102 @@
+# Forager Training Log
+
+Document each training attempt below. For each attempt, write your hypothesis
+before you run the experiment, then fill in the evidence and analysis after.
+
+Use `retro-gamer info runs/forager/` to see a summary of your run, and
+`cat runs/forager/training.log` to see the full log.
+
+---
+
+## Attempt 1
+
+### Hypothesis
+
+*Before training, predict what will happen with the default configuration.
+Will the agent learn to find the food? How quickly? What might go wrong?*
+
+Your prediction:
+
+### Configuration
+
+*Copy the relevant sections of `runs/forager/config.toml` here.*
+
+```toml
+
+```
+
+### Evidence
+
+*Paste the first and last few lines of your training log, and any interesting
+moments in between.*
+
+```
+
+```
+
+### Analysis
+
+*What happened? How do the numbers — avg_reward, avg_steps, epsilon, avg_loss —
+tell the story of what the agent learned? Did the result match your prediction?*
+
+---
+
+## Attempt 2
+
+### Hypothesis
+
+*Based on what you observed in Attempt 1, what will you change and why?
+Predict the outcome.*
+
+### Configuration
+
+```toml
+
+```
+
+### Evidence
+
+```
+
+```
+
+### Analysis
+
+---
+
+## Attempt 3 (if needed)
+
+### Hypothesis
+
+### Configuration
+
+```toml
+
+```
+
+### Evidence
+
+```
+
+```
+
+### Analysis
+
+---
+
+## Final analysis
+
+**Which attempt produced the best-trained agent? Run `retro-gamer play` on your
+best run's checkpoints and describe what the agent does.**
+
+*Your answer:*
+
+**Compare two of your attempts. What changed between them, and how did that
+change affect the training curve?**
+
+*Your answer:*
+
+**If you had more time, what would you try next to improve the agent further?
+Refer to specific hyperparameters or configuration options.**
+
+*Your answer:*