Forager Training Log

Document each training attempt below. For each attempt, write your hypothesis before you run the experiment, then fill in the evidence and analysis after.

Use retro-gamer info runs/forager/ to see a summary of your run, and cat runs/forager/training.log to see the full log.

Attempt 1

Hypothesis

Before training, predict what will happen with the default configuration. Will the agent learn to find the food? How quickly? What might go wrong?

Your prediction:

Configuration

Copy the relevant sections of runs/forager/config.toml here.

Evidence

Paste the first and last few lines of your training log, and any interesting moments in between.

Analysis

What happened? How do the numbers — avg_reward, avg_steps, epsilon, avg_loss — tell the story of what the agent learned? Did the result match your prediction?

Attempt 2

Hypothesis

Based on what you observed in Attempt 1, what will you change and why? Predict the outcome.

Configuration

Evidence

Analysis

Attempt 3 (if needed)

Hypothesis

Configuration

Evidence

Analysis

Final analysis

Which attempt produced the best-trained agent? Run retro-gamer play on your best run's checkpoints and describe what the agent does.

Your answer:

Compare two of your attempts. What changed between them, and how did that change affect the training curve?

Your answer:

If you had more time, what would you try next to improve the agent further? Refer to specific hyperparameters or configuration options.

Your answer:

1.7 KiB Raw Blame History

Forager Training Log

Attempt 1

Hypothesis

Configuration

Evidence

Analysis

Attempt 2

Hypothesis

Configuration

Evidence

Analysis

Attempt 3 (if needed)

Hypothesis

Configuration

Evidence

Analysis

Final analysis

1.7 KiB

Raw Blame History