Updates across the board

This commit is contained in:
Chris Proctor
2026-06-22 16:41:31 -04:00
parent 5ca97dc5d0
commit 73624d1a0c
33 changed files with 3104 additions and 643 deletions

View File

@@ -100,12 +100,12 @@ matters.
**Observation design** determines what information is available to the
agent. If you leave a character out of the ``character_set``, the agent
will not distinguish it from empty space. If you include a game-state
variable in ``observe_state``, the agent can see it directly rather than
having to infer it from the board. The consequences of these choices for
what the agent can learn are reasonably predictableand making and
checking those predictions is exactly the kind of reasoning the tool is
designed to support.
will not distinguish it from empty space. If the game module defines a
``get_state()`` function, the agent also receives those computed values
as part of its observation. The consequences of these choices for what
the agent can learn are reasonably predictableand making and checking
those predictions is exactly the kind of reasoning the tool is designed
to support.
**Reward engineering** is the craft of specifying what counts as doing
well in a way the agent can actually optimize. Using score as the reward