Updates across the board
This commit is contained in:
@@ -100,12 +100,12 @@ matters.
|
||||
|
||||
**Observation design** determines what information is available to the
|
||||
agent. If you leave a character out of the ``character_set``, the agent
|
||||
will not distinguish it from empty space. If you include a game-state
|
||||
variable in ``observe_state``, the agent can see it directly rather than
|
||||
having to infer it from the board. The consequences of these choices for
|
||||
what the agent can learn are reasonably predictable—and making and
|
||||
checking those predictions is exactly the kind of reasoning the tool is
|
||||
designed to support.
|
||||
will not distinguish it from empty space. If the game module defines a
|
||||
``get_state()`` function, the agent also receives those computed values
|
||||
as part of its observation. The consequences of these choices for what
|
||||
the agent can learn are reasonably predictable — and making and checking
|
||||
those predictions is exactly the kind of reasoning the tool is designed
|
||||
to support.
|
||||
|
||||
**Reward engineering** is the craft of specifying what counts as doing
|
||||
well in a way the agent can actually optimize. Using score as the reward
|
||||
|
||||
Reference in New Issue
Block a user