generated from mwc/lab_tic_tac_toe
Add assessment
This commit is contained in:
parent
c2ae97588a
commit
0661473603
|
@ -0,0 +1,33 @@
|
||||||
|
# Tic Tac Toe Lab Assessment
|
||||||
|
|
||||||
|
## Checkpoint 1
|
||||||
|
Looks good.
|
||||||
|
|
||||||
|
## Checkpoint 2
|
||||||
|
> What I do not understand is what state is and why we need to use
|
||||||
|
> state["board"][index] to access something on the board
|
||||||
|
|
||||||
|
I think you asked a similar question in Discord, so maybe you already have an
|
||||||
|
understanding, but state is just a dict storing what's on the board and whose
|
||||||
|
turn it is. State isn't related to syntax--python doesn't care about state at all.
|
||||||
|
Instead, state is a design feature of this program. We intentionally design the
|
||||||
|
TTTGame so that it encodes the rules of the game, but doesn't know anything about
|
||||||
|
the particular game(s) which are actually being played. That's stored within state.
|
||||||
|
|
||||||
|
The benefit of organizing a program this way is that we can then reason about
|
||||||
|
multiple future states of the game, for example in the lookahead strategy introduced
|
||||||
|
later in the lab. We can draw an expanding chain of states out into the future,
|
||||||
|
and think about which futures are best. To see t
|
||||||
|
|
||||||
|
## Checkpoint 3
|
||||||
|
Actually, the current and future reward (they're added together) is 0.
|
||||||
|
This means that if both players play perfectly, the game will end in a draw every time.
|
||||||
|
You could see this using:
|
||||||
|
|
||||||
|
```
|
||||||
|
>>> from ttt.game import TTTGame
|
||||||
|
>>> from strategy.lookahead_strategy import LookaheadStrategy
|
||||||
|
>>> g = TTTGame()
|
||||||
|
>>> s = LookaheadStrategy(g)
|
||||||
|
>>> s.get_current_and_future_reward(g.get_initial_state())
|
||||||
|
```
|
Loading…
Reference in New Issue