577 B

Raw Blame History

Tic Tac Toe Lab Assessment

Checkpoint 1

Looks good.

Checkpoint 2

Nice! The ranges make use of the fact that there are some useful patterns in the indices of row- and colum-based wins.

Checkpoint 3

The current and future reward for this state is 0 which tells me there are as many games where x wins as loses and most games end in a draw? Does this mean it's a fair game?

Almost--but this isn't a propabilistic process. Having the initial state's current and future reward be zero means that if everyone plays perfectly, the game will always be a draw.

577 B Raw Blame History

Tic Tac Toe Lab Assessment

Checkpoint 1

Checkpoint 2

Checkpoint 3

577 B

Raw Blame History