generated from mwc/lab_tic_tac_toe
577 B
577 B
Tic Tac Toe Lab Assessment
Checkpoint 1
Looks good.
Checkpoint 2
Nice! The ranges make use of the fact that there are some useful patterns in the indices of row- and colum-based wins.
Checkpoint 3
The current and future reward for this state is 0 which tells me there are as many games where x wins as loses and most games end in a draw? Does this mean it's a fair game?
Almost--but this isn't a propabilistic process. Having the initial state's current and future reward be zero means that if everyone plays perfectly, the game will always be a draw.