Checkpoint 3

What I changed
I changed the computer to use the lookahead strategy rather than the random picker strategy.

Why I changed it
It's boring to play agains the computer when all it does is pick a random available spot. When the computer
plays intelligently it's more interesting and difficult.

Estimate for remaining time to finish assignment: [30 minutes to an hour]
This commit is contained in:
Thomas Naber 2024-02-25 21:52:47 -05:00
parent 0dbc59d637
commit c2ae97588a
4 changed files with 14 additions and 9 deletions

BIN
.DS_Store vendored Normal file

Binary file not shown.

View File

@ -21,20 +21,24 @@ an array that I'm accessing, and that indexes 0 through 9 correspond to the diff
board starting on the top left and going across the row, then down to the left of the next row, etc. board starting on the top left and going across the row, then down to the left of the next row, etc.
What I do not understand is what state is and why we need to use state["board"][index] to access What I do not understand is what state is and why we need to use state["board"][index] to access
something on the boare something on the boare
### TTT Strategy
### TTT Strategy
For each of the following board states, if you are playing as X For each of the following board states, if you are playing as X
and it's your turn, which action would you take? Why? and it's your turn, which action would you take? Why?
| O | O | | O | X | X | O | 0 | O | O 0 | 1 | O 0 | X | 2 X | O | 2
---+---+--- ---+---+--- ---+---+--- ---+---+--- ---+---+--- ---+---+--- ---+---+--- ---+---+---
X | X | | X | X | O | O | | X | X | 5 3 | X | 5 X | O | O 3 | 4 | 5
---+---+--- ---+---+--- ---+---+--- ---+---+--- ---+---+--- ---+---+--- ---+---+--- ---+---+---
| | | | O | | | | 6 | 7 | 8 6 | 7 | O 6 | 7 | 8 6 | 7 | 8
1 - You should play in space 5, since you'll win the game.
2 - You should play in space 5 since that will block the O player from winning and also give you two in a row.
3 - You should play in space 0, since that will give you two chances to win on your next move and the other player can only block one of them.
4 - Play in space 4. This blocks the other player from using the O they placed.
### Initial game state ### Initial game state
You can get the inital game state using game.get_initial_state(). You can get the inital game state using game.get_initial_state().
What is the current and future reward for this state? What does this mean? What is the current and future reward for this state? What does this mean?
I think the current reward is 1, meaning x wins, and the future reward is 0. This means that at the beginning of the game, x who is the first player is likely to win, but after the computer tests all of it's cases, it's likely that the game will end in a draw. This makes sense to me. It wasn't too difficult to tie with the computer, however it was difficult to beat the computer.

View File

@ -3,7 +3,7 @@ from ttt.view import TTTView
from ttt.player import TTTHumanPlayer, TTTComputerPlayer from ttt.player import TTTHumanPlayer, TTTComputerPlayer
player0 = TTTHumanPlayer("Player 1") player0 = TTTHumanPlayer("Player 1")
player1 = TTTHumanPlayer("Player 2") player1 = TTTComputerPlayer("Player 2")
game = TTTGame() game = TTTGame()
view = TTTView(player0, player1) view = TTTView(player0, player1)

View File

@ -1,5 +1,6 @@
from click import Choice, prompt from click import Choice, prompt
from strategy.random_strategy import RandomStrategy from strategy.random_strategy import RandomStrategy
from strategy.lookahead_strategy import LookaheadStrategy
from ttt.game import TTTGame from ttt.game import TTTGame
import random import random
@ -24,7 +25,7 @@ class TTTComputerPlayer:
def __init__(self, name): def __init__(self, name):
"Sets up the player." "Sets up the player."
self.name = name self.name = name
self.strategy = RandomStrategy(TTTGame()) self.strategy = LookaheadStrategy(TTTGame())
def choose_action(self, state): def choose_action(self, state):
"Chooses a random move from the moves available." "Chooses a random move from the moves available."