generated from mwc/lab_tic_tac_toe
The computer player was trained to use more
effective strategies. What I changed I replaced the random strategy class with the lookahead strategy class. Why I changed it By replacing the classes, the computer player stopped responding randomly and instead used data provided to integrate effective strategies. Estimate for remaining time to finish assignment: [1-2 hours depending on peer assistance]
This commit is contained in:
parent
b84640e4b3
commit
ab309c8be8
8
notes.md
8
notes.md
|
@ -21,15 +21,17 @@ A player can choose which action to play on a turn from within the TTTHumanPlaye
|
||||||
For each of the following board states, if you are playing as X
|
For each of the following board states, if you are playing as X
|
||||||
and it's your turn, which action would you take? Why?
|
and it's your turn, which action would you take? Why?
|
||||||
|
|
||||||
| O | O | | O | X | X | O |
|
| O | O | | O x | X | X | O |
|
||||||
---+---+--- ---+---+--- ---+---+--- ---+---+---
|
---+---+--- ---+---+--- ---+---+--- ---+---+---
|
||||||
X | X | | X | X | O | O | |
|
X | X | x | X | x X | O | O | x |
|
||||||
---+---+--- ---+---+--- ---+---+--- ---+---+---
|
---+---+--- ---+---+--- ---+---+--- ---+---+---
|
||||||
| | | | O | | | |
|
| | | | O | | | |
|
||||||
|
|
||||||
|
For the first board, it is simple to choose 6 to win the game. The second board in also 6, but is done to prevent O from winning while setting up a potential win scenario. The third board simultaneously blocks O from making meaningful advances while setting up a guaranteed win scenario. The fourth board is setting up a potential win scenario in which O will be forced to block, leaving X free to set up a situation similar to the third board.
|
||||||
|
|
||||||
### Initial game state
|
### Initial game state
|
||||||
|
|
||||||
You can get the inital game state using game.get_initial_state().
|
You can get the inital game state using game.get_initial_state().
|
||||||
What is the current and future reward for this state? What does this mean?
|
What is the current and future reward for this state? What does this mean?
|
||||||
|
|
||||||
|
The current and future state demonstrates a 0 value, which implies that the game is as fair as possible since it means an equal number of wins for X or O.
|
|
@ -3,7 +3,7 @@ from ttt.view import TTTView
|
||||||
from ttt.player import TTTHumanPlayer, TTTComputerPlayer
|
from ttt.player import TTTHumanPlayer, TTTComputerPlayer
|
||||||
|
|
||||||
player0 = TTTHumanPlayer("Player 1")
|
player0 = TTTHumanPlayer("Player 1")
|
||||||
player1 = TTTHumanPlayer("Player 2")
|
player1 = TTTComputerPlayer("Player 2")
|
||||||
game = TTTGame()
|
game = TTTGame()
|
||||||
view = TTTView(player0, player1)
|
view = TTTView(player0, player1)
|
||||||
|
|
||||||
|
|
|
@ -1,5 +1,5 @@
|
||||||
from click import Choice, prompt
|
from click import Choice, prompt
|
||||||
from strategy.random_strategy import RandomStrategy
|
from strategy.lookahead_strategy import LookaheadStrategy
|
||||||
from ttt.game import TTTGame
|
from ttt.game import TTTGame
|
||||||
import random
|
import random
|
||||||
|
|
||||||
|
@ -24,7 +24,7 @@ class TTTComputerPlayer:
|
||||||
def __init__(self, name):
|
def __init__(self, name):
|
||||||
"Sets up the player."
|
"Sets up the player."
|
||||||
self.name = name
|
self.name = name
|
||||||
self.strategy = RandomStrategy(TTTGame())
|
self.strategy = LookaheadStrategy(TTTGame())
|
||||||
|
|
||||||
def choose_action(self, state):
|
def choose_action(self, state):
|
||||||
"Chooses a random move from the moves available."
|
"Chooses a random move from the moves available."
|
||||||
|
|
Loading…
Reference in New Issue