Integrating a Trained Model =========================== Once you have trained a model, you can use it in two ways: - **PolicyInput** — the model replaces the keyboard, driving an existing player-controlled agent. Use this to watch a trained agent play, or to run automated evaluations. - **TrainedPolicy in play_turn** — call ``get_action(game)`` from inside any agent's ``play_turn`` to embed the model as an autonomous character (for example, a smart enemy) alongside human-controlled or other agents. Loading a trained model ----------------------- Both approaches start by creating a :class:`retro_gamer.TrainedPolicy`: .. code-block:: python from retro_gamer import TrainedPolicy ai = TrainedPolicy("runs/snake/") This reads ``config.toml``, rebuilds the network, and loads the latest checkpoint. To load a specific checkpoint instead: .. code-block:: python ai = TrainedPolicy("runs/snake/", checkpoint="ep_0500") PolicyInput: model as player ---------------------------- :class:`retro_gamer.PolicyInput` is an input source — it implements the same interface as keyboard input, but chooses actions using the trained model. Pass it to ``game.play()`` and everything else works exactly as usual: .. code-block:: python from retro.examples.snake import create_game from retro_gamer import TrainedPolicy, PolicyInput ai = TrainedPolicy("runs/snake/") game = create_game() game.play(input_source=PolicyInput(ai, game)) On each turn, ``PolicyInput`` observes the current board and game state, runs the model, and sends the chosen action to the game exactly as if the player had pressed that key. TrainedPolicy in play_turn: model as autonomous character --------------------------------------------------------- To embed a trained model as an autonomous game character, create a ``TrainedPolicy`` at module level and call ``get_action(game)`` from inside the agent's ``play_turn``. Placing it at module level means the model is loaded from disk once — not once per episode. .. code-block:: python from retro.game import Game from retro.examples.snake import Apple, SnakeHead from retro_gamer import TrainedPolicy _ai = TrainedPolicy("runs/snake/") class AISnake(SnakeHead): def handle_keystroke(self, k, game): pass # ignore keyboard def play_turn(self, game): key = _ai.get_action(game) if key == 'KEY_RIGHT': self.direction = (1, 0) elif key == 'KEY_LEFT': self.direction = (-1, 0) elif key == 'KEY_UP': self.direction = (0, -1) elif key == 'KEY_DOWN': self.direction = (0, 1) super().play_turn(game) human_snake = SnakeHead() ai_snake = AISnake() ai_snake.position = (16, 8) apple = Apple() game = Game([human_snake, ai_snake, apple], {"score": 0}, board_size=(32, 16)) apple.relocate(game) game.play() Training an enemy model ~~~~~~~~~~~~~~~~~~~~~~~~ You can use the same training pipeline to produce a model for an enemy agent. ``retro-gamer`` does not care *which* character it is training — it only cares that it can control one character through the keyboard and read a reward signal from the game state. To train an enemy: 1. **Create an enemy-perspective game variant.** Write (or add) a ``create_game`` function — in a separate file, or alongside your main one — where the enemy agent is the keyboard-driven character and the reward key in the game state reflects the enemy's objective (for example, a bonus for catching the player). The human player can be absent, replaced by a random-moving agent, or driven by a ``TrainedPolicy`` once you have a trained player model. .. code-block:: python def create_enemy_training_game(): enemy = EnemyAgent() # the character the trainer will control player = RandomPlayer() # a stand-in; no human involved game = Game([enemy, player], {'enemy_reward': 0}, board_size=(32, 16)) return game 2. **Train normally against this variant.** .. code-block:: console % retro-gamer create --game my_game:create_enemy_training_game \ --output runs/enemy/ % retro-gamer train runs/enemy/ 3. **Embed the trained model in your main game** using ``get_action``, exactly as shown above. .. note:: Because ``retro-gamer`` injects actions through the game's global input source, *all* keyboard-listening agents in the training game will receive the trainer's keystrokes. The cleanest approach is to make the enemy the only keyboard-driven character in the training variant — any other characters should advance on their own without reading from the keyboard. Adversarial training ~~~~~~~~~~~~~~~~~~~~~ Once you have separate training runs for the player and the enemy, you can train them *against each other* iteratively. The idea is simple: train the player against the current enemy model, then train the enemy against the updated player model, and repeat. Each side is forced to improve against an increasingly capable opponent. The key technique is to load the opponent's model at module level in each training game variant, so it is loaded from disk once per run rather than once per episode: .. code-block:: python # enemy_training_game.py from retro_gamer import TrainedPolicy _player = TrainedPolicy("runs/player/") # loaded once when the module is imported def create_game(): enemy = EnemyAgent() player = AIPlayer(_player) # uses _player.get_action in play_turn return Game([enemy, player], {'enemy_reward': 0}, board_size=(32, 16)) You then alternate training runs: .. code-block:: console % retro-gamer train runs/player/ # train player against current enemy % retro-gamer train runs/enemy/ # train enemy against updated player % retro-gamer train runs/player/ # train player again # ... How many episodes to run before switching is itself a design decision: too few and neither model has time to adapt; too many and each side overfits to its current opponent. Watching how the strategies evolve — and asking *why* each model behaves as it does at each stage — connects directly to concepts in multi-agent reinforcement learning and adversarial training. Differences between the two approaches --------------------------------------- .. list-table:: :header-rows: 1 :widths: 35 65 * - ``PolicyInput`` - ``TrainedPolicy`` in ``play_turn`` * - Replaces human input for the whole game - One autonomous agent among many * - Game code is unchanged - Agent's ``play_turn`` calls ``get_action`` * - One model drives all player-controlled agents - Each agent instance has its own model * - Simpler — just pass to ``game.play()`` - More flexible — mix human and AI characters