Files
retro-gamer/memory/project_architecture.md
Chris Proctor 5ca97dc5d0 Initial commit
2026-05-08 14:07:17 -04:00

2.8 KiB

name, description, type
name description type
retro-gamer architecture Two-package structure, key design decisions, and how the packages interact project

retro-gamer trains DQN agents to play retro-games framework games. Two packages are involved:

retro (/Users/chrisp/Repos/MWC/packages/retro) — the game framework, modified:

  • retro/input.pyInputSource protocol, TerminalInput, ProgrammaticInput (for RL). ProgrammaticInput.press(key) queues a keystroke for the next step.
  • retro/views/View protocol, TerminalView (moved from view.py), HeadlessView (reads board into board_characters list-of-lists without terminal output). view.py kept as compat shim.
  • retro/game.pyGame.step() runs one turn (uses self.input_source, calls self.view.render() if set). Game.play() loops over step() with its own TerminalInput/TerminalView. Game.start() must be called before first step().
  • retro/examples/snake.py — added create_game(**kwargs) factory function returning an initialized Game.

retro-gamer (/Users/chrisp/Repos/MWC/packages/retro-gamer) — the RL toolkit:

  • metadata.pyGameMetadata dataclass (board_size, actions, reward, character_set, spatial, observe_state). TOML load/save via from_toml()/to_toml().
  • env.pyGameEnvironment(game_factory, metadata) with reset()→obs, step(action)→(obs,reward,done). Manages ProgrammaticInput + HeadlessView internally. Reward is delta of state[reward_key].
  • observation.py — one-hot encodes board to (H,W,C) array; for spatial games transposes to (C,H,W) then flattens; state keys appended. Always returns flat 1D np.ndarray.
  • network.pybuild_network(metadata, hp)→(model, rationale_str). _SpatialNet uses Conv2d→flatten→MLP; _FlatNet uses MLP only. The flat obs vector's first CHW elements are board (channel-first), remainder is state.
  • memory.pyReplayMemory (FIFO deque), PrioritizedReplayMemory (alpha/beta sampling).
  • trainer.pyDQNTrainer. Discovers character_set if not given. Writes architecture rationale to training.log on init. Saves config.toml (merges with existing to preserve game section). Checkpoints every 100 episodes + final.
  • cli.pyretro-gamer create/train/play/info. create writes config.toml with game module name. train loads config, calls DQNTrainer. play runs model vs terminal view using ProgrammaticInput+HeadlessView for obs and TerminalView for display.

Why: Designed as a pedagogy tool for students learning RL. Students specify GameMetadata and hyperparameters; the trainer makes architecture decisions and logs rationale.

How to apply: When adding features, keep RL concepts out of retro framework (input.py and views/ should not reference RL). When extending trainer, log all design decisions to training.log.