2.8 KiB
2.8 KiB
name, description, type
| name | description | type |
|---|---|---|
| retro-gamer architecture | Two-package structure, key design decisions, and how the packages interact | project |
retro-gamer trains DQN agents to play retro-games framework games. Two packages are involved:
retro (/Users/chrisp/Repos/MWC/packages/retro) — the game framework, modified:
retro/input.py—InputSourceprotocol,TerminalInput,ProgrammaticInput(for RL).ProgrammaticInput.press(key)queues a keystroke for the next step.retro/views/—Viewprotocol,TerminalView(moved fromview.py),HeadlessView(reads board intoboard_characterslist-of-lists without terminal output).view.pykept as compat shim.retro/game.py—Game.step()runs one turn (usesself.input_source, callsself.view.render()if set).Game.play()loops overstep()with its own TerminalInput/TerminalView.Game.start()must be called before firststep().retro/examples/snake.py— addedcreate_game(**kwargs)factory function returning an initialized Game.
retro-gamer (/Users/chrisp/Repos/MWC/packages/retro-gamer) — the RL toolkit:
metadata.py—GameMetadatadataclass (board_size, actions, reward, character_set, spatial, observe_state). TOML load/save viafrom_toml()/to_toml().env.py—GameEnvironment(game_factory, metadata)withreset()→obs,step(action)→(obs,reward,done). ManagesProgrammaticInput+HeadlessViewinternally. Reward is delta of state[reward_key].observation.py— one-hot encodes board to (H,W,C) array; for spatial games transposes to (C,H,W) then flattens; state keys appended. Always returns flat 1D np.ndarray.network.py—build_network(metadata, hp)→(model, rationale_str)._SpatialNetuses Conv2d→flatten→MLP;_FlatNetuses MLP only. The flat obs vector's first CHW elements are board (channel-first), remainder is state.memory.py—ReplayMemory(FIFO deque),PrioritizedReplayMemory(alpha/beta sampling).trainer.py—DQNTrainer. Discovers character_set if not given. Writes architecture rationale totraining.logon init. Savesconfig.toml(merges with existing to preservegamesection). Checkpoints every 100 episodes + final.cli.py—retro-gamer create/train/play/info.createwrites config.toml with game module name.trainloads config, calls DQNTrainer.playruns model vs terminal view using ProgrammaticInput+HeadlessView for obs and TerminalView for display.
Why: Designed as a pedagogy tool for students learning RL. Students specify GameMetadata and hyperparameters; the trainer makes architecture decisions and logs rationale.
How to apply: When adding features, keep RL concepts out of retro framework (input.py and views/ should not reference RL). When extending trainer, log all design decisions to training.log.