Initial commit
This commit is contained in:
3
memory/MEMORY.md
Normal file
3
memory/MEMORY.md
Normal file
@@ -0,0 +1,3 @@
|
||||
# Memory Index
|
||||
|
||||
- [Project architecture](project_architecture.md) — Two-package structure (retro + retro-gamer), key design decisions, component responsibilities
|
||||
26
memory/project_architecture.md
Normal file
26
memory/project_architecture.md
Normal file
@@ -0,0 +1,26 @@
|
||||
---
|
||||
name: retro-gamer architecture
|
||||
description: Two-package structure, key design decisions, and how the packages interact
|
||||
type: project
|
||||
---
|
||||
|
||||
retro-gamer trains DQN agents to play retro-games framework games. Two packages are involved:
|
||||
|
||||
**retro** (`/Users/chrisp/Repos/MWC/packages/retro`) — the game framework, modified:
|
||||
- `retro/input.py` — `InputSource` protocol, `TerminalInput`, `ProgrammaticInput` (for RL). `ProgrammaticInput.press(key)` queues a keystroke for the next step.
|
||||
- `retro/views/` — `View` protocol, `TerminalView` (moved from `view.py`), `HeadlessView` (reads board into `board_characters` list-of-lists without terminal output). `view.py` kept as compat shim.
|
||||
- `retro/game.py` — `Game.step()` runs one turn (uses `self.input_source`, calls `self.view.render()` if set). `Game.play()` loops over `step()` with its own TerminalInput/TerminalView. `Game.start()` must be called before first `step()`.
|
||||
- `retro/examples/snake.py` — added `create_game(**kwargs)` factory function returning an initialized Game.
|
||||
|
||||
**retro-gamer** (`/Users/chrisp/Repos/MWC/packages/retro-gamer`) — the RL toolkit:
|
||||
- `metadata.py` — `GameMetadata` dataclass (board_size, actions, reward, character_set, spatial, observe_state). TOML load/save via `from_toml()`/`to_toml()`.
|
||||
- `env.py` — `GameEnvironment(game_factory, metadata)` with `reset()→obs`, `step(action)→(obs,reward,done)`. Manages `ProgrammaticInput` + `HeadlessView` internally. Reward is delta of state[reward_key].
|
||||
- `observation.py` — one-hot encodes board to (H,W,C) array; for spatial games transposes to (C,H,W) then flattens; state keys appended. Always returns flat 1D np.ndarray.
|
||||
- `network.py` — `build_network(metadata, hp)→(model, rationale_str)`. `_SpatialNet` uses Conv2d→flatten→MLP; `_FlatNet` uses MLP only. The flat obs vector's first C*H*W elements are board (channel-first), remainder is state.
|
||||
- `memory.py` — `ReplayMemory` (FIFO deque), `PrioritizedReplayMemory` (alpha/beta sampling).
|
||||
- `trainer.py` — `DQNTrainer`. Discovers character_set if not given. Writes architecture rationale to `training.log` on init. Saves `config.toml` (merges with existing to preserve `game` section). Checkpoints every 100 episodes + final.
|
||||
- `cli.py` — `retro-gamer create/train/play/info`. `create` writes config.toml with game module name. `train` loads config, calls DQNTrainer. `play` runs model vs terminal view using ProgrammaticInput+HeadlessView for obs and TerminalView for display.
|
||||
|
||||
**Why:** Designed as a pedagogy tool for students learning RL. Students specify GameMetadata and hyperparameters; the trainer makes architecture decisions and logs rationale.
|
||||
|
||||
**How to apply:** When adding features, keep RL concepts out of retro framework (input.py and views/ should not reference RL). When extending trainer, log all design decisions to training.log.
|
||||
Reference in New Issue
Block a user