AGENT BRIEFING

ScummBench

Harness for ScummVM games. Play the Monkey Island 1 demo or upload your own game, then drive it through the window.__scumm* API.

Quick start

  1. Open a game — either the pre-baked /game?game=monkey1-demo (Monkey Island 1 demo) or /game to upload your own ScummVM game folder.
  2. Wait for __scummActionsReady() to return true
  3. Read state: __scummRead()
  4. Act: __scummDoSentence({verb, objectA})
  5. Observe: __scummEventsSince(cursor)
  6. Repeat 3-5

For local dev you can also pre-stage any other game folder via scripts/add-game.sh and load it with /game?game=<id>.

Read API

__scummRead()Full state snapshot (room, ego, objects, verbs, inventory, actors, dialogChoices)
__scummEventsSince(cursor)Returns {events[], cursor}. Pass cursor back next call for incremental reads.
__scummActionsReady()True when WASM is loaded and actions work

Record API — state changes over time

Poll the snapshot at a configurable interval and buffer structural diffs between ticks. Useful for catching transient changes that don't emit events — e.g. an NPC walking, or an object animating after a trigger (step on the wood, the bird flies away).

Use __scummRecordSummary() to read. It returns net-change-per-path across the whole window and drops oscillating paths (SCUMM animates by flipping state between 0 and 1 each tick — dozens of noise rows per second). __scummRecordRead() returns the per-tick log; use it only for forensic replay.

__scummRecordStart({intervalMs?, clear?})Start polling. Default interval 200ms, min 50ms. Clears prior buffer unless clear:false.
__scummRecordStop()Stop polling. Entries remain readable.
__scummRecordSummary({includeAnimation?})Preferred. Returns {windowMs, ticksRecorded, changes, filteredAnimationPaths}. Each change is {path, from, to, ticks, oscillated}. Oscillating paths are dropped unless includeAnimation:true.
__scummRecordRead(sinceIndex?)Per-tick log (verbose). Returns {startedAt, entries, nextIndex, total, running}. Each entry is {dt, diff: [{path, from, to, op?}]} where dt is ms offset from startedAt.
__scummRecordStatus(){running, intervalMs, entries, startedAt}
__scummRecordClear()Drop all buffered entries.

Paths are JSON-pointer-style arrays. For the id-keyed top-level arrays (roomObjects, inventory, verbs, dialogChoices, actors) items are matched by id, so segments are {id: N} — e.g. ["roomObjects", {id: 10}, "box", "x"]. op is "add" or "remove" on membership changes in the per-tick log.

Transient messages and spatial motion are never treated as animation. High-signal paths bypass the oscillation filter and include a seenValues array of every distinct value the path held during the window. These are:

So an NPC that zigzags across the room reports its full pos.x / pos.y trajectory in seenValues, and a message that flashed is reported in full rather than lost as "null → null". Object-level state flips (fog, candles, flames) are NOT on this list — they stay filtered as animation noise.

Action API

__scummDoSentence({verb, objectA, objectB?})Preferred. Atomic verb+object, auto-walks ego.
__scummSelectDialog(index)Pick dialog choice (0-indexed into dialogChoices[])
__scummSkipMessage()Dismiss current dialog text
__scummWalkTo(x, y)Walk ego to room coordinates
__scummClickAt(x, y)Last resort. Click at room coordinates.

Key state fields

Key events

Events are coarse, not exhaustive. The stream covers the major engine-level transitions above and nothing else. It does not emit for: NPC movement along a path, transient flavor messages that auto-dismiss (the flavour text a game shows when you step on something), object animation frames, or per-object state flips. For exact observation of those, use the recorder (__scummRecordSummary).

Patterns

Look at / use an object

const s = __scummRead();
const verb = s.verbs.find(v => v.name.toLowerCase().includes("look"));
const obj = s.roomObjects.find(o => o.name === "poster");
__scummDoSentence({ verb: verb.id, objectA: obj.id });

Conversation

// 1. Talk to NPC: __scummDoSentence({ verb: talkId, objectA: npcId })
// 2. Poll __scummRead() until dialogChoices.length > 0
// 3. Pick: __scummSelectDialog(0)
// 4. Advance text: __scummSkipMessage() when haveMsg > 0
// 5. Repeat until dialogChoices empty and haveMsg === 0

Room navigation

const s = __scummRead();
const door = s.roomObjects.find(o => o.name === "door");
const walk = s.verbs.find(v => v.name.toLowerCase().includes("walk"));
__scummDoSentence({ verb: walk.id, objectA: door.id });
// Wait for roomEntered event, then re-read state.

Strategy

  1. Use __scummRead() as your primary orientation tool — inspect room objects, actors, verbs, and inventory to understand where you are and what you can do. Do not take screenshots for routine orientation.
  2. Use __scummEventsSince(cursor) to efficiently catch up on what happened after an action — dialog text, room changes, cutscene starts/ends. This is far cheaper than re-reading full state or taking screenshots.
  3. Use screenshots only as a fallback when the API state is ambiguous (e.g. spatial layout unclear, need to visually identify something the state doesn't describe). Screenshots are expensive in tokens.
  4. Plan before acting: read the full state, identify available objects and NPCs, form a goal, then execute. Don't wander blindly.
  5. Collect everything you can. The majority of puzzle solutions involve using inventory items — on objects, on other items, or giving them to NPCs. Pick up anything not nailed down.
  6. Build a mental map of room connections as you explore. Track which exits lead where.
  7. Talk to NPCs to gather information — adventure games progress through conversation and item use.

Rules

  1. Check __scummActionsReady() before first action.
  2. Check inputLocked before each action.
  3. While dialogChoices is non-empty the conversation is open — only __scummSelectDialog and __scummSkipMessage are allowed. doSentence, walkTo, clickAt, and clickObject will be rejected until a choice is picked.
  4. Prefer doSentence over clickAt.
  5. Use events (not polling state) to detect action results.
  6. Avoid repeating failed actions.
  7. Never write an unbounded polling loop. If you wait on a condition (bird.pos.x > 280 && !inputLocked, haveMsg === 0, etc.) and it never holds, the Promise hangs the entire tool call. Always include a hard timeout (Date.now() - start > 8000) — or skip polling entirely and use a fixed setTimeout + __scummRecordSummary(), which captures everything that happened in the window without waiting on a specific state.

Machine-readable brief

Same data as JSON in #agent-brief below.

loading...