Playground

Playground mode is an interactive testing environment where you build and refine scenarios step-by-step. Instead of writing a complete scenario upfront and running it, you watch the agent play your game live and evolve the test as you go — adding, editing, and reordering checkpoints in real time. Think of it as a live workshop: the agent plays, you observe, and together you shape the test plan.

Starting a Playground Session

To start a playground session, navigate to a scenario and select Playground. You’ll configure:

Build — Which version of your game to test
Device — A single device to run on
Agent Template — Which agent configuration to use
Skills — Behavioral instructions for the agent (editable before starting)
Knowledge — Game context (can be overridden inline)
Version — Optionally start from a specific scenario version instead of the current draft

Click Run and you’ll be taken to the live session view.

The Playground Loop

The core workflow is an iterative loop:

1. Watch the Agent

Once the session starts, you see the live device screen streamed to your browser. The agent begins working through the scenario’s checkpoints. You can follow along with:

The live video feed
Checkpoint progress in the sidebar
The agent’s reasoning and actions

2. Pause and Adjust

At any point, you can pause the agent. While paused:

Completed checkpoints are locked — checkpoints the agent has already passed are frozen with their results (timestamp, screenshot, duration). You can’t edit or remove them.
Pending checkpoints are editable — you can modify goals and instructions, reorder them, or remove them entirely.
Add new checkpoints — insert new milestones for the agent to work toward.
Add verifications — attach assertions to any checkpoint to check game behavior.

3. Resume

When you’re ready, click Resume to send the agent back to work with your updated checkpoints. If you made changes while paused, you’ll be asked to confirm:

Save & Resume — Apply your checkpoint edits and continue
Discard & Resume — Throw away edits and continue with the previous checkpoints

4. Repeat

Continue the watch-pause-adjust-resume cycle until you’re satisfied with the scenario. The agent will keep working through checkpoints as you refine them.

Auto-Sync

You don’t always need to pause. When you edit checkpoints while the agent is running, your changes are automatically synced to the agent after a short delay. The agent picks up the updated instructions on its next decision cycle without interruption.

Saving to Scenario

When the agent finishes (or you’re happy with the results), click Save to Scenario. This:

Saves checkpoints — Writes the current checkpoint list back to the scenario
Syncs skills — Updates, creates, or links any skills that changed during the session
Promotes verifications — Converts any assertions you added during the session into persistent scenario verifications
Creates a new version — Snapshots the saved state as a new scenario version (e.g., v2, v3)
Links the run — Associates this playground run with the new version

After saving, your scenario is updated and ready for formal runs.

When to Use Playground

Building a new scenario from scratch — Start with a rough idea, watch the agent, and add checkpoints as you discover what matters
Debugging a failing scenario — Run the failing scenario in playground, watch where it goes wrong, and adjust checkpoints on the spot
Exploring a new build — Use playground to poke around a new version of your game and see how the agent handles changes
Refining checkpoints — Watch the agent struggle with a vague goal, then rewrite it with better instructions while the session is still live
Adding verifications — Watch the game behavior during a run and add assertions for things you want to check automatically in future runs

Quick Reference

Concept	What it is
Playground session	An interactive run where you can edit checkpoints live
Locked checkpoint	A completed or failed checkpoint — frozen, can’t be edited
Pending checkpoint	A checkpoint the agent hasn’t reached yet — fully editable
Auto-sync	Checkpoint edits are sent to the agent automatically
Save to Scenario	Writes your playground changes back to the scenario and creates a new version

Getting Started

Core Concepts

Starting a Playground Session

The Playground Loop

1. Watch the Agent

2. Pause and Adjust

3. Resume

4. Repeat

Auto-Sync

Saving to Scenario

When to Use Playground

Quick Reference

Getting Started

Core Concepts

​Starting a Playground Session

​The Playground Loop

​1. Watch the Agent

​2. Pause and Adjust

​3. Resume

​4. Repeat

​Auto-Sync

​Saving to Scenario

​When to Use Playground

​Quick Reference

Starting a Playground Session

The Playground Loop

1. Watch the Agent

2. Pause and Adjust

3. Resume

4. Repeat

Auto-Sync

Saving to Scenario

When to Use Playground

Quick Reference