Skip to main content
A run is a single execution of a scenario against a specific build, on a specific device. When you press “Start”, a run is created — the agent takes control of the device and works through the scenario’s checkpoints. Think of a scenario as the test script and a run as the test session. You can run the same scenario many times, each time against a different build or device configuration.

What You Configure

Before starting a run, you select:
  • Scenario — Which test plan to execute
  • Build — Which version of your game to test (APK, IPA, or web URL)
  • Device — Which device to run on (model, platform)
  • Skills — Which behavioral instructions to give the agent for this run
  • Knowledge — Auto-selected based on the build version (always included)

Run Lifecycle

Every run moves through a clear sequence of stages:

1. Pending

The run has been created but hasn’t started yet.

2. Device Preparing

A remote device is being provisioned and set up. This includes:
  • Installing support services on the device
  • Installing your game build
  • Launching the app
  • Confirming the video stream is working

3. Agent Running

The AI agent is actively playing your game, working through checkpoints. During this stage you can:
  • Watch live — See the device screen in real time
  • Track progress — See which checkpoints have been reached, are in progress, or are pending
  • View agent reasoning — See the agent’s thought process as it decides what to do next

4. Completed

The run has finished. The final status depends on checkpoint results:
  • Passed — All required checkpoints were completed successfully
  • Failed — One or more required checkpoints failed

Pass/Fail Logic

The rule is simple: all checkpoints must pass for the run to pass.
  • If any checkpoint fails (timeout, skip, or agent failure), the run stops and is marked as failed
  • The run result gives you a clear breakdown: how many checkpoints passed, failed, or were skipped

Watching a Run in Progress

While the agent is running, you get real-time visibility into:
  • Live video — The device screen, streamed to your browser
  • Checkpoint progress — A checklist showing which milestones have been reached
  • Agent reasoning — The agent’s thought process for each action it takes
  • Action log — Every tap, swipe, and interaction the agent performs

Interactive Device Control

The device screen isn’t just a video feed — it’s fully interactive. You can click, drag, swipe, and type on the device directly from your browser. If you see the agent is stuck on a certain screen, you can step in and navigate it to where it needs to be. The agent will pick up from there and continue on its own.

Run Artifacts

After a run completes, several artifacts are saved for review:
ArtifactWhat it contains
Video recordingFull recording of the device screen throughout the run
Device logsSystem and app logs captured from the device
Agent reasoningThe agent’s decision-making process — what it saw, what it considered, what it chose to do
Action logEvery interaction the agent performed (taps, swipes, text input), with timestamps
Checkpoint screenshotsA screenshot captured at the moment each checkpoint was reached or failed

Playground Mode vs Formal Runs

Duzz offers two ways to run a scenario:
Formal RunPlayground Mode
PurposeStructured testing with recorded resultsInteractive exploration and debugging
When agent finishesRun ends, device shuts down automaticallyAgent pauses at the last checkpoint — device stays alive for you to interact with
Device behaviorReleased immediately after the runStays available until you close it (30-minute idle timeout)
CheckpointsLocked — uses the scenario’s checkpoints as-isEditable — you can add, remove, and reorder checkpoints before and during the session
Best forRegression testing, CI pipelines, formal QA passesIterating on scenarios, debugging failures, exploring the game

Cancelling and Re-running

  • Cancel — Stop a run that’s in progress. The device is released and partial results are saved
  • Re-run — Start a new run with the same configuration. Useful after fixing a build or tweaking a scenario

Timeouts

  • Checkpoint timeout — Each checkpoint has its own timeout (default 5 minutes). If the agent can’t reach the goal in time, that checkpoint fails
  • Global timeout — Runs have an overall time limit of 1.5 hours. This prevents runaway sessions from consuming resources indefinitely
  • Pause timeout — In playground mode, if you don’t interact with the paused device for 30 minutes, it shuts down automatically

Quick Reference

ConceptWhat it is
RunOne execution of a scenario on a build + device
BuildA specific version of your game (APK, IPA, or URL)
DeviceThe mobile device or browser the agent plays on
PlaygroundInteractive mode — agent pauses when done, device stays alive
Formal RunStructured mode — run ends cleanly, device released
ArtifactsVideo, logs, screenshots, and reasoning saved after a run