Agent Journeys (Test)

Name: Colter
Author: Colter

Run hosted agent journeys on Pro or local BYOK journeys from the CLI. 10 personas, 3 AI model families, actionable results.

TL;DR: Run colter test https://your-store.com to send hosted AI shopping personas through your store and score the experience. Hosted journeys run on Colter's infrastructure, require Pro or higher, and use COLTER_API_KEY or a saved colter auth login session. Use --local for the free BYOK path with your own ANTHROPIC_API_KEY, OPENAI_API_KEY, or GEMINI_API_KEY. Add --json for structured output, --pdf for a report, or --fix to generate follow-up fixes automatically.

colter.test returns structured JSON built for hosted agent journeys, regression checks, and CI jobs.

What Agent Journeys Do

colter test runs real shopping journeys across multiple model families and scores the outcomes. It answers a different question than Check: not just "is the protocol there?" but "does the agent succeed when it tries to use it?"

Personas

Ten personas run by default, with the_comparer available as an opt-in persona. Each persona makes live storefront or protocol requests, and the browser verification persona adds screenshots when the required tool surface is present.

Persona group	Focus
Platform shoppers	Protocol flows, browser flow, mobile flow
Intent shoppers	Security, pricing clarity, data quality, returns, edge cases

Scenarios

Typical scenarios include:

discovery
product info
policy comprehension
checkout readiness
competitive comparison
recommendation
edge cases

Requirements

For hosted journeys:

COLTER_API_KEY=col_live_..., colter auth login, or --api-key
Pro, Agency, or Enterprise plan

For Local/BYOK runs:

ANTHROPIC_API_KEY, OPENAI_API_KEY, or GEMINI_API_KEY
No Colter-hosted LLM calls
Cost-controlled defaults unless you widen --models, --personas, or --scenarios

CLI

colter test <url> [flags]

Common Flags

Flag	Purpose
`--models LIST`	Choose `claude`, `gpt`, `gemini`
`--api-key KEY`	Override `COLTER_API_KEY` for this run
`--local`	Run local/BYOK tests with your provider key instead of Colter-hosted LLMs
`--personas LIST`	Filter personas
`--scenarios LIST`	Filter scenarios
`--json`	Structured output
`--parallel N`	Concurrent persona runs
`--timeout DURATION`	Client wait timeout for hosted results
`--budget AMOUNT`	Max spend in USD
`--threshold N`	Exit non-zero below this score
`--fix`	Generate fix plans for weak dimensions
`--fix-threshold N`	Cutoff used with `--fix`
`--apply`	Apply fixes after test when `--fix` is set
`--dry-run`	Generate fix content without writing
`--pdf`	Create a PDF report
`--pdf-out PATH`	Set the PDF path
`--browser`	Include `browser_shopper` when you already filter personas or scenarios
`--api-url URL`	Override API base URL

Examples

colter test https://store.example.com

colter test https://store.example.com --models claude,gemini --json

colter test https://store.example.com --threshold 70 --json

colter test https://store.example.com --fix --fix-threshold 75

colter test https://store.example.com --pdf --pdf-out report.pdf

GEMINI_API_KEY=... colter test https://store.example.com --local

GEMINI_API_KEY=... colter query-rank https://store.example.com

GEMINI_API_KEY=... colter ucp-qa https://store.example.com

Output Highlights

The JSON payload includes:

overall score
per-persona results
per-scenario results
per-model scores
recommendations
token and cost totals

CI

colter test exits with code 1 when the final score is below --threshold.

colter test https://mystore.com --threshold 70 --json

Recommended Flow

Run Check.
Run hosted or local agent journeys to see interaction failures.
Run Fix on the weak areas.
Re-run the journey.
Use Lens for live traffic after launch.

Pricing

Agent journeys are:

available on Pro, Agency, and Enterprise when run on Colter-hosted infrastructure
available locally through BYOK commands when customers supply their own model provider key