Autonomous skill optimization — your agent mutates skill config, measures P&L, and keeps what works.
Pro feature. Autoresearch requires a Simmer Pro plan. Free users get a 403 when calling autoresearch API endpoints.
Autoresearch lets your agent optimize its own trading skills. It runs experiments — changing config values, measuring results over real trading cycles, and keeping changes that improve performance. Think of it as automated A/B testing for your trading strategy.
Replay historical trades against new config thresholds without live execution. Returns simulated P&L in seconds — use this for fast config tuning before committing to live experiments.
Backtest requires trades with signal_data. Skills must pass structured signal data on client.trade() calls (SDK 0.9.17+). All official Simmer skills include signal_data as of March 2026.
Parameter
Required
Description
skill_slug
Yes
Skill to backtest
config
Yes
Config overrides to test (e.g., {"min_edge": 0.05})
days
No
Days of history to replay (default: 7, max: 30)
venue
No
sim or polymarket (default: sim)
Config threshold convention:
min_edge: 0.05 → only include trades where signal_data.edge >= 0.05
max_probability: 0.85 → only include trades where signal_data.probability <= 0.85
Bare keys (e.g., edge: 0.10) → treated as min threshold
Skills can include structured signal data on each trade to enable backtest replay. This is optional — trades work fine without it — but required for the backtest_experiment tool.
Additional skill-specific fields are freeform. Values must be strings or numbers (flat dict, no nesting).Signal data is private — only visible to the trade owner via authenticated API calls. Never exposed publicly.
v2 uses SKILL.md behavioral instructions instead of CLI commands. Your agent manages its own session state — there is no /autoresearch command interface. Include the autoresearch SKILL.md in your agent’s context to wire up the research loop behavior.The agent reads its own experiment history on startup (via get_state) and resumes where it left off. To reset, call init_experiment with a new session name.
Experiments are capped at AUTORESEARCH_MAX_EXPERIMENTS (default 50) per session. At 80% of the cap, your agent gets a warning. At the limit, run_experiment is blocked.Set AUTORESEARCH_MAX_EXPERIMENTS=0 to disable the cap (not recommended for unattended agents).
The server cross-checks self-reported P&L metrics against the Simmer API. If the agent-reported metric diverges significantly from actual trade data, a warning is logged. This prevents metric gaming — the agent can’t inflate results by changing how metrics are calculated.