Backtest a Pine strategy from Claude Code in 90 seconds
Walk-through of the @pineforge/codegen-mcp server: install in one npx command, ask Claude to transpile your Pine, run a Docker backtest, and read the trade list back. Your OHLCV never leaves the machine.
The PineForge codegen has lived behind a curl for the last few months. As of this week,
it also lives behind a Model Context Protocol server, which means you can drive a backtest
from inside any MCP-aware client — Claude Desktop, Claude Code, Cursor, Continue.dev, and
the growing list of others.
This is a walkthrough of what that looks like end-to-end. Total wall-clock time from "empty repo" to "I have a JSON backtest report": about 90 seconds, plus the one-time Docker pull.
What the server actually does
The MCP server (@pineforge/codegen-mcp on npm) is a thin local stdio bridge. It exposes
four tools to your AI client:
| Tool | Runs on | Cost |
|---|---|---|
| transpile_pine | Hosted codegen API | counts against your quota (refunded on compile error) |
| get_quota | Hosted codegen API | free |
| backtest_pine | Your local Docker daemon | counts 1 (the transpile inside) |
| pull_engine_image | Your local Docker daemon | free |
The privacy surface is the bit worth pausing on. The Pine source travels to the
hosted codegen at codegen.pineforge.dev. The OHLCV CSV does not — backtest_pine
runs the resulting strategy.cpp against the file you point it at via your local
Docker daemon, with the file mounted into the container as a read-only volume. No
network access from the runtime container. Your data stays on your laptop.
Install (one command)
npx -y @pineforge/codegen-mcp
That's it. The package downloads on first run, compiles its TypeScript on the fly, and starts speaking MCP over stdio. No global install required. No build step.
You'll need three things present:
- Node ≥ 20 (most recent macOS/Linux distros qualify)
- Docker daemon running locally (the
backtest_pinetool exec'sdocker run) - A PineForge API key —
pf_…— that you got via the pineforge.dev waitlist
Wire it into Claude Desktop
Open the config file:
- macOS:
~/Library/Application Support/Claude/claude_desktop_config.json - Windows:
%APPDATA%\Claude\claude_desktop_config.json
Add an entry under mcpServers:
{
"mcpServers": {
"pineforge-codegen": {
"command": "npx",
"args": ["-y", "@pineforge/codegen-mcp"],
"env": {
"PINEFORGE_API_KEY": "pf_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
}
}
}
}
Restart Claude Desktop. The server will be listed under "Available tools" in the
sidebar. The first tool call will hit the hosted codegen and run docker pull for
the runtime image — a one-time ~250 MB download.
Wire it into Claude Code (CLI)
If you're already in a terminal session, this is shorter:
claude mcp add pineforge-codegen \
--transport stdio \
--env PINEFORGE_API_KEY=pf_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx \
-- npx -y @pineforge/codegen-mcp
Now claude knows about the server for this project. Re-launch the session and
the four tools become available.
Cursor
Settings → MCP → New MCP Server → paste the same JSON block as the Claude Desktop config above. Cursor handles the lifecycle.
Drive it with prose
Once the server is wired in, the interaction model is just English prompts. A realistic exchange (slightly compressed):
You: I have a Pine v6 strategy in
./strategy.pineand 7 days of 15-minute ETHUSDT data in./eth_15m.csv. Backtest it and tell me the Sharpe and max drawdown.
Claude: Running
backtest_pineon those files…[tool call:
backtest_pine({ source: "...", ohlcv_csv_path: "./eth_15m.csv" })returns in ~3 seconds]Backtest complete. 49 trades over the 7-day window. Net PnL −$190.85, Sharpe 0.21, max drawdown $312. The strategy enters short on the 3rd losing trade and never recovers — would you like me to look at what triggered that entry, or sweep the entry threshold?
This is the productive surface. The model has the tool descriptions, knows what
backtest_pine returns, and can chain the next call without you typing the curl
yourself. The friction of "edit Pine → save → switch to TradingView → reload chart →
read numbers → switch back to editor" collapses into a single conversation.
What the report looks like
backtest_pine returns the same JSON shape the standalone Docker image produces.
The summary block:
{
"engine": "pineforge",
"summary": {
"total_trades": 49,
"net_pnl": -190.85,
"max_drawdown": 312.0,
"sharpe": 0.21,
"profit_factor": 0.78,
"win_rate": 0.43
},
"trades": [
/* 49 entries with timestamps, prices, PnL */
],
"elapsed_seconds": 0.0042,
"_meta": {
"strategy_cpp_bytes": 5079,
"image": "ghcr.io/fullpass-4pass/pineforge-engine:latest"
}
}
The model gets the full structure. It can answer follow-ups like "what was the
worst trade?" by scanning trades[] itself, no second tool call needed.
Quota awareness
Every conversation that uses the codegen API counts against your quota. The
get_quota tool exists so the model can check before re-running an expensive
parameter sweep. Free tier is 100 transpiles per month — plenty for hobby work
and CI smoke tests, less when you're driving an iterative optimization loop.
A useful pattern: in your project's CLAUDE.md (or equivalent), add a hint like
If asked to optimize a strategy, call get_quota first and report remaining budget before kicking off >5 transpile calls.
What's deliberately NOT in the server
- No live order placement. The server is read-only against your strategy and data; it doesn't connect to a broker. Live trading is a separate concern.
- No data fetching. You bring the OHLCV. The server doesn't pull from exchanges or feed providers — you control where the bytes come from and what gets backtested.
- No state between calls. Every
backtest_pineis a fresh Docker container. No persisted cache, no shared session.
These are deliberate scope limits. They keep the surface auditable and the failure modes simple.
Where this becomes interesting
A few patterns this enables that are awkward to do by hand:
- "Tighten the stop and re-run." The model edits one number in the Pine
source, calls
backtest_pineagain, compares Sharpe. Loop until you're tired of suggesting variations. - "Try this on the last 30 days instead of the last 7." You point at a different CSV, the model re-runs. No filter UI to navigate.
- "What's the parameter sensitivity here?" Sweep one input across a range, collect Sharpe per value, print a small markdown table. Model handles the iteration.
- "Compare against this other strategy I have." Two
backtest_pinecalls, side-by-side report.
None of these require new infrastructure. They emerge from the model + four tools + your prose.
Try it
- Get a free codegen API key (waitlist signup, key arrives by email)
- Full /ai setup page (Claude Desktop / Claude Code / Cursor configs)
- npm package
The package is open source. The server itself is small enough to read in one sitting if you want to audit what gets sent over the wire.