pineforgeGet started
Tooling

Backtest a Pine strategy from Claude Code in 90 seconds

Walk-through of the @pineforge/codegen-mcp server: install in one npx command, ask Claude to transpile your Pine, run a Docker backtest, and read the trade list back. Your OHLCV never leaves the machine.

7 min read#mcp#ai#claude#cursor#tooling

The PineForge codegen has lived behind a curl for the last few months. As of this week, it also lives behind a Model Context Protocol server, which means you can drive a backtest from inside any MCP-aware client — Claude Desktop, Claude Code, Cursor, Continue.dev, and the growing list of others.

This is a walkthrough of what that looks like end-to-end. Total wall-clock time from "empty repo" to "I have a JSON backtest report": about 90 seconds, plus the one-time Docker pull.

What the server actually does

The MCP server (@pineforge/codegen-mcp on npm) is a thin local stdio bridge. It exposes four tools to your AI client:

ToolRuns onCost
transpile_pineHosted codegen APIcounts against your quota (refunded on compile error)
get_quotaHosted codegen APIfree
backtest_pineYour local Docker daemoncounts 1 (the transpile inside)
pull_engine_imageYour local Docker daemonfree

The privacy surface is the bit worth pausing on. The Pine source travels to the hosted codegen at codegen.pineforge.dev. The OHLCV CSV does not — backtest_pine runs the resulting strategy.cpp against the file you point it at via your local Docker daemon, with the file mounted into the container as a read-only volume. No network access from the runtime container. Your data stays on your laptop.

Install (one command)

npx -y @pineforge/codegen-mcp

That's it. The package downloads on first run, compiles its TypeScript on the fly, and starts speaking MCP over stdio. No global install required. No build step.

You'll need three things present:

Wire it into Claude Desktop

Open the config file:

Add an entry under mcpServers:

{
  "mcpServers": {
    "pineforge-codegen": {
      "command": "npx",
      "args": ["-y", "@pineforge/codegen-mcp"],
      "env": {
        "PINEFORGE_API_KEY": "pf_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
      }
    }
  }
}

Restart Claude Desktop. The server will be listed under "Available tools" in the sidebar. The first tool call will hit the hosted codegen and run docker pull for the runtime image — a one-time ~250 MB download.

Wire it into Claude Code (CLI)

If you're already in a terminal session, this is shorter:

claude mcp add pineforge-codegen \
  --transport stdio \
  --env PINEFORGE_API_KEY=pf_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx \
  -- npx -y @pineforge/codegen-mcp

Now claude knows about the server for this project. Re-launch the session and the four tools become available.

Cursor

Settings → MCP → New MCP Server → paste the same JSON block as the Claude Desktop config above. Cursor handles the lifecycle.

Drive it with prose

Once the server is wired in, the interaction model is just English prompts. A realistic exchange (slightly compressed):

You: I have a Pine v6 strategy in ./strategy.pine and 7 days of 15-minute ETHUSDT data in ./eth_15m.csv. Backtest it and tell me the Sharpe and max drawdown.

Claude: Running backtest_pine on those files…

[tool call: backtest_pine({ source: "...", ohlcv_csv_path: "./eth_15m.csv" }) returns in ~3 seconds]

Backtest complete. 49 trades over the 7-day window. Net PnL −$190.85, Sharpe 0.21, max drawdown $312. The strategy enters short on the 3rd losing trade and never recovers — would you like me to look at what triggered that entry, or sweep the entry threshold?

This is the productive surface. The model has the tool descriptions, knows what backtest_pine returns, and can chain the next call without you typing the curl yourself. The friction of "edit Pine → save → switch to TradingView → reload chart → read numbers → switch back to editor" collapses into a single conversation.

What the report looks like

backtest_pine returns the same JSON shape the standalone Docker image produces. The summary block:

{
  "engine": "pineforge",
  "summary": {
    "total_trades": 49,
    "net_pnl": -190.85,
    "max_drawdown": 312.0,
    "sharpe": 0.21,
    "profit_factor": 0.78,
    "win_rate": 0.43
  },
  "trades": [
    /* 49 entries with timestamps, prices, PnL */
  ],
  "elapsed_seconds": 0.0042,
  "_meta": {
    "strategy_cpp_bytes": 5079,
    "image": "ghcr.io/fullpass-4pass/pineforge-engine:latest"
  }
}

The model gets the full structure. It can answer follow-ups like "what was the worst trade?" by scanning trades[] itself, no second tool call needed.

Quota awareness

Every conversation that uses the codegen API counts against your quota. The get_quota tool exists so the model can check before re-running an expensive parameter sweep. Free tier is 100 transpiles per month — plenty for hobby work and CI smoke tests, less when you're driving an iterative optimization loop.

A useful pattern: in your project's CLAUDE.md (or equivalent), add a hint like If asked to optimize a strategy, call get_quota first and report remaining budget before kicking off >5 transpile calls.

What's deliberately NOT in the server

These are deliberate scope limits. They keep the surface auditable and the failure modes simple.

Where this becomes interesting

A few patterns this enables that are awkward to do by hand:

None of these require new infrastructure. They emerge from the model + four tools + your prose.

Try it

The package is open source. The server itself is small enough to read in one sitting if you want to audit what gets sent over the wire.