AlphaForge – Overview

Source: Notion | Last edited: 2025-12-06 | ID: 2b22d2dc-3ef...

What is AlphaForge?

AlphaForge is an AI-agent-centric, DSL-driven quantitative research and execution infrastructure.

It provides a unified way to:

Describe trading ideas in a declarative strategy DSL (Domain-Specific Language)
Compile them into canonical DAGs (Directed Acyclic Graphs) with fingerprints
Run backtests, simulations, and live trading across multiple venues
Track every experiment as a first-class object – identifiable, reproducible, and comparable

Core Innovation

AlphaForge’s core innovation is to let humans and AI agents build trading strategies by assembling and evolving tested “Lego-like” plugins in a DSL, instead of hand-coding fragile one-off backtests.

Strategies are composed from well-tested plugins (data, features, signals, execution, metrics), not raw scripts.
The DSL specifies “what” the strategy should do; plugins encapsulate “how” it’s implemented.
When existing plugins are not enough, users and agents can experiment in an escape bundle, then promote stable logic back into plugins. This has several implications:
Far fewer implementation bugs – most complexity lives in reusable, tested plugins.
AI agents don’t have to write full programs in a general-purpose language; they just choose plugins, set parameters, and wire them together.
Over time, AlphaForge helps teams grow a large, high-quality plugin library, which both humans and agents can reuse.
With a mature DSL + plugin ecosystem, each user can leverage AI-driven “armies of quants” to implement and iterate on trading ideas quickly and safely.

Problems AlphaForge Addresses

Traditional quant pipelines are often:

Ad-hoc – every researcher writes their own backtest loop and data plumbing.
Non-reproducible – hard to know exactly what was run, with what config, under which environment.
Non-composable – reusing data, features, execution, and risk components is painful.
Opaque – experiments and results are scattered across notebooks, scripts, and logs.
Feature-duplicated – the same “factor” gets re-implemented many times with subtle, undocumented differences. AlphaForge aims to be the operating system for trading research:
One language (DSL) to describe strategies and experiments.
One compiler pipeline to normalize, fingerprint, and index them.
One orchestrator to manage runs, resources, and lineage.
One data + feature + plugin layer to serve consistent, well-defined components.
One execution gateway to bridge research and live trading without rewriting strategies.

High-Level Architecture Shape

At a high level, AlphaForge is organized as:

A DSL + compiler + IR (intermediate representation) layer that turns strategy intent into a canonical DAG.
A runtime / orchestrator that schedules DAGs as jobs across compute and data backends.
A data & feature service exposing well-typed series and panels via contracts.
An execution gateway that maps abstract orders / portfolios into venue-specific actions.
An experiment / artifact store that tracks configs, runs, and results as first-class entities.
A plugin and capability-pack system that extends AlphaForge without modifying the core.

Design Principles

1. DSL first, code second

Strategy logic, data dependencies, and evaluation plans are defined in a declarative DSL.
Implementation details live behind plugins and capability packs.
Researchers and AI agents reason in terms of “what” they want, not “how” to implement it. This makes the strategy surface:
Structured – explicit fields and sections instead of arbitrary code.
Validatable – easier to check for missing pieces, inconsistent assumptions, and invalid combinations.
Composable – strategies assemble from reusable blocks rather than being monolithic scripts.

Escape bundles

When the DSL or existing plugins are not expressive enough, AlphaForge provides an escape bundle:

A sandbox for new custom logic:
- Features / factors
- Labels and targets
- Signal or strategy components
Users and agents iterate there first; once the logic proves stable and useful, it is promoted into a proper plugin with clear contracts and versioning, and registered in the plugin registry. This keeps the DSL focused on intent and wiring, while still allowing rapid experimentation in code when needed.

Why DSL is a good fit for AI agents

DSL + plugins are particularly well-suited to AI agents:

Structured search space Agents work inside a constrained, typed configuration space (DSL fields, enums, plugin parameters) instead of arbitrary Python code. This reduces:
- Syntax errors
- Missing imports / wrong APIs
- Subtle type and shape mismatches
Plugin-based composition Agents don’t implement low-level details; they select and combine existing plugins:
- “Use this data plugin, these feature plugins, that signal plugin, under this risk plugin.”
- The heavy lifting lives in code that has already been tested and deployed.
Escape bundle → plugin pipeline When agents (or humans) need something new, they can:
- Prototype it in the escape bundle
- Validate it through experiments
- Promote it into a plugin that becomes part of the shared Lego set
Compounding leverage over time As the plugin library grows:
- Each new strategy is more about choosing the right building blocks than reinventing the wheel.
- AI agents can efficiently explore more of the strategy space by recombining high-quality components. The result: once the DSL and plugin ecosystem are in place, users can orchestrate multiple AI agents as a coordinated quant research team, safely exploring and implementing ideas at scale.

2. Compile → Canonicalize → Fingerprint → Vectorize

Every strategy or experiment goes through the same pipeline:

Parse & compile

DSL → IR (intermediate representation) + DAG

Canonicalize

Deterministic ordering and normalization
Prune irrelevant differences (e.g., formatting, non-semantic reordering)

Fingerprint

Content-addressed identity for strategies, components, and DAG nodes

Vectorize

Embed the IR / DAG into a vector space for similarity and novelty queries This gives AlphaForge:
Run de-duplication – “Has this already been run?”
Similarity search – “Have we tried something like this before?”
Novelty metrics – “How new is this configuration compared to our existing corpus?”
Traceability and caching at the DAG-node level, not just at the script or repository level.
Auditability and governance – every result is tied back to a canonical experiment spec and environment, with full lineage.

3. Core, Plugins, Capability Packs, and the Registry

Core includes:

Strategy DSL and compiler / IR
Orchestrator and runtime
Data abstractions and contracts
Execution gateway and portfolio / risk abstractions
Experiment and artifact store Plugins are the primary extension mechanism. Typical plugin types include:
Data plugins – data sources, ingestion, and normalization
Feature / factor plugins – reusable transformations, factors, and engineered signals
Strategy plugins – combine, aggregate, adjust signals into position decision logic
Execution plugins – order routing, execution engines, and risk controls
Metrics / reporting plugins – custom curves, diagnostics, and reporting Plugins are managed via a registry with strong contracts (schema, units, lag, availability, parameters), so they can be:
Discovered and reused across strategies
Versioned and compared
Swapped or upgraded without breaking DSL definitions Capability packs (separate repos) bundle cohesive sets of plugins and configurations for specific domains. For example:
A mid-frequency crypto pack
An HFT pack
An on-chain / DeFi pack Each pack contributes plugins and templates (data, features, signals, execution adapters, metrics) without modifying the core, so organizations can add their own domain expertise cleanly on top of AlphaForge.

Over time, both humans and AI agents can rapidly grow this plugin ecosystem, turning domain knowledge into reusable, testable building blocks.

4. Agent-centric experimentation

AlphaForge is designed to be both human-friendly and agent-friendly:

The DSL is structured and explicit, making it easy for AI agents to read, modify, and generate.
The compiler, orchestrator, plugin registry, and experiment store expose clean APIs for programmatic access. Experiments, runs, and metrics form a machine-navigable index of what has been tried:
Agents can search past experiments by configuration, fingerprint, similarity, or performance.
They can avoid re-running equivalent or near-duplicate configurations.
They can propose new experiments that explore novel but relevant regions of the space. Over time, the experiment and plugin corpus becomes an institutional memory of research, rather than a pile of ad-hoc notebooks and scripts. As this corpus grows, users can bring multiple AI agents into the workflow to propose strategies, assemble DSL specs, refine plugins, and iterate toward robust trading systems.

5. Research ↔ Execution symmetry

Backtesting and live trading share:

The same data contracts and schemas
The same portfolio and risk abstractions
The same execution gateway (with different adapters, venues, and risk policies) Goals:
Minimize the backtest–live gap
Allow multiple execution engines (e.g. third-party engines, in-house services, or direct exchange APIs) to be plugged in behind stable abstractions
Keep strategies expressed in the DSL, without redesigning them for each execution path

High-Level Capabilities (MVP Scope)

1. Strategy DSL

Define, in a single declarative spec:

Universe and instruments
Data sources and sampling frequencies
Features, labels, and signal definitions
Position rules and portfolio constraints
Backtest configuration (horizon, fees, slippage, costs, risk constraints) 2. Compiler pipeline
DSL → IR → canonical DAG → fingerprints and embeddings
Index experiments and components for de-duplication, similarity, and novelty queries 3. Data & feature layer
Integration with a columnar time-series / panel store and object storage (e.g. a ClickHouse-like backend + blob storage)
Data ingestion and normalization for supported venues
A factor / feature registry backed by plugins and contracts 4. Experiment orchestrator
Run backtests and evaluations as jobs managed by the orchestrator
Log metrics, curves, exposures, and artifacts
Designed for team workflows – experiments can be shared and revisited across users and roles
Provide research-friendly workflows:
- Compare experiments side-by-side (config diff, metrics diff, exposure diff)
- Treat experiments as shareable objects (clone, tweak, tag, comment) 5. Execution gateway (baseline)
Shared abstractions for orders, positions, portfolios, and risk limits
Adapters to one or more execution engines (e.g. Nautilus-style engines, in-house execution services, or direct exchange APIs), all behind the same interface

Current Scope

AlphaForge is currently being applied to mid-frequency strategies in crypto futures/perpetuals.

The architecture is intentionally asset- and venue-agnostic, and is designed to extend to equities, futures, options, and on-chain venues over time.

Intended Users and Personas

Quant Researcher

Writes DSL strategies.
Launches backtests and evaluations.
Inspects results and iterates on ideas.

Quant Engineer / System Architect

Extends the platform with plugins and capability packs.
Owns infrastructure, data integration, and performance optimizations.

Portfolio Manager / Risk Manager

Consumes higher-level views on performance, risk, exposures, and scenarios.
Enforces risk policies consistently across strategies and venues.

AI Research Agents

Interact with the DSL, compiler APIs, plugin registry, and experiment store.
Automate research workflows end-to-end: generate, run, monitor, and refine experiments.
Help teams grow and exploit a rich ecosystem of well-tested plugins and strategies.