System Architecture Diagram
Source: Notion | Last edited: 2025-12-03 | ID: 29b2d2dc-3ef...
1.1 Mermaid (paste into Notion / GitHub / Markdown renderers)
Section titled “1.1 Mermaid (paste into Notion / GitHub / Markdown renderers)”flowchart LR subgraph Sources["Data Sources"] M1["Markets<br/>(trades / quotes / orderbooks / bars)"] R1["Reference<br/>(fundamentals / corp actions / calendars)"] A1["Alt Data<br/>(news / social / RSS)"] C1["On-chain<br/>(RPC / indexers)"] end
subgraph Stream["Streaming Layer"] K1["Redpanda / Kafka"] SR1["Schema Registry<br/>(Apicurio / Confluent)"] SP1["Stream Processing<br/>(Materialize / Flink / Bytewax)"] end
subgraph Lakehouse["Lakehouse Storage"] S3["S3 / MinIO"] IC["Iceberg Catalog<br/>(Glue / Nessie)"] end
subgraph OLAP["OLAP / Timeseries"] CH["ClickHouse"] SQL["Trino / Athena / DuckDB"] end
subgraph Vector["Unstructured / Vector"] OBJ["Raw Objects<br/>S3 URIs"] VDB["pgvector / Qdrant"] end
subgraph Feature["Feature Layer"] OFF["Offline Feature Store<br/>(Iceberg Tables)"] ONL["Online Feature Store<br/>(Feast → Redis / ClickHouse)"] end
subgraph Access["Access & Serving"] AF["Arrow Flight / Flight SQL"] GRPC["gRPC / REST"] META["OpenMetadata + OpenLineage<br/>+ Great Expectations"] end
subgraph QuantOS["Consumers"] AG["AI Agents"] RUN["QuantOS Runner"] RE["Research Notebooks"] end
M1 --> K1 R1 --> K1 A1 --> K1 C1 --> K1 K1 --> SR1 K1 --> SP1 SP1 --> S3 K1 --> S3 S3 --> IC S3 --> CH CH --> OFF S3 --> OFF OFF --> ONL OBJ --> VDB A1 --> OBJ S3 --> SQL CH --> AF OFF --> AF ONL --> AF CH --> GRPC OFF --> GRPC ONL --> GRPC S3 --> META K1 --> META SP1 --> META AF --> AG GRPC --> AG AF --> RUN GRPC --> RUN SQL --> RE1.2 Quick ASCII View (for docs that don’t render Mermaid)
Section titled “1.2 Quick ASCII View (for docs that don’t render Mermaid)”[SOURCES] Markets | Reference | Alt/News | On-chain ↓ (Avro/Protobuf + Schema Registry)[STREAM BUS] Redpanda/Kafka → [STREAM PROC] Materialize/Flink ↓ ↘ alerts/quality[LAKEHOUSE] S3/MinIO + Iceberg Catalog (Glue/Nessie) ↓ ↓ ↓ [Trino/Athena] [ClickHouse] [Objects + Vector DB] ↓ ↓ ↓ Ad-hoc SQL Rollups/low-latency Unstructured + RAG \_____________________ ______________________/ \/ [FEATURE LAYER] Offline (Iceberg) + Online (Feast→Redis/CH) \/ [ACCESS LAYER] Arrow Flight / gRPC / REST / Metadata(Lineage/Quality) \/ [QuantOS Runner | AI Agents | Notebooks]2) Terraform-Style Resource Layout
Section titled “2) Terraform-Style Resource Layout”Opinionated for AWS + EKS, cloud-neutral where possible.
You can swap managed for self-managed later without changing the public interfaces.
infra/├─ envs/│ ├─ dev/│ │ ├─ main.tf│ │ ├─ variables.tf│ │ └─ terraform.tfvars│ └─ prod/│ ├─ main.tf│ ├─ variables.tf│ └─ terraform.tfvars├─ modules/│ ├─ vpc/│ ├─ eks/│ ├─ redpanda/ # Helm on EKS (or MSK alternative)│ ├─ schema-registry/ # Apicurio/Confluent Helm│ ├─ s3-iceberg/ # S3 buckets + IAM + Glue/Nessie│ ├─ clickhouse/ # ClickHouse operator + cluster│ ├─ trino/ # Trino Helm (optional early)│ ├─ materialize/ # Materialize Helm (or Flink)│ ├─ feast/ # Feast On EKS│ ├─ redis/ # Online FS low-latency store│ ├─ pgvector/ # RDS Postgres with pgvector (or Qdrant Helm)│ ├─ openmetadata/ # Metadata catalog│ ├─ openlineage/ # Marquez/OpenLineage│ ├─ great-expectations/ # Data quality runner job│ ├─ arrow-flight-gateway/ # Custom k8s svc for Flight/grpc│ └─ observability/ # Prometheus/Grafana/Loki/Tempo└─ README.md2.1 envs/dev/main.tf (root wiring example)
Section titled “2.1 envs/dev/main.tf (root wiring example)”terraform { required_version = ">= 1.8.0" required_providers { aws = { source = "hashicorp/aws", version = "~> 5.0" } kubernetes = { source = "hashicorp/kubernetes", version = "~> 2.30" } helm = { source = "hashicorp/helm", version = "~> 2.13" } }}
provider "aws" { region = var.aws_region}
module "vpc" { source = "../../modules/vpc" name = "${var.prefix}-vpc" cidr = var.vpc_cidr}
module "eks" { source = "../../modules/eks" cluster_name = "${var.prefix}-eks" vpc_id = module.vpc.id private_subnets = module.vpc.private_subnets public_subnets = module.vpc.public_subnets}
# Lakehouse foundation: S3 + Glue Iceberg catalogmodule "s3_iceberg" { source = "../../modules/s3-iceberg" prefix = var.prefix bucket_lake = "${var.prefix}-lake" enable_glue_catalog = true}
# Streaming: Redpanda (or swap to MSK)module "redpanda" { source = "../../modules/redpanda" cluster_name = "${var.prefix}-redpanda" eks_cluster = module.eks.name eks_kubeconfig = module.eks.kubeconfig}
module "schema_registry" { source = "../../modules/schema-registry" eks_kubeconfig = module.eks.kubeconfig provider_type = "apicurio" # or "confluent"}
# OLAP: ClickHousemodule "clickhouse" { source = "../../modules/clickhouse" eks_kubeconfig = module.eks.kubeconfig storage_size = "2Ti"}
# SQL access (optional early): Trinomodule "trino" { source = "../../modules/trino" eks_kubeconfig = module.eks.kubeconfig}
# Stream processing (start with Materialize)module "materialize" { source = "../../modules/materialize" eks_kubeconfig = module.eks.kubeconfig}
# Feature Store: Feast + Redis (or ClickHouse as online store)module "redis" { source = "../../modules/redis" eks_kubeconfig = module.eks.kubeconfig size = "cache.m6g.large"}
module "feast" { source = "../../modules/feast" eks_kubeconfig = module.eks.kubeconfig offline_store_catalog = module.s3_iceberg.glue_catalog_name online_store_endpoint = module.redis.endpoint}
# Vector: RDS Postgres + pgvector (swap to Qdrant later if needed)module "pgvector" { source = "../../modules/pgvector" db_name = "vectors" instance_type = "db.r6g.large" vpc_id = module.vpc.id subnets = module.vpc.private_subnets}
# Metadata, lineage, qualitymodule "openmetadata" { source = "../../modules/openmetadata" eks_kubeconfig = module.eks.kubeconfig}
module "openlineage" { source = "../../modules/openlineage" eks_kubeconfig = module.eks.kubeconfig}
module "great_expectations" { source = "../../modules/great-expectations" eks_kubeconfig = module.eks.kubeconfig lake_bucket = module.s3_iceberg.bucket_lake expectations_s3_prefix = "dq/expectations/"}
# Arrow Flight / gRPC gateway (your public API to QuantOS)module "arrow_flight_gateway" { source = "../../modules/arrow-flight-gateway" eks_kubeconfig = module.eks.kubeconfig depends_on = [module.clickhouse, module.feast]}envs/dev/variables.tf (excerpt)
Section titled “envs/dev/variables.tf (excerpt)”variable "prefix" { type = string }variable "aws_region"{ type = string, default = "us-west-2" }variable "vpc_cidr" { type = string, default = "10.30.0.0/16" }2.2 Example module snippets
Section titled “2.2 Example module snippets”modules/s3-iceberg/main.tf
Section titled “modules/s3-iceberg/main.tf”resource "aws_s3_bucket" "lake" { bucket = var.bucket_lake force_destroy = true}
resource "aws_s3_bucket_versioning" "v" { bucket = aws_s3_bucket.lake.id versioning_configuration { status = "Enabled" }}
# Optional: AWS Glue Data Catalog as Iceberg catalogresource "aws_glue_catalog_database" "iceberg_db" { count = var.enable_glue_catalog ? 1 : 0 name = "${var.prefix}_iceberg"}
output "bucket_lake" { value = aws_s3_bucket.lake.bucket }output "glue_catalog_name" { value = try(aws_glue_catalog_database.iceberg_db[0].name, null) }modules/redpanda/main.tf (Helm on EKS)
Section titled “modules/redpanda/main.tf (Helm on EKS)”provider "helm" { kubernetes { config_path = var.eks_kubeconfig }}
resource "helm_release" "redpanda" { name = "redpanda" repository = "https://charts.redpanda.com" chart = "redpanda" namespace = "streaming" create_namespace = true
values = [yamlencode({ statefulset = { replicas = 3 } storage = { persistentVolume = { size = "500Gi" } } external = { enabled = true } })]}modules/clickhouse/main.tf
Section titled “modules/clickhouse/main.tf”resource "helm_release" "clickhouse_operator" { name = "clickhouse-operator" repository = "https://charts.altinity.com" chart = "altinity-clickhouse-operator" namespace = "olap" create_namespace = true}
resource "helm_release" "clickhouse" { name = "clickhouse" repository = "https://charts.altinity.com" chart = "clickhouse" namespace = "olap"
values = [yamlencode({ replicas = 3 persistence = { size = var.storage_size } resources = { requests = { cpu = "4", memory = "16Gi" } limits = { cpu = "8", memory = "32Gi" } } })]}modules/feast/values.yaml (conceptual)
Section titled “modules/feast/values.yaml (conceptual)”offlineStore: type: iceberg catalog: glue warehouse: s3://<lake-bucket>/onlineStore: type: redis host: <redis-host> port: 6379modules/arrow-flight-gateway/deployment.yaml (conceptual)
Section titled “modules/arrow-flight-gateway/deployment.yaml (conceptual)”apiVersion: apps/v1kind: Deploymentmetadata: { name: arrow-flight-gateway, namespace: serving }spec: replicas: 2 selector: { matchLabels: { app: flight-gw } } template: metadata: { labels: { app: flight-gw } } spec: containers: - name: flight image: ghcr.io/yourorg/arrow-flight-gateway:latest ports: [{ containerPort: 31337 }] env: - { name: CLICKHOUSE_DSN, valueFrom: { secretKeyRef: { name: ch-secrets, key: dsn } } } - { name: FEAST_ONLINE_ADDR, value: "redis:6379" } - { name: ICEBERG_CATALOG, value: "glue://..." }---apiVersion: v1kind: Servicemetadata: { name: arrow-flight-gateway, namespace: serving }spec: type: LoadBalancer ports: [{ port: 31337, targetPort: 31337, name: flight }] selector: { app: flight-gw }