ragweld

Open-source MLOps Engineering Platform for retrieval + agent systems.

API-first orchestration for retrieval, training, evals, tracing, and model routing, with MCP built in for agent workflows.

Synthetic Data Lab, dual training studios, semantic cache controls, and guardrailed indexing in one operational surface.

View on GitHub Docs Get Started

Operator tool

Crucible plans GPU fit, wall-clock, and spend before you burn the credits.

It is the range-planning surface for expensive training runs: VRAM envelopes, provider comparisons, and source-backed cost assumptions in one place.

Open Crucible

API-first platform MCP integration Synthetic Data Lab Benchmarks + Evals Eval Drilldowns Run Diffs + Analysis Dual training studios Tracing + Grafana Webhooks + Alerts Semantic cache + Recall gates Model routing + catalog

Live Demo (Epstein Files 1 corpus)

Epstein Files 1

Try fullscreen

Live search + chat are backed by the Epstein Files 1 corpus. Add ?mock=1 for offline demo mode.

What ragweld is now

A full MLOps engineering platform for retrieval + agent systems. The product center is API-first orchestration for indexing, retrieval, evals, training, tracing, and routing.

MCP support is first-class, but it sits on top of the API contract rather than replacing it. The panels below map core workflows to their matching docs destinations.

API docs first → MCP integration docs → Parameter glossary →

RAG retrieval generation panel showing HTTP and MCP model routing overrides.

API-first core • API > MCP

Use the HTTP API as the contract. MCP rides on top of it.

ragweld is API-first in production: routing, retrieval, evals, and observability all anchor to /api. MCP is built in for agent tooling, but the API surface is the primary integration layer.

API docs MCP integration

Indexing screen with embedding mismatch warnings, index contract lock notices, and model assignment controls.

Indexing guardrails

Deep tuning surface, but guarded against easy failures

Embedding mismatch detection, index contract locking, and guided reindex actions keep high-option indexing from drifting silently in production.

Indexing manual Indexing config reference

Learning Agent Studio with run HUD status panel, run timeline, and training controls.

Training studios

Train reranker and agent adapters in-product

Dual training studios support live telemetry, promotion gates, rollback paths, and run inspection so quality moves are operational, not guesswork.

Training workflow Training config

Graph explorer with entity table and interactive relationship visualization.

Graph + retrieval

Inspect entities, relationships, and retrieval context directly

Graph explorer and retrieval controls are in the same workbench, so structural signals are measurable, debuggable, and part of normal iteration.

Retrieval overview Graph search config

Chat recall settings with retrieval gate toggles, intensity controls, and recency weighting.

Recall + semantic cache

Control memory retrieval, not just chat history storage

Smart gating and intensity controls decide when conversational memory is injected, while semantic cache controls token spend in repeated query patterns.

Chat settings reference Semantic cache config

Infrastructure services panel with Postgres, Neo4j, Prometheus, and runtime controls.

Infra + observability

Operate services and diagnostics from one runtime surface

Infrastructure controls, service states, and observability integration stay in the same operator UI so debug loops stay short when routing, indexing, or model behavior shifts.

Operations docs Observability docs

Workbench snapshots

Current production surfaces across API routing, indexing, training, graph inspection, and recall policy.

RAG routing panel with API-first model routing and MCP override controls. — API routing control plane

Route model traffic per channel with API-first defaults, MCP overrides, and provider readiness visible in one place.

API docs Click image to enlarge

Code indexing panel with embedding mismatch alerts and model controls. — Indexing guardrails in production mode

Embedding mismatch detection, index contract lock controls, and guided reindex actions reduce high-cost indexing errors.

Indexing manual Click image to enlarge

Learning Agent Studio showing run HUD status panel, run timeline, and training controls. — Learning Agent Studio run HUD

Track completed and failed runs side-by-side with run status, duration, active model path, and promotion controls.

Training reference Click image to enlarge

Chat recall settings panel with retrieval gate controls and intensity options. — Recall gating controls

Control how conversational memory is indexed and injected with intensity, recency, and skip-behavior gates.

Chat config docs Click image to enlarge

Knowledge graph canvas showing entities and connections for retrieval debugging. — Knowledge graph canvas

Inspect neighborhood structure directly to validate entity relationships and graph-retrieval behavior.

Graph retrieval docs Click image to enlarge

Dock chooser modal listing workspace tabs for custom layout composition. — Dock chooser for custom layouts

Recompose the workspace quickly by docking the exact tabs and diagnostics needed for the current investigation.

UI manual Click image to enlarge

Under the hood

Three-leg retrieval is the spine: vector + sparse + graph signals fused and reranked — then surfaced through a workbench that lets you measure and iterate.

Tri-brid retrieval engine

Index once, then retrieve through vector, sparse, and graph legs in parallel. Fusion and reranking happen with inspectable knobs so changes are measurable instead of opaque.

Vector + sparse + graph retrieval orchestration

Learning reranker and agent studio in the same loop

Per-run inspection with configurable promotion gates

MLOps iteration loop

The workbench is built for repeated cycles: synthetic data generation, eval runs, tracing, and routing updates all feed the same decision surface. API routes stay primary; MCP rides on top for agent clients.

Synthetic data recipes + eval dataset workflows

Tracing + Grafana support for live diagnostics

Semantic cache and recall gates for token efficiency

Choose your path

ragweld is open source, self-hostable, and built as an API-first MLOps Engineering Platform.

Open source

Self-hosted / on-prem

MIT licensed. Run it on your infrastructure. Keep your corpora, embeddings, and model traffic where you want them.

• API-first orchestration with MCP support layered in
• Three-leg retrieval: vector + sparse + graph
• Synthetic Data Lab + eval workflows
• Dual training studios: reranker + agent adapters
• Benchmark + eval suites with run comparisons
• Deep config + parameter glossary + tooltips
• Multi-corpus + per-corpus overrides
• Learning reranker: LoRA training + yes/no logits scoring

Docs GitHub

Enterprise

Managed deployments + custom integrations

For teams pushing RAG into production: deployment patterns, hardened ops, and integration work tailored to your stack.

• Cloud or on-prem deployments (your network, your controls)
• Observability + alerting hooks that plug into your on-call
• Benchmarks/evals as a release gate (regressions don’t ship)
• API contracts first; MCP tooling for agent ecosystems
• MCP + automation workflows for agents and internal tools
• Trainable reranking on Apple Silicon (MLX) + Linux fallback

Contact enterprise Launch demo

Ready to see ragweld in action?

Explore the live demo, start from the API docs, then layer MCP integrations where they fit your agent stack.

Launch demo Docs Contact enterprise