Dograh: The Open-Source Vapi Alternative You Run on Your Own Server

What problem does it solve

“"Dograh is an open source alternative to Vapi, not a clone though. Vapi/Retell are closed platforms; this is open source infra you self-host and modify. Like saying n8n is a clone of Zapier because they solve the same problem. Same category, but fundamentally different model." —...”

You know that feeling when your voice agent vendor charges per minute, routes your customers' audio through their infrastructure, and hands you a compliance checklist you cannot satisfy because the data never touched your servers? Every managed platform — Vapi, Retell, Bland — controls the audio pipeline, charges for usage, and owns the compliance certifications. Building the full stack yourself means stitching together Twilio webhooks, streaming STT, an LLM call loop, TTS synthesis, WebRTC signaling, and telephony provider SDKs — then maintaining all of it when any provider changes their API. Dograh wraps that entire integration in a Docker Compose setup with a visual workflow builder, so you own the deployment without assembling the plumbing from scratch.

voice-aiopen-sourcepythonself-hostedtelephonywebrtcfastapi

How it works

You define your voice agent as a directed graph in the UI — start node, LLM call nodes, tool call nodes, conditional edges — and Dograh runs that graph on top of Pipecat, an open-source real-time audio pipeline. When a call arrives via Twilio, Vonage, or WebRTC, Pipecat handles audio framing and feeds it to your configured STT provider (Deepgram, Speechmatics) to get a transcript. That transcript goes to your LLM (OpenAI, Gemini, OpenRouter), the response goes to your TTS provider (Cartesia, Dograh native TTS), and the synthesized audio goes back to the caller — end-to-end latency is 500–600ms on fast model configurations per the maintainer. The entire stack runs in Docker Compose on a server you control: `docker compose up` brings up the FastAPI backend and the Next.js UI at port 3010.

Key takeaways

✦

01

Multi-provider STT/LLM/TTS — you swap Deepgram for Speechmatics or OpenAI for Gemini without rewriting your agent logic, so a provider outage or price change does not force a rebuild

⟁

02

Docker Compose deployment — `docker compose up` brings the full backend and UI online in one command on any server you control, with the Next.js UI accessible at port 3010

⊕

03

Drag-and-drop workflow graph builder — you wire agent logic visually (start node → LLM node → conditional edge → tool call) instead of writing orchestration code, reducing time to first working agent

◈

04

Telephony and WebRTC both supported — Twilio, Vonage, Telnyx, Cloudonix, and Asterisk ARI all plug into the same agent graph, so phone and browser calls share one codebase

∞

05

MCP tool integration — v1.31.0 adds generic MCP (Model Context Protocol) tool sources with per-node function filtering, letting your agent call external services without writing custom connector code

◎

06

Embeddings-based RAG knowledge base — attach a document corpus to your agent to ground responses in your own data, without building a separate retrieval pipeline

✺

07

ElevenLabs Data Residency support — added in v1.29.0, routing TTS through a compliant endpoint for EU or HIPAA-adjacent audio processing without switching providers

Should you care?

Who it’s for

If you are building voice automation for BPO call centres, inbound support lines, or outbound dialing campaigns and you need call audio to stay on your own servers — Dograh gives you the managed-platform feature set under a BSD-2 license you can modify and ship. Also a fit if you want to avoid per-minute SaaS pricing at high call volume. Not the right pick if you need SOC 2/HIPAA/PCI compliance certifications out of the box (Vapi and Bland cover this), or if you need sub-400ms turn latency (Bland claims 400ms vs. Dograh's 500–600ms on fast models per the maintainer).

Worth exploring

Worth a serious look if your use case requires self-hosted voice infrastructure and you have engineering time to patch the three open security audit issues (#330, #331, #340) before going live. The core Docker setup, workflow builder, and provider integrations work — but ElevenLabs TTS is currently broken (issue #334), so use Cartesia or Deepgram TTS as your default stack. If you need production deployment without hands-on patching, wait 1–2 months for the security backlog to clear.

6 more sections · unlock free

Developer playbook

Tech stack, code snippet, sentiment, alternatives.

PM playbook

Adoption angles, user fit, positioning.

CEO playbook

Traction signals, ROI, build vs buy.

Deep-dive insight

Full long-form analysis, no fluff.

Easy mode

Core idea, fast — when you need the gist.

Pro mode

Technical nuance, edge cases, tradeoffs.

Sign in free — unlock all 6

Dograh: The Open-Source Vapi Alternative You Run on Your Own Server

Underrated tools. Unfiltered takes.