The platform at a glance
The single most important line in that diagram is Backend → runtime config → Fleet → results → Backend. The voice fleet fetches everything it needs at the start of each call and hands back a complete record at the end. It stores nothing in between. That one property is what lets the fleet scale horizontally and lets a crashed worker cost you exactly one call.
The four services
| Service | Plane | What it owns | Stack |
|---|---|---|---|
| Console | Operator surface | The dashboard operators use: bot builder, campaign builder, call logs, reports, settings, knowledge bases | React + Vite single-page app |
| Backend API | Control plane | Auth, bots, campaigns, system settings, per-call runtime config, durable call records, CRM integrations, API keys, recordings access, fleet selection | Python · FastAPI |
| Voice fleet | Runtime / media plane | Live call workers, transport adapters, the speech pipeline, recording upload, post-call packaging, result delivery | Python · FastAPI · Pipecat |
| Dialler | Campaign execution plane | Campaign leases, predictive pacing, retry scheduling, outbound SIP dialing, answering-machine screening, attaching answered calls to the fleet | Python · asyncio worker |
Control plane = the source of truth
The Backend and its data layer (MongoDB, Redis, vector DB, object storage) hold everything durable: who the bots are, what the campaigns contain, every call record. If it must survive a restart, it lives here.
Runtime plane = disposable muscle
The fleet and dialler do the heavy, real-time work but keep no durable state. They can be scaled out, restarted, or replaced freely — they always re-fetch config from the control plane.
The life of a call
Two call shapes converge on the same pipeline and the same post-call record. This is the flow that ties every service together.A call begins
Either the dialler places an outbound campaign call (paced so it never overruns the fleet or the carrier), or a customer dials in to a configured number. Both arrive at the telephony plane.
A worker picks it up
The call lands on exactly one free fleet worker. The worker immediately asks the Backend for that bot’s runtime config — prompts, voice settings, tools, and any CRM data for this contact.
The conversation runs
Inside the worker, the speech pipeline runs the loop: speech-to-text → language model (with tools) → text-to-speech, with voice-activity and turn detection deciding who’s speaking. The audio is recorded.
Inside a fleet host
The fleet’s scaling model is deliberately simple. Each host runs many single-call worker processes behind a reverse proxy that hands each worker one call at a time.| Property | Value | Why it matters |
|---|---|---|
| Calls per worker | 1 | A stuck or crashed worker can only ever lose one call, never a batch. |
| Worker count | N per host (commonly 16 on a 4-CPU / 8 GB host) | Total host capacity = number of workers. |
| Proxy rule | max_conns=1 | Stops a busy worker from being handed a second call. |
| Worker state | none | Workers re-fetch config every call, so any worker can take any call. |
How a change reaches production
Code does not move by hand. Every change flows through version control and a build server before it lands on the servers — see the developer and DevOps sections for the full detail.Where to go next
Run it locally
For developers: prerequisites, cloning the repos, and running the stack on your own machine.
Deploy it
For DevOps: the Bitbucket → Jenkins pipeline, host deployment, configuration, and the operations runbook.
Repository map
The four repositories, their boundaries, and the contracts between them.
Operations manual
For the team running day-to-day calling: bots, campaigns, calls, and reports.