Skip to content

AI Bot Integration (AudioSocket + Pipecat)

This guide documents how AstraPBX connects phone calls to AI voice bots via Asterisk AudioSocket and the Pipecat Gateway.


Overview

When a caller dials extension 1003 (or a queue times out with failover to 1003), the call is connected to an AI voice bot powered by Pipecat + Gemini Live. The audio path uses Asterisk's native AudioSocket for reliable bidirectional PCM audio.

graph LR
    A[Caller] --> B[Asterisk PBX]
    B -->|ARI Stasis| C[AstraPBX Node.js]
    C -->|AudioSocket TCP| B
    B -->|Raw PCM 8kHz| C
    C -->|Binary WebSocket| D[Pipecat Gateway]
    D -->|Gemini Live API| E[Google AI]

Architecture

Audio Flow

Caller <-> Asterisk <-> AudioSocket (TCP, raw PCM) <-> Node.js Relay <-> WebSocket (binary PCM) <-> Pipecat
Segment Protocol Format
Caller to Asterisk SIP/RTP G.711 ulaw
Asterisk to Node.js AudioSocket (TCP) 16-bit signed linear PCM, 8kHz mono
Node.js to Pipecat WebSocket (binary) 16-bit signed linear PCM, 8kHz mono
Pipecat to Gemini WebSocket Gemini Live native audio

AudioSocket Protocol

Asterisk's AudioSocket sends/receives messages over TCP with a simple framing protocol:

[1 byte type][2 bytes length (big-endian)][payload]
Type Hex Description
UUID 0x01 Connection UUID (16 bytes binary)
Audio 0x10 PCM audio data
Hangup 0x00 Call ended

AstraPBX Custom Serializer

Instead of using pipecat's Twilio serializer (which adds ulaw/base64/JSON overhead), we use a custom AstraPBXSerializer that passes raw PCM directly via binary WebSocket frames. This eliminates three conversion steps and reduces latency.

Twilio Path AstraPBX Path
Caller → Bot PCM → ulaw → base64 → JSON → parse → base64 decode → ulaw → PCM PCM → binary WS → PCM
Bot → Caller PCM → ulaw → base64 → JSON → parse → base64 decode → ulaw → PCM PCM → binary WS → PCM

The serializer auto-detects based on provider: "astrapbx" in the WebSocket start message's customParameters.


Components

1. AstraPBX API (Node.js) — ariClient.js

The handleAiAgentCall method in ariClient.js orchestrates the connection:

  1. Answers the channel immediately (prevents hangup during WSS setup)
  2. Creates a TCP server on a random port for AudioSocket
  3. Opens a WebSocket to the Pipecat Gateway URL
  4. Sends Twilio-format handshake (connected + start events) for transport detection
  5. Sets Asterisk channel variables AUDIOSOCKET_UUID and AUDIOSOCKET_SERVICE
  6. Redirects channel to the [audiosocket-ai] dialplan context via continueInDialplan
  7. Relays audio: AudioSocket TCP PCM ↔ Binary WebSocket PCM
// Key flow in handleAiAgentCall:
await this.answerChannel(channel.id);           // Answer immediately
const tcpServer = net.createServer(/* ... */);   // AudioSocket TCP relay
ws = new WebSocket(wssUrl);                      // Connect to pipecat
// Send handshake, set channel vars, continueInDialplan to audiosocket-ai

2. Asterisk Dialplan — extensions.conf

The [audiosocket-ai] context handles the AudioSocket connection:

[audiosocket-ai]
exten => s,1,NoOp(AudioSocket AI Agent: UUID=${AUDIOSOCKET_UUID} Service=${AUDIOSOCKET_SERVICE})
 same => n,Answer()
 same => n,AudioSocket(${AUDIOSOCKET_UUID},${AUDIOSOCKET_SERVICE})
 same => n,Hangup()

The dialplan generator creates AI routing for users with routing_type: 'ai_agent':

; Generated for extension 1003 (AI Bot user)
exten => 1003,1,NoOp(Calling AI Bot)
 same => n,Stasis(pbx_api,ai_agent,ws://bots.astradial.com/ws/...)
 same => n,Hangup()

3. Pipecat Gateway — pipeline.py

The gateway detects AstraPBX connections and uses the custom serializer:

is_astrapbx = call_data.get("body", {}).get("provider") == "astrapbx"
if is_astrapbx:
    serializer = AstraPBXSerializer()
else:
    serializer = TwilioFrameSerializer(...)

4. AstraPBX Serializer — astrapbx_serializer.py

Located in gateway/astrapbx_serializer.py. Handles binary WebSocket frames containing raw 16-bit PCM:

  • serialize(): Converts AudioRawFrame to raw PCM bytes
  • deserialize(): Converts raw PCM bytes to InputAudioRawFrame
  • Resampling: Handles sample rate conversion if pipeline rate differs from 8kHz

Database Configuration

AI Bot User

-- User record for AI Bot extension
INSERT INTO users (id, org_id, username, extension, routing_type, routing_destination)
VALUES (
  '73c9ef8e-003f-420a-a742-fe710186ccf0',
  'ba50c665-7ab4-4f04-a301-eccc395dc42b',
  'AI Bot',
  '1003',
  'ai_agent',
  'ws://bots.astradial.com/ws/66c1779e-f56c-490a-aba3-f2506d2093a8/a40df67c-aab6-4122-8575-7e1a0aeab240?key=ak_...'
);

Queue Failover to AI

The reception queue (5001) fails over to the AI bot when no agents answer:

UPDATE queues SET
  timeout_destination = '1003',
  timeout_destination_type = 'extension'
WHERE id = '4552f60b-cc18-4c47-8bee-2d95b94512d2';

The dialplan generator produces failover routing:

; Queue timeout — route to AI bot
exten => 5001,n(timeout),NoOp(Queue timeout - routing to failover: 1003)
exten => 5001,n,Goto(org_mnd5khym__internal,1003,1)

Call Flow: Inbound to AI Bot via Queue Failover

graph TD
    A[Caller dials 08065978002] --> B[Tata PSTN → NUC → WireGuard → Cloud Asterisk]
    B --> C[Match DID → Route to Queue 5001]
    C --> D[Ring Hari ext 1001 - 15s]
    D -->|No answer| E[Ring Surya ext 1002 - 15s]
    E -->|No answer| F[Queue timeout → Failover to 1003]
    F --> G[Stasis: ai_agent with WSS URL]
    G --> H[Node.js: Answer channel, open WSS to Pipecat]
    H --> I[AudioSocket TCP relay established]
    I --> J[Pipecat Gateway: AstraPBXSerializer]
    J --> K[Gemini Live: AI responds with voice]
    K --> L[Audio flows bidirectionally]
    D -->|Answer| M[Call connected to Hari]
    E -->|Answer| N[Call connected to Surya]

Bugs Fixed During Setup

WSS Connection Timeout (DNS)

The Contabo server's default DNS resolvers (213.136.95.x) were extremely slow (~6 seconds) for Cloudflare-hosted domains. The 10-second WSS timeout was hit before the connection could complete.

Fix: Added Google/Cloudflare DNS as primary resolvers:

# /etc/systemd/resolved.conf.d/dns.conf
[Resolve]
DNS=8.8.8.8 1.1.1.1
FallbackDNS=213.136.95.10 213.136.95.11

DNS resolution went from 6185ms to 28ms.

Channel Hangup Before WSS Connected

The channel sat unanswered in Stasis while the WSS connection was being established. Asterisk/caller timed out and hung up (cause 32).

Fix: Answer the channel immediately before starting WSS setup:

// Before: answered after WSS connected (too late)
// After: answer immediately
await this.answerChannel(channel.id);
// Then set up WSS, TCP server, etc.

AudioSocket Protocol Type Bytes Wrong

Initial implementation used 0x10 for UUID and 0x11 for audio. Correct values are 0x01 for UUID and 0x10 for audio. This caused audio data to be logged as "UUID" (binary garbage) and never forwarded to pipecat.

StasisEnd Killing WSS Connection

When the channel moved from Stasis to the AudioSocket dialplan (via continueInDialplan), the handleStasisEnd handler closed the WebSocket to pipecat. For AI agent calls, StasisEnd means the channel moved to a different dialplan context — the call is still active.

Fix: Skip cleanup in handleStasisEnd for AI agent calls; cleanup on actual ChannelHangupRequest instead.

Queue Members Showing "Invalid" State

Static queue members used 5-field format with paused value. Asterisk expects 4 fields: interface,penalty,membername,state_interface.

Fix: Removed paused field from generateQueueMemberString() and removed invalid monitor-join=yes from queue config.


Date

Setup completed: 2026-03-31