Session — Apr 23 — IVR visual builder → prod + 11 staging fixes¶
Full ship day. IVR visual builder feature (previously staging-only) landed on prod via PR #63 after clearing 11 staging UAT bugs, 1 auth bug, and 1 hand- editing prod Asterisk config. Production orgs: Om Chamber, GrandEstancia, Zauto AI. Zero customer-reported incidents.
This page is the reference for the fixes + troubleshooting paths uncovered. For the prod ship runbook, see Prod Direct-Edit. For the complete test suite, see Test Cases.
Shipped to prod (PR #62 + PR #63)¶
PR #62 — Auth auto-logout fixes¶
| Fix | File(s) | Summary |
|---|---|---|
| Admin-impersonated JWT expiry now fires | editor/lib/auth/authStore.ts, editor/components/auth/AuthExpiryWatcher.tsx | Watcher was bailing on any admin key presence, silently letting the 24h impersonation JWT expire. Now schedules whenever pbx_org_token exists; handleUnauthorized distinguishes normal user vs impersonating admin vs pure admin. |
| 24h admin session auto-logout | editor/lib/auth/authStore.ts, editor/components/auth/AuthExpiryWatcher.tsx, editor/app/layout.tsx, editor/app/dashboard/[orgId]/layout.tsx, editor/app/dashboard/page.tsx | Tracks admin_session_start at login; at +24h, handleAdminSessionExpiry does full Firebase signOut + gateway key clear. Watcher moved to root layout to cover admin-only pages. |
| Sidebar + Settings logout full state-clear | editor/components/layout/Sidebar.tsx, editor/app/dashboard/[orgId]/settings/page.tsx | Previously cleared 3-4 keys, leaked auth_type/user_role/user_permissions etc. Now branches on isImpersonatingAdmin() — clears only impersonation state for impersonating admins, full logout otherwise. |
PR #63 — IVR visual builder (14 commits) + UAT bug fixes¶
Ten feature commits for IVR Phase 1/2 + React Flow builder + misc UX, plus four session fixes:
| Fix | File(s) | Root cause | Fix |
|---|---|---|---|
| DIDs IVR dropdown didn't exist; input deselected per keystroke | editor/app/dashboard/[orgId]/dids/page.tsx | DestinationField declared inside DidsPage — React remounted the Input on every parent render. Also no ivr branch. | Hoisted DestinationField out of parent; added ivr branch that lists IVRs with value={ivr.id} so UUID is stored. |
| Call to IVR DID said "number not in service" | same | Free-text input let users type "7002" (extension) but dialplan looks up IVR by UUID. | Dropdown forces UUID. Plus one-time SQL update of the broken DID row on staging. |
| Max retries was free-text number | editor/app/dashboard/[orgId]/ivr/[ivrId]/page.tsx | UI gap. | Replaced with Select options "1/2/3 attempts" + helper text. |
| DIDs table showed raw IVR UUID | editor/app/dashboard/[orgId]/dids/page.tsx | Column rendered did.routing_destination verbatim. | Added displayDestination helper that resolves UUID → "<ext> — <name>". |
| IVR greeting silently failed (DID call hung up) | api/src/services/asterisk/dialplanGenerator.js | Background(greeting_ivr_<uuid>) used bare filename, but TTS writes under /var/lib/asterisk/sounds/greetings/ subdir. Asterisk only searches <astsoundsdir>/<lang>/. File never found. | Prefixed greetings/ in the generator. Also regenerated the 7002 .wav that had never been written on staging. |
| SIP phone → IVR extension = SIP 404 | api/src/services/asterisk/dialplanGenerator.js | Internal context included _outbound + _queue but not _ivr. | Added include => <prefix>_ivr. |
| Outbound SIP endpoint rejected by pjsip on staging | api/src/services/asterisk/sipTrunkService.js | Splatted every key in trunk.configuration as pjsip option. Staging row had {system_trunk:true, nuc_gateway:true, channels:10} → Asterisk error "Could not find option suitable for category ... named 'system_trunk'". | Added METADATA_KEYS deny list. Prod trunks had configuration={} so the change is a no-op for prod. |
Zoiper dialing +919944... → "extension not found" | api/src/services/asterisk/dialplanGenerator.js | _X. pattern doesn't match literal + prefix. | Added _+X. catch-all at top of outbound context that strips + and Gotos ${EXTEN:1},1. |
| SIP → IVR extension (7002) went to PSTN instead of IVR | api/src/services/asterisk/dialplanGenerator.js | Internal context include order was _outbound → _queue → _ivr. Asterisk matches includes in declaration order, returns on first include with a match. _X. in outbound caught 7002 before _ivr was even searched. | Reversed: _ivr → _queue → _outbound. Exact-match contexts first, wildcard last. Also fixed an existing Om Chamber bug where dialling queue number 5001 from internal SIP went to PSTN. |
Prod Asterisk hand-edit (not in monorepo)¶
One config file outside /opt/<app> was edited after explicit user authorization:
| File | Change | Reason |
|---|---|---|
/etc/asterisk/ext_from_cloud.conf on prod 89.116.31.109 | Goto(org_mna9x47k__outbound,${EXTEN},1) → Goto(staging-outbound,${EXTEN},1) | Staging outbound calls were 403-Forbidden because the Goto target context didn't exist on prod (AstraPrivate was never provisioned as a prod org). staging-outbound context already existed and correctly dials PJSIP/${EXTEN}@tata_gateway. |
Backup saved at /etc/asterisk/ext_from_cloud.conf.bak-2026-04-23 on prod.
Troubleshooting reference (hard-won lessons)¶
"IVR call lands but no greeting plays — just hangs up"¶
Diagnostic: 1. asterisk -rx "dialplan show <ivr_ext>@<prefix>__ivr" — is there a Background(...) priority? 2. Check the filename passed: Background(greetings/<prompt>). Must have greetings/ prefix. Bare filename fails silently because Asterisk looks only in <astsoundsdir>/<lang>/, not in subdirs. 3. ls /var/lib/asterisk/sounds/greetings/<prompt>.wav — does the file exist?
Fix: - If path is missing greetings/ prefix: bug in dialplanGenerator.js:567 (fixed in PR #63). Regenerate the org's dialplan via configDeploymentService.deployOrganizationConfiguration. - If file missing: TTS was never generated. Either re-click "Generate greeting" in the IVR UI, or call TTSService directly:
"SIP phone dials IVR / queue extension → 404 Not Found"¶
Diagnostic:
Fix paths:
| Symptom | Cause | Fix |
|---|---|---|
| "There is no existence of context" | internal context missing or wrong prefix | Regenerate the org's dialplan |
Resolves to _X. outbound instead of exact match | Include order places _outbound before the exact-match context | Fixed in PR #63. Regenerate for each org. |
Resolves in _ivr / _queue but call still fails with 404 | pjsip endpoint not loaded for the dial target; check pjsip show endpoint <name> | Fix pjsip config (see next section) |
"Outbound call returns 403 / 404 — pjsip endpoint unknown"¶
Diagnostic:
Common error: Could not find option suitable for category '...' named 'X' means pjsip rejected the endpoint config at load time due to an unknown option. The endpoint is then missing from runtime — any Dial(...@endpoint) returns congestion.
Most common cause (historically): sip_trunks.configuration column contains metadata keys (system_trunk, nuc_gateway, channels, etc.) that get splatted into the pjsip file. Fixed in PR #63 (sipTrunkService.js now filters these). After the fix is deployed, regenerate the org's config and asterisk -rx "pjsip reload".
"Admin impersonating — users don't load, no auto-logout"¶
Symptom: Admin impersonated an org 24h+ ago. Page shows empty users list and zero call stats. Not kicked to login.
Diagnostic:
// In browser devtools on editor.astradial.com
localStorage.getItem('pbx_org_token_exp') // < Date.now() = expired
localStorage.getItem('gateway_admin_key') // truthy = admin session active
JSON.parse(localStorage.getItem('org_access')).impersonating // true
If all three are true and you're not being redirected, the watcher isn't firing — pre-PR-#62 behaviour. Fix landed in PR #62: ship to prod via CI/CD.
"Outbound dial with + prefix (E.164) doesn't match"¶
Zoiper and other softphones dial international numbers with a leading +. Asterisk _X. patterns do NOT match + (it's a literal char, not a digit).
Fix (PR #63): outbound context gets a _+X. catch-all that strips + and re-enters at ${EXTEN:1},1.
"Staging outbound hits prod but prod says 'congested' / 'number not in service'"¶
Flow: staging dials PJSIP/<num>@org_mna9x47k_tata → sends INVITE to prod's 10.10.10.1:5060 over WireGuard → prod matches endpoint cloud-endpoint-stage (by IP) → enters [from-cloud] context.
If prod's ext_from_cloud.conf Gotos a context that doesn't exist, the call dies. As of 2026-04-23 it Gotos staging-outbound (defined in ext_staging_outbound.conf) which dials tata_gateway. If that file is reverted or Gotos a per-org context for an org not provisioned on prod, staging outbound stops working. Check:
"Dialplan regen works on one org but fails on another with 'Unknown column'"¶
Check for schema drift. Staging has gone through more migration-like ALTERs than prod. Example seen today: prod ivrs table was missing greeting_text / greeting_language / greeting_voice columns because the idempotent ALTERs in the migration never ran on prod.
Fix:
scp api/database/migrations/<migration>.sql root@<vps>:/tmp/
mariadb -u<user> -p<pw> pbx_api_db < /tmp/<migration>.sql
CREATE TABLE IF NOT EXISTS, ADD COLUMN IF NOT EXISTS). Safe to re-run. "Om Chamber queue 5001 call silently goes to PSTN"¶
This was a prod bug revealed by the include-order fix. Before PR #63, dialling queue number 5001 from an internal SIP phone on Om Chamber matched _X. in the outbound context (because _outbound was included first). The queue never received the call. After the fix, exact 5001 in _queue wins.
Verification after any dialplan change¶
Minimum verification after regenerating or editing dialplan:
# 1. Reload cleanly
asterisk -rx "dialplan reload" 2>&1 | tail
asterisk -rx "pjsip reload" 2>&1 | tail
# 2. Spot-check each routing type
asterisk -rx "dialplan show <queue>@<org>_internal" # should hit _queue
asterisk -rx "dialplan show <ivr_ext>@<org>_internal" # should hit _ivr
asterisk -rx "dialplan show <pstn_number>@<org>_internal" # should hit _outbound _X.
asterisk -rx "dialplan show _+X.@<org>_outbound" # should exist
asterisk -rx "pjsip show endpoint <trunk-peer-name>" # should be loaded, not "Unable to find object"
# 3. Tail logs for errors
tail -100 /var/log/asterisk/full.log | grep -iE "error|warning" | grep -v "tata_gateway/tag"
# 4. A live PSTN call — irreplaceable
Deployment timeline (for audit)¶
All times IST. Prod = 89.116.31.109, Staging = 94.136.188.221.
| Time | Action |
|---|---|
| ~14:06 | Backups taken on prod (editor, astrapbx, asterisk configs, DB dump of IVR tables) |
| ~14:07 | Direct patches to prod editor files: authStore.ts, AuthExpiryWatcher.tsx (auth fix #1) |
| ~14:27 | Direct patches: Sidebar.tsx, settings/page.tsx (manual logout full state-clear) |
| ~15:38 | Staging fix chain: dialplan greeting path, _ivr include, TTS regen for 7002, sipTrunk metadata filter |
| ~15:45 | Staging: _+X. catch-all for E.164 outbound |
| ~15:55 | Staging: include order reorder (_ivr first) |
| ~16:15 | Prod edit (user-authorized): ext_from_cloud.conf Goto target → staging-outbound |
| ~16:25 | PR #62 merged → deploy-editor-prod ran → live in 2m9s |
| ~16:40 | PR #63 merged → deploy-editor-prod + deploy-api-prod ran → live |
| ~16:45 | Prod ivrs schema migration: added 3 columns + enum extension |
| ~16:50 | Dialplan regenerated for Om Chamber, GrandEstancia, Zauto AI |
| ~16:55 | QA agent verification: GREEN (22/22 automated checks) |
| ~17:00 | Human-verified: Om Chamber IVR creation + call test |
Related docs¶
- Test Cases (full suite) — run these against any future IVR / dialplan change.
- Prod Direct-Edit — the runbook for working on prod VPS directly.
- Staging Direct-Edit — same idea for staging.
- Staging Environment — architecture.