Skip to content

Fail2Ban Runbook

Operational reference for the fail2ban configuration protecting the PROD cloud Asterisk. The setup is intentionally split so that real attacks get banned fast while legitimate customer retry storms (IP changes, network blips, softphone session starts) cannot trip the ban threshold.

Why two asterisk jails

Asterisk's PJSIP emits No matching endpoint found on every legitimate first REGISTER — that's the standard SIP handshake (request without auth, server responds 401, client retries with auth). The same log line is also emitted by username scanners doing brute-force discovery.

A single jail can't tell them apart by log line alone — it can only distinguish by rate. So we split:

Jail Catches maxretry findtime bantime
asterisk-auth Real auth failures: wrong password, ACL violation, security events, "hacking attempt detected" 3 10 min 24 h
asterisk-scan Endpoint-discovery patterns: "No matching endpoint found", "No matching peer", "Not a local domain", extension-not-found 50 1 h 1 h

A real attacker doing username scanning produces hundreds of probes per minute → still banned fast by asterisk-scan. A customer PBX with 10 phones re-registering after an IP change produces ~10–30 such lines in a few seconds → well under 50 in an hour → no ban.

File layout on PROD (89.116.31.109)

Path Contents
/etc/fail2ban/jail.local Jail definitions for asterisk-auth, asterisk-scan, sshd plus [DEFAULT] ignoreip list
/etc/fail2ban/filter.d/asterisk-auth.local Regex set for real auth failures
/etc/fail2ban/filter.d/asterisk-scan.local Regex set for endpoint-discovery patterns
/etc/fail2ban/filter.d/asterisk.conf Distro-default monolithic filter — disabled, kept on disk for reference
/etc/fail2ban/jail.local.bak-YYYY-MM-DD-* Pre-change backups
/var/log/fail2ban.log Action log (bans, unbans, reloads)
/var/log/asterisk/messages.log Source the jails monitor

Customer-PBX whitelist (ignoreip)

Customer PBXes are special — they have many endpoints behind one public IP, and when their ISP-assigned IP changes (common in India for non-static-IP customers), all endpoints re-register simultaneously. Even with a 50-strike scan threshold, a one-off catastrophic ban can still happen; the ignoreip list is the belt-and-suspenders.

Current entries (as of 2026-05-11):

ignoreip = 127.0.0.1/8 ::1 59.93.255.0/24 103.197.113.0/24
CIDR Customer Notes
127.0.0.0/8, ::1 localhost Always whitelisted
59.93.255.0/24 VSEVEN HOTELS (V7) Current uplink
103.197.113.0/24 VSEVEN HOTELS (V7) Previous uplink — kept for resilience if their ISP rotates back

Use /24 (256 IPs), not /16 (65k IPs). /16 ranges are too broad and would whitelist many unrelated hosts on the same ISP, defeating the purpose.

Adding a new customer

When onboarding a customer whose PBX has a known public IP / range:

  1. Note the customer's public-IP block (ask their IT; typical SMB has a /29 or /28 from the ISP)
  2. SSH to PROD:
    ssh root@89.116.31.109
    cp /etc/fail2ban/jail.local /etc/fail2ban/jail.local.bak-$(date +%F)
    
  3. Edit /etc/fail2ban/jail.local:
  4. Add the CIDR to the ignoreip = … line under [DEFAULT]
  5. Add a comment above the line documenting which customer it's for
  6. Reload:
    fail2ban-client reload
    fail2ban-client get asterisk-auth ignoreip   # verify
    

Removing an entry

Same flow — edit the line, reload. The change takes effect immediately.

Common operations

Check status

ssh root@89.116.31.109

fail2ban-client status                       # list all jails
fail2ban-client status asterisk-auth         # counters for strict jail
fail2ban-client status asterisk-scan         # counters for lenient jail
fail2ban-client get asterisk-auth ignoreip   # current whitelist

Unban an IP

fail2ban-client set asterisk-auth unbanip <IP>
fail2ban-client set asterisk-scan unbanip <IP>
# or — unban an IP from every jail at once:
fail2ban-client unban <IP>

Add a temporary runtime ignoreip (lost on restart)

fail2ban-client set asterisk-auth addignoreip <IP>
fail2ban-client set asterisk-scan addignoreip <IP>
# remove:
fail2ban-client set asterisk-auth delignoreip <IP>

For a permanent entry, edit jail.local instead.

Watch live ban activity

tail -f /var/log/fail2ban.log

Test whether a regex matches a log line

fail2ban-regex /var/log/asterisk/messages.log /etc/fail2ban/filter.d/asterisk-auth.local
fail2ban-regex /var/log/asterisk/messages.log /etc/fail2ban/filter.d/asterisk-scan.local

Useful when adding a new regex or debugging "why didn't this get banned" / "why did this get banned".

Tuning guidance

If you find legitimate traffic getting banned by asterisk-scan:

  • First check whether the customer's CIDR is in ignoreip. If a known production customer is hitting the threshold, whitelist them.
  • Don't lower the strict-jail threshold — it's already at 3, going lower is dangerous.
  • Consider raising asterisk-scan maxretry from 50 → 100 if multiple customers have flash crowds. Don't go above 200 — at that point you're effectively disabled.

If you find a real attacker NOT getting banned fast enough:

  • Check findtime — the scan jail uses 1 h. A slow scanner doing 10 probes/hour can evade. Drop to 600 (10 min) if you see this pattern.
  • Add a recidive jail — fail2ban has a built-in [recidive] jail that bans repeat offenders banned by other jails. Worth enabling if scanner abuse rises.

Rollback to the pre-split single jail

If anything goes wrong:

ssh root@89.116.31.109
cp /etc/fail2ban/jail.local.bak-<date>-pre-split /etc/fail2ban/jail.local
rm /etc/fail2ban/filter.d/asterisk-auth.local /etc/fail2ban/filter.d/asterisk-scan.local
fail2ban-client reload

The distro-default [asterisk] jail will resume; it will re-create the false-ban risk for IP-change retry storms, so plan a replacement before rolling back.

History

Date Change
2026-05-06 Bumped maxretry from 3 → 10 on the monolithic [asterisk] jail as a stopgap after the PJSIP-reload + fail2ban storm incident. See Error 52.
2026-05-11 Split into asterisk-auth (strict) + asterisk-scan (lenient) + customer ignoreip whitelist after V7's IP change triggered another false-ban. See Error 55.