Kill Switch Recovery
The kill switch is an emergency stop that blocks all agent traffic instantly. This runbook covers activation, investigation, and recovery.
Activation
Via dashboard
Settings → Kill Switch → Activate
The kill switch can be scoped:
| Scope | Effect |
|---|---|
| All | Blocks all agent traffic across all connectors |
Connector (e.g., github) | Blocks all traffic to a specific connector |
Agent (e.g., DevOps Bot) | Blocks all traffic from a specific agent |
Via CLI
The kill switch is managed from the dashboard. Use tf logs to open the dashboard quickly.
Warning
The kill switch takes effect immediately. All matching in-flight requests are terminated. Pending approvals are denied.
What happens when the kill switch is active
| Component | Behavior |
|---|---|
| Gateway | All matching requests return 403 with X-TameFlare-Kill-Switch: active header |
| Dashboard | Kill switch status shown in Settings and header health indicator |
| Audit log | kill_switch.activated event recorded with scope and activating user |
| Pending approvals | Matching pending approvals are auto-denied |
Investigation runbook
When you activate the kill switch, follow this process to investigate and recover:
1. Confirm the kill switch is active
tf status
# Should show: Kill switch: ACTIVE (scope: all)2. Review recent traffic
Open the dashboard traffic page (tf logs or visit tameflare.com/dashboard/traffic).
Look for:
- Unusual action types (e.g.,
github.repo.delete,stripe.transfer.create) - High request volume from a single agent
- Actions that were denied by policy but retried repeatedly
- Requests to unexpected domains
3. Check the audit log
# Via dashboard: Audit Log page
# Filter by event type, agent, or time range
# Export to CSV for offline analysisKey events to look for:
action.denied- what was the agent trying to do?action.allowed- were any dangerous actions allowed before the kill switch?approval.responded- were any approvals granted that shouldn't have been?
4. Identify the root cause
| Symptom | Likely cause | Action |
|---|---|---|
| Agent making unexpected API calls | Prompt injection or hallucination | Review agent prompts, add policy rules |
| High volume of requests | Agent in a loop | Fix agent code, add rate limit policy |
| Requests to unknown domains | Agent discovered new APIs | Add connector or block domain |
| Approved actions that shouldn't have been | Approver error or social engineering | Review approval policies, restrict approver group |
5. Fix the issue
Before deactivating the kill switch:
- Update policies to prevent the problematic behavior
- Revoke or suspend the affected agent if needed
- Rotate credentials if you suspect credential compromise
- Add stricter permissions for the affected connector
6. Deactivate the kill switch
In the dashboard: Settings → Kill Switch → Deactivate
Note
Only users with the owner role can activate or deactivate the kill switch.
7. Verify recovery
# Check status
tf status
# Run a test action
tf run -- curl https://api.github.com/zen
# Should succeed if permissions are configuredNotification options
TameFlare can notify you when the kill switch is activated:
| Method | Configuration |
|---|---|
| Slack | Configure Slack integration in Settings. Kill switch events are sent to the configured channel. |
| Webhook | Set webhook_url on action requests. Kill switch denials include the reason in the webhook payload. |
| Dashboard | Kill switch status is shown in the header health indicator with auto-refresh (10s). |
| Audit log | kill_switch.activated and kill_switch.deactivated events are always recorded. |
Preventing false alarms
To avoid unnecessary kill switch activations:
- Use scoped kill switches - block one agent or connector instead of everything
- Use
monitorenforcement level first - log without blocking to understand traffic patterns - Set up approval workflows - require human approval for high-risk actions instead of blocking everything
- Review policies regularly - overly permissive policies lead to unexpected behavior
Next steps
- Security - full security model and threat analysis
- Troubleshooting - common issues and solutions
- Audit Log - review events after an incident