Detection is the easy part
Most teams we meet have detection figured out. Cisco XDR raises a clean incident, the kill-chain is laid out, the affected host and user are right there. That part takes seconds.
It's everything after the alert that eats the afternoon. Someone has to decide to act. Someone has to get approval to pull a production machine off the network. Someone has to log it, change-control it, and make sure it's reversible. Detection took seconds; the response took the rest of the day.
We built an integration that closes that gap for a customer running Cisco XDR alongside ServiceNow. The idea is simple to say and surprisingly fiddly to get right: let an XDR incident drive containment — but keep a human approval in the middle, and keep change control where it already lives.
The one rule: automate the busywork, not the decision
We were clear from the start about what we wouldn't do. We would not let a workflow quarantine production endpoints on its own. Isolating the wrong machine at the wrong time is just a new incident.
So the work splits along a clean line. XDR does detection, classification, and the containment only it can perform. ServiceNow does what it's already the system of record for: the ticket, the change request, the approval. The automation removes the copy-paste and the waiting-room friction. It does not remove the human yes.
How it actually flows
An incident fires in XDR. The workflow enriches and classifies it against the customer's response rules, and decides whether this is even the kind of incident worth automating. Most aren't — those just get routed for analyst review with a note saying why.
The classification is deliberately conservative. A handful of signals decide the path — and the default, by far the most common, is "hand it to a human":
| Signal | What it means | Path |
|---|---|---|
| Critical + high score + relevant technique on a workstation | Likely a contained-blast-radius endpoint threat | Ticket → approval → contain device |
| Credential / privilege-escalation on a server | Lateral-movement risk | Ticket (P1 if service account, else P2) → log; identity action stays in ServiceNow |
| Anything else | The long tail | Route to analyst with a note |
The ones that qualify become a real ServiceNow incident, carrying the original XDR identifier so the two systems stay tied together. ServiceNow turns that into a change request and runs it through the same CAB approval the org already trusts. Nothing about that process had to be weakened — the security flow just feeds into it.
And here's the part that matters: while ServiceNow runs the approval, the XDR side pauses. It's created the ticket and now it waits — on purpose — for a human. When the change is approved or rejected, ServiceNow calls back, and only then does XDR resume. Approve, and it contains the threat and logs the result. Reject, and it records that nothing was done, and why.
Why we made it wait
The tempting shortcut is to contain immediately and treat the change request as paperwork afterward. We didn't. An approval gate is worthless the moment the action happens before the approval. By holding the workflow open until the real decision comes back, the gate is a real gate — production stays untouched until a person signs off.
That pause-and-resume is the heart of it. It also has to be correlated carefully: when an approval comes back, the system has to resume the exact paused action it belongs to, even with several incidents in flight. A correlation key rides along from the moment the incident is created, so one decision can never resolve the wrong thing.
The callback itself is intentionally tiny — just enough to say which incident and what the human decided:
{
"incident_id": "incident-<id>",
"action": "Approve",
"change_number": "CHG00xxxxx",
"approver": "j.doe"
}
That's the whole contract crossing the boundary. The resume logic keys off incident_id to find the one paused action waiting on it, and action decides whether to contain or stand down.
Keeping the two systems at arm's length
One discipline we held throughout: ServiceNow never holds XDR's credentials, and vice versa. A compromise of the change system shouldn't hand someone the security platform's response capability. The only thing that crosses the boundary is a single, narrowly-scoped, authenticated callback — enough to signal XDR, never enough to drive it. The first time an auditor asks "what could the ServiceNow side do if it were breached?", that answer is short and reassuring.
On the ServiceNow side, that callback is just a few lines that fire once, on the approval decision — sending the decision out and holding nothing more than a webhook key:
// fires once, on the transition into approved/rejected
var r = new sn_ws.RESTMessageV2();
r.setEndpoint(gs.getProperty('xdr.callback_url'));
r.setHttpMethod('POST');
r.setRequestHeader('Content-Type', 'application/json');
r.setRequestHeader('x-api-key', gs.getProperty('xdr.callback_key'));
r.setRequestBody(JSON.stringify({
incident_id: current.getValue('u_xdr_incident_id'),
action: current.approval == 'approved' ? 'Approve' : 'Reject',
change_number: current.getValue('number'),
approver: gs.getUserName()
}));
r.execute();
No XDR API client, no response capability — just a one-way signal carrying the decision and the correlation id.
What only shows up once you build it
The happy path hides the hard cases. Approve-and-contain is the easy 80%. The real work is the reject path, the approval that never comes, the device you can't reach, the same incident arriving twice.
A couple of others worth passing on. Test environments love to silently auto-approve, which makes your callback look instant and convinces you the gate works when it has never once waited for a human — confirming it genuinely pauses was a test in itself. And reaching a system isn't the same as being allowed to act on it; the plumbing is the first 90%, the permission model is what actually makes it work. Everything writes a record on both sides, because an automated response you can't audit is a liability, not a feature.
What the customer gets
No more copying details between tools. Containment still needs a human yes — but it's one click, in the system they already use. The action is change-controlled, the CAB process intact, and both systems carry a clean, correlated audit trail. Mean time to respond drops, without trading away the control that makes the response safe.
That's the whole point. A lot of "SOAR" stories are really just "we removed the human." This isn't that. The human still holds the off switch — we just gave them their time back.
Detection that fires fast but a response that drags? Especially if "get approval to contain" is where your hours vanish — that's the gap we like to close.