Skip to content
Red Teaming 13 min read

The Assumed-Breach Red Team Engagement: What It Actually Looks Like

If you've never bought a red team engagement before, the proposals all sound mysterious and the pitches are full of words like 'adversary emulation' and 'TTPs'. This is what an assumed-breach engagement actually looks like from the buyer's seat — the rhythm of the work, what your blue team will see, and what you get at the end.

Who this is for

This post is for the person whose CISO just said "we should run a red team next quarter" and whose immediate question is "what does that actually look like." If you've never bought a red team engagement before, the proposals all read the same and the sales pitches are full of words like "adversary emulation" and "TTPs" and "left of boom." This is the un-marketed version: what an assumed-breach engagement actually involves, day by day, from the buyer's seat.

Specifically: an assumed-breach red team. There are other models — external-only, full-scope including phishing, physical — but assumed-breach is the most common starting point and the one that gives you the most signal per dollar. The premise is that the attacker has already gotten in (because in the real world, they will), and the question is what they can do from there.

Diagram

1. What a red team actually is, vs a pentest

A penetration test is broad and finding-focused. The goal is to cover the attack surface and produce a list of vulnerabilities. A red team is narrow and objective-focused. The goal is to achieve a specific outcome — exfiltrate customer data, escalate to domain admin, pivot to production from a developer laptop — while measuring whether your detection and response can catch it.

In a pentest, the report is the product. In a red team, the report is half the product. The other half is the answer to "did your blue team see us coming?" That answer is what makes red teaming worth the higher cost. You're not paying for a list of bugs; you're paying for a real measurement of your defensive program against real adversary behavior.

If your defensive program is immature — no SOC, no SIEM, no detection content — a red team will mostly tell you that, expensively. Run a pentest first, fix the obvious gaps, build a basic detection capability, and then run a red team to measure it. We tell prospects this often, even when it means a smaller engagement.

2. Setting the objective

The first conversation in any red team scoping call is about objectives. What does "success" look like for the attacker? The answer should be a concrete, measurable outcome. Bad answers include "show us our weaknesses" and "test our security." Good answers include:

  • "Exfiltrate the customer database without triggering an alert in our SIEM."
  • "Pivot from a compromised developer laptop to read/write access on our production Kubernetes cluster."
  • "Achieve domain admin in our corporate Active Directory and demonstrate access to the finance share."
  • "From a compromised CI runner, demonstrate the ability to deploy malicious code to production without code review."

Each objective implies a different set of techniques, a different scope, and a different measurement. Pick one or two — not five. The depth of testing is more valuable than the breadth.

3. The rules of engagement document

Before any testing happens, you'll sign a rules-of-engagement (RoE) document. This is the contract between you and the red team that says what's in scope, what's out, what techniques are allowed, who knows about the test, and what the escalation path looks like if something goes wrong. It is not boilerplate. Read it carefully.

The key fields:

  • Scope: which systems, networks, applications, accounts, and identities are valid targets. If it isn't listed, it's out.
  • Out of scope: systems that must not be touched. Production payment processing, customer support tools, anything subject to specific regulatory restrictions.
  • Techniques permitted: what the red team is allowed to do. Phishing? Physical entry? Tools that exploit memory? Live malware? The default is conservative; expand it deliberately.
  • Trusted agents: the small list of people on your side who know the engagement is happening. Usually 2-4 names — typically the CISO, the head of security operations, and one or two trusted contacts who can vouch for the test if it gets noticed.
  • Stop conditions: events that pause or terminate the engagement. Production outage caused by red team activity, accidental impact on a customer, discovery by an untrusted-agent employee who is escalating it as a real incident.
  • Escalation contacts: 24/7 phone numbers on both sides. If the red team triggers a real incident, you need to be able to call them and call it off.
  • Get-out-of-jail letter: a signed document the red team carries (or stores digitally) authorizing their activity, in case they're discovered and someone wants to know who gave them permission.

If the red team you're hiring doesn't insist on a written RoE before any payment changes hands, they're not the team you want.

4. How initial access is staged in an assumed-breach

The "assumed" part of assumed-breach means you give the red team a foothold to start from. They don't have to phish their way in; you hand them the equivalent of what an attacker would have after a successful phish. The reason is efficiency — you don't pay them to spend a week rediscovering that phishing works. You pay them to operate from the foothold and tell you what they could do next.

A typical assumed-breach starting position looks like one of these:

  • A managed laptop, joined to your corporate domain, with a standard user account and the same software baseline an employee would have. The red team gets it shipped to them or accesses it over a VPN.
  • A standing user identity in your IdP — a real-looking but disposable account that exists in Okta, Microsoft Entra ID, or Google Workspace, with whatever group memberships an average employee in a given role would have.
  • A compromised low-privilege Kubernetes pod, simulating an exploited application. The red team gets a shell into the pod and starts from there.
  • A revoked, expired, or low-permission cloud credential, simulating a stolen token from a developer's laptop or a leaked GitHub secret.

The choice of starting position should match your highest-anxiety threat model. If you worry most about phishing, start from a managed laptop. If you worry about supply-chain attacks against your applications, start from a compromised pod.

5. The week-by-week rhythm

A typical assumed-breach engagement runs 4–8 weeks of active testing. Here's what each week usually looks like.

Week 1: foothold and reconnaissance

The red team takes the foothold and spends most of the week looking around. What's on the laptop? What's in the user's email? What can the user identity see in the cloud console? What groups is it a member of? What documents does it have access to? They're not making noise yet — they're building a map.

From the blue-team side, this week is mostly quiet. There's some baseline activity that might get logged but probably won't trigger alerts. If your SOC catches the red team in week 1, it's an excellent sign.

Weeks 2–3: lateral movement and privilege escalation

The red team starts moving. They use legitimate-looking tooling (no malware, just commands that an admin might run) to enumerate Active Directory groups, kerberoast service accounts, find misconfigured ACLs in cloud IAM, and look for over-permissioned credentials in CI. They chain low-severity findings into higher-severity ones.

This is where most blue teams notice something. The good ones notice the kerberoasting or the enumeration spike. The honest assessment is that fewer than half of the SOCs we've worked with catch this phase on the first run. That isn't a failure — it's a measurement. It tells you exactly which detection content to invest in.

Weeks 4–5: objective execution

The red team has enough access to attempt the objective. They demonstrate it carefully — usually in a way that doesn't actually impact production but proves capability. Exfiltrating "the customer database" might mean reading a table and pasting a hash of the contents into the report, rather than copying the data anywhere. The point is to prove that the read was possible, not to walk out with the data.

If the objective is achieved without detection, that's the headline finding. If the objective is detected and stopped, that's also a headline finding — and a good one.

Weeks 6+: reporting and debrief

After active testing ends, the red team writes the report. This usually takes 1–2 weeks. Then there's a readout, which has two formats:

  • Executive readout: 60-90 minutes, leadership audience, narrative-focused. Here's what we did, here's what we found, here's what it means for the business.
  • Technical readout / purple team workshop: half a day, technical audience. We walk your detection engineers and SOC through every action we took, mapped to MITRE ATT&CK, with the exact telemetry that should have caught us. This is the highest-value session in the entire engagement, and you should not skip it.

6. What your blue team will actually see

If you've never run a red team before, your SOC's reaction to seeing one is unpredictable. We've had SOCs page on a single LDAP query and shut the engagement down on day two. We've had SOCs miss every single action and only find out at the readout. Both are useful data.

The trusted-agent decision matters a lot here. If your detection lead is a trusted agent, they can quietly observe what their team catches without intervening, and the test is honest. If nobody on the SOC side knows, the test is even more honest but you risk a real incident response, including page-outs at 3 a.m. and a potentially disruptive containment effort. Most mature programs go with "the SOC lead knows but the analysts don't" — that's the sweet spot.

7. How to read the report

A good red team report has four sections. If the one you receive is missing any of these, something is wrong.

  1. Executive summary. One page, no jargon. What was the objective, was it achieved, what does it mean. If your CFO can't understand it, it's a bad summary.
  2. Attack narrative. A chronological story of the engagement. "On day three, we discovered…", "On day five, we used credential X to access service Y…". Mapped to MITRE ATT&CK techniques throughout. This is the section detection engineers will read most closely.
  3. Findings. Each individual finding the team produced — vulnerabilities, misconfigurations, missing detections — with a severity, an impact, and a remediation. Some findings are vulnerabilities (fix them in code or config); others are detection gaps (write a rule).
  4. Recommendations. Beyond individual fixes, what should change about how your security program is run. Better instrumentation in a specific control? A new detection category? A change to incident response runbooks?

8. What to do with the report

The temptation is to triage findings by severity and ship the criticals. That's necessary but not sufficient. The bigger value is in the detection gaps. For every action the red team took that your SOC didn't see, ask: what would have caught it? Write that detection. Test it against the same technique. The next red team should be harder.

The other thing to do is rerun the red team in 6–12 months. The first run gives you a baseline; the second run measures whether the program improved. If your detection coverage doesn't get better between runs, the investment isn't paying off and you need to figure out why.

9. Timelines and what you're actually getting

An assumed-breach red team typically runs 4–8 weeks of active testing plus 2 weeks of reporting. What you should expect across that window is the focused attention of one or two senior practitioners — not a rotating cast of juniors running scanners, and not a large team running the same playbook on a bigger scope.

What you're actually getting is judgment. A senior red teamer knows which techniques will produce signal in your environment, which will produce only noise, and which will get them caught immediately. You can't get that from a scanner. You can barely get it from a junior consultant reading a playbook.

10. How to spot a bad red team vendor

Warning signs from a sales conversation:

  • They can't articulate the difference between a pentest and a red team without falling back on marketing language.
  • They don't insist on a written RoE before any work begins.
  • They promise to "find every vulnerability" — that's a pentest claim, not a red team claim.
  • The proposal is pages of certifications and team bios, not pages of methodology.
  • They won't tell you the names of the specific people who will run the engagement, or those people are clearly junior.
  • They don't offer a purple team readout.
  • The deliverable is described as "a report" with no mention of detection mapping or recommendations.

The right vendor will spend the first call asking you questions about your environment and your objectives, not pitching their service. If you finish a discovery call and the vendor knows more about your goals than you knew before the call started, that's a good sign.

The short version

An assumed-breach red team is a 4–8 week engagement where a senior practitioner takes a pre-staged foothold in your environment and tries to achieve a specific business-impact objective while you measure how well your detection and response perform. You'll sign a written RoE before testing starts. You'll get a report with an executive summary, a chronological attack narrative mapped to MITRE ATT&CK, individual findings, and recommendations. The highest-value moment is the purple-team readout, where the red team walks your detection engineers through every action they took. The goal isn't to find every bug — it's to measure your defensive program against a real attacker, and get an honest answer about how it performs.

If your program is mature enough to handle the answer, run it. If it isn't, run a pentest first, fix the easy stuff, and come back when you're ready to measure something harder.

Considering your first red team?

MITRE ATT&CK-aligned, objective-based, and run by a senior practitioner. We'll scope it together, run it without breaking your environment, and leave you with a measurable view of how your detection and response actually performed.