Detection Engineering’s Quant Era
The story isn't that AI writes detections faster. It's that the economics of the whole craft just flipped, compressing finance's fifty-year arc into a fraction of the time.
A detection engineer I work with told me his output went from two good detections a week to twenty. Most people react to that the same way, and that reaction is the mistake this post is about.
Twenty is the boring part. The interesting part is that they are high-quality detections produced at that speed, and what that does to the economics of cybersecurity. What changed was not his typing speed or regex patterns. His hours stopped going into writing rules and started going into judging them. The machine drafts. He decides what is worth keeping, what is brittle, and what a real adversary would evade without noticing. Each of the twenty is better than the two used to be, because his judgment now goes to the part of the work that needed it instead of being burned on syntax.
If the speed is what impressed you, you have already misread it, and you will build a faster version of the problem detection engineering has spent a decade fighting at the wrong bottleneck. You will industrialize security theater.
This is a story about what happens to a craft when the cost of doing it collapses. We have a very good map for that, and it is not in security. It is in finance.
What Finance Has to Do with Detection
Donald MacKenzie wrote a history of financial models with the perfect title: An Engine, Not a Camera. His point was that a model of a market does not sit outside it taking a picture. Once people trade on the model, the model becomes part of the market and changes the thing it claims to describe. The famous options-pricing formula did not photograph how options were priced, it taught the market how to price them. Which caused the prices to move toward the formula.
A detection is the same kind of object. It is not a camera pointed at the adversary, it is an engine that changes the adversary. Stand up a detection program and the adversary’s behavior bends around it, because what you build to observe the world becomes a force acting on it. That is what makes detection reflexive. Exactly how the bending happens, through what channel and on which ground, I answer later. The obvious answer is wrong and the real one is far more interesting. For now: a detection is an engine, not a camera.
I run this whole post through a computational finance analogy, because finance spent fifty years in this domain and already ran the arc detection engineering is starting. You do not need to know anything about finance, the next section tells the whole story.
One caveat, because the analogy is only true in one corner of finance. Most of finance is not adversarial, the market does not design a trade to ruin your specific position, and a threat actor does. But few niche parts of Wall Street are adversarial, and how that side evolved as computers moved its choke points is the fascinating part.
The Whole Finance Story, in One Read
no finance background required, I promise this is about cybersecurity
It is a story about computers entering finance and redrawing the boundaries of the industry.
Start in 1979. VisiCalc, the first spreadsheet, ships on the Apple II. Before it, testing a financial idea meant recalculating a page of math by hand every time you changed a number, which took hours. After the spreadsheet, it took a second. That sounds small. But it made a new kind of deal possible, the leveraged buyout, where a small group borrows an enormous amount of money against a company’s own assets to buy it, wring it out, and sell it. You could not run those numbers fast enough before the spreadsheet. After it, a wave of corporate raiders did exactly that. The tool did not just speed finance up. It made a new game thinkable.
The math also got deeper. In 1973 a pair of academics published Black-Scholes, a formula for pricing options, the contracts that pay off if a stock moves a certain way. Risk stopped being something you carried and became something you could measure, slice, and sell. That kicked off the derivatives era, and it needed a new kind of employee to run it. People with physics and math doctorates, the ones everyone came to call the quants.
The quants needed a way to say how much the whole firm could lose, so the industry settled on a single number called Value at Risk. One figure that said, in effect, we will not lose more than this much on a normal day. Executives loved it because it was simple and fit on one line. It was also a lie of omission, because it hid the rare, catastrophic days it could not predict. Keep that number in mind. It has a twin in cybersecurity.
Here is the change that matters most. For a long time trading meant betting on where the world would go, rates up, this stock down. Then a smarter game took over, arbitrage. Instead of betting on the world, you bet that two related prices were out of line, that the gap was someone else’s mistake, and that it would close. You started to read the people reading the market. That is a harder game, and it is the one that truly needed the quants, because you cannot find those gaps by instinct.
The shop that perfected it was the Arbitrage Group at Salomon Brothers, run by John Meriwether. While the rest of Wall Street bet on where the world would go, Meriwether’s arbitrageurs bet on the gaps, and they were unfairly good at it. By the early 1990s, his desk earned more than the entire rest of Salomon combined, all from a small room of quants. That is how good the game was, if played right.
The most famous arbitrage shop of all was Long-Term Capital Management, founded in 1994 when Meriwether took that desk private, joined by two Nobel laureates whose names are on that options formula from earlier. On paper, they were the smartest firm that had ever existed. In 1998 they collapsed in weeks and nearly took the financial system with them. How they collapsed is the part that matters. They did not blow up because their trades were wrong. Most were right. They blew up because they had borrowed enormously to make those trades bigger, which finance calls leverage, and because every clever firm was crowded into the same trades. When they were all forced to sell, everyone was selling the same thing at the same instant and there was no one left to buy. Their genius and their fragility were the same thing. Ten years later, in 2008, the whole system did it again at planetary scale, everyone holding the same correlated bets, trusting the same risk models, all of it going down together.
That is the arc, and underneath it is one pattern, the lens for the rest of this post. Each time a computer tool entered finance it had the same four effects, and each fed the others:
It dissolved whatever the bottleneck was and exposed the next one behind it.
It raised the game to a higher level, from single trades to portfolios to firm-wide risk.
It crowned a new elite at the new bottleneck and hollowed out the old one.
And the scale that produced the brilliance produced the catastrophe, inseparably.
Detection engineering is about to run this same arc, faster.
Detection’s Turn
Start with the spreadsheet move, the one most misread. AI did not hand detection engineering a better map of the adversary. ATT&CK is the same map it always was, useful and deeply tautological, a catalog of what has been seen rather than a forecast of what is coming. What AI collapsed is the cost of operating over the rule libraries we already have, the same way the spreadsheet collapsed the cost of operating over the math finance already had. The instant that cost falls, the unit of analysis climbs, from the single rule to the whole portfolio of coverage. Exactly the climb finance made from a single trade to the book.
A tool does not just speed a bottleneck up, it moves it, and the bottleneck it dissolves was quietly dictating the shape of the work. The new winners form wherever it lands next. So the question was never how fast AI writes rules. It is where detection’s bottlenecks actually were, what shape they forced on the work, and where they relocate once they fall.
Keep one thing in mind. Like any trading position, a detection is a wasting asset, its edge decaying the instant the adversary adapts.
Front Office, Middle Office, Back Office
Detection has bottlenecks in three places, and finance has the same three. Banks split the work into a front office that makes the trades, a back office that clears and settles them, and a middle office in between that prices risk and reconciles the books. Detection engineering has all three of these offices as well.
The two that flip the whole regime sit on opposite ends. The front is authoring cost, the spreadsheet move. The back is false-positive economics, the one starting to get discussion with AI SOCs. Detection’s depth-to-fidelity ratio was never a deliberate choice, it came out of the triage bottleneck. Every false positive cost an analyst time, so a rational engineer wrote narrow, brittle, low-volume rules (the great ones managed wider behavioral detections that were still low-volume). Narrow rules miss exactly what the adversary varies. That is a perfect example of a bottleneck reaching back up the pipeline and dictating the shape of the product. The field was structurally bad at catching anything new, and it was not a failure of skill. It was triage cost doing what bottlenecks do.
When AI absorbs triage at near-zero marginal cost, per-false-positive cost falls one to two orders of magnitude and the optimal shape inverts. It goes from a small corpus of brittle low-volume rules to a large corpus of broad, behavioral detections, high on the pyramid and continuously validated. Handling that volume is not one thing. A rule that fires twenty-five times a month wants depth, every hit investigated. A rule that fires two thousand times a month wants statistics, baselines and clustering and the anomalous needle. A human SOC could only ever afford the first, so the broad net that actually catches variation got tuned to death or never written. AI can run both lenses at once, which is what finally makes the broad net survivable at scale.
The finance precedent is the same. The quant era did not arrive when computers got fast, it needed two cost collapses on opposite ends. Computation cost fell first, letting the quants price strategies. But that changed nothing structural, because the cost of actually executing trades still made the strategies unprofitable to run. That execution cost, the back-office bottleneck, had quietly dictated which strategies were viable at all, the same way false-positive economics dictated which detections were. It flipped only when execution cost collapsed too, when US stocks moved from fractions to penny increments and trading went electronic. Two bottlenecks, opposite ends, and the industry changed only when both moved. This is why the single-sided “AI makes authoring cheap” story is the shallow read.
That leaves the middle. In a bank the middle office is the floor between the traders and the settlement clerks. It prices risk, reconciles the books, and tells you what you actually own versus what you think you own. It is where managed detection engineering quietly lived or died, because the middle office is the per-client operating overhead, and that overhead is the difference between MDR as cheap security theater and MDR as real boutique risk management. The front-office collapse gets the attention because it is visible. The middle office is where the economics actually sat.
Every middle-office job has the same hidden shape, a human judgment about the relationship between a rule and an environment, made one relationship at a time by a person who can hold a handful in working memory and decides differently on a Friday than a Monday. Does this rule apply here? Not “it carries a T1078 tag and the client runs Okta, so we are covered”, which is tag-matching, the same half-truth as a coverage percentage. The real answer requires reading the rule’s logic and asking whether it would actually fire on the threat actor’s behavior in this environment, given this client’s log sources and field availability. That is slow and inconsistent between engineers, and it just was not done at most shops. Worse is the failure tag-matching can never see. A rule can be perfectly written, correctly tagged, deployed to the right client, and fire nothing, because the parser for that source never populates the field the rule keys on. On paper it is coverage, in production it is a green cell over a dead detection. Security theater. Confirming the telemetry actually carries what the rule assumes is pure reconciliation, the detection version of checking the book against the vault, and it is exactly the tedious work no human did consistently.
The rest of the middle office is the same move at portfolio scale:
Gap analysis that means something: not counting empty cells but joining rule coverage, log-source capability, and live threat intel to rank what is missing by what actors in this client’s sector actually do.
Parity: do all clients have the coverage the fleet claims, a rule-by-source-by-client matrix humans compute badly and machines compute exactly, and the same idea scores a prospect’s existing rules against yours on migration.
Tuning: deciding whether a new firing pattern is noise to suppress or signal to widen. Which under deadline got resolved the lazy way, by narrowing the rule until it went quiet, which is also how you blind it. The machine can cut false positives while refusing the tunings that buy quiet by opening an evasion path.
Metrics: replace the single coverage number with a dual signal, library breadth against logic-verified applicability, and let the machine write the report. Which matters because the report is the only part of the work the client ever sees.
Ingest: the largest line item in the deployment, expensive low-value logs running into the millions, optimized per source against coverage, evasion exposure, and hunt value instead of guessed.
Every one of those takes a judgment a human made locally, inconsistently, under time pressure, and turns it into a computation over the entire rule-by-source-by-coverage matrix, the detection book held whole as a portfolio. That is why this is more than cheaper. A human holds a handful of relationships, the machine holds the portfolio. So the output is more coherent across the fleet, and more correct, with one caveat that becomes the back half of this essay. It is more correct only to the extent the centralized method is correct. Coherence cuts both ways. A right method is now right everywhere at once, and a wrong assumption is now wrong everywhere at once.
The good version of this was not impossible before AI. A few mature in-house programs did real applicability analysis and parity and ingest tuning by hand, with senior engineers and patience. It was available only to the orgs that could afford to burn senior-engineer hours on book-keeping, which is almost none. Most MDRs did a bad version or skipped it, not from incompetence, but because every one of these jobs was per-client variable cost. The only way to scale a variable cost is to minimize it, even when that means shipping worse coverage. The middle office is where the industry’s quality was quietly sacrificed to make the unit economics work. AI is what makes the good version also the cheap version. This is where real personalization comes from, since per-client coverage that is genuine rather than cosmetic just falls out of running these as a computation per environment. This is also where the hyperscaler compounds, since the matrix gets smarter with every environment it sees and that knowledge is shared across the fleet. For the same reason, it is also the layer that manufactures the monoculture, a debt the last section comes to collect.
Why now, and not as gradual efficiency creep? Because detection is a volume problem wearing a quality problem’s clothes, and the collapse is pushed by a volume crisis. Finance has that precedent too. In 1968, Wall Street’s back offices drowned in paper stock certificates, and the exchange shortened trading hours to dig out. The Paperwork Crisis forced settlement automation before anyone thought it was ready. Telemetry and alert volume are doing that to triage right now. Automation as the response to a control crisis is an old pattern, and it means this collapse is being forced on the industry, not chosen by it.
The Ground the Adversary Cannot Leave
Now the question I deferred. How does a private detection change an adversary who never sees it? Your rules are private, nobody breaching a network is reading their logic. So whatever reflexivity detection has, it does not run through your rule.
It runs through shared knowledge. When the security community converges on a specific technique, when a tool or procedure gets documented in ATT&CK, written up by a vendor, shipped as commodity coverage, its cost rises everywhere and any competent threat actor abandons it. But notice the word specific. This only works on the avoidable, the named tool, the IOC, the one obfuscation trick, the LOLBAS they can swap for any of a hundred others. Cover those and your coverage goes quietly dead, because covering them is what drives the adversary off them. Your success is what makes the coverage worthless.
The ground that does not go dead is the ground the adversary cannot leave. Every intrusion has to authenticate, execute, move across trust boundaries, and get the data out. Those are the points where a hundred paths converge and there is nowhere left to migrate. No amount of community convergence drives the adversary off them, because there is no off.
The real point is that threat actors do not perform MITRE techniques. A technique is a label we apply after the fact, not a thing the adversary picks off a menu. On the unavoidable chokepoints, the adversary cannot abandon the technique, so they detach the action from its signature. They rename powershell.exe to something benign, so the rule keyed on the name sees nothing. They load the scripting engine as a library inside their own process, the way Cobalt Strike and Sliver do, so powershell.exe never runs at all. They log in with stolen credentials, so the session is indistinguishable from yours. The coverage cell stays green the whole time, the malicious instance hiding inside the benign one.
Detection on the durable ground is not a signature question (did powershell.exe run) but a structural one. What does this execution encode, in its lineage, its timing, its deviation from baseline? That is exactly the broad, high-volume, noisy net the false-positive economics made impossible, because authenticating and executing happen constantly and innocently. The whole optimistic argument narrows to one line here. AI does not just make coverage cheaper, it lets you finally fight on the chokepoints the adversary cannot abandon, fighting them structurally at a volume no human SOC could triage. That is the only ground where detection holds a lasting edge, because it is also the only ground the adversary cannot walk away from.
That tells you which kind of engine to be. The bad engine chases the avoidable and watches its coverage die the moment it works. The good engine watches the unavoidable and forces the adversary to keep paying, in effort and exposure, to camouflage what they cannot stop doing. The rest of this post is an argument for the second kind.
Detection’s Quant
Remember why arbitrage summoned the quants. Ordinary trading was a bet on the world. Arbitrage was a bet on other people’s mistakes, modeling the modelers, which you cannot do on instinct. So the harder game pulled in a new kind of mind and crowned the new winners.
Detection has spent twenty years at the lower level, treating adversaries as a mostly static catalog of techniques to write rules against, an opponent who does not really respond. That is why intrusion reports are full of the same living-off-the-land tradecraft reused for years. Nobody ever made the stale moves cost anything. AI pushes detection up a level, to where you model an adversary who is modeling what a competent defender would watch for, the belief about your coverage rather than your private logic, and where the lazy reused move can finally be punished. The value lands at the one place left, the theory layer. That is the person who supplies the causal model of the adversary and the judgment about which gap matters. That person is detection’s quant. Remember, Meriwether’s arbitrageurs and the firm that blew up were the same people. The elite a new level crowns is also the one that discovers its new failure mode first.
Mark to Market
The finance analogy left one number hanging, Value at Risk, the single figure for how much a bank could lose on a normal day. It was loved because it fit on one line and dangerous because it hid the abnormal ones. Detection has the same number. “We are at seventy-eight percent ATT&CK coverage” is the Value at Risk of security, one legible figure for the board, hiding all the adversary variety it cannot represent. We tolerated it for the same reason finance did. Actually modeling whether a detection fires against real behavior, continuously, was too expensive.
That cost is what AI removes, so detection skips the trap. Instead of asserting coverage you test it, continuously, per client. Here is what is proven-detected and proven-missed against emulated behavior, today. Coverage stops being a claim and becomes a measured property, marked to market. This is the oracle the last few sections kept circling, the daily price that pops the green-but-dead cell before a breach does it for you.
The Reflexivity Engine
Strip reflexivity down and it comes to one thing. The only quantity that matters is the live relationship between your decision function and the current state of the world, because both move. A static rule is a decision function frozen the day it was written, fighting a world that has moved on. The thing that keeps detections indexed to the present is a reflexivity engine, the reflexivity from the top of this essay seen from the practical side.
It runs in two modes. The first is reactive and shaped like news, a piece of threat intelligence lands and the system treats it like a market-moving headline. In the time it took one analyst to read the report, the whole fleet is re-marked, who has the telemetry to see this, who already has proven coverage, who has a real gap, whose configuration is exposed, then it drafts the detection, tests it against the behavior, deploys where it proves out, and flags where it does not. The old loop was a report, a human, and maybe a rule for some clients in a week. The new one is same-day, across everyone, tuned to each. The second mode did not exist before at any price. Run the adversary models against each client’s actual environment as a near-free background process, ask what this actor would do here that you cannot currently see, and build the behavioral coverage before the technique is ever used. That is climbing the pyramid on purpose instead of waiting for the indicator to show up after the fact.
Scale
Run all of that across a thousand clients and the economics break open. Managed detection always forced a brutal choice. You scaled cheaply, shipping near-identical fragile coverage to everyone and calling the sameness a methodology, or you personalized deeply, a boutique only a handful could pay for. Personalization and scale were mutually exclusive, and the reason is everything in the three offices above. It was all per-client variable cost, and variable per-unit cost is what makes a service scale multiplicatively and badly.
Collapse all of it and the binding constraint relocates off the per-client cost and onto the centralized method, built once and amortized across the fleet. This is how index funds beat stock pickers. Passive management did not win by choosing better companies, it won by driving the cost of holding one more client’s money toward zero until owning the whole market cheaply was the product. The endgame is something like BlackRock’s Aladdin, the single risk platform watching a vast share of the world’s professionally managed money. Applied to security operations, the payoff is concrete. Boutique-grade, client-specific coverage at index-fund marginal cost, onboarding in days instead of weeks, metrics that are continuous and tested, ingest optimized against coverage instead of guessed. The value lands at the theory layer, not in the hours. Hold that, because the one shared method that makes all of this work is the thing about to try to kill us.
When Genius Fails
The failure mode is not the one people reach for. The reflex is to worry the AI writes a bad rule. But one bad rule on some clients is contained and survivable, a single trade gone wrong, and every model is wrong eventually. That is not what kills.
What kills is what capsized LTCM. They did not die because their trades were wrong, mostly the trades were right. They died because they were enormously leveraged and every clever firm was crowded into the same positions, so the unwind took them all down together. A detection hyperscaler has both halves. The leverage is the ratio of consequence to independent defense, a platform backing the security of most of the institutions you can name, all of it resting on one shared method, with no diverse reserve to absorb a failure. The correlation is that shared method’s blind spots, which are not one client’s but every client’s at once. So failure does not arrive one client at a time. One sufficiently novel bypass, an EternalBlue-class flaw that worms through everyone at once before anyone has a detection for it, or an AI-driven intrusion that does something it would be pure speculation to describe today, hits the shared method and detonates the whole book together.
Personalization does not save you either. Each client’s rule set looks different, but that difference lives at the surface while the tail comes through the shared method underneath, so a blind spot in the method is inherited by every client no matter how custom their rules look. Cosmetic variety over a monoculture is still a monoculture where it counts. Personalization de-correlates the failures you would have survived anyway and does almost nothing about the one that ends you.
This is 2008, the planetary version. Everything correlated goes down at once and no hedging inside the system prevents it, because the correlation is not incidental, it is intrinsic to the same scale that creates the value. The homogeneity that gives you coverage and quality and index-fund economics is the homogeneity that makes the fleet fail together. That is the lesson, the brilliance and the fragility are one thing, and you do not get the genius without manufacturing the tail. One consequence, worth its own post, is that a platform at that scale is critical infrastructure the way a too-big-to-fail bank is. It will eventually draw the same treatment, stress tests, diversity requirements, a drift toward more nationally governed security, probably after a crash teaches everyone why.
The Frontier
So if the tail cannot be eliminated, what does detection engineering become? You de-risk the tail, knowing you can only reduce and survive it, not erase it. You spend your effort on the unknowns instead of the knowns. You harden speed, building circuit breakers and isolation into the fleet so a machine-speed correlated attack can be cut off instead of propagated, the way exchanges install trading halts for the moments the system moves faster than judgment. You deliberately de-lever by keeping genuine diversity in your methods, paying for it in the efficiency you give up. And you invent the strange, security-specific forms of hedging this domain will need and does not yet have names for.
One last thing, the move that closes the loop with the unavoidable ground. When the chokepoints the adversary cannot leave are finally watched at scale, the cheap reused tradecraft stops working, and the adversary’s only moves left are to pay even more to camouflage what they cannot stop doing, or to invent something genuinely new. The stale meta breaks. The frontier of the craft becomes detecting the move the adversary is forced to invent, and that work is irreducibly human, the part of the theory layer no machine supplies.




