Can Satellites and AI Replace Arms Control Inspections? A Reality Check

A recent Wired article asks a deliberately provocative question: “AI Is Here to Replace Nuclear Treaties. Scared Yet?” With the expiration of New START and the collapse of routine on site inspections, the article explores an idea gaining traction among arms control technologists. The proposal is that satellite surveillance, remote sensing, and AI assisted analysis, backed by human review, could substitute for many of the functions once performed by inspectors and data exchanges.
The argument is not utopian. Its proponents describe it explicitly as a fallback. In a world where intrusive inspections are politically unacceptable and formal treaties are stalled, satellites already observe missile fields and bases. AI systems could help scale that monitoring, detect changes, and flag anomalies for analysts. If states are willing to cooperate at least minimally, for example by opening silo lids on a schedule or placing agreed equipment in view, remote verification might preserve some constraints where traditional arms control has broken down.
It is a serious proposal and one worth engaging carefully. It is also a good moment for a reality check.
What follows is a feasibility assessment of satellites and AI as a substitute for traditional arms control verification. I treat the idea as a concrete claim: that remote sensing, machine learning, and limited cooperative gestures could replace most on site inspections and detailed data exchanges.
The conclusion is narrower than the framing sometimes suggests. Satellites and AI are useful and increasingly capable. But they cannot by themselves deliver what arms control has historically required.
What verification actually has to establish
Arms control is not simply about observing military activity. It is about producing evidence for specific legally defined claims under adversarial incentives. Historically this has required the ability to establish several different kinds of facts at once.
First are numerical limits. These include how many deployed launchers exist, how many warheads are attributed to those launchers, and how many non deployed items are in storage.
Second are attributes. Verification must distinguish treaty accountable systems from non accountable ones. It must determine whether a platform is nuclear capable or conventional, whether a bomber has been converted, and whether a silo has been eliminated or merely covered.
Third is irreversibility. Arms control has required confidence that warheads were dismantled, fissile material dispositioned, and launchers destroyed rather than preserved for rapid reuse.
Fourth is timeliness. Changes must be detected quickly enough to matter for breakout scenarios, upload strategies, or dispersal.
Finally there is dispute resolution. The evidence produced must be credible and acceptable to both sides in situations where incentives to cheat or to accuse are strong.
Remote sensing can support some of these requirements. It is structurally weak for others, especially warhead counts, attribution, and dismantlement, unless paired with cooperative measures that closely resemble inspections conducted by other means.
What satellites can realistically observe
The strongest case for satellite based verification is also the most limited. Satellites perform best when monitoring large fixed outdoor objects at known locations.
Today both commercial and national imagery already provide meaningful insight into several categories of activity. At silo fields, imagery can reveal new construction, excavation, changes in security perimeters, support vehicles, and whether silo lids are open at the moment of a pass. At air bases, satellites can show bomber presence, apron activity, shelter use, runway upgrades, and dispersal patterns. At shipyards, they can track submarine hull modules, drydock occupancy, and new pier infrastructure. At missile test ranges, they can detect pre test buildup, transporter activity, and new instrumentation. At industrial complexes with distinctive signatures, they can reveal large scale building changes, cooling towers, substations, and new road or rail access.
This is where AI genuinely adds value. Not by interpreting intent or strategy, but by scaling analysis. Automated change detection, object counting for large visible items, and correlation across sensors and time allow analysts to manage vastly larger volumes of data.
The picture becomes far less reliable once verification moves beyond fixed infrastructure.
Mobile systems can sometimes be observed but cannot be counted with confidence. Road mobile missiles are intermittently visible and easily concealed using shelters, decoys, timing, and terrain. Submarines can be monitored at bases and shipyards but not reliably at sea. Warhead upload and download operations usually occur indoors or under cover. Support vehicles may be visible, but inferring whether nuclear or non nuclear payloads are involved is highly uncertain.
Some tasks are simply not achievable at treaty standards without cooperation. Warhead counts, dismantlement, fissile material accounting, and often even system attribution collapse into ambiguity once states adapt camouflage and operational security. Satellites can indicate that something has changed, but they cannot reliably determine what changed or by how much.
What AI can and cannot do
Machine learning does not fail in this domain because it is immature. It struggles because arms control verification is adversarial by nature.
AI systems are well suited to several supporting tasks. They can detect changes over time across large imagery sets. They can identify and count large visible objects such as aircraft or vehicles when those objects are unobscured. They can fuse data from multiple sensors and produce ranked lists of anomalies for human review.
These capabilities are valuable. They increase throughput and allow analysts to focus attention more efficiently.
However treaty grade reliance encounters several hard constraints.
One is training data. Verification demands extremely low error rates under deliberate evasion. There is no large well labeled ground truth dataset covering every relevant system type, camouflage regime, and environmental condition. Models tend to become bespoke to specific countries and sites, making them brittle and expensive to maintain.
Another is the base rate problem. True violations are rare relative to the volume of observations. Even highly accurate systems monitoring thousands of locations daily will generate large numbers of false positives. This creates political risks. Constant alarms can provoke crises or lead to complacency when warnings are repeatedly dismissed.
A third constraint is deception. States can move activities indoors, use temporary covers, deploy decoys, alter paint and geometry, manage thermal signatures, or time operations between satellite passes. Once a monitoring regime becomes binding, adaptation should be expected.
There is also the issue of explainability and contestability. Treaties require evidence that can be independently assessed and accepted by rivals. Outputs that rely heavily on opaque model behavior are unlikely to be persuasive in disputes.
Finally, there is cybersecurity. A shared or trusted verification pipeline would be a high value target for data poisoning, model theft, and inference attacks. These risks matter because they can either conceal violations or manufacture plausible allegations.
AI can assist monitoring. It does not on its own create the kind of shared adjudicable truth that arms control depends on.
The central role of cooperation
The most important part of the proposal discussed in Wired is also the one that tends to be understated. Remote verification only works if states cooperate.
If the idea is to replace inspections entirely with satellites and AI, it fails beyond coarse confidence building at fixed sites.
If instead the idea is to embed satellites and AI within a cooperative technical framework, it becomes more plausible. However, this requires adding elements that look very familiar.
These typically include declarations of sites and inventories so monitoring can be targeted. They include managed access actions such as opening silo lids, towing items into view, or parking aircraft in the open. They may involve cryptographic commitments to inventories and movements that can be checked over time. They often rely on information barriers that confirm treaty accountable objects without revealing design secrets. And they require dispute resolution mechanisms for cases where sensing and declarations diverge.
Once these elements are included, the system begins to resemble arms control as it has always existed, simply with less physical presence and more remote corroboration. That may be an evolution worth pursuing. It is not a technological substitute for political agreement.
What such a system could realistically deliver
Used honestly and with clear limits, a satellite and AI based regime could achieve several meaningful outcomes.
It could detect major expansions at known fixed sites. It could monitor compliance with simple observable actions that states agree to perform. It could provide early warning of destabilizing posture changes such as large dispersals or unusual readiness patterns across multiple bases.
What it cannot reliably guarantee are precise warhead ceilings, limits on non deployed stockpiles, verified dismantlement, monitoring of survivable forces at sea, or detection of fast distributed breakout pathways designed to remain indoors.
It reduces uncertainty relative to having no verification at all. It does not reproduce the level of assurance provided by past treaty regimes.
A realistic conclusion
Satellites and AI are technically feasible as tools to augment monitoring and to support limited cooperative confidence building, especially for fixed sites and visible infrastructure changes.
They are not technically feasible as a full replacement for verification of warhead limits, dismantlement, and detailed accountable inventories unless states are already willing to cooperate in ways that reintroduce many of the same political challenges inspections were designed to manage.
The deeper issue is not technological. It is political.
If states are willing to accept constraints and tolerate verification, workable systems can be designed using a mix of remote sensing, managed access, and human judgment. If they are not, no amount of imagery or computation will resolve the underlying mistrust.
Satellites and AI can tell observers that something has changed. Arms control requires proving exactly what changed, by how much, and whether it violated a rule in a way adversaries accept.
That gap is not a problem of sensors or algorithms. It is the central problem of arms control in a world of rivalry and suspicion.

