Method Essay · Shadow Bridge · v0.1

If local shadows recover Faraday, what do they recover for AI?

The Shadow Faraday experiment landed Branch A: local, gauge-invariant shadow data suffices to close Faraday's law on the registered classical-vacuum domain, without reconstructing the electromagnetic potential globally. This essay translates that receipt into an AI-safety method template. It does not derive new physics or prove a theorem about AI; it asks whether the same three properties that made Branch A earnable in EM — local gauge-invariant readout, structural zero, and named quarantine — are also the properties that distinguish robust AI behavioral guarantees from empirical tolerance tests.

SKETCH · interpretation, not derivation IP rights reserved · patent filing candidate Faraday Branch A is the load-bearing example Class B candidate · local Bucket 1 ready citation hook staged in good faith

The Gap

Why an essay between the receipt and the headline claim.

Sundog's stack of experiments — atlas, three-body, Isotrophy K_facet, mesa, structural-failure, capset, Faraday — is a trail of receipts. Each one is honest about what it does and does not show. None of them, on its own, makes a public-facing claim about AI safety: that would betray the rigor that earns the receipts in the first place.

But the receipts are the wrong size for a curious electrical engineer or a working alignment researcher to extrapolate from. The Faraday page proves Branch A; it does not explain why Branch A matters outside electrodynamics. That gap is what this essay tries to close — once, in one place — by writing the translation down.

The receipts are not the safety claim. The receipts are the proof that the safety claim's method is honest.

01 Local, gauge-invariant readout, no global reconstruction.

The shadow operator P_shadow reads a gauge-invariant scalar from a small stencil of the electromagnetic potential. It never asks for a globally consistent choice of A. The readout is local by construction (four edges of one plaquette) and gauge-invariant by Stokes (the holonomy of an exact form vanishes on every closed loop).

In Faraday (Branch A)

The plaquette holonomy ∮ A on a coordinate square equals the flux ∫ F through it. The potential A is never reconstructed globally; only its line integral on one closed loop appears. Gauge invariance is exact, not approximate.

Translated to AI behavior

A behavioral verifier reads from a small stencil of the agent's trajectory or output, and does not require reconstructing the agent's full latent world-model. Two agents whose internals differ by an irrelevant reparameterisation (the AI analog of a gauge) produce the same verdict.

the load-bearing rhyme: locality + invariance, not depth of inspection

Today's alignment work mostly picks one of two extremes: take the model apart (mechanistic interpretability) or stare at its outputs (behavioral evals). The shadow position is the third option that earned Branch A in EM — small stencil, exact invariance — and whether it is reachable for any nontrivial AI behavior is an open empirical question, not a Sundog claim. The three-body workbench shows the same local-probe lineage in a chaotic dynamical system; the σ₃ detector there is a non-EM cousin of P_shadow.

02 Structural zero, not tolerance threshold.

The Phase 3 derivation gives R_F^0(S) = 0 by an algebraic identity: dF = d(dA) = 0. The zero is not a number below a tolerance. It is a structural consequence of the operator's typing. A reader does not have to trust a threshold; the residual is zero by construction. This is the same distinction that earned Isotrophy K_facet's 20 structural-zero receipts and the same one routed through a different apparatus on the alignment Bayes comparator.

In Faraday (Branch A)

The closure residual evaluates to an algebraic zero. The Phase 4 battery's tolerance = 1e-9 is a sanity-check floor for the supporting numerical spot-checks, not the standard the Branch A claim is held to.

Translated to AI behavior

Behavioral guarantees that come from the architecture (a constraint the model cannot violate) versus guarantees that come from empirical testing (a constraint the model has not been observed to violate at scale). The first is a structural zero; the second is a tolerance threshold.

the load-bearing rhyme: identity, not benchmark

Most public AI-safety work today is tolerance-based: test on 10,000 prompts, report no observed violation, publish. The shadow apparatus offers a different target — show the violation is ruled out by the operator's structure — and registers in advance what would count as a structural failure versus a tolerance failure. Reaching that target for any nontrivial AI behavior is hard. This essay does not claim it has been done. It claims that the distinction is real and worth drawing.

03 Named quarantine, pre-registered before the algebra.

Phase 2 enumerated five failure modes before Phase 3 began: regularity, topology, monopole, operator-stencil commutator, and motional EMF. Each is paired with the branch it would force (B-quarantine or C-bounded-failure). When the Phase 4 monopole probe tripped, it tripped a hook that already had a name and a verdict. The precedent is the K_facet v0.3h O_617 quarantine: that audit landed 20 structural zeros and one named quarantine — not because the audit failed, but because the failure was registered in advance with a specific representation-level reason.

In Faraday (Branch A)

The five quarantine hooks are not edge cases discovered after the fact. They were filed in the ledger before the derivation ran, so any surprise in Phase 3 had to land in a pre-registered category. No mid-experiment invention of new failure classes.

Translated to AI behavior

A pre-mortem for behavioral failure: name the kinds of inputs, conditions, or mode-shifts under which the behavioral guarantee is expected to fail, and the specific way it fails, before training or evaluation. A surprise that has no pre-registered class is a class-C result — the prior is updated, the scope is corrected.

the load-bearing rhyme: pre-registered failure beats post-hoc explanation

This is the cheapest translation to start practising in AI work today, and the one most often elided. A safety case that names its Aharonov-Bohm analog, its monopole analog, and its moving-boundary analog before the model is trained is a different artifact than a safety case that explains each observed failure as it appears.

Convergent Ground

Same shape, different substrate.

Sundog's name comes from the atmospheric hologram: a 3D ice-crystal geometry casts a 2D shadow that nevertheless preserves enough of the original structure to be read off locally. The shadow apparatus on the Faraday page formalises that move algebraically — a small stencil on a local patch recovers gauge-invariant content without reconstructing the global potential.

A separate, on-the-ground line of work is asking what the same move looks like inside an autoregressive transformer's residual stream. The shape the territory is converging on is that bottleneck layers act, effectively, as companders — compressing activations into a low-rank subspace and expanding them back out — and the subspaces that emerge in that bottleneck appear to separate into two orthogonal pieces: categorical centroids on one side, generator algebras on the other. Among the measured generators, rotations (so(3)) rank near the top across many models.

This essay is not the source of that finding, does not cite it (the paper has not been pulled into the citation rail yet, by mutual courtesy), and does not claim it proves any of the three translations above. The reason it is worth mentioning at all is that the convergent shape is a load-bearing signal: the same body/shadow decomposition that earned Branch A in Faraday may have a cousin inside the model the AI translations are about. When the citation lands, this card becomes sharper. Until then, the territory is described in the territory's own vocabulary, without borrowing the credit.

Good-Faith Hooks

What the bridge is allowed to borrow.

The essay is useful only if it stays courteous to the work around it. The Faraday receipt supplies a worked method; the pending compander citation supplies a possible mechanistic neighbor when it becomes public. Neither is allowed to launder a stronger AI-safety claim than the receipts support.

Credit waits for the source.

The compander-shaped territory is described without naming or citing private work. The rollout hook is staged, but the named card only appears after a public citation or explicit go-ahead.

No borrowed proof.

Branch A proves the electromagnetic receipt in its registered domain. A transformer-substrate citation, when named, can sharpen the analogy; it cannot prove the three AI translations.

Falsifiers stay in front.

Each translation below keeps a next experiment and a failure branch attached. A surprise that misses the named hooks weakens the bridge instead of being explained away after the fact.

Claim Boundary

Not a proof of AI safety.

The Faraday receipt is a proof. This essay is a translation of the receipt's method into AI-safety language. The translation has not been run against a single nontrivial AI system, and until it is, it is a hypothesis.

Not a theory of interpretability.

Mechanistic interpretability inspects model internals; behavioral evals inspect outputs. The shadow position is between, with explicit locality and exact invariance. It is a different target, not a refutation of the others.

Not a recipe.

There is no method here for building a structural zero in any specific AI system. The three claims are what to aim at, not how to achieve it. The how is what the next experiments would test.

Not yet externally promoted.

The local share surface is now prepared: bespoke og:image, metadata, sitemap coverage, and the /index → /faraday → /safety-method path are present. Post-deploy validators and the citation ratchet still gate any external push.

What Would Test The Translation

Three pre-registered next experiments.

A bridge essay that cannot be falsified is not a bridge. The three claims above are each falsifiable. The cheapest experiments that would either strengthen or kill the translation are listed below. None has been run.

tests claim 01

Shadow verifier on a toy policy.

Pick a small RL or sequence policy where the local stencil can be a sliding window of actions, and ask whether a constraint can be checked stencil-only and invariant under a registered equivalence. Falsifier: any policy where the constraint requires unbounded lookback.

tests claim 02

Structural vs tolerance audit.

Take an existing alignment evaluation and classify each predicate as structural-by-construction or tolerance-by-empirical-floor. Falsifier: an evaluation where every predicate is already structural — in which case the distinction adds no information.

tests claim 03

Pre-mortem catalog for one behavioral claim.

Pick a single behavioral property (e.g., refusal under a specific prompt class) and write the five-to-seven named failure classes before running any evaluation. Falsifier: the post-hoc failure does not land in any pre-registered class.

Inspection Trail

Shadow Faraday Zero-Out — the worked example anchoring this essay Phase 5 Chapter Close · receipts catalog, limitations, extensions, fidelity audit Shadow Faraday roadmap v0.1 Alignment workbench · Bayes comparator and tolerance-vs-structural framing Isotrophy K_facet v0.3h · precedent for the named-quarantine discipline Three-body workbench · local-probe pattern lineage Sundog Legend · Shadow Faraday vocabulary cluster Scientific Criteria Claims and Scope Policy Sundog repository on GitHub