# Phase 1 LQG Proof Draft

> Phase 1 artifact for
> [`COARSE_GRAINING_PROOF_ROADMAP.md`](../COARSE_GRAINING_PROOF_ROADMAP.md).
> Status: reviewed, 2026-05-16. Phase 0 definitions are locked in
> [`POSTULATE1_DEFINITIONS.md`](POSTULATE1_DEFINITIONS.md). This records a
> positive LQG existence proof and a qualified Postulate 6 toy check; the
> reviewer pass is complete (see *Exit Status*) and Phase 1 closes positive.
> Not yet final theorem prose — public-prose citations are deferred per
> roadmap §5.

## Entry And Gate

Phase 1 enters from the Phase 0 convention that `X` is the decision information
state, not necessarily the raw hidden plant state. The pre-registered negative
for this phase is:

> If signature-only LQG control does not reach Bayes-optimal cost on the
> `𝓕_σ`-measurable set, the schema is false in its easiest case — halt the
> whole roadmap and file the falsification against
> [`SCIENTIFIC_CRITERIA.md`](../SCIENTIFIC_CRITERIA.md) ▸ Falsifiable
> Expectations.

In the uncoarsened full-belief signature the `𝓕_σ`-measurable set is the
*entire* evaluation support, so the gate's "on the measurable set" qualifier is
met everywhere; the off-set boundary is reached only by deliberate coarsening
and is deferred to Phase 3 (see *Boundary Inside The Toy*).

Draft verdict: **negative not triggered** under standard finite-horizon LQG
assumptions. The Kalman belief is a sufficient statistic for Bayes-optimal
control, so the optimal policy is measurable with respect to the signature
sigma-algebra generated by that belief statistic.

## Model

Use the standard finite-horizon partially observed linear-Gaussian control
problem:

```text
z_{t+1} = F_t z_t + B_t a_t + w_t
y_t     = H_t z_t + v_t
```

where `z_t ∈ R^n` is the hidden plant state, `a_t ∈ R^m` is the control, and
`y_t` is the observation. The prior is Gaussian, `w_t` and `v_t` are independent
zero-mean Gaussian noises with known covariances, and the objective is quadratic:

```text
J(π) = E[ z_T' Q_T z_T + Σ_{t=0}^{T-1} (z_t' Q_t z_t + a_t' R_t a_t) ].
```

Assume `Q_t` is positive semidefinite and `R_t` is positive definite. The
controller's decision information at time `t` is the history
`h_t = (y_0, ..., y_t, a_0, ..., a_{t-1})`.

For this phase:

| Phase 0 symbol | LQG instantiation |
| --- | --- |
| `X_t` | The set of admissible histories `h_t`, equivalently their Gaussian belief states. |
| `Φ_t(h_t)` | The Kalman posterior statistic `(m_t, P_t)`, where `m_t = E[z_t | h_t]` and `P_t = Cov(z_t | h_t)`. |
| `Σ_t` | The space of admissible Gaussian belief parameters `(m, P)`. |
| `J` | The quadratic LQG objective above. |
| `π*` | The Bayes-optimal LQG controller under the history information regime. |

With fixed model and observation schedule, `P_t` is deterministic in many LQG
settings, so the minimal signature may reduce to `m_t`. This draft keeps
`(m_t, P_t)` in `Φ_t` so the sufficiency statement remains explicit.

### Local Symbols

Phase-1-local symbols, defined once so Phase 2 can import the notation without
collision (Phase 0 review standard: every symbol defined once):

| Symbol | Definition |
| --- | --- |
| `m_t` | Kalman posterior mean, `E[z_t \| h_t]`. |
| `P_t` | Kalman posterior (filter) covariance, `Cov(z_t \| h_t)`. |
| `Π_t` | **Control** Riccati matrix from the LQG backward recursion. Distinct from the filter covariance `P_t` — same neighbourhood of the alphabet, different object; do not conflate. |
| `K_t` | LQG feedback gain derived from `Π_t`. |

Note the `S` overload, carried intentionally and not redefined here: legacy
scalar readout `S(x)` (Phase 0 ledger) and the toy readout `S(τ)` in the
Postulate 6 section are the *same* scalar-coordinate-of-`Φ` device, not a
control-Riccati `S`. The control Riccati is `Π_t`, never `S`.

## Theorem

In finite-horizon LQG, the Bayes-optimal policy is `𝓕_σ`-measurable for
`σ_t = Φ_t(h_t) = (m_t, P_t)`. Therefore LQG is Sundog-solvable under the Kalman
belief signature: there exists a measurable `g_t : Σ_t → A` such that

```text
π*_t(h_t) = g_t(Φ_t(h_t))
```

for each decision time `t`.

## Proof

1. **Belief closure.** By the Kalman filter, the conditional distribution of
   `z_t` given the full history `h_t` is Gaussian with parameters `(m_t, P_t)`.
   The update from `(m_t, P_t, a_t, y_{t+1})` to `(m_{t+1}, P_{t+1})` is
   deterministic under the fixed model. Thus the belief statistic is a
   controlled Markov state for the partially observed problem.

2. **Quadratic value form.** Dynamic programming over the belief state gives a
   cost-to-go of the form

   ```text
   V_t(m_t, P_t) = m_t' Π_t m_t + tr(Π_t P_t) + c_t
   ```

   where `Π_t` is the **control** Riccati matrix (the LQG backward recursion),
   *not* the filter covariance `P_t`. The `tr(Π_t P_t)` term and `c_t` are
   written schematically: the exact estimation-error / noise-trace bookkeeping
   is standard LQG and is not load-bearing for this existence proof — the
   load-bearing facts are the minimizer form (step 3) and cost attainment
   (step 5). Final theorem text, if promoted to public prose, should either
   carry the full Riccati bookkeeping or cite a standard LQG reference for it.
   Under the standard LQG separation principle the minimizing action is the
   same linear feedback applied to the posterior mean regardless of `P_t`.

3. **Optimal action depends only on the signature.** The minimizer has the form

   ```text
   a_t* = -K_t m_t
   ```

   in the zero-offset case, with the corresponding affine term in nonzero-target
   coordinates. Define `g_t(m, P) = -K_t m` (or its affine version). Then
   `π*_t(h_t) = g_t(Φ_t(h_t))`.

4. **Measurability.** The Kalman update and linear feedback map are measurable,
   so `π*_t` is measurable with respect to the sigma-algebra generated by
   `Φ_t`. This is exactly the Phase 0 Sundog-solvable predicate.

5. **Cost equality.** Because `g_t(Φ_t(h_t))` is the Bayes-optimal action at each
   dynamic-programming step, a controller that receives only `Φ_t(h_t)` reaches
   the Bayes-optimal LQG cost. No reconstruction of the realized hidden state
   `z_t` is required.

This proves the computable-case existence result the roadmap asks for: a
many-to-one signature can be control-sufficient even when it is not
state-reconstructive.

## Boundary Inside The Toy

The proof also names the failure boundary. If `Φ_t` is coarsened below the
Kalman belief so that two histories with the same signature require different
Bayes-optimal feedback actions on positive measure, then no signature-only
policy over that coarser `Φ_t` can be Bayes-optimal. In LQG terms, this happens
when the coarsening loses a component of `m_t` to which the gain `K_t` assigns
nonzero control authority.

That boundary is the LQG version of the pushable-occluder pattern: the missing
bit is not cosmetic state detail; it changes the optimal action.

## Postulate 6 Toy Check

The anniversary thread asks whether the founding `H(x) = ∂S/∂τ` line is really
an information term. In the linear-Gaussian toy, the honest statement is a
metric one.

Let a local scalar or vector torque coordinate `τ` affect a scalar/vector
signature readout `S(τ)` near the optimum, and suppose the measured signature is

```text
y = S(τ) + ε,     ε ~ N(0, R).
```

Let `G = ∂S/∂τ` be the local signature Jacobian. The Fisher information of the
signature channel with respect to `τ` is

```text
I_Φ(τ) = G' R^{-1} G.
```

For the scalar case, this is

```text
I_Φ(τ) = (1 / R) (∂S/∂τ)^2.
```

Recorded result: **qualified pass.** Fisher information is the noise-weighted
pullback metric of the signature Jacobian. The founding `∂S/∂τ` term is
therefore proportional to the square-root information scale in the scalar
constant-noise toy, or to the Jacobian whose metric contraction gives Fisher
information in the vector toy. A literal signed-scalar reading,
`I_Φ ∝ ∂S/∂τ`, is not the invariant statement; it loses sign/orientation and
fails as general wording.

Safe downstream wording after this draft:

> In the LQG toy, the founding `H` term becomes the local signature Jacobian;
> Fisher information is its noise-weighted metric contraction.

**Scope limit on `τ`.** The founding anniversary line `H(x) = ∂S/∂τ` had `τ` as
*proper time* (the year-one entropy arc). This toy reinterprets `τ` as a local
torque coordinate. The Jacobian/metric reading is therefore vindicated *only
under the torque-coordinate reading*; the toy does **not** by itself rescue the
original proper-time/entropy interpretation. This is an independent reason the
roadmap §4 strengthening clause must not be read as "the year-one
`H`-was-an-entropy arc is closed."

Do **not** promote the stronger anniversary line that `H` "was entropy all
along" unless Phase 1 review accepts this metric/Jacobian interpretation and
later phases preserve it.

## Exit Status

Phase 1 proof draft: **positive under standard LQG assumptions.**

Postulate 6 toy check: **qualified pass for the metric/Jacobian reading; fail
for literal signed-scalar proportionality.**

Reviewer pass: **complete, 2026-05-16.** The measurability chain (steps 1–4),
cost attainment (step 5), the gate, and the Postulate 6 Fisher derivation were
checked and are correct. Symbol hygiene, the schematic value-form, the gate
quote, and the `τ` reinterpretation were flagged and are now resolved in this
draft (Local Symbols ledger; step 2 control-Riccati naming + schematic note;
Entry And Gate consequence + measurable-set scope; Postulate 6 `τ` scope limit).
The roadmap §4 strengthening clause was reconciled to the qualified-pass reality.

Resolved review-item dispositions:

1. **Keep `(m_t, P_t)` as `Φ_t`** in the theorem statement. Under
   deterministic-covariance schedules `P_t` is a known function of `t`, so the
   *effective* signature reduces to `m_t`; but keeping both makes sufficiency
   robust to the cost convention and to observation-dependent covariance
   (intermittent / missing sensing) and matches the Phase 0 handoff wording.
2. **Citations deferred, non-blocking.** The proof track is research-internal
   (roadmap §5); citations are required only if this reaches the Phase 5 public
   gate.
3. **Phase 2 must reuse the decision-information-state convention** — confirmed
   and pre-committed: finite-MDP `X` is the information state / belief simplex,
   predicate = constancy of `π*` on `Φ`-fibers. This is binding so Phase 2 does
   not fork notation (roadmap §2 no-fork constraint).

Phase 1 closes **positive**; Phase 2 entry (Phase 1 exit positive) is satisfied.

