Alignment & Bayes - Sundog Research Lab

Alignment, Generally

The broad problem

In machine learning and control, alignment is the question of whether a system's behavior remains coupled to the intended objective, constraints, and users when uncertainty, optimization pressure, partial observation, and distribution shift enter the room.

That broad field includes reward design, reinforcement learning, imitation learning, Bayesian filtering, POMDP planning, robustness testing, interpretability, oversight, red teaming, and formal verification. Sundog is not a replacement for that field.

Sundog's smaller claim is about route fidelity under hidden-state control.

A run is not aligned merely because it succeeds. It is aligned, in the Sundog sense, only if the system used the admitted trace, stayed inside the stated information lane, and exposed the boundary where that trace no longer supports action.

What Sundog Means

The local definition

Sundog studies cases where the decisive state is hidden but the world still leaks useful structure. The lab builds small workbenches where each information source is typed before stronger language is allowed.

For AI mathematics under traceability, the cap-set workbench is the analogous public surface: a hands-on primer that credits the outside discovery while keeping Sundog's role to apparatus, boundary, and interpretation.

Hidden target The controller is denied the decisive variable: target coordinates, pole angle, true mine layout, full orbital state, or internal truth labels.
Admitted trace The controller gets only the declared signal: light, shadow, pressure, local field readings, compressed game state, or bounded action history.
Route check The result must survive no-leak audits, baselines, interventions, or structural failure predictions. Decodable is not the same as used.
Failure boundary The public claim names where the trace aliases, collapses, delays, or stops being enough.

Existing Techniques

Not a strawman

The comparison is only useful if the other methods are treated strongly. Bayes does not require full sight. Reinforcement learning is not just brute force. Interpretability probes are useful, but they do not by themselves prove causal use.

Reward and RL Optimizes behavior against a signal. Sundog asks whether the learned behavior remains coupled to the intended trace instead of a shortcut.

Bayesian inference Maintains belief over hidden causes. Sundog asks when action can be controlled directly from response without reconstructing the full hidden world.

Model-based control Uses dynamics and sensor models. Sundog shares the control discipline, but makes the information lane and failure map the central public artifact.

Interpretability Can show a representation contains a variable. Sundog requires causal steerability or structural failure coincidence before calling the route traceable.

Bayes And Sundog

Belief versus response

The fair head-to-head is not "Bayes gets the world and Sundog gets shadows." A fair Bayesian baseline receives the same partial observations, then maintains a posterior over hidden state. Sundog receives the same observations, then acts from the local response field.

Bayes

Belief lane P(hidden world | signal) ~= likelihood * prior

Where is the hidden thing, probably? A particle Bayes lane answers by carrying many weighted guesses.

Sundog

Trace lane response coupling ~= change in signal / action

Which action makes the trace improve?

Bayes Forecast map Which hidden cause is most likely after the evidence?

Sundog Steering trace Which action makes the admitted signal improve?

Ledger Claim hygiene Which public sentence is actually licensed?

Bayes is for knowing under uncertainty; Sundog is for acting under partial sight. They are complementary, and the ledger below reports where each earns its keep, where each fails, and where neither earns the headline.

Reader glossary: misspecification is a wrong-world assumption; decoy / alias is a wrong or ambiguous trace; substrate-conditional means the claim belongs to the tested regime. Atmospheric-optics terms such as parhelion, circumzenithal arc, and 22° halo live in the halo legend.

Terminal accuracy versus time-to-threshold is why the core task can tie on final accuracy while Bayes still wins on speed. Model-family collapse (s3) is not a normal Bayes loss; it is the model-less limit.

Pressure Mines side-by-side

Pressure Mines is the clearest current Bayes plus Sundog alignment example because both lanes now run on the same admitted pressure field. The Phase 13 bundle is not just extra experiment data; it tells the public page which interpretation is allowed, then feeds the Pressure Mines ledger row.

Trace lane

Sundog lane

Phase 10 keeps the public claim local: in the confirmed pocket, sundog_minimal improved budget-adjusted safe-tile progress by +7.21875 versus naive_pressure, and sundog_lean also cleared the gate at +6.3125.

This is route evidence for pressure-field action, not a board-clearance claim.

Belief lane

Bayes lane

The Phase 12 all-cell reducer ran 14,720 same-field trials across 46 Phase 10 cells. bayes_frontier_pressure had zero negative mean-regret cells versus naive_pressure and sundog_minimal.

The hard lane is pressure-floor parity; sundog_lean stays a diagnostic headroom lane.

Ledger

Allowed read

Bayes prevents the Mines result from resting on a weak naive comparator, but it does not make the stronger claim that Sundog beats posterior inference.

In the promoted pocket, Bayes is 0 versus sundog_minimal and -1.515625 versus sundog_lean: a shared same-field floor, not posterior dominance.

Alignment reading: the side-by-side says the admitted pressure trace contains enough structure for both a legal posterior floor and a Sundog controller family to act, while the boundary remains visible. That is stronger than a naive-only result and weaker than posterior dominance.

Standalone Bayes-vs-Sundog side-by-side

Loading generated comparator receipt.

Phase 2 mismatch Phase 3 decoy / alias Phase 4b hybrid dropped Phase 5b boundary Phase 6 core task Phase 6b structure Phase 6c mismatch crossover

Comparator Ledger

Current public read

Not a victory table — a running ledger: what Sundog did, what Bayes or other ML baselines did against the same information lane, and the interpretation that holds.

Surface	Sundog readout	Comparator state	Safe interpretation
Photometric mirror alignment	No target-position access; terminal target intensity is comparable to target-aware analytic and noisy-oracle baselines in the controlled MuJoCo result — and, under an injected mirror-calibration mismatch, retains terminal accuracy as the oracle collapses (operating-envelope crossover, 30 seeds/level).	The serious same-observation Bayesian comparator is not the primary public baseline yet. The existing target-aware baseline wins hard on acquisition speed: median threshold time about 11.5 steps versus about 188 for photometric feedback.	Core result, not a Bayes win. It proves the apparatus shape and motivates the fair Bayesian comparison. The mismatch crossover is operating-envelope evidence, not proof: the oracle still wins near nominal, and under mismatch the photometric controller only degrades gracefully on terminal accuracy rather than solving the task.
Standalone Bayes-vs-Sundog benchmark	Narrow separation only under specific synthetic misspecification: Phase 2 anisotropic, Phase 3 decoy/alias failure boundary, and Phase 5b clean-only model-family collapse. Hybrid did not earn a niche.	Correctly specified adaptive or particle Bayes dominates the synthetic envelope and the core photometric task. On the core task it stays faster at comparable terminal accuracy under parameter stress and non-degenerate structural misspecification; response control flips only against degenerate model-less Bayes.	Real but narrow and substrate-conditional: stronger than a naive-only comparator result, weaker than posterior dominance. The response edge does not transfer to the core task except degenerately.
Balance	Shadow controller cleared the Phase 10 operating-envelope claim and keeps overhead-light / degradation lanes visible.	Phase 15 same-shadow Bayesian-floor lock ran 27,200 trials with audits passing, 56/56 hard-gate cells admitted, zero negative mean-regret cells, and aggregate regret near parity at +0.00395 normalized survival versus `sundog_shadow`.	Earned same-information Bayes receipt. The careful claim is near-floor behavior inside the mapped envelope, with reported-only degradation boundaries preserved.
Pressure Mines	Phase 10 confirmed a narrow pressure-field pocket: density 0.16, pressure noise 2.0, dropout 0.2 improved budget-adjusted safe-tile progress versus `naive_pressure`, beside a matched failure cell.	Phase 13 now publishes the same-field Bayesian pressure-floor receipt: 14,720 trials across all 46 Phase 10 cells, no-leak pass, zero negative cells versus `naive_pressure` / `sundog_minimal`, and visible `sundog_lean` headroom.	Earned pressure-floor parity — not posterior dominance or field clearance: the comparator lane is validated, the claim stays narrow.
Three-Body / Coarse Proof	Guarded accelerometer-proxy TRACK improved survival over passive and naive local baselines in the tested high-velocity near-escape pocket, with harms exposed outside it.	The proof-track Bayesian floor exists as a staged particle-MPC evaluator and passed smoke, but the BF-4b information-accessibility diagnostic and BF-5 full lock remain blockers.	Operating-envelope evidence, not proof closure. The Bayesian-floor gate is exactly why the proof trunk stays labeled open.
Mesa / learned policy work	Signature-trained and mixed-objective policy studies map where trace-conditioned behavior holds, collapses, or localizes causally in the tested architecture.	This is an ML alignment surface rather than a Bayes comparator. The relevant comparison is reward-only, signature-only, mixed objectives, probes, and interventions under controlled capacity and selection pressure.	Use it to discuss causal route fidelity and operating envelopes. Do not treat it as evidence that Sundog replaces mainstream ML alignment.

The ledger should get stricter over time. A weak comparator does not make Sundog stronger; it makes the claim less interpretable. When Bayes wins, the site should say Bayes wins.

Result Shape

What would count

The Bayes roadmap is strongest when it pre-registers wins and losses before the run. The closed comparison now publishes its public stamp sequence from the generated receipt bundle.

Loading Generated Bayes comparison receipts are loading.

Pre-registration means the command, thresholds, cell slate, and allowed interpretations were frozen before the claim-bearing run. In this page it is the difference between a result that reports what happened and a story rewritten after the outcome was visible.

Comparator trail: standalone benchmark row, photometric ledger row, and Pressure Mines ledger row.

Inspection Path

Source trail

Use these links to inspect the current comparator status and the comparisons that are still open.

Alignment means route fidelity, not just task success.