.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "auto_tutorials/08_uncertainty_calibration.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        :ref:`Go to the end <sphx_glr_download_auto_tutorials_08_uncertainty_calibration.py>`
        to download the full example code.

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_auto_tutorials_08_uncertainty_calibration.py:


08. Calibration Methods
=======================

This tutorial mainly explains the ``UncertaintyCalibrator`` class through the
high-level ``UncertaintyCalibrator`` API and CaliBrain's calibration modes.

It covers all implemented modes conceptually:

- ``precal``
- ``post_oracle``
- ``post_pooled``
- ``post_pooled_mismatch``
- ``post_fixed``

and demonstrates two of them concretely:

- ``precal``: evaluate raw empirical coverage without fitting a recalibration map;
- ``post_oracle``: fit on a matched train split and evaluate on a matched eval split.

The remaining post-calibration modes are presented as workflow examples with
different split definitions. They all use the same recalibration mechanism.
More general evaluation questions are handled separately by
``MetricEvaluator``.

.. GENERATED FROM PYTHON SOURCE LINES 30-51

Scientific motivation
---------------------

``UncertaintyEstimator`` gives pre-calibration empirical coverage curves, but
these curves are often misaligned with the nominal coverage grid. CaliBrain's
calibration step uses isotonic regression to fit a monotone recalibration map
on a training split and then evaluates the corrected nominal coverages on a
held-out evaluation split.

The workflow-level calibration modes used in the documentation are:

- ``precal``: no fit, evaluate raw empirical coverage only;
- ``post_oracle``: fit and evaluate on matched conditions;
- ``post_pooled``: fit on pooled heads, evaluate on one head;
- ``post_pooled_mismatch``: fit on mismatched pooled conditions, evaluate on target data;
- ``post_fixed``: fit once at a reference setting and reuse that mapping across a sweep.

These names refer to benchmark-style train/evaluation configurations around a
common isotonic recalibration step; they are not five different calibration
algorithms. This tutorial demonstrates two representative cases with the
current high-level API: ``UncertaintyCalibrator.calibrate(...)``.

.. GENERATED FROM PYTHON SOURCE LINES 51-69

.. code-block:: Python


    import matplotlib.pyplot as plt
    import numpy as np
    from mne.io.constants import FIFF

    from calibrain import (
        MetricEvaluator,
        SensorSimulator,
        SourceEstimator,
        SourceSimulator,
        UncertaintyCalibrator,
        UncertaintyEstimator,
        gamma_map_sflex,
    )


    RANDOM_SEED = 61


.. GENERATED FROM PYTHON SOURCE LINES 70-80

Build a lightweight fixed-orientation calibration fixture
---------------------------------------------------------

To keep the tutorial executable, we build tiny matched-condition datasets
directly in memory. Both train and eval datasets use the same solver,
orientation, source coordinates, leadfield shape, and ERP settings. Only the
random seed changes.

This corresponds to the conceptual logic of ``post_oracle``: calibration is
fitted and evaluated under the same condition family.

.. GENERATED FROM PYTHON SOURCE LINES 80-201

.. code-block:: Python


    erp_config = {
        "tmin": -0.1,
        "tmax": 0.8,
        "stim_onset": 0.0,
        "sfreq": 100,
        "fmin": 2,
        "fmax": 8,
        "amplitude_distribution": {
            "median": 8.0,
            "sigma": 0.15,
            "clip": [2.0, 20.0],
        },
        "random_erp_timing": False,
        "erp_min_length": 20,
    }

    nominal_coverages = np.linspace(0.0, 1.0, 11)
    ue = UncertaintyEstimator(nominal_coverages=nominal_coverages)
    me = MetricEvaluator(ue)
    source_simulator = SourceSimulator(ERP_config=erp_config)
    sensor_simulator = SensorSimulator()

    rng = np.random.default_rng(RANDOM_SEED)
    n_sensors = 16
    n_sources = 32
    src_coords = rng.normal(scale=0.04, size=(n_sources, 3))
    leadfield_fixed = rng.normal(scale=0.03, size=(n_sensors, n_sources))
    leadfield_fixed /= np.maximum(
        np.linalg.norm(leadfield_fixed, axis=0, keepdims=True),
        np.finfo(float).eps,
    )
    leadfield_fixed *= 0.6

    sensor_simulator.set_sensor_metadata(
        kind=FIFF.FIFFV_EEG_CH,
        units=FIFF.FIFF_UNIT_V,
        unitmult=FIFF.FIFF_UNITM_MU,
        coil_type=FIFF.FIFFV_COIL_EEG,
    )
    x_true_train, _ = source_simulator.simulate(
        n_sources=n_sources,
        nnz=4,
        orientation_type="fixed",
        seed=RANDOM_SEED,
    )
    y_clean_train, y_noisy_train, noise_train, _ = sensor_simulator.simulate(
        x=x_true_train,
        L=leadfield_fixed,
        alpha_SNR=0.7,
        sensor_white_noise_std=0.2,
        seed=RANDOM_SEED,
    )
    noise_var_train = float(np.var(noise_train))
    estimator_train = SourceEstimator(
        solver=gamma_map_sflex,
        solver_params={"max_iter": 150, "tol": 1e-7, "sigma": 0.01, "src_coords": src_coords},
        noise_var=noise_var_train,
        n_orient=1,
    )
    estimator_train.fit(leadfield_fixed, y_noisy_train)
    result_train = estimator_train.predict()
    train_dataset = {
        "orientation_type": "fixed",
        "coil_type": None,
        "x_true": x_true_train,
        "x_hat": result_train["posterior_mean"],
        "posterior_var": ue.posterior_variance_from_cov(result_train["posterior_cov"]),
        "posterior_cov": result_train["posterior_cov"],
        "n_sources": x_true_train.shape[0],
        "n_times": x_true_train.shape[1],
        "noise_var": noise_var_train,
        "alpha_SNR": 0.7,
        "seed": RANDOM_SEED,
        "solver": "gamma_map_sflex",
        "noise_type": "oracle",
    }

    x_true_eval, _ = source_simulator.simulate(
        n_sources=n_sources,
        nnz=4,
        orientation_type="fixed",
        seed=RANDOM_SEED + 1,
    )
    y_clean_eval, y_noisy_eval, noise_eval, _ = sensor_simulator.simulate(
        x=x_true_eval,
        L=leadfield_fixed,
        alpha_SNR=0.7,
        sensor_white_noise_std=0.2,
        seed=RANDOM_SEED + 1,
    )
    noise_var_eval = float(np.var(noise_eval))
    estimator_eval = SourceEstimator(
        solver=gamma_map_sflex,
        solver_params={"max_iter": 150, "tol": 1e-7, "sigma": 0.01, "src_coords": src_coords},
        noise_var=noise_var_eval,
        n_orient=1,
    )
    estimator_eval.fit(leadfield_fixed, y_noisy_eval)
    result_eval = estimator_eval.predict()
    eval_dataset = {
        "orientation_type": "fixed",
        "coil_type": None,
        "x_true": x_true_eval,
        "x_hat": result_eval["posterior_mean"],
        "posterior_var": ue.posterior_variance_from_cov(result_eval["posterior_cov"]),
        "posterior_cov": result_eval["posterior_cov"],
        "n_sources": x_true_eval.shape[0],
        "n_times": x_true_eval.shape[1],
        "noise_var": noise_var_eval,
        "alpha_SNR": 0.7,
        "seed": RANDOM_SEED + 1,
        "solver": "gamma_map_sflex",
        "noise_type": "oracle",
    }

    print("train dataset keys:", sorted(train_dataset.keys()))
    print("eval dataset keys:", sorted(eval_dataset.keys()))
    print("train posterior_var shape:", train_dataset["posterior_var"].shape)
    print("eval posterior_var shape:", eval_dataset["posterior_var"].shape)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    train dataset keys: ['alpha_SNR', 'coil_type', 'n_sources', 'n_times', 'noise_type', 'noise_var', 'orientation_type', 'posterior_cov', 'posterior_var', 'seed', 'solver', 'x_hat', 'x_true']
    eval dataset keys: ['alpha_SNR', 'coil_type', 'n_sources', 'n_times', 'noise_type', 'noise_var', 'orientation_type', 'posterior_cov', 'posterior_var', 'seed', 'solver', 'x_hat', 'x_true']
    train posterior_var shape: (32,)
    eval posterior_var shape: (32,)


.. GENERATED FROM PYTHON SOURCE LINES 202-208

Mode 1: ``precal``
------------------

``precal`` means: do **not** fit a recalibration map. Evaluate the raw
empirical coverage on the evaluation split only. In the class API, this is
``fit=False``.

.. GENERATED FROM PYTHON SOURCE LINES 208-223

.. code-block:: Python


    precal_calibrator = UncertaintyCalibrator(ue, me)
    precal_results = precal_calibrator.calibrate(
        test_data=eval_dataset,
        fit=False,
    )

    print("precal nominal coverages:", precal_results["pre_calibration"]["nominal_coverages"])
    print("precal empirical coverages:", precal_results["pre_calibration"]["empirical_coverages"])
    print("precal post block equals pre block:", np.allclose(
        precal_results["pre_calibration"]["empirical_coverages"],
        precal_results["post_calibration"]["empirical_coverages"],
    ))
    print("precal recalibrated nominal coverages:", precal_results["post_calibration"]["recalibrated_nominal_coverages"])


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    precal nominal coverages: [0.  0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1. ]
    precal empirical coverages: [0.      0.59375 0.59375 0.78125 0.875   0.90625 0.90625 0.90625 0.90625
     0.96875 1.     ]
    precal post block equals pre block: True
    precal recalibrated nominal coverages: [0.  0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1. ]


.. GENERATED FROM PYTHON SOURCE LINES 224-230

Mode 2: ``post_oracle``
-----------------------

``post_oracle`` means: fit a recalibration map on a matched train split and
evaluate it on a matched eval split. In the class API, this is the same
high-level method, but now with ``train_data``, ``test_data``, and ``fit=True``.

.. GENERATED FROM PYTHON SOURCE LINES 230-243

.. code-block:: Python


    post_oracle_calibrator = UncertaintyCalibrator(ue, me)
    post_oracle_results = post_oracle_calibrator.calibrate(
        train_data=train_dataset,
        test_data=eval_dataset,
        fit=True,
    )

    print("post_oracle train empirical coverages:", post_oracle_results["train_empirical_coverages"])
    print("post_oracle pre empirical coverages:", post_oracle_results["pre_calibration"]["empirical_coverages"])
    print("post_oracle post empirical coverages:", post_oracle_results["post_calibration"]["empirical_coverages"])
    print("post_oracle recalibrated nominal coverages:", post_oracle_results["post_calibration"]["recalibrated_nominal_coverages"])


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    post_oracle train empirical coverages: [0.      0.5625  0.78125 0.84375 0.9375  1.      1.      1.      1.
     1.      1.     ]
    post_oracle pre empirical coverages: [0.      0.59375 0.59375 0.78125 0.875   0.90625 0.90625 0.90625 0.90625
     0.96875 1.     ]
    post_oracle post empirical coverages: [0.      0.5     0.53125 0.5625  0.59375 0.59375 0.59375 0.59375 0.625
     0.875   1.     ]
    post_oracle recalibrated nominal coverages: [0.         0.01777778 0.03555556 0.05333333 0.07111111 0.08888889
     0.11714286 0.16285714 0.23       0.36       1.        ]


.. GENERATED FROM PYTHON SOURCE LINES 244-259

The other workflow modes
------------------------

The remaining modes differ only in how the workflow constructs ``train_data``
and ``test_data`` from aggregated datasets:

- ``post_pooled``: pooled train split across heads, matched eval condition;
- ``post_pooled_mismatch``: pooled train split from mismatched condition;
- ``post_fixed``: one reference train split, reused across many eval splits.

They still rely on the same high-level calibration logic demonstrated above.
Evaluation itself can then be carried out in several ways: by inspecting raw
and recalibrated coverage curves, by computing summary calibration metrics, or
by comparing transfer behavior across conditions. Those broader evaluation
choices are covered in :doc:`metric evaluation </auto_tutorials/09_metric_evaluation>`.

.. GENERATED FROM PYTHON SOURCE LINES 261-267

Plot ``precal`` and ``post_oracle``
-----------------------------------

The first panel shows raw pre-calibration coverage. The second panel compares
the matched post-calibration result against the raw eval curve and the
training empirical curve used to fit the isotonic map.

.. GENERATED FROM PYTHON SOURCE LINES 267-311

.. code-block:: Python


    fig, axes = plt.subplots(1, 2, figsize=(11, 4.2), sharey=True)

    axes[0].plot([0, 1], [0, 1], "--", color="0.5", label="perfect calibration")
    axes[0].plot(
        precal_results["pre_calibration"]["nominal_coverages"],
        precal_results["pre_calibration"]["empirical_coverages"],
        marker="o",
        label="precal",
    )
    axes[0].set(
        xlabel="Nominal coverage",
        ylabel="Empirical coverage",
        title="Mode: precal",
    )
    axes[0].legend(loc="best")

    axes[1].plot([0, 1], [0, 1], "--", color="0.5", label="perfect calibration")
    axes[1].plot(
        post_oracle_results["pre_calibration"]["nominal_coverages"],
        post_oracle_results["pre_calibration"]["empirical_coverages"],
        marker="o",
        label="eval pre",
    )
    axes[1].plot(
        post_oracle_results["pre_calibration"]["nominal_coverages"],
        post_oracle_results["train_empirical_coverages"],
        marker="s",
        label="train empirical",
    )
    axes[1].plot(
        post_oracle_results["post_calibration"]["nominal_coverages"],
        post_oracle_results["post_calibration"]["empirical_coverages"],
        marker="^",
        label="eval post",
    )
    axes[1].set(
        xlabel="Nominal coverage",
        title="Mode: post_oracle",
    )
    axes[1].legend(loc="best")

    fig.tight_layout()


.. image-sg:: /auto_tutorials/images/sphx_glr_08_uncertainty_calibration_001.png
   :alt: Mode: precal, Mode: post_oracle
   :srcset: /auto_tutorials/images/sphx_glr_08_uncertainty_calibration_001.png
   :class: sphx-glr-single-img


.. GENERATED FROM PYTHON SOURCE LINES 312-317

Inspect metric summaries
------------------------

``UncertaintyCalibrator`` also returns the default calibration metrics from
``MetricEvaluator`` for both the pre- and post-calibration curves.

.. GENERATED FROM PYTHON SOURCE LINES 317-322

.. code-block:: Python


    print("precal metrics:", precal_results["pre_calibration"]["calibration_metrics"])
    print("post_oracle pre metrics:", post_oracle_results["pre_calibration"]["calibration_metrics"])
    print("post_oracle post metrics:", post_oracle_results["post_calibration"]["calibration_metrics"])


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    precal metrics: {'max_underconfidence_deviation': 0.0, 'max_overconfidence_deviation': 0.49375, 'mean_absolute_deviation': 0.26704545454545453, 'mean_signed_deviation': 0.26704545454545453}
    post_oracle pre metrics: {'max_underconfidence_deviation': 0.0, 'max_overconfidence_deviation': 0.49375, 'mean_absolute_deviation': 0.26704545454545453, 'mean_signed_deviation': 0.26704545454545453}
    post_oracle post metrics: {'max_underconfidence_deviation': 0.17500000000000004, 'max_overconfidence_deviation': 0.4, 'mean_absolute_deviation': 0.14488636363636367, 'mean_signed_deviation': 0.08806818181818178}


.. GENERATED FROM PYTHON SOURCE LINES 323-337

Summary
-------

``UncertaintyCalibrator`` is the high-level API that realizes CaliBrain's
calibration modes.

In this tutorial:

- ``precal`` evaluated raw empirical coverage without fitting a map;
- ``post_oracle`` fitted isotonic recalibration on a matched train split and
  evaluated the recalibrated curve on a matched eval split;
- ``post_pooled``, ``post_pooled_mismatch``, and ``post_fixed`` were
  explained as workflow-level variations in how the train/eval splits are
  constructed.


.. rst-class:: sphx-glr-timing

   **Total running time of the script:** (0 minutes 0.267 seconds)


.. _sphx_glr_download_auto_tutorials_08_uncertainty_calibration.py:

.. only:: html

  .. container:: sphx-glr-footer sphx-glr-footer-example

    .. container:: sphx-glr-download sphx-glr-download-jupyter

      :download:`Download Jupyter notebook: 08_uncertainty_calibration.ipynb <08_uncertainty_calibration.ipynb>`

    .. container:: sphx-glr-download sphx-glr-download-python

      :download:`Download Python source code: 08_uncertainty_calibration.py <08_uncertainty_calibration.py>`

    .. container:: sphx-glr-download sphx-glr-download-zip

      :download:`Download zipped: 08_uncertainty_calibration.zip <08_uncertainty_calibration.zip>`