.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_tutorials/08_metric_evaluation.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_tutorials_08_metric_evaluation.py: 08. Metric Evaluation ===================== This tutorial mainly explains the ``MetricEvaluator`` class. It demonstrates the high-level ``MetricEvaluator`` API for quantitative evaluation of uncertainty and calibration outputs. It covers: - calibration curves via ``MetricEvaluator.calibration_curve``; - calibration summary metrics via ``MetricEvaluator.calibration_metrics_4``; - combined error and uncertainty summaries via ``MetricEvaluator.evaluate_all``; - fixed-orientation and free-orientation EEG examples. .. GENERATED FROM PYTHON SOURCE LINES 20-33 Scientific motivation --------------------- After source estimation and uncertainty estimation, CaliBrain needs a compact way to answer three questions: - how accurate are the reconstructed source signals? - how large is the posterior uncertainty on average? - how well do empirical coverages match nominal coverages? ``MetricEvaluator`` is the high-level class that summarizes these quantities. It wraps ``UncertaintyEstimator`` and exposes workflow-facing evaluation methods. .. GENERATED FROM PYTHON SOURCE LINES 33-50 .. code-block:: Python import matplotlib.pyplot as plt import numpy as np from mne.io.constants import FIFF from calibrain import ( MetricEvaluator, SensorSimulator, SourceEstimator, SourceSimulator, UncertaintyEstimator, gamma_map_sflex, ) RANDOM_SEED = 71 .. GENERATED FROM PYTHON SOURCE LINES 51-60 Build a lightweight evaluation fixture -------------------------------------- The tutorial generates two small synthetic examples: - one fixed-orientation example; - one free-orientation EEG example. Both use the active ``gamma_map_sflex`` solver and the same uncertainty grid. .. GENERATED FROM PYTHON SOURCE LINES 60-110 .. code-block:: Python erp_config = { "tmin": -0.1, "tmax": 0.8, "stim_onset": 0.0, "sfreq": 100, "fmin": 2, "fmax": 8, "amplitude_distribution": { "median": 8.0, "sigma": 0.15, "clip": [2.0, 20.0], }, "random_erp_timing": False, "erp_min_length": 20, } nominal_coverages = np.linspace(0.0, 1.0, 11) ue = UncertaintyEstimator(nominal_coverages=nominal_coverages) metric_evaluator = MetricEvaluator(ue) source_simulator = SourceSimulator(ERP_config=erp_config) sensor_simulator = SensorSimulator() times = np.arange(erp_config["tmin"], erp_config["tmax"], 1.0 / erp_config["sfreq"]) rng = np.random.default_rng(RANDOM_SEED) n_sensors = 16 n_sources = 32 src_coords = rng.normal(scale=0.04, size=(n_sources, 3)) leadfield_fixed = rng.normal(scale=0.03, size=(n_sensors, n_sources)) leadfield_fixed /= np.maximum( np.linalg.norm(leadfield_fixed, axis=0, keepdims=True), np.finfo(float).eps, ) leadfield_fixed *= 0.6 leadfield_free_eeg = rng.normal(scale=0.015, size=(n_sensors, n_sources, 3)) leadfield_free_eeg /= np.maximum( np.linalg.norm(leadfield_free_eeg, axis=0, keepdims=True), np.finfo(float).eps, ) leadfield_free_eeg *= 0.4 sensor_simulator.set_sensor_metadata( kind=FIFF.FIFFV_EEG_CH, units=FIFF.FIFF_UNIT_V, unitmult=FIFF.FIFF_UNITM_MU, coil_type=FIFF.FIFFV_COIL_EEG, ) .. GENERATED FROM PYTHON SOURCE LINES 111-118 Fixed-orientation evaluation ---------------------------- For fixed orientation, ``MetricEvaluator`` works with: - ``x_true`` and ``x_hat`` of shape ``(N, T)``; - uncertainty as either ``posterior_var`` or full ``posterior_cov``. .. GENERATED FROM PYTHON SOURCE LINES 118-164 .. code-block:: Python x_true_fixed, active_fixed = source_simulator.simulate( n_sources=n_sources, nnz=4, orientation_type="fixed", seed=RANDOM_SEED, ) y_fixed_clean, y_fixed_noisy, fixed_noise, fixed_eta = sensor_simulator.simulate( x=x_true_fixed, L=leadfield_fixed, alpha_SNR=0.7, sensor_white_noise_std=0.2, seed=RANDOM_SEED, ) fixed_noise_var = float(np.var(fixed_noise)) fixed_estimator = SourceEstimator( solver=gamma_map_sflex, solver_params={"max_iter": 150, "tol": 1e-7, "sigma": 0.01, "src_coords": src_coords}, noise_var=fixed_noise_var, n_orient=1, ) fixed_estimator.fit(leadfield_fixed, y_fixed_noisy) fixed_result = fixed_estimator.predict() fixed_curve = metric_evaluator.calibration_curve( x_true=x_true_fixed, x_hat=fixed_result["posterior_mean"], posterior_uncert=fixed_result["posterior_cov"], setting="fixed", mode="aggregated", ) fixed_summary = metric_evaluator.evaluate_all( x_true=x_true_fixed, x_hat=fixed_result["posterior_mean"], posterior_uncert=fixed_result["posterior_cov"], setting="fixed", mode="aggregated", ) print("fixed calibration curve keys:", sorted(fixed_curve.keys())) print("fixed calibration metrics:", fixed_curve["metrics_4"]) print("fixed evaluate_all keys:", sorted(fixed_summary.keys())) print("fixed mean posterior std:", fixed_summary["mean_posterior_std"]) .. rst-class:: sphx-glr-script-out .. code-block:: none fixed calibration curve keys: ['empirical', 'metrics_4', 'nominal'] fixed calibration metrics: {'max_underconfidence_deviation': 0.0, 'max_overconfidence_deviation': 0.4, 'mean_absolute_deviation': 0.19318181818181815, 'mean_signed_deviation': 0.19318181818181815} fixed evaluate_all keys: ['calibration', 'mae', 'mean_posterior_std', 'mse', 'rmae', 'rmse'] fixed mean posterior std: 0.019154597617136844 .. GENERATED FROM PYTHON SOURCE LINES 165-174 Free-orientation EEG evaluation ------------------------------- For free-orientation EEG, ``MetricEvaluator`` supports two interval types: - ``marginal``: pooled component-wise intervals; - ``full_cov``: local 3D covariance blocks. Both are evaluated below in aggregated mode. .. GENERATED FROM PYTHON SOURCE LINES 174-231 .. code-block:: Python x_true_free, active_free = source_simulator.simulate( n_sources=n_sources, nnz=4, orientation_type="free", coil_type=FIFF.FIFFV_COIL_EEG, seed=RANDOM_SEED + 1, ) y_free_clean, y_free_noisy, free_noise, free_eta = sensor_simulator.simulate( x=x_true_free, L=leadfield_free_eeg, alpha_SNR=0.7, sensor_white_noise_std=0.05, seed=RANDOM_SEED + 1, ) free_noise_var = float(np.var(free_noise)) free_estimator = SourceEstimator( solver=gamma_map_sflex, solver_params={"max_iter": 150, "tol": 1e-7, "sigma": 0.01, "src_coords": src_coords}, noise_var=free_noise_var, n_orient=3, ) free_estimator.fit(leadfield_free_eeg, y_free_noisy) free_result = free_estimator.predict() free_curve_marginal = metric_evaluator.calibration_curve( x_true=x_true_free, x_hat=free_result["posterior_mean_reshaped"], posterior_uncert=free_result["posterior_cov"], setting="eeg_free", mode="aggregated", free_interval_type="marginal", ) free_curve_full_cov = metric_evaluator.calibration_curve( x_true=x_true_free, x_hat=free_result["posterior_mean_reshaped"], posterior_uncert=free_result["posterior_cov"], setting="eeg_free", mode="aggregated", free_interval_type="full_cov", ) free_summary = metric_evaluator.evaluate_all( x_true=x_true_free, x_hat=free_result["posterior_mean_reshaped"], posterior_uncert=free_result["posterior_cov"], setting="eeg_free", mode="aggregated", free_interval_type="full_cov", ) print("free marginal calibration metrics:", free_curve_marginal["metrics_4"]) print("free full_cov calibration metrics:", free_curve_full_cov["metrics_4"]) print("free evaluate_all mse:", free_summary["mse"]) print("free evaluate_all mean posterior std:", free_summary["mean_posterior_std"]) .. rst-class:: sphx-glr-script-out .. code-block:: none free marginal calibration metrics: {'max_underconfidence_deviation': 0.0, 'max_overconfidence_deviation': 0.74375, 'mean_absolute_deviation': 0.34753787878787873, 'mean_signed_deviation': 0.34753787878787873} free full_cov calibration metrics: {'max_underconfidence_deviation': 0.0, 'max_overconfidence_deviation': 0.74375, 'mean_absolute_deviation': 0.3352272727272727, 'mean_signed_deviation': 0.3352272727272727} free evaluate_all mse: 0.004249153282609895 free evaluate_all mean posterior std: 0.0140126642270974 .. GENERATED FROM PYTHON SOURCE LINES 232-238 Compare calibration summary metrics directly -------------------------------------------- ``calibration_metrics_4`` can also be called directly on nominal and empirical coverage arrays. This is useful when calibration curves are already available and only the summary metrics need to be recomputed. .. GENERATED FROM PYTHON SOURCE LINES 238-251 .. code-block:: Python fixed_metrics_direct = metric_evaluator.calibration_metrics_4( fixed_curve["nominal"], fixed_curve["empirical"], ) free_metrics_direct = metric_evaluator.calibration_metrics_4( free_curve_full_cov["nominal"], free_curve_full_cov["empirical"], ) print("fixed direct metrics:", fixed_metrics_direct) print("free direct metrics:", free_metrics_direct) .. rst-class:: sphx-glr-script-out .. code-block:: none fixed direct metrics: {'max_underconfidence_deviation': 0.0, 'max_overconfidence_deviation': 0.4, 'mean_absolute_deviation': 0.19318181818181815, 'mean_signed_deviation': 0.19318181818181815} free direct metrics: {'max_underconfidence_deviation': 0.0, 'max_overconfidence_deviation': 0.74375, 'mean_absolute_deviation': 0.3352272727272727, 'mean_signed_deviation': 0.3352272727272727} .. GENERATED FROM PYTHON SOURCE LINES 252-257 Plot calibration and evaluation summaries ----------------------------------------- The first panel compares the fixed and free-EEG calibration curves. The second panel summarizes the default calibration metrics. .. GENERATED FROM PYTHON SOURCE LINES 257-307 .. code-block:: Python metric_names = [ "mean_absolute_deviation", "mean_signed_deviation", "max_underconfidence_deviation", "max_overconfidence_deviation", ] fixed_metric_values = [fixed_curve["metrics_4"][name] for name in metric_names] free_metric_values = [free_curve_full_cov["metrics_4"][name] for name in metric_names] fig, axes = plt.subplots(1, 2, figsize=(11, 4.5)) axes[0].plot([0, 1], [0, 1], "--", color="0.5", label="perfect calibration") axes[0].plot(fixed_curve["nominal"], fixed_curve["empirical"], marker="o", label="fixed") axes[0].plot( free_curve_full_cov["nominal"], free_curve_full_cov["empirical"], marker="s", label="free EEG full_cov", ) axes[0].plot( free_curve_marginal["nominal"], free_curve_marginal["empirical"], marker="^", label="free EEG marginal", ) axes[0].set( xlabel="Nominal coverage", ylabel="Empirical coverage", title="Calibration curves", ) axes[0].legend(loc="best") bar_positions = np.arange(len(metric_names)) bar_width = 0.38 axes[1].bar(bar_positions - bar_width / 2, fixed_metric_values, width=bar_width, label="fixed") axes[1].bar(bar_positions + bar_width / 2, free_metric_values, width=bar_width, label="free EEG full_cov") axes[1].set( xticks=bar_positions, xticklabels=[ "MAD", "MSD", "Max under", "Max over", ], ylabel="Metric value", title="Calibration summary metrics", ) axes[1].legend(loc="best") fig.tight_layout() .. image-sg:: /auto_tutorials/images/sphx_glr_08_metric_evaluation_001.png :alt: Calibration curves, Calibration summary metrics :srcset: /auto_tutorials/images/sphx_glr_08_metric_evaluation_001.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 308-313 Inspect the combined evaluation output -------------------------------------- ``evaluate_all`` combines error metrics, uncertainty summary, and calibration summary into one dictionary. .. GENERATED FROM PYTHON SOURCE LINES 313-317 .. code-block:: Python for key in ["mse", "mae", "rmse", "rmae", "mean_posterior_std", "calibration"]: print(f"fixed evaluate_all[{key!r}] =", fixed_summary[key]) .. rst-class:: sphx-glr-script-out .. code-block:: none fixed evaluate_all['mse'] = 0.0004689190050340884 fixed evaluate_all['mae'] = 0.012077275043368751 fixed evaluate_all['rmse'] = 0.02165453774695014 fixed evaluate_all['rmae'] = 0.10989665619739643 fixed evaluate_all['mean_posterior_std'] = 0.019154597617136844 fixed evaluate_all['calibration'] = {'nominal': array([0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1. ]), 'empirical': array([0. , 0.5 , 0.53125, 0.53125, 0.6875 , 0.75 , 0.84375, 0.875 , 0.90625, 1. , 1. ]), 'metrics_4': {'max_underconfidence_deviation': 0.0, 'max_overconfidence_deviation': 0.4, 'mean_absolute_deviation': 0.19318181818181815, 'mean_signed_deviation': 0.19318181818181815}} .. GENERATED FROM PYTHON SOURCE LINES 318-330 Summary ------- ``MetricEvaluator`` is the high-level evaluation class used after uncertainty estimation and calibration. In this tutorial it was used to: - compute aggregated calibration curves; - summarize them with the default four calibration metrics; - compare ``marginal`` and ``full_cov`` free-EEG uncertainty; - collect reconstruction and uncertainty summaries with ``evaluate_all``. .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 0.344 seconds) .. _sphx_glr_download_auto_tutorials_08_metric_evaluation.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: 08_metric_evaluation.ipynb <08_metric_evaluation.ipynb>` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: 08_metric_evaluation.py <08_metric_evaluation.py>` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: 08_metric_evaluation.zip <08_metric_evaluation.zip>`