PIT-CP

pitcp is a Python package for conformal prediction using probability integral transform (PIT) pivotal scores. Given any black-box nonconformity score, it fits a conditional density estimator on the score distribution and maps raw scores to PIT values, yielding valid marginal coverage at any user-specified level.

Features

  • PIT Conformal Prediction: Maps base nonconformity scores through a learned conditional CDF, producing asymptotically exact conditional coverage.

  • Model-agnostic: Works with any callable nonconformity score s(x, y), including distance, residual, or likelihood-based scores.

  • Flexible Density Estimation: Supports normalizing flows and mixture density networks from the zuko library.

  • Marginal Coverage Guarantee: Provably valid conformal coverage at any target level via finite-sample calibration.

  • scikit-learn: Native BaseEstimator integration with a familiar fit / conformalize / predict API.

Installation

You can install the package via pip:

pip install pitcp

Usage

Example:

import torch
import zuko
from pitcp import PITCP


def std(x):
    return torch.where((x > -0.9) & (x < 0.9), torch.cos(torch.pi * x / 2), 1.0)


def gen_data(n):
    x = torch.rand(n, 1) * 2 - 1
    return x, torch.randn(n, 1) * std(x)


torch.manual_seed(42)

(X_train, y_train), (X_cal, y_cal), (X_test, y_test) = [
    gen_data(5000) for _ in range(3)
]


# Define a nonconformity score
def score(x, y):
    return y.abs()


# Build a normalizing flow density estimator
model = zuko.flows.NSF(features=1, context=1, bins=4, hidden_features=(32, 32, 32))
optimizer = torch.optim.Adam(model.parameters(), lr=1e-2)

# Compute nonconformity scores
s_train = score(X_train, y_train)
s_cal = score(X_cal, y_cal)
s_test = score(X_test, y_test)

# Fit and conformalize
pitcp = PITCP(model, optimizer, n_epochs=10, batch_size=128)
pitcp.fit(X_train, s_train)
pitcp.conformalize(X_cal, s_cal)

# Predict conformal regions (max score thresholds) at multiple quantiles
limits = pitcp.predict(X_test, quantile=[0.7, 0.8, 0.9])

# Predict conformal coverage at multiple quantiles (single float accepted)
covered = pitcp.predict_coverage(X_test, s_test, quantile=[0.7, 0.8, 0.9])
print(f"Empirical coverages: {covered.mean(axis=0)}")

API Reference

class PITCP(estimator, optimizer, *, n_epochs=10, batch_size=None, verbose=True)[source]

Bases: BaseEstimator, Module

PIT conformal predictor using a normalizing flow or mixture density estimator.

This class implements probability integral transform (PIT) conformal prediction. Given a potentially black-box nonconformity scores, it fits a conditional density estimator on the score distribution over a training set, then uses the learned conditional CDF to map raw scores to PIT values. Conformal coverage guarantees are obtained by comparing test PIT values against a calibration quantile.

The estimator must be a zuko subclass, coming from either zuko.flows.Flow (a normalizing flow) or zuko.mixtures.GMM (a mixture density network). The class internally detects which family is used and applies the appropriate CDF computation.

Density estimation settings:
  • estimator: A zuko lazy distribution instance conditioned on features, used to model the score distribution. Must be from zuko.flows or zuko.mixtures.

  • optimizer: Optimizer used to train the density estimator via maximum likelihood (negative log-likelihood/forward KL divergence minimization).

Train settings:
  • n_epochs: Number of full passes over the Train data.

  • batch_size: Mini-batch size used during both Train and inference.

  • verbose: Whether to display a tqdm progress bar during fit.

Variables:
  • estimator (Flow | GMM) – Conditional density estimator from zuko.flows or zuko.mixtures.

  • optimizer (torch.optim.Optimizer) – Optimizer for training the estimator.

  • n_epochs (int) – Number of training epochs.

  • batch_size (int | None) – Batch size for data loading. None means full-batch training.

  • verbose (bool | int) – Whether to display a progress bar during training.

  • estimator_type (str) – Either flow or mixture, set at initialization based on the type of estimator.

  • scores (torch.Tensor | None) – Calibration PIT scores stored after calling conformalize.

Parameters:
  • estimator (Flow | GMM)

  • optimizer (Optimizer)

  • n_epochs (int)

  • batch_size (int | None)

  • verbose (bool | int)

scores_: ndarray
estimator: Flow | GMM
optimizer: Optimizer
n_epochs: int
batch_size: int | None
verbose: bool | int
estimator_type_: str
set_fit_request(*, s='$UNCHANGED$')

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • s (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for s parameter in fit.

  • self (PITCP)

Returns:

self – The updated object.

Return type:

object

set_predict_request(*, quantile='$UNCHANGED$')

Configure whether metadata should be requested to be passed to the predict method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to predict.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • quantile (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for quantile parameter in predict.

  • self (PITCP)

Returns:

self – The updated object.

Return type:

object

fit(X, s)[source]

Fits the conditional density estimator on nonconformity scores.

Parameters:
  • X (np.typing.ArrayLike) – Train features.

  • s (np.typing.ArrayLike) – Train scores.

Returns:

The fitted estimator.

Return type:

Self

conformalize(X, s)[source]

Computes and stores calibration PIT scores from a held-out dataset.

Parameters:
  • X (np.typing.ArrayLike) – Calibration features.

  • s (np.typing.ArrayLike) – Calibration scores.

Returns:

The updated estimator.

Return type:

Self

threshold(quantile=0.9)[source]

Computes the PIT threshold for a given quantile.

Parameters:

quantile (float | Sequence[float], optional) – Target coverage level(s). Defaults to 0.9.

Returns:

PIT threshold values.

Return type:

np.ndarray

predict(X, *, quantile=0.9)[source]

Predicts conformal regions for test points.

Parameters:
  • X (np.typing.ArrayLike) – Test features.

  • quantile (float | Sequence[float], optional) – Target coverage level(s). Defaults to 0.9.

Returns:

Maximum base score threshold values.

Return type:

np.ndarray

predict_coverage(X, s, *, quantile=0.9)[source]

Predicts conformal coverage for test points.

Parameters:
  • X (np.typing.ArrayLike) – Test features.

  • s (np.typing.ArrayLike) – Test scores.

  • quantile (float | Sequence[float], optional) – Target coverage level(s). Defaults to 0.9.

Returns:

Coverage indicators.

Return type:

np.ndarray