Splitting

Helpers to define how to assign data into fit and calibration sets.

class splitting.BaseSplitter(random_state=None)

Abstract structure of a splitter. A splitter provides a function that assignes data points to fit and calibration sets.

class splitting.IdSplitter(X_fit, y_fit, X_calib, y_calib)

Identity splitter that wraps an already existing data assignment.

Parameters:

__call__(X=None, y=None)

Wraps into a splitter the provided fit and calibration subsets.

Parameters:

X (Iterable) – features array. Not needed here, just a placeholder for interoperability.
y (Iterable) – labels array. Not needed here, just a placeholder for interoperability.

Returns:

List of one tuple of deterministic subsets (X_fit, y_fit, X_calib, y_calib).

Return type:

List[Tuple[Iterable]]

class splitting.RandomSplitter(ratio, random_state=None)

Random splitter that assign samples given a ratio.

Parameters:

ratio (float) – ratio of data assigned to the training (1-ratio to calibration).
random_state (int) – seed to control random generation.

__call__(X, y)

Implements a random split strategy.

Parameters:

Returns:

List of one tuple of random subsets (X_fit, y_fit, X_calib, y_calib).

Return type:

List[Tuple[Iterable]]

class splitting.KFoldSplitter(K, random_state=None)

KFold data splitter.

Parameters:

__call__(X, y)

Implements a K-fold split strategy.

Parameters:

Returns:

list of K split folds. Each fold is a tuple (X_fit, y_fit, X_calib, y_calib).

Return type:

List[Tuple[Iterable]]