Splitting
Helpers to define how to assign data into fit and calibration sets.
- class splitting.BaseSplitter(random_state=None)
Abstract structure of a splitter. A splitter provides a function that assignes data points to fit and calibration sets.
- Parameters:
random_state (int) – seed to control random generation.
- class splitting.IdSplitter(X_fit, y_fit, X_calib, y_calib)
Identity splitter that wraps an already existing data assignment.
- Parameters:
X_fit (Iterable) – Fit features.
y_fit (Iterable) – Fit labels.
X_calib (Iterable) – calibration features.
y_calib (Iterable) – calibration labels.
- __call__(X=None, y=None)
Wraps into a splitter the provided fit and calibration subsets.
- Parameters:
X (Iterable) – features array. Not needed here, just a placeholder for interoperability.
y (Iterable) – labels array. Not needed here, just a placeholder for interoperability.
- Returns:
List of one tuple of deterministic subsets (X_fit, y_fit, X_calib, y_calib).
- Return type:
List[Tuple[Iterable]]
- class splitting.RandomSplitter(ratio, random_state=None)
Random splitter that assign samples given a ratio.
- Parameters:
ratio (float) – ratio of data assigned to the training (1-ratio to calibration).
random_state (int) – seed to control random generation.
- __call__(X, y)
Implements a random split strategy.
- Parameters:
X (Iterable) – features array.
y (Iterable) – labels array.
- Returns:
List of one tuple of random subsets (X_fit, y_fit, X_calib, y_calib).
- Return type:
List[Tuple[Iterable]]
- class splitting.KFoldSplitter(K, random_state=None)
KFold data splitter.
- Parameters:
K (int) – number of folds to generate.
random_state (int) – seed to control random generation.
- __call__(X, y)
Implements a K-fold split strategy.
- Parameters:
X (Iterabler) – features array.
y (Iterable) – labels array.
- Returns:
list of K split folds. Each fold is a tuple (X_fit, y_fit, X_calib, y_calib).
- Return type:
List[Tuple[Iterable]]