Skip to content

factory module

This module contains factory functions that allow to build more easily IndicesInput objects.

from_numpy(x, y, feature_names=None, model=None, target=None)

Builds IndicesInput from numpy array.

Parameters:

Name Type Description Default
x

numpy array containing the samples to analyse.

required
y

numpy array containing the labels. Can be None if no labels are provided.

required
feature_names

a list of str containing the features name of x. When None features are named with numbers.

None
model

function that can be applied on dataframe, that return an series with same shape as y.

None
target

one of the target from the utils.fairness_objective module.

None

Returns:

Type Description
IndicesInput

an IndicesInput object that can be used to compute sensitivity indices.

Source code in deel\fairsense\data_management\factory.py
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
def from_numpy(x, y, feature_names=None, model=None, target=None) -> IndicesInput:
    """
    Builds IndicesInput from numpy array.

    Args:
        x: numpy array containing the samples to analyse.
        y: numpy array containing the labels. Can be None if no labels are provided.
        feature_names: a list of str containing the features name of x. When None
            features are named with numbers.
        model: function that can be applied on dataframe, that return an series with
            same shape as y.
        target: one of the target from the utils.fairness_objective module.

    Returns:
        an IndicesInput object that can be used to compute sensitivity indices.

    """
    df = pd.DataFrame(x, columns=feature_names)
    # build dataframe
    y = pd.DataFrame(y, columns=["target"])
    return from_pandas(dataframe=df, y=y, model=model, target=target)

from_pandas(dataframe, y, model=None, target=None)

Builds IndicesInput from pandas dataframe.

Parameters:

Name Type Description Default
dataframe pd.DataFrame

DataFrame containing the samples to analyse.

required
y Union[str, pd.DataFrame, pd.Series, None]

Union[str, pd.DataFrame, pd.Series, None] : when str, refers to the name of the columns containing the labels. Must be present in dataframe. When pd.DataFrame or pd.Series the label are provided in the same order as in dataframe. When None, no labels are provided.

required
model Optional[Callable]

function that can be applied on dataframe, that return a series with same shape as y.

None
target Callable

one of the target from the utils.fairness_objective module.

None

Returns:

Type Description
IndicesInput

an IndicesInput object that can be used to compute sensitivity indices.

Source code in deel\fairsense\data_management\factory.py
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
def from_pandas(
    dataframe: pd.DataFrame,
    y: Union[str, pd.DataFrame, pd.Series, None],
    model: Optional[Callable] = None,
    target: Callable = None,
) -> IndicesInput:
    """
    Builds IndicesInput from pandas dataframe.

    Args:
        dataframe: DataFrame containing the samples to analyse.
        y: Union[str, pd.DataFrame, pd.Series, None] : when str, refers to the name
            of the columns containing the labels. Must be present in dataframe. When
            pd.DataFrame or pd.Series the label are provided in the same order as in
            dataframe. When None, no labels are provided.
        model: function that can be applied on dataframe, that return a series with
            same shape as y.
        target: one of the target from the utils.fairness_objective module.

    Returns:
        an IndicesInput object that can be used to compute sensitivity indices.

    """
    cols = set(dataframe.columns)
    if y is None:
        assert model is not None, "model must be defined when target is None"
        x = dataframe
        y = None
    elif isinstance(y, str):
        x = dataframe[cols - {y}]
        y = dataframe[y]
    elif isinstance(y, pd.DataFrame) or isinstance(y, pd.Series):
        x = dataframe
        y = pd.DataFrame(y)
    else:
        raise RuntimeError("type of target must be Dataframe, Series, str or None")
    return IndicesInput(x=x, y_true=y, model=model, objective=target)