data_describe.misc.preprocessing

preprocess(data, target, impute=’simple’, encode=’label’)

Simple preprocessing pipeline for ML.

data_describe.misc.preprocessing.preprocess(data, target, impute='simple', encode='label')

Simple preprocessing pipeline for ML.

Parameters
  • data – A Pandas dataframe

  • target – Name of the target feature

  • impute – Method to use for imputing numeric variables. Only ‘simple’ is implemented.

  • encode – Method to use for encoding categorical variables. Only ‘label’ is implemented.

Raises
  • NotImplementedError – Imputation or encoding method not implemented.

  • ValueError – No columns left to preprocess.

Returns

(X, y) tuple of numpy arrays