data_describe.core.importance

importance(data, target: str, preprocess_func=None, estimator=None, return_values: bool = False, truncate: bool = True, top_features: Optional[int] = None, compute_backend: Optional[str] = None, viz_backend: Optional[str] = None, **kwargs)

Variable importance chart.

data_describe.core.importance.importance(data, target: str, preprocess_func=None, estimator=None, return_values: bool = False, truncate: bool = True, top_features: Optional[int] = None, compute_backend: Optional[str] = None, viz_backend: Optional[str] = None, **kwargs)

Variable importance chart.

This feature utilizes fits a simple model to the dataset to generate an estimate of feature importance (predictive power). Note that these results are dependent on the accuracy of the fitted model and should refined during modeling.

Parameters
  • data – A Pandas data frame

  • target – Name of the response column, as a string

  • preprocess_func – A custom preprocessing function that takes a Pandas dataframe and the target/response column as a string. Returns X and y as tuple.

  • estimator – A custom sklearn estimator. Default is Random Forest Classifier

  • return_values – If True, only the importance values as a numpy array

  • truncate – If True, negative importance values will be truncated (set to zero)

  • top_features – Return the top N most important features. Default is None (all features)

  • compute_backend – The compute backend

  • viz_backend – The visualization backend

  • **kwargs – Other arguments to be passed to the preprocess function

Returns

Matplotlib figure