data_describe.core.distributions

distribution(data, diagnostic=True, compute_backend=None, viz_backend=None, **kwargs)

Distribution Plots.

class data_describe.core.distributions.DistributionWidget(input_data=None, spike_value=None, skew_value=None, spike_factor=None, skew_factor=None, viz_backend=None)

Bases: data_describe._widget.BaseWidget

Container for distributions.

This class (object) is returned from the distribution function. The attributes documented below can be accessed or extracted.

input_data

The input data

spike_value

Measure of the “spikey”ness metric, which diagnoses spikey histograms where the tallest bin is n times taller than the average bin.

skew_value

Measure of the skewness metric.

spike_factor

The threshold factor used to diagnose “spikey”ness.

skew_factor

The threshold factor used to diagnose skew.

show(self, viz_backend=None, **kwargs)

The default display for this output.

Displays a summary of diagnostics.

Parameters
  • viz_backend (str, optional) – The visualization backend.

  • **kwargs – Keyword arguments.

plot_distribution(self, x: Optional[str] = None, contrast: Optional[str] = None, viz_backend: Optional[str] = None, **kwargs)

Generate distribution plot(s).

Numeric features will be visualized using a histogram/violin plot, and any other types will be visualized using a categorical bar plot.

Parameters
  • x (str, optional) – The feature name to plot. If None, will plot all features.

  • contrast (str, optional) – The feature name to compare histograms by contrast.

  • mode (str) – {‘combo’, ‘violin’, ‘hist’} The type of plot to display. Defaults to a combined histogram/violin plot.

  • hist_kwargs (dict, optional) – Keyword args for seaborn.histplot.

  • violin_kwargs (dict, optional) – Keyword args for seaborn.violinplot.

  • viz_backend (optional) – The visualization backend.

  • **kwargs – Additional keyword arguments for the visualization backend.

Returns

Histogram plot(s).

data_describe.core.distributions.distribution(data, diagnostic=True, compute_backend=None, viz_backend=None, **kwargs) → DistributionWidget

Distribution Plots.

Visualizes univariate distributions. This feature can be used for generating various types of plots for univariate distributions, including: histograms, violin plots, bar (count) plots.

Parameters
  • data – Data Frame

  • diagnostic – If True, will run diagnostics to select “interesting” plots.

  • compute_backend – The compute backend.

  • viz_backend – The visualization backend.

  • **kwargs – Keyword arguments.

Raises

ValueError – Invalid input data type.

Returns

DistributionWidget