Introduction

data-describe is a Python toolkit for Exploratory Data Analysis (EDA). It aims to accelerate data exploration and analysis by providing automated and opinionated analysis widgets.

Main Features

The main features of data-describe are organized as the “core”. These features are expected to be commonly used with most EDA applications on tabular data:

Example Usage

The core features (functions) are exported and can be used directly:

import data_describe as dd
dd.data_summary(df)

Non-core features need to be imported explicitly. For example, for text preprocessing:

from data_describe.text.text_preprocessing import preprocess_texts
preprocess_texts(df.TEXT_COLUMN)

Extended Features

Additional features of data-describe include sensitive data detection (e.g. PII), text analysis, dimensionality reduction, and more. For more information on using these, check out the Examples or API Reference sections.