Scatter Plots¶
[1]:
import pandas as pd
import data_describe as dd
UserWarning: The Dask Engine for Modin is experimental.
UserWarning: The extension "jupyterlab-plotly" was not found and is required for Plotly-based visualizations.
[2]:
from sklearn.datasets import load_diabetes
data = load_diabetes()
df = pd.DataFrame(data.data, columns=list(data.feature_names))
df['target'] = data.target
df.shape
[2]:
(442, 11)
[3]:
df.head(2)
[3]:
age | sex | bmi | bp | s1 | s2 | s3 | s4 | s5 | s6 | target | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0.038076 | 0.050680 | 0.061696 | 0.021872 | -0.044223 | -0.034821 | -0.043401 | -0.002592 | 0.019908 | -0.017646 | 151.0 |
1 | -0.001882 | -0.044642 | -0.051474 | -0.026328 | -0.008449 | -0.019163 | 0.074412 | -0.039493 | -0.068330 | -0.092204 | 75.0 |
Scatterplot Matrix¶
[4]:
dd.scatter_plots(df, mode='matrix')
[4]:
<seaborn.axisgrid.PairGrid at 0x23b2f3bd6c8>
data:image/s3,"s3://crabby-images/18314/1831454bc60dbc29f640016536c9146e68e8b683" alt="../_images/examples_scatter_plots_5_1.png"
Show all plots¶
[9]:
df_subset = df.iloc[:, :3] # Avoid creating all the plots in this notebook
dd.scatter_plots(df_subset, mode='all')
[9]:
[<seaborn.axisgrid.JointGrid at 0x23b419380c8>,
<seaborn.axisgrid.JointGrid at 0x23b441f40c8>,
<seaborn.axisgrid.JointGrid at 0x23b4453a048>]
data:image/s3,"s3://crabby-images/90a08/90a08dbaadcea15c5447e1fffc6b4a076d8e2050" alt="../_images/examples_scatter_plots_7_1.png"
data:image/s3,"s3://crabby-images/a02e4/a02e47d2dd0d55f15866f96ab0a2b977785d3d96" alt="../_images/examples_scatter_plots_7_2.png"
data:image/s3,"s3://crabby-images/13e5f/13e5fdfce9752298a330860cddffd9074a355a13" alt="../_images/examples_scatter_plots_7_3.png"
Show plots of interest using scatterplot diagnostics¶
Filter plots by a diagnostic
[6]:
dd.scatter_plots(df, mode='diagnostic', threshold={'Outlying': 0.5})
[6]:
[<seaborn.axisgrid.JointGrid at 0x23b3c76a548>]
data:image/s3,"s3://crabby-images/3c6ff/3c6ffc791ac322379d3feb65222796a6871eabc8" alt="../_images/examples_scatter_plots_9_1.png"
[7]:
dd.scatter_plots(df, mode='diagnostic', threshold={'Striated': 0.9})
[7]:
[<seaborn.axisgrid.JointGrid at 0x23b3ee14848>,
<seaborn.axisgrid.JointGrid at 0x23b41338ec8>]
data:image/s3,"s3://crabby-images/1beef/1beef42e22caa614449a4bf2fabc2f85e8c0fa9b" alt="../_images/examples_scatter_plots_10_1.png"
data:image/s3,"s3://crabby-images/83934/839346a60cf8fa767540fc8535bd6d4816516300" alt="../_images/examples_scatter_plots_10_2.png"