data_describe.dimensionality_reduction.dimensionality_reduction¶
|
Reduces the number of dimensions of the input data. |
|
Reduces the number of dimensions of the input data using PCA. |
|
Reduces the number of dimensions of the input data using Incremental PCA. |
|
Reduces the number of dimensions of the input data using t-SNE. |
|
Reduces the number of dimensions of the input data using TSVD. |
-
data_describe.dimensionality_reduction.dimensionality_reduction.
dim_reduc
(data, n_components: int, dim_method: str, apply_tsvd: bool = True, compute_backend=None)¶ Reduces the number of dimensions of the input data.
- Parameters
data – The dataframe
n_components – Desired dimensionality for the data set prior to modeling
dim_method – {‘pca’, ‘ipca’, ‘tsne’, ‘tsvd’}
pca (-) – Principal Component Analysis
ipca (-) – Incremental Principal Component Analysis. Highly suggested for very large datasets
tsne (-) – T-distributed Stochastic Neighbor Embedding
tsvd (-) – Truncated Singular Value Decomposition
apply_tsvd – If True, TSVD will be run before t-SNE. This is highly recommended when running t-SNE
- Returns
The dimensionally-reduced dataframe and reduction object
-
data_describe.dimensionality_reduction.dimensionality_reduction.
run_pca
(data, n_components, compute_backend=None)¶ Reduces the number of dimensions of the input data using PCA.
- Parameters
data – The dataframe
n_components – Desired dimensionality for the data set prior to modeling
- Returns
The dimensionally-reduced dataframe pca: The applied PCA object
- Return type
reduc_df
-
data_describe.dimensionality_reduction.dimensionality_reduction.
run_ipca
(data, n_components, compute_backend=None)¶ Reduces the number of dimensions of the input data using Incremental PCA.
- Parameters
data – The dataframe
n_components – Desired dimensionality for the data set prior to modeling
- Returns
The dimensionally-reduced dataframe ipca: The applied IncrementalPCA object
- Return type
reduc_df
-
data_describe.dimensionality_reduction.dimensionality_reduction.
run_tsne
(data, n_components, apply_tsvd=True, compute_backend=None)¶ Reduces the number of dimensions of the input data using t-SNE.
- Parameters
data – The dataframe
n_components – Desired dimensionality for the output dataset
apply_tsvd – If True, TSVD will be run before t-SNE. This is highly recommended when running t-SNE
- Returns
The dimensionally-reduced dataframe tsne: The applied t-SNE object
- Return type
reduc_df
-
data_describe.dimensionality_reduction.dimensionality_reduction.
run_tsvd
(data, n_components, compute_backend=None)¶ Reduces the number of dimensions of the input data using TSVD.
- Parameters
data – The dataframe
n_components – Desired dimensionality for the output dataset
- Returns
The dimensionally-reduced dataframe tsne: The applied TSVD object
- Return type
reduc_df