Open
Description
Hello,
I wanted to know is there any way to do multiple transforms on multiple columns , treating each one seperately.
I was able to implement it using Sklearn's ColumnTransformer
as follows:
ct = ColumnTransformer(
[(
'numeric',
Pipeline([
('handle_na',NAHandler(is_train=True,nan_cols=[])),
('standardize',StandardScaler()),
('PCA',PCA(n_components=4))
]),
['col1','col2','col3','col4']
),
)],
remainder='passthrough'
)
However SKlearn pandas' documentation doesnt point me to something like this.
I can see there are 2 sections --- one for single column , multiple transforms and other for multiple colums, single transform.
I couldnt see multiple cols multiple transforms
for now i am able to do what i intend by writing transforms for each and every column seperately . i.e
mapper = DataFrameMapper(
[
(
['col1'],
[NAHandler(is_train=True,nan_cols=[]),StandardScaler(),PCA(n_components=4)]
),
(
['col2'],
[NAHandler(is_train=True,nan_cols=[]),StandardScaler(),PCA(n_components=4)]
)
........
],
input_df=True,
df_out=True,
default=None
)
But what i was actually looking for is ColumnTransformer
- like usage .
something like this :
mapper = DataFrameMapper(
[
(
['col1','col2','col3','col4'],
[NAHandler(is_train=True,nan_cols=[]),StandardScaler(),PCA(n_components=4)]
)
],
input_df=True,
df_out=True,
default=None
)
will such functionality be supported in upcoming builds? Can be very helpful!
Metadata
Metadata
Assignees
Labels
No labels