Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataset with floating point values #12

Open
itlchriss opened this issue Aug 18, 2024 · 1 comment
Open

Dataset with floating point values #12

itlchriss opened this issue Aug 18, 2024 · 1 comment

Comments

@itlchriss
Copy link

Hi,

The work is great and I want to explore the possibility of using it on some complicated dataset. I have tried to use it on the Wisconsin breast cancer dataset. However, as the dataset contains quite a lot of different floating point values, there are many feature names appended with these values during the get_dummies. I have tried to remove the checking (the one in explain.py:90). There are no rules found. Are there any limitations in using this work on datasets with floating point values?

@groshanlal
Copy link
Collaborator

groshanlal commented Aug 23, 2024

TE2Rules can handle both continuous and categorical features. Regarding the Wisconsin breast cancer dataset, most of the features are continuous. Please use get_dummies only to transform categorical features into one-hot encoded features. Do not use it on all features, since it would make the continuous features unusable.

If you are using get_dummies, make sure that the transformed feature names do not have hyphens ("-"). TE2Rules expects feature to contain only alphanumeric characters and underscores are allowed in feature names. Replace hyphens ("-") with underscores("_") in feature names.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants