-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ordinal encoding not working as expected #275
Comments
Thanks for catching this. I believe it's highly likely because I use I will confirm this and try to implement |
@ablaom could you check now on the entityembeds branch |
Okay, there is still a problem. For some reason the vectors, although consisting now of julia> X.Column2
5-element Vector{AbstractFloat}:
1.0f0
2.0f0
3.0f0
4.0f0
5.0f0
julia> [X.Column2...]
5-element Vector{Float32}:
1.0
2.0
3.0
4.0
5.0 I looked into this, and this seems to arise from the nature of |
Thanks for your initiative in applying this fix. Yes, I will revisit for MLJTransforms. I do believe that the ordinal encoder there should return values still wrapped in What I am thinking about now is whether doing |
If you need "my trick" you are already doing something suboptimal. I didn't check this, but maybe the problem is not |
Quite embarresingly that turned out to be the case. Some how, it completely slipped me that by this it will be forced to maintain the supertype. I will aim to remove the abstract type annotation in the dict for MLJTransforms as well... @ablaom I already did here in the entityembeds branch so we no longer need the trick I believe... |
Okay, thanks. I think easiest is for me to separately fix the dict type in #276, which I've not merged yet. |
Although resolved, I'm re-opening this issue because the test suite currently lacks a catch for re-occurrence and this already led to an issue at #281. |
Is it implied that I should add tests of a particular nature to the current test suite? |
Not sure I understand the question. I am proposing that new tests be added to the test suite to catch errors like the one encountered in the opening post. Nothing more, nothing less. |
In stepping through
fit
for NeuraNetworkRegressor, using the data at the top of the test fileregressors.jl
, I am getting some unexpected behaviour.Here is a minimal version of that data giving the same behaviour:
And the model:
Okay, now the following lines are copied from
fit
, as given in "src/mlj_model_iinterface.jl" on the dev branch:At this point I expect
X
to haveContinuous
scitype - no more categoricals. However:The raw element type is
Float32
but these are getting wrapped as categorical vectors.The text was updated successfully, but these errors were encountered: