Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GTSAM_DT_MERGING Flag #1501

Merged
merged 53 commits into from
Nov 11, 2023
Merged

GTSAM_DT_MERGING Flag #1501

merged 53 commits into from
Nov 11, 2023

Conversation

varunagrawal
Copy link
Collaborator

@varunagrawal varunagrawal commented Mar 26, 2023

This PR is a discussion venue for issue #1496.

I have added a unit test which demonstrates an issue with the number of assignments being set for the decision tree.
This seems to be because the decision tree rearranges the keys to be highest to lowest but doesn't account for the implicit pruning performed. If I disable the implicit pruning in the decision tree, this issue doesn't appear.

This PR adds a new CMake flag GTSAM_DT_MERGING which enables/disables DecisionTree merging of branches when the leaves have the same value. This flag is generally useful for debugging and maybe useful for other things.

Accordingly I have updated the tests and the config files.

I have also removed the nrAssignments_ variable in DecisionTree due to it needing significant refactor and the issue it was originally intended for (hybrid pruning) being resolved via alternative means.

@varunagrawal varunagrawal self-assigned this Mar 26, 2023
@@ -299,6 +299,15 @@ namespace gtsam {
/// Return the number of leaves in the tree.
size_t nrLeaves() const;

/**
* @brief Return the number of total leaf assignments.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please give examples as to what this is, I think "number of assignments" needs to be explained.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO(Frank): Add comments to mention that this is a convenience function and not used for any major operation.

@dellaert
Copy link
Member

dellaert commented Jun 4, 2023

@varunagrawal I think we discussed this and is no more relevant? If so, please close issue and PR. If not, let's chat again as I lost context.

@varunagrawal
Copy link
Collaborator Author

This is still a bug, one that I introduced accidentally.

@varunagrawal
Copy link
Collaborator Author

varunagrawal commented Jun 8, 2023

So I managed to figure this out after a lot of debugging.

The big issue was that when building the tree, we call Unique which does the pruning, but once the tree is built, pruned nodes are added back if the labels are not in decreasing order (this is handled by compose).

To address this, I made Unique a recursive method that now operates only after the whole tree is built and the labels reordered. The tree construction process is now: build tree bottom-up, order labels, prune top-down. The previous approach was: build and prune bottom-up, order labels.

@varunagrawal varunagrawal added the bugfix Fixes an issue or bug label Jun 8, 2023
@varunagrawal varunagrawal linked an issue Jun 8, 2023 that may be closed by this pull request
@varunagrawal
Copy link
Collaborator Author

#1555 details some of the pending issues I have with nrAssignments. I'll get back to those after I update Hybrid to use the TableFactor.

@dellaert
Copy link
Member

Please comment on what the status is of this PR after you merged all other 3 we discussed. Also, might want to merge in develop after that to make this PR much smaller?

@varunagrawal
Copy link
Collaborator Author

Merged in develop. The nrAssignments is still messed up due to the binary apply method, and I'll push a unit test to this PR so we can track it later.

@dellaert
Copy link
Member

Still confused about status of this PR. Changes in 15 files, many tests now DISABLED.

@varunagrawal
Copy link
Collaborator Author

This PR is still a WIP. I'll update it as I go along.

@varunagrawal
Copy link
Collaborator Author

I removed nrAssignments since the issues go much deeper and doing a full fix would take far too long, time which I currently don't have. Also the original reason I added that feature was to improve the pruning speed, but that was accomplished in another way.

@varunagrawal varunagrawal changed the title Fixing nrAssignments in DecisionTree GTSAM_DT_MERGING Flag Nov 6, 2023
@varunagrawal varunagrawal mentioned this pull request Nov 10, 2023
Copy link
Member

@dellaert dellaert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I like the simplification.

@varunagrawal varunagrawal merged commit 1121ece into develop Nov 11, 2023
26 checks passed
@varunagrawal varunagrawal deleted the fix-1496 branch November 11, 2023 15:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugfix Fixes an issue or bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Odd Behavior for nrAssignments in DecisionTree
2 participants