Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: Add generalized gamma index likelihood #497

Open
Cole-Monnahan-NOAA opened this issue Aug 18, 2023 · 7 comments
Open

[Feature]: Add generalized gamma index likelihood #497

Cole-Monnahan-NOAA opened this issue Aug 18, 2023 · 7 comments
Assignees
Labels
statistics related to logL wishlist request new feature; bigger than revision; OK to remove after adding to Milestone
Milestone

Comments

@Cole-Monnahan-NOAA
Copy link

Describe the solution you would like.

In many cases I believe it is more statistically justifiable to use a more flexible index likelihood than the lognormal. One appealing alternative is the generalized gamma distribution (GGD). It requires reading in a third parameter ("Q"), and has a complicated PDF function.

I forked the repo and have added a prototype on this branch and it appears to work well on the few models I've tested. It also does not appear to break backwards compatibility because the new data are only read in with the new likelihood option.

The point of this issue is to gauge the interest in this feature by the SS3 team for incorporating this into a future release of SS3. IF so, how should I proceed with development (git workflow, timing, testing, etc.)?

I think the only other main component to add is a simulator for the bootstrapper, and perhaps some reporting? I'll need help with the latter. I also need to further comment the code and update the documentation.

Describe alternatives you have considered

None so far

Statistical validity, if applicable

In many (most?) cases there is no statistical justification for assuming a lognormal index resulting from a design-based or model-based estimator. This is because sums of positive r.v.s are not necessarily lognormal, even if each r.v. is lognormal. Thus a more flexible distribution which can more accurately convey the information in the survey biomass indices is needed.

Describe if this is needed for a management application

No response

Additional context

No response

@Rick-Methot-NOAA Rick-Methot-NOAA added wishlist request new feature; bigger than revision; OK to remove after adding to Milestone statistics related to logL labels Aug 18, 2023
@Rick-Methot-NOAA
Copy link
Collaborator

Good suggestion Cole. I've long been interested in seeing more use of fat-tailed distributions and introduced the T-distribution for survey logL. It hasn't seen much use but with your interest in gamma we might be able to spur interest in both. We are planning a new release shortly and next would not be until early 2024. Let's move your branch into our org so evaluation will be easier.

@e-perl-NOAA
Copy link
Collaborator

I have merged @Cole-Monnahan-NOAA's branch onto a new branch in the stock-synthesis repo.

@Cole-Monnahan-NOAA
Copy link
Author

OK I'll continue to develop on my fork and do a PR when I think I'm ready for a more thorough review. Any tips on testing locally before doing that? Thanks!

@Rick-Methot-NOAA
Copy link
Collaborator

You should be able to use any test model that has an index time series. You can grab one from our test_model repo or use your own. We have a gha ready to run all our test models when we do a PR. The PR will not test your feature, but will let us know if you broke something :). What we will want is a small demo that the new feature works; some text for the User Manual; great if you can show what a useful range of model parameters would be.

Rick

@e-perl-NOAA
Copy link
Collaborator

@Cole-Monnahan-NOAA, another option for testing "locally" so to speak, is to use GitHub's codespaces, which gives you a linux machine, and run some of the tests that we use with GitHub actions (see GitHub Actions workflows here) on that codespace. Aside from the GitHub actions steps that are pre-built and available (such as the actions/checkout@v3, r-lib/actions/setup-r@v2, etc.), the rest of it should be able to be run in the linux terminal or from R. I have a codespace template that already has R loaded on it here if you would like to copy the devcontainer.json file to your own codespace.

@Cole-Monnahan-NOAA
Copy link
Author

@Rick-Methot-NOAA This reminded me to push my latest changes which is good. I got it working for a biomass and numerical indices on real examples at the AFSC. One tricky bit is that the gengamma seems to not like giant values (numerically unstable) so I scaled the expected biomass by 1e-6 (here) and likewise the user has to input the mean in millions of metric tons instead of tons.

We will need to think about how to deal with this on the user end.

I'm hoping to return to this project in about a month and once the paper is in internal review I will devote some time to SS3 development.

@e-perl-NOAA
Copy link
Collaborator

@Cole-Monnahan-NOAA I have synced the changes that you pushed to your forked repo branch to this branch in the stock-synthesis repo which will run the build github action

@Rick-Methot-NOAA Rick-Methot-NOAA modified the milestones: 3.30.23, 3.30.24 Aug 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
statistics related to logL wishlist request new feature; bigger than revision; OK to remove after adding to Milestone
Projects
Status: No status
Status: No status
Development

No branches or pull requests

3 participants