Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JP-3581: badpix_selfcal step #8500

Merged
merged 21 commits into from
Jun 14, 2024
Merged

JP-3581: badpix_selfcal step #8500

merged 21 commits into from
Jun 14, 2024

Conversation

emolter
Copy link
Collaborator

@emolter emolter commented May 21, 2024

Resolves JP-3581

Closes #8369

This PR adds a new optional step to the spec2pipeline that applies a median filter in the spectral direction of IFU data to flag warm and cool pixels.

Checklist for PR authors (skip items if you don't have permissions or they are not applicable)

  • added entry in CHANGES.rst within the relevant release section
  • updated or added relevant tests
  • updated relevant documentation
  • added relevant milestone
  • added relevant label(s)
  • ran regression tests, post a link to the Jenkins job below.
    How to run regression tests on a PR
  • All comments are resolved
  • Make sure the JIRA ticket is resolved properly

@emolter
Copy link
Collaborator Author

emolter commented May 21, 2024

@drlaw1558 @jemorrison This first draft of the step implements median filtering in the spectral direction to find bad pixels, then flags them as "WARM". If an association is passed in, as is done in the regression test, then the flagging will occur on the pixelwise MIN of the background exposures. Flagging gets applied to both the science and background exposures, which are saved into separate .fits files if the save_flagged_bkgd parameter is set True. If a single _rate file is passed, then this will operate on that rate file, i.e., true self-calibration. A user could also choose to run the .process() method with a list of background filenames, which might be easier in some cases than writing your own association.

Let me know if the current behavior of the code is what is desired, and what needs to be changed.

@melanieclarke suggested to use the "other bad pixel" flag instead of simply "do not use"; should I go ahead and do that? edit: currently I set the pixels to "warm" although this step should catch other types of bad pixels, too. Still not sure what the appropriate thing is.

I ran into some issues with file I/O that I am unsure of how to resolve yet. In short, if I pass an association into calwebb_spec2 (as opposed to just this step), then assign_wcs will pass an IFUImageModel into this step instead of a ModelContainer, meaning that the step never knows about the background exposures and reverts to operating on the science data alone. If I run the step on an association, it's not obvious to me how to format the output such that it's easily usable downstream (filenames to use, how to save background exposures, how to update the association if at all).
edit: got some clarity about this at stand-up; current version should be better.

Still marked as draft until those issues can be resolved, and I'd appreciate any suggestions you have.

I'm working on adding a docs page now.

@emolter emolter requested a review from melanieclarke May 21, 2024 20:13
Copy link

codecov bot commented May 21, 2024

Codecov Report

Attention: Patch coverage is 88.07339% with 13 lines in your changes missing coverage. Please review.

Project coverage is 58.63%. Comparing base (93574c3) to head (88f5754).
Report is 332 commits behind head on master.

Files with missing lines Patch % Lines
jwst/badpix_selfcal/badpix_selfcal_step.py 85.50% 10 Missing ⚠️
jwst/pipeline/calwebb_spec2.py 66.66% 2 Missing ⚠️
jwst/badpix_selfcal/badpix_selfcal.py 96.77% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #8500      +/-   ##
==========================================
+ Coverage   58.61%   58.63%   +0.02%     
==========================================
  Files         391      391              
  Lines       39066    39081      +15     
==========================================
+ Hits        22897    22914      +17     
+ Misses      16169    16167       -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@emolter
Copy link
Collaborator Author

emolter commented May 22, 2024

Regression tests started here. Marking this ready for review now.

@emolter emolter marked this pull request as ready for review May 22, 2024 20:23
@emolter emolter requested a review from a team as a code owner May 22, 2024 20:23
@drlaw1558
Copy link
Collaborator

Thanks @emolter ; that was super fast!

I think there would probably need to be some changes on the association-generator front to ensure that this step is getting passed an asn that includes all of the files that would be helpful. If the pipeline notebooks that we're providing people with are modified to create them properly that would work, but it would be useful if the default spec2 association files downloaded from MAST also had the right information for this to work. I.e., include all science, background, leakcal, etc files (probably not target acq though as those may be much shallower, noisier, and generally not like the science data) associated with a given observation in the spec2 asn file with exptype='selfcal'. At the moment these (e.g., for MIRI MRS) often contain just a single 'science' member and nothing else. That exptype should be ignored by all other spec2 steps, but then this self-calibration step can simply use all members of the asn list that aren't target_acq for use in the algorithm.

I don't think we want to explicitly apply the flagging to both science and background exposures here; at least not from a single asn file. E.g., for PID 1523, Observation 1 is the science data, and Observation 2 the background. The association files generated for Observation 2 are currently just a single background exposure, which has exptype=Science (since it's processed as science data in spec2 and gets used as a background in spec3). As such, I can imagine file collisions if re-flagged data for the Obs2 exposures is written out both during Obs 1 and Obs 2 processing. This should generally be the case I think when we care about applying the flagging to the background data too- when they get processed as Science data in spec2 so that they can be used as backgrounds in spec3.

We probably don't even need to explicitly ask the code to use only 'background' exposures if those exist, which would be simpler for the code. By definition, the MIN algorithm should be able to handle being passed both science and background together as background should be fainter than source observations. (If not, it suggests the backgrounds are bad and shouldn't be used alone anyway). Essentially, things marked as exptype=Background aren't special in any way; flagging should just get applied to whatever the exposure is with exptype=Science in the lvl2 asn file, using everything in the asn file to generate the mask. Does that make sense?

Setting other_bad_pixel in addition to DO_NOT_USE sounds fine to me. It's a catch-all flag, and this is a pretty catch-all algorithm.

@emolter
Copy link
Collaborator Author

emolter commented May 23, 2024

Thanks for the detailed review @drlaw1558 ! Some quick responses point-by-point:

  • Setting other_bad_pixel - sounds good
  • Using all exposures in the asn to compute the min - sounds good
  • Not flagging exposures marked as background - This was originally @jemorrison's suggestion. I don't initially see how, as written, the file conflict you bring up would happen, because the spec2pipeline does not support marking more than one file in an association as science. In the present code version, only a single output file is written representing the science exposure unless the optional save_flagged_bkgd parameter is True, and in that case a different file extension should be used (although I need to check this). When the whole spec2pipeline is run, these flagged backgrounds are passed as a list into the background step.
  • Updates to association rules: it sounds like you want an entirely new classification of exposures for the association jus for this step. That sounds good in principle, although I wonder if adding all these extra files by default would confuse users. Perhaps it should be optional for association generation too, and turned off by default?
  • PID 1523 - can you point me to an association file or files where one science exposure is used as a background for the other? I don't see any association files in the Box folder you linked to the ticket, and the one I was able to download for 1523 from MAST just has a single exposure in it

@melanieclarke
Copy link
Collaborator

I'm also wondering about the effects of including all the files in the default association files in MAST. I think the pipeline complains if a file is in the association but isn't on disk, even if it isn't used in the pipeline -- this already happens for TA files in spec3. If everything that could be used for selfcal was listed, that might make it more difficult to use MAST association files for users not interested in doing selfcal. I agree that some user support and documentation would be helpful though, because association files are not necessarily trivial to create.

Copy link
Collaborator

@melanieclarke melanieclarke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial comments/suggestions here. I can test with real data and review again when things are settled.

docs/jwst/badpix_selfcal/description.rst Outdated Show resolved Hide resolved
jwst/badpix_selfcal/badpix_selfcal_step.py Outdated Show resolved Hide resolved
jwst/badpix_selfcal/badpix_selfcal.py Show resolved Hide resolved
jwst/badpix_selfcal/badpix_selfcal.py Outdated Show resolved Hide resolved
jwst/badpix_selfcal/badpix_selfcal_step.py Outdated Show resolved Hide resolved
jwst/pipeline/badpix_selfcal.cfg Outdated Show resolved Hide resolved
jwst/pipeline/calwebb_spec2.py Outdated Show resolved Hide resolved
jwst/pipeline/calwebb_spec2.py Outdated Show resolved Hide resolved
@drlaw1558
Copy link
Collaborator

drlaw1558 commented May 24, 2024

As a slightly more normal science example (PID 1523 is a somewhat unusual mosaic), see PID 1328 Observations 15 and 16 (asns now in the Box folder). Observation 15 is the science exposures on a galaxy target, and Observation 16 is the dedicated background exposures. Both Observation 15 and Observation 16 get processed through spec2 as 'science' type data, and it's only in spec3 that Observation 16 is treated as a background for purposes of the master_background step. As such, having Observation 15 write out flagged backgrounds is unnecessary- it should happen when Observation 16 is processed.

asns are certainly the trick though; I agree changing everything to add a bunch more entries to an association is potentially risky. At the same time, not doing so makes it harder for users to take advantage of the step...

Maybe for the time being we should take the safe route; rely on users creating a spec2 association that can take advantage of the code. I can certainly add that into the MRS notebook that we provide to the community, and add some documentation to JDox about how to do it more generally. If the selfcal step is passed an association with just one member (or simply passed a file instead of an association), it should print an error message and skip the step. There are a lot of failure modes using just a single exposure (emission lines in particular), and I can see users trying to turn it on without setting it up properly getting caught by that. We could have some parameter that says "I know what I'm doing, really really run on just a single input image" to enable it.

@emolter
Copy link
Collaborator Author

emolter commented May 24, 2024

@drlaw1558 all of that sounds good. One idea for the notebook is to show an example where the .process() method accepts a simple list of filenames or ImageModels as backgrounds, bypassing the need to create an association. My sense is that this may be easier for users.

Copy link
Collaborator

@jemorrison jemorrison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have much to add on the code changes. Melanie did a thorough job on that front. Personally I would hold off on association rule changes for now. That could have many unintended consequences that could led to errors. If you can provide documentation and maybe a notebook to users on how to construct an association to run this option step that might be a first step. Then after we get feed back from folks we can see if the community would want this easier to run and think then about an association rules change (a general association rules change to make this step run frankly scares me a bit). When I was using David's notebook to flag warm pixels I constructed by own associations containing the backgrounds. It is fairly easy to do.

@emolter
Copy link
Collaborator Author

emolter commented May 27, 2024

I just pushed a new version that attempts to respond to all the comments. The changes are:

  • set DQ flag to "DO_NOT_USE" + "OTHER_BAD_PIXEL"
  • removed option to save flagged background exposures to file
  • added warning and step skip if a single science exposure is passed in, added force_single parameter to override this
  • step now uses science and background exposures identically when creating the MIN exposure
  • added mention of the step to calwebb_spec2 pipeline docs

Some additional things that I feel are still up in the air:

  • I retained the behavior that background exposures are flagged when passed through the entire calwebb_spec2 pipeline; the effect of this is that the background_sub step would see background exposures with the flags applied. It seemed to me like that is a good thing; however, if the notebook examples we create are re-using exptype="background" for e.g. leakcal files, then that might cause problems
  • That notebook still needs to be created. IMO it might be best for either @melanieclarke or @drlaw1558 to make a draft of this, or at least an example of what you would like the input user-created asn files to look like; then I can test that through both the individual step and the whole calwebb_spec2 pipeline to ensure that it behaves as intended.
  • A regression test is still needed. I'm holding off on this until everything else is finalized, simply because it's tedious to continually modify the inputs and truth files that are uploaded to the artifactory.

Copy link
Collaborator

@braingram braingram left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Just one more comment about the background list files in the regtest.

jwst/regtest/test_miri_mrs_badpix_selfcal.py Outdated Show resolved Hide resolved
@emolter
Copy link
Collaborator Author

emolter commented Jun 13, 2024

started a new regtest run here. Hopefully I got all the artifactory uploads in place.

@emolter
Copy link
Collaborator Author

emolter commented Jun 13, 2024

new regtest run with all suggested changes is here

Copy link
Collaborator

@braingram braingram left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (pending regression test results)!

@emolter
Copy link
Collaborator Author

emolter commented Jun 14, 2024

Latest regtest result is clean

jwst/badpix_selfcal/badpix_selfcal_step.py Outdated Show resolved Hide resolved
jwst/badpix_selfcal/badpix_selfcal.py Outdated Show resolved Hide resolved
jwst/badpix_selfcal/badpix_selfcal.py Outdated Show resolved Hide resolved
jwst/badpix_selfcal/badpix_selfcal.py Outdated Show resolved Hide resolved
jwst/badpix_selfcal/badpix_selfcal.py Outdated Show resolved Hide resolved
jwst/badpix_selfcal/badpix_selfcal_step.py Outdated Show resolved Hide resolved
jwst/badpix_selfcal/badpix_selfcal_step.py Show resolved Hide resolved
jwst/pipeline/calwebb_spec2.py Outdated Show resolved Hide resolved
jwst/pipeline/calwebb_spec2.py Show resolved Hide resolved
CHANGES.rst Show resolved Hide resolved
Copy link
Collaborator

@hbushouse hbushouse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good; RTD doc pages look good; regtest results look good. I approve!

@hbushouse
Copy link
Collaborator

@emolter Feel free to merge whenever you want.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants