You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I got a request from FXE: the JUNGFRAU calibration pipeline had failed (because of issues with GPFS, outside our control), and so when they went to access data with open_run(..., data='all'), the JUNGFRAU data it offered them was from raw instead of from proc. They specifically wanted the corrected data, so it would have been clearer to say that no JF data was found.
The immediate issue is that EXtra-data doesn't know what sources are meant to be in proc, so anything that isn't in proc will be exposed from raw. One obvious way round this is to let users specify that sources are expected to be in proc - e.g. proc_only='*/DET/JNGFR*' . But this is a clumsy workaround - it mostly works without this, so people won't bother setting it before they hit the problem, and it's easy to exclude stuff you might want (e.g. a .../DET/JNGFRCTRL source which is not meant to go in proc).
In the longer term, I want correction to write data with a new source name (e.g. .../CORR/JNGFR01), so you can clearly refer to raw/corrected data as separate things. But this is going to be a big change in offline correction. We might want to offer something in EXtra-data before that.
If we decide what the source names for corrected data will look like, we might try to 'rename' them in EXtra-data before the change in the files. This may get fiddly and confusing, though - so far, EXtra-data has always reflected what's in the files.
We could add run.raw and run.proc (or .corr?) attributes, which would point to separate DataCollection objects for the raw and proc data, so you could use run.proc['.../DET/JNGFR01'] to ensure you were getting the proc data. It would be pretty simple to do this in a crude way, but if you wanted e.g. run.select() to affect the separate raw & proc data collections, it would get much more involved.
We might also add a new high-level function like open_run that defaults to opening raw & proc together - it's hard to change open_run without breaking existing code.
The text was updated successfully, but these errors were encountered:
I got a request from FXE: the JUNGFRAU calibration pipeline had failed (because of issues with GPFS, outside our control), and so when they went to access data with
open_run(..., data='all')
, the JUNGFRAU data it offered them was from raw instead of from proc. They specifically wanted the corrected data, so it would have been clearer to say that no JF data was found.The immediate issue is that EXtra-data doesn't know what sources are meant to be in proc, so anything that isn't in proc will be exposed from raw. One obvious way round this is to let users specify that sources are expected to be in proc - e.g.
proc_only='*/DET/JNGFR*'
. But this is a clumsy workaround - it mostly works without this, so people won't bother setting it before they hit the problem, and it's easy to exclude stuff you might want (e.g. a.../DET/JNGFRCTRL
source which is not meant to go in proc).In the longer term, I want correction to write data with a new source name (e.g.
.../CORR/JNGFR01
), so you can clearly refer to raw/corrected data as separate things. But this is going to be a big change in offline correction. We might want to offer something in EXtra-data before that.run.raw
andrun.proc
(or.corr
?) attributes, which would point to separateDataCollection
objects for the raw and proc data, so you could userun.proc['.../DET/JNGFR01']
to ensure you were getting the proc data. It would be pretty simple to do this in a crude way, but if you wanted e.g.run.select()
to affect the separate raw & proc data collections, it would get much more involved.We might also add a new high-level function like
open_run
that defaults to opening raw & proc together - it's hard to changeopen_run
without breaking existing code.The text was updated successfully, but these errors were encountered: