Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bulk Rest-API does not stage files with broken disk locations #7607

Open
christianvoss opened this issue Jun 24, 2024 · 2 comments
Open

Bulk Rest-API does not stage files with broken disk locations #7607

christianvoss opened this issue Jun 24, 2024 · 2 comments
Assignees

Comments

@christianvoss
Copy link

Hi,

we've been observing some curious behaviour with the bulk staging service. It appears the bulk service does not trigger stages, if there are disk locations known to dCache, even when these pools are offline. We've observed this recently, when a storage node had to be taken out of production for a week and we wanted to stage back some files needed by our users.

I've reproduced this also with the latest 9.2 dCache release: 9.2.21. What we see, when we want to stage a NEARLINE file with is:

{
"nextId": -1,
"uid": "f9b987ee-02b5-4ba8-a334-df4b24ed4b6a",
"arrivedAt": 1719238168621,
"startedAt": 1719238168720,
"lastModified": 1719238168753,
"status": "COMPLETED",
"targetPrefix": "/",
"targets": [
{
"target": "/pnfs/desy.de/exfel/archive/XFEL/raw/FXE/201802/p002271/r0081/RAW-R0081-LPD09-S00003.h5",
"state": "SKIPPED",
"submittedAt": 1719238168635,
"startedAt": 1719238168635,
"finishedAt": 1719238168750,
"id": 242049
}
]
}

The operation will always be skipped. But, dCache reports the file correctly as NEARLINE: "fileLocality": "NEARLINE",

In contrast, staging via SRM triggers a restore from tape immediately:

[vossc@naf-it01] [dev/vossc/no-macaroon-voms-directly] pnfs_qos_api $ srm-bring-online -lifetime=864000 srm://dcache-door-xfel01.desy.de:8443/pnfs/desy.de/exfel/archive/XFEL/raw/FXE/201802/p002271/r0081/RAW-R0081-LPD09-S00003.h5

[dcache-head-xfel02] (local) vossc > \sn pnfsidof /pnfs/desy.de/exfel/archive/XFEL/raw/FXE/201802/p002271/r0081/RAW-R0081-LPD09-S00003.h5
00005283EB13A8A943E9938C32E0BFFF47FC

[dcache-head-xfel02] (local) vossc > \sp rc ls
00005283EB13A8A943E9938C32E0BFFF47FC@world-net-/ m=1 r=0 [dcache-xfel499-01] [Waiting for stage: dcache-xfel499-01 06.24 16:10:40] {0,}

[dcache-head-xfel02] (local) vossc > \s dcache-xfel499-01 rh ls
a928e3c3-6454-4151-b186-0f3ab7b93757 ACTIVE Mon Jun 24 16:10:40 CEST 2024 Mon Jun 24 16:10:40 CEST 2024 00005283EB13A8A943E9938C32E0BFFF47FC xfel:FXE-2018

Is it possible for bulk to behave like SRM did in the past, or would the procedure be to 'disable' the location in chimera before triggering the stage?

Thanks a lot,
Christian

@DmitryLitvintsev
Copy link
Member

yes. Bulk purely relies on location information in chimera.

@christianvoss
Copy link
Author

Hi Dmitry,

thanks a lot for the clarification. I guess it is supposed to stage from tape even when a pool is offline?

Thanks a lot,
Christian

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants