-
Notifications
You must be signed in to change notification settings - Fork 602
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
net: remember the name of the lock chain (nftables) #2550
base: criu-dev
Are you sure you want to change the base?
net: remember the name of the lock chain (nftables) #2550
Conversation
c483710
to
0305093
Compare
Hi Adrian! Please, can you tell how and in which circumstances you've caught this issue? As far as I understand the idea of your fix is to ensure that we keep nftables table name in the inventory image file instead of dynamically recalculate it on restore (using A first question I have after going through this is "How this has worked before?". P.S. I'll take a closer look into this. I've spent not enough time yet to fully understand what's going on there. |
It probably never did. We are not running all the tests on a system without iptables with nftables locking backend. Only two or four tests are running with the nftables backend. |
I am trying to switch the default locking backend in Fedora and CentOS >= 10 to nftables from iptables because iptables is no longer installed by default.
Yes. The table name makes sense if the locking and unlocking happens in the same CRIU run, but between CRIU runs it does not work with the existing approach. |
Ah, thanks for clarifications! I wonder if we can do something like this:
Yes, it's not a forward-compatible change and will break restore of images which were dumped with an older CRIU. In this form only works for experimental purposes (and have to check for |
My idea is that instead of introducing a new field
|
@mihalicyn I am happy to use whatever makes most sense. What is
I don't think we have to worry about this. Currently it doesn't work at all. Let me know which ID makes most sense and I can rework this PR. I think the important part is that is has to come from some value of the checkpoint image and not be generated during restore. |
@mihalicyn I think I understood your proposal now. The PR could be really simple as pid_ns_id is already in the image. Let me try it out. |
With this line it also passes all the zdtm test cases (besides a couple of tests which call iptables (which I did not install)) if I switch to the nftables locking backend:
That brings it down to a one line change. Very good idea @mihalicyn. Thanks. How long can the pid_ns_id be? Currently the variable |
@mihalicyn Tests are happy, but So that is not really a good idea I think as it not really unique. |
Hey Adrian,
Yes, precisely.
We don't as we already have it in the image anyways.
Are we 100% percent sure that it doesn't work and never worked in any circumstances?
Hmm, it's
That's my bad, actually, to get pid namespace inode number you need something like:
But yes, I don't think that even with this change having pid_ns_id would be enough, I think we still need to add a new field to |
Also, we have |
Ah, okay. So let's use the
I don't know. All tests with open TCP connections are just hanging during restore because the network locking cannot be disabled. According to zdtm it is so broken that it doesn't work currently.
As an additional field in the nft table name? Or instead of |
Would it be possible to add a CI workflow or modify an existing one to run all tests with the nftables backend? |
Using libnftables the chain to lock the network is composed of ("CRIU-%d", real_pid). This leads to around 40 zdtm tests failing with errors like this: Error: No such file or directory; did you mean table 'CRIU-62' in family inet? delete table inet CRIU-86 The reason is that as soon as a process is running in a namespace the real PID can be anything and only the PID in the namespace is restored correctly. Relying on the real PID does not work for the chain name. Using the PID of the innermost namespace would lead to the chain be called 'CRIU-1' most of the time which is also not really unique. With this commit the change is now named using the already existing CRIU run ID. To be able to correctly restore the process and delete the locking table, the CRIU run id during checkpointing is now stored in the inventory as dump_criu_run_id. Signed-off-by: Adrian Reber <[email protected]>
0305093
to
30e76fd
Compare
@mihalicyn What do you think about the latest version. This works in my tests just as good as the previous version. Now using criu_run_id as suggested. |
Using libnftables the chain to lock the network is composed of ("CRIU-%d", real_pid). This leads to around 40 zdtm tests failing with errors like this:
The reason is that as soon as a process is running in a namespace the real PID can be anything and only the PID in the namespace is restored correctly. Relying on the real PID does not work for the chain name.
Using the PID of the innermost namespace would lead to the chain be called 'CRIU-1' most of the time which is also not really unique.
The uniqueness of the name was always problematic. With this change all tests are working again which rely on network locking if the nftables backend is used for network locking.