Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hydra-queue-runner: fix building on machines specified with the ssh-ng protocol #918

Closed
wants to merge 1 commit into from

Conversation

Ma27
Copy link
Member

@Ma27 Ma27 commented Apr 17, 2021

Most of my machines in /etc/nix/machines are specified as
ssh-ng://foobar where foobar is defined in e.g. ~/.ssh/config. The
problem here is that while it's possible to use the remote Nix
installation via SSH, hydra-queue-runner refuses to do so since ssh(1)
can't resolve hostnames that begin with ssh-ng://.

To work around this problem, this prefix will be drop if it exists on a
machine declaration that was chosen for a build step.

cc @grahamc @edolstra

…ng` protocol

Most of my machines in `/etc/nix/machines` are specified as
`ssh-ng://foobar` where `foobar` is defined in e.g. `~/.ssh/config`. The
problem here is that while it's possible to use the remote Nix
installation via SSH, `hydra-queue-runner` refuses to do so since `ssh(1)`
can't resolve hostnames that begin with `ssh-ng://`.

To work around this problem, this prefix will be drop if it exists on a
machine declaration that was chosen for a build step.
@grahamc
Copy link
Member

grahamc commented Apr 19, 2021

It seems like we would be better served by using Nix's own store abstractions instead of running our own ssh incantation. I don't suppose there is a way we could test this behavior?

@Ma27
Copy link
Member Author

Ma27 commented Apr 19, 2021

I don't suppose there is a way we could test this behavior?

Well, I have an entry like this in my /etc/nix/machines:

ssh-ng://builder-ng x86_64-linux - 10 4 big-parallel,kvm,nixos-test

With this patch on a Hydra, hydra-queue-runner happily picks up this server to build derivations :)

It seems like we would be better served by using Nix's own store abstractions instead of running our own ssh incantation

Not sure if I agree here. First of all, this nix-store --serve via ssh approach is also used by e.g. nix.sshServe (IIRC). Also, to me this currently sounds like yet another thing to catch up with (rather rapid) changes in nixUnstable.

@knedlsepp
Copy link
Member

Fixes: #688

@edolstra
Copy link
Member

This PR just makes hydra-queue-runner treat ssh-ng:// as ssh://, which IMHO is not all that useful and probably isn't what users expect.

Currently hydra-queue-runner uses the nix-store --serve protocol, so it can only use ssh://. I have a branch somewhere that makes it use the Store abstraction. However, we can't use it because of some build log issues (fixable) and lack of build step cancellation (not fixable without some major changes to the Store API - currently build steps are cancelled by sending a signal to the corresponding SSH process).

@edolstra edolstra closed this Apr 21, 2021
@Ma27 Ma27 deleted the build-remote-ssh-ng branch April 21, 2021 08:04
@Ma27
Copy link
Member Author

Ma27 commented Apr 21, 2021

Hmm yeah, so I know the difference between ssh-ng and ssh, however this was supposed to be a cheap fix for existing /etc/nix/machines where I use ssh-ng by default for good reasons.

But if you've got already a draft for using Nix's store abstraction in Hydra (and hence allow both protocols), that's way better, so 👍 for closing it then :)

@edolstra
Copy link
Member

To be clear, the code that I wrote isn't usable because it doesn't support cancellation.

@ajs124
Copy link
Member

ajs124 commented Apr 21, 2021

The current code supports cancellation? I'm not sure that's ever worked for me.

@grahamc
Copy link
Member

grahamc commented Apr 21, 2021

It does support cancellation, and it often does work :)

@dasJ
Copy link
Member

dasJ commented Apr 21, 2021

I thought the exact log message was "maybe cancelling build" and it finishes the current step and aborts the rest?

@Ericson2314
Copy link
Member

Ericson2314 commented Apr 21, 2021

To be clear, the code that I wrote isn't usable because it doesn't support cancellation.

Where is that branch again? I've been thinking if some of legacy ssh store and remote store methods were split up to separate the parts hydra needs to do separately, it could work. That split would be of no use to Nix itself, but I think the separation could be worth it for hydra's sake alone.

@edolstra
Copy link
Member

Looks like I never committed anything, but I recovered this from my stash: https://pastebin.com/suP2ZX2D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants