Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an HTTP download fallback mechanism to avoid proxy authentication failures #4072

Open
JasonGantner opened this issue Nov 8, 2024 · 4 comments
Labels
enhancement New feature or request

Comments

@JasonGantner
Copy link

Is your feature request related to a problem? Please describe.

Golang's HTTP proxy support only implements Basic authentication. Proxies that limit authentication to only digest, NTLM, or SPNEGO (kerberos) are unusable. In many cases, downgrading security is not an option.

Describe the solution you'd like

Ideally, adding support for other schemes would be the simplest from a user standpoint.
However, it means relying on multiple other go libraries for each scheme (eg. jcmturner/gokrb5 for SPNEGO, Azure/go-ntlmssp for NTLM) and leads to an increased maintainer workload and might be made impossible because of licensing issues.

Describe alternatives you've considered

The simpler workaround I thought of would be to use cURL as a fallback. This would be made possible through a configuration variable similarly to git/age fallbacks.

Additional context

Behaviour difference between chezmoi and curl with correct proxy definition in env vars :

~$ chezmoi apply --dry-run --verbose
chezmoi: Get "https://github.com/opentofu/opentofu/releases/download/v1.8.5/tofu_1.8.5_linux_amd64.tar.gz": authenticationrequired

~$ curl -LJO https://github.com/opentofu/opentofu/releases/download/v1.8.5/tofu_1.8.5_linux_amd64.tar.gz
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100 23.8M  100 23.8M    0     0  8348k      0  0:00:02  0:00:02 --:--:-- 10.7M
@JasonGantner JasonGantner added the enhancement New feature or request label Nov 8, 2024
@twpayne
Copy link
Owner

twpayne commented Nov 8, 2024

This is an interesting idea but there are a number of issues:

  • chezmoi caches HTTP responses to avoid re-downloading the same URL when it has not changed. This is critically important for performance as the contents of files retrieved by HTTP need to be inspected every time the user runs chezmoi apply, chezmoi diff, chezmoi status. It's not clear how to achieve caching with an external curl binary.
  • chezmoi adds its own OAuth2 tokens when making requests to the GitHub API. These would also need to be passed to curl somehow, and only when needed.
  • chezmoi adds its own User-Agent header as a courtesy to webmasters.

Basically, the HTTP API that chezmoi needs is quite complex, and does not easily map onto replacing calls to Go's net/http library with invocations of curl.

Are there other workarounds you can use? For example:

  • Use a run_ script to download the archive with curl and unpack that?
  • Run a local proxy that transparently adds the authentication layer?

@JasonGantner
Copy link
Author

JasonGantner commented Nov 8, 2024

  • chezmoi caches HTTP responses to avoid re-downloading the same URL when it has not changed. This is critically important for performance as the contents of files retrieved by HTTP need to be inspected every time the user runs chezmoi apply, chezmoi diff, chezmoi status. It's not clear how to achieve caching with an external curl binary.

Without having looked at the caching internals of chezmoi, I would guess $XDG_CACHE_HOME/chezmoi/ is a good place to store the downloaded files/responses.
To avoid re-downloading a file there's currently two solutions :

  • based on the modification date with curl -z "$file" "$url" or curl -z "$datetime" "$url"
  • based on Etags with curl --etag-compare "$etag_file" --etag-save "$etag_file" "$url"
  • chezmoi adds its own OAuth2 tokens when making requests to the GitHub API. These would also need to be passed to curl somehow, and only when needed.

I admit I did not think outside of downloading external files.
Though passing the token is just a matter of correctly setting the header with curl -H "Authentication: Bearer $token" "$url"; the logic of when to pass it needs to be implemented outside of the call to curl

  • chezmoi adds its own User-Agent header as a courtesy to webmasters.

This one is the easiest to solve with curl --user-agent "$UA"

Basically, the HTTP API that chezmoi needs is quite complex, and does not easily map onto replacing calls to Go's net/http library with invocations of curl.

The maintenance overhead of having to build each call to curl in addition to the go implementation does seem hard to justify given the limited use case where it would apply

Are there other workarounds you can use? For example:

  • Use a run_ script to download the archive with curl and unpack that?

That was my first thought but at this point there's almost no benefit to using chezmoi to run the script. In my previous example, chezmoi is used to download, verify the checksum, extract and install the files with just a few lines of config.
We end up with the tofu executable in .local/bin/tofu while the associated documentation (license, readme, ...) is put in .local/share/tofu.
My idea was to delegate the download to curl but let chezmoi handle the rest of the operations

  • Run a local proxy that transparently adds the authentication layer?

While this does seem like the "simple" workaround, it raises some additional security concerns.

Even if we ignore the security aspect, it can be hard to find a fitting proxy:

  • for NTLM, cntlm is often mentioned but is also unmaintained since around 2012
  • for SPNEGO, I found montag451/spnego-proxy that hasn't seen developments for four years now and genotrance/px but it doesn't integrate with Kerberos on linux

As a side note, spnego-proxy is written in go and has an MIT license which could be interesting to assess the cost/complexity of implementing SPNEGO proxy authentication in chezmoi.

TL;DR

chezmoi calling curl as a fallback brings too much overhead to the development to be a valid solution.

Relying exclusively on a run_ script brings almost no benefit as compared to running the script without chezmoi

Using a local proxy isn't a viable alternative when security is a concern

Would it be a viable alternative to have a "local archive(-file)" external type that checks against a path on the host instead of retrieving a remote file ?
The local file could be downloaded with a run_before_ script which also enables downloading through unsupported protocols/tools such as sftp or svn

@twpayne
Copy link
Owner

twpayne commented Nov 9, 2024

Thank you for the detailed further investigation!

I totally agree with your analysis and think there are several good ways to build on this, which can be pursued independently:

Way 1

Would it be a viable alternative to have a "local archive(-file)" external type that checks against a path on the host instead of retrieving a remote file ?

Yes, absolutely. It should be fairly straightforward to add support for file:// URLs to chezmoi's externals, which can link to a local file. I would propose extending chezmoi's externals to allow a list of external URLs to fetch the archive from.

For example, as well as:

[".vim/autoload/plug.vim"]
    type = "file"
    url = "https://raw.githubusercontent.com/junegunn/vim-plug/master/plug.vim"

you could write:

[".vim/autoload/plug.vim"]
    type = "file"
    urls = [
        "file:///home/user/Downloads/plug.vim",
        "https://raw.githubusercontent.com/junegunn/vim-plug/master/plug.vim",
    ]

where chezmoi would try urls in order and use the first one that succeeds. If both url and urls are specified, then, for backwards compatibility, chezmoi would try url first, then urls.

Way 2

chezmoi calling curl as a fallback brings too much overhead to the development to be a valid solution.

Let's not abandon this yet. You're right that there are ways for chezmoi to use curl, but adding pluggable HTTP clients is more work than Way 1.

Way 3

Even if we ignore the security aspect, it can be hard to find a fitting proxy:

Yup, these two projects have not been updated in while. It might be because they are abandoned. It could also because they are stable and just work as they are. I agree that integrating them into chezmoi adds a lot of complexity and I would prefer to avoid it.

Conclusion

Let's do Way 1 first, hold Way 2 in mind, and try to avoid Way 3.

What do you think?

@JasonGantner
Copy link
Author

JasonGantner commented Nov 10, 2024

Your idea of how to implement way 1 works even better than I imagined, and it can even allow using mirrors for http resources!

IMHO way 3 should not be integrating local proxies into chezmoi but adding support for more http authentication schemes.
The benefit I see from way 3 over way 2 is that it keeps chezmoi self-contained and not relying on external tools (unless desired)
On the other hand, way 2 is kinda future proof since curl is well maintained and we can expect it's CLI and API to be stable. It also adds support for some extra protocols that may be of interest (SFTP, FTP/S, SMB/S, ..., even GOPHER/S).

Regarding my specific issue, way 1 is more than enough. I'll let you be the judge for the need to implement way 2 and/or 3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants