Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] refactor with caddy labels + remove dependency on systemd #26

Closed
wants to merge 1 commit into from
Closed

[WIP] refactor with caddy labels + remove dependency on systemd #26

wants to merge 1 commit into from

Conversation

unixfox
Copy link
Member

@unixfox unixfox commented Mar 1, 2020

Closes #9
Closes #23

This commit use the Docker image lucaslorentz/caddy-docker-proxy to automatically generate the Caddyfile based on Docker labels.
Also, it removes the dependency on systemd and allow any Linux distribution compatible with Docker to work with searx-docker.

Changes

  • Switch to lucaslorentz/caddy-docker-proxy docker image.
  • Remove main Caddyfile and switch most of the directives to Docker labels
  • Use caddy/conf.d as a way to include additional Caddy directives.
  • Refactor the README to reflect the new changes.

Some changes that I would like to do but not sure about them:

  • Remove the SEARX_TLS environment variable: it is annoying because it's not really possible to include it directly as a Docker label due to a generation error if the variable is empty. It would be better to ask the user to set it inside the caddy/conf.d/tls.conf.
  • Remove the SEARX_PROTOCOL environment variable: it's useless and the protocol (HTTP or HTTPS) can already be included in SEARX_HOSTNAME.

Breaking changes:

  • The users that disabled the image proxy and modified the Caddyfile will probably get a git error when updating and will need to modify again the CSP directive.
  • Maybe other things that I didn't think about.

@unixfox
Copy link
Member Author

unixfox commented Mar 1, 2020

@dalf When you are available please review my PR 😃

@dalf dalf self-requested a review March 4, 2020 13:35
Copy link
Contributor

@dalf dalf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the SEARX_PROTOCOL environment variable: it's useless and the protocol (HTTP or HTTPS) can already be included in SEARX_HOSTNAME.

Yes this one can be remove, and SEARX_HOSTNAME can be https://somewhere

Switch to lucaslorentz/caddy-docker-proxy docker image.

I'm puzzled by this image:

  • I'm not sure we need to support dynamic configuration provide by this image: everything is static.
  • It doesn't support caddy v2 ( Caddy v2? lucaslorentz/caddy-docker-proxy#130 ): no big deal except when caddy v1 will reach end of life.

May be I'm missing something.

ports:
- 80:80
- 443:443
network_mode: host
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without network_mode: host, Filtron won't have the user IP.
In the log, if I don't accept the self signed certificated:

2020/03/04 13:26:05 http: TLS handshake error from 172.19.0.1:39616: remote error: tls: bad certificate

It should not be 172.19.0.1

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't due to use a self signed certificate. This is due to the fact that the IPv6 support in Docker is bad. I already encountered this issue and to solve it I had to use this docker image: https://github.com/robbertkl/docker-ipv6nat. This docker image allows to pass the correct IPv6 address to the Docker container.

I will include it in this PR because even if the Linux server doesn't have public IPv6 address it won't interfere with anything because docker-ipv6nat only works with IPv6.

Proof that it isn't about having a self signed certificate:
(First request in IPv6 without docker-ipv6nat and second request in IPv4)

caddy_1          | 2020/03/04 17:18:36 http: TLS handshake error from 172.18.0.1:42496: remote error: tls: bad certificate
caddy_1          | 2020/03/04 17:20:57 http: TLS handshake error from 87.64.x.x:63301: remote error: tls: bad certificate

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, my message was not clear: the thing I don't understand is how caddy is getting the user IPv4, since it is in a docker network:

  • that's the reason of network_mode: host
  • and because of network_mode: host here, there is configuration is way more complicated for filtron / searx / morty.

Thank you for the IPv6 information: I've tried to configured a IPv6 without success, now I understand why (also partly because kimsufi IPv6 provide only one IPv6).
As I understand, docker-ipv6nat, caddy may see different (IPv4, source port) for one IPv6 ?

Copy link
Member Author

@unixfox unixfox Mar 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "proxy" which forwards requests from the external network to the internal network of the container doesn't work in IPv6: moby/moby#17666.
That's why you are seeing the IP of the "load balancer": 172.19.0.1 or 172.18.0.1 (depends on how the network is created).
This proxy is called "userland proxy" and you can learn more about it here: https://windsock.io/the-docker-proxy/

Using network_mode: host is considered as a very bad practice because it disables the initial network isolation, publish every single port of the container to the "internet" and make the container in the same network as the other services.
What if there is a vulnerability in Searx which allows doing requests from the Searx instance to any host? The attacker will be able to map the network where Searx reside and if there are other services which aren't properly secured but listen only on localhost then the attacker will be able to gain access to them.
Docker isn't only about making "things simple" by allowing programs to run on every possible Linux servers but it's also an isolation system (not the perfect one for sure).
This "option" should be used as a last resort if there is no other solution available.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the link, way more clear for me.

Just to sum up:

  • currently, caddy can open any ports, but has no capabilities except NET_BIND_SERVICE and DAC_OVERRIDE (sadly with the root user inside the container, it is not mandatory, another bad point for the abiosoft image).
  • to remove this --network host, the full solution is to include docker-ipv6nat with requires the privileged option so to be more or less root on the host ?

So are we moving from some trust to caddy to a blind trust to docker-ipv6nat ?

(without docker-ipv6nat, filtron can't work).

Copy link
Member Author

@unixfox unixfox Mar 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason why I mainly want to switch to docker-ipv6nat is that hardcoded IP address are considered as bad practice because:

About the security for docker-ipv6nat, you aren't required to give all the permissions. Like explained here: https://github.com/robbertkl/docker-ipv6nat#docker-container all what's needed is to give the permissions NET_RAW, NET_ADMIN and SYS_MODULE.
docker-ipv6nat has way less attack surface than caddy because it doesn't listen to the external world whereas Caddy does and it's just a bunch of iptables commands.

Also, I wanted to correct some of my previous thoughts. The Searx container isn't in the same network as the Linux server (network_mode: host) so any attacks like doing requests from the Searx instance to any host (like an internal host) is incorrect but still correct for caddy if one day there is a flaw in it.

Side note:

(without docker-ipv6nat, filtron can't work).

In IPv4 only environment it does work but not properly in dual stack (IPv4 & IPv6) environment.

Copy link
Contributor

@dalf dalf Mar 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In IPv4 only environment it does work but not properly in dual stack (IPv4 & IPv6) environment.

Yes and the problem is the silent fail: everything works except the bot protection when the bots uses an IPv6. (actually I'm not sure about the filtron behaviour without docker-ipv6nat).

[...]but still correct for caddy if one day there is a flaw in it.

yes (if caddy runs in a empty docker image, it limits the attack surface)

hardcoded IP address are considered as bad practice

Here a docker-compose.yml without hardcoded IP: https://gist.github.com/dalf/28ffe27e8675553928fcbfbcd5098179
I know there is still the --network host and the environment variables copy.

Comparaison about the permissions:

docker-ipv6nat
(this PR)
lucaslorentz/caddy-docker-proxy
(this PR)
caddy
(master branch)
Description
--privileged yes no no It also lifts all the limitations enforced by the device cgroup controller. In other words, the container can then do almost everything that the host can do.
--network host yes no yes That container’s network stack is not isolated from the Docker host (the container shares the host’s networking namespace), and the container does not get its own IP-address allocated
-v /var/run/docker.sock yes yes no Full access to docker
capabilities NET_BIND_SERVICE NET_BIND_SERVICE Bind a socket to Internet domain privileged ports (port numbers less than 1024).
DAC_OVERRIDE DAC_OVERRIDE Bypass file read, write, and execute permission checks. (DAC is an abbreviation of "discretionary access control".)
NET_RAW * Use RAW and PACKET sockets;
* bind to any address for transparent proxying.
NET_ADMIN Perform various network-related operations:
* interface configuration;
* administration of IP firewall, masquerading, and accounting;
* modify routing tables;
* bind to any address for transparent proxying;
* set type-of-service (TOS)
* clear driver statistics;
* set promiscuous mode;
* enabling multicasting;
* use setsockopt(2) to set the following socket options: SO_DEBUG, SO_MARK, SO_PRIORITY (for a priority outside the range 0 to 6), SO_RCVBUFFORCE, and SO_SNDBUFFORCE.
SYS_MODULE Load and unload kernel modules (see init_module(2) and delete_module(2))

I understand that --network host and listen on 127.0.0.1 breaks the docker philosophy but I wish to keep the number of permissions minimals.

Actually, if docker-ipv6nat would be part of the docker project, there would the defacto solution. And yes even if it is not the case, there is far less code to audit in docker-ipv6nat than in caddy ( there is an exec.Cmd, but anyway caddy has one too )

But still it is weird for me to add more permissions to remove permissions.
I guess there is no real other solution ? (except hack like hard coded iptables rules).

# the code might be out of sync with the current running services
systemctl stop "${SERVICE_NAME}"
./stop.sh
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

searx-docker.service.template must change Restart=always to something Restart=on-failure, otherwise after ./stop.sh, systemd will restart everything (only for people using systemd of course).

But doing that change, for sure some people won't upgrade the service, and it will be a mess when they call this script.

One idea to solve this issue:

  • if systemd exists, then call systemctl stop "${SERVICE_NAME}" otherwise call ./stop.sh.
  • I don't like it because the script will behave differently if systemd is installed.

Copy link
Member Author

@unixfox unixfox Mar 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or we could remove the systemd support altogether, this would reduce the requirement of testing on two different types of system, just like the idea behind using Docker in the first place.
Removing the systemd support isn't that complicated, display a warning message to use the new system if the user use the original systemd service file and make the start.sh script inefficient when started by systemd.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could remove the systemd

I don't think a file, few command lines to explain how to install this project using systemd won't hurt.

But I do agree to put on the side the systemd stuff.

@unixfox
Copy link
Member Author

unixfox commented Mar 4, 2020

It doesn't support caddy v2 ( lucaslorentz/caddy-docker-proxy#130 ): no big deal except when caddy v1 will reach end of life.

The current Caddyfile doesn't support Caddy v2 too (see: caddyserver/caddy/wiki/v2:-Caddyfile-examples). If caddy v1 gets unsupported whenever we switch to caddy-docker-proxy or stick with the current docker image there will be some changes needed to switch to the next caddy version. Like stated here caddyserver/caddy/releases/tag/v2.0.0-beta1:

Caddy 2 is not backwards-compatible with Caddy 1.

Moreover, a web server doesn't need that much maintenance, once it reaches a ready state the maintainer can still push some small bug/security fixes from time to time. Even the author of caddy stated that he already stopped working on it: caddyserver/caddy#3073.

I'm not sure we need to support dynamic configuration provide by this image: everything is static.

This isn't true, there is a .env which gets populated by the user. This environment file is then used to generate a dynamic Caddy configuration. The ability to write the Caddyfile from a label allow removing things like this:

environment:
      - SEARX_HOSTNAME=${SEARX_HOSTNAME}
      - SEARX_PROTOCOL=${SEARX_PROTOCOL:-}
      - SEARX_TLS=${SEARX_TLS:-}
      - FILTRON_USER=${FILTRON_USER}
      - FILTRON_PASSWORD=${FILTRON_PASSWORD}

by utilizing the capability of including environment variables directly into the docker-compose itself which from my point of view is way cleaner than passing the environment variable twice (inside the docker-compose and Caddyfile).
Unfortunately I still had to do it the old way only for the environment variable SEARX_TLS but that's why I asked for it to be removed.

header / {
# CSP (see http://content-security-policy.com/ )
Content-Security-Policy "upgrade-insecure-requests; default-src 'none'; script-src 'self'; style-src 'self' 'unsafe-inline'; form-action 'self'; font-src 'self'; frame-ancestors 'self'; base-uri 'self'; connect-src 'self' https://overpass-api.de; img-src 'self' data: https://*.tile.openstreetmap.org; frame-src https://www.youtube-nocookie.com https://player.vimeo.com https://www.dailymotion.com https://www.deezer.com https://www.mixcloud.com https://w.soundcloud.com https://embed.spotify.com"
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not add the other headers ?

Strict-Transport-Security, X-XSS-Protection, X-Content-Type-Options, X-Frame-Options, Feature-Policy, Pragma, Referrer-Policy, X-Robots-Tag, -Server, Cache-Control, Access-Control-Allow-Methods, Access-Control-Allow-Origin

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's already here: https://github.com/searx/searx-docker/pull/26/files#diff-eb672b1ddd3aa7dae5f2577f38daa271R51.

The csp.conf config file is only useful for the user to change the CSP headers if he disables the image proxy.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've seen but why the only HTTP header in caddy.conf.d is Content-Security-Policy ?

Either:

  • put Content-Security-Policy is docker-compose.yaml
  • put the other headers in a caddy.conf.d/headers.conf file

May be I miss the point of a different file for the CSP header.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the CSP is put in the docker-compose.yaml then when the docker-compose.yaml is going to be updated with new changes and the user changed the CSP because he disabled the image proxy there will be a git conflict like you described it (https://github.com/searx/searx-docker#custom-docker-composeyaml):

Do not modify docker-compose.yaml otherwise you won't be able to update easily from the git repository.

I don't mind putting the other headers into a separate file but is it really useful for the user?
I don't think anyone will want to change the other headers by hand because all of them are "standard" and doesn't need to be customized. This also allows us to be sure that when creating an issue the user will have the same headers as the ones in the repository.
Moreover, if one day we need to generate "dynamically" a header based on the .env then we wouldn't be required to remove it from the "headers.conf" file then add it back as a Docker label.

@dalf
Copy link
Contributor

dalf commented Mar 5, 2020

The current Caddyfile doesn't support Caddy v2 too

Sure. I'm just afraid if there is CVE:

  • if there is a fix in the git repo:
    • currently we have to wait for an update of abiosoft/caddy (not good).
    • with this PR: we have to wait for an update of caddy-docker-proxy.
    • if we build caddy without plugins, we just have to wait for the fix to land in the git repo.
  • if there is not fix:
    • we can move to the version 2, we "just" have to rewrite the Caddyfile (and there is an official caddy v2 docker image).
    • with this PR : we have to find another solution, in the meantime the CVE is not fixed.

I know that's may never happen.

This isn't true, there is a .env which gets populated by the user.

By "support dynamic configuration", I mean after ./start.sh, there is no new docker image to start and publish using caddy-docker-proxy. After ./start.sh everything is static, so caddy-docker-proxy could be a template renderer.

@unixfox
Copy link
Member Author

unixfox commented Mar 5, 2020

Let's imagine there is a CVE happening like right now.

From looking at the activity of both abiosoft/caddy and lucaslorentz/caddy-docker-proxy it's clear that lucaslorentz/caddy-docker-proxy is going to be updated first because:

@kidcuauh Thanks for sharing instructions. That's how I do on my production environments, I create a network separately and reference it on several compose files.

  • (personal experience) lucaslorentz is pretty quick at responding in the issue of his repository, so he will probably update his image if someone open an issue to warn about the vulnerability.

But thanks to the open source world, we aren't required to rely on lucaslorentz/caddy-docker-proxy. We could just build our caddy-docker-proxy docker image and update it if there is a CVE, I already did it for myself so this isn't an issue to setup it for Searx but in this case we "take the responsibility" for updating the Caddy image when there is a vulnerability.

Caddy v1 isn't going away any time soon, the web server is used by millions of users and if a vulnerability is discovered there is a high chance that the caddy team is going to release a new version to fix the vulnerability. Switching to a new version in a hurry isn't a good solution because it is more likely to break something than not having quickly enough a new version that fixes the vulnerability.


By "support dynamic configuration", I mean after ./start.sh, there is no new docker image to start and publish using caddy-docker-proxy. After ./start.sh everything is static, so caddy-docker-proxy could be a template renderer.

Using Docker labels for generating the Caddyfile is still useful in order to avoid having to pass the environment variable twice even though you don't consider using the .env file as a "dynamic configuration".

@unixfox
Copy link
Member Author

unixfox commented Apr 9, 2020

Sorry for the long waiting time about this PR.
I switched to Kubernetes on my own setup, so I stopped using Docker Swarm/Docker compose and I won't try to improve this PR anymore.

Anyway if you want to have as less privileged services as possible in the docker-compose then use network host anyway but try to reduce the possible surface attack by using a dumb TCP proxy that have PROXY PROTOCOL support (in order to preserve the IP address) like this project: https://github.com/kazeburo/ppdp.
Then you make it so Caddy is in its own network with the other services and have caddy binds to 127.0.0.1:8080 (127.0.0.1:8080:80) and use the TCP proxy in network host to forward to 127.0.0.1:8080.
Also, please remove this hardcoded network: https://github.com/searx/searx-docker/blob/master/docker-compose.yaml#L108

@@ -3,61 +3,95 @@ version: '3.7'
services:

caddy:
container_name: caddy
image: abiosoft/caddy:1.0.3-no-stats
image: lucaslorentz/caddy-docker-proxy:latest

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Be careful when using latest tag. It is preferable to use a specific version, as latest tag might get breaking changes.
Latest now is caddy v2, which is not compatible with caddy v1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Not all distros have systemd Use caddy-docker-proxy to remove the hardcoded IPv4 addresses
3 participants