[WIP] refactor with caddy labels + remove dependency on systemd #26

unixfox · 2020-03-01T16:52:50Z

Closes #9
Closes #23

This commit use the Docker image lucaslorentz/caddy-docker-proxy to automatically generate the Caddyfile based on Docker labels.
Also, it removes the dependency on systemd and allow any Linux distribution compatible with Docker to work with searx-docker.

Changes

Switch to lucaslorentz/caddy-docker-proxy docker image.
Remove main Caddyfile and switch most of the directives to Docker labels
Use caddy/conf.d as a way to include additional Caddy directives.
Refactor the README to reflect the new changes.

Some changes that I would like to do but not sure about them:

Remove the SEARX_TLS environment variable: it is annoying because it's not really possible to include it directly as a Docker label due to a generation error if the variable is empty. It would be better to ask the user to set it inside the caddy/conf.d/tls.conf.
Remove the SEARX_PROTOCOL environment variable: it's useless and the protocol (HTTP or HTTPS) can already be included in SEARX_HOSTNAME.

Breaking changes:

The users that disabled the image proxy and modified the Caddyfile will probably get a git error when updating and will need to modify again the CSP directive.
Maybe other things that I didn't think about.

unixfox · 2020-03-01T16:54:42Z

@dalf When you are available please review my PR 😃

dalf

Remove the SEARX_PROTOCOL environment variable: it's useless and the protocol (HTTP or HTTPS) can already be included in SEARX_HOSTNAME.

Yes this one can be remove, and SEARX_HOSTNAME can be https://somewhere

Switch to lucaslorentz/caddy-docker-proxy docker image.

I'm puzzled by this image:

I'm not sure we need to support dynamic configuration provide by this image: everything is static.
It doesn't support caddy v2 ( Caddy v2? lucaslorentz/caddy-docker-proxy#130 ): no big deal except when caddy v1 will reach end of life.

May be I'm missing something.

dalf · 2020-03-04T13:37:44Z

docker-compose.yaml

    ports:
      - 80:80
      - 443:443
-    network_mode: host


Without network_mode: host, Filtron won't have the user IP.
In the log, if I don't accept the self signed certificated:

2020/03/04 13:26:05 http: TLS handshake error from 172.19.0.1:39616: remote error: tls: bad certificate

It should not be 172.19.0.1

This isn't due to use a self signed certificate. This is due to the fact that the IPv6 support in Docker is bad. I already encountered this issue and to solve it I had to use this docker image: https://github.com/robbertkl/docker-ipv6nat. This docker image allows to pass the correct IPv6 address to the Docker container.

I will include it in this PR because even if the Linux server doesn't have public IPv6 address it won't interfere with anything because docker-ipv6nat only works with IPv6.

Proof that it isn't about having a self signed certificate:
(First request in IPv6 without docker-ipv6nat and second request in IPv4)

caddy_1 | 2020/03/04 17:18:36 http: TLS handshake error from 172.18.0.1:42496: remote error: tls: bad certificate caddy_1 | 2020/03/04 17:20:57 http: TLS handshake error from 87.64.x.x:63301: remote error: tls: bad certificate

Sorry, my message was not clear: the thing I don't understand is how caddy is getting the user IPv4, since it is in a docker network:

that's the reason of network_mode: host

and because of network_mode: host here, there is configuration is way more complicated for filtron / searx / morty.

Thank you for the IPv6 information: I've tried to configured a IPv6 without success, now I understand why (also partly because kimsufi IPv6 provide only one IPv6).
As I understand, docker-ipv6nat, caddy may see different (IPv4, source port) for one IPv6 ?

The "proxy" which forwards requests from the external network to the internal network of the container doesn't work in IPv6: moby/moby#17666.
That's why you are seeing the IP of the "load balancer": 172.19.0.1 or 172.18.0.1 (depends on how the network is created).
This proxy is called "userland proxy" and you can learn more about it here: https://windsock.io/the-docker-proxy/

Using network_mode: host is considered as a very bad practice because it disables the initial network isolation, publish every single port of the container to the "internet" and make the container in the same network as the other services.
What if there is a vulnerability in Searx which allows doing requests from the Searx instance to any host? The attacker will be able to map the network where Searx reside and if there are other services which aren't properly secured but listen only on localhost then the attacker will be able to gain access to them.
Docker isn't only about making "things simple" by allowing programs to run on every possible Linux servers but it's also an isolation system (not the perfect one for sure).
This "option" should be used as a last resort if there is no other solution available.

Thank you for the link, way more clear for me.

Just to sum up:

currently, caddy can open any ports, but has no capabilities except NET_BIND_SERVICE and DAC_OVERRIDE (sadly with the root user inside the container, it is not mandatory, another bad point for the abiosoft image).

to remove this --network host, the full solution is to include docker-ipv6nat with requires the privileged option so to be more or less root on the host ?

So are we moving from some trust to caddy to a blind trust to docker-ipv6nat ?

(without docker-ipv6nat, filtron can't work).

The reason why I mainly want to switch to docker-ipv6nat is that hardcoded IP address are considered as bad practice because:

The network 10.10.10.0/24 could already exist and IPv4 network collision is already impacting Docker: https://stackoverflow.com/questions/50514275/docker-bridge-conflicts-with-host-network, https://www.lullabot.com/articles/fixing-docker-and-vpn-ip-address-conflicts and more articles on your preferred search engine.
Leaving Docker creating itself the network is the way to go to avoid any collision.

In a Docker mindset everything should be "temporary" and not tided to real IP addresses that's why it's better to delegate the work to a DNS server

About the security for docker-ipv6nat, you aren't required to give all the permissions. Like explained here: https://github.com/robbertkl/docker-ipv6nat#docker-container all what's needed is to give the permissions NET_RAW, NET_ADMIN and SYS_MODULE.
docker-ipv6nat has way less attack surface than caddy because it doesn't listen to the external world whereas Caddy does and it's just a bunch of iptables commands.

Also, I wanted to correct some of my previous thoughts. The Searx container isn't in the same network as the Linux server (network_mode: host) so any attacks like doing requests from the Searx instance to any host (like an internal host) is incorrect but still correct for caddy if one day there is a flaw in it.

Side note:

(without docker-ipv6nat, filtron can't work).

In IPv4 only environment it does work but not properly in dual stack (IPv4 & IPv6) environment.

In IPv4 only environment it does work but not properly in dual stack (IPv4 & IPv6) environment.

Yes and the problem is the silent fail: everything works except the bot protection when the bots uses an IPv6. (actually I'm not sure about the filtron behaviour without docker-ipv6nat).

[...]but still correct for caddy if one day there is a flaw in it.

yes (if caddy runs in a empty docker image, it limits the attack surface)

hardcoded IP address are considered as bad practice

Here a docker-compose.yml without hardcoded IP: https://gist.github.com/dalf/28ffe27e8675553928fcbfbcd5098179
I know there is still the --network host and the environment variables copy.

Comparaison about the permissions:

docker-ipv6nat
(this PR) lucaslorentz/caddy-docker-proxy
(this PR)
caddy
(master branch) Description

--privileged yes no no It also lifts all the limitations enforced by the device cgroup controller. In other words, the container can then do almost everything that the host can do.

--network host yes no yes That container’s network stack is not isolated from the Docker host (the container shares the host’s networking namespace), and the container does not get its own IP-address allocated

-v /var/run/docker.sock yes yes no Full access to docker

capabilities NET_BIND_SERVICE NET_BIND_SERVICE Bind a socket to Internet domain privileged ports (port numbers less than 1024).

DAC_OVERRIDE DAC_OVERRIDE Bypass file read, write, and execute permission checks. (DAC is an abbreviation of "discretionary access control".)

NET_RAW * Use RAW and PACKET sockets;
* bind to any address for transparent proxying.

NET_ADMIN Perform various network-related operations:
* interface configuration;
* administration of IP firewall, masquerading, and accounting;
* modify routing tables;
* bind to any address for transparent proxying;
* set type-of-service (TOS)
* clear driver statistics;
* set promiscuous mode;
* enabling multicasting;
* use setsockopt(2) to set the following socket options: SO_DEBUG, SO_MARK, SO_PRIORITY (for a priority outside the range 0 to 6), SO_RCVBUFFORCE, and SO_SNDBUFFORCE.

SYS_MODULE Load and unload kernel modules (see init_module(2) and delete_module(2))

I understand that --network host and listen on 127.0.0.1 breaks the docker philosophy but I wish to keep the number of permissions minimals.

Actually, if docker-ipv6nat would be part of the docker project, there would the defacto solution. And yes even if it is not the case, there is far less code to audit in docker-ipv6nat than in caddy ( there is an exec.Cmd, but anyway caddy has one too )

But still it is weird for me to add more permissions to remove permissions.
I guess there is no real other solution ? (except hack like hard coded iptables rules).

dalf · 2020-03-04T13:47:46Z

update.sh

 # the code might be out of sync with the current running services
-systemctl stop "${SERVICE_NAME}"
+./stop.sh


searx-docker.service.template must change Restart=always to something Restart=on-failure, otherwise after ./stop.sh, systemd will restart everything (only for people using systemd of course).

But doing that change, for sure some people won't upgrade the service, and it will be a mess when they call this script.

One idea to solve this issue:

if systemd exists, then call systemctl stop "${SERVICE_NAME}" otherwise call ./stop.sh.

I don't like it because the script will behave differently if systemd is installed.

Or we could remove the systemd support altogether, this would reduce the requirement of testing on two different types of system, just like the idea behind using Docker in the first place.
Removing the systemd support isn't that complicated, display a warning message to use the new system if the user use the original systemd service file and make the start.sh script inefficient when started by systemd.

we could remove the systemd

I don't think a file, few command lines to explain how to install this project using systemd won't hurt.

But I do agree to put on the side the systemd stuff.

unixfox · 2020-03-04T17:03:33Z

It doesn't support caddy v2 ( lucaslorentz/caddy-docker-proxy#130 ): no big deal except when caddy v1 will reach end of life.

The current Caddyfile doesn't support Caddy v2 too (see: caddyserver/caddy/wiki/v2:-Caddyfile-examples). If caddy v1 gets unsupported whenever we switch to caddy-docker-proxy or stick with the current docker image there will be some changes needed to switch to the next caddy version. Like stated here caddyserver/caddy/releases/tag/v2.0.0-beta1:

Caddy 2 is not backwards-compatible with Caddy 1.

Moreover, a web server doesn't need that much maintenance, once it reaches a ready state the maintainer can still push some small bug/security fixes from time to time. Even the author of caddy stated that he already stopped working on it: caddyserver/caddy#3073.

I'm not sure we need to support dynamic configuration provide by this image: everything is static.

This isn't true, there is a .env which gets populated by the user. This environment file is then used to generate a dynamic Caddy configuration. The ability to write the Caddyfile from a label allow removing things like this:

environment:
      - SEARX_HOSTNAME=${SEARX_HOSTNAME}
      - SEARX_PROTOCOL=${SEARX_PROTOCOL:-}
      - SEARX_TLS=${SEARX_TLS:-}
      - FILTRON_USER=${FILTRON_USER}
      - FILTRON_PASSWORD=${FILTRON_PASSWORD}

by utilizing the capability of including environment variables directly into the docker-compose itself which from my point of view is way cleaner than passing the environment variable twice (inside the docker-compose and Caddyfile).
Unfortunately I still had to do it the old way only for the environment variable SEARX_TLS but that's why I asked for it to be removed.

dalf · 2020-03-05T09:25:24Z

caddy.conf.d/csp.conf

+header / {
+    # CSP (see http://content-security-policy.com/ )
+    Content-Security-Policy "upgrade-insecure-requests; default-src 'none'; script-src 'self'; style-src 'self' 'unsafe-inline'; form-action 'self'; font-src 'self'; frame-ancestors 'self'; base-uri 'self'; connect-src 'self' https://overpass-api.de; img-src 'self' data: https://*.tile.openstreetmap.org; frame-src https://www.youtube-nocookie.com https://player.vimeo.com https://www.dailymotion.com https://www.deezer.com https://www.mixcloud.com https://w.soundcloud.com https://embed.spotify.com"
+}


Why not add the other headers ?

Strict-Transport-Security, X-XSS-Protection, X-Content-Type-Options, X-Frame-Options, Feature-Policy, Pragma, Referrer-Policy, X-Robots-Tag, -Server, Cache-Control, Access-Control-Allow-Methods, Access-Control-Allow-Origin

It's already here: https://github.com/searx/searx-docker/pull/26/files#diff-eb672b1ddd3aa7dae5f2577f38daa271R51.

The csp.conf config file is only useful for the user to change the CSP headers if he disables the image proxy.

I've seen but why the only HTTP header in caddy.conf.d is Content-Security-Policy ?

Either:

put Content-Security-Policy is docker-compose.yaml

put the other headers in a caddy.conf.d/headers.conf file

May be I miss the point of a different file for the CSP header.

If the CSP is put in the docker-compose.yaml then when the docker-compose.yaml is going to be updated with new changes and the user changed the CSP because he disabled the image proxy there will be a git conflict like you described it (https://github.com/searx/searx-docker#custom-docker-composeyaml):

Do not modify docker-compose.yaml otherwise you won't be able to update easily from the git repository.

I don't mind putting the other headers into a separate file but is it really useful for the user?
I don't think anyone will want to change the other headers by hand because all of them are "standard" and doesn't need to be customized. This also allows us to be sure that when creating an issue the user will have the same headers as the ones in the repository.
Moreover, if one day we need to generate "dynamically" a header based on the .env then we wouldn't be required to remove it from the "headers.conf" file then add it back as a Docker label.

dalf · 2020-03-05T10:03:31Z

The current Caddyfile doesn't support Caddy v2 too

Sure. I'm just afraid if there is CVE:

if there is a fix in the git repo:
- currently we have to wait for an update of abiosoft/caddy (not good).
- with this PR: we have to wait for an update of caddy-docker-proxy.
- if we build caddy without plugins, we just have to wait for the fix to land in the git repo.
if there is not fix:
- we can move to the version 2, we "just" have to rewrite the Caddyfile (and there is an official caddy v2 docker image).
- with this PR : we have to find another solution, in the meantime the CVE is not fixed.

I know that's may never happen.

This isn't true, there is a .env which gets populated by the user.

By "support dynamic configuration", I mean after ./start.sh, there is no new docker image to start and publish using caddy-docker-proxy. After ./start.sh everything is static, so caddy-docker-proxy could be a template renderer.

unixfox · 2020-03-05T14:14:04Z

Let's imagine there is a CVE happening like right now.

From looking at the activity of both abiosoft/caddy and lucaslorentz/caddy-docker-proxy it's clear that lucaslorentz/caddy-docker-proxy is going to be updated first because:

abiosoft/caddy still doesn't have the latest version released in November 2019: https://github.com/caddyserver/caddy/releases/tag/v1.0.4
the latest commit on abiosoft/caddy repo is from last year (end of May 2019)
lucaslorentz uses his docker image in production (Service <id> and caddy are not in same network lucaslorentz/caddy-docker-proxy#104 (comment)) and his infrastructure is at risk, so he will want to update as quick as possible:

@kidcuauh Thanks for sharing instructions. That's how I do on my production environments, I create a network separately and reference it on several compose files.

(personal experience) lucaslorentz is pretty quick at responding in the issue of his repository, so he will probably update his image if someone open an issue to warn about the vulnerability.

But thanks to the open source world, we aren't required to rely on lucaslorentz/caddy-docker-proxy. We could just build our caddy-docker-proxy docker image and update it if there is a CVE, I already did it for myself so this isn't an issue to setup it for Searx but in this case we "take the responsibility" for updating the Caddy image when there is a vulnerability.

Caddy v1 isn't going away any time soon, the web server is used by millions of users and if a vulnerability is discovered there is a high chance that the caddy team is going to release a new version to fix the vulnerability. Switching to a new version in a hurry isn't a good solution because it is more likely to break something than not having quickly enough a new version that fixes the vulnerability.

By "support dynamic configuration", I mean after ./start.sh, there is no new docker image to start and publish using caddy-docker-proxy. After ./start.sh everything is static, so caddy-docker-proxy could be a template renderer.

Using Docker labels for generating the Caddyfile is still useful in order to avoid having to pass the environment variable twice even though you don't consider using the .env file as a "dynamic configuration".

unixfox · 2020-04-09T15:04:08Z

Sorry for the long waiting time about this PR.
I switched to Kubernetes on my own setup, so I stopped using Docker Swarm/Docker compose and I won't try to improve this PR anymore.

Anyway if you want to have as less privileged services as possible in the docker-compose then use network host anyway but try to reduce the possible surface attack by using a dumb TCP proxy that have PROXY PROTOCOL support (in order to preserve the IP address) like this project: https://github.com/kazeburo/ppdp.
Then you make it so Caddy is in its own network with the other services and have caddy binds to 127.0.0.1:8080 (127.0.0.1:8080:80) and use the TCP proxy in network host to forward to 127.0.0.1:8080.
Also, please remove this hardcoded network: https://github.com/searx/searx-docker/blob/master/docker-compose.yaml#L108

lucaslorentz · 2020-05-31T08:22:38Z

docker-compose.yaml

@@ -3,61 +3,95 @@ version: '3.7'
 services:

  caddy:
-    container_name: caddy
-    image: abiosoft/caddy:1.0.3-no-stats
+    image: lucaslorentz/caddy-docker-proxy:latest


Be careful when using latest tag. It is preferable to use a specific version, as latest tag might get breaking changes.
Latest now is caddy v2, which is not compatible with caddy v1.

unixfox mentioned this pull request Mar 1, 2020

Support other architectures #27

Open

refactor with caddy labels + remove dependency on systemd

ce0ff24

dalf self-requested a review March 4, 2020 13:35

dalf reviewed Mar 4, 2020

View reviewed changes

dalf reviewed Mar 5, 2020

View reviewed changes

lucaslorentz reviewed May 31, 2020

View reviewed changes

dalf mentioned this pull request Jun 11, 2020

Remove unnecessary published ports #38

Open

dalf mentioned this pull request Jul 19, 2020

installing searx with https on a different port #39

Closed

unixfox closed this Jul 25, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] refactor with caddy labels + remove dependency on systemd #26

[WIP] refactor with caddy labels + remove dependency on systemd #26

unixfox commented Mar 1, 2020 •

edited

Loading

unixfox commented Mar 1, 2020

dalf left a comment

dalf Mar 4, 2020

unixfox Mar 4, 2020

dalf Mar 5, 2020

unixfox Mar 5, 2020 •

edited

Loading

dalf Mar 5, 2020

unixfox Mar 7, 2020 •

edited

Loading

dalf Mar 7, 2020 •

edited

Loading

dalf Mar 4, 2020

unixfox Mar 4, 2020 •

edited

Loading

dalf Mar 4, 2020

unixfox commented Mar 4, 2020

dalf Mar 5, 2020

unixfox Mar 5, 2020

dalf Mar 5, 2020

unixfox Mar 5, 2020

dalf commented Mar 5, 2020

unixfox commented Mar 5, 2020 •

edited

Loading

unixfox commented Apr 9, 2020

lucaslorentz May 31, 2020

	docker-ipv6nat (this PR)	lucaslorentz/caddy-docker-proxy (this PR)	caddy (master branch)	Description
--privileged	yes	no	no	It also lifts all the limitations enforced by the device cgroup controller. In other words, the container can then do almost everything that the host can do.
--network host	yes	no	yes	That container’s network stack is not isolated from the Docker host (the container shares the host’s networking namespace), and the container does not get its own IP-address allocated
-v /var/run/docker.sock	yes	yes	no	Full access to docker
capabilities		NET_BIND_SERVICE	NET_BIND_SERVICE	Bind a socket to Internet domain privileged ports (port numbers less than 1024).
		DAC_OVERRIDE	DAC_OVERRIDE	Bypass file read, write, and execute permission checks. (DAC is an abbreviation of "discretionary access control".)
	NET_RAW			* Use RAW and PACKET sockets; * bind to any address for transparent proxying.
	NET_ADMIN			Perform various network-related operations: * interface configuration; * administration of IP firewall, masquerading, and accounting; * modify routing tables; * bind to any address for transparent proxying; * set type-of-service (TOS) * clear driver statistics; * set promiscuous mode; * enabling multicasting; * use setsockopt(2) to set the following socket options: SO_DEBUG, SO_MARK, SO_PRIORITY (for a priority outside the range 0 to 6), SO_RCVBUFFORCE, and SO_SNDBUFFORCE.
	SYS_MODULE			Load and unload kernel modules (see init_module(2) and delete_module(2))

[WIP] refactor with caddy labels + remove dependency on systemd #26

[WIP] refactor with caddy labels + remove dependency on systemd #26

Conversation

unixfox commented Mar 1, 2020 • edited Loading

Changes

Some changes that I would like to do but not sure about them:

Breaking changes:

unixfox commented Mar 1, 2020

dalf left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

unixfox Mar 5, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

unixfox Mar 7, 2020 • edited Loading

Choose a reason for hiding this comment

dalf Mar 7, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

unixfox Mar 4, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

unixfox commented Mar 4, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dalf commented Mar 5, 2020

unixfox commented Mar 5, 2020 • edited Loading

unixfox commented Apr 9, 2020

Choose a reason for hiding this comment

unixfox commented Mar 1, 2020 •

edited

Loading

unixfox Mar 5, 2020 •

edited

Loading

unixfox Mar 7, 2020 •

edited

Loading

dalf Mar 7, 2020 •

edited

Loading

unixfox Mar 4, 2020 •

edited

Loading

unixfox commented Mar 5, 2020 •

edited

Loading