-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] refactor with caddy labels + remove dependency on systemd #26
Conversation
@dalf When you are available please review my PR 😃 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove the SEARX_PROTOCOL environment variable: it's useless and the protocol (HTTP or HTTPS) can already be included in SEARX_HOSTNAME.
Yes this one can be remove, and SEARX_HOSTNAME can be https://somewhere
Switch to lucaslorentz/caddy-docker-proxy docker image.
I'm puzzled by this image:
- I'm not sure we need to support dynamic configuration provide by this image: everything is static.
- It doesn't support caddy v2 ( Caddy v2? lucaslorentz/caddy-docker-proxy#130 ): no big deal except when caddy v1 will reach end of life.
May be I'm missing something.
ports: | ||
- 80:80 | ||
- 443:443 | ||
network_mode: host |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without network_mode: host
, Filtron won't have the user IP.
In the log, if I don't accept the self signed certificated:
2020/03/04 13:26:05 http: TLS handshake error from 172.19.0.1:39616: remote error: tls: bad certificate
It should not be 172.19.0.1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't due to use a self signed certificate. This is due to the fact that the IPv6 support in Docker is bad. I already encountered this issue and to solve it I had to use this docker image: https://github.com/robbertkl/docker-ipv6nat. This docker image allows to pass the correct IPv6 address to the Docker container.
I will include it in this PR because even if the Linux server doesn't have public IPv6 address it won't interfere with anything because docker-ipv6nat only works with IPv6.
Proof that it isn't about having a self signed certificate:
(First request in IPv6 without docker-ipv6nat and second request in IPv4)
caddy_1 | 2020/03/04 17:18:36 http: TLS handshake error from 172.18.0.1:42496: remote error: tls: bad certificate
caddy_1 | 2020/03/04 17:20:57 http: TLS handshake error from 87.64.x.x:63301: remote error: tls: bad certificate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, my message was not clear: the thing I don't understand is how caddy is getting the user IPv4, since it is in a docker network:
- that's the reason of
network_mode: host
- and because of
network_mode: host
here, there is configuration is way more complicated for filtron / searx / morty.
Thank you for the IPv6 information: I've tried to configured a IPv6 without success, now I understand why (also partly because kimsufi IPv6 provide only one IPv6).
As I understand, docker-ipv6nat, caddy may see different (IPv4, source port) for one IPv6 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "proxy" which forwards requests from the external network to the internal network of the container doesn't work in IPv6: moby/moby#17666.
That's why you are seeing the IP of the "load balancer": 172.19.0.1
or 172.18.0.1
(depends on how the network is created).
This proxy is called "userland proxy" and you can learn more about it here: https://windsock.io/the-docker-proxy/
Using network_mode: host
is considered as a very bad practice because it disables the initial network isolation, publish every single port of the container to the "internet" and make the container in the same network as the other services.
What if there is a vulnerability in Searx which allows doing requests from the Searx instance to any host? The attacker will be able to map the network where Searx reside and if there are other services which aren't properly secured but listen only on localhost then the attacker will be able to gain access to them.
Docker isn't only about making "things simple" by allowing programs to run on every possible Linux servers but it's also an isolation system (not the perfect one for sure).
This "option" should be used as a last resort if there is no other solution available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the link, way more clear for me.
Just to sum up:
- currently, caddy can open any ports, but has no capabilities except NET_BIND_SERVICE and DAC_OVERRIDE (sadly with the root user inside the container, it is not mandatory, another bad point for the abiosoft image).
- to remove this
--network host
, the full solution is to include docker-ipv6nat with requires the privileged option so to be more or less root on the host ?
So are we moving from some trust to caddy to a blind trust to docker-ipv6nat ?
(without docker-ipv6nat, filtron can't work).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason why I mainly want to switch to docker-ipv6nat
is that hardcoded IP address are considered as bad practice because:
- The network
10.10.10.0/24
could already exist and IPv4 network collision is already impacting Docker: https://stackoverflow.com/questions/50514275/docker-bridge-conflicts-with-host-network, https://www.lullabot.com/articles/fixing-docker-and-vpn-ip-address-conflicts and more articles on your preferred search engine.
Leaving Docker creating itself the network is the way to go to avoid any collision. - In a Docker mindset everything should be "temporary" and not tided to real IP addresses that's why it's better to delegate the work to a DNS server
About the security for docker-ipv6nat, you aren't required to give all the permissions. Like explained here: https://github.com/robbertkl/docker-ipv6nat#docker-container all what's needed is to give the permissions NET_RAW
, NET_ADMIN
and SYS_MODULE
.
docker-ipv6nat
has way less attack surface than caddy
because it doesn't listen to the external world whereas Caddy does and it's just a bunch of iptables
commands.
Also, I wanted to correct some of my previous thoughts. The Searx container isn't in the same network as the Linux server (network_mode: host
) so any attacks like doing requests from the Searx instance to any host
(like an internal host) is incorrect but still correct for caddy if one day there is a flaw in it.
Side note:
(without docker-ipv6nat, filtron can't work).
In IPv4 only environment it does work but not properly in dual stack (IPv4 & IPv6) environment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In IPv4 only environment it does work but not properly in dual stack (IPv4 & IPv6) environment.
Yes and the problem is the silent fail: everything works except the bot protection when the bots uses an IPv6. (actually I'm not sure about the filtron behaviour without docker-ipv6nat).
[...]but still correct for caddy if one day there is a flaw in it.
yes (if caddy runs in a empty docker image, it limits the attack surface)
hardcoded IP address are considered as bad practice
Here a docker-compose.yml without hardcoded IP: https://gist.github.com/dalf/28ffe27e8675553928fcbfbcd5098179
I know there is still the --network host
and the environment variables copy.
Comparaison about the permissions:
docker-ipv6nat (this PR) |
lucaslorentz/caddy-docker-proxy (this PR) |
caddy (master branch) |
Description | |
---|---|---|---|---|
--privileged | yes | no | no | It also lifts all the limitations enforced by the device cgroup controller. In other words, the container can then do almost everything that the host can do. |
--network host | yes | no | yes | That container’s network stack is not isolated from the Docker host (the container shares the host’s networking namespace), and the container does not get its own IP-address allocated |
-v /var/run/docker.sock | yes | yes | no | Full access to docker |
capabilities | NET_BIND_SERVICE | NET_BIND_SERVICE | Bind a socket to Internet domain privileged ports (port numbers less than 1024). | |
DAC_OVERRIDE | DAC_OVERRIDE | Bypass file read, write, and execute permission checks. (DAC is an abbreviation of "discretionary access control".) | ||
NET_RAW | * Use RAW and PACKET sockets; * bind to any address for transparent proxying. |
|||
NET_ADMIN | Perform various network-related operations: * interface configuration; * administration of IP firewall, masquerading, and accounting; * modify routing tables; * bind to any address for transparent proxying; * set type-of-service (TOS) * clear driver statistics; * set promiscuous mode; * enabling multicasting; * use setsockopt(2) to set the following socket options: SO_DEBUG, SO_MARK, SO_PRIORITY (for a priority outside the range 0 to 6), SO_RCVBUFFORCE, and SO_SNDBUFFORCE. |
|||
SYS_MODULE | Load and unload kernel modules (see init_module(2) and delete_module(2)) |
I understand that --network host
and listen on 127.0.0.1 breaks the docker philosophy but I wish to keep the number of permissions minimals.
Actually, if docker-ipv6nat would be part of the docker project, there would the defacto solution. And yes even if it is not the case, there is far less code to audit in docker-ipv6nat than in caddy ( there is an exec.Cmd, but anyway caddy has one too )
But still it is weird for me to add more permissions to remove permissions.
I guess there is no real other solution ? (except hack like hard coded iptables rules).
# the code might be out of sync with the current running services | ||
systemctl stop "${SERVICE_NAME}" | ||
./stop.sh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
searx-docker.service.template
must change Restart=always
to something Restart=on-failure
, otherwise after ./stop.sh
, systemd will restart everything (only for people using systemd of course).
But doing that change, for sure some people won't upgrade the service, and it will be a mess when they call this script.
One idea to solve this issue:
- if systemd exists, then call
systemctl stop "${SERVICE_NAME}"
otherwise call./stop.sh
. - I don't like it because the script will behave differently if systemd is installed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or we could remove the systemd support altogether, this would reduce the requirement of testing on two different types of system, just like the idea behind using Docker in the first place.
Removing the systemd support isn't that complicated, display a warning message to use the new system if the user use the original systemd service file and make the start.sh
script inefficient when started by systemd.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could remove the systemd
I don't think a file, few command lines to explain how to install this project using systemd won't hurt.
But I do agree to put on the side the systemd stuff.
The current Caddyfile doesn't support Caddy v2 too (see: caddyserver/caddy/wiki/v2:-Caddyfile-examples). If caddy v1 gets unsupported whenever we switch to
Moreover, a web server doesn't need that much maintenance, once it reaches a ready state the maintainer can still push some small bug/security fixes from time to time. Even the author of caddy stated that he already stopped working on it: caddyserver/caddy#3073.
This isn't true, there is a environment:
- SEARX_HOSTNAME=${SEARX_HOSTNAME}
- SEARX_PROTOCOL=${SEARX_PROTOCOL:-}
- SEARX_TLS=${SEARX_TLS:-}
- FILTRON_USER=${FILTRON_USER}
- FILTRON_PASSWORD=${FILTRON_PASSWORD} by utilizing the capability of including environment variables directly into the docker-compose itself which from my point of view is way cleaner than passing the environment variable twice (inside the docker-compose and Caddyfile). |
header / { | ||
# CSP (see http://content-security-policy.com/ ) | ||
Content-Security-Policy "upgrade-insecure-requests; default-src 'none'; script-src 'self'; style-src 'self' 'unsafe-inline'; form-action 'self'; font-src 'self'; frame-ancestors 'self'; base-uri 'self'; connect-src 'self' https://overpass-api.de; img-src 'self' data: https://*.tile.openstreetmap.org; frame-src https://www.youtube-nocookie.com https://player.vimeo.com https://www.dailymotion.com https://www.deezer.com https://www.mixcloud.com https://w.soundcloud.com https://embed.spotify.com" | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not add the other headers ?
Strict-Transport-Security, X-XSS-Protection, X-Content-Type-Options, X-Frame-Options, Feature-Policy, Pragma, Referrer-Policy, X-Robots-Tag, -Server, Cache-Control, Access-Control-Allow-Methods, Access-Control-Allow-Origin
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's already here: https://github.com/searx/searx-docker/pull/26/files#diff-eb672b1ddd3aa7dae5f2577f38daa271R51.
The csp.conf config file is only useful for the user to change the CSP headers if he disables the image proxy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've seen but why the only HTTP header in caddy.conf.d is Content-Security-Policy ?
Either:
- put Content-Security-Policy is docker-compose.yaml
- put the other headers in a caddy.conf.d/headers.conf file
May be I miss the point of a different file for the CSP header.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the CSP is put in the docker-compose.yaml
then when the docker-compose.yaml
is going to be updated with new changes and the user changed the CSP because he disabled the image proxy there will be a git conflict like you described it (https://github.com/searx/searx-docker#custom-docker-composeyaml):
Do not modify docker-compose.yaml otherwise you won't be able to update easily from the git repository.
I don't mind putting the other headers into a separate file but is it really useful for the user?
I don't think anyone will want to change the other headers by hand because all of them are "standard" and doesn't need to be customized. This also allows us to be sure that when creating an issue the user will have the same headers as the ones in the repository.
Moreover, if one day we need to generate "dynamically" a header based on the .env
then we wouldn't be required to remove it from the "headers.conf" file then add it back as a Docker label.
Sure. I'm just afraid if there is CVE:
I know that's may never happen.
By "support dynamic configuration", I mean after ./start.sh, there is no new docker image to start and publish using caddy-docker-proxy. After ./start.sh everything is static, so caddy-docker-proxy could be a template renderer. |
Let's imagine there is a CVE happening like right now. From looking at the activity of both
But thanks to the open source world, we aren't required to rely on Caddy v1 isn't going away any time soon, the web server is used by millions of users and if a vulnerability is discovered there is a high chance that the caddy team is going to release a new version to fix the vulnerability. Switching to a new version in a hurry isn't a good solution because it is more likely to break something than not having quickly enough a new version that fixes the vulnerability.
Using Docker labels for generating the Caddyfile is still useful in order to avoid having to pass the environment variable twice even though you don't consider using the |
Sorry for the long waiting time about this PR. Anyway if you want to have as less privileged services as possible in the docker-compose then use network host anyway but try to reduce the possible surface attack by using a dumb TCP proxy that have PROXY PROTOCOL support (in order to preserve the IP address) like this project: https://github.com/kazeburo/ppdp. |
@@ -3,61 +3,95 @@ version: '3.7' | |||
services: | |||
|
|||
caddy: | |||
container_name: caddy | |||
image: abiosoft/caddy:1.0.3-no-stats | |||
image: lucaslorentz/caddy-docker-proxy:latest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Be careful when using latest tag. It is preferable to use a specific version, as latest tag might get breaking changes.
Latest now is caddy v2, which is not compatible with caddy v1.
Closes #9
Closes #23
This commit use the Docker image
lucaslorentz/caddy-docker-proxy
to automatically generate the Caddyfile based on Docker labels.Also, it removes the dependency on systemd and allow any Linux distribution compatible with Docker to work with searx-docker.
Changes
lucaslorentz/caddy-docker-proxy
docker image.caddy/conf.d
as a way to include additional Caddy directives.Some changes that I would like to do but not sure about them:
SEARX_TLS
environment variable: it is annoying because it's not really possible to include it directly as a Docker label due to a generation error if the variable is empty. It would be better to ask the user to set it inside thecaddy/conf.d/tls.conf
.SEARX_PROTOCOL
environment variable: it's useless and the protocol (HTTP or HTTPS) can already be included inSEARX_HOSTNAME
.Breaking changes: