Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Usage examples #4

Open
pablomfc opened this issue Apr 5, 2024 · 4 comments
Open

Usage examples #4

pablomfc opened this issue Apr 5, 2024 · 4 comments

Comments

@pablomfc
Copy link

pablomfc commented Apr 5, 2024

Hi, I really liked this project, but I would like to ask for an example of how to use it in practice, if possible with the inclusion of a deployment manifest example. I completed all the steps in the documentation, but in the end I didn't quite understand how to configure a sidecar with the help of vpn-operator.

Thank you !!

@thavlik
Copy link
Owner

thavlik commented Apr 5, 2024

Howdy! Thank you for your interest in the project. vpn-operator was originally intended to be used in tandem with ytdl-operator. Such a combination of tools allows one to mass download videos from any of the hundreds of backends supported by yt-dlp. Note that this is already an option with yt-dlp on desktop (i.e. without kubernetes), but this does not scale horizontally and can occupy your PC for days at a time. You probably want the videos in an S3 bucket anyway, so there isn't much of a point to downloading them to your local hard drive first.

Downloading videos in bulk is a common task for deep learning researchers working with video data, such as myself. Moving this aspect of my research to k8s means that bulk download operations can be scaled and also don't require babysitting (the advantage k8s is that you can automate behavior in case of failures). I haven't had the time/energy to bring that project to completion, nor the necessity.

vpn-operator, however, is ready to be used. I will explain how it works and plan on updating the readme to clarify these matters.

The design assumes you are making use of a VPN sidecar like gluetun. Here we define a primary container as the one running your job (e.g. downloading a video) and a sidecar container as any additional (non-primary) containers running within a kubernetes Pod. The kubernetes spec requires that all containers in the same pod share the same networking. Therefore, when the gluetun container connects to the VPN, your primary container will also utilize the VPN connection. This way, you don't have to fool around with vpn stuff within the yt-dlp container, and you don't have to modify the default gluetun image. Taking advantage of the "multiple containers inside a single Pod" feature, there is an elegant separation of concerns here.

In my mind, the best way to utilize this project is to write another operator on top of it. Create a CRD that represents the work that needs to be done, and then your custom operator automates the process of reserving VPN credentials for each job.

If you are looking to merely connect a pod to a VPN, then all you need is a gluetun sidecar. This is trivial and entails adding a gluetun container to your Pod then specifying the credentials as environment variables. This project handles the logistical complexity of distributing VPN credentials. If you don't need the scale or automation of k8s, you probably won't benefit from this project.

Before I started working on vpn-operator, the architecture & behavior of VPNs on kubernetes was also very opaque to me. The readme definitely needs better explanations, as I don't think even I would have easily understood this project before figuring it all out for myself.

Regarding installation, the project is shipped as a helm chart. chart/values.yaml has the default values that are used when the chart is installed with helm install. The exact command is available in the README.

Thanks again for the feedback. Please let me know if something needs more clarification or you have any other questions.

@pablomfc
Copy link
Author

pablomfc commented Apr 5, 2024

Thomas, thank you very much for taking the time to answer me so precisely. I now understand the purpose and the right way to use it. Unfortunately it is beyond my capacity to develop an operator at the moment.
I thought before that it would work like an istio/linkerd mesh. For example, in the case of Linkerd, we add an annotation of the type: "linkerd.io/inject: enabled" in the deployment manifest (https://linkerd.io/2-edge/features/proxy-injection/) and in a similar way the gluetun would be automatically injected as sidecar. It would be a simpler way to provision a VPN without having to manually deal with sidecar manifests and manage authentication.

Thanks,
Pablo

@thavlik
Copy link
Owner

thavlik commented Apr 25, 2024

Pablo,

I like your idea! Perhaps this functionality does belong in vpn-operator. Right now, the sidecars must be configured manually, and the idea of it being automatic is attractive. This would require writing another controller. I'll keep it in the back of my mind, and of course, anyone inspired by this thread is more than welcome to take a stab at it.

Best,

Tom

@pablomfc
Copy link
Author

pablomfc commented Apr 27, 2024

Thomas,

I'll show you what I ended up doing. I did some research on admission control in Kubernetes and discovered that Kyverno had this feature they call ClusterPolicy and could be used to inject a sidecar into a deployment.

In my use case I needed to activate VPNs for about 10 to 15 pods. So VPN providers that had limitations of 5-10 did not meet my needs.

I made a list of providers that did not limit simultaneous connections:

Private Internet Access/PIA:

Surfshark:

Windscribe:

IPVanish:

I also tested with Mullvad (https://mullvad.net/) and although they said they limited it to 5, in practice they didn't limit it, but I preferred not to take any risks and selected PIA as the best option.

I created a repository so you can see my implementation:
https://github.com/pablomfc/gluetun-sidecar

One thing I would like to do is be able to subscribe to several providers and find a way to randomly distribute the credentials among Pods, but for now I don't know how to do this. So I'm only using one provider.

Best regards,
Pablo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants