-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Bubblewrap implementation for sandboxing #153
base: master
Are you sure you want to change the base?
Conversation
Required as temporary directory is used for storing some data
0d510b1
to
84719c2
Compare
84719c2
to
261a5df
Compare
Seeemmmss to work, at least to some degree. Tested it with the following context file: import subprocess
import socket
from datetime import timedelta
from pathlib import Path
import numpy as np
from damnit_ctx import Variable
@Variable(title="Trains")
def n_trains(run):
return len(run.train_ids)
@Variable(title="Proposals")
def list_proposals(run):
return str(list(Path("/gpfs/exfel/exp/").glob("*")))
@Variable(title="Slurm")
def srun(run):
return subprocess.run(
["sacct"],
text=True,
stdout = subprocess.PIPE,
stderr = subprocess.STDOUT,
).stdout
@Variable(title="Web")
def ping(run):
return subprocess.run(
["curl", "example.com"],
text=True,
stdout = subprocess.PIPE,
stderr = subprocess.STDOUT,
).stdout
@Variable(title="SSH")
def ping(run):
return subprocess.run(
["ssh", "max-exfl", "hostname"],
text=True,
stdout = subprocess.PIPE,
stderr = subprocess.STDOUT,
).stdout
@Variable(title="Slurm Executed", cluster=True)
def ping(run):
host = socket.getfqdn()
proposals = len(list(Path("/gpfs/exfel/exp/").glob("*/*/*")))
return f"{host} - {proposals}" Which updates the sqlite database and creates the extracted data files successfully. The output is: Full output
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One other thing, could you write some docs about this in backend.md
? Including how to enable/disable it.
Also, your code is outstanding :) I love all the documentation and type hints ❤️
self.extract_procs_queue.put((proposal, run, extract_proc)) | ||
|
||
def listen(): | ||
def listen(sandbox: bool): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we move the sandbox
setting to the databases metameta
table? That way we can set it relatively easily with amore-proto db-config
and we wouldn't have to restart the listener after changing it. It would also make it simpler to move to a centralized listener in the future and have different sandbox settings for different proposals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm yeah, that was my first idea, but it depends on how 'secure' the configuration should be, since if it's in the database then users could enable/disable the sandboxing as easily as we could, unless the db is read only to the DAMNIT user, which would break a lot of things.
Keeping the setting as part of how the process is started means that the user running the listener could have sandboxing on, while others can still modify the DB/run reprocessing themselves.
I think options are:
- Make it part of the table with tables as read-only, enforces sandboxing for processes, but breaks user write permissions to the table.
- Make it part of the table, still read-write, risk of users disabling sandboxing.
- Some extra configuration file in the proposal directory/for the listener which is read-only to the DAMNIT user. This kind of config may end up being required as part of the move to a central listener anyway.
- Something set as part of the supervisor config?
- Leave it as a flag.
With all options ending in "for now" 😛
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say lets go with option 2 for now, and later we can move it to the centralized listener settings (which should be inaccessible by users).
|
||
|
||
@pytest.fixture | ||
def bubblewrap(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpicking, could the fixtures go into conftest.py
with the others?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One other thing, could you write some docs about this in
backend.md
? Including how to enable/disable it.
Yep, I'll add some stuff there, including notes on when it should be disabled.
Also, your code is outstanding :) I love all the documentation and type hints ❤️
Aha thanks, my IDE blinds me with warnings if I don't 😂 😂 good motivator
damnit/cli.py
Outdated
|
||
elif args.subcmd == 'reprocess': | ||
# Hide some logging from Kafka to make things more readable | ||
logging.getLogger('kafka').setLevel(logging.WARNING) | ||
|
||
from .backend.extract_data import reprocess | ||
reprocess(args.run, args.proposal, args.match, args.mock) | ||
reprocess(args.run, args.proposal, args.match, args.mock, args.no_sandbox) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👀
self.extract_procs_queue.put((proposal, run, extract_proc)) | ||
|
||
def listen(): | ||
def listen(sandbox: bool): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm yeah, that was my first idea, but it depends on how 'secure' the configuration should be, since if it's in the database then users could enable/disable the sandboxing as easily as we could, unless the db is read only to the DAMNIT user, which would break a lot of things.
Keeping the setting as part of how the process is started means that the user running the listener could have sandboxing on, while others can still modify the DB/run reprocessing themselves.
I think options are:
- Make it part of the table with tables as read-only, enforces sandboxing for processes, but breaks user write permissions to the table.
- Make it part of the table, still read-write, risk of users disabling sandboxing.
- Some extra configuration file in the proposal directory/for the listener which is read-only to the DAMNIT user. This kind of config may end up being required as part of the move to a central listener anyway.
- Something set as part of the supervisor config?
- Leave it as a flag.
With all options ending in "for now" 😛
|
||
|
||
@pytest.fixture | ||
def bubblewrap(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure!
if venv == "False": | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be surprising that this method is a complete no-op if the Python you point to isn't a venv. Especially if people use conda envs - they're technically not venvs, but I think people usually expect things to work in the same way.
Would it make sense to turn this into add_bind_python
and try to do the right thing with the target Python, venv or not?
It could also start by adding the main env directory (i.e. the bit before bin/python
) and only add extra paths if they're not under that; it should be the same, but a shorter command is easier to make sense of if we ever need to.
(Also, I think you'd actually end up with "False\n"
here if the target is not a venv, so the check would fail. It might be worth using JSON to send details back, just to avoid fiddly details like this.)
@@ -106,6 +108,24 @@ def extract_in_subprocess( | |||
for m in match: | |||
args.extend(['--match', m]) | |||
|
|||
if sandbox: | |||
bubblewrap = Bubblewrap() | |||
with contextlib.suppress(Exception): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with contextlib.suppress(Exception): | |
with contextlib.suppress(FileNotFoundError): |
Is that what we're trying to catch? Do we expect it to come up in real use, or only in testing?
Adds sandboxing via bubblewrap for context file execution, closes #150.
Main functionality is implemented via a
Bubblewrap
class which is a convenience class for building up the CLI arguments/flags/mounts required to sandbox a python process to only have access to the data from a single proposal directory.There's also a new flag
--no-sandbox
to disable the sandboxing feature. This can be set at launch time for the listener and will be propagated to other subprocess commands down to the final relevantextract_in_subprocess
call.Summary
Bubblewrap
implements:__init__
: initializes with some default flags and bind mounts for running the sandboxed process, these defaults are all security/isolation related (unshare/disable namespaces) or required binds/settings (e.g. share network, bind/bin
,/lib
, etc...).add_bind
: main method for adding some bind mount with asource
,destination
, and a flag to set the mount to read only or not.add_bind_proposal
which takes in a proposal number, finds the directory, bind mounts the directory, and resolves the top-level symlinks in the proposal directory, then bind mounts those in as well.add_bind_venv
which takes the path to a python executable and, if it is running in a venv, bind mounts all paths required by it to the sandbox.build_command
takes in the command to sandbox and prepends the full bubblewrap command to it, then returns a list that can be called bysubprocess
to start the sandboxed command.There are some tests to check that the command is built (at least somewhat) correctly, but they're not that robust since I'm not sure how reliable testing with something bubblewrap is when running in a CI environment.
Example
An example of building up the commands would be:
Which creates:
Questions
Main one is... where should this go? - Went with making it part of `extract_in_subprocess`
It can be added anywhere there is a subprocess command, or even in the
ctxrunner
:DAMNIT/damnit/backend/listener.py
Line 150 in 377af28
DAMNIT/damnit/backend/extract_data.py
Line 56 in 377af28
DAMNIT/damnit/backend/extract_data.py
Line 283 in 377af28
DAMNIT/damnit/ctxsupport/ctxrunner.py
Line 410 in 377af28
--sandbox
which, if present, re-executes itself within the sandbox.Other questions are:
venv
(if there is one). Alternative would be to mount everything as read only by default, and explicitly pass a single output directory which is writable. - spoke to James, decided to leave proposal dir as normal mount instead of having everything read only and one output directory due to other tools potentially writing to other directories.--die-with-parent
is set to kill the sandbox if the parent (DAMNIT) process dies, I assume that's desirable? - spoke to James, seems fine.--unshare-all
and some of the binds break (on purpose) authentication, slurm, ssh, etc..., I assume people don't expect to be able to have some function in a context file that runs a subprocess with ssh or something? - spoke to James, mentioned that xwiz will run slurm commands on its own which would break, added in option to disable sandboxing when the listener is spawned.TODO: