Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delete unused containers #173

Closed
fnothaft opened this issue Aug 29, 2016 · 13 comments
Closed

Delete unused containers #173

fnothaft opened this issue Aug 29, 2016 · 13 comments
Assignees

Comments

@fnothaft
Copy link
Contributor

This was bothering me.

  • bwa (replaced by bwakit)
  • dupe containers:
    • mapsplice-2.0.1.9
    • rsem-1.1.13
    • samtools-1.3
    • ubu-1.0

@jvivian I grepped through toil-scripts and didn't see any of these, so I assume they're good to remove:

fnothaft$ grep samtools-1.3 -R src/
fnothaft$ grep rsem-1.1.1.3 -R src/
fnothaft$ grep rsem-1.1.13 -R src/
fnothaft$ grep mapsplice-2.0 -R src/
fnothaft$ grep ubu-1.0 -R src/
fnothaft$ pwd
/Users/fnothaft/IdeaProjects/toil-scripts
fnothaft added a commit to fnothaft/cgl-docker-lib that referenced this issue Aug 29, 2016
@jvivian
Copy link
Collaborator

jvivian commented Aug 29, 2016

Samtools 1.3 is used in the SplAdder pipeline, which hasn't been integrated into the general purpose RNA-seq pipeline yet. (quay.io/ucsc_cgl/samtools:1.3--...)

The other tools are all part of the UNC pipeline (which we no longer support as a production pipeline). The duplicates are embarrassing though, please torch em.

@hannes-ucsc
Copy link
Contributor

How do we recreate those images if we lose the binaries?

@fnothaft
Copy link
Contributor Author

Well, we'd check out a revision of this repo from before we deleted the Dockerfiles, and we'd build from there. That being said, I think there's a pretty good argument for deleting them:

  • None of the dockerfiles listed above (except for samtools-1.3, which I will remove from Delete unused containers (resolves #173) #174) are used anymore in pipelines that are supported
  • All of the dockerfiles deleted except for samtools-1.3 have problems:
    • Except for bwa and samtools-1.3, the Dockerfiles we're deleting in Delete unused containers (resolves #173) #174 pull down binaries from weblinks, which could vanish at any point in time.
    • bwa builds from lh3/bwa ToT, which isn't best practices

I don't think the question (what if quay went away?) posted over on PR #174 is terribly important. I mean, yes, quay disappearing could happen, but if quay did go away, and all our pushed images disappeared, would we go back in time and rebuild all containers ever used in deprecated pipelines? I wager not. If we would do that, then we should have an internal backup of all containers that we've ever built on ToT, because there's no guarantee that we could do a fresh build of the containers that download resources from the web.

I don't think this is a terribly reasonable contingency to plan for. Not that I think quay is immune from failure—companies are fundamentally ephemeral—but, it's like asking what would happen if GitHub shut down. Do we mirror all GitHub repositories that we've ever built artifacts from in order to tolerate the disappearance of GitHub? Etc etc.

@fnothaft
Copy link
Contributor Author

As an aside, contingent on space, this:

If we would do that, then we should have an internal backup of all containers that we've ever built on ToT, because there's no guarantee that we could do a fresh build of the containers that download resources from the web.

Would not be unreasonable to do.

@hannes-ucsc
Copy link
Contributor

Re quay.io sinking being an unrealistic contingency: agreed. But there still is the possibility of phat-phingering and rogue/broken scripts that cause accidental deletion of images. Backup should be the contingency plan for that.

Re backup of images. Let's do that. I propose backing up each top-level image and each image it's based on as an object in S3. IOW, the entire image tree.

I also don't want to drag around unused ballast. It's not sustainable.

Checking out a specific commit is acceptable if you can find the specific commit. That can be laborious without tags to help you. We do tag the images with the commit of the project repo. That should be included in the backup.

And we need a policy governing which tool versions are kept and which are deleted.

@hannes-ucsc
Copy link
Contributor

See #175 for backup plan.

What do you think about the following policy:

A tool version can be deleted if it is no longer used by a currently supported version of toil-scripts (or the successor projects it is soon going to be broken up into). Deletions have to be submitted as separate PRs and require the explicit consent of all contributors.

@fnothaft
Copy link
Contributor Author

What do you think about the following policy:

A tool version can be deleted if it is no longer used by a currently supported version of toil-scripts (or the successor projects it is soon going to be broken up into). Deletions have to be submitted as separate PRs and require the explicit consent of all contributors.

So, I am not very good at making mistakes, but this seems like one of those policies where someone might still make a mistake. Since we are using revision control, it would be impossible to recover from a mistake, thus I propose that we strengthen the protocol further:

  • Any merges that delete code must be done by three people. One person must have the secret passcodes. When a merge is going to happen, this person must send the passcodes to two separate people, who then execute the merge command within 5 seconds of each other.
  • PRs that delete a docker image must be submitted as a hardcopy patch by registered mail. On the day it is received, it will be read aloud on scrum.
  • In addition to obtaining the explicit consent of all contributors, the PR must be confirmed by the Senate.

Sorry for the snark, but the proposal was a bit much. I'm all for protocol, but this is way overboard.

Deletions have to be submitted as separate PRs

Concretely, I don't think it makes sense to do them as separate PRs, but I think it would make sense to remove each tool in a separate commit. This way, if we discover that we need to pull back the Dockerfile for a tool we removed, we just revert that single commit, which would be pretty clean. If we really wanted, we could have a standardized commit message (to make the log easier to grep for said commit), but it should be sufficient to ensure that the commit message has the name of the tool we're removing in it.

Checking out a specific commit is acceptable if you can find the specific commit. That can be laborious without tags to help you. We do tag the images with the commit of the project repo. That should be included in the backup.

The "it's too hard to find the commit that removed a tool" is overstated. Since we have all of the Dockerfiles nested in subdirectories, it should take <30sec with git bisect and ls <docker-image-name> to find the last commit where a tool existed.

A tool version can be deleted if it is no longer used by a currently supported version of toil-scripts (or the successor projects it is soon going to be broken up into)

The downstream refactor of toil-scripts into toil-lib and a bunch of constituent projects should make it easy to track what Docker images we need. Specifically, we're moving all tools into toil_lib.tools, so, if the tool ain't in there, the image is free game to get deleted. I think there might be a few exceptions (vg is the one that comes to mind), but I think my next comment will handle this.

Quick aside before getting to my actual next comment: We should start tagging this repository, and the tags should line up with (future) toil-lib tags. We should make a pass and clean up all unused images after tagging.

require the explicit consent of all contributors.

Define contributors. Is that all 10 people who've committed code to this repo? If so, man, I'd hate to have to wrangle 10 people to get a minor change approved.

What I'd suggest instead, is that we make sure that all dockerfiles in this repo have up-to-date and comprehensive MAINTAINER directives, and then we ask the maintainers of the image to +1 the proposed change.

I'd be fine if we allowed anyone who's contributed to toil-scripts/cgl-docker-lib/etc to -1 a change as well, that seems reasonable.

@hannes-ucsc
Copy link
Contributor

Sorry for the snark, but the proposal was a bit much. I'm all for protocol, but this is way overboard.

Bear with me as I have good intentions. I want to ensure that we can reliably revert deletions and that we ask the right set of people, those who might be adversely affected by a deletion.

Concretely, I don't think it makes sense to do them as separate PRs, but I think it would make sense to remove each tool in a separate commit

Yes, that is better.

The "it's too hard to find the commit that removed a tool" is overstated.

This is a misquote of what I actually said.

it should take <30sec with git bisect and ls to find the last commit where a tool existed

It implies that you know where to look and that we never rename anything or restructure the source tree, break up the repository etc. etc. It assumes that the image was built from the revision of the file at the commit preceding the commit that deletes the tool.

I prefer tagging the image with the commit as we currently do now.

Define contributors

I take that back. For PRs with deletions, I propose pinging the people who we think might have an interest in the tools that are being deleted. This is not a definition but I hope we can just use common sense. I'm afraid that only asking people in MAINTAINERS is not enough.

@fnothaft
Copy link
Contributor Author

I want to ensure that we can reliably revert deletions

I think we're in agreement that the simple way to do this is to remove each tool in a separate commit, right?

and that we ask the right set of people, those who might be adversely affected by a deletion.

We're not deleting the image in quay, we're just deleting the Dockerfile for a tool in ToT. If someone needs to update it at a later point in time, we revert the deletion. As you pointed out, we currently tag all images with the revision that they were built from, so I really don't understand why we're debating whether it is tractable to find a commit where a tool existed? Unless I've missed something, no one is proposing changing the tagging scheme.

The "it's too hard to find the commit that removed a tool" is overstated.

This is a misquote of what I actually said.

Isn't that what we're literally debating here? Whether it is hard or not to find the last commit where a Dockerfile for a tool existed?

I take that back. For PRs with deletions, I propose pinging the people who we think might have an interest in the tools that are being deleted. This is not a definition but I hope we can just use common sense. I'm afraid that only asking people in MAINTAINERS is not enough.

If we tag the correct MAINTAINERS and keep the maintainers up to date, the maintainers should CC in the affected people...

@hannes-ucsc
Copy link
Contributor

hannes-ucsc commented Aug 30, 2016

so I really don't understand why we're debating whether it is tractable to find a commit where a tool existed? Unless I've missed something, no one is proposing changing the tagging scheme.

I got worried when you brought up git bissect as the contingency plan for deletions because in the scenarios that play out in my head it will neither be "tractable" (for the reasons I mentioned in my previous comment) nor useful. Say, when we're asked to reproduce a certain pipeline result but deleted both the Dockerfile and the image from quay.io. The commit that matters will be the one that the image was built from, not the one where the Dockerfile was deleted. As long as we agree to tag images and back that information up, we're good. In that sense it doesn't even matter whether the deletion was done in a separate commit. I proposed the single PR because I wanted deletions to have sufficient community exposure. But if you expect git bissect to be the weapon of choice, I have no qualm with a deletion in separate commit.

If we tag the correct MAINTAINERS and keep the maintainers up to date, the maintainers should CC in the affected people..

Team members using a tool in an actively maintained pipeline should also be considered. I personally think a MAINTAINERS file is pointless because GH makes that information is readily apparent.

@fnothaft
Copy link
Contributor Author

I got worried when you brought up git bissect as the contingency plan for deletions because in the scenarios that play out in my head it will neither be "tractable" (for the reasons I mentioned in my previous comment) nor useful. Say, when we're asked to reproduce a certain pipeline result but deleted both the Dockerfile and the image from quay.io. The commit that matters will be the one that the image was built from, not the one where the Dockerfile was deleted. As long as we agree to tag images and back that information up, we're good. In that sense it doesn't even matter whether the deletion was done in a separate commit. I proposed the single PR because I wanted deletions to have sufficient community exposure. But if you expect git bissect to be the weapon of choice, I have no qualm with a deletion in separate commit.

IDK, we're literally talking about 30 seconds with git bisect here. I mean, even if we decide to rearrange the repo to put images in completely randomly named directories, instead of doing ls <imagename> we do something like grep -q quay.io/ucsc_cgl/<imagename>. This just isn't rocket science, and we have the images tagged with their commit hashes anyways. I mean, sure, we could talk about bizarre contingency plans, but I don't think that's a good use of time.

Anyways, if we're worried about not being able to build old docker images in the future, we should probably fix all of the places where we're depending on URLs that are not guaranteed to be stable.

Team members using a tool in an actively maintained pipeline should also be considered. I personally think a MAINTAINERS file is pointless because GH makes that information is readily apparent.

I was referring to the MAINTAINER directive in the Dockerfile. I agree that this information is also available through GitHub. My point is more that the person who is maintaining the image should be responsible for knowing where it is used, and should be able to authoritatively comment whether it is OK to delete the container or not. If they can't be responsible for that, then that's a management problem.

@hannes-ucsc
Copy link
Contributor

I don't think we'll be able to agree on a policy so lets table that for now.

Pinging @jvivian, @arkal and @jpfeil for a comments. If they don't veto this issue, this has my consent but I'll +1 the PR explicitly. @jvivian mentioned that samtools-1.3 is in use. I think we should not delete it.

@fnothaft
Copy link
Contributor Author

Yeah, I need to break this out into separate commits anyways, so this isn't ready for a merge.

hannes-ucsc added a commit that referenced this issue Nov 3, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants