-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Log Pipeline Extensions #121
Conversation
Note that this commit is WIP and although the goal is to have this merged, I would consider this as the common ground to get a feeling about some conventions monasca-docker follows. |
Interesting work! I have no objections to merging once everything's ready. One quick issue: CI/CD integration is using If I'm interpreting things right, the log pipeline is in a separate docker-compose yaml file? We may need to make some changes to |
@timothyb89 ideally I wanted to make create another yml file called metric-pipeline.yml. In other words there would be a YML that would describe all components for metric processing. That said, the 3rd YML would be for common stuff (zookeeper, kafka AFAIR) - however, that might be a bit too much for now and in overall - so left that for now. Going back to this PR. Yeah, there is seperate log-pipeline.yml. My goal was to minimize the impact over what you've already done. There is just one common part - the kafka topics. Maybe there's a way to extend a service to include just those new components, or we might try and define a single-job container that would depend on kafka, do what already has been done for metric pipeline and do some extra job required by log pipeline. Anyway that's up to you I guess. I would just ask, from my part, to take closer look at monasca-log-api image, that present a bit different approach to to the topic of building python-based image. I am curious about your opinion on that. |
@timothyb89 I ran into the problem with multi-stage build. Does docker in travis supports that ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: multi-stage builds, it should be possible to upgrade according to https://docs.travis-ci.com/user/docker/#Installing-a-newer-Docker-version - I'll look into that soon
I think it would be good to move the kafka topics script into it's own init container like mysql-init/influx-init/etc. Then it should be possible to reuse it for any optional components.
monasca-log-api/Dockerfile
Outdated
python setup.py install && \ | ||
cd / && \ | ||
rm -rf /monasca-log-api && \ | ||
apk del build-dep |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main caveat here is that apk del build-dep
is run in another layer so without --squash
we'll end up including all the build dependencies in the resulting image anyway. The log API image ends up being ~30 MiB larger than the vanilla monasca-api's right now (as measured by docker hub, 70 MB vs 100 MB) since the version of Docker on Travis doesn't have --squash
available/enabled.
In general I like the idea of REBUILD_...
args (I made a point of supporting them in dbuild which runs our CI builds) however they depend on docker build --squash
to produce small images. Unfortunately squash functionality is still experimental and seemed to behave poorly when I last evaluated it. (I'd tested it with the thresh container but ended up removing the REBUILD instructions to keep builds small in Travis).
It might be worthwhile to revisit enabling --squash
in Travis (especially with #80), so I think there's 3 options:
- don't use
REBUILD_...
args yet (longer build times for devs, smaller images from Travis) - use
REBUILD_
, ignore large images from Travis (shorter build times for devs, larger images from Travis) - update Docker and enable experimental flags in Travis (short and small builds)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
monasca-log-api/Dockerfile
Outdated
@@ -0,0 +1,58 @@ | |||
FROM python:2-alpine |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think installing python via alpine's package manager results in a (slightly) smaller image than python:alpine but probably not enough to worry about. I would say we should use python:2-alpine3.6
though, that base tag uses alpine:3.4.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About that I had mixed feeling.
From one side I'd prefer reusing official images instead of manually installing Python.
On the other side my biggest concern is lack of possibility to easily build Python3 based container.
monasca-api does not have Py3 support, but log-api is different beast, it was much easier to have it implemented there. Not to mention that recently we've managed to enable monasca-notification (still some job is needed, but at least we won't have regression of failing tests).
Having all that said, I think I will simply rollback to monasca-api approach here to enable that possibility.
monasca-log-api/Dockerfile
Outdated
pip install --no-cache-dir Jinja2 gunicorn -c $CONSTRAINTS_FILE | ||
|
||
ARG REBUILD_CHECKOUT=1 | ||
RUN git clone $LOG_API_REPO --depth 1 --branch $LOG_API_BRANCH monasca-log-api && \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've been using a four-step clone in most of our other Dockerfiles since it gives more options for ..._BRANCH
. For instance here it would let us pass --build-arg LOG_API_BRANCH=refs/changes/43/485443/2
to build directly from an OpenStack patch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Point taken.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, rollback to 4-step-clone will be in next commit.
@mhoppal @timothyb89 - I've reached, or at least I believe, I reached a point where is pretty much nothing that I could add over here. There is, though, one glitch - that I am not really sure. The fact that even if you lunch only metric-pipeline (a.k.a. If an answer is no-no ( ;-) ) I will simply write down all the documentation requires here and someone will pick up kafka-init container (actually I could start working on this now, but I'd need some help with helm later on). What do you think ? |
runtime. | ||
|
||
The config file sources are available [in the repository][5]. If necessary, the | ||
generated config files can be viewed at runtime by running: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
by running?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| `ACCESS_LOG_FORMAT` | `%(asctime)s [%(process)d] gunicorn.access [%(levelname)s] %(message)s` | Log format for access log | | ||
| `ACCESS_LOG_FIELDS` | `%(h)s %(l)s %(u)s %(t)s %(r)s %(s)s %(b)s "%(f)s" "%(a)s" %(L)s` | Access log fields | | ||
|
||
If additional values need to be overridden, new config files or jinja2 templates |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nice to have some info how to do this replacements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I copied that over from the api documentation. Can we do this in another PR ?
|---------------------------|------------------|------------------------------------| | ||
| `ZOOKEEPER_URI` | `zookeeper:2181` | An URI to Zookeeper server | | ||
| `KAFKA_URI` | `kafka:9092` | The host and port for kafka | | ||
| `KAFKA_WAIT_FOR_TOPICS` | `log-transformed,metrics` | Topics to wait on at startup | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be transformed-log
?
| Variable | Default | Description | | ||
|----------------------------|-------------------|-------------------------------------| | ||
| `ZOOKEEPER_URI` | `zookeeper:2181` | An URI to Zookeeper server | | ||
| `KAFKA_WAIT_FOR_TOPICS` | `log-transformed` | Topics to wait on at startup | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
transformed-log
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
ELASTICSEARCH_SNIFFING=true \ | ||
ELASTICSEARCH_SNIFFING_DELAY=5 \ | ||
ZOOKEEPER_URI=zookeeper:2181 \ | ||
KAFKA_WAIT_FOR_TOPICS=log-transformed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
transformed-log
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be transformed-log
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh my...got lost in all that :/
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT | ||
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the | ||
# License for the specific language governing permissions and limitations | ||
# under the License. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add empty line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
monasca-log-transformer/template.py
Outdated
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the | ||
# License for the specific language governing permissions and limitations | ||
# under the License. | ||
from __future__ import print_function |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add empty line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT | ||
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the | ||
# License for the specific language governing permissions and limitations | ||
# under the License. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add empty line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT | ||
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the | ||
# License for the specific language governing permissions and limitations | ||
# under the License. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add empty line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
monasca-log-api/README.md
Outdated
Tags | ||
---- | ||
|
||
TBD |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fill with info
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done 👍
monasca-log-transformer/build.yml
Outdated
variants: | ||
- tag: 0.0.1 | ||
aliases: | ||
- latest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be - :latest
- causing a tag parse error in CI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, done - will be in next commit.
@timothyb89 any idea what might gone wrong this time :/. I am looking at the logs but cannot determine the cause :/ |
Full log pipeline to monasca-docker. Pipeline is defined in docker-compose extensions `log-pipeline.yml` to minimize an impact on metric pipeline. log-pipeline features: * monasca-log-{api,persister,transformer,metrics} * elasticsearch + template loading * kibana
Following commit completes several items: * added build.yml to integrate into existing CI * removed all images, apart from master and stable/ocata * made constraints control a little bit easier with just contstrains branch variable
Previously tags were build in wrong way. Now they are correctly written using `:` as prefix
Introducing 3 new checkpoints for monasca-log-api docker image: - REBUILD_DEPENDENCIES - REBUILD_CHECKOUT - REBUILD_CONFIG
Commit adjusts monasca-log-{metrics,persister,transformer} to: * include build.yml to be picked up by CI * shortened images names (removed monasca- prefix)
Commits introduces build checkpoints for: * monasca-log-metrics * monasca-log-persister * monasca-log-transformer
log-pipeline.yml includes bad references to monasca-log-{metrics,transformer,persister} images
Right now it is possible to enable or disable monasca-kibana-plugin with environment variable MONASCA_PLUGIN_ENABLED. By default the plugin is disabled, to allow to access Kibana.
* removed FROM python image, using alpine instead and installing Python by hand * removed one REBUILD_ step, for now it is better to keep usage of these low until newer docker is available in travis
All images for monasca-docker should be placed under monasca namespace
monasca-api variable for the port name starts with `MONASCA_CONTAINER_` prefix. Make equaivalent log-api variable to follow the same principle.
Being explic in build.yml
Now, when there is a separate image to create topic in kafka, log pipeline should now affect the metric deployment.
kibana plugin -i monasca-kibana-plugin -u file:///monasca-kibana-plugin.tar.gz && \ | ||
rm -rf /monasca-kibana-plugin.tar.gz | ||
|
||
CMD /start |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use ENTRYPOINT if you don't want developers to change the executable that is run when the container starts. Prefer ENTRYPOINT than CMD when building executable Docker image
$KEYSTONE_ADMIN_DOMAIN \ | ||
$MONASCA_CONTAINER_LOG_API_PORT | ||
|
||
CMD ["/start.sh"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use ENTRYPOINT if you don't want developers to change the executable that is run when the container starts. Prefer ENTRYPOINT than CMD when building executable Docker image
COPY log-metrics* /etc/monasca/ | ||
COPY template.py start.sh kafka_wait_for_topics.py / | ||
|
||
CMD ["/start.sh"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use ENTRYPOINT if you don't want developers to change the executable that is run when the container starts. Prefer ENTRYPOINT than CMD when building executable Docker image
COPY log-transformer* /etc/monasca/ | ||
COPY template.py start.sh kafka_wait_for_topics.py / | ||
|
||
CMD ["/start.sh"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use ENTRYPOINT if you don't want developers to change the executable that is run when the container starts. Prefer ENTRYPOINT than CMD when building executable Docker image
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In every image there's a CMD
is used. In that's a bad practice example, as you told me face-to-face, I guess we should face cover other images, in separate PR. And, as usual, after that update that PR to include this solution.
@timothyb89 what do you think about that ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally I tend to like CMD since you can more easily override with e.g. sh
if needed for test/debugging. Plus eventually I'd like to see our containers all use tini which recommends setting the entrypoint to a different executable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@matrixik could you take & run this one today ? I wonder is there anything more I could've missed ;/ |
Same as monasca-api, monasca-log-api should use memcache to speed up token related operations.
@timothyb89 can I ask you to take a closer look at this one ? We're kind of approaching a deadline @ Fujitsu and we would like to have this inside the repository. I believe we have some items to cover here (launching a tempests / CI, include all that inside travis etc - but that's a lot of items we need to discuss anyway [should I write an issue for that ?]). Thx. |
Looks like it's working for me. |
Just some random complain: I hate github PRs for reviewing longer/bigger code... In compare Gerrit feels like heaven... |
I wonder if we should add @timothyb89 do you agree ? |
@timothyb89 @mhoppal friendly ping ;) |
@kornicameister yeah, we'll need to add them after since we won't know the timestamps in advance |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other than the kibana image tag I think this is working fine on my machine. Will merge now since we'll need to add log pipeline versions in .env
in a follow-up PR anyway.
Thanks for the change!
- kafka | ||
|
||
kibana: | ||
image: monasca/kibana:4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be monasca/kibana:4.6.3-master
? Everything seems to be working on my end with the tag here changed to something valid.
Unrelated to this patch, but the agent seems to be broken right now, keystone auth is failing for some reason. There might be CI failures until we figure out the cause. |
Full log pipeline to monasca-docker.
Pipeline is defined in docker-compose extensions
log-pipeline.yml
to minimize an impact onmetric pipeline.
log-pipeline features: