Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: TopicOperator's use of SSL and auth for Cruise Control cannot be disabled anymore #10289

Open
Pinimo opened this issue Jul 1, 2024 · 11 comments

Comments

@Pinimo
Copy link

Pinimo commented Jul 1, 2024

Bug Description

Starting with #9483 (which holds awesome work otherwise 馃槄), we found the TopicOperator will necessarily speak SSL with Cruise Control. This seems to be a regression, because in some setups, it is not easy to use SSL on Cruise Control backends, especially when using a CC front-end UI (as mentioned in this blog post's prerequisites).

When changing a topic replication factor, the TopicOperator will throw the following error:

ReplicasChangeHandler:299 - Replicas change failed, javax.net.ssl.SSLHandshakeException: Remote host terminated the handshake

Indeed, there is no SSL on Cruise Control side, so the Topic Operator's attempt to negociate SSL is in vain.

We tried to change the TopicOperator's SSL behaviour with the following configuration, but it turned out to be impossible, because the STRIMZI_CRUISE_CONTROL_SSL_ENABLED environment variable is hard-coded to true here. In the case of the following scenario, this error is thrown: User defined container template environment variable STRIMZI_CRUISE_CONTROL_SSL_ENABLED is already in use and will be ignored.

apiVersion: kafka.strimzi.io/v1beta2
kind: Kafka
metadata:
  name: test-kafka
  namespace: strimzi-test
spec:
  cruiseControl:
    config:
      webserver.security.enable: false  <-- ask Cruise Control backend to use no auth
      webserver.ssl.enable: false       <-- nor any SSL
  entityOperator:
    template:
      topicOperatorContainer:
        env:
        - name: STRIMZI_CRUISE_CONTROL_SSL_ENABLED
          value: "false"    <-- ask topicOperator not to speak SSL to Cruise Control

Thank you for your help 鉂わ笍

I guess it will be necessary to modify or enrich the way the addContainerEnvsToExistingEnvs function works (for instance by making it possible to add the "true" value as a default value) or is called (for instance by calling the method only after the template vars are already loaded in the configuration).

PS. Beyond auth and SSL, other parameters are also affected by this issue.

Steps to reproduce

Deploy the following resource:

apiVersion: kafka.strimzi.io/v1beta2
kind: Kafka
metadata:
  name: test-kafka
  namespace: strimzi-bug-cruise-control-no-ssl
spec:
  cruiseControl:
    config:
      webserver.security.enable: false
      webserver.ssl.enable: false
  entityOperator:
    template:
      topicOperatorContainer:
        env:
        - name: STRIMZI_CRUISE_CONTROL_SSL_ENABLED
          value: "false"
  kafka:
    replicas: 3
    resources:
      limits:
        memory: 2Gi
      requests:
        cpu: 250m
        memory: 2Gi
---
apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaTopic
metadata:
  labels:
    app: test-kafka
    strimzi.io/cluster: test-kafka
  name: test-topic
  namespace: strimzi-bug-cruise-control-no-ssl
spec:
  partitions: 12
  replicas: 1

Once the cluster is created, kubectl edit kt/test-topic to change the replication factor to 2.

Monitor the logs of the TopicOperator, especially to see the configuration printed at startup.

Expected behavior

As the default value of STRIMZI_CRUISE_CONTROL_SSL_ENABLED is false, one expects this variable not to be true. But when printed at startup, it appears to be true. Digging through, I saw the cluster operator actually adds the environment variable as a result from a hard-coded setting.

If the environment variable is manually added in the Kafka spec (see first code sample above), the cluster operator ignores it.

Strimzi version

0.41

Kubernetes version

v1.28.9-gke.1289000

Installation method

Helm Chart

Infrastructure

GKE

Configuration files and logs

No response

Additional context

The security issue with Cruise Control is addressed using native Kubernetes filtering.

@Pinimo Pinimo changed the title [Bug]: TopicOperator use of SSL and auth for Cruise Control cannot be disabled anymore [Bug]: TopicOperator's use of SSL and auth for Cruise Control cannot be disabled anymore Jul 1, 2024
@scholzj
Copy link
Member

scholzj commented Jul 1, 2024

Honestly, I'm not sure we want to support all these different variants. They put a lot of effort on maintenance, testing etc. The STRIMZI_CRUISE_CONTROL_SSL_ENABLED variable is intended when using Topic Operator as a standalone installation outside of the Strimzi Kafka resource. It is not expected to be user-configurable when Topic OPerator is deployed through the Kafka CR.

@fvaleri
Copy link
Contributor

fvaleri commented Jul 1, 2024

Hi @Pinimo, thanks for reaching out.

That's not a regression because the RF change is a new feature, and it is working as designed. This is consistent with how Strimzi enforces the use TLS for its components, which can't be disabled. As @scholzj suggested, you can use the standalone deployment for maximum flexibility.

@scholzj
Copy link
Member

scholzj commented Jul 1, 2024

Let's discuss on the next community call (or the one afterwards as many maintainers might be missing next week) what to do about it. If we do not want to support this, we should probably also remove the guidance how to disable CC security in our docs. It might be also less needed after #10117 is merged in Strimzi 0.43.

@Pinimo
Copy link
Author

Pinimo commented Jul 2, 2024

Hello @scholzj and @fvaleri, many thanks for your quick answers. I understand indeed that those parameters add some risk and require more testing; I also understand that you designed this with SSL as a strong default in mind.

However, from a basic user point of view, I think it is reasonable to see it as a "regression" of the Strimzi ecosystem altogether. Indeed, as things are, it will not be possible to use a Cruise Control UI with Strimzi anymore, by lack of configuration flexibility.

If we do not want to support this, we should probably also remove the guidance how to disable CC security in our docs.

As far as I understand, the consequence is rather that we should remove the guidance to use the CC UI at all. I tried to activate TLS (and auth) everywhere, but the UI does not support this. This project is not well maintained and I doubt it will be easy for anyone to integrate it with Strimzi using SSL or auth any time soon. However, as of today, our team (BlaBlaCar) does not wish to stop using the UI, because it contains interesting monitoring information and can help with some operations.

Note: fixing the CC UI could also be an option to ensure consistency of the Strimzi ecosystem, if tweakable SSL is not an option for the TopicOperator. This seems technically possible by patching the proxy configuration of CC UI and mounting some secrets into its pod. (However, if the CC UI is deployed in another namespace than the one or multiple backends, the secret must first be replicated. For this reason, it might also be more practical to have the CC UI be managed by the operator directly instead of being deployed in standalone.)

@scholzj
Copy link
Member

scholzj commented Jul 2, 2024

TBH, I'm not aware of any guidance for how to use Cruise Control UI in Strimzi. I'm not aware of it being covered in the documentation or anything like that. As far as I know, there was just one blog post a long time ago. I also don't understand why would the Cruise Control UI not support TLS. If it indeed doesn't, that is maybe something what should be contributed to Cruise Control in the first place?

But keep in mind that from the beginning, the Cruise Control integration in Strimzi was designed around the operator pattern and around the KafkaRebalance resources. Using it differently might cause problems and conflicts with the operator.

@Pinimo
Copy link
Author

Pinimo commented Jul 2, 2024

TBH, I'm not aware of any guidance for how to use Cruise Control UI in Strimzi. I'm not aware of it being covered in the documentation or anything like that. As far as I know, there was just one blog post a long time ago.

Yes, indeed, it's not in the docs, my bad, but rather in this 2023 blog post: https://strimzi.io/blog/2023/01/11/hacking-for-cruise-control-ui/

I agree that part of my problem is around the CC UI not being updated anymore. However, I think several users could fall against the hard-coded env variable override, that felt a bit counter-intuitive when I understood it (because not really documented). As I understood it (but I might have been wrong), the env variables defined in a component's containerTemplate seemed to have precedence over defaults. But here, it's not the case. Is it really what was intended, and does it not bring some confusion?

@scholzj
Copy link
Member

scholzj commented Jul 2, 2024

As I understood it (but I might have been wrong), the env variables defined in a component's containerTemplate seemed to have precedence over defaults. But here, it's not the case. Is it really what was intended, and does it not bring some confusion?

No, the environment variables set by the operator have always a priority over what you configure int he template. The template is basically designed for additional environment variables, not to overwrite what the operator does.

@scholzj
Copy link
Member

scholzj commented Jul 2, 2024

This is also listed in the docs: https://strimzi.io/docs/operators/latest/full/configuring.html#type-ContainerTemplate-reference

If you set a custom environment variable that is already in use by Strimzi, it is ignored and a warning is recorded in the log.

@Pinimo
Copy link
Author

Pinimo commented Jul 2, 2024

You mentioned a link with PR #10117, and I also see in the blog post mentioned above:

In a future version, the Strimzi community hopes to add the option for creating custom Cruise Control API users. This would remove the requirement for the Cruise Control API security settings to be disabled for applications like the Cruise Control UI to work with Strimzi.

That sounds nice! But I'm a bit confused about how the UI is supposed to use user credentials, as I did not see any auth mechanism in its code (at first view).

(sorry, the message "slipped away" during edition - published too early)

@Pinimo
Copy link
Author

Pinimo commented Jul 2, 2024

At any rate, thank you for having answered my questions 馃槈 I'm not yet sure how we are going to upgrade to 0.41, but your support is highly appreciable 馃

@scholzj
Copy link
Member

scholzj commented Jul 2, 2024

That sounds nice! But I'm a bit confused about how the UI is supposed to use user credentials, as I did not see any auth mechanism in its code (at first view).

Honestly, I do not know anything about the Cruise Control UI. So I do not know what kind of authentication does it support. Also, if it doesn't support TLS, the API users will not solve that either.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants