-
Notifications
You must be signed in to change notification settings - Fork 603
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[LIVY-702]: Submit Spark apps to Kubernetes #451
Conversation
can this chnaged be merged if there are not objections? |
@askhatri Thanks for pushing this feature forward! As a side note, could you please add the original author, @jahstreet, to the author list? We've been using this feature for a year and truly appreciate Alex's initial effort in bringing it to Livy. |
Hi @jshmchenxi , I have added the original author, @jahstreet, to the author list as suggested by you. Please let me know incase if any further change or correction required |
Thx for mentioning mate, appreciate the credits 🙏 . |
// When istio-proxy restarts, the access to K8s API from livy could be down | ||
// until envoy comes back, which could take upto 30 seconds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is istio/envoy or any ingress controller a prerequisite to using Livy with K8s?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
istio/envoy is optional for using Livy with K8s. It is just to collected the Livy logs for audit purpose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great functionality, this will surely be beneficial
val KUBERNETES_GRAFANA_LOKI_ENABLED = Entry("livy.server.kubernetes.grafana.loki.enabled", false) | ||
val KUBERNETES_GRAFANA_URL = Entry("livy.server.kubernetes.grafana.url", "http://localhost:3000") | ||
val KUBERNETES_GRAFANA_LOKI_DATASOURCE = | ||
Entry("livy.server.kubernetes.grafana.loki.datasource", "loki") | ||
val KUBERNETES_GRAFANA_TIME_RANGE = Entry("livy.server.kubernetes.grafana.timeRange", "6h") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great to add some docs about expectations to Grafana and Loki installation to make use of this configs.
Same about installing Livy on K8s, we should give some guide to the community on how to set this up at least locally to play with. That will help to raise the adoption.
I can help you with connecting the dots, ping me if you wanna discuss.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, I see it in https://github.com/askhatri/livycluster . Great at least to leave somewhere a link to it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I will try to add this as part of LIVY-979. I will connect with you via email.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@askhatri better to discuss it on LIVY-979 itself so that whole community is across it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure @vikas-saxena02.
// side car container for spark pods enabled? | ||
val KUBERNETES_SPARK_SIDECAR_ENABLED = | ||
Entry("livy.server.kubernetes.spark.sidecar.enabled", true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the use case for this flag?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A sidecar container in Kubernetes is a secondary container that runs alongside the main application container in the same pod. We can set the sidecar configuration in Livy using this flag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what could be the intention for running sidecar?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A sidecar container can be used to collect logs from the Spark application and forward them to a centralized logging system like Elasticsearch, Fluentd, and Kibana (EFK) stack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood, thx for explaining. Does Livy add this sidecar or should Livy be handling the Spark Pods with a sidecar differently? What would happen if Spark Pods have sidecar but this flag is false and how it is different from setting it to true? Just trying to understand it better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Livy does not add a sidecar container for Spark. Instead, Livy uses a flag to determine the status of the Spark pod. If the Spark pod is running with a sidecar, the flag is set to true; otherwise, it is set to false.
Co-authored-by: Alex Sasnouskikh <[email protected]>
@askhatri the latest push was forced and I can see many of the checks have failed, can you please look into this? |
Hi @vikas-saxena02, I have re-triggered the checks now by adding ACTIONS_ALLOW_USE_UNSECURE_NODE_VERSION=true. |
@jahstreet @askhatri is there any documentation that you can share that explains how to use livy run jobs on spark on kubernetes? |
/approve |
Yes, we will talk about packaging it.
|
/assign @gyogal |
Hi @vikas-saxena02, I have assigned to @gyogal as suggested by you...! |
Thanks @askhatri !! Hopefully this should be merged soon. |
Thanks everyone for your input! It seems like there are no objections and the overall feedback is positive. If you find any issues once this PR is merged, please feel free to raise a ticket. |
This pull request (PR) is the foundational PR for adding Kubernetes support in Apache Livy, originally found here (apache#249). This update includes a newer version of the Kubernetes client and adds code to display the Spark UI. ## Summary of the Proposed Changes This PR introduces a method to submit Spark applications to a Kubernetes cluster. The key points covered include: * Submitting batch sessions * Submitting interactive sessions * Monitoring sessions, collecting logs, and gathering diagnostic information * Restoring session monitoring after restarts * Garbage collection (GC) of created Kubernetes resources JIRA link: https://issues.apache.org/jira/browse/LIVY-702 ## How was this patch tested? * Unit Tests: The patch has been verified through comprehensive unit tests. * Manual Testing: Conducted manual testing using Kubernetes on Docker Desktop. * Environment: Helm charts. For detailed instructions on testing using Helm charts, please refer to the documentation available at https://github.com/askhatri/livycluster Co-authored-by: Asif Khatri <[email protected]> Co-authored-by: Alex Sasnouskikh <[email protected]>
This pull request (PR) is the foundational PR for adding Kubernetes support in Apache Livy, originally found here. This update includes a newer version of the Kubernetes client and adds code to display the Spark UI.
Summary of the Proposed Changes
This PR introduces a method to submit Spark applications to a Kubernetes cluster. The key points covered include:
JIRA link: https://issues.apache.org/jira/browse/LIVY-702
How was this patch tested?
For detailed instructions on testing using Helm charts, please refer to the documentation available at livycluster.