Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LIVY-702]: Submit Spark apps to Kubernetes #451

Merged
merged 1 commit into from
Jul 10, 2024
Merged

Conversation

askhatri
Copy link
Contributor

This pull request (PR) is the foundational PR for adding Kubernetes support in Apache Livy, originally found here. This update includes a newer version of the Kubernetes client and adds code to display the Spark UI.

Summary of the Proposed Changes

This PR introduces a method to submit Spark applications to a Kubernetes cluster. The key points covered include:

  • Submitting batch sessions
  • Submitting interactive sessions
  • Monitoring sessions, collecting logs, and gathering diagnostic information
  • Restoring session monitoring after restarts
  • Garbage collection (GC) of created Kubernetes resources

JIRA link: https://issues.apache.org/jira/browse/LIVY-702

How was this patch tested?

  • Unit Tests: The patch has been verified through comprehensive unit tests.
  • Manual Testing: Conducted manual testing using Kubernetes on Docker Desktop.
    • Environment: Helm charts.

For detailed instructions on testing using Helm charts, please refer to the documentation available at livycluster.

@gyogal
Copy link
Contributor

gyogal commented Jun 27, 2024

Thanks for working on this @askhatri and for retesting and updating the original code in PR #249 . If there are no objections, we could merge this PR and any smaller updates or fixes could be added in follow-up tickets before the 0.9 release.

@vikas-saxena02
Copy link

can this chnaged be merged if there are not objections?

@vikas-saxena02
Copy link

@gyogal @lmccay , can you please take a look at this PR?

@jshmchenxi
Copy link

@askhatri Thanks for pushing this feature forward!

As a side note, could you please add the original author, @jahstreet, to the author list? We've been using this feature for a year and truly appreciate Alex's initial effort in bringing it to Livy.

@askhatri
Copy link
Contributor Author

askhatri commented Jul 9, 2024

Hi @jshmchenxi , I have added the original author, @jahstreet, to the author list as suggested by you. Please let me know incase if any further change or correction required

@jahstreet
Copy link
Contributor

Hi @jshmchenxi , I have added the original author, @jahstreet, to the author list as suggested by you. Please let me know incase if any further change or correction required

Thx for mentioning mate, appreciate the credits 🙏 .

Comment on lines +137 to +138
// When istio-proxy restarts, the access to K8s API from livy could be down
// until envoy comes back, which could take upto 30 seconds
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is istio/envoy or any ingress controller a prerequisite to using Livy with K8s?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

istio/envoy is optional for using Livy with K8s. It is just to collected the Livy logs for audit purpose.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great functionality, this will surely be beneficial

Comment on lines +303 to +307
val KUBERNETES_GRAFANA_LOKI_ENABLED = Entry("livy.server.kubernetes.grafana.loki.enabled", false)
val KUBERNETES_GRAFANA_URL = Entry("livy.server.kubernetes.grafana.url", "http://localhost:3000")
val KUBERNETES_GRAFANA_LOKI_DATASOURCE =
Entry("livy.server.kubernetes.grafana.loki.datasource", "loki")
val KUBERNETES_GRAFANA_TIME_RANGE = Entry("livy.server.kubernetes.grafana.timeRange", "6h")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great to add some docs about expectations to Grafana and Loki installation to make use of this configs.
Same about installing Livy on K8s, we should give some guide to the community on how to set this up at least locally to play with. That will help to raise the adoption.
I can help you with connecting the dots, ping me if you wanna discuss.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, I see it in https://github.com/askhatri/livycluster . Great at least to leave somewhere a link to it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I will try to add this as part of LIVY-979. I will connect with you via email.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@askhatri better to discuss it on LIVY-979 itself so that whole community is across it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +309 to +311
// side car container for spark pods enabled?
val KUBERNETES_SPARK_SIDECAR_ENABLED =
Entry("livy.server.kubernetes.spark.sidecar.enabled", true)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the use case for this flag?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A sidecar container in Kubernetes is a secondary container that runs alongside the main application container in the same pod. We can set the sidecar configuration in Livy using this flag.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what could be the intention for running sidecar?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A sidecar container can be used to collect logs from the Spark application and forward them to a centralized logging system like Elasticsearch, Fluentd, and Kibana (EFK) stack.

Copy link
Contributor

@jahstreet jahstreet Jul 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood, thx for explaining. Does Livy add this sidecar or should Livy be handling the Spark Pods with a sidecar differently? What would happen if Spark Pods have sidecar but this flag is false and how it is different from setting it to true? Just trying to understand it better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Livy does not add a sidecar container for Spark. Instead, Livy uses a flag to determine the status of the Spark pod. If the Spark pod is running with a sidecar, the flag is set to true; otherwise, it is set to false.

@vikas-saxena02
Copy link

@askhatri the latest push was forced and I can see many of the checks have failed, can you please look into this?

@askhatri
Copy link
Contributor Author

askhatri commented Jul 9, 2024

Hi @vikas-saxena02, I have re-triggered the checks now by adding ACTIONS_ALLOW_USE_UNSECURE_NODE_VERSION=true.

@vikas-saxena02
Copy link

@jahstreet @askhatri is there any documentation that you can share that explains how to use livy run jobs on spark on kubernetes?

@vikas-saxena02
Copy link

/approve

@jahstreet
Copy link
Contributor

jahstreet commented Jul 9, 2024

@jahstreet @askhatri is there any documentation that you can share that explains how to use livy run jobs on spark on kubernetes?

Yes, we will talk about packaging it.
Until then, people can refer:

@vikas-saxena02
Copy link

vikas-saxena02 commented Jul 9, 2024

/assign @gyogal

@askhatri
Copy link
Contributor Author

Hi @vikas-saxena02, I have assigned to @gyogal as suggested by you...!

@vikas-saxena02
Copy link

Thanks @askhatri !! Hopefully this should be merged soon.

@gyogal
Copy link
Contributor

gyogal commented Jul 10, 2024

Thanks everyone for your input! It seems like there are no objections and the overall feedback is positive. If you find any issues once this PR is merged, please feel free to raise a ticket.

@gyogal gyogal merged commit b089dd6 into apache:master Jul 10, 2024
3 checks passed
@askhatri askhatri deleted the LIVY-702 branch July 14, 2024 13:23
jimenefe pushed a commit to onedot-data/incubator-livy that referenced this pull request Oct 15, 2024
This pull request (PR) is the foundational PR for adding Kubernetes support in Apache Livy, originally found here (apache#249). This update includes a newer version of the Kubernetes client and adds code to display the Spark UI.

## Summary of the Proposed Changes

This PR introduces a method to submit Spark applications to a Kubernetes cluster. The key points covered include:

 * Submitting batch sessions
 * Submitting interactive sessions
 * Monitoring sessions, collecting logs, and gathering diagnostic information
 * Restoring session monitoring after restarts
 * Garbage collection (GC) of created Kubernetes resources

JIRA link: https://issues.apache.org/jira/browse/LIVY-702

## How was this patch tested?

 * Unit Tests: The patch has been verified through comprehensive unit tests.
 * Manual Testing: Conducted manual testing using Kubernetes on Docker Desktop.
    *  Environment: Helm charts.

For detailed instructions on testing using Helm charts, please refer to the documentation available at https://github.com/askhatri/livycluster

Co-authored-by: Asif Khatri <[email protected]>
Co-authored-by: Alex Sasnouskikh <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants