Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Quiesce time configurable #25523

Open
39 tasks
NottyCode opened this issue Jun 20, 2023 · 6 comments · Fixed by #26554 · May be fixed by #26848
Open
39 tasks

Make Quiesce time configurable #25523

NottyCode opened this issue Jun 20, 2023 · 6 comments · Fixed by #26554 · May be fixed by #26848
Assignees
Labels
Aha Idea Design Review Request Epic Used to track Feature Epics that are following the UFO process in:Kernel/Bootstrap paused Features that have been paused from development release:230012-beta target:beta The Epic or Issue is targetted for the next beta target:230012-beta

Comments

@NottyCode
Copy link
Member

NottyCode commented Jun 20, 2023

Description

As of now, the Quiesce time is not configurable and is hard-coded to 30 seconds. Making it configurable gives flexibility to users about how much time at maximum their threads need to finish the task at hand.


Documents

When available, add links to required feature documents. Use "N/A" to mark particular documents which are not required by the feature.


Process Overview

General Instructions

The process steps occur roughly in the order as presented. Process steps occasionally overlap.

Each process step has a number of tasks which must be completed or must be marked as not applicable ("N/A").

Unless otherwise indicated, the tasks are the responsibility of the Feature Owner or a Delegate of the Feature Owner.

If you need assistance, reach out to the OpenLiberty/release-architect.

Important: Labels are used to trigger particular steps and must be added as indicated.


Prioritization (Complete Before Development Starts)

The (OpenLiberty/chief-architect) and area leads are responsible for prioritizing the features and determining which features are being actively worked on.

Prioritization

  • Feature added to the "New" column of the Open Liberty project board
    • Epics can be added to the board in one of two ways:
      • From this issue, use the "Projects" section to select the appropriate project board.
      • From the appropriate project board click "Add card" and select your Feature Epic issue
  • Priority assigned
    • Attend the Liberty Backlog Prioritization meeting

Design (Complete Before Development Starts)

Design preliminaries determine whether a formal design, which will be provided by an Upcoming Feature Overview (UFO) document, must be created and reviewed. A formal design is required if the feature requires any of the following: UI, Serviceability, SVT, Performance testing, or non-trivial documentation/ID. Furthermore, each identified item places a blocking requirement on another team so it must be identified early in the process. The feature owner may check-off the item if they know it doesn't apply, but otherwise they should work with the focal point to determine what work, if any, will be necessary and make them aware of it.

Design Preliminaries

Design

  • [ x] POC Design / UFO review requested.
    • Feature owner adds label Design Review Request
  • [ x] POC Design / UFO review scheduled.
    • Follow the instructions in POC-Forum repo
  • [ x] POC Design / UFO review completed.
  • POC / UFO Review follow-ons completed.
  • POC Design / UFO approval requested.
    • Feature owner adds label Design Approval Request
  • Design / UFO approved. (OpenLiberty/chief-architect) or N/A
    • (OpenLiberty/chief-architect) adds label Design Approved
    • Add the public link to the UFO in Box to the Documents section.
    • The UFO must always accurately reflect the final implementation of the feature. Any changes must be first approved. Afterwards, update the UFO by creating a copy of the original approved slide(s) at the end of the deck and prepend "OLD" to the title(s). A single updated copy of the slide(s) should take the original's place, and have its title(s) prepended with "UPDATED".

No Design

  • No Design requested.
    • Feature owner adds label No Design Approval Request
  • No Design / No UFO approved. (OpenLiberty/chief-architect) or N/A
    • Approver adds label No Design Approved
  • Feature / Capability stabilization or discontinuation or N/A
    • Feature owner adds label Product Management Approval Request and notifies OpenLiberty/product-management
    • Approver adds label Product Management Approved (OpenLiberty/product-management)
    • Note: For stabilized, superseded, and discontinued feature/capability, skip the Beta section of the template (you may delete it). Otherwise, proceed as normal.

FAT Documentation


Implementation

A feature must be prioritized before any implementation work may begin to be delivered (inaccessible/no-ship). However, a design focused approach should still be applied to features, and developers should think about the feature design prior to writing and delivering any code.
Besides being prioritized, a feature must also be socialized (or No Design Approved) before any beta code may be delivered. All new Liberty content must be inaccessible in our GA releases until it is Feature Complete by either marking it kind=noship or beta fencing it.
Code may not GA until this feature has obtained the Design Approved or No Design Approved label, along with all other tasks outlined in the GA section.

Feature Development Begins

  • [ x] Add the In Progress label

Legal and Translation

In order to avoid last minute blockers and significant disruptions to the feature, the legal items need to be done as early in the feature process as possible, either in design or as early into the development as possible. Similarly, translation is to be done concurrently with development. Both MUST be completed before Beta or GA is requested.

Legal (Complete before Feature Complete Date)

  • Changed or new open source libraries are cleared and approved, or N/A. (Legal Release Services/Cass Tucker/Release PM).

Innovation (Complete 1 week before Feature Complete Date)

  • Consider whether any aspects of the feature may be patentable. If any identified, disclosures have been submitted.

Translation (Complete by Feature Complete Date)

  • PII (Program Integrated Information) updates are merged (i.e. all English strings due for translation have been delivered), or N/A.

Beta

In order to facilitate early feedback from users, all new features and functionality should first be released as part of a beta release.

Beta Code

  • [ x ] Beta fence the functionality
    • E.g. kind=beta, ibm:beta, ProductInfo.getBetaEdition()
  • [ x ] Beta development complete and feature ready for inclusion in a beta release
    • Add label target:beta and the appropriate target:YY00X-beta (where YY00X is the targeted beta version).
  • Feature delivered into beta

Beta Blog (Complete by beta eGA)

  • [ x] Beta blog issue created and populated using the Open Liberty BETA blog post template.
    • Add a link to the beta blog issue in the Documents section.
    • Note: This is for inclusion into the overall beta release blog post. If, in addition, you'd also like to create a dedicated blog post about your feature, then follow the "Standalone Feature Blog Post" instructions under the Other Deliverables section.

GA

A feature is ready to GA after it is Feature Complete and has obtained all necessary Focal Point Approvals.

Feature Complete

  • Feature implementation and tests completed.
    • All PRs are merged.
    • All epic and child issues are closed.
    • All stop ship issues are completed.
  • Legal: all necessary approvals granted.
  • Translation: Feature may only proceed to GA if it has either Translation - Complete or Translation - Missing label
    • If all translation has been delivered to release branch, feature owner adds label Translation - Complete.
    • If missing translation does not cause a break in functionality, nor a security or production outage risk, feature owner adds label Translation - Missing.
      • Once all missing translations are delivered, the Translation - Missing label is replaced with Translation - Complete.
    • If missing translation could cause a break in functionality or a security or production outage risk, feature owner adds the Translation - Blocked label.
      • Featues with Translation - Blocked may NOT proceed to GA until the label has been replaced with either Translation - Missing or Translation - Complete.
    • For further guidance, contact Globalization focal point or the Release Architect.
  • GA development complete and feature ready for inclusion in a GA release
    • Add label target:ga and the appropriate target:YY00X (where YY00X is the targeted GA version).
    • Inclusion in a release requires the completion of all Focal Point Approvals.

Focal Point Approvals (Complete by Feature Complete Date)

These occur only after GA of this feature is requested (by adding a target:ga label). GA of this feature may not occur until all approvals are obtained.

All Features

  • APIs/Externals - Externals have been reviewed or N/A. (OpenLiberty/externals-approvers)
    • Approver adds label focalApproved:externals
  • Demo - Demo is scheduled for an upcoming EOI or N/A. (OpenLiberty/demo-approvers)
    • Add comment @OpenLiberty/demo-approvers Demo scheduled for EOI [Iteration Number] to this issue.
    • Approver adds label focalApproved:demo.
  • FAT - All Tests complete and running successfully in SOE or N/A. (OpenLiberty/fat-approvers)
    • Approver adds label focalApproved:fat.

Design Approved Features

  • ID - Documentation is complete or N/A. (OpenLiberty/id-approvers)
    • Approver adds label focalApproved:id.
    • NOTE: If only trivial documentation changes are required, you may reach out to the ID Feature Focal to request a ID Required - Trivial label. Unlike features with regular ID requirement, those with ID Required - Trivial label do not have a hard requirement for a Design/UFO.

  • InstantOn - InstantOn capable or N/A. (OpenLiberty/instantOn-approvers)
    • Approver adds label focalApproved:instantOn.
  • Performance - Performance testing is complete or N/A. (OpenLiberty/performance-approvers)
    • Approver adds label focalApproved:performance.
  • Serviceability - Serviceability has been addressed or N/A. (OpenLiberty/serviceability-approvers)
    • Approver adds label focalApproved:sve.
  • STE - Skills Transfer Education chart deck is complete or N/A. (OpenLiberty/ste-approvers)
    • Approver adds label focalApproved:ste.
  • SVT - System Verification Test is complete or N/A. (OpenLiberty/svt-approvers)
    • Approver adds label focalApproved:svt.

Remove Beta Fencing (Complete by Feature Complete Date)

  • Beta guards are removed, or N/A
    • Only after all necessary Focal Point Approvals have been granted.

GA Blog (Complete by Friday after GM)

  • GA Blog issue created and populated using the Open Liberty GA release blog post template.
    • Add a link to the GA Blog issue in the Documents section.
    • Note: This is for inclusion into the overall release blog post. If, in addition, you'd also like to create a dedicated blog post about your feature, then follow the "Standalone Feature Blog Post" instructions under the Other Deliverables section.

Post GA


Other Deliverables


@NottyCode NottyCode added Epic Used to track Feature Epics that are following the UFO process Aha Idea labels Jun 20, 2023
@jimblye jimblye self-assigned this Aug 30, 2023
@cbridgha cbridgha added the In Progress Items that are in active development. label Sep 18, 2023
@jimblye
Copy link
Member

jimblye commented Oct 12, 2023

Currently the quiesce timeout is hard-coded to 30 seconds in

  • dev/com.ibm.ws.threading/src/com/ibm/ws/threading/internal/ExecutorServiceImpl.java
    

How the timeout will be configurable:

server.xml


     <executor  quiesceTimeOut=“1m30s”/>


From a user's perspective there are two other timeout values to consider; channel chain quiesce timeout and the server stop command client timeout.

chainQuiesceTimeout - Time to wait for channel chains to stop. The default value is 30 seconds.

server.xml

    <channelfw chainQuiesceTimeout="1m"/>  

server stop command timeout - The server stop command timeout is the amount of time the server command client script waits for the server to stop. The default timeout is 30 seconds. If the server has not stopped within the timeout period, the client will terminate with an error code.

server stop --timeout=30s

Implementation

There are two services that need to access the configurable quieseTimeout; ExecutorService and RuntimeUpdateManager

ExecutorService isn't a problem. The quiesceTimeout is configured on the executor element in server.xml and that gets passed into ExecutorServiceImpl.java.

ExecutorServiceImpl.java
protected void activate(Map<String, Object> componentConfig) {

RuntimeUpdateManager is a bit trickier ( @tjwatson ). It waits a hard-coded 30 seconds for quiesceListenerFutures to complete. We'll need a way to pass the configurable quiesceTimeout to the service when the service is activated.

RuntimeUpdateManagerImpl.java
protected void activate(BundleContext ctx) {

Messages

The 30-second quiesce timeout is also hard-coded in messages. The messages are displayed by com.ibm.ws.runtime.update.RuntimeUpdateManagerImpl. We'll need to update messages and pass in the timeout value.

@tjwatson
Copy link
Member

I doubt this is the first time in Liberty where we need more than one component to share the same configuration data. It should be possible for both components to get configuration from the com.ibm.ws.threading pid. I am not sure how our configuration admin implementation threats the configuration objects generated from liberty configuration (i.e. server.xml). If the are treated as as multi-location (https://docs.osgi.org/specification/osgi.cmpn/8.1.0/service.cm.html#service.cm-location.binding) then it should be possible to update the RuntimeUpdateManagerImpl to use a configuration-pid=com.ibm.ws.threading so that it can receive the same configuraiton value for <executor quiesceTimeOut=“1m30s”/>

@jimblye jimblye added the No Design Approval Request Must NOT need: SVT/Perf testing, new UI, Servicibility considerations, major doc updates label Oct 13, 2023
@jimblye
Copy link
Member

jimblye commented Oct 16, 2023

UFO is in box, but from a user's perspective this doesn't need a lot of explanation.
No design approval requested. @NottyCode

@jimblye
Copy link
Member

jimblye commented Oct 18, 2023

related to:
#27134. Being able to increase the quiesce timeout will at least mitigate the issue.
#15188. Duplicate

@jimblye
Copy link
Member

jimblye commented Oct 18, 2023

Created doc issue for updates to existing doc

@donbourne
Copy link
Member

UFO review - Make quiesce timeout configurable #79

presenter: jim blye

To-Be
    nathan: will it make sense to customer to look at executor element to set this timeout
    thomas: where is app startup timeout configured (for slow running apps)?
    jared: app manager has a stop timeout.  would make sense to have quiesce timeout there
    ACTION: consider if the config should be set elsewhere that might be more intuitive (with nathan / thomas / jared / emily)

Communication
    how does developer know they need to configure this new timeout?
        error messages in the logs indicating things timed out
        ACTION: can we update the user action for this for that message (or the message related to when the quiesce begins)

    not clear why we have a separate chainQuiesceTimeout since it sounds like quiesceTimeout will trigger killing of the channel framework
        ACTION: document, in chainQuiesceTimeout doc, the relationship with quiesceTimeout

Automated Testing
    ACTION: add a test to verify that quiesceTimeout is working with new values

Serviceability
    ACTION: log a message if user tries to set value of quiesceTimeout below 30s (may already be handled by the setting on the metatype for a duration type. metatype minimums are also documented, so that's helpful)

ACTION: doc should provide info on when to set the quiesceTimeout (what would make user need to use this new capability?)

Migration
    do we have a quiesce timeout in tWAS that we might want to migrate to Liberty?
    ACTION: migration toolkit team should be made aware that we should migrate the quiesce timeout from tWAS (David Zavala knows about this timeout in tWAS)

@jimblye jimblye moved this to Kernel Features in Open Liberty Kernel Team May 21, 2024
@NottyCode NottyCode moved this to Core Runtime in Open Liberty Roadmap Aug 26, 2024
@malincoln malincoln added paused Features that have been paused from development and removed In Progress Items that are in active development. labels Sep 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Aha Idea Design Review Request Epic Used to track Feature Epics that are following the UFO process in:Kernel/Bootstrap paused Features that have been paused from development release:230012-beta target:beta The Epic or Issue is targetted for the next beta target:230012-beta
Projects
Status: Kernel Features
Status: Core Runtime
9 participants