Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Observability docs need re-writing/organising to reflect user-focused OL docs approach and the deprecation of MP Metrics #7659

Open
lauracowen opened this issue Oct 31, 2024 · 1 comment
Assignees
Milestone

Comments

@lauracowen
Copy link
Member

lauracowen commented Oct 31, 2024

At some point, the DEVELOPMENT > Observability section has morphed into Knowledge Center-style docs with unnecessary task topics and an unhelpful container topic. It also doesn't seem to recognise the different user needs of developers vs operations. I know Alasdair has raised some issues (such as #7657) about this so I'll try and pull together here what I think needs doing and refer to his issues as I go.

Overall, the Observability docs now read like they've been largely copied and pasted from the developer's notes. The developer should be providing the info but they aren't technical writers and will give info that isn't needed and can be omitted, and they'll leave gaps that you might need to fill. The technical writer needs to understand it enough to ask questions about what should and should not be documented (including doing some background reading to understand the topic if necessary). Then discuss with the developer as needed to work out what info each relevant user audience needs in order to get up and running as quickly as possible. Then also what info they might need in typical more advanced situations (focusing on the 80% need, not the 20% edge cases - point them off to the comprehensive reference/API docs for that - and make sure those docs are sufficient for that too).

Overview/observability topic

The overview Observability topic needs to be more useful and give an introduction to what observability is and why you need to care about it as a developer of microservices/apps. I think that's what the [Microservice observability with metrics]https://openliberty.io/docs/latest/microservice-observability-metrics.html topic was originally, though it maybe didn't do that enough still.

The collated list of topic titles (Knowledge Center-style) currently in this topic is overwhelming and not very helpful (it places them all as equal), and really duplicates what's in the nav tree:


image

The topic needs to provide far more useful information in its own right, and then link off to the other topics at the relevant points. That's the style of the Open Liberty docs. Not provide a flat list of everything hierarchically under the container topic.

MP Metrics has been deprecated and is no longer under active development; MP Telemetry is now the way we should be guiding developers to go. It might be useful to provide a brief explanation of this early in the topic (not in the shortdesc/abstract). Everything about Observability should now assume MP Telemetry, not MP Metrics. The change in the product shouldn't just mean adding more docs, the existing docs need updating, changing, removing, whatever is needed to make them coherent and useful.

Keep in mind always that this section of the docs is DEVELOPMENT so this topic (and any topics in this section) must directly address concerns of developers, not operations. (In the OPERATIONS section, topics should developer the concerns of operations people.)

This topic needs to talk about how MP Telemetry is used (at a high-level), what it gives to the developer and the app. To paraphrase the Telemetry guide (which I don't think phrases it quite right):

Automatic instrumentation instruments Jakarta RESTful web services and MicroProfile REST clients. To instrument other operations, such as database calls, you can add custom instrumentation to your source code.

That context is (as far as I have seen so far) completely missing from these docs. It needs to be really really clear in the OL docs what the very minimum is that a developer needs to do in order to get up and running with Telemetry of their app as quickly as possible. Actually, I've now found it down in this topic: https://openliberty.io/docs/latest/telemetry-trace.html A lot of the info in that topic should be in the Overview topic instead, introducing how you add observability to an app (but not the details from the Manual section).

Prepare your development environment for MicroProfile Telemetry

This topic says in many words what can be said in a sentence plus links to the config reference topic for that feature (if that feature doesn't provide enough info, it should be improved). This is not a task anyone wants to do; it's a step in a larger more useful task. See #7657 for more detailed feedback.

Define custom MicroProfile Telemetry metrics

There is currently no context given for why a developer would want to do what's documented in this topic. This lack of context and its high-up listing in the Observability topic implies that using Telemetry probably requires you to create custom metrics (though it also adds uncertainty because "custom" implies in addition to some kind of easy/automatic metrics that haven't previously been described).

This topic (on custom metrics) should start by reiterating that most users/applications will be fine with the automatic instrumentation. Then go on to say that, however, for some operations such as database calls [and add some other concrete examples too], you need to define custom metrics to monitor them. Then given a brief overview of what it means to define custom metrics (briefly explain the concept of spans and anything else relevant).

Then the example needs far more explanation about what it does and why you'd need to define it as a custom metric. This is so that the developer can see how it maps (or not) to what s/he wants to do.

Confusingly, there's a bunch of info about "manual instrumentation" in a different topic https://openliberty.io/docs/latest/telemetry-trace.html which I think is referring to the same thing as defining custom metrics (based on my brief reading on the subject here and in the OpenTelemetry docs).

Don't link to the API docs early in the topic (that's a "for more information also see" type of link late in the topic or even in the related links section).

Do add a link to the Telemetry guide, either at a relevant point for illustration of that particular point, or at the end of the topic in related links (or both if the topic is long). Always be aware of what guides we have on the OL site that can give a more hands-on code-based illustration of what the docs might describe.

Don't link to the specs docs from user documentation at all. Those specs are aimed at the developers implementing Liberty, not at users of Liberty. If they want to go looking for that level of detail, fine, that's on them. But if we link to the specs, how are they to know how much of it we're indicating they should read or care about? If they need to know about it, include it in the Liberty docs. That doesn't mean including every detail; just what they most typically need to know - it comes back to understanding the user.

OPERATIONS > Code instrumentation for MicroProfile Telemetry tracing

https://openliberty.io/docs/latest/telemetry-trace.html

This topic seems to appear in both DEVELOPMENT and OPERATIONS sections of the docs?

This topic is not OPERATIONS info. This is info for the developer. And the "Manual instrumentation" section is about defining custom metrics, which should be handled as an exception as I've described above.

Enabling observability with MP Telemetry

https://openliberty.io/docs/latest/microprofile-telemetry.html

I agree with Alasdair's points in #7655. In summary, this topic is long-winded and seems to duplicate a lot with little explanation, and gives unnecessary alternatives instead of focusing on what they're most likely to want to do and documenting that path.

This topic needs to consider what an operations/devops person needs to know and what they're trying to do. I don't know (you need to check) but I'm guessing that MP Telemetry has to be enabled by the developer in the app, then the ops person can only turn things on and off (or override settings already set in the app by the developer).

Microservice observability with metrics and Choose your own monitoring tools with MicroProfile Metrics

https://openliberty.io/docs/latest/microservice-observability-metrics.html
https://openliberty.io/docs/latest/micrometer-metrics.html

If this info is entirely about MP Metrics and therefore deprecated, the whole topic(s) needs to be indicated as such and not highlighted as relevant to the developer in the Observability overview topic. Basically deprecate the topic. But check if there's useful explanatory info in here though that could be used in the newer/changed topics about Telemetry.

Monitoring with metrics

https://openliberty.io/docs/latest/introduction-monitoring-metrics.html

This topic needs more properly updating to reflect the switch from MP Metrics to MP Telemetry. I think talking to someone like Don Bourne would help in reframing the focus to introduce metrics with this context in mind. Probably at this stage needs a brief explanation of the deprecation etc too for an operations person.

This topic was previously the top-level concept/intro to observability with metrics, as the Logs topic introduces logging. The introduction of MP Telemetry combines metrics and logging into a single thing, so I can see why there's been a topic about Telemetry added at the same level. I'd be inclined to consider whether there needs to be a separate topic of "Enable observability with MicroProfile Telemetry". If there's too much common info to go in the Logs and Metrics topics, a separate topic is fine as long as it's better titled (the "Enabling" topics should not be about MP Telemetry per se; it's about monitoring using metrics/trace info; MP Telemetry is just the implementation we now use).

@dmuelle dmuelle added this to the 24.0.0.12 milestone Oct 31, 2024
@dmuelle dmuelle self-assigned this Oct 31, 2024
@dmuelle
Copy link
Member

dmuelle commented Nov 4, 2024

Hi Laura- I'll be working on this feedback as part of the larger rewrite of the observability topics to prioritize MP telemetry and address the structural concerns you and Alasdair raised in the other issues.

Has MicroProfile Metrics been officially deprecated? I know new development focuses on MP Telmetry as a single solution and therefore supersedes MPM. But I haven't heard that it's formally deprecated, which requires special documentation and ideally updates t the generated doc. I just want to be careful about using that term in the public facing doc unless it's gone/going the POC approval process. But either way, the docs should make clear that the MPM info is really just for apps that were already using it and haven't transitioned to MP Telemetry.

The topic about code instrumentation about tracing, not metrics. . You can only use the auto-instrumentation that MP Telemetry provides to trace JAX-RS apps, anything else requires manual or agent instrumentation. I dont see this topic in the OPS section

The preparing your dev environment info can move to the feature page until we figure out how to automate maven coordinates. Then a revised Observability overview, which explains MP telemetry as a single solution for logs/metrics/traces and positions MicroProfile metrics properly (whether its officially deprecated or not), can point to the feature topic for all the config.

I'll work through your feedback here and synthesize with what Alasdair has in #7655 and then send for review to both of you and Don Bourne.

@dmuelle dmuelle modified the milestones: 24.0.0.12, 25.0.0.1 Nov 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants