JMX implementation : feature parity for target systems #12158

SylvainJuge · 2024-09-03T09:15:07Z

SylvainJuge · 2024-09-03T09:56:13Z

Ping @robsunday I can't yet co-assign you as you are not part of the otel contributors group.

SylvainJuge · 2024-09-03T12:10:45Z

For Tomcat, the mapping is not the same but almost equivalent, there isn't anything we need to add for 1:1 support beyond aligning the metrics themselves.

Side note: using JMX object names and attributes is a convenient way to identify elements, as it's a common part between the two mappings.

JMX : Catalina:type=Manager,host=localhost,context=* or Tomcat:type=GlobalRequestProcessor,name=*
- activeSessions : tomcat.sessions (no attribute) <==> http.server.tomcat.sessions.activeSessions with context attribute
JMX: Catalina:type=GlobalRequestProcessor,name=* or Catalina:type=GlobalRequestProcessor,name=*
- JMX Gatherer: name => proto_handler, JMX Insight: name => name
- errorCount: tomcat.errors with proto_handler attribute <==> http.server.tomcat.errorCount with name attribute
- requestCount: tomcat.request_count with proto_handler attribute <==> http.server.tomcat.requestCount with name attribute
- maxTime: tomcat.max_time with proto_handler attribute <==> http.server.tomcat.maxTime with name attribute
- processingTime: tomcat.processing_time with proto_handler attribute <==> http.server.tomcat.processingTime with name attribute
- bytesReceived: tomcat.traffic with proto_handler and direction = received|sent <==> http.server.tomcat.traffic with name, direction identical
JMX: Catalina:type=ThreadPool,name=* or Tomcat:type=ThreadPool,name=*
- JMX Gatherer: name => proto_handler, JMX Insight: name => name
- currentThreadCount : tomcat.threads with state = idle <==> http.server.tomcat.threads with name , state identical (state=idle reports the total number of threads, which is a bug mentioned here and here)
- currentThreadsBusy: tomcat.threads with state = busy <==> http.server.tomcat.threads with name and state identical

Given the mapping differences, I think here we need we probably need to leave it as-is for now.

robsunday · 2024-09-03T12:31:13Z

I'll look on Jetty

SylvainJuge · 2024-09-03T13:42:27Z

For Wildfly, the mapping is also not the same but equivalent, there isn't anything we need to add for 1:1 support beyond aligning the metrics themselves.

JMX: jboss.as:deployment=*,subsystem=undertow
- Both map deployment => deployment attribute
- sessionsCreated: wildfly.session.count <==> wildfly.session.sessionsCreated
- activeSessions: wildfly.session.active <==> wildfly.session.activeSessions
- expiredSessions: wildfly.session.expired <==> wildfly.session.expiredSessions
- rejectedSessions: wildfly.session.rejected <==> wildfly.session.rejectedSessions
JMX: jboss.as:subsystem=undertow,server=*,http-listener=*
- Both map server => server attribute and http-listener => value of listener
- requestCount: wildfly.request.count <==> wildfly.request.requestCount
- processingTime: wildfly.request.time <==> wildfly.request.processingTime
- errorCount: wildfly.request.server_error <==> wildfly.request.errorCount
- bytesSent: wildfly.network.io with extra state = out attribute <==> same
- bytesReceived: wildfly.network.io with extra state = in attribute <==> same
JMX: jboss.as:subsystem=datasources,data-source=*,statistics=pool
- Both map data-source => value of data_source
- ActiveCount : wildfly.jdbc.connection.open with state = active <==> wildfly.db.client.connections.usage with state = used
- IdleCount : wildfly.jdbc.connection.open with state = idle <==> wildfly.db.client.connections.usage with state = idle
- WaitCount: wildfly.jdbc.request.wait <==> wildfly.db.client.connections.WaitCount
JMX: jboss.as:subsystem=transactions
- numberOfTransactions: wildfly.jdbc.transaction.count <==> wildfly.db.client.transaction.NumberOfTransactions
- numberOfSystemRollbacks: wildfly.jdbc.rollback.count with cause = system <==> wildfly.db.client.rollback.count with cause = system
- numberOfResourceRollbacks: wildfly.jdbc.rollback.count with cause = resource <==> wildfly.db.client.rollback.count with cause = resource
- numberOfApplicationRollbacks: wildfly.jdbc.rollback.count with cause = application <==> wildfly.db.client.rollback.count with cause = application

SylvainJuge · 2024-09-04T09:01:10Z

For JVM metrics, the JMX Insight does not provide a YAML file, the feature is implemented in the runtime-metrics module of instrumentation (link). The current definition is aligned with semantic conventions for JVM metrics.

JMX Gatherer provides the following metrics that are not aligned with semconv, all of those can be easily captured with the YAML configuration:

java.lang:type=ClassLoading:
- LoadedClassCount : jvm.classes.loaded
java.lang:type=GarbageCollector,* :
- CollectionCount: jvm.gc.collections.count with name => name
- CollectionTime: jvm.gc.collections.elapsed with name => name
java.lang:type=Memory
- HeapMemoryUsage: jvm.memory.heap
- NonHeapMemoryUsage: jvm.memory.nonheap
java.lang:type=MemoryPool,*
- Usage: jvm.memory.pool with name => name
java.lang:type=Threading:
- ThreadCount : jvm.threads.count

SylvainJuge · 2024-09-04T09:15:13Z

As a side note, after reviewing differences for jvm, tomcat and wildfly, it becomes more and more obvious to me that there are too many differences to fix. Also, the groovy definitions haven't been modified in 2 or 3 years for some, which means they are very probably obsolete or not really used in practice.

As a consequence, I think the better option for now is to:

finish reviewing the mapping to ensure we can reproduce it with YAML in JMX Gatherer

The steps that will likely follow are:

build a new module that will use the JMX Insight implementation in contrib next to JMX Gatherer
provide a set of YAML definitions for this new module to capture the metrics as they currently are (just to preserve compatibility)
modify the collector jmxreciver implementation to use this new way to capture JMX metrics
start deprecating the current JMX Gatherer
start improving the metrics definitions so we have a set of common YAML definitions that can be reused between Instrumentation and Contrib (from the consumer side of those metrics, they should be exactly the same).

robsunday · 2024-09-04T09:31:12Z

Here are my findings regarding jetty:

JMX: org.eclipse.jetty.server.session:context=*,type=sessionhandler,id=*
- MBean property: sessionsCreated --> YAML: jetty.session.sessionsCreated <==> Groovy: jetty.session.count
- MBean property: sessionTimeTotal --> YAML: jetty.session.sessionTimeTotal <==> Groovy: jetty.session.time.total
  - minor difference in type: YAML: counter / Groovy: UpDownCounter
- MBean property: sessionTimeMax --> YAML: jetty.session.sessionTimeMax <==> Groovy: jetty.session.time.max
- MBean property: sessionTimeMean --> YAML: jetty.session.sessionTimeMean, not used in Groovy
JMX: org.eclipse.jetty.util.thread:type=queuedthreadpool,id=*
- MBean property: busyThreads --> YAML: jetty.threads.busyThreads <==> Groovy: jetty.thread.count with extra state=busy attribute
  - minor difference in type: YAML: updowncounter / Groovy: Value
- MBean property: idleThreads --> YAML: jetty.threads.idleThreads <==> Groovy: jetty.thread.count with extra state=idle attribute
  - minor difference in type: YAML: updowncounter / Groovy: Value
- MBean property: maxThreads --> YAML: jetty.threads.maxThreads, not used in Groovy
- MBean property: queueSize --> YAML: jetty.threads.queueSize <==> Groovy: jetty.thread.queue.count
  - minor difference in type: YAML: updowncounter / Groovy: Value
JMX: org.eclipse.jetty.io:context=*,type=managedselector,id=*
- MBean property: selectCount --> YAML: jetty.io.selectCount <==> Groovy: jetty.select.count
  - difference in units: YAML: 1 / Groovy: {operations}
JMX: org.eclipse.jetty.logging:type=jettyloggerfactory,id=* not used in Groovy

SylvainJuge · 2024-09-04T11:23:21Z

For hbase, there isn't anything in JMX Insight for it, the mappings are simple and it should be quite straightforward (but a bit tedious) to produce an equivalent YAML to hbase.groovy.

SylvainJuge · 2024-09-04T11:40:40Z

For hadoop:

JMX attribute tag.Hostname is always mapped to node_name metric attribute in both implementations.

JMX Hadoop:service=NameNode,name=FSNamesystem:

CapacityUsed : hadoop.name_node.capacity.usage <==> hadoop.capacity.CapacityUsed
CapacityTotal: hadoop.name_node.capacity.limit <==> hadoop.capacity.CapacityTotal
BlocksTotal: hadoop.name_node.block.count <==> hadoop.block.BlocksTotal
MissingBlocks: hadoop.name_node.block.missing <==> hadoop.block.MissingBlocks
CorruptBlocks: hadoop.name_node.block.corrupt <==> hadoop.block.CorruptBlocks
VolumeFailuresTotal: hadoop.name_node.volume.failed <==> hadoop.volume.VolumeFailuresTotal
FilesTotal: hadoop.name_node.file.count <==> hadoop.file.FilesTotal
TotalLoad: hadoop.name_node.file.load <==> hadoop.file.TotalLoad
NumLiveDataNodes: hadoop.name_node.data_node.count with state = live <==> hadoop.datenode.Count, same state value (yes, there is a typo in datanode)
NumDeadDataNodes: hadoop.name_node.data_node.count with state = dead <==> hadoop.hadoop.datenode.Count, same state value

SylvainJuge · 2024-09-04T12:03:29Z

For cassandra:

There is no mapping in YAML, the mapping is verbose and the lack of support for templates or string interpolation would make it quite tedious to write, but it's more an annoyance than a really blocking issue.

For example, few examples of MBeans:

org.apache.cassandra.metrics:type=ClientRequest
org.apache.cassandra.metrics:type=ClientRequest,scope=RangeSlice
org.apache.cassandra.metrics:type=ClientRequest,scope=Read
org.apache.cassandra.metrics:type=ClientRequest,scope=Write
all of above with scope= with 3 variants by adding ,name= with value in Unavailables, Timeouts or Failures
org.apache.cassandra.metrics:type=Storage,name=Load

There isn't anything that could not be mapped using YAML syntax.

robsunday · 2024-09-04T12:50:41Z

For activemq everything except property descriptions seems to be in sync.
Metric attributes are consitent.

JMX: org.apache.activemq:type=Broker,brokerName=*,destinationType=Queue,destinationName=* and org.apache.activemq:type=Broker,brokerName=*,destinationType=Topic,destinationName=*
- ProducerCount: activemq.producer.count <==> activemq.ProducerCount
- ConsumerCount: activemq.consumer.count <==> activemq.ConsumerCount
- MemoryPercentUsage: activemq.memory.usage <==> activemq.memory.MemoryPercentUsage
- QueueSize: activemq.message.current <==> activemq.message.QueueSize
- ExpiredCount: activemq.message.expired <==> activemq.message.ExpiredCount
- EnqueueCount: activemq.message.enqueued <==> activemq.message.EnqueueCount
- DequeueCount: activemq.message.dequeued <==> activemq.message.DequeueCount
- AverageEnqueueTime: activemq.message.wait_time.avg <==> activemq.message.AverageEnqueueTime

All desc fields in properties needs to be synchronized because wording is different

JMX: org.apache.activemq:type=Broker,brokerName=*
- CurrentConnectionsCount: activemq.connection.count <==> activemq.connections.CurrentConnectionsCount
- StorePercentUsage: activemq.disk.store_usage <==> activemq.disc.StorePercentUsage
- TempPercentUsage: activemq.disk.temp_usage <==> activemq.disc.TempPercentUsage

robsunday · 2024-09-04T13:16:29Z

solr case is very similar to hbase. No YAML at the moment but creating it should not be an issue.

SylvainJuge · 2024-09-04T14:20:18Z

For kafka, the YAML is kafka-broker.yaml

JMX: kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec:

Count : kafka.message.count
JMX: kafka.server:type=BrokerTopicMetrics,name=TotalProduceRequestsPerSec:
Count: kafka.request.count with type = produce
JMX: kafka.server:type=BrokerTopicMetrics,name=TotalFetchRequestsPerSec:
Count: kafka.request.count with type = fetch
JMX: kafka.server:type=BrokerTopicMetrics,name=FailedProduceRequestsPerSec:
Count: kafka.request.failed with type = produce
JMX: kafka.server:type=BrokerTopicMetrics,name=FailedFetchRequestsPerSec:
Count: kafka.request.failed with type = fetch

I haven't checked in detail all the others, but they look identical between the two implementations.

I discovered that we have a way to use multiple mbeans names with the same metrics definition as seen in kafka-broker.yaml

For kafka-consumer.groovy and kafka-producer.groovy there is no equivalent YAML mapping though.

SylvainJuge self-assigned this Sep 3, 2024

breedx-splk assigned robsunday Sep 3, 2024

SylvainJuge mentioned this issue Sep 5, 2024

[WIP] JMX scraper open-telemetry/opentelemetry-java-contrib#1445

Open

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JMX implementation : feature parity for target systems #12158

JMX implementation : feature parity for target systems #12158

SylvainJuge commented Sep 3, 2024 •

edited

Loading

SylvainJuge commented Sep 3, 2024

SylvainJuge commented Sep 3, 2024 •

edited

Loading

robsunday commented Sep 3, 2024

SylvainJuge commented Sep 3, 2024

SylvainJuge commented Sep 4, 2024

SylvainJuge commented Sep 4, 2024

robsunday commented Sep 4, 2024

SylvainJuge commented Sep 4, 2024

SylvainJuge commented Sep 4, 2024

SylvainJuge commented Sep 4, 2024

robsunday commented Sep 4, 2024 •

edited

Loading

robsunday commented Sep 4, 2024

SylvainJuge commented Sep 4, 2024

JMX implementation : feature parity for target systems #12158

JMX implementation : feature parity for target systems #12158

Comments

SylvainJuge commented Sep 3, 2024 • edited Loading

SylvainJuge commented Sep 3, 2024

SylvainJuge commented Sep 3, 2024 • edited Loading

robsunday commented Sep 3, 2024

SylvainJuge commented Sep 3, 2024

SylvainJuge commented Sep 4, 2024

SylvainJuge commented Sep 4, 2024

robsunday commented Sep 4, 2024

SylvainJuge commented Sep 4, 2024

SylvainJuge commented Sep 4, 2024

SylvainJuge commented Sep 4, 2024

robsunday commented Sep 4, 2024 • edited Loading

robsunday commented Sep 4, 2024

SylvainJuge commented Sep 4, 2024

SylvainJuge commented Sep 3, 2024 •

edited

Loading

SylvainJuge commented Sep 3, 2024 •

edited

Loading

robsunday commented Sep 4, 2024 •

edited

Loading