Skip to content

meoww-bot/hadoop_exporter

 
 

Repository files navigation

Hadoop Exporter for Prometheus

Exports hadoop metrics via HTTP for Prometheus consumption.

Build

How to build

go mod tidy
make build

or build individual exporter

make build-namenode
make build-resourcemanager
make build-journalnode 
make build-datanode

Help

Help on flags of namenode_exporter:

-krb5.keytab.path string
    	Kerberos keytab file path
-krb5.principal string
    	Principal ([email protected])
-namenode.jmx.url string
    Hadoop JMX URL. (default "http://nn01.example.com:50070/jmx")
-web.listen-address string
    Address on which to expose metrics and web interface. (default ":9070")
-web.telemetry-path string
    Path under which to expose metrics. (default "/metrics")

Help on flags of datanode_exporter:

-datanode.jmx.url string
    Hadoop JMX URL. (default "http://localhost:50075/jmx")
-web.listen-address string
    Address on which to expose metrics and web interface. (default ":9070")
-web.telemetry-path string
    Path under which to expose metrics. (default "/metrics")

Help on flags of resourcemanager_exporter:

-resourcemanager.url string
    Hadoop ResourceManager URL. (default "http://localhost:8088")
-web.listen-address string
    Address on which to expose metrics and web interface. (default ":9088")
-web.telemetry-path string
    Path under which to expose metrics. (default "/metrics")

Help on flags of journalnode_exporter:

-journalnodeJmxUrl.url string
    Hadoop ResourceManager URL. (default "http://localhost:8088")
-web.listen-address string
    Address on which to expose metrics and web interface. (default ":9088")
-web.telemetry-path string
    Path under which to expose metrics. (default "/metrics")

Metrics Map

指标定义准则

  1. 将同一指标的不同维度放到标签里面,降低基数
  2. 指标定义: <hadoop service>_<component>_<jmx beans modelerType>_<metrics> 比如:BlocksTotal 对应的 prometheus 指标 hdfs_namenode_fsname_system_blocks_total
    hadoop service: hdfs
    component: namenode
    jmx beans modelerType: FSNamesystem -> fsname_system
    metrics: BlocksTotal -> blocks_total
  3. prometheus 指标全部是小写字母,使用 _ 下划线分隔
  4. 如果指标有单位,尽量带单位,比如 count,millisecond,bytes

NameNode

Hadoop:service=NameNode,name=FSNamesystem

Jmx Metric Prometheus Metric Description Chinese Description
MissingBlocks hdfs_namenode_fsname_system_missing_blocks Current number of missing blocks
UnderReplicatedBlocks hdfs_namenode_fsname_system_under_replicated_blocks Current number of blocks under replicated
CapacityTotal hdfs_namenode_fsname_system_capacity_bytes{mode="Total"} Current raw capacity of DataNodes in bytes
CapacityUsed hdfs_namenode_fsname_system_capacity_bytes{mode="Used"} Current used capacity across all DataNodes in bytes
CapacityRemaining hdfs_namenode_fsname_system_capacity_bytes{mode="Remaining"} Current remaining capacity in bytes
CapacityUsedNonDFS hdfs_namenode_fsname_system_capacity_bytes{mode="UsedNonDFS"} Current space used by DataNodes for non DFS purposes in bytes
BlocksTotal hdfs_namenode_fsname_system_blocks_total Current number of allocated blocks in the system
FilesTotal hdfs_namenode_fsname_system_files_total Current number of files and directories
CorruptBlocks hdfs_namenode_fsname_system_corrupt_blocks Current number of blocks with corrupt replicas
ExcessBlocks hdfs_namenode_fsname_system_excess_blocks Current number of excess blocks
StaleDataNodes hdfs_namenode_fsname_system_stale_datanodes Current number of DataNodes marked stale due to delayed heartbeat
tag.HAState hdfs_namenode_fsname_system_hastate (HA-only) Current state of the NameNode: initializing or active or standby or stopping state

Hadoop:service=NameNode,name=JvmMetrics

Jmx Metric Prometheus Metric Description Chinese Description
GcCountParNew hdfs_namenode_jvm_metrics_gc_count{type="ParNew"} ParNew GC count
GcCountConcurrentMarkSweep hdfs_namenode_jvm_metrics_gc_count{type="ConcurrentMarkSweep"} ConcurrentMarkSweep GC count
GcTimeMillisParNew hdfs_namenode_jvm_metrics_gc_time_milliseconds{type="ParNew"} ParNew GC time in milliseconds
GcTimeMillisConcurrentMarkSweep hdfs_namenode_jvm_metrics_gc_time_milliseconds{type="ConcurrentMarkSweep"} ConcurrentMarkSweep GC time in milliseconds

java.lang:type=Memory

Jmx Metric Prometheus Metric Description Chinese Description
HeapMemoryUsage{committed} hdfs_namenode_memory_heap_memory_usage_bytes{mode="committed"}
HeapMemoryUsage{init} hdfs_namenode_memory_heap_memory_usage_bytes{mode="init"}
HeapMemoryUsage{max} hdfs_namenode_memory_heap_memory_usage_bytes{mode="max"}
HeapMemoryUsage{used} hdfs_namenode_memory_heap_memory_usage_bytes{mode="used"}

Hadoop:service=NameNode,name=NameNodeStatus

Jmx Metric Prometheus Metric Description Chinese Description
LastHATransitionTime hdfs_namenode_namenode_status_last_ha_transition_time

Hadoop:service=NameNode,name=RpcActivityForPort8020/8060

Jmx Metric Prometheus Metric Description Chinese Description
ReceivedBytes hdfs_namenode_rpc_activity_received_bytes Total number of received bytes
SentBytes hdfs_namenode_rpc_activity_sent_bytes Total number of sent bytes
RpcQueueTimeNumOps hdfs_namenode_rpc_activity_call_count{method="QueueTime"} Total number of RPC calls
RpcQueueTimeAvgTime hdfs_namenode_rpc_activity_avg_time_milliseconds{method="RpcQueueTime"} Average queue time in milliseconds
RpcProcessingTimeAvgTime hdfs_namenode_rpc_activity_avg_time_milliseconds{method="RpcProcessingTime"} Average Processing time in milliseconds
NumOpenConnections hdfs_namenode_rpc_activity_open_connections_count Current number of open connections
CallQueueLength hdfs_namenode_rpc_activity_call_queue_length Current length of the call queue

Requirements

golang 1.20

Releases

No releases published

Packages

No packages published

Languages

  • Go 95.1%
  • Makefile 3.0%
  • Dockerfile 1.3%
  • Shell 0.6%