Skip to content

Docker image: OpenJDK 17 + Cassandra 5.0.2 + Prometheus JMX exporter fully configurable through env's + tracing with Jaeger

License

Notifications You must be signed in to change notification settings

smok-serwis/cassandra

Repository files navigation

OpenJDK 17 + Cassandra 5.0.2 + Prometheus JMX exporter + Jolokia exporter + jemalloc2 + Jaeger tracing

Current version: Cassandra v5.0.2, now with more configurability through the envs!

Due to myriad of different licenses employed here, please take a look at the summary detailed here.

If you're migrating from Cassandra 4, just scroll to the bottom.

Ports it listens on

  • 7199 - JMX
  • 7198 - Prometheus metrics exporter
  • 9042 - Native transport
  • 7000 - Internode communications
  • 9160 - Thrift client (disabled by default, set env START_RPC to true to enable it)

Volumes of interest

  • /var/lib/cassandra - data partition
  • /var/lib/cassandra/commitlog - commitlog partition
  • /var/lib/cassandra/logs - logs
  • /var/lib/cassandra/heapdump - heap dumps in case Cassandra crashes

Usage

Since this uses OpenJDK 17, you do not need to set anymore any weird environment variables. Just enjoy! G1 garbage collector is enabled by default. You don't need to make your images basing off this one. cassandra.yaml will be set as you set particular environment variables. Just set envs as needed. See Dockerfile and entrypoint.py for details.

This exports three volumes -

  1. for data (/var/lib/cassandra),
  2. for commitlog (/var/lib/cassandra/commitlog),
  3. for logs (/var/log/cassandra)

Best mount them as bind.

Recommended options are --network host --privileged, althrough passing the external host IP in BROADCAST_ADDRESSes and using auto ONLY for normal addresses works fine with a bridge network.

NEVER USE auto if HOST NETWORKING IS ENABLED!

Any arguments passed to the entry point will be called as through a Cassandra was called. Any extra arguments will be passed there, after a cassandra -f.

Or, if you pass bash command, a bash shell will be set for you with required envs.

Parameters

For the love of God, disable ADDRESS_FOR_ALL while setting up a second or third node. It sets SEED_NODES to point to this node. It's also deprecated.

Set ADDRESS_FOR_ALL for a variable that will replace all _ADDRESS.

Following env's values will be placed in cassandra.yaml verbatim (ie, withouting quotes)

  • BROADCAST_ADDRESS, LISTEN_ADDRESS, RPC_ADDRESS, RPC_BROADCAST_ADDRESS
  • CLUSTER_NAME (will be automatically escaped with quotes), default is Test Cluster
  • SEED_NODES - list of comma separated IP addresses to bootstrap the cluster from

In general, if it's found in cassandra.yaml with a dollar sign preceding it, it is safe to assume that environment variable with a given name will be substituted for it.

If you need quotes, bring them with you. See for example how CLUSTER_NAME is set.

Extra parameters for RTFM

Note that where sizes are required, you should postfix them with MiB or KiB. Where tiems are requires, use milliseconds (ms)

  • NUM_TOKENS - by default 256, but take care
  • START_RPC - whether to start classic Cassandra Thrift RPC. Default is false, but you might wish to use true
  • RPC_PORT - port to which start Thrift RPC, if it is requested.
  • DISK_OPTIMIZATION_STRATEGY - pass spinning or ssd, any other option will fail with an error. Default is ssd
  • ENDPOINT_SNITCH - endpoint snitch to use, by default it's SimpleSnitch
  • AUTHENTICATOR - by default AllowAllAuthenticator, can use also PasswordAuthenticator
  • AUTHORIZER - by default AllowAllAuthorizer, can use also CassandraAuthorizer
  • PARTITIONER - partitioner to use, by default org.apache.cassandra.dht.Murmur3Partitioner
  • ROW_CACHE_SIZE - row cache size to use. By default is 0MiB, which means disabled.
  • TOMBSTONE_WARN_THRESHOLD and TOMBSTONE_FAIL_THRESHOLD - there's no unit. RTFM
  • COLUMN_INDEX_SIZE - RTFM, default is 64KiB
  • BATCH_SIZE_FAIL_THRESHOLD - maximum size of the batch that Cassandra will fail. Unit is KiB. RTFM
  • BATCHLOG_REPLAY_THROTTLE - maximum speed at which commit log will be replayed. Default is 512 MiB, which means 512 MiB/s.
  • REQUEST_SCHEDULER - defaults to org.apache.cassandra.scheduler.NoScheduler
  • READ_REQUEST_TIMEOUT - defaults to 5000ms
  • RANGE_REQUEST_TIMEOUT - defaults to 10000ms
  • STREAM_THROUGHPUT_OUTBOUND - defaults to 25MiB/s
  • WRITE_REQUEST_TIMEOUT - defaults to 2000
  • MAX_HEAP_SIZE - defaults to 48g
  • NEW_HEAP_SIZE - defaults to 10g don't confuse with HEAP_NEWSIZE!!
  • COUNTER_WRITE_REQUEST_TIMEOUTS - defaults to 5000ms
  • JMX_AUTH - defaults to yes, set to no to disable JMX auth
  • CAS_CONTENTION_TIMEOUT - defaults to 2000ms
  • TRUNCATE_REQUEST_TIMEOUT - defaults to 60000ms
  • REQUEST_TIMEOUT - defaults to 15000ms
  • COMPACTION_THROUGHPUT - defaults to 64MiB/s
  • MAX_HINT_WINDOW - defaults to 3h
  • ENABLE_USER_DEFINED_FUNCTIONS' - defaults to false
  • ENABLE_SCRIPTED_USER_DEFINED_FUNCTIONS - defaults to false
  • COMMITLOG_SEGMENT_SIZE - size of a commit log segment. Defaults to 32MiB.
  • DISABLE_PROMETHEUS_EXPORTER - if set, Prometheus' exporter will be disabled
  • KEY_CACHE_SIZE - default is auto, unit is MiB
  • FILE_CACHE_SIZE - size of chunk cache, unit is MiB
  • COMMITLOG_TOTAL_SPACE - space to use for commit log. Please specify the values, the defaults are difficult to explain.
  • COMMITLOG_SYNC - RTFM. Defaults to periodic
  • MEMTABLE_HEAP_SIZE - size of heap size for memtables. Default is 1024MiB. Postfix it with MiB please.
  • MEMTABLE_OFF_HEAP_SIZE - size of off-heap memtables. Default is 512MiB. Postfix it with MiB please.
  • STORAGE_COMPATIBILITY_MODE - one used for updating. Please read the end of this article. Default is None (bootstrap in a Cassandra 5 cluster)

Enabling JMX

To enable JMX [without SSL] set the environment variable LOCAL_JMX to no, and the environment variable JMX_REMOTE_PASSWORD to target remote password.

This way you will have two users created - monitorRole with read-only permissions, and controlRole with read-write JMX permissions, both having the password that you set.

IF you want JMX to bind to a specific interface, define JMX_ADDRESS.

Optionals

Following env's would be nice to have, but are not required:

  • CASSANDRA_DC - name of this DC that Cassandra is in. dc1 by default.
  • CASSANDRA_RACK - name of the rack that Cassandra is in, rack1 by default.

Jolokia

Jolokia is enabled by default and listens on port 8080. If you define the env DISABLE_JOLOKIA t won't be loaded.

Extra arguments

This simply launches cassandra with a -f flag, and passes any extra arguments to that cassandra.

Health check

This container spots a built-in healthcheck. It is done by invoking "nodetool status" and seeing it's exit code. This assumes that 30 minutes will be a sufficient time for your Cassandra to get up and read it's commit logs and initialize. If this is not the case, start the container with suitable docker run --health-start-period.

To enable health check just set the environment variable HEALTHCHECK_ENABLE to 1.

If you choose not to enable the health check, the container will always be marked as healthy.

Disabling Prometheus exporter

Just define an env called DISABLE_PROMETHEUS.

Enabling Jaeger tracing

In order to enable Jaeger tracing just define the envs JAEGER_AGENT_HOST, and optionally JAEGER_AGENT_PORT, which is 6831 by default.

Note that this uses our custom version of cassandra-jaeger-tracing.

In order to trace spans that don't fit within an UDP packet (you'll see lots of logs saying that frame is too large), just define an env called JAEGER_ENDPOINT that will point directly to collector, eg. http://jaeger_collector:14268/api/traces. In this case you don't need to set either JAEGER_AGENT_HOST and JAEGER_AGENT_PORT.

Not setting any of these will result in Cassandra's tracker being used.

Bash

If you invoke this container with a single argument of "bash", it will drop you to a shell without starting anything.

Extra JVM_OPTS

If you set an env called EXTRA1 it will get automatically appended to cassandra-env.sh, producing an extra line of:

JVM_OPTS="$JVM_OPTS ${EXTRA1}"\n

You can add any number, starting from numbering them EXTRA1, without any limit. It's important that they are consecutive numbers. These will simply enlarge your JVM_OPTS. You can for example use it to replace a dead node.

Enabling assertions

Assertions are disabled by default in order to provide a modest speed-up. To enable them, use an env called ENABLE_ASSERTIONS and set it to 1.

Logging GC

GC can be logged to:

  • not logged (default value of LOG_GC=none)
  • file /var/log/cassandra.gc (LOG_GC=file)
  • standard output (LOG_GC=stdout)

correct sysctl settings

  • vm.max_map_count = 1048575
  • echo 8 > /sys/block/sda/queue/read_ahead_kb for the drive storing Cassandra data

Migrating to Cassandra 5

Remember to change the env of JAVA_HOME to /usr/lib/jvm/java-11-openjdk-amd64!

For every node:

1. Stop it
2. Check that environment variables match (they changed a lot, they added units, this README details them all)
3. Set `STORAGE_COMPATIBILITY_MODE` to `UPGRADING`
4. Start the node, wait for it to join the cluster.
5. Run `nodetool upgradesstables` on all SSTables that this node has.

Now for every node:

1. Stop it
2. Change `STORAGE_COMPATIBILITY_MODE` to `NONE`
3. Start it

About

Docker image: OpenJDK 17 + Cassandra 5.0.2 + Prometheus JMX exporter fully configurable through env's + tracing with Jaeger

Topics

Resources

License

Stars

Watchers

Forks