Skip to content

Commit

Permalink
Merge pull request #914 from replicatedhq/61207-kurl
Browse files Browse the repository at this point in the history
Updates to Rook requirements
  • Loading branch information
paigecalvert authored Nov 16, 2022
2 parents d090c65 + 94685a0 commit 410b670
Show file tree
Hide file tree
Showing 2 changed files with 42 additions and 23 deletions.
44 changes: 31 additions & 13 deletions src/markdown-pages/add-ons/rook.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,12 +10,6 @@ addOn: "rook"
The [Rook](https://rook.io/) add-on creates and manages a Ceph cluster along with a storage class for provisioning PVCs.
It also runs the Ceph RGW object store to provide an S3-compatible store in the cluster.

By default the cluster uses the filesystem for storage. Each node in the cluster will have a single OSD backed by a directory in `/opt/replicated/rook`. Nodes with a Ceph Monitor also utilize `/var/lib/rook`.

**Note**: At minimum, 10GB of disk space should be available to `/var/lib/rook` for the Ceph Monitors and other configs. We recommend a separate partition to prevent a disruption in Ceph's operation as a result of `/var` or the root partition running out of space.

**Note**: All disks used for storage in the cluster should be of similar size. A cluster with large discrepancies in disk size may fail to replicate data to all available nodes.

The [EKCO](/docs/add-ons/ekco) add-on is recommended when installing Rook. EKCO is responsible for performing various operations to maintain the health of a Ceph cluster.

## Advanced Install Options
Expand All @@ -36,11 +30,16 @@ flags-table
## Block Storage
For production clusters, Rook should be configured to use block devices rather than the filesystem.
Enabling block storage is required with version 1.4.3+. Therefore, the `isBlockStorageEnabled` option will always be set to true when using version 1.4.3+.
The following spec enables block storage for the Rook add-on and automatically uses disks matching the regex `/sd[b-z]/`.
Rook will start an OSD for each discovered disk, which could result in multiple OSDs running on a single node.
Rook will ignore block devices that already have a filesystem on them.
For Rook versions 1.4.3 and later, block storage is required.
For Rook versions earlier than 1.4.3, block storage is recommended in production clusters.
You can enable and disable block storage for Rook versions earlier than 1.4.3 with the `isBlockStorageEnabled` field in the kURL spec.

When the `isBlockStorageEnabled` field is set to `true`, or when using Rook versions 1.4.3 and later, Rook starts an OSD for each discovered disk.
This can result in multiple OSDs running on a single node.
Rook ignores block devices that already have a filesystem on them.

The following provides an example of a kURL spec with block storage enabled for Rook:

```yaml
spec:
Expand All @@ -50,8 +49,27 @@ spec:
blockDeviceFilter: sd[b-z]
```

The Rook add-on will wait for a disk before continuing.
If you have attached a disk to your node but the installer is still waiting at the Rook add-on installation step, refer to the [troubleshooting guide](https://rook.io/docs/rook/v1.0/ceph-common-issues.html#osd-pods-are-not-created-on-my-devices) for help with diagnosing and fixing common issues.
In the example above, the `isBlockStorageEnabled` field is set to `true`.
Additionally, `blockDeviceFilter` instructs Rook to use only block devices that match the specified regex.
For more information about the available options, see [Advanced Install Options](#advanced-install-options) above.

The Rook add-on waits for a disk before continuing with installation.
If you attached a disk to your node, but the installer is waiting at the Rook add-on installation step, see [OSD pods are not created on my devices](https://rook.io/docs/rook/v1.0/ceph-common-issues.html#osd-pods-are-not-created-on-my-devices) in the Rook documentation for troubleshooting information.

## Filesystem Storage

By default, for Rook versions earlier than 1.4.3, the cluster uses the filesystem for Rook storage.
However, block storage is recommended for Rook in production clusters.
For more information, see [Block Storage](#block-storage) above.

When using the filesystem for storage, each node in the cluster has a single OSD backed by a directory in `/opt/replicated/rook`.
Nodes with a Ceph Monitor also use `/var/lib/rook`.

At minimum, 10GB of disk space must be available to `/var/lib/rook` for the Ceph Monitors and other configs.
We recommend a separate partition to prevent a disruption in Ceph's operation as a result of `/var` or the root partition running out of space.

**Note**: All disks used for storage in the cluster should be of similar size.
A cluster with large discrepancies in disk size may fail to replicate data to all available nodes.

## Shared Filesystem

Expand Down
21 changes: 11 additions & 10 deletions src/markdown-pages/install-with-kurl/system-requirements.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,8 +22,9 @@ title: "System Requirements"

* 4 AMD64 CPUs or equivalent per machine
* 8 GB of RAM per machine
* 40 GB of Disk Space per machine.
* **Note**: When [Rook](/docs/add-ons/rook) is enabled, 10GB of the total 40GB should be available to `/var/lib/rook`
* 40 GB of Disk Space per machine
* The Rook add-on version 1.4.3 and later requires block storage on each node in the cluster.
For more information about how to enable block storage for Rook, see [Block Storage](/docs/add-ons/rook#block-storage) in _Rook Add-On_.
* TCP ports 2379, 2380, 6443, 10250, 10251 and 10252 open between cluster nodes
* **Note**: When [Flannel](/docs/add-ons/flannel) is enabled, UDP port 8472 open between cluster nodes
* **Note**: When [Weave](/docs/add-ons/weave) is enabled, TCP port 6783 and UDP port 6783 and 6784 open between cluster nodes
Expand All @@ -38,7 +39,7 @@ For more information see [kURL Advanced Install Options](/docs/install-with-kurl
## Networking Requirements
### Firewall Openings for Online Installations

The following domains need to be accessible from servers performing online kURL installs.
The following domains need to be accessible from servers performing online kURL installs.
IP addresses for these services can be found in [replicatedhq/ips](https://github.com/replicatedhq/ips/blob/master/ip_addresses.json).

| Host | Description |
Expand All @@ -50,9 +51,9 @@ IP addresses for these services can be found in [replicatedhq/ips](https://githu
No outbound internet access is required for airgapped installations.
### Host Firewall Rules

The kURL install script will prompt to disable firewalld.
The kURL install script will prompt to disable firewalld.
Note that firewall rules can affect communications between containers on the **same** machine, so it is recommended to disable these rules entirely for Kubernetes.
Firewall rules can be added after or preserved during an install, but because installation parameters like pod and service CIDRs can vary based on local networking conditions, there is no general guidance available on default requirements.
Firewall rules can be added after or preserved during an install, but because installation parameters like pod and service CIDRs can vary based on local networking conditions, there is no general guidance available on default requirements.
See [Advanced Options](/docs/install-with-kurl/advanced-options) for installer flags that can preserve these rules.

The following ports must be open between nodes for multi-node clusters:
Expand Down Expand Up @@ -103,15 +104,15 @@ In addition to the networking requirements described in the previous section, op

### Control Plane HA

To operate the Kubernetes control plane in HA mode, it is recommended to have a minimum of 3 primary nodes.
In the event that one of these nodes becomes unavailable, the remaining two will still be able to function with an etcd quorom.
To operate the Kubernetes control plane in HA mode, it is recommended to have a minimum of 3 primary nodes.
In the event that one of these nodes becomes unavailable, the remaining two will still be able to function with an etcd quorom.
As the cluster scales, dedicating these primary nodes to control-plane only workloads using the `noSchedule` taint should be considered.
This will affect the number of nodes that need to be provisioned.

### Worker Node HA

The number of required secondary nodes is primarily a function of the desired application availability and throughput.
By default, primary nodes in kURL also run application workloads.
By default, primary nodes in kURL also run application workloads.
At least 2 nodes should be used for data durability for applications that use persistent storage (i.e. databases) deployed in-cluster.

### Load Balancers
Expand All @@ -125,7 +126,7 @@ graph TB
A -->|Port 6443| D[Primary Node]
```

Highly available cluster setups that do not leverage EKCO's [internal load balancing capability](/docs/add-ons/ekco#internal-load-balancer) require a load balancer to route requests to healthy nodes.
Highly available cluster setups that do not leverage EKCO's [internal load balancing capability](/docs/add-ons/ekco#internal-load-balancer) require a load balancer to route requests to healthy nodes.
The following requirements need to be met for load balancers used on the control plane (primary nodes):
1. The load balancer must be able to route TCP traffic, as opposed to Layer 7/HTTP traffic.
1. The load balancer must support hairpinning, i.e. nodes referring to eachother through the load balancer IP.
Expand All @@ -134,7 +135,7 @@ The following requirements need to be met for load balancers used on the control
1. The load balancer should target each primary node on port 6443.
1. In accordance with the above firewall rules, port 6443 should be open on each primary node.

The IP or DNS name and port of the load balancer should be provided as an argument to kURL during the HA setup.
The IP or DNS name and port of the load balancer should be provided as an argument to kURL during the HA setup.
See [Highly Available K8s](/docs/install-with-kurl/#highly-available-k8s-ha) for more install information.

For more information on configuring load balancers in the public cloud for kURL installs see [Public Cloud Load Balancing](/docs/install-with-kurl/public-cloud-load-balancing).
Expand Down

1 comment on commit 410b670

@github-actions
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please sign in to comment.